Kendall's Tau metric, based loosely on scipy. #2169

sorensenjs · 2020-09-17T13:29:32Z

Description

Adds Kendall's Tau, a measure of ordinal concordance as a Keras compatible metric,

Type of change

Checklist:

I've properly formatted my code according to the guidelines
- By running Black + Flake8
- By running pre-commit hooks
This PR addresses an already submitted issue for TensorFlow Addons
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
This PR contains modifications to C++ custom-ops

How Has This Been Tested?

Multiple tests have been included, and because this metric exists in non-tf form in scipy it's easy
to verify correctness.

If you're adding a bugfix or new feature please describe the tests that you ran to verify your changes:
*

bot-of-gabrieldemarmiesse · 2020-09-17T13:30:50Z

@marload

You are owner of some files modified in this pull request.
Would you kindly review the changes whenever you have the time to?
Thank you very much.

tensorflow_addons/metrics/kendalls_tau.py

AakashKumarNain · 2020-09-18T05:19:44Z

Thanks for the contribution. I am not very familiar with scipy implementation. @WindQAQ @bhack are you familiar with this one?

AakashKumarNain · 2020-09-20T09:32:17Z

LGTM! Thanks again. Some test are failing but they are on our end and not yours. Once we fix them, I think we can merge this.

bhack · 2020-09-20T11:06:58Z

/cc @brianwa84 @jvdillon Are you interested in this for the Tensorflow Probability repo? It could fit in the stats sub package: https://www.tensorflow.org/probability/api_docs/python/tfp/stats

jvdillon · 2020-09-21T22:42:35Z

Hello! I'd love to see this in TFP!

Id also love to see the top-k loss, as described here: https://arxiv.org/pdf/1206.5280.pdf

jvdillon

Ive left some initial comments, should you be interested in moving it to TFP.

jvdillon · 2020-09-21T22:43:03Z

tensorflow_addons/metrics/kendalls_tau.py

@@ -0,0 +1,195 @@
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.


jvdillon · 2020-09-21T22:47:16Z

tensorflow_addons/metrics/kendalls_tau.py

+    exchanges = 0
+    num = tf.size(y)
+    k = tf.constant(1, tf.int32)
+    while tf.less(k, num):


Since TFP is a general purpose library, we need to write everything so that it does not presume eager execution. Minimally this means no python controlflow (eg, for, while, and if are not allowed).

More generally, this has a code nature that would be quite inefficient in a SIMD regime. Perhaps we could examine the full quadratic comparison? Eg,

0.5 * tf.math.reduce_mean(score[..., tf.newaxis] < score[..., tf.newaxis, :], axis=-1)

In general this would be quite large. I therefore recommend extending this idea to "chunks" using a tf.while_loop to reduce them. In so doing we can handle many cases in memory yet also bound the memory by way of chunks.

Thoughts?

Only after I finished writing this did I fully examine the implementation of the AUC metrics in Keras, which use a bucketized approximation, not that dissimilar to the two-dimensional bucket approach presented in https://arxiv.org/abs/1712.01521 which would be much more amenable to streaming. (OTOH, I'm unclear why Metrics need to be streaming, usually the classifier labels is a space much smaller than the input features.)

I also note that I should have looked at the completely rewritten scipy implementation which works differently, but still requires sorting by both of the full inputs which seems largely incompatible with tf Metrics design.
https://github.com/scipy/scipy/blob/01d8bfb6f239df4ce70c799b9b485b53733c9911/scipy/stats/stats.py#L4452

I suspect a reasonable thing to do here is to probably fork this into two separate efforts, one for tfp which would focus on the explicit exact solution with potential backoff to approximate, and a tensorflow addons submission that focuses on implementing a streaming approximation. I think I can manage that, but it's going to take some time - and at present I don't have a good sense of which of these would be easier to do first.

Hi - sorry I'm back to this finally - I think from the discussion this PR should be moved to tfp,
most likely in the https://github.com/tensorflow/probability/tree/master/tensorflow_probability/python/stats directory.

However, I think I need to make a significant redesign to comply with the suggestion of removing the control flow.

I'm not quite able to distill what the consensus is here for next steps, maybe I should do all of these:

Make a tfp PR?

Reimplement this as a streaming algorithm per the previously mentioned arxiv paper.

Change this current PR to use tf.while(), possibly with chunks.

bhack · 2020-09-21T23:35:06Z

@jvdillon Thanks, I have question It seems that the TFP repository currently doesn't hosts metrics right? We could have case like this or our Matthews correlation impl where the user could need to consume it as inherited from tf.keras.metrics.Metric object.
So if the ops is in TFP but the metric is here we need to depend on TFP. Do you plan to support metrics directly?

jvdillon · 2020-09-22T02:28:04Z

We dont have keras metrics but we do have standalone functions which might be regarded as metrics. We strive to be as library agnostic as possible. If possible we prefer simple functions and have a strict policy that all functions must support batch input. Take a look at other tfp.stats for details.

…

On Mon, Sep 21, 2020 at 4:35 PM bhack ***@***.***> wrote: @jvdillon <https://github.com/jvdillon> Thanks, I have question It seems that the TFP repository currently doesn't hosts metrics right? We could have case like this or our Matthews correlation <https://github.com/tensorflow/addons/blob/master/tensorflow_addons/metrics/matthews_correlation_coefficient.py> impl where the user could need to consume it as inherited from tf.keras.metrics.Metric object. So if the ops is in TFP but the metric is here we need to depend on TFP. Do you plan to support metrics directly? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2169 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIVTNWWRTQXH4R63APFN33SG7PLPANCNFSM4RQOFWYA> .

bhack · 2020-09-22T17:52:44Z

@jvdillon Ok but I see that you have Keras layers and optimizers in TFP.
I think that it is ok to have an op like this in TFP but then why not to have also metric that wraps the operator (when it is useful to be consumed as metrics)?
I think that It will be hard for TF Addons to depend on TFP currently just to implement metrics.

jvdillon · 2020-09-22T18:06:23Z

On Tue, Sep 22, 2020 at 10:53 AM bhack ***@***.***> wrote: @jvdillon <https://github.com/jvdillon> Ok but I see that you have Keras layers and optimizers in TFP. I think that it is ok to have an op like this in TFP but then why not to have also metric that wraps the operator (when it is useful to be consumed as metrics)? I think that It will be hard for TF Addons to depend on TFP currently just to implement metrics.

I will bring this question to the team. In the meantime, maybe proceed with adding the core function?

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2169 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIVTNW3IOKABCQ4ICYTIWDSHDP7ZANCNFSM4RQOFWYA> .

bhack · 2020-09-22T18:09:34Z

@jvdillon It is ok for me.
@AakashKumarNain @sorensenjs what do you think?

P.s. Are you interested also in https://github.com/tensorflow/addons/blob/master/tensorflow_addons/metrics/matthews_correlation_coefficient.py?

brianwa84 · 2020-09-23T01:27:09Z

I guess keras metrics have some streaming functionality? At least the old tf.metrics did.
Note that we have a similar code for streaming reduction baking under tfp.experimental.mcmc (covariance, moments, etc). This uses tensors instead of Variables, though.

AakashKumarNain · 2020-09-24T17:54:43Z

I guess keras metrics have some streaming functionality? At least the old tf.metrics did.
Note that we have a similar code for streaming reduction baking under tfp.experimental.mcmc (covariance, moments, etc). This uses tensors instead of Variables, though.

@brianwa84 Yes, the Metric api has streaming functionality.
PS: I am okay with adding the core functionality for the time being.

cc: @seanpmorgan

bhack · 2020-10-18T10:50:17Z

@seanpmorgan Before we lost this PR I am still oriented to include this TFP.

bhack · 2020-10-26T17:07:45Z

@jvdillon @brianwa84 Do you have enough Github perm to migrate this in TFP?

brianwa84 · 2020-10-26T17:41:45Z

I don't know what you mean by "migrate". I think you can just open a new PR against TFP.

For now, we could perhaps put something like this under tfp.experimental.stats, where we have other streaming metrics like covariance etc. Or you could modify to drop the streaming component and simply compute on a single tensor of observations, which would be more consistent w/ e.g. tfp.stats.covariance. We would likely want to eliminate the for loops for TFP. If you really need a loop, we usually use tf.while_loop or tf.scan to avoid bloating the tf.function graph.

bhack · 2020-10-26T17:48:30Z

I don't know what you mean by "migrate".

Yes I supposed that we had a related issue about this feature as it was mandatory in our contribution policy. But in this case we have a PR directly.

For now, we could perhaps put something like this under tfp.experimental.stats, where we have other streaming metrics like covariance etc. Or you could modify to drop the streaming component and simply compute on a single tensor of observations, which would be more consistent w/ e.g. tfp.stats.covariance. We would likely want to eliminate the for loops for TFP. If you really need a loop, we usually use tf.while_loop or tf.scan to avoid bloating the tf.function graph.

The plan is ok for me

@sorensenjs
Can you open the PR at https://github.com/tensorflow/probability/ ?

sorensenjs · 2020-10-29T17:14:54Z

Created tensorflow/probability#1147, following up there with requested changes from this thread.

sorensenjs · 2021-03-12T14:09:27Z

This was added to tensorflow probability
tensorflow/probability@9999dd4?branch=9999dd41df50157e09656d07b2d4dd49136a9288&diff=unified

See https://github.com/tensorflow/probability/blob/master/tensorflow_probability/python/stats/kendalls_tau.py for more details.

bhack · 2021-03-12T14:14:50Z

It would be nice if it could be exposed as metrics.

sorensenjs · 2021-03-12T14:22:35Z

Kendall's tau doesn't fit the metrics streaming style well. There are approximate versions (see https://arxiv.org/pdf/1712.01521.pdf) that are more interesting and well suited to the metrics. I might try doing that.

bhack · 2021-03-12T14:30:17Z

Kendall's tau doesn't fit the metrics streaming style well. There are approximate versions (see https://arxiv.org/pdf/1712.01521.pdf) that are more interesting and well suited to the metrics. I might try doing that.

Interesting. Thanks for the resoruce

sorensenjs · 2021-03-23T03:25:46Z

I've created a completely new streaming based algorithm for this abandoned PR
#2423

Kendall's Tau metric, based loosely on scipy.

419c5bb

boring-cyborg bot added the metrics label Sep 17, 2020

googlebot added the cla: yes label Sep 17, 2020

sorensenjs added 4 commits September 17, 2020 09:48

Some formatting, removing google only includes.

4058b33

Remove future imports.

4b294d7

Fix visually indented line.

c1e78b6

pre_commit reformatting.

7828aa5

AakashKumarNain reviewed Sep 18, 2020

View reviewed changes

tensorflow_addons/metrics/kendalls_tau.py Outdated Show resolved Hide resolved

AakashKumarNain reviewed Sep 18, 2020

View reviewed changes

tensorflow_addons/metrics/kendalls_tau.py Show resolved Hide resolved

AakashKumarNain reviewed Sep 18, 2020

View reviewed changes

tensorflow_addons/metrics/kendalls_tau.py Show resolved Hide resolved

sorensenjs added 2 commits September 18, 2020 07:13

Address comments, adding warnings, removing compat and future imports

d50ad60

Reformating.

196fd07

jvdillon reviewed Sep 21, 2020

View reviewed changes

bhack added the ecosystem-review label Oct 26, 2020

sorensenjs closed this Nov 1, 2020

sorensenjs deleted the kendallstau branch November 1, 2020 21:03

bhack mentioned this pull request Aug 5, 2022

Fix the Kendalls Tau metric when used in graph mode #2739

Merged

8 tasks

		@@ -0,0 +1,195 @@
		# Copyright 2019 The TensorFlow Authors. All Rights Reserved.

Kendall's Tau metric, based loosely on scipy. #2169

Kendall's Tau metric, based loosely on scipy. #2169

Uh oh!

Conversation

sorensenjs commented Sep 17, 2020

Description

Type of change

Checklist:

How Has This Been Tested?

Uh oh!

bot-of-gabrieldemarmiesse commented Sep 17, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AakashKumarNain commented Sep 18, 2020

Uh oh!

AakashKumarNain commented Sep 20, 2020

Uh oh!

bhack commented Sep 20, 2020

Uh oh!

jvdillon commented Sep 21, 2020

Uh oh!

jvdillon left a comment

Choose a reason for hiding this comment

Uh oh!

jvdillon Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

jvdillon Sep 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sorensenjs Sep 23, 2020

Choose a reason for hiding this comment

Uh oh!

sorensenjs Oct 20, 2020

Choose a reason for hiding this comment

Uh oh!

bhack commented Sep 21, 2020

Uh oh!

jvdillon commented Sep 22, 2020 via email

Uh oh!

bhack commented Sep 22, 2020

Uh oh!

jvdillon commented Sep 22, 2020 via email

Uh oh!

bhack commented Sep 22, 2020

Uh oh!

brianwa84 commented Sep 23, 2020

Uh oh!

AakashKumarNain commented Sep 24, 2020

Uh oh!

bhack commented Oct 18, 2020

Uh oh!

bhack commented Oct 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brianwa84 commented Oct 26, 2020

Uh oh!

bhack commented Oct 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sorensenjs commented Oct 29, 2020

Uh oh!

sorensenjs commented Mar 12, 2021

Uh oh!

bhack commented Mar 12, 2021

Uh oh!

sorensenjs commented Mar 12, 2021

Uh oh!

bhack commented Mar 12, 2021

Uh oh!

sorensenjs commented Mar 23, 2021

Uh oh!

Uh oh!

jvdillon Sep 21, 2020 •

edited

Loading

bhack commented Oct 26, 2020 •

edited

Loading

bhack commented Oct 26, 2020 •

edited

Loading