-
Notifications
You must be signed in to change notification settings - Fork 30
Add fallback protection to Pixie cell clustering #999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rite functionality for SOM assignment and consensus clustering
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -297,7 +297,9 @@ | |||
"cell_type": "markdown", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #15. compression='uncompressed'
I looked at an issue you investigated a while back, #860 about feather alternatives. It seemed that you concluded that feather files are pretty solid, and they have efficient compression, but they it seems to lack on the actual storage savings side of things. Do you think it may be worth revisiting the compression aspect to see if zstd
on the scale of a cohort would be beneficial?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely worth looking into in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a question if compressing feather files yields anything useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
What is the purpose of this PR?
Closes #992. Because cell clustering now takes a considerable amount of time, chance of intermediate failure also increases. There is no current way to recover intermediate cell training data if this happens, so this needs to be added.
How did you implement your changes
overwrite
functionality to the cell SOM assignment and cell consensus clustering functions. This will allow the user to redo these if they use different parameters.