Skip to content

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented Apr 3, 2023

Successor to the experimental #5069.
Addresses #5053

Draft while I see what tests it may break.

@codecov
Copy link

codecov bot commented Apr 3, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.01 ⚠️

Comparison is base (6b25c3a) 89.28% compared to head (79d16d1) 89.27%.

❗ Current head 79d16d1 differs from pull request most recent head 3dabeca. Consider uploading reports for the commit 3dabeca to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5229      +/-   ##
==========================================
- Coverage   89.28%   89.27%   -0.01%     
==========================================
  Files          88       88              
  Lines       22269    22273       +4     
  Branches     4870     4875       +5     
==========================================
+ Hits        19882    19884       +2     
- Misses       1641     1642       +1     
- Partials      746      747       +1     
Impacted Files Coverage Δ
lib/iris/fileformats/netcdf/loader.py 80.37% <100.00%> (+0.37%) ⬆️

... and 3 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@pp-mo pp-mo changed the title Small netCDF variable data is real. Use real array for data of of small netCDF variables. Apr 4, 2023
@bjlittle bjlittle requested a review from ESadek-MO April 4, 2023 09:11
@pp-mo pp-mo marked this pull request as ready for review April 4, 2023 10:05
@pp-mo
Copy link
Member Author

pp-mo commented Apr 4, 2023

@ESadek-MO I think this is good to go now.
Please take a look + see what you think.

N.B. there is basically no API and no "public view" of this, it's just a performance tweak.

On the other hand, it does affect some loads in a backwards-incompatible way.
@trexfeathers can we accept that the real/lazy status of data arrays is not something that we have to absolutely preserve,
or should this count as a breaking change ?

@trexfeathers
Copy link
Contributor

On the other hand, it does affect some loads in a backwards-incompatible way.
@trexfeathers can we accept that the real/lazy status of data arrays is not something that we have to absolutely preserve,
or should this count as a breaking change ?

I'm comfortable with this change. We encourage people to CHECK if their data is lazy or not, rather than to assume it is one way or the other.

Copy link
Contributor

@ESadek-MO ESadek-MO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on the whole, only one thing to address then I think you're good!

@ESadek-MO
Copy link
Contributor

Looks good, thanks @pp-mo!

@ESadek-MO ESadek-MO merged commit c74d783 into SciTools:main Apr 6, 2023
tkknight added a commit to tkknight/iris that referenced this pull request Apr 13, 2023
* upstream/main: (59 commits)
  Updated environment lockfiles (SciTools#5211)
  update ci locks location (SciTools#5228)
  Fixes to _discontiguity_in_bounds (attempt 2) (SciTools#4975)
  Finalises Lazy Data documentation (SciTools#5137)
  Modernize and simplify iris.analysis._Groupby (SciTools#5015)
  clarity on whatsnew entry contributors (SciTools#5240)
  Handle derived coordinates correctly in `concatenate` (SciTools#5096)
  Use real array for data of of small netCDF variables. (SciTools#5229)
  Bump scitools/workflows from 2023.04.1 to 2023.04.2 (SciTools#5236)
  fixing whatsnew entry
  remove results creation commit from blame
  configure codecov
  adding a whatsnew entry
  Replacing numpy legacy printing with array2string and remaking results for dependent tests
  Adding a whatsnew entry for 5224 (SciTools#5234)
  Cf cell method (SciTools#5224)
  Bump scitools/workflows from 2023.03.3 to 2023.04.1 (SciTools#5231)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5230)
  Bump scitools/workflows from 2023.03.2 to 2023.03.3 (SciTools#5227)
  raise dask min pin (SciTools#5225)
  ...
pp-mo added a commit to pp-mo/iris that referenced this pull request Apr 13, 2023
* Small netCDF variable data is real.

* Various test fixes.

* More test fixing.

* Fix printout in Mesh documentation.

* Whatsnew + doctests fix.

* Tweak whatsnew.
lbdreyer pushed a commit that referenced this pull request Apr 21, 2023
* Basic functional lazy saving.

* Simplify function signature which upsets Sphinx.

* Non-lazy saves return nothing.

* Now fixed to enable use with process/distributed scheduling.

* Remove dask.utils.SerializableLock, which I think was a mistake.

* Make DefferedSaveWrapper use _thread_safe_nc.

* Fixes for non-lazy save.

* Avoid saver error when no deferred writes.

* Reorganise locking code, ready for shareable locks.

* Remove optional usage of 'filelock' for lazy saves.

* Document dask-specific locking; implement differently for threads or distributed schedulers.

* Minor fix for unit-tests.

* Pin libnetcdf to avoid problems -- see #5187.

* Minor test fix.

* Move DeferredSaveWrapper into _thread_safe_nc; replicate the NetCDFDataProxy fix; use one lock per Saver; add extra up-scaled test

* Update lib/iris/fileformats/netcdf/saver.py

Co-authored-by: Bouwe Andela <bouweandela@gmail.com>

* Update lib/iris/fileformats/netcdf/_dask_locks.py

Co-authored-by: Bouwe Andela <bouweandela@gmail.com>

* Update lib/iris/fileformats/netcdf/saver.py

Co-authored-by: Bouwe Andela <bouweandela@gmail.com>

* Small rename + reformat.

* Remove Saver lazy option; all lazy saves are delayed; factor out fillvalue checks and make them delayable.

* Repurposed 'test__FillValueMaskCheckAndStoreTarget' to 'test__data_fillvalue_check', since old class is gone.

* Disable (temporary) saver debug printouts.

* Fix test problems; Saver automatically completes to preserve existing direct usage (which is public API).

* Fix docstring error.

* Fix spurious error in old saver test.

* Fix Saver docstring.

* More robust exit for NetCDFWriteProxy operation.

* Fix doctests by making the Saver example functional.

* Improve docstrings; unify terminology; simplify non-lazy save call.

* Moved netcdf cell-method handling into nc_load_rules.helpers, and various tests into more specific test folders.

* Fix lockfiles and Makefile process.

* Add unit tests for routine _fillvalue_report().

* Remove debug-only code.

* Added tests for what the save function does with the 'compute' keyword.

* Fix mock-specific problems, small tidy.

* Restructure hierarchy of tests.unit.fileformats.netcdf

* Tidy test docstrings.

* Correct test import.

* Avoid incorrect checking of byte data, and a numpy deprecation warning.

* Alter parameter names to make test reports clearer.

* Test basic behaviour of _lazy_stream_data; make 'Saver._delayed_writes' private.

* Add integration tests, and distributed dependency.

* Docstring fixes.

* Documentation section and whatsnew entry.

* Various fixes to whatsnew, docstrings and docs.

* Minor review changes, fix doctest.

* Arrange tests + results to organise by package-name alone.

* Review changes.

* Review changes.

* Enhance tests + debug.

* Support scheduler type 'single-threaded'; allow retries on delayed-save test.

* Improve test.

* Adding a whatsnew entry for 5224 (#5234)

* Adding a whatsnew entry explaining 5224

* Fixing link and format error

* Replacing numpy legacy printing with array2string and remaking results for dependent tests

* adding a whatsnew entry

* configure codecov

* remove results creation commit from blame

* fixing whatsnew entry

* Bump scitools/workflows from 2023.04.1 to 2023.04.2 (#5236)

Bumps [scitools/workflows](https://github.com/scitools/workflows) from 2023.04.1 to 2023.04.2.
- [Release notes](https://github.com/scitools/workflows/releases)
- [Commits](SciTools/workflows@2023.04.1...2023.04.2)

---
updated-dependencies:
- dependency-name: scitools/workflows
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Use real array for data of of small netCDF variables. (#5229)

* Small netCDF variable data is real.

* Various test fixes.

* More test fixing.

* Fix printout in Mesh documentation.

* Whatsnew + doctests fix.

* Tweak whatsnew.

* Handle derived coordinates correctly in `concatenate` (#5096)

* First working prototype of concatenate that handels derived coordinates correctly

* Added checks for derived coord metadata during concatenation

* Added tests

* Fixed defaults

* Added what's new entry

* Optimized test coverage

* clarity on whatsnew entry contributors (#5240)

* Modernize and simplify iris.analysis._Groupby (#5015)

* Modernize and simplify _Groupby

* Rename variable to improve readability

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Add a whatsnew entry

* Add a type hint to _add_shared_coord

* Add a test for iris.analysis._Groupby.__repr__

---------

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Finalises Lazy Data documentation (#5137)

* cube and io lazy data notes added

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added comments within analysis, as well as palette and iterate, and what's new

* fixed docstrings as requested in @trexfeathers review

* reverted cube.py for time being

* fixed flake8 issue

* Lazy data second batch

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated lastest what'snew

* I almost hope this wasn't the fix, I'm such a moron

* adressed review changes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Bill Little <bill.james.little@gmail.com>

* Fixes to _discontiguity_in_bounds (attempt 2) (#4975)

* update ci locks location (#5228)

* Updated environment lockfiles (#5211)

Co-authored-by: Lockfile bot <noreply@github.com>

* Increase retries.

* Change debug to show which elements failed.

* update cf standard units (#5244)

* update cf standard units

* added whatsnew entry

* Correct pull number

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

---------

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* libnetcdf <4.9 pin (#5242)

* Pin libnetcdf<4.9 and update lock files.

* What's New entry.

* libnetcdf not available on PyPI.

* Fix for Pandas v2.0.

* Fix for Pandas v2.0.

* Avoid possible same-file crossover between tests.

* Ensure all-different testfiles; load all vars lazy.

* Revert changes to testing framework.

* Remove repeated line from requirements/py*.yml (?merge error), and re-fix lockfiles.

* Revert some more debug changes.

* Reorganise test for better code clarity.

* Use public 'Dataset.isopen()' instead of '._isopen'.

* Create output files in unique temporary directories.

* Tests for fileformats.netcdf._dask_locks.

* Fix attribution names.

* Fixed new py311 lockfile.

* Fix typos spotted by codespell.

* Add distributed test dep for python 3.11

* Fix lockfile for python 3.11

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Bouwe Andela <bouweandela@gmail.com>
Co-authored-by: Henry Wright <84939917+HGWright@users.noreply.github.com>
Co-authored-by: Henry Wright <henrywright@sky.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Manuel Schlund <32543114+schlunma@users.noreply.github.com>
Co-authored-by: Bill Little <bill.james.little@gmail.com>
Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
Co-authored-by: Elias <110238618+ESadek-MO@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: stephenworsley <49274989+stephenworsley@users.noreply.github.com>
Co-authored-by: scitools-ci[bot] <107775138+scitools-ci[bot]@users.noreply.github.com>
Co-authored-by: Lockfile bot <noreply@github.com>
@pp-mo pp-mo deleted the nc_real_small_coords branch April 28, 2023 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants