-
Notifications
You must be signed in to change notification settings - Fork 297
Description
Closes #5165
User story
- As an: Iris user loading NetCDF files written according to the CF Conventions.
- I want: Iris to capture malformed CF information during loading, instead of crashing or disposing of it.
- So that: I can fix CF problems within my Iris script - avoiding the complexity of multiple scripts/tools.
Why this is hard
There are many ways Iris crashes when it encounters bad CF. It is tempting to think of these crashes as deliberate - with easy-to-modify code blocks for each rule - and there are a few of these, but Iris is not a CF-checker. Instead we have used CF to make assumptions so that the code can be simpler/smaller; barely any of the crashes are raised from dedicated lines, and they are often hard to predict.
Any 'fix' must therefore be a form of generic error handling, which cannot have knowledge of what precisely might go wrong.
The architecture
iris.LOAD_PROBLEMS
- a global object where Iris can capture objects that could not be loaded, and the stack trace error that was raised.build_raw_cube()
- a routine that will represent anyCFVariable
as a very basicCube
, with as little interpretation as possible.- Separate building objects versus adding them to the
Cube
being loaded. - Ensure all objects - including names, units, etcetera - are contained within their own building routine.
_add_or_capture()
:- Can't build the object? Use
build_raw_cube()
and store iniris.LOAD_PROBLEMS
. - Object built but can't add to the
Cube
? Store iniris.LOAD_PROBLEMS
.
- Can't build the object? Use
- Issue a warning at the end of loading if anything is found within
iris.LOAD_PROBLEMS
. - Make it easier to convert
Cube
s - output byraw_cube_from_cf_var()
- into other objects e.g.DimCoord
s? Documentation at the very least.
Note that I have checked cf.py
and believe it can remain unchanged. This has a defensive philosophy already, which involves checking if variables can be interpreted as different types, and the remainder are all represented as CfDataVariables
, so we already have an existing fallback in place. Anything here that is not formatted correctly just shows up as extra Cube
(s) in the loaded CubeList
.
More specifics on implementation
For reference when writing #6318 and #6319
- ✔
Iris:__init__.py
- Create the
LOAD_PROBLEMS
object - Example structure:
{"file/path/1": [(problem_object_1, error_or_stacktrace), (problem_object_1, error_or_stacktrace)]}
- Create the
helpers.py
:- ✔
Introduce a new function -:_add_or_capture()
- that- Attempts to call a
build_
routine (passed as an argument) in atry
-except
- On failure: falls back on
build_raw_cube()
. - Attempts to add successfully built objects (e.g.
DimCoord
) to aCube
(passed as an argument) in atry
-except
- On failure: adds the built objects to
iris.LOAD_PROBLEMS
instead.
- Attempts to call a
- ✔
Create thebuild_raw_cube
function - Separate as much as possible into
build_
routines. We already have many, but even getting hold of standard names etcetera should be separated in this way. - Refactor
build_
routines to only perform the building - returning the built object rather than adding it to theCube
. - Make the build routines private -
_build...
- and call them frombuild_and_add...
routines, which prepare the necessary arguments for_add_or_capture()
. Here are examples that have already been completed:build_and_add_dimension_coordinate
build_and_add_names
- ✔
actions.py
:- Three
action_
routines have success criteria and failure information. These should be refactored so that a failure falls back tobuild_raw_cube()
. The failure reason (already being recorded) should be captured iniris.LOAD_PROBLEMS
. See Tolerant handling ofstandard_name
and dimension coordinate loading #6338 for how this was done with dimension coordinates - ALL
action_
routines should be refactored to call the newbuild_and_add
routines.
- Three
- Tests
- Confirm this can fix known cases (Common agreement on loading CF non-compliant NetCDF files #5165)
- Docstrings
- What's New
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status
Status
Status