Skip to content

Unstructured meshes attached to cubes should be read-only #4749

@jamesp

Description

@jamesp

The cube.mesh, or its component parts, should probably be read-only values.

When loading a ugrid netcdf file, all cubes returned:

  • share a reference to the same Mesh object with lazy realisation of mesh properties such as nodes and face-edges.
  • have their own MeshCoord coordinate objects that point to the same dask objects as those in the Mesh.

When iris realises data on e.g. calling .points it replaces the dask array with the numpy array, but only at the specific object on which it was called. Which means:

  • while the Mesh object with lazy properties is shared between cubes, realised instantiations of those properties are not, and will be held only by the cube, or mesh, separately
  • memory taken by these numpy arrays will not be shared
  • changing the values at one location does not change them elsewhere

A concrete example:

cube1, cube2 = iris.load('ugrid_data.nc')
# both cubes point to the same mesh object
assert cube1.mesh is cube2.mesh
# both cube meshcoord core_points point to the same dask object
assert cube1.coord('latitude').core_points() is cube2.coord('latitude').core_points()

# but if we realise one set of points, they become detached
cube1.coord('latitude').points[0] = 999
print(cube2.coord('latitude').points[0]) # => 37

The same will apply if the values on the cube.mesh are changed. Realising the nodes on cube.mesh does not replace the dask graph references in the meshcoords so changes will not propagate through.

# create two meshcoords from the (lazy) mesh
mc1_lazy = mesh.to_MeshCoord(location='face', axis='y')
mc2_lazy = mesh.to_MeshCoord(location='face', axis='y')

mc1_lazy.points[0] = 999
print(mc2_lazy.points[0]) # => 37

All of this does not hold if the mesh is initialised with real data instead of dask arrays. then all objects do share the same numpy buffer and changing one will affect the others:

# realise the latitude points on the faces
mesh.face_coords.face_y.points

# generate two new meshcoords now that the data is realised
mc1_real = mesh.to_MeshCoord(location='face', axis='y')
mc2_real = mesh.to_MeshCoord(location='face', axis='y')

mc1_real.points[0] = 999
print(mc2_real.points[0]) # => 999

All of this is very confusing. There may be better ways of dealing with this complexity (ideas welcome!) but for now a good first step might be to make the mesh properties read-only to avoid these inconsistencies from occurring

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions