Support lazy saving

## ✨ Feature Request

The `iris.save` function only supports saving a single file at a time and is not lazy. However, the [`dask.array.store`](https://docs.dask.org/en/latest/array-api.html#dask.array.store) function that backs the NetCDF saver supports delayed saving. For our use case, it would be computationally more efficient to have this supported by `iris.save`. Would it be possible to allow providing the `compute=False` option to the current `iris.save` function, so instead of saving directly it returns a `dask.delayed.Delayed` object that can be computed at a later time?

Alternatively, having a save function in iris that allows saving a list of cubes to a list of files, one cube per file (also similar to `da.store` but then working for cubes) would also work for us.

## Motivation


In our case, [multi model statistics in ESMValTool](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html#multi-model-statistics), we are interested in computing statistics (e.g. mean, median) over a number of climate models (cubes). Before we can compute those statistics, we need to load the data from disk and regrid the cubes to the same horizontal grid (and optionally to the same vertical levels). Then we merge all cubes into a single cube with a 'model' dimension and collapse along that dimension using e.g. `iris.analysis.MEAN` to compute the mean.

We want to store both the regridded input cubes and the cube(s) containing the statistics, each cube in it's own netCDF file according to the CMIP/CMOR conventions. Because `iris.save` only allows saving a single cube to a single file and is immediately executed, the load and regrid needs to be executed (1 + the number of statistics) times. Having support for delayed saving (or saving a list of cubes to a matching list of files) would save computational time, because the regridded chunks can be re-used for computing each statistic (as well as for storing the regridded cube), so we only need to load and regrid the chunk once.

## Additional context

<details>
<summary><b>Example script that shows the use case</b></summary>

This is an example script that demonstrates our workflow and how we could use the requested save function to speed up the multi-model statistics computation. Note that the script uses lazy multi-model statistics, which are still in development in https://github.com/ESMValGroup/ESMValCore/pull/968.

```python3
import os
import sys

import dask
import dask.array as da
import iris
from netCDF4 import Dataset

from esmvalcore.preprocessor import multi_model_statistics, regrid


def save(cube, target, compute):
    """Save the data from a 3D cube to file using da.store."""
    dataset = Dataset(target, "w")
    dataset.createDimension("time", cube.shape[0])
    dataset.createDimension("lat", cube.shape[1])
    dataset.createDimension("lon", cube.shape[2])
    dataset.createVariable(
        "var",
        "f4",
        (
            "time",
            "lat",
            "lon",
        ),
    )

    return da.store(cube.core_data(), dataset["var"], compute=compute)


def main(in_filenames):
    """Compute multi-model statistics over the input files."""
    target_grid = "1x1"
    cubes = {}
    for in_filename in in_filenames:
        cube = iris.load_cube(in_filename)
        cube = regrid(cube, target_grid, scheme="linear")
        out_filename = os.path.basename(in_filename)
        cubes[out_filename] = cube

    statistics = multi_model_statistics(cubes.values(), "overlap", ["mean", "std_dev"])
    for statistic, cube in statistics.items():
        out_filename = statistic + ".nc"
        cubes[out_filename] = cube

    results = []
    for out_filename, cube in cubes.items():
        result = save(cube, out_filename, compute=False)
        results.append(result)

    dask.compute(results)

    # for out_filename, cube in cubes.items():
    #     iris.save(cube, out_filename)


if __name__ == "__main__":
    # This script takes a list of netCDF files containing 3D variables as arguments
    main(sys.argv[1:])

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support lazy saving #4190

✨ Feature Request

Motivation

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support lazy saving #4190

Description

✨ Feature Request

Motivation

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions