Dask-friendly 1D interpolation

I'm writing an xarray+dask library to perform 1D interpolation (https://github.com/crusaderky/xarray_extras/blob/master/xarray_extras/interpolate.py).
The library takes a ND xarray with dask backend as y and one of its coords (a plain numpy array) as x.
Currently I'm wrapping around ``scipy.interpolate.make_interp_spline``, and it's not great for a couple of reasons:

1. when working in dask, I might have multiple chunks on any axis of y except the interpolation axis; each chunk will translate to an equally-shaped array of spline coefficients. However, across all chunks there is only one array of knots, which (since my x is always a pure numpy array) is also a pure numpy array. With ``make_interp_spline``, I am forced to needlessly re-compute the knots vector multiple times.
2. dask makes it super easy to wrap around functions that return numpy arrays. ``make_interp_spline`` however returns a ``BSpline`` object, from which I have to extract the ``t`` and ``c`` arrays.

My current code looks pretty awful:
```
spline = map_blocks(
    make_interp_spline,
    x, a.data, k=k, check_finite=False, dtype=float)
t_name = 'spline_t-' + tokenize(getattr, spline.name, 't')
spline_first_key = (spline.name, ) + (0, ) * spline.ndim
t = Array(
    {(t_name, 0): (getattr, spline_first_key, 't')},
    name=t_name, dtype=float,
    chunks=((x.size + k + 1, ), ))
    t.dask.update(spline.dask)
c = map_blocks(getattr, spline, 'c', dtype=float)
```
Besides looking ugly and needlessly recomputing t for each chunk, it also introduces a dependency on the first chunk from all other chunks in order to save memory (the spline evaluation of all chunks will read t from the first chunk). It would have been much nicer to just compute t in plain numpy.

# Proposed design
Gut ``make_interp_spline`` and create two new functions out of it, ``make_interp_knots(x, k)`` and ``make_interp_coeffs(y, t, k)``, both of which return  a plain np.ndarray.
``make_interp_spline`` would become a simple wrapper around those two. The only non-trivial piece of logic it would retain is the special short-circuit for ``k=0`` and ``(k1=1 and t=None)`` (https://github.com/scipy/scipy/blob/v1.0.0/scipy/interpolate/_bsplines.py#L706).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Dask-friendly 1D interpolation #8810

Proposed design

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Dask-friendly 1D interpolation #8810

Description

Proposed design

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions