Skip to content

distance matrix lacks dtype checks / warnings #10262

@robaki

Description

@robaki

scipy.spatial.distance_matrix issues no warning of possible overflows when using unusual dtypes (e.g. uint16, etc.). It calculates the distances, but the results are affected by the overflows, and therefore are incorrect. Also, the results affected by overflow are returned in float64 dtype (instead of original dtype), what makes it harder to figure out what happened.

I'm not sure if the warning is worth adding, but I came across this behaviour today in a project that uses openCV, where uint type numpy arrays are often used.

Reproducing code example:

from scipy.spatial import distance_matrix
import numpy as np

points_1 = np.array([[352, 916]])
points_2 = np.array([[350,660]])

print(distance_matrix(points_1, points_2)) # [[256.00781238]] (OK)

points_1 = points_1.astype('uint16')
points_2 = points_2.astype('uint16')

print(distance_matrix(points_1, points_2)) # [[2.]] (not OK)
print(distance_matrix(points_1, points_2).dtype) # float64 (confusing)

Scipy/Numpy/Python version information:

import sys, scipy, numpy; print(scipy.__version__, numpy.__version__, sys.version_info)
1.3.0 1.16.4 sys.version_info(major=3, minor=6, micro=7, releaselevel='final', serial=0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectA clear bug or issue that prevents SciPy from being installed or used as expectedscipy.spatial

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions