-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Closed
Labels
scipy.spatialwontfixNot actionable, rejected or unplanned changesNot actionable, rejected or unplanned changes
Description
cdist
currently treats DataFrame
s as arrays, ignoring column names. If passing DataFrame
s with the same columns but in a different order, this will cause unexpected results. Aligning the columns would be more intuitive IMO.
from scipy.spatial.distance import cdist
df = pd.DataFrame({'a': [0, 0],
'b': [1, 1]})
# Zero distance from df to itself.
cdist(df, df)
# array([[0., 0.],
# [0., 0.]])
# But switching the columns creates distance.
df2 = df[['b', 'a']]
cdist(df, df2)
# array([[1.41421356, 1.41421356],
# [1.41421356, 1.41421356]])
Scipy/Numpy/Python version information:
import sys, scipy, numpy; print(scipy.__version__, numpy.__version__, sys.version_info)
1.1.0 1.15.4 sys.version_info(major=3, minor=6, micro=6, releaselevel='final', serial=0)
Metadata
Metadata
Assignees
Labels
scipy.spatialwontfixNot actionable, rejected or unplanned changesNot actionable, rejected or unplanned changes