Skip to content

dtype.byteorder is not consistent during cross platform joblib.load()  #1123

@pradghos

Description

@pradghos

dtype.byteorder is not consistent during cross platform joblib.load() ( big endian ) of file written in different platform (little endian)

On little endian machine -

Python 3.8.5 (default, Sep  4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import joblib
>>> jb = joblib.load('joblib_0.11.0_pickle_py36_np111.pkl')
>>> jb[0].dtype
dtype('int64')
>>> jb[0].dtype.byteorder
'='
>>> jb1 = joblib.load('joblib_0.9.2_compressed_pickle_py34_np19.gz')
>>> jb1[0].dtype
dtype('int64')
>>> jb1[0].dtype.byteorder
'='
>>>

On Big Endian Machine (linux-s390x)

(sklearn-test2) [root@plexors1 data]# python
Python 3.7.4 (default, Oct  7 2020, 05:24:07)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import joblib
>>> jb = joblib.load('joblib_0.11.0_pickle_py36_np111.pkl')
>>> jb[0].dtype
dtype('<i8')
>>> jb[0].dtype.byteorder
'<'                ----------> This is probably because dtype is written as little endian on joblib_0.11.0_pickle_py36_np111.pkl

I think this is backward compatibility support- calling load_compatibility.
At Big endian system, test expecting dtype '<i8' byteorder but received 'int64'.

>>> jb1 = joblib.load('joblib_0.9.2_compressed_pickle_py34_np19.gz')
>>> jb1[0].dtype
dtype('int64')
>>> jb1[0].dtype.byteorder
'='                
>>>

As result test_joblib_pickle_across_python_versions test is failing on s390x system.

Another question , after loading data from file (written in little endian order) into big endian system, in general dtype.byteorder stays with '<' (little endian ) only. Pls do suggest , after joblib.load, dtype.byteorder should always be '=' (if read correctly) - will this be more consistent approach or not.
Thank you !

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions