Skip to content

BUG: Inconsistent conversion of object-type bytes array to StringDType #28269

@crusaderky

Description

@crusaderky

Describe the issue:

Converting an array of object dtype and bytes items, or a list of bytes, to variable-width strings results in a spurious b'...' tag.
This issue is absent in any other combination of input/output type where the input is bytes and the output is string.

Reproduce the code example:

>>> import numpy as np

>>> np.array(["foo", "bar"], dtype="O").astype("U")
array(['foo', 'bar'], dtype='<U3')

>>> np.array(["foo", "bar"], dtype="O").astype("T")
array(['foo', 'bar'], dtype=StringDType())

>>> np.array([b"foo", b"bar"], dtype="O").astype("U")
array(['foo', 'bar'], dtype='<U3')

>>> np.array([b"foo", b"bar"], dtype="S").astype("T")
array(['foo', 'bar'], dtype=StringDType())

>>> np.array([b"foo", b"bar"], dtype="T")
array(["b'foo'", "b'bar'"], dtype=StringDType())

>>> np.array([b"foo", b"bar"], dtype="O").astype("T")
array(["b'foo'", "b'bar'"], dtype=StringDType())

Python and NumPy Versions:

numpy 2.2.2

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions