Skip to content

[Feature request] LabelEncoder with Tensor attributes #5412

@adityagoel4512

Description

@adityagoel4512

System information

No response

What is the problem that this feature solves?

Since version 2 of the ai.onnx.ml opset, LabelEncoder has encoded its attributes as lists of specific type (e.g. keys_int64s, keys_strings, keys_floats) where only one "keys_" attribute should be set and only one "values_" attribute should be set. This arrangement prevents the use of External Data which can be very beneficial for users with large LabelEncoders. It also enforces a very strict cross product of types for keys and values, a decision which has pros (explicitly defining the necessary pairings of key, value types to implement for backends) and cons (forces users to downcast/upcast to conform to the API).

Alternatives considered

No response

Describe the feature

Having 1D tensors attributes instead of lists would make for a cleaner API with just one keys and one values attribute. It would also enable the use of External Data for models with large label encoders.

We can continue to retain the benefits of specifying strictly the cross product of key and value element types by specifying explicitly in the standard what type of tensors are permitted, thereby not suddenly permitting every possible LabelEncoder implementation.

Will this influence the current api (Y/N)?

Yes - it would require an opset bump for ai.onnx.ml.

Feature Area

Operators

Are you willing to contribute it (Y/N)

Yes

Notes

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic: enhancementRequest for new feature or operatortopic: operatorIssues related to ONNX operators

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions