-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
System information
No response
What is the problem that this feature solves?
Since version 2 of the ai.onnx.ml
opset, LabelEncoder has encoded its attributes as lists of specific type (e.g. keys_int64s
, keys_strings
, keys_floats
) where only one "keys_" attribute should be set and only one "values_" attribute should be set. This arrangement prevents the use of External Data which can be very beneficial for users with large LabelEncoders. It also enforces a very strict cross product of types for keys and values, a decision which has pros (explicitly defining the necessary pairings of key, value types to implement for backends) and cons (forces users to downcast/upcast to conform to the API).
Alternatives considered
No response
Describe the feature
Having 1D tensors attributes instead of lists would make for a cleaner API with just one keys
and one values
attribute. It would also enable the use of External Data for models with large label encoders.
We can continue to retain the benefits of specifying strictly the cross product of key and value element types by specifying explicitly in the standard what type of tensors are permitted, thereby not suddenly permitting every possible LabelEncoder implementation.
Will this influence the current api (Y/N)?
Yes - it would require an opset bump for ai.onnx.ml
.
Feature Area
Operators
Are you willing to contribute it (Y/N)
Yes
Notes
No response