Skip to content

Conversation

Mytherin
Copy link
Collaborator

@Mytherin Mytherin commented Nov 5, 2024

This PR does some clean-up in the Value interface and serialization:

  • Value::EMPTYLIST and Value::EMPTYARRAY are removed, instead the interface Value::LIST(type, values) can be used to generate an empty list/array
  • Value::MAP and Value::STRUCT now correctly insert casts to enforce that the values are internally consistent with the defined types
  • LogicalType::ANY is explicitly supported in nested value types - in which case these types are ignored, i.e. we can do Value::LIST(LogicalType::ANY, values) with a list of values of any type
  • Avoid unnecessary recursive type serialization: when serializing nested values, we no longer need to serialize the type of every value if the type is defined (i.e. not LogicalType::ANY). Instead, the type from the upper-layer is used (i.e. if we have LogicalType::LIST(LogicalType::INT), the child elements are deserialized as LogicalType::INT).
  • For backwards compatibility, we still serialize the type in every Value also in nested values, unless the serialization version is set to >= 1.2.0.

@Maxxen
Copy link
Member

Maxxen commented Nov 5, 2024

Avoid unnecessary recursive type serialization

This got me thinking that we probably do this for Vector too? Although we don't serialize vectors as often it might still be something to look into in the future (I think vector serialization is a bit scuffed anyway, esp in relation to validity masks IIRC).

@Mytherin
Copy link
Collaborator Author

Mytherin commented Nov 5, 2024

Avoid unnecessary recursive type serialization

This got me thinking that we probably do this for Vector too? Although we don't serialize vectors as often it might still be something to look into in the future (I think vector serialization is a bit scuffed anyway, esp in relation to validity masks IIRC).

Vectors only serialize their data - the type is serialized in the DataChunk:

serializer.WriteProperty(100, "has_validity_mask", ...)
serializer.WriteProperty(101, "validity", ...)
serializer.WriteProperty(102, "data", ...)
serializer.WriteList(103, "children", ...)

@Mytherin Mytherin merged commit 564eb25 into duckdb:main Nov 5, 2024
40 checks passed
@Mytherin Mytherin deleted the cleanuparraylistinterface branch December 8, 2024 06:52
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Dec 20, 2024
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Dec 21, 2024
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Dec 21, 2024
Value interface & serialization clean-up (duckdb/duckdb#14710)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants