-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Implement JSON <-> Nested types casting #7366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is fantastic! Implicit casts will make this so much simpler for our users. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks good to me. One comment -
INSERT INTO jsons SELECT * FROM maps | ||
|
||
query I | ||
SELECT * FROM jsons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add more tests with the structs/vectors in test_all_types
and test_vector_types
(e.g. select * from test_vector_types(null::int[], false);
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to add more tests, but I'm not sure I understand what you mean.
test_all_types
is in DuckDB core, while the JSON
type is only in the extension. How would this work? Happy to discuss in the office today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I understand. I'll write some more tests and then you can check if it's OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant including the casts to and from the vectors, e.g.:
select test_vector::JSON from test_vector_types(null::int[], false);
select test_vector::JSON::int[]::json from test_vector_types(null::int[], false);
select test_vector::JSON from test_vector_types(null::int[][], false);
select test_vector::JSON::int[][]::json from test_vector_types(null::int[][], false);
select test_vector::JSON from test_vector_types(null::row(i varchar, j int), false);
...
The false
indicates whether or not to flatten the vectors, this can be used to find bugs in handling of non-flat vectors by checking if the result is the same for both settings (true and false), e.g.:
foreach flatten true false
query I nosort expected_result_row
select test_vector::JSON from test_vector_types(null::row(i varchar, j int), ${flatten});
endloop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an excellent suggestion, thanks! I was able to quickly squash a bunch of bugs. I've also added more functionality in the process. We can now cast all of our types to JSON
and back, e.g.:
D select '1996-03-27'::DATE::JSON d;
┌──────────────┐
│ d │
│ json │
├──────────────┤
│ "1996-03-27" │
└──────────────┘
D select '"1996-03-27"'::JSON::DATE d;
┌────────────┐
│ d │
│ date │
├────────────┤
│ 1996-03-27 │
└────────────┘
VARCHAR
is a special case of this, and still behaves like before. It is parsed and validated instead of being interpreted as a string that has to be quoted.
Thanks! |
This PR implements casting from
JSON
to our nested types (LIST
,STRUCT
,MAP
) and vice-versa. This mostly uses the existingto_json
andfrom_json
functionality.For example, inserting
JSON
directly into tables that contain our nested types now works, using our implicit cast rules:And of course, the other way around also works (also using our implicit cast rules!).
I've taken a not-so-strict approach with
TRY_CAST
, i.e.:Usually, the entire result becomes
NULL
if a cast fails somewhere. Instead, for these kinds of casts, only the specific nested value that fails becomesNULL
. If aJSON
almost matches the schema, but not completely, the matching values will still be converted, instead of it being entirelyNULL
.