-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
What happens?
Duckdb crashes with a Segmentation fault when trying to COPY the result of a positional join with PARQUET VERSION V2, but not V1
To Reproduce
I have been unable to generate synthetic data that is equivalent, so here is my trimmed down actual data in csv form, each just one column, there is some size component to this crash, if I limit to < 1000 data points it doesn't appear, also it crashes on my real data, but doesn't when using a simple unnest(generate_series
This works
COPY (
SELECT * FROM 'tbl1.csv'
POSITIONAL JOIN (FROM 'tbl2.csv')
) TO 'test_out.parquet'
(PARQUET_VERSION V1);
This doesn't:
COPY (
SELECT * FROM 'tbl1.csv'
POSITIONAL JOIN (FROM 'tbl2.csv')
) TO 'test_out.parquet'
(PARQUET_VERSION V2);
When running roughly equivalent of the above from python with ray I also see PC: @ 0x7f47e9339b0d (unknown) duckdb::StandardColumnWriter<>::WriteVectorInternal<>()
Backtrace:
OS:
Linux x86_64
DuckDB Version:
1.3.0
DuckDB Client:
CLI
Hardware:
No response
Full Name:
Julian Meyers
Affiliation:
Personal Use / Advanced Robotics Group (when working)
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- Yes, I have