-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
Labels
Description
What happens?
Hello,
I noticed a lot of variation in the writing duration of CSV files. Running this to write an 850 MB CSV file can be very fast (less than a second) or very slow (more than 40 seconds). It appears to be happening randomly on my machine, and I'm unsure why.
COPY "table1" TO './output/ahccd.csv' (DELIMITER ',', HEADER TRUE);
The ahccd.csv file is 850 MB and has the following data in it.
time,station,station_name,tas,decade
1913-10-01,1012010,COWICHAN BAY CHERRY,10.2,1910
1913-10-02,1012010,COWICHAN BAY CHERRY,12.0,1910
1913-10-03,1012010,COWICHAN BAY CHERRY,11.0,1910
1913-10-04,1012010,COWICHAN BAY CHERRY,8.4,1910
1913-10-05,1012010,COWICHAN BAY CHERRY,8.1,1910
1913-10-06,1012010,COWICHAN BAY CHERRY,8.1,1910
1913-10-07,1012010,COWICHAN BAY CHERRY,6.3,1910
1913-10-08,1012010,COWICHAN BAY CHERRY,8.1,1910
1913-10-09,1012010,COWICHAN BAY CHERRY,5.8,1910
1913-10-10,1012010,COWICHAN BAY CHERRY,6.7,1910
1913-10-11,1012010,COWICHAN BAY CHERRY,10.9,1910
When I was using v1.2.x, it wasn't a problem. It was always very fast.
Please let me know if there's anything I can do differently to help.
Thank you.
To Reproduce
COPY "table1" TO './output/ahccd.csv' (DELIMITER ',', HEADER TRUE);
OS:
iOS
DuckDB Version:
v1.3.1
DuckDB Client:
Node (neo)
Hardware:
M1 Pro 16GB
Full Name:
Nael Shiab
Affiliation:
CBC News
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
No - I cannot easily share my data sets due to their large size
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- Yes, I have