Skip to content

Performance varies a lot with COPY ... TO ... #18139

@nshiab

Description

@nshiab

What happens?

Hello,

I noticed a lot of variation in the writing duration of CSV files. Running this to write an 850 MB CSV file can be very fast (less than a second) or very slow (more than 40 seconds). It appears to be happening randomly on my machine, and I'm unsure why.

COPY "table1" TO './output/ahccd.csv' (DELIMITER ',', HEADER TRUE);

The ahccd.csv file is 850 MB and has the following data in it.

time,station,station_name,tas,decade
1913-10-01,1012010,COWICHAN BAY CHERRY,10.2,1910
1913-10-02,1012010,COWICHAN BAY CHERRY,12.0,1910
1913-10-03,1012010,COWICHAN BAY CHERRY,11.0,1910
1913-10-04,1012010,COWICHAN BAY CHERRY,8.4,1910
1913-10-05,1012010,COWICHAN BAY CHERRY,8.1,1910
1913-10-06,1012010,COWICHAN BAY CHERRY,8.1,1910
1913-10-07,1012010,COWICHAN BAY CHERRY,6.3,1910
1913-10-08,1012010,COWICHAN BAY CHERRY,8.1,1910
1913-10-09,1012010,COWICHAN BAY CHERRY,5.8,1910
1913-10-10,1012010,COWICHAN BAY CHERRY,6.7,1910
1913-10-11,1012010,COWICHAN BAY CHERRY,10.9,1910

When I was using v1.2.x, it wasn't a problem. It was always very fast.

Please let me know if there's anything I can do differently to help.

Thank you.

To Reproduce

COPY "table1" TO './output/ahccd.csv' (DELIMITER ',', HEADER TRUE);

OS:

iOS

DuckDB Version:

v1.3.1

DuckDB Client:

Node (neo)

Hardware:

M1 Pro 16GB

Full Name:

Nael Shiab

Affiliation:

CBC News

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

No - I cannot easily share my data sets due to their large size

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions