How can I call the pre-Spring 2024 CSV parser in v1.1.1?

### What happens?

When I wrote the 1.1B taxi rides benchmark https://tech.marksblogg.com/duckdb-1b-taxi-rides.html#an-alternative-code-path I was given the advice to call DuckDB's old CSV parser instead of the new one which was released this summer.

This worked fine on v1.0.0 but in v1.1.1 af39bd0dcf I'm finding that the following raises issues every few hundred million records saying 1 or 2 fewer columns were able to be parsed.

```bash
$ ~/duckdb_111/duckdb \
    test.duckdb \
    < create.sql

$ for FILENAME in trips_*.csv.gz; do
    echo $FILENAME
    /usr/bin/time \
        ~/duckdb_111/duckdb \
        -c "COPY trips FROM '$FILENAME';" \
        test.duckdb
    du -hs test.duckdb
  done
```

```
trips_xae.csv.gz:

Expected Number of Columns: 51 Found: 49

trips_xag.csv.gz:

Expected Number of Columns: 51 Found: 50

trips_xbb.csv.gz:

Expected Number of Columns: 51 Found: 50
```

In v1.0.0 this was the way to call the new parser:

```bash
$ ~/duckdb_111/duckdb test.duckdb < create.sql

$ for FILENAME in trips_x*.csv.gz; do
    echo $FILENAME
    /usr/bin/time \
        ~/duckdb_111/duckdb \
        -c "INSERT INTO trips
                    SELECT *
                    FROM   READ_CSV('$FILENAME');" \
        test.duckdb
    du -hs test.duckdb
  done
```

That new parser still has the same sorts of inference issues in v1.1.1 as it did in v1.0.0.

The import times for both CSV parsers are almost identical now which, along with the parsing issues, has me wondering if there is a way to call the old CSV parser, which was able to import the 1.1B rows without issue, any more or has its code been removed from DuckDB v1.1.1?

### To Reproduce

See above.

### OS:

Ubuntu for Windows on Windows 11

### DuckDB Version:

v1.1.1 af39bd0dcf

### DuckDB Client:

CLI

### Hardware:

6 GHz Intel Core i9-14900K, 96 GB of RAM

### Full Name:

Mark Litwintschik

### Affiliation:

Green Idea

### What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

### Did you include all relevant data sets for reproducing the issue?

No - I cannot share the data sets because they are confidential

### Did you include all code required to reproduce the issue?

- [X] Yes, I have

### Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

- [X] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How can I call the pre-Spring 2024 CSV parser in v1.1.1? #14111

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How can I call the pre-Spring 2024 CSV parser in v1.1.1? #14111

Description

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions