Skip to content

segmentation fault when running read_csv() #14552

@pkoppstein

Description

@pkoppstein

What happens?

The 'ignore_errors' flag is supposed to cause erroneous lines to be skipped
when reading a CSV file. The following line, by contrast, is sufficient to cause
a segmentation fault or allocation failure, depending on the DuckDB version:

a,"

Here are some sample runs with different versions of duckdb:

duckdb < csv.sql
┌───────────┐
│ version() │
│  varchar  │
├───────────┤
│ v1.0.0    │
└───────────┘
Out of Memory Error: Allocation failure
duckdb1.1.3 < csv.sql
┌──────────────┐
│ "version"()  │
│   varchar    │
├──────────────┤
│ v1.1.3-dev38 │
└──────────────┘
Segmentation fault: 11

$ duckdb1.1
┌─────────────┐
│ "version"() │
│   varchar   │
├─────────────┤
│ v1.1.0      │
└─────────────┘
Segmentation fault: 11

To Reproduce

select version();
from read_csv('input.csv',
   header=false,
   quote='"',
   escape = '"',
   sep=',',
   ignore_errors=true);

OS:

MacOS

DuckDB Version:

1.0, 1.1, and others

DuckDB Client:

CLI

Hardware:

No response

Full Name:

Peter Koppstein

Affiliation:

Princeton University

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a source build

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions