Bug when reading csv with # and quoted values using read_csv

### What happens?

This is a minimal reproduced issue but it is happening naturally in our production data. If a column starts with a #, then there is a row that has a quoted value, and then there is another column on row 2048 has another # duckdb fails to parse the csv. In the production data it is not on row 1 and 2048 but when narrowing it down to the rows causing the issue it seemed that there had to be at least 2048 rows to cause the issue. 

NOTE: It seems to be related to the comment param. if I set the comment='' in my read csv, the query works. In the docs it says that the comment param defaults to (empty) but if I set it to anything other than # it works fine but if I set it to # then it reproduces the issue as well. So it seems that for some reason the query is trying to use # as the comment character even though it is not set. 

## Error Message
```sql
select * from read_csv('~/downloads/test.csv', strict_mode=false);
```
```
Invalid Input Error:
Error when sniffing file "/Users/thomas.boles/downloads/test.csv".
It was not possible to automatically detect the CSV Parsing dialect/types
The search space used was:
Delimiter Candidates: ',', '|', ';', '	'
Quote/Escape Candidates: ['(no quote)','(no escape)'],['"','(no escape)'],['"','"'],['"','''],['"','\'],[''','(no escape)'],[''','''],[''','"'],[''','\']
Comment Candidates: '\0', '#'
Encoding: utf-8
Possible fixes:
* Make sure you are using the correct file encoding. If not, set it (e.g., encoding = 'utf-16').
* Set delimiter (e.g., delim=',')
* Set quote (e.g., quote='"')
* Set escape (e.g., escape='"')
* Set comment (e.g., comment='#')
* Set skip (skip=${n}) to skip ${n} lines at the top of the file
* Enable ignore errors (ignore_errors=true) to ignore potential errors
* Enable null padding (null_padding=true) to pad missing columns with NULL values
* Check you are using the correct file compression, otherwise set it (e.g., compression = 'zstd')
* Be sure that the maximum line size is set to an appropriate value, otherwise set it (e.g., max_line_size=10000000)
```

### To Reproduce

Add the following to `iss17744.sql` and run it with `duckdb -bail -f iss17744.sql`:

```sql
COPY (
    SELECT
        -- The string must start with a hash to reproduce the issue
        '#hash start' AS first_column,
        1 AS second_column

    UNION ALL

    SELECT
        -- Quoted value can go anywhere between rows 1 and 2048 just not 1 or 2048. It must be between the hashes
        '"my, quoted value"' AS first_column,
        1 AS second_column

    UNION ALL

    -- These rows make the csv 2048 rows long which is required to reproduce
    SELECT
        'any value' AS first_column,
        1 AS second_column
    FROM range(0, 2045)

    UNION ALL

    SELECT
        -- This hash value must be somewhere in the string just not at the beginning
        'hash not at start #' AS column_value,
     1 AS second_column
) TO 'test.csv' (format csv, header 1);

-- This errors with or without strict mode, with or without quote='"'
select * from read_csv('test.csv', strict_mode=false);

-- Setting comment to an empty string '' works. It seems that the config is automatically setting comment to '#' which is causing the issue.
select * from read_csv('test.csv', strict_mode=false, comment='');
```

### OS:

Mac OS 15.5 arm64

### DuckDB Version:

1.3.0

### DuckDB Client:

Java and Shell

### Hardware:

_No response_

### Full Name:

Thomas Boles

### Affiliation:

Finqore

### What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have not tested with any build

### Did you include all relevant data sets for reproducing the issue?

No - Other reason (please specify in the issue body)

### Did you include all code required to reproduce the issue?

- [x] Yes, I have

### Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

- [x] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug when reading csv with # and quoted values using read_csv #17744

What happens?

Error Message

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug when reading csv with # and quoted values using read_csv #17744

Description

What happens?

Error Message

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions