-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
What happens?
A CSV file that was previously parseable under v1.2 is no longer parseable under v1.3 because of an error trying to parse timezones. In addition, the ignore_errors = true
flag appears to be ignored.
To Reproduce
Consider the following CSV taken from a much larger file:
day_night,operation_no,flight_date_time
Night,38615452,2022/01/27 11:04:57 PM
Night,38615452,2022/01/27 11:04:57 PM
Night,38615475,2022/01/27 11:09:20 PM
Using the CLI with the following command:
select * from read_csv('sample.csv');
Produces the following error:
Not implemented Error:
Unknown TimeZone 'PM'!
Candidate time zones: "Pacific/Marquesas", "PLT", "PNT", "PRC", "GMT"
Setting ignore_errors = true
results in the same error as above being generated:
select * from read_csv('sample.csv', ignore_errors = true);
Setting strict_mode = false
results in the same error as above being generated:
select * from read_csv('sample.csv', strict_mode = false);
In DuckDB v1.2, this column was parsed as a VARCHAR and no error was reported. If DuckDB v1.3 is now "correct" it should still honour the ignore_errors
options.
This may be related to Sniff Timestamp_TZ from CSV FIles #15730
OS:
macOS
DuckDB Version:
DuckDB v1.3.0 (Ossivalis) 71c5c07 clang-17.0.0
DuckDB Client:
CLI
Hardware:
MacBook Pro
Full Name:
Kenny Carruthers
Affiliation:
None
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- Yes, I have