-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Fixing Parallel CSV Reader over multiple files #6131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM
Looks like there is a test failing because of a missing sort-order (and globs can be read in any order - it is file-system dependent). Could you add a rowsort or change the glob to a list of files? |
|
Thanks! |
Hi, Saw the updates here and that #6074 was closed. I checked out head, rebuilt, ran unittest.exe with all tests passing. I then tried my example from that issue and still seeing issue, though with better error message :-)
Sorry if I jumped the gun on giving this a try... |
Hi, @pdet I spent a little more time looking at the fix and saw my test code became a test case I took a look at the file and the two tests there both use
and that worked. I then added something closer to what failed in previous comment: `select * from read_csv('test/sql/copy/csv/data/auto/glob/[0-9].csv', sample_size=-1, header=True, columns={'row_id':'BIGINT','integer':'INTEGER','float':'DOUBLE', 'text':'VARCHAR'})' and that still fails. If I I've attached a modified version of Hopefully that is helpful... |
Could you perhaps open a new issue or re-open your existing issue? It is easy to lose track of these reports otherwise. |
Yeah I get that for sure....problem is I'm not sure how to re-open #6074 again as I'm not seeing any options on how to do that. I was hoping that mentioning it (@pdet mentioned it at the beginning of this pull request too) would provide enough breadcrumbs... Anyway, I'll look a little harder and if I don't have any luck will submit something new. |
I have reopened it |
Restarting some buffer variables properly when passing from one file to the other.
Fix: #6074