Skip to content

Conversation

fgregg
Copy link
Contributor

@fgregg fgregg commented Sep 26, 2022

Relates to #526

This is a minimal set of changes needed for having query CSVs attempt to download all the rows.

What's good about it is the minimalism.

What's bad about it:

  1. We are abusing the _size argument to indicate we don't want truncation, which isn't the most obvious thing. Additionally, there are various checks that make sure the "_size" URL parameter is a positive integer, which we are relying on to prevent overloading.
  2. The default CSV on a table page will use the max_returned_rows argument. Changing this could be a breaking change, since that's currently a place that has some facilities for pagination. Additionally, i think there's a limit under the hood somewhere which if we removed could lead to sql timeouts
  3. There are similar reasons for leaving the current streaming method alone, as the current methods could allow for downloading very large files that could have a sql timeout if we tried to get them in one go.

📚 Documentation preview 📚: https://datasette--1820.org.readthedocs.build/en/1820/

@fgregg fgregg closed this Sep 26, 2022
@fgregg fgregg reopened this Sep 26, 2022
@fgregg fgregg closed this Sep 26, 2022
@fgregg fgregg reopened this Sep 26, 2022
@fgregg fgregg changed the title No limit csv Don't truncate query CSVs Sep 26, 2022
@fgregg fgregg changed the title Don't truncate query CSVs [SPIKE] Don't truncate query CSVs Sep 26, 2022
@codecov
Copy link

codecov bot commented Sep 26, 2022

Codecov Report

Base: 92.50% // Head: 92.51% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (9bead2a) compared to base (eff1124).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1820      +/-   ##
==========================================
+ Coverage   92.50%   92.51%   +0.01%     
==========================================
  Files          35       35              
  Lines        4400     4406       +6     
==========================================
+ Hits         4070     4076       +6     
  Misses        330      330              
Impacted Files Coverage Δ
datasette/app.py 94.11% <ø> (ø)
datasette/views/base.py 94.80% <100.00%> (+0.05%) ⬆️
datasette/views/database.py 95.29% <100.00%> (+0.06%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@fgregg
Copy link
Contributor Author

fgregg commented Sep 27, 2022

the pattern in this PR max_returned_rows control the maximum rows rendered through html and json, and the csv render bypasses that.

i think it would be better to have each of these different query renderers have more direct control for how many rows to fetch, instead of relying on the internals of the execute method.

generally, users will not want to paginate through tens of thousands of results, but often will want to download a full query as json or as csv.

@fgregg fgregg marked this pull request as draft September 27, 2022 01:06
@fgregg fgregg closed this Oct 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant