-
-
Notifications
You must be signed in to change notification settings - Fork 772
Closed
Labels
Description
I just noticed something that could make for a huge performance improvement in faceting.
The default query used by Datasette when faceting looks like this:
select
country_long,
count(*)
from (
select * from [global-power-plants] order by rowid
)
where
country_long is not null
group by
country_long
order by
count(*) desc
Note that there's a order by rowid
in there which isn't necessary - the order on that inner query doesn't matter since we're grouping and counting.
I had assumed SQLite would optimize this away - but it turns out it doesn't! Consider this version of the query, with that pointless order by removed:
select
country_long,
count(*)
from (
select * from [global-power-plants]
)
where
country_long is not null
group by
country_long
order by
count(*) desc
I tried this optimization on a table with 2.5m rows in it - without the optimization it took 5 seconds, with the optimization it took 450ms. So this is a very significant improvement!
pratyushmittalelgiad007