-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
What happens?
Adding multiple INNER JOIN statements to a query makes it exponentially slower.
Number of records in tables do not make any difference.
In the sample below, using 20 joins takes 9 minutes to complete on empty tables.
What's more strange, the same timings apply to EXPLAIN statements.
After some research, I found out that it's actually duckdb_prepare function call makes it slow.
To Reproduce
CREATE or REPLACE TABLE MainTable (Id INT,
Value1_Id INT, Value2_Id INT, Value3_Id INT, Value4_Id INT, Value5_Id INT,
Value6_Id INT, Value7_Id INT, Value8_Id INT, Value9_Id INT, Value10_Id INT,
Value11_Id INT, Value12_Id INT, Value13_Id INT, Value14_Id INT, Value15_Id INT,
Value16_Id INT, Value17_Id INT, Value18_Id INT, Value19_Id INT, Value20_Id INT);
CREATE or REPLACE TABLE ValueTable (Id INT, Value TEXT);
SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
--5ms
SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
--90ms
SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
INNER JOIN ValueTable T11 ON T.Value11_Id = T11.Id
INNER JOIN ValueTable T12 ON T.Value12_Id = T12.Id
INNER JOIN ValueTable T13 ON T.Value13_Id = T13.Id
INNER JOIN ValueTable T14 ON T.Value14_Id = T14.Id
INNER JOIN ValueTable T15 ON T.Value15_Id = T15.Id
--3200ms
SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
INNER JOIN ValueTable T11 ON T.Value11_Id = T11.Id
INNER JOIN ValueTable T12 ON T.Value12_Id = T12.Id
INNER JOIN ValueTable T13 ON T.Value13_Id = T13.Id
INNER JOIN ValueTable T14 ON T.Value14_Id = T14.Id
INNER JOIN ValueTable T15 ON T.Value15_Id = T15.Id
INNER JOIN ValueTable T16 ON T.Value16_Id = T16.Id
--7600ms
SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
INNER JOIN ValueTable T11 ON T.Value11_Id = T11.Id
INNER JOIN ValueTable T12 ON T.Value12_Id = T12.Id
INNER JOIN ValueTable T13 ON T.Value13_Id = T13.Id
INNER JOIN ValueTable T14 ON T.Value14_Id = T14.Id
INNER JOIN ValueTable T15 ON T.Value15_Id = T15.Id
INNER JOIN ValueTable T16 ON T.Value16_Id = T16.Id
INNER JOIN ValueTable T17 ON T.Value17_Id = T17.Id
INNER JOIN ValueTable T18 ON T.Value18_Id = T18.Id
INNER JOIN ValueTable T19 ON T.Value16_Id = T19.Id
INNER JOIN ValueTable T20 ON T.Value20_Id = T20.Id
--540000ms
OS:
win11
DuckDB Version:
6.0.1
DuckDB Client:
c++
Full Name:
xllance
Affiliation:
.NET
Have you tried this on the latest master
branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree