Skip to content

Using multiple INNER JOINs causes significant slow down #5630

@xllance

Description

@xllance

What happens?

Adding multiple INNER JOIN statements to a query makes it exponentially slower.

Number of records in tables do not make any difference.

In the sample below, using 20 joins takes 9 minutes to complete on empty tables.

What's more strange, the same timings apply to EXPLAIN statements.

After some research, I found out that it's actually duckdb_prepare function call makes it slow.

To Reproduce

CREATE or REPLACE TABLE MainTable (Id INT,
Value1_Id INT, Value2_Id INT, Value3_Id INT, Value4_Id INT, Value5_Id INT,
Value6_Id INT, Value7_Id INT, Value8_Id INT, Value9_Id INT, Value10_Id INT,
Value11_Id INT, Value12_Id INT, Value13_Id INT, Value14_Id INT, Value15_Id INT,
Value16_Id INT, Value17_Id INT, Value18_Id INT, Value19_Id INT, Value20_Id INT);

CREATE or REPLACE TABLE ValueTable (Id INT, Value TEXT);

SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
--5ms

SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
--90ms

SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
INNER JOIN ValueTable T11 ON T.Value11_Id = T11.Id
INNER JOIN ValueTable T12 ON T.Value12_Id = T12.Id
INNER JOIN ValueTable T13 ON T.Value13_Id = T13.Id
INNER JOIN ValueTable T14 ON T.Value14_Id = T14.Id
INNER JOIN ValueTable T15 ON T.Value15_Id = T15.Id
--3200ms

SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
INNER JOIN ValueTable T11 ON T.Value11_Id = T11.Id
INNER JOIN ValueTable T12 ON T.Value12_Id = T12.Id
INNER JOIN ValueTable T13 ON T.Value13_Id = T13.Id
INNER JOIN ValueTable T14 ON T.Value14_Id = T14.Id
INNER JOIN ValueTable T15 ON T.Value15_Id = T15.Id
INNER JOIN ValueTable T16 ON T.Value16_Id = T16.Id
--7600ms

SELECT * FROM MainTable T
INNER JOIN ValueTable T1 ON T.Value1_Id = T1.Id
INNER JOIN ValueTable T2 ON T.Value2_Id = T2.Id
INNER JOIN ValueTable T3 ON T.Value3_Id = T3.Id
INNER JOIN ValueTable T4 ON T.Value4_Id = T4.Id
INNER JOIN ValueTable T5 ON T.Value5_Id = T5.Id
INNER JOIN ValueTable T6 ON T.Value6_Id = T6.Id
INNER JOIN ValueTable T7 ON T.Value7_Id = T7.Id
INNER JOIN ValueTable T8 ON T.Value8_Id = T8.Id
INNER JOIN ValueTable T9 ON T.Value9_Id = T9.Id
INNER JOIN ValueTable T10 ON T.Value10_Id = T10.Id
INNER JOIN ValueTable T11 ON T.Value11_Id = T11.Id
INNER JOIN ValueTable T12 ON T.Value12_Id = T12.Id
INNER JOIN ValueTable T13 ON T.Value13_Id = T13.Id
INNER JOIN ValueTable T14 ON T.Value14_Id = T14.Id
INNER JOIN ValueTable T15 ON T.Value15_Id = T15.Id
INNER JOIN ValueTable T16 ON T.Value16_Id = T16.Id
INNER JOIN ValueTable T17 ON T.Value17_Id = T17.Id
INNER JOIN ValueTable T18 ON T.Value18_Id = T18.Id
INNER JOIN ValueTable T19 ON T.Value16_Id = T19.Id
INNER JOIN ValueTable T20 ON T.Value20_Id = T20.Id
--540000ms

OS:

win11

DuckDB Version:

6.0.1

DuckDB Client:

c++

Full Name:

xllance

Affiliation:

.NET

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions