-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
What happens?
When compared to PostgreSQL , DuckDB is giving incorrect results. Spark SQL and PostgreSQL give 2 rows and DuckDB gives 3 rows.
PostgreSQL and Spark SQL results
t1b,8
t1c,8
DuckDB results
t1e,10
t1b,8
t1c,8
To Reproduce
-- Create table t1
CREATE TABLE t1 (
t1a TEXT,
t1b SMALLINT,
t1c INTEGER,
t1d BIGINT,
t1e REAL,
t1f DOUBLE PRECISION,
t1g NUMERIC(10,2),
t1h TIMESTAMP,
t1i DATE
);
INSERT INTO t1 (t1a, t1b, t1c, t1d, t1e, t1f, t1g, t1h, t1i) VALUES
('t1a', 6, 8, 10, 15.0, 20, 20.00, '2014-04-04 01:00:00', '2014-04-04'),
('t1b', 8, 16, 19, 17.0, 25, 26.00, '2014-05-04 01:01:00', '2014-05-04'),
('t1a', 16, 12, 21, 15.0, 20, 20.00, '2014-06-04 01:02:00.001', '2014-06-04'),
('t1a', 16, 12, 10, 15.0, 20, 20.00, '2014-07-04 01:01:00', '2014-07-04'),
('t1c', 8, 16, 19, 17.0, 25, 26.00, '2014-05-04 01:02:00.001', '2014-05-05'),
('t1d', NULL, 16, 22, 17.0, 25, 26.00, '2014-06-04 01:01:00', NULL),
('t1d', NULL, 16, 19, 17.0, 25, 26.00, '2014-07-04 01:02:00.001', NULL),
('t1e', 10, NULL, 25, 17.0, 25, 26.00, '2014-08-04 01:01:00', '2014-08-04'),
('t1e', 10, NULL, 19, 17.0, 25, 26.00, '2014-09-04 01:02:00.001', '2014-09-04'),
('t1d', 10, NULL, 12, 17.0, 25, 26.00, '2015-05-04 01:01:00', '2015-05-04'),
('t1a', 6, 8, 10, 15.0, 20, 20.00, '2014-04-04 01:02:00.001', '2014-04-04'),
('t1e', 10, NULL, 19, 17.0, 25, 26.00, '2014-05-04 01:01:00', '2014-05-04');
-- Create table t2
CREATE TABLE t2 (
t2a TEXT,
t2b SMALLINT,
t2c INTEGER,
t2d BIGINT,
t2e REAL,
t2f DOUBLE PRECISION,
t2g NUMERIC(10,2),
t2h TIMESTAMP,
t2i DATE
);
INSERT INTO t2 (t2a, t2b, t2c, t2d, t2e, t2f, t2g, t2h, t2i) VALUES
('t2a', 6, 12, 14, 15.0, 20, 20.00, '2014-04-04 01:01:00', '2014-04-04'),
('t1b', 10, 12, 19, 17.0, 25, 26.00, '2014-05-04 01:01:00', '2014-05-04'),
('t1b', 8, 16, 119, 17.0, 25, 26.00, '2015-05-04 01:01:00', '2015-05-04'),
('t1c', 12, 16, 219, 17.0, 25, 26.00, '2016-05-04 01:01:00', '2016-05-04'),
('t1b', NULL, 16, 319, 17.0, 25, 26.00, '2017-05-04 01:01:00', NULL),
('t2e', 8, NULL, 419, 17.0, 25, 26.00, '2014-06-04 01:01:00', '2014-06-04'),
('t1f', 19, NULL, 519, 17.0, 25, 26.00, '2014-05-04 01:01:00', '2014-05-04'),
('t1b', 10, 12, 19, 17.0, 25, 26.00, '2014-06-04 01:01:00', '2014-06-04'),
('t1b', 8, 16, 19, 17.0, 25, 26.00, '2014-07-04 01:01:00', '2014-07-04'),
('t1c', 12, 16, 19, 17.0, 25, 26.00, '2014-08-04 01:01:00', '2014-08-05'),
('t1e', 8, NULL, 19, 17.0, 25, 26.00, '2014-09-04 01:01:00', '2014-09-04'),
('t1f', 19, NULL, 19, 17.0, 25, 26.00, '2014-10-04 01:01:00', '2014-10-04'),
('t1b', NULL, 16, 19, 17.0, 25, 26.00, '2014-05-04 01:01:00', NULL);
SELECT t1a,
t1b
FROM t1
WHERE t1c IN (SELECT t2c
FROM t2
WHERE t1a = t2a)
GROUP BY t1a,
t1b;
OS:
x86_64
DuckDB Version:
1.1.1
DuckDB Client:
Python
Hardware:
No response
Full Name:
Rishab C
Affiliation:
Student
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- Yes, I have