-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
What happens?
The documentation says "The inverse of IGNORE NULLS is RESPECT NULLS, which is the default for all functions."
(1) I don't think count
/sum
and most other aggregate functions "respect nulls" in any sense of the word. A NULL-respecting count
would just count all rows like count(*)
does or pandas.shape
does; a null-respecting sum
would return NULL
if any value was NULL
, analogous to how 3.14::FLOAT + NULL
evaluates to NULL
in duckdb and how df.sum(skipna=False)
behaves in pandas.
In fact, the documentation also says "[...] will not work as expected when using an aggregate function that does not ignore NULL values. The first
function falls into this category" when talking about count
.
(2) I therefore don't think it makes sense to allow count/sum/...(... RESPECT NULLS)
, which currently does the same as count/sum/...(...)
and is better described as IGNORE NULLS
(3) Also, the documentation should be updated, but I can open an issue for that once I get confirmation on the technical side here
To Reproduce
FROM
(VALUES (NULL), (1)) t(b)
SELECT
sum(b RESPECT NULLS)
GROUP BY ()
and
FROM
(VALUES (NULL), (1)) t(b)
SELECT
sum(b RESPECT NULLS) OVER ()
work, but behave like IGNORE NULLS
(i.e., return 1
).
OS:
Linux
DuckDB Version:
'0.10.1-dev717'
DuckDB Client:
Python
Full Name:
Soeren Wolfers
Affiliation:
G-Research
Have you tried this on the latest nightly build?
I have tested with a nightly build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- Yes, I have