-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
What happens?
DuckDB doesn't let me use aliases in the HAVING clause that I define in the SELECT clause and that override an existing column. (More precisely, it gives higher precedence to the original columns over aliases in the SELECT when resolving column names in HAVING expressions, and then throws because the original column is not being aggregated over.)
DuckDB does let me use aliases in the HAVING clause that I define in the SELECT clause as long as they do not override an existing column.
Prior art:
- MySQL lets me use all aliases in the HAVING clause, see https://www.db-fiddle.com/f/6CMxm3gtr2h7PWLXkJkax3/0
- SQLite allows aliases in the HAVING clause but in case of overrides gives precedence to original columns (and if those original columns are not GROUPED over, it'll filter the groups by checking the filter condition on a randomly selected row of each group, which seems insane to me)
- PostgreSQL and SQLServer don't let me use any alias in the HAVING clause.
This may as well be intentional, just opening in case there is interest in adopting MySQL behavior here, which I find more convenient, since duckdb is already deviating from PostgreSQL.
Personally, I find that:
- even in non-grouped by select statements, the aliases should take precedence over the original columns, see Unexpected values when using alias that happens to be a (joined) column name #13991
- even if not, typing the HAVING instead of the ubiquitous WHERE raises enough alerts in my brain that I'm now talking about groups, so I see even less danger of accidentally referring to the alias when the original column was meant. Also, MySQL here offers a precedent, the lack of which was given in Unexpected values when using alias that happens to be a (joined) column name #13991 as a reason not to make a general switch to alias precedence.
To Reproduce
SELECT b, sum(a) AS a
FROM (VALUES (1, 0), (1, 1)) t(a, b)
GROUP BY b
HAVING a > 0
Error: column t.a must appear in the GROUP BY clause or be used in an aggregate function
while
SELECT b, sum(a) AS c
FROM (VALUES (1, 0), (1, 1)) t(a, b)
GROUP BY b
HAVING c > 0
┌───────┬────────┐
│ b │ c │
│ int32 │ int128 │
├───────┼────────┤
│ 1 │ 1 │
│ 0 │ 1 │
└───────┴────────┘
OS:
Linux
DuckDB Version:
'0.10.1-dev717'
DuckDB Client:
Python
Full Name:
Soeren Wolfers
Affiliation:
G-Research
Have you tried this on the latest nightly build?
I have tested with a nightly build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- Yes, I have