-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
What happens?
Arithmetic expressions on ubigint (== uint64) columns and integers are easily implicitly cast to double which arguably is quite unexpected because that way unsigned values larger than 2**53-1
are approximated, which violates the expectation that results in the ubigint range (i.e. up to 2**64-1
) are represented exactly.
This arguably can be considered as a hidden foot gun (or anit-personal mine) when you are querying ubigint columns that contain large IDs or high-precision timestamps (such as nanoseconds since some epoche), because your query seems to work but a subset of rows might be silently dropped (when such expressions are used in the where or join clause) or some rows might contain approximated and thus erroneous values.
NB: A similar issue was discussed in #3931 (comment):
In principle this makes sense to me, but perhaps it would make sense to turn constants into the smallest type that can represent them, so the constant 2 should instead be a UTINYINT.
FWIW, this idea would solve this issue - alternatively, warning about implicit double casting in such expressions or even throwing an error would be a very useful improvement, already, in my opinion.
To Reproduce
To reproduce enter a DuckDB shell and enter the following query:
select x + 1 from (select 9007199254740992::ubigint as x);
Actual result:
D select x + 1 from (select 9007199254740992::ubigint as x);
┌────────────────────┐
│ (x + 1) │
│ double │
├────────────────────┤
│ 9007199254740992.0 │
└────────────────────┘
Expected result:
D select x + 1 from (select 23::ubigint as x);
┌──────────────────────────┐
│ (x + CAST(1 AS UTINYINT))│
│ uint64 │
├──────────────────────────┤
│ 9007199254740993 │
└──────────────────────────┘
NB: Of course, if you are very familiar with the current duckdb casting mechanics you can directly work around this by always casting all the integer literals in all you expressions, e.g. like this:
D select x + 1::ubigint from (select 9007199254740992::ubigint as x);
┌──────────────────────────┐
│ (x + CAST(1 AS UBIGINT)) │
│ uint64 │
├──────────────────────────┤
│ 9007199254740993 │
└──────────────────────────┘
NB: I also tested it with the latest stable release, i.e. version 0.9.2, same results.
OS:
Fedora 39 x86_64
DuckDB Version:
v0.9.3-dev1996 dae3b28
DuckDB Client:
CLI
Full Name:
Georg Sauthoff
Affiliation:
Have you tried this on the latest main
branch?
I have tested with a main build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- Yes, I have