Skip to content

Arithmetic expressions involving ubigint columns and integer literals are implicitly cast 'upwards' to double #10096

@gsauthof

Description

@gsauthof

What happens?

Arithmetic expressions on ubigint (== uint64) columns and integers are easily implicitly cast to double which arguably is quite unexpected because that way unsigned values larger than 2**53-1 are approximated, which violates the expectation that results in the ubigint range (i.e. up to 2**64-1) are represented exactly.

This arguably can be considered as a hidden foot gun (or anit-personal mine) when you are querying ubigint columns that contain large IDs or high-precision timestamps (such as nanoseconds since some epoche), because your query seems to work but a subset of rows might be silently dropped (when such expressions are used in the where or join clause) or some rows might contain approximated and thus erroneous values.

NB: A similar issue was discussed in #3931 (comment):

In principle this makes sense to me, but perhaps it would make sense to turn constants into the smallest type that can represent them, so the constant 2 should instead be a UTINYINT.

FWIW, this idea would solve this issue - alternatively, warning about implicit double casting in such expressions or even throwing an error would be a very useful improvement, already, in my opinion.

To Reproduce

To reproduce enter a DuckDB shell and enter the following query:

select x + 1 from (select 9007199254740992::ubigint as x);

Actual result:

D select x + 1 from (select 9007199254740992::ubigint as x);
┌────────────────────┐
│      (x + 1)       │
│       double       │
├────────────────────┤
│ 9007199254740992.0 │
└────────────────────┘

Expected result:

D select x + 1 from (select 23::ubigint as x);
┌──────────────────────────┐
│ (x + CAST(1 AS UTINYINT))│
│          uint64          │
├──────────────────────────┤
│         9007199254740993 │
└──────────────────────────┘

NB: Of course, if you are very familiar with the current duckdb casting mechanics you can directly work around this by always casting all the integer literals in all you expressions, e.g. like this:

D select x + 1::ubigint from (select 9007199254740992::ubigint as x);
┌──────────────────────────┐
│ (x + CAST(1 AS UBIGINT)) │
│          uint64          │
├──────────────────────────┤
│         9007199254740993 │
└──────────────────────────┘

NB: I also tested it with the latest stable release, i.e. version 0.9.2, same results.

OS:

Fedora 39 x86_64

DuckDB Version:

v0.9.3-dev1996 dae3b28

DuckDB Client:

CLI

Full Name:

Georg Sauthoff

Affiliation:

https://gms.tf

Have you tried this on the latest main branch?

I have tested with a main build

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions