Skip to content

regression: TABLESAMPLE does not respect seed when using system method #16373

@NickCrews

Description

@NickCrews

What happens?

SET threads = 1;
CREATE OR REPLACE TABLE test AS SELECT UNNEST(RANGE(100000)) as x;
SELECT COUNT(*), min(x) FROM test TABLESAMPLE system (25 PERCENT) REPEATABLE (42);
# (32416, 0)
# (18432, 0)
# (28320, 2048)
# (24224, 0)
# (24224, 2048)

However, if I replace the system method with bernoulii or reservoir, then the results are reproducible.

This appears to be a regression in 1.2.0. In Duckdb version 1.1.3 and before, the results are consistent.

To Reproduce

See above.

OS:

macOS 14.6.1

DuckDB Version:

1.2.0

DuckDB Client:

python

Hardware:

No response

Full Name:

Nick Crews

Affiliation:

Ship Creek Group

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions