-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
duckdb/duckdb-r
#571Labels
Description
What happens?
when you run multiple pivot statements before turning their relations into tables, all but the last get clobbered with nonsense.
I'm running linux but the same problem occurs on windows.
details in reproduction code.
To Reproduce
# demonstrate bug
import numpy as np
import pandas as pd
import duckdb
N = 1000
df = pd.DataFrame({"a":np.random.choice(["u", "v", "w"], N, True),
"b":np.random.choice(["x", "y", "z"], N, True),
"c":np.random.randn(N),})
conn = duckdb.connect()
ddf = conn.from_df(df)
ddf.create("input_data")
xx = conn.query("pivot input_data on a using max(c) group by b;")
yy = conn.query("pivot input_data on b using max(c) group by a;")
xx.create("xx")
yy.create("yy")
print(conn.query("SHOW ALL TABLES;").df())
xx_table = conn.table("xx")
yy_table = conn.table("yy")
assert set(yy.columns)==set(yy_table.columns) # Succeeds.
assert set(xx.columns)==set(xx_table.columns) # Fails.
# xx_table shows up with cols [b, x, y, z] and all nans in the x,y,z cols
# The first col should indeed by b, but the next 3 should be u,v,w and they shouldn't have all nans
OS:
Ubuntu 22.04.5 LTS x86_64
DuckDB Version:
1.1.1
DuckDB Client:
python
Hardware:
No response
Full Name:
Michael Hankin
Affiliation:
Mothball Labs
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have not tested with any build
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- Yes, I have