Creating index seems to lead to incorrect query results

### What happens?

When I use DuckDB, it seems to produce incorrect query results. I'm not exactly sure the root cause, but I tried to reproduce the error with much a smaller test case. 

### To Reproduce

I put the data `obj_clevrer_short.csv` on Google Drive: https://drive.google.com/file/d/1QkwvNnEEPT_MyRSrW4Nm-WS5luUl2V0P/view?usp=sharing. Note that commenting out the line `conn.execute("CREATE INDEX IF NOT EXISTS idx_obj_clevrer ON Obj_clevrer (vid);")` will resolve the error. 

```py
import duckdb

# Create database, and load data
conn = duckdb.connect(database='test.duckdb', read_only=False)
conn.execute("CREATE TABLE Obj_clevrer (oid INT, vid INT, fid INT, shape varchar, color varchar, material varchar, x1 float, y1 float, x2 float, y2 float)")
conn.execute("COPY Obj_clevrer FROM 'obj_clevrer_short.csv' (FORMAT 'csv', DELIMITER ',', HEADER 0);")

# Creating index causes incorrect results
conn.execute("CREATE INDEX IF NOT EXISTS idx_obj_clevrer ON Obj_clevrer (vid);")

# Run some queries
query_results = []
for vid in range(500):
    df1 = conn.execute(f"""
        SELECT oid as o0_oid, fid
        FROM Obj_clevrer
        WHERE color = 'red' AND vid = {vid}
    """).df()
    df2 = conn.execute(f"""
        SELECT oid as o0_oid, fid
        FROM Obj_clevrer
        WHERE material = 'rubber' AND vid = {vid}
    """).df()
    graph_df1 = conn.execute(f"SELECT * FROM df1 natural join df2").df()
    if len(graph_df1):
        query_results.append(vid)

# query_results should be equal to ground_truth
ground_truth = [1, 2, 3, 6, 14, 21, 23, 25, 29, 30, 32, 33, 40, 41, 44, 50, 51, 54, 63, 71, 74, 75, 78, 79, 83, 84, 90, 91, 94, 95, 97, 98, 100, 105, 107, 109, 111, 114, 115, 117, 125, 126, 128, 132, 138, 139, 144, 146, 149, 151, 154, 156, 163, 165, 169, 171, 177, 180, 183, 185, 186, 189, 191, 193, 197, 201, 202, 205, 206, 207, 209, 214, 217, 221, 222, 223, 235, 241, 244, 248, 249, 250, 252, 254, 257, 261, 266, 268, 270, 271, 276, 287, 291, 292, 299, 308, 314, 315, 317, 326, 342, 344, 346, 347, 349, 350, 351, 355, 364, 366, 367, 368, 369, 372, 380, 384, 392, 394, 399, 400, 402, 406, 407, 410, 412, 414, 416, 418, 419, 420, 424, 428, 431, 433, 435, 438, 442, 448, 450, 452, 456, 462, 463, 464, 476, 477, 479, 480, 484, 485, 486, 489, 490, 493]
diff1 = sorted(list(set(query_results) - set(ground_truth)))
print("diff1", diff1)
if len(diff1):
    # This gives incorrect results. Specifically, I found that 'oid' and 'fid' values of some tuples become 0 when running this query
    test_df1 = conn.execute("select * from Obj_clevrer where vid = {} and fid = 0".format(diff1[0])).df()
    print(test_df1.to_string())

    # This gives correct results
    test_df2 = conn.execute("select * from Obj_clevrer").df()
    print(test_df2.loc[(test_df2['vid'] == diff1[0]) & (test_df2['fid'] == 0)].to_string())
```

Output: the output is non-deterministic, but I'm able to obtain a non-empty `diff1` every time I run the above code. In this example output, `query_results` includes three vids that are not in `ground_truth`. Looking at vid 181, test_df1 gives an incorrect result, where the last tuple's oid and fid are incorrectly set to 0. However, test_df2 gives the correct result. 
```
diff1:
[181, 232, 352]

test_df1:
   oid  vid  fid     shape  color material     x1     y1     x2     y2
0    0  181    0    sphere   blue   rubber  321.0   73.0  355.0  108.0
1    1  181    0      cube  brown   rubber  114.0   72.0  154.0  116.0
2    2  181    0  cylinder  green   rubber  138.0    1.0  161.0   18.0
3    3  181    0    sphere    red    metal  112.0  109.0  152.0  149.0
4    0  181    0    sphere    red    metal  335.0   77.0  371.0  112.0

test_df2:
        oid  vid  fid     shape  color material     x1     y1     x2     y2
110051    0  181    0    sphere   blue   rubber  321.0   73.0  355.0  108.0
110052    1  181    0      cube  brown   rubber  114.0   72.0  154.0  116.0
110053    2  181    0  cylinder  green   rubber  138.0    1.0  161.0   18.0
110054    3  181    0    sphere    red    metal  112.0  109.0  152.0  149.0
```

### OS:

macos (and also linux)

### DuckDB Version:

0.9.2 and 0.9.3-dev2705

### DuckDB Client:

Python

### Full Name:

Enhao Zhang

### Affiliation:

University of Washington 

### Have you tried this on the latest `main` branch?

I have tested with a main build

### Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

- [X] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Creating index seems to lead to incorrect query results #10251

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

Have you tried this on the latest `main` branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Creating index seems to lead to incorrect query results #10251

Description

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

Have you tried this on the latest main branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Have you tried this on the latest `main` branch?