Out of Memory Error When Unnesting Large Array Field

### What happens?

I have large dataset that contains a deeply nested json field I am attempting to unnest. It is the ingredients column that is part of this dataset:
https://huggingface.co/datasets/openfoodfacts/product-database

I have loaded this parquet file into my database and I am attempting to unnest it with the following query:
```sql
with
    parsed_ingredients as (
        select code, ingredients::json[] as json_array
        from "mydb"."main_raw"."raw__open_food_facts"
    ),
    ingredient_hierarchy as (
        select
            code,
            unnest(json_array, max_depth := 2) as ingredient_name,
            generate_subscripts(json_array, 1) as ingredient_order,
        from parsed_ingredients
    )

select *
from ingredient_hierarchy
order by code, ingredient_order
```

When running it results in the following error (note I am limiting my duckdb instance to only use 8gb to run on my hardware):
`Out of Memory Error: could not allocate block of size 256.0 KiB (7.4 GiB/7.4 GiB used)`

I have also tried converting the field to a struct first but this causes it run out of memory before I even begin unnesting. How can i properly process this field without running OOM?

### To Reproduce

Download the parquet data and run the code above to reporoduce.

### OS:

Windows

### DuckDB Version:

1.7.0

### DuckDB Client:

DuckDB-DBT

### Hardware:

_No response_

### Full Name:

Kyle Burke

### Affiliation:

None

### What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

### Did you include all relevant data sets for reproducing the issue?

Yes

### Did you include all code required to reproduce the issue?

- [x] Yes, I have

### Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

- [x] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Out of Memory Error When Unnesting Large Array Field #16277

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Out of Memory Error When Unnesting Large Array Field #16277

Description

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions