Skip to content

Assertion failed when reading JSON with nullable fields #7145

@matt-allan

Description

@matt-allan

What happens?

I ran into the following error while querying a few JSON files with the json extension from Node:

node: ../src/duckdb/extension/json/json_functions/read_json.cpp:251: void duckdb::ReadJSONFunction(duckdb::ClientContext&, duckdb::TableFunctionInput&, duckdb::DataChunk&): Assertion `result_vectors.size() == gstate.bind_data.names.size()' failed.

The JSON I am querying is essentially a linked list that looks like this:

// data_1.json
{
  "id": 1,
   "next": null
}
// data_2.json
{
  "id": 2,
  "next": 1
}

I am using a recursive CTE to query the files which looks like this:

CREATE OR REPLACE VIEW segments AS WITH RECURSIVE
raw_segments AS (
  SELECT *
  FROM read_json(
    '/tmp/duckdb-reprofOPqUc/*.json',
    columns = {
      id: 'INTEGER',
      next: 'INTEGER'
  })
),
sorted_segments AS (
  SELECT
    0 AS seq,
    raw_segments.*,
    FROM raw_segments
  WHERE raw_segments.next IS NULL
  UNION ALL
  (
    SELECT
      sorted_segments.seq + 1 AS seq,
      raw_segments.*,
      FROM raw_segments
      INNER JOIN sorted_segments ON (raw_segments.next = sorted_segments.id)
  )
)
SELECT *
FROM sorted_segments
ORDER BY seq;

Running the same query from the DuckDB CLI works correctly on 0.7.1, while using 0.7.1 from Node does not (still need to try comparing with exact versions though).

$ cat repro.sql|duckdb
┌───────┬───────┬───────┐
│  seq  │  id   │ next  │
│ int32 │ int32 │ int32 │
├───────┼───────┼───────┤
│     0 │     1 │       │
│     1 │     2 │     1 │
│     2 │     3 │     2 │
└───────┴───────┴───────┘

It also works from Node.JS if I make next always an integer and never null. If I first insert the JSON into a table, then query that it prevents the issue as well.

To Reproduce

I've attached a file with a Node.JS script that reproduces the issue. There's also a SQL script and a folder of test data for comparison.

duckdb-issue.zip

OS:

Arch Linux x86_64

DuckDB Version:

0.7.2-dev2233 / c81600e

DuckDB Client:

Node

Full Name:

Matt Allan

Affiliation:

Destructure LLC (consulting)

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions