Skip to content

UB when persisting nested data #11621

@krlmlr

Description

@krlmlr

What happens?

Writing a nested table to a database on disk leads to errors in Valgrind, and to crashes in the R client.

Downstream: duckdb/duckdb-r#141.

To Reproduce

The error only occurs when writing to disk.

rm my.duckdb;
echo "CREATE TABLE test_list_2 (a integer, b STRUCT(c VARCHAR[], d VARCHAR[], e INTEGER[]))" | duckdb/duckdb my.duckdb;
for i in $(seq 1 10); do echo "INSERT INTO test_list_2 VALUES (1, row(['a', 'b', 'c', 'd', 'e', 'f'], ['A', 'B'], [1, 5, 9]))" | duckdb/duckdb my.duckdb; done;
echo "INSERT INTO test_list_2 VALUES (1, row(['a', 'b', 'c', 'd', 'e', 'f'], ['A', 'B'], [1, 5, 9]))" | valgrind duckdb/duckdb my.duckdb

Relevant Valgrind output:

==28183== Syscall param pwrite64(buf) points to uninitialised byte(s)
==28183==    at 0x4CB083F: __libc_pwrite64 (pwrite64.c:25)
==28183==    by 0x4CB083F: pwrite (pwrite64.c:23)
==28183==    by 0x8ABF00: duckdb::LocalFileSystem::Write(duckdb::FileHandle&, void*, long, unsigned long) (in /duckdb/duckdb)
==28183==    by 0xB578C8: duckdb::BlockManager::ConvertToPersistent(long, std::shared_ptr<duckdb::BlockHandle>) (in /duckdb/duckdb)
==28183==    by 0xC1840F: duckdb::ColumnSegment::ConvertToPersistent(duckdb::optional_ptr<duckdb::BlockManager>, long) (in /duckdb/duckdb)
==28183==    by 0xC2583C: duckdb::PartialBlockForCheckpoint::Flush(unsigned long) (in /duckdb/duckdb)
==28183==    by 0xC57316: duckdb::PartialBlockManager::FlushPartialBlocks() (in /duckdb/duckdb)
==28183==    by 0xC6F9B2: duckdb::SingleFileCheckpointWriter::CreateCheckpoint() (in /duckdb/duckdb)
==28183==    by 0xC6FC58: duckdb::SingleFileStorageManager::CreateCheckpoint(bool, bool) (in /duckdb/duckdb)
==28183==    by 0xAF755E: duckdb::AttachedDatabase::Close() (in /duckdb/duckdb)
==28183==    by 0xAF7AC7: duckdb::DatabaseManager::ResetDatabases(duckdb::unique_ptr<duckdb::TaskScheduler, std::default_delete<duckdb::TaskScheduler>, true>&) (in /duckdb/duckdb)
==28183==    by 0xAF7B64: duckdb::DatabaseInstance::~DatabaseInstance() (in /duckdb/duckdb)
==28183==    by 0xAEB399: duckdb::DuckDB::~DuckDB() (in /duckdb/duckdb)
==28183==  Address 0x5c00800 is in a rw- anonymous segment

There's another UB regarding rendering boxes if we add a SELECT * FROM test_list_2 at the end:

==27977== Conditional jump or move depends on uninitialised value(s)
==27977==    at 0x8AB796: duckdb::BoxRenderer::ComputeRenderWidths(duckdb::vector<std::string, true> const&, duckdb::vector<duckdb::LogicalType, true> const&, std::list<duckdb::ColumnDataCollection, std::allocator<duckdb::ColumnDataCollection> >&, unsigned long, unsigned long, duckdb::vector<unsigned long, true>&, unsigned long&) (in /duckdb/duckdb)
==27977==    by 0x8B614B: duckdb::BoxRenderer::Render(duckdb::ClientContext&, duckdb::vector<std::string, true> const&, duckdb::ColumnDataCollection const&, std::ostream&) (in /duckdb/duckdb)
==27977==    by 0x8B6E58: duckdb::BoxRenderer::ToString(duckdb::ClientContext&, duckdb::vector<std::string, true> const&, duckdb::ColumnDataCollection const&) (in /duckdb/duckdb)
==27977==    by 0x6D4B2A: duckdb_shell_sqlite3_print_duckbox (in /duckdb/duckdb)
==27977==    by 0x6C4321: exec_prepared_stmt (in /duckdb/duckdb)
==27977==    by 0x6C54D4: shell_exec (in /duckdb/duckdb)
==27977==    by 0x6C6FAF: runOneSqlLine.constprop.0 (in /duckdb/duckdb)
==27977==    by 0x6CF440: process_input (in /duckdb/duckdb)
==27977==    by 0x6A2FB0: main (in /duckdb/duckdb)

OS:

Ubuntu Linux amd64

DuckDB Version:

v0.10.2-dev684

DuckDB Client:

CLI

Full Name:

Kirill Müller

Affiliation:

cynkra GmbH

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a nightly build

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions