Skip to content

USE databases causes export database to produce duplicate data #7660

@bleskes

Description

@bleskes

What happens?

If you attach a databases, issue a use command to make it a default, and export the data to a folder, the resulting load.sql will include duplicate instructions to import the tables.

The issue comes from bind_export.cpp which asks for a list of schemas:

auto schemas = Catalog::GetSchemas(context, catalog);

At that point the search path has relevant two entries - one with the result of the use command and one for the default catalog:

[1] = {duckdb::CatalogSearchEntry} 
 catalog = {std::string} "test"
 schema = {std::string} "main"
[2] = {duckdb::CatalogSearchEntry} 
 catalog = {std::string} ""
 schema = {std::string} "main"

Which then resolves here to a schema list which contains test.main twice.

As a result the tables in the schema are visited twice and the file export contains duplicate.

I'm not sure if that's the export command that uses the search path wrong, or a more fundamental issue with the catalog API. It feels to me that GetSchemas should never have duplicates.

To Reproduce

v0.8.0 e8e4cea5ec
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D attach ':memory:' as test;
D use test;
D create table tbl1 as select 1 as a;
D export database 'test_export';

Which results in:

» cat test_export/load.sql 
COPY tbl1 FROM 'test_export/tbl_.csv' (FORMAT 'csv', quote '"', delimiter ',', header 0);
COPY tbl1 FROM 'test_export/tbl__1.csv' (FORMAT 'csv', quote '"', delimiter ',', header 0);

OS:

MacOS

DuckDB Version:

0.8.0

DuckDB Client:

cli

Full Name:

Boaz Leskes

Affiliation:

MotherDuck

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions