Skip to content

Conversation

pdet
Copy link
Owner

@pdet pdet commented Jun 22, 2021

No description provided.

@pdet pdet merged commit 52e7044 into pdet:master Jun 22, 2021
pdet pushed a commit that referenced this pull request Mar 18, 2022
restore original code
pdet pushed a commit that referenced this pull request Jun 6, 2022
pdet pushed a commit that referenced this pull request Aug 9, 2022
Pull from duckdb/duckdb
pdet pushed a commit that referenced this pull request Aug 29, 2022
basically completed the new join order optimizer. Need to inspect a few tpcds queries though

* added imdb_parquet benchmark

* Adding multiplicities branch

* add benchmark to query direct from query files

* clean up code that is unnecessary

* fix bugs in join ordering code

* code clean up. This commit has code that changes how analyze structures are printed

* small changes but nothing major

* fix tpcds query error, but now there is one that times out

* tpc-ds passes now

* format fix

* final fixes

* fix tpcds benchmark again

* fixed hopefully the last issues with tpcds and tpch and imdb

* lawrence comments

* format fix

* forgot a D_ASSERT

* make sure it can build

* mult and sel looks good. Still needs cleanup and variable renaming. Going to add mult and sel on column levels now

* first commit, have some ideas

* now we are tracking at the column level, very nice

* mults and sels are now tracked on a column level. stable for tpch, imdb not as much, going to do some memory switching

* no more memory errors

* getting there but we'll see. works if you init the left and right every time

* it works on my machine, but I think there is something wrong with my hardware

* things work, going to work on a big refactor now

* refactor is working so far, but results aren't any better

* remove some comments

* format fix

* some more smaller commits to clean up code

* ok looks good, computing cardinality sums should work now

* fix cmake stuff

* comments and remove old code

* this is going in the right direction

* you can now see some of the join stats when you print the querygraph

* should be able to see sels and muls in graph, but for some reason not yet

* can now see selectivities and multiplicities on the query graph

* some refactoring and cleaning, adding more metadata to query graph should be straightforward now

* remove unnecessary header file

* add cost to JoinStats

* refactor looks much nicer now

* more refactor

* updated benchmark

* cardinality estimation is a little bit better, still seg faulting on tpch query 5

* code to later add HLL

* last commit before attemptin to implement new idea made with laurens

* it works now. going to attempt big rebase

* tracking uniqur values, debugging

* works for the most part, except for a DP bug in q08a for the JOB tests

* checkpoint, less segfaults, some laurens code as well

* when applying a filter set a min resulting cardinality of at least 1

* fix rebase issues

* properly update DP table

* stopping coding to prove symmetry

* ok looks good

* ok here we have some really good results with very approximate joining. DP seems to still have an issue

* we are defs on the right track here. Just implemented a way to check if a table filter is an equality comparison

* good version going to do a clean up commit next

* did some clean up and it builds

* refactor, add or filter estimation logic

* fixed non-reoderable joins bug

* some more refactoring

* some more refactoring

* fix all benchmark files

* fix benchmark files again

* more refactoring

* more refactoring

* more refactoring and commentts

* format fix refactor

* remove two lines

* big refactor. putting cardinality estimation logic in its own class

* refactor looks good now. compiles and runs. Just need to fix laurens comments

* using column_binding_t now, new copy function for estimated properties, unnesting some code

* more refactoring

* format-fix commit

* removing dead code and other not necessary things

* fixed a couple of issues regarding cartesian joins and the approximate join order solver. Still need to handle the case where there is more than one filter

* allunit tests should pass now

* format-fix

* fix the std issue for finding the max element

* fix more tests

* format fix changes

* debug and unittests pass

* format-fix changes

* more fixes and comments from lawrence

* styling changes and also changes to make tpcds pass when using parquet

* forgot to do a format fix

* remove semi colon not found by format fix

* try to get tests passing again, going to merge with master soon

* fix rebase conflicts

* format fix

* more refactoring, need to look at some results because we regressed, but don't really want to do that now

* make debug pass

* should fix debug now

* try to get more tests to pass. run analyze statement tests have better results as well

* fix int/double error breaking some tests

* everything looks good

* fix divide by 0

* fix bug

* make allunit passes now, still need to test make unittestci

* add a parquet vs base table test. A test in select4.test_slow is still failing :(

* fix last bug

* fix compiling bugs

* fix debug failing tests

* ok, fixed bug where not all query edges are considered

* fix issue where not all edges are considered, but results are worse

* fix regressions that took me 3 hours to find

* fix small bug involving idx_t to double casting, and fix executing queries on just parquet

* fix the make tidy errors (hopefully)

* remove not important files

* some clang tidy stuff, but also some stuff for phase timings, but for some reason they aren't reported in python

* deleted stuff for phase timings

* tidy and format fixes

* last comments

* fix last test case

Co-authored-by: Laurens Kuiper <laurens.kuiper@cwi.nl>
pdet pushed a commit that referenced this pull request Nov 25, 2024
pdet pushed a commit that referenced this pull request Dec 4, 2024
I was investigating the following crash where a checkpoint task had its
underlying resources being destroyed while it was still running. The two
interesting threads are the following:

```
thread #1, name = 'duckling', stop reason = signal SIGTRAP
    frame #0: 0x0000ffff91bb71ec
    frame #1: 0x0000aaaad73a38e8 duckling`duckdb::InternalException::InternalException(this=<unavailable>, msg=<unavailable>) at exception.cpp:336:2
    frame #2: 0x0000aaaad786eb68 duckling`duckdb::unique_ptr<duckdb::RowGroup, std::default_delete<duckdb::RowGroup>, true>::operator*() const [inlined] duckdb::unique_ptr<duckdb::RowGroup, std::default_delete<duckdb::RowGroup>, true>::AssertNotNull(null=<unavailable>) at unique_ptr.hpp:25:10
    frame #3: 0x0000aaaad786eaf4 duckling`duckdb::unique_ptr<duckdb::RowGroup, std::default_delete<duckdb::RowGroup>, true>::operator*(this=0x0000aaacbb73e008) const at unique_ptr.hpp:34:4
    frame #4: 0x0000aaaad787abbc duckling`duckdb::CheckpointTask::ExecuteTask(this=0x0000aaabec92be60) at row_group_collection.cpp:732:21
    frame #5: 0x0000aaaad7756ea4 duckling`duckdb::BaseExecutorTask::Execute(this=0x0000aaabec92be60, mode=<unavailable>) at task_executor.cpp:72:3
    frame #6: 0x0000aaaad7757e28 duckling`duckdb::TaskScheduler::ExecuteForever(this=0x0000aaaafda30e10, marker=0x0000aaaafda164a8) at task_scheduler.cpp:189:32
    frame duckdb#7: 0x0000ffff91a031fc
    frame duckdb#8: 0x0000ffff91c0d5c8

thread duckdb#120, stop reason = signal 0
    frame #0: 0x0000ffff91c71c24
    frame #1: 0x0000ffff91e1264c
    frame #2: 0x0000ffff91e01888
    frame #3: 0x0000ffff91e018f8
    frame #4: 0x0000ffff91e01c10
    frame #5: 0x0000ffff91e05afc
    frame #6: 0x0000ffff91e05e70
    frame duckdb#7: 0x0000aaaad784b63c duckling`duckdb::RowGroup::~RowGroup() [inlined] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(this=<unavailable>) at shared_ptr_base.h:184:10
    frame duckdb#8: 0x0000aaaad784b5b4 duckling`duckdb::RowGroup::~RowGroup() [inlined] std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count(this=<unavailable>) at shared_ptr_base.h:705:11
    frame duckdb#9: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] std::__shared_ptr<duckdb::ColumnData, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr(this=<unavailable>) at shared_ptr_base.h:1154:31
    frame duckdb#10: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] duckdb::shared_ptr<duckdb::ColumnData, true>::~shared_ptr(this=<unavailable>) at shared_ptr_ipp.hpp:115:24
    frame duckdb#11: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy<duckdb::shared_ptr<duckdb::ColumnData, true>>(__pointer=<unavailable>) at stl_construct.h:151:19
    frame duckdb#12: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy_aux<false>::__destroy<duckdb::shared_ptr<duckdb::ColumnData, true>*>(__first=<unavailable>, __last=<unavailable>) at stl_construct.h:163:6
    frame duckdb#13: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy<duckdb::shared_ptr<duckdb::ColumnData, true>*>(__first=<unavailable>, __last=<unavailable>) at stl_construct.h:195:7
    frame duckdb#14: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy<duckdb::shared_ptr<duckdb::ColumnData, true>*, duckdb::shared_ptr<duckdb::ColumnData, true>>(__first=<unavailable>, __last=<unavailable>, (null)=<unavailable>) at alloc_traits.h:848:7
    frame duckdb#15: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup() [inlined] std::vector<duckdb::shared_ptr<duckdb::ColumnData, true>, std::allocator<duckdb::shared_ptr<duckdb::ColumnData, true>>>::~vector(this=<unavailable>) at stl_vector.h:680:2
    frame duckdb#16: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup(this=<unavailable>) at row_group.cpp:83:1
    frame duckdb#17: 0x0000aaaad786ee18 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] std::default_delete<duckdb::RowGroup>::operator()(this=0x0000aaacbb73e1a8, __ptr=0x0000aaab75ae7860) const at unique_ptr.h:85:2
    frame duckdb#18: 0x0000aaaad786ee10 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() at unique_ptr.h:361:4
    frame duckdb#19: 0x0000aaaad786ee08 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] duckdb::SegmentNode<duckdb::RowGroup>::~SegmentNode(this=0x0000aaacbb73e1a0) at segment_tree.hpp:21:8
    frame duckdb#20: 0x0000aaaad786ee08 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy<duckdb::SegmentNode<duckdb::RowGroup>>(__pointer=0x0000aaacbb73e1a0) at stl_construct.h:151:19
    frame duckdb#21: 0x0000aaaad786ee08 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy_aux<false>::__destroy<duckdb::SegmentNode<duckdb::RowGroup>*>(__first=0x0000aaacbb73e1a0, __last=0x0000aaacbb751130) at stl_construct.h:163:6
    frame duckdb#22: 0x0000aaaad786ede8 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy<duckdb::SegmentNode<duckdb::RowGroup>*>(__first=<unavailable>, __last=0x0000aaacbb751130) at stl_construct.h:195:7
    frame duckdb#23: 0x0000aaaad786ede8 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy<duckdb::SegmentNode<duckdb::RowGroup>*, duckdb::SegmentNode<duckdb::RowGroup>>(__first=<unavailable>, __last=0x0000aaacbb751130, (null)=0x0000fffefc81a908) at alloc_traits.h:848:7
    frame duckdb#24: 0x0000aaaad786ede8 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector(this=size=4883) at stl_vector.h:680:2
    frame duckdb#25: 0x0000aaaad7857f74 duckling`duckdb::RowGroupCollection::Checkpoint(this=<unavailable>, writer=<unavailable>, global_stats=0x0000fffefc81a9c0) at row_group_collection.cpp:1017:1
    frame duckdb#26: 0x0000aaaad788f02c duckling`duckdb::DataTable::Checkpoint(this=0x0000aaab04649e70, writer=0x0000aaab6584ea80, serializer=0x0000fffefc81ab38) at data_table.cpp:1427:14
    frame duckdb#27: 0x0000aaaad7881394 duckling`duckdb::SingleFileCheckpointWriter::WriteTable(this=0x0000fffefc81b128, table=0x0000aaab023b78c0, serializer=0x0000fffefc81ab38) at checkpoint_manager.cpp:528:11
    frame duckdb#28: 0x0000aaaad787ece4 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() [inlined] duckdb::SingleFileCheckpointWriter::CreateCheckpoint(this=<unavailable>, obj=0x0000fffefc81ab38)::$_7::operator()(duckdb::Serializer::List&, unsigned long) const::'lambda'(duckdb::Serializer&)::operator()(duckdb::Serializer&) const at checkpoint_manager.cpp:181:43
    frame duckdb#29: 0x0000aaaad787ecd8 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() [inlined] void duckdb::Serializer::List::WriteObject<duckdb::SingleFileCheckpointWriter::CreateCheckpoint()::$_7::operator()(duckdb::Serializer::List&, unsigned long) const::'lambda'(duckdb::Serializer&)>(this=<unavailable>, f=(unnamed class) @ 0x0000600002cbd2b0) at serializer.hpp:385:2
    frame duckdb#30: 0x0000aaaad787ecc4 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() [inlined] duckdb::SingleFileCheckpointWriter::CreateCheckpoint()::$_7::operator()(this=<unavailable>, list=<unavailable>, i=2) const at checkpoint_manager.cpp:181:8
    frame duckdb#31: 0x0000aaaad787ecb0 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() at serializer.hpp:151:4
    frame duckdb#32: 0x0000aaaad787ec94 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint(this=0x0000fffefc81b128) at checkpoint_manager.cpp:179:13
    frame duckdb#33: 0x0000aaaad78954a8 duckling`duckdb::SingleFileStorageManager::CreateCheckpoint(this=0x0000aaaafe1de140, options=(wal_action = DONT_DELETE_WAL, action = CHECKPOINT_IF_REQUIRED, type = FULL_CHECKPOINT)) at storage_manager.cpp:365:17
    frame duckdb#34: 0x0000aaaad78baac0 duckling`duckdb::DuckTransactionManager::Checkpoint(this=0x0000aaaafe167e00, context=<unavailable>, force=<unavailable>) at duck_transaction_manager.cpp:198:18
    frame duckdb#35: 0x0000aaaad69d02c0 duckling`md::Server::BackgroundCheckpointIfNeeded(this=0x0000aaaafdbfe900) at server.cpp:1983:31
    frame duckdb#36: 0x0000aaaadac5d3f0 duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] std::function<void ()>::operator()(this=<unavailable>) const at std_function.h:590:9
    frame duckdb#37: 0x0000aaaadac5d3e0 duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] md::BackgroundCronTask::Start(unsigned long)::$_0::operator()(this=0x0000aaaafdf169a8) const at background_cron_task.cpp:25:4
    frame duckdb#38: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] void std::__invoke_impl<void, md::BackgroundCronTask::Start(unsigned long)::$_0>((null)=<unavailable>, __f=0x0000aaaafdf169a8) at invoke.h:61:14
    frame duckdb#39: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] std::__invoke_result<md::BackgroundCronTask::Start(unsigned long)::$_0>::type std::__invoke<md::BackgroundCronTask::Start(unsigned long)::$_0>(__fn=0x0000aaaafdf169a8) at invoke.h:96:14
    frame duckdb#40: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] void std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>::_M_invoke<0ul>(this=0x0000aaaafdf169a8, (null)=<unavailable>) at std_thread.h:259:13
    frame duckdb#41: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>::operator()(this=0x0000aaaafdf169a8) at std_thread.h:266:11
    frame duckdb#42: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run(this=0x0000aaaafdf169a0) at std_thread.h:211:13
    frame duckdb#43: 0x0000ffff91a031fc
    frame duckdb#44: 0x0000ffff91c0d5c8
```

The problem is that if there's an IO exception being thrown in
`RowGroupCollection::Checkpoint` after some (but not all) checkpoint
tasks have been scheduled but before
`checkpoint_state.executor.WorkOnTasks();` is called, it results in an
InternalException / DuckDB crash as the `Checkpoint ` method does not
wait for the scheduled tasks to have completed before destroying the
referenced resources.
pdet pushed a commit that referenced this pull request Jan 20, 2025
Fixes duckdblabs/duckdb-internal#3922

The failing query
```SQL
SET order_by_non_integer_literal=true;
SELECT DISTINCT ON ( 'string' ) 'string', GROUP BY CUBE ( 'string', ), 'string' IN ( SELECT 'string' ), HAVING 'string' IN ( SELECT 'string');
```

The Plan generated before optimization is below. During optimization
there is an attempt to convert the mark join into a semi. Before this
conversion takes place, we usually check to make sure the mark join is
not used in any operators above the mark join to prevent plan
verification errors. Up until this point, only logical projections were
checked for mark joins. Turns out this query is planned in such a way
that the mark join is in one of the expressions of the aggregate
operator. This was not checked, so the mark to semi conversion would
take place. The fix is to modify the filter pushdown optimization so
that it stores table indexes from logical aggregate operators.

```
┌───────────────────────────┐
│       PROJECTION #1       │
│    ────────────────────   │
│    Expressions: #[2.0]    │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│           FILTER          │
│    ────────────────────   │
│    Expressions: #[2.1]    │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│    AGGREGATE #2, #3, #4   │
│    ────────────────────   │
│          Groups:          │
│          'string'         │
│          #[14.0]          │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│      COMPARISON_JOIN      │
│    ────────────────────   │
│      Join Type: MARK      │
│                           ├──────────────┐
│        Conditions:        │              │
│    ('string' = #[8.0])    │              │
└─────────────┬─────────────┘              │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│       DUMMY_SCAN #0       ││       PROJECTION duckdb#8       │
│    ────────────────────   ││    ────────────────────   │
│                           ││   Expressions: 'string'   │
└───────────────────────────┘└─────────────┬─────────────┘
                             ┌─────────────┴─────────────┐
                             │       DUMMY_SCAN duckdb#7       │
                             │    ────────────────────   │
                             └───────────────────────────┘
```
pdet pushed a commit that referenced this pull request Feb 17, 2025
We had two users crash with the following backtrace:

```
    frame #0: 0x0000ffffab2571ec
    frame #1: 0x0000aaaaac00c5fc duckling`duckdb::InternalException::InternalException(this=<unavailable>, msg=<unavailable>) at exception.cpp:328:2
    frame #2: 0x0000aaaaac1ee418 duckling`duckdb::optional_ptr<duckdb::OptimisticDataWriter, true>::CheckValid(this=<unavailable>) const at optional_ptr.hpp:34:11
    frame #3: 0x0000aaaaac1eea8c duckling`duckdb::MergeCollectionTask::Execute(duckdb::PhysicalBatchInsert const&, duckdb::ClientContext&, duckdb::GlobalSinkState&, duckdb::LocalSinkState&) [inlined] duckdb::optional_ptr<duckdb::OptimisticDataWriter, true>::operator*(this=<unavailable>) at optional_ptr.hpp:43:3
    frame #4: 0x0000aaaaac1eea84 duckling`duckdb::MergeCollectionTask::Execute(this=0x0000aaaaf1b06150, op=<unavailable>, context=0x0000aaaba820d8d0, gstate_p=0x0000aaab06880f00, lstate_p=<unavailable>) at physical_batch_insert.cpp:219:90
    frame #5: 0x0000aaaaac1d2e10 duckling`duckdb::PhysicalBatchInsert::Sink(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSinkInput&) const [inlined] duckdb::PhysicalBatchInsert::ExecuteTask(this=0x0000aaaafa62ab40, context=<unavailable>, gstate_p=0x0000aaab06880f00, lstate_p=0x0000aab12d442960) const at physical_batch_insert.cpp:425:8
    frame #6: 0x0000aaaaac1d2dd8 duckling`duckdb::PhysicalBatchInsert::Sink(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSinkInput&) const [inlined] duckdb::PhysicalBatchInsert::ExecuteTasks(this=0x0000aaaafa62ab40, context=<unavailable>, gstate_p=0x0000aaab06880f00, lstate_p=0x0000aab12d442960) const at physical_batch_insert.cpp:431:9
    frame duckdb#7: 0x0000aaaaac1d2dd8 duckling`duckdb::PhysicalBatchInsert::Sink(this=0x0000aaaafa62ab40, context=0x0000aab2fffd7cb0, chunk=<unavailable>, input=<unavailable>) const at physical_batch_insert.cpp:494:4
    frame duckdb#8: 0x0000aaaaac353158 duckling`duckdb::PipelineExecutor::ExecutePushInternal(duckdb::DataChunk&, duckdb::ExecutionBudget&, unsigned long) [inlined] duckdb::PipelineExecutor::Sink(this=0x0000aab2fffd7c00, chunk=0x0000aab2fffd7d30, input=0x0000fffec0aba8d8) at pipeline_executor.cpp:521:24
    frame duckdb#9: 0x0000aaaaac353130 duckling`duckdb::PipelineExecutor::ExecutePushInternal(this=0x0000aab2fffd7c00, input=0x0000aab2fffd7d30, chunk_budget=0x0000fffec0aba980, initial_idx=0) at pipeline_executor.cpp:332:23
    frame duckdb#10: 0x0000aaaaac34f7b4 duckling`duckdb::PipelineExecutor::Execute(this=0x0000aab2fffd7c00, max_chunks=<unavailable>) at pipeline_executor.cpp:201:13
    frame duckdb#11: 0x0000aaaaac34f258 duckling`duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) [inlined] duckdb::PipelineExecutor::Execute(this=<unavailable>) at pipeline_executor.cpp:278:9
    frame duckdb#12: 0x0000aaaaac34f250 duckling`duckdb::PipelineTask::ExecuteTask(this=0x0000aab16dafd630, mode=<unavailable>) at pipeline.cpp:51:33
    frame duckdb#13: 0x0000aaaaac348298 duckling`duckdb::ExecutorTask::Execute(this=0x0000aab16dafd630, mode=<unavailable>) at executor_task.cpp:49:11
    frame duckdb#14: 0x0000aaaaac356600 duckling`duckdb::TaskScheduler::ExecuteForever(this=0x0000aaaaf0105560, marker=0x0000aaaaf00ee578) at task_scheduler.cpp:189:32
    frame duckdb#15: 0x0000ffffab0a31fc
    frame duckdb#16: 0x0000ffffab2ad5c8
```

Core dump analysis showed that the assertion `D_ASSERT(lstate.writer);`
in `MergeCollectionTask::Execute` (i.e. it is crashing because
`lstate.writer` is NULLPTR) was not satisfied when
`PhysicalBatchInsert::Sink` was processing merge tasks from (other)
pipeline executors.

My suspicion is that this is only likely to happen for heavily
concurrent workloads (applicable to the two users which crashed). The
patch submitted as part of this PR has addressed the issue for these
users.
pdet pushed a commit that referenced this pull request Feb 26, 2025
pdet pushed a commit that referenced this pull request Aug 27, 2025
I have seen this crashing due to invalid pointer on which a destructor is called, on last night `main` (`2ed9bf887f`) using
unittester compiled from sources (clang 17) and extensions installed from default extension repository.

Basically:
```
DYLD_INSERT_LIBRARIES=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/17/lib/darwin/libclang_rt.asan_osx_dynamic.dylib LOCAL_EXTENSION_REPO=http://extensions.duckdb.org ./build/release/test/unittest --autoloading all --skip-compiled  --order rand test/parquet/test_parquet_schema.test
```
and seeing runtime sanitizer assertions such as
```
==56046==ERROR: AddressSanitizer: container-overflow on address 0x6100000d4dcf at pc 0x000116c7f450 bp 0x00016fc1d170 sp 0x00016fc1d168
READ of size 1 at 0x6100000d4dcf thread T0
    #0 0x000116c7f44c in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>* std::__1::__uninitialized_allocator_copy_impl[abi:ne190102]<std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+0x318 (parquet.duckdb_extension:arm64+0xab44c)
    #1 0x000116c7ec90 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__construct_at_end<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, unsigned long)+0x160 (parquet.duckdb_extension:arm64+0xaac90)
    #2 0x000116c7e7d8 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__assign_with_size[abi:ne190102]<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, long)+0x1e0 (parquet.duckdb_extension:arm64+0xaa7d8)
    #3 0x000116e8cd48 in duckdb::ParquetMultiFileInfo::BindReader(duckdb::ClientContext&, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileBindData&)+0xf18 (parquet.duckdb_extension:arm64+0x2b8d48)
    #4 0x000116e6e5fc in duckdb::MultiFileFunction<duckdb::ParquetMultiFileInfo>::MultiFileBindInternal(duckdb::ClientContext&, duckdb::unique_ptr<duckdb::MultiFileReader, std::__1::default_delete<duckdb::MultiFileReader>, true>, duckdb::shared_ptr<duckdb::MultiFileList, true>, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileOptions, duckdb::unique_ptr<duckdb::BaseFileReaderOptions, std::__1::default_delete<duckdb::BaseFileReaderOptions>, true>, duckdb::unique_ptr<duckdb::MultiFileReaderInterface, std::__1::default_delete<duckdb::MultiFileReaderInterface>, true>)+0x1210 (parquet.duckdb_extension:arm64+0x29a5fc)
```

or these failures while using ducklake
```
==56079==ERROR: AddressSanitizer: container-overflow on address 0x616000091a78 at pc 0x0001323fc81c bp 0x00016bd0e890 sp 0x00016bd0e888
READ of size 8 at 0x616000091a78 thread T2049
    #0 0x0001323fc818 in duckdb::MultiFileColumnDefinition::~MultiFileColumnDefinition()+0x258 (parquet.duckdb_extension:arm64+0x2a4818)
    #1 0x0001323fb588 in std::__1::vector<duckdb::MultiFileColumnDefinition, std::__1::allocator<duckdb::MultiFileColumnDefinition>>::__destroy_vector::operator()[abi:ne190102]()+0x98 (parquet.duckdb_extension:arm64+0x2a3588)
    #2 0x0001324a09e4 in duckdb::BaseFileReader::~BaseFileReader()+0x2bc (parquet.duckdb_extension:arm64+0x3489e4)
    #3 0x0001324a23ec in duckdb::ParquetReader::~ParquetReader()+0x22c (parquet.duckdb_extension:arm64+0x34a3ec)
```
pdet pushed a commit that referenced this pull request Aug 27, 2025
I have seen this crashing due to invalid pointer on which a destructor
is called, on last night `main` (`2ed9bf887f`) using unittester compiled
from sources (clang 17) and extensions installed from default extension
repository.

Basically:
```
DYLD_INSERT_LIBRARIES=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/17/lib/darwin/libclang_rt.asan_osx_dynamic.dylib LOCAL_EXTENSION_REPO=http://extensions.duckdb.org ./build/release/test/unittest --autoloading all --skip-compiled  --order rand test/parquet/test_parquet_schema.test
```
and seeing runtime sanitizer assertions such as
```
==56046==ERROR: AddressSanitizer: container-overflow on address 0x6100000d4dcf at pc 0x000116c7f450 bp 0x00016fc1d170 sp 0x00016fc1d168
READ of size 1 at 0x6100000d4dcf thread T0
    #0 0x000116c7f44c in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>* std::__1::__uninitialized_allocator_copy_impl[abi:ne190102]<std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+0x318 (parquet.duckdb_extension:arm64+0xab44c)
    #1 0x000116c7ec90 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__construct_at_end<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, unsigned long)+0x160 (parquet.duckdb_extension:arm64+0xaac90)
    #2 0x000116c7e7d8 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__assign_with_size[abi:ne190102]<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, long)+0x1e0 (parquet.duckdb_extension:arm64+0xaa7d8)
    #3 0x000116e8cd48 in duckdb::ParquetMultiFileInfo::BindReader(duckdb::ClientContext&, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileBindData&)+0xf18 (parquet.duckdb_extension:arm64+0x2b8d48)
    #4 0x000116e6e5fc in duckdb::MultiFileFunction<duckdb::ParquetMultiFileInfo>::MultiFileBindInternal(duckdb::ClientContext&, duckdb::unique_ptr<duckdb::MultiFileReader, std::__1::default_delete<duckdb::MultiFileReader>, true>, duckdb::shared_ptr<duckdb::MultiFileList, true>, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileOptions, duckdb::unique_ptr<duckdb::BaseFileReaderOptions, std::__1::default_delete<duckdb::BaseFileReaderOptions>, true>, duckdb::unique_ptr<duckdb::MultiFileReaderInterface, std::__1::default_delete<duckdb::MultiFileReaderInterface>, true>)+0x1210 (parquet.duckdb_extension:arm64+0x29a5fc)
```

or these failures while using ducklake
```
==56079==ERROR: AddressSanitizer: container-overflow on address 0x616000091a78 at pc 0x0001323fc81c bp 0x00016bd0e890 sp 0x00016bd0e888
READ of size 8 at 0x616000091a78 thread T2049
    #0 0x0001323fc818 in duckdb::MultiFileColumnDefinition::~MultiFileColumnDefinition()+0x258 (parquet.duckdb_extension:arm64+0x2a4818)
    #1 0x0001323fb588 in std::__1::vector<duckdb::MultiFileColumnDefinition, std::__1::allocator<duckdb::MultiFileColumnDefinition>>::__destroy_vector::operator()[abi:ne190102]()+0x98 (parquet.duckdb_extension:arm64+0x2a3588)
    #2 0x0001324a09e4 in duckdb::BaseFileReader::~BaseFileReader()+0x2bc (parquet.duckdb_extension:arm64+0x3489e4)
    #3 0x0001324a23ec in duckdb::ParquetReader::~ParquetReader()+0x22c (parquet.duckdb_extension:arm64+0x34a3ec)
```

With these changes, once having the `parquet` extension build by CI,
this works as expected.

I am not sure if the fix could / should be elsewhere.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants