-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
Description
What happens?
Calling conn.register()
/conn.unregister()
from multiple threads triggers a segfault sometimes. The cause appears to be some sort of race condition where the entry being removed is already gone/invalid by the time unregister()
calls DependencyManager::DropObject
.
To Reproduce
The following script causes a crash about a third of the time:
import duckdb
import pandas as pd
import numpy as np
import time
from threading import Thread
df = pd.DataFrame(np.zeros((10_000, 3)))
conn = duckdb.connect()
conn.execute("""CREATE TABLE foo
(
x FLOAT,
y FLOAT,
z FLOAT,
);""")
def work(i):
db = conn.cursor()
db.register("df", df)
#print(f"{i} registered")
db.execute("INSERT INTO foo SELECT * from df")
#print(f"{i} inserted")
db.unregister("df")
#print(f"{i} done")
print("Start")
threads = []
for i in range(100_000):
threads.append(Thread(target=work, args=(i,), name=f"thread_{i}"))
for t in threads:
t.start()
for t in threads:
t.join()
print("done!")
The backtrace under lldb typically looks like this:
(lldb) bt
* thread #41, stop reason = EXC_BAD_ACCESS (code=1, address=0xbeadd8af5d08)
* frame #0: 0x000000010373b990 duckdb.cpython-38-darwin.so`std::__1::unordered_map<duckdb::CatalogEntry*, std::__1::unordered_set<duckdb::Dependency, duckdb::DependencyHashFunction, duckdb::DependencyEquality, std::__1::allocator<duckdb::Dependency> >, std::__1::hash<duckdb::CatalogEntry*>, std::__1::equal_to<duckdb::CatalogEntry*>, std::__1::allocator<std::__1::pair<duckdb::CatalogEntry* const, std::__1::unordered_set<duckdb::Dependency, duckdb::DependencyHashFunction, duckdb::DependencyEquality, std::__1::allocator<duckdb::Dependency> > > > >::operator[](duckdb::CatalogEntry* const&) + 408
frame #1: 0x000000010373a8f4 duckdb.cpython-38-darwin.so`duckdb::DependencyManager::DropObject(duckdb::ClientContext&, duckdb::CatalogEntry*, bool) + 44
frame #2: 0x000000010373a864 duckdb.cpython-38-darwin.so`duckdb::CatalogSet::DropEntryDependencies(duckdb::ClientContext&, unsigned long long, duckdb::CatalogEntry&, bool) + 132
frame #3: 0x000000010373ab24 duckdb.cpython-38-darwin.so`duckdb::CatalogSet::DropEntryInternal(duckdb::ClientContext&, unsigned long long, duckdb::CatalogEntry&, bool) + 84
frame #4: 0x0000000103734ab4 duckdb.cpython-38-darwin.so`duckdb::CatalogSet::DropEntry(duckdb::ClientContext&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool) + 228
frame #5: 0x0000000103745100 duckdb.cpython-38-darwin.so`duckdb::SchemaCatalogEntry::DropEntry(duckdb::ClientContext&, duckdb::DropInfo*) + 124
frame #6: 0x0000000103a8b4fc duckdb.cpython-38-darwin.so`duckdb::PhysicalDrop::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::GlobalSourceState&, duckdb::LocalSourceState&) const + 236
frame #7: 0x00000001040bf5e8 duckdb.cpython-38-darwin.so`duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) + 96
frame #8: 0x00000001040bcf88 duckdb.cpython-38-darwin.so`duckdb::PipelineExecutor::ExecutePull(duckdb::DataChunk&) + 160
frame #9: 0x00000001040bce40 duckdb.cpython-38-darwin.so`duckdb::Executor::FetchChunk() + 112
frame #10: 0x0000000103ff66d0 duckdb.cpython-38-darwin.so`duckdb::ClientContext::FetchInternal(duckdb::ClientContextLock&, duckdb::Executor&, duckdb::BaseQueryResult&) + 76
frame #11: 0x0000000103ff7fd8 duckdb.cpython-38-darwin.so`duckdb::ClientContext::FetchResultInternal(duckdb::ClientContextLock&, duckdb::PendingQueryResult&) + 540
frame #12: 0x0000000103ffcb04 duckdb.cpython-38-darwin.so`duckdb::PendingQueryResult::ExecuteInternal(duckdb::ClientContextLock&) + 92
frame #13: 0x0000000103ffefac duckdb.cpython-38-darwin.so`duckdb::ClientContext::Query(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool) + 316
frame #14: 0x0000000104004a28 duckdb.cpython-38-darwin.so`duckdb::Connection::Query(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 36
frame #15: 0x000000010429c39c duckdb.cpython-38-darwin.so`duckdb::DuckDBPyConnection::UnregisterPythonObject(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 288
frame #16: 0x00000001042aa2fc duckdb.cpython-38-darwin.so`void pybind11::cpp_function::initialize<pybind11::cpp_function::cpp_function<duckdb::DuckDBPyConnection*, duckdb::DuckDBPyConnection, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling, char [25], pybind11::arg>(duckdb::DuckDBPyConnection* (duckdb::DuckDBPyConnection::*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, char const (&) [25], pybind11::arg const&)::'lambda'(duckdb::DuckDBPyConnection*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&), duckdb::DuckDBPyConnection*, duckdb::DuckDBPyConnection*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling, char [25], pybind11::arg>(duckdb::DuckDBPyConnection*&&, duckdb::DuckDBPyConnection (*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, char const (&) [25], pybind11::arg const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const + 172
frame #17: 0x0000000104254818 duckdb.cpython-38-darwin.so`pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 3544
frame #18: 0x0000000100021960 python`cfunction_call_varargs + 140
frame #19: 0x0000000100021324 python`_PyObject_MakeTpCall + 372
frame #20: 0x00000001000244e0 python`method_vectorcall + 196
frame #21: 0x00000001000f6840 python`call_function + 296
frame #22: 0x00000001000f3bf0 python`_PyEval_EvalFrameDefault + 23796
frame #23: 0x0000000100021d08 python`function_code_fastcall + 120
frame #24: 0x0000000100021688 python`PyVectorcall_Call + 104
frame #25: 0x00000001000f3e48 python`_PyEval_EvalFrameDefault + 24396
frame #26: 0x0000000100021d08 python`function_code_fastcall + 120
frame #27: 0x00000001000f6840 python`call_function + 296
frame #28: 0x00000001000f3bcc python`_PyEval_EvalFrameDefault + 23760
frame #29: 0x0000000100021d08 python`function_code_fastcall + 120
frame #30: 0x00000001000f6840 python`call_function + 296
frame #31: 0x00000001000f3bcc python`_PyEval_EvalFrameDefault + 23760
frame #32: 0x0000000100021d08 python`function_code_fastcall + 120
frame #33: 0x000000010002453c python`method_vectorcall + 288
frame #34: 0x0000000100021688 python`PyVectorcall_Call + 104
frame #35: 0x000000010019247c python`t_bootstrap + 80
frame #36: 0x0000000100142b2c python`pythread_wrapper + 28
frame #37: 0x000000019e50606c libsystem_pthread.dylib`_pthread_start + 148
OS:
MacOS 13.0 - arm64
DuckDB Version:
0.6.1-dev153
DuckDB Client:
Python
Full Name:
Ronan Lamy
Affiliation:
iterative.ai
Have you tried this on the latest master
branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
Metadata
Metadata
Assignees
Labels
No labels