-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Eviction queue: Sort purged nodes and bulk re-add #17913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I don't remember a particular reason - I'd say if the tests pass, all good :))
Thanks for the PR! The bulk enqueue makes sense to me and seems like an improvement, but the sort on sequence numbers does not make that much sense to me. Sequence numbers are kind of arbitrary, they are incremented when a block is pinned, and they are never decremented. That means if you have a block that is pinned/unpinned many times in a short period of time and then never used again it will have a very high sequence number, versus a block that is pinned/unpinned constantly but at a lower frequency. If we want to do this kind of sorting we could add a timestamp or similar to the buffer nodes instead. |
@Mytherin Thanks, I did not realize that. I thought it was a global thing. I've removed the sort for now, we can add timestamps and sort in v1.4.0 |
Eviction queue: Sort purged nodes and bulk re-add (duckdb/duckdb#17913) [CI] adding DONT_LINK parameter to the test extension configuration for `inet` extension (duckdb/duckdb#17967) bump julia to v1.3.1 (duckdb/duckdb#17966)
Eviction queue: Sort purged nodes and bulk re-add (duckdb/duckdb#17913) [CI] adding DONT_LINK parameter to the test extension configuration for `inet` extension (duckdb/duckdb#17967) bump julia to v1.3.1 (duckdb/duckdb#17966) Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
} | ||
} | ||
|
||
// bulk re-add (TODO order them by timestamp to better retain the LRU behavior) | ||
q.enqueue_bulk(purge_nodes.begin(), alive_nodes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lnkuiper @Mytherin @taniabogatsch This function call will cause the DuckDB WASM build to fail. Because the MVP/EH feature of WASM doesn't support thread (DUCKDB_NO_THREADS=1
)
In this case, the file https://github.com/duckdb/duckdb/blob/main/third_party/concurrentqueue/concurrentqueue.h would not be included. Instead a simple polyfill for the ConcurrentQueue
class in the file https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/parallel/concurrentqueue.hpp will be used.
However, this class doesn't implement a enqueue_bulk
method, so the following error will be thrown:
/path/to/duckdb-wasm/submodules/duckdb/src/storage/buffer/buffer_pool.cpp:227:4: error: no member named 'enqueue_bulk' in 'duckdb_moodycamel::ConcurrentQueue<duckdb::BufferEvictionNode>'
I think that a method with the same name enqueue_bulk
needs to be added into the ConcurrentQueue
class in DuckDB src
directory.
Or checking the macro variable DUCKDB_NO_THREADS
or other macro variables which could determine whether the real ConcurrentQueue
class has been included inthe file https://github.com/duckdb/duckdb/blob/main/src/storage/buffer/buffer_pool.cpp
Thank you
Following the discussion here:
https://github.com/duckdblabs/motherduck/issues/320
Our eviction queue uses an LRU policy to evict pages. However, we need to "purge dead nodes" (pages that have been (un)pinned more than once before they are evicted), which re-adds some of the "alive nodes" to the back of the queue. This means that the LRU behavior is not maintained for a small portion of the pages.
This PR sorts the alive nodes to slightly improve the LRU behavior, and re-adds them in bulk to slightly improve performance (instead of re-adding them one-by-one).
EDIT: @taniabogatsch Do you remember if there was any reason we did not bulk re-add already?