Skip to content

Conversation

Mytherin
Copy link
Collaborator

@Mytherin Mytherin commented Apr 3, 2025

This PR adds a new setting - scheduler_process_partial - that allows partial scheduling of tasks in the background threads. Normally, the background tasks pass TaskExecutionMode::PROCESS_ALL to the tasks they execute - leading to them being stuck on processing the task fully until it is completed. As a result, long-running queries can block short-running queries.

In regular DuckDB operation - that is not a big problem, because every connection has its own "processing thread" that works only on the queries it issues - and hence short-running queries still make some progress. That is not always the case - however. When using certain APIs (e.g. the pending query API) the foreground thread might not participate in the query processing, leaving everything to the background tasks. As a result, short-running queries can become starved.

Enabling this setting improves fairness by only working on tasks partially before rescheduling. As a result, bigger tasks will need more "rounds" and short-running queries will complete without stalling. The setting is disabled by default for now as in most instances this will not be required - but is enabled in CI with the alternative verification setting.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 3, 2025 15:18
@Mytherin Mytherin marked this pull request as ready for review April 3, 2025 18:32
@Mytherin Mytherin merged commit a299d15 into duckdb:main Apr 4, 2025
51 of 52 checks passed
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
Add a setting `scheduler_process_partial` that allows partial scheduling of tasks in the background threads (duckdb/duckdb#16973)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
Add a setting `scheduler_process_partial` that allows partial scheduling of tasks in the background threads (duckdb/duckdb#16973)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 16, 2025
Add a setting `scheduler_process_partial` that allows partial scheduling of tasks in the background threads (duckdb/duckdb#16973)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 16, 2025
Add a setting `scheduler_process_partial` that allows partial scheduling of tasks in the background threads (duckdb/duckdb#16973)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 17, 2025
Add a setting `scheduler_process_partial` that allows partial scheduling of tasks in the background threads (duckdb/duckdb#16973)
Mytherin added a commit that referenced this pull request Jun 12, 2025
…17877)

#16973 add a setting for tasks to
be returned to the scheduler after partially processing them. This PRs
makes the executor (which is used on application threads) respect that
setting as well.
@Mytherin Mytherin deleted the rescheduletasks branch June 12, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant