Skip to content

Conversation

jkub
Copy link
Contributor

@jkub jkub commented Mar 19, 2024

Add TRIM method to file system, which enables excising a range of the DB file. The OS can drop pages from the page-cache, and the file-system is free to deallocate the range (sparse file support). Reads to the range will succeed but will return undefined data (though most platforms will return 0s).

This is implemented and tested on linux. Windows/ntfs also has support for trim but I have not implemented it, and so tests do not exercise it.

The feature is enabled via a dbconfig.options.trim_free_blocks, which defaults to false.

@jkub jkub force-pushed the fs_trim_support branch from cc7f321 to 79697fd Compare March 20, 2024 05:20
@Mytherin Mytherin marked this pull request as ready for review March 20, 2024 06:24
Copy link
Contributor

@carlopi carlopi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one substantial comment on the testing side of this, that I think might require @Mytherin to take a look and make a call on how to move forward, and then took a look also at the rest and left two minor clarifications.

@@ -547,6 +551,28 @@ void SingleFileBlockManager::WriteHeader(DatabaseHeader header) {
active_header = 1 - active_header;
//! Ensure the header write ends up on disk
handle->Sync();
// Release the free blocks to the filesystem.
TrimFreeBlocks();
newly_freed_list.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
newly_freed_list.clear();

This is done already as part of TrimFreeBlocks.
Potentially to consider an D_ASSERT that `newly_freed_list is actually empty.

Comment on lines +439 to +440
free_list.erase(free_list.lower_bound(max_block), free_list.end());
newly_freed_list.erase(newly_freed_list.lower_bound(max_block), newly_freed_list.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not obvious to me looking here that both free_list and newly_freed_list are guaranteed to be sorted (given they are backed by a set<>).

Unsure if that should be made more visible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's part of the spec that std::set is sorted. Are you suggesting that we rename these members, to make it clearer that they're sorted?

Comment on lines 151 to 154
// Currently this is only supported on linux.
// Because we test this mode on linux, and the no-trim mode on other platforms,
// we get coverage with the option both on and off.
result->options.trim_free_blocks = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this part. I think it's great to test this thoroughly, but it looks this also means we will stop testing duckdb on Linux with its default setting, and I can see potentially some problems being visible only with one of the two modes.

Unsure what's the way forward.
Maybe leaving default to false but having some unittester options to turn this on based on environment variable?

And also, I would consider this to not be linux only, and when this is turned on all filesystems handle this properly (via Trim being a no-op).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I wasn't sure what to do here and was hoping for feedback.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can enable the setting if DUCKDB_ALTERNATIVE_VERIFY is defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make that change. Although DUCKDB_ALTERNATIVE_VERIFY also tests a slightly different checkpoint/shutdown behavior, which could conceivably interact?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is true, maybe putting it behind DESTROY_UNPINNED_BLOCKS would be better? Or RUN_SLOW_VERIFIERS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use DUCKDB_RUN_SLOW_VERIFIERS

@carlopi carlopi requested a review from Mytherin March 20, 2024 09:31
@Mytherin
Copy link
Collaborator

Thanks for the PR! Cool functionality. I was not aware this was even possible. I'm very curious how well this works on various file systems/operating systems, but we can experiment with that separately. Could you have a look at the failing CI?

@github-actions github-actions bot marked this pull request as draft March 20, 2024 15:36
@jkub jkub marked this pull request as ready for review March 20, 2024 19:48
@Mytherin Mytherin merged commit cf02e63 into duckdb:main Mar 21, 2024
@Mytherin
Copy link
Collaborator

Thanks!

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 21, 2024
Merge pull request duckdb/duckdb#11258 from jkub/fs_trim_support
Merge pull request duckdb/duckdb#11268 from lnkuiper/shell_json
Merge pull request duckdb/duckdb#11271 from lnkuiper/json_stuff
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Mar 23, 2024
Merge pull request duckdb/duckdb#11258 from jkub/fs_trim_support
Merge pull request duckdb/duckdb#11268 from lnkuiper/shell_json
Merge pull request duckdb/duckdb#11271 from lnkuiper/json_stuff
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 28, 2024
Merge pull request duckdb/duckdb#11258 from jkub/fs_trim_support
Merge pull request duckdb/duckdb#11268 from lnkuiper/shell_json
Merge pull request duckdb/duckdb#11271 from lnkuiper/json_stuff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants