-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Implement partial snapshot recovery #6206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
06f8a7f
to
f4dcec0
Compare
This comment was marked as resolved.
This comment was marked as resolved.
I've already implemented the APIs (and also merged |
f4dcec0
to
cc6c0db
Compare
5d7581d
to
0bf5d2b
Compare
0bf5d2b
to
4e38c00
Compare
4e38c00
to
65ec3d9
Compare
65ec3d9
to
d55e366
Compare
Condition to mark as `Dead` was inverted the whole time... 😅 Thanks coderabbitai!
d55e366
to
ed5f2e1
Compare
pub fn validate(&self) -> OperationResult<()> { | ||
for (segment_id, manifest) in &self.manifests { | ||
for (file, version) in &manifest.file_versions { | ||
if version.or_segment_version(manifest.segment_version) > manifest.segment_version { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see manifest.segment_version
on both sides of the >
here, is that on purpose?
Would it make sense to take an option instead and do .is_some_and(version < segment version)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, is_some_and
would make more sense here, I'll fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. It's not an Option
, but a enum FileVersion { Version(SeqNumberType), Unversioned }
and or_segment_version
is basically Option::unwrap_or
for Unversioned
. I can add is_versioned_and
method, but in this particular case version.or_segment_version(manifest.segment_version) > manifest.segment_version
works the same.
fn failed_to_add(what: &str, path: &Path, err: impl fmt::Display) -> OperationError { | ||
OperationError::service_error(format!( | ||
"failed to add {what} {} into snapshot: {err}", | ||
path.display(), | ||
)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is exclusively used in the function above, how do you feel about inlining it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never bother with explicit inlining, but sure.
@@ -82,6 +104,7 @@ impl ShardReplicaSet { | |||
pub async fn restore_local_replica_from( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is humongous, spanning a few hundred lines. I think we should split it, but we can do so in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Looking forward to the tests and integration of this feature :)
Please split the restore_local_replica_from
function.
- Fix creating partial snapshot without RocksDB backup - Fix `segment_manifest.json` file placement - Implement partial snapshot recovery - Allow restoring partial snapshot for all shard types - Validate segment manifests when recovering partial snapshot - Log error, if streaming shard/partial snapshot failed - Contextualize errors in `snapshot_files` function - Tweak log messages in `Segment::take_snapshot` method - Remove unnecessary lifetimes from `SegmentHolder` iter methods
This PR extends
ShardReplicaSet::restore_local_replica_from
to support partial snapshot recovery.APIs to create and recover partial snapshots and integration tests will be added in a follow-up PRs.
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?Changes to Core Features: