-
Notifications
You must be signed in to change notification settings - Fork 3.6k
add LoadSnapshotFailed in snapshot protocol, #21842 #21848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Refs #21842 |
If the snapshot itself is corrupt (and hence prevents recovery) then one can still just delete it? |
LGTM |
yes, you can delete the snapshot if you know that you have events that cover it. It's also possible to start a PersistentActor without using snapshots:
|
Ok, thanks for confirming. |
Test PASSed. |
Great, that's what I vaguely remembered - that's all we need then 👍 |
* i.e. there was no snapshot. If snapshot could not be loaded the `Future` | ||
* should be completed with failure. That is important because events may | ||
* have been deleted and just replaying the events might not result in a valid | ||
* state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
case Some(md) ⇒ | ||
Try(withInputStream(md)(deserialize)) match { | ||
case Success(s) ⇒ Some(SelectedSnapshot(md, s.data)) | ||
case Success(s) ⇒ | ||
Success(Some(SelectedSnapshot(md, s.data))) | ||
case Failure(e) ⇒ | ||
log.error(e, s"Error loading snapshot [${md}]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make it: log.error(e, "Error loading snapshot [{}], remaining attempts: {}", md, remaining)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improved: 4de7317
Mention of this new behaviour in the docs (along with the |
TCK will be hard, we can't simulate deserialization failure because the choice of serialization is up to the journal implementation |
Ah, didn't think of that. I just realized we discussed the same thing about journal deserialization problems coverage recently. Thanks for clarifying a second time then. |
No problem, it's not obvious (and not optimal from testing perspective). |
@johanandren I have added the additional documentation, see 7ee86ae |
Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great!
* treat snapshot load failure in same way as other recovery failures * if load of snapshot fails the persistent actor will be stopped, since we can't assume that a consistent state would be recovered just by replaying all events, since events may have been deleted * additional recovery docs * improve log message
4de7317
to
ea84b4b
Compare
we can't assume that a consistent state would be recovered just by
replaying all events, since events may have been deleted