Skip to content

Conversation

patriknw
Copy link
Contributor

  • treat snapshot load failure in same way as other recovery failures
  • if load of snapshot fails the persistent actor will be stopped, since
    we can't assume that a consistent state would be recovered just by
    replaying all events, since events may have been deleted

@patriknw
Copy link
Contributor Author

Refs #21842

@akka-ci akka-ci added the validating PR is currently being validated by Jenkins label Nov 17, 2016
@drewhk
Copy link
Contributor

drewhk commented Nov 17, 2016

If the snapshot itself is corrupt (and hence prevents recovery) then one can still just delete it?

@drewhk
Copy link
Contributor

drewhk commented Nov 17, 2016

LGTM

@patriknw
Copy link
Contributor Author

yes, you can delete the snapshot if you know that you have events that cover it.

It's also possible to start a PersistentActor without using snapshots:

Recovery(fromSnapshot = SnapshotSelectionCriteria.None)

@drewhk
Copy link
Contributor

drewhk commented Nov 17, 2016

Ok, thanks for confirming.

@akka-ci akka-ci added the tested PR that was successfully built and tested by Jenkins label Nov 17, 2016
@akka-ci
Copy link

akka-ci commented Nov 17, 2016

Test PASSed.

@akka-ci akka-ci removed the validating PR is currently being validated by Jenkins label Nov 17, 2016
@ktoso
Copy link
Contributor

ktoso commented Nov 17, 2016

It's also possible to start a PersistentActor without using snapshots:
Recovery(fromSnapshot = SnapshotSelectionCriteria.None)

Great, that's what I vaguely remembered - that's all we need then 👍

* i.e. there was no snapshot. If snapshot could not be loaded the `Future`
* should be completed with failure. That is important because events may
* have been deleted and just replaying the events might not result in a valid
* state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

case Some(md) ⇒
Try(withInputStream(md)(deserialize)) match {
case Success(s) ⇒ Some(SelectedSnapshot(md, s.data))
case Success(s) ⇒
Success(Some(SelectedSnapshot(md, s.data)))
case Failure(e) ⇒
log.error(e, s"Error loading snapshot [${md}]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make it: log.error(e, "Error loading snapshot [{}], remaining attempts: {}", md, remaining)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved: 4de7317

@johanandren
Copy link
Contributor

Mention of this new behaviour in the docs (along with the SnapshotSelectionCriteria.None) & coverage in the TCK? (LGTM except for those missing)

@patriknw
Copy link
Contributor Author

TCK will be hard, we can't simulate deserialization failure because the choice of serialization is up to the journal implementation

@johanandren
Copy link
Contributor

Ah, didn't think of that. I just realized we discussed the same thing about journal deserialization problems coverage recently. Thanks for clarifying a second time then.

@patriknw
Copy link
Contributor Author

No problem, it's not obvious (and not optimal from testing perspective).
Good point with the docs, I'll add that.

@patriknw
Copy link
Contributor Author

@johanandren I have added the additional documentation, see 7ee86ae

@akka-ci akka-ci added validating PR is currently being validated by Jenkins tested PR that was successfully built and tested by Jenkins and removed tested PR that was successfully built and tested by Jenkins labels Nov 17, 2016
@akka-ci
Copy link

akka-ci commented Nov 17, 2016

Test PASSed.

@akka-ci akka-ci added validating PR is currently being validated by Jenkins tested PR that was successfully built and tested by Jenkins and removed validating PR is currently being validated by Jenkins tested PR that was successfully built and tested by Jenkins labels Nov 17, 2016
@akka-ci
Copy link

akka-ci commented Nov 18, 2016

Test PASSed.

Copy link
Contributor

@johanandren johanandren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great!

* treat snapshot load failure in same way as other recovery failures
* if load of snapshot fails the persistent actor will be stopped, since
  we can't assume that a consistent state would be recovered just by
  replaying all events, since events may have been  deleted
* additional recovery docs
* improve log message
@patriknw patriknw force-pushed the wip-21842-snap-fail-patriknw branch from 4de7317 to ea84b4b Compare November 18, 2016 09:52
@patriknw patriknw merged commit bf3d27c into master Nov 18, 2016
@patriknw patriknw deleted the wip-21842-snap-fail-patriknw branch November 18, 2016 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tested PR that was successfully built and tested by Jenkins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants