-
Notifications
You must be signed in to change notification settings - Fork 566
Add a function to restore a snapshot #10373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
size-limit report 📦
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #10373 +/- ##
==========================================
+ Coverage 62.22% 62.26% +0.04%
==========================================
Files 1232 1232
Lines 88407 88612 +205
Branches 1820 1820
==========================================
+ Hits 55013 55177 +164
- Misses 32543 32591 +48
+ Partials 851 844 -7 ☔ View full report in Codecov by Sentry. |
ctx := pctx.TestContext(t) | ||
db := dockertestenv.NewMigratedTestDB(t, clusterstate.DesiredClusterState) | ||
tracker := track.NewPostgresTracker(db) | ||
storage := fileset.NewStorage(fileset.NewPostgresStore(db), tracker, chunk.NewStorage(kv.NewMemStore(), nil, db, tracker)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a test constructor for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileset.NewTestStorage only works against an unmigrated database, and can only create tables used by the storage system. This test wants to write to the recovery.snapshots
table, but there is no way we can create that in a way that doesn't conflict with NewTestStorage. (Applying migrations prevents fileset.NewTestStorage from working.)
I definitely get why storage tests don't want all those tables around, but for this "dump the entire database" test, we want the entire database.
(I also like kv.NewMemStore() better than NewFSStore, though not for any real reason.)
return 0, errors.Wrap(err, "create snapshot row") | ||
b.WriteRune('\n') | ||
b.WriteString("COPY storage.chunksets (id) FROM stdin;\n") | ||
fmt.Fprintf(&b, "%v\n", int64(chunksetID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a log line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the format of the database dump. We have to manually create it here because pg_dump can't see writes from this transaction even though its reads are in this transaction. The statement after COPY is basically a TSV file of the table content.
I'm concerned about the I thought the whole point of having ChunkSets was to encapsulate that logic behind an API that could remind stable while the storage representation continued to change. At the very least, those function should be exported from the storage package, so we know that they are a commitment to the recovery system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to sort out the cross-system database queries. see my other comment.
335f732
to
81c72f7
Compare
@brendoncarroll I have moved the storage dumping functionality to the storage and tracker package. The reason for this is that pg_dump and Go start with the same database state (i.e. the same MVCC snapshot at the postgres level), but can't see each other's writes. Thus, anything new created by Go in this transaction has to be dumped by Go. We dump our newly-created chunkset row and newly-created snapshot row in this package, and dump the state of the tracker in the tracker package. (Ideally we would dump the storage.chunksets row in the storage package, but it would be a real mess for other callers, so the mess is contained here instead. storage.chunksets is just an integer id.) All this complexity is to avoid any new data being added to the system between creating the chunkset and creating the database dump. If data were added between those two points, we'd potentially restore to an unrecoverable state. So we compromise and create + dump in the same transaction, but because we're using two different programs to do the work (pg_dump and pachd), the best we can do is starting from the same postgres snapshot (https://www.postgresql.org/docs/current/sql-set-transaction.html). Unfortunately snapshots just represent the state of the database at tx start time, but can't see changes made since then. So pg_dump is dumping from the start of our With all that in mind, I think this has been cleaned up significantly. It should be obvious when modifying the tracker what breaks. I have also made the test fail if chunksets don't actually work, so that should be another thing that catches un-snapshottable-with-this-code tracker changes. |
dump and restore in a transaction run database migrations after restore test version compatibility; fix bug make snapshots atomic
…ng a public interface to dump them
cc061f9
to
fe76302
Compare
This adds the ability to restore the database state to that contained in a snapshot. After restoring, the snapshot is still restorable, because we re-upload the database dump and adjust the snapshot to show the new ID. This makes it look like snapshots contain themselves, which is a nice feature ;)
To restore cleanly, we can't rely on queries running against a default schema (public), it seems, so I updated the migrations code to be explicit about the schema the queries are in.
psql
is used to restore the dump, because that's what thepg_dump
documentation says to do. I thought about parsing the SQL and sending one statement at a time, but it seems like something that is likely to go wrong in weird ways. As a result of this,psql
is available in //tools/postgres/psql now, which should be convenient.We use pg_dump v17 to dump any version of postgres. For various reasons, these dumps can only be restored onto postgres 17. The snapshotting code has been adjusted to remove the v17-specific functionality so that we can take snapshots of v13 and restore onto v13.
Finally, I adjusted the snapshotting code to atomically create the pachyderm snapshot and the postgres snapshot. This involves sharing the transaction between Go and pg_dump which is possible (see pg_export_snapshot). We have to manually dump everything created in that transaction, though, which we now do. (The start READING the same MVCC snapshot, but can't see each other's writes because postgres doesn't have READ UNCOMMITTED isolation. Thus, we make a list of everything we wrote and add it to the end of the dump file ourselves.) This fixes MLDM-143.