Skip to content

Conversation

jrockway
Copy link
Contributor

@jrockway jrockway commented Oct 7, 2024

This adds the ability to restore the database state to that contained in a snapshot. After restoring, the snapshot is still restorable, because we re-upload the database dump and adjust the snapshot to show the new ID. This makes it look like snapshots contain themselves, which is a nice feature ;)

To restore cleanly, we can't rely on queries running against a default schema (public), it seems, so I updated the migrations code to be explicit about the schema the queries are in.

psql is used to restore the dump, because that's what the pg_dump documentation says to do. I thought about parsing the SQL and sending one statement at a time, but it seems like something that is likely to go wrong in weird ways. As a result of this, psql is available in //tools/postgres/psql now, which should be convenient.

We use pg_dump v17 to dump any version of postgres. For various reasons, these dumps can only be restored onto postgres 17. The snapshotting code has been adjusted to remove the v17-specific functionality so that we can take snapshots of v13 and restore onto v13.

Finally, I adjusted the snapshotting code to atomically create the pachyderm snapshot and the postgres snapshot. This involves sharing the transaction between Go and pg_dump which is possible (see pg_export_snapshot). We have to manually dump everything created in that transaction, though, which we now do. (The start READING the same MVCC snapshot, but can't see each other's writes because postgres doesn't have READ UNCOMMITTED isolation. Thus, we make a list of everything we wrote and add it to the end of the dump file ourselves.) This fixes MLDM-143.

Copy link

github-actions bot commented Oct 7, 2024

size-limit report 📦

Path Size
Entry 93.13 KB (0%)
Vendor XL 800.86 KB (0%)
Chunks 292.51 KB (0%)
Everything 1.16 MB (0%)

Copy link

codecov bot commented Oct 7, 2024

Codecov Report

Attention: Patch coverage is 79.23077% with 54 lines in your changes missing coverage. Please review.

Project coverage is 62.26%. Comparing base (8b6e714) to head (fe76302).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/internal/recovery/recovery.go 78.35% 42 Missing ⚠️
src/internal/storage/track/postgres_tracker.go 80.85% 9 Missing ⚠️
src/internal/storage/fileset/storage.go 76.92% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10373      +/-   ##
==========================================
+ Coverage   62.22%   62.26%   +0.04%     
==========================================
  Files        1232     1232              
  Lines       88407    88612     +205     
  Branches     1820     1820              
==========================================
+ Hits        55013    55177     +164     
- Misses      32543    32591      +48     
+ Partials      851      844       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ctx := pctx.TestContext(t)
db := dockertestenv.NewMigratedTestDB(t, clusterstate.DesiredClusterState)
tracker := track.NewPostgresTracker(db)
storage := fileset.NewStorage(fileset.NewPostgresStore(db), tracker, chunk.NewStorage(kv.NewMemStore(), nil, db, tracker))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a test constructor for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileset.NewTestStorage only works against an unmigrated database, and can only create tables used by the storage system. This test wants to write to the recovery.snapshots table, but there is no way we can create that in a way that doesn't conflict with NewTestStorage. (Applying migrations prevents fileset.NewTestStorage from working.)

I definitely get why storage tests don't want all those tables around, but for this "dump the entire database" test, we want the entire database.

(I also like kv.NewMemStore() better than NewFSStore, though not for any real reason.)

return 0, errors.Wrap(err, "create snapshot row")
b.WriteRune('\n')
b.WriteString("COPY storage.chunksets (id) FROM stdin;\n")
fmt.Fprintf(&b, "%v\n", int64(chunksetID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a log line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the format of the database dump. We have to manually create it here because pg_dump can't see writes from this transaction even though its reads are in this transaction. The statement after COPY is basically a TSV file of the table content.

@brendoncarroll
Copy link
Contributor

brendoncarroll commented Oct 14, 2024

I'm concerned about the dumpTrackerRefs and dumpTrackerObjects accessing storage tables which may need to change in future releases. That information is included in the database snapshot, and we should get it back for free when the backup is restored.

I thought the whole point of having ChunkSets was to encapsulate that logic behind an API that could remind stable while the storage representation continued to change.

At the very least, those function should be exported from the storage package, so we know that they are a commitment to the recovery system.

Copy link
Contributor

@brendoncarroll brendoncarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to sort out the cross-system database queries. see my other comment.

@jrockway jrockway force-pushed the jonathan/mldm-107-restore-machinery branch from 335f732 to 81c72f7 Compare October 18, 2024 02:35
@jrockway
Copy link
Contributor Author

jrockway commented Oct 18, 2024

@brendoncarroll I have moved the storage dumping functionality to the storage and tracker package. The reason for this is that pg_dump and Go start with the same database state (i.e. the same MVCC snapshot at the postgres level), but can't see each other's writes. Thus, anything new created by Go in this transaction has to be dumped by Go. We dump our newly-created chunkset row and newly-created snapshot row in this package, and dump the state of the tracker in the tracker package.

(Ideally we would dump the storage.chunksets row in the storage package, but it would be a real mess for other callers, so the mess is contained here instead. storage.chunksets is just an integer id.)

All this complexity is to avoid any new data being added to the system between creating the chunkset and creating the database dump. If data were added between those two points, we'd potentially restore to an unrecoverable state. So we compromise and create + dump in the same transaction, but because we're using two different programs to do the work (pg_dump and pachd), the best we can do is starting from the same postgres snapshot (https://www.postgresql.org/docs/current/sql-set-transaction.html). Unfortunately snapshots just represent the state of the database at tx start time, but can't see changes made since then. So pg_dump is dumping from the start of our WithTx call, and that involves creating snapshot, chunkset, and tracker rows; we then dump those manually after pg_dump is done (since we can read our own writes); thus we end up with a simulation of doing the snapshot creation + dump in one transaction, even though that's not quite what happens.

With all that in mind, I think this has been cleaned up significantly. It should be obvious when modifying the tracker what breaks. I have also made the test fail if chunksets don't actually work, so that should be another thing that catches un-snapshottable-with-this-code tracker changes.

@jrockway jrockway force-pushed the jonathan/mldm-107-restore-machinery branch from cc061f9 to fe76302 Compare October 18, 2024 18:58
@jrockway jrockway merged commit ade7122 into master Oct 18, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants