Enable verbose & progress logging for pg_basebackup #665

hmac · 2019-06-05T11:13:14Z

These two changes cause pg_basebackup to log the names of the files it is processing, along with the estimated progress through the entire backup. For large databases this can be very useful in estimating how long the backup will take.

sgotti · 2019-06-06T13:12:49Z

@hmac Thanks for your PR. I think it's a good idea to show progress.

The unique small issue it that progress reporting on a tty uses line feed control char to update its progress state (rewrite the current line without carriage return) so it'll probably mess up/remove some concurrent logging output from the keeper. This doesn't happen when not on a tty (pg_basebackup detects if stdout is a tty or not).

One possible solution could be to not set pg_basebackup stdout/stderr to the same fd of the keeper but pipe them to a goroutine that will log them.

sgotti · 2019-06-07T08:49:16Z

internal/postgresql/postgresql.go

+		return fmt.Errorf("error: %v", err)
+	}
+
+	cmd.Start()


missing err check

sgotti · 2019-06-07T08:51:04Z

internal/postgresql/postgresql.go

+	// stderr.
+	stderr, err := cmd.StderrPipe()
+	if err != nil {
+		return fmt.Errorf("error: %v", err)


If there's no additional context to add to err just return err

I saw that in this file we are doing return fmt.Errorf("error: %v", err) on cmd.Run() calls. This was an oversight. It should be a better context (like failed to execute command). So if you did the same here because you saw that pattern, sorry since I've tricked you.

sgotti · 2019-06-07T08:51:26Z

internal/postgresql/postgresql.go

+		return fmt.Errorf("error: %v", err)
+	}
+
+	if err := cmd.Wait(); err != nil {
 		return fmt.Errorf("error: %v", err)


If there's no additional context to add to err just return err

sgotti · 2019-06-07T09:01:05Z

internal/postgresql/postgresql.go

+
+	cmd.Start()
+	if _, err := io.Copy(os.Stderr, stderr); err != nil {
+		return fmt.Errorf("error: %v", err)


Returning an error here (also if it's very hard that this will happen) will keep a zombie process since cmd.Wait() will not be called.

I'll prefer to execute the io.Copy inside a go routine so the program function can go ahead and block on cmd.Wait(). The goroutine will ignore possible io.Copy errors since they're not really important.

hmac · 2019-06-07T11:10:39Z

@sgotti thanks for the review and apologies for the sloppy code! I'm still getting used to Go 😅

What I've done is pass a context to the exec.Command and cancel it when we return from the function, to ensure that it gets cleaned up. We should then block on io.Copy until the process completes, but we'll be streaming to stderr the whole time so we'll still get incremental progress logs.

I think this covers all your points, but please let me know if I've missed anything!

sgotti · 2019-06-07T11:41:36Z

What I've done is pass a context to the exec.Command and cancel it when we return from the function, to ensure that it gets cleaned up. We should then block on io.Copy until the process completes, but we'll be streaming to stderr the whole time so we'll still get incremental progress logs.

passing a context to exec.Command is used to kill the command when the context expires but it won't implicitly wait for it so it'll leave a zombie process anyway. So using a context are is not useful and confusing.

If you want to avoid adding a goroutine then a solution is to do the io.Copy like you're doing now. io.Copy will return without an error when the process exits and then the function will pass to the cmd.Wait. If io.Copy returns an error then just log it without returning so cmd.Wait() will be called anyway.

hmac · 2019-06-07T12:14:19Z

passing a context to exec.Command is used to kill the command when the context expires but it won't implicitly wait for it so it'll leave a zombie process anyway. So using a context are is not useful and confusing.

Just for my own understanding, do you mean that the process will be killed, but this might take some time and meanwhile stolon would have carried on?

I think your suggestion of using a goroutine makes sense since there's obviously a lot of nuance here. I'll make that change shortly.

sgotti · 2019-06-07T12:32:13Z

Just for my own understanding, do you mean that the process will be killed, but this might take some time and meanwhile stolon would have carried on?

At the lower level killing a process is a kill syscall that won't block so the function could exit while the process is still alive. That's why you must always call cmd.Wait to be sure that the process is exited, be able to get its exit code and avoid having a zombie child process.

But what I was trying to says is that the unique purpose of attaching a context to a cmd is to have the process killed when it expires. But that's not our need so a context is not needed.

hmac · 2019-06-14T14:01:43Z

@sgotti thank you for your comments so far and sorry this PR has taken so long! I've updated the changes to match what I think you intended - does it look like the right approach?

sgotti

@hmac Thanks! Just a small nit. Can you then squash your commits in a single one?

sgotti · 2019-06-15T09:24:31Z

internal/postgresql/postgresql.go

+	if err := cmd.Start(); err != nil {
+		return err
+	}
+	go io.Copy(os.Stderr, stderr)


I'll also log the io.Copy error. Just in case.

This will cause pg_basebackup to output some extra steps during startup and shutdown. If progress reporting is also enabled (-P) then it will show the exact file name that is currently being processed. This can be useful for debugging and monitoring the backup process. Combined with verbose mode (-v) this will give detailed information on what files are being processed and how far through the backup we are. There's a slight initial time cost introduced as Postgres has to determine the total database size, but this should be minimal compared to the backup time. Co-authored-by: me@lawrencejones.dev Co-authored-by: harrymaclean@gmail.com

lawrencejones · 2019-06-18T14:09:12Z

io.Copy should now be logged @sgotti. Hope this now looks good?

sgotti · 2019-07-03T13:28:45Z

@hmac @lawrencejones Thanks! Merging

hmac force-pushed the hmac/pgbasebackup-options branch from fce17c5 to 6239021 Compare June 7, 2019 08:27

sgotti requested changes Jun 7, 2019

View reviewed changes

hmac force-pushed the hmac/pgbasebackup-options branch from 6239021 to 48bf1e3 Compare June 7, 2019 11:06

hmac force-pushed the hmac/pgbasebackup-options branch from 48bf1e3 to aef9ae1 Compare June 7, 2019 12:37

hmac force-pushed the hmac/pgbasebackup-options branch 2 times, most recently from 3131b7f to a4f4ef8 Compare June 14, 2019 14:00

sgotti approved these changes Jun 15, 2019

View reviewed changes

sgotti requested changes Jun 15, 2019

View reviewed changes

lawrencejones force-pushed the hmac/pgbasebackup-options branch from a4f4ef8 to 393b512 Compare June 18, 2019 14:08

sgotti approved these changes Jul 3, 2019

View reviewed changes

sgotti merged commit ed435bd into sorintlab:master Jul 3, 2019

sgotti added this to the v0.14.0 milestone Jul 26, 2019

lawrencejones deleted the hmac/pgbasebackup-options branch September 11, 2019 16:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable verbose & progress logging for pg_basebackup #665

Enable verbose & progress logging for pg_basebackup #665

Uh oh!

hmac commented Jun 5, 2019

Uh oh!

sgotti commented Jun 6, 2019 •

edited

Loading

Uh oh!

sgotti Jun 7, 2019

Uh oh!

sgotti Jun 7, 2019

Uh oh!

sgotti Jun 7, 2019

Uh oh!

sgotti Jun 7, 2019 •

edited

Loading

Uh oh!

hmac commented Jun 7, 2019

Uh oh!

sgotti commented Jun 7, 2019

Uh oh!

hmac commented Jun 7, 2019 •

edited

Loading

Uh oh!

sgotti commented Jun 7, 2019

Uh oh!

hmac commented Jun 14, 2019 •

edited

Loading

Uh oh!

sgotti left a comment

Uh oh!

sgotti Jun 15, 2019

Uh oh!

lawrencejones commented Jun 18, 2019

Uh oh!

sgotti commented Jul 3, 2019

Uh oh!

Uh oh!

Enable verbose & progress logging for pg_basebackup #665

Enable verbose & progress logging for pg_basebackup #665

Uh oh!

Conversation

hmac commented Jun 5, 2019

Uh oh!

sgotti commented Jun 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgotti Jun 7, 2019

Choose a reason for hiding this comment

Uh oh!

sgotti Jun 7, 2019

Choose a reason for hiding this comment

Uh oh!

sgotti Jun 7, 2019

Choose a reason for hiding this comment

Uh oh!

sgotti Jun 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hmac commented Jun 7, 2019

Uh oh!

sgotti commented Jun 7, 2019

Uh oh!

hmac commented Jun 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgotti commented Jun 7, 2019

Uh oh!

hmac commented Jun 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgotti left a comment

Choose a reason for hiding this comment

Uh oh!

sgotti Jun 15, 2019

Choose a reason for hiding this comment

Uh oh!

lawrencejones commented Jun 18, 2019

Uh oh!

sgotti commented Jul 3, 2019

Uh oh!

Uh oh!

sgotti commented Jun 6, 2019 •

edited

Loading

sgotti Jun 7, 2019 •

edited

Loading

hmac commented Jun 7, 2019 •

edited

Loading

hmac commented Jun 14, 2019 •

edited

Loading