Skip to content

Enable verbose & progress logging for pg_basebackup #665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2019

Conversation

hmac
Copy link
Contributor

@hmac hmac commented Jun 5, 2019

These two changes cause pg_basebackup to log the names of the files it is processing, along with the estimated progress through the entire backup. For large databases this can be very useful in estimating how long the backup will take.

@sgotti
Copy link
Member

sgotti commented Jun 6, 2019

@hmac Thanks for your PR. I think it's a good idea to show progress.

The unique small issue it that progress reporting on a tty uses line feed control char to update its progress state (rewrite the current line without carriage return) so it'll probably mess up/remove some concurrent logging output from the keeper. This doesn't happen when not on a tty (pg_basebackup detects if stdout is a tty or not).

One possible solution could be to not set pg_basebackup stdout/stderr to the same fd of the keeper but pipe them to a goroutine that will log them.

@hmac hmac force-pushed the hmac/pgbasebackup-options branch from fce17c5 to 6239021 Compare June 7, 2019 08:27
return fmt.Errorf("error: %v", err)
}

cmd.Start()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing err check

// stderr.
stderr, err := cmd.StderrPipe()
if err != nil {
return fmt.Errorf("error: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no additional context to add to err just return err

I saw that in this file we are doing return fmt.Errorf("error: %v", err) on cmd.Run() calls. This was an oversight. It should be a better context (like failed to execute command). So if you did the same here because you saw that pattern, sorry since I've tricked you.

return fmt.Errorf("error: %v", err)
}

if err := cmd.Wait(); err != nil {
return fmt.Errorf("error: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no additional context to add to err just return err


cmd.Start()
if _, err := io.Copy(os.Stderr, stderr); err != nil {
return fmt.Errorf("error: %v", err)
Copy link
Member

@sgotti sgotti Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning an error here (also if it's very hard that this will happen) will keep a zombie process since cmd.Wait() will not be called.

I'll prefer to execute the io.Copy inside a go routine so the program function can go ahead and block on cmd.Wait(). The goroutine will ignore possible io.Copy errors since they're not really important.

@hmac hmac force-pushed the hmac/pgbasebackup-options branch from 6239021 to 48bf1e3 Compare June 7, 2019 11:06
@hmac
Copy link
Contributor Author

hmac commented Jun 7, 2019

@sgotti thanks for the review and apologies for the sloppy code! I'm still getting used to Go 😅

What I've done is pass a context to the exec.Command and cancel it when we return from the function, to ensure that it gets cleaned up. We should then block on io.Copy until the process completes, but we'll be streaming to stderr the whole time so we'll still get incremental progress logs.

I think this covers all your points, but please let me know if I've missed anything!

@sgotti
Copy link
Member

sgotti commented Jun 7, 2019

What I've done is pass a context to the exec.Command and cancel it when we return from the function, to ensure that it gets cleaned up. We should then block on io.Copy until the process completes, but we'll be streaming to stderr the whole time so we'll still get incremental progress logs.

passing a context to exec.Command is used to kill the command when the context expires but it won't implicitly wait for it so it'll leave a zombie process anyway. So using a context are is not useful and confusing.

If you want to avoid adding a goroutine then a solution is to do the io.Copy like you're doing now. io.Copy will return without an error when the process exits and then the function will pass to the cmd.Wait. If io.Copy returns an error then just log it without returning so cmd.Wait() will be called anyway.

@hmac
Copy link
Contributor Author

hmac commented Jun 7, 2019

passing a context to exec.Command is used to kill the command when the context expires but it won't implicitly wait for it so it'll leave a zombie process anyway. So using a context are is not useful and confusing.

Just for my own understanding, do you mean that the process will be killed, but this might take some time and meanwhile stolon would have carried on?

I think your suggestion of using a goroutine makes sense since there's obviously a lot of nuance here. I'll make that change shortly.

@sgotti
Copy link
Member

sgotti commented Jun 7, 2019

Just for my own understanding, do you mean that the process will be killed, but this might take some time and meanwhile stolon would have carried on?

At the lower level killing a process is a kill syscall that won't block so the function could exit while the process is still alive. That's why you must always call cmd.Wait to be sure that the process is exited, be able to get its exit code and avoid having a zombie child process.

But what I was trying to says is that the unique purpose of attaching a context to a cmd is to have the process killed when it expires. But that's not our need so a context is not needed.

@hmac hmac force-pushed the hmac/pgbasebackup-options branch from 48bf1e3 to aef9ae1 Compare June 7, 2019 12:37
@hmac hmac force-pushed the hmac/pgbasebackup-options branch 2 times, most recently from 3131b7f to a4f4ef8 Compare June 14, 2019 14:00
@hmac
Copy link
Contributor Author

hmac commented Jun 14, 2019

@sgotti thank you for your comments so far and sorry this PR has taken so long! I've updated the changes to match what I think you intended - does it look like the right approach?

Copy link
Member

@sgotti sgotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hmac Thanks! Just a small nit. Can you then squash your commits in a single one?

if err := cmd.Start(); err != nil {
return err
}
go io.Copy(os.Stderr, stderr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also log the io.Copy error. Just in case.

This will cause pg_basebackup to output some extra steps during startup
and shutdown. If progress reporting is also enabled (-P) then it will
show the exact file name that is currently being processed. This can be
useful for debugging and monitoring the backup process.

Combined with verbose mode (-v) this will give detailed information on
what files are being processed and how far through the backup we are.
There's a slight initial time cost introduced as Postgres has to
determine the total database size, but this should be minimal compared
to the backup time.

Co-authored-by: me@lawrencejones.dev
Co-authored-by: harrymaclean@gmail.com
@lawrencejones lawrencejones force-pushed the hmac/pgbasebackup-options branch from a4f4ef8 to 393b512 Compare June 18, 2019 14:08
@lawrencejones
Copy link
Contributor

io.Copy should now be logged @sgotti. Hope this now looks good?

@sgotti
Copy link
Member

sgotti commented Jul 3, 2019

@hmac @lawrencejones Thanks! Merging

@sgotti sgotti merged commit ed435bd into sorintlab:master Jul 3, 2019
@sgotti sgotti added this to the v0.14.0 milestone Jul 26, 2019
@lawrencejones lawrencejones deleted the hmac/pgbasebackup-options branch September 11, 2019 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants