-
Notifications
You must be signed in to change notification settings - Fork 446
Enable verbose & progress logging for pg_basebackup #665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@hmac Thanks for your PR. I think it's a good idea to show progress. The unique small issue it that progress reporting on a tty uses line feed control char to update its progress state (rewrite the current line without carriage return) so it'll probably mess up/remove some concurrent logging output from the keeper. This doesn't happen when not on a tty (pg_basebackup detects if stdout is a tty or not). One possible solution could be to not set pg_basebackup stdout/stderr to the same fd of the keeper but pipe them to a goroutine that will log them. |
fce17c5
to
6239021
Compare
internal/postgresql/postgresql.go
Outdated
return fmt.Errorf("error: %v", err) | ||
} | ||
|
||
cmd.Start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing err check
internal/postgresql/postgresql.go
Outdated
// stderr. | ||
stderr, err := cmd.StderrPipe() | ||
if err != nil { | ||
return fmt.Errorf("error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no additional context to add to err just return err
I saw that in this file we are doing return fmt.Errorf("error: %v", err)
on cmd.Run() calls. This was an oversight. It should be a better context (like failed to execute command
). So if you did the same here because you saw that pattern, sorry since I've tricked you.
internal/postgresql/postgresql.go
Outdated
return fmt.Errorf("error: %v", err) | ||
} | ||
|
||
if err := cmd.Wait(); err != nil { | ||
return fmt.Errorf("error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no additional context to add to err just return err
internal/postgresql/postgresql.go
Outdated
|
||
cmd.Start() | ||
if _, err := io.Copy(os.Stderr, stderr); err != nil { | ||
return fmt.Errorf("error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning an error here (also if it's very hard that this will happen) will keep a zombie process since cmd.Wait()
will not be called.
I'll prefer to execute the io.Copy inside a go routine so the program function can go ahead and block on cmd.Wait()
. The goroutine will ignore possible io.Copy errors since they're not really important.
6239021
to
48bf1e3
Compare
@sgotti thanks for the review and apologies for the sloppy code! I'm still getting used to Go 😅 What I've done is pass a context to the I think this covers all your points, but please let me know if I've missed anything! |
passing a context to exec.Command is used to kill the command when the context expires but it won't implicitly wait for it so it'll leave a zombie process anyway. So using a context are is not useful and confusing. If you want to avoid adding a goroutine then a solution is to do the io.Copy like you're doing now. io.Copy will return without an error when the process exits and then the function will pass to the cmd.Wait. If io.Copy returns an error then just log it without returning so cmd.Wait() will be called anyway. |
Just for my own understanding, do you mean that the process will be killed, but this might take some time and meanwhile stolon would have carried on? I think your suggestion of using a goroutine makes sense since there's obviously a lot of nuance here. I'll make that change shortly. |
At the lower level killing a process is a kill syscall that won't block so the function could exit while the process is still alive. That's why you must always call cmd.Wait to be sure that the process is exited, be able to get its exit code and avoid having a zombie child process. But what I was trying to says is that the unique purpose of attaching a context to a cmd is to have the process killed when it expires. But that's not our need so a context is not needed. |
48bf1e3
to
aef9ae1
Compare
3131b7f
to
a4f4ef8
Compare
@sgotti thank you for your comments so far and sorry this PR has taken so long! I've updated the changes to match what I think you intended - does it look like the right approach? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hmac Thanks! Just a small nit. Can you then squash your commits in a single one?
internal/postgresql/postgresql.go
Outdated
if err := cmd.Start(); err != nil { | ||
return err | ||
} | ||
go io.Copy(os.Stderr, stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also log the io.Copy error. Just in case.
This will cause pg_basebackup to output some extra steps during startup and shutdown. If progress reporting is also enabled (-P) then it will show the exact file name that is currently being processed. This can be useful for debugging and monitoring the backup process. Combined with verbose mode (-v) this will give detailed information on what files are being processed and how far through the backup we are. There's a slight initial time cost introduced as Postgres has to determine the total database size, but this should be minimal compared to the backup time. Co-authored-by: me@lawrencejones.dev Co-authored-by: harrymaclean@gmail.com
a4f4ef8
to
393b512
Compare
|
@hmac @lawrencejones Thanks! Merging |
These two changes cause
pg_basebackup
to log the names of the files it is processing, along with the estimated progress through the entire backup. For large databases this can be very useful in estimating how long the backup will take.