-
Notifications
You must be signed in to change notification settings - Fork 1.3k
progressui: json mode for a buffered json output #4165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9541c64
to
0a576fd
Compare
0a576fd
to
411eb36
Compare
411eb36
to
aa6a522
Compare
7379d20
to
c68e09a
Compare
} | ||
|
||
func (d *jsonDisplay) update(ss *client.SolveStatus) { | ||
d.t.update(ss, 80) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting 80
in here seems terminal-specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number isn't used for non-tty displays. It's mostly an artifact of the previous implementation where trace
was used for both the tty display and the plain display. The text display used 80 as a default just to give a value (which is unused) and I just carried that on.
Might be worth refactoring this code at some point just to reduce confusion for the implementation but it seemed a bit too large of a task for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is unused, then just setting a zero value is ok for now.
|
||
// JSON format keeps the logs separated into their streams | ||
// and takes the aggregated result. The trace does not do that | ||
// so we use and update logMux for log messages by digest/stream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to disable log behavior in d.t
to avoid duplicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation should just ignore the logs from trace
. The trace
will still do the log splitting and aggregating stdout/stderr into the same stream, but this specific display just ignores that property. It would be possible for us to disable the log behavior to improve performance, but it would make that section of code more complicated and I think that section of code may already be a bit too complicated and in need of a refactor so, unless it's causing a documented performance problem, I think we should just ignore it.
It won't create a wrong answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it means that we use double the memory then we should fix it. But if we don't need to store everything for JSON and can release the memory once vertex is printed then not needed until some benchmark proves that this is a bottleneck.
util/progress/progressui/display.go
Outdated
|
||
func (d *jsonDisplay) printVtx(v *vertex) { | ||
intervals := make([]VertexInterval, len(v.mergedIntervals)) | ||
for i, interval := range v.mergedIntervals { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this prints same vertex multiple times with all intervals. The logic should be to print interval after it completes and if there is another interval later then it prints a new object but without the previous data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presently have it so the output gets printed when the entire vertex completes which means we have the completed aggregate data from all of the intervals. Were we going to print a JSON object for each interval that completes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vertex does complete, but you don't know if there will be another run for the same vertex (eg. some runs needs to happen for cache lookup and later for execution, but there are more complicated cases).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this. I didn't realize that the intervals were caused by a vertex completing and then starting again. We're going to print the json object when the vertex completes, erase any state associated with it, and print a new one if it runs again. Most builds this shouldn't happen with but there are some cases where a vertex can run multiple times.
util/progress/progressui/json.go
Outdated
} | ||
|
||
type VertexInterval struct { | ||
Start time.Time `json:"start"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the same names as the original types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean use *time.Time
? This structure is intended to mimic interval
which is defined as:
type interval struct {
start *time.Time
stop *time.Time
}
I modified it to use a non-pointer because this version of the type should not have nullable versions of start and stop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I meant that in current types these fields are called Started
and Completed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had an offline conversation regarding this. The inclusion of intervals at all shouldn't be done. We're going to split things up so each interval gets its own json object. That means this structure is just going to be removed.
c68e09a
to
fd7023d
Compare
Looking at this output, it seems to expose some weird output in progress stream that should be checked out in follow-up. Shouldn't block this PR but not sure if we want to expose this in buildx for example before these cases are handled:
Additional comments:
Here is an example of current output: @AkihiroSuda PTAL as well |
util/progress/progressui/json.go
Outdated
Detail []string `json:"detail,omitempty"` | ||
URL string `json:"url,omitempty"` | ||
|
||
SourceInfo *pb.SourceInfo `json:"sourceInfo,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't want to have this pb
info in the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove this. I figured the source info might be useful and the protobuf structure is already annotated with the correct json markers.
continue | ||
} | ||
|
||
if v.isCompleted() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, would the case that all blobs are shown separately be solved if in here we instead check that all the mergedIntervals
are completed, then in printVtx
we combine the (unprinted) mergedIntervals
and reset the list after printing to discard them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see, mergedIntervals
only contains the one interval during this time. From what I see, we only know that the current vertex has a status that is completed and we don't see any in progress statuses at this point.
util/progress/progressui/display.go
Outdated
dt = entry.ts.Sub(*ival.start) | ||
} | ||
logs[stream] = append(logs[stream], VertexLog{ | ||
Timestamp: dt.Seconds(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use math.Round(n*1000)/1000
so we don't get too long decimals.
util/progress/progressui/display.go
Outdated
} | ||
|
||
type digestLogs struct { | ||
entries []*logEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be easier with just one field with type map[int][]*logEntry
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to remember why I did it this way and I'm drawing a blank so I'll change this to use a map.
util/progress/progressui/display.go
Outdated
return out | ||
} | ||
|
||
type logMux struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think calling this "mux" is confusing as this is just a map. You could argue that the digestLogs
part does some multiplexing (but more like indexing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably consolidate this with digestLogs
and rename it. I'm just not the biggest fan of things like this:
map[digest.Digest]map[int][]*logEntry
This kind of structure is a bit confusing to follow even if it's correct but it's sometimes hard to think of good names for them.
I think it likely makes sense to only print the one status. The json progress indicator doesn't currently implement anything for in progress statuses. I don't think it makes sense to print out in progress statuses for the aggregated result. I do imagine it could be helpful in some capacity but it can always be an additive change at that point.
My personal opinion is it's probably more simple to just have started and completed as both timestamps. It's easy enough to convert between them if needed. |
86b60fc
to
b021657
Compare
@AkihiroSuda What do you think regarding timestamps vs timestamp+duration? Other things I think we can look at follow-ups, but we wouldn't be able to change the fields. |
util/progress/progressui/display.go
Outdated
logs := make([]VertexLog, 0, len(entries)) | ||
for _, entry := range entries { | ||
var dt time.Duration | ||
if ival := v.mostRecentInterval(); ival != nil && ival.start != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need to do mostRecentInterval
is somewhat confusing to me. These times should always be relative to v.Vertex.StartedAt
. Otherwise, user would not be able to understand the meaning as ival.strart
is never printed. Is it always guaranteed to be the same value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. This was due to a misunderstanding on my part for what the intervals represented.
The former one for machines, the latter one for humans. |
Ok, but note that we already made log times relative to the object. We also have Anyway, not blocking on it. |
Based on the feedback, I'm going to update the |
b021657
to
76b797d
Compare
5ff9d29
to
90ad59b
Compare
This adds a json progress mode to the progressui. This json mode, in contrast to `rawjson`, produces an aggregated json output for each vertex after the vertex has completed. The primary use of this progress output is for machine tools to parse to determine the details of the build. For raw events and interactive tooling, `rawjson` is a better option due to providing more real time detail. Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
90ad59b
to
e27286a
Compare
Closing for now, if demand/design resurfaces we can re-open or link |
This adds a json progress mode to the progressui. This json mode, in
contrast to
rawjson
, produces an aggregated json output for each vertexafter the vertex has completed.
The primary use of this progress output is for machine tools to parse to
determine the details of the build. For raw events and interactive tooling,
rawjson
is a better option due to providing more real time detail.Related to #4113.