Skip to content

Conversation

jsternberg
Copy link
Collaborator

@jsternberg jsternberg commented Aug 22, 2023

This adds a json progress mode to the progressui. This json mode, in
contrast to rawjson, produces an aggregated json output for each vertex
after the vertex has completed.

The primary use of this progress output is for machine tools to parse to
determine the details of the build. For raw events and interactive tooling,
rawjson is a better option due to providing more real time detail.

Related to #4113.

@jsternberg jsternberg force-pushed the json-buffered-printer branch 2 times, most recently from 9541c64 to 0a576fd Compare August 24, 2023 14:42
@jsternberg jsternberg force-pushed the json-buffered-printer branch from 0a576fd to 411eb36 Compare August 25, 2023 14:06
@jsternberg jsternberg marked this pull request as ready for review August 25, 2023 14:08
@jsternberg jsternberg force-pushed the json-buffered-printer branch from 411eb36 to aa6a522 Compare August 25, 2023 14:08
@jsternberg jsternberg force-pushed the json-buffered-printer branch 4 times, most recently from 7379d20 to c68e09a Compare September 12, 2023 19:40
}

func (d *jsonDisplay) update(ss *client.SolveStatus) {
d.t.update(ss, 80)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting 80 in here seems terminal-specific.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number isn't used for non-tty displays. It's mostly an artifact of the previous implementation where trace was used for both the tty display and the plain display. The text display used 80 as a default just to give a value (which is unused) and I just carried that on.

Might be worth refactoring this code at some point just to reduce confusion for the implementation but it seemed a bit too large of a task for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is unused, then just setting a zero value is ok for now.


// JSON format keeps the logs separated into their streams
// and takes the aggregated result. The trace does not do that
// so we use and update logMux for log messages by digest/stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to disable log behavior in d.t to avoid duplicates?

Copy link
Collaborator Author

@jsternberg jsternberg Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation should just ignore the logs from trace. The trace will still do the log splitting and aggregating stdout/stderr into the same stream, but this specific display just ignores that property. It would be possible for us to disable the log behavior to improve performance, but it would make that section of code more complicated and I think that section of code may already be a bit too complicated and in need of a refactor so, unless it's causing a documented performance problem, I think we should just ignore it.

It won't create a wrong answer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it means that we use double the memory then we should fix it. But if we don't need to store everything for JSON and can release the memory once vertex is printed then not needed until some benchmark proves that this is a bottleneck.


func (d *jsonDisplay) printVtx(v *vertex) {
intervals := make([]VertexInterval, len(v.mergedIntervals))
for i, interval := range v.mergedIntervals {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this prints same vertex multiple times with all intervals. The logic should be to print interval after it completes and if there is another interval later then it prints a new object but without the previous data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presently have it so the output gets printed when the entire vertex completes which means we have the completed aggregate data from all of the intervals. Were we going to print a JSON object for each interval that completes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vertex does complete, but you don't know if there will be another run for the same vertex (eg. some runs needs to happen for cache lookup and later for execution, but there are more complicated cases).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this. I didn't realize that the intervals were caused by a vertex completing and then starting again. We're going to print the json object when the vertex completes, erase any state associated with it, and print a new one if it runs again. Most builds this shouldn't happen with but there are some cases where a vertex can run multiple times.

}

type VertexInterval struct {
Start time.Time `json:"start"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the same names as the original types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean use *time.Time? This structure is intended to mimic interval which is defined as:

type interval struct {
    start *time.Time
    stop *time.Time
}

I modified it to use a non-pointer because this version of the type should not have nullable versions of start and stop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I meant that in current types these fields are called Started and Completed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had an offline conversation regarding this. The inclusion of intervals at all shouldn't be done. We're going to split things up so each interval gets its own json object. That means this structure is just going to be removed.

@tonistiigi
Copy link
Member

Looking at this output, it seems to expose some weird output in progress stream that should be checked out in follow-up. Shouldn't block this PR but not sure if we want to expose this in buildx for example before these cases are handled:

  • all the locals like "load build definition from Dockerfile" have two intervals. I'm not sure now. Also it would be nice to mark the cache lookup interval somehow so it isn't confusing to the user.
  • The blob downloads are all in their own vertex. I know this started with introducing lazy pulls but I think it should be possible to find a way to detect that there are ongoing pulls and combine them. @sipsma

Additional comments:

  • The blob download always have one status atm. while in the progress stream we have the timestamped progress while the blob is pulled. I'm not sure if it makes sense to hide them.
  • As the log timestamps are relative, would it make sense to make vertex & status times relative as well. Eg. Started + duration instead of started and completed.

Here is an example of current output:
https://gist.github.com/tonistiigi/8e27054d2947da3e9c4a47a8103a3ac0

@AkihiroSuda PTAL as well

Detail []string `json:"detail,omitempty"`
URL string `json:"url,omitempty"`

SourceInfo *pb.SourceInfo `json:"sourceInfo,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't want to have this pb info in the output?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove this. I figured the source info might be useful and the protobuf structure is already annotated with the correct json markers.

continue
}

if v.isCompleted() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again, would the case that all blobs are shown separately be solved if in here we instead check that all the mergedIntervals are completed, then in printVtx we combine the (unprinted) mergedIntervals and reset the list after printing to discard them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, mergedIntervals only contains the one interval during this time. From what I see, we only know that the current vertex has a status that is completed and we don't see any in progress statuses at this point.

dt = entry.ts.Sub(*ival.start)
}
logs[stream] = append(logs[stream], VertexLog{
Timestamp: dt.Seconds(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use math.Round(n*1000)/1000 so we don't get too long decimals.

}

type digestLogs struct {
entries []*logEntry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be easier with just one field with type map[int][]*logEntry?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to remember why I did it this way and I'm drawing a blank so I'll change this to use a map.

return out
}

type logMux struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling this "mux" is confusing as this is just a map. You could argue that the digestLogs part does some multiplexing (but more like indexing).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably consolidate this with digestLogs and rename it. I'm just not the biggest fan of things like this:

map[digest.Digest]map[int][]*logEntry

This kind of structure is a bit confusing to follow even if it's correct but it's sometimes hard to think of good names for them.

@jsternberg
Copy link
Collaborator Author

The blob download always have one status atm. while in the progress stream we have the timestamped progress while the blob is pulled. I'm not sure if it makes sense to hide them.

I think it likely makes sense to only print the one status. The json progress indicator doesn't currently implement anything for in progress statuses. I don't think it makes sense to print out in progress statuses for the aggregated result. I do imagine it could be helpful in some capacity but it can always be an additive change at that point.

As the log timestamps are relative, would it make sense to make vertex & status times relative as well. Eg. Started + duration instead of started and completed.

My personal opinion is it's probably more simple to just have started and completed as both timestamps. It's easy enough to convert between them if needed.

@tonistiigi
Copy link
Member

@AkihiroSuda What do you think regarding timestamps vs timestamp+duration?

Other things I think we can look at follow-ups, but we wouldn't be able to change the fields.

logs := make([]VertexLog, 0, len(entries))
for _, entry := range entries {
var dt time.Duration
if ival := v.mostRecentInterval(); ival != nil && ival.start != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need to do mostRecentInterval is somewhat confusing to me. These times should always be relative to v.Vertex.StartedAt. Otherwise, user would not be able to understand the meaning as ival.strart is never printed. Is it always guaranteed to be the same value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. This was due to a misunderstanding on my part for what the intervals represented.

@AkihiroSuda
Copy link
Member

@AkihiroSuda What do you think regarding timestamps vs timestamp+duration?

The former one for machines, the latter one for humans.
Assuming that the JSON mode is for machines, the former one seems better after all.

@tonistiigi
Copy link
Member

Ok, but note that we already made log times relative to the object. We also have jsonraw that is only for machines. I think this is more for both cases.

Anyway, not blocking on it.

@jsternberg
Copy link
Collaborator Author

Based on the feedback, I'm going to update the Completed field on both the vertex and the status to Duration and convert it to a float64.

@jsternberg jsternberg force-pushed the json-buffered-printer branch from b021657 to 76b797d Compare September 15, 2023 18:40
@jsternberg jsternberg force-pushed the json-buffered-printer branch 2 times, most recently from 5ff9d29 to 90ad59b Compare September 15, 2023 20:52
This adds a json progress mode to the progressui. This json mode, in
contrast to `rawjson`, produces an aggregated json output for each
vertex after the vertex has completed.

The primary use of this progress output is for machine tools to parse
to determine the details of the build. For raw events and interactive
tooling, `rawjson` is a better option due to providing more real time
detail.

Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
@jsternberg jsternberg force-pushed the json-buffered-printer branch from 90ad59b to e27286a Compare September 18, 2023 14:46
@thompson-shaun
Copy link
Collaborator

Closing for now, if demand/design resurfaces we can re-open or link

@jsternberg jsternberg deleted the json-buffered-printer branch August 12, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants