Skip to content

Conversation

Sushisource
Copy link
Member

@Sushisource Sushisource commented Feb 14, 2025

Closes temporalio/sdk-go#1811

What changed?
Title. More reliable way of discovering everything

Why?
Avoid missing newly added top-level paths to payloads/failures

How did you test it?
Existing tests / comparing generated output.

Potential risks
Too much visiting!

@Sushisource Sushisource requested review from a team as code owners February 14, 2025 00:59
@Sushisource Sushisource marked this pull request as draft February 14, 2025 00:59
@Sushisource Sushisource force-pushed the proxy-generator-improvement branch from 0ae9bbd to 452e2e7 Compare February 14, 2025 01:01
Comment on lines +623 to +624
case *errordetails.MultiOperationExecutionFailure:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things did, in fact, get missed!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the proxy ever look into error details though...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? It won't call with them? If so, oh well, no harm having it here anyway since they do contain payloads

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah because these are part of the error

It won't call with them? If so, oh well

Well the consequence is a payload or failure that should be encoded would not be so it is kinda important no?

Copy link
Member Author

@Sushisource Sushisource Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you mean now.

Well that's because it's not referenced by any other proto. In fact I don't understand this at all:

    // ExecuteMultiOperation executes multiple operations within a single workflow.
    //
    // Operations are started atomically, meaning if *any* operation fails to be started, none are,
    // and the request fails. Upon start, the API returns only when *all* operations have a response.
    //
    // Upon failure, it returns `MultiOperationExecutionFailure` where the status code
    // equals the status code of the *first* operation that failed to be started.
    //
    // NOTE: Experimental API.
    rpc ExecuteMultiOperation (ExecuteMultiOperationRequest) returns (ExecuteMultiOperationResponse) {

That comment is the only place it's referenced

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is part of the error details https://grpc.io/docs/guides/error/#richer-error-model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy doesn't look at errors at all right now AFAIK @cretz please correct me if I am wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying you need to fix this as part of this PR, but it is another gap I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, OK that makes sense. Yeah, I agree that is a possible gap. I also think probably another PR.

Copy link
Member

@cretz cretz Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy doesn't look at errors at all right now AFAIK @cretz please correct me if I am wrong

Correct, ref

if err != nil {
return err
}
. The only cases I am aware of where user-provided payloads can be in an error is ExecuteMultiOperation and QueryWorkflow. I agree it is a hole in our logic in the interceptor that we should fix (walk the gRPC status details). I have opened temporalio/sdk-go#1825. But yes we should keep them in this switch statement of course.

if err := visitFailures(
ctx,
options,
o.GetUserMetadata(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetUserMetadata doesn't contain a failure so why would we need to search it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like maybe some of these are still reporting as being needed to search because they have payloads rather than failures, I'll see if that's it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok these were fixed by removing the hardcoded check for payloads in walk

@Sushisource
Copy link
Member Author

Sushisource commented Feb 14, 2025

I changed the make stuff to generate the proto descriptors on the fly since checking in a 1mb file is annoying. However that would mean adding proto to CI which isn't there currently. Should we just add it?

Discussed @ SDK standup. No objections to adding protoc to CI

@Sushisource Sushisource marked this pull request as ready for review February 14, 2025 17:04
@Sushisource Sushisource force-pushed the proxy-generator-improvement branch from 441650a to 062634e Compare February 14, 2025 18:21
Comment on lines +623 to +624
case *errordetails.MultiOperationExecutionFailure:

Copy link
Member

@cretz cretz Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy doesn't look at errors at all right now AFAIK @cretz please correct me if I am wrong

Correct, ref

if err != nil {
return err
}
. The only cases I am aware of where user-provided payloads can be in an error is ExecuteMultiOperation and QueryWorkflow. I agree it is a hole in our logic in the interceptor that we should fix (walk the gRPC status details). I have opened temporalio/sdk-go#1825. But yes we should keep them in this switch statement of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, I didn't dig, but why are these files appearing now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to regenerate these to work properly with the updates to the proto lib

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, ok. I forget, but I guess as part of the the tests we added the generated proto code in we never committed the protos themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly

Comment on lines +679 to +680
if resultType.String() == types.NewPointer(directMatchType).String() {
hasDirectMatch = true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't love that I had to do this but I could not for the life of me get reflection to be equal properly

@Sushisource Sushisource merged commit 306db99 into master Feb 18, 2025
4 checks passed
@Sushisource Sushisource deleted the proxy-generator-improvement branch February 18, 2025 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[api-go] Code gen for payload visiting should walk all descriptors instead of workflow service reachability
3 participants