Skip to content

Conversation

Shivs11
Copy link
Member

@Shivs11 Shivs11 commented Jan 28, 2025

What changed?

  • ListWorkerDeployments API
  • Also removed presence of locks when adding a version to a worker-deployment workflow. Causes deadlocks and we don't need them.

Why?

  • Versioning-3.1

How did you test it?

  • Added functional tests (happy-path)
  • Existing suite

Potential risks

  • None, feature

Documentation

Is hotfix candidate?

  • No

Comment on lines -661 to -672
// userData, _, err := c.partitionMgr.GetUserDataManager().GetUserData()
// if err != nil {
// return err
// }

// deploymentVersionData := userData.GetData().GetPerType()[int32(c.queue.TaskType())].GetDeploymentVersionData()
// if findDeploymentVersion(deploymentVersionData, workerDeployment.SeriesName, workerDeployment.BuildId) >= 0 {
// // already registered in user data, we can assume the workflow is running.
// // TODO: consider replication scenarios where user data is replicated before
// // the deployment workflow.
// return nil
// }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had forgotten to uncomment this and the next block of code - important since it basically serves as a de-dup mechanism for multiple polls coming in with the same worker-deployment information!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

@Shivs11 Shivs11 marked this pull request as ready for review January 28, 2025 14:50
@Shivs11 Shivs11 requested a review from a team as a code owner January 28, 2025 14:50
request.PageSize = maxPageSize
}

resp, nextPageToken, err := wh.workerDeploymentClient.ListWorkerDeployments(ctx, namespaceEntry, int(request.PageSize), request.NextPageToken)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth changing for all of these APIs, but personally I find it easier to write code / keep up with shifting parameters if the input to my helper function (in this case workerDeploymentClient.ListWorkerDeployments) is just the proto request object + namespaceEntry. Then all the necessary parameters are already present.

Up to you though and I think not worth changing now, just letting you know for the future that it's totally fine to pass the whole request object if that's easier for you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a design decision which was followed since the pre-release days and I didn't wanna alter it - fwiw, I agree with you - I shall keep things as is for now but not repeat the same thing in future

Comment on lines -661 to -672
// userData, _, err := c.partitionMgr.GetUserDataManager().GetUserData()
// if err != nil {
// return err
// }

// deploymentVersionData := userData.GetData().GetPerType()[int32(c.queue.TaskType())].GetDeploymentVersionData()
// if findDeploymentVersion(deploymentVersionData, workerDeployment.SeriesName, workerDeployment.BuildId) >= 0 {
// // already registered in user data, we can assume the workflow is running.
// // TODO: consider replication scenarios where user data is replicated before
// // the deployment workflow.
// return nil
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

Comment on lines 698 to 700
deploymentVersions := userData.GetData().GetPerType()[int32(c.queue.TaskType())].GetDeploymentData()
if findDeploymentVersion(deploymentVersions, worker_versioning.DeploymentVersionFromDeployment(workerDeployment)) >= 0 {
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
deploymentVersions := userData.GetData().GetPerType()[int32(c.queue.TaskType())].GetDeploymentData()
if findDeploymentVersion(deploymentVersions, worker_versioning.DeploymentVersionFromDeployment(workerDeployment)) >= 0 {
break
deploymentData := userData.GetData().GetPerType()[int32(c.queue.TaskType())].GetDeploymentData()
if findDeploymentVersion(deploymentData, worker_versioning.DeploymentVersionFromDeployment(workerDeployment)) >= 0 {
break

This is still deploymentData, it just has a Deployments field and a Versions field now.

select {
case <-userDataChanged:
case <-ctx.Done():
c.logger.Error("timed out waiting for deployment to appear in user data")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.logger.Error("timed out waiting for deployment to appear in user data")
c.logger.Error("timed out waiting for worker deployment version to appear in user data")

Is this more accurate?

@@ -539,7 +588,7 @@ func (d *ClientImpl) AddVersionToWorkerDeployment(
outcome, err := d.updateWithStart(
ctx,
namespaceEntry,
WorkerDeploymentVersionWorkflowType,
WorkerDeploymentWorkflowType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch -- although I'm confused why our tests didn't catch this. Seems like sending an update to the wrong WF type should have always failed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know - I wondered on this too....

@@ -44,7 +43,7 @@ const (
WorkerDeploymentWorkflowType = "temporal-sys-worker-deployment-workflow"

// Namespace division
WorkerDeploymentVersionNamespaceDivision = "TemporalWorkerDeploymentVersion"
WorkerDeploymentVersionNamespaceDivision = "TemporalWorkerDeployment"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this variable be called WorkerDeploymentNamespaceDivision ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming with "version" is left over from when "Deployment" (instance) was the "first-class-citizen" of this whole API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, yeah you are right - changing this to be called WorkerDeploymentNamespaceDivision

Comment on lines -241 to -251
// add version to worker-deployment workflow
activityCtx = workflow.WithActivityOptions(ctx, defaultActivityOptions)
err = workflow.ExecuteActivity(activityCtx, d.a.AddVersionToWorkerDeployment, &deploymentspb.AddVersionToWorkerDeploymentRequest{
DeploymentName: d.VersionState.DeploymentName,
Version: d.VersionState.Version,
RequestId: d.newUUID(ctx),
}).Get(ctx, nil)
if err != nil {
return err
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry - we were adding the same version twice to the worker-deployment workflow (we were doing this twice)!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see the duplicate code above -- got it

@@ -115,7 +117,7 @@ func (d *WorkflowRunner) run(ctx workflow.Context) error {
}

func (d *WorkflowRunner) validateSetCurrent(args *deploymentspb.SetCurrentVersionArgs) error {
if d.State.CurrentVersion != args.Version {
if d.State.RoutingInfo.CurrentVersion != args.Version {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think it's more idiomatic in Go, and more readable, to structure the function like

func foo(...) error {
   if errorCondition {
      return err
   }
   return nil
}

Returning success at the end and errors in the middle makes it easier to read. The section "A longer example" in https://8thlight.com/insights/exploring-error-handling-patterns-in-go this article also calls this out with another example.

@Shivs11 Shivs11 merged commit a1fc7bd into versioning-3.1 Jan 28, 2025
41 of 49 checks passed
@Shivs11 Shivs11 deleted the ss/listworkerdeployment-versioning3.1 branch January 28, 2025 20:07
ShahabT pushed a commit that referenced this pull request Feb 4, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 4, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 5, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 5, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 6, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 6, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 6, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
ShahabT pushed a commit that referenced this pull request Feb 6, 2025
## What changed?
<!-- Describe what has changed in this PR -->
- ListWorkerDeployments API
- Also removed presence of locks when adding a version to a
worker-deployment workflow. Causes deadlocks and we don't need them.

## Why?
<!-- Tell your future self why have you made these changes -->
- Versioning-3.1

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
- Added functional tests (happy-path)
- Existing suite

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
- None, feature

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
- No
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants