Skip to content

Conversation

taylanisikdemir
Copy link
Member

What changed?
Updating replication simulation to support active-active domains and their failovers.

  • Simulation config structure is updated to support single active cluster (regular global domain) & multiple active clusters (active-active)
  • Failover operation implementation updated to support both

Misc change:

  • Update execution manager to explicitly return EntityNotExistsError when there's no cluster selection policy for given workflow and handle it in active cluster manager.

How did you test it?

  • Validate default simulation scenario passes:
./simulation/replication/run.sh default
  • Validate activeactive scenario runs (aa domain creation, wf creation works).
./simulation/replication/run.sh activeactive

Next steps
Frontend cluster redirection layer needs to be updated to lookup active cluster of given workflow. Currently this layer only checks the domain of the request. I need to update the template and policy implementation to plumb wf id and run id.

name: test-domain-2
activeClusters:
- cluster1
activeClusterName: cluster0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be cluster1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously primary cluster name was used as active cluster name so both of these domains were active on cluster0. If @fimanishi 's intent was different then can be addressed in follow up PR.

@taylanisikdemir taylanisikdemir merged commit d910e5e into cadence-workflow:master Jun 25, 2025
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants