Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented May 21, 2025

In cilium/cilium the job groups were provided for each module and its lifecycle hooks appended when constructed. This is problematic when the job group is used multiple times as then the dependencies of the later uses won't influence the order in which start hooks run, e.g. if a later job depends on X it might happen that the job is started before X is started.

For example:

  cell.Module(...,
    cell.Provide(newX),

    // First use of the job.Group will construct it and append hooks to
    // lifecycle.
    cell.Invoke(func(jg job.Group, ...) { jg.Add( ... )}),

    // This second use will construct X and append its hooks, but these
    // are now *after* the job.Group's hooks and hence we might start the
    // job before X has been started.
    cell.Invoke(func(jg job.Group, x *X) { jg.Add( ... )}),
  )

To fix this, let's use the lifecycle to start the jobs that are added before the job group is started. This way the start order will respect the order in which [job.Group.Add] is called and thus be arranged according to the dependency order.

This is a breaking change in the interface, but there's not that many direct uses of "NewGroup" in cilium/cilium and thus fairly easy to fix.

@joamaki joamaki requested a review from a team as a code owner May 21, 2025 10:39
@joamaki joamaki requested review from pippolo84 and removed request for a team May 21, 2025 10:39
@joamaki joamaki force-pushed the pr/joamaki/jobs-append-lifecycle branch 2 times, most recently from 4d77c87 to 854353c Compare May 21, 2025 11:38
Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minus some fixes to the tests to please the data race detector.

@joamaki joamaki force-pushed the pr/joamaki/jobs-append-lifecycle branch from 854353c to c5bd9c4 Compare May 22, 2025 11:51
In cilium/cilium the job groups were provided for each module and its
lifecycle hooks appended when constructed. This is problematic when the
job group is used multiple times as then the dependencies of the later
uses won't influence the order in which start hooks run, e.g. if a later
job depends on X it might happen that the job is started before X is started.

For example:

  cell.Module(...,
    cell.Provide(newX),

    // First use of the job.Group will construct it and append hooks to
    // lifecycle.
    cell.Invoke(func(jg job.Group, ...) { jg.Add( ... )}),

    // This second use will construct X and append its hooks, but these
    // are now *after* the job.Group's hooks and hence we might start the
    // job before X has been started.
    cell.Invoke(func(jg job.Group, x *X) { jg.Add( ... )}),
  )

To fix this, let's use the lifecycle to start the jobs that are added
before the job group is started. This way the start order will respect
the order in which [job.Group.Add] is called and thus be arranged according
to the dependency order.

This is a breaking change in the interface, but there's not that many
direct uses of "NewGroup" in cilium/cilium and thus fairly easy to fix.

Signed-off-by: Jussi Maki <jussi.maki@isovalent.com>
@joamaki joamaki force-pushed the pr/joamaki/jobs-append-lifecycle branch from c5bd9c4 to 9abb9c0 Compare May 22, 2025 12:27
@joamaki joamaki merged commit 2946c49 into main May 22, 2025
1 check passed
@joamaki joamaki deleted the pr/joamaki/jobs-append-lifecycle branch May 22, 2025 12:32
joamaki added a commit to cilium/statedb that referenced this pull request May 22, 2025
This pulls in cilium/hive#51 and fixes the
usage.

Signed-off-by: Jussi Maki <jussi.maki@isovalent.com>
joamaki added a commit to cilium/statedb that referenced this pull request May 22, 2025
This pulls in cilium/hive#51 and fixes the
usage.

Signed-off-by: Jussi Maki <jussi.maki@isovalent.com>
joamaki added a commit to joamaki/cilium that referenced this pull request May 22, 2025
This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
joamaki added a commit to joamaki/cilium that referenced this pull request May 22, 2025
This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
joamaki added a commit to joamaki/cilium that referenced this pull request May 23, 2025
This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
github-merge-queue bot pushed a commit to cilium/cilium that referenced this pull request May 25, 2025
This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
github-merge-queue bot pushed a commit to cilium/cilium that referenced this pull request May 25, 2025
This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
zzuckerfrei pushed a commit to zzuckerfrei/cilium that referenced this pull request May 29, 2025
This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
joamaki added a commit to joamaki/cilium that referenced this pull request Jun 2, 2025
[ upstream commit a719911 ]
[ backporter note: disabled the experimental tests as they're not relevant to
  run for v1.17 and these old versions were flaky ]

This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
github-merge-queue bot pushed a commit to cilium/cilium that referenced this pull request Jun 4, 2025
[ upstream commit a719911 ]
[ backporter note: disabled the experimental tests as they're not relevant to
  run for v1.17 and these old versions were flaky ]

This pulls in the new Hive version that has a safer API for jobs:
cilium/hive#51.

Code is fixed as both Hive and StateDB had breaking API changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants