Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Dec 21, 2023

This is not intended to be merged and is open only for running CI. We'll upstream commits from this branch in separate PRs. Periodic merges from main back to this.

Patchsets to upstream:

Extras not part of the devs&addrs work that we'll upstream later:

Cool things to check out:

  • pkg/statedb/reconciler/example (generic reconciler)
  • pkg/statedb/reconciler/benchmark (benchmark for reconciler + statedb)
  • pkg/statedb/reflector/k8s.go (k8s to statedb reflection)
  • pkg/bpf/ops/example (statedb to BPF map reconciliation)
  • pkg/datapath/linux/sysctl (sysctl reconciler)

Remaining work:

  • Removal of option.Config.GetDevices(), AreDevicesRequired() (@bimmlerd)
  • Removal of option.Config.DirectRoutingDevice, DirectRoutingDeviceRequired (@bimmlerd)
  • Move IPv6MCastDevice logic to pkg/mcast (@bimmlerd)
  • Removal of DeviceManager (@bimmlerd)
  • Removal of node.ipv[46]MasqAddrs (@joamaki)
  • Replace uses of NodeAddressing.{LocalAddresses,LoadBalancerNodeAddresses} with Table[NodeAddress] (@joamaki)
  • Removal of NodeAddressing (@joamaki)
  • Simple loader reconciliation loop (@dylandreimerink)
  • Simple iptables reconciliation loop (@pippolo84)
  • Reconcile address changes in pkg/service (@joamaki)
  • L2 announcements should watch Table[Device] (@dylandreimerink)
  • Remove daemon/cmd/device-reloader.go (@bimmlerd)
  • e2e/unit tests for validating reconfiguration on device/address changes
  • Replace LinkList/AddrList etc. calls with lookups to Table[*Device]? As follow-up?
  • Upstream all the things (everyone)

joamaki and others added 22 commits December 11, 2023 15:00
Allow specifying a negative retry count for infinite retries of
OneShot jobs.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
The tests took 16s before, now down to less than a second.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
…ealization time

The 500ms default is way too long for tests that validate health status multiple times.
Add a function to allow overriding this.

TBD whether we want to keep this or whether to use a fake instead or some other
approach.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
This implements a generic utility for reconciling between
a statedb table and a target defined as a set of idempotent
operations. It implements incremental reconciliation with
per-object retries and a periodic full reconciliation
(forced per-object update and pruning).

A single shared reconciler implementation helps building a
more resilient architecture as it centralizes the decisions
on how to deal with failures and how to report them.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Example application for showing how to use the reconciler
to reconcile set of files on disk ("memos") created through
an HTTP API.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Example run:

  benchmark % go build .
  benchmark % ./benchmark -objects=100000 -batchsize=1000
  Inserting batch 100/100 ...
  Waiting for reconciliation to finish ...

  100000 objects reconciled in 1.02 seconds (batch size 1000)
  Throughput 98225.12 objects per second

Signed-off-by: Jussi Maki <jussi@isovalent.com>
…onciliation-v1.16

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Most of the functions and types in the loader package are only used
internally. This commit unexports them. Some functions were placed in
the loader package but without much added benefit.

These changes make it easier to identify the API boundary of the loader
package.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Not that all global functions are methods of the loader object, we can
provide the loader via hive.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This commit changes the loader into a cell. The Datapath object depends
on it and re-distributes it still (for now). All locations where a
new loader was created are now replaced by the datapath interface so
there is now only ever one loader object at a time.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Moved the compilation lock into the loader as part of the effort to
separate the tight integration of the loader with the rest of the
agent and daemon.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
The loader.Loader type is problematic. Remove it and only use
the types.Loader.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Avoid flakyness by using goleak.IgnoreCurrent()

Signed-off-by: Jussi Maki <jussi@isovalent.com>
To prevent an import cycle from occurring when importing the device
table for the bandwidth manager.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
The SetReporterMinTimeoutForTest triggers the race detector. Just
set a lower timeout for now. Do not upstream this.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Derive transforms objects from an input table to an output table.
Useful in conjunction with a reconciler where the desired state
is derived from a single input table.

Example:

  // Assuming we have Table[*Foo] and RWTable[*Bar] and that
  // *Bar is an object we want reconciled.
  cell.Invoke(
    statedb.Derive[*Foo, *Bar](
      func(foo *Foo, deleted bool) (*Bar, statedb.DeriveResult) {
        if deleted {
            return &Bar{
              ID: foo.ID, // Only need enough for primary key
              Status: reconciler.StatusPendingDelete(),
            }, statedb.DeriveUpdate
        }
        return &Bar{
          ID: foo.ID,
          Quux: foo.Quux,
          Status: reconciler.StatusPending(),
        }, statedb.DeriveInsert
      },
    ),
  )

Signed-off-by: Jussi Maki <jussi@isovalent.com>
In some cases we want to only register the reconciler when a feature
is enabled. To make this easy to do, export the Params struct.
Now the reconciler can be registered based on config:

  func registerReconciler(cfg Config, params reconciler.Params[*Foo]) error {
    if !cfg.Enabled {
      return nil
    }
    return reconciler.Register[*Foo]
  }

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Since the "cilium-dbg statedb" commands aren't currently tested we
want to avoid typoing the table names, so define a string constant
for each of them.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Implement the qdisc setup for bandwidth manager using
the generic reconciler.

The desired state is derived from Table[*Device] using the
Derive() helper.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Dec 21, 2023
Sysctl should be part of the datapath module, so the commit moves the
package accordingly.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Refactor the sysctl package using the statedb table and the generic
reconciler. The sysctl specific reconciler operations are in charge of
syncing the desired state with the actual kernel sysctl parameters.

To be backward compatible with the old sysctl interface, the exported
methods are kept synchronous, so they wait for the reconciliation to
actually happen before returning to the caller.

Moreover, ReadSysctl and WriteSysctl offer a low-level "direct"
implementation to change sysctl parameters without relying on the
reconciler. They are meant to be used *ONLY* in binaries that are not
based on the "hive and cells" framework, IOW where a statedb and the
reconcilers are not available (e.g: cilium-health and cilium-cni).

The sysctl package tests are adapted to the new implementation, and the
TestSysctl is written as a unit-style integration test to verify the
sysctl reconciler operations.

All the privileged tests that don't rely on a hive but require a working
implementation of the sysctl package are temporarily left broken. The
same goes for the controlplane tests that require a fake sysctl
implementation.
They will be fixed in the subsequent commits.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add an alternative implementation of sysctl to be used only in
privileged tests or benchmarks where a hive is not available.

Fix all the privileged tests and benchmarks using the newly added
implementation.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add a fake implementation of Sysctl interface to be used in controlplane
testing.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add sysctl table to cilium-dbg statedb output and update related
documentation.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
bimmlerd and others added 2 commits January 15, 2024 10:45
Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

Not all parameters were passed to NewDatapath (MTU, DB, Devices).
Fix by refactoring the construction in the datapath cell and remove
the intermediate newDatapath function.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

…onciliation-v1.16

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

Previous iteration of this fix tried to fill in things by hand, which
works until something changes, and then breaks in this unhelpful way.
Instead of fixing this time and time again, just remove the dependency
on the real linux datapath, and use the fake one.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

Replace the global map of BPF masquerade addresses with a query to
Table[NodeAddress] in the loader's patchHostNetdevDatapath and
remove the now unused code around masquerade addresses in pkg/node.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@joamaki
Copy link
Contributor Author

joamaki commented Jan 22, 2024

/ci-runtime

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

* Use DevicesControllerCell in all tests that need devices
* Refactor out the duplicated code to assert against NeighList
* Use proper fake nodemap in TestNodeUpdateIDs

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@joamaki
Copy link
Contributor Author

joamaki commented Jan 23, 2024

/test

@maintainer-s-little-helper
Copy link

Commits 51421b3, fff04a4, 86d7a19, 4a16c44, f30cad0, 082fa89, fce80a7, 2bf7bf0 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label May 10, 2024
Copy link

This pull request has not seen any activity since it was marked stale.
Closing.

@github-actions github-actions bot closed this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants