-
Notifications
You must be signed in to change notification settings - Fork 3.4k
daemon: Create all global maps in cilium-agent #10626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@pchaigno Do we still have bpf code that could potentially create maps? If so maybe remove so that we can be sure we are not depending on that code any more? |
I think the existing code to create BPF maps is actually in tc (cf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small remark, didn't review thoroughly yet. Will also check regarding memory consumption tomorrow.
CallMapName = "cilium_policy" | ||
// PolicyCallMapName is the name of the map to do tail calls into policy | ||
// enforcement programs. | ||
PolicyCallMapName = "cilium_call_policy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this won't brake up and downgrade because we always create the map when we initialized the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that we (tc) repopulate the map when we reload the datapath, so this won't be an issue for up/downgrades. I expect the old map to remain behind though; I don't know if we usually clean this sort of stale BPF objects (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination is to agree with @pchaigno here, but I think it'd be worthwhile to run some tests on this. For instance set up a couple of chatty peers, perform upgrade, and check that they continue chatting. We could probably use one of the chaos monkeys for this, eg:
https://github.com/cilium/chaos-monkeys/tree/master/monkeys/clique
(Worth double-checking the behaviour by deploying the monkey, restarting an existing version of Cilium first to understand how it performs, then try upgrade and observe)
I don't know if we usually clean this sort of stale BPF objects (?)
Yes, we clean the old objects. I think there's proxymap references still in the code to help clean up from upgrade from older versions, maybe v1.5 or so. Of course, if we downgrade again then we'll leave the map with the new name on the filesystem... though I'm less concerned about that since we assume users will eventually get onto a stable version of the newer release.
2a7b89d
to
bcbb6fa
Compare
When starting
|
Woah, not sure how I missed that. Fix seems straightforward; I'll send it after community meeting. Thanks for spotting it @tklauser! |
I wonder if we should add the warnings from the above failure into the set of strings that the CI looks for to fail a build, so that we only ever break these things very intentionally. Lines 258 to 268 in cacf250
|
The warnings @tklauser found look unrelated to this pull request. They concern maps that were already created by
That sounds like a good idea. I'll prepare a pull request for that tomorrow morning and will see if it breaks any existing test. |
and use it to check that all Go-side keys and values for maps have the correct sizes. Signed-off-by: Paul Chaignon <paul@cilium.io>
getNumPossibleCPUs is needed for all per-cpu map and perf event arrays. Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit declares the size and names of reverse NAT sock maps as constants on the Go side and writes them into the BPF program when generated. Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
Create the signal map during daemon initialization. The map is closed as soon as it's created since cilium-agent never uses it. To generate the deepcopy file for the new map, I used: make generate-k8s-api Signed-off-by: Paul Chaignon <paul@cilium.io>
Create the policy call map during daemon initialization. The map is closed as soon as it's created since cilium-agent never uses it. To generate the deepcopy files for the new maps, I used: make generate-k8s-api Signed-off-by: Paul Chaignon <paul@cilium.io>
Create per-endpoint policy maps during daemon initialization. Signed-off-by: Paul Chaignon <paul@cilium.io>
Create global and per-endpoint ct maps during daemon initialization. Signed-off-by: Paul Chaignon <paul@cilium.io>
Create global NAT maps during daemon initialization. Signed-off-by: Paul Chaignon <paul@cilium.io>
Create neighbors maps used to store IP-MAC associations when NodePort services are implemented in BPF datapath. To generate the deepcopy file for the new maps, I used: make generate-k8s-api Signed-off-by: Paul Chaignon <paul@cilium.io>
bcbb6fa
to
f575116
Compare
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pchaigno @joestringer Hmm, looks like these don't trigger in our CI? :-( In a later PR job, I've seen the following in the log (maybe due to padding or other size mismatch?):
|
5d6b669 also triggers the following warn:
This needs to update |
|
Three new maps were added to pkg/maps/ in #10626. The corresponding checks in alignchecker are however missing. This commit adds them. Signed-off-by: Paul Chaignon <paul@cilium.io>
Three new maps were added to pkg/maps/ in #10626. The corresponding checks in alignchecker are however missing. This commit adds them. Signed-off-by: Paul Chaignon <paul@cilium.io>
The first 6 commits are preparatory work for the last 9 commits, which creates all global maps in the cilium-agent. The last 9 commits can be easily squashed if preferred as they're fairly similar (I initially broke them down to bisect a bug).
Fixes: #9276
/cc @tklauser to check I didn't revert any memory optimization effort :-)