-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Add container event support to containerd #7073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @ruiwen-zhao. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
pkg/cri/server/container_events.go
Outdated
) | ||
|
||
func (c *criService) GetContainerEvents(r *runtime.GetEventsRequest, s runtime.RuntimeService_GetContainerEventsServer) error { | ||
// TODO: replace with a real implementation that broadcasts containerEventsChan to all subscribers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use containerd events here?
https://github.com/containerd/containerd/blob/main/events/events.go#L79
It supports broadcasting and has notion of filters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Maksym! I thought about it acutally but I thought events.go was for specific containerd-internal use cases and could not be easily generalized. Some questions I had:
- events/events.go is just some interfaces, and I would still need to write my own implementation of broadcasting. Is that right?
- events.go seems to have some implementations but looks like it spins up an event service and relies local grpc to sub/pub. I am not sure if this is an overkill for my usecase here. (My change only touches CRI server and does not touch the containerd service itself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is an implementation as well - https://github.com/containerd/containerd/blob/main/events/exchange/exchange.go
- Event service already launched by containerd any way and some events already used by CRI (https://github.com/containerd/containerd/blob/main/pkg/cri/server/events.go#L105).
So I'm just wondering whether its possible to reuse existing code for a similar task instead of building another one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a simple implementation to start I think for this POC would be ok I think, then maybe look at refactoring after if needed? I think he's reusing container exit/stopped off the cri events code correctly, I like how that looks below. I would rather not do this message api as a broadcast-able filter-able by container id subscription for all event types .. not in the alpha of the api anyhow. IMO this get msg subs api should be about the orchestrator asking for its PLEG messages for its namespace, it's owned containers and pods.. period. Not something to replace NRI or containerd events or otel or.. :-) Probably should only be one active registration per namespace unless reset due to connection/restart of kubelet/rancher etc..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cri container create and cri container start also looks right..
the pod ones need work at least due to overloading with container events ( an artifact of crio overloading container/pod ids, one of the few differences of opinion on the api between containerd/crio )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pod ones need work at least due to overloading with container events
I am not fully understanding the rationale, and asked it in the KEP PR (kubernetes/enhancements#3387 (comment)), so let's discuss there.
But anyway, if we want to separate sandbox events from container events, then a new CRI API GetSandboxEvents will be needed.
not do this message api as a broadcast-able filter-able by container id subscription for all event types
To make sure I understand correctly, you are suggesting that there should be only one caller/subscriber to the GetContainerEvents API, and the caller shoud not use any filters. Containerd will just return all events of k8s.io containers to this one and only subscriber.
I agree that the caller should just get all events of its namespace. We can continue this discussion around whether id/labels are needed in the KEP PR or CRI API PR.
Probably should only be one active registration
Kubelet will subscribe for sure, but on top of that what if user calls this API directly through crictl? As an implemntation of the API we should be able to handle the case of multiple subscribers even tho ideally this case should not happen, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great questions .. should probably take this to a zoom call..
I am not fully understanding the rationale, and asked it in the KEP PR (kubernetes/enhancements#3387 (comment)), so let's discuss there.
answered there
GetSandboxEvents will be needed.
agreed
To make sure I understand correctly, you are suggesting that there should be only one caller/subscriber to the GetContainerEvents API, and the caller shoud not use any filters. Containerd will just return all events of k8s.io containers to this one and only subscriber.
two issues.. someone "proxying" CRI (non-issue for single user); someone registering for kubelet's pleg events (what is the use case here? monitoring, some new trace? is it to be truly broadcast or sequence by subs? do we need n-subs for alpha?)
I agree that the caller should just get all events of its namespace. We can continue this discussion around whether id/labels are needed in the KEP PR or CRI API PR.
nod
Kubelet will subscribe for sure, but on top of that what if user calls this API directly through crictl? As an implemntation of the API we should be able to handle the case of multiple subscribers even tho ideally this case should not happen, right?
if we are not supporting two(to n) subscriptions for one namespace the crictl request would be able to validate that "alpha" limitation.., in alpha critest is more likely the use case and it would own the k8s.io ns
the logs/metrics/trace already have this information in them... these events are targeted I think for the orchestrator's cache updates, and are not guaranteed (thus requiring sync) this isn't some set of messages for a hook pattern or logs/metrics/trace.. ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments
// GetEventsFilter is used to filter a list of events. | ||
// All those fields are combined with 'AND' | ||
message GetEventsFilter { | ||
// ID of the container or sandbox. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments in kep pr.. this (combining container/sandbox ids) won't work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would rather start with namespace as the filter "k8s.io" not a pod id or if nil a container id.. but that may only make sense to containerd users.. Maybe just leave it more abstract for now..
// LabelSelector to select matches. | ||
// Only api.MatchLabels is supported for now and the requirements | ||
// are ANDed. MatchExpressions is not supported yet. | ||
map<string, string> label_selector = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a use case for this?
|
||
message ContainerEventResponse { | ||
// ID of the container | ||
string container_id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tabs broken..
see notes regarding not combining container and pod ids / events
pkg/cri/server/container_events.go
Outdated
) | ||
|
||
func (c *criService) GetContainerEvents(r *runtime.GetEventsRequest, s runtime.RuntimeService_GetContainerEventsServer) error { | ||
// TODO: replace with a real implementation that broadcasts containerEventsChan to all subscribers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a simple implementation to start I think for this POC would be ok I think, then maybe look at refactoring after if needed? I think he's reusing container exit/stopped off the cri events code correctly, I like how that looks below. I would rather not do this message api as a broadcast-able filter-able by container id subscription for all event types .. not in the alpha of the api anyhow. IMO this get msg subs api should be about the orchestrator asking for its PLEG messages for its namespace, it's owned containers and pods.. period. Not something to replace NRI or containerd events or otel or.. :-) Probably should only be one active registration per namespace unless reset due to connection/restart of kubelet/rancher etc..?
pkg/cri/server/container_events.go
Outdated
) | ||
|
||
func (c *criService) GetContainerEvents(r *runtime.GetEventsRequest, s runtime.RuntimeService_GetContainerEventsServer) error { | ||
// TODO: replace with a real implementation that broadcasts containerEventsChan to all subscribers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cri container create and cri container start also looks right..
the pod ones need work at least due to overloading with container events ( an artifact of crio overloading container/pod ids, one of the few differences of opinion on the api between containerd/crio )
fa5e31f
to
6cc59f1
Compare
9304e91
to
93588a9
Compare
6549a7a
to
b642888
Compare
Some CI checks failed. Most of the failures seem irrelevant:
One failure on the new test added
Looking. |
btw I am getting
when running Update: adding CONTAINERD_RUNTIME solves this issue (
Still looking |
pkg/cri/server/container_events.go
Outdated
) | ||
|
||
func (c *criService) GetContainerEvents(r *runtime.GetEventsRequest, s runtime.RuntimeService_GetContainerEventsServer) error { | ||
// TODO: replace with a real implementation that broadcasts containerEventsChan to all subscribers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the status of the TODOs in this PR? Are you planning to address them prior to this PR being ready to merge or do you intend to track their completion somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This todo is NOT a blocker of this PR because in most, if not all, use cases, there will be only one client calling GetContainerEvents.
I can create a separate containerd issue and reference that issue in the code if that's a better way to use TODOs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try
Are you perhaps using a machine with cgroups v2 configured? I believe the v1 shim depends on cgroups v1. |
4f7af9e
to
65cef12
Compare
61cff81
to
28a0216
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some comments on the testing.. reviewing server code next
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments.. mostly I think we need to be tolerant of events for containers when the podsandbox is not in the store .. I think it should be possible to ignore that and continue vs leave the abandoned container un-stoppable/removable
a20f1e8
to
a911a6f
Compare
waiting for confirmation that kubernetes/kubernetes#114351 is merged into a k/k v1.26 release |
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
Updated the code to skip sending events with nil PodSandboxStatus. See the TODO issue (#7785) for context and next step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As discussed we can't send the events with nil for the k/k v1.26.0 release because it would cause kubelet to throw a nil pointer deferrence exception. We will do the TODOs on both sides once kubernetes v1.26.1 has included support for ignoring nil and the service release has been around for a while
I hit the wrong button there, sorry! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
np give it a good look ping if you have questions..
@fuweid and @samuelkarp expressed interest in reviewing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The CI issue can be handled in the follow-up
|
||
const ( | ||
drainContainerEventChannelTimeout = 2 * time.Second | ||
readContainerEventChannelTimeout = 500 * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should increase it to [5, 10] seconds just in case that the CI action is under high pressure. Otherwise, it would be new flaky test case.
} | ||
containerStatuses, err := c.getContainerStatuses(ctx, sandboxID) | ||
if err != nil { | ||
logrus.Errorf("Failed to get container statuses for container event for sandboxID %q: %v", sandboxID, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[non-blocker] If there are 4 containers, only one fails. Currently, we sends empty container status. What if we send partial response?
Follow-up: containerd#7073 (comment) Signed-off-by: Wei Fu <fuweid89@gmail.com>
Follow-up: containerd#7073 (comment) Signed-off-by: Wei Fu <fuweid89@gmail.com>
Follow-up: containerd#7073 (comment) Signed-off-by: Wei Fu <fuweid89@gmail.com>
Follow-up: containerd#7073 (comment) Signed-off-by: Wei Fu <fuweid89@gmail.com>
This PR adds container event support (kubernetes/enhancements#3386) to containerd, by implementing GetContainerEvents() CRI API that supports single-caller and returns all container events (i.e. not filtered by container ID or other filters) to the caller.
Tests Done
a. Creating a k8s cluster with k8s version evented pleg support and containerd from this PR.
b. Deploying an simple Job and verifying the pod status turned to Running -> Terminating, and then disappeared
To-do list before un-drafting: