Skip to content

Conversation

ningmingxiao
Copy link
Contributor

@ningmingxiao ningmingxiao commented Jan 18, 2024

use crictl create 1000 containers and containerEventsChan will full, then don't call GetContainerEvents to consume events, events will lost. will show "containerEventsChan is full, discarding event ..."
issue:#8892

@k8s-ci-robot
Copy link

Hi @ningmingxiao. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ningmingxiao ningmingxiao force-pushed the dev14 branch 2 times, most recently from 9affa74 to d1e0739 Compare January 18, 2024 12:59
@ningmingxiao
Copy link
Contributor Author

ningmingxiao commented Jan 18, 2024

Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
@lengrongfu
Copy link
Contributor

@ningmingxiao can you add issue link? thanks.

@ningmingxiao
Copy link
Contributor Author

@ningmingxiao can you add issue link? thanks.

#8892

containerEventsDroppedCount.Inc()
log.G(ctx).Debugf("containerEventsChan is full, discarding event %+v", event)
}
eq.enqueue(event)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the queue add over just

go func(e runtime.ContainerEventResponse) {
        c.containerEventsChan <- e
}(event)

Neither approach seems to guarantee the order if that is the intent.

Copy link
Member

@dmcgowan dmcgowan Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If order preservation is important, could also just have a single dequeue go routine which runs when there are items in the queue, like

func (eq *eventQueue) enqueue(c chan<- runtime.ContainerEventResponse, event runtime.ContainerEventResponse) {
	eq.lock.Lock()
	defer eq.lock.Unlock()
	eq.items = append(eq.items, event)
	if len(eq.items) == 1 {
		go func() {
			var event *runtime.ContainerEventResponse
			for {
				eq.lock.Lock()
				if event != nil {
					eq.items = eq.items[1:]
				}
				if len(eq.items) == 0 {
					eq.lock.Unlock()
					return
				}
				event = &eq.items[0]
				eq.lock.Unlock()
				c <- *event
			}
		}()
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If just generate one event

for {
	eq.lock.Lock()
	// here  len(eq.items)=1 ,event is not nill
	if event != nil {
		//after next  stup len(eq.items) = 0
		eq.items = eq.items[1:]
	}
	
	//here will return len(eq.items)=0 will lost a event
	if len(eq.items) == 0 {
		eq.lock.Unlock()
		return
	}
	event = &eq.items[0]
	eq.lock.Unlock()
	c <- *event
}

Copy link
Contributor Author

@ningmingxiao ningmingxiao Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about add a lock in criService struct

	c.mu.lock()
	eq.enqueue(event)
	c.mu.unlock()

to keep event in order
@dmcgowan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be looked for a better solution overall rather than just add more locking/queuing. There are already TODOs around the publishing of the events and it is using a channel which a large buffer which is not a good pattern to avoid lost messages. It probably makes more sense to use a queue but to avoid leaking memory if there is no one calling GetContainerEvents the queue should have an expiration if nothing is removing items from the queue. I don't think a quick fix to just add a queue is going to work as it may lead to a memory leak.

@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

This PR is stale because it has been open 90 days with no activity. This PR will be closed in 7 days unless new comments are made or the stale label is removed.

@github-actions github-actions bot added the Stale label Apr 27, 2024
Copy link

github-actions bot commented May 5, 2024

This PR was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants