web: avoid duplicate in-memory checks #9103

taylorsilva · 2025-03-08T17:14:32Z

Changes proposed by this PR

Previously we could spawn as many in-memory checks and they wouldn't be
checked for duplication until the atc/db/lock acquired a Mutex and
tried to get a lock from the database.

I'm guessing, based on the info from #8638 that the contention for this lock
would get really high. I'm also guessing that clusters seeing this issue are
high usage clusters. How high, idk. I'd guess they have a high number of
resource checks always occurring. It's possible that lidar may send off
multiple checks for the same resource which would result in multiple go
routines running and fighting over the same atc/db/lock. Lidar would send
off multiple in-memory checks if every time it runs it still finds that a
resource has exceeded it's check interval.

I'm not 100% sure this is the problem and I don't have an easy way to
test this. I do not have reproducible steps for the issue the users say.
This is all based on my reading of the code as it is today
and the pprof memory graph reported from users that showed a lot of
in-memory checks waiting on the db/lock Mutex.

Release Note

Avoid creating duplicate in-memory checks

Previously we could spawn as many in-memory checks and they wouldn't be checked for duplication until the atc/db/lock acquired a Mutex and tried to get a lock from the database. I'm _guessing_, based on the info from #8638 that the contention for this lock would get really high. I'm guessing this and other users cluster are very high usage clusters. It's possible that lidar may send off multiple checks for the same resource which would result in multiple go routines running and fighting over the same lock. Lidar would send off multiple in-memory checks if every time it runs it still finds that a resource has exceeded it's check interval. I'm not 100% sure this is the problem and I don't have an easy way to test this. This is all based on my reading of the code as it is today and the pprof memory graph reported from users that showed a lot of in-memory checks waiting on the db/lock Mutex. Signed-off-by: Taylor Silva <dev@taydev.net>

this code isn't being used anywhere. Dead code. Signed-off-by: Taylor Silva <dev@taydev.net>

taylorsilva added 2 commits March 8, 2025 12:02

remove in-memory lock

324d27f

this code isn't being used anywhere. Dead code. Signed-off-by: Taylor Silva <dev@taydev.net>

taylorsilva added the bug label Mar 8, 2025

taylorsilva requested a review from a team as a code owner March 8, 2025 17:14

taylorsilva merged commit 8c0fcad into master Mar 9, 2025
12 checks passed

taylorsilva deleted the issue/8638 branch March 9, 2025 00:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

web: avoid duplicate in-memory checks #9103

web: avoid duplicate in-memory checks #9103

Uh oh!

taylorsilva commented Mar 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

web: avoid duplicate in-memory checks #9103

web: avoid duplicate in-memory checks #9103

Uh oh!

Conversation

taylorsilva commented Mar 8, 2025

Changes proposed by this PR

Release Note

Uh oh!

Uh oh!

Uh oh!