-
-
Notifications
You must be signed in to change notification settings - Fork 327
Description
Description:
When attempting to run Popeye against a namespace, it will intermittently (around half the time) fail due to problems in other namespaces.
To Reproduce
Steps to reproduce the behavior:
- Deploy a pod that will cause Popeye to fail:
kubectl run fail-pod --image=nonexistent/nonexistentimage:latest -n test
- Scan a different namespace that is healthy:
popeye -n healthy -l error -f ./spinach.yml
- Repeat the scan until it fails.
- Most scans will return healthy with no issues, e.g.:
PODS (5 SCANNED) 💥 0 😱 0 🔊 0 ✅ 5 100٪ ┅┅┅┅┅┅┅ · Nothing to report.
- Occasionally, it will fail due to the pod in the other namespace (notice a lot more pods are included in the scan):
PODS (30 SCANNED) 💥 1 😱 0 🔊 0 ✅ 29 96٪ ┅┅┅┅┅┅┅ · test/fail-pod...............................................................................💥 💥 [POP-207] Pod is in an unhappy phase (Pending). 🐳 fail-pod 💥 [POP-203] Pod is waiting [0/1] ImagePullBackOff.
Using the following (crude) command, I was able to reproduce the error easily:
> repeat 20 { popeye -n healthy -l error -f ./spinach.yml > /dev/null 2>&1; echo $?}
1
0
0
1
0
0
1
0
0
1
1
0
0
0
0
1
1
1
0
1
The exit codes show that, in this instance, 9 out of 20 scans failed due to including resources from other namespaces. When repeating this command, the number of failures has always been between 8 and 12, so roughly half the time it fails.
Expected behavior
- The namespace flag should restrict the popeye scan to that namespace.
- Scans are consistent in the resources they include.
Versions (please complete the following information):
- OS: OSX 14.7 and Ubuntu 22.04
- Popeye: 0.21.5
- K8s: 1.29.8
Additional context
Our team owns/manages a number of namespaces on shared Kubernetes (AKS) clusters, which we are scanning individually using the -n flag and then aggregating the JUnit output.
These namespaces are looped through, so the scans happen immediately after one another. I've tried adding sleeps between scans, but this didn't help.
This could be related to #314, but I've created a new issue as it does work some of the time.
Spinach config:
---
# Popeye configuration using the AKS sample as a base.
# See: https://github.com/derailed/popeye/blob/master/spinach/spinach_aks.yml
popeye:
allocations:
cpu:
# Checks if cpu is under allocated by more than x% at current load.
underPercUtilization: 200
# Checks if cpu is over allocated by more than x% at current load.
overPercUtilization: 50
memory:
# Checks if mem is under allocated by more than x% at current load.
underPercUtilization: 200
# Checks if mem is over allocated by more than x% at current load.
overPercUtilization: 50
# Excludes define rules to exempt resources from sanitization
excludes:
global:
fqns:
# Exclude kube-system namespace
- rx:^kube-system/
linters:
# Exclude system CRBs
clusterrolebindings:
instances:
- fqns:
- rx:^aks
- rx:^omsagent
- rx:^system
# Exclude system CRs
clusterroles:
instances:
- fqns:
- rx:^system
- admin
- cluster-admin
- edit
- omsagent-reader
- view
codes: [400]
# Exclude unused windows daemonset
daemonsets:
instances:
- fqns: [calico-system/calico-windows-upgrade]
codes: [508]
# Exclude due to intermittent false positives
serviceaccounts:
codes: ["305"]
resources:
# Nodes specific sanitization
node:
limits:
cpu: 90
memory: 80
# Pods specific sanitization
pod:
limits:
# Fail if cpu is over x%
# Set intentionally high to ignore (if you comment it out, it'll default to 80)
cpu: 250
# Set intentionally high to ignore (if you comment it out, it'll default to 90)
# Fail if pod mem is over x%
memory: 900
# Fail if more than x restarts on any pods
restarts: 3