orphanedResources monitoring for cluster resources results in excessive etcd resource usage

**Describe the bug**

Enabling `orphanedResources` monitoring for a project with all cluster resources whitelisted leads to excessive growth in traffic and resource usage for etcd, as well as for ArgoCD itself.

For etcd we have observed:
- Traffic into and out of etcd increasing ~10x
- Database size increasing ~3x
- Memory usage increasing ~4x

And for ArgoCD:
- Reconcilliation activity increasing ~6x
- Application controller CPU usage increasing ~5x
- Cluster events almost doubling

The same can not be repoduced for a project with no cluster resources whitelisted (`clusterResourceWhitelist: []`).

**To Reproduce**

We have a single application that applies all of our cluster scoped resources:
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kube-system
spec:
  destination:
    namespace: kube-system
    server: https://kubernetes.default.svc
  project: cluster
  source:
    path: cluster-name/kube-system
    repoURL: git@github.com:example/example-manifests.git
    targetRevision: master
  ignoreDifferences:
    - group: apiextensions.k8s.io
      kind: CustomResourceDefinition
      jsonPointers:
        - /status
  syncPolicy:
    automated:
      selfHeal: true
      prune: true
```

This application belongs to a project that whitelists all cluster resources
```yaml
kind: AppProject
metadata:
  name: cluster
spec:
  description: Project for applying cluster resources
  clusterResourceWhitelist:
    - group: "*"
      kind: "*"
  sourceRepos:
    - "*"
  destinations:
    - server: https://kubernetes.default.svc
      namespace: "*"
```

The application in question manages 1259 resource objects (as reported by `argocd_cluster_api_resource_objects`) and the cluster holds 80 resource types altogether (`argocd_cluster_api_resources`).

Enabling `orphanedResources` on the project immediately results in the described uptick in resource usage, with memory usage and db size seeming to plateau after several hours.

Refer to the screenshots below.

**Expected behavior**

I would not expect this to be so heavily resource intensive for etcd, or for ArgoCD, given how negligible it is for a project without any cluster resources whitelisted.

**Screenshots**
In these screenshots you can see when `orphanedResources` was enabled at 10:30 on 7/7 and then disabled at ~11:00 on 7/9.

etcd:
<img width="1322" alt="Screenshot 2020-07-10 at 10 46 15" src="https://user-images.githubusercontent.com/9870923/87143377-e53f3400-c29d-11ea-8013-0ae6f66c721c.png">

ArgoCD reconcilliation:
<img width="1322" alt="Screenshot 2020-07-10 at 10 46 27" src="https://user-images.githubusercontent.com/9870923/87143593-45ce7100-c29e-11ea-8e04-added7b512e4.png">

ArgoCD CPU:
<img width="1321" alt="Screenshot 2020-07-10 at 10 47 21" src="https://user-images.githubusercontent.com/9870923/87143631-52eb6000-c29e-11ea-976a-e08110d8f197.png">

Cluster events as reported by ArgoCD:
<img width="1325" alt="Screenshot 2020-07-10 at 10 48 00" src="https://user-images.githubusercontent.com/9870923/87143663-5da5f500-c29e-11ea-8c82-19c0724848d5.png">


**Version**

```shell
argocd: v1.5.4+36bade7
  BuildDate: 2020-05-05T19:02:56Z
  GitCommit: 36bade7a2d7b69d1c0b0c4d41191f792a847d61c
  GitTreeState: clean
  GoVersion: go1.14.1
  Compiler: gc
  Platform: darwin/amd64
argocd-server: v1.6.1+159674e
  BuildDate: 2020-06-19T00:41:05Z
  GitCommit: 159674ee844a378fb98fe297006bf7b83a6e32d2
  GitTreeState: clean
  GoVersion: go1.14.1
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: v0.13.1
  Kustomize Version: {Version:kustomize/v3.6.1 GitCommit:c97fa946d576eb6ed559f17f2ac43b3b5a8d5dbd BuildDate:2020-05-27T20:47:35Z GoOs:linux GoArch:amd64}
  Helm Version: version.BuildInfo{Version:"v3.2.0", GitCommit:"e11b7ce3b12db2941e90399e874513fbd24bcb71", GitTreeState:"clean", GoVersion:"go1.13.10"}
  Kubectl Version: v1.14.0
```

**Additional notes**
- After disabling `orphanedResources` we've found we've needed to compact and defrag etcd to bring memory usage and db size back down to normal levels (https://www.compose.com/articles/how-to-keep-your-etcd-lean-and-mean/)
- This could be related to https://github.com/argoproj/argo-cd/issues/3556. Although as you can see from the screenshots, in our case etcd memory seems to plateau after a sharp rise rather than continue to grow unbounded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

orphanedResources monitoring for cluster resources results in excessive etcd resource usage #3924

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

orphanedResources monitoring for cluster resources results in excessive etcd resource usage #3924

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions