feat: support generating manifests for the same commit in parallel #4625

alexmt · 2020-10-20T23:57:30Z

Implements first part of #4583

PR changes repository server so that it executes multiple operations that require the same repository and same commit SHA ( if the operation does not change files in the repository). This in theory should significantly improve manifest generation for mono repo with many argocd apps in it.

codecov-io · 2020-10-21T00:27:30Z

Codecov Report

Attention: Patch coverage is 85.10638% with 14 lines in your changes missing coverage. Please review.

Project coverage is 41.52%. Comparing base (c7dbe48) to head (3bbbf62).
Report is 5331 commits behind head on master.

Files with missing lines	Patch %	Lines
reposerver/repository/repository.go	71.42%	9 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4625      +/-   ##
==========================================
+ Coverage   41.24%   41.52%   +0.28%     
==========================================
  Files         124      125       +1     
  Lines       16976    17063      +87     
==========================================
+ Hits         7001     7086      +85     
+ Misses       8969     8963       -6     
- Partials     1006     1014       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

alexmt · 2020-10-22T23:17:14Z

Tested on a real-life mono repository. It seems like change actually significantly improve repo-server performance (~10x faster).

Here is how I've performed testing:

Run a script that refreshes all apps using soft-refresh ( cached manifests ) and then the same script with hard refresh (force manifest regeneration). The difference between runs is how much time manifest generation took.

Environment:
40 kustomize based apps in repo. ( kustomize build takes ~ 0.3 seconds )
2 argocd-repo-server replicas

Script:

REFRESH=${TYPE:=--hard-refresh}

echo "Refresh type: $REFRESH"

for app in $(cat /tmp/apps); do
   argocd app get $REFRESH $app > /dev/null &
done

time wait

Results:

Before PR (~15 seconds)

TYPE="--refresh" /tmp/check.sh
Refresh type: --refresh
real	0m12.150s

 /tmp/check.sh                                                                                      
Refresh type: --hard-refresh
real	0m27.091s

After PR (1.2 seconds)

TYPE="--refresh" /tmp/check.sh
Refresh type: --refresh

real	0m12.344s

/tmp/check.sh
Refresh type: --hard-refresh

real	0m13.555s

alexmt · 2020-10-25T14:14:22Z

reposerver/repository/repository.go

+		}
+
+		// Helm based apps might downloads dependencies and cannot be process concurrently
+		return v1alpha1.ApplicationSourceType(appType) != v1alpha1.ApplicationSourceTypeHelm


Unfortunately helm dependency build fails if run concurrently. So helm based apps from the same revision cannot be processed in parallel. But we can process such apps in different directories will work on it in a separate PR

Figured that we can simply add a lock that prevents running concurrent "helm dependency build" in the same directory.

jannfis

I've not had the change to take a thorough look at the code, nor did I perform any tests yet. But from glancing over the changes, I have one concern so far (please see inline comment).

Also, I was wondering when looking forward, whether we could just create shallow clones of the repositories at the revision that is requested for each application that refers the repository, adding the name of the application to the path of the local checkout.

For example, when we have two applications app-a and app-b which both refer to https://github.com/argoproj/argocd-example-apps, repo-server would checkout two repositories at /tmp/app-a_https:__github.com_argoproj_argocd-example-apps and /tmp/app-b_https:__github.com_argoproj_argocd-example-apps instead of just /tmp/https:__github.com_argoproj_argocd-example-apps.

This would not have to introduce any locking, but at the expense of more temporary disk space required. Performing a shallow fetch could keep the expense low, tho. For really large mono repos, we could introduce something like a weighted semaphore to limit concurrent checkouts per repo-server instance to a configurable value, if temporary disk space on the node is sparse. But it would have the nice effect to also allow Kustomize and (J|K)Sonnet sources in a concurrent way.

However, I'm not sure about the rational of the checkout and general repo handling design, so it might well be seen I've overlooked something that was already discussed in the past and has led to the implementaiton we currently have.

jannfis · 2020-10-25T20:53:17Z

reposerver/repository/repository.go

+}
+
+// allowConcurrent returns true if given application source can be processed concurrently
+func allowConcurrent(source *v1alpha1.ApplicationSource) bool {


I think it might be somewhat dangerous to test here & this way for whether a source allows concurrent repository processing.

Reason being, we might want to add further source options later on and this is a hard-to-catch dependency. Maybe we should move this to pkg/apis/application/v1alpha1/types.go and make it a receiver function in the various source types, so it becomes more obvious and easier for people writing code and people reviewing it. For example in some pseudoish-code:

func (k *ApplicationSourceKustomize) AllowsConcurrentProcessing() bool { return len(k.Images) == 0 && len(k.CommonLabels) == 0 && k.NamePrefix == "" && k.NameSuffix == "" } func (k *ApplicationSourceKsonnet) AllowsConcurrentProcessing() bool { return ... } func (a *ApplicationSource) AllowsConcurrentProcessing() bool { switch { case a.Kustomize != nil: return a.Kustomize.AllowsConcurrentProcessing() ... } }

Good point. Implemented the change

alexmt · 2020-10-26T16:28:56Z

Thank you for review @jannfis . Implemented your suggestion about moving AllowsConcurrentProcessing to types.go closer to structure definition.

Cloning the same repository multiple times and some kind of optimization like shallow clone was the first idea. I've decided to consider this approach first after evaluating our own use case. We have a repository with ~200+ applications (1 addon per cluster) . Single commit to the repository invalidates manifests cache of all applications and after receiving webhook notification controller fires 200 manifest generation requests. Cloning the same repository 200 times definitely won't work. As you've mentioned we might come up with some limit per repo clone and run several experiments to find the right number of clones. I'm afraid it won't be that easy and we will need to deal with something like Github/Gitlab rate-limiting, not enough network bandwidth etc.

On the other hand, the optimization in this PR already provides the best possible combination: 1 repo clone handles all 200 requests. The optimization doesn't require any significant redesign/code changes so it is really easy to try it. There is a chance that processing of the same commit in parallel is enough. Otherwise, we can try multiple repository clones.

What do you think?

jannfis · 2020-10-26T17:02:09Z

I agree that your implementation is the most reasonable and efficient one for plain YAML manifests, and I really do like the approach of re-using the same repository for the same commit SHA to generate the manifests for reconciliation.

However, I was thinking about solving the use-case of more dynamically generated resources, i.e. when using any of the generators such as Kustomize that modify the source to some extent or when there are simultaneous commits to multiple tracking branches. But imo, your PR is solving >80% of use-cases, so I think we might think about solutions for the remaining 20% and implement them later, extending your work in this PR if possible. Maybe instead of cloning the repository multiple times, we can just recursively copy the initial clone from the Git provider on disk for applications that need to modify the source. This would save network bandwidth and reduce pressure on the Git provider (while still not considering disk space requirements tho).

But anyway, let's go with parallel use of same checkout for the time being :)

jannfis

LGTM

dudicoco · 2021-03-31T06:39:52Z

@alexmt I want to use the concurrent processing feature along with a helmfile custom plugin within a monorepo, each app has the following config:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
spec:
  destination:
    namespace: my-ns
    server: https://kubernetes.default.svc.cluster.local
  project: default
  source:
    path: .
    repoURL: git@github.com:my-repo.git
    targetRevision: HEAD
    plugin:
      name: helmfile
  syncPolicy: {}

Since the plugin is triggered from the root of the repo (path: .), should I add the .argocd-allow-concurrency file to the root dir?

Thanks

alexmt force-pushed the repo-performance branch 4 times, most recently from c7936e7 to f772a9b Compare October 21, 2020 00:26

alexmt force-pushed the repo-performance branch 2 times, most recently from 1d185c9 to d173a2c Compare October 21, 2020 00:55

alexmt marked this pull request as draft October 21, 2020 01:12

alexmt force-pushed the repo-performance branch from d173a2c to d64702e Compare October 21, 2020 15:01

alexmt force-pushed the repo-performance branch 16 times, most recently from d3c0978 to 46dc14d Compare October 25, 2020 14:11

alexmt marked this pull request as ready for review October 25, 2020 14:11

alexmt requested review from jannfis and jessesuen October 25, 2020 14:11

alexmt commented Oct 25, 2020

View reviewed changes

feat: support generating manifests for the same commit in parallel

6a12510

alexmt force-pushed the repo-performance branch from 46dc14d to 6a12510 Compare October 25, 2020 16:43

jannfis requested changes Oct 25, 2020

View reviewed changes

prevent running helm build concurrently

b56f2f5

alexmt force-pushed the repo-performance branch from a7a72b9 to b56f2f5 Compare October 26, 2020 15:32

Apply reviewer notes: move AllowsConcurrentProcessing to types.go

cd19d38

alexmt added 2 commits October 26, 2020 10:45

add commit to runHelmBuild method

9533f17

support concurrent helm builds again; update documentation

3bbbf62

alexmt force-pushed the repo-performance branch from 75fe6f3 to 3bbbf62 Compare October 26, 2020 19:21

jannfis approved these changes Oct 26, 2020

View reviewed changes

alexmt merged commit 5fdbe20 into argoproj:master Oct 26, 2020

alexmt deleted the repo-performance branch October 26, 2020 20:36

alexmt mentioned this pull request Mar 13, 2024

Argo CD unnecessary enforce sequential helm manifest generation for one chart #17517

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support generating manifests for the same commit in parallel #4625

feat: support generating manifests for the same commit in parallel #4625

Uh oh!

alexmt commented Oct 20, 2020 •

edited

Loading

Uh oh!

codecov-io commented Oct 21, 2020 •

edited by codecov bot

Loading

Uh oh!

alexmt commented Oct 22, 2020 •

edited

Loading

Uh oh!

alexmt Oct 25, 2020

Uh oh!

alexmt Oct 26, 2020

Uh oh!

jannfis left a comment

Uh oh!

jannfis Oct 25, 2020

Uh oh!

alexmt Oct 26, 2020

Uh oh!

alexmt commented Oct 26, 2020

Uh oh!

jannfis commented Oct 26, 2020

Uh oh!

jannfis left a comment

Uh oh!

dudicoco commented Mar 31, 2021

Uh oh!

Uh oh!

feat: support generating manifests for the same commit in parallel #4625

feat: support generating manifests for the same commit in parallel #4625

Uh oh!

Conversation

alexmt commented Oct 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Oct 21, 2020 • edited by codecov bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

alexmt commented Oct 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexmt Oct 25, 2020

Choose a reason for hiding this comment

Uh oh!

alexmt Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

jannfis left a comment

Choose a reason for hiding this comment

Uh oh!

jannfis Oct 25, 2020

Choose a reason for hiding this comment

Uh oh!

alexmt Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

alexmt commented Oct 26, 2020

Uh oh!

jannfis commented Oct 26, 2020

Uh oh!

jannfis left a comment

Choose a reason for hiding this comment

Uh oh!

dudicoco commented Mar 31, 2021

Uh oh!

Uh oh!

alexmt commented Oct 20, 2020 •

edited

Loading

codecov-io commented Oct 21, 2020 •

edited by codecov bot

Loading

alexmt commented Oct 22, 2020 •

edited

Loading