-
Notifications
You must be signed in to change notification settings - Fork 6.4k
feat: support generating manifests for the same commit in parallel #4625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c7936e7
to
f772a9b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4625 +/- ##
==========================================
+ Coverage 41.24% 41.52% +0.28%
==========================================
Files 124 125 +1
Lines 16976 17063 +87
==========================================
+ Hits 7001 7086 +85
+ Misses 8969 8963 -6
- Partials 1006 1014 +8 ☔ View full report in Codecov by Sentry. |
1d185c9
to
d173a2c
Compare
d173a2c
to
d64702e
Compare
Tested on a real-life mono repository. It seems like change actually significantly improve repo-server performance (~10x faster). Here is how I've performed testing: Run a script that refreshes all apps using soft-refresh ( cached manifests ) and then the same script with hard refresh (force manifest regeneration). The difference between runs is how much time manifest generation took. Environment: Script: REFRESH=${TYPE:=--hard-refresh}
echo "Refresh type: $REFRESH"
for app in $(cat /tmp/apps); do
argocd app get $REFRESH $app > /dev/null &
done
time wait Results: Before PR (~15 seconds)
After PR (1.2 seconds)
|
d3c0978
to
46dc14d
Compare
reposerver/repository/repository.go
Outdated
} | ||
|
||
// Helm based apps might downloads dependencies and cannot be process concurrently | ||
return v1alpha1.ApplicationSourceType(appType) != v1alpha1.ApplicationSourceTypeHelm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately helm dependency build fails if run concurrently. So helm based apps from the same revision cannot be processed in parallel. But we can process such apps in different directories will work on it in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured that we can simply add a lock that prevents running concurrent "helm dependency build" in the same directory.
46dc14d
to
6a12510
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not had the change to take a thorough look at the code, nor did I perform any tests yet. But from glancing over the changes, I have one concern so far (please see inline comment).
Also, I was wondering when looking forward, whether we could just create shallow clones of the repositories at the revision that is requested for each application that refers the repository, adding the name of the application to the path of the local checkout.
For example, when we have two applications app-a
and app-b
which both refer to https://github.com/argoproj/argocd-example-apps
, repo-server would checkout two repositories at /tmp/app-a_https:__github.com_argoproj_argocd-example-apps
and /tmp/app-b_https:__github.com_argoproj_argocd-example-apps
instead of just /tmp/https:__github.com_argoproj_argocd-example-apps
.
This would not have to introduce any locking, but at the expense of more temporary disk space required. Performing a shallow fetch could keep the expense low, tho. For really large mono repos, we could introduce something like a weighted semaphore to limit concurrent checkouts per repo-server instance to a configurable value, if temporary disk space on the node is sparse. But it would have the nice effect to also allow Kustomize and (J|K)Sonnet sources in a concurrent way.
However, I'm not sure about the rational of the checkout and general repo handling design, so it might well be seen I've overlooked something that was already discussed in the past and has led to the implementaiton we currently have.
reposerver/repository/repository.go
Outdated
} | ||
|
||
// allowConcurrent returns true if given application source can be processed concurrently | ||
func allowConcurrent(source *v1alpha1.ApplicationSource) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be somewhat dangerous to test here & this way for whether a source allows concurrent repository processing.
Reason being, we might want to add further source options later on and this is a hard-to-catch dependency. Maybe we should move this to pkg/apis/application/v1alpha1/types.go
and make it a receiver function in the various source types, so it becomes more obvious and easier for people writing code and people reviewing it. For example in some pseudoish-code:
func (k *ApplicationSourceKustomize) AllowsConcurrentProcessing() bool {
return len(k.Images) == 0 &&
len(k.CommonLabels) == 0 &&
k.NamePrefix == "" &&
k.NameSuffix == ""
}
func (k *ApplicationSourceKsonnet) AllowsConcurrentProcessing() bool {
return ...
}
func (a *ApplicationSource) AllowsConcurrentProcessing() bool {
switch {
case a.Kustomize != nil:
return a.Kustomize.AllowsConcurrentProcessing()
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Implemented the change
a7a72b9
to
b56f2f5
Compare
Thank you for review @jannfis . Implemented your suggestion about moving Cloning the same repository multiple times and some kind of optimization like shallow clone was the first idea. I've decided to consider this approach first after evaluating our own use case. We have a repository with ~200+ applications (1 addon per cluster) . Single commit to the repository invalidates manifests cache of all applications and after receiving webhook notification controller fires 200 manifest generation requests. Cloning the same repository 200 times definitely won't work. As you've mentioned we might come up with some limit per repo clone and run several experiments to find the right number of clones. I'm afraid it won't be that easy and we will need to deal with something like Github/Gitlab rate-limiting, not enough network bandwidth etc. On the other hand, the optimization in this PR already provides the best possible combination: 1 repo clone handles all 200 requests. The optimization doesn't require any significant redesign/code changes so it is really easy to try it. There is a chance that processing of the same commit in parallel is enough. Otherwise, we can try multiple repository clones. What do you think? |
I agree that your implementation is the most reasonable and efficient one for plain YAML manifests, and I really do like the approach of re-using the same repository for the same commit SHA to generate the manifests for reconciliation. However, I was thinking about solving the use-case of more dynamically generated resources, i.e. when using any of the generators such as Kustomize that modify the source to some extent or when there are simultaneous commits to multiple tracking branches. But imo, your PR is solving >80% of use-cases, so I think we might think about solutions for the remaining 20% and implement them later, extending your work in this PR if possible. Maybe instead of cloning the repository multiple times, we can just recursively copy the initial clone from the Git provider on disk for applications that need to modify the source. This would save network bandwidth and reduce pressure on the Git provider (while still not considering disk space requirements tho). But anyway, let's go with parallel use of same checkout for the time being :) |
75fe6f3
to
3bbbf62
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@alexmt I want to use the concurrent processing feature along with a
Since the plugin is triggered from the root of the repo ( Thanks |
Implements first part of #4583
PR changes repository server so that it executes multiple operations that require the same repository and same commit SHA ( if the operation does not change files in the repository). This in theory should significantly improve manifest generation for mono repo with many argocd apps in it.