-
Notifications
You must be signed in to change notification settings - Fork 2k
Initiate Dapr shutdown after expiry of grace period - Issue 5481 #5562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
Codecov Report
@@ Coverage Diff @@
## master #5562 +/- ##
==========================================
- Coverage 65.08% 65.06% -0.02%
==========================================
Files 143 143
Lines 15286 15295 +9
==========================================
+ Hits 9949 9952 +3
- Misses 4637 4640 +3
- Partials 700 703 +3
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not support this change, sorry.
A few months ago we went through an extensive amount of work to be able to shut down PubSub and input binding before the grace period, as this is the most correct behavior for the majority of users.
Imagine a pod that gets shut down. Both daprd and the app get the SIGKILL at the same time and should immediately begin the shutdown sequence. Usually apps terminate instantly. If dapr were to still take messages from PubSub and input bindings, we'd try to deliver them to apps that are already shut down, and that's not polite (and could cause messages to go to the DLQ incorrectly).
We have the grace period on output things just so the app can complete the work and store the output. But we shouldn't try to bring more work to the app during the grace period.
This makes sense. We are closing all Dapr APIs, and Dapr to App communication. Though I have once question, most of the output component APIs are synchronous in Dapr, so what happens to in-flight requests if we suddenly close down the servers? @ItalyPaleAle WDYT? |
Right now, the servers are shut down at the end of the grace period. Interesting idea about not accepting new requests, but I am not sure how I feel about it. The app should be able to invoke Dapr APIs during the graceful shutdown period if it needs to store its output somewhere IMHO. Here's how I see the shutdown sequence working:
That's why I don't think Dapr should stop accepting new work (from the app), as it may still be needed. It's the app's responsibility to complete all work within the grace period, however. |
But with the current sequence in the code, the Dapr APIs are closed immediately. We are closing the APIs even before the graceful time period. (closer.Close()) method. log.Info("dapr shutting down.")
log.Info("Stopping PubSub subscribers and input bindings")
a.stopSubscriptions()
a.stopReadingFromBindings()
a.cancel()
a.stopActor()
log.Info("Stopping Dapr APIs")
for _, closer := range a.apiClosers {
if err := closer.Close(); err != nil {
log.Warnf("error closing API: %v", err)
}
}
shutdownCtx, shutdownCancel := context.WithCancel(context.Background())
go func() {
if a.tracerProvider != nil {
a.tracerProvider.Shutdown(shutdownCtx)
}
}()
log.Infof("Waiting %s to finish outstanding operations", duration)
<-time.After(duration) Shouldn't the Additionally, if DAPR APIs are still available during graceful shutdown period, shouldn't actors ( We are also immediately calling the Should the line |
Yes I agree with you.
I think that would probably be correct
Not 100% sure on this, but I think what you're saying makes sense
Possibly, but I'm not exactly sure what the context is used for right now. But you are probably correct. |
Yes, that is the correct behavior. |
Also @yaron2 ... what is your thought on the following two lines ?
|
Actors are different, the actor runtime should be stopped when the signal is received to give the runtime the chance to disconnect from placement as soon as possible and finish the rehashing in a controlled manner. Since Dapr is the actual compute orchestrator here, I think that makes sense. Ongoing requests will be drained as part of the actor runtime behavior. If ongoing requests getting drained isn't guaranteed by the actor runtime, then the above isn't true and we should stop actors after the grace period has elapsed.
It should probably be called after the timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akhilac1 please make the changes as discussed above
…ndings Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
@ItalyPaleAle @yaron2 @mukundansundar - tagging for review |
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the UTs. They seem to be flaky in windows. please take a look.
Have added comments.
…hutdown for windows tests Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
Yes. Sending SIGTERM fails and hence the assert operation fails. Fixed this to call rt.Shutdown in case we are unable to send SIGTERM |
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
…nto graceperiod_5481 Resolve merge conflict and add pubsub shutdown order check
…face to return Tracer Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
@ItalyPaleAle @mukundansundar - Pinging for attention |
…e method, moving trace shutdown to after api shutdown and removing go routine Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
…nto graceperiod_5481
Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local>
/ok-to-test |
1 similar comment
/ok-to-test |
Dapr E2E testCommit ref: 45a75fd ✅ Build succeeded for linux/amd64
✅ Infrastructure deployed
✅ Build succeeded for linux/arm64
✅ Build succeeded for windows/amd64
|
Dapr E2E testCommit ref: 45a75fd ✅ Build succeeded for linux/amd64
✅ Infrastructure deployed
✅ Build succeeded for linux/arm64
✅ Build succeeded for windows/amd64
|
} | ||
|
||
func sendSigterm(rt *DaprRuntime) { | ||
rt.runtimeConfig.GracefulShutdownDuration = 5 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this to 2? Just so tests end quicker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Artur Souza <artursouza.ms@outlook.com> Initiate Dapr shutdown after expiry of grace period - Issue 5481 (dapr#5562) * Kick off shutdown after expiry of grace period Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * update shutdown with grace period. Add tests for pubsub actors and bindings Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * fix linting and windows incompatibility Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * fixed tests on windows. SIGTERM sending fails on windows. So invoke shutdown for windows tests Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * review comments incorporated Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * review comments incorporated Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * removed comment Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * update branch and add pubsub order check Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * Fixed trace initiation and shutdown. Updated trace Registration interface to return Tracer Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * reverting timeout pushed in test and moving trace shutdown to seperate method, moving trace shutdown to after api shutdown and removing go routine Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> * re-trigger pipeline Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> Signed-off-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> Co-authored-by: Akhila Chetlapalle <akhila@Akhilas-MacBook-Pro.local> Co-authored-by: Yaron Schneider <schneider.yaron@live.com> Co-authored-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Loong Dai <long.dai@intel.com> Co-authored-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Co-authored-by: Artur Souza <artursouza.ms@outlook.com> Misc refactorings extracted from dapr#5170 (dapr#5609) Changes to the resiliency.NewRunner (dapr#5645) Remove dapr local replacement for pluggable apps (dapr#5642) * Remove dapr local replacement for pluggable apps Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Pin v0.0.8 on k6 operator Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Set actor stress tests thresholds based on previous run (dapr#5657) Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Fix ping method invoked before Init method for pluggable components (dapr#5659) Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Co-authored-by: Artur Souza <artursouza.ms@outlook.com> feature: add context to lock&pubsub API (dapr#5640) * feature: add context to lock&pubsub API Signed-off-by: seachen <seachen@tencent.com> * feature: add context to lock&pubsub API Signed-off-by: seachen <seachen@tencent.com> * Updated pinned components-contrib Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * upgrade sigs.k8s.io/controller-runtime to v0.14.1 Signed-off-by: seachen <seachen@tencent.com> * fixed golangci-lint Signed-off-by: seachen <seachen@tencent.com> Signed-off-by: seachen <seachen@tencent.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Update protoc version (dapr#5663) Extracted from dapr#5648 Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Misc test fixes (dapr#5664) * Misc test fixes 1. Fixes some (not all) race conditions in tests for pkg/runtime 2. Improvements to test platform and the actorfeatures test to make testing locally (outside of K8s) easier 3. Some more logging in E2E test apps Extracted from dapr#5648 Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Tailscale needs a bit more resources or it can crash with OOM Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Revert change to test per review feedback. However, this re-introduces a race condition (test fails `go test -race`) that will need to be fixed Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> optimize bulkpub resp processing (dapr#5498) * refactor code: bulk pub res from component only contains failed entries Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * fixing dependency. fixing unit test. Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * fix error response in gRPC bulk publish API Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * fix pluggable comps go.mod Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * change to point to correct contrib commit Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * fix dependency for components-contrib Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * update contrib to latest commit Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * address review comments. Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> * remove new line Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Register new Cloudflare KV state store and Queues binding (dapr#5632) * Register new Cloudflare KV state store and Queues binding See dapr/components-contrib#2363 for the new components Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Naming: workerskv Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Mod tidy Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Yaron Schneider <schneider.yaron@live.com> fix error message typo (dapr#5681) Signed-off-by: yaron2 <schneider.yaron@live.com> Signed-off-by: yaron2 <schneider.yaron@live.com> Replace `go.uber.org/atomic` and `github.com/pkg/errors` with standard library packages (dapr#5678) * Replace `go.uber.org/atomic` and `github.com/pkg/errors` with standard library packages The packages are now forbidden by a linter rule Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixed E2E tests failing Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Yaron Schneider <schneider.yaron@live.com> Do not start Dapr Watchdog runnable unless it's enabled (dapr#5689) Currently, the Dapr Watchdog runnable is added to the manager whether the watchdog is enabled or not. This forces the Dapr Operator service to request leadership election in all cases, even if the disable-leader-election flag is set. Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> More realistic metric for actor id stress tests (dapr#5687) Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Co-authored-by: Artur Souza <asouza.pro@gmail.com> Bump github.com/fasthttp/router from 1.4.13 to 1.4.14 (dapr#5666) Bumps [github.com/fasthttp/router](https://github.com/fasthttp/router) from 1.4.13 to 1.4.14. - [Release notes](https://github.com/fasthttp/router/releases) - [Commits](fasthttp/router@v1.4.13...v1.4.14) --- updated-dependencies: - dependency-name: github.com/fasthttp/router dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Artur Souza <asouza.pro@gmail.com> Streaming support in `InternalInvokeRequest` / `InternalInvokeResponse` (dapr#5648) * WIP - Updated pkg/messaging to make InvokeMethodRequest and InvokeMethodResponse replayable - Updated protos Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * WIP: custom io.MultiReader with io.Closer Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * WIP Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * 💄 Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * More WIP Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Use a pool for buffers in replayableRequest too Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Enabling replays where necessary Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * 💄 Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixes in code and tests Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * More fixes Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixed the remaining unit tests Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Not yer time for CallLocalStream Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * These protos are unused for now Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * More currently-unused code Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Various fixes Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Update protoc version Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Update protos Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Updated version here too Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Remove unused proto import Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Mini tweaks Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixes & other improvements-tests should now pass Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * More unit tests Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixes a possible panic Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Misc fixes Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Changes to actors and to allow the test app to run in self-hosted Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixes Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * More fixes Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Some fixes for race conditions in unit tests Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * DRY Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Update protoc version Extracted from dapr#5648 Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Changed per review feedback Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Added unit test Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Changed per review feedback Co-authored-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Misc tweaks Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Co-authored-by: Dapr Bot <56698301+dapr-bot@users.noreply.github.com> Fixed: replay buffer not resized (dapr#5697) * Fixed: replay buffer not resized Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Add unit test Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Bump github.com/prometheus/common from 0.37.0 to 0.39.0 (dapr#5694) Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.37.0 to 0.39.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.37.0...v0.39.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Artur Souza <asouza.pro@gmail.com> Fix call to fetch subscriptions over gRPC but not implemented. (dapr#5652) * Fix call to fetch subscriptions over gRPC but not implemented. Signed-off-by: Artur Souza <artursouza.ms@outlook.com> * Guarantee non-null subscriptions when not implemented on gRPC. Signed-off-by: Artur Souza <artursouza.ms@outlook.com> Signed-off-by: Artur Souza <artursouza.ms@outlook.com> Co-authored-by: Artur Souza <artursouza.ms@outlook.com> Co-authored-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Co-authored-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com> Add resiliency to bulk publish API (dapr#5646) remove unsupported k8s version, update kind action (dapr#5649) Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Signed-off-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Allow passing Dapr trust bundle flags via Helm charts (dapr#5470) * allow to pass sentry issuer related flags into charts for each components Signed-off-by: Marco <bardelli.marco@gmail.com> * Update according to feedback into dapr#5470 Signed-off-by: Marco <bardelli.marco@gmail.com> * quote added flags Signed-off-by: Marco <bardelli.marco@gmail.com> * improve explanation in README and remove too generic not strictly needed args Signed-off-by: Marco Bardelli <bardelli.marco@gmail.com> Signed-off-by: Marco <bardelli.marco@gmail.com> Signed-off-by: Marco Bardelli <bardelli.marco@gmail.com> Co-authored-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Artur Souza <asouza.pro@gmail.com> Fixes metric grouping for CPU usage graphs. (dapr#5525) "Total CPU Usage" graph is displaying `container_cpu_usage_seconds_total` but it is not grouping it by application and is instead using the `pod` field as its discriminating field. Each application is re-deployed daily and receives a new pod-id. As this metric graph uses the `pod` id in its legend, the same application ends up in represented as a series of disconnected metrics. This PR fixes the metric to grouping distinct "pods" under the same by "application id" using some metric-math. This in turn will allow us to observe how a given application behaves over time. Fixes dapr#5524 Signed-off-by: Tiago Alves Macambira <tmacam@burocrata.org> Signed-off-by: Tiago Alves Macambira <tmacam@burocrata.org> Co-authored-by: Loong Dai <long.dai@intel.com> Co-authored-by: Artur Souza <asouza.pro@gmail.com> Allow enabling preview features at build-time (dapr#5677) * Allow enabling preview features at build-time Added the `ENABLED_FEATURES` env var to the Makefile to define a (comma-separated) list of features that are always enabled, regardless of what's in the Configuration spec. The `Resiliency` feature was added to the list of always-enabled features for now (replacing the previous "hack" to have it always enabled - see dapr#5523). Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Fixed unit tests Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Features in unit tests Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Dapr Bot <56698301+dapr-bot@users.noreply.github.com> Co-authored-by: Artur Souza <asouza.pro@gmail.com> Fix profile port Performance tests for pubsub http (dapr#5683) * Add pubsub perf test for multiples message size and delayed requests Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Add pubsub test on dapr_test mk Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Support appID parameter Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Preallocate VUs Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * se in memory broker for pubsub http tests perf Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Set start time and use shared array Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Fix line formating Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Using realistic thresholds for actor type stress test (dapr#5710) Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> return error on duplicated entry IDs in gRPC bulk publish (dapr#5672) Add metrics labels regex rules (dapr#5716) * add metrics labels regex rules Signed-off-by: yaron2 <schneider.yaron@live.com> * linter Signed-off-by: yaron2 <schneider.yaron@live.com> * update header to correct year Signed-off-by: yaron2 <schneider.yaron@live.com> * linter Signed-off-by: yaron2 <schneider.yaron@live.com> Signed-off-by: yaron2 <schneider.yaron@live.com> Deprecation notice for gRPC service invocation API (dapr#5324) * Deprecation notice for gRPC service invocation API Signed-off-by: sunzhaochang <zhchsun1992@gmail.com> * Add deprecation notices automatically when generating release notes Signed-off-by: sunzhaochang <zhchsun1992@gmail.com> * Update api.go Signed-off-by: Yaron Schneider <schneider.yaron@live.com> * Update api.go Signed-off-by: Yaron Schneider <schneider.yaron@live.com> Signed-off-by: sunzhaochang <zhchsun1992@gmail.com> Signed-off-by: Yaron Schneider <schneider.yaron@live.com> Co-authored-by: Yaron Schneider <schneider.yaron@live.com> Resiliency Support for Bulk Subscribe (dapr#5603) * Add filter for resiliency policy Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Delete unrequired Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Add Resiliency Support via Accumulator and misc refactorings Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Fix linting Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Incorporate review comments Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Fix filter in bulkpub_resiliency Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Add cap assertions Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Add locks Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Add locks Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * Incorporate review comments Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> * contenttype correction Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> Signed-off-by: Deepanshu Agarwal <deepanshu.agarwal1984@gmail.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Co-authored-by: Mukundan Sundararajan <65565396+mukundansundar@users.noreply.github.com> Co-authored-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> fix flaky test (dapr#5723) Signed-off-by: yaron2 <schneider.yaron@live.com> Signed-off-by: yaron2 <schneider.yaron@live.com> fix flaky test sync issue (dapr#5728) Signed-off-by: yaron2 <schneider.yaron@live.com> Signed-off-by: yaron2 <schneider.yaron@live.com> Misc refactorings and fixes to shutdown sequence (dapr#5729) This PR contains misc refactorings extracted from the "firewall" branch, including some fixes to the shutdown sequence. Two user-facing changes: 1. Fixed: cannot stop daprd if it's waiting for the app to come online (often happens if the app crashed while using the Dapr CLI) 2. Can force shutdown (aborting any graceful shutdown sequence) by sending a second SIGTERM/SIGINT. Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Fix pluggable component withblock usage on tests (dapr#5724) * Fix pluggable component withblock usage on tests Signed-off-by: Marcos Candeia <marrcooos@gmail.com> * Add grpc server listener Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Signed-off-by: Marcos Candeia <marrcooos@gmail.com> Make Resiliency stable (dapr#5732) * Make Resiliency stable Remove the "Resiliency" feature flag and all the code paths where we biforcated based on whether Resiliency was enabled or not Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> * Remove feature flag from Makefile Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com> Initialize metrics prior to loading resiliency Resiliency was being loaded before the actual metric views/fields were being init. This caused the Resiliency init metric to be lost. This commit moves the init up a bit to go before Resiliency. dapr#5711 Signed-off-by: halspang <halspang@microsoft.com>
Signed-off-by: Akhila Chetlapalle akhila@Akhilas-MacBook-Pro.local
Description
Issue #5481 limits usage of Dapr for a subset of users, especially during rolling upgrade. The load balancer continues to send requests even after SIGTERM is sent to the Pod. The app being written in spring can support a pre-stop hook and delay shutdown to ensure the requests coming un while load balancer identifies the pod as Terminating and removes it from the list of active pods.
Dapr, however, closes inbound channels and hence will not receive any requests once SIGTERM is received. So many failed requests are noticed in this environment.
This PR moves the closing of inbound channels to after the graceperiod expiry to address the immediate closure and hence dropping of requests.
Issue reference
#5481
Please reference the issue this PR will close: #5481
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list: