Skip to content
This repository was archived by the owner on Sep 14, 2020. It is now read-only.

Conversation

nolar
Copy link
Contributor

@nolar nolar commented May 8, 2020

What do these changes do?

Fix an issue in 0.27rc5 when resource deletion hangs for daemons/timers combined with other handlers — due to not allowing them a fair chance to exit instantly, despite they can.

Description

Previously, when a daemon/timer was capable of exiting instantly, it didn't have enough event loop cycles to do that — only one cycle was given with only one asyncio.sleep(0), while there could be a few await calls both in the daemon itself and in the framework's wrapping routines (even if it is zero-time, await gives control back to the event loop).

This is especially important for timers, which spent most of their time in sleeping until the next trigger, so they definitely can exit instantly, but didn't have a chance due to 2-3 await uses internally.

Now, Kopf will give 10 zero-time asyncio cycles to such daemons/timers — i.e. up to 10 potential await statements since the stopper is set or the task is cancelled — and this is configurable. Alternatively, instead of the cycles, a short timeout can be used (e.g. 0.001-0.01-0.1 seconds or so) — also configurable.

This PR fixes the most common case when #356 can happen: in case of daemons and timers with "instant exit" implementation. The issue is still possible in case of actually slowly existing daemons combined with specific operator setup (due to no-op patches) — to be fixed separately in #???.


Additionally, some cleanup is performed in the code:

  • Module-level constants are moved to configuration settings.
  • Better log presentation is used for daemons/timers.
  • daemon_id is removed since it is not needed anymore.

Issues/PRs

Issues: #356

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Refactoring (non-breaking change which does not alter the behaviour)

Checklist

  • The code addresses only the mentioned problem, and this problem only
  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt

The settings are intentionally not documented in the docs, in order to not over-complicate the docs. These settings are extremely advanced, and the defaults should be enough for most or all uses-cases. Those who might ever need to change it, will probably dive deep to the source code to find it themselves.

nolar added 5 commits May 8, 2020 17:40
Generally, no magic constant should be left, everything must be
a configurable value — since the configuration structures were added
(some can be undocumented).

This is just one of the leftovers with the module-level constants.
Previously, handlers were a reactor's module, and importing upper-level
handlers into lower-level structs.containers was not desired. Now, they
are both low-level, so we can import HandlerId into structs.containers.

The daemon_id is not needed also due to usage of self-presentation of
daemons & timers (previously, `f"Daemon {daemon_id}"` was used always).
There is nothing "instant" in the world, everything takes time.
We have to specifically define what we consider as "instant"
before switching to external resource patching & status polling.

These are quite advanced features, so they are not reflected
in the configuration docs. A docstring should be enough.
@zincr
Copy link

zincr bot commented May 8, 2020

🤖 zincr found 0 problems , 1 warning

ℹ️ Large Commits
✅ Approvals
✅ Specification
✅ Dependency Licensing

Details on how to resolve are provided below


Large Commits

Checks all commits for large additions to a single file. Large commits should be reviewed more carefully for potential copyright and licensing issues

This file contains a substantial change, please review to determine if the change comes from an external source and if there are any copyright or licensing issues to be aware of

  • ℹ️ kopf/reactor/daemons.py had +100 lines of code changed in a single commit
    Please review this file to determine the source of this change
     

@nolar nolar added the bug Something isn't working label May 8, 2020
@nolar nolar requested review from 3abdelazim and mnarodovitch May 11, 2020 08:47
@nolar nolar added this to the 0.27 milestone May 11, 2020
@nolar nolar merged commit 94933eb into zalando-incubator:master May 11, 2020
@nolar nolar deleted the 356-timer-with-delete branch May 11, 2020 16:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants