Skip to content
This repository was archived by the owner on Sep 14, 2020. It is now read-only.

Conversation

dbazhal
Copy link
Contributor

@dbazhal dbazhal commented Jun 12, 2019

Trying to fix blocking operations, such as objects patching, event posting, etc

Issue : #111

Description

  1. Added config - kopf.config.WorkersConfig - class attributes based options to control async workers. So user can just import it before all other kopf modules and change values to required. Maybe not the best decision, but any.
  2. So now can be configured number of next types of workers: queue processing, object patching, event creation.
  3. Blocking synchronous operations such as event creation, changed objects patching, are now asynchronous (i guess).
  4. Added kopf.config.EventsConfig.events_loglevel to control what types of logs should trigger event sending - INFO/WARNING/ERROR/CRITICAL

Existing tests updated for new async functions.
E2E tests required to test performance(how fast large number of new/changed objects can be processed). Such tests would prove efficiency of kopf concurrency.

Types of Changes

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Configuration change
  • Refactor/improvements

Tasks

List of tasks you will do to complete the PR

  • Documentation
  • E2E performance tests
  • Review?

Review

List of tasks the reviewer must do to review the PR

  • Tests (don't know if there could be tests to do)
  • Documentation (maybe?)

…options to control async workers. So user can just import it before all other kopf modules and change values to required. Maybe not the best decision, but any.

2. So now can be configured number of next types of workers: queue processing, object patching, event creation.
3. Blocking synchronous operations such as event creation, changed objects patching, are now asynchronous (i guess).

Existing tests updated for new async functions.
E2E tests required to test performance(how fast large number of new/changed objects can be processed). Such tests would prove efficiency of kopf concurrency.
@dbazhal dbazhal requested review from nolar and samurang87 as code owners June 12, 2019 15:56
@zincr
Copy link

zincr bot commented Jun 12, 2019

🤖 zincr found 0 problems , 0 warnings

✅ Large Commits
✅ Specification
✅ Dependency Licensing

Copy link
Contributor

@nolar nolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! This can speed up the framework significantly.

Solution-wise, I have no questions anymore — thanks for reacting fast to the notes given in the chat. All is clear for me now.

Code-wise, I've added few notes. To summarise:

  • Constants' docstrings for Sphinx.
  • Google-style imports.
  • Explicit event-loop management.
  • Async tests (a simpler way).

And couple places where I didn't get what is happening — some clarification would be helpful.

Looking forward when these improvements get into the master branch!

@nolar
Copy link
Contributor

nolar commented Jun 17, 2019

As a side-note: I tried to check why Travis CI pipeline is not triggered for this PR, and for this PR only — it works for other PRs & contributors. Travis CI says "Abuse detected" without any additional explanation.

I will try to contact Travis CI support to clarify what is the actual reason.

My only assumption for now is that this is a PR from the "master" branch of a fork to the "master" branch of the upstream — maybe Travis does not like it. This is the only "unusual" thing in this PR. Just guessing.

nolar
nolar previously approved these changes Jun 19, 2019
Copy link
Contributor

@nolar nolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I'm ready to merge the PR — the default behaviour with the default config is as intended, public interfaces are kept intact, blocking operations removed, asyncio magic is clear.

@dbazhal However, I have one last-minute question regarding how the configs affect the thread pool sizes (and do they):

logger = logging.getLogger(__name__)

MAX_MESSAGE_LENGTH = 1024
CUT_MESSAGE_INFIX = '...'


def post_event(*, obj, type, reason, message=''):
event_executor = concurrent.futures.ThreadPoolExecutor(max_workers=config.WorkersConfig.synchronous_event_post_workers_limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a new question. The intention of this PR is to make thread-pools' sizes configurable (and a lot of other things too).

However, if you import kopf (or kopf run in CLI), all the modules will be loaded, this one included, and the executors will be created with the default hard-coded config (as they are the module-level objects).

Any config changes applied after the import will have no effect on the executors.

What is the intended way of changing the configs to alter the Kopf's runtime setup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, after a quick discussion with the PR's author (in a chat), we decided to go as is for now.

These things (thread pool sizes) were not configurable before the PR, so they will be not configurable after the PR — nothing changes here.

However, blocking operations (API calls) are now moved from asyncio event-loops to the thread pools — which gives a great performance gain already, and was the main goal in this PR.

And the introduced config objects can be used to dynamically configure the Kopf's runtime later, in a separate PR (since configurability is a separate topic).



Configure logging events
=================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "==="-lines should be at least the same length as the text they use as a caption/title: http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#sections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thx

.. code-block:: python
:caption: test_example_operator.py

from kopf import config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work? I tried with a local simulation:

# pkg/__init__.py
print('init!')
import pkg.unwanted
import pkg.mod
# pkg/mod.py
print('mod')
# pkg/unwanted.py
print('UNWANTED')
>>> from pkg import mod
init!
UNWANTED
mod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omg, I missed init :) no, that does not work at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to figure out how it could be configured.

Copy link
Contributor

@nolar nolar Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbazhal I believe, it cannot be configured. This is why I've made a comment above (the one with "after a quick discussion") about skipping this topic for now, postponing it for another PR about configurability and isolated user-controlled event-loops.

One way would be to have kopf_config module outside of kopf package, as a top-level module. But that doesn't look nice (it becomes a time-frozen public interface for the users). A better way would be to have a cfg instance object passed down the call stack starting from run() or create_tasks() — but it is a big changeset.

from kopf import config

# Now kopf will send events only when error or critical occasion happens
config.EventsConfig.events_loglevel = config.LOGLEVEL_ERROR
Copy link
Contributor

@nolar nolar Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything beside thread pool sizes, such as this loglevel, can be set at any time after the imports. The values are used as they are at the moment of their usage, they are never remembered locally in the functions. Or do I miss something?

Which means, the two-step imports are not needed for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fixed that.

nolar
nolar previously approved these changes Jun 20, 2019
Copy link
Contributor

@nolar nolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I was wrong. It can be configured 😄

kopf/config.py Outdated
@staticmethod
def set_synchronous_tasks_threadpool_limit(new_limit: int):
if new_limit < 1:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, ValueError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your truth, ValueError it be :)

@nolar nolar merged commit c759d68 into zalando-incubator:master Jun 21, 2019
@nolar nolar mentioned this pull request Jul 1, 2019
2 tasks
@nolar nolar added the refactoring Code cleanup without new features added label Oct 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
refactoring Code cleanup without new features added
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants