Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

synapse wedges after logrotation #8892

@michaelkaye

Description

@michaelkaye

No obvious goings on or CPU spinning in a py-spy top - looked like everything was hung. Process was generally just sitting idle.

Did see this in a py-spy dump:

Thread 52103 (idle): "MainThread"
    start (synapse/metrics/background_process_metrics.py:266)
    set_current_context (synapse/logging/context.py:628)
    __exit__ (synapse/logging/context.py:587)
    call_later (synapse/util/__init__.py:131)
    run_sighup (synapse/app/_base.py:263)
    start (synapse/metrics/background_process_metrics.py:267)
    set_current_context (synapse/logging/context.py:628)
    __enter__ (synapse/logging/context.py:584)
    messageReceived (synapse/replication/tcp/redis.py:108)
    replyReceived (txredisapi.py:1919)
    dataReceived (txredisapi.py:1857)
    _dataReceived (twisted/internet/tcp.py:249)
    doRead (twisted/internet/tcp.py:243)
    _doReadOrWrite (twisted/internet/posixbase.py:614)
    callWithContext (twisted/python/context.py:85)
    callWithContext (twisted/python/context.py:122)
    callWithContext (twisted/python/log.py:86)
    callWithLogger (twisted/python/log.py:103)
    doPoll (twisted/internet/epollreactor.py:235)
    mainLoop (twisted/internet/base.py:1295)
    run (twisted/internet/base.py:1283)
    run (synapse/app/_base.py:117)
    start_reactor (synapse/app/_base.py:133)
    start_worker_reactor (synapse/app/_base.py:80)
    start (synapse/app/generic_worker.py:988)
    <module> (synapse/app/generic_worker.py:993)
    _run_code (runpy.py:85)
    _run_module_as_main (runpy.py:193)
Thread 52107 (idle): "Thread-1"
    select (selectors.py:415)
    serve_forever (socketserver.py:232)
    run (threading.py:865)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)
Thread 52108 (idle): "PoolThread-140238481588800-0"
    wait (threading.py:296)
    get (queue.py:170)
    work (twisted/_threads/_threadworker.py:45)
    run (threading.py:865)
    run (sentry_sdk/integrations/threading.py:67)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)

... more of the same
Thread 52118 (idle): "PoolThread-twisted.internet.reactor-0"
    wait (threading.py:296)
    get (queue.py:170)
    work (twisted/_threads/_threadworker.py:45)
    run (threading.py:865)
    run (sentry_sdk/integrations/threading.py:67)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)
Thread 44785 (idle): "raven-sentry.BackgroundWorker"
    wait (threading.py:296)
    get (queue.py:170)
    _target (sentry_sdk/worker.py:122)
    run (threading.py:865)
    run (sentry_sdk/integrations/threading.py:67)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)
Thread 56726 (idle): "Thread-19758"
    collect (synapse/metrics/background_process_metrics.py:121)
    collect (prometheus_client/registry.py:75)
    collect (synapse/metrics/__init__.py:58)
    generate_latest (synapse/metrics/_exposition.py:97)
    do_GET (synapse/metrics/_exposition.py:203)
    handle_one_request (http/server.py:414)
    handle (http/server.py:426)
    __init__ (socketserver.py:720)
    finish_request (socketserver.py:360)
    process_request_thread (socketserver.py:650)
    run (threading.py:865)
    run (sentry_sdk/integrations/threading.py:67)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)
... and this stack trace repeated as well

Symptoms were that the log rotation ran, the signal was sent, logrotate was happy, then the service stopped responding to monitoring requests. No logs and i think no processing either.

Metadata

Metadata

Assignees

Labels

z-bug(Deprecated Label)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions