Skip to content

Upgrade to 5.0.2 potentially caused hanging threads #692

@PhilCoggins

Description

@PhilCoggins

Hello,

We recently upgraded to 5.0.2 and almost immediately started to notice major degradation on one of our Sidekiq queues that make heavy usage of HTTP. I have attached a screenshot that shows our jobs.wait metric - how long a job takes to complete from the time it was enqueued, in ms for the last four weeks. We started seeing a lot of "dead" threads, that is jobs making external HTTP requests, that would hang indefinitely and exhaust our worker pool, until we restarted dynos, which would immediately resolve after restart. The vertical line shows when the release with the HTTP upgrade was rolled out. I downgraded back to 5.0.1 today and we thus far have observed no dead threads.

I strongly suspect a regression in this pr, but haven't yet been able to build a reproducible test case due to the intermittent nature of external resources and the concurrent access to them. I'm happy to provide additional information and welcome any tips on how I can troubleshoot. We do use the global .timeout(20) in these requests.

Thanks!

Screen Shot 2021-09-28 at 6 12 00 PM

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions