Skip to content

Conversation

chradcliffe
Copy link
Contributor

@chradcliffe chradcliffe commented Jan 22, 2021

Commit Message:
Fix a potential use-after-free error in ScaledRangeTimerManagerImpl by adding a processing_timers_ flag to the timer queues that is set during onQueueTimerFired processing. This flag is checked when a timer is removed to ensure that the timer's queue isn't deleted while it is in a callback triggered by onQueueTimerFired.

Additional Description: N/A
Risk Level: Low - the issue does not appear possible in current Envoy code
Testing: unit test that reproduces the issue under ASAN
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A

Issue #14798

Craig Radcliffe added 2 commits January 22, 2021 12:55
…ssing a callback

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Add a `processing_timers_` flag to the timer queues that is set during
`onQueueTimerFired` processing. This flag is checked when a timer is
removed to ensure that the timer's queue isn't deleted while it is in
a callback triggered by `onQueueTimerFired`.

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
@chradcliffe
Copy link
Contributor Author

cc @akonradi

@akonradi
Copy link
Contributor

Sweet, thanks for the fix!

akonradi
akonradi previously approved these changes Jan 22, 2021
Comment on lines 170 to 171
timer1->enableTimer(std::chrono::seconds(100));
timer2->enableTimer(std::chrono::seconds(100));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these timers are enabled for the same duration, they can fire in either order. Consider making timer1 fire in 95 seconds with an absolute minimum of 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need both timers to end up in the same queues so their durations should be the same.

I think the solution is to do an advance of 5 seconds between the first enable and the second enable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either of those will work. The duration that matters is (max - min) which is (100 - 5) or (95 - 0) as adjusted.

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
while (!timers.empty() &&
computeTriggerTime(timers.front(), queue.duration_, scale_factor_) <= now) {
auto item = std::move(queue.range_timers_.front());
queue.range_timers_.pop_front();
item.timer_.trigger();
}
queue.processing_timers_ = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So processing_timers_ needs to be per queue since otherwise the queue not empty invariant would not hold if a timer from queue A removes a timer from queue B. It would be good to also cover that case in tests.

Also, is there code somewhere that verifies the invariant that all active queues are non-empty?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there code somewhere that verifies the invariant that all active queues are non-empty?

I don't believe there is currently. The number of queues and sizes is not currently exposed. I'm okay to keep this PR focused on the bugfix and add additional testing in a follow-on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

023d9ec adds a test for cancelling a timer from a different queue in the context of a timer callback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @asraa

@akonradi Please do followup. Regarding ASSERTs, I think failing this one will result in a crash in the next line.

ASSERT(!queue.range_timers_.empty());

…ent queue

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Copy link
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting this bug and providing a fix.

I would like to keep the issue you filed open until we manage to followup on some of the test improvements mentioned in various review comments.

@chradcliffe
Copy link
Contributor Author

Does this make sense as a candidate for a backport to 1.17? That is where I originally discovered this issue.

@antoniovicente antoniovicente added the backport/review Request to backport to stable releases label Jan 23, 2021
@antoniovicente
Copy link
Contributor

Tagged for backport.

@envoyproxy/senior-maintainers could one of you review the PR?

@antoniovicente
Copy link
Contributor

ping @envoyproxy/senior-maintainers for review since previous mention was over the weekend.

@ggreenway ggreenway merged commit f5468fa into envoyproxy:main Jan 27, 2021
@chradcliffe chradcliffe deleted the chradcliffe/scaled-range-timer branch January 27, 2021 17:47
@chradcliffe chradcliffe restored the chradcliffe/scaled-range-timer branch January 27, 2021 17:47
@Shikugawa Shikugawa added backport/approved Approved backports to stable releases and removed backport/review Request to backport to stable releases labels Feb 1, 2021
Shikugawa pushed a commit to Shikugawa/envoy that referenced this pull request Feb 10, 2021
…nvoyproxy#14799)

Fix a potential use-after-free error in ScaledRangeTimerManagerImpl by adding a processing_timers_ flag to the timer queues that is set during onQueueTimerFired processing. This flag is checked when a timer is removed to ensure that the timer's queue isn't deleted while it is in a callback triggered by onQueueTimerFired.

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Signed-off-by: Shikugawa <rei@tetrate.io>
Shikugawa pushed a commit to Shikugawa/envoy that referenced this pull request Feb 10, 2021
…nvoyproxy#14799)

Fix a potential use-after-free error in ScaledRangeTimerManagerImpl by adding a processing_timers_ flag to the timer queues that is set during onQueueTimerFired processing. This flag is checked when a timer is removed to ensure that the timer's queue isn't deleted while it is in a callback triggered by onQueueTimerFired.

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Signed-off-by: Shikugawa <rei@tetrate.io>
akonradi added a commit to akonradi/envoy that referenced this pull request Feb 10, 2021
Signed-off-by: Alex Konradi <akonradi@google.com>
antoniovicente pushed a commit that referenced this pull request Feb 11, 2021
The bug is triggered when a scaled timer callback disables another
scaled timer, which was causing the shared queue to be deleted before
being referenced again by the triggered timer.

Signed-off-by: Alex Konradi <akonradi@google.com>
antoniovicente pushed a commit that referenced this pull request Feb 11, 2021
…g timer fire (#14799) (#15007)

Fix a potential use-after-free error in ScaledRangeTimerManagerImpl by adding a processing_timers_ flag to the timer queues that is set during onQueueTimerFired processing. This flag is checked when a timer is removed to ensure that the timer's queue isn't deleted while it is in a callback triggered by onQueueTimerFired.

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Signed-off-by: Shikugawa <rei@tetrate.io>
antoniovicente pushed a commit that referenced this pull request Feb 11, 2021
…g timer fire (#14799) (#15006)

Fix a potential use-after-free error in ScaledRangeTimerManagerImpl by adding a processing_timers_ flag to the timer queues that is set during onQueueTimerFired processing. This flag is checked when a timer is removed to ensure that the timer's queue isn't deleted while it is in a callback triggered by onQueueTimerFired.

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Signed-off-by: Shikugawa <rei@tetrate.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/approved Approved backports to stable releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants