-
Notifications
You must be signed in to change notification settings - Fork 5.1k
upstream: fix degraded health check and thread posting #5662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes two bugs with degraded health checks: 1) When an active health check succeeded but marked the host as degraded, we'd always set the change_state to ChangePending, which meant we'd not ever invoke the callback when hosts went from healthy -> degraded. 2) Fix typo preventing degraded hosts from being updated on worker threads. Signed-off-by: Snow Pettersen <snowp@squareup.com>
@@ -656,7 +656,7 @@ void ClusterManagerImpl::postThreadLocalClusterUpdate(const Cluster& cluster, ui | |||
// TODO(htuch): Can we skip these copies by exporting out const shared_ptr from HostSet? | |||
HostVectorConstSharedPtr hosts_copy(new HostVector(host_set->hosts())); | |||
HostVectorConstSharedPtr healthy_hosts_copy(new HostVector(host_set->healthyHosts())); | |||
HostVectorConstSharedPtr degraded_hosts_copy(new HostVector(host_set->healthyHosts())); | |||
HostVectorConstSharedPtr degraded_hosts_copy(new HostVector(host_set->degradedHosts())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find a good place to add a test for this - any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be able to add a test in cluster_manager_impl_test that does a sequence of updates and then verifies that the thread local host sets have the hosts they should?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good pending test.
/wait
@@ -656,7 +656,7 @@ void ClusterManagerImpl::postThreadLocalClusterUpdate(const Cluster& cluster, ui | |||
// TODO(htuch): Can we skip these copies by exporting out const shared_ptr from HostSet? | |||
HostVectorConstSharedPtr hosts_copy(new HostVector(host_set->hosts())); | |||
HostVectorConstSharedPtr healthy_hosts_copy(new HostVector(host_set->healthyHosts())); | |||
HostVectorConstSharedPtr degraded_hosts_copy(new HostVector(host_set->healthyHosts())); | |||
HostVectorConstSharedPtr degraded_hosts_copy(new HostVector(host_set->degradedHosts())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be able to add a test in cluster_manager_impl_test that does a sequence of updates and then verifies that the thread local host sets have the hosts they should?
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Fixes two bugs with degraded health checks: 1) When an active health check succeeded but marked the host as degraded, we'd always set the change_state to ChangePending, which meant we'd not ever invoke the callback when hosts went from healthy -> degraded. 2) Fix typo preventing degraded hosts from being updated on worker threads. Signed-off-by: Snow Pettersen <snowp@squareup.com> Signed-off-by: Dan Zhang <danzh@google.com>
Fixes two bugs with degraded health checks: 1) When an active health check succeeded but marked the host as degraded, we'd always set the change_state to ChangePending, which meant we'd not ever invoke the callback when hosts went from healthy -> degraded. 2) Fix typo preventing degraded hosts from being updated on worker threads. Signed-off-by: Snow Pettersen <snowp@squareup.com> Signed-off-by: Fred Douglas <fredlas@google.com>
Fixes two bugs related to degraded health checks:
degraded, we'd always set the change_state to ChangePending, which meant
we'd not ever invoke the callback when hosts went from healthy ->
degraded.
threads.
Signed-off-by: Snow Pettersen snowp@squareup.com
Risk Level: Medium
Testing: Updated unit tests
Docs Changes: n/a
Release Notes: n/a
#5063