-
Notifications
You must be signed in to change notification settings - Fork 8.1k
adds copilot snapshot during cf config setup #4019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Assign the PR to them by writing The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/assign @geeknoid |
/ok-to-test |
/retest |
1 similar comment
/retest |
ignore the prow test failure for now; they are broken on master/all prs |
@ZackButcher @costinm could we get a cursory review on this? Should fall in line with the file snapshot pattern for config injection that we refactored into. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than refactoring the large method - it seems reasonable.
I would appreciate more comments or a pointer the
the docs on how the copilot works with istio, how it is
tested, etc.
pilot/pkg/bootstrap/server.go
Outdated
|
||
// Defer starting the file monitor until after the service is created. | ||
s.addStartFunc(func(stop chan struct{}) error { | ||
fileMonitor.Start(stop) | ||
return nil | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this ( or the entire block ) to a separate method ? initCopilotConfig() or similar ? This method is getting to big for readability.
@costinm thanks for the review. For more details on how copilot is working you can actually see e2e test that we've constructed here (its got a mocked copilot but it should provide a rough sketch): https://github.com/istio/istio/blob/master/tests/e2e/tests/pilot/cloudfoundry/copilot_test.go copilot is meant to pull the relevant data from the Cloudfoundry API and hold it until Pilot / the Service Registry can come through and grab it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG overall, just some small stuff.
) | ||
|
||
// CopilotClient defines a local interface for interacting with Cloud Foundry Copilot | ||
//go:generate counterfeiter -o fakes/copilot_client.go --fake-name CopilotClient . CopilotClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a go run
invocation. I (and the build machines) do not have counterfeiter
installed already, so anything running a go generate
over this dir will just break. Go run should be portable, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: would you like us to add counterfeiter
to the vendor so that we can go run
it, kinda like
istio/mixer/template/sample/doc.go
Line 31 in fae56a6
//go:generate go run $GOPATH/src/istio.io/istio/mixer/tools/codegen/cmd/mixgenbootstrap/main.go -f $GOPATH/src/istio.io/istio/mixer/template/sample/inventory.yaml -o $GOPATH/src/istio.io/istio/mixer/template/sample/template.gen.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed in slack, we are leaving this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chatted on Slack: I thought go run
would download the binary (or rather, download the source and build it) if provided a fully qualified name; it does not. So this is fine. If we hit this footgun down the line and have issues due to the binary missing, we can vendor it. The PR can go in with this as-is though.
// ReadConfigFiles returns a complete set of VirtualServices for all Cloud Foundry routes known to Copilot. | ||
// It may be used for the getSnapshotFunc when constructing a NewMonitor | ||
func (c *CopilotSnapshot) ReadConfigFiles() ([]*model.Config, error) { | ||
resp, err := c.client.Routes(context.Background(), new(copilotapi.RoutesRequest)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does copilot have a default deadline on incoming requests? Is there a reasonable default we can set here? We shouldn't be making any calls w/o some kind of deadline set on the context as a matter of hygiene. Too easy to leak connections/goroutines on calls that never come back otherwise.
I think we haven't been doing a good job of checking for that kind of thing in the codebase to date, but we need to (especially as we're zeroing in on Pilot perf).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZackButcher great catch - we'll just add a context with deadline here. Don't know what is reasonable for this call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked in Slack, but to capture discussion here for future reference: the specific value isn't really important so much as that there's some deadline. A decent value is easy to get from running the system in prod for a bit (you're running copilot behind Envoy, to get those sweet, sweet Mixer metrics, right? :D ).
Name: fmt.Sprintf("route-for-%s", hostname), | ||
}, | ||
Spec: &networking.VirtualService{ | ||
Gateways: gatewayNames, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is CF specific logic, I assume, since you're attaching virtual services to every gateway in the mesh? (IIRC from our chats y'all weren't planning on exposing everything on the gateways out of the box? Sorry it's been a busy week and I can't quite recall where we stand on that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a bit odd.. That said, I think what they are doing here is just that (exposing all services on the gateway), for this is how the default CF go router works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are filtering as well (note the blacklisted bit in the middle). So this is selective dependent on route suffix. In this case - .internal
is removed.
resp, err := c.client.Routes(context.Background(), new(copilotapi.RoutesRequest)) | ||
if err != nil { | ||
log.Warnf("Error connecting to copilot: %v", err) | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere, just return the err please. Logging the error and returning it will nearly always result in duplicate errors being written to the log. If you want some indication at this call site, make it a log.Debug
statement instead so it's only printed while debug flags are on, and not during normal server operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, just annotate the error with your log message as you pass it up to the caller:
return nil, fmt.Errorf("Error connecting to copilot: %v", err)
var gatewayConfig = fmt.Sprintf(` | ||
apiVersion: config.istio.io/v1alpha2 | ||
apiVersion: config.istio.io/v1alpha3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is wrong I think.. check with @GregHanson / @frankbu .. Its networking.istio.io IIRC - recent change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, it's networking.istio.io/v1alpha3
, thanks for catching that.
}, | ||
}, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zachgersh / @rosenhouse is this meant to be an initial POC implementation? If so, please clearly add a big TODO to optimize this entire function. This function effectively ends up running every second, and generates 250K virtual services (one per route), and throws away all precomputed data. You could avoid this by having indices, and generating only those virtual services that changed recently. This is what we are doing in the Kube section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take on this. This is fine for a PoC PR to get things going, but this definitely requires a ton of optimization. As such, it won't scale. I am also skeptical of the (mis)use of the routing API.
What you essentially need is a gateway that can dynamically pull entries from the service registry (if configured to do so). The gateway exposes all service names in the service registry to the outside world, with a default route (matching prefix /), if and only if there is no virtualservice defined for that service.
You could abstract this nicely by doing something like
hosts := Gateway.GetHosts(gatewayName)
// if configured, the gateway spec returns all hosts in the service registry instead of hosts in the gateway spec.
.. generate Envoy virtualhost for all the hosts
...apply virtualservice specs in envoy route..
The implementation of GetHosts could be platform specific. In kubernetes, it could simply retrieve only the hosts from the Gateway spec. In CF, you can call copilot.getRoutes(gatewayName)
which can choose to either return ALL routes, or just return those meant to be exposed via that gateway.
If you follow this path, you could potentially easily apply some optimizations (common to both CF/K8S) - such as add/remove a single virtual host from the gateway, update a single route rule (virtualservice), etc., without churning the entire envoy configuration.
@rshriram yes, thanks for pointing that out. short answer: we will follow-up with optimizations. we are still in this POC phase for everything CF+Istio, unfortunately. |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
@utako please comment here and note you've signed the CLA 👍 |
I've signed the CLA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than the array initializations. I know in this code specifically we're not really at risk of having too many allocations, but as a general practice across the codebase we should allocate precisely sized arrays when we can.
pilot/pkg/bootstrap/server.go
Outdated
}) | ||
configController := memory.NewController(store) | ||
|
||
err := s.makeFileMonitor(args, configController) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if a method only returns an error merge onto a single line with if: if err := s.MakeFileMonitor(); err != nil {}
here and elsewhere.
} | ||
} | ||
|
||
// ReadConfigFiles returns a complete set of VirtualServices for all Cloud Foundry routes known to Copilot. | ||
// It may be used for the getSnapshotFunc when constructing a NewMonitor | ||
func (c *CopilotSnapshot) ReadConfigFiles() ([]*model.Config, error) { | ||
resp, err := c.client.Routes(context.Background(), new(copilotapi.RoutesRequest)) | ||
ctx, cancel := context.WithTimeout(context.Background(), c.timeout) | ||
defer cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking: you can actually ignore cancel here, it'll never be used. It's not required by the context API that you call cancel, and we immediately block until either we get a return (we don't need to cancel) or we time out (we don't need to cancel).
|
||
virtualServices = append(virtualServices, config) | ||
var cachedHostnames sort.StringSlice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This slice should be make(sort.StringSlice, 0, len(c.virtualServices))
to avoid having to resize the backing array.
func (c *CopilotSnapshot) collectVirtualServices(cachedHostnames sort.StringSlice, resp *copilotapi.RoutesResponse) []*model.Config { | ||
var virtualServices []*model.Config | ||
for _, hostname := range cachedHostnames { | ||
if _, ok := resp.Backends[hostname]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the blacklist constant for the life of the cache? If it can change, I think there is an opportunity to send back results with hostnames in the blacklist here. Might be safer to use filteredCopilotHostnames
here instead (and change it to a map so membership checks are fast).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more, IMO the fact that above we have to do virtualServices := make([]*model.Config, 0, len(resp.Backends))
even though we know len(virtualServices)
is at most len(filteredCopilotHostnames)
is a pretty strong signal we ought to be using filteredCopilotHostnames
here. (We know that because we'll never construct and cache something in the blacklist, therefore if the blacklist is constant for the life of the cache the virtual services in the cache must be resp.Backends
with the blacklisted elements removed, which is filteredCopilotHostnames
by definition.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, and we did consider it. We ended up choosing not to use a map so we could preserve the sort order for the hostnames.
} | ||
|
||
func (c *CopilotSnapshot) collectVirtualServices(cachedHostnames sort.StringSlice, resp *copilotapi.RoutesResponse) []*model.Config { | ||
var virtualServices []*model.Config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtualServices := make([]*model.Config, 0, len(resp.Backends))
} | ||
|
||
func (c *CopilotSnapshot) removeBlacklistedHostnames(resp *copilotapi.RoutesResponse) []string { | ||
var filteredHostnames []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initialize at the correct size please (len(resp.Backends)
to avoid having to resize in worst case).
return nil, err | ||
} | ||
|
||
var gatewayNames []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialize array at the correct size
/test istio-unit-tests |
|
||
for _, hostname := range filteredCopilotHostnames { | ||
if _, ok := c.virtualServices[hostname]; ok { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do you remove routes? This is only skipping computation for existing routes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed collectVirtualServices to pruneAndCollectVirtualServices to make this clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, subject to Zack's nits
/retest |
Prow is clear, please manually retest circle (I have no permissions) |
- cloudfoundry can now dynamically inject route rules based on data from copilot - during a snapshot, the config monitor will keep its config store in tact if a connection error occurs while attempting to update itself. Signed-off-by: Zach LaVallee <zlavallee@pivotal.io>
- copilot e2e test: fix apiversion - don't log error in monitor package - copilot_snapshot: add deadline for querying routes - we are now attempting to cache the routes in the copilot snapshot. Needlessly creating route objects that may already be there was bad. - server boostrap has a pattern emerging now that the two monitors and one controller have been extracted. - change v3routing to networking everywhere Signed-off-by: Zachary Gershman <zgershman@pivotal.io>
Pulling the following changes from github.com/istio/proxy: 7a0fca9 Update Envoy SHA to latest with LcTrie optimizations (release-1.0). (istio#1919) d93f0fe Fix macOS build on CircleCI (release-1.0). (istio#1921) Pulling the following changes from github.com/envoyproxy/envoy: 73bd3d95c http_filter: add addEncodedTrailers and addDecodedTrailers (istio#3980) c3652aad5 rbac/fuzz: fix build (istio#4150) 07bc27c05 fix flaky RBAC integration test. (istio#4147) b150d61a9 header_map: copy constructor for HeaderMapImpl. (istio#4129) f345c8b23 test: moving websocket tests to using HTTP codec. (istio#4143) da500d20f upstream: init host hc value based on hc value from other priorities (istio#3959) da6194b94 test: add tests for corner-cases around sending requests before run() starts or after run() ends. (istio#4114) 3527f7799 perf: reduce the memory usage of LC Trie construction (istio#4117) b538e46d8 test: moving redundant code in websocket_integration_test to utilities (istio#4127) a3c55bf7b test: make YamlLoadFromStringFail less picky about error msg. (istio#4141) c283439b6 rbac: add rbac network filter. (istio#4083) 5a7152d21 fuzz: route lookup and header finalization fuzzer. (istio#4116) 589467360 Set content-type and content-length (istio#4113) 714ae130a fault: use FractionalPercent for percent (istio#3978) fde378705 test: Fix inverted exact match logic in IntegrationTcpClient::waitForData() (istio#4134) 794a00126 Added cluster_name to load assignment config for static cluster (istio#4123) 19f51e5e1 ssl: refactor ContextConfig to use TlsCertificateConfig (istio#4115) 0a4bffc5a syscall: refactor OsSysCalls for deeper errno latching (istio#4111) ec0d98e5e thrift_proxy: fix oneway bugs (istio#4025) 1381673ad Do not crash when converting YAML to JSON fails (istio#4110) 2662bf1f2 config: allow unknown fields flag (take 2) (istio#4096) 1ab839c1f Use a jittered backoff strategy for handling HdsDelegate stream/connection failures (istio#4108) 7309c14cf bazel: use GCS remote cache (istio#4050) 5fe4e14f0 Add thread local cache of overload action states (istio#4090) 3bb7fbc5f Added TCP healthcheck capabilities to the HdsDelegate (istio#4079) 98037ed37 secret: add secret provider interface and use it for TlsCertificates (istio#4086) 3e15c9490 upstream: allow custom extension protocol options (istio#4098) 9b33c49d1 Rename message types in hds.proto to improve readability (istio#4109) bb70b42bb fuzz: router header formatter/parser fuzz test. (istio#4105) fe57f6b33 fuzz: http parsing utility fuzzer. (istio#4107) 73dfedc95 ci: link ninja-buid to ninja for centos (istio#4106) 1cd509ef1 docs: add curl to Ubuntu deps (istio#4104) 45b900829 Handling updates from the management server on HDS (istio#4077) 510994c6a Don't use SIGTERM for admin /quitquitquit, just shut down directly. (istio#4099) 29b60291e fuzz: access log formatter fuzz test. (istio#4102) 765cac42f Destroy pending updates when updating a cluster (istio#4084) aafdf6037 authz_client_fix: fixed ext_authz http client when request contains content-length greater than 0 (istio#3888) 22ae0ab93 HttpConnectionManager and upstream counters for total completed requests (istio#3995) 04616d676 tcp_proxy: convert TCP proxy to use TCP connection pool (istio#4067) e759eab17 buffer: add prepend functions to Buffer::Instance (istio#4064) 14baa40ea fuzz: h1_capture_fuzz with direct response (istio#3787) d47365a9a Per endpoint load report (istio#4044) 70e9878ed Fix bug in `HostSetImpl::chooseLocality()` (istio#4061) 797e82484 deps: update gRPC to 1.14.0 (istio#4047) 628730666 Remove std::string cast in upstream impl lib and tests. (istio#4080) 33ab6ddac bot: exempt label "no stalebot" for PRs (istio#4081) 699c008d6 Absl string view to std string in dynamic metadata (istio#4078) e9dc1090e collect metrics for RBAC shadow policy (istio#4062) e9d81e179 Combine query-params into admin API's path, with API access from MainCommon sinking to main thread (istio#4059) fccaeade9 Revert "Revert "Basic Implementation of HDS (istio#3973)" (istio#4063)" (istio#4068) e96d4a6c4 http: fix upstream_rq stat increment (istio#4055) 14140ad83 Add overload manager to bootstrap config (istio#4038) b14dee5ee thrift_proxy: introduce MessageMetadata to track message headers and other metadata (istio#3991) 9ee2b2759 authz: correct stat names (istio#4074) c68063c05 Stats interface atomization (istio#4071) 82e3541b0 docs: fix incorrect doc about cluster warming in CDS (istio#4040) 3868326bd Support ListValue for metadata matcher (istio#3964) 4e5258953 Revert "Basic Implementation of HDS (istio#3973)" (istio#4063) f3b0f8580 Basic Implementation of HDS (istio#3973) 7b03f2ef5 tracing: Fixes issue with small LightStep reports. (istio#3989) fd517b356 request_info: initial implementation of dynamic metadata object (istio#3918) d5bbd1e0c Ability to specify a test or a test group when building with docker release (istio#4030) a1c646102 Remove stats_impl.h (istio#4057) 7bf713a93 fuzz: H2 codec fuzzer. (istio#4017) a614808b9 upstream: fix typo (s/lb_type/lb_policy/g) in previous commit. (istio#4051) 346059548 upstream: require opt-in for the x-envoy-original-dst-host header. (istio#4046) f2c9652a9 owners: add Dhi is maintainer (istio#4042) 6a1868dff Revert "tcp_proxy: convert TCP proxy to use TCP connection pool (istio#3938)" (istio#4043) cc3657797 docs: document request_timeout in version_history (istio#4041) a3364380a rest-api: make request timeout configurable (istio#4006) fa628c44e logging: optional details for ASSERT (istio#3934) 55606ec3f bump abseil-cpp commit (istio#4034) 4c3219c0c owners: promote Stephan and Greg to senior maintainer! (istio#4039) ddd661ac0 hot restarter: Log errno for 'panic: cannot open shared memory' error (istio#4032) cb3356fc5 Sds: Ssl socket factory owns ContextConfig (istio#4028) 9bc047226 Refactor TransportSocketFactoryContext and Cluster interfaces. (istio#4026) f8f21c26d Rename duplicated ads integration test case name (istio#4035) 02281809b fix duplicate listeners in lds response (istio#4029) 61421bddf upstream: fix duplicate clusters (istio#4012) 1f1166167 split up stats_impl_test to match the *impl.h and and *impl.cc files. (istio#4024) 5ec8b37da Remove "DO NOT SUBMIT" comment. (istio#4020) 882c49832 Add more information to errors about rejected cipher suite configuration. (istio#4019) ffc8258e5 Rename common/stats/stats_impl.* to common/stats/source_impl.* and fix refs (istio#4021) 891135e38 Fix overload manager unit test build (istio#4022) c2f204cc7 Add stats for overload manager (istio#4001) aec92237a remove unused variables (istio#4013) e999cfacc Re-order functions in stats_impl to group classes together (istio#4004) d5805b171 typos (istio#4009) aeb3f2875 Fix perf_annotation_test compilation under gcc 8.1.1 (istio#4000) da3c1eaf8 test/mock: Add 3 new gmock matchers (istio#3972) 6a8b84384 test: Add timeouts to methods that could wait forever in test/integration/fake_upstream.h. (istio#3936) d0f10faff HeapStatData with a distinct allocation mechanism for RawStatData (istio#3710) 2012c3e4c rds: make RouteConfigProvider unique_ptr (istio#3967) 62441f9fe Add option for merging cluster updates (istio#3941) eb5ea98ff fuzz: fixes oss-fuzz: 9599, 9600 (istio#3979) b27068bd0 listener: add socket api in os sys calls for additional tests (istio#3968) 83b9e2da8 Add overload manager for Envoy (istio#3954) f0ca75415 Fix prometheus typo. (istio#3999) 028387a3b tcp_proxy: convert TCP proxy to use TCP connection pool (istio#3938) f882e74dc syscall: use Api::SysCallResult in buffer impl (istio#3976) 7d61b0017 fuzz: fixes oss-fuzz: 9621 (istio#3988) dc03a9a41 docs: fix grammar errors (istio#3983) ed131cfa9 docs: minor typo and grammar fixups (istio#3984) 08fadcc41 http: fix segfault when idle timer fires before request headers received. (istio#3970) 8b9fd9aa7 Refactor setSocketOption for better errno latching (istio#3915) 6b65dbe3a Change drop_percentage to FractionalPercent (istio#3974) f28dc53f4 Remove deprecated handling of mutating admin requests from GET. (istio#3975) 324e628b7 syscall: refactor address APIs for deeper errno latching (istio#3897) Fixes istio#7710, fixes istio#7817, and hopefully fixes istio#7759. Signed-off-by: Piotr Sikora <piotrsikora@google.com>
* Update Envoy SHA to latest (release-1.0). Pulling the following changes from github.com/istio/proxy: 7a0fca9 Update Envoy SHA to latest with LcTrie optimizations (release-1.0). (#1919) d93f0fe Fix macOS build on CircleCI (release-1.0). (#1921) Pulling the following changes from github.com/envoyproxy/envoy: 73bd3d95c http_filter: add addEncodedTrailers and addDecodedTrailers (#3980) c3652aad5 rbac/fuzz: fix build (#4150) 07bc27c05 fix flaky RBAC integration test. (#4147) b150d61a9 header_map: copy constructor for HeaderMapImpl. (#4129) f345c8b23 test: moving websocket tests to using HTTP codec. (#4143) da500d20f upstream: init host hc value based on hc value from other priorities (#3959) da6194b94 test: add tests for corner-cases around sending requests before run() starts or after run() ends. (#4114) 3527f7799 perf: reduce the memory usage of LC Trie construction (#4117) b538e46d8 test: moving redundant code in websocket_integration_test to utilities (#4127) a3c55bf7b test: make YamlLoadFromStringFail less picky about error msg. (#4141) c283439b6 rbac: add rbac network filter. (#4083) 5a7152d21 fuzz: route lookup and header finalization fuzzer. (#4116) 589467360 Set content-type and content-length (#4113) 714ae130a fault: use FractionalPercent for percent (#3978) fde378705 test: Fix inverted exact match logic in IntegrationTcpClient::waitForData() (#4134) 794a00126 Added cluster_name to load assignment config for static cluster (#4123) 19f51e5e1 ssl: refactor ContextConfig to use TlsCertificateConfig (#4115) 0a4bffc5a syscall: refactor OsSysCalls for deeper errno latching (#4111) ec0d98e5e thrift_proxy: fix oneway bugs (#4025) 1381673ad Do not crash when converting YAML to JSON fails (#4110) 2662bf1f2 config: allow unknown fields flag (take 2) (#4096) 1ab839c1f Use a jittered backoff strategy for handling HdsDelegate stream/connection failures (#4108) 7309c14cf bazel: use GCS remote cache (#4050) 5fe4e14f0 Add thread local cache of overload action states (#4090) 3bb7fbc5f Added TCP healthcheck capabilities to the HdsDelegate (#4079) 98037ed37 secret: add secret provider interface and use it for TlsCertificates (#4086) 3e15c9490 upstream: allow custom extension protocol options (#4098) 9b33c49d1 Rename message types in hds.proto to improve readability (#4109) bb70b42bb fuzz: router header formatter/parser fuzz test. (#4105) fe57f6b33 fuzz: http parsing utility fuzzer. (#4107) 73dfedc95 ci: link ninja-buid to ninja for centos (#4106) 1cd509ef1 docs: add curl to Ubuntu deps (#4104) 45b900829 Handling updates from the management server on HDS (#4077) 510994c6a Don't use SIGTERM for admin /quitquitquit, just shut down directly. (#4099) 29b60291e fuzz: access log formatter fuzz test. (#4102) 765cac42f Destroy pending updates when updating a cluster (#4084) aafdf6037 authz_client_fix: fixed ext_authz http client when request contains content-length greater than 0 (#3888) 22ae0ab93 HttpConnectionManager and upstream counters for total completed requests (#3995) 04616d676 tcp_proxy: convert TCP proxy to use TCP connection pool (#4067) e759eab17 buffer: add prepend functions to Buffer::Instance (#4064) 14baa40ea fuzz: h1_capture_fuzz with direct response (#3787) d47365a9a Per endpoint load report (#4044) 70e9878ed Fix bug in `HostSetImpl::chooseLocality()` (#4061) 797e82484 deps: update gRPC to 1.14.0 (#4047) 628730666 Remove std::string cast in upstream impl lib and tests. (#4080) 33ab6ddac bot: exempt label "no stalebot" for PRs (#4081) 699c008d6 Absl string view to std string in dynamic metadata (#4078) e9dc1090e collect metrics for RBAC shadow policy (#4062) e9d81e179 Combine query-params into admin API's path, with API access from MainCommon sinking to main thread (#4059) fccaeade9 Revert "Revert "Basic Implementation of HDS (#3973)" (#4063)" (#4068) e96d4a6c4 http: fix upstream_rq stat increment (#4055) 14140ad83 Add overload manager to bootstrap config (#4038) b14dee5ee thrift_proxy: introduce MessageMetadata to track message headers and other metadata (#3991) 9ee2b2759 authz: correct stat names (#4074) c68063c05 Stats interface atomization (#4071) 82e3541b0 docs: fix incorrect doc about cluster warming in CDS (#4040) 3868326bd Support ListValue for metadata matcher (#3964) 4e5258953 Revert "Basic Implementation of HDS (#3973)" (#4063) f3b0f8580 Basic Implementation of HDS (#3973) 7b03f2ef5 tracing: Fixes issue with small LightStep reports. (#3989) fd517b356 request_info: initial implementation of dynamic metadata object (#3918) d5bbd1e0c Ability to specify a test or a test group when building with docker release (#4030) a1c646102 Remove stats_impl.h (#4057) 7bf713a93 fuzz: H2 codec fuzzer. (#4017) a614808b9 upstream: fix typo (s/lb_type/lb_policy/g) in previous commit. (#4051) 346059548 upstream: require opt-in for the x-envoy-original-dst-host header. (#4046) f2c9652a9 owners: add Dhi is maintainer (#4042) 6a1868dff Revert "tcp_proxy: convert TCP proxy to use TCP connection pool (#3938)" (#4043) cc3657797 docs: document request_timeout in version_history (#4041) a3364380a rest-api: make request timeout configurable (#4006) fa628c44e logging: optional details for ASSERT (#3934) 55606ec3f bump abseil-cpp commit (#4034) 4c3219c0c owners: promote Stephan and Greg to senior maintainer! (#4039) ddd661ac0 hot restarter: Log errno for 'panic: cannot open shared memory' error (#4032) cb3356fc5 Sds: Ssl socket factory owns ContextConfig (#4028) 9bc047226 Refactor TransportSocketFactoryContext and Cluster interfaces. (#4026) f8f21c26d Rename duplicated ads integration test case name (#4035) 02281809b fix duplicate listeners in lds response (#4029) 61421bddf upstream: fix duplicate clusters (#4012) 1f1166167 split up stats_impl_test to match the *impl.h and and *impl.cc files. (#4024) 5ec8b37da Remove "DO NOT SUBMIT" comment. (#4020) 882c49832 Add more information to errors about rejected cipher suite configuration. (#4019) ffc8258e5 Rename common/stats/stats_impl.* to common/stats/source_impl.* and fix refs (#4021) 891135e38 Fix overload manager unit test build (#4022) c2f204cc7 Add stats for overload manager (#4001) aec92237a remove unused variables (#4013) e999cfacc Re-order functions in stats_impl to group classes together (#4004) d5805b171 typos (#4009) aeb3f2875 Fix perf_annotation_test compilation under gcc 8.1.1 (#4000) da3c1eaf8 test/mock: Add 3 new gmock matchers (#3972) 6a8b84384 test: Add timeouts to methods that could wait forever in test/integration/fake_upstream.h. (#3936) d0f10faff HeapStatData with a distinct allocation mechanism for RawStatData (#3710) 2012c3e4c rds: make RouteConfigProvider unique_ptr (#3967) 62441f9fe Add option for merging cluster updates (#3941) eb5ea98ff fuzz: fixes oss-fuzz: 9599, 9600 (#3979) b27068bd0 listener: add socket api in os sys calls for additional tests (#3968) 83b9e2da8 Add overload manager for Envoy (#3954) f0ca75415 Fix prometheus typo. (#3999) 028387a3b tcp_proxy: convert TCP proxy to use TCP connection pool (#3938) f882e74dc syscall: use Api::SysCallResult in buffer impl (#3976) 7d61b0017 fuzz: fixes oss-fuzz: 9621 (#3988) dc03a9a41 docs: fix grammar errors (#3983) ed131cfa9 docs: minor typo and grammar fixups (#3984) 08fadcc41 http: fix segfault when idle timer fires before request headers received. (#3970) 8b9fd9aa7 Refactor setSocketOption for better errno latching (#3915) 6b65dbe3a Change drop_percentage to FractionalPercent (#3974) f28dc53f4 Remove deprecated handling of mutating admin requests from GET. (#3975) 324e628b7 syscall: refactor address APIs for deeper errno latching (#3897) Fixes #7710, fixes #7817, and hopefully fixes #7759. Signed-off-by: Piotr Sikora <piotrsikora@google.com> * reivew: fix for duplicate clusters (backported from master). Signed-off-by: Piotr Sikora <piotrsikora@google.com> * review: disable broken tests (backported from master). Signed-off-by: Piotr Sikora <piotrsikora@google.com>
based on data from copilot
tact if a connection error occurs while attempting to update itself.
Signed-off-by: Zach LaVallee zlavallee@pivotal.io