-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Implement Citadel prometheus monitoring feature. #5015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5015 +/- ##
======================================
+ Coverage 74% 74% +1%
======================================
Files 310 312 +2
Lines 25816 25792 -24
======================================
- Hits 18888 18875 -13
+ Misses 6167 6154 -13
- Partials 761 763 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not familiar with promethus, take it as a similar thing as google internal monitoring sys.
security/cmd/istio_ca/main.go
Outdated
|
||
flags.BoolVar(&opts.signCACerts, "sign-ca-certs", false, "Whether the CA signs certificates for other CAs") | ||
|
||
// Monitoring configuration | ||
flags.IntVar(&opts.monitoringPort, "monitoring-port", 9093, "The port number for monitoring Citadel. "+ | ||
"If unspecified, Istio CA will disable monitoring.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be consistently using "Citadel"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Good catch.
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this in pkg/citadel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will do the migration in another PR.
) | ||
|
||
// StartMonitor starts the Citadel monitor. | ||
func StartMonitor(port int, enableProfiling bool, errCh chan<- error) (*Monitor, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent, consider to define:
- NewMonitor() *Monitor, error
- Monitor.Start()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Help: "The number of certificates issuances that have succeeded.", | ||
}, caLabels) | ||
|
||
cSRErrorCount = prometheus.NewCounterVec(prometheus.CounterOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use csrErrorCount, the abbreviation should be all little case or capital case.
https://github.com/golang/go/wiki/CodeReviewComments#initialisms
Same for the rest of variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
security/pkg/pki/ca/ca.go
Outdated
@@ -82,7 +83,8 @@ type IstioCA struct { | |||
|
|||
keyCertBundle util.KeyCertBundle | |||
|
|||
livenessProbe *probe.Probe | |||
monitoringCounters monitoring.CAOperationCounters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make pkg/pki
not depend on monitoring libraries. Instead can we:
- Define the customized error code for signing related logic, say TTL_FAIL, INVALID_PEM, etc.
- ca.go returned the customized error to server.go
- The metrics defined there can do signingError[errorCode].Inc()
In this way, we also don't have to define one monitoring metric per error type, which makes code simpler, also in monitoring post-processing side, we can use more expressive to query the motoring metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I will update the code.
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as a top level package, pkg/monitoring
should only provide generic monitoring facilities. We can define the metrics only where it's used. So that, secrets.go is never able to NewCAOperationCounters().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good call. I will put them into each directory.
@incfly I think they are similar. Citadel only needs to use the client lib to expose the HTTP endpoint for Prometheus to call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. Just couple of minor suggestions.
} | ||
|
||
// HandleCSR handles an incoming certificate signing request (CSR). It does | ||
// proper validation (e.g. authentication) and upon validated, signs the CSR | ||
// and returns the resulting certificate. If not approved, reason for refusal | ||
// to sign is returned as part of the response object. | ||
func (s *Server) HandleCSR(ctx context.Context, request *pb.CsrRequest) (*pb.CsrResponse, error) { | ||
s.monitoring.CSR.Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is promethus counter thread safe? otherwise, server needs to have mutex to avoid racing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I looked it up and found:
"All exported functions and methods are safe to be used concurrently unless specified otherwise."
https://godoc.org/github.com/prometheus/client_golang/prometheus
That's good news :)
@@ -179,6 +179,7 @@ func TestSign(t *testing.T) { | |||
port: 8080, | |||
authorizer: c.authorizer, | |||
authenticators: c.authenticators, | |||
monitoring: newMonitoringMetrics(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some test for monitoring if monitoring dependency can be arranged? Could do it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will add the unit tests later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add a dashboard for Citadel to the existing dashboards and add an e2e test to handle while we are at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the dashboard, what do we need to change except the yaml file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we would need to build a new dashboard like the mixer and pilot ones in: https://github.com/istio/istio/tree/master/addons/grafana/dashboards
And then add a test that validates the queries in that dashboard work.
I can help with that after this stuff goes in.
caller := s.authenticate(ctx) | ||
if caller == nil { | ||
log.Warn("request authentication failure") | ||
s.monitoring.AuthenticationError.Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're good to move forward for now. This is for potential future improvements: would it better to have one metric and using its different labels for each type of error? These metrics users can just look at these errors together or break down by error (label value) if they want. Otherwise they have to find all metrics' name and sum them up.
(Above is assuming promethus behaves simliarly to google's internal monitoring sys).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's my plan. I've added a TODO in ca file to have Sign() return an error code. I will implement it later.
/approve |
/test all [submit-queue is verifying that this PR is safe to merge] |
} | ||
go monitor.Start(monitorErrCh) | ||
log.Info("Citadel monitor has started.") | ||
defer monitor.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is runCA()
expected to not return until server shutdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Any concerns?
serviceAccountCreationCount = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "citadel", | ||
Subsystem: "secret_controller", | ||
Name: "certs_created_due_to_service_account_creation", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming suggestion: service_account_created_cert_count
or even svc_acct_create_certs_count
This will end up being citadel_secret_controller_certs_created_due_to_service_acount_creation
currently. That's a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
serviceAccountDeletionCount = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "citadel", | ||
Subsystem: "secret_controller", | ||
Name: "certs_deleted_due_to_service_account_deletion", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming suggestion: service_account_deleted_cert_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
secretDeletionCount = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "citadel", | ||
Subsystem: "secret_controller", | ||
Name: "certs_created_due_to_secret_deletion", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming suggestion: secret_deleted_cert_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// newMonitoringMetrics creates a new monitoringMetrics. | ||
func newMonitoringMetrics() monitoringMetrics { | ||
labels := prometheus.Labels{ | ||
configID: "1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this label adding to the metrics ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not useful for now. I thought it could be meaningful in the future, but I can remove it.
) | ||
|
||
const ( | ||
configID = "SecretController" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to use mixed case here?
and is this needed? you already use secret_controller
as the subsystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
security/pkg/server/ca/monitoring.go
Outdated
) | ||
|
||
const ( | ||
configID = "CitadelServer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
@@ -179,6 +179,7 @@ func TestSign(t *testing.T) { | |||
port: 8080, | |||
authorizer: c.authorizer, | |||
authenticators: c.authenticators, | |||
monitoring: newMonitoringMetrics(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add a dashboard for Citadel to the existing dashboards and add an e2e test to handle while we are at it.
/do-not-merge |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: incfly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
@myidpt: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue. |
* integrate circle ci with testgrid (#4649) Automatic merge from submit-queue. Integrate circle CI with Testgrid The integration will start with mixer and simple e2e tests. Will expand to other jobs in subsequent PRs. Results are now available on https://k8s-testgrid.appspot.com/istio#circleci-e2e-mixer https://k8s-testgrid.appspot.com/istio#circleci-e2e-simple Those are test results from this PR exclusively. After this is merged, the results will be more interesting. The `ci_to_gubernator` binary is built from istio/test-infra#769. Feel free to also take a look. * Fixing postsubmit (#5104) * Implement Citadel prometheus monitoring feature. (#5015) Automatic merge from submit-queue. Implement Citadel prometheus monitoring feature. This monitoring feature exposes a service (port 9093) about Citadel status to prometheus. Yaml files are changed in #5072. Tests will be added shortly. * let stackdriver adapter figure out project metadata by itself (#5057) Automatic merge from submit-queue. let stackdriver adapter figure out project metadata by itself This will make stackdriver adapter config generalized and usable for different projects. * Fix labelNames getting filled in stackdriver logging. (#5058) Automatic merge from submit-queue. Fix labelNames getting filled in stackdriver logging. Added LabelNames in testdata/stackdriver.yaml file Also, fixed that if label was not string, it was getting filled as null. * Don't attempt to qualify the wildcard domain. (#5106) * Don't attempt to qualify the wildcard domain. * fmt * pilot-agent: add a flag to disable internal telemetry (#5101) Automatic merge from submit-queue. pilot-agent: add a flag to disable internal telemetry Enabled by default, but allows to turn off telemetry. Signed-off-by: Kuat Yessenov <kuat@google.com> * Add tcp proxy support for multi-cluster environment (#4694) Automatic merge from submit-queue. tcp proxy support in a multi-cluster environment. Services need to maintain per cluster service IPs. For sidecars running in a cluster, cluster-specific service IPs should be used to build destination_ip_list for envoy listeners. * Add missing namespace to citadel template (#5108) Automatic merge from submit-queue. Add missing namespace to citadel template * prune old version resources that no longer exist (#5107) Automatic merge from submit-queue. prune old version resources that no longer exist * [vendor-change] CloudWatch Mixer adapter (#4617) Automatic merge from submit-queue. [vendor-change] CloudWatch Mixer adapter Adding an adapter to send metrics to cloudwatch * Enable Ingress/Egress gateways in Helm for bookinfo demos (#5120) Automatic merge from submit-queue. Enable Ingress/Egress gateways in Helm for bookinfo demos * Consume labeled multicluster secrets on startup (#5117) Automatic merge from submit-queue. Consume labeled multicluster secrets on startup This patch when run against istio.yaml or istio-auth.yaml runs in the new config mode using only labels rather than configmaps. The configmap functionality can be removed in 0.9. * Add a linter check to make sure types.go are generated. (#5110) Automatic merge from submit-queue. Add a linter check to make sure types.go are generated. addresses #4418 * Remove outdated manifests from install/kubernetes (#4882) * Remove orig_ manifests * Remove istio-mixer-validator and istio-mixer-with-health-check manifests * Remove unwanted manifests before archiving * Remove istio-sidecar-injector.yaml from install/README.md * Remove *one-namespace*.yaml from install/README.md * Make helm-generated manifests overwrite updateVersion_orig.sh manifests
* Do not add jwt and authn filter for TCP listner type (they are http filters) (#5034) * Improve virtual service validation (#5027) Automatic merge from submit-queue. Improve virtual service validation Adds more tests to virtual service validation and fixes a few issues. This PR improves the coverage but more test cases need to be added. * istioctl: fix default namespace handling for (de)register commands (#4493) * istioctl: fix default namespace handling for (de)register commands PersistentPreRun is only invoked for the leaf subcommand in the command chain and parent PersistentPreRun are skipped. In case of (de)register's PersistentPreRun, this skipped default namespace handling. Fortunatly, (de)register was only calling getRealKubeConfig which was already handled by the parent command. * fix lint errors * Removed duplicate core v1 import. (#4957) Automatic merge from submit-queue. Removed duplicate core v1 import. * update the --grpc-host-identities=istio-ca to preserve the default behavior. (#4972) Automatic merge from submit-queue. update the --grpc-host-identities=istio-ca to preserve the default behavior. We still need to set default value of --grpc-host-identities=istio-ca to support mesh expansion customer. * Support multiple route rules versions in Bookinfo e2e tests (#4896) * move route-rule-reviews-90-10 and route-rule-reviews-80-20 with all other routing rules otherwise, complex logic will be required to test them on v1alpha3 and v1alpha2 * added v1alpha3 versions of route-rule-reviews-80-20/90-10 * add TestFlags to e2e test framework * add reprocessRule * split demo_test.go into demo_test.go and main_test.go * refactor: extract getPreprocessedRulePath * add TestFlags definition and initialization * remove adding default rules from setUpDefaultRouting * remove deleting default rules on testConfig cleanup * remove adding default routing rule * print the migration rate in TestVersionMigration * add setting default rules in TestVersionMigration * ignore an error in deleting a rule that was deleted before ignore the error that will happen since the fifty rule redefines "reviews-default" rule and is deleted * add default rules to test version routing and test fault * add handling of rules config versions * refactor: extract allRules variable * add reviewsDestinationRule for v1Alpha3 * Skip -> Skipf * Remove forCA from CsrRequest. (#5018) Automatic merge from submit-queue. Remove ForCA from CsrRequest. ForCA is not used by Citadel any more. Instead, Citadel decides whether to sign certificate for workload or CA from the flag `sign-ca-certs`: https://github.com/istio/istio/blob/master/security/cmd/istio_ca/main.go#L218 * Use pod IP instead of host IP (#4988) Automatic merge from submit-queue. Correct IP for pilot debug tool. * Remove vendor files from .gitignore (#5048) * update Go control plane (#5053) Signed-off-by: Shriram Rajagopalan <shriramr@vmware.com> * updated istio/api repo version (#5050) * updated istio/api repo version * code review comments * Add myself to install/OWNERS (#5046) Automatic merge from submit-queue. Add myself to install/OWNERS * cleanup unused files (#5042) Automatic merge from submit-queue. cleanup unused files istio-pilot-e2e-v1alpha3 is not run anymore. istio-pilot-e2e-envoyv2-v1alpha3 replaced that test * Don't build anything in the init phase (#5035) Automatic merge from submit-queue. Don't build anything in the init phase make init shouldn't actually build any binary. That was introduced in #4773. * [Part2] Added `ImagePullSecrets` to `SidecarInjectionSpec` (#5002) Automatic merge from submit-queue. [Part2] Added `ImagePullSecrets` to `SidecarInjectionSpec` This PR updated the test cases related to SidecarInjectionSpec. Fixed #4870 /cc @ayj @yusuoh @linsun * Mixer filter fixes (#5012) * Add proxy instances to plugin input params * Fix yet another slice handling issue * Update mixer cluster names * Fix mixer cluster addresses * Fix mixer cluster addresses #2 * Put registry and secrets into a nodeagent package. (#5054) Automatic merge from submit-queue. Put registry and secrets into a nodeagent package. Two sub-package under nodeagent/ - `registry/`, is to handle the workload pod creation, deletion, etc, contains interaction with flexvolumedriver. - `secrets/`, is to handle the secrets management, containing envoy SDS API interaction. * Fix pilot gRPC port in mTLS mode (#4998) * Use 15011 for pilot gRPC port when mTLS enabled. * Update bootstrap test. * Add tls context to bootstrap v1 (ingress using this). * Update bootstrap golden data. * Add a prow test for bookinfo v1alpha3 route rules (#5049) Automatic merge from submit-queue. Add a prow test for bookinfo v1alpha3 route rules * Sets up Pilot and remote cluster for multicluster (#4997) Automatic merge from submit-queue. Sets up Pilot and remote cluster for multicluster This change creates a secret and configmap in order to start pilot in mulitcluster mode. It also creates the necessary resources on the remote cluster to deploy applications there. * Update helm chart version to 0.8.0 (#5030) * simple e2e test: fix error output and exit condition (#5060) * [pilot] Some minor cleanup of bootstrap code (#5071) Automatic merge from submit-queue. [pilot] Some minor cleanup of bootstrap code * Start GRPC server using native go stack, cleanup auth (#4867) * Refactoring the init code for grpc. * Merge cleanup * Add the secure grpc port (also serving https) * Revert cleanup * Refactor and fix AZ retrieval, add test * Change default since the tls-via-envoy uses that, remove verbose log * Allow custom DNS names for pilot * Panic if certs not found an policy requires it * Setting the add to emtpty will disable the mtls listener * Revert vendor again * Fix test failure * Typo * Make sure previous logs are dumped, it seems pilot may crash * Manual fix of fmt -c * Better fix for 'prev log' * Lint errors * routes no longer working * Mount certs to pilot pod * Lint fix and revert pod logs, it's used for something else * Format * Add missing transitive dependency envoyproxy/go-control-plane (#5076) Automatic merge from submit-queue. Add missing transitive dependency envoyproxy/go-control-plane Not entirely sure how these were missed previously. Regardless, running `dep ensure` adds them to vendor (no changes to our lock file since they're transitive). * Remove EUC validation code (#5080) Automatic merge from submit-queue. Remove EUC validation code Issue: #4744 * renamed e2e-bookInfoTests-v1alpha3.sh -> e2e-bookInfoTests-envoyv2-v1alpha3.sh (#5087) Automatic merge from submit-queue. rename e2e-bookInfoTests-v1alpha3.sh -> e2e-bookInfoTests-envoyv2-v1alpha3.sh * Revert "[pilot] Some minor cleanup of bootstrap code (#5071)" (#5090) This reverts commit d5fa2b7. Conflicts: pilot/pkg/bootstrap/server.go * Enable RBAC (#5082) * Use zipkin 2.6.0 in istio. (#4726) * Modernize istio-remote helm chart (#5083) Automatic merge from submit-queue. Modernize istio-remote helm chart Modifies endpoints and services to match master This PR further changes the Makefile target to use istio-${service}.istio-system This PR further adds in the CA service and deploys it by default * Remove gcloud docker calls (#5092) Automatic merge from submit-queue. Remove gcloud docker calls Fixes #4797 * Add per proxy pilot querying (#5096) * Add per proxy pilot querying This is the final chunk of functionality for the base proxy-config command. It adds the ability to query Pilot for a specific proxy as well as the full mesh support. * Linting... * Correct the reference file path in galley (#5000) * Add secret controller for multicluster (#5017) * Vendor changes adding Informers and Listers * Secret Controller code * Linter detected issues * Vendor update related sha change * Adding final bits to the controller * Fixing controller startup code * Adding required RBAC rules to watch for secrets * Refactor Cluster Store initialization place * Fixing Unit test failure * Fixing Unit test failure * Addressing comments part #1 * Fixing Unit test * Switching to different type of Informer * Add create k8s_cr.Cluster object * Fixing if statement * Fixing lint error * Cosmetic changes * Fixing lint error * Tests reproing 503s (was 404s) during routerule apply - with t.Skip() until we have fix (#1041) Reproduces 3 classes of bugs. with t.Skip() for now/until they are fixed. * Multicluster deployment of Apps on remote (#5099) Automatic merge from submit-queue. Multicluster deployment of Apps on remote This change deploys the remote applications in a multicluster test. It alps fixes a couple issues with the remote.yaml and a typo from a prior PR. It makes some fixes to match up with PR5083 * Use iptables TPROXY instead of REDIRECT for inbound traffic (#4654) * Pilot: Support running Envoy with CAP_NET_ADMIN to support TPROXY Add iptables as a dependency to the istio.deb package. Signed-off-by: Romain Lenglet <romain@covalent.io> * Pilot: Support iptables TPROXY instead of REDIRECT for inbound traffic Add iproute2 as dependency to the istio.deb package and the proxy_init Docker image. Add a "-m" command-line flag to istio-ipstables.sh to select the inbound traffic interception mode ("REDIRECT" or "TPROXY"). Fix the usage doc for the other command-line options. Signed-off-by: Romain Lenglet <romain@covalent.io> * Fix deb/* Makefile targets Signed-off-by: Romain Lenglet <romain@covalent.io> * Pilot: Configure transparent proxy redirection from proxy config Configure the redirection mode (redirect vs. tproxy) in ProxyConfig. Pass the redirection mode to Pilot in the Node's metadata, in environment variable ISTIO_META_INTERCEPTION_MODE. Fix Envoy bootstrap to remove the ISTIO_META_ prefix from metadata key names, instead of ISTIO_META. Signed-off-by: Romain Lenglet <romain@covalent.io> * Pilot: Parse Node Metadata and store it in Proxy Signed-off-by: Romain Lenglet <romain@covalent.io> * Update pilot/pkg/kube/inject tests Fix pilot/pkg/kube/inject tests to use the new templates. Add unit tests for the ProxyConfig.InterceptionMode. Signed-off-by: Romain Lenglet <romain@covalent.io> * Pilot: Define sidecar.istio.io/interceptionMode annotation Define sidecar.istio.io/interceptionMode to override the ProxyConfig.InterceptionMode mesh-wide setting. Signed-off-by: Romain Lenglet <romain@covalent.io> * added missing external service definition (#5094) * Fix the node agent E2E test, manually tested. (#5063) Automatic merge from submit-queue. Fix the node agent E2E test, manually tested. Fixed several things to get it work: - Update outdated flags value. - In `start_app.sh`, retain the node agent process log, instead just run it in background. - Make sure the initial root and cert is loaded. How to test this: - `make docker && make docker.push` - `go test -v istio.io/istio/security/tests/integration/nodeAgentTest --tag $(git log -1 --format="%H") --hub gcr.io/<your-project> -kube-config ~/.kube/config --skip_cleanup` We should also figure out how to not run `apt-get` install for every test run. Will follow up with key cert generation issue and then ensure it's stability before re-enabling it. cc @wattli * Bug fix in stackdriver adapter. (#5026) Automatic merge from submit-queue. stackdriver adapter bug fixes and clean up Several bug fixes and cleanups in stackdriver adapter: - add duration type into distribution processing, which is used for latency metrics. - remove a duplicated layer of loop. - remove metric kind override since now adapter could deal with not only custom metrics. - remove a log line which overwhelms mixer logs. * Only query the specific trace using the provided x-client-trace-id tag (#5066) Automatic merge from submit-queue. Only query the specific trace using the provided x-client-trace-id tag This change should make the trace query more efficient as it should only retrieve the single required trace instance. cc @kyessenov @nmittler Signed-off-by: Gary Brown <gary@brownuk.com> * integrate circle ci with testgrid (#4649) Automatic merge from submit-queue. Integrate circle CI with Testgrid The integration will start with mixer and simple e2e tests. Will expand to other jobs in subsequent PRs. Results are now available on https://k8s-testgrid.appspot.com/istio#circleci-e2e-mixer https://k8s-testgrid.appspot.com/istio#circleci-e2e-simple Those are test results from this PR exclusively. After this is merged, the results will be more interesting. The `ci_to_gubernator` binary is built from istio/test-infra#769. Feel free to also take a look. * Fixing postsubmit (#5104) * Implement Citadel prometheus monitoring feature. (#5015) Automatic merge from submit-queue. Implement Citadel prometheus monitoring feature. This monitoring feature exposes a service (port 9093) about Citadel status to prometheus. Yaml files are changed in #5072. Tests will be added shortly. * let stackdriver adapter figure out project metadata by itself (#5057) Automatic merge from submit-queue. let stackdriver adapter figure out project metadata by itself This will make stackdriver adapter config generalized and usable for different projects. * Fix labelNames getting filled in stackdriver logging. (#5058) Automatic merge from submit-queue. Fix labelNames getting filled in stackdriver logging. Added LabelNames in testdata/stackdriver.yaml file Also, fixed that if label was not string, it was getting filled as null. * Don't attempt to qualify the wildcard domain. (#5106) * Don't attempt to qualify the wildcard domain. * fmt * pilot-agent: add a flag to disable internal telemetry (#5101) Automatic merge from submit-queue. pilot-agent: add a flag to disable internal telemetry Enabled by default, but allows to turn off telemetry. Signed-off-by: Kuat Yessenov <kuat@google.com> * Add tcp proxy support for multi-cluster environment (#4694) Automatic merge from submit-queue. tcp proxy support in a multi-cluster environment. Services need to maintain per cluster service IPs. For sidecars running in a cluster, cluster-specific service IPs should be used to build destination_ip_list for envoy listeners. * Add missing namespace to citadel template (#5108) Automatic merge from submit-queue. Add missing namespace to citadel template * prune old version resources that no longer exist (#5107) Automatic merge from submit-queue. prune old version resources that no longer exist * [vendor-change] CloudWatch Mixer adapter (#4617) Automatic merge from submit-queue. [vendor-change] CloudWatch Mixer adapter Adding an adapter to send metrics to cloudwatch * Enable Ingress/Egress gateways in Helm for bookinfo demos (#5120) Automatic merge from submit-queue. Enable Ingress/Egress gateways in Helm for bookinfo demos * Consume labeled multicluster secrets on startup (#5117) Automatic merge from submit-queue. Consume labeled multicluster secrets on startup This patch when run against istio.yaml or istio-auth.yaml runs in the new config mode using only labels rather than configmaps. The configmap functionality can be removed in 0.9. * Add a linter check to make sure types.go are generated. (#5110) Automatic merge from submit-queue. Add a linter check to make sure types.go are generated. addresses #4418 * Remove outdated manifests from install/kubernetes (#4882) * Remove orig_ manifests * Remove istio-mixer-validator and istio-mixer-with-health-check manifests * Remove unwanted manifests before archiving * Remove istio-sidecar-injector.yaml from install/README.md * Remove *one-namespace*.yaml from install/README.md * Make helm-generated manifests overwrite updateVersion_orig.sh manifests
This monitoring feature exposes a service (port 9093) about Citadel status to prometheus.
Yaml files are changed in #5072.
Tests will be added shortly.