Skip to content

auto_san_validation support #38722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 17, 2022
Merged

Conversation

kfaseela
Copy link
Member

@kfaseela kfaseela commented May 4, 2022

Signed-off-by: Faseela K faseela.k@est.tech

Please provide a description of this PR:

RFC: auto san validation support

To help us figure out who should review this PR, please put an X in all the areas that this PR affects.

  • Configuration Infrastructure
  • Docs
  • Installation
  • [X ] Networking
  • Performance and Scalability
  • Policies and Telemetry
  • Security
  • Test and Release
  • User Experience
  • Developer Infrastructure

@kfaseela kfaseela requested a review from a team as a code owner May 4, 2022 15:48
@istio-testing istio-testing added the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label May 4, 2022
@istio-testing istio-testing added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 4, 2022
@kfaseela kfaseela force-pushed the auto_san_validation branch from 64f5015 to 0977e8d Compare May 4, 2022 16:03
@kfaseela kfaseela requested review from a team as code owners May 4, 2022 16:03
Signed-off-by: Faseela K <faseela.k@est.tech>
@kfaseela kfaseela force-pushed the auto_san_validation branch from 0977e8d to 250266a Compare May 4, 2022 16:34
@istio-testing istio-testing added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 4, 2022
mc.httpProtocolOptions.UpstreamHttpProtocolOptions = &core.UpstreamHttpProtocolOptions{}
}
if len(tls.Sni) == 0 {
mc.httpProtocolOptions.UpstreamHttpProtocolOptions.AutoSni = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we still want this even if VerifyCertAtClient is false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, line 1149 is an OR condition so that we don't unnecessarily set empty attrs..did I miss something?

// Set auto_san_validation if VerifyCertAtClient feature flag is enabled
func (cb *ClusterBuilder) setAutoSniAndAutoSanValidation(mc *MutableCluster, tls *networking.ClientTLSSettings) {
if mc != nil && features.EnableAutoSni {
if len(tls.Sni) == 0 || (features.VerifyCertAtClient && len(tls.SubjectAltNames) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not we return with some warning message if EnableAutoSni is true and len(tls.Sni) != 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DR allows setting the sni explicitly, in that case we donot want to use auto_sni, and that is an allowed usecase. So I am not sure if we need a warning message in that case.

// Set auto_san_validation if VerifyCertAtClient feature flag is enabled
func (cb *ClusterBuilder) setAutoSniAndAutoSanValidation(mc *MutableCluster, tls *networking.ClientTLSSettings) {
if mc != nil && features.EnableAutoSni {
if len(tls.Sni) == 0 || (features.VerifyCertAtClient && len(tls.SubjectAltNames) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we coupling AutoSanValidation with VerifyCertAtClient? both seems unrelated to me.
IMO, for setting AutoSanValidation, we should just look for EnableAutoSni(may be rename it to EnableAutoSniAndSANValidation) and len(tls.SubjectAltNames) == 0).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider the case, mode MUTUAL with user configured ca certificates. In this case VerifyCertAtClient could be false because client side does not need system ca for server cert verification, but AutoSanValidation can still be desired for auto-sni

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAN validation doesn't even work without VerifyCertAtClient unless you are extremely careful as far as I understand. 99% of users would end up not actually validating the SAN.

VerifyCertAtClient=false is a horrible default only for backwards compatibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAN validation doesn't even work without VerifyCertAtClient unless you are extremely careful

careful means configuring user ca certs in the DR correctly?

VerifyCertAtClient=false is a horrible default only for backwards compatibility

yeah, enabling it by default will mean SAN is validated using system ca certs if user has not provided ca certs and that seems better ok.

If we want to go ahead with relying on VerifyCertAtClient for AutoSanValidation, does it make sense to add validation logic to verify VerifyCertAtClient is set to true if EnableAutoSni is set to true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vikaschoudhary16 : We wanted to keep EnableAutoSni separate, as that is not tightly coupled with any validation. That way it gives flexibility to use only auto_sni part. However AutoSanValidation is turned on, only if VerifyCertAtClient is enabled, and that looks better?

mc.httpProtocolOptions.UpstreamHttpProtocolOptions = &core.UpstreamHttpProtocolOptions{}
// Set auto_sni if sni field is not explicitly set and if EnableAutoSni feature flag is enabled.
// Set auto_san_validation if VerifyCertAtClient feature flag is enabled
func (cb *ClusterBuilder) setAutoSniAndAutoSanValidation(mc *MutableCluster, tls *networking.ClientTLSSettings) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is too complex, need to simplify

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please check if it looks better now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split into auto_sni and auto_san since rioght now they are very mixed but don't seem to need to be

if len(tls.Sni) == 0 {
mc.httpProtocolOptions.UpstreamHttpProtocolOptions.AutoSni = true
}
if features.VerifyCertAtClient && len(tls.SubjectAltNames) == 0 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to check tls.SubjectAltNames, because if the TLS layer SAN checking fails, the connection can not setup , so it will not impact to L7 SAN checking. And if this TLS layer SAN checking passed, it also does not impact L7 checking. But if you would like to only keep one check, you can do that. The pseudo-code may look like this:

func setAutoSniAndAutoSANValidation(){
	if len(tls.Sni) != 0 || !features.EnableAutoSni
	    return
		
	if mc.httpProtocolOptions == nil {
		mc.httpProtocolOptions = &http.HttpProtocolOptions{}
	}
	if mc.httpProtocolOptions.UpstreamHttpProtocolOptions == nil {
		mc.httpProtocolOptions.UpstreamHttpProtocolOptions = &core.UpstreamHttpProtocolOptions{}
	}
        mc.httpProtocolOptions.UpstreamHttpProtocolOptions.AutoSni = true
	
	if features.VerifyCertAtClient //&& len(tls.SubjectAltNames) == 0
	    mc.httpProtocolOptions.UpstreamHttpProtocolOptions.AutoSanValidation = true
}

BTW, I also think AutoSanValidation is not related to VerifyCertAtClient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @hobbytp : can't we set auto_san_validation alone to true, if len(tls.Sni) != 0 ? That works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused. Isn't it all tls handshake checking?

L7 San checking only is in authorization policy?

Copy link

@hobbytp hobbytp May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfaseela auto_san_validation only enabled when auto_sni is enabled. or I miss anything here?
refer here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto sni or auto san validation can be be set only when they are not set in UpstreamTlsContext, otherwise it will override.

So here must check tls. SubjectAltNames and SNI

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think we can, at least from the envoy proxy perspective, because one is for L4, one is for L7, and their configurations are separated, right?
But maybe I got your point, you want to disable the auto_san_validation (in L7) when L4 configured SAN list in DR from istio perspective? Just like you said "You should not be able to set explicitly SANs and use auto san", right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no L4 & L7, SAN check is during tls handshake.

According to envoy document, auto_sni and auto_san_validation will override that specified in TransportSocketMatch, which is set from DR's TLS setting.

So here we definitely not to override user explicitly set ones.

Copy link

@hobbytp hobbytp May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I also found that now, you are right.
So the left questions are which SNI, SAN we want to take high priority. DR's or auto.
seems both you and John's comments prefer DR's to be the high priority, which is different from envoy logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we want to and must respect user explicitly setting, this is the convention of declaring API Semantics.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable.

Signed-off-by: Faseela K <faseela.k@est.tech>
@kfaseela kfaseela changed the title WIP: auto_san_validation support auto_san_validation support May 10, 2022
@istio-testing istio-testing removed the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label May 10, 2022
@kfaseela
Copy link
Member Author

/test unit-tests_istio

if setAutoSni {
mc.httpProtocolOptions.UpstreamHttpProtocolOptions.AutoSni = true
}
if setAutoSanValidation {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the auto san validation should be set with auto sni

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hzxuzhonghu Please take a look at my comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only sni specified in DR, but not san. And then set auto_san_validation here, which san should it check>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envoy honors auto_sni and auto_san_validation config independent of each other: https://github.com/envoyproxy/envoy/blob/main/source/common/router/router.cc#L516

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering how should we disable san check when features.VerifyCertAtClient = true, the env indicates checking ca bundle only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is two different case, i mean how to only verify ca.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@howardjohn does it make sense to have a separate feature flag only for auto_san_validation? As per the envoy code mentioned above, both can be set independently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO no. I do not know any TLS client that separates these two. You either validate CA+SAN or neither. It doesn't make any sense to validate SAN if you do not validate CA IMO.
For example, curl has -k and Go has InsecureSkipVerify.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @howardjohn . In that case, probably the current PR is good enough, and you can do one more round of review?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense.

@istio-testing istio-testing merged commit ac4e391 into istio:master May 17, 2022
@jacob-delgado
Copy link
Contributor

/cherry-pick release-1.14

@istio-testing
Copy link
Collaborator

@jacob-delgado: new pull request created: #39004

In response to this:

/cherry-pick release-1.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

stevenctl pushed a commit to stevenctl/istio that referenced this pull request Jun 1, 2022
* fixed the multicluster tests failure for ca custom root tests (istio#38761)

* fixed the multicluster tests failure for ca custom root tests

* operator change

* removed print statements

* testing istiod remote multicluster env

* istiod remote testing

* testing istiod remote cases

* testing istiod remote multicluster

* updated ns creation logic in test

* added plugin certs to config cluster as well

* remove debugging print statement

* bump pdb version to policy/v1 (istio#38814)

* add v1 pdb

* release note

* refactor: remove redundant embeded interface in kube.Client (istio#38789)

* refactor: remove redundant embeded interface in kube.Client

* test: give a fake k8s client otherwise `s.kubeClient.Kube()` will panic

* Make use oftransport socket name directly from envoy api (istio#38821)

* Only expand vhost domains for local services (istio#38713)

* Only expand vhost domains for local services

When building outbound sidecar config, generate an expanded set of
hostnames (e.g. fqdn + short hostnames) only for local kubernetes
services in a given cluster.

* use DNSDomain instead of isClusterLocal(svc)

* gofumpt and gocritic

* release notes

* Clarify comments

* fix release notes

* fix unintended commit

* fix release notes

* add nil check to authz builder (istio#38825)

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* [tf] Refactor echo forwarder (istio#38253)

A number of improvements for better overall performance and code cleanup.

Connection pooling now better supported for grpc and http.]

* stackdriver: add support for testing with real Cloud Monitoring backend (istio#38605)

* stackdriver: add support for testing with real Cloud Monitoring backend

In certain testing environments, it may desirable to execute the tests in different contexts, including GKE clusters in distinct projects, from the projects that the data plane proxies are running in. In these cases, the Test Framework needs to support a way to "discover" the resource project for a cluster. This CL establishes a mechanism for that via `cluster.Config`'s `Meta` which is mapped into a new field in `Topology`.

With such a mechanism in place, the CL then expands the Test Framework stackdriver component interface to allow specification of a resource project as necessary (falling back to the discovered project of the running test code).

Test invocations can then establish the appropriate project information via `cluster.Config` as needed per environment. If left unspecified (as it is here) the project of the test will be used by default. As a result, this should be a no-op for the vast majority of Istio testing.

* log cluster project to aid debugging

* Add topologySpreadConstraints (istio#38558)

* revise groupversionaliaskind (istio#38829)

* fix outdate comments (istio#38842)

* istioctl: support `gateway` as input type (istio#38818)

This allows things like `istioctl proxy-config listeners gateway/my-gateway`. Also
included some bug fixes for namespace resolution (which is valid even
without the Gateway part).

* revert PR istio#38659 (istio#38826)

* Automator: update common-files@master in istio/istio@master (istio#38859)

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#38860)

* Refactoring and extending `registryredirector` (formerly `containerregistry`) integration test component (istio#38838)

* Use Json instead of Post Form

* Renaming the container registry component to `registryredirector` in
order to avoid misconception of the integration testing component.

In addition, to override the target registry and the docker image of
registryredirector itself, options are added.

* Renaming the omitted portions

* Use a simpler field name

* initSidecarScopes: remove the unused variable (istio#38867)

* [tf] Add debug utils to echo images (istio#38873)

* update flaky testcases for envoy filter (istio#38862)

* Automator: update common-files@master in istio/istio@master (istio#38874)

* pilot: minor test simplification (istio#38810)

Was cleaning up for another bigger change I abandoned. But might as well
push up this part.

* Automator: update istio/api@master dependency in istio/istio@master (istio#38879)

* Bump go modules, including Kubernetes 1.24 (istio#38721)

* Bump go modules, including Kubernetes 1.24

* lint

* Revert "Additional changes for istioctl analyzer when checking envoyFilter patch operations (istio#38389)" (istio#38851)

This reverts commit 7e61166.

* Update BASE_VERSION to master-2022-05-12T19-01-29 (istio#38886)

* Merge changes from istio template (istio#38837)

* Merge changes from istio template

* Remove security section

* Missing end

* Make gen, remove the ifs and add lifecycle - grpc starts very fast

* Few debug statements to figure out why wait doesn't work

* Use the agent ready port

* Rever the deugging changes

* make gen

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#38878)

* remove unused code (istio#38848)

* Refactor gateway server tls adjugement  (istio#38895)

* rewrite IsTCPServerWithTLSTermination

* Refactor gateway server adjugement

* Address comments

* Fix ut

* fix: increase 100% test coverage for telemetry_test.go (istio#38900)

* fix: increse 100% test coverage

* fix: remove comments

* remove deprecated access_log_path in bootstrap (istio#38902)

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* Automator: update common-files@master in istio/istio@master (istio#38906)

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#38907)

* [tf] Consistent namespace creation (istio#38875)

Currently, the logic for creating a namespace is slightly different in `newKube` vs `claimKube`, but it shouldn't be. This change extracts the logic into a common function. Also does some general cleanup to consolidate code.

* minor renaming of function (istio#38904)

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* WasmPlugin: Fix to update the checksum cache when the module cache hits (istio#38911)

* Fix a bug that `checksums` mapping is not updated when the retrieved
OCI image was already pulled before.

* Update pkg/wasm/cache_test.go

Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* Update pkg/wasm/cache_test.go

Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* removea temporal debug log

Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* fix incorect sds logs (istio#38901)

* fix incorect sds logs

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* fix ut

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* revert

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* Add custom ErrorLog handler to webhook server (istio#38928)

Reduce log level for the webhook server's TLS handshake error to debug to avoid flooding the log in case multiple probes are hitting :8080/ready concurrently.

Background: Pilot-discovery invokes its own webhook readiness server at :15017/httpsReady when receiving a readiness request on :8080/ready. This is done using the webhook readiness client based on http.Client.

The default behavior of http.Client is to reuse an already established TCP connection to the host if possible or otherwise to initiate one. When multiple readiness requests coincide, such as when both readiness and liveness probes are hitting the same endpoint, the webhook readiness client can encounter situations where the initial TCP connection is busy serving the first probe. In such cases http.Client will begin to open a second connection. If the first connection finishes to serve its request and subsequently becomes ready to handle another before the second connection finishes handshaking, http.Client will abort the connection setup by sending a TCP reset and reuse connection 1 also to serve probe 2.

When http.Server/the webhook server receives TCP RST during the TLS handshake, it produces the following error:

http: TLS handshake error from 127.0.0.1:<RANDOM PORT>: EOF

Signed-off-by: Faseela K <faseela.k@est.tech>

* Adds example under /samples for envoy SDS integration (istio#37948)

* Adds sample for a SPIRE integration quick start

* Adds README and supporting files for SPIRE integration example

* address linting issues

* address linting issues

* fix required version

* update CRD API version

* Set spire-agent deployment spec to DaemonSet

* add note about k8s-workload-registrar

Co-authored-by: Alexandre Alves Alvino <alexandre.alvino@hpe.com>

* move cluster cache code to a separate file (istio#38922)

* move cluster cache code to a separate file

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* lint

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* add more comments on full push (istio#38926)

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* Fix pod label panic (istio#38933)

* Fix panic

* rm esc.newEndpointBuilder

* Revert "[tf] Consistent namespace creation (istio#38875)" (istio#38937)

This reverts commit 10be586.

* EnvoyFilter is also applied to HTTP Proxy Listener (istio#38924)

The function patchOneListener, that applies the EnvoyFilter patches in the listener, is called in [buildHTTPProxyListener()](https://github.com/istio/istio/blob/bb93b2a69c73c0b3cbdb2f262e36a3e67bb369f5/pilot/pkg/networking/core/v1alpha3/listener_builder.go#L91). However, the `envoyFilterWrapper` is only populated in the [patchListener()](https://github.com/istio/istio/blob/bb93b2a69c73c0b3cbdb2f262e36a3e67bb369f5/pilot/pkg/networking/core/v1alpha3/listener_builder.go#L145), so the call to patchOneListener in buildHTTPProxyListener() do not apply any changes to the listener created for the HTTP Proxy.

So, to fix this bug, the patchOneListener for the HTTP Proxy listener moved to inside the patchListener(), where the other listener are patched. A new test were added to ensure that the listener are indeed removed.

Signed-off-by: Alan Diego dos Santos <alandiegosantos@gmail.com>

* Automator: update common-files@master in istio/istio@master (istio#38942)

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#38943)

* [tf] Add more utils to echo containers (istio#38947)

This is in part to force a rebuild of the base images, now that the regex has been expanded to detect changes to the echo base images (istio/test-infra#4047).

* Update BASE_VERSION to master-2022-05-16T21-51-27 (istio#38950)

* istiod async changes (istio#38307)

* async code updated

    pick f1ad418285 async code updated

* fixed remotte multicluster test failure

* fixed flaky unit test

* fixed jwks resolver unit test

* update policy applier test

* fixed the race ccondition issues

* resolved the PR comments

    pick 52cca209df fixed the race ccondition issues
    pick ccb21951aa resolved the PR comments
    pick 0d1a4d6abc fixed a pr comment
    pick 4d9a67dda0 update one unit test

* fixed a pr comment

* update one unit test

* update the policy applier unit tests

* fixed linter issues

* resolved unit test failure issues

* added logic in main flow with an optimization with azync fetching

* updated the function name

* decreased the retry interval for async fetch

* refactored unit tests

* removed the print statements

* updated unit test

* logic to refresh one single jwksuri

* updated isiod pilot logic

* Automator: update proxy@master in istio/istio@master (istio#38952)

* tf: disable wait on YAML (istio#38939)

If we do this long term we should probably remove the option entirely.
For now, just hack it out to see the impact.

* Fix double run of controller (istio#38949)

* Add filter to endpointslice (istio#38961)

* feat: Add skaffold.yaml for helm charts (istio#36591)

* feat: Add skaffold.yaml

* fix: Correct indentation

* fix: Add dependancies

* feat: Add kiali to installation

* fix: move skaffold file

* fix: link kiali

* chore: Update skaffold readme

* feat: Add bookstore and prometheus

* Update samples/cicd/skaffold/README.md

* Update samples/cicd/skaffold/README.md

* Update samples/cicd/skaffold/README.md

* fix lint

Co-authored-by: craigbox <craigbox@google.com>

* fix: ImageType not taking effect (istio#38964)

* [tf] Improve echo server connection handling (istio#38948)

PR istio#37914 is seeing VMs crashing due to file descriptor exhaustion. This PR attempts to improve the way that the echo servers are handling server-side connections, allowing connections to close more quickly.

* auto_san_validation support (istio#38722)

* WIP: auto_san_validation support

Signed-off-by: Faseela K <faseela.k@est.tech>

* Simplify setAutoSniAndAutoSanValidation func logic

Signed-off-by: Faseela K <faseela.k@est.tech>

* [tf] Consistent namespace creation (istio#38969)

This reverts commit 26dc4d1.

Currently, the logic for creating a namespace is slightly different in `newKube` vs `claimKube`, but it shouldn't be. This change extracts the logic into a common function. Also does some general cleanup to consolidate code.

* add ut for util (istio#38965)

Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>

* No delta for grpc XDS, no envoy prometheus fetching. (istio#38919)

* [tf] Cleanup test dump operations (istio#38970)

Adding cluster information to a number of error logs. Also avoiding errors when attempting to dump control plane debugging information for non-primary cluster.

* [tf] Reduce test traffic from converge (istio#38972)

This is really more of an RFC. The attempt here is to help reduce the overall test time and the amount of traffic we're sending for each test.

Many of our tests attempt to hit all clusters, and they do this by setting `count = 5 * num workloads`. However, the default converge value is 3 and the number of clusters in our multi-cluster setup is also 3, which means that the number of actual requests on the wire would be `converge * 5 * num workloads` = `3 * 5 * 3` = `45`.

In these cases it's not clear if converge is helpful. If we're already sending enough traffic to guarantee we're hitting each cluster and everything passes, do we need to do it all over again?

This PR disables converge for cases where `count > converge`. In the previous example, that would mean that a typical call would result in `5 * 3` = `15` requests on the wire, reducing our test traffic by up to 67%. It's not clear what impact this would have on test time, but it's probably safe to assume it would be significant.

* [tf] Add local authz server for vms (istio#38971)

* Sds cleanup (istio#38899)

* use k8s utility

* Move DisableAuthorizationForTest to fake.go

* Add license

* rm getLocalityFromTopology (istio#38955)

* prow: update default image to 1.24.0 and fix remote-secret creation (istio#38885)

* prow: update default image to 1.24.0

Change-Id: I5b8732367af80a948f857a71d14102ca0549436b

* wait for token data to be available

Change-Id: I1e91d11ade8f053ee05a962372357a5b40627fd5

* fix test

Change-Id: Ide970ec6dac60b1020adeeed5955bb5dcf0f69cd

* fix lint and release note

Change-Id: I8527d0325527899ab20dcf3b01c642c3dac94907

* remove go install

Change-Id: I248c3464b478c179f7e9b9a999de42121f88d5e1

* handle err

Change-Id: If0496c54b12105110ec93841961c40dd8d11fb49

* unit test coverage

Change-Id: I7756310338deb98f9284976cdd11f254424a45a3

* Only build workloadInstance when se selects pod enabled (istio#38925)

* Disable se selects pods

* Optimization: skip building workloadinstance when no workload handler registered

* Add release note

* Revert PILOT_ENABLE_SERVICEENTRY_SELECT_PODS

* Revert PILOT_ENABLE_SERVICEENTRY_SELECT_PODS

* Fix typos (istio#38985)

* make use of networks to determine internal address config (istio#38801)

* make use of networks to determine internal address config

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* add test and release notes

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* lint

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* Simplify ALPN protocols selection for inbound listener (istio#38994)

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* enabled reachability tests in multi-cluster env (istio#38893)

* enabled reachability tests in multi-cluster env

* updated reachability tests

* added comments

* improve test coverage for camelcase (istio#38995)

Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>

* remove warnings from istioctl envoy filter analyzer during first install (istio#38998)

* set priority in default envoyFilters to -1

* update release note msg

* update test

* Update BASE_VERSION to master-2022-05-18T19-02-54 (istio#39008)

* Fix cert rotation for proxyless, don't start SDS (istio#38980)

* Fix cert rotation for proxyless, don't start SDS

* Fix lint

* Remove trailing space

* make gen

* Make gen, remove the annotation check

* Update bookinfo images (istio#39012)

* Update bookinfo images

* put back docker.io prefix

* fix log line in aggregate controller (istio#39030)

* fix log line

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* fix log line

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* Automator: update istio/api@master dependency in istio/istio@master (istio#39033)

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#39034)

* Allow all the insecure servers when WASM_INSECURE_REGISTRIES has a "*" (istio#39024)

* Allow all the insecure servers when the environment variable WASM_INSECURE_REGISTRIES has a "*"

* Fix the flakiness

* Automator: update common-files@master in istio/istio@master (istio#39035)

* WasmPlugin: Add the integration test for wasm in OCI Images. (istio#38918)

* Add the integration test for the wasm plugin.

* fix lint

* fix lint

* Fix the template parameter

* Fix test name and secret name

* Move the tests into the existing wasm folder, and share the suite with the others.

* Fix lint errors

* Use c++ wasm codes

* Add license and copyright

* Add "image-pull-policy" test feature

* Reflect the comments

* Simplify the retry structure

* [tf] Adding more utils to echo vms (istio#39042)

In particular, adds DNS utilities to aid in debugging.

* Update BASE_VERSION to master-2022-05-19T20-51-21 (istio#39048)

* lint: update to 1.46 rules (istio#38951)

* Rerun gofumpt

* Update remaining issues

* revert accidental change

* fix copy order

* [tf] Fix dump of echo vm (istio#39043)

Fixes the dumping of VMs pods that have an authz container. Also does some general cleanup around the handling of container names.

* [tf] Echo cleanup tcp server (istio#39038)

This adds better logging for the TCP server and also avoids forceClose when the request completes normally.

* Fixing typos in start_csrctrl.go (istio#39059)

Signed-off-by: Faseela K <faseela.k@est.tech>

* fix: update test cases (istio#39021)

* fix: udpate test cases

* fix: remove invalid test cases

* add log line to indicate envoy discovery request (istio#39060)

* add log line to indicate envoy discovery request

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* add comma

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* remove remote profile and update precheck (istio#38903)

* remove remote profile and update precheck

* add release note

* remove remote profiile check

* istioctl generate remote config add more useful info

* fix test

* update pr

* add String function in provider id (istio#39061)

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

* go.mod: update dependencies (istio#39065)

* Automator: update common-files@master in istio/istio@master (istio#39066)

* [tf] Fix DNS for echo VM (istio#39040)

When running on GKE, tests are flaky for VMs due to the fact that kubelet can overwrite the custom hosts in `/etc/hosts`. See [k8s docs](https://kubernetes.io/docs/tasks/network/customize-hosts-file-for-pods/#why-does-kubelet-manage-the-hosts-file).

This PR avoids the use of a custom `/etc/hosts` in favor of specifying the custom istiod host via `hostAliases`.

Tests seem to pass consistently on GKE with this change.

Fixes istio#39013

* Revert "WasmPlugin: Add the integration test for wasm in OCI Images. (istio#38918)" (istio#39069)

This reverts commit 0bbee37.

* TestTraffic: use a new TrafficContext instead of list of cases (istio#39053)

* wip

* TestTraffic: use a new `TrafficContext` instead of list of cases

* fix changed filters

* fix selector

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#39068)

* [tf] Improvements to echo component lib (istio#39072)

* Automator: update common-files@master in istio/istio@master (istio#39074)

* Automator: update common-files@master in istio/istio@master (istio#39090)

* improve test coverage for string (istio#39064)

Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>

* Bump go yaml v3 to fix crasher (istio#39087)

* gateway-api: update dependency and fix conformance test errors (istio#38910)

* gateway tests: sort golden files

* deps: bump gateway-api

* Fix mesh allowed

* bump again

* WasmPlugin: Change the file path scheme of the cached wasm modules (istio#39071)

* Change the filename scheme of the cached wasm modules

* Fix lint errors

* fix using gofumpt

* telemetry tests: use standard deployment for echo (istio#39051)

* telemetry tests: use standard deployment for echo

This brings consistency with the security and pilot tests of using the
same core deployment everywhere. In particular, I was motivated by
https://github.com/istio/istio/pull/38495/files#diff-b3848f393e9a44a4e5f8cdb34f9f5715eab5fa22f975870ddce60ee29cddab86R224
which is doing external traffic -- the common deployment has a
standardized (and, IMO, safer) way of doing this that can easily be
utilized with this change.

* move setup

* fix dashboard

* fix rebase

* Automator: update istio/client-go@master dependency in istio/istio@master (istio#39076)

* Automator: update common-files@master in istio/istio@master (istio#39092)

* Fuzzing: Fix OSS-fuzz build (istio#39096)

* Use `apps cdeployment.SingleNamespaceView` instead of appNsInst (istio#39098)

* fix FuzzServiceDeepCopy (istio#39104)

* Fix stale svc clusterVIP fo multicluster (istio#39058)

* Fix stale svc clusterVIP fo multicluster

* Rm log

* Address comments

* Fix test flake by using sync controller (istio#39105)

Co-authored-by: Aryan Gupta <garyan@google.com>
Co-authored-by: Xinnan Wen <iamwen@google.com>
Co-authored-by: drivebyer <wuyangmuc@163.com>
Co-authored-by: Zhonghu Xu <xuzhonghu@huawei.com>
Co-authored-by: Aaron Birkland <aaron.birkland@solo.io>
Co-authored-by: Rama Chavali <rama.rao@salesforce.com>
Co-authored-by: Nathan Mittler <nmittler@gmail.com>
Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com>
Co-authored-by: Jesus Munoz <jesus.munoz@solo.io>
Co-authored-by: Xiaopeng Han <hanxiaop8@outlook.com>
Co-authored-by: zirain <hejianpeng2@huawei.com>
Co-authored-by: Istio Automation <istio-testing-bot@google.com>
Co-authored-by: Ingwon Song <102102227+ingwonsong@users.noreply.github.com>
Co-authored-by: dwq <41563853+dddddai@users.noreply.github.com>
Co-authored-by: AndreaM12345 <ayma@us.ibm.com>
Co-authored-by: Costin Manolache <costin@gmail.com>
Co-authored-by: ramanujadasu <vsrr.ramanujadasu@rakuten.com>
Co-authored-by: Faseela K <faseela.k@est.tech>
Co-authored-by: Juliano Fantozzi <juliano.fantozzi@hpe.com>
Co-authored-by: Alexandre Alves Alvino <alexandre.alvino@hpe.com>
Co-authored-by: alandiegosantos <alandiegosantos@gmail.com>
Co-authored-by: Matthew McLeod <mtthwmcld@gmail.com>
Co-authored-by: craigbox <craigbox@google.com>
Co-authored-by: Zhongmin <zhongming.chang@daocloud.io>
Co-authored-by: Steven Landow <landow@google.com>
Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
Co-authored-by: Frank Budinsky <frankb@ca.ibm.com>
Co-authored-by: LiuDui <liuduidui@beyondcent.com>
Co-authored-by: AdamKorcz <44787359+AdamKorcz@users.noreply.github.com>
@kfaseela kfaseela deleted the auto_san_validation branch August 5, 2022 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants