Skip to content

envoyfilter: http1.1 case preserve; using auto_config HttpProtocolOptions causes xds push failures; gw CDS stale #36299

@noah8713

Description

@noah8713

Bug Description

To support http1.1 case preserve and http2 at the same time for clients, enabling case preserve filter with auto_config option for gateway context as below causes gw CDS go stale state.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: header-case-preserve-1.10
  namespace: istio-system
  labels:
    operator.istio.io/component: "Pilot"
spec:
  configPatches:
  - applyTo: CLUSTER
    match:
      context: GATEWAY
      proxy:
        proxyVersion: '^1\.10.*'
    patch:
      operation: MERGE
      value:
        typed_extension_protocol_options:
          envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
            "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
            commonHttpProtocolOptions:
              idleTimeout: 60s
            auto_config:
              http_protocol_options:
                header_key_format:
                  stateful_formatter:
                    name: preserve_case  # preserve header case for response from backend service
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.http.header_formatters.preserve_case.v3.PreserveCaseFormatterConfig

logs:

warn	ads	ADS:CDS: ACK ERROR router~10.130.111.35~istio-ingressgateway-01-7f94d456df-wh2pk.istio-system~istio-system.svc.45.x.io-48901 Internal:Error adding/updating cluster(s) outbound|443||httpbin-svc.httpbin.svc.45.x.io: ALPN configured for cluster outbound|443||httpbin-svc.httpbin.svc.45.x.io which has a non-ALPN transport socket: name: "outbound|443||httpbin-svc.httpbin.svc.45.x.io"

istioctl

istio-ingressgateway-01-7f94d456df-2tftb                                   STALE (Never Acknowledged)     SYNCED     SYNCED       SYNCED       istiod-58559c447b-kvfwt     1.10-dev

This causes data plane impact as new workloads will not be pushed/realized for gateway. Switching to explicit_http_config and pilot push works as expected and gateways status shows synced.

istio-ingressgateway-01-7f94d456df-2tftb                                   SYNCED    SYNCED     SYNCED       SYNCED       istiod-58559c447b-kvfwt     1.10-dev

NOTE: case preserve for sidecar_INBOUND/OUTBOUND context for application pods is also applied at application's namespace level to decouple it from gateway context on istio ns level.

Flow is client-> ingressgw (envoy) -> sidecar(envoy)-> ingress gw->client

However, with explicit_http_config, access to http2/grpc services in mesh is being blocked if client outside mesh accesses them; since we are forcing gateway http options to http1 only.

As per envoyproxy/envoy#13922, it seems using auto_config will cause config failure for transport socket if alpn is not set and it will not autodetect as http1/h2.

So in order to enable http1.1 case preserve filter for all clusters(mix of http1/2; clusters using http2 should auto ignore case preserve as its only meant for http1.1 upstream) for gateway context and at the same time support http2/grpc functionality, do we need extra settings/changes in addition to filter above or is there a way/trivial change to set alpn transport socket via virtual service/gw , etc. to avoid failure when using auto_config to auto detect upstream http1/ http2 or some new bug?

e.g. transport socket details from proxy config shows it uses tlsMode-istio with default alpnProtocols

 name: outbound|443||httpbin-svc.httpbin.svc.45.x.io
  transportSocketMatches:
  - match:
      tlsMode: istio
    name: tlsMode-istio
    transportSocket:
      name: envoy.transport_sockets.tls
      typedConfig:
        '@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        commonTlsContext:
          alpnProtocols:
          - istio-peer-exchange
          - istio
          combinedValidationContext:
            defaultValidationContext:
              matchSubjectAltNames:
              - exact: spiffe://y.x.io/ns/httpbin/sa/default
    - match: {}
    name: tlsMode-disabled
    transportSocket:
      name: envoy.transport_sockets.raw_buffer

If we use cluster match for case preserve for outbound cluster, grpc/http2 traffic passing via ingress gateway will be unblocked for clients outside mesh accessing http2/grpc services in mesh. Since cluster match is always exact match, if we have n clusters, we will need n filters for n clusters which can be cumbersome.

Kind: EnvoyFilter
cluster:
        name: outbound|443||httpbin-svc.httpbin.svc.45.x.io
context: GATEWAY
xxxx

Version

$istioctl version
client version: 1.10.4
control plane version: 1.10-dev-3e1cdc98208a4824619469fc09d984315f9aece8
data plane version: 1.10-dev (883 proxies)

$kubectl version --short
Client Version: v1.20.0
Server Version: v1.18.8-354+0-51-15.9012d3db7b3436

Additional Information

pilot environment for protocol sniffing default values

PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_OUTBOUND=true
PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_INBOUND=false 

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions