Skip to content

Some ServiceEntry hostnames can cause non-deterministic Envoy routes #38678

@birkland

Description

@birkland

Bug Description

If a ServiceEntry host has a name that fits the pattern *.<namespace>.svc.*, its hostname will undergo the same expansion as local Kubernetes Services. This can result in non-deterministic Envoy routes for short non-fully-qualified names.

An example patch to the existing unit tests will illustrate:

index 9ccd8c8b48..b6d5f65b73 100644
--- a/pilot/pkg/networking/core/v1alpha3/httproute_test.go
+++ b/pilot/pkg/networking/core/v1alpha3/httproute_test.go
@@ -111,6 +111,21 @@ func TestGenerateVirtualHostDomains(t *testing.T) {
                                "echo.default:8123",
                        },
                },
+               {
+                       name: "non-k8s service",
+                       service: &model.Service{
+                               Hostname:     "foo.default.svc.bar.baz",
+                               MeshExternal: false,
+                       },
+                       port: 8123,
+                       node: &model.Proxy{
+                               DNSDomain: "default.svc.cluster.local",
+                       },
+                       want: []string{
+                               "foo.default.svc.bar.baz",
+                               "foo.default.svc.bar.baz:8123",
+                       },
+               },
                {
                        name: "k8s service with default domain and different namespace",
                        service: &model.Service{

the result of this test is:

    ./github.com/istio/istio/pilot/pkg/networking/core/v1alpha3/httproute_test.go:246: unexpected virtual hosts:
        got  [foo.default.svc.bar.baz foo.default.svc.bar.baz:8123 foo foo:8123 foo.default.svc foo.default.svc:8123 foo.default foo.default:8123]
        want [foo.default.svc.bar.baz foo.default.svc.bar.baz:8123]

The problem is that the hostname foo.default.svc.bar.baz is being interpreted as the name of a Kubernetes service here, and therefore alternate service names are being generated for it.

If there are any other ServiceEntries or Kubernetes Services with a name of the form foo.default.svc.*, then the expanded alternate/short names will conflict and be de-duplicated. The problem is that this is non-deterministic, so one of them will win. The net result is that upon re-starting pilot, requests to http://foo might end up routing to any Service or ServiceEntry that matches.

Inspecting Envoy's routes across re-starts can result in differences like:
Screen Shot 2022-04-28 at 2 24 56 PM

It looks like this bug was introduced as of 1.13

Version

client version: 1.13.2
control plane version: 1.13
data plane version: 1.13.2 (2 proxies), 1.13.2 (3 proxies)

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions