-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Description
Bug Description
I've been building a proof-of-concept multi-cluster mesh with multi-primaries in different networks. In my case, one cluster is AWS EKS and another is DigitalOcean managed Kubernetes. The east-west gateway in the EKS cluster is exposed with an AWS NLB.
After linking the clusters as per the documentation, I've discovered that the domain name of the AWS NLB in the EKS cluster is not correctly resolved in the DO cluster. It turned out that the upstream DNS set up on DO KS nodes is returning REFUSED
answers to ANY
queries that are currently used in the Pilot code:
istio/pilot/pkg/model/network.go
Lines 485 to 486 in e67c34b
// TODO figure out how to query only A + AAAA | |
res := n.client.Query(new(dns.Msg).SetQuestion(dns.Fqdn(name), dns.TypeANY)) |
$ dig -t ANY k8s-istiomul-istioeas-4d501f177f-9a9de7682aacbcd6.elb.us-west-2.amazonaws.com
; <<>> DiG 9.16.1-Ubuntu <<>> -t ANY k8s-istiomul-istioeas-4d501f177f-9a9de7682aacbcd6.elb.us-west-2.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 44741
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: c367afe288a65985 (echoed)
;; QUESTION SECTION:
;k8s-istiomul-istioeas-4d501f177f-9a9de7682aacbcd6.elb.us-west-2.amazonaws.com. IN ANY
;; Query time: 7 msec
;; SERVER: 10.245.0.10#53(10.245.0.10)
;; WHEN: Mon May 02 03:20:07 UTC 2022
;; MSG SIZE rcvd: 118
ANY
queries are not guaranteed to be consistently implemented in DNS servers. For example, Cloudflare deems them deprecated and their NS return NOTIMP
to ANY
queries.
I'd suggest replacing ANY
with A
and AAAA
queries, as mentioned by the comment in the code. Though it is technically possible to craft a multi-type query with the library currently in use, such queries also seem not guaranteed to be implemented consistently, so we'd likely have to make two separate queries and merge the results. I have a patch tested in my environment and can follow up with a PR.
Version
$ istioctl version
client version: 1.13.3
control plane version: 1.13.3
data plane version: 1.13.3 (1 proxies)
$ kubectl version --short
Client Version: v1.23.6
Server Version: v1.22.8
Additional Information
No response
Affected product area
- Docs
- Installation
- Networking
- Performance and Scalability
- Extensions and Telemetry
- Security
- Test and Release
- User Experience
- Developer Infrastructure
- Upgrade
- Multi Cluster
- Virtual Machine
- Control Plane Revisions
Is this the right place to submit this?
- This is not a security vulnerability
- This is not a question about how to use Istio