Skip to content

Conversation

eminaktas
Copy link
Contributor

@eminaktas eminaktas commented Jun 5, 2022

Fixes: #20055

Update connectivity tests for clusters running NodeLocal DNSCache with Local Redirect Policy.

Signed-off-by: eminaktas eminaktas34@gmail.com

@eminaktas eminaktas requested review from a team and sayboras June 5, 2022 10:18
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 5, 2022
@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from a9a20a6 to fc0300b Compare June 5, 2022 10:22
@sayboras sayboras added the release-note/misc This PR makes changes that have no direct user impact. label Jun 6, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 6, 2022
Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment as per below, basically, I think it's better to have general connectivity check example. Anything related to a specific deployment could be mentioned in docs.

If we decide to go with changing *.cue files, please run make -C examples/kubernetes/connectivity-check all to make sure generated files are in sync.

https://docs.cilium.io/en/v1.11/gettingstarted/local-redirect-policy/#node-local-dns-cache

Comment on lines 235 to 252
toCIDR: [
169.254.25.10/32
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this CIDR changed depend on diff environment/cluster? having something specific to particular environment might not be a good example.

Also, is this required as you already have matchLabels above for node-local-dns ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about having below rule ? It's not ideal but should be generic enough. As mentioned above, having hard-code CIDR might help in some cases, but not for all cases, and in the long term it might not be easy to maintain.

  - toEntities:
    - world
    toPorts:
    - ports:
      - port: "53"
        protocol: UDP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply. In the first place, I wasn't sure that we should use world since the CIDR is an IP on the host. I expected that the cilium could know that IP is a host IP. However, your suggestion will cover all the cases and it will be generic enough to fix the problem. So, users wouldn't be misled anymore. I have tested in our environment. It works perfectly.

NAME                                                     READY   STATUS    RESTARTS   AGE
echo-a-5769c68dd8-hr88p                                  1/1     Running   0          2m47s
echo-b-6dd649f7f6-c4bl2                                  1/1     Running   0          2m47s
echo-b-host-687668f6fd-w6qz7                             1/1     Running   0          2m47s
host-to-b-multi-node-clusterip-557fc49b9-6knb8           1/1     Running   0          2m46s
host-to-b-multi-node-headless-6c74ddd66c-zkmxg           1/1     Running   0          2m46s
pod-to-a-7cccd8457c-dtwcd                                1/1     Running   0          2m47s
pod-to-a-allowed-cnp-bbc844c6f-v6p6p                     1/1     Running   0          2m47s
pod-to-a-denied-cnp-869fd9d7ff-hb2vd                     1/1     Running   0          2m47s
pod-to-b-intra-node-nodeport-b48c5ccc7-ctmsg             1/1     Running   0          2m46s
pod-to-b-multi-node-clusterip-f8647fd67-hpxzt            1/1     Running   0          2m46s
pod-to-b-multi-node-headless-6b49bc6d55-gdzpm            1/1     Running   0          2m46s
pod-to-b-multi-node-nodeport-855b49f8c5-2vv5k            1/1     Running   0          2m46s
pod-to-external-1111-69c7775457-w9xjg                    1/1     Running   0          2m47s
pod-to-external-fqdn-allow-google-cnp-699854bbff-7csz2   1/1     Running   0          2m47s

@eminaktas
Copy link
Contributor Author

One comment as per below, basically, I think it's better to have general connectivity check example. Anything related to a specific deployment could be mentioned in docs.

If we decide to go with changing *.cue files, please run make -C examples/kubernetes/connectivity-check all to make sure generated files are in sync.

https://docs.cilium.io/en/v1.11/gettingstarted/local-redirect-policy/#node-local-dns-cache

I think the same as you. Adding the node-local-dns label is enough when you use the local redirect policy. However, this won't apply when NodeLocal DNSCache is deployed with the hostNetwork=true setting since it will create an interface on the host machine in which 169.254.25.10 is the default IP.

When a Pod does a DNS request, it will send to this 169.254.25.10 IP. Cilium labels this request as world as I mentioned here. I expected to see that Cilium could catch this IP as host or with node-local-dns label. Since the IP leads to a Pod in the same cluster.

Consequently, without this CIDR, the tests will fail for this situation.

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 7, 2022
@eminaktas
Copy link
Contributor Author

ping

@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 8, 2022
@aanm aanm requested a review from sayboras July 8, 2022 11:04
Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late replay, I have 2 comments as per below.

Comment on lines 235 to 252
toCIDR: [
169.254.25.10/32
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about having below rule ? It's not ideal but should be generic enough. As mentioned above, having hard-code CIDR might help in some cases, but not for all cases, and in the long term it might not be easy to maintain.

  - toEntities:
    - world
    toPorts:
    - ports:
      - port: "53"
        protocol: UDP

@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from fc0300b to 35dd1e1 Compare July 13, 2022 13:21
Copy link
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eminaktas Sorry, I missed your ping on #20055. Can you please create a dedicated PR to work with LRP based node-local-dns deployment where you just need to add the k8s-app: node-local-dns label to the connectivity test policies. Please ping me on that PR, and I'll review it. Thanks!
Also, are you interested in making similar label related changes to the cilium-cli connectivity tests - https://github.com/cilium/cilium-cli/blob/master/connectivity/manifests/client-egress-to-fqdns-one-one-one-one.yaml?

@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from 35dd1e1 to b754172 Compare July 28, 2022 11:36
@eminaktas eminaktas requested a review from a team as a code owner July 28, 2022 11:36
@eminaktas eminaktas changed the title fix: add rules for allowDNS to allow node-local-dns with label and ip fix: add a rule for allowDNS to allow node-local-dns with its label for LRP use case Jul 28, 2022
@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from b754172 to e786111 Compare July 28, 2022 11:41
@eminaktas
Copy link
Contributor Author

Hi @eminaktas Sorry, I missed your ping on #20055. Can you please create a dedicated PR to work with LRP based node-local-dns deployment where you just need to add the k8s-app: node-local-dns label to the connectivity test policies. Please ping me on that PR, and I'll review it. Thanks! Also, are you interested in making similar label related changes to the cilium-cli connectivity tests - https://github.com/cilium/cilium-cli/blob/master/connectivity/manifests/client-egress-to-fqdns-one-one-one-one.yaml?

I just separated the PRs. This #20086 covers the LRP use case and the other PR #20683 covers the host network case. I'll check the cilium-cli today.

Copy link
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good. I've posted a minor comment. Should be ready for merge once you've addressed it.
Can you also update the commit description release note to something like - "Update connectivity tests for clusters running node-local dns caches with Local Redirect Policy."

@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from e786111 to cb0d013 Compare July 28, 2022 14:34
@eminaktas
Copy link
Contributor Author

Thanks! Looks good. I've posted a minor comment. Should be ready for merge once you've addressed it. Can you also update the commit description release note to something like - "Update connectivity tests for clusters running node-local dns caches with Local Redirect Policy."

Thanks! All done!

@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from cb0d013 to c7b956e Compare July 28, 2022 14:39
@aditighag
Copy link
Member

Conformance test failures seem related -

make: Leaving directory '/home/runner/work/cilium/cilium/examples/kubernetes/connectivity-check'
please run 'make -C examples/kubernetes/connectivity-check fmt all' and submit your changes
Error: Process completed with exit code 1.

Can you run make -C examples/kubernetes/connectivity-check fmt all, commit and push those changes?

@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from c7b956e to 392b9c0 Compare July 29, 2022 10:35
@eminaktas
Copy link
Contributor Author

Conformance test failures seem related -

make: Leaving directory '/home/runner/work/cilium/cilium/examples/kubernetes/connectivity-check'
please run 'make -C examples/kubernetes/connectivity-check fmt all' and submit your changes
Error: Process completed with exit code 1.

Can you run make -C examples/kubernetes/connectivity-check fmt all, commit and push those changes?

Thanks for the info. Done

@aanm aanm requested review from sayboras July 29, 2022 13:08
Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late reply, this didn't show up in my filter unless there is a re-request for review.

Thanks 💯

@sayboras
Copy link
Member

also can you help to correct the commit message to make CI happy ? make -C bpf checkpatch can be used for local verification.

Error: ERROR:CUSTOM: Please avoid long commit subjects (max: 75, found: 84)

Signed-off-by: eminaktas <eminaktas34@gmail.com>
@eminaktas eminaktas force-pushed the nodelocaldns-test-resolution branch from 392b9c0 to 233e3a6 Compare July 29, 2022 17:12
@aditighag
Copy link
Member

Travis is broken on master so the failure is expected. Changes are only in the connectivity tests so marking the PR as ready for merge.

@aditighag aditighag added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. and removed release-note/misc This PR makes changes that have no direct user impact. labels Jul 29, 2022
@nathanjsweet nathanjsweet merged commit 40da4c6 into cilium:master Aug 1, 2022
@eminaktas eminaktas deleted the nodelocaldns-test-resolution branch August 1, 2022 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cilium connectivity test fails when nodelocaldns is running in cluster
4 participants