-
Notifications
You must be signed in to change notification settings - Fork 3.4k
cilium-cli: add connectivity tests support for policy-default-local-cluster #39786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cilium-cli: add connectivity tests support for policy-default-local-cluster #39786
Conversation
/test |
/ci-ingress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we consider how to test this functionality without end-to-end tests, for instance using hive/script? As a general observation, e2e tests tend to fail at a higher rate due to integration problems and they also fail in ways that are not obvious to the general contributor. We already struggle a lot with the existing e2e tests, and I worry that each time we add one more permutation into the tests, we make that problem worse.
From a pure code perspective this looks good for @cilium/github-sec code owners.
If it can be reassuring for you the goal of this new option is to make this the default behavior eventually and probably to deprecate it/remove the option completely. Potentially If I can contribute the second missing thing which is a reporting tool to flag policies that would be changed by this either in preflight or cilium-cli (TBD) in time for 1.18, it could already be a default option for 1.19 and be deprecated/removed somewhere in 1.20/1.21 (to be discussed ofc!). There's also ofc some unit testing in the original PR. That being said if you feel like it needs some additional testing through hive/script I can take a look at doing that too. |
I can't say I'm familiar enough with the particular feature or area to give good technical guidance on this. Maybe @cilium/sig-clustermesh reviewers could consider this and provide pointers whether there's a concrete improvement we could make or whether these changes are good enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my ci-structure point of view, this seems like a very heavy handed way to test clustermesh policy functionality. This PR is basically touching every manifest in the Cilium CLI. I would agree with Joe here that exploring alternative means is probably the best move. Otherwise, we are adding another dimension to an already complex and error-prone test suite.
I now realize that I provided way too less context here sorry about that! So in the current cilium-cli connectivity tests we deploy the (IIRC) "other-node" pod to another cluster in a multicluster scenario instead of another node. This allows to run the same tests whether it's for multiple clusters or only one. Without this mode turned on there was no need for the policy to be aware of its cluster mesh environments. Unless explicitly specified a network policy was allowing traffic to every cluster. But this new option Apart from the fact that there is one more option in the tests matrix (which is a totally fair concern!) this is mostly not really a CI/testing problem because if a user run the connectivity test with this option turned on it should hopefully succeed. So unless we give up on this strategy of deploying some pod to another cluster to run most tests we unfortunately need to update most netpol manifests related to cli connectivity tests :/... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the context, that helps. From a ci-structure point of view, the changes makes sense and are as minimal as I can think of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A couple of comments inline.
cilium-cli/connectivity/builder/manifests/client-egress-to-echo-no-cluster-policy.yaml
Outdated
Show resolved
Hide resolved
f96c54f
to
1d00621
Compare
/test |
Add a new test when policy-default-local-cluster where we have a policy which we don't specify any cluster to be filtered. In a multicluster scenario this means that the traffic to the remote cluster is denied and in a single cluster the traffic should be authorized. Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
1d00621
to
000a08c
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Followup to #39338 adding support for connectivity test for this new mode.
Also this draft PR #39731 contains the same cli commits but with the new option enabled to have a full CI test run with this turned on by default