Skip to content

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented Jul 22, 2020

This PR disables the host firewall by default in CI. It will only be enabled for all tests when the label ci/host-firewall is set. Tests that specifically enable the host firewall (see below) will still run regardless of the label.

The two last commits then add two new tests for the host firewall:

  • A test of a fromCIDR+toPorts host policy (based on the existing fromCIDR+toPorts test) from the third node.
  • A NodePort test with an ingress+egress host policy (initially written to catch a potential regression on the path client->node1->backend@node2).

I verified ci/host-firewall works by running the tests with that label in #12672.

@pchaigno pchaigno added area/CI-improvement Topic or proposal to improve the Continuous Integration workflow release-note/ci This PR makes changes to the CI. needs-backport/1.8 area/host-firewall Impacts the host firewall or the host endpoint. labels Jul 22, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/host-firewall-ci branch 2 times, most recently from 2987c8e to 9bb4503 Compare July 22, 2020 21:17
@coveralls

This comment has been minimized.

@pchaigno pchaigno force-pushed the pr/pchaigno/host-firewall-ci branch 2 times, most recently from 16b2055 to 2572a2a Compare July 24, 2020 13:19
@pchaigno pchaigno marked this pull request as ready for review July 24, 2020 19:03
@pchaigno pchaigno requested a review from a team as a code owner July 24, 2020 19:03
@pchaigno pchaigno requested a review from christarazi July 24, 2020 19:07
@pchaigno pchaigno requested a review from brb July 24, 2020 19:09
Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From experience, having a label to run a particular set of tests will make the code to regress as soon this PR is merge. Even we are not enabling the host firewall by default, what's wrong on having more coverage for the host firewall?

@pchaigno
Copy link
Member Author

From experience, having a label to run a particular set of tests will make the code to regress as soon this PR is merge. Even we are not enabling the host firewall by default, what's wrong on having more coverage for the host firewall?

My original post was a bit unclear so not sure we're on the same page. The new host firewall tests I added will run regardless of the ci/host-firewall label. I've clarified that ☝️

On the general risk of regressions once we stop testing something, I agree. That is partly why I'm only making this change now that we also have some first host firewall tests.

Then, why don't I want the host firewall enabled by default in all tests? Because (1) with #12345, enabling the host firewall changes the path of packets in some cases (pod1@node1>node1 and pod1@node1->node2) and (2) it's not the default setup most users will run. I intend to use the new label to run additional tests on PRs that may impact the host firewall.

@aanm
Copy link
Member

aanm commented Jul 25, 2020

From experience, having a label to run a particular set of tests will make the code to regress as soon this PR is merge. Even we are not enabling the host firewall by default, what's wrong on having more coverage for the host firewall?

My original post was a bit unclear so not sure we're on the same page. The new host firewall tests I added will run regardless of the ci/host-firewall label. I've clarified that

👍

On the general risk of regressions once we stop testing something, I agree. That is partly why I'm only making this change now that we also have some first host firewall tests.

Then, why don't I want the host firewall enabled by default in all tests? Because (1) with #12345, enabling the host firewall changes the path of packets in some cases (pod1@node1>node1 and pod1@node1->node2) and (2) it's not the default setup most users will run. I intend to use the new label to run additional tests on PRs that may impact the host firewall.

Why can't we deploy Cilium to run with host firewall and run these tests (besides the setups that we already have)?

@pchaigno
Copy link
Member Author

test-me-please

@pchaigno
Copy link
Member Author

I pushed an update but this is currently blocked by #12834 so switching to draft.

@pchaigno pchaigno marked this pull request as draft August 10, 2020 19:24
@pchaigno pchaigno force-pushed the pr/pchaigno/host-firewall-ci branch 3 times, most recently from 6d785d9 to 722aeaf Compare August 14, 2020 06:58
@pchaigno
Copy link
Member Author

pchaigno commented Aug 14, 2020

@pchaigno pchaigno marked this pull request as ready for review August 14, 2020 12:51
@pchaigno pchaigno requested a review from aanm August 14, 2020 12:52
Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM! Just two nits.

The host firewall is only enabled in CI if label ci/host-firewall is
set. The goal is to have default CI options closer to common user
environments and host firewall is not enabled by default in those.

Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit extends the existing fromCIDR+toPorts policy test to test
the same kind of policy for the host firewall. To that end, it:
1. Enables the host firewall. The issue in comment is not relevant
   anymore since masquerading is disabled.
2. Introduce a helper to get the ID of the host endpoint. This helper
   will likely be needed for other host firewall tests as well.
3. Load a new DaemonSet to instanciate a host-networking pod on each k8s
   node. This pod serves as the target for host firewall connectivity
   tests.
4. Extend the existing test cases with CCNP tests.

Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit adds new tests, identical to NodePort tests under vxlan
tunneling and direct routing, but with an ingress+egress host policy
applied. The host policy only allow communications between nodes and to
specific endpoints for readiness probes.

Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno force-pushed the pr/pchaigno/host-firewall-ci branch from 722aeaf to 3c86d17 Compare August 25, 2020 15:23
@pchaigno pchaigno requested a review from brb August 25, 2020 15:24
@pchaigno
Copy link
Member Author

test-me-please

@christarazi christarazi merged commit eecd5b9 into master Aug 25, 2020
@christarazi christarazi deleted the pr/pchaigno/host-firewall-ci branch August 25, 2020 22:40
pchaigno added a commit that referenced this pull request Feb 1, 2021
We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - #12621
2 - #14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
pchaigno added a commit that referenced this pull request Feb 4, 2021
We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - #12621
2 - #14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
pchaigno added a commit that referenced this pull request Feb 8, 2021
We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - #12621
2 - #14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
borkmann pushed a commit that referenced this pull request Feb 9, 2021
We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - #12621
2 - #14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
lyveng pushed a commit to lyveng/cilium that referenced this pull request Mar 4, 2021
We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - cilium#12621
2 - cilium#14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
vadorovsky pushed a commit to vadorovsky/cilium that referenced this pull request May 19, 2021
[ upstream commit 6f59f4f ]

We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - cilium#12621
2 - cilium#14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
twpayne pushed a commit that referenced this pull request May 21, 2021
[ upstream commit 6f59f4f ]

We currently have a couple of host firewall tests, but they don't cover
all possible packet paths. [1] added two tests for the path node <-> world
(one with fromCIDR+toPorts and one combined with NodePort handling).
Other firewall tests [2] are only validating correct loading without
enforcing policies.

This commit fills this gap by adding VXLAN and direct routing tests for
the host firewall, with L3+L4 policies enforced on the paths node <->
local pod, node <-> remote pod, and node <-> remote node.

The test design draws inspiration from early host firewall bugs and
regressions:
- Test ingress and egress at the same time with restrictions on allowed
  ports. This is meant to ensure we detect a regression where only one
  direction bypasses policy enforcement. If such a case arises, we will
  fail because the source port won't be allowed and the connection
  will be dropped.
- Allow connections to/from world and pods not used in tests. This is
  meant to reduce the risk of bricking the nodes. Node to node
  communications are still strongly restricted, but the ports defined
  there have been stable for a while.
- Test connections to local and remote pods separately. They follow very
  different paths through our datapath.

1 - #12621
2 - #14255
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI-improvement Topic or proposal to improve the Continuous Integration workflow area/host-firewall Impacts the host firewall or the host endpoint. release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants