Skip to content

Conversation

ruicao93
Copy link
Contributor

@ruicao93 ruicao93 commented Sep 9, 2024

For the scenario where there are multiple interfaces on a Node and delegated IPAM is used to determine the IP and the uplink interface of a Pod. The "interfaces" field can be used by IPAM plugin to pass uplink interface info to Cilium. And then Cilium can setup host routes for the Pod according to the uplink interface info.

The "interfaces" field is allowed to specify the related host interface according to the CNI spec and is already widely used by existing plugins, such as bridge and ptp plugin. It's actually not used by any IPAM plugin for now, but I think it's reasonable and conforms to the original definition of the field.

Motivations/Benefits of the PR for the native routing mode:

  • More cloud(public or private) envs with multiple uplink can be easily ingregrated with Cilium without involving any cloud specific logic. A common way is provided for the case: "uplink interface" support.
  • Cilium keeps full control of the Pod traffic. Cilium choose the implement of the "routing" for egress traffic, through Linux routes or eBPF forwarding rules or any other methods.

- interfaces: An array of all interfaces created by the attachment, including any host-level interfaces:

bridge.go

	// Assume L2 interface only
	result := &current.Result{
		CNIVersion: current.ImplementedSpecVersion,
		Interfaces: []*current.Interface{
			brInterface,
			hostInterface,
			containerInterface,
		},
	}

bridge plugin result:

{
    "ips": [
        {
          "address": "10.1.0.5/16",
          "gateway": "10.1.0.1",
          "interface": 2
        }
    ],
    "routes": [
      {
        "dst": "0.0.0.0/0"
      }
    ],
    "interfaces": [
        {
            "name": "cni0",
            "mac": "00:11:22:33:44:55"
        },
        {
            "name": "veth3243",
            "mac": "55:44:33:22:11:11"
        },
        {
            "name": "eth0",
            "mac": "99:88:77:66:55:44",
            "sandbox": "/var/run/netns/blue"
        }
    ],
    "dns": {
      "nameservers": [ "10.1.0.1" ]
    }
}

Fixes: #34604

@ruicao93 ruicao93 requested a review from a team as a code owner September 9, 2024 13:50
@ruicao93 ruicao93 requested a review from joamaki September 9, 2024 13:50
@maintainer-s-little-helper
Copy link

Commit 2916ed4 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Sep 9, 2024
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Sep 9, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Sep 9, 2024
@joamaki
Copy link
Contributor

joamaki commented Sep 10, 2024

As the title still has Draft in it, I'll wait before reviewing. Edit the title and ping me when you think it's ready.

@ruicao93 ruicao93 changed the title [Draft]Allow delegated IPAM to specify uplink interface Allow delegated IPAM to specify uplink interface Sep 10, 2024
@ruicao93
Copy link
Contributor Author

ruicao93 commented Sep 10, 2024

As the title still has Draft in it, I'll wait before reviewing. Edit the title and ping me when you think it's ready.

Thanks @joamaki, the commit is ready for review and more discussions could be found in #34604

@ruicao93
Copy link
Contributor Author

ruicao93 commented Sep 11, 2024

Hi @joestringer @gandro , could you please help review the PR as we discussed.
cc @cilium/sig-ipam

@joestringer joestringer added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Sep 11, 2024
@joestringer
Copy link
Member

I'm not so familiar with delegated IPAM, but one crucial part of this will be to test that this doesn't break existing delegated IPAM setups. I see that issue #34788 is filed to create automated testing for that.

@dylandreimerink
Copy link
Member

/test

@joestringer
Copy link
Member

I see that there is now #34839 out for review, so it should be possible to combine that test with this PR and test it out on your own repository?

@joestringer
Copy link
Member

@dylandreimerink I don't think it's meaningful to run /test on this PR, as the delegated IPAM path isn't enabled for any of the existing tests merged into the tree.

@ruicao93
Copy link
Contributor Author

I see that there is now #34839 out for review, so it should be possible to combine that test with this PR and test it out on your own repository?

Sure, l will run the test provided in #34839 . The Cilium e2e and k8s conformance tests have been verified on Volcengine cloud combined with Volcengine delegated IPAM:

  • CIlium e2e tests:

With one minor error, but it doesn't matters.
image

  • Kubernetes conformance network tests
    Plugin: e2e
    Status: passed
    Total: 6965
    Passed: 42
    Failed: 0
    Skipped: 6923

Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have a few questions on this approach that are still unclear to me

@ruicao93
Copy link
Contributor Author

ruicao93 commented Sep 18, 2024

hi @tamilmani1989. We're discussing whether it is better to install node routes for Pod in Cilium or in delegated IPAM plugin.

I have a few of questions about the Delegated Azure IPAM mode. Could you please help me with them?

  • Does the azure ipam will create multiple network interfaces for Pods like how aws ENI mode works?
  • If there're multile ENIs, is the azure IPAM compoment responsible for install egress routes for Pods to redirect traffic to right network interface?

Do you think it would be helpful if delegated IPAM plugin only handle IP/interface management and Cilium controls all Pod routes(implemented by linux routes or bpf forwarding rules)?

@wedaly
Copy link

wedaly commented Sep 20, 2024

hi @tamilmani1989. We're discussing whether it is better to install node routes for Pod in Cilium or in delegated IPAM plugin.

I have a few of questions about the Delegated Azure IPAM mode. Could you please help me with them?

  • Does the azure ipam will create multiple network interfaces for Pods like how aws ENI mode works?
  • If there're multile ENIs, is the azure IPAM compoment responsible for install egress routes for Pods to redirect traffic to right network interface?

Do you think it would be helpful if delegated IPAM plugin only handle IP/interface management and Cilium controls all Pod routes(implemented by linux routes or bpf forwarding rules)?

azure-ipam currently handles only IP address management, not installing routes or multiple network interfaces.

I mentioned this in the parent issue, but my understanding is that a conforming delegated IPAM plugin is not allowed to return interfaces. (The bridge and ptp plugins are CNI plugins, but are not delegated IPAM plugins.) We want azure-ipam to conform to the CNI spec, so I think even if we had this use case we'd look for a different solution or work with the CNI maintainers to extend the spec so the semantics of interfaces from a delegated IPAM plugin are well-defined.

@ruicao93
Copy link
Contributor Author

azure-ipam currently handles only IP address management, not installing routes or multiple network interfaces.

I mentioned this in the parent issue, but my understanding is that a conforming delegated IPAM plugin is not allowed to return interfaces. (The bridge and ptp plugins are CNI plugins, but are not delegated IPAM plugins.) We want azure-ipam to conform to the CNI spec, so I think even if we had this use case we'd look for a different solution or work with the CNI maintainers to extend the spec so the semantics of interfaces from a delegated IPAM plugin are well-defined.

Thanks @wedaly for your reply. Dose the single uplink(eth0) is enough for AKS? The IPs and bandwidth of single ENI may be limited by the
control plane(quota) or dataplane(performance).

@squeed
Copy link
Contributor

squeed commented Oct 15, 2024

Adding @dylandreimerink, who added the "uplink" or "parent" mechanism for ENI.

@squeed
Copy link
Contributor

squeed commented Oct 15, 2024

CNI spec author here :-). There's nothing that says the IPAM result is not allowed to contain interfaces; it's not done out of convention. However, if we would like to change the interpretation of the types, I would be amenable to revisiting that in the spec.

The best part is that it wouldn't require any code changes, so we could "ship" it entirely in text.

@ruicao93
Copy link
Contributor Author

ruicao93 commented Oct 15, 2024

CNI spec author here :-). There's nothing that says the IPAM result is not allowed to contain interfaces; it's not done out of convention. However, if we would like to change the interpretation of the types, I would be amenable to revisiting that in the spec.

The best part is that it wouldn't require any code changes, so we could "ship" it entirely in text.

Thanks @squeed for the clarification, it's extremely helpful for the current discussion. We also created another PR to update the deletation IPAM result part, could you help review containernetworking/cni#1121

So it's reasonable to handle the returned uplink hints. Install the routes according to the uplink info could help Cilium have full control of the Pod traffic in delegate IPAM mode. Based on the current code modifications, we might also consider adding an option for users to decide whether the routing is controlled by Cilium or controlled by the user.

@tamilmani1989
Copy link
Contributor

Late to the party. We have a similar requirement where pod egress traffic to leave via secondary interface instead of primary interface. i noticed eni ipam mode already does this. https://docs.cilium.io/en/latest/network/concepts/routing/#egress. @ruicao93 Does this PR follow the same pattern/code?

@ruicao93
Copy link
Contributor Author

Late to the party. We have a similar requirement where pod egress traffic to leave via secondary interface instead of primary interface. i noticed eni ipam mode already does this. https://docs.cilium.io/en/latest/network/concepts/routing/#egress. @ruicao93 Does this PR follow the same pattern/code?

Yes @tamilmani1989 , we reused the logic(codes) with ENI IPAM mode. For the long term, ENI IPAM mode could consider to be migrated to Delegation IPAM mode and can be developed and evolved independently of Cilium.

@ruicao93
Copy link
Contributor Author

ruicao93 commented Nov 28, 2024

Yes, I can gladly do that. But before, could also squash your two commits into one commit? Otherwise the CI results will be invalidated once the commits are changed again.

@gandro Okay, done. I thought just need to squash on merge.

@gandro
Copy link
Member

gandro commented Nov 28, 2024

/test

@ruicao93
Copy link
Contributor Author

/test

@tklauser
Copy link
Member

/test

@ruicao93 ruicao93 force-pushed the ipam-uplink branch 2 times, most recently from a0d6e75 to 2fb3e11 Compare December 2, 2024 06:22
@ruicao93
Copy link
Contributor Author

ruicao93 commented Dec 2, 2024

Rebsed the commit. @gandro @tklauser , could you help run the ci tests?

@gandro
Copy link
Member

gandro commented Dec 2, 2024

/test

Support uplink interface info from IPAM result and install related
ingress/egress routes.

Signed-off-by: Rui Cao <caorui.io@bytedance.com>
@gandro
Copy link
Member

gandro commented Dec 2, 2024

@ruicao93 There is no need to rebase if CI fails. We can restart individual tests separately (assuming those are flakes). In fact. You should only rebase if a) there are conflicts or b) there are persistent CI failures because the tests have been updated on main.

@gandro
Copy link
Member

gandro commented Dec 2, 2024

/test

@ruicao93
Copy link
Contributor Author

ruicao93 commented Dec 2, 2024

@ruicao93 There is no need to rebase if CI fails. We can restart individual tests separately (assuming those are flakes). In fact. You should only rebase if a) there are conflicts or b) there are persistent CI failures because the tests have been updated on main.

@gandro Thanks~ Gotcha~

@ruicao93
Copy link
Contributor Author

ruicao93 commented Dec 3, 2024

@ruicao93 There is no need to rebase if CI fails. We can restart individual tests separately (assuming those are flakes). In fact. You should only rebase if a) there are conflicts or b) there are persistent CI failures because the tests have been updated on main.

@gandro Thanks~ Gotcha~

@gandro Finally... all CI cases has passed. Dose the PR could be merged?

@gandro
Copy link
Member

gandro commented Dec 3, 2024

There is still a review needed from the @cilium/api CODEOWNER before we can merge. The designated reviewer is @learnitall

@gandro gandro removed the request for review from joamaki December 3, 2024 09:35
Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API changes LGTM! Thank you for the ping, I apologize for the delay.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Dec 3, 2024
@joestringer
Copy link
Member

@squeed @gandro I take it from the above discussion that with the spec loosening in containernetworking/cni#1137 we are happy with this solution? Just want to check given that this is now approved by all codeowners and the MLH bot has marked the PR as ready-to-merge.

@gandro
Copy link
Member

gandro commented Dec 4, 2024

@squeed @gandro I take it from the above discussion that with the spec loosening in containernetworking/cni#1137 we are happy with this solution? Just want to check given that this is now approved by all codeowners and the MLH bot has marked the PR as ready-to-merge.

Yes, from my side I'm happy with the proposed solution. The functionality in this PR is opt-in, so I the risk of Cilium accidentally misinterpreting a delegated plugin's result is very low.

@ldelossa ldelossa added this pull request to the merge queue Dec 4, 2024
Merged via the queue into cilium:main with commit c5308b5 Dec 4, 2024
64 checks passed
@rjdenney
Copy link

rjdenney commented Dec 6, 2024

@ruicao93 Do you have a timeline for the additional V6 changes that would be needed, or could you tell us what would need to be added so that we can try working on it from our side?

@ruicao93
Copy link
Contributor Author

ruicao93 commented Dec 9, 2024

@ruicao93 Do you have a timeline for the additional V6 changes that would be needed, or could you tell us what would need to be added so that we can try working on it from our side?

@rjdenney Let me create a PR for v6 this week. We need to modify it carefully to avoid disrupting the ENI feature. A draft PR will be available soon for early discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CFP: add Volcengine cloud ENI IPAM support