-
Notifications
You must be signed in to change notification settings - Fork 3.4k
eni: Assign primary IP to support multiple VPC CIDRs #15453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0c3bd6e
to
e362a33
Compare
pkg/ipam/crd_eni.go
Outdated
} | ||
|
||
newENIs, _ := diffResources(oldNode, newNode) | ||
go setupENI(newENIs, expectedENIByMac, eniByName, firstInterfaceIndex, mtuConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no synchronization around this goroutine. In theory this might cause problems if updateENIRulesAndRoutes
is called multiple times in quick succession, which could cause multiple setupENI
goroutines to run concurrently, possibly stepping on each other's toes. In practice is should be OK though. 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great point. I have reworks the PR to only consider newly added interfaces. This should ensure that all possibly concurrent go routine only try to configure a distinct subset of newly added ENIs.
In addition to this change, I also increased the wait time and introduced exponential backoff.
e362a33
to
ef2254f
Compare
ef2254f
to
74a68ac
Compare
In CRD-backed IPAM modes such as ENI mode, some IPs were recently removed from the IPAM pool [1]. In user environments where Cilium was running a version prior to [1], it is possible for endpoints to be assigned "unavailable" IPs. Upon upgrade (to a version which includes [1]), endpoint restoration will fail with [2]. In order to workaround this failure and not disrupt the upgrade, this change introduces a hidden flag (`--bypass-ip-availability-upon-restore`) which will inform Cilium to continue on if the restored endpoint's IP is not available for reallocation, bypassing the specific error "IP is not available". Other errors will not be bypassed, in order to reduce the scope of this stop-gap solution. With the flag set, restored endpoints which had "unavailable" IPs will keep them. Any new endpoints / pods will be assigned fresh, valid IPs from the pool. This flag is only meant to be enabled with CRD-backed IPAM modes such as ENI mode. The reason is because of the change described in [1], where the primary ENI IP was removed from the IPAM pool. In any other mode that this flag is enabled in, the user is warned that the flag is not intended for other modes and will have no effect. This patch is intended to be reverted in the future, as this stop-gap solution will no longer be required, as users of Cilium don't upgrade from versions prior to [1]. I propose that we revert this in the following release that this patch makes it in (N+1). How was this tested? 1) Deployed a Cilium version that doesn't include [1] on EKS cluster 2) Created a Deployment object which I scaled to max out the ENI IPs, such that at least one pod is assigned an "unavailable" IP 3) Upgraded Cilium to a version which does include [1] and observe [2] failures 4) Reset cluster back to state from (2) 5) Upgrade Cilium to the version that contains this commit 6) Observe log msgs from this commit and endpoint restoration succeeding 7) Scale Deployment to 0 and back up, to restart all pods 8) Observe that they all get fresh IPs and none of the "unavailable" IPs [1]: cilium#15453 [2]: ``` { "time":"2021-09-20T16:57:00.400086481Z", "level":"WARN", "origin":"cilium.io/agent", "message":"Unable to restore endpoint, ignoring", "params":{ "endpointID":"992", "error":"Failed to re-allocate IP of endpoint: unable to reallocate 10.0.133.193 IPv4 address: IP 10.0.133.193 is not available", "k8sPodName":"default/pod-1", "subsys":"daemon" } } ``` Signed-off-by: Chris Tarazi <chris@isovalent.com>
In CRD-backed IPAM modes such as ENI mode, some IPs were recently removed from the IPAM pool [1]. In user environments where Cilium was running a version prior to [1], it is possible for endpoints to be assigned "unavailable" IPs. Upon upgrade (to a version which includes [1]), endpoint restoration will fail with [2]. In order to workaround this failure and not disrupt the upgrade, this change introduces a hidden flag (`--bypass-ip-availability-upon-restore`) which will inform Cilium to continue on if the restored endpoint's IP is not available for reallocation, bypassing the specific error "IP is not available". Other errors will not be bypassed, in order to reduce the scope of this stop-gap solution. With the flag set, restored endpoints which had "unavailable" IPs will keep them. Any new endpoints / pods will be assigned fresh, valid IPs from the pool. This flag is only meant to be enabled with CRD-backed IPAM modes such as ENI mode. The reason is because of the change described in [1], where the primary ENI IP was removed from the IPAM pool. In any other mode that this flag is enabled in, the user is warned that the flag is not intended for other modes and will have no effect. This patch is intended to be reverted in the future, as this stop-gap solution will no longer be required, as users of Cilium don't upgrade from versions prior to [1]. I propose that we revert this in the following release that this patch makes it in (N+1). How was this tested? 1) Deployed a Cilium version that doesn't include [1] on EKS cluster 2) Created a Deployment object which I scaled to max out the ENI IPs, such that at least one pod is assigned an "unavailable" IP 3) Upgraded Cilium to a version which does include [1] and observe [2] failures 4) Reset cluster back to state from (2) 5) Upgrade Cilium to the version that contains this commit 6) Observe log msgs from this commit and endpoint restoration succeeding 7) Scale Deployment to 0 and back up, to restart all pods 8) Observe that they all get fresh IPs and none of the "unavailable" IPs [1]: #15453 [2]: ``` { "time":"2021-09-20T16:57:00.400086481Z", "level":"WARN", "origin":"cilium.io/agent", "message":"Unable to restore endpoint, ignoring", "params":{ "endpointID":"992", "error":"Failed to re-allocate IP of endpoint: unable to reallocate 10.0.133.193 IPv4 address: IP 10.0.133.193 is not available", "k8sPodName":"default/pod-1", "subsys":"daemon" } } ``` Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit baa9e9a ] In CRD-backed IPAM modes such as ENI mode, some IPs were recently removed from the IPAM pool [1]. In user environments where Cilium was running a version prior to [1], it is possible for endpoints to be assigned "unavailable" IPs. Upon upgrade (to a version which includes [1]), endpoint restoration will fail with [2]. In order to workaround this failure and not disrupt the upgrade, this change introduces a hidden flag (`--bypass-ip-availability-upon-restore`) which will inform Cilium to continue on if the restored endpoint's IP is not available for reallocation, bypassing the specific error "IP is not available". Other errors will not be bypassed, in order to reduce the scope of this stop-gap solution. With the flag set, restored endpoints which had "unavailable" IPs will keep them. Any new endpoints / pods will be assigned fresh, valid IPs from the pool. This flag is only meant to be enabled with CRD-backed IPAM modes such as ENI mode. The reason is because of the change described in [1], where the primary ENI IP was removed from the IPAM pool. In any other mode that this flag is enabled in, the user is warned that the flag is not intended for other modes and will have no effect. This patch is intended to be reverted in the future, as this stop-gap solution will no longer be required, as users of Cilium don't upgrade from versions prior to [1]. I propose that we revert this in the following release that this patch makes it in (N+1). How was this tested? 1) Deployed a Cilium version that doesn't include [1] on EKS cluster 2) Created a Deployment object which I scaled to max out the ENI IPs, such that at least one pod is assigned an "unavailable" IP 3) Upgraded Cilium to a version which does include [1] and observe [2] failures 4) Reset cluster back to state from (2) 5) Upgrade Cilium to the version that contains this commit 6) Observe log msgs from this commit and endpoint restoration succeeding 7) Scale Deployment to 0 and back up, to restart all pods 8) Observe that they all get fresh IPs and none of the "unavailable" IPs [1]: cilium#15453 [2]: ``` { "time":"2021-09-20T16:57:00.400086481Z", "level":"WARN", "origin":"cilium.io/agent", "message":"Unable to restore endpoint, ignoring", "params":{ "endpointID":"992", "error":"Failed to re-allocate IP of endpoint: unable to reallocate 10.0.133.193 IPv4 address: IP 10.0.133.193 is not available", "k8sPodName":"default/pod-1", "subsys":"daemon" } } ``` Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit baa9e9a ] In CRD-backed IPAM modes such as ENI mode, some IPs were recently removed from the IPAM pool [1]. In user environments where Cilium was running a version prior to [1], it is possible for endpoints to be assigned "unavailable" IPs. Upon upgrade (to a version which includes [1]), endpoint restoration will fail with [2]. In order to workaround this failure and not disrupt the upgrade, this change introduces a hidden flag (`--bypass-ip-availability-upon-restore`) which will inform Cilium to continue on if the restored endpoint's IP is not available for reallocation, bypassing the specific error "IP is not available". Other errors will not be bypassed, in order to reduce the scope of this stop-gap solution. With the flag set, restored endpoints which had "unavailable" IPs will keep them. Any new endpoints / pods will be assigned fresh, valid IPs from the pool. This flag is only meant to be enabled with CRD-backed IPAM modes such as ENI mode. The reason is because of the change described in [1], where the primary ENI IP was removed from the IPAM pool. In any other mode that this flag is enabled in, the user is warned that the flag is not intended for other modes and will have no effect. This patch is intended to be reverted in the future, as this stop-gap solution will no longer be required, as users of Cilium don't upgrade from versions prior to [1]. I propose that we revert this in the following release that this patch makes it in (N+1). How was this tested? 1) Deployed a Cilium version that doesn't include [1] on EKS cluster 2) Created a Deployment object which I scaled to max out the ENI IPs, such that at least one pod is assigned an "unavailable" IP 3) Upgraded Cilium to a version which does include [1] and observe [2] failures 4) Reset cluster back to state from (2) 5) Upgrade Cilium to the version that contains this commit 6) Observe log msgs from this commit and endpoint restoration succeeding 7) Scale Deployment to 0 and back up, to restart all pods 8) Observe that they all get fresh IPs and none of the "unavailable" IPs [1]: #15453 [2]: ``` { "time":"2021-09-20T16:57:00.400086481Z", "level":"WARN", "origin":"cilium.io/agent", "message":"Unable to restore endpoint, ignoring", "params":{ "endpointID":"992", "error":"Failed to re-allocate IP of endpoint: unable to reallocate 10.0.133.193 IPv4 address: IP 10.0.133.193 is not available", "k8sPodName":"default/pod-1", "subsys":"daemon" } } ``` Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit baa9e9a ] In CRD-backed IPAM modes such as ENI mode, some IPs were recently removed from the IPAM pool [1]. In user environments where Cilium was running a version prior to [1], it is possible for endpoints to be assigned "unavailable" IPs. Upon upgrade (to a version which includes [1]), endpoint restoration will fail with [2]. In order to workaround this failure and not disrupt the upgrade, this change introduces a hidden flag (`--bypass-ip-availability-upon-restore`) which will inform Cilium to continue on if the restored endpoint's IP is not available for reallocation, bypassing the specific error "IP is not available". Other errors will not be bypassed, in order to reduce the scope of this stop-gap solution. With the flag set, restored endpoints which had "unavailable" IPs will keep them. Any new endpoints / pods will be assigned fresh, valid IPs from the pool. This flag is only meant to be enabled with CRD-backed IPAM modes such as ENI mode. The reason is because of the change described in [1], where the primary ENI IP was removed from the IPAM pool. In any other mode that this flag is enabled in, the user is warned that the flag is not intended for other modes and will have no effect. This patch is intended to be reverted in the future, as this stop-gap solution will no longer be required, as users of Cilium don't upgrade from versions prior to [1]. I propose that we revert this in the following release that this patch makes it in (N+1). How was this tested? 1) Deployed a Cilium version that doesn't include [1] on EKS cluster 2) Created a Deployment object which I scaled to max out the ENI IPs, such that at least one pod is assigned an "unavailable" IP 3) Upgraded Cilium to a version which does include [1] and observe [2] failures 4) Reset cluster back to state from (2) 5) Upgrade Cilium to the version that contains this commit 6) Observe log msgs from this commit and endpoint restoration succeeding 7) Scale Deployment to 0 and back up, to restart all pods 8) Observe that they all get fresh IPs and none of the "unavailable" IPs [1]: #15453 [2]: ``` { "time":"2021-09-20T16:57:00.400086481Z", "level":"WARN", "origin":"cilium.io/agent", "message":"Unable to restore endpoint, ignoring", "params":{ "endpointID":"992", "error":"Failed to re-allocate IP of endpoint: unable to reallocate 10.0.133.193 IPv4 address: IP 10.0.133.193 is not available", "k8sPodName":"default/pod-1", "subsys":"daemon" } } ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit baa9e9a ] In CRD-backed IPAM modes such as ENI mode, some IPs were recently removed from the IPAM pool [1]. In user environments where Cilium was running a version prior to [1], it is possible for endpoints to be assigned "unavailable" IPs. Upon upgrade (to a version which includes [1]), endpoint restoration will fail with [2]. In order to workaround this failure and not disrupt the upgrade, this change introduces a hidden flag (`--bypass-ip-availability-upon-restore`) which will inform Cilium to continue on if the restored endpoint's IP is not available for reallocation, bypassing the specific error "IP is not available". Other errors will not be bypassed, in order to reduce the scope of this stop-gap solution. With the flag set, restored endpoints which had "unavailable" IPs will keep them. Any new endpoints / pods will be assigned fresh, valid IPs from the pool. This flag is only meant to be enabled with CRD-backed IPAM modes such as ENI mode. The reason is because of the change described in [1], where the primary ENI IP was removed from the IPAM pool. In any other mode that this flag is enabled in, the user is warned that the flag is not intended for other modes and will have no effect. This patch is intended to be reverted in the future, as this stop-gap solution will no longer be required, as users of Cilium don't upgrade from versions prior to [1]. I propose that we revert this in the following release that this patch makes it in (N+1). How was this tested? 1) Deployed a Cilium version that doesn't include [1] on EKS cluster 2) Created a Deployment object which I scaled to max out the ENI IPs, such that at least one pod is assigned an "unavailable" IP 3) Upgraded Cilium to a version which does include [1] and observe [2] failures 4) Reset cluster back to state from (2) 5) Upgrade Cilium to the version that contains this commit 6) Observe log msgs from this commit and endpoint restoration succeeding 7) Scale Deployment to 0 and back up, to restart all pods 8) Observe that they all get fresh IPs and none of the "unavailable" IPs [1]: #15453 [2]: ``` { "time":"2021-09-20T16:57:00.400086481Z", "level":"WARN", "origin":"cilium.io/agent", "message":"Unable to restore endpoint, ignoring", "params":{ "endpointID":"992", "error":"Failed to re-allocate IP of endpoint: unable to reallocate 10.0.133.193 IPv4 address: IP 10.0.133.193 is not available", "k8sPodName":"default/pod-1", "subsys":"daemon" } } ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
This PR contains two fixes to support multiple AWS VPCs CIDRs in Cilium ENI IPAM mode.
The first fix is to assign the primary IP of each ENI to its corresponding Linux device. This ensures SNAT works correctly on the secondary interfaces. Before this change, SNAT on the secondary ENI interfaces would pick the node IP as the source IP address, which subsequently caused the packet to be dropped by the fabric. To facilitate this change, the primary IP is excluded from the allocation pool in the operator. This is similar to how AWS CNI treats the primary IP.
The second fix is around the cleanup of the additional rules created for secondary CIDRs (i.e. a follow-up to #15303)
The approach taken here has been validated manually by myself and by a community user.