Skip to content

Conversation

HadrienPatte
Copy link
Member

This PRs contains three fixes for the GetDetachedNetworkInterfaces function:

GC tags

GetDetachedNetworkInterfaces is used to find ENIs that should be GCed. For that to work, it's important to use the same tags to filter ENIs here as tags that are used to create ENIs. This is done here. The issue is that in GetDetachedNetworkInterfaces, we merge c.subnetsFilters to those existing tags filters. c.subnetsFilters is created by NewSubnetsFilters based on operatorOption.Config.IPAMSubnetsTags and operatorOption.Config.IPAMSubnetsIDs. The subnet IDs filter is fine as ENIs can be filtered by the ID of their subnet (ref), but filtering ENIs based on IPAMSubnetsTags is problematic, because at no point in the ENI creation process are those tags added to the ENIs, and they are not intended to be added to the ENIs. ENITags is the flag for tags that should be added to all ENIs, and these tags are properly passed down to GetDetachedNetworkInterfaces to ensure the GC logic also filters based on those tags.

So here the fix is to keep the subnet-ID based filtering of ENIs but remove the subnet-tag one as it currently prevents the operator from GC-ing ENIs if the operator is not configured with ENITags as a subset of IPAMSubnetsTags.

Relevant diff:

 func (c *Client) GetDetachedNetworkInterfaces(ctx context.Context, tags ipamTypes.Tags, maxResults int32) ([]string, error) {
 	result := make([]string, 0, int(maxResults))
 	input := &ec2.DescribeNetworkInterfacesInput{
-		Filters: append(NewTagsFilter(tags), c.subnetsFilters...),
+		Filters: NewTagsFilter(tags),
+	}
+	for _, subnetFilter := range c.subnetsFilters {
+		if aws.ToString(subnetFilter.Name) == "subnet-id" {
+			input.Filters = append(input.Filters, subnetFilter)
+		}
 	}

AWS API calls page size

When constructing the input DescribeNetworkInterfacesInput, its MaxResults field is set to the maxResults parameter of GetDetachedNetworkInterfaces. But despite having the same name, those two parameters are very different:

  • maxResults is the maximum number of ENIs we want returned by GetDetachedNetworkInterfaces
  • input.MaxResults is actually the (max) page size of paginated AWS API calls, see pagination doc

The way the AWS API constructs a DescribeNetworkInterfaces call response is:

  1. First gather all ENIs,
  2. Then break them into pages based on the requested max page size
  3. Then apply filters to remove ENIs that don't match the filters
  4. Finally send all pages with whatever is left in them.

This means that for example in an account with 1000 ENIs, with 5 ENIs that have a specific tag, if we make a DescribeNetworkInterfaces with a filter on this tag and request a page size of 25, the AWS API will start by gathering all 1000 ENIs. Then group them in 40 pages of 25 ENIs each. Then apply the filter which will leave most pages empty, only at most five pages will have at least one ENI depending on whether the 5 ENIs matching the tag filter were in the same pages or not. Then the API will send all 40 pages one by one even if most of those are empty. this means that we make 40 individual API calls to get 5 ENIs.

To avoid this situation, we set the AWS API response max page size to its maximum value of 1000. using the existing defaults.ENIMaxResultsPerApiCall.

Relevant diff:

 	}
 	if operatorOption.Config.AWSPaginationEnabled {
-		input.MaxResults = aws.Int32(maxResults)
+		input.MaxResults = aws.Int32(defaults.ENIMaxResultsPerApiCall)
 	}

Max results cutoff

GetDetachedNetworkInterfaces has a maxResults parameter which is set from GarbageCollectionParams.MaxPerInterval. Which in turn is set from the ENIGarbageCollectionMaxPerInterval const to 25.

The goal of this parameter is to cap the length of the returned result. But with the way it is currently used, GetDetachedNetworkInterfaces can actually return a result of length up to 49. This is because for each page, we append all ENIs from output.NetworkInterfaces to result and only after that check for the length of result. so if at the begining of a page iteration result is a slice of length 24 and the new page has 25 ENIs, then they will all be appended to result before the length check will break out of the paginator loop as result will be of length 24+25=49.

Relevant diff:

@@ -236,11 +241,11 @@ func (c *Client) GetDetachedNetworkInterfaces(ctx context.Context, tags ipamType
 			return nil, err
 		}
 		for _, eni := range output.NetworkInterfaces {
+			if len(result) >= int(maxResults) {
+				return result, nil
+			}
 			result = append(result, aws.ToString(eni.NetworkInterfaceId))
 		}
-		if len(result) >= int(maxResults) {
-			break
-		}
 	}
 	return result, nil
 }

Note: this was already reported in #37983:

I also noticed that the logic in GetDetachedNetworkInterfaces() is a bit strange: it actually uses pagination but the number of returned ENIs can exceed maxResults passed in the parameters. Mainly, this can happen here if the first page returns x results where x < maxResults and then the second page returns for example exactly maxResults, so the length of result will be maxResults + x > maxResults. The break should happen inside the loop on output.NetworkInterfaces. I think the fix for this problem is out of scope for this PR though.


Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!
aws/eni: Don't use subnet tags to filter ENIs for GC

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 22, 2025
@HadrienPatte
Copy link
Member Author

/test

@HadrienPatte HadrienPatte marked this pull request as ready for review July 22, 2025 17:37
@HadrienPatte HadrienPatte requested a review from a team as a code owner July 22, 2025 17:37
@HadrienPatte HadrienPatte requested a review from liyihuang July 22, 2025 17:37
@HadrienPatte HadrienPatte added release-note/bug This PR fixes an issue in a previous release of Cilium. and removed release-note/bug This PR fixes an issue in a previous release of Cilium. labels Jul 28, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 29, 2025
@liyihuang liyihuang added the area/eni Impacts ENI based IPAM. label Jul 30, 2025
@liyihuang liyihuang self-requested a review July 30, 2025 13:55
Copy link
Contributor

@liyihuang liyihuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. there are some questions but not related to this PR

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jul 30, 2025
@joestringer
Copy link
Member

Thanks for the breakdown. Typically the way you have formatted the PR description, we would encode directly into the git commits as three separate commits. This is valuable as well when referring to the code, because GitHub PR descriptions are often not available locally and that context is harder to come by. Would you mind pushing that content into the commits and split them up?

Additionally as a bugfix I'll nominate this for v1.18 backport.

@joestringer joestringer added the needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch label Jul 30, 2025
`GetDetachedNetworkInterfaces` is used to find ENIs that should be GCed. For that to work, it's important to use the same tags to filter ENIs here as tags that are used to create ENIs. This is done [here](https://github.com/cilium/cilium/blob/65c63cd6febecde7eb307a1d776fa63ca61c0cc6/pkg/ipam/allocator/aws/aws.go#L95-L97). The issue is that in `GetDetachedNetworkInterfaces`, we merge `c.subnetsFilters` to those existing tags filters. `c.subnetsFilters` is created by [`NewSubnetsFilters`](https://github.com/cilium/cilium/blob/65c63cd6febecde7eb307a1d776fa63ca61c0cc6/pkg/aws/ec2/ec2.go#L130) based on [`operatorOption.Config.IPAMSubnetsTags`](https://github.com/cilium/cilium/blob/65c63cd6febecde7eb307a1d776fa63ca61c0cc6/operator/option/config.go#L267-L268) and [`operatorOption.Config.IPAMSubnetsIDs`](https://github.com/cilium/cilium/blob/65c63cd6febecde7eb307a1d776fa63ca61c0cc6/operator/option/config.go#L264-L265). The subnet IDs filter is fine as ENIs can be filtered by the ID of their subnet ([ref](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeNetworkInterfaces.html#API_DescribeNetworkInterfaces_RequestParameters)), but filtering ENIs based on `IPAMSubnetsTags` is problematic, because at no point in the ENI creation process are those tags added to the ENIs, and they are not intended to be added to the ENIs. [`ENITags`](https://github.com/cilium/cilium/blob/65c63cd6febecde7eb307a1d776fa63ca61c0cc6/operator/option/config.go#L112) is the flag for tags that should be added to all ENIs, and these tags are properly passed down to `GetDetachedNetworkInterfaces` to ensure the GC logic also filters based on those tags.

So here the fix is to keep the subnet-ID based filtering of ENIs but remove the subnet-tag one as it currently prevents the operator from GC-ing ENIs if the operator is not configured with `ENITags` as a subset of `IPAMSubnetsTags`.

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
When constructing the `input` [`DescribeNetworkInterfacesInput`](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/ec2#DescribeNetworkInterfacesInput), its `MaxResults` field is set to the `maxResults` parameter of `GetDetachedNetworkInterfaces`. But despite having the same name, those two parameters are very different:
* `maxResults` is the [maximum number of ENIs](https://github.com/cilium/cilium/blob/6d1aa98670fe046fe3ac5a2b4167cfece416603e/pkg/aws/eni/eni_gc.go#L29-L31) we want returned by `GetDetachedNetworkInterfaces`
* `input.MaxResults` is actually the (max) page size of paginated AWS API calls, see [pagination doc](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Query-Requests.html#api-pagination)

The way the AWS API constructs a `DescribeNetworkInterfaces` call response is:
1. First gather all ENIs,
2. Then break them into pages based on the requested max page size
3. Then apply filters to remove ENIs that don't match the filters
4. Finally send all pages with whatever is left in them.

This means that for example in an account with 1000 ENIs, with 5 ENIs that have a specific tag, if we make a `DescribeNetworkInterfaces` with a filter on this tag and request a page size of 25, the AWS API will start by gathering all 1000 ENIs. Then group them in 40 pages of 25 ENIs each. Then apply the filter which will leave most pages empty, only at most five pages will have at least one ENI depending on whether the 5 ENIs matching the tag filter were in the same pages or not. Then the API will send all 40 pages one by one even if most of those are empty. this means that we make 40 individual API calls to get 5 ENIs.

To avoid this situation, we set the AWS API response max page size to [its maximum value of 1000](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeNetworkInterfaces.html#API_DescribeNetworkInterfaces_RequestParameters). using the existing `defaults.ENIMaxResultsPerApiCall`.

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
@HadrienPatte HadrienPatte force-pushed the pr/HadrienPatte/eni-gc branch from b1e2a94 to 14d4584 Compare July 31, 2025 08:49
`GetDetachedNetworkInterfaces` has a `maxResults` parameter which is set from [`GarbageCollectionParams.MaxPerInterval`](https://github.com/cilium/cilium/blob/6d1aa98670fe046fe3ac5a2b4167cfece416603e/pkg/aws/eni/eni_gc.go#L29-L31). Which in turn is set from the [`ENIGarbageCollectionMaxPerInterval`](https://github.com/cilium/cilium/blob/6d1aa98670fe046fe3ac5a2b4167cfece416603e/pkg/defaults/defaults.go#L391-L393) const to 25.

The goal of this parameter is to cap the length of the returned `result`. But with the way it is currently used, `GetDetachedNetworkInterfaces` can actually return a `result` of length up to 49. This is because for each page, we append all ENIs from `output.NetworkInterfaces` to `result` and only after that check for the length of `result`. so if at the beginning of a page iteration `result` is a slice of length 24 and the new page has 25 ENIs, then they will all be appended to `result` before the length check will `break` out of the paginator loop as `result` will be of length 24+25=49.

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
@HadrienPatte HadrienPatte force-pushed the pr/HadrienPatte/eni-gc branch from 14d4584 to 2da9819 Compare July 31, 2025 08:51
@HadrienPatte
Copy link
Member Author

/test

@joestringer joestringer added this pull request to the merge queue Jul 31, 2025
Merged via the queue into main with commit 5be4ea1 Jul 31, 2025
323 of 327 checks passed
@joestringer joestringer deleted the pr/HadrienPatte/eni-gc branch July 31, 2025 16:15
@rastislavs rastislavs mentioned this pull request Aug 6, 2025
17 tasks
@rastislavs rastislavs added backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. and removed needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Aug 6, 2025
@github-actions github-actions bot added backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. and removed backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. labels Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/eni Impacts ENI based IPAM. backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants