Skip to content

Conversation

sugangli
Copy link
Contributor

@sugangli sugangli commented Oct 8, 2024

NAT Map LRU behavior could evict any entry in the MAP. This change restore a reverse NAT entry when it is expected but not found. Previously, we were only relying on existence of forwarding NAT to determine if the NAT pair exists or not.

This fix is verified by using the same repro steps in #34833. After the fix, we saw much less timeout from the client, but the case like eviction happens between TCP SYN and TCP SYN ACK will still drop the returning packets.

Fixes: #34833

@sugangli sugangli requested a review from a team as a code owner October 8, 2024 19:17
@sugangli sugangli requested a review from gentoo-root October 8, 2024 19:17
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 8, 2024
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Oct 8, 2024
@sugangli sugangli force-pushed the snat_restore branch 2 times, most recently from 6314beb to ac97018 Compare October 8, 2024 21:27
@joestringer joestringer added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Oct 9, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 9, 2024
@joestringer
Copy link
Member

/test

@maintainer-s-little-helper
Copy link

Commit 50ad1eb does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Oct 15, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Oct 16, 2024
@sugangli
Copy link
Contributor Author

Hi @joestringer, let me know if I need to add anything else to this PR.

@aanm aanm requested a review from joestringer October 21, 2024 10:01
@sypakine
Copy link
Contributor

/test

@sugangli sugangli force-pushed the snat_restore branch 2 times, most recently from cc029f0 to d9934ce Compare October 21, 2024 20:05
@joestringer
Copy link
Member

@gentoo-root Do you have intentions to review this PR or can you find someone else from @cilium/sig-datapath ?

Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also add the IPv6 part of the fix?

Agree with Tom that this needs tests. I could see a bpf_host test under bpf/tests/ that walks through the desired packet flow, and reproduces the whole scenario you're fixing:

  1. transmit a packet, have it SNATed
  2. receive a reply for that connection, check that it gets revSNATed
  3. manually delete the revSNAT entry
  4. receive a second reply, check that it no longer gets revSNATed
  5. transmit a second packet, validate that it re-creates the missing state
  6. receive a third reply, check that it gets revSNATed again

@julianwiedmann julianwiedmann added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/snat Relates to SNAT or Masquerading of traffic kind/bug This is a bug in the Cilium logic. labels Oct 24, 2024
@joestringer joestringer removed the request for review from gentoo-root October 24, 2024 17:30
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Feb 20, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 10, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 10, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 10, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 11, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 19, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 19, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 19, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 20, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 21, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 21, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 21, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gyutaeb added a commit to gyutaeb/cilium that referenced this pull request Mar 21, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: cilium#35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
github-merge-queue bot pushed a commit that referenced this pull request Mar 21, 2025
ORG NAT entry could be removed by LRU eviction. This commit restore the
ORG NAT entry from REV NAT entry and tuple in
snat_v4_rev_nat_handle_mapping(). Thus, it can mitigate network failures
caused by LRU eviction.

Relates: #35304
Signed-off-by: Gyutae Bae <gyu.8ae@gmail.com>
gentoo-root added a commit that referenced this pull request Aug 6, 2025
snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
gentoo-root added a commit that referenced this pull request Aug 7, 2025
snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
gentoo-root added a commit that referenced this pull request Aug 7, 2025
snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Aug 14, 2025
snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
rabelmervin pushed a commit to rabelmervin/cilium that referenced this pull request Aug 18, 2025
snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: cilium#35304
Ref: cilium#37747
Fixes: cilium#38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 25, 2025
[ upstream commit b1bdfdd ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 25, 2025
[ upstream commit b1bdfdd ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 25, 2025
[ upstream commit b1bdfdd ]

[ backporter's notes: minor conflicts due to different map name and some
missing tests in stable branch. ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 25, 2025
[ upstream commit b1bdfdd ]

[ backporter's notes: minor conflicts due to different map name and some
missing tests in stable branch. ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 25, 2025
[ upstream commit b1bdfdd ]

[ backporter's notes: minor conflicts due to different map name and some
missing tests in stable branch. ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 26, 2025
[ upstream commit b1bdfdd ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 26, 2025
[ upstream commit b1bdfdd ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 26, 2025
[ upstream commit b1bdfdd ]

[ backporter's notes: minor conflicts due to different map name and some
missing tests in stable branch. ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 26, 2025
[ upstream commit b1bdfdd ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 pushed a commit that referenced this pull request Aug 26, 2025
[ upstream commit b1bdfdd ]

[ backporter's notes: minor conflicts due to different map name and some
missing tests in stable branch. ]

snat_v4_new_mapping creates a RevSNAT entry, then a SNAT entry for new
connections. However, we also have a fallback mechanism in
snat_v4_rev_nat_handle_mapping, which restores the SNAT entry if it was
evicted by LRU, but the original RevSNAT entry still exists.

There is a hypothesis that the above two functions may race with each
other, and the following is possible:

1. snat_v4_new_mapping creates a RevSNAT entry.
2. snat_v4_rev_nat_handle_mapping finds the RevSNAT entry and "restores"
   the SNAT entry (which has never existed yet).
3. snat_v4_new_mapping goes on to create the SNAT entry, corresponding
   to the RevSNAT entry, created previously.
4. snat_v4_new_mapping fails because the SNAT entry already exists,
   created by snat_v4_rev_nat_handle_mapping.
5. snat_v4_new_mapping then removes the RevSNAT entry, trying to revert
   everything.

The right outcome in this case would be to keep the SNAT entry, whoever
created it first, so it makes sense not to fail on -EEXIST.

It also makes the fallback mechanism more robust, because there is a
time gap between __snat_lookup and __snat_create.

Ref: #35304
Ref: #37747
Fixes: #38466
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/snat Relates to SNAT or Masquerading of traffic kind/bug This is a bug in the Cilium logic. kind/community-contribution This was a contribution made by a community member. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NAT LRU Eviction due to Full Capacity
7 participants