Skip to content

Conversation

tklauser
Copy link
Member

@tklauser tklauser commented Mar 7, 2025

Access to (*ProxyPorts).nRedirects needs to happen with (*ProxyPorts).mutex locked to avoid the following data race:

WARNING: DATA RACE
Read at 0x00c000365f38 by goroutine 24:
  github.com/cilium/cilium/pkg/proxy/proxyports.(*ProxyPorts).releaseProxyPort.func1()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports.go:402 +0x224

Previous write at 0x00c000365f38 by goroutine 22:
  github.com/cilium/cilium/pkg/proxy/proxyports.(*ProxyPorts).releaseProxyPort()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports.go:388 +0xf6
  github.com/cilium/cilium/pkg/proxy/proxyports.TestPortAllocator()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports_test.go:147 +0x13e5
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1851 +0x44

Goroutine 24 (running) created at:
  github.com/cilium/cilium/pkg/proxy/proxyports.(*ProxyPorts).releaseProxyPort()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports.go:396 +0x30f
  github.com/cilium/cilium/pkg/proxy/proxyports.TestPortAllocator()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports_test.go:75 +0x77a
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1851 +0x44

Goroutine 22 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1851 +0x8f2
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:2279 +0x85
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1792 +0x225
  testing.runTests()
      /usr/local/go/src/testing/testing.go:2277 +0x96c
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:2142 +0xeea
  main.main()
      _testmain.go:47 +0x164

This can easily be reproduced by running:

go test -race -c ./pkg/proxy/proxyports && stress ./proxyports.test

The godoc comment for (*ProxyPorts).releaseProxyPort also specifies that it needs to be called with (*ProxyPorts).mutex held, so let's lock it when using it in TestPortAllocator.

Fixes #36375

Access to (*ProxyPorts).nRedirects needs to happen with
(*ProxyPorts).mutex locked to avoid the following data race:

WARNING: DATA RACE
Read at 0x00c000365f38 by goroutine 24:
  github.com/cilium/cilium/pkg/proxy/proxyports.(*ProxyPorts).releaseProxyPort.func1()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports.go:402 +0x224

Previous write at 0x00c000365f38 by goroutine 22:
  github.com/cilium/cilium/pkg/proxy/proxyports.(*ProxyPorts).releaseProxyPort()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports.go:388 +0xf6
  github.com/cilium/cilium/pkg/proxy/proxyports.TestPortAllocator()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports_test.go:147 +0x13e5
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1851 +0x44

Goroutine 24 (running) created at:
  github.com/cilium/cilium/pkg/proxy/proxyports.(*ProxyPorts).releaseProxyPort()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports.go:396 +0x30f
  github.com/cilium/cilium/pkg/proxy/proxyports.TestPortAllocator()
      /home/tklauser/go/src/github.com/cilium/cilium/pkg/proxy/proxyports/proxyports_test.go:75 +0x77a
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1851 +0x44

Goroutine 22 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1851 +0x8f2
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:2279 +0x85
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1792 +0x225
  testing.runTests()
      /usr/local/go/src/testing/testing.go:2277 +0x96c
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:2142 +0xeea
  main.main()
      _testmain.go:47 +0x164

This can easily be reproduced by running:

    go test -race -c ./pkg/proxy/proxyports && stress ./proxyports.test

The godoc comment for (*ProxyPorts).releaseProxyPort also specifies that
it needs to be called with (*ProxyPorts).mutex held, so let's lock it
when using it in TestPortAllocator.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
@tklauser tklauser requested a review from a team as a code owner March 7, 2025 11:36
@tklauser tklauser requested a review from sayboras March 7, 2025 11:36
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 7, 2025
@tklauser tklauser added the release-note/ci This PR makes changes to the CI. label Mar 7, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 7, 2025
@tklauser tklauser added area/CI Continuous Integration testing issue or flake area/CI-improvement Topic or proposal to improve the Continuous Integration workflow dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Mar 7, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 7, 2025
@tklauser
Copy link
Member Author

tklauser commented Mar 7, 2025

/test

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice finding 👍

@sayboras sayboras enabled auto-merge March 7, 2025 12:27
@sayboras sayboras added this pull request to the merge queue Mar 7, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 7, 2025
Merged via the queue into main with commit dd2295f Mar 7, 2025
153 checks passed
@sayboras sayboras deleted the pr/tklauser/proxyport-fix-flake-datarace branch March 7, 2025 12:48
@julianwiedmann julianwiedmann added the needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch label Apr 2, 2025
@julianwiedmann
Copy link
Member

discussed with @mhofstetter that this also affects the v1.17 branch

@mhofstetter mhofstetter mentioned this pull request Apr 2, 2025
3 tasks
@mhofstetter mhofstetter added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Apr 2, 2025
@github-actions github-actions bot added backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. and removed backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. labels Apr 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake area/CI-improvement Topic or proposal to improve the Continuous Integration workflow backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: ci-integration: TestPortAllocator failed: Unexpected count for proxy port rules
4 participants