Skip to content

Conversation

tgraf
Copy link
Member

@tgraf tgraf commented Mar 20, 2021

Neighbor insertion can block indefinitely. See
#14710 for details.

goroutine 62 [syscall, 1080 minutes]:
syscall.Syscall6(0x2d, 0x4e, 0xc002e27000, 0x1000, 0x0, 0xc004ba33e0, 0xc004ba33d4, 0x20, 0x21b1380, 0xc005241101)
        /usr/local/go/src/syscall/asm_linux_amd64.s:44 +0x5
syscall.recvfrom(0x4e, 0xc002e27000, 0x1000, 0x1000, 0x0, 0xc004ba33e0, 0xc004ba33d4, 0x27637a0, 0xc0015555c0, 0x0)
        /usr/local/go/src/syscall/zsyscall_linux_amd64.go:1618 +0xa3
syscall.Recvfrom(0x4e, 0xc002e27000, 0x1000, 0x1000, 0x0, 0x4, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/syscall/syscall_unix.go:273 +0xaf
syscall.NetlinkRIB(0x12, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/syscall/netlink_linux.go:71 +0x23c
net.interfaceTable(0x0, 0x4, 0x4, 0x8, 0x18, 0xc002462cc0)
        /usr/local/go/src/net/interface_linux.go:17 +0x48
net.InterfaceByName(0xc0001dd298, 0x4, 0xc00a3e6a20, 0x10, 0x10)
        /usr/local/go/src/net/interface.go:157 +0x1a4
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).insertNeighbor(0xc0008d4c30, 0xc004ba3b58, 0xc0001dd298, 0x4)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:591 +0x132
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).nodeUpdate(0xc0008d4c30, 0x0, 0xc004ba3b58, 0x0, 0x0, 0x0)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:729 +0x18f
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).NodeValidateImplementation(0xc0008d4c30, 0xc0006a6ed0, 0x29, 0xc00a3e6958, 0x7, 0xc00be311d0, 0x2, 0x2, 0xc001646368, 0x0, ...)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:1171 +0x9b
github.com/cilium/cilium/pkg/node/manager.(*Manager).backgroundSync.func1(0x27c8680, 0xc0008d4c30)
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:282 +0xaa
github.com/cilium/cilium/pkg/node/manager.(*Manager).Iter(0xc00091c280, 0xc004ba3e08)
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:149 +0xe7
github.com/cilium/cilium/pkg/node/manager.(*Manager).backgroundSync(0xc00091c280)
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:281 +0x421
created by github.com/cilium/cilium/pkg/node/manager.NewManager
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:191 +0x673

@tgraf tgraf added release-note/bug This PR fixes an issue in a previous release of Cilium. backport/1.7 labels Mar 20, 2021
@tgraf tgraf requested a review from a team as a code owner March 20, 2021 00:15
@maintainer-s-little-helper maintainer-s-little-helper bot added the kind/backports This PR provides functionality previously merged into master. label Mar 20, 2021
@tgraf
Copy link
Member Author

tgraf commented Mar 20, 2021

test-backport-1.7

@tgraf tgraf force-pushed the pr/tgraf/v1.7-bump-netlink-dependency branch from 30df4b4 to 9ba51b9 Compare March 20, 2021 00:33
Neighbor insertion can block indefinitely. See
cilium#14710 for details.

```
goroutine 62 [syscall, 1080 minutes]:
syscall.Syscall6(0x2d, 0x4e, 0xc002e27000, 0x1000, 0x0, 0xc004ba33e0, 0xc004ba33d4, 0x20, 0x21b1380, 0xc005241101)
        /usr/local/go/src/syscall/asm_linux_amd64.s:44 +0x5
syscall.recvfrom(0x4e, 0xc002e27000, 0x1000, 0x1000, 0x0, 0xc004ba33e0, 0xc004ba33d4, 0x27637a0, 0xc0015555c0, 0x0)
        /usr/local/go/src/syscall/zsyscall_linux_amd64.go:1618 +0xa3
syscall.Recvfrom(0x4e, 0xc002e27000, 0x1000, 0x1000, 0x0, 0x4, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/syscall/syscall_unix.go:273 +0xaf
syscall.NetlinkRIB(0x12, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/syscall/netlink_linux.go:71 +0x23c
net.interfaceTable(0x0, 0x4, 0x4, 0x8, 0x18, 0xc002462cc0)
        /usr/local/go/src/net/interface_linux.go:17 +0x48
net.InterfaceByName(0xc0001dd298, 0x4, 0xc00a3e6a20, 0x10, 0x10)
        /usr/local/go/src/net/interface.go:157 +0x1a4
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).insertNeighbor(0xc0008d4c30, 0xc004ba3b58, 0xc0001dd298, 0x4)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:591 +0x132
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).nodeUpdate(0xc0008d4c30, 0x0, 0xc004ba3b58, 0x0, 0x0, 0x0)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:729 +0x18f
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).NodeValidateImplementation(0xc0008d4c30, 0xc0006a6ed0, 0x29, 0xc00a3e6958, 0x7, 0xc00be311d0, 0x2, 0x2, 0xc001646368, 0x0, ...)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:1171 +0x9b
github.com/cilium/cilium/pkg/node/manager.(*Manager).backgroundSync.func1(0x27c8680, 0xc0008d4c30)
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:282 +0xaa
github.com/cilium/cilium/pkg/node/manager.(*Manager).Iter(0xc00091c280, 0xc004ba3e08)
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:149 +0xe7
github.com/cilium/cilium/pkg/node/manager.(*Manager).backgroundSync(0xc00091c280)
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:281 +0x421
created by github.com/cilium/cilium/pkg/node/manager.NewManager
        /go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:191 +0x673
```

Signed-off-by: Thomas Graf <thomas@cilium.io>
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we've been running these versions (or something very close) in the recent 1.8.8, 1.9.5 releases that we put out last week and we haven't heard much recently so I don't really anticipate any sort of regression from this. Not sure there's anything practical we can do to mitigate the potential risk here if the intent is to pull in fixes that are only available in much newer trees.

@tgraf tgraf force-pushed the pr/tgraf/v1.7-bump-netlink-dependency branch from cfb2b53 to 3378798 Compare March 20, 2021 01:09
tgraf added 2 commits March 20, 2021 02:15
Signed-off-by: Thomas Graf <thomas@cilium.io>
The netlink library depends on a more recent x/sys/unix

Signed-off-by: Thomas Graf <thomas@cilium.io>
@tgraf tgraf force-pushed the pr/tgraf/v1.7-bump-netlink-dependency branch from 3378798 to c232b81 Compare March 20, 2021 01:16
@tgraf
Copy link
Member Author

tgraf commented Mar 20, 2021

test-backport-1.7

@joestringer joestringer merged commit b93db0a into cilium:v1.7 Mar 20, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 20, 2021
@brb
Copy link
Member

brb commented Mar 22, 2021

@tgraf @joestringer Unfortunately, this PR is not enough. We need to backport #15225 too. From the provided stack trace we can see that the Go's net from the stdlib was doing netlink request.

(Lots of issues, unnecessary work and confusion could have been avoided, if we agreed to backport #13112 to v1.{7,8,9}).

@tgraf tgraf deleted the pr/tgraf/v1.7-bump-netlink-dependency branch March 22, 2021 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants