Skip to content

Conversation

nathanjsweet
Copy link
Member

@nathanjsweet nathanjsweet commented Jul 23, 2020

Saving protocol information in the lb(x)_key in bpf, as well as
adding protocol information to service maps creation in the
lbmap package ensures that translation, receiving, and forwarding
will always take the protocol into proper account.

Signed-off-by: Nathan Sweet nathanjsweet@pm.me

Fixes: #9207

Differentiate load-balancer keys in the datapath by protocol (in addition to address and port), so that 
cilium can correctly differentiate protocols between services.

@nathanjsweet nathanjsweet added the area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. label Jul 23, 2020
@nathanjsweet nathanjsweet requested review from qmonnet and a team July 23, 2020 16:40
@nathanjsweet nathanjsweet requested a review from a team as a code owner July 23, 2020 16:40
@nathanjsweet nathanjsweet requested a review from a team July 23, 2020 16:40
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 23, 2020
@nathanjsweet nathanjsweet added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Jul 23, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 23, 2020
@coveralls
Copy link

coveralls commented Jul 23, 2020

Coverage Status

Coverage decreased (-0.04%) to 37.081% when pulling 9f59de2 on nathanjsweet/pr/differentiate-protocol-in-services into 3abd218 on master.

@joestringer joestringer added the upgrade-impact This PR has potential upgrade or downgrade impact. label Jul 23, 2020
@joestringer
Copy link
Member

Added upgrade-impact label based on the concerns raised in the original issue #9207. Will need to review closely in that area.

@pchaigno pchaigno added the needs/e2e-test This issue is not covered by existing CI tests, but should be. label Jul 24, 2020
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for some minor indentation issues and tests.

Maybe a painless way to test this in e2e tests is to simply change test/k8sT/manifests/demo_ds.yaml to switch one of the services to use same ports for TCP and UDP. These services are then used in test/k8sT/Services.go tests for NodePort, ExternalIP, externalTrafficPolicy=Local, etc.

@@ -603,7 +603,7 @@ func (s *Service) upsertServiceIntoLBMaps(svc *svcInfo, onlyLocalBackends bool,

err := s.lbmap.UpsertService(
uint16(svc.frontend.ID), svc.frontend.L3n4Addr.IP,
svc.frontend.L3n4Addr.L4Addr.Port,
svc.frontend.L3n4Addr.L4Addr.Port, string(svc.frontend.L3n4Addr.L4Addr.Protocol),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some unit tests for this in service_test.go. I think they could be extended to add a new frontend with just the protocol changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look to be unit tests and the lbmap service is just the mock service, so I don't know how much that will really tell us. Is there something else that you're referencing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no you're right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still add a unit test to check whether a svc doesn't get overwritten in pkg/service/service.go internal state if the same svc but with different proto gets added.

@brb
Copy link
Member

brb commented Jul 24, 2020

Thinking loud about the upgrade:

  • Both UDP and TCP services with the same IP/port will end up having the same rev_nat_index (allocated by pkg/service.AcquireID(...)). Same applies to service backends. So, the indices won't change, and thus no changes to existing CT entries are needed.
  • Last time I looked, the flow is the following:
    1. pkg/service.RestoreServices()
    2. Wait for k8s watchers to finish restoration (it will call pkg/service.UpsertService(..) for existing services)
    3. Reload BPF programs
    4. Call pkg/service.SyncWithK8sFinished(). This will remove the obsolete services.

From the first glance it looks that no existing flow will be interrupted, as 1) rev_nat_index doesn't change, 2) the old BPF programs will be able to access old svc entries, 3) the new BPF programs will be accessing new svc entries. However, we should definitely add an upgrade test (we can reuse the existing migrate-svc in test/k8sT/Updates.go) to gain more confidence.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pkg/loadbalancer/loadbalancer.go:L3n4Addr.Hash() should not omit the protocol.
  • I'd add some unit tests to pkg/service, also integration tests to k8sT/Services.go.
  • Extend migrate-svc in k8sT/Updates.go to test the upgrade.

@nathanjsweet nathanjsweet force-pushed the nathanjsweet/pr/differentiate-protocol-in-services branch 3 times, most recently from b755786 to f0b067e Compare July 25, 2020 00:02
@nathanjsweet nathanjsweet force-pushed the nathanjsweet/pr/differentiate-protocol-in-services branch 2 times, most recently from c954fa9 to db791c9 Compare August 5, 2020 17:38
@nathanjsweet nathanjsweet requested a review from a team as a code owner August 5, 2020 17:38
@nathanjsweet nathanjsweet requested a review from a team August 5, 2020 17:38
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall as a first-pass, just a minor item to address

@nathanjsweet nathanjsweet force-pushed the nathanjsweet/pr/differentiate-protocol-in-services branch 2 times, most recently from d7d88cc to 359f001 Compare August 6, 2020 03:02
@nathanjsweet nathanjsweet requested a review from a team as a code owner August 6, 2020 03:02
@nathanjsweet
Copy link
Member Author

nathanjsweet commented Oct 2, 2020

We have full kernel/e2e and runtime passes for this PR. The last commit, I think, doesn't need a full set of tests, as it is a simple matter of deleting a few lines of logging, and adding some integration tests. Assuming @brb is cool with putting off sockops protocol #ifdef work into another PR then I think this PR is safe to merge on 🍏 .

@nathanjsweet
Copy link
Member Author

ran into #13262

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given a detailed review to all files except the ones in test/

func FromNumber(proto uint8) (U8proto, error) {
_, ok := protoNames[U8proto(proto)]
if !ok {
return 0, fmt.Errorf("unknown protocol %d", proto)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, any reason why the error on line 66 is quoted but this one is not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but since the primitive causing the error, in this case, isn't a string I think we should leave it as is.

@@ -22,7 +22,7 @@ import (

k8sConst "github.com/cilium/cilium/pkg/k8s/apis/cilium.io"
"github.com/cilium/cilium/pkg/versioncheck"
"github.com/cilium/cilium/test/ginkgo-ext"
ginkgoext "github.com/cilium/cilium/test/ginkgo-ext"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

irrelevant change

Copy link
Member Author

@nathanjsweet nathanjsweet Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go fmt goimports does this for me automatically (running go 1.15).

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs, policy changes were trivial. I'll give approval for these two codeowners. CLI and BPF seemed to already be reviewed by others so I didn't look too closely, looks like others already have comments, some of which still need to be addressed.

Saving protocol information in the lb(x)_key in bpf, as well as
adding protocol information to service maps creation in the
lbmap package ensures that translation, receiving, and forwarding
will always take the protocol into proper account.

Refactor cilium to always assume a protocol with a service.
Refactor tests to account for service protocols.

Update Documentation for kubeproxy-free nodeport example
Since the output of `cilium service list` has changed
the documentation should reflect the change.

Fixes: #9207
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
@nathanjsweet nathanjsweet force-pushed the nathanjsweet/pr/differentiate-protocol-in-services branch from b33559a to cfdb026 Compare October 9, 2020 17:42
@joestringer
Copy link
Member

test-me-please

@joestringer joestringer requested review from aanm and brb October 9, 2020 22:00
@joestringer
Copy link
Member

All green CI, as discussed out-of-band on Slack, feedback was addressed and recently requested changes on this PR have been largely cosmetic. Reviewers, if you still see anything worth following up on then please post those reviews and let @nathanjsweet know.

@joestringer joestringer merged commit 107b0f5 into master Oct 9, 2020
@joestringer joestringer deleted the nathanjsweet/pr/differentiate-protocol-in-services branch October 9, 2020 22:03
@nathanjsweet nathanjsweet restored the nathanjsweet/pr/differentiate-protocol-in-services branch October 15, 2020 15:05
@nathanjsweet nathanjsweet deleted the nathanjsweet/pr/differentiate-protocol-in-services branch October 15, 2020 15:45
@nathanjsweet nathanjsweet restored the nathanjsweet/pr/differentiate-protocol-in-services branch November 11, 2020 03:49
@nathanjsweet nathanjsweet deleted the nathanjsweet/pr/differentiate-protocol-in-services branch November 11, 2020 04:08
@nathanjsweet nathanjsweet restored the nathanjsweet/pr/differentiate-protocol-in-services branch February 5, 2021 17:13
@aanm aanm deleted the nathanjsweet/pr/differentiate-protocol-in-services branch March 4, 2022 03:12
@briantopping
Copy link
Contributor

Does this actually fix #9207? Last time I checked, it's still impossible for a pod to publish both a TCP and UDP as a service (such as 53) with the same port number.

@pchaigno
Copy link
Member

@briantopping No, this PR was later reverted because of an upgrade issue: #13587. This is now tracked at #9207, as you've seen.

@briantopping
Copy link
Contributor

Ah, thanks. I forgot a lot of the history, just noticed the heading on that issue and didn't know if it was getting lost because of it.
image

@pchaigno
Copy link
Member

Yeah, that's just the GitHub UI being a bit broken: it doesn't understand when you reopen an issue that should have been fixed by a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. upgrade-impact This PR has potential upgrade or downgrade impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Differentiate between UDP and TCP services
9 participants