WIP Open ELB to kops-controller port when using it for internal API #10142

johngmyers · 2020-10-30T05:27:06Z

Fixes #10139

k8s-ci-robot · 2020-10-30T05:27:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johngmyers

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [johngmyers]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

johngmyers · 2020-10-30T16:54:28Z

/retest

seh · 2020-10-30T20:12:24Z

I built kops including this patch, and it worked as intended: my worker machines were able to contact kops-controller on port 3988 and finish their bootstrap procedure.

seh · 2020-10-31T02:30:58Z

I built kops including this patch, and it worked as intended: my worker machines were able to contact kops-controller on port 3988 and finish their bootstrap procedure.

I spoke too soon. It seemed that this worked earlier, but then failed again in my current test. What I observed was that opening up the ELB's security group to allow ingress on port 3988 from the "nodes" security group wasn't good enough. Opening ingress on port 3988 from everywhere did work.

I don't understand why this is so. The source machine—one of my worker machines—is a member of the "nodes" security group. It also a member of another security group, but—not that I expected this to work—when I also tried allowing ingress to the ELB from that other security group, it didn't change the outcome. Only opening up the ELB to ingress from all sources has worked so far.

I can't tell if there's some SNAT going on here that's confusing AWS's ability to tell that the incoming traffic (that is, from a worker machine to the ELB) is coming from a blessed security group.

seh · 2020-10-31T03:13:23Z

I enabled access logs on the ELB, but the client IP addresses only show the IP addresses of the ELB listeners, which isn't helpful. There are a few clients with an address like 18.188.58.156 which does not look like one from any of my VPCs.

I tried replacing the ELB security group rule allowing ingress from anywhere with one allowing ingress from the same ELB security group itself, just to see if this firewall enforcement matches what the ELB logs show. That didn't work, though: traffic was still blocked from the worker machines.

I'm going to take a break and get some sleep, and see if reasonable explanation comes to me—as usually occurs the moment I turn off the computer.

pkg/model/awsmodel/api_loadbalancer.go

hakman · 2020-10-31T09:59:27Z

Your problem seems to be that the ELB is public and the SGs will not help too much with that. Probably 18.188.58.156 is VPC NAT GW address.

pkg/model/awsmodel/api_loadbalancer.go

seh · 2020-10-31T13:08:20Z

pkg/model/awsmodel/api_loadbalancer.go

+				FromPort:      fi.Int64(wellknownports.KopsControllerPort),
+				Protocol:      fi.String("tcp"),
+				SecurityGroup: lbSG,
+				SourceGroup:   nodeGroup.Task,


Can we check if the API server load balancer is public, and if so, use a more lax source range here? As @hakman realized, for a public ELB, the traffic comes in from the nodes via a NAT gateway with a source address that's not even within any of the VPC's CIDR blocks. (The NAT gateways each have a private IP address within the VPC, but the corresponding "Elastic IP address" is outside the VPC.)

I'd prefer not to expose the port externally if possible. While the authentication is strong, there are denial of service considerations.

Per #10139 (comment), I now think we're fixing the wrong problem here.

If someone asks for Public LB and UseForInternalApi, I think should allow access from 0.0.0.0/0, same as for API.
Or maybe not allow UseForInternalApi in this case, as it won't work anyway.
I don't have a strong preference in general here, so feel free to ignore my comment.

I want to make sure I understand your second proposal: Could kops reject "useForInternalApi" as invalid when the load balancer is public? I think that's the best choice, if it's feasible.

I liked the option—clearly, as I tried to enable it—for accessing the Kubernetes API servers, not really thinking about whether the load balancer was public or not. At that point, though, I didn't realize that it would be used for other things like this node bootstrapping.

And did this other use start in kops 1.19? I didn't experience this problem with kops 1.18.2.

johngmyers · 2020-10-31T19:48:08Z

Looks like the choices might be to expose the port externally, pay for a second, internal, load balancer, or to use a dns-controller domain regardless of the UseForInternalAPI setting.

johngmyers · 2020-11-14T19:22:15Z

Going with a separate dns-controller-managed domain instead, in #10239

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 30, 2020

k8s-ci-robot requested review from joshbranham and KashifSaadat October 30, 2020 05:27

k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 30, 2020

johngmyers force-pushed the internal-api-elb branch from dc99c6b to 9e82908 Compare October 30, 2020 05:29

Open ELB to kops-controller port when using it for internal API

280432f

johngmyers force-pushed the internal-api-elb branch from 9e82908 to 280432f Compare October 30, 2020 05:39

seh mentioned this pull request Oct 30, 2020

Enabling "spec.api.loadBalancer.useForInternalApi" requires access to kops controller port through API load balancer #10139

Closed

johngmyers changed the title ~~WIP Open ELB to kops-controller port when using it for internal API~~ Open ELB to kops-controller port when using it for internal API Oct 30, 2020

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 30, 2020

hakman reviewed Oct 31, 2020

View reviewed changes

pkg/model/awsmodel/api_loadbalancer.go Show resolved Hide resolved

seh reviewed Oct 31, 2020

View reviewed changes

pkg/model/awsmodel/api_loadbalancer.go Outdated Show resolved Hide resolved

seh reviewed Oct 31, 2020

View reviewed changes

johngmyers changed the title ~~Open ELB to kops-controller port when using it for internal API~~ WIP Open ELB to kops-controller port when using it for internal API Oct 31, 2020

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 31, 2020

Address review comment

05e49dd

seh mentioned this pull request Nov 3, 2020

Fix additionalSecurityGroups support for NLB #10162

Merged

johngmyers closed this Nov 14, 2020

johngmyers deleted the internal-api-elb branch November 15, 2020 03:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP Open ELB to kops-controller port when using it for internal API #10142

WIP Open ELB to kops-controller port when using it for internal API #10142

Uh oh!

johngmyers commented Oct 30, 2020

Uh oh!

k8s-ci-robot commented Oct 30, 2020

Uh oh!

johngmyers commented Oct 30, 2020

Uh oh!

seh commented Oct 30, 2020

Uh oh!

seh commented Oct 31, 2020

Uh oh!

seh commented Oct 31, 2020

Uh oh!

Uh oh!

hakman commented Oct 31, 2020 •

edited

Loading

Uh oh!

Uh oh!

seh Oct 31, 2020

Uh oh!

johngmyers Oct 31, 2020

Uh oh!

seh Oct 31, 2020

Uh oh!

hakman Oct 31, 2020

Uh oh!

seh Oct 31, 2020

Uh oh!

seh Oct 31, 2020

Uh oh!

johngmyers commented Oct 31, 2020 •

edited

Loading

Uh oh!

johngmyers commented Nov 14, 2020

Uh oh!

Uh oh!

WIP Open ELB to kops-controller port when using it for internal API #10142

WIP Open ELB to kops-controller port when using it for internal API #10142

Uh oh!

Conversation

johngmyers commented Oct 30, 2020

Uh oh!

k8s-ci-robot commented Oct 30, 2020

Uh oh!

johngmyers commented Oct 30, 2020

Uh oh!

seh commented Oct 30, 2020

Uh oh!

seh commented Oct 31, 2020

Uh oh!

seh commented Oct 31, 2020

Uh oh!

Uh oh!

hakman commented Oct 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

seh Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

johngmyers Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

seh Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

hakman Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

seh Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

seh Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

johngmyers commented Oct 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johngmyers commented Nov 14, 2020

Uh oh!

Uh oh!

hakman commented Oct 31, 2020 •

edited

Loading

johngmyers commented Oct 31, 2020 •

edited

Loading