-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Description
/kind bug
1. What kops
version are you running? The command kops version
, will display
this information.
Testing upgrade from Client version: 1.29.2 (git-v1.29.2)
to Client version: 1.30.1 (git-v1.30.1)
2. What Kubernetes version are you running? kubectl version
will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops
flag.
v1.29.9
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops_v1.30.1 update cluster
- no other changes to manifest or environment, only executing newer kops binary.
5. What happened after the commands executed?
$ export AWS_PROFILE=company-name-dev3
$ kops_v1.30.1 update cluster
SDK 2024/09/20 14:31:06 DEBUG request failed with unretryable error https response error StatusCode: 403, RequestID: 623bd87e-11e1-4b06-9f16-10f60ba2f030, api error AccessDenied: User: arn:aws:sts::[redacted]006:assumed-role/OrganizationAccountAccessRole/aws-go-sdk-1726842666098977639 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::[redacted]006:role/OrganizationAccountAccessRole Error: error determining default DNS zone: error querying zones: error listing hosted zones: operation error Route 53: ListHostedZones, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 623bd87e-11e1-4b06-9f16-10f60ba2f030, api error AccessDenied: User: arn:aws:sts::[redacted]006:assumed-role/OrganizationAccountAccessRole/aws-go-sdk-1726842666098977639 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::[redacted]006:role/OrganizationAccountAccessRole
6. What did you expect to happen?
With kops-1.29.2 the output shows proposed changes that need to be applied with --yes
AWS CLI is able to successfully get Route53 zones from the same shell:
$ aws route53 list-hosted-zones
{
"HostedZones": [
{
"Id": "/hostedzone/Z0[redacted]",
"Name": "k8s.dev3.us-west-2.example.com.",
"CallerReference": "8e483d8f-0d3c-4bcc-9c68-ecb4dea807ae",
"Config": {
"PrivateZone": false
},
"ResourceRecordSetCount": 8
}
}
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the -v 10
flag.
Paste the logs into this report, or in a gist and provide the gist link here.
https://gist.github.com/vitaliyf/cfddd9ad771ee613ee850bb9e2d3fe14
9. Anything else do we need to know?
$ cat ~/.aws/config
[default]
region = us-west-2
[profile company-name]
aws_account_id = company-name
region = us-west-2
output = json
color = ff0000
[profile company-name-dev1]
role_arn = arn:aws:iam::[redacted]385:role/OrganizationAccountAccessRole
source_profile = company-name
[profile company-name-dev2]
role_arn = arn:aws:iam::[redacted]813:role/OrganizationAccountAccessRole
source_profile = company-name
color = 00ff00
[profile company-name-dev3]
role_arn = arn:aws:iam::[redacted]006:role/OrganizationAccountAccessRole
source_profile = company-name
color = 0000ff
This cluster has been continuously upgraded one kops/kubernetes version at a time for at least a couple years, so it is pretty routine for us to test and execute such upgrades in-place.
I tried to look around and I suspect this is related to aws-sdk-go-v2 upgrade.
For example, they have this issue: aws/aws-sdk-go-v2#2686 - and coincidentally or not, that ticket is referenced by cert-manager/cert-manager#7236 where they are also dealing with "Missing Region" error just like #16645 from kops-1.30.0