Improve E2E test coverage

**1. Describe IN DETAIL the feature/behavior/change you would like to see.**
Kops has many periodic prow jobs that run the kubernetes e2e suite against various combinations of k8s versions, OS, CNI, cloud provider, and cpu architecture.

However there is a lot of kops functionality not covered by e2e tests:

- Upgrading & downgrading Kops
- Upgrading & downgrading Kubernetes / etcd / CNI providers
- Applying & upgrading & downgrading all of the miscelaneous "bootstrapchannelbuilder" manifests
- Terraform & Cloudformation targets
- Rolling Updates
- Almost all feature flags
- Many API fields that aren't settable via `kops create cluster`, requiring a manifest file and `kops replace -f`

Adding test coverage for these scenarios would require significant changes to kubetest (our e2e runner) which is in [maintenance mode](https://github.com/kubernetes/test-infra/tree/master/kubetest#deprecation-notice) so I'm proposing that we reevaluate how we run our e2e jobs. https://github.com/kubernetes/kops/pull/8605 was open as an experiment but I think we should settle on a design proposal before we merge anything.
 

The kubetest logic that we currently use involves:

1. download or build the appropriate kops version
2. download and extract the appropriate k8s version
3. choose a region
4. run the `kops create cluster` with the appropriate flags and wait for validation to pass
5. download and run the appropriate k/k e2e test suite
6. dump node logs and troubleshooting info
7. tear down the cluster
8. publish the kops "version marker" if applicable

Of the above steps, less than half are also used by other projects. In order to cover the areas mentioned above we would need to add a significant amount of kops-specific testing logic. I'm wondering if it makes more sense to have our test runner code live in the kops repo rather than test-infra. We could consider vendoring the necessary kubetest (or kubetest2) logic for some of the above steps, but keeping most of the development in the kops repo allows us to iterate faster by not requiring approvals from outside our OWNERS. It also reduces potential impact or compromises necessary for a "shared" tool like kubetest2.

**2. Feel free to provide a design supporting your feature request.**

Some concerns:
* Do we continue to use go for consistency? It would allow us to vendor certain logic from test-infra. If we choose a different language or technology like Jupyter it would involve additional effort to get working in prow jobs.
* Does each job compile the runner code each time or do we publish a runner artifact/image similar to our existing kops version markers?


Anyone have any other thoughts or suggestions?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve E2E test coverage #9598

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve E2E test coverage #9598

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions