Skip to content

Improve E2E test coverage #9598

@rifelpet

Description

@rifelpet

1. Describe IN DETAIL the feature/behavior/change you would like to see.
Kops has many periodic prow jobs that run the kubernetes e2e suite against various combinations of k8s versions, OS, CNI, cloud provider, and cpu architecture.

However there is a lot of kops functionality not covered by e2e tests:

  • Upgrading & downgrading Kops
  • Upgrading & downgrading Kubernetes / etcd / CNI providers
  • Applying & upgrading & downgrading all of the miscelaneous "bootstrapchannelbuilder" manifests
  • Terraform & Cloudformation targets
  • Rolling Updates
  • Almost all feature flags
  • Many API fields that aren't settable via kops create cluster, requiring a manifest file and kops replace -f

Adding test coverage for these scenarios would require significant changes to kubetest (our e2e runner) which is in maintenance mode so I'm proposing that we reevaluate how we run our e2e jobs. #8605 was open as an experiment but I think we should settle on a design proposal before we merge anything.

The kubetest logic that we currently use involves:

  1. download or build the appropriate kops version
  2. download and extract the appropriate k8s version
  3. choose a region
  4. run the kops create cluster with the appropriate flags and wait for validation to pass
  5. download and run the appropriate k/k e2e test suite
  6. dump node logs and troubleshooting info
  7. tear down the cluster
  8. publish the kops "version marker" if applicable

Of the above steps, less than half are also used by other projects. In order to cover the areas mentioned above we would need to add a significant amount of kops-specific testing logic. I'm wondering if it makes more sense to have our test runner code live in the kops repo rather than test-infra. We could consider vendoring the necessary kubetest (or kubetest2) logic for some of the above steps, but keeping most of the development in the kops repo allows us to iterate faster by not requiring approvals from outside our OWNERS. It also reduces potential impact or compromises necessary for a "shared" tool like kubetest2.

2. Feel free to provide a design supporting your feature request.

Some concerns:

  • Do we continue to use go for consistency? It would allow us to vendor certain logic from test-infra. If we choose a different language or technology like Jupyter it would involve additional effort to get working in prow jobs.
  • Does each job compile the runner code each time or do we publish a runner artifact/image similar to our existing kops version markers?

Anyone have any other thoughts or suggestions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/testinglifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions