Skip to content

Conversation

sdake
Copy link
Member

@sdake sdake commented Oct 14, 2019

This adds a --use_operator flag to the e2e tests. The e2e tests
expect a different component to store a manifest for deployment
in the $ISTIO/install/kubernetes directory.

One aspect of this test case is incorrect. Deletion of the operator
appears not to work within the test framework. The namespace locks
up with the namespace in the terminating state.

Please provide a description for what this PR is for.

And to help us figure out who should review this PR, please
put an X in all the areas that this PR affects.

[ ] Configuration Infrastructure
[ ] Docs
[ ] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Policies and Telemetry
[ ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure

This adds a --use_operator flag to the e2e tests. The e2e tests
expect a different component to store a manifest for deployment
in the $ISTIO/install/kubernetes directory.

One aspect of this test case is incorrect. Deletion of the operator
appears not to work within the test framework. The namespace locks
up with the namespace in the terminating state.
@sdake sdake requested a review from a team as a code owner October 14, 2019 19:36
@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Oct 14, 2019
@istio-testing istio-testing added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 14, 2019
sdake pushed a commit to sdake/operator that referenced this pull request Oct 14, 2019
This work runs e2e testing using end to end testing against the
istio/istio repository.

To run - use `make e2e`.

Depends-On: istio/istio#17864
log.Errorf("Failed to delete istio-operator namespace.")
return err
}
if _, err := util.Shell("kubectl delete ns %s --grace-period=0 --force --kubeconfig=%s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why force?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really sure - operations determined by others. Will dig into that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grace period and force have been removed.

@howardjohn
Copy link
Member

howardjohn commented Oct 14, 2019 via email

@@ -605,7 +628,16 @@ func (k *KubeInfo) Teardown() error {
validatingWebhookConfigurationExists := false
log.Infof("Deleting namespace %v", k.Namespace)
for attempts := 1; attempts <= maxAttempts; attempts++ {
namespaceDeleted, _ = util.NamespaceDeleted(k.Namespace, k.KubeConfig)
if *useOperator {
if _, err := util.Shell("kubectl delete ns istio-operator --grace-period=0 --force",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k.namespace and k.kubeconfig not passed in

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k.namesapace is in this case istio-system. k.kubeconfig in this case maps to k.namespace. There is no context for these, hen ce why they are not passed into the shell util command since they are not absolutely needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but you are deleting istio-operator ns here? also why we don't remove the extra args if we don't pass in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why force and no grace period?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the non operator code this is checking if the namespace was deleted, but here we are deleting it. Don't we already delete it above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed force and grace period. The logic sets namespaceDeleted=true only if the namespace was deleted. The same logic checks that the namespace is deleted:

                if namespaceDeleted && !validatingWebhookConfigurationExists {
                        break
                }       

This basically just adds logic for the case the --use_operator is set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the non operator code this is checking if the namespace was deleted, but here we are deleting it. Don't we already delete it above?

I see, I misunderstood the purpose of the function call. I will check the logic again and submit a followup.

@sdake
Copy link
Member Author

sdake commented Oct 14, 2019

I don't know the context behind the finalizer change, but something seems off about the solution. We should make sure to revisit that (unless this was already well thought through by someone, in which case we should document the rationale)

The finalizer hacking doesn't work. If I run it on the shell, all is grand. if run from the test cases, it has no effect - possibly because of parsing of the text string in the shell command.

More importantly we need a better way of handling deletion then no way at all (which is what we currently have). I am simply attempting to make an e2e test that tests the system. I could remove the deletion if you prefer.

Cheers
-steve

@sdake
Copy link
Member Author

sdake commented Oct 15, 2019

Regarding the discussion around deletion in this PR, the deletion has been a sticking point since Friday afternoon after the first standalone operator was birthed. I'd recommend we merge this PR and i will follow up with a proper delete that matches how we think the operator should delete Istio.

FWIW, We believe we have the delete working properly and none of this complex deletion procedure is needed, but we want to get the gate moving.

Cheers
-steve

Copy link
Member

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, a few comments

@@ -605,7 +628,16 @@ func (k *KubeInfo) Teardown() error {
validatingWebhookConfigurationExists := false
log.Infof("Deleting namespace %v", k.Namespace)
for attempts := 1; attempts <= maxAttempts; attempts++ {
namespaceDeleted, _ = util.NamespaceDeleted(k.Namespace, k.KubeConfig)
if *useOperator {
if _, err := util.Shell("kubectl delete ns istio-operator --grace-period=0 --force",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why force and no grace period?

@@ -605,7 +628,16 @@ func (k *KubeInfo) Teardown() error {
validatingWebhookConfigurationExists := false
log.Infof("Deleting namespace %v", k.Namespace)
for attempts := 1; attempts <= maxAttempts; attempts++ {
namespaceDeleted, _ = util.NamespaceDeleted(k.Namespace, k.KubeConfig)
if *useOperator {
if _, err := util.Shell("kubectl delete ns istio-operator --grace-period=0 --force",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the non operator code this is checking if the namespace was deleted, but here we are deleting it. Don't we already delete it above?

@@ -754,12 +809,6 @@ func (k *KubeInfo) deployIstio() error {
}
}

// Create istio-system namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may break the CNI stuff below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two branches. --use_operator and not. I have not implemented for the scenario where cni is set and operator is set. IMO if we we with to test this scenario, that is something we could tackle as follow-on work. I will post results of e2e.sh tested against this PR for make e2e with the cni repo though.

Cheers
-steve

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats not what I mean, for the CNI test before we create the namespace, then install CNI in the namespace. Now we install cni in the namespace, then create the namespace. This may not work, unless CNI is also creating the namespace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am running e2e cni now. Actually I am building a buildtools image because e2e for cni repo is broken as a result of a recent build tools change around the /home dir. (the cni tests create /home/.helm). I will report back the results as well as my diff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e2e cni image regression fix: istio/tools#441

Copy link
Member Author

@sdake sdake Oct 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my cni diff tested with:

sdake@beast-01:~/go/src/istio.io/cni$ git diff
diff --git a/Makefile b/Makefile
index cadc120..6da2153 100644
--- a/Makefile
+++ b/Makefile
@@ -56,7 +56,7 @@ ifeq ($(BUILD_WITH_CONTAINER),1)
 export TARGET_OUT = /work/out/$(TARGET_ARCH)_$(TARGET_OS)
 CONTAINER_CLI ?= docker
 DOCKER_SOCKET_MOUNT ?= -v /var/run/docker.sock:/var/run/docker.sock
-IMG ?= gcr.io/istio-testing/build-tools:2019-10-11T13-37-52
+IMG ?= gcr.io/istio-testing/build-tools:2019-10-15T17-10-27
 UID = $(shell id -u)
 PWD = $(shell pwd)
 
diff --git a/test/e2e.sh b/test/e2e.sh
index 8b43027..835830b 100755
--- a/test/e2e.sh
+++ b/test/e2e.sh
@@ -59,7 +59,13 @@ kubectl get pods --all-namespaces -o wide
 ISTIO_DIR="${GOPATH}/src/istio.io/istio"
 if [[ ! -d "${ISTIO_DIR}" ]]
 then
-  git clone https://github.com/istio/istio.git "${ISTIO_DIR}"
+  mkdir -p ${GOPATH}/src/istio.io
+  cd ${GOPATH}/src/istio.io
+  git clone https://github.com/sdake/istio.git "${ISTIO_DIR}"
+  pushd "${ISTIO_DIR}"
+  git fetch origin
+  git checkout operator_e2e
+  popd
 fi
 pushd "${ISTIO_DIR}" || exit
   make istioctl

Copy link
Member Author

@sdake sdake Oct 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just recognized the e2e tests in BUILD_WTIH_CONTAINER in the istio/cni repo use cached repos. I am in the process of a reboot so I may docker volume prune.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like cni tests pass. Makes me question the results of the cni e2e test cases, although I have not investigated in much detail what the local test does. What would you recommend next?

@@ -55,6 +56,7 @@ const (
mcAuthInstallFileNamespace = "istio-auth-multicluster.yaml"
mcRemoteInstallFile = "istio-remote.yaml"
mcSplitHorizonInstallFile = "istio-multicluster-split-horizon.yaml"
authOperatorInstallFile = "istio-operator.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is created by the e2e.sh script in istio/operator#379

Cheers
-steve

@istio-testing istio-testing merged commit 88439bb into istio:master Oct 16, 2019
sdake pushed a commit to sdake/operator that referenced this pull request Oct 17, 2019
This work runs e2e testing using end to end testing against the
istio/istio repository.

To run - use `make e2e`.

Depends-On: istio/istio#17864
istio-testing pushed a commit to istio/operator that referenced this pull request Oct 17, 2019
* cross-repo e2e gating for the operator

This work runs e2e testing using end to end testing against the
istio/istio repository.

To run - use `make e2e`.

Depends-On: istio/istio#17864

* Address reviewer comments.

* Revert all gomod changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants