Skip to content

Subscription Channel ref without api version #5131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

slinkydeveloper
Copy link
Contributor

@slinkydeveloper slinkydeveloper commented Mar 23, 2021

Signed-off-by: Francesco Guardiani francescoguard@gmail.com

Fixes #5086

Proposed Changes

  • 🎁 Refer to channels without specifying api version

Additional context

The idea of this PR is described here: #5086 (comment)
In short, when the "concrete api version" of a CRD is missing, it queries the resource definition to check what's the storage version and, using the storage version, it performs the reconciliation.

Example log creating a subscription:

{"level":"info","ts":"2021-03-23T12:53:58.733Z","logger":"controller","caller":"subscription/subscription.go:330","msg":"Getting channel","commit":"c039526","knative.dev/pod":"eventing-controller-69df4cd754-hgjwr","knative.dev/controller":"knative.dev.eventing.pkg.reconciler.subscription.Reconciler","knative.dev/kind":"messaging.knative.dev.Subscription","knative.dev/traceid":"9f407704-fd86-4c80-a0b9-b3bcea8121a2","knative.dev/key":"default/with-dead-letter-sink","channel":{"kind":"InMemoryChannel","name":"my-channel","apiVersion":"messaging.knative.dev"}}
{"level":"info","ts":"2021-03-23T12:53:58.733Z","logger":"controller","caller":"subscription/subscription.go:337","msg":"Resolved channel ref","commit":"c039526","knative.dev/pod":"eventing-controller-69df4cd754-hgjwr","knative.dev/controller":"knative.dev.eventing.pkg.reconciler.subscription.Reconciler","knative.dev/kind":"messaging.knative.dev.Subscription","knative.dev/traceid":"9f407704-fd86-4c80-a0b9-b3bcea8121a2","knative.dev/key":"default/with-dead-letter-sink","channel":{"kind":"InMemoryChannel","name":"my-channel","apiVersion":"messaging.knative.dev/v1"}}

This PR is a working PoC and can be merged as is, although I prefer to refactor https://github.com/knative/eventing/pull/5131/files#diff-f4ab96e0544e474732f224fa784e19598eacbbecd9db8944a4eebbc72c785d35R532 to a more proper place where we can use the same logic for whatever other place we use refs: maybe somewhere in pkg?

Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
@google-cla google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Mar 23, 2021
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from slinkydeveloper after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 23, 2021
@slinkydeveloper slinkydeveloper changed the title [PoC] Subscription Channel ref without api version Subscription Channel ref without api version Mar 23, 2021
Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
@aliok
Copy link
Member

aliok commented Mar 23, 2021

The PR looks good.

I won't approve it yet because of the discussion in #5086 though.

Signed-off-by: Francesco Guardiani <francescoguard@gmail.com>
@aliok
Copy link
Member

aliok commented Mar 23, 2021

BTW, how hard would it be to use latest served version instead of storage version?

There can be multiple served versions and they're strings. Some semver magic?

@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-eventing-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/subscription/subscription.go 86.9% 85.8% -1.1

@codecov
Copy link

codecov bot commented Mar 23, 2021

Codecov Report

Merging #5131 (34dd074) into main (c039526) will decrease coverage by 0.05%.
The diff coverage is 61.90%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5131      +/-   ##
==========================================
- Coverage   83.26%   83.20%   -0.06%     
==========================================
  Files         283      283              
  Lines        7957     7977      +20     
==========================================
+ Hits         6625     6637      +12     
- Misses        958      963       +5     
- Partials      374      377       +3     
Impacted Files Coverage Δ
pkg/reconciler/subscription/subscription.go 80.39% <60.00%> (-1.82%) ⬇️
pkg/reconciler/subscription/controller.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c039526...34dd074. Read the comment docs.

@matzew
Copy link
Member

matzew commented Mar 23, 2021

Looks simple enough to tackle the problem. Perhaps would be good to continue chatting about it on the issue itself

return version.Name, nil
}
}
return "", fmt.Errorf("this CRD %s doesn't have a storage version! Kubernetes, you're drunk, go home", crd.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 23, 2021
return err
}

actualGvk.Version, err = findCRDStorageVersion(crd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this will do what you want. What you are trying to do is find the version of the channel that implements the duck type the subscription controller assumes to use, but this is not the way to do it. The stored version means nothing to the duck type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are trying to do is find the version of the channel that implements the duck type the subscription controller assumes to use, but this is not the way to do it

What is the proper way, if there is any?

The stored version means nothing to the duck type.

From my understanding, the duck type doesn't care neither of the stored version, nor the original version used by the user to create the resource.

To be more specific: here I'm not performing discovery of duck types, nor looking for the duck type version etc. The user gives me a concrete channel kind and api group and I select for her/him the version, that's it. For what is worth, we could even pick just the first or the last version, since which version to pick is not important.

I query the storage version because, at the end of the day, this is the ultimate source of truth for k8s regarding that particular resource and skips some conversions in the middle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper way is to specify the API version that you want to use. The label on the resource says which duck type if conforms to. Storage version is totally irrelevant, just in this case you happen to know that this version conforms to a duck type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea in this PR:

Kind: Subscription
...
Spec:
  Channel:
    API Version:  messaging.knative.dev
    Kind:         KafkaChannel
    Name:         testchannel

When the controller sees that there's no version in the "API version" field, it goes and checks the versions of this resource. It transiently sets the storage version of the resource on the reference. So, in memory, it becomes like this:

Kind: Subscription
...
Spec:
  Channel:
    API Version:  messaging.knative.dev/v1alpha1
    Kind:         KafkaChannel
    Name:         testchannel

And then, it fetches the channel as messaging.knative.dev/v1alpha1 KafkaChannel.

In KafkaChannel, we have v1alpha1 API supporting v1alpha1 subscribable duck and we have v1beta1 API supporting v1 subscribable duck.

When it fetches the channel as messaging.knative.dev/v1alpha1, doesn't it get v1alpha1 in the duck annotation? And then based on that the subscriber reconciler would operate as if this channel ref is a ref to a channel that's supporting v1alpha1 duck type.

This is why the storage version is used. If there's no version is specified, it just falls back to storage version to be able to operate with that resource.

What you are trying to do is find the version of the channel that implements the duck type the subscription controller assumes to use, but this is not the way to do it.

Nope, just find an arbitrary version (in this case, getting the storage version makes the reasonable arbitrary version to use) and then let the subscription controller take it from there. It can handle whatever duck version is specified in the resource fetched as storage version.

@n3wscott
Copy link
Contributor

We should not change the meaning of apiVersion. Just make the subscription go ready=false and an operator will need to step in and fix it.

@slinkydeveloper
Copy link
Contributor Author

slinkydeveloper commented Mar 24, 2021

We should not change the meaning of apiVersion. Just make the subscription go ready=false and an operator will need to step in and fix it.

@n3wscott Which operator are you talking about? And why this should be owned by a separate controller? We can just apply this logic to every controller that interacts with ref, without any external interaction needed.

Also some refs are immutable (the subscription channel ref, IIRC, is one of them), so we can't really do that without changing the mutability constraints.

@n3wscott
Copy link
Contributor

@n3wscott Which operator are you talking about? And why this should be owned by a separate controller?

I am talking about likely a human, maybe a robot that updates the links via a crawl.

We can just apply this logic to every controller that interacts with ref, without any external interaction needed.

This violates the principles of kubernetes. The spec says do this thing, point to this thing, and the spec has a version that is invalid. We can make no assumption about the intent. And we can't pick a version out of the hat and hope it will work out, or refs point to duck types and those have versions. The spec says what it says and it should be the truth.

Also some refs are immutable (the subscription channel ref, IIRC, is one of them), so we can't really do that without changing the mutability constraints.

That is a consequence of versioning and we provide depreciation policies to address this, the operator should make a new subscription with the correct version while upgrading. Hacking what the spec says and doing something different is not what we should get into the business of.

If I dep on package v1 and v1 is no longer published, I expect my update and vendor to fail, not auto-upgrade me to v2 and hope for the best.

@slinkydeveloper
Copy link
Contributor Author

I am talking about likely a human, maybe a robot that updates the links via a crawl.
Also some refs are immutable (the subscription channel ref, IIRC, is one of them), so we can't really do that without changing the mutability constraints.

Except the human can't do that without removing the subscription and re-creating it, because of the mutability constraints we have now.

Removing and re-creating the subscription IMO is not an answer here, because it disrupts the service and it's a bad UX: why the user should care at all about this? As a user, I'm telling the system to refer to x.y.z, it's up to the system itself to figure out how to communicate with x.y.z.

This violates the principles of kubernetes. The spec says do this thing, point to this thing, and the spec has a version that is invalid.

We can either replace or extend that type and use one made by us, where user can just specify the api group and the kind.

And we can't pick a version out of the hat and hope it will work out, or refs point to duck types and those have versions

That's at the end of the day what happens in the api server: whatever version the user is pushing, only the storage one will be saved. I can push an IMC with version v1beta1 and then refer to it in a sub with v1 or viceversa. For any duck controller this doesn't have any meaning, the duck controller just cares about using one version, our conversion wehooks will do the rest.

We're talking about references and the fact that they don't care about the referred resource version. The duck contract and their version conversions already take care of all of that complexity, that's why we have them, right? So why do we need to specify here the api version? Isn't that redundant?

If I dep on package v1 and v1 is no longer published, I expect my update and vendor to fail, not auto-upgrade me to v2 and hope for the best.

The referred resource upgrade won't be handled by this code. As soon as the CRD has a storage version (and it must have it by the CRD contract), this code is fine. If a vendor fix or doesn't fix its channel apis when upgrading is a concern not related at all to the reference system, it's an orthogonal problem IMO.

When you say If I dep on package v1 I think you're looking at the wrong analogy. I see it in another way: refs in yamls are the actual functions linked and kubernetes acts as the "build tool" that is able to find out library versions. When you code, you care about the function name, so ref doesn't care about the group version.

@vaikas
Copy link
Contributor

vaikas commented Mar 29, 2021

So, the resource can have multiple versions. Conversion webhook converts between those versions. So if you ask for v1beta1 or v1 regardless of what storage version you have, you get the correct version.
Version of the resource (regardless of which version is stored) might conform to a duck type. So, original subscription controller had logic to deal with different versions of the duck, but this can't and shouldn't be baked into the subscription controller as to which resource version conforms to which duck, and instead should be discovered when you fetch a resource.
So, I'm having a hard time understanding why we want to use the storage version for anything because it is irrelevant to how you read the resource or which duck type it conforms to.

@n3wscott
Copy link
Contributor

So, I'm having a hard time understanding why we want to use the storage version for anything because it is irrelevant to how you read the resource or which duck type it conforms to.

I think the reason is this PR allows an operator (human) to make a Subscription to a Channel with the incorrect version and it will get "corrected" by pulling the stored version out of the channel CRD.

The issue is, we are now assuming what that human wanted. They should know if they made a bad subscription because the subscription should report an error for not loading the correct channel. The consequence of this PR is I can fat finger any version into the subscription channel ref and it will likely work if I did not mess too much up. This is wrong, we should not assume what the operator wants to do.

There is a fine line between declarative apis and bespoke APIs.

@slinkydeveloper
Copy link
Contributor Author

slinkydeveloper commented Mar 30, 2021

So, I'm having a hard time understanding why we want to use the storage version for anything because it is irrelevant to how you read the resource or which duck type it conforms to.

There is not particular reason to choose the storage version, except avoiding unnecessary conversions. I could have picked whatever version of the CRD, like you sad. The point of this PR it to let user refer to resources without specifying api version.

I think the reason is this PR allows an operator (human) to make a Subscription to a Channel with the incorrect version and it will get "corrected" by pulling the stored version out of the channel CRD.

No, it allows humans to refer to resources without specifying the api version, just the group. It doesn't correct any user mistake.

The consequence of this PR is I can fat finger any version into the subscription channel ref and it will likely work if I did not mess too much up. This is wrong, we should not assume what the operator wants to do.

That's not true, check the code here: https://github.com/knative/eventing/pull/5131/files#diff-f4ab96e0544e474732f224fa784e19598eacbbecd9db8944a4eebbc72c785d35R533. If the user specifies explicitly the api group and version, the behaviour doesn't change from master. This code just defines what happens if you write messaging.knative.dev more than messaging.knative.dev/v1, but it doesn't change what happens if you write messaging.knative.dev/v2

@slinkydeveloper
Copy link
Contributor Author

Superseded by #5440

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Indicates the PR's author has signed the CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Experimental] KReference.Group to avoid being explicit on the APIVersion
7 participants