Skip to content

add feature gate mechanism to ray-operator #2219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 15, 2024

Conversation

andrewsykim
Copy link
Member

Why are these changes needed?

Add feature gating to KubeRay. This uses standard feature gate implementation used by Kubernetes components.

Related issue number

None

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@@ -13,6 +13,7 @@ RUN go mod download
COPY main.go main.go
COPY apis/ apis/
COPY controllers/ controllers/
COPY pkg/ pkg/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also includes pkg/client, which I think is fine? I can also update to pkg/features

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference in image size with and without pkg/?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just update this to only copy pkg/features, we can always change it to pkg/ when we introduce more packages that need to be compiled

Copy link
Member Author

@andrewsykim andrewsykim Jul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@andrewsykim andrewsykim changed the title [WIP] add feature gate mechanism to ray-operator add feature gate mechanism to ray-operator Jul 2, 2024
@kevin85421 kevin85421 self-assigned this Jul 3, 2024
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no experience about feature gate in K8s. Would you mind explaining what will happen if we turn off a feature flag?

I am reading this PR: kubernetes/kubernetes#116261. It adds // +featureGate=AdmissionWebhookMatchConditions to types.go. Will the +featureGate annotation affect the CRD generation if the feature is off?

	// This is an alpha feature and managed by the AdmissionWebhookMatchConditions feature gate.
	//
	// +featureGate=AdmissionWebhookMatchConditions
	// +optional
	MatchConditions []MatchCondition

return featuregatetesting.SetFeatureGateDuringTest(tb, utilfeature.DefaultFeatureGate, f, value)
}

// Enabled is helper for `utilfeature.DefaultFeatureGate.Enabled()`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make Enabled a public method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, there's no usage of it in this PR, but it'll be nice to have for feature gating the new RayCluster status chfanges

@@ -13,6 +13,7 @@ RUN go mod download
COPY main.go main.go
COPY apis/ apis/
COPY controllers/ controllers/
COPY pkg/ pkg/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference in image size with and without pkg/?

@@ -92,6 +95,7 @@ func main() {
flag.StringVar(&configFile, "config", "", "Path to structured config file. Flags are ignored if config file is set.")
flag.BoolVar(&useKubernetesProxy, "use-kubernetes-proxy", false,
"Use Kubernetes proxy subresource when connecting to the Ray Head node.")
flag.StringVar(&featureGates, "feature-gates", "", "A set of key=value pairs that describe feature gates.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that the value needs to be "true" or "false"?

https://pkg.go.dev/k8s.io/component-base/featuregate#MutableFeatureGate

feature1=true,feature2=false,...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, will update

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an example in the flag description

features := make(map[featuregate.Feature]bool, len(defaultFeatureGates))
for f := range utilfeature.DefaultMutableFeatureGate.GetAll() {
if _, ok := defaultFeatureGates[f]; ok {
features[f] = Enabled(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled seems to be a one-line function. Maybe we can use utilfeature.DefaultFeatureGate.Enabled(f) directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Enabled will be useful when we implement the feature gate check for the RayCluster status changes

}

var defaultFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{
RayClusterStatusConditions: {Default: false, PreRelease: featuregate.Alpha},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can turn it on by default for the alpha release if we successfully deliver it before v1.2.0. Unlike a big project like Kubernetes, if we don't turn on a feature, it is hard for us to collect feedback because no user will turn it on.

Copy link
Member Author

@andrewsykim andrewsykim Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we plan to turn feature gates on by default in alpha, then we probably don't need feature gates?

The main purpose of the feature gate is to introduce early versions of features in alpha that we can break / change before going to Beta / GA. Agree that significantly less users will use alpha features if disabled by default, but that's the trade-off of being able to break alpha stage features. If we enable by default, then breaking a feature in alpha phase will break many users, which defeats the purpose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we turn it off at this moment? We can try to find some users to try the alpha feature. If we get enough users to test this alpha feature, we can turn it off by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't find enough users, we can turn it on by default to collect user feedback more aggressively.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we turn it off at this moment? We can try to find some users to try the alpha feature. If we get enough users to test this alpha feature, we can turn it off by default.

This sounds good to me

@kevin85421
Copy link
Member

When the feature gate is off, the CRD will still be generated in the OpenAPI YAML file, but we will check the feature gate to determine whether to write to the conditions field or not. If it is off, conditions will always be empty.

Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @rueian would you mind reviewing this PR? Thanks!


// SetFeatureGateDuringTest is a helper method to override feature gates in tests.
func SetFeatureGateDuringTest(tb testing.TB, f featuregate.Feature, value bool) func() {
return featuregatetesting.SetFeatureGateDuringTest(tb, utilfeature.DefaultFeatureGate, f, value)
Copy link
Contributor

@rueian rueian Jul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird to import the testing and the featuregatetesting package outside a _test.go file. I believe that will have some impact on the compilation time. Is this a common practice while implementing a custom feature gate with the utilfeature package? Or could we just move the SetFeatureGateDuringTest into a _test.go file?

Copy link
Member Author

@andrewsykim andrewsykim Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that will have some impact on the compilation time.

I believe this is negligible impact to compilation time

Is this a common practice while implementing a custom feature gate with the utilfeature package? Or could we just move the SetFeatureGateDuringTest into a _test.go file?

I believe it's common to do this because many test packages will execute code that checks the feature gate and will compile this package already. If you move it into _test.go files, then each test will have to carry a copy of this method or import it from a common test package.

@kevin85421 kevin85421 merged commit b05c396 into ray-project:master Jul 15, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants