-
Notifications
You must be signed in to change notification settings - Fork 789
Implement TLS authentication for kafka mqt #1300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
details ---------- - uses secret to store keys and certificates
details ---------- - this will avoid confusion if the user assumes that the correct certificate files are already present - disable kafka (it is disabled by default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 4 unresolved discussions (waiting on @vadasambar)
charts/fission-all/auth/kafka/ca.crt, line 1 at r1 (raw file):
-----BEGIN CERTIFICATE-----
I think we can ask users to provide their own crt & key before installation like what we do in
{{- if .Files.Get "config/fluentbit.conf" }} |
so that we don't need to push crt & key to repo.
The helm installation should be stopped if a user doesn't provide necessary files.
charts/fission-all/templates/deployment.yaml, line 684 at r1 (raw file):
value: {{ .Values.debugEnv | quote }} - name: MESSAGE_QUEUE_SECRETS value: mqtrigger-kafka-secrets
It's better to mount secrets as a volume instead of passing it as environment variable.
pkg/mqtrigger/mqtrigger.go, line 52 at r1 (raw file):
var secrets *v1.Secret if mqSecretName != "" { secrets, _ = kubeClient.CoreV1().Secrets(getCurrentNamespace()).Get(mqSecretName, metav1.GetOptions{})
Mount secrets as volume and read crt/key from the it instead of calling kubernetes API service.
pkg/mqtrigger/messageQueue/kafka.go, line 154 at r1 (raw file):
cert, err := tls.X509KeyPair(kafka.certificates["userCert"], kafka.certificates["userKey"]) if err != nil { panic(err)
Return the error and let the caller handles the error instead of doing panic here.
Also, please check the error log in CI build: https://travis-ci.org/fission/fission/builds/580226550#L1172 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do this.
Reviewable status: 0 of 8 files reviewed, 4 unresolved discussions (waiting on @life1347 and @vadasambar)
charts/fission-all/auth/kafka/ca.crt, line 1 at r1 (raw file):
The cert files I have added are samples/examples. We can remove them if we want.
The helm installation should be stopped if a user doesn't provide necessary files.
I will do this. Thank you :)
charts/fission-all/templates/deployment.yaml, line 684 at r1 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
It's better to mount secrets as a volume instead of passing it as environment variable.
Some advantages I can think of are:
- I can access it in the pod itself. So no need for an API call to the apiserver(a roundtrip which would probably include validations, decoding etc.,).
- No need for an environment variable. Environment variable can add another point of failure. i.e., if the environment variable is tampered with, kafka might not work anymore because it might not be able to find the secret.
Please let me know if there's anything that I have not considered.
If I am understanding this correctly, (1) would be a one-time thing since the secret is accessed only during initialization of mqtrigger pod.
As for (2), I agree. I think it is better not to pollute the environment variable pool (we might sacrifice some simplicity here because I would no longer be able to link some other secret instead of the one created by default, unless I edit the default secret).
I think overall, volume approach sounds better because it has less moving parts. I just wanted to know if I was understanding this correctly.
pkg/mqtrigger/messageQueue/kafka.go, line 154 at r1 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
Return the error and let the caller handles the error instead of doing panic here.
Makes sense. I will fix this. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 4 unresolved discussions (waiting on @life1347 and @vadasambar)
charts/fission-all/templates/deployment.yaml, line 684 at r1 (raw file):
Previously, vadasambar (Suraj Banakar) wrote…
Some advantages I can think of are:
- I can access it in the pod itself. So no need for an API call to the apiserver(a roundtrip which would probably include validations, decoding etc.,).
- No need for an environment variable. Environment variable can add another point of failure. i.e., if the environment variable is tampered with, kafka might not work anymore because it might not be able to find the secret.
Please let me know if there's anything that I have not considered.
If I am understanding this correctly, (1) would be a one-time thing since the secret is accessed only during initialization of mqtrigger pod.
As for (2), I agree. I think it is better not to pollute the environment variable pool (we might sacrifice some simplicity here because I would no longer be able to link some other secret instead of the one created by default, unless I edit the default secret).I think overall, volume approach sounds better because it has less moving parts. I just wanted to know if I was understanding this correctly.
Another advantage I did not add.
3. Volume will be available across pod restarts.
Also, we would have to write the code to load all the secrets from the directory ourselves (this would be handled by kubernetes if we do api call).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 4 unresolved discussions (waiting on @life1347 and @vadasambar)
charts/fission-all/auth/kafka/ca.crt, line 1 at r1 (raw file):
Previously, vadasambar (Suraj Banakar) wrote…
The cert files I have added are samples/examples. We can remove them if we want.
The helm installation should be stopped if a user doesn't provide necessary files.
I will do this. Thank you :)
We can add steps to fission doc and charts to teach users how to generate their own.
charts/fission-all/templates/deployment.yaml, line 684 at r1 (raw file):
Previously, vadasambar (Suraj Banakar) wrote…
Another advantage I did not add.
3. Volume will be available across pod restarts.Also, we would have to write the code to load all the secrets from the directory ourselves (this would be handled by kubernetes if we do api call).
The points you list are correct. And here are some extra benefits I can think of
- Lower barrier
- User can easily change the volume part of the YAML file with their own secret without knowing how fission works internally.
- Conventional secret management:
- Services like Vault are able to rotate the secret periocally for security concerns. If we call API directly, we have to change it again for such cases.
- Don't reinvent the wheel if we already have a good one.
- Normally, we don't call API server directly unless we have no other convenient ways to get resource information like deployment, pod information.
details/others --------------------- - remove environment variable for kafka secrets. It's easier to tamper with - read secrets from volume instead of making an apiserver call - read secrets to a []byte map - fix build failing because of mutex copy error (tlsConfig)
other -------- - remove default path for TLS kafka keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 4 unresolved discussions (waiting on @life1347)
charts/fission-all/auth/kafka/ca.crt, line 1 at r1 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
We can add steps to fission doc and charts to teach users how to generate their own.
👍 I have removed the sample files.
charts/fission-all/templates/deployment.yaml, line 684 at r1 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
The points you list are correct. And here are some extra benefits I can think of
- Lower barrier
- User can easily change the volume part of the YAML file with their own secret without knowing how fission works internally.
- Conventional secret management:
- Services like Vault are able to rotate the secret periocally for security concerns. If we call API directly, we have to change it again for such cases.
- Don't reinvent the wheel again if we already have a good one.
- Normally, we don't call API server directly unless we have no other convenient ways to get resource information like deployment, pod information.
👍 I have updated the code with volumes approach
pkg/mqtrigger/mqtrigger.go, line 52 at r1 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
Mount secrets as volume and read crt/key from the it instead of calling kubernetes API service.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 4 unresolved discussions (waiting on @life1347)
pkg/mqtrigger/messageQueue/kafka.go, line 154 at r1 (raw file):
Previously, vadasambar (Suraj Banakar) wrote…
Makes sense. I will fix this. Thank you!
Done.
- name: kafka-secrets | ||
mountPath: /etc/secrets | ||
{{- end}} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The final YAML file generated by helm will contain blank lines if they exist in the template, so please remove unnecessary blank lines here and other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me fix that. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
charts/fission-all/values.yaml
Outdated
# userCert: 'auth/kafka/user.crt' | ||
# userKey: 'auth/kafka/user.key' | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: one blank line here should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me fix that. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you missed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/mqtrigger/mqtrigger.go
Outdated
} | ||
messageQueue.MakeMessageQueueTriggerManager(logger, fissionClient, routerUrl, mqCfg) | ||
return nil | ||
} | ||
|
||
func getCurrentNamespace() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this function since we don't need it anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this. Thanks for pointing it out :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/mqtrigger/mqtrigger.go
Outdated
} | ||
|
||
func readSecrets(logger *zap.Logger) (map[string][]byte, error) { | ||
secretsPath := "/etc/secrets" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hard coding the secretsPath
, I prefer to read it from the environment variable.
So that user can change the path later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea. Will make the change. Thank you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/mqtrigger/mqtrigger.go
Outdated
secretFiles, err := ioutil.ReadDir(secretsPath) | ||
|
||
if err != nil { | ||
return secrets, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/mqtrigger/mqtrigger.go
Outdated
secret, fileReadErr := ioutil.ReadFile(filePath) | ||
|
||
if fileReadErr != nil { | ||
return secrets, fileReadErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, fileReadErr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
value: "true" | ||
volumeMounts: | ||
- name: kafka-secrets | ||
mountPath: /etc/secrets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/etc/fission/secrets
would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/mqtrigger/mqtrigger.go
Outdated
|
||
func readSecrets(logger *zap.Logger) (map[string][]byte, error) { | ||
secretsPath := "/etc/secrets" | ||
secrets := make(map[string][]byte) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest moving this line just right before the place where it's first called, like
secrets := make(map[string][]byte)
for _, secretFile := range secretFiles {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 17 unresolved discussions (waiting on @life1347 and @vadasambar)
pkg/mqtrigger/mqtrigger.go, line 29 at r2 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
You may need to do it manually. The order of imports follow the rule
built-in pkg 3rd pkg fission pkg
Thank you 👍
pkg/mqtrigger/mqtrigger.go, line 49 at r2 (raw file):
Previously, life1347 (Ta-Ching Chen) wrote…
Oops, sorry, I forgot to remove this comment last night. You're right for this.
No problem. :)
other -------- - return error instead of panic'ing - inject secrets path as an environment variable
@@ -680,7 +680,50 @@ spec: | |||
value: {{ .Values.traceSamplingRate | default "0.5" | quote }} | |||
- name: DEBUG_ENV | |||
value: {{ .Values.debugEnv | quote }} | |||
# TLS authentication is TLS with authentication (2 way) | |||
# More info: https://docs.confluent.io/current/kafka/authentication_ssl.html#ssl-overview | |||
{{- if .Values.kafka.authentication.tls.enabled}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Leave empty before }}
L685, L693, L695, L700
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Thank you for noticing it. Will fix it. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
charts/fission-all/values.yaml
Outdated
# userCert: 'auth/kafka/user.crt' | ||
# userKey: 'auth/kafka/user.key' | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you missed it.
pkg/mqtrigger/mqtrigger.go
Outdated
|
||
// return if no secrets exist | ||
if _, err := os.Stat(secretsPath); os.IsNotExist(err) { | ||
return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here should return an error since we are not able to find the secrets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to use readSecrets
as a generic function (i.e., use them for other messaging queues as well), it would make more sense to log that the secretsPath was not found and let it return nil.
There are two cases which are of importance here:
- When no secrets are present and they are not required. i.e., there is no need for secrets e.g. connecting to a non TLS secured Kafka cluster would not need secrets.
- When no secrets are present but they should be present (secrets are required). This would be the case when the Kafka cluster is to be connected using TLS but the secrets were not mounted properly.
Returning nil should handle the first scenario just fine because there will be no error and throwing an error is not really required here. Problem arises in the second case. If the secrets cannot be found at the path but they should be present there, an error should be raised so that we know what went wrong. Returning nil here would be just ignoring this case and it would make it difficult to debug the problem.
I believe returning error is what will handle the second case. But returning error when secrets will not be required (connecting to non TLS Kafka cluster) and distinguishing this harmless thing with an actual error would be difficult.
One way to fix this can be, we log the information (maybe as a warning) so that we know that the secrets were not found and hope that the messaging queue would throw an error which together with the warning would help debug the problem. We can word the warning to convey the intensity of the problem. I am not very happy with this fix but this is something which can solve the problem for now.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get it. If the secrets are not necessary, why calling this function?
It's the caller's responsibility to decide whether it needs to call the function, instead of calling a function that will silently ignore the error and pretend nothing happen. A generic function should always return the error instead of hiding it (unless the function itself can handle it) and let the caller to decide what to do to handle the error.
In this case, I think we can change the function signature
// caller checks whether it need to read secrets
secretsPath := strings.TrimSpace(os.Getenv("MESSAGE_QUEUE_SECRETS"))
if secretsPath > 0 {
secrets, err := readSecrets(secretsPath, logger) (map[string][]byte, error)
}
// change the function signature to
func readSecrets(path string, logger *zap.Logger) (map[string][]byte, error) {
// return if no secrets exist
if _, err := os.Stat(path); os.IsNotExist(err) {
return nil, err
}
....
It brings a couple of benefits,
- A more generic function that can accept any path instead of reading the path from the environment variable.
- It's easier to write unit tests for the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. 👍
pkg/mqtrigger/messageQueue/kafka.go
Outdated
consumer, err := cluster.NewConsumer(kafka.brokers, string(trigger.Metadata.UID), []string{trigger.Spec.Topic}, consumerConfig) | ||
kafka.logger.Info("created a new consumer", zap.Any("consumer", consumer)) | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. Will fix this. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/mqtrigger/messageQueue/kafka.go
Outdated
tlsConfig, err := kafka.getTLSConfig() | ||
|
||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. Will fix this. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
other -------- - add spaces before trailing }}
@vadasambar any updates for this? |
Hi, sorry I was a little busy this week. I will check this as soon as I get some time. Thanks for the reminder 👍 |
@life1347 , I did not get a reply on the review comment. I thought I would just bring it to notice because it is buried in other review comments.
|
sorry.. i missed that comment. that make sense to me. |
I think I have made all the changes requested from my side. If there's anything left, please let me know. :) |
I will review the PR again before weekend, before that please fix the code conflicts. Thanks! |
There are still some unresolved comments, please fix them. |
Sorry, I thought I had fixed them. I have 3 pending changes. Thank you for pointing them out. I think I will fix them and then rebase/merge my branch with master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, after resolving the conflict with master branch. Then we are good to go.
Codecov Report
@@ Coverage Diff @@
## master #1300 +/- ##
===========================================
- Coverage 27.5% 17.39% -10.12%
===========================================
Files 70 50 -20
Lines 7348 5819 -1529
===========================================
- Hits 2021 1012 -1009
+ Misses 5093 4733 -360
+ Partials 234 74 -160
Continue to review full report at Codecov.
|
@vadasambar thanks for all your effort! Merged! |
details
Fix for #1293
This change is