-
Notifications
You must be signed in to change notification settings - Fork 48
Add Dockerfile to build a kafka connect image based on kafka trunk #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mfvitale! I built the image locally and it looks good for the most part.
If we plan on using this image in our integration tests, we'll need a few extra tweaks:
- Debezium customizes the Log4j config file used by their Connect workers, and we have integration tests that rely on that. The config file is stored in
$KAFKA_HOME/config/log4j.properties
and right now it's getting overwritten when we build this image; can we preserve that file somehow? - Debezium exposes a KAFKA_VERSION environment variable that contains which version of Kafka the image is built against. We also use this in our integration tests; can we update this build file and/or our integration tests to read the correct Kafka version from this image? AFAICT we can't overwrite the environment variable itself without hardcoding a value (which seems distasteful), but we might consider reading it from the Kafka repo's gradle.properties file either during the build or in our integration tests.
…in kafka_version file under $KAFKA_HOME dir
Seems there is no way to set an ENV with command output. I saved it in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, sorry--it looks like license checking doesn't work on raw Dockerfiles (see mathieucarbou/license-maven-plugin#204).
Can we add an exclude for the Dockerfile? Should look something like this (in pom.xml):
<exclude>**/Dockerfile</exclude>
Also, just a thought--we may want to rename the Dockerfile
or at least add some detail to the path for it, since people may reasonably see a file in docker/Dockerfile
and think that that has to do with running kcctl
instead of running Kafka Connect. Do you think it'd make sense to put it in a path like docker/connect/trunk/Dockerfile
?
Sure.
What do you think to put it in the test resource folder? |
This is interesting... I think it depends on how we plan on using the Dockerfile in the future. If we need to access it using Java code at test runtime, then Did you have thoughts on how to leverage this in the future? I have a draft commit ready that adds support for running integration tests against this image, but not for building it. If it's fast enough, we could add |
I saw that there are test containers that uses the connect image from Debezium. Do you think we can use also this one here?
If you plan to test it using test containers maybe we can think to build it on the fly . WDYT?
It requires more or less 6 minutes. |
Yes, that's exactly what my draft commit does 😄
Sounds great! Not sure how easy this would be though--we use the DebeziumContainer class for our testing, which subclasses We don't have to block merging this PR on implementing build-on-the-fly logic; if you're confident that it can be done, we can merge this now (after moving the Dockerfile to a different directory) and tackle that later. |
I have found a way to build the image on the fly and use it as a Not sure if the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mfvitale, this is awesome!
Aside from the nit, I have two more thoughts about this. I don't think either has to be a blocker, but wanted to get your input before merging:
- Since the image build time is okay, we can start doing tests against trunk on our CI builds by adding
TRUNK
to the connect_version list. - If I want to test against trunk locally, the image will only be built once and then reused for subsequent test runs. This is great for cutting down test runtime, but may lead to using a stale image the next day, month, etc. We may want to add a cleaner way to force a rebuild of the image aside from manually deleting the
debezium-connect:trunk
(ordebezium/connect:trunk
) image.
Let me know what you think about these; happy to classify them both as follow-up items (and file issues for them) if you'd prefer not to address them in this PR.
Co-authored-by: Chris Egerton <fearthecellos@gmail.com>
Thank you!
I think it's reasonable.
Good point. I think that there isn't a solution that will safely save you from having a stale image. Two solution comes to my mind that can mitigate the risk:
Yes, better to have them on separate issues. |
Great, thanks for all the excellent work @mfvitale! |
I have created a Dockerfile based on the debezium-connect image.
This will simply add all the necessary stuff to build Kafka from the trunk branch.
Just did some tests and seems that it is working.