Skip to content

Better Dockerfile #645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 31, 2023
Merged

Better Dockerfile #645

merged 3 commits into from
Mar 31, 2023

Conversation

frafra
Copy link
Contributor

@frafra frafra commented Feb 3, 2023

Fix #644.

I created a new stage to handle the key, so that many tools do not end up in the final image, then I further simplified both the build and release stage. This reduced the image size from ~200 MB to ~160 MB, and reduced the build time as well. I also removed apt-key, which is deprecated, used quiet mode for apt, and avoided installed recommended packages.

I dropped the postgres client as well, as it is needed just when running the tests. By doing that, the only external dependency is nc, which is provided by a minimal busybox:musl image (<1 MB). The final image ~15 MB, which is ~92 % less of the original image. Since I was not sure if this is the desired result, I kept the commit split.
Since the binary is now built with CGO_ENABLED=0, there is no need to use a golang image based on Debian, so the alpine version is used instead.

Replacing nc with a different way to do a self-check would further reduce the image, but using scratch as the base image could be problematic for some users, that are used to have a minimal shell within the container.

As the last step, I added https://github.com/eficode/wait-for, a small POSIX shell script, to wait for Postgres without having to rely on specific docker compose functionalities, which are not available in other systems. depends_on does not even work in some Docker environments. I replaced the health check that used nc with wait-for, which tries to establish a http connection by default.

The software works as usual, but a review would be helpful.
I also added build: . to allow local builds using docker compose.

The current version requires 27 seconds to be built, and its size 197 MB (docker system prune -af && time rootlesskit docker build --no-cache .). This includes the image pull from the server. In my case, I am using a gigabit connection, so the impact of using larger images is small, but it could affect other users more.
Using the same criteria, the version proposed in this PR takes 20 seconds to be built, and its size is 16 MB.

Possible improvements for the future:

  • make git an optional requirement, so that .git should not be copied when building the image, and git installation could be skipped as well;
  • use RUN --mount=type=cache,mode=0755,target=/go/pkg/mod and similar to avoid downloading dependencies multiple times when there is a minor change;
  • unbundle web static assets, so that they can be served independently of the backend if needed;
  • add end-to-end test using the Docker image.

RUN \
curl --silent https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt-get update && apt-get install -y postgresql-client
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pgweb absolutely needs the postgres client package since it utulizes pg_dump under the hood for exporting data dumps.

Copy link
Contributor Author

@frafra frafra Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok, I didn't know that, thanks.

Dockerfile Outdated
apt-get autoremove --yes && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
FROM scratch AS wait
ADD https://raw.githubusercontent.com/eficode/wait-for/v2.2.4/wait-for /usr/bin/wait-for
Copy link
Owner

@sosedoff sosedoff Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise against pulling a dependency from a random Github endpoint, especially if such dependency is a fork. Previously I've used a similar tool, though it seems like we could just grab the contents of https://github.com/eficode/wait-for/blob/master/wait-for (MIT licensed) and use in the repo moving forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean with "random [...] endpoint". Would you rather use a different tool?

Dockerfile Outdated
# ------------------------------------------------------------------------------
# Release Stage
# ------------------------------------------------------------------------------
FROM busybox:musl
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to support current build targets as defined in here. https://github.com/sosedoff/pgweb/blob/master/.github/workflows/docker.yml#L35

As for having a very slim base image - there might be some other issues (like the missing postgres client package) and unfortunately I did not write down any notes since the last time I had to deal with this. A proper E2E test using Docker image is probably in order, however definitely out of scope of the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem then :)


RUN \
apt-get update && \
apt-get install -y ca-certificates openssl netcat curl gnupg lsb-release && \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing ca-certs and openssl packages might have an impact on pgweb connectivity capabilities, I did run into similar issues before, however i don't recall the details.

Also, netcat and curl as super useful when you need to drop into container and troubleshoot things quickly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should pgweb connect to external servers while running?

@frafra frafra force-pushed the better-dockerfile branch from 10849dd to 362686b Compare March 14, 2023 13:23
@frafra
Copy link
Contributor Author

frafra commented Mar 14, 2023

I reduced the effects of this PR by dropping some commits and re-adding some missing packages.

@frafra frafra requested review from sosedoff and removed request for sosedoff March 23, 2023 09:30
@sosedoff
Copy link
Owner

Thanks, i'll review

@sosedoff sosedoff merged commit e905e5d into sosedoff:master Mar 31, 2023
@frafra frafra deleted the better-dockerfile branch October 19, 2023 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dockerfile cleanup actions are useless
2 participants