Skip to content

Conversation

peterschmidt85
Copy link
Contributor

@peterschmidt85 peterschmidt85 commented Jun 14, 2025

This PR does:

  • Adds mpirun and /opt/nvcc-tests/build to the base image
  • Drops separate images per Python version
  • Adds devel-efa image that comes with /opt/nvcc-tests/build and uv (fixes [UX] Pre-build a EFA version of the default Docker image #2793)
  • Updates the Docker image version to 0.10
  • Updates the format of the base Docker image name to {version}-{'base|devel|devel-efa'}
  • Drops pre-pooling Docker images with VMs
  • Allows to use any python (not only minor versions)
  • Automatically choose devel-efa on EFA-enabled AWS instances
  • Add OMPI_MCA_pml, OMPI_MCA_btl, OMPI_MCA_btl_tcp_if_exclude, NCCL_SOCKET_IFNAME. See an example.
  • Updated to Ubuntu 22.04

To be done separately:

  • Build Docker & cloud images
  • Updates documentation and examples (incl. add AWS EFA example)

Staged Docker images: https://hub.docker.com/repository/docker/dstackai/base-stgn/tags?name=ubuntu22.04&page=1

peterschmidt85 and others added 30 commits June 12, 2025 15:09
@peterschmidt85

This comment was marked as resolved.

@@ -1,3 +1,3 @@
__version__ = "0.0.0"
__is_release__ = False
base_image = "0.9"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should change the image version in the master after the images are built and published, so it's better to change it in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we also change the format of the Docker image name, it's not possible to change it in a separate PR.
Moreover, we can build the new VM images from the PR branch before the merge, so it's OK.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's build the images from the PR then and merge when published.

@@ -1,15 +1,15 @@
ARG BASE_IMAGE=dstackai/base:py3.12-0.7-cuda-12.1
# syntax = edrevo/dockerfile-plus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we live without it? An unfamiliar dependency that is no longer maintained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this dependency, we would need to duplicate the code

@peterschmidt85 peterschmidt85 requested a review from r4victor June 16, 2025 09:53
@peterschmidt85 peterschmidt85 merged commit b09844d into master Jun 18, 2025
35 of 50 checks passed
@peterschmidt85 peterschmidt85 deleted the 2793-ux-pre-build-a-efa-version-of-the-default-docker-image branch June 18, 2025 06:16
haydnli-shopify pushed a commit to haydnli-shopify/dstack that referenced this pull request Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[UX] Pre-build a EFA version of the default Docker image
3 participants