Skip to content

Conversation

wdbaruni
Copy link
Member

@wdbaruni wdbaruni commented Mar 5, 2025

Overview

This PR addresses a specific issue with platform compatibility verification for Docker images that are already cached locally. The fix ensures that Bacalhau properly verifies platform compatibility before using locally available Docker images.

Problem

We identified a specific issue in our Docker image handling logic:

When a Docker image is already pulled and available in the local cache:

  • The system was using the locally cached image without properly verifying if its platform is compatible with the host platform
  • This could lead to execution failures when the locally cached image was built for a different architecture than the host
  • For example, if an arm64 image was cached on an amd64 host, the system would incorrectly attempt to use the incompatible image

This issue only occurred when incompatible images were already present in the local Docker cache. The remote registry verification was working correctly when no local image was available.

Solution

This PR implements a targeted fix:

  • Enhanced the isPlatformCompatible check to properly verify locally cached images against the host platform
  • Added proper handling to fall back to pulling from the remote registry when a locally cached image has an incompatible platform
  • Improved logging to clearly indicate when a platform mismatch is detected with locally cached images

How to Reproduce the Issue

The issue can be reproduced with these steps:

# On an arm64 host machine:

# Pre: Remove the image from your local cache if it exists
docker image rm ubuntu:latest

# 1. Pull an amd64 image to local cache
docker pull --platform linux/amd64 ubuntu:latest

# 2. Run a Bacalhau job with Docker executor
bacalhau docker run ubuntu:latest -- echo "hello world"

# Result: Before the fix, Bacalhau would attempt to use the locally cached amd64 image 
# on the arm64 host, potentially causing execution issues

Summary by CodeRabbit

  • New Features

    • Enhanced image compatibility checks to ensure that only container images matching the host environment are used.
  • Chores

    • Optimized the platform information caching interval, reducing it from 24 hours to 1 hour for fresher data.
    • Improved error handling to better manage and report issues when platform details are incomplete.

Copy link

linear bot commented Mar 5, 2025

Copy link
Contributor

coderabbitai bot commented Mar 5, 2025

Walkthrough

The changes update the Docker client to better handle platform compatibility. A new constant unknownPlatform is introduced, and the Client struct is enhanced with fields for managing platform information in a thread-safe manner. The method ImagePlatforms is removed, replaced by getHostPlatform, which caches host platform data. Additionally, a new method isPlatformCompatible checks image platform details before distribution, logging issues for incomplete data. In the executor package, the caching duration for image platform information is reduced from one day to one hour.

Changes

File(s) Change Summary
pkg/docker/docker.go - Added new methods: getHostPlatform, isPlatformCompatible, and platformString.
- Removed ImagePlatforms and updated SupportedPlatforms to leverage the new caching mechanism.
- Modified ImageDistribution to log and handle platform mismatches.
- Added fields: hostPlatform, platformSet, and platformMu to Client struct.
pkg/executor/docker/bidstrategy/semantic/image_platform.go - Changed caching constant from one day (86400 seconds) to one hour (60 * 60 seconds).
- Updated caching logic and comments in the ShouldBid method.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Logger
    participant Registry

    Client->>Client: getHostPlatform(ctx)
    Note over Client: Retrieves and caches host platform info
    Client->>Client: isPlatformCompatible(imageInfo, hostPlatform)
    alt Platform compatible
        Client->>Registry: Proceed with ImageDistribution
    else Platform mismatch
        Client->>Logger: Log debug message for platform mismatch
    end
Loading

Poem

I'm a rabbit, hopping through code so bright,
With platform checks that are now just right.
Caching info safe, like carrots in a snare,
Debug logs alert when platforms aren't fair.
I nibble on errors with a smile in flight,
Celebrating changes deep into the night!

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint (1.62.2)

{"Issues":[{"FromLinter":"typecheck","Text":"pattern build/: no matching files found","Severity":"","SourceLines":["//go:embed build/"],"Replacement":null,"Pos":{"Filename":"webui/webui.go","Offset":0,"Line":23,"Column":12},"ExpectNoLint":false,"ExpectedNoLintLinter":""}],"Report":{"Linters":[{"Name":"asasalint"},{"Name":"asciicheck"},{"Name":"bidichk"},{"Name":"bodyclose","Enabled":true},{"Name":"canonicalheader"},{"Name":"containedctx"},{"Name":"contextcheck"},{"Name":"copyloopvar","Enabled":true},{"Name":"cyclop"},{"Name":"decorder"},{"Name":"deadcode"},{"Name":"depguard","Enabled":true},{"Name":"dogsled","Enabled":true},{"Name":"dupl"},{"Name":"dupword"},{"Name":"durationcheck"},{"Name":"errcheck","Enabled":true,"EnabledByDefault":true},{"Name":"errchkjson"},{"Name":"errname"},{"Name":"errorlint"},{"Name":"execinquery"},{"Name":"exhaustive"},{"Name":"exhaustivestruct"},{"Name":"exhaustruct"},{"Name":"exportloopref"},{"Name":"forbidigo","Enabled":true},{"Name":"forcetypeassert"},{"Name":"fatcontext"},{"Name":"funlen","Enabled":true},{"Name":"gci"},{"Name":"ginkgolinter"},{"Name":"gocheckcompilerdirectives"},{"Name":"gochecknoglobals"},{"Name":"gochecknoinits","Enabled":true},{"Name":"gochecksumtype"},{"Name":"gocognit"},{"Name":"goconst","Enabled":true},{"Name":"gocritic"},{"Name":"gocyclo","Enabled":true},{"Name":"godot"},{"Name":"godox"},{"Name":"err113"},{"Name":"gofmt","Enabled":true},{"Name":"gofumpt"},{"Name":"goheader"},{"Name":"goimports","Enabled":true},{"Name":"golint"},{"Name":"mnd","Enabled":true},{"Name":"gomnd"},{"Name":"gomoddirectives"},{"Name":"gomodguard"},{"Name":"goprintffuncname","Enabled":true},{"Name":"gosec","Enabled":true},{"Name":"gosimple","Enabled":true,"EnabledByDefault":true},{"Name":"gosmopolitan"},{"Name":"govet","Enabled":true,"EnabledByDefault":true},{"Name":"grouper"},{"Name":"ifshort"},{"Name":"iface"},{"Name":"importas"},{"Name":"inamedparam"},{"Name":"ineffassign","Enabled":true,"EnabledByDefault":true},{"Name":"interfacebloat"},{"Name":"interfacer"},{"Name":"intrange"},{"Name":"ireturn"},{"Name":"lll","Enabled":true},{"Name":"loggercheck"},{"Name":"maintidx"},{"Name":"makezero"},{"Name":"maligned"},{"Name":"mirror"},{"Name":"misspell"},{"Name":"musttag"},{"Name":"nakedret","Enabled":true},{"Name":"nestif"},{"Name":"nilerr"},{"Name":"nilnil"},{"Name":"nlreturn"},{"Name":"noctx","Enabled":true},{"Name":"nonamedreturns"},{"Name":"nosnakecase"},{"Name":"nosprintfhostport"},{"Name":"paralleltest"},{"Name":"perfsprint"},{"Name":"prealloc"},{"Name":"predeclared"},{"Name":"promlinter"},{"Name":"protogetter"},{"Name":"reassign"},{"Name":"recvcheck"},{"Name":"revive"},{"Name":"rowserrcheck"},{"Name":"sloglint"},{"Name":"scopelint"},{"Name":"sqlclosecheck"},{"Name":"spancheck"},{"Name":"staticcheck","Enabled":true,"EnabledByDefault":true},{"Name":"structcheck"},{"Name":"stylecheck","Enabled":true},{"Name":"tagalign"},{"Name":"tagliatelle"},{"Name":"tenv"},{"Name":"testableexamples"},{"Name":"testifylint"},{"Name":"testpackage"},{"Name":"thelper"},{"Name":"tparallel"},{"Name":"typecheck","Enabled":true,"EnabledByDefault":true},{"Name":"unconvert","Enabled":true},{"Name":"unparam"},{"Name":"unused","Enabled":true,"EnabledByDefault":true},{"Name":"usestdlibvars"},{"Name":"varcheck"},{"Name":"varnamelen"},{"Name":"wastedassign"},{"Name":"whitespace","Enabled":true},{"Name":"wrapcheck"},{"Name":"wsl"},{"Name":"zerologlint"},{"Name":"nolintlint","Enabled":true}]}}

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
pkg/executor/docker/bidstrategy/semantic/image_platform.go (2)

76-82: Consider making cache duration configurable
The in-memory caching approach with a fixed duration is workable but somewhat inflexible. You might consider making the cache TTL user-configurable or rely on a more robust cache mechanism as you indicated in the TODO.

I can open a new issue or provide a patch if you'd like to incorporate a more dynamic caching approach (e.g., LRU with user-configurable TTL).


85-91: Minimal fallback if caching fails
Logging and continuing is user-friendly in a best-effort scenario, but repeated failures may degrade performance if images are not cached at all. Consider adding a fallback strategy or periodic reattempt.

pkg/docker/docker.go (2)

256-268: Eliminate repeated 'unknown' string usage
The static analysis warns about repeated usage of the string 'unknown'. Consider extracting it into a shared constant to improve maintainability.

Here's an example diff:

+const unknownStr = "unknown"

 func (c *Client) isPlatformCompatible(info types.ImageInspect, hostPlatform v1.Platform) bool {
-    if info.Os == "unknown" || info.Architecture == "unknown" {
+    if info.Os == unknownStr || info.Architecture == unknownStr {
         log.Debug()...
         return false
     }
     ...
 }
🧰 Tools
🪛 GitHub Check: lint / go-lint (ubuntu-latest)

[failure] 258-258:
string unknown has 4 occurrences, make it a constant (goconst)

🪛 GitHub Actions: Main Pipeline

[warning] 258-258: string unknown has 4 occurrences, make it a constant (goconst)


358-363: Unnecessary leading newline
A linter flags an unnecessary leading newline around line 356. Consider removing extra blank lines to comply with style guidelines.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1882fe and db46ecf.

📒 Files selected for processing (2)
  • pkg/docker/docker.go (5 hunks)
  • pkg/executor/docker/bidstrategy/semantic/image_platform.go (3 hunks)
🧰 Additional context used
🪛 GitHub Check: lint / go-lint (ubuntu-latest)
pkg/docker/docker.go

[failure] 258-258:
string unknown has 4 occurrences, make it a constant (goconst)


[failure] 356-356:
unnecessary leading newline (whitespace)

🪛 GitHub Actions: Main Pipeline
pkg/docker/docker.go

[warning] 258-258: string unknown has 4 occurrences, make it a constant (goconst)

🔇 Additional comments (25)
pkg/executor/docker/bidstrategy/semantic/image_platform.go (7)

19-19: One-hour caching constant is well-chosen
Having a shorter caching duration (one hour) can help ensure more up-to-date platform information, especially if your environment changes frequently.


62-65: No issues found.


67-67: No issues found.


94-94: No issues found.


105-105: No issues found.


111-111: No issues found.


115-116: No issues found.

pkg/docker/docker.go (18)

41-43: Good concurrency design for caching
These new fields for hostPlatform, platformSet, and platformMu effectively manage concurrency around retrieving host platform data.


52-52: No issues found.


210-246: Synchronized Host Platform Retrieval
This function uses a mutex to safely cache the host platform. The logic is clear and avoids repeated calls to Docker. Looks good.


248-254: Single-element slice is straightforward
Returning just one platform from getHostPlatform is logical for local runs.


324-327: No issues found.


329-355: Local image digest parsing
Parsing the digest from the first entry in repoDigests is a practical solution, ensuring minimal overhead.


364-366: No issues found.


368-368: No issues found.


375-386: No issues found.


389-389: No issues found.


390-390: No issues found.


395-395: No issues found.


397-401: No issues found.


403-403: No issues found.


404-405: No issues found.


407-407: No issues found.


410-410: No issues found.


524-526: No issues found.

@wdbaruni wdbaruni self-assigned this Mar 5, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/docker/docker.go (1)

326-329: Consider additional logging for empty repository digests

The implementation for checking local image compatibility is good, but there's a potential edge case where info.RepoDigests might be empty, which would cause the function to silently continue to the registry check without any logging or notification.

Consider adding a debug log when repository digests are empty:

  if c.isPlatformCompatible(info, hostPlatform) {
    repos := info.RepoDigests
    if len(repos) >= 1 {
      // ...existing code...
+   } else {
+     log.Ctx(ctx).Debug().
+       Str("image", image).
+       Msg("Local image has compatible platform but no repository digests, checking registry")
    }
  }

Also applies to: 335-357

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db46ecf and cad2947.

📒 Files selected for processing (1)
  • pkg/docker/docker.go (5 hunks)
🔇 Additional comments (10)
pkg/docker/docker.go (10)

39-39: Well-defined constant for unknown platform values

Good addition - using a named constant instead of magic strings improves code readability and prevents typo-based errors when dealing with unknown platform values.


43-45: Good thread-safe implementation for caching platform information

The addition of these fields to cache host platform information with proper mutex synchronization is a solid approach. This prevents redundant calls to the Docker daemon and ensures thread safety when multiple goroutines access platform information.


212-248: Well-structured platform detection with proper error handling

This method effectively retrieves and caches the host platform information with good error handling and thread safety. The step-by-step approach to extract platform details is clear and robust.

Some additional points to note:

  1. The use of a mutex ensures thread-safe access to cached platform information
  2. The error messages are specific and informative
  3. The cache mechanism prevents redundant Docker API calls

250-256: Refactored to use the new centralized platform detection

Good refactoring to leverage the new getHostPlatform method instead of duplicating platform detection logic.


258-270: Effective platform compatibility check with debug logging

The isPlatformCompatible method provides a clear and effective way to determine if an image's platform matches the host platform. The debug logging for unknown platform values is helpful for troubleshooting.


358-364: Good debugging for platform mismatches

The debug logging when a platform mismatch is detected provides valuable information for troubleshooting. The use of platformString for consistent formatting is a nice touch.


376-382: Improved platform filtering to prevent issues with unknown platforms

Filtering out platforms with unknown values is an important improvement that prevents potential problems with incomplete platform information from the registry.


391-406: Platform compatibility check before pulling images

This change ensures that Docker images are only pulled when necessary - either when they don't exist locally or when the local version doesn't match the host platform. This optimization reduces unnecessary network traffic and improves performance.


408-412: Explicit platform specification in pull options

Using the platformString helper to set the platform in pull options ensures that the correct platform-specific image is pulled. This is crucial for preventing issues with multi-architecture images.


525-527: Useful helper for consistent platform string formatting

This simple helper function ensures consistent formatting of platform information throughout the codebase, which is good for maintainability and readability.

@wdbaruni wdbaruni merged commit 7a0d340 into main Mar 5, 2025
13 of 14 checks passed
@wdbaruni wdbaruni deleted the eng-254-how-do-i-enforce-a-different-architecture branch March 5, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant