Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: bacalhau-project/bacalhau
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.6.3
Choose a base ref
...
head repository: bacalhau-project/bacalhau
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v1.6.4
Choose a head ref
  • 8 commits
  • 71 files changed
  • 4 contributors

Commits on Feb 5, 2025

  1. support plain encoding in s3 publisher (#4840)

    # Add plain encoding option to S3 Publisher
    
    ## Description
    This PR adds support for publishing job results without gzip compression
    through a new `Encoding` type. This enables more efficient data
    pipelines where subsequent jobs need to access individual files from
    previous job results.
    
    Previously, all job results were automatically gzip compressed before
    uploading to S3. While this is efficient for storage and download, it
    requires downloading and decompressing the entire archive to access any
    file. This can be inefficient for workflows like map-reduce where jobs
    only need specific files from previous results.
    
    ### Changes
    - Added new `Encoding` type with `EncodingGzip` (default) and
    `EncodingPlain` options
    - Updated validation to check for valid encoding values
    - Maintains gzip compression as default behavior for backwards
    compatibility
    
    
    ## Benefits and Tradeoffs
    
    ### Benefits
    - Enables efficient access to individual files from job results
    - Better support for data pipelines where jobs consume partial results
    - No decompression overhead when accessing results
    
    ### Tradeoffs
    - More S3 PUT requests (one per file vs one archive)
    - Higher storage costs (no compression)
    - Higher network costs for full result downloads
    - Cannot use `bacalhau job get` with pre-signed URLs when using plain
    encoding (requires individual file URLs)
    
    ## Usage Recommendations
    - Use plain encoding when:
      - Subsequent jobs need to access individual files
      - Results will be frequently accessed by file
      - Building data pipelines with partial result access
    - Keep default gzip encoding when:
      - Results are typically accessed as a complete set
      - Storage/transfer costs are a concern
      - Using `bacalhau job get` functionality
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    ## Summary by CodeRabbit
    
    - **New Features**
    - Enhanced S3 publishing now supports flexible encoding options,
    allowing for compressed archive uploads (gzip) or individual file
    uploads (plain).
    - New test cases added to validate different encoding scenarios for both
    publishing and downloading.
    - Improved handling of publisher specifications with a focus on encoding
    types.
    
    - **Bug Fixes**
    - Addressed error handling for invalid encoding values in publisher
    specifications.
    
    - **Documentation**
    - Updated test cases to reflect changes in encoding handling and improve
    overall test coverage.
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    wdbaruni authored Feb 5, 2025
    Configuration menu
    Copy the full SHA
    8ae8a00 View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2025

  1. Fix race conditions when starting dind containers (#4841)

    Fixes race conditions when multiple dind containers are started at the
    same time, which is usually the case with docker-compose
    
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    ## Summary by CodeRabbit
    
    - **Chores**
    	- Enhanced the container startup process for improved reliability.
    - Introduced a random initial delay to help prevent simultaneous startup
    issues.
    - Implemented a robust retry mechanism with better error handling and
    cleanup for smoother operation.
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    
    ---------
    
    Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
    wdbaruni and coderabbitai[bot] authored Feb 8, 2025
    Configuration menu
    Copy the full SHA
    d151162 View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2025

  1. Host Environment Variable Forwarding (#4842)

    This PR adds the ability to securely forward host environment variables
    to job executions. This enables passing credentials and secrets from the
    host to jobs through a controlled allowlist mechanism.
    
    ## Key Changes
    
    - Added support for referencing host environment variables using `env:`
    prefix
    - Implemented allowlist-based security controls at the compute node
    level
    - Added early validation in bid strategy to fail fast when jobs request
    non-allowlisted variables
    
    ## Usage Example
    
    ```yaml
    # Job specification
    Tasks:
      - Name: main
        Env:
          API_KEY: "env:API_KEY"     # Forward host's API_KEY
          LOG_PATH: "/logs"          # Regular literal value
        Engine:
          Type: docker
          Params:
            Image: ubuntu:latest
    
    # Compute node configuration
    compute:
      env:
        allowlist:
          - "API_*"    # Allow forwarding of any env var starting with API_
    ```
    
    ## Security Design
    
    - Host variables must be explicitly allowlisted using patterns (e.g.,
    `API_*`)
    - Jobs must explicitly request variables using `env:` prefix
    - Early validation during bid phase prevents scheduling jobs that
    request non-allowlisted variables
    - Creates clear audit trail of which credentials each job requested
    
    ## Future Work
    
    The architecture introduced in this PR is designed to be extensible to
    support secret management systems. The `env:` prefix pattern will evolve
    to support additional sources like:
    
    ```yaml
    Tasks:
      - Name: main
        Env:
          API_KEY: "vault:secrets/api-key"    # HashiCorp Vault
          DB_PASS: "aws:prod/db/password"     # AWS Secrets Manager
          CERT: "azure:certificates/prod"      # Azure Key Vault
    ```
    
    This foundation enables:
    - Integration with popular secret vaults and cloud provider secret
    managers
    - Dynamic credential generation and rotation
    - More granular access control patterns
    
    The key difference from the previous implementation is that we now
    support referencing host environment variables (prefixed with `env:`) in
    addition to literal values. This is implemented with security in mind -
    only explicitly allowlisted patterns can be forwarded, and jobs must
    declare which variables they need for audit purposes.
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    
    ## Summary by CodeRabbit
    
    - New Features
    - Introduced dynamic environment variable resolution for jobs and
    compute nodes, allowing secure usage of host variables through
    customizable allow lists.
    - Integrated environment variable handling into job execution and
    bidding workflows for more flexible configurations.
    
    - Documentation
    - Updated API specifications and schema descriptions to clarify how
    environment variables can be configured, including support for direct
    values and host references.
    
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    wdbaruni authored Feb 9, 2025
    Configuration menu
    Copy the full SHA
    e36b9ab View commit details
    Browse the repository at this point in the history

Commits on Feb 12, 2025

  1. fixing logo on readme (#4845)

    Fixing the readme logo.
    
    (for some reason, couldn't get tests to pass until i updated retracted
    packages)
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    
    ## Summary by CodeRabbit
    
    - **Documentation**
    - Updated the Bacalhau logo image path in the documentation for clearer
    asset referencing.
    - **Chores**
    - Upgraded multiple dependency versions to enhance overall stability,
    security, and performance.
    
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    aronchick authored Feb 12, 2025
    Configuration menu
    Copy the full SHA
    f0d4955 View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2025

  1. Enhance value column configuration with width constraints in config l…

    …ist (#4848)
    
    Closes #4789 
    This pull request includes a change to the `cmd/cli/config/list.go` file
    to improve the display of the "Value" column in the configuration list
    table.
    
    Enhancements to table column configuration:
    
    *
    [`cmd/cli/config/list.go`](diffhunk://#diff-30602d5cf9bbe56af2982dd1b800c529ec64a62985243376475e16c75b89c971L102-R102):
    Added `WidthMax` and `WidthMaxEnforcer` properties to the "Value" column
    configuration to ensure the column width does not exceed 80 characters
    and to wrap text softly if it does.
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    
    ## Summary by CodeRabbit
    
    - **New Features**
    - Enhanced the display of configuration information by enforcing a
    maximum width of 80 characters and applying soft text wrapping. This
    update improves the overall layout when displaying configuration data,
    ensuring that long entries are neatly formatted. Users will experience a
    clearer and more user-friendly presentation when reviewing configuration
    values.
    
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    virajbhartiya authored Feb 13, 2025
    Configuration menu
    Copy the full SHA
    b7f9cd6 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2025

  1. fix golangci-lint workflow (#4854)

    The workflow is failing because `golangci-lint-action` added config
    validation, and our config are still using outdated lints that have been
    dropped couple of years ago.
    
    This PR fixes the issue by removing the dropped configs, but also
    mitigates future similars issues by:
    - Pinning the `golangci-lint` version used by the action to latest
    version v1.64.5 to avoid always using the latest version which may be
    incompatible with our configs
    - Pinning `golangci-lint-action` to current latest version v6.5.0 to
    avoid change in behaviour, such as config validation
    
    This allows us to update whenever we are ready and not block our
    pipeline when latest versions are incompatible with our repo
    
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    
    ## Summary by CodeRabbit
    
    - **Chores**
    - Updated and streamlined automated quality checks for improved
    consistency.
    - Upgraded linting tool versions across the development process to
    enhance code quality.
    
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    wdbaruni authored Feb 18, 2025
    Configuration menu
    Copy the full SHA
    655223d View commit details
    Browse the repository at this point in the history
  2. install ca-certificates in docker images (#4853)

    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    ## Summary by CodeRabbit
    
    - **Chores**
    - Improved container image builds by installing essential security
    certificates and performing necessary cleanup steps.
    - Streamlined dependency management by removing unnecessary utilities
    and focusing on essential packages.
    - Adjusted environment configuration by removing the `PATH` environment
    variable setting.
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    wdbaruni authored Feb 18, 2025
    Configuration menu
    Copy the full SHA
    ea31142 View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2025

  1. Enable port mapping and network configuration for jobs (#4855)

    This PR introduces networking capabilities to Bacalhau jobs, enabling
    containerized workloads to expose services and communicate with host
    networks. This is a significant enhancement that opens up new use cases
    for Bacalhau and enabling inter-job communication.
    
    ## Core Changes
    
    ### 1. Network Modes
    - Introduced explicit network configuration in job specifications
    - Supported modes:
      * `bridge`
      * `host`: (Linux only) Direct access to host network stack
      * `none`: Complete network isolation
    - Renamed legacy 'full' mode to more accurate 'host' mode
    
    
    ### 2. Port Mapping System
    - Introduced `PortMapping` configuration:
      ```go
      type PortMapping struct {
          Name   string // Required identifier for the port
    Static int // Optional host port (auto-allocated if not specified)
          Target int    // Container port to expose (only in bridge mode)
      }
      ```
    - Added port allocation system to manage host ports:
    * Support for both static (user-defined) and dynamic (auto-allocated)
    ports
      * Conflict detection and resolution
      * Port range configuration for dynamic allocation
      * Port release on job completion
    
    ### 3. Environment Variables
    Added standardized environment variables for port discovery:
    - `BACALHAU_HOST_PORT_{name}`: Host port mapping
    - `BACALHAU_PORT_{name}`: Container port mapping
    
    
    Example Usage:
    ```
    network:
     type: bridge
     ports:
      - name: http
        static: 8080 # Host port
        target: 80 # Container port
      - name: metrics
        target: 9090 # Dynamic host port allocation
    ```
    
    
    Testing:
    - Added tests for port mapping in both host and bridge modes
    - Added tests for environment variable injection
    - Added tests for host service accessibility
    
    Future Work:
    1. Auto-discovery of compute node IP addresses
    2. Service discovery system for job intercommunication
    
    
    <!-- This is an auto-generated comment: release notes by coderabbit.ai
    -->
    ## Summary by CodeRabbit
    
    
    - **New Features**
    - Enhanced networking with dynamic port allocation and support for both
    host and isolated bridge modes.
    - Improved compute node configuration by exposing advertised address
    details for better connectivity.
    - Added new properties and definitions for network configurations and
    port mappings in the API.
    - Introduced a comprehensive suite of integration tests for networking
    functionalities, validating various configurations and scenarios.
    - Added validation for network configurations during task submissions to
    ensure correctness.
    
    - **Refactor**
    - Streamlined network configuration handling and parameter management
    across the system.
    
    - **Tests**
    - Expanded test coverage for network configurations and port mapping
    scenarios, including new tests for various Docker networking modes and
    validation of port allocation behaviors.
      - Introduced new validation tests for task network configurations.
      - Added a new test suite for execution port allocation functionality.
    - Established a new integration test suite for networking
    functionalities.
    
    - **Documentation**
    - Updated API and configuration guides to reflect the new network
    capabilities and port mapping definitions.
    
    <!-- end of auto-generated comment: release notes by coderabbit.ai -->
    
    ---------
    
    Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
    wdbaruni and coderabbitai[bot] authored Feb 20, 2025
    Configuration menu
    Copy the full SHA
    5de7ab8 View commit details
    Browse the repository at this point in the history
Loading