-
Notifications
You must be signed in to change notification settings - Fork 99
Comparing changes
Open a pull request
base repository: bacalhau-project/bacalhau
base: v1.6.3
head repository: bacalhau-project/bacalhau
compare: v1.6.4
- 8 commits
- 71 files changed
- 4 contributors
Commits on Feb 5, 2025
-
support plain encoding in s3 publisher (#4840)
# Add plain encoding option to S3 Publisher ## Description This PR adds support for publishing job results without gzip compression through a new `Encoding` type. This enables more efficient data pipelines where subsequent jobs need to access individual files from previous job results. Previously, all job results were automatically gzip compressed before uploading to S3. While this is efficient for storage and download, it requires downloading and decompressing the entire archive to access any file. This can be inefficient for workflows like map-reduce where jobs only need specific files from previous results. ### Changes - Added new `Encoding` type with `EncodingGzip` (default) and `EncodingPlain` options - Updated validation to check for valid encoding values - Maintains gzip compression as default behavior for backwards compatibility ## Benefits and Tradeoffs ### Benefits - Enables efficient access to individual files from job results - Better support for data pipelines where jobs consume partial results - No decompression overhead when accessing results ### Tradeoffs - More S3 PUT requests (one per file vs one archive) - Higher storage costs (no compression) - Higher network costs for full result downloads - Cannot use `bacalhau job get` with pre-signed URLs when using plain encoding (requires individual file URLs) ## Usage Recommendations - Use plain encoding when: - Subsequent jobs need to access individual files - Results will be frequently accessed by file - Building data pipelines with partial result access - Keep default gzip encoding when: - Results are typically accessed as a complete set - Storage/transfer costs are a concern - Using `bacalhau job get` functionality <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced S3 publishing now supports flexible encoding options, allowing for compressed archive uploads (gzip) or individual file uploads (plain). - New test cases added to validate different encoding scenarios for both publishing and downloading. - Improved handling of publisher specifications with a focus on encoding types. - **Bug Fixes** - Addressed error handling for invalid encoding values in publisher specifications. - **Documentation** - Updated test cases to reflect changes in encoding handling and improve overall test coverage. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Configuration menu - View commit details
-
Copy full SHA for 8ae8a00 - Browse repository at this point
Copy the full SHA 8ae8a00View commit details
Commits on Feb 8, 2025
-
Fix race conditions when starting dind containers (#4841)
Fixes race conditions when multiple dind containers are started at the same time, which is usually the case with docker-compose <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Enhanced the container startup process for improved reliability. - Introduced a random initial delay to help prevent simultaneous startup issues. - Implemented a robust retry mechanism with better error handling and cleanup for smoother operation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d151162 - Browse repository at this point
Copy the full SHA d151162View commit details
Commits on Feb 9, 2025
-
Host Environment Variable Forwarding (#4842)
This PR adds the ability to securely forward host environment variables to job executions. This enables passing credentials and secrets from the host to jobs through a controlled allowlist mechanism. ## Key Changes - Added support for referencing host environment variables using `env:` prefix - Implemented allowlist-based security controls at the compute node level - Added early validation in bid strategy to fail fast when jobs request non-allowlisted variables ## Usage Example ```yaml # Job specification Tasks: - Name: main Env: API_KEY: "env:API_KEY" # Forward host's API_KEY LOG_PATH: "/logs" # Regular literal value Engine: Type: docker Params: Image: ubuntu:latest # Compute node configuration compute: env: allowlist: - "API_*" # Allow forwarding of any env var starting with API_ ``` ## Security Design - Host variables must be explicitly allowlisted using patterns (e.g., `API_*`) - Jobs must explicitly request variables using `env:` prefix - Early validation during bid phase prevents scheduling jobs that request non-allowlisted variables - Creates clear audit trail of which credentials each job requested ## Future Work The architecture introduced in this PR is designed to be extensible to support secret management systems. The `env:` prefix pattern will evolve to support additional sources like: ```yaml Tasks: - Name: main Env: API_KEY: "vault:secrets/api-key" # HashiCorp Vault DB_PASS: "aws:prod/db/password" # AWS Secrets Manager CERT: "azure:certificates/prod" # Azure Key Vault ``` This foundation enables: - Integration with popular secret vaults and cloud provider secret managers - Dynamic credential generation and rotation - More granular access control patterns The key difference from the previous implementation is that we now support referencing host environment variables (prefixed with `env:`) in addition to literal values. This is implemented with security in mind - only explicitly allowlisted patterns can be forwarded, and jobs must declare which variables they need for audit purposes. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced dynamic environment variable resolution for jobs and compute nodes, allowing secure usage of host variables through customizable allow lists. - Integrated environment variable handling into job execution and bidding workflows for more flexible configurations. - Documentation - Updated API specifications and schema descriptions to clarify how environment variables can be configured, including support for direct values and host references. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Configuration menu - View commit details
-
Copy full SHA for e36b9ab - Browse repository at this point
Copy the full SHA e36b9abView commit details
Commits on Feb 12, 2025
-
Fixing the readme logo. (for some reason, couldn't get tests to pass until i updated retracted packages) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Updated the Bacalhau logo image path in the documentation for clearer asset referencing. - **Chores** - Upgraded multiple dependency versions to enhance overall stability, security, and performance. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Configuration menu - View commit details
-
Copy full SHA for f0d4955 - Browse repository at this point
Copy the full SHA f0d4955View commit details
Commits on Feb 13, 2025
-
Enhance value column configuration with width constraints in config l…
…ist (#4848) Closes #4789 This pull request includes a change to the `cmd/cli/config/list.go` file to improve the display of the "Value" column in the configuration list table. Enhancements to table column configuration: * [`cmd/cli/config/list.go`](diffhunk://#diff-30602d5cf9bbe56af2982dd1b800c529ec64a62985243376475e16c75b89c971L102-R102): Added `WidthMax` and `WidthMaxEnforcer` properties to the "Value" column configuration to ensure the column width does not exceed 80 characters and to wrap text softly if it does. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced the display of configuration information by enforcing a maximum width of 80 characters and applying soft text wrapping. This update improves the overall layout when displaying configuration data, ensuring that long entries are neatly formatted. Users will experience a clearer and more user-friendly presentation when reviewing configuration values. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Configuration menu - View commit details
-
Copy full SHA for b7f9cd6 - Browse repository at this point
Copy the full SHA b7f9cd6View commit details
Commits on Feb 18, 2025
-
fix golangci-lint workflow (#4854)
The workflow is failing because `golangci-lint-action` added config validation, and our config are still using outdated lints that have been dropped couple of years ago. This PR fixes the issue by removing the dropped configs, but also mitigates future similars issues by: - Pinning the `golangci-lint` version used by the action to latest version v1.64.5 to avoid always using the latest version which may be incompatible with our configs - Pinning `golangci-lint-action` to current latest version v6.5.0 to avoid change in behaviour, such as config validation This allows us to update whenever we are ready and not block our pipeline when latest versions are incompatible with our repo <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated and streamlined automated quality checks for improved consistency. - Upgraded linting tool versions across the development process to enhance code quality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Configuration menu - View commit details
-
Copy full SHA for 655223d - Browse repository at this point
Copy the full SHA 655223dView commit details -
install ca-certificates in docker images (#4853)
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Improved container image builds by installing essential security certificates and performing necessary cleanup steps. - Streamlined dependency management by removing unnecessary utilities and focusing on essential packages. - Adjusted environment configuration by removing the `PATH` environment variable setting. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Configuration menu - View commit details
-
Copy full SHA for ea31142 - Browse repository at this point
Copy the full SHA ea31142View commit details
Commits on Feb 20, 2025
-
Enable port mapping and network configuration for jobs (#4855)
This PR introduces networking capabilities to Bacalhau jobs, enabling containerized workloads to expose services and communicate with host networks. This is a significant enhancement that opens up new use cases for Bacalhau and enabling inter-job communication. ## Core Changes ### 1. Network Modes - Introduced explicit network configuration in job specifications - Supported modes: * `bridge` * `host`: (Linux only) Direct access to host network stack * `none`: Complete network isolation - Renamed legacy 'full' mode to more accurate 'host' mode ### 2. Port Mapping System - Introduced `PortMapping` configuration: ```go type PortMapping struct { Name string // Required identifier for the port Static int // Optional host port (auto-allocated if not specified) Target int // Container port to expose (only in bridge mode) } ``` - Added port allocation system to manage host ports: * Support for both static (user-defined) and dynamic (auto-allocated) ports * Conflict detection and resolution * Port range configuration for dynamic allocation * Port release on job completion ### 3. Environment Variables Added standardized environment variables for port discovery: - `BACALHAU_HOST_PORT_{name}`: Host port mapping - `BACALHAU_PORT_{name}`: Container port mapping Example Usage: ``` network: type: bridge ports: - name: http static: 8080 # Host port target: 80 # Container port - name: metrics target: 9090 # Dynamic host port allocation ``` Testing: - Added tests for port mapping in both host and bridge modes - Added tests for environment variable injection - Added tests for host service accessibility Future Work: 1. Auto-discovery of compute node IP addresses 2. Service discovery system for job intercommunication <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced networking with dynamic port allocation and support for both host and isolated bridge modes. - Improved compute node configuration by exposing advertised address details for better connectivity. - Added new properties and definitions for network configurations and port mappings in the API. - Introduced a comprehensive suite of integration tests for networking functionalities, validating various configurations and scenarios. - Added validation for network configurations during task submissions to ensure correctness. - **Refactor** - Streamlined network configuration handling and parameter management across the system. - **Tests** - Expanded test coverage for network configurations and port mapping scenarios, including new tests for various Docker networking modes and validation of port allocation behaviors. - Introduced new validation tests for task network configurations. - Added a new test suite for execution port allocation functionality. - Established a new integration test suite for networking functionalities. - **Documentation** - Updated API and configuration guides to reflect the new network capabilities and port mapping definitions. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 5de7ab8 - Browse repository at this point
Copy the full SHA 5de7ab8View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v1.6.3...v1.6.4