[tests] Introduce cozytest - a new testing framework #982

kvaps · 2025-05-23T18:59:15Z

Signed-off-by: Andrei Kvapil kvapss@gmail.com

Summary by CodeRabbit

New Features
- Introduced separate end-to-end test scripts for cluster and application provisioning, improving test clarity and modularity.
- Added a new test runner script with enhanced output formatting and live tracing for easier debugging of test runs.
Bug Fixes
- None.
Chores
- Removed the legacy end-to-end test automation script.
- Updated testing workflow to use new modular test scripts and runner for improved maintainability.

coderabbitai · 2025-05-23T18:59:22Z

Walkthrough

The end-to-end test process for CozyStack has been refactored from a single shell script to a modular system using Bats test scripts and a custom test runner. The Makefile now runs separate cluster and application tests, with new scripts introduced for each, and the original monolithic script removed.

Changes

File(s)	Change Summary
hack/e2e.sh	Deleted the comprehensive end-to-end test automation shell script.
hack/e2e-cluster.bats, hack/e2e-apps.bats	Added new Bats test scripts for cluster and application end-to-end testing.
hack/cozytest.sh	Added a custom shell-based test runner for Bats scripts with enhanced output and tracing.
packages/core/testing/Makefile	Refactored test targets: split into `test-cluster` and `test-apps`, updated `apply` env vars.

Sequence Diagram(s)

sequenceDiagram
    participant Dev as Developer
    participant Makefile
    participant CozyTest as hack/cozytest.sh
    participant BatsCluster as e2e-cluster.bats
    participant BatsApps as e2e-apps.bats

    Dev->>Makefile: make test
    Makefile->>CozyTest: Run test-cluster (e2e-cluster.bats)
    CozyTest->>BatsCluster: Execute all cluster E2E tests
    Makefile->>CozyTest: Run test-apps (e2e-apps.bats)
    CozyTest->>BatsApps: Execute all application E2E tests

Suggested labels

ok-to-test

Suggested reviewers

lllamnyp
klinch0

Poem

A rabbit hops through lines anew,
The old script’s gone, split into two.
With Bats and cozytest in tow,
The cluster and apps now each have their show.
Modular tests, clear and bright—
CozyStack’s future is looking right!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

hack/run_test.sh (2)
3-3: Replace non-ASCII characters in comments.

The comment contains Russian text which might cause encoding issues in some environments.
-#  run_test.sh — POSIX-sh test-runner с live-trace и корректным RC-выходом    #
+#  run_test.sh - POSIX-sh test-runner with live-trace and correct RC-exit     #
26-26: Translate Russian comments to English for better compatibility.

Several inline comments contain Russian text which might cause encoding issues.
-  skip_next="+ $fn"      # первую строку трассировки с именем функции пропустим
+  skip_next="+ $fn"      # skip first trace line with function name

           '__RC__'*) : ;;                        # маркер не выводим
+          '__RC__'*) : ;;                        # don't output marker
Also applies to: 37-42
hack/e2e.bats (1)
27-30: Consider checking file existence before reading.

While the current approach works with || true, it would be cleaner to check if PID files exist first.
 @test "Clean previous VMs" {
- kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) 2>/dev/null || true
+ for i in 1 2 3; do
+   [ -f "srv${i}/qemu.pid" ] && kill "$(cat "srv${i}/qemu.pid")" 2>/dev/null || true
+ done
  rm -rf srv1 srv2 srv3
 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8ba6b1 and d02f6bf.

📒 Files selected for processing (4)

hack/e2e.bats (1 hunks)
hack/e2e.sh (0 hunks)
hack/run_test.sh (1 hunks)
packages/core/testing/Makefile (1 hunks)

💤 Files with no reviewable changes (1)

hack/e2e.sh

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Test

🔇 Additional comments (2)

packages/core/testing/Makefile (1)

34-34: LGTM! Proper integration of the new test framework.

The Makefile correctly invokes the new test runner with the Bats test file, maintaining the same environment setup.

hack/e2e.bats (1)

234-385: Excellent comprehensive test coverage!

The test suite provides thorough end-to-end validation of the Cozystack provisioning process, including:

Proper timeouts for async operations

Appropriate error handling with reconciliation retries

Verification of all major components (storage, networking, monitoring, OIDC)

hack/run_test.sh

hack/e2e.bats

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (6)

hack/e2e-apps.bats (2)
3-3: Replace non-ASCII hyphen with standard ASCII hyphen.

The comment contains a non-ASCII hyphen character (‑) which could cause issues in some environments.
-# Cozystack end‑to‑end provisioning test (Bats)
+# Cozystack end-to-end provisioning test (Bats)
87-87: Remove redundant namespace status check.

The namespace tenant-test was already verified to be Active in the previous test. This check is redundant unless tests can be run independently.

If tests are meant to be run independently, consider adding a comment explaining this. Otherwise, remove this line:
-  kubectl wait namespace tenant-test --timeout=20s --for=jsonpath='{.status.phase}'=Active
hack/e2e-cluster.bats (4)
3-3: Replace non-ASCII hyphen with standard ASCII hyphen.

The comment contains a non-ASCII hyphen character (‑) which could cause issues in some environments.
-# Cozystack end‑to‑end provisioning test (Bats)
+# Cozystack end-to-end provisioning test (Bats)
59-59: Consider using consistent subnet mask.

The VMs are configured with /26 subnet mask while the bridge uses /24. This inconsistency might cause confusion.

For a 3-node test setup, using /24 throughout would be simpler and more consistent:
-      - "192.168.123.1${i}/26"
+      - "192.168.123.1${i}/24"
110-112: Consider more robust VM startup verification.

The fixed 5-second sleep might not be sufficient in all environments (slow systems, high load, etc.).

Since the next test already waits for the Talos API port, consider removing the sleep or replacing it with a network readiness check:
-  # Give qemu a few seconds to start up networking
-  sleep 5
+  # Network readiness will be verified in the next test
266-267: Improve readability by splitting the long kubectl wait command.

The command line is very long and hard to read.

Consider using a loop or splitting into multiple wait commands:
# Wait for all deployments
deployments=(
  "capi-controller-manager"
  "capi-kamaji-controller-manager"
  "capi-kubeadm-bootstrap-controller-manager"
  "capi-operator-cluster-api-operator"
  "capk-controller-manager"
)

for deploy in "${deployments[@]}"; do
  kubectl wait "deployment/${deploy}" -n cozy-cluster-api --timeout=1m --for=condition=available
done

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 326870a and b107d80.

📒 Files selected for processing (5)

hack/e2e-apps.bats (1 hunks)
hack/e2e-cluster.bats (1 hunks)
hack/e2e.sh (0 hunks)
hack/run_test.sh (1 hunks)
packages/core/testing/Makefile (2 hunks)

💤 Files with no reviewable changes (1)

hack/e2e.sh

🚧 Files skipped from review as they are similar to previous changes (2)

packages/core/testing/Makefile
hack/run_test.sh

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Test

🔇 Additional comments (2)

hack/e2e-apps.bats (1)

6-24: Test implementation looks good!

The tenant creation test is well-structured with appropriate resource definition and reasonable timeout values for waiting on HelmRelease and namespace readiness.

hack/e2e-cluster.bats (1)

1-384: Excellent comprehensive test implementation!

The test suite is well-structured with:

Logical progression from setup to deployment

Good error handling and informative messages

Appropriate use of timeouts and wait conditions

Idempotent operations where needed

End-to-end verification including HTTP checks

This provides a solid foundation for automated testing of the Cozystack cluster provisioning process.

hack/e2e-apps.bats

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

hack/e2e-cluster.bats (3)
75-76: Progress output redirection may cause issues.

The combination of --output-file /dev/stdout and 2>/dev/null in the wget command could suppress important error messages while trying to redirect progress output.

Consider simplifying the progress handling:
-wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \
-  -O nocloud-amd64.raw.xz --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null
+wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \
+  -O nocloud-amd64.raw.xz --progress=dot:giga
101-107: Consider making VM resource allocation configurable.

The VM resources (8 CPUs, 16GB RAM) are hardcoded, which may not be suitable for all testing environments.

Consider using environment variables for resource configuration:
+# Set default values if not provided
+: ${QEMU_CPUS:=8}
+: ${QEMU_MEMORY:=16384}
+
qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp ${QEMU_CPUS} -m ${QEMU_MEMORY} \
115-115: Enhance timeout error message for better debugging.

The current timeout command doesn't provide clear information about which specific node failed to respond.

Consider adding more descriptive error handling:
-timeout 60 sh -ec 'until nc -nz 192.168.123.11 50000 && nc -nz 192.168.123.12 50000 && nc -nz 192.168.123.13 50000; do sleep 1; done'
+timeout 60 sh -ec '
+  until nc -nz 192.168.123.11 50000 && nc -nz 192.168.123.12 50000 && nc -nz 192.168.123.13 50000; do 
+    echo "Waiting for Talos API on nodes 11, 12, 13..." >&2
+    sleep 1
+  done'

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b107d80 and 8ac7143.

📒 Files selected for processing (3)

hack/e2e-apps.bats (1 hunks)
hack/e2e-cluster.bats (1 hunks)
packages/core/testing/Makefile (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

hack/e2e-apps.bats
packages/core/testing/Makefile

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: pre-commit
GitHub Check: Build and Test

🔇 Additional comments (2)

hack/e2e-cluster.bats (2)

258-261: Excellent error handling pattern for HelmRelease validation.

This is a good example of failing fast when critical components don't reconcile properly, with clear error messaging.

1-384: Well-structured comprehensive testing framework.

This Bats script provides excellent coverage of the Cozystack cluster provisioning lifecycle. The modular approach with individual test cases for each component makes it easy to debug failures and understand the setup process.

Key strengths:

Comprehensive coverage from VM setup to application verification

Good use of timeouts and retries for async operations

Proper cleanup and preparation steps

Clear test case organization

coderabbitai · 2025-05-24T22:05:32Z

hack/e2e-cluster.bats

+  if ! kubectl wait hr/monitoring -n tenant-root --timeout=2m --for=condition=ready; then
+    flux reconcile hr monitoring -n tenant-root --force
+    kubectl wait hr/monitoring -n tenant-root --timeout=2m --for=condition=ready
+  fi


💡 Verification agent

🧩 Analysis chain

Force reconciliation suggests potential instability.

The fallback to flux reconcile hr monitoring -n tenant-root --force suggests the monitoring HelmRelease may have intermittent issues that require manual intervention.

This pattern might indicate an underlying issue with the monitoring component. Consider investigating why force reconciliation is needed:

🏁 Script executed:

#!/bin/bash # Search for similar force reconciliation patterns in the codebase rg -A 5 -B 5 "flux reconcile.*--force"

Length of output: 944

Investigate the need for forced reconciliation

The flux reconcile hr monitoring … --force call in hack/e2e-cluster.bats is the only occurrence of a forced reconciliation in our test suite. Relying on --force can mask intermittent failures of the monitoring HelmRelease. Rather than suppressing these flakes, let’s:

Identify why kubectl wait hr/monitoring occasionally times out (e.g., resource contention, misconfiguration, insufficient timeout).

Consider increasing the --timeout or adding readiness checks to make the test more robust.

If --force remains necessary, document the root cause and rationale in a comment so future maintainers understand why this exception exists.

🤖 Prompt for AI Agents

In hack/e2e-cluster.bats around lines 350 to 353, the script uses `flux reconcile hr monitoring --force` as a fallback when `kubectl wait hr/monitoring` times out, which can hide intermittent issues. To fix this, first investigate why the wait command times out by checking for resource contention, misconfiguration, or insufficient timeout duration. Increase the `--timeout` value or add additional readiness checks to improve stability. If the forced reconciliation is still required, add a detailed comment explaining the root cause and why this exception is necessary for future maintainers.

hack/e2e-cluster.bats

coderabbitai · 2025-05-24T22:05:33Z

hack/e2e-cluster.bats

+}
+
+@test "Clean previous VMs" {
+ kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) 2>/dev/null || true


⚠️ Potential issue

Unsafe process termination could affect unrelated processes.

The kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) command could potentially kill unrelated processes if the PID files contain stale PIDs that have been reused by the system.

Consider adding process validation before termination:

-kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) 2>/dev/null || true +for pidfile in srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid; do + if [ -f "$pidfile" ]; then + pid=$(cat "$pidfile" 2>/dev/null || echo "") + if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then + kill "$pid" 2>/dev/null || true + fi + fi +done

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) 2>/dev/null || true

for pidfile in srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid; do

if [ -f "$pidfile" ]; then

pid=$(cat "$pidfile" 2>/dev/null || echo "")

if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then

kill "$pid" 2>/dev/null || true

fi

fi

done

🤖 Prompt for AI Agents

In hack/e2e-cluster.bats at line 27, the current kill command uses PIDs from files without validation, risking termination of unrelated processes if PIDs are stale. To fix this, read each PID from the files, verify that the process corresponds to the expected QEMU instance (e.g., by checking the command line or process name), and only then issue the kill command. This ensures only the intended processes are terminated safely.

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (2)

hack/e2e-cluster.bats (2)
350-353: Force reconciliation still present from previous review

The forced reconciliation pattern was flagged in a previous review as potentially masking underlying issues.

As noted in the previous review, investigate why the monitoring HelmRelease occasionally requires force reconciliation instead of relying on this workaround.

27-27: ⚠️ Potential issue

Address unsafe process termination from previous review

This issue was flagged in a previous review but remains unresolved. The kill command could terminate unrelated processes if PIDs are stale.

Apply the suggested fix from the previous review:
-kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) 2>/dev/null || true
+for pidfile in srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid; do
+  if [ -f "$pidfile" ]; then
+    pid=$(cat "$pidfile" 2>/dev/null || echo "")
+    if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then
+      kill "$pid" 2>/dev/null || true
+    fi
+  fi
+done

🧹 Nitpick comments (5)

hack/cozytest.sh (4)
22-23: Improve error handling for temporary directory creation

The error handling is good, but the cleanup should be more robust.
 run_one() {
   fn=$1 title=$2
-  tmp=$(mktemp -d) || { echo "Failed to create temp directory" >&2; exit 1; }
+  tmp=$(mktemp -d) || { echo "Failed to create temp directory" >&2; exit 1; }
+  trap 'rm -rf "$tmp"' EXIT
   log="$tmp/log"
27-27: Remove non-English comment

The comment contains non-English text which should be translated for consistency.
-  skip_next="+ $fn"      # первую строку трассировки с именем функции пропустим
+  skip_next="+ $fn"      # skip the first trace line with function name
86-86: Remove non-English comment

Another non-English comment that should be translated.
-    print "  set -e"           # ошибка → падение теста
+    print "  set -e"           # error causes test failure
90-90: Remove non-English comment

Another non-English comment that should be translated.
-    print "  return 0"         # если автор не сделал exit 1 — тест ОК
+    print "  return 0"         # if author didn't exit 1, test is OK
hack/e2e-cluster.bats (1)
233-262: Consider making timeouts configurable

The hard-coded timeouts throughout the script (60s, 15m, 2m, etc.) might not be suitable for all environments. Consider making these configurable via environment variables.

Example for the HelmRelease wait:
+ HELM_TIMEOUT=${HELM_TIMEOUT:-15m}
- kubectl get hr -A | awk 'NR>1 {print "kubectl wait --timeout=15m --for=condition=ready -n "$1" hr/"$2" &"} END {print "wait"}' | sh -ex
+ kubectl get hr -A | awk -v timeout="$HELM_TIMEOUT" 'NR>1 {print "kubectl wait --timeout="timeout" --for=condition=ready -n "$1" hr/"$2" &"} END {print "wait"}' | sh -ex

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c36498 and 74e7e5c.

📒 Files selected for processing (5)

hack/cozytest.sh (1 hunks)
hack/e2e-apps.bats (1 hunks)
hack/e2e-cluster.bats (1 hunks)
hack/e2e.sh (0 hunks)
packages/core/testing/Makefile (2 hunks)

💤 Files with no reviewable changes (1)

hack/e2e.sh

🚧 Files skipped from review as they are similar to previous changes (2)

hack/e2e-apps.bats
packages/core/testing/Makefile

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Test

🔇 Additional comments (5)

hack/cozytest.sh (3)

1-6: LGTM: Clean header and strict error handling

The script header follows good practices with proper shebang and strict error handling via set -eu.

36-50: Consider security implications of command execution tracing

The live tracing output could potentially expose sensitive information (passwords, tokens) that might be used in test commands. Consider adding a mechanism to sanitize or filter sensitive data from the trace output.

Are there any sensitive credentials or tokens that might be exposed in the test trace output? If so, consider implementing output sanitization.

75-95: Robust AWK parsing for Bats conversion

The AWK script correctly handles the conversion from Bats syntax to shell functions, with proper escaping and sanitization of function names.

hack/e2e-cluster.bats (2)

6-14: LGTM: Clear environment validation

Good practice to validate required environment variables with helpful error messages.

376-379: Good improvement: Enhanced curl error handling

This appears to address the curl error handling issue from the previous review with proper error checking and informative messages.

coderabbitai · 2025-05-25T07:35:48Z

hack/cozytest.sh

+TEST_FILE=${1:?Usage: ./cozytest.sh <file.bats> [pattern]}
+PATTERN=${2:-*}


🛠️ Refactor suggestion

Add validation for TEST_FILE existence

The script should verify that the provided Bats file exists before proceeding.

TEST_FILE=${1:?Usage: ./cozytest.sh <file.bats> [pattern]} +[ -f "$TEST_FILE" ] || { echo "Error: Test file '$TEST_FILE' not found" >&2; exit 1; } PATTERN=${2:-*}

🤖 Prompt for AI Agents

In hack/cozytest.sh around lines 8 to 9, add a check after assigning TEST_FILE to verify that the file actually exists on the filesystem. If the file does not exist, print an error message indicating the missing file and exit the script with a non-zero status to prevent further execution.

coderabbitai · 2025-05-25T07:35:48Z

hack/e2e-cluster.bats

+  iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true
+  iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE


🛠️ Refactor suggestion

Add validation for iptables rule management

The iptables commands could fail on systems where iptables is not available or when permissions are insufficient.

+ # Check if iptables is available + if ! command -v iptables >/dev/null 2>&1; then + echo "iptables command not found. Please install iptables." >&2 + exit 1 + fi + # Masquerading rule – idempotent (delete first, then add) iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true - iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE + if ! iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE; then + echo "Failed to add iptables masquerading rule. Check permissions." >&2 + exit 1 + fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true

iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE

# Check if iptables is available

if ! command -v iptables >/dev/null 2>&1; then

echo "iptables command not found. Please install iptables." >&2

exit 1

fi

# Masquerading rule – idempotent (delete first, then add)

iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true

if ! iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE; then

echo "Failed to add iptables masquerading rule. Check permissions." >&2

exit 1

fi

🤖 Prompt for AI Agents

In hack/e2e-cluster.bats around lines 38 to 39, the iptables commands lack validation and may fail silently if iptables is not installed or if permissions are insufficient. Add checks to verify that iptables is available on the system before running these commands, and handle errors gracefully by checking command exit statuses and providing informative error messages or fallback behavior to avoid script failures.

coderabbitai · 2025-05-25T07:35:48Z

hack/e2e-cluster.bats

+    qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 8 -m 16384 \
+      -device virtio-net,netdev=net0,mac=52:54:00:12:34:5${i} \
+      -netdev tap,id=net0,ifname=cozy-srv${i},script=no,downscript=no \
+      -drive file=srv${i}/system.img,if=virtio,format=raw \
+      -drive file=srv${i}/seed.img,if=virtio,format=raw \
+      -drive file=srv${i}/data.img,if=virtio,format=raw \
+      -display none -daemonize -pidfile srv${i}/qemu.pid
+  done


🛠️ Refactor suggestion

Add validation for QEMU command success

The QEMU commands should be validated to ensure VMs start successfully.

for i in 1 2 3; do - qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 8 -m 16384 \ + if ! qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 8 -m 16384 \ -device virtio-net,netdev=net0,mac=52:54:00:12:34:5${i} \ -netdev tap,id=net0,ifname=cozy-srv${i},script=no,downscript=no \ -drive file=srv${i}/system.img,if=virtio,format=raw \ -drive file=srv${i}/seed.img,if=virtio,format=raw \ -drive file=srv${i}/data.img,if=virtio,format=raw \ - -display none -daemonize -pidfile srv${i}/qemu.pid + -display none -daemonize -pidfile srv${i}/qemu.pid; then + echo "Failed to start QEMU VM srv${i}" >&2 + exit 1 + fi done

🤖 Prompt for AI Agents

In hack/e2e-cluster.bats around lines 101 to 108, the QEMU commands launching VMs lack validation for success. Modify the script to check the exit status of each qemu-system-x86_64 command after execution. If the command fails, output an error message indicating which VM failed to start and exit the script or handle the failure appropriately to prevent silent errors.

coderabbitai · 2025-05-25T07:35:48Z

hack/e2e-cluster.bats

+    wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \
+      -O nocloud-amd64.raw.xz --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null
+    rm -f nocloud-amd64.raw
+    xz --decompress nocloud-amd64.raw.xz
+  fi


🛠️ Refactor suggestion

Add integrity verification for downloaded Talos image

The downloaded image should be verified for integrity to ensure it hasn't been tampered with.

if [ ! -f nocloud-amd64.raw ]; then wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \ -O nocloud-amd64.raw.xz --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null + # Verify checksum if available + if wget -q --spider https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz.sha256; then + wget -q https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz.sha256 + if ! sha256sum -c nocloud-amd64.raw.xz.sha256; then + echo "Checksum verification failed for downloaded image" >&2 + exit 1 + fi + fi rm -f nocloud-amd64.raw xz --decompress nocloud-amd64.raw.xz fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \

-O nocloud-amd64.raw.xz --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null

rm -f nocloud-amd64.raw

xz --decompress nocloud-amd64.raw.xz

fi

if [ ! -f nocloud-amd64.raw ]; then

wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \

-O nocloud-amd64.raw.xz --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null

# Verify checksum if available

if wget -q --spider https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz.sha256; then

wget -q https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz.sha256

if ! sha256sum -c nocloud-amd64.raw.xz.sha256; then

echo "Checksum verification failed for downloaded image" >&2

exit 1

fi

fi

rm -f nocloud-amd64.raw

xz --decompress nocloud-amd64.raw.xz

fi

🤖 Prompt for AI Agents

In hack/e2e-cluster.bats around lines 75 to 79, the downloaded Talos image is not verified for integrity, which risks using a tampered file. After downloading the nocloud-amd64.raw.xz file, add a step to verify its checksum or signature against a trusted source before decompressing. This can be done by downloading a checksum file or signature from the same release and using tools like sha256sum or gpg to validate the image's authenticity and integrity.

kvaps requested review from lllamnyp and klinch0 as code owners May 23, 2025 18:59

kvaps marked this pull request as draft May 23, 2025 18:59

kvaps force-pushed the test-framework branch from 1559aa5 to 0bb55aa Compare May 23, 2025 19:00

kvaps added the do-not-merge label May 23, 2025

kvaps force-pushed the test-framework branch from 0bb55aa to 8514799 Compare May 24, 2025 13:35

kvaps mentioned this pull request May 24, 2025

Introduce bats testing framework #981

Closed

kvaps force-pushed the test-framework branch from 8514799 to 1f498ad Compare May 24, 2025 13:45

kvaps removed the do-not-merge label May 24, 2025

kvaps marked this pull request as ready for review May 24, 2025 13:46

kvaps force-pushed the test-framework branch from 1f498ad to d02f6bf Compare May 24, 2025 13:47

coderabbitai bot reviewed May 24, 2025

View reviewed changes

hack/run_test.sh Outdated Show resolved Hide resolved

hack/run_test.sh Outdated Show resolved Hide resolved

hack/e2e.bats Outdated Show resolved Hide resolved

hack/e2e.bats Outdated Show resolved Hide resolved

kvaps changed the title ~~Introduce new test framework~~ [tests] Introduce new test framework May 24, 2025

kvaps mentioned this pull request May 24, 2025

[tests] Add Kubernetes test #986

Closed

kvaps changed the title ~~[tests] Introduce new test framework~~ [tests] Introduce new testing framework May 24, 2025

kvaps force-pushed the test-framework branch 6 times, most recently from 61a4536 to b107d80 Compare May 24, 2025 21:40

coderabbitai bot reviewed May 24, 2025

View reviewed changes

hack/e2e-apps.bats Outdated Show resolved Hide resolved

kvaps force-pushed the test-framework branch from b107d80 to 8ac7143 Compare May 24, 2025 22:02

coderabbitai bot reviewed May 24, 2025

View reviewed changes

kvaps force-pushed the test-framework branch from 8ac7143 to 4c36498 Compare May 24, 2025 23:35

kvaps added 2 commits May 25, 2025 09:29

[tests] Introduce cozytest - a new testing framework

2bf4032

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

[tests] Add tests for tenant Kubernetes cluster

74e7e5c

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

kvaps changed the title ~~[tests] Introduce new testing framework~~ [tests] Introduce cozytest - a new testing framework May 25, 2025

kvaps force-pushed the test-framework branch from 4c36498 to 74e7e5c Compare May 25, 2025 07:31

coderabbitai bot reviewed May 25, 2025

View reviewed changes

klinch0 approved these changes May 27, 2025

View reviewed changes

kvaps merged commit 50932ba into main May 28, 2025
6 of 8 checks passed

kvaps deleted the test-framework branch May 28, 2025 07:19

This was referenced May 29, 2025

[tests] Enhance timeouts awaitng for tenant Kubernetes cluster #1004

Closed

Update Talos Linux v1.10.3 and fix assets #1006

Merged

This was referenced Jun 8, 2025

test #1036

Closed

[tests] VM Disk, VMI, VM, DBs #1048

Merged

coderabbitai bot mentioned this pull request Jun 17, 2025

Split testing job into several #1075

Merged

coderabbitai bot mentioned this pull request Jun 24, 2025

[ci] Refactor Github workflows #1107

Merged

coderabbitai bot mentioned this pull request Aug 21, 2025

Fix: Sanitize v2 schema #1353

Merged

This was referenced Sep 4, 2025

add resource quota for testing ns #1382

Merged

Feat/tests with resource quota #1389

Merged

- kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid 2>/dev/null) 2>/dev/null || true
+ for pidfile in srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid; do
+   if [ -f "$pidfile" ]; then
+     pid=$(cat "$pidfile" 2>/dev/null || echo "")
+     if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then
+       kill "$pid" 2>/dev/null || true
+     fi
+   fi
+ done

		TEST_FILE=${1:?Usage: ./cozytest.sh <file.bats> [pattern]}
		PATTERN=${2:-*}

		iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null \|\| true
		iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE

-  iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true
-  iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE
+  # Check if iptables is available
+  if ! command -v iptables >/dev/null 2>&1; then
+    echo "iptables command not found. Please install iptables." >&2
+    exit 1
+  fi
+  # Masquerading rule – idempotent (delete first, then add)
+  iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true
+  if ! iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE; then
+    echo "Failed to add iptables masquerading rule. Check permissions." >&2
+    exit 1
+  fi

[tests] Introduce cozytest - a new testing framework #982

[tests] Introduce cozytest - a new testing framework #982

Uh oh!

Conversation

kvaps commented May 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot May 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kvaps commented May 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented May 23, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)