Skip to content

Conversation

tolbrino
Copy link
Contributor

@tolbrino tolbrino commented May 13, 2025

This PR adds dynamic port discovery to the smoke tests such that they can be executed in parallel on the same machine, specifically in the CI.

Dynamic Port Allocation in SDK:

  • Replaced the PORT_BASE constant with a base_port parameter in sdk/python/localcluster/constants.py and propagated this change across the cluster, node, and anvil initialization. [1] [2] [3]
  • Updated the Cluster and Node classes to calculate API and P2P ports dynamically based on the base_port value. [1] [2]
  • Modified the Anvil class to accept a port parameter and use it for all operations, including server startup and shutdown. [1] [2] [3]

Test Infrastructure Updates:

  • Introduced a new base_port pytest fixture to allocate a block of available ports dynamically for tests. Updated the swarm7 fixture to leverage this dynamic port allocation.
  • Added error handling to ensure the cluster and anvil are properly cleaned up in case of failures during test setup.

Miscellaneous Fixes and Cleanups:

  • Removed unused imports and constants, such as PORT_BASE, from various files. [1] [2] [3]
  • Temporarily disabled certain snapshot-related operations that are incompatible with the dynamic port approach, with FIXME comments added for future resolution. [1] [2]

These changes improve the flexibility and scalability of the local cluster setup while enhancing the CI pipeline with additional smoke test coverage.

@tolbrino tolbrino added this to the 3.0.0 milestone May 13, 2025
@tolbrino tolbrino self-assigned this May 13, 2025
Copy link
Contributor

coderabbitai bot commented May 13, 2025

📝 Walkthrough

Walkthrough

The changes refactor port management in the local cluster SDK and tests by replacing the global PORT_BASE constant with a configurable base_port parameter, propagated through constructors and methods of Cluster, Node, and Anvil classes. Related logic in main process orchestration and tests is updated to dynamically select and use available port blocks, and snapshot file handling is temporarily disabled to accommodate the new port assignment approach.

Changes

File(s) Change Summary
sdk/python/localcluster/constants.py Renamed constant PORT_BASE to BASE_PORT.
sdk/python/localcluster/anvil.py Removed unused PORT_BASE import. Updated Anvil constructor to accept port parameter. Changed run and kill methods to use instance port instead of global constant. Converted kill from class method to instance method.
sdk/python/localcluster/cluster.py Updated Cluster constructor to require base_port parameter. Passed base_port to Node instances. Replaced all PORT_BASE usage with self.base_port. Commented out early return in shared_bringup to allow readiness checks after funding.
sdk/python/localcluster/node.py Updated Node constructor and fromConfig class method to accept base_port. Replaced all PORT_BASE usage with instance base_port for port calculations.
sdk/python/localcluster/main_process.py Renamed PORT_BASE to BASE_PORT in imports and function parameters. Updated bringup function to accept base_port parameter (defaulting to BASE_PORT). Passed base_port to Cluster, Anvil, and Snapshot. Commented out initial shared_bringup call and cleanup to support random base port. Added error handling for cluster bring-up. Changed final bring-up to not skip funding.
sdk/python/localcluster/snapshot.py Commented out copying and usability checks for database snapshot files with a FIXME, disabling these steps to avoid issues with random base port assignment.
tests/conftest.py Added find_available_port_block function to dynamically find a free block of ports. Modified swarm7 fixture to use dynamic port selection, improved error handling, and updated cluster bring-up to use discovered base_port. Added socket import.
tests/test_hopli.py Removed global ANVIL_ENDPOINT. Added generate_anvil_endpoint(base_port) function. Updated functions and test methods to accept base_port and generate endpoint dynamically.
tests/test_win_prob.py Removed global ANVIL_ENDPOINT. Added generate_anvil_endpoint(base_port) function. Updated set_minimum_winning_probability_in_network and test methods to accept base_port and generate endpoint dynamically.
tests/test_integration.py Removed unused global constant PORT_BASE.
tests/test_redeeming.py Removed unused global constant PORT_BASE.
tests/test_stress.py Removed unused global constant PORT_BASE.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test (pytest)
    participant Conftest as conftest.py
    participant Main as bringup()
    participant Cluster as Cluster
    participant Node as Node
    participant Anvil as Anvil

    Test->>Conftest: request swarm7 fixture
    Conftest->>Conftest: find_available_port_block()
    Conftest->>Main: bringup(config, ..., base_port)
    Main->>Anvil: Anvil(..., port=base_port)
    Main->>Cluster: Cluster(..., base_port)
    Cluster->>Node: Node(..., base_port)
    Main->>Cluster: cluster.shared_bringup(skip_funding=False)
    Cluster->>Node: Node.prepare() (uses base_port)
    Main->>Anvil: anvil.kill()
    Main->>Cluster: cluster.clean_up()
Loading
sequenceDiagram
    participant Snapshot as Snapshot
    participant Filesystem as Filesystem

    Snapshot->>Filesystem: Copy config, anvil, and node id/env files
    Note right of Snapshot: Skips copying DB snapshot files (FIXME)
    Snapshot->>Filesystem: (No DB file usability checks)
Loading

Suggested labels

toolchain

Suggested reviewers

  • Teebor-Choka
  • jeandemeusy

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the testing Related to tests label May 13, 2025
@tolbrino tolbrino marked this pull request as ready for review May 13, 2025 11:29
@Copilot Copilot AI review requested due to automatic review settings May 13, 2025 11:29
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables smoke tests to run in parallel by introducing dynamic port allocation and propagating a new base_port parameter throughout the localcluster modules. Key changes include:

  • Adding a function to find an available contiguous block of ports and updating the swarm7 fixture in tests.
  • Modifying node, cluster, anvil, and main process modules to use a dynamic base_port instead of a fixed PORT_BASE.
  • Adjusting snapshot logic to temporarily skip operations that conflict with the random base port approach.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/conftest.py Introduces dynamic port selection and propagates base_port through test setup.
sdk/python/localcluster/snapshot.py Comments out snapshot file copying to avoid port inconsistencies with dynamic ports.
sdk/python/localcluster/node.py Updates node initialization and port assignment logic to use the dynamic base_port.
sdk/python/localcluster/main_process.py Propagates base_port during the cluster bringup process.
sdk/python/localcluster/constants.py Renames fixed PORT_BASE to BASE_PORT.
sdk/python/localcluster/cluster.py Updates node creation and funding logic to pass and use dynamic base_port.
sdk/python/localcluster/anvil.py Adjusts anvil initialization to utilize the dynamic base_port.
Comments suppressed due to low confidence (1)

sdk/python/localcluster/node.py:88

  • Verify that the new port arithmetic (api_port and p2p_port assignment) correctly avoids collisions across nodes during parallel tests. Consider adding tests or comments to document the scheme.
self.api_port = self.base_port + (self.id * 2)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
sdk/python/localcluster/cluster.py (1)

178-180: ⚠️ Potential issue

Fix attribute typo: self.nodeself.nodes

Cluster has no node attribute; iterating over it will raise an AttributeError the first time this helper is called.

-        for node in self.node.values():
+        for node in self.nodes.values():
🧹 Nitpick comments (7)
sdk/python/localcluster/node.py (1)

249-250: Line exceeds recommended length.

The method signature exceeds the recommended line length of 120 characters.

Consider breaking the parameter list across multiple lines:

-    def fromConfig(
-        cls, index: int, alias: str, config: dict, defaults: dict, network: str, use_nat: bool, exposed: bool, base_port: int
-    ):
+    def fromConfig(
+        cls,
+        index: int,
+        alias: str,
+        config: dict,
+        defaults: dict,
+        network: str,
+        use_nat: bool,
+        exposed: bool,
+        base_port: int
+    ):
🧰 Tools
🪛 Ruff (0.8.2)

249-249: Line too long (125 > 120)

(E501)

sdk/python/localcluster/cluster.py (1)

67-70: Remove dead-code comment to avoid confusion

The commented-out # return is a stale artefact from the old flow and could mislead future readers.

-            # return
sdk/python/localcluster/main_process.py (2)

26-28: Break long signature for readability & Ruff compliance

Line exceeds 120 chars (E501). Wrapping keeps the diff small and silences the linter.

-async def bringup(
-    config: str, test_mode: bool = False, fully_connected: bool = False, use_nat: bool = False, exposed: bool = False, base_port: int = BASE_PORT
-) -> Optional[Tuple[Cluster, Anvil]]:
+async def bringup(
+    config: str,
+    test_mode: bool = False,
+    fully_connected: bool = False,
+    use_nat: bool = False,
+    exposed: bool = False,
+    base_port: int = BASE_PORT,
+) -> Optional[Tuple[Cluster, Anvil]]:
🧰 Tools
🪛 Ruff (0.8.2)

27-27: Line too long (145 > 120)

(E501)


38-41: Long lines & trailing arguments

Lines 38-41 also violate E501. Consider the same wrapping style used above for consistency.

🧰 Tools
🪛 Ruff (0.8.2)

38-38: Line too long (124 > 120)

(E501)


39-39: Line too long (126 > 120)

(E501)

tests/conftest.py (3)

88-88: Split overly long bringup call

This line breaches 120 chars (E501) and is hard to diff. Wrap like so:

-        cluster, anvil = await localcluster.bringup(params_path, test_mode=True, fully_connected=False, use_nat=False, base_port=base_port)
+        cluster, anvil = await localcluster.bringup(
+            params_path,
+            test_mode=True,
+            fully_connected=False,
+            use_nat=False,
+            base_port=base_port,
+        )
🧰 Tools
🪛 Ruff (0.8.2)

88-88: Line too long (139 > 120)

(E501)


95-95: Remove superfluous f-string

The string contains no placeholders; keep it a raw literal to silence Ruff F541.

-        pytest.fail(f"Failed to bring up the cluster")
+        pytest.fail("Failed to bring up the cluster")
🧰 Tools
🪛 Ruff (0.8.2)

95-95: f-string without any placeholders

Remove extraneous f prefix

(F541)


129-194: Unused variable & race window in find_available_port_block

temp_sockets is assigned but never used (F841).
If you meant to hold the sockets open to reserve the ports, keep them and call bind(('', port)) / listen(1); otherwise drop the variable entirely.

-        # Store open sockets temporarily to prevent others from taking the ports while we're checking
-        temp_sockets = []

Optionally, actually bind to reserve:

temp_sockets = []
...
s.bind(('', port))
s.listen(1)
temp_sockets.append(s)

Remember to close the sockets before returning the result.

🧰 Tools
🪛 Ruff (0.8.2)

169-169: Local variable temp_sockets is assigned to but never used

Remove assignment to unused variable temp_sockets

(F841)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 05bba78 and 0b5585f.

📒 Files selected for processing (7)
  • sdk/python/localcluster/anvil.py (4 hunks)
  • sdk/python/localcluster/cluster.py (3 hunks)
  • sdk/python/localcluster/constants.py (1 hunks)
  • sdk/python/localcluster/main_process.py (5 hunks)
  • sdk/python/localcluster/node.py (5 hunks)
  • sdk/python/localcluster/snapshot.py (2 hunks)
  • tests/conftest.py (3 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
sdk/python/localcluster/main_process.py

27-27: Line too long (145 > 120)

(E501)


38-38: Line too long (124 > 120)

(E501)


39-39: Line too long (126 > 120)

(E501)

sdk/python/localcluster/node.py

249-249: Line too long (125 > 120)

(E501)

tests/conftest.py

88-88: Line too long (139 > 120)

(E501)


95-95: f-string without any placeholders

Remove extraneous f prefix

(F541)


169-169: Local variable temp_sockets is assigned to but never used

Remove assignment to unused variable temp_sockets

(F841)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: hoprd / docker
  • GitHub Check: hopli / docker
  • GitHub Check: Docs / Rust docs
  • GitHub Check: zizmor
  • GitHub Check: tests-smoke (websocket_api)
  • GitHub Check: tests-unit
  • GitHub Check: tests-unit-nightly
  • GitHub Check: Linter
🔇 Additional comments (15)
sdk/python/localcluster/constants.py (1)

17-17: Appropriate rename to BASE_PORT.

The rename from PORT_BASE to BASE_PORT aligns with the pattern of using this as a base value for calculating individual component ports, making the code more self-descriptive.

sdk/python/localcluster/snapshot.py (2)

48-51: Good temporary solution with clear FIXME documentation.

Commenting out the database file copying with a clear explanation about how it breaks the random base port approach is appropriate. The FIXME comment properly documents the need for a future solution.

Have you considered using port-agnostic node announcement data or parameterizing the stored announcements with the current base port? This might be worth exploring as a potential solution.


81-85: Consistent temporary solution with proper documentation.

The changes here mirror those above, with appropriate commenting and clear FIXME notes explaining why the database file verification is being skipped.

sdk/python/localcluster/node.py (4)

44-44: Good addition of base_port parameter.

Adding the base_port parameter to the constructor is a key part of making port assignment configurable per instance.


56-56: Proper storage of base_port as instance attribute.

Storing the base_port as an instance attribute ensures it's available throughout the instance's lifecycle.


87-89: Logical port calculation based on instance base_port.

The port calculation logic has been updated appropriately:

  • anvil_port uses the base port directly
  • api_port uses an offset based on node ID, allowing each node to have a unique port
  • p2p_port is simply api_port + 1, maintaining the existing relationship

This approach ensures unique port assignments across multiple nodes in a parallelized test environment.


263-263: Proper base_port parameter passing.

The base_port is correctly passed to the constructor, completing the propagation of this parameter.

sdk/python/localcluster/anvil.py (6)

7-7: Removed unused PORT_BASE import.

Correctly removed the PORT_BASE import as it's no longer needed after switching to instance-based port configuration.


16-16: Added port parameter to constructor.

The port parameter is now explicitly passed to the Anvil constructor, supporting the dynamic port assignment strategy.


20-20: Proper storage of port as instance attribute.

Storing the port as an instance attribute ensures it's available throughout the Anvil instance's lifecycle.


34-34: Updated command to use instance port.

The anvil command properly uses the instance's port attribute instead of the previously hardcoded PORT_BASE constant.


67-67: Changed kill from class method to instance method.

The kill method has been appropriately changed from a class method to an instance method since it now needs to access the instance's port attribute.


69-69: Updated kill command to use instance port.

The kill command now correctly uses the instance's port attribute instead of the global PORT_BASE constant.

sdk/python/localcluster/cluster.py (1)

23-30: Constructor correctly exposes base_port

Making the port base explicit removes hidden dependencies on a global constant and is essential for running multiple clusters in parallel. 👍

tests/conftest.py (1)

80-85: Early port-block discovery is solid

Great idea to randomise and pre-allocate a contiguous 20-port block – this will dramatically cut down on collisions in CI.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
tests/conftest.py (3)

87-98: Well-structured refactoring of the swarm7 fixture

The fixture properly uses the new base_port parameter and has good error handling with try-except. However, there's a minor issue with the f-string on line 98.

-        pytest.fail(f"Failed to bring up the cluster")
+        pytest.fail("Failed to bring up the cluster")
🧰 Tools
🪛 Ruff (0.8.2)

91-91: Line too long (139 > 120)

(E501)


98-98: f-string without any placeholders

Remove extraneous f prefix

(F541)


132-197: Thorough implementation of port availability checking

The find_available_port_block function is well-implemented with good randomization to avoid race conditions. However, there's an unused variable that should be removed.

-        # Store open sockets temporarily to prevent others from taking the ports while we're checking
-        temp_sockets = []
+        # Check each port in the block
🧰 Tools
🪛 Ruff (0.8.2)

172-172: Local variable temp_sockets is assigned to but never used

Remove assignment to unused variable temp_sockets

(F841)


91-91: Consider breaking this line for better readability

This line exceeds the recommended line length limit of 120 characters.

-        cluster, anvil = await localcluster.bringup(params_path, test_mode=True, fully_connected=False, use_nat=False, base_port=base_port)
+        cluster, anvil = await localcluster.bringup(
+            params_path, 
+            test_mode=True, 
+            fully_connected=False, 
+            use_nat=False, 
+            base_port=base_port
+        )
🧰 Tools
🪛 Ruff (0.8.2)

91-91: Line too long (139 > 120)

(E501)

tests/test_win_prob.py (1)

337-337: Consider line length improvements for better readability

Several lines in this file exceed the recommended line length limit of 120 characters.

Consider breaking long lines like this one to improve code readability.

🧰 Tools
🪛 Ruff (0.8.2)

337-337: Line too long (134 > 120)

(E501)

tests/test_hopli.py (1)

406-406: Consider line length improvements

Several lines in this file exceed the recommended line length limit of 120 characters.

Consider breaking long lines like these to improve code readability.

Also applies to: 418-418, 433-433, 441-441

🧰 Tools
🪛 Ruff (0.8.2)

406-406: Line too long (124 > 120)

(E501)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b5585f and aa7b05b.

📒 Files selected for processing (6)
  • tests/conftest.py (3 hunks)
  • tests/test_hopli.py (18 hunks)
  • tests/test_integration.py (0 hunks)
  • tests/test_redeeming.py (0 hunks)
  • tests/test_stress.py (0 hunks)
  • tests/test_win_prob.py (19 hunks)
💤 Files with no reviewable changes (3)
  • tests/test_redeeming.py
  • tests/test_integration.py
  • tests/test_stress.py
🧰 Additional context used
🧠 Learnings (1)
tests/test_win_prob.py (1)
Learnt from: NumberFour8
PR: hoprnet/hoprnet#6511
File: tests/test_integration.py:1182-1198
Timestamp: 2024-10-01T04:47:11.992Z
Learning: In the test `test_hoprd_should_not_accept_tickets_with_lower_than_min_win_prob`, the relay's `rejected_tickets_value` and `unredeemed_value` are already being verified to ensure tickets are rejected and the unredeemed value does not increase.
🪛 Ruff (0.8.2)
tests/conftest.py

91-91: Line too long (139 > 120)

(E501)


98-98: f-string without any placeholders

Remove extraneous f prefix

(F541)


172-172: Local variable temp_sockets is assigned to but never used

Remove assignment to unused variable temp_sockets

(F841)

tests/test_hopli.py

398-398: Line too long (134 > 120)

(E501)


406-406: Line too long (124 > 120)

(E501)


418-418: Line too long (124 > 120)

(E501)


433-433: Line too long (124 > 120)

(E501)


441-441: Line too long (132 > 120)

(E501)

tests/test_win_prob.py

51-51: Line too long (124 > 120)

(E501)


279-279: Line too long (130 > 120)

(E501)


337-337: Line too long (134 > 120)

(E501)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: hoprd / docker
  • GitHub Check: hopli / docker
  • GitHub Check: Docs / Rust docs
  • GitHub Check: tests-smoke (websocket_api)
  • GitHub Check: tests-unit
  • GitHub Check: tests-unit-nightly
  • GitHub Check: Linter
  • GitHub Check: zizmor
  • GitHub Check: Cargo Audit
🔇 Additional comments (14)
tests/conftest.py (1)

80-85: Great implementation of the dynamic port allocation!

The base_port fixture correctly finds an available port block and yields it for tests. This is a key component for enabling parallel test execution.

tests/test_win_prob.py (7)

23-24: Good approach for generating dynamic endpoint URLs

Creating a function to generate the Anvil endpoint URL from the base port is a clean approach that promotes reusability.


27-44: Properly refactored to use dynamic port allocation

The function has been correctly updated to accept and use the base_port parameter, which is essential for parallel test execution.


51-66: Test successfully updated to use dynamic ports

The test method now correctly accepts and uses the base_port parameter, and propagates it to the appropriate functions.

🧰 Tools
🪛 Ruff (0.8.2)

51-51: Line too long (124 > 120)

(E501)


81-128: Consistent implementation across tests

The test method has been updated to use the base_port parameter in all necessary places, maintaining the existing test logic.


142-195: Test method properly parameterized for dynamic port allocation

The test logic remains unchanged while successfully incorporating the dynamic port allocation mechanism.


208-265: Consistent port parameterization implementation

The test method correctly accepts and uses the base_port parameter to dynamically generate endpoint URLs.


279-324: Parallel test execution enabled through proper port parameterization

This test has been successfully updated to support the PR's goal of enabling parallel test execution.

🧰 Tools
🪛 Ruff (0.8.2)

279-279: Line too long (130 > 120)

(E501)


308-308: Line too long (122 > 120)

(E501)

tests/test_hopli.py (6)

28-29: Good implementation of the endpoint generation function

The function correctly generates the Anvil endpoint URL from the base port parameter.


36-46: Function properly refactored to use dynamic ports

The run_cast_cmd function now correctly accepts and uses the base_port parameter for endpoint generation.


49-76: Consistent implementation of port parameterization

The faucet function has been properly updated to support dynamic port allocation.


79-104: All helper functions consistently updated for dynamic port support

The management functions have been properly updated to accept and use the base_port parameter, which is necessary for parallel test execution.

Also applies to: 106-133, 135-161


234-280: All utility functions updated to support dynamic ports

The safe module, migration, and win probability functions have been consistently updated to use dynamic port allocation.

Also applies to: 282-316, 318-341, 344-363


371-394: Tests properly updated to use dynamic port allocation

All test methods in the TestHopliWithSwarm class now correctly accept and use the base_port parameter, enabling parallel test execution.

Also applies to: 398-438, 441-446, 448-480, 482-508

@tolbrino tolbrino force-pushed the tb/202505-smoke-tests-parallel branch from aa7b05b to fee609a Compare May 13, 2025 13:23
@github-actions github-actions bot added the toolchain Developer and product happiness label May 13, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
sdk/python/localcluster/main_process.py (1)

100-108: ⚠️ Potential issue

Symmetrical clean-up on TimeoutError missing.

On a timeout you return cluster, anvil but do not tear them down as you do in the RuntimeError branch. This leaves orphaned processes and open ports, defeating the whole "parallel test" goal.

 except asyncio.TimeoutError as e:
     logging.error(f"Timeout error: {e}")
-    return cluster, anvil
+    cluster.clean_up()
+    anvil.kill()
+    raise
🧹 Nitpick comments (3)
sdk/python/localcluster/main_process.py (1)

74-80: Consider documenting alternative approaches for the FIXME.

While commenting out code is appropriate during refactoring, the FIXME lacks a clear timeline or potential solutions.

-# FIXME: This breaks the random base port approach since nodes will make announcements using different port.
-# Skipping for now until a better approach is found.
+# FIXME(port-refactor): The random base port approach breaks node announcements which use hardcoded ports.
+# Potential solutions:
+# 1. Pass port configuration through node environment
+# 2. Store port information in a shared configuration
+# Skipping for now until one of these approaches is implemented.
tests/conftest.py (2)

89-101: Improved error handling in the swarm7 fixture.

The refactored swarm7 fixture properly propagates the base_port and adds explicit error handling for cluster startup failures.

There's an unnecessary f-string without placeholders:

-        pytest.fail(f"Failed to bring up the cluster")
+        pytest.fail("Failed to bring up the cluster")
🧰 Tools
🪛 Ruff (0.8.2)

101-101: f-string without any placeholders

Remove extraneous f prefix

(F541)


137-202: Well-designed port allocation algorithm.

The find_available_port_block function is well-documented and implements a good algorithm for finding available port blocks, with proper randomization to avoid conflicts.

However, there's an unused variable:

-        # Store open sockets temporarily to prevent others from taking the ports while we're checking
-        temp_sockets = []
+        # Note: We're not keeping sockets open between checks, which could 
+        # potentially allow port races, but simplifies the code
🧰 Tools
🪛 Ruff (0.8.2)

177-177: Local variable temp_sockets is assigned to but never used

Remove assignment to unused variable temp_sockets

(F841)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa7b05b and fee609a.

📒 Files selected for processing (13)
  • .github/workflows/tests.yaml (1 hunks)
  • sdk/python/localcluster/anvil.py (4 hunks)
  • sdk/python/localcluster/cluster.py (3 hunks)
  • sdk/python/localcluster/constants.py (1 hunks)
  • sdk/python/localcluster/main_process.py (5 hunks)
  • sdk/python/localcluster/node.py (5 hunks)
  • sdk/python/localcluster/snapshot.py (2 hunks)
  • tests/conftest.py (3 hunks)
  • tests/test_hopli.py (18 hunks)
  • tests/test_integration.py (0 hunks)
  • tests/test_redeeming.py (0 hunks)
  • tests/test_stress.py (0 hunks)
  • tests/test_win_prob.py (19 hunks)
💤 Files with no reviewable changes (3)
  • tests/test_redeeming.py
  • tests/test_stress.py
  • tests/test_integration.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • sdk/python/localcluster/constants.py
  • sdk/python/localcluster/anvil.py
  • sdk/python/localcluster/snapshot.py
  • sdk/python/localcluster/node.py
  • sdk/python/localcluster/cluster.py
  • tests/test_win_prob.py
  • tests/test_hopli.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
sdk/python/localcluster/main_process.py (5)
tests/conftest.py (1)
  • base_port (80-85)
sdk/python/localcluster/cluster.py (3)
  • Cluster (22-203)
  • shared_bringup (49-104)
  • clean_up (40-42)
sdk/python/localcluster/anvil.py (2)
  • Anvil (15-69)
  • kill (67-69)
sdk/python/localcluster/snapshot.py (1)
  • Snapshot (17-96)
sdk/python/localcluster/node.py (1)
  • clean_up (244-245)
🪛 actionlint (1.7.4)
.github/workflows/tests.yaml

152-152: label "self-hosted-hoprnet-bigger" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-22.04", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "macos-12-xl", "macos-12-xlarge", "macos-12-large", "macos-12", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)


234-234: label "self-hosted-hoprnet-bigger" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-22.04", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "macos-12-xl", "macos-12-xlarge", "macos-12-large", "macos-12", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

🪛 Ruff (0.8.2)
tests/conftest.py

101-101: f-string without any placeholders

Remove extraneous f prefix

(F541)


177-177: Local variable temp_sockets is assigned to but never used

Remove assignment to unused variable temp_sockets

(F841)

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: hoprd / docker
  • GitHub Check: Docs / Rust docs
  • GitHub Check: tests-unit-nightly
  • GitHub Check: tests-smoke-hopli
  • GitHub Check: tests-unit
  • GitHub Check: Linter
🔇 Additional comments (7)
.github/workflows/tests.yaml (3)

151-232: Excellent separation of the hopli test into its own job.

The new tests-smoke-hopli job correctly isolates the hopli test suite with appropriate timeout settings (10 minutes instead of the previous 60), enabling parallel test execution with proper resource management. This change aligns perfectly with the PR objective of allowing smoke tests to run in parallel.

🧰 Tools
🪛 actionlint (1.7.4)

152-152: label "self-hosted-hoprnet-bigger" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-22.04", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "macos-12-xl", "macos-12-xlarge", "macos-12-large", "macos-12", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)


233-237: Good job adding dependency on the hopli job.

Making the main smoke tests job depend on the new hopli job ensures proper sequencing while still allowing potential parallelization with other CI jobs.

🧰 Tools
🪛 actionlint (1.7.4)

234-234: label "self-hosted-hoprnet-bigger" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-22.04", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "macos-12-xl", "macos-12-xlarge", "macos-12-large", "macos-12", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)


237-237: Improved timeout management.

Reducing the timeout from 60 to 30 minutes is appropriate now that the potentially time-consuming hopli tests have been moved to their own job.

sdk/python/localcluster/main_process.py (3)

17-17: Good constant renaming for clarity.

Changing from PORT_BASE to BASE_PORT in the import makes the naming convention more consistent.


32-32: Good addition of dynamic port parameter.

Adding the base_port parameter with a default value of BASE_PORT preserves backward compatibility while enabling dynamic port allocation.


43-50: Proper propagation of base_port parameter.

The changes correctly propagate the base_port parameter to all relevant components: Cluster, Anvil, and Snapshot.

tests/conftest.py (1)

80-85: Good implementation of dynamic port allocation.

The new base_port fixture correctly implements dynamic port allocation for test parallelization, with appropriate error handling when no port block is available.

@tolbrino tolbrino requested a review from a team May 13, 2025 13:56
@tolbrino tolbrino enabled auto-merge May 13, 2025 13:56
Copy link
Contributor

@Teebor-Choka Teebor-Choka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can still see some failures.

@tolbrino tolbrino disabled auto-merge May 13, 2025 18:30
@tolbrino tolbrino merged commit f728f1e into master May 13, 2025
33 of 41 checks passed
@tolbrino tolbrino deleted the tb/202505-smoke-tests-parallel branch May 13, 2025 18:30
@coderabbitai coderabbitai bot mentioned this pull request May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Related to tests toolchain Developer and product happiness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants