Skip to content

Add a instrumentation build for the tokio runtime #6513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 4, 2024

Conversation

Teebor-Choka
Copy link
Contributor

@Teebor-Choka Teebor-Choka commented Sep 26, 2024

Introduce a new profiling build that allows tokio runtime profiling:

  • Minor fix to port config for local_interconnected_cluster
  • Add documentation on how to profile the tokio executor

Notes

Relates to #5917 #6473

Copy link
Contributor

coderabbitai bot commented Sep 26, 2024

Warning

Rate limit exceeded

@Teebor-Choka has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 15 minutes and 6 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e23721b and c98e01f.

📝 Walkthrough
📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces several changes across multiple files, primarily focusing on dependency management in Cargo.toml, enhancements to channel operations in db/sql/src/channels.rs, and modifications to the logger initialization in hoprd/hoprd/src/main.rs. Notable updates include the addition of the console-subscriber dependency and modifications to the tokio dependency's features. The setup-local-cluster.sh script has been updated for improved configuration and error handling, while the test setup in tests/node.py has been adjusted to include a new environment variable.

Changes

File Change Summary
Cargo.toml Added dependency console-subscriber = "0.4.0"; updated tokio to include feature "tracing".
db/sql/src/channels.rs Refined methods in HoprDbChannelOperations; added caching logic and improved cache invalidation.
hoprd/hoprd/Cargo.toml Added feature prof and new optional dependency console-subscriber.
hoprd/hoprd/src/main.rs Modified init_logger to handle profiling features conditionally.
scripts/setup-local-cluster.sh Swapped node_api_base_port and node_p2p_base_port; added TOKIO_CONSOLE_BIND variable.
tests/node.py Updated setup method to include TOKIO_CONSOLE_BIND in the environment configuration.

Assessment against linked issues

Objective Addressed Explanation
Use a profiler to inspect the memory usage (#6473)

Possibly related PRs

Suggested labels

crate:core-network, crate:hoprd-api, crate:hopr-lib

Suggested reviewers

  • NumberFour8
  • tolbrino

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the toolchain Developer and product happiness label Sep 26, 2024
@Teebor-Choka Teebor-Choka self-assigned this Sep 27, 2024
@Teebor-Choka Teebor-Choka force-pushed the kauki/bug/oom-issues-on-channel-endpoint branch 2 times, most recently from 7003011 to 4c508f8 Compare October 4, 2024 13:01
@github-actions github-actions bot added dependencies Pull requests that update a dependency file testing Related to tests crate:hoprd labels Oct 4, 2024
@Teebor-Choka Teebor-Choka force-pushed the kauki/bug/oom-issues-on-channel-endpoint branch 2 times, most recently from c4adb11 to 5ead402 Compare October 30, 2024 16:43
@Teebor-Choka Teebor-Choka marked this pull request as ready for review October 30, 2024 16:43
@Teebor-Choka
Copy link
Contributor Author

The OOM problem does not seem to be present in the latest branch, pending @jeandemeusy tests.

@Teebor-Choka Teebor-Choka requested a review from a team October 30, 2024 16:44
@Teebor-Choka Teebor-Choka added this to the 2.2.0-rc.1 milestone Oct 30, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range comments (2)
db/sql/src/channels.rs (1)

Line range hint 1-301: Consider adding memory profiling for cache analysis.

Given the OOM issues with long-running nodes (issue #6473), I recommend:

  1. Add metrics for cache size and hit/miss rates
  2. Implement memory profiling to track cache memory usage over time
  3. Consider using tokio-console (already added as a dependency) to monitor async resource usage

Example metrics to consider:

  • Cache size (number of entries)
  • Cache memory usage
  • Cache hit/miss rates
  • Cache eviction rates
  • Average time entries spend in cache

Would you like me to provide an example implementation for these metrics?

hoprd/hoprd/src/main.rs (1)

Line range hint 61-88: Consider additional memory optimization strategies.

While adding profiling capabilities is a good first step for diagnosing the OOM issues, consider implementing these additional strategies once the profiling data is analyzed:

  1. Implement proper resource cleanup in critical paths
  2. Add memory limits and monitoring for early detection of leaks
  3. Consider connection pooling and caching strategies for channel queries
  4. Set up memory usage alerts before OOM occurs

This will help create a more robust solution beyond just diagnostics.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 6fb598e and 5ead402.

⛔ Files ignored due to path filters (154)
  • Cargo.lock is excluded by !**/*.lock
  • vendor/cargo/console-api-0.8.0/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/LICENSE is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/README.md is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/async_ops.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/common.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/google/protobuf/duration.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/google/protobuf/timestamp.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/instrument.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/resources.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/tasks.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/trace.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/async_ops.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/common.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/google.protobuf.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.async_ops.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.common.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.instrument.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.resources.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.tasks.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.trace.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/instrument.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/resources.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/tasks.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/trace.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/tests/bootstrap.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/Cargo.lock is excluded by !**/*.lock, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/LICENSE is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/app.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/barrier.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/dump.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/buf.gen.yaml is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/index.html is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/package-lock.json is excluded by !**/package-lock.json, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/package.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/App.css is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/App.tsx is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/async_ops_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/common_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/google/protobuf/duration_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/google/protobuf/timestamp_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/instrument_connect.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/instrument_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/resources_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/tasks_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/trace_connect.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/trace_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/index.css is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/main.tsx is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/vite-env.d.ts is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/tsconfig.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/tsconfig.node.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/vite.config.ts is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/main.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/local.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/long_scheduled.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/long_sleep.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/mutex.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/rwlock.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/semaphore.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/uds.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/id_data.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/mod.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/shrink.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/attribute.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/builder.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/callsites.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/record.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/stack.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/stats.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/sync.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/visitors.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/framework.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/poll.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/spawn.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/mod.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/state.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/subscriber.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/task.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/wake.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/Cargo.lock is excluded by !**/*.lock, !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/LICENSE-APACHE is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/LICENSE-MIT is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/README.md is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/clone.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/interval_log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/quantiles.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/record.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/serialization.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/clippy.toml is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/examples/cli.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/core/counter.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/core/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/errors/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/all.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/linear.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/quantile.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/recorded.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/benchmarks.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/deserializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/interval_log/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/interval_log/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/rand_varint.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/v2_deflate_serializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/v2_serializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/sync/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/helpers.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/index_calculation.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/init.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/subtract.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/value_calculation.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/auto_resize.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.hist is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.histz is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.txt is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/tagged-Log.logV2.hlog is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data_access.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/histogram.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/interval_log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/iterators.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/quantile.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/serialization.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/sync.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/LICENSE is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/README.md is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/any.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/compiler.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/datetime.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/duration.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/protobuf.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/timestamp.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/type_url.rs is excluded by !vendor/**
📒 Files selected for processing (6)
  • Cargo.toml (2 hunks)
  • db/sql/src/channels.rs (2 hunks)
  • hoprd/hoprd/Cargo.toml (2 hunks)
  • hoprd/hoprd/src/main.rs (2 hunks)
  • scripts/setup-local-cluster.sh (2 hunks)
  • tests/node.py (1 hunks)
🔇 Additional comments (8)
hoprd/hoprd/Cargo.toml (2)

79-80: 🧹 Nitpick (assertive)

Ensure profiling tools don't affect production builds.

The console-subscriber is correctly marked as optional, but we should verify that it won't be included in production builds.

Let's verify the production build configuration:

Consider adding a release profile in Cargo.toml to explicitly exclude development features:

[profile.release]
debug = false
debug-assertions = false
✅ Verification successful

Profiling tools are correctly configured for production builds

The verification shows that:

  • console-subscriber is correctly marked as optional and is only included when the prof feature is explicitly enabled
  • The prof feature is not part of the default features list
  • The release profile in Cargo.toml has appropriate production settings including panic = "abort", strip = true, and optimizations

This means profiling tools won't be included in production builds unless explicitly requested via the prof feature flag.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if profiling tools are excluded from release builds
# Look for any release profile configurations
rg -A 10 '\[profile\.release\]'

# Check if prof feature is enabled in any release scripts
rg --glob '*.{sh,yaml,toml}' 'cargo.*--features.*prof.*--release'

Length of output: 16628


Script:

#!/bin/bash
# Check if console-subscriber is included in any default features or feature groups
rg -A 5 'console-subscriber' hoprd/hoprd/Cargo.toml

# Check if there are any feature flags that might enable console-subscriber
rg -A 10 '\[features\]' hoprd/hoprd/Cargo.toml

# Check if there are any workspace-level feature configurations
rg -A 10 '\[workspace\.dependencies\]' Cargo.toml

Length of output: 1206


48-48: LGTM! Good addition for debugging the OOM issue.

Adding the profiling feature is a good approach to investigate the memory leak. The console-subscriber will help track tokio runtime behavior and identify potential memory issues.

Let's verify the tokio version and features:

✅ Verification successful

LGTM! The profiling feature is properly configured for debugging OOM issues.

The verification confirms:

  • Tokio 1.40.0 is used with required features for profiling
  • console-subscriber 0.4.0 is properly configured as an optional dependency
  • The 'prof' feature is correctly set up with the necessary tokio_unstable requirement documented

The setup will help diagnose memory leaks by enabling runtime profiling when built with the 'prof' feature and RUSTFLAGS="--cfg tokio_unstable".

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify tokio configuration in workspace
# Check if tokio is properly configured for profiling
rg -A 5 'tokio.*=.*workspace.*=.*true' 

Length of output: 16628


Script:

#!/bin/bash
# Check tokio dependency and features in workspace
rg -A 5 'tokio.*=.*\{.*version' 

# Check if tokio_unstable is set in any config or environment files
rg -g '*.{toml,sh,yaml,json}' 'tokio.*unstable'

# Check console-subscriber dependency
rg -A 3 'console-subscriber.*=' 

Length of output: 26843

tests/node.py (1)

144-144: Verify port availability for tokio-console.

The current port calculation (p2p_port + 100) might lead to conflicts in large test setups.

✅ Verification successful

Let me analyze the results and generate a final response.

Port calculation for tokio-console appears safe.

The search results show that:

  1. The p2p_port+100 calculation is only used once in tests/node.py for tokio-console binding
  2. No other test components use this port range or similar port calculations
  3. The only other test file using ports (tests/test_websocket_api.py) doesn't overlap with this range
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential port conflicts in the test setup

# Test: Search for other instances of port calculation that might conflict
rg -l 'p2p_port.*\+.*100' .

# Test: Check if the port range is used elsewhere
rg -l '\b[0-9]+100\b' .

# Test: Look for other port assignments that might overlap
ast-grep --pattern 'port = $_ + $_'

Length of output: 18926

Cargo.toml (1)

165-165: 🧹 Nitpick (assertive)

Verify integration of tracing features.

The addition of the "tracing" feature to tokio is necessary for the console-subscriber to work. Let's verify the proper integration in the main application.

Remember to:

  1. Initialize the console subscriber before any other tracing subscribers
  2. Ensure it's only enabled in development/debugging builds
  3. Consider adding configuration options to enable/disable the console at runtime
✅ Verification successful

Let me analyze the verification results. The search results show:

  1. The tracing feature is properly integrated in the codebase:
  • In hoprd/src/main.rs, hopr-socks-server/src/main.rs and hopli/src/main.rs, there are proper tracing subscriber configurations with format layers that include thread IDs, target and level information.
  1. The console subscriber is properly initialized:
  • The codebase follows the recommended pattern of setting up a registry with telemetry and format layers.
  • The tracing subscriber is set as the global default with proper error handling.
  1. The integration matches the recommended patterns from the console-subscriber crate documentation.

Tracing features are properly integrated and configured.

The addition of the "tracing" feature to tokio is correctly utilized in the codebase with proper initialization of the console subscriber and tracing layers. The implementation follows recommended patterns and includes necessary configuration for logging and telemetry.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the proper integration of console-subscriber and tracing

# Check if the console subscriber is properly initialized in the main application
rg -A 5 "console_subscriber::init" 

# Check for proper tracing initialization
rg -A 5 "tracing::subscriber::set_global_default"

# Look for any existing tracing setup that might conflict
rg -A 5 "tracing_subscriber::fmt"

Length of output: 172532

scripts/setup-local-cluster.sh (1)

92-93: 🧹 Nitpick (assertive)

Update usage function to reflect new port values.

The port values have been swapped, but the usage function hasn't been updated to reflect these new default values. This could lead to confusion for users referring to the help documentation.

Let's verify the port configuration is consistent throughout the codebase:

db/sql/src/channels.rs (1)

Line range hint 292-301: Verify cache implementation to prevent memory leaks.

Given the OOM issues mentioned in issue #6473, it's crucial to verify the cache implementation details. Please ensure that:

  1. The cache has a maximum size limit
  2. Cache entries have an expiration policy
  3. Cache eviction strategy is properly configured

Let's check the cache implementation:

hoprd/hoprd/src/main.rs (2)

70-71: LGTM! Clean separation of concerns.

The non-profiling configuration maintains efficiency while allowing for easy profiling activation when needed.

Also applies to: 88-88


61-68: LGTM! Consider performance implications in production.

The addition of tokio runtime tracing with console-subscriber is a good approach for diagnosing the OOM issues. This aligns well with the issue's suggestion to use profiling for analyzing memory usage.

Let's verify the feature flag configuration:

✅ Verification successful

Feature flag and dependency configuration verified correctly

The profiling setup is properly configured:

  • The prof feature is correctly defined in hoprd/hoprd/Cargo.toml with console-subscriber as an optional dependency
  • The feature flag usage in main.rs matches the configuration
  • The comment in Cargo.toml correctly notes the tokio_unstable requirement
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the prof feature is properly configured
# Test: Search for prof feature definition in Cargo.toml files
rg -l 'prof.*=.*\[.*"console-subscriber"' -g 'Cargo.toml'

Length of output: 57


Script:

#!/bin/bash
# Let's try a broader search for the prof feature and console-subscriber
echo "=== Searching for prof feature in Cargo.toml files ==="
rg "prof.*=.*\[" -g 'Cargo.toml'

echo -e "\n=== Searching for console-subscriber dependency ==="
rg "console-subscriber" -g 'Cargo.toml'

echo -e "\n=== Checking feature flags in main.rs ==="
rg "#\[cfg\(feature.*prof" hoprd/hoprd/src/main.rs -C 2

Length of output: 1332

@Teebor-Choka Teebor-Choka force-pushed the kauki/bug/oom-issues-on-channel-endpoint branch from ef699f7 to 9a16c69 Compare October 31, 2024 12:04
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range comments (5)
tests/node.py (2)

Line range hint 127-143: Enhance logging configuration for memory debugging.

Consider adding memory-related components to RUST_LOG for better visibility into memory issues:

             "RUST_LOG": ",".join(
                 [
                     log_level,
+                    "hoprd_memory=trace",      # Memory management logs
+                    "hoprd_allocator=debug",   # Allocation logs
                     "libp2p_swarm=info",
                     "libp2p_mplex=info",
                     "multistream_select=info",
                     "isahc=error",
                     "sea_orm=warn",
                     "sqlx=warn",
                     "hyper_util=warn",
                     "libp2p_tcp=info",
                     "libp2p_dns=info",
                     "hickory_resolver=warn",
+                    "tokio_runtime=debug",     # Runtime memory logs
                 ]
             ),

Line range hint 19-41: Add memory monitoring capabilities to Node class.

To better track and debug memory issues, consider adding memory monitoring capabilities:

     def __init__(
         self,
         id: int,
         api_token: str,
         host_addr: str,
         network: str,
         cfg_file: str,
     ):
         # initialized
         self.id = id
         self.host_addr: str = host_addr
         self.api_token: str = api_token
         self.network: str = network
 
         # optional
         self.cfg_file: str = cfg_file
 
         # generated
         self.safe_address: str = None
         self.module_address: str = None
         self.proc: Popen = None
+        # memory monitoring
+        self.memory_stats: list = []
+        self.memory_threshold_mb: int = int(os.getenv("HOPRD_MEMORY_THRESHOLD", "1800"))
+        self.memory_monitor_interval: int = 60  # seconds

Add methods for memory monitoring:

def start_memory_monitoring(self):
    """Start monitoring memory usage."""
    def monitor():
        import psutil
        while self.proc and self.proc.poll() is None:
            try:
                process = psutil.Process(self.proc.pid)
                memory_mb = process.memory_info().rss / 1024 / 1024
                self.memory_stats.append((time.time(), memory_mb))
                if memory_mb > self.memory_threshold_mb:
                    logging.warning(f"Node {self.id} memory usage ({memory_mb:.2f}MB) exceeded threshold")
                time.sleep(self.memory_monitor_interval)
            except psutil.NoSuchProcess:
                break
    
    import threading
    threading.Thread(target=monitor, daemon=True).start()

This enhancement will:

  • Track memory usage over time
  • Alert when memory exceeds thresholds
  • Help identify memory growth patterns
  • Provide data for debugging OOM issues
db/sql/src/channels.rs (1)

Line range hint 303-401: Add tests for caching behavior and memory usage.

While the existing tests cover basic operations well, consider adding tests for:

  1. Cache hit/miss scenarios
  2. Cache invalidation correctness
  3. Memory usage patterns under load
  4. Concurrent access patterns

This would help verify that the OOM fixes are working as intended.

Would you like me to help generate these additional test cases?

hoprd/hoprd/src/main.rs (2)

Line range hint 236-237: Consider implementing channel monitoring and backpressure.

The websocket event channel's fixed capacity and overflow behavior might not be sufficient to prevent memory issues during high load.

Consider:

  1. Adding metrics to monitor channel capacity and overflow frequency
  2. Implementing backpressure mechanisms when approaching capacity
  3. Adding configurable cleanup policies for old messages

Example monitoring implementation:

let (mut ws_events_tx, ws_events_rx) = {
    let (tx, rx) = async_broadcast::broadcast::<ApplicationData>(WEBSOCKET_EVENT_BROADCAST_CAPACITY);
    
    // Monitor channel capacity
    let tx_clone = tx.clone();
    spawn(async move {
        let mut interval = tokio::time::interval(Duration::from_secs(60));
        loop {
            interval.tick().await;
            let capacity = tx_clone.capacity();
            let receiver_count = tx_clone.receiver_count();
            
            info!(
                capacity_used_percent = format!("{:.2}", (WEBSOCKET_EVENT_BROADCAST_CAPACITY - capacity) as f64 / WEBSOCKET_EVENT_BROADCAST_CAPACITY as f64 * 100.0),
                active_receivers = receiver_count,
                "Websocket event channel status"
            );
            
            #[cfg(feature = "prometheus")]
            {
                METRIC_WS_CHANNEL_CAPACITY.set((WEBSOCKET_EVENT_BROADCAST_CAPACITY - capacity) as i64);
                METRIC_WS_RECEIVER_COUNT.set(receiver_count as i64);
            }
        }
    });
    
    (tx, rx)
};

Line range hint 292-341: Implement message age tracking and cleanup.

The message processing loop could potentially contribute to memory growth over time without proper cleanup mechanisms.

Consider implementing message age tracking and periodic cleanup:

 match hopr_lib::rlp::decode(&data.plain_text) {
     Ok((msg, sent)) => {
         let latency = recv_at.as_unix_timestamp().saturating_sub(sent);
+        let message_age = SystemTime::now()
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap()
+            .as_secs();
 
         trace!(
             tag = ?data.application_tag,
             latency_in_ms = latency.as_millis(),
+            message_age_secs = message_age,
             receiged_at = DateTime::<Utc>::from(recv_at).to_rfc3339(),
             "received message"
         );
 
         #[cfg(all(feature = "prometheus", not(test)))]
-        METRIC_MESSAGE_LATENCY.observe(latency.as_secs_f64());
+        {
+            METRIC_MESSAGE_LATENCY.observe(latency.as_secs_f64());
+            METRIC_MESSAGE_AGE.observe(message_age as f64);
+        }
 
         if cfg.api.enable && ws_events_tx.receiver_count() > 0 {
+            // Implement backpressure when channel is near capacity
+            if ws_events_tx.capacity() < WEBSOCKET_EVENT_BROADCAST_CAPACITY / 10 {
+                warn!(
+                    "Websocket event channel near capacity, implementing backpressure"
+                );
+                tokio::time::sleep(Duration::from_millis(100)).await;
+            }
+
             if let Err(e) = ws_events_tx.try_broadcast(ApplicationData {
                 application_tag: data.application_tag,
                 plain_text: msg.clone(),
             }) {
                 error!(error = %e, "Failed to notify websockets about a new message");
             }
         }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 5ead402 and 9a16c69.

⛔ Files ignored due to path filters (154)
  • Cargo.lock is excluded by !**/*.lock
  • vendor/cargo/console-api-0.8.0/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/LICENSE is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/README.md is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/async_ops.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/common.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/google/protobuf/duration.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/google/protobuf/timestamp.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/instrument.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/resources.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/tasks.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/trace.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/async_ops.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/common.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/google.protobuf.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.async_ops.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.common.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.instrument.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.resources.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.tasks.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.trace.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/instrument.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/resources.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/tasks.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/trace.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/tests/bootstrap.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/Cargo.lock is excluded by !**/*.lock, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/LICENSE is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/app.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/barrier.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/dump.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/buf.gen.yaml is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/index.html is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/package-lock.json is excluded by !**/package-lock.json, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/package.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/App.css is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/App.tsx is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/async_ops_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/common_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/google/protobuf/duration_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/google/protobuf/timestamp_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/instrument_connect.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/instrument_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/resources_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/tasks_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/trace_connect.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/trace_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/index.css is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/main.tsx is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/vite-env.d.ts is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/tsconfig.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/tsconfig.node.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/vite.config.ts is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/main.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/local.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/long_scheduled.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/long_sleep.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/mutex.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/rwlock.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/semaphore.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/uds.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/id_data.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/mod.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/shrink.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/attribute.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/builder.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/callsites.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/record.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/stack.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/stats.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/sync.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/visitors.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/framework.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/poll.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/spawn.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/mod.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/state.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/subscriber.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/task.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/wake.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/Cargo.lock is excluded by !**/*.lock, !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/LICENSE-APACHE is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/LICENSE-MIT is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/README.md is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/clone.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/interval_log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/quantiles.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/record.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/serialization.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/clippy.toml is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/examples/cli.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/core/counter.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/core/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/errors/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/all.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/linear.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/quantile.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/recorded.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/benchmarks.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/deserializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/interval_log/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/interval_log/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/rand_varint.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/v2_deflate_serializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/v2_serializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/sync/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/helpers.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/index_calculation.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/init.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/subtract.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/value_calculation.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/auto_resize.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.hist is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.histz is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.txt is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/tagged-Log.logV2.hlog is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data_access.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/histogram.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/interval_log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/iterators.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/quantile.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/serialization.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/sync.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/LICENSE is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/README.md is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/any.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/compiler.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/datetime.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/duration.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/protobuf.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/timestamp.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/type_url.rs is excluded by !vendor/**
📒 Files selected for processing (6)
  • Cargo.toml (2 hunks)
  • db/sql/src/channels.rs (2 hunks)
  • hoprd/hoprd/Cargo.toml (2 hunks)
  • hoprd/hoprd/src/main.rs (2 hunks)
  • scripts/setup-local-cluster.sh (2 hunks)
  • tests/node.py (1 hunks)
🔇 Additional comments (7)
hoprd/hoprd/Cargo.toml (1)

48-51: LGTM! Well-structured profiling setup for debugging OOM issues.

The addition of the prof feature with console-subscriber is a good approach for investigating the memory leak issues mentioned in #6473.

Also applies to: 82-82

Cargo.toml (1)

165-165: Good addition of tokio tracing feature!

The addition of the "tracing" feature to tokio is essential for proper memory profiling and complements the console-subscriber integration. This will help diagnose the OOM issues by providing detailed insights into async task behavior and memory patterns.

Let's verify that all necessary tracing features are enabled:

✅ Verification successful

Based on the shell script results, I can now provide a final response:

All necessary tracing features are properly configured

The verification shows that:

  1. The tokio "tracing" feature is correctly enabled in Cargo.toml:
tokio = { version = "1.40.0", features = ["rt-multi-thread", "macros", "tracing"] }
  1. The console-subscriber is properly added as a dependency:
console-subscriber = "0.4.0"

This configuration will allow proper memory profiling and debugging of OOM issues since:

  • The tokio tracing feature enables instrumentation of async tasks and runtime events
  • The console-subscriber integration will allow collecting and analyzing memory usage patterns
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify tracing-related dependencies and features

# Check for tracing-related dependencies
echo "Checking tracing dependencies..."
rg '"tracing"' -A 5

# Check if tokio console features are properly configured
echo "Checking tokio console configuration..."
rg 'console-subscriber|tokio.*tracing' -A 2

Length of output: 423755

scripts/setup-local-cluster.sh (1)

157-157: LGTM! Tokio Console integration for OOM investigation.

The addition of Tokio Console binding is a good step towards investigating the OOM issue mentioned in #6473. The port offset of 100 prevents conflicts with other services.

db/sql/src/channels.rs (3)

Line range hint 219-234: Good use of stream processing for memory efficiency.

The implementation of get_all_channels uses streaming to process results, which is memory-efficient for large result sets. This helps prevent OOM issues when dealing with many channels.


Line range hint 108-111: Improve cache invalidation robustness.

The current cache invalidation approach has two potential issues:

  1. Race condition: Gap between DB update and cache invalidation could lead to stale data
  2. Unhandled errors: Cache invalidation failures are silently ignored

Consider:

  1. Using a transaction-like pattern for atomic DB+cache updates
  2. Proper error handling for cache operations

Let's check for similar patterns in the codebase:

#!/bin/bash
# Search for cache invalidation patterns
rg "invalidate.*await" --type rust

Also applies to: 267-271


Line range hint 146-166: Consider implementing cache size limits to prevent OOM.

Given the PR's objective to fix OOM issues, the unbounded channel cache could potentially contribute to memory growth. Consider implementing:

  1. Maximum cache size limit
  2. Eviction policy (e.g., LRU)
  3. Cache entry TTL

This would help prevent unconstrained memory usage while maintaining the performance benefits of caching.

Let's verify the cache implementation:

hoprd/hoprd/src/main.rs (1)

Line range hint 292-341: Verify memory usage patterns with the new profiling feature.

Now that profiling is enabled, we should verify the effectiveness of the memory leak fixes.

Let's analyze the memory usage patterns:

@Teebor-Choka Teebor-Choka changed the title Fix OOM issue on /account/channel querying Add a profiling build for the tokio runtime Oct 31, 2024
@Teebor-Choka Teebor-Choka force-pushed the kauki/bug/oom-issues-on-channel-endpoint branch from b36c76c to e23721b Compare October 31, 2024 16:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range comments (1)
db/sql/src/channels.rs (1)

Line range hint 146-171: LGTM! Consider adding documentation for the caching behavior.

The caching implementation in get_channel_by_parties looks solid with proper invalidation in write operations. Consider adding documentation about:

  • When to use caching (performance implications)
  • Cache invalidation behavior
  • Memory usage considerations

Add documentation like:

/// Retrieves the channel by source and destination.
/// 
/// # Arguments
/// * `use_cache` - When true, attempts to fetch from cache before querying the database.
///                 Use this for performance-critical paths where eventual consistency is acceptable.
///                 Cache is automatically invalidated on channel updates.
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 9a16c69 and e23721b.

⛔ Files ignored due to path filters (154)
  • Cargo.lock is excluded by !**/*.lock
  • vendor/cargo/console-api-0.8.0/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/LICENSE is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/README.md is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/async_ops.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/common.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/google/protobuf/duration.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/google/protobuf/timestamp.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/instrument.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/resources.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/tasks.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/proto/trace.proto is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/async_ops.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/common.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/google.protobuf.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.async_ops.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.common.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.instrument.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.resources.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.tasks.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/generated/rs.tokio.console.trace.rs is excluded by !**/generated/**, !vendor/**
  • vendor/cargo/console-api-0.8.0/src/instrument.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/resources.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/tasks.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/src/trace.rs is excluded by !vendor/**
  • vendor/cargo/console-api-0.8.0/tests/bootstrap.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/Cargo.lock is excluded by !**/*.lock, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/LICENSE is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/app.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/barrier.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/dump.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/README.md is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/buf.gen.yaml is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/index.html is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/package-lock.json is excluded by !**/package-lock.json, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/package.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/App.css is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/App.tsx is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/async_ops_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/common_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/google/protobuf/duration_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/google/protobuf/timestamp_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/instrument_connect.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/instrument_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/resources_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/tasks_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/trace_connect.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/gen/trace_pb.ts is excluded by !**/gen/**, !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/index.css is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/main.tsx is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/src/vite-env.d.ts is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/tsconfig.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/tsconfig.node.json is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/app/vite.config.ts is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/grpc_web/main.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/local.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/long_scheduled.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/long_sleep.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/mutex.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/rwlock.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/semaphore.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/examples/uds.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/id_data.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/mod.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/aggregator/shrink.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/attribute.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/builder.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/callsites.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/record.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/stack.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/stats.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/sync.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/src/visitors.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/framework.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/poll.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/spawn.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/mod.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/state.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/subscriber.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/support/task.rs is excluded by !vendor/**
  • vendor/cargo/console-subscriber-0.4.0/tests/wake.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/CHANGELOG.md is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/Cargo.lock is excluded by !**/*.lock, !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/LICENSE-APACHE is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/LICENSE-MIT is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/README.md is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/clone.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/interval_log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/quantiles.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/record.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/benches/serialization.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/clippy.toml is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/examples/cli.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/core/counter.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/core/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/errors/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/all.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/linear.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/quantile.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/iterators/recorded.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/benchmarks.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/deserializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/interval_log/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/interval_log/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/rand_varint.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/v2_deflate_serializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/serialization/v2_serializer.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/sync/mod.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/helpers.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/index_calculation.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/init.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/subtract.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/tests.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/src/tests/value_calculation.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/auto_resize.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.hist is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.histz is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/seq-nums.txt is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data/tagged-Log.logV2.hlog is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/data_access.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/histogram.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/interval_log.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/iterators.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/quantile.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/serialization.rs is excluded by !vendor/**
  • vendor/cargo/hdrhistogram-7.5.4/tests/sync.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/.cargo-checksum.json is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/Cargo.toml is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/LICENSE is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/README.md is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/any.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/compiler.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/datetime.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/duration.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/lib.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/protobuf.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/timestamp.rs is excluded by !vendor/**
  • vendor/cargo/prost-types-0.13.1/src/type_url.rs is excluded by !vendor/**
📒 Files selected for processing (6)
  • Cargo.toml (2 hunks)
  • db/sql/src/channels.rs (2 hunks)
  • hoprd/hoprd/Cargo.toml (2 hunks)
  • hoprd/hoprd/src/main.rs (2 hunks)
  • scripts/setup-local-cluster.sh (2 hunks)
  • tests/node.py (1 hunks)
🔇 Additional comments (10)
hoprd/hoprd/Cargo.toml (1)

51-53: LGTM! Profiling feature implementation looks good.

The profiling feature is correctly implemented with the required dependency and build configuration.

tests/node.py (2)

144-144: 🛠️ Refactor suggestion

Enhance memory debugging for OOM investigation.

Given the OOM issues described in #6473, consider adding comprehensive memory monitoring capabilities alongside the tokio console:

             "OTEL_SERVICE_NAME": f"hoprd-{self.p2p_port}",
             "TOKIO_CONSOLE_BIND": f"localhost:{self.p2p_port+100}",
+            # Memory limits and monitoring
+            "HOPRD_MEMORY_LIMIT_MB": os.getenv("HOPRD_MEMORY_LIMIT_MB", "2048"),
+            "HOPRD_MEMORY_STATS": "true",
+            "HOPRD_MEMORY_STATS_INTERVAL_SECS": "300",  # 5 minutes
+            # Heap profiling with jemalloc
+            "JEMALLOC_SYS_WITH_MALLOC_CONF": "prof:true,lg_prof_interval:28",
+            "HOPRD_HEAP_PROFILE_DIR": str(dir.joinpath(f"heap_profile_{self.p2p_port}")),
+            # Runtime metrics
+            "HOPRD_METRICS_ENABLED": "true",
+            "HOPRD_METRICS_PORT": str(self.p2p_port + 200)

These additions will help:

  1. Test memory limits to reproduce OOM scenarios
  2. Profile heap allocations to identify memory leaks
  3. Monitor runtime metrics for memory usage patterns
  4. Generate memory stats for long-term analysis

This complements the tokio console profiling by providing a complete suite of tools for investigating the memory issues reported in production nodes.

Likely invalid or redundant comment.


144-144: LGTM: Tokio console binding added for runtime profiling.

The addition of TOKIO_CONSOLE_BIND environment variable is a good step towards enabling runtime profiling capabilities.

Let's verify port availability in the test environment:

✅ Verification successful

Port configuration for Tokio console is safe from conflicts

The port allocation scheme shows no conflicts:

  • Test nodes: p2p_port = port_base + (id * 10) + 2
  • Local cluster: p2p_port = node_p2p_base_port + node_id
  • Tokio console ports: p2p_port + 100 and api_port + 100

The 100-port offset provides sufficient spacing from other services, and the port allocation patterns ensure no overlap between nodes.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the tokio console ports might conflict with other services
# Test: Search for other port configurations in the range of p2p_port + 100
rg -g '!target/**' -g '*.{py,rs,sh,toml}' 'port.*\+.*100'

Length of output: 227


Script:

#!/bin/bash
# Let's check the p2p_port and api_port configurations to understand potential overlaps
rg -g '!target/**' -g '*.{py,rs,sh,toml}' '(p2p_port|api_port)\s*=.*\d+'

Length of output: 683

Cargo.toml (1)

165-169: Verify tokio profiling configuration.

The addition of the "tracing" feature to tokio is essential for the profiling build. To ensure this works effectively with the console-subscriber:

  1. Verify that the profiling feature flag is properly configured in the crate's feature declarations
  2. Ensure that the build system is set up to use RUSTFLAGS="--cfg tokio_unstable" when profiling is enabled
scripts/setup-local-cluster.sh (2)

92-93: Port configuration changes look good.

The port swap between API and P2P services is properly implemented and maintains consistency throughout the script.


157-157: Verify port availability for Tokio Console.

The Tokio Console binding is correctly implemented with a port offset to prevent conflicts. However, we should ensure these console ports are available.

Let's verify that the script checks for port availability:

Consider adding port availability checks for the console ports (19191-19195) to prevent binding failures.

✅ Verification successful

Let me verify if the Tokio Console ports are properly checked. I'll search for the port calculation and its usage.


Let me check the base port values to understand the complete port allocation scheme.


Port availability checks are already implemented correctly

The script already has comprehensive port availability checks:

  • API ports start at 19091 and are verified using ensure_port_is_free
  • Tokio Console ports are dynamically calculated as api_port + 100, which means they start at 19191
  • The script checks all required ports (8545, 13301-13305, 19091-19095) before starting the nodes

Since the base ports are properly defined and all required ports are verified before use, no additional port availability checks are needed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify port availability checks for Tokio Console
# Test: Search for port availability checks. Expect: Should include console ports.
rg "ensure_port_is_free" -A 10

Length of output: 1758


Script:

#!/bin/bash
# Search for api_port definition and its usage in port calculations
rg "api_port" -A 5 scripts/setup-local-cluster.sh

Length of output: 2098


Script:

#!/bin/bash
# Search for base port definitions
rg "node_api_base_port|node_p2p_base_port" scripts/setup-local-cluster.sh

Length of output: 312

db/sql/src/channels.rs (2)

Line range hint 292-301: Skipping comment about unused imports.

This issue was already identified in previous reviews.


Line range hint 146-171: Consider implementing cache size management.

Given the context of memory leak investigations (issue #6473), it would be prudent to implement cache size management to prevent unbounded memory growth in long-running nodes. Consider:

  • Implementing a maximum cache size
  • Adding cache entry eviction policies
  • Monitoring cache size metrics

Let's check for cache size management implementations:

hoprd/hoprd/src/main.rs (2)

88-88: LGTM: Registry configuration is correct.

The registry configuration properly integrates the format layer while maintaining compatibility with both profiling and non-profiling builds.


61-72: Verify profiling feature configuration.

Let's ensure the profiling feature is properly configured in the codebase.

✅ Verification successful

Let me check the specific configuration details in the Cargo.toml files and the profiling feature usage.


Profiling configuration is properly set up

The profiling feature is correctly configured:

  • The prof feature is properly defined in hoprd/hoprd/Cargo.toml with required console-subscriber dependency
  • Clear build instructions are documented in the Cargo.toml comments
  • The feature flag is correctly used in main.rs to conditionally enable tracing with console subscriber
  • The implementation properly sets up detailed tokio tracing when profiling is enabled
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify profiling feature configuration

# Test 1: Check if prof feature is defined in Cargo.toml
echo "Checking prof feature definition..."
rg -l 'features.*=.*\[.*prof.*\]' -g 'Cargo.toml'

# Test 2: Verify console-subscriber dependency
echo "Checking console-subscriber dependency..."
rg -l 'console-subscriber.*=.*' -g 'Cargo.toml'

# Test 3: Look for other profiling-related code
echo "Checking for other profiling-related code..."
rg -l '#\[cfg\(feature.*=.*"prof"\)\]'

Length of output: 507


Script:

#!/bin/bash
# Check the prof feature configuration in hoprd Cargo.toml
echo "=== hoprd Cargo.toml prof configuration ==="
rg 'prof' -A 5 -B 5 hoprd/hoprd/Cargo.toml

# Check console-subscriber dependency configuration
echo -e "\n=== Console subscriber dependency configuration ==="
rg 'console-subscriber' -A 2 -B 2 hoprd/hoprd/Cargo.toml

# Check all prof feature usages in main.rs
echo -e "\n=== Prof feature usage in main.rs ==="
rg '#\[cfg.*prof' -A 5 -B 5 hoprd/hoprd/src/main.rs

Length of output: 1962

Copy link
Contributor

@NumberFour8 NumberFour8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document that in README.md how to do a profiling build and an example how to profile the created binary.

@Teebor-Choka Teebor-Choka changed the title Add a profiling build for the tokio runtime Add a instrumentation build for the tokio runtime Nov 4, 2024
@Teebor-Choka Teebor-Choka added this pull request to the merge queue Nov 4, 2024
Merged via the queue into master with commit d122e5b Nov 4, 2024
25 of 26 checks passed
@Teebor-Choka Teebor-Choka deleted the kauki/bug/oom-issues-on-channel-endpoint branch November 4, 2024 12:36
@coderabbitai coderabbitai bot mentioned this pull request Jan 14, 2025
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate:hoprd dependencies Pull requests that update a dependency file testing Related to tests toolchain Developer and product happiness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants