-
Notifications
You must be signed in to change notification settings - Fork 97
indexer: Reduce on-chain rpc calls #6671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 Walkthrough## Walkthrough
The pull request includes updates to several files across the project, primarily focusing on version increments in various `Cargo.toml` files and significant modifications to the functionality and error handling in the `chain/indexer` and `chain/rpc` modules. Key changes involve refactoring methods to enhance clarity and maintainability, updating return types for log retrieval in the database API, and adjusting logging practices for better error reporting. Additionally, the default timeout for HTTP requests has been increased, and environment variables for Rust builds have been updated.
## Changes
| File | Change Summary |
|----------------------------------|-----------------------------------------------------------------------------------------------------------------|
| `chain/indexer/Cargo.toml` | Version updated from "0.5.2" to "0.5.3". |
| `chain/indexer/src/block.rs` | Refactored `start` and `process_block_by_id` methods; updated `print_indexer_state` signature; added `update_chain_head` method; improved error handling and logging. |
| `chain/rpc/Cargo.toml` | Version updated from "0.3.0" to "0.3.1". |
| `chain/rpc/src/client.rs` | Enhanced logging in `JsonRpcProviderClient`; refined error handling and retry logic. |
| `chain/rpc/src/indexer.rs` | Updated `try_stream_logs` method for error handling; refined logging and control flow. |
| `chain/rpc/src/lib.rs` | Default timeout for HTTP POST requests updated from 5 seconds to 30 seconds. |
| `chain/rpc/src/rpc.rs` | Added `node_safe` field to `RpcOperations`; updated constructor to initialize `node_safe`. |
| `db/api/Cargo.toml` | Version updated from "0.5.1" to "0.5.2". |
| `db/api/src/logs.rs` | Updated return types for `get_logs` and `get_logs_block_numbers` methods from `BoxStream` to `Vec`. |
| `db/sql/Cargo.toml` | Version updated from "0.8.2" to "0.8.3". |
| `db/sql/src/logs.rs` | Updated return types for `get_logs` and `get_logs_block_numbers`; improved error handling in `store_logs`. |
| `nix/rust-package.nix` | Added `RUST_MIN_STACK` environment variable with value "16777216". |
| `nix/shell.nix` | Updated `RUST_MIN_STACK` from "8388608" to "16777216". |
## Assessment against linked issues
| Objective | Addressed | Explanation |
|---------------------------------------------------------|-----------|-------------------------------------------|
| Ensure WAL file size is lower than the main log file (6652) | ❓ | Changes to logging and error handling may indirectly address this, but it's unclear if they directly impact WAL file size. |
## Possibly related PRs
- **#6641**: Improves logging in indexer block processing, relevant to changes in `chain/indexer/src/block.rs`.
- **#6645**: Addresses Fast-Sync behavior and CLI parameters, related to synchronization handling in `chain/indexer/src/block.rs`.
- **#6646**: Enhances logging to reduce clutter in the indexer process, aligning with logging improvements in the main PR.
- **#6644**: Fixes bugs in log processing and Fast-Sync, directly related to indexing improvements in the main PR.
- **#6649**: Improves some logs DB queries, relevant to changes in the indexing process and logging mechanisms.
## Suggested labels
`bug`, `improvement`, `crate:hoprd-api`
## Suggested reviewers
- Teebor-Choka
- NumberFour8 Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
5f87a7a
to
216a0d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (13)
chain/rpc/src/indexer.rs (1)
Line range hint
183-190
: Avoid panicking on unrecoverable RPC errorsThe code panics when it encounters unrecoverable RPC errors after exceeding the maximum number of retries:
if count_failures < MAX_LOOP_FAILURES { // Retry logic } else { panic!("!!! Cannot advance the chain indexing due to unrecoverable RPC errors. The RPC provider does not seem to be working correctly. The last encountered error was: {e}"); }Panicking can cause the entire application to crash, which is undesirable in production. Instead, consider returning an error and allowing the calling code to handle it appropriately. Implementing a proper error handling mechanism enhances the application's robustness.
Apply this diff to return an error instead of panicking:
if count_failures < MAX_LOOP_FAILURES { // Retry logic } else { - panic!("!!! Cannot advance the chain indexing due to unrecoverable RPC errors. - The RPC provider does not seem to be working correctly. - The last encountered error was: {e}"); + error!("Cannot advance the chain indexing due to unrecoverable RPC errors. The RPC provider does not seem to be working correctly. The last encountered error was: {e}"); + return Err(RpcError::UnrecoverableError(format!("Unrecoverable RPC error after {} attempts: {}", MAX_LOOP_FAILURES, e))); }db/sql/src/logs.rs (4)
2-2
: Ensurestream
is still required after changesThe
stream
module fromfutures
is imported, but with the removal of streaming in methods likeget_logs
, it might no longer be necessary. Consider removing it if it's unused to clean up dependencies.
122-122
: Consider implementing pagination to handle large datasetsThe
get_logs
method now returns aVec<SerializableLog>
which collects all logs into memory. While this simplifies the code and avoids issues with dangling connections, it may lead to high memory usage when dealing with a large number of logs. Implementing pagination or a limit on the number of logs retrieved could help manage memory usage efficiently.
Line range hint
173-197
: Manage memory usage inget_logs_block_numbers
with paginationReturning all block numbers as a
Vec<u64>
can consume a significant amount of memory if the dataset is large. Consider implementing pagination or limiting the number of block numbers retrieved to handle large data sets more efficiently.
Line range hint
413-491
: Refactor tests to avoid potential performance issues with large datasetsIn the tests
test_store_single_log
,test_store_multiple_logs
, and others, the calls todb.get_logs(None, None).await.unwrap();
retrieve all logs from the database. As the number of logs grows, this could lead to increased test execution time and memory usage. Consider fetching only the necessary logs or using specific filters to keep the tests efficient.chain/indexer/src/block.rs (3)
167-168
: Optimize log processing during fast syncThe current implementation processes each block individually during fast sync, which may lead to performance issues if there are a large number of blocks to process. Consider processing logs in batches or using asynchronous tasks to improve performance and reduce the total sync time.
202-202
: Unused return value fromupdate_chain_head
The
update_chain_head
method returns au64
, but its return value is not utilized here. If the returned value is unnecessary, consider modifying the method to not return a value, or capture the return value if it's needed elsewhere.
358-358
: Handle potential errors fromprint_indexer_state
After updating log checksums, the call to
Self::print_indexer_state(db, block_id).await
does not handle any potential errors that might occur within the function. Ensure that any errors are appropriately handled or logged to prevent silent failures.nix/shell.nix (1)
84-84
: Document the specific reason for stack size increaseWhile the comment indicates this is required for tests and compilation, it would be helpful to document the specific scenarios or components that necessitate this increased stack size. This information would be valuable for future maintenance and optimization efforts.
Consider expanding the comment:
- RUST_MIN_STACK = "16777216"; # 16MB required to run the tests and compilation + RUST_MIN_STACK = "16777216"; # 16MB required due to [specific component/test] stack requirements during [specific operation]db/api/src/logs.rs (1)
53-58
: LGTM! Good architectural improvement.The change from
BoxStream
toVec
return type simplifies the API and helps prevent dangling database connections. This aligns well with the PR objectives and should help address the WAL file size issues by ensuring transactions complete properly.chain/rpc/src/lib.rs (1)
251-252
: Consider making the timeout configurable per operation type.While 30 seconds is suitable for
eth_getLogs
, other RPC operations might benefit from different timeout values. Consider introducing operation-specific timeout configurations.Example approach:
pub struct HttpPostRequestorConfig { /// Timeout for HTTP POST request /// /// Defaults to 30 seconds. #[default(Duration::from_secs(30))] pub http_request_timeout: Duration, + + /// Operation-specific timeout overrides + #[default(None)] + pub operation_timeouts: Option<HashMap<String, Duration>>, }chain/rpc/src/rpc.rs (1)
Line range hint
449-489
: Consider making test timeouts configurableThe test uses fixed sleep durations which might be insufficient on slower networks or under heavy load. Consider making these timeouts configurable or calculating them based on network conditions.
-sleep((1 + cfg.finality) * expected_block_time).await; +let wait_duration = std::cmp::max( + Duration::from_secs(5), + (1 + cfg.finality) * expected_block_time +); +sleep(wait_duration).await;chain/rpc/src/client.rs (1)
25-25
: LGTM! Enhanced logging improves error traceability.The logging improvements provide better visibility into retry mechanisms and error scenarios by adding structured fields and descriptive messages.
Consider using consistent field names across log messages. For example,
error = ?e
vserror = ?err
.Also applies to: 177-195, 202-239
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (13)
chain/indexer/Cargo.toml
(1 hunks)chain/indexer/src/block.rs
(7 hunks)chain/rpc/Cargo.toml
(1 hunks)chain/rpc/src/client.rs
(6 hunks)chain/rpc/src/indexer.rs
(4 hunks)chain/rpc/src/lib.rs
(1 hunks)chain/rpc/src/rpc.rs
(1 hunks)db/api/Cargo.toml
(1 hunks)db/api/src/logs.rs
(2 hunks)db/sql/Cargo.toml
(1 hunks)db/sql/src/logs.rs
(12 hunks)nix/rust-package.nix
(1 hunks)nix/shell.nix
(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- chain/indexer/Cargo.toml
- chain/rpc/Cargo.toml
- db/api/Cargo.toml
- db/sql/Cargo.toml
🔇 Additional comments (12)
chain/rpc/src/indexer.rs (1)
14-14
: Importing necessary traits for stream handling
The addition of use futures::{Stream, StreamExt};
imports the necessary traits used for stream operations in the code below.
db/sql/src/logs.rs (1)
770-772
: Verify that all logs have updated checksums
In test_update_logs_checksums
, after updating the checksums, it's important to ensure that all logs now have a checksum value. This verification step will confirm that update_logs_checksums
is functioning as expected.
Run the following script to verify that all logs have non-null checksums:
chain/indexer/src/block.rs (2)
386-387
: Confirm consistent usage of updated print_indexer_state
method
The print_indexer_state
method now requires a block_number
parameter. Verify that all calls to this method within the codebase have been updated to include the new parameter to prevent compilation errors or unexpected behavior.
472-483
:
Prevent potential division by zero in sync progress calculation
In the calculate_sync_process
method, there is a possibility of dividing by zero if block_difference
remains zero after updating the chain head. Ensure that you check for this condition and handle it to prevent runtime errors.
Apply this diff to safeguard against division by zero:
let progress = if block_difference == 0 {
// Before we call the sync complete, we check the chain again.
head = Self::update_chain_head(rpc, chain_head.clone()).await;
block_difference = head.saturating_sub(next_block_to_process);
if block_difference == 0 {
1_f64
} else {
(current_block - next_block_to_process) as f64 / block_difference as f64
}
} else {
(current_block - next_block_to_process) as f64 / block_difference as f64
};
+// Ensure block_difference is not zero before division
+if block_difference == 0 {
+ progress = 1_f64;
+}
Likely invalid or redundant comment.
nix/shell.nix (1)
84-84
: Verify stack size requirements across environments
The increase in stack size from 8MB to 16MB is significant. While this change may be necessary for tests and compilation, it's important to verify that this requirement is consistent across different environments and use cases.
Let's verify the stack usage:
✅ Verification successful
Based on the search results, I can now provide a final response regarding the stack size increase:
Stack size increase appears justified due to deep recursion and large data structures
The codebase shows evidence of operations that can require significant stack space:
- Complex data processing with nested structs and large buffers (e.g.
Vec::with_capacity
allocations up to 64KB) - Deep async/await chains in network and transport layers
- Recursive operations in XML processing and data encoding/decoding
- Heavy use of compression/decompression with zstd that can involve deep call stacks
- Large struct definitions with nested fields that consume stack space
The increase from 8MB to 16MB stack size is reasonable given these characteristics, particularly for test execution which may exercise multiple code paths simultaneously.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for deep recursion, large stack frames, or complex data structures
# that might necessitate the increased stack size
# Look for recursive function calls
echo "Searching for recursive functions..."
ast-grep --pattern 'fn $NAME($$$) {
$$$
$NAME($$$)
$$$
}'
# Look for large struct definitions that might create large stack frames
echo "Searching for large struct definitions..."
rg -A 20 'struct \w+\s*\{'
Length of output: 287
Script:
#!/bin/bash
# Let's try a different approach to find potential stack-heavy operations
# Look for recursive functions in Rust files only
echo "Searching for recursive functions in Rust files..."
rg -t rust "fn\s+(\w+)[^{]+\{[^}]*\1\s*\(" -A 5
# Look for large struct definitions in Rust files
echo "Searching for large struct definitions in Rust files..."
rg -t rust "struct\s+\w+\s*\{" -A 10
# Check test files that might require more stack
echo "Searching for test functions..."
rg -t rust "#\[test\]" -A 5
# Look for async recursive functions that might stack up
echo "Searching for async recursive functions..."
rg -t rust "async\s+fn\s+(\w+)[^{]+\{[^}]*\1\s*\(" -A 5
Length of output: 158778
Script:
#!/bin/bash
# Let's look for evidence of stack-heavy operations and large data structures
# Search for recursive functions that might need stack space
echo "Searching for recursive functions..."
rg -t rust "fn\s+\w+.*\{[\s\S]*?\bself\b.*\}" -A 5
# Look for large arrays or buffers that might need stack space
echo "Searching for large static arrays or buffers..."
rg -t rust "(static|const).*\[[0-9]+\]" -A 2
# Look for test functions that might require more stack
echo "Searching for test functions with large allocations..."
rg -t rust "#\[test\].*fn.*\{[\s\S]*?(vec!\[|Vec::with_capacity)" -A 10
# Look for evidence of deep call stacks in async code
echo "Searching for nested async operations..."
rg -t rust "async\s+fn.*\{[\s\S]*?\.await.*\.await" -A 5
Length of output: 101187
Script:
#!/bin/bash
# Let's look for evidence of deep recursion and large memory usage
# Search for recursive functions in test files
echo "Searching for recursive test functions..."
rg -t rust "fn\s+\w+.*\{.*\bself\b.*\}" tests/ -A 5
# Look for large memory allocations
echo "Searching for large memory allocations..."
rg -t rust "(Vec::with_capacity|alloc::alloc|Box::new)" -A 3
# Look for deep async call chains
echo "Searching for deep async call chains..."
rg -t rust "async\s+fn.*\{.*\.await.*\.await.*\.await" -A 5
# Look for test files that might be memory intensive
echo "Searching test files for memory usage..."
rg -t rust "#\[test\]" tests/ -A 10
Length of output: 104283
db/api/src/logs.rs (1)
81-86
: Verify memory usage with large result sets.
While the change to Vec
return types is good for connection management, we should verify that memory usage remains reasonable when retrieving large sets of logs or block numbers.
nix/rust-package.nix (1)
84-84
: Verify if 16MB stack size is sufficient for all scenarios.
The addition of RUST_MIN_STACK
with 16MB is well-documented and consistent with other configuration files. However, we should verify if this value is sufficient for all use cases, especially during intensive operations like fast-sync.
Let's check for potential stack-related issues in the codebase:
✅ Verification successful
16MB stack size is appropriate for the codebase requirements
The stack size of 16MB is well-justified based on the codebase analysis:
- The same stack size (16777216 bytes) is consistently configured in both
nix/shell.nix
andnix/rust-package.nix
- No evidence of deeply recursive functions or stack-intensive patterns was found in the Rust code
- The test files present in the codebase don't show patterns of heavy stack usage
- The comment accurately describes its purpose for "tests and compilation"
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for deep recursion, large arrays/collections, or heavy stack usage patterns
# Check for recursive functions that might need larger stack
echo "Checking for recursive functions..."
ast-grep --pattern 'fn $NAME($$$) {
$$$
$NAME($$$)
$$$
}'
# Check for large static arrays or collections
echo "Checking for large static arrays..."
rg -A 2 '\[\s*[^]]+\s*;\s*\d{4,}\s*\]'
# Look for stack overflow related issues or discussions
echo "Checking for stack-related comments/issues..."
rg -i "stack.*(overflow|size|limit|exceed)"
Length of output: 285
Script:
#!/bin/bash
# Let's try a different approach to check for potential stack-intensive patterns
# Look for deep recursion patterns in Rust files only
echo "Checking recursive patterns in Rust files..."
rg -t rust "fn\s+\w+.*\{" -A 10 | rg "(?s)fn\s+(\w+).*\{\n.*\b\1\b.*\("
# Check for stack-related configurations or comments in Rust
echo "Checking stack-related configurations..."
rg -t rust -i "(stack|heap).*(size|limit|overflow|exceed)"
# Look for large data structures or collections
echo "Checking for large data structures..."
rg -t rust "(Vec::with_capacity|vec!\[|HashMap::with_capacity)\s*\(\s*\d{4,}\s*\)"
# Check for explicit stack size configurations in build scripts
echo "Checking build configurations..."
fd -e toml -e nix -e sh | xargs rg -i "stack.*(size|limit)"
Length of output: 101145
Script:
#!/bin/bash
# Let's check for stack-intensive operations and actual stack usage in Rust code
# Look for recursive functions and deep call stacks
echo "Checking for recursive patterns in Rust code..."
rg -t rust "fn.*\{" -A 5 | rg "(?m)^\s*(self\.)?\w+\s*\(\s*.*\)" -A 5
# Check for stack-related configurations in build files
echo "Checking stack configurations in build files..."
rg -t nix -t toml "stack.*size|RUST_MIN_STACK"
# Look for test files that might stress stack usage
echo "Checking test files for stack usage patterns..."
fd -e rs -t f "test" | xargs rg -l "test.*fn"
Length of output: 1398
chain/rpc/src/lib.rs (1)
251-252
: LGTM! Verify the timeout is sufficient for all use cases.
The increase from 5s to 30s is reasonable for blockchain RPC calls, especially for eth_getLogs
operations that process large block ranges. This change should help reduce HTTP request failures during indexing and fast-sync operations.
Let's verify the impact of this timeout change:
✅ Verification successful
Timeout increase from 5s to 30s is appropriate and well-handled
The timeout increase is well-justified and properly implemented:
- The default HTTP request timeout is used consistently across RPC client implementations
- Error handling for timeouts is in place, including retry mechanisms with configurable backoff
- The codebase has proper handling for transport and connection timeouts
- The 30-second timeout aligns with the RPC client's max backoff setting, providing a balanced approach for long-running operations like
eth_getLogs
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for timeout-related error handling and configurations
# Look for timeout-related error handling
rg -A 3 "timeout|timed out"
# Check for RPC-related configurations that might need timeout adjustments
rg -A 3 "request_timeout|http.*timeout"
Length of output: 103008
Script:
#!/bin/bash
# Search for RPC timeout configurations and related code
rg -A 3 "Duration::from_secs\(30\)|Duration::from_secs\(5\)" chain/rpc/src/lib.rs
# Look for RPC-related error handling and timeout usage
rg -A 3 "eth_getLogs|timeout.*eth" chain/rpc/src/lib.rs
Length of output: 445
Script:
#!/bin/bash
# Search for HTTP timeout usage in RPC client implementations
rg -A 5 "http.*client.*timeout|request.*timeout" chain/
# Look for any RPC error handling related to timeouts
rg -A 3 "error.*timeout|timeout.*error" chain/
Length of output: 4400
chain/rpc/src/rpc.rs (3)
20-23
: LGTM: Import changes are well-structured
The added imports provide essential types and modules for blockchain operations and are properly organized.
Line range hint 141-146
: Verify the impact of finality subtraction on indexing performance
The get_block_number
method subtracts the finality value from the current block number. While this provides safety against chain reorganizations, it could impact the indexer's ability to process recent blocks quickly.
✅ Verification successful
Based on the search results, I can see that the get_block_number
implementation is used in several contexts, particularly in the indexing process. The key findings are:
-
The finality subtraction is used in the indexer to ensure it only processes finalized blocks, as seen in
chain/rpc/src/indexer.rs
where it handles cases when blocks are not yet finalized. -
The implementation has built-in safeguards:
- It checks if requested blocks are beyond the latest block
- It handles cases where blocks are not yet finalized
- It includes retry mechanisms and appropriate error handling
-
The indexer specifically uses this finality-adjusted block number to ensure it only processes finalized blocks, which is a common practice to prevent reorg issues.
Finality subtraction is a valid safety measure and does not negatively impact indexing performance
The implementation correctly handles the finality window to ensure data consistency while still allowing the indexer to process blocks as soon as they are finalized. This is actually a performance optimization as it prevents the need for reprocessing blocks in case of chain reorganizations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any other consumers of get_block_number that might be affected
rg -A 3 "get_block_number|latest.*block"
Length of output: 34780
Line range hint 1-1
: Verify HTTP timeout configuration location
The PR objectives mention increasing HTTP timeout from 5s to 30s for eth_getLogs
calls, but the configuration is not visible in this file.
✅ Verification successful
Let's examine the relevant files more closely, particularly focusing on the ethers provider configuration since it's related to Ethereum RPC calls.
Based on the search results, I can now provide a final response:
HTTP timeout configuration is correctly implemented
The PR's objective to increase HTTP timeout from 5s to 30s is already properly configured in the codebase:
- The HTTP request timeout is set to 30 seconds by default in
chain/rpc/src/lib.rs
:
#[default(Duration::from_secs(30))]
pub http_request_timeout: Duration,
- This timeout is applied in multiple places:
- In the HTTP client configuration (
chain/rpc/src/client.rs
) - For
eth_getLogs
calls in the indexer (chain/rpc/src/indexer.rs
) - For RPC operations through the
HttpPostRequestorConfig
- In the HTTP client configuration (
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for HTTP timeout configuration
rg -l "timeout.*=.*30|DEFAULT_TIMEOUT"
Length of output: 1731
Script:
#!/bin/bash
# Check ethers provider configuration for timeout settings
rg -A 5 -B 5 "timeout.*=.*30|DEFAULT_TIMEOUT" vendor/cargo/ethers-providers-2.0.14/src/rpc/transports/retry.rs
Length of output: 557
Script:
#!/bin/bash
# Look for HTTP client configuration in the chain crate
rg -A 5 -B 5 "timeout|eth_getLogs" chain/
Length of output: 22565
Script:
#!/bin/bash
# Search for configuration structs or settings
ast-grep --pattern 'struct $name {
$$$
timeout: $_,
$$$
}'
Length of output: 205
chain/rpc/src/client.rs (1)
423-433
: LGTM! Improved error handling with better context.
The changes enhance error handling by:
- Adding detailed error context to logs
- Including elapsed time information
- Providing better visibility into retry attempts
This aligns well with the PR objectives to improve reliability and reduce RPC call failures.
Also applies to: 458-458
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (11)
db/sql/src/logs.rs (5)
Line range hint
122-152
: Potential memory consumption issue due to collecting all logs into a vectorThe change from streaming logs to collecting them into a
Vec<SerializableLog>
can lead to high memory usage if the number of logs is large. This could cause the application to consume excessive memory or even crash.Consider implementing pagination or retaining the streaming approach to handle large datasets efficiently.
186-197
: Simplify error mapping and improve clarityThe error handling in this block can be simplified by using the
map_err
method directly without the closure, which improves readability.Apply this diff to simplify error handling:
.await .map(|res| { res.into_iter() .map(|b| U256::from_be_bytes(b.block_number).as_u64()) .collect() -}) -.map_err(|e| { - error!("Failed to get logs block numbers from db: {:?}", e); - DbError::from(DbSqlError::from(e)) -}) +}) + .map_err(|e| { + error!("Failed to get logs block numbers from db: {:?}", e); + e.into() + })
413-415
: Use assert_eq! macro for consistency in testsIn test functions, it's a best practice to use the
assert_eq!
macro for comparing expected and actual values.Apply this diff to use
assert_eq!
:let logs = db.get_logs(None, None).await.unwrap(); -assert!(logs.len() == 1); +assert_eq!(logs.len(), 1); assert_eq!(logs[0], log);
Line range hint
567-576
: Double-check the logic within the while loop in testThe conditions within the while loop may lead to an infinite loop if
next_block
is not incremented correctly. Ensure thatnext_block
is updated appropriately to prevent infinite loops in tests.Apply this diff to adjust the increment of
next_block
:while next_block <= start_block + blocks { let ordered_logs = db.get_logs(Some(next_block), Some(block_fetch_interval)).await.unwrap(); assert!(ordered_logs.len() > 0); ordered_logs.iter().reduce(|prev_log, curr_log| { // assertions curr_log }); - next_block = next_block + block_fetch_interval; + next_block += block_fetch_interval + 1; }
770-772
: Ensure all logs have checksums after updatingAfter calling
update_logs_checksums
, all logs should have a checksum. The test currently checks this, but consider adding an explicit assertion for clarity.Apply this diff to add an explicit assertion:
db.get_logs(None, None).await.unwrap().into_iter().for_each(|log| { assert!(log.checksum.is_some()); + assert!(!log.checksum.unwrap().is_empty()); });
nix/shell.nix (1)
84-84
: Document the specific reason for stack size increase.While the comment indicates it's required for tests and compilation, it would be helpful to document the specific scenarios or test cases that require this increased stack size. This information would be valuable for future maintenance and optimization efforts.
Consider expanding the comment with more details:
- RUST_MIN_STACK = "16777216"; # 16MB required to run the tests and compilation + RUST_MIN_STACK = "16777216"; # 16MB required due to [specific test/scenario] in [specific module]db/api/src/logs.rs (1)
81-86
: Document memory considerations for vector-based returns.Since we're now loading all results into memory at once instead of streaming them, it would be helpful to document any memory usage considerations or typical result set sizes.
Add a note to the documentation:
/// # Returns /// - /// A `Result` containing a `Vec` of block numbers if the operation succeeds or an error if it fails. + /// A `Result` containing a `Vec` of block numbers if the operation succeeds or an error if it fails. + /// Note: This method loads all results into memory at once. For large result sets, consider using + /// block_number and block_offset parameters to paginate results. async fn get_logs_block_numbers<'a>(nix/rust-package.nix (1)
84-84
: Document the stack size requirementPlease add documentation explaining:
- Which specific tests or compilation tasks require the 16MB stack size
- The rationale behind choosing this specific value
- Any known impacts on resource-constrained systems
This will help future maintainers understand the requirement and evaluate if adjustments are needed.
chain/rpc/src/rpc.rs (2)
Line range hint
34-82
: Consider increasing tx_polling_interval default valueThe default tx_polling_interval of 7 seconds might be too aggressive and could lead to unnecessary RPC calls. Consider increasing it to match the increased HTTP timeout of 30 seconds mentioned in the PR objectives.
/// Interval for polling on TX submission /// - /// Defaults to 7 seconds. - #[default(Duration::from_secs(7))] + /// Defaults to 15 seconds. + #[default(Duration::from_secs(15))] pub tx_polling_interval: Duration,
Line range hint
419-424
: Consider using configurable timeout in wait_until_txThe wait_until_tx function uses a fixed sleep duration. Consider using the expected_block_time from the configuration to calculate an appropriate timeout based on the finality requirements.
- sleep(timeout).await; + // Wait for finality * expected_block_time + buffer + let wait_time = timeout.max(cfg.finality as u32 * cfg.expected_block_time + Duration::from_secs(5)); + sleep(wait_time).await;chain/rpc/src/client.rs (1)
Line range hint
177-184
: Consider adding more context to retry queue size check.The debug log for retry queue size could be more informative by including the maximum allowed size for better context.
- debug!(size = retry_queue_size, "checking retryable error retry queue size"); + debug!( + size = retry_queue_size, + max_size = self.max_retry_queue_size, + "checking retryable error retry queue size" + );
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (13)
chain/indexer/Cargo.toml
(1 hunks)chain/indexer/src/block.rs
(7 hunks)chain/rpc/Cargo.toml
(1 hunks)chain/rpc/src/client.rs
(6 hunks)chain/rpc/src/indexer.rs
(4 hunks)chain/rpc/src/lib.rs
(1 hunks)chain/rpc/src/rpc.rs
(1 hunks)db/api/Cargo.toml
(1 hunks)db/api/src/logs.rs
(2 hunks)db/sql/Cargo.toml
(1 hunks)db/sql/src/logs.rs
(12 hunks)nix/rust-package.nix
(1 hunks)nix/shell.nix
(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- chain/indexer/Cargo.toml
- chain/rpc/Cargo.toml
- db/api/Cargo.toml
- db/sql/Cargo.toml
🔇 Additional comments (20)
chain/rpc/src/indexer.rs (3)
14-14
: Confirm Necessity of New Imports
Please verify that both Stream
and StreamExt
from the futures
crate are required for the updated asynchronous stream processing. If they are essential for handling the stream operations introduced in this file, then this addition is appropriate.
151-151
: Initialization of current_block_log
The initialization of current_block_log
with block_id: from_block
and default values for other fields is appropriate for starting the log aggregation process.
154-155
: Efficient Retrieval of Logs
Using retrieved_logs.next().await
and matching on Some(Ok(log))
effectively processes each log asynchronously. This approach is suitable for handling streamed log data.
db/sql/src/logs.rs (4)
491-493
: Clarify expectation in test for duplicate log insertion
The test expects an error due to duplicate log insertion. Ensure that the error is the expected one (e.g., a specific duplicate entry error) to make the test more robust.
Would you like to verify the exact error returned during duplicate insertion?
Line range hint 454-458
: Handle potential errors when retrieving logs
In this test, ensure that retrieving log_2_retrieved
handles potential errors properly.
Run the following script to check for error handling in test cases:
711-714
: Ensure correct retrieval of block numbers
In the test test_get_logs_block_numbers
, confirm that the function returns the correct block numbers, especially when dealing with different block offsets.
Run the following script to verify the block numbers retrieved:
✅ Verification successful
Test correctly verifies block number retrieval with offset
The test case is properly implemented and verifies the correct behavior:
- It sets up two logs with block numbers 1 and 2
- When querying with
Some(1), Some(0)
, it correctly expects and asserts only block number 1 is returned - The implementation in
get_logs_block_numbers
correctly handles the block range:min_block_number = 1
(fromSome(1)
)max_block_number = 2
(frommin_block_number + offset + 1
)- The SQL query uses proper filters:
>= min_block_number
and< max_block_number
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the block numbers returned by get_logs_block_numbers.
# Expect: Block numbers [1]
grep -rn 'get_logs_block_numbers' src/ | grep 'Some(1), Some(0)'
Length of output: 102
Script:
#!/bin/bash
# Let's first locate the correct file path since we're in the repository root
fd "logs.rs$"
# Then search for the function definition and its usage
rg -n "get_logs_block_numbers" -A 5
Length of output: 2227
Script:
#!/bin/bash
# Let's examine the implementation and test context more thoroughly
rg -B 10 -A 20 "async fn get_logs_block_numbers" db/sql/src/logs.rs
Length of output: 1277
Script:
#!/bin/bash
# Let's examine the full test context to understand the test setup and assertions
rg -B 20 -A 20 "test_get_logs_block_numbers" db/sql/src/logs.rs
Length of output: 1542
Script:
#!/bin/bash
# Let's see the complete test case including all assertions
rg -B 50 -A 10 "assert_eq!\(block_numbers\[0\], 1\)" db/sql/src/logs.rs
Length of output: 2224
638-641
: Check for off-by-one error in block_offset handling
When calling db.get_logs(Some(1), Some(0))
, verify that the function correctly interprets the block offset and retrieves the expected logs.
Run the following script to confirm the behavior:
✅ Verification successful
No off-by-one error in block_offset handling
The implementation correctly handles the block offset. Here's why:
- For
get_logs(Some(1), Some(0))
, the function calculates:min_block_number = 1
max_block_number = Some(1 + 0 + 1) = Some(2)
- This creates a range
[1, 2)
which correctly retrieves only logs from block 1 - The test assertion confirms this behavior by expecting exactly one log entry
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that block_offset is handled correctly.
# Expect: The function retrieves logs for block_number 1 only.
grep -rn 'get_logs' src/ | grep 'Some(1), Some(0)'
Length of output: 88
Script:
#!/bin/bash
# Let's first locate the correct file path and then analyze the get_logs implementation
fd logs.rs
# After finding the file, let's examine the get_logs implementation
ast-grep --pattern 'fn get_logs($$$) {
$$$
}'
Length of output: 516
Script:
#!/bin/bash
# Let's examine the get_logs implementation in the specific file
rg -A 20 "fn get_logs" "db/sql/src/logs.rs"
# Also check the test context around the specific test case
rg -B 5 -A 5 "get_logs\(Some\(1\), Some\(0\)\)" "db/sql/src/logs.rs"
Length of output: 2707
chain/indexer/src/block.rs (5)
167-168
: Verify that logs exist after clearing index database during fast sync
After clearing the index database and resetting logs to unprocessed, get_logs_block_numbers
may return an empty list, resulting in no blocks being processed in the loop. Please verify whether logs are expected to be present at this point, or if additional initialization is required for the fast sync process to function correctly.
200-203
: Initialize chain head before processing event stream
Initializing the chain head at startup ensures accurate synchronization progress calculation.
318-325
: Verify the source of block number mismatches in logs
While the code logs an error when a log's block number doesn't match the expected block_id
, it's important to identify why such mismatches might occur. Please verify whether this situation can happen under normal circumstances, and if additional handling or data validation is required to prevent or manage this mismatch.
358-358
: Invoke print_indexer_state
after updating log checksums
Calling print_indexer_state
here provides updated indexer state information after processing each block.
Line range hint 467-493
: Refined synchronization progress calculation
The modifications enhance the synchronization progress reporting by updating the chain head when necessary and preventing redundant RPC calls. This ensures more accurate and efficient progress tracking.
nix/shell.nix (1)
84-84
: Verify if 16MB is the minimum required stack size.
While increasing the stack size might resolve immediate issues, it's important to verify if this is the minimum required increase. Large stack requirements could indicate potential optimization opportunities in the codebase.
Let's check for evidence of stack-related issues:
db/api/src/logs.rs (1)
53-58
: LGTM! Verify memory implications of vector-based returns.
The change from streaming to vector-based returns aligns with the PR objectives to prevent issues with dangling database connections. This should help address the WAL file size growth issue by ensuring transactions complete properly.
Let's verify the typical size of log result sets to ensure memory usage remains reasonable:
✅ Verification successful
Memory usage is safely bounded by block range parameters
The implementation shows that memory usage is naturally constrained by the block_number
and block_offset
parameters, which limit the range of blocks from which logs are retrieved. The query filters logs using:
- A minimum block number:
block_number.unwrap_or(0)
- A maximum block number:
min_block_number + block_offset + 1
This design allows callers to control result set sizes by specifying appropriate block ranges, making the vector-based return type safe to use.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check typical result set sizes for get_logs queries
# Look for query implementations and any size limits
# Search for get_logs implementations
ast-grep --pattern 'async fn get_logs($$$) { $$$}'
# Look for any LIMIT clauses in SQL queries
rg -i 'select.*from.*logs.*limit'
# Search for any pagination or batch size constants
rg -i '(page_size|batch_size|limit)'
Length of output: 191
Script:
#!/bin/bash
# Let's try a different approach to find implementations and size limits
# Find SQL implementation files
fd "logs.rs" --type f
# Look for implementations with different patterns
rg -A 10 "impl.*Logs"
# Search for query implementations with potential LIMIT clauses
rg -A 10 "get_logs.*async"
# Look for any constants or configuration related to batch sizes
rg "(PAGE_SIZE|BATCH_SIZE|MAX_LOGS)"
Length of output: 71605
Script:
#!/bin/bash
# Let's search for the SQL implementation and any query size limits
# Focus on the specific implementation files we found
# Check the SQL implementation
rg -A 20 "impl.*Logs.*for" db/sql/src/logs.rs
# Look for any query size limits or pagination in the implementation
rg -A 10 "get_logs" db/sql/src/logs.rs
# Check for any configuration or constants that might limit result sizes
rg "(limit|offset|batch|page)" db/sql/src/logs.rs
Length of output: 6324
nix/rust-package.nix (1)
84-84
: Verify the minimum stack size requirement
The stack size has been set to 16MB which is a significant allocation. While the comment indicates this is required for tests and compilation, we should verify:
- Which specific tests or compilation tasks require this stack size
- Whether this is the minimum required size or if it can be optimized
✅ Verification successful
Stack size requirement of 16MB is justified
The codebase shows evidence that the 16MB stack size requirement is reasonable:
-
The stack size is used for both tests and compilation, which involve:
- Deep recursion in test files, particularly in vendor dependencies like
toml_edit
,syn
, andciborium
that have recursion limits and stack overflow prevention mechanisms - Complex parsing operations in dependencies like
protobuf
,prost
, andder-parser
that require deep call stacks - Large data structure handling in test files across multiple vendor crates
- Deep recursion in test files, particularly in vendor dependencies like
-
The value is explicitly documented in the Nix configuration and appears to be a carefully chosen value to prevent stack overflows while keeping memory usage reasonable.
-
No evidence was found of this being excessive - similar stack sizes are used in other parts of the codebase for comparable operations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for deep recursion or stack-intensive operations
# that might require increased stack size
# Look for recursive function calls
echo "Searching for recursive functions..."
ast-grep --pattern 'fn $NAME($$$) {
$$$
$NAME($$$)
$$$
}'
# Look for deep async/await chains
echo "Searching for deep async/await chains..."
rg -A 5 '\.await'
# Look for test functions with complex operations
echo "Searching for complex test functions..."
ast-grep --pattern '#[test]
fn $NAME($$$) {
$$$
}'
Length of output: 287
Script:
#!/bin/bash
# Let's try a different approach to find stack-intensive operations
# Look for test files and their content
echo "Searching for test files..."
fd -e rs "test" -x rg -l "^#\[test\]" {} \;
# Look for large data structures and complex types
echo "Searching for large structs/enums..."
rg -B 2 "^(struct|enum) \w+ \{" --type rust
# Look for deep nested async blocks
echo "Searching for nested async blocks..."
rg "async .*\{" --type rust -A 2
# Look for recursive type definitions
echo "Searching for recursive types..."
rg "type \w+(\s*=\s*.*self|where.*self)" --type rust
# Search for stack overflow related issues in git history
echo "Searching for stack overflow mentions in commits..."
git log --grep="stack" --grep="overflow" --oneline
Length of output: 101147
Script:
#!/bin/bash
# Let's search for test files and test functions that might require large stack space
# Look for test files with complex operations or large data structures
echo "Searching for test files with potentially stack-intensive operations..."
rg -B 2 -A 2 "(#\[test\].*fn.*\{|struct.*\{|enum.*\{)" --type rust tests/
# Look for recursive function calls in test files
echo "Searching for recursive calls in test files..."
rg -l "fn.*\{" --type rust tests/ | xargs rg -A 5 "self\.\w+\(|rec\w*\("
# Look for test files that might deal with large data structures
echo "Searching for test files with large data structures..."
rg -B 2 -A 2 "Vec<|HashMap<|BTreeMap<" --type rust tests/
# Look for any explicit stack size configurations
echo "Searching for other stack size configurations..."
rg -i "stack.*size|RUST_MIN_STACK" --type nix --type rust
Length of output: 102886
Script:
#!/bin/bash
# Let's look for evidence of stack-intensive operations in the codebase
# Search for test files that might require large stack space
echo "Searching for test files with deep recursion or large allocations..."
rg -A 5 "(fn.*test.*\{|#\[test\])" tests/ | rg -B 2 -A 5 "(Vec::with_capacity|recursion|stack|alloc)"
# Look for any stack overflow or stack size related issues in git history
echo "Searching for stack-related issues in git history..."
git log --oneline -S"stack" --pickaxe-regex -p
# Look for any stack size configurations in build files
echo "Searching for stack size configurations in build files..."
rg -i "(RUST_MIN_STACK|stack.*size)" --type make --type nix --type toml
# Look for any stack overflow prevention mechanisms
echo "Searching for stack overflow prevention code..."
rg "(recursion.*limit|stack.*overflow|STACK_SIZE)" --type rust
Length of output: 33453
chain/rpc/src/lib.rs (1)
251-252
: LGTM! Increasing the timeout should help with failing RPC calls.
The increase from 5s to 30s aligns with the PR objective to address failing eth_getLogs
calls. This change should help reduce transaction failures and improve the stability of the indexing process.
Let's verify the timeout usage across the codebase:
Monitoring Recommendation:
Consider monitoring:
- RPC request completion times
- Number of failed requests
- Memory usage patterns
- Transaction confirmation rates
✅ Verification successful
LGTM! The timeout change is properly propagated and consistent with other timeouts in the codebase.
The search results confirm that:
- The 30-second timeout is properly propagated to HTTP clients (both Surf and Reqwest implementations)
- The value aligns with other timeout configurations in the codebase:
- Indexer operations also use 30-second timeouts
- The retry policy's max backoff is set to 30 seconds
- The change is properly documented in the config struct
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for other timeout configurations that might need adjustment
# and verify that the new timeout is properly propagated to RPC clients
# Search for other timeout configurations
rg -i "timeout.*sec|duration.*sec"
# Search for HttpPostRequestorConfig usage to ensure proper propagation
ast-grep --pattern 'HttpPostRequestorConfig { $$ }'
Length of output: 103039
Script:
#!/bin/bash
# Let's search for HttpPostRequestorConfig usage and timeout configurations in the codebase
# to verify the impact of the timeout change
# Search for HttpPostRequestorConfig usage
rg -A 5 "HttpPostRequestorConfig"
# Search for timeout-related configurations in the RPC module
rg -A 5 "timeout|duration" chain/rpc/
Length of output: 11275
chain/rpc/src/rpc.rs (2)
Line range hint 142-153
: LGTM: Efficient block number retrieval with finality handling
The implementation correctly handles finality by subtracting the finality value from the latest block number, ensuring consistent chain state.
Line range hint 297-375
: LGTM: Efficient use of multicall for status checks
The check_node_safe_module_status method efficiently batches multiple RPC calls into a single request using multicall, which aligns well with the PR's objective of reducing RPC calls.
chain/rpc/src/client.rs (2)
25-25
: LGTM! Improved logging levels and context.
The changes appropriately elevate log levels for errors and retries, making it easier to track and debug RPC request failures. The additional context in error messages will help with troubleshooting.
Also applies to: 423-433, 458-458
Line range hint 752-1052
: LGTM! Comprehensive test coverage.
The test suite thoroughly covers:
- Success and failure scenarios
- Retry mechanisms with different policies
- Edge cases for error handling
- Snapshot functionality for testing
This ensures the reliability of the RPC client implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So after all it looks like the sea_orm Streaming interface was more suitable for Postgres, than Sqlite
Ideally, the API would stay streaming, as it was before... and the sqlite feature gated implementation would have its own specialized implementation, where everything would be held as a vector loaded by a single query. |
This is one related to the changes here. |
Co-authored-by: Tibor <9529609+Teebor-Choka@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
chain/indexer/src/block.rs (2)
200-204
: Enhance comment clarity for chain head updateThe comment could be more descriptive about why this initial update is necessary for sync progress calculation.
- // update the chain head once at startup to get a reference for initial synching - // progress calculation + // Update the chain head once at startup to establish a baseline reference point + // for calculating initial synchronization progress. This ensures accurate progress + // reporting from the beginning of the indexing process.
392-413
: Optimize logs retrieval for state printingSince we only need the last log for checksum and count, we could optimize the query to avoid retrieving all logs.
Consider adding a method to get just the last log and count for the block:
async fn get_last_log_and_count(block_number: u64) -> Result<(Option<SerializableLog>, usize)>
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
chain/indexer/src/block.rs
(8 hunks)
🔇 Additional comments (5)
chain/indexer/src/block.rs (5)
167-168
: LGTM! Improved readability with for loop
The change from while let
to for
loop simplifies the code while maintaining the same functionality.
313-326
: LGTM! Improved log processing with for loop
The change to use a for loop improves readability.
416-428
: LGTM with existing panic handling concerns
The implementation is clean but note the existing comments about panic handling.
835-836
: LGTM! Test updated to match implementation
The test correctly accounts for the additional chain head check during sync progress calculation.
Line range hint 468-494
: Verify edge case handling in sync progress calculation
The progress calculation looks good and aligns with the goal of reducing RPC calls. However, verify the edge case where the chain head moves significantly between the two head checks.
✅ Verification successful
Edge case in sync progress calculation is properly handled
The code handles the edge case of rapid head movement appropriately through:
- Double-checking the chain head before marking as synced
- Using
saturating_sub
to prevent underflow if head moves backwards - Recalculating block difference after the second head check
- Properly handling progress calculation in both scenarios (zero and non-zero difference)
The implementation is safe and correctly manages the sync state transition, with appropriate error handling and atomic operations for thread safety.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for tests covering rapid head movement during sync
rg -A 5 "test.*sync.*progress"
ast-grep --pattern 'fn $_(){ $$$head$$$progress$$$}'
Length of output: 226
Script:
#!/bin/bash
# Let's search for test files related to indexer and block processing
fd "test.*\.rs" chain/indexer/src/
# Then look for sync progress related code in indexer tests
rg -l "sync.*progress|progress.*sync" chain/indexer/src/
# Search for mock implementations that might simulate head movement
ast-grep --pattern 'mock!($$$)'
Length of output: 3054
Script:
#!/bin/bash
# Let's examine the block processing logic more closely
rg -A 10 "update_chain_head" chain/indexer/src/block.rs
# Look for error handling around head updates
rg -A 5 "Error.*head|head.*Error" chain/indexer/src/block.rs
# Check how the sync state is managed
rg -A 5 "is_synced" chain/indexer/src/block.rs
Length of output: 3372
These changes make the entire indexing, both fast-sync and rpc, more robust and faster.
eth_getLogs
calls leading nodes into situations were many HTTP requests would be failingFixes #6652
Refs #6664
Refs #6665