-
Notifications
You must be signed in to change notification settings - Fork 254
fix: now farmer piece cache sync percentage base on total cache size #3654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: now farmer piece cache sync percentage base on total cache size #3654
Conversation
🛡️ Immunefi PR ReviewsWe noticed that your project isn't set up for automatic code reviews. If you'd like this PR reviewed by the Immunefi team, you can request it manually using the link below: Once submitted, we'll take care of assigning a reviewer and follow up here. |
However, I noticed during testing that the byte sizes sometimes differ (e.g., 9.9Gib or 9.8Gib). I haven't looked into this issue closely, but perhaps it's caused by precision loss? From my perspective, I think it might be more intuitive to directly use the number of pieces here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I noticed during testing that the byte sizes sometimes differ (e.g., 9.9Gib or 9.8Gib). I haven't looked into this issue closely, but perhaps it's caused by precision loss?
A slight variation in cache size seems like a low priority, but it's worth opening a ticket in case it's caused by a more serious bug.
Can you see any places where that precision loss could be happening? All the calculations we're doing are in integers, and the piece cache size calculations are deterministic.
The byte_size
formatting uses f64
:
https://docs.rs/bytesize/1.3.3/src/bytesize/lib.rs.html#193
but it is simple code that should be deterministic, as long as the total piece cache capacity is the same. (Which is something we should confirm.)
Let's treat this as a separate bug? We can add debug logs for it in this PR.
If the change was happening in the number of pieces downloaded, or to download, it could be a race condition. But there's no concurrency on the total cache capacity.
Let's open another ticket and post full debug/trace logs there?
From my perspective, I think it might be more intuitive to directly use the number of pieces here.
Most users don't know how big a piece is, but logging gigabytes will give them a rough idea of how many pieces still need to be downloaded.
count = %piece_indices_to_store.len(), | ||
"Identified piece indices that should be cached", | ||
); | ||
|
||
let pieces_to_download_total = piece_indices_to_store.len(); | ||
let pieces_to_download_total = piece_indices_to_store.len() + stored_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try replacing this with caches.total_capacity()
, because that's a simpler calculation.
And debug log the total capacity above, to help diagnose the variable maximum capacity bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using caches.total_capacity()
here might not be suitable. For the vast majority of farmers, the total_capacity
will likely never be filled. I think it's fine to use for display purposes, but using it to calculate a percentage for synchronization progress could be misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the vast majority of farmers, the
total_capacity
will likely never be filled.
Well, that's once source of variability: if the attached node is synced to a different segment index, then the total available number of pieces will change, so the target number of pieces might also change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add a debug log of the total cache capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add a debug log of the total cache capacity.
Currently, total_capacity
is PieceCache::max_num_elements
, which is the max number of pieces in this cache. So, should I display this number directly? Or should I display the bytesize
similar to your logic below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is about to merge, and it's not that important, so let's skip it for now.
But if we add it in future, matching the other numbers in the debug log is more useful. So if it went in the "Identified piece indices that should be cached" debug log, it would be a piece count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review.
I'll try to reproduce the byte_size
issue separately with a unit test. If it warrants further discussion, I'll open a separate issue for it. I agree that this is a low-priority issue.
count = %piece_indices_to_store.len(), | ||
"Identified piece indices that should be cached", | ||
); | ||
|
||
let pieces_to_download_total = piece_indices_to_store.len(); | ||
let pieces_to_download_total = piece_indices_to_store.len() + stored_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using caches.total_capacity()
here might not be suitable. For the vast majority of farmers, the total_capacity
will likely never be filled. I think it's fine to use for display purposes, but using it to calculate a percentage for synchronization progress could be misleading.
Oh, I've found the reason. It's caused by another project on my server. Please ignore this issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
38a715a
to
83f36b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…rcentage base on total cache size)
Thank you again for this change, running a farmer feels much better with logs like:
Previously that would have shown as:
Which doesn't give much sense of true farmer progress. |
close: #3614
Thanks to the previous refactoring, it's now easy to retrieve stored_pieces.
Added a helper function to retrieve stored_pieces.
As you can see from the log changes, a restart no longer resets the percentage calculation but instead continues from the previous count; additionally, stored_count is now printed in the Debug log.
(For testing purposes, I set INTERMEDIATE_CACHE_UPDATE_INTERVAL to 10 for convenience; under normal circumstances, the printing interval is much longer)
Code contributor checklist: