Skip to content

Conversation

tdb3
Copy link
Contributor

@tdb3 tdb3 commented Aug 25, 2024

Currently, after starting a scan with scanblocks the user must wait until the scan is complete to see the associated/relevant block hashes.
This updates the scanblocks status result object to provide the relevant_blocks found so far.
This enables earlier subsequent lookup within matching blocks (e.g. to run getdescriptoractivity in PR #30708)

Below is an example tested on signet by starting and obtaining status of a scan for an address that has received many coinbase outputs (so there are matches over many blocks, showing the increase in relevant_blocks results in scanblocks status over the life of the scan). Also tested on mainnet (see #30713 (review)).

build/src/bitcoin-cli -signet scanblocks start '["addr(tb1qsygnet2jdqm8n2p7wmmklp9yel3k7agpnycv4f)"]'
build/src/bitcoin-cli -signet scanblocks status
{
  "progress": 14,
  "current_height": 30002,
  "relevant_blocks": [
    "0000012aee246273622a8fb3546454a1d6c11cb468dfb1cd536a3058b0590cba",
    "0000002e043f797a967fca25c18fd2b0f66ddce76b443e540327d2e19928d0aa",
...
    "000000f7626bb9f7917f48d0c63a9c30a4692232e5cd96366192a62c7d1a89a4",
    "000000db94d6817657bd9184ad5f6232981cfe60177b7395485260a339b32571"
  ]
}

or (to show snapshots over time)

for i in `seq 1 20`; do sleep 1; src/bitcoin-cli -signet scanblocks status > test$i.txt; done

-deprecatedrpc seems like overkill for the addition of a key (not a data type change). Release notes have been added (affecting user interface).

If testing, be sure to enable the block filter index feature (e.g. adding blockfilterindex=1 to bitcoin.conf), and allow time for the index to build:

2024-09-11T12:17:49Z Config file arg: signet="1"
2024-09-11T12:17:49Z Config file arg: [signet] blockfilterindex="1"
2024-09-11T12:17:51Z Syncing basic block filter index with block chain from height 0
...
2024-09-11T12:19:41Z basic block filter index is enabled at height 212758
2024-09-11T12:19:41Z basic block filter index thread exit

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 25, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30713.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK ryanofsky
Concept ACK jamesob
Stale ACK pablomartin4btc

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@tdb3 tdb3 changed the title rpc: add revelant_blocks to scanblocks status rpc: add revelant_blocks to scanblocks status Aug 25, 2024
@tdb3 tdb3 force-pushed the relevant_blocks_in_scanblocks_status branch from 5891441 to 6bac76f Compare August 25, 2024 13:38
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Debug: https://github.com/bitcoin/bitcoin/runs/29219972894

Hints

Make sure to run all tests locally, according to the documentation.

The failure may happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@jamesob
Copy link
Contributor

jamesob commented Aug 25, 2024

Concept ACK. This'd be useful, especially in conjunction with #30708.

@@ -2550,7 +2563,8 @@ static RPCHelpMan scanblocks()

ret.pushKV("from_height", start_block_height);
ret.pushKV("to_height", start_index->nHeight); // start_index is always the last scanned block here
ret.pushKV("relevant_blocks", std::move(blocks));
LOCK(cs_relevant_blocks);
ret.pushKV("relevant_blocks", g_relevant_blocks);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since status stops working when the scan completes, it seems like we should clear g_relevant_blocks here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you. Will update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd introduce a race, where the relevant blocks are cleared before the reserver is destructed (and thus the status call may return an empty list where previously it didn't for the same call)

Not sure if it matters, or worthy to address.

Copy link
Contributor Author

@tdb3 tdb3 Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While leaving scanblocks() without resetting g_relevant_blocks isn't ideal, resetting g_relevant_blocks just before BlockFiltersScanReserver in the start branch seemed to be a good way to prevent status from accidentally seeing an empty g_relevant_blocks (at least in common cases). Please let me know if I'm overlooking a likely concurrency case, and it can be adjusted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems trivial to fix properly? luke-jr@5698131

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

included, thanks

Copy link
Member

@pablomartin4btc pablomartin4btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK (coming from #30708).

@tdb3 tdb3 force-pushed the relevant_blocks_in_scanblocks_status branch from 6bac76f to bdda0c1 Compare September 9, 2024 16:41
@tdb3 tdb3 marked this pull request as ready for review September 9, 2024 21:47
Copy link
Member

@pablomartin4btc pablomartin4btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK 39ee30c

Release notes has been added and g_relevant_blocks reset has been placed into the start conditional section since my last review.

Tested on mainnet.
/build/src/bitcoin-cli scanblocks start     '["addr(bc1p0cp0vyag6snlta2l7c4am3rue7eef9f72l7uhx52m4v27vfydx9s8tfs7t)"]' 850263
{
  "from_height": 850263,
  "to_height": 860861,
  "relevant_blocks": [
    "00000000000000000002b3c2a395905df9d12ce30d2237f064cdafacb81eb33d",
    "00000000000000000002bc5cc78f5b0913a5230a8f4b0d5060bc9a60900a5a88",
    "00000000000000000001c5291ed6a40c06d3db5c8fb738567654b24a14b24ecb"
  ],
  "completed": true
}

./build/src/bitcoin-cli scanblocks status
{
  "progress": 0,
  "current_height": 850263,
  "relevant_blocks": [
    "00000000000000000002b3c2a395905df9d12ce30d2237f064cdafacb81eb33d"
  ]
}

For those not used to scanblocks I'd add a little note somewhere in the description that in order to testing this change the block filter index feature should be enabled, by either passing -blockfilterindex as an argument on the node start up or putting it in the bitcoin.conf file (same for #30708), even it's specified in the help or an error would lead to that, it'd make it easier and straightforward. Also that if the feature was not enabled before it could take up to a few hours to create the indexes.

Out of the scope of this PR, having played around start, status, abort subcommands, I'd add some more context in the results of the first 2. Perhaps as "good first issue".
  • for status subcommand, instead of returning no results when there are no scans in progress:

    ./build/src/bitcoin-cli -datadir=${AU_DATADIR} scanblocks status
    {
      "status": "No scan in progress"
    }
    
  • for start subcommand, when the abort has been triggered and was no completion:

    /build/src/bitcoin-cli -datadir=${AU_DATADIR} scanblocks start  '["addr(bc1p0cp0vyag6snlta2l7c4am3rue7eef9f72l7uhx52m4v27vfydx9s8tfs7t)"]' 750000
    {
      "from_height": 750000,
      "to_height": 780002,
      "relevant_blocks": [
     ],
      "completed": false,
      "status": "Abort call performed"
    }
    

@DrahtBot DrahtBot requested a review from jamesob September 11, 2024 11:23
@tdb3
Copy link
Contributor Author

tdb3 commented Sep 11, 2024

Thanks for reviewing.

Out of the scope of this PR, having played around start, status, abort subcommands, I'd add some more context in the results of the first 2. Perhaps as "good first issue".

  • for status subcommand, instead of returning no results when there are no scans in progress:
    ./build/src/bitcoin-cli -datadir=${AU_DATADIR} scanblocks status
    {
      "status": "No scan in progress"
    }
    
  • for start subcommand, when the abort has been triggered and was no completion:
    /build/src/bitcoin-cli -datadir=${AU_DATADIR} scanblocks start  '["addr(bc1p0cp0vyag6snlta2l7c4am3rue7eef9f72l7uhx52m4v27vfydx9s8tfs7t)"]' 750000
    {
      "from_height": 750000,
      "to_height": 780002,
      "relevant_blocks": [
     ],
      "completed": false,
      "status": "Abort call performed"
    }
    

These are good ideas for follow-up PRs.

For those not used to scanblocks I'd add a little note somewhere in the description that in order to testing this change the block filter index feature should be enabled

Good idea. Description updated.

@pablomartin4btc
Copy link
Member

Good idea. Description updated.

Thanks! It looks good.

Forgot to mention that it would be nice to add some test checking relevant_blocks is returned by scanblocks status in test/functional/rpc_scanblocks.py if possible.

@tdb3
Copy link
Contributor Author

tdb3 commented Sep 11, 2024

Forgot to mention that it would be nice to add some test checking relevant_blocks is returned by scanblocks status in test/functional/rpc_scanblocks.py if possible.

It would be. I took a look at rpc_sccanblocks when building the PR.
The existing test for status is pretty minimal, but that's probably because it might be difficult to get the scan to run long enough to get status.

# test accessing the status (must be empty)
assert_equal(node.scanblocks("status"), None)

Any ideas on how to slow down scanblocks() enough to allow a status call to consistently return status of a scan in progress? When testing locally, I had inserted sleep_for() statements into the start path of scanblocks() to simulate a long scan. The only natural way I've thought of doing this is to build a purposefully complex/bloated chain, but even then, it wouldn't necessarily be consistent on different hardware (faster machines might have the test fail, and slower machines might make the test run unnecessarily long), and it's pretty bad practice.

@pablomartin4btc
Copy link
Member

pablomartin4btc commented Sep 12, 2024

The existing test for status is pretty minimal,

Exactly.

Any ideas on how to slow down scanblocks() enough to allow a status call to consistently return status of a scan in progress?

I've been playing a bit around the invalidateblock as I had present in my mind the recent issue from the assume-utxo testing but that wouldn't make the trick either. I went thru the original PRs of scanblocks (#23549 and #20664) but it seems there were no questions around this 'status' call test either. Perhaps @furszy, that worked on a speedup of the rpc command, has an idea if it's possible (and if worth it).

Copy link
Contributor

@naiyoma naiyoma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested changes on mainnet

Started scanning
./build/src/bitcoin-cli scanblocks start '["addr(bc1qgqrn8gratlc59gn5v02nrz0rg8anjjaxt3efh8)"]'

status check
./build/src/bitcoin-cli scanblocks status

{
  "progress": 98,
  "current_height": 570056,
  "relevant_blocks": [
    "0000000000005ecf9f819ea03abd8f2a3f0b2cad5e5124bdabd0d7a770fd9067",
    "000000000000000479f286761e91b2c983ad63e19b0d53cffc67e731a7558d85"
  ]
}

scan completed

./build/src/bitcoin-cli scanblocks start '["addr(bc1qgqrn8gratlc59gn5v02nrz0rg8anjjaxt3efh8)"]' 
{
  "from_height": 0,
  "to_height": 577394,
  "relevant_blocks": [
    "0000000000005ecf9f819ea03abd8f2a3f0b2cad5e5124bdabd0d7a770fd9067",
    "000000000000000479f286761e91b2c983ad63e19b0d53cffc67e731a7558d85"
  ],
  "completed": true
}

`

@furszy
Copy link
Member

furszy commented Oct 15, 2024

Perhaps @furszy, that #26780 on a #23549 (review) of the rpc command, has an idea if it's possible (and if worth it).

Not sure if it worth it but could craft an unit test that calls to the RPC command. Subclassing and setting the block manager class so the file opening method "stops" at a certain point. It would mimic the delay.

luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request Nov 15, 2024
Provides relevant block hashes to status output.
Enables a user to start looking through matching
blocks without having to wait on scan completion.

Github-Pull: bitcoin#30713
Rebased-From: bdda0c1
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request Nov 15, 2024
@luke-jr
Copy link
Member

luke-jr commented Nov 15, 2024

Any reason not to scope the new variables? luke-jr@851d354

@tdb3
Copy link
Contributor Author

tdb3 commented Nov 21, 2024

Thanks. Planning to touch this up soon.

tdb3 and others added 2 commits November 28, 2024 20:46
Provides relevant block hashes to status output.
Enables a user to start looking through matching
blocks without having to wait on scan completion.

Co-Authored-By: Luke Dashjr <luke-jr+git@utopios.org>
@tdb3 tdb3 force-pushed the relevant_blocks_in_scanblocks_status branch from 39ee30c to 5b2d021 Compare November 30, 2024 17:57
@tdb3
Copy link
Contributor Author

tdb3 commented Nov 30, 2024

Any reason not to scope the new variables? luke-jr@851d354

moved, thanks

@tdb3
Copy link
Contributor Author

tdb3 commented Nov 30, 2024

Updated with suggested changes

@jamesob
Copy link
Contributor

jamesob commented Dec 6, 2024

Want to reemphasize that I think this is a good change worth doing. I have some questions about using static vars rather than housing running state in a BlockFiltersScanReserver instance, but that's a pretty small implementation detail. Will review/suggest changes more formally soon.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 5b2d021. Seems like a nice feature, and the implementation is pretty simple. I think there is a minor regression if scanblocks start is called from multiple threads, and suggested a fix below (#30713 (comment)).

re: #30713 (comment)

Any ideas on how to slow down scanblocks() enough to allow a status call to consistently return status of a scan in progress?

Will suggest a very general solution that would probably be overkill here, but I think we could have a setmockcondition RPC that acts like the existing setmocktime RPC and lets tests toggle global state that could be useful for testing. In this case I imagine a test could call something like setmockcondition({"scanblocks_pause_at_height": 50}) to make the scanning pause at a certain height and setmockcondition({"scanblocks_pause_at_height": null}) to make it resume, and we could have some helpful functions that make this not too intrusive to implement.

Comment on lines +2455 to +2456
static GlobalMutex cs_relevant_blocks;
static UniValue relevant_blocks GUARDED_BY(cs_relevant_blocks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "rpc: add relevant_blocks to scanblocks status" (4c9dc4b)

It would be good to declare these next to g_scanfilter_ variables so all global state is declared in one place, and so other functions besides scanblocks can see this information (other RPCs and maybe tests should be able to access it). Would suggest renaming relevant_blocks to g_scanfilter_relevant_blocks and cs_relevant_blocks to g_scanfilter_relevant_blocks_mutex. (The "critical section" naming is also a little archaic.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will include in next update.

ret.pushKV("completed", completed);
reserver.release(); // ensure this is before cs_relevant_blocks is released, so status doesn't try to use moved relevant_blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "rpc: add relevant_blocks to scanblocks status" (4c9dc4b)

Think this is missing a word, suggest "ensure this is called before"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will include

@@ -2461,6 +2477,11 @@ static RPCHelpMan scanblocks()
g_scanfilter_should_abort_scan = true;
return true;
} else if (request.params[0].get_str() == "start") {
{
LOCK(cs_relevant_blocks);
relevant_blocks = UniValue(UniValue::VARR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "rpc: add relevant_blocks to scanblocks status" (4c9dc4b)

Clearing relevant_blocks before checking g_scanfilter_in_progress seems not very nice, because it means if scanblocks start is called while another scan is in progress, instead of just returning an "already in progress" error, it will do that but also partially erase the state of the in-progress scan.

This behavior is not just a limitation of the new feature, but also a regression, because it messes up the "relevant_blocks" return value of the other RPC call when previously it would be accurate.

I think it should be straightforward to fix this by moving this new code below the reserve() call below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Verified locally that truncation occurs. Fixed for next update.

UniValue ret(UniValue::VOBJ);
if (request.params[0].get_str() == "status") {
BlockFiltersScanReserver reserver;
LOCK(cs_relevant_blocks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "rpc: add relevant_blocks to scanblocks status" (4c9dc4b)

Lock order here seems a little subtle. Would be good to have a comment saying it is important to lock this before calling reserve so if a scan is currently happening but is about to finish, an accurate "relevant_blocks" value can be returned before the scanning thread clears it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Comment will be added.

@tdb3 tdb3 changed the title rpc: add revelant_blocks to scanblocks status rpc: add relevant_blocks to scanblocks status Jan 7, 2025
@tdb3
Copy link
Contributor Author

tdb3 commented Jan 11, 2025

I think we could have a setmockcondition RPC that acts like the existing setmocktime RPC and lets tests toggle global state that could be useful for testing.

Great idea. Playing around with it locally. Perhaps it could eventually absorb setmocktime as well.

@tdb3 tdb3 marked this pull request as draft February 1, 2025 01:05
@tdb3
Copy link
Contributor Author

tdb3 commented Feb 1, 2025

Will revisit later

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Feb 22, 2025
Provides relevant block hashes to status output.
Enables a user to start looking through matching
blocks without having to wait on scan completion.

Co-Authored-By: Luke Dashjr <luke-jr+git@utopios.org>

Github-Pull: bitcoin#30713
Rebased-From: 4c9dc4b
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Feb 22, 2025
fanquake added a commit that referenced this pull request Mar 21, 2025
c6eca6f doc: add guidance for RPC to developer notes (tdb3)

Pull request description:

  Adds guidance statements to the RPC interface section of the developer notes with examples of when to implement `-deprecatedrpc=`.

  Wanted to increase awareness of preferred RPC implementation approaches for newer contributors.

  This implements some of what's discussed in #29912 (comment)

  Opinions may differ, so please don't be shy.  We want to make RPC as solid/safe as possible.

  Examples of discussions where guidelines/context might have added value:
  #30212 (comment)
  #29845 (comment)
  #30381 (review)
  #29954 (comment)
  #30410 (review)
  #30713
  #30381
  #29060 (review)

ACKs for top commit:
  l0rinc:
    ACK c6eca6f
  fjahr:
    ACK c6eca6f
  maflcko:
    lgtm ACK c6eca6f
  jonatack:
    ACK c6eca6f

Tree-SHA512: 01a98a8dc0eb91762b225d3278cdb4a5e380ceb7486fd096b4ad9122bed859cea8584d8996d3dce51272fdb792f4a793a1bd1c5445efeb87f0a30f9b6e59a790
@tdb3 tdb3 closed this by deleting the head repository Jul 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants