Add balances cache / GUI: use a signal instead of a poll thread #10251

jonasschnelli · 2017-04-21T14:43:20Z

There are two major parts in this PR.

A) wallet: introduce a per-balance-type cache

Additionally to the per CWalletTx debit, etc. caches, this adds an atomic cache for each balance type (available, immature, watchonly, etc.).
If the balance has been cached, no lock will be acquired when calling Get*Balance().
As always with caches, the problematic parts is where to invalidate it (that is why there are some calls to MarkBalancesDirty()).

B) Signal for balance changes

This PR exposes a new wallet signal that will signal the GUI when a balance change had happened (instead of the 250ms polling timer).

laanwj · 2017-04-21T15:11:19Z

Concept ACK, nice!

ryanofsky · 2017-04-21T15:07:29Z

src/wallet/wallet.cpp


 CAmount CWallet::GetBalance() const
 {
-    CAmount nTotal = 0;
+    if (!fBalancesDirty) {
+        return nBalanceCache;


I think think you need to acquire cs_wallet lock here, otherwise this could return a partial sum if GetBalance got called from two threads at the same time and one thread returned nBalanceCache while the other thread was in the middle of the nBalanceCache += loop.

Alternately, you could bring back the nTotal local variable and just set nBalanceCache = nTotal atomically.

Oh.. good point.
I'll re-add the temp variable then.

ryanofsky · 2017-04-21T15:10:37Z

src/wallet/wallet.cpp

+        return nBalanceCache;
+    }
+
+    nBalanceCache = 0;


I think you need to move this line after the LOCK2(cs_main, cs_wallet); line. Otherwise if this method is called from two different threads simultaneously, this line could zero out a balance which is in the middle of being added up by the other thread.

Or alternately, delete this line and and set nBalanceCache = nTotal atomically after the loop.

ryanofsky · 2017-04-21T15:21:59Z

src/wallet/wallet.h

+    std::atomic<bool> fBalancesDirty;
+
+    /** Notification for balance changes */
+    boost::signals2::signal<void ()> BalancesDidChange;


Maybe call it BalancesChanged instead of BalancesDidChange to be consistent with the naming of other signals (StatusChanged, AddressBookChanged, etc).

Yes. I can do that. I kinda like the signal naming convention from apple. XYWillChange, XYDidChange, etc.: because it allows the listener to know (without reading to much code) if the event was synchronous or has triggered a process with a later update. But lets not overdo it here,.. will change therefore.

jonasschnelli · 2017-04-21T16:12:19Z

Force pushed fixes for issues found by @ryanofsky.

ryanofsky

See other comments, but it looks like the caching here is broken. The Qt changes seem fine but I think would be easier to understand as 1 commit instead of 3.

ryanofsky · 2017-05-01T16:09:11Z

src/wallet/wallet.h

@@ -780,6 +780,7 @@ class CWallet : public CCryptoKeyStore, public CValidationInterface
        nRelockTime = 0;
        fAbortRescan = false;
        fScanningWallet = false;
+        fBalancesDirty = true;


In commit "Add atomic cache for balances"

The whole caching portion of this PR actually seems broken because fBalancesDirty is initialized to true but never set to false anywhere. I think this can be fixed, but maybe simplest thing to do would be to drop the caching, and just keep the signalling part of this PR.

Oh. Yes indeed. I now switched to a cache-invalidation-atomic per balance type and set it to false once the cache has been built.

ryanofsky · 2017-05-01T17:05:22Z

src/qt/walletmodel.cpp

@@ -137,12 +136,6 @@ void WalletModel::updateBalance()
    }
 }

-void WalletModel::updateTransaction()


In commit "[Qt] remove unused polling code"

I think it would be good to squash this commit into "[Qt] use the BalancesChanged signal instead of a 250ms poll timer." It's confusing to see code that checks the fForceCheckBalanceChanged variable and some code that sets it removed, while other code that sets it sticks around.

Good point. Squashed those two commits together.

jonasschnelli · 2017-05-04T07:20:01Z

Force push fixed @ryanofsky points.

ryanofsky

utACK a6903788aa105bd26dc6bd5a2784db5109467848

ryanofsky · 2017-05-04T15:22:15Z

src/wallet/wallet.cpp

@@ -1914,13 +1935,18 @@ CAmount CWallet::GetBalance() const
            if (pcoin->IsTrusted())
                nTotal += pcoin->GetAvailableCredit();
        }
+        nBalanceCache = nTotal;
+        fBalancesDirty = false;


In commit "Add atomic cache for balances"

Maybe rename fBalancesDirty to fBalanceDirty since this only applies to the balance returned from CWallet::GetBalance and not the other balances.

jonasschnelli · 2017-05-04T18:42:05Z

Force push fixed @ryanofsky's nit (and also added the "other balances" atomics to SetNull()).

ryanofsky · 2017-05-05T15:14:21Z

ACK 84904a9d8ac5e98558b366bc69338d4ad07dc2f7.

Changes since last review were those Jonas mentioned. Seeing the missing initializations in SetNull() was a little disconcerting, because I would have expected tests to catch errors they would cause. Actually one test did fail (wallet_tests/rescan) but I guess I didn't notice it.

jonasschnelli · 2017-06-14T06:38:40Z

Needed a trivial rebase (BOOST_FOREACH -> for change).

kallewoof

utACK 9ca1140a19d3d39b86d7a6d6dc7ed907fcea2a88 (my QT skills are limited though)

kallewoof · 2017-06-14T06:58:07Z

src/wallet/wallet.cpp

-    }
+    MarkBalancesDirty();
+
+    LOCK(cs_wallet);


Am I mistaken when I think that the lock on cs_wallet will actually happen before the call to MarkBalancesDirty() due to scope? If so I would swap these lines for clarity.

I guess I should move the LOCK() up to the top. Because its RAII, I think it doesn't matter in this case (no extra block). The lock should take affect when we enter the function/scope (before MarkBalancesDirty()). But maybe I'm wrong.

sipa · 2017-06-14T07:18:36Z

@kallewoof No, the lock on cs_wallet will happen after MarkBalancesDirty

sipa · 2017-06-14T07:27:49Z

But maybe I'm wrong

You're wrong.

remove unused polling code

jonasschnelli · 2017-06-14T07:32:14Z

But maybe I'm wrong

You're wrong.

Okay. Then I'll better be quite.
Moved up the LOCK().

ryanofsky

utACK 187b3ee. Only change since last review was acquiring wallet lock before the one markdirty call.

laanwj · 2017-06-15T07:06:11Z

I'm trying to understand whether this will introduce a hang in the GUI thread in some cases.

Right now, the recurrent poll timer (not thread! it's a qt timer which runs in the GUI thread) updates the balance if it's possible to get the locks using TRY_LOCK. Then it will get them, once, and process all balances.

After this change the locks are no longer TRY, so the BalancesChanged signal will always cause the GUI thread to take the wallet locks (through CWallet::get*Balance()). Not once, but up to once per type of balance, if they all turn out to be dirty (which is likely).
Taking the cs_main and cs_wallet lock up to 6 times can take ages when the cs_main lock is contended: during initial sync, catching up. This all happens in the GUI thread (at the receiving end of updateBalance signal).

Maybe I'm misunderstanding the code, but if I'm right about this, I'm not convinced that this is an improvement in user experience.

jonasschnelli · 2017-06-16T12:05:13Z

Some of the current heavy "freezes" are happening because of the balance poll thread... it seems to be a significant CPU eater.
Right now, we TRY_LOCK every 250ms (which often LOCKS successfully and then iterates (multiple times) through the complete mapWallet).

jonasschnelli · 2017-06-16T12:10:35Z

Here the complete profiling (master):

It seems like that the pollBalanceThread took 50% of the complete CPU time for the measured run.
(30 seconds, opening debug window and 10-20 times sendtoaddress)

jonasschnelli · 2017-06-16T12:16:56Z

I testes again with this PR and the responsiveness is much better. @laanwj points make sense to me.

Could it be, that the TRY_LOCK every 250ms will result in getting the LOCK very often and then do the heavy balance calculation on the main thread because QTimer runs on the Main/GUI thread?

ryanofsky · 2017-06-16T12:20:58Z

Could it be, that the TRY_LOCK every 250ms will result in getting the LOCK very often and then do the heavy balance calculation on the main thread because QTimer runs on the Main/GUI thread?

That's what I would expect. Have you thought about keeping the cache but bringing back the polling and try lock to address laanwj's point?

jonasschnelli · 2017-06-16T13:05:14Z

@ryanofsky I profiled this PR with only the first commit (atomic caches, keep the polling).
The results are better, but not as good as with the signal/non-polling approach:

Test run: Startup, call sendtoaddress(getnewaddress(), 1) 20 times manually

Master with only the first commit (pure atomic caches):

The polling thread gets measured (through significants) in my profiler with ~33% (very high IMO)

Master with this PR

The balance updates seems no longer be significant:

ryanofsky · 2017-06-21T18:36:53Z

Unclear to me if @laanwj's concern about losing the try lock is sufficiently addressed with the new testing and discussion at https://botbot.me/freenode/bitcoin-core-dev/msg/87346236/. If it is, maybe this PR is close to ready for merge (so far two utACKs from kallewoof and me). If not, it seems like there is some advantage (50% -> 30% poll thread cpu decrease) and no downside to just merging the first commit for now?

jonasschnelli · 2017-06-21T18:38:54Z

I guess it would be great if someone with (g)perf experience on Linux/Ubuntu and maybe on Windows could benchmark this PR against master.

ryanofsky · 2017-06-21T19:54:33Z

Linux and windows performance tests seem like they would be a good sanity check on the numbers you collected (50% cpu usage spent on balance computation on master, 33% with caching commit, and ~0% with full pr), but is there some reason to think this behavior would be dependent on the platform?

I guess the main thing I'm not clear on is whether the results you collected (assuming they do hold up on other platforms) address laanwj's concern above, or if there is some kind of different testing that needs to be done, or some other safeguard that should be added to prevent the possibility of new hangs.

promag · 2017-07-20T01:19:46Z

@jonasschnelli if still relevant I can do it.

jonasschnelli · 2017-07-20T06:57:35Z

@promag: A benchmark would be very good to have.
Ideally you compare master against this PR and against only the first commit in this PR (3 versions).
Thanks!

jonasschnelli · 2017-10-05T04:05:15Z

Closing for now...

jonasschnelli added GUI Wallet labels Apr 21, 2017

ryanofsky reviewed Apr 21, 2017

View reviewed changes

jonasschnelli force-pushed the 2017/04/gui_rm_locks branch from 69b751b to e63ab9b Compare April 21, 2017 16:11

ryanofsky reviewed May 1, 2017

View reviewed changes

jonasschnelli force-pushed the 2017/04/gui_rm_locks branch from e63ab9b to a690378 Compare May 4, 2017 07:15

ryanofsky reviewed May 4, 2017

View reviewed changes

jonasschnelli force-pushed the 2017/04/gui_rm_locks branch from a690378 to 84904a9 Compare May 4, 2017 18:41

jonasschnelli force-pushed the 2017/04/gui_rm_locks branch from 84904a9 to 9ca1140 Compare June 14, 2017 06:38

kallewoof reviewed Jun 14, 2017

View reviewed changes

jonasschnelli added 3 commits June 14, 2017 09:31

Add atomic cache for balances

8688671

[Qt] use the BalancesChanged signal instead of a 250ms poll timer

1e93408

remove unused polling code

[Qt] rename checkBalanceChanged to updateBalance

187b3ee

jonasschnelli force-pushed the 2017/04/gui_rm_locks branch from 9ca1140 to 187b3ee Compare June 14, 2017 07:31

ryanofsky reviewed Jun 14, 2017

View reviewed changes

jonasschnelli force-pushed the 2017/04/gui_rm_locks branch from 187b3ee to 8688671 Compare June 16, 2017 12:56

jonasschnelli mentioned this pull request Jun 16, 2017

Windows bitcoin-qt Verry laggy while syncing Blocks [0.14.1] #10610

Closed

jonasschnelli closed this Oct 5, 2017

jonasschnelli mentioned this pull request Jul 10, 2018

Qt: Only call tryGetBalances in pollBalanceChanged if the result will be used. #13612

Closed

promag mentioned this pull request Sep 8, 2019

wallet: CPU use proportional to wallet transaction count when idle #16815

Open

jonasschnelli mentioned this pull request Feb 20, 2020

gui: Avoid Wallet::GetBalance in WalletModel::pollBalanceChanged #18160

Merged

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Add balances cache / GUI: use a signal instead of a poll thread #10251

Add balances cache / GUI: use a signal instead of a poll thread #10251

Uh oh!

Conversation

jonasschnelli commented Apr 21, 2017

A) wallet: introduce a per-balance-type cache

B) Signal for balance changes

Uh oh!

laanwj commented Apr 21, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonasschnelli commented Apr 21, 2017

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonasschnelli commented May 4, 2017

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonasschnelli commented May 4, 2017

Uh oh!

ryanofsky commented May 5, 2017

Uh oh!

jonasschnelli commented Jun 14, 2017

Uh oh!

kallewoof left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonasschnelli Jun 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sipa commented Jun 14, 2017

Uh oh!

sipa commented Jun 14, 2017

Uh oh!

jonasschnelli commented Jun 14, 2017

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

laanwj commented Jun 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonasschnelli commented Jun 16, 2017

Uh oh!

jonasschnelli commented Jun 16, 2017

Uh oh!

jonasschnelli commented Jun 16, 2017

Uh oh!

ryanofsky commented Jun 16, 2017

Uh oh!

jonasschnelli commented Jun 16, 2017

Master with only the first commit (pure atomic caches):

Master with this PR

Uh oh!

ryanofsky commented Jun 21, 2017

jonasschnelli Jun 14, 2017 •

edited

Loading

laanwj commented Jun 15, 2017 •

edited

Loading

promag commented Jul 20, 2017 •

edited

Loading