Skip to content

Conversation

promag
Copy link
Contributor

@promag promag commented Apr 29, 2020

This PR fixes an early relocking race condition from #18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time

Issue introduced in #18487.
Fixes #18811.

@promag promag force-pushed the 2020-04-rpc-wallet-relock branch from 69e21bb to 5a0486c Compare April 29, 2020 10:48
@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 29, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented Apr 29, 2020

Would be nice to write a test for this

@promag
Copy link
Contributor Author

promag commented Apr 29, 2020

Thanks for reviewing @ryanofsky. I'll address that case.

@MarcoFalke probably a unit test no?

@maflcko
Copy link
Member

maflcko commented Apr 29, 2020

Yes, I haven't looked at the code changes, but I assume some kind of unit test makes sense.

@maflcko
Copy link
Member

maflcko commented Apr 29, 2020

Also, this needs backport, right?

@maflcko maflcko added this to the 0.20.0 milestone Apr 29, 2020
@promag promag force-pushed the 2020-04-rpc-wallet-relock branch from 5a0486c to b0f1e19 Compare April 29, 2020 20:34
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK b0f1e19. I think all the problems we've seen so far are avoided:

  1. Original deadlock from #18487 where rpcRunLater blocks waiting for old relock callback in timer thread which is deadlocked waiting for cs_wallet
  2. Early relocking race condition from #18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time
  3. Failure to relock race condition in 5a0486c where two relock callbacks could each cancel the other out without either of them relocking

Everything would be simpler if rpcRunLater would just activate the new callback after the old callback if the old callback happened to be running, instead of blocking waiting for the old callback to finish. Then walletpassphrase could go back to using cs_wallet and multiple mutexes wouldn't be required. But I think the approach in this PR should work too and is probably a smaller change on top of existing code.

Only additional suggestion would be to write a PR description that summarizes what this is fixing without having to read the other thread. A unit test could be a good idea, but might be difficult to write, so I don't think it would be crazy to merge this without a new test if it's difficult to come up with one. Especially since the bug this is fixing was caught by a python test.

@promag
Copy link
Contributor Author

promag commented Apr 29, 2020

A unit test could be a good idea, but might be difficult to write

FWIW I'm still trying this.

@promag
Copy link
Contributor Author

promag commented Apr 29, 2020

instead of blocking waiting for the old callback to finish.

BTW If we bump minimum libevent then we could use this https://github.com/libevent/libevent/blob/29cc8386a2f7911eaa9336692a2c5544d8b4734f/whatsnew-2.1.txt#L248-L265

@maflcko
Copy link
Member

maflcko commented Apr 29, 2020

Bumping libevent should be discussed separately. See also #18676

@ryanofsky
Copy link
Contributor

Also, I haven't looked but I doubt there's a need for a special libevent feature to improve rpcRunLater. We should just need an extra bit of code at the end of the existing libevent callback to register a new callback if there's one pending

@promag
Copy link
Contributor Author

promag commented May 3, 2020

FWIW I'm still trying this.

And I quit. Tried with unit test but it required to start HTTP/RPC server because of libevent and timer stuff. Then tried funcional test but can't figure how to hit the problem.

@ryanofsky I'll borrow your comment above to improve PR description.

@ryanofsky
Copy link
Contributor

This still looks good but it might be simpler to squash second and third commits to avoid having a bug in the second commit which is fixed by the third commit.

This would simplify the PR description so you could drop the paragraph "Failure to relock race condition in 59a50e7" about the temporary bug.

@promag promag force-pushed the 2020-04-rpc-wallet-relock branch from b0f1e19 to 9f59dde Compare May 7, 2020 00:42
@promag
Copy link
Contributor Author

promag commented May 7, 2020

Done.

@ryanofsky
Copy link
Contributor

ryanofsky commented May 7, 2020

Code review ACK 9f59dde. No changes since last review except squashing commits.

Two other thoughts:

  • It might be nice to add a comment for RPCRunLater that mentions the risk of deadlocks:

    /** 
     * Run func nSeconds from now. 
     * Overrides previous timer <name> (if any).
     *
     * If previous timer has started executing, this will block waiting for the call
     * to return. Therefore, to avoid deadlocks, RPCRunLater can't be called while
     * holding locks that a previous timer might be waiting for.
     *
     * @todo Make implementation nonblocking to avoid risk of deadlocks and simplify
     * usage.
     */ 
    void RPCRunLater(const std::string& name, std::function<void()> func, int64_t nSeconds);
  • This is an idea for a new PR, but maybe there should be an RPC function to return a timer's scheduled time so we could drop CWallet::nRelockTime

@maflcko
Copy link
Member

maflcko commented May 11, 2020

ACK 9f59dde

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9f59dde

I like @ryanofsky's suggestions in #18814 (comment).

@fanquake fanquake merged commit a33901c into bitcoin:master May 13, 2020
@promag promag deleted the 2020-04-rpc-wallet-relock branch May 13, 2020 09:55
@promag
Copy link
Contributor Author

promag commented May 13, 2020

@ryanofsky honestly I'd suggest to avoid using it. Instead of timeout what I think would be best is to exclusively "bind" the unlock to the current RPC connection. In the GUI we could have a timer. I don't see other uses of the timer in the server.

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 14, 2020
9f59dde rpc: Relock wallet only if most recent callback (João Barbosa)
a2e6db5 rpc: Add mutex to guard deadlineTimers (João Barbosa)

Pull request description:

  This PR fixes an early relocking race condition from bitcoin#18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time

  Issue introduced in bitcoin#18487.
  Fixes bitcoin#18811.

ACKs for top commit:
  MarcoFalke:
    ACK 9f59dde
  ryanofsky:
    Code review ACK 9f59dde. No changes since last review except squashing commits.
  jonatack:
    ACK 9f59dde

Tree-SHA512: 2f7fc03e5ab6037337f2d82dfad432495cc337c77d07c968ee2355105db6292f24543c03456f5402e0e759577a4327758f9372f7ea29de6d56dc3695fda9b379
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request May 14, 2020
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request May 14, 2020
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request May 15, 2020
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request May 15, 2020
maflcko pushed a commit that referenced this pull request May 15, 2020
245c862 test: disable script fuzz tests (fanquake)
9a8fb4c fuzz: Remove enumeration of expected deserialization exceptions in ProcessMessage(...) fuzzer (practicalswift)
6161c94 [net processing] Only send a getheaders for one block in an INV (John Newbery)
cf2a6e2 test: Remove const to work around compiler error on xenial (Wladimir J. van der Laan)
cc7d344 miner: Avoid stack-use-after-return in validationinterface (MarcoFalke)
37a6207 test: Add unregister_validation_interface_race test (MarcoFalke)
ff4dc20 gui: Fix manual coin control with multiple wallets loaded (João Barbosa)
ed0afe8 test: Add test for conflicted wallet tx notifications (Russell Yanofsky)
251e321 rpc: Relock wallet only if most recent callback (João Barbosa)
ca4dac4 rpc: Add mutex to guard deadlineTimers (João Barbosa)
a3fe458 [docs] Improve commenting in ProcessGetData() (John Newbery)
011532e [test] test that an invalid GETDATA doesn't prevent processing of future messages (Amiti Uttarwar)
1e73d72 [net processing] ignore unknown INV types in GETDATA messages (Amiti Uttarwar)
fb82173 [net processing] ignore tx GETDATA from blocks-only peers (Amiti Uttarwar)
315ae14 gui: Fix itemWalletAddress leak when not tree mode (João Barbosa)

Pull request description:

  Backports the following PRs to the 0.20 branch:

  * #18578: gui: Fix leak in CoinControlDialog::updateView
  * #18808: [net processing] Drop unknown types in getdata
  * #18814: rpc: Relock wallet only if most recent callback
  * #18878: test: Add test for conflicted wallet tx notifications
  * #18894: gui: Fix manual coin control with multiple wallets loaded
  * #18742: miner: Avoid stack-use-after-return in validationinterface
  * #18962: net processing: Only send a getheaders for one block in an INV
  * #18975: test: Remove const to work around compiler error on xenial

ACKs for top commit:
  promag:
    Tested ACK 245c862 coin control with multiple wallets.
  laanwj:
    ACK 245c862
  MarcoFalke:
    ACK 245c862 solved the conflicts myself as a sanity check. Did not re-review 🍷

Tree-SHA512: 285e5a5fad5bbeba6032742c65dc68836e8eccfcceda9e69fec4ddd162a3f61679a96f9bbe3d434267835af67c21ac4c05accf6f63e827c2eb47203c6daafe31
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 6, 2021
Summary:
```
This PR fixes an early relocking race condition from #18811 where old
relock callback runs after new wallet unlock code and nRelockTime update
but before rpcRunLater call, causing early relock and incorrect
nRelockTime time
```

Backport of core [[bitcoin/bitcoin#18814 | PR18814]].

Fix spurious wallet_descriptor.py failure like
https://build.bitcoinabc.org/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=167449&_focus=3291&guest=1.

Test Plan:
With Clang:
  ninja all check-all

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Differential Revision: https://reviews.bitcoinabc.org/D8803
backpacker69 referenced this pull request in peercoin/peercoin Mar 28, 2021
Github-Pull: #18814
Rebased-From: a2e6db5
backpacker69 referenced this pull request in peercoin/peercoin Mar 28, 2021
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wallet_descriptor intermittent error "Please enter the wallet passphrase with walletpassphrase first. (-13)"
6 participants