Skip to content

Conversation

0xB10C
Copy link
Contributor

@0xB10C 0xB10C commented Feb 16, 2022

This adds functional tests for the USDT tracepoints added in #22006 and #22902. This partially fixes #23296. The tests are probably skipped on most systems as these tests require:

  • a Linux system with a kernel that supports BPF (and available kernel headers)
  • that Bitcoin Core is compiled with tracepoints for USDT support (default when compiled with depends)
  • bcc installed
  • the tests are run with a privileged user that is able to e.g. do BPF syscalls and load BPF maps

The tests are not yet run in our CI as the CirrusCI containers lack the required permissions (see #23296 (comment)). Running the tests in a VM in the CI could work, but I haven't experimented with this yet. The priority was to get the actual tests done first to ensure the tracepoints work as intended for the v23.0 release. Running the tracepoint tests in the CI is planned as the next step to finish #23296.

The tests can, however, be run against e.g. release candidates by hand. Additionally, they provide a starting point for tests for future tracepoints. PRs adding new tracepoint should include tests. This makes reviewing these PRs easier.

The tests require privileges to execute BPF sycalls (CAP_SYS_ADMIN before Linux kernel 5.8 and CAP_BPF and CAP_PERFMON on 5.8+) and permissions to /sys/kernel/debug/tracing/. It's currently recommended to run the tests in a virtual machine (or on a VPS) where it's sensible to use the root user to gain these privileges. Never run python scripts you haven't carefully reviewed with root permissions! It's unclear if a non-root user can even gain the required privileges. This needs more experimenting.

The goal here is to test the tracepoint interface to make sure the documented interface does not break by accident. The tracepoints expose implementation details. This means we also need to rely on implementation details of Bitcoin Core in these functional tests to trigger the tracepoints. An example is the test of the utxocache:flush tracepoint: On Bitcoin Core shutdown, the UTXO cache is flushed twice. The corresponding tracepoint test expects two flushes, too - if not, the test fails. Changing implementation details could cause these tests to fail and the tracepoint API to break. However, we purposefully treat the tracepoints only as semi-stable. The tracepoints should not block refactors or changes to other internals.

@laanwj
Copy link
Member

laanwj commented Feb 18, 2022

Nice! Concept ACK.

@arnabsen1729
Copy link
Contributor

arnabsen1729 commented Feb 18, 2022

I tested it on my machine:

  • First I ran the tests on master. Only interface_usdt_utxocache.py failed. After I pulled the commits from follow-up tracing: utxocache tracepoints follow up for #22902 #23907, the test was successful.
  • the tests for net and validation were successful in both scenarios.
  • running tests without root privileges showed the message no permissions to use BPF and they were skipped.

Tested with bpftrace v0.12.0

$ bpftrace --version
bpftrace v0.12.0

For testing the USDT tracepoint API in the functional tests we
require:
 - that we are on a Linux system*
 - that Bitcoin Core is compiled with tracepoints
 - that bcc and the the Python bcc module [0] is installed
 - that we run the tests with the required permissions**
otherwise we skip the tests.

*:  We currently only support tracepoints on Linux. Tracepoints are
    not compiled on other platforms.
**: Currently, we check for root permissions via getuid == 0. It's
    unclear if it's even possible to run the tests a non-root user
    with e.g. CAP_BPF, CAP_PERFMON, and access to /sys/kernel/debug/
    tracing/. Anyone running these tests as root should carefully
    review them first and then run them in a disposable VM.

[0]: https://github.com/iovisor/bcc/blob/master/INSTALL.md
This adds tests for the net:inbound_message and net:outbound_message
tracepoint interface.
This adds tests for the
- utxocache:flush
- utxocache:uncache
- utxocache:add
- utxocache:spent
tracepoint interfaces.
This adds a test for the validation:block_connected tracepoint.
@0xB10C 0xB10C force-pushed the 2022-02-tracepoint-tests-a branch from db3a05f to 76c60d7 Compare February 20, 2022 13:59
@0xB10C
Copy link
Contributor Author

0xB10C commented Feb 20, 2022

#23907 is merged. Rebased and ready for review.

@0xB10C 0xB10C marked this pull request as ready for review February 20, 2022 14:01
@fanquake fanquake requested review from jb55 and laanwj February 20, 2022 15:00
@theStack
Copy link
Contributor

Concept ACK

@jb55
Copy link
Contributor

jb55 commented Feb 25, 2022

Tests pass on my end, but oddly enough when I run bpftrace contrib/tracing/log_utxos.bt I get a segfault. anyone else seeing this?

$ bpftrace --version
bpftrace v0.14.1
Thread 1 "bpftrace" received signal SIGSEGV, Segmentation fault.
0x00007fffefc816e4 in llvm::PointerType::get(llvm::Type*, unsigned int) () from /nix/store/sfydnaab0wxn2qm3pkaab5x1fcagzxpf-llvm-11.1.0-lib/lib/libLLVM-11.so
(gdb) bt
#0  0x00007fffefc816e4 in llvm::PointerType::get(llvm::Type*, unsigned int) ()
   from /nix/store/sfydnaab0wxn2qm3pkaab5x1fcagzxpf-llvm-11.1.0-lib/lib/libLLVM-11.so
#1  0x000000000055fc37 in bpftrace::ast::CodegenLLVM::createFormatStringCall(bpftrace::ast::Call&, int&, std::vector<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<bpftrace::Field, std::allocator<bpftrace::Field> > >, std::allocator<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<bpftrace::Field, std::allocator<bpftrace::Field> > > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bpftrace::AsyncAction) ()
#2  0x0000000000569c87 in bpftrace::ast::CodegenLLVM::visit(bpftrace::ast::Call&) ()
#3  0x0000000000558dd6 in bpftrace::ast::CodegenLLVM::accept(bpftrace::ast::Node*) ()
#4  0x0000000000558ee7 in bpftrace::ast::CodegenLLVM::visit(bpftrace::ast::ExprStatement&) ()
#5  0x0000000000558dd6 in bpftrace::ast::CodegenLLVM::accept(bpftrace::ast::Node*) ()
#6  0x000000000056edce in bpftrace::ast::CodegenLLVM::generateProbe(bpftrace::ast::Probe&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::FunctionType*, bool, std::optional<int>, bool) ()
#7  0x000000000056fbaf in bpftrace::ast::CodegenLLVM::visit(bpftrace::ast::Probe&) ()
#8  0x0000000000558dd6 in bpftrace::ast::CodegenLLVM::accept(bpftrace::ast::Node*) ()
#9  0x0000000000558f6e in bpftrace::ast::CodegenLLVM::visit(bpftrace::ast::Program&) ()
#10 0x0000000000558dd6 in bpftrace::ast::CodegenLLVM::accept(bpftrace::ast::Node*) ()
#11 0x0000000000558feb in bpftrace::ast::CodegenLLVM::generate_ir() ()
#12 0x000000000043d622 in main ()

@0xB10C
Copy link
Contributor Author

0xB10C commented Feb 25, 2022

@jb55 which version of bpftrace are you running? I've got the segfault on 0.13 and 0.14 too, it works fine with 0.12. The issue seems to be related to the printf in the unroll followed by another printf.

This is kinda off topic here, but there have been quite a few problems with bpftrace on different versions. People reported on IRC that the example scripts don't work with their bpftrace version. Have been thinking about dropping the examples until bpftrace is more stable.

@jb55
Copy link
Contributor

jb55 commented Feb 25, 2022

tACK 76c60d7

@laanwj
Copy link
Member

laanwj commented Mar 30, 2022

It's unclear if a non-root user can even gain the required privileges. This needs more experimenting.

I managed to get the required privileges as a user, and get the tests to run (don't forget to patch test/functional/test_framework/test_framework.py to remove the euid check):

# chmod 755 /sys/kernel/debug # might want to assign a special group or such
# chmod 755 /sys/kernel/debug/tracing
# chmod 666 /sys/kernel/debug/tracing/uprobe_events
# echo "3" > /proc/sys/kernel/perf_event_paranoid # 3 or lower will do, 4 won't
# setpriv --ambient-caps +cap_38,+cap_39 --inh-caps +cap_38,+cap_39 --init-groups --reuid=1000 --regid=1000 bash
$ getpcaps $$
310068: cap_perfmon,cap_bpf=eip
$ cd …/bitcoin
$ test/functional/interface_usdt_net.py
(passes)
$ test/functional/interface_usdt_validation.py
(passes)
$ test/functional/interface_usdt_utxocache.py
(fails on some probably unrelated assertion)

It's apparently also possible to assign fixed capabilities to users through /etc/security/capability.confin some environments. I haven't tried this yet.

BTW: test_runner.py fails on all the new tests for me. Although the test itself passes, warnings are printed to stderr, I think causing it to be marked as failed:

stderr:
In file included from <built-in>:2:                                                                       
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:80:
include/linux/compiler-clang.h:41:9: warning: '__HAVE_BUILTIN_BSWAP32__' macro redefined [-Wmacro-redefine
d]
#define __HAVE_BUILTIN_BSWAP32__
        ^
<command line>:4:9: note: previous definition is here 
#define __HAVE_BUILTIN_BSWAP32__ 1
        ^
…

@laanwj
Copy link
Member

laanwj commented Mar 30, 2022

Tested ACK 76c60d7

Will open a PR to fix the RPC assertion in interface_usdt_utxocache.py. It's unrelated.

laanwj added a commit to laanwj/bitcoin that referenced this pull request Mar 30, 2022
When `getrawtransaction` is successfully used on a coinbase transaction,
there is an assertion error. This is very unlikely but happens in the
test in bitcoin#24358.

This does the following:

- Add missing "coinbase" documentation.

- Synchronize documentation between `getrawtransaction` and
  `decoderawtransaction`, the two users of `TxToUniv` that have detailed
  documentation. `decodepsbt` also uses it but fortunately elides this block.

- Change "vout[].amount" to `STR_AMOUNT` for consistency.

- Add maintainer comment to keep the two places synchronized. It might
  be possible to get smarter with deduplication, but there are some
  extra fields that prevent the obvious way.
"""Skip the running test if we don't have permissions to do BPF syscalls and load BPF maps."""
# check for 'root' permissions
if os.geteuid() != 0:
raise SkipTest("no permissions to use BPF (please review the tests carefully before running them with higher privileges)")
Copy link
Member

@laanwj laanwj Mar 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As alternative to / in addition to os.geteuid() != 0 for non-root use we'll probably want to check CapEff in /proc/<pid>/status for (1 << CAP_PERFMON) | (1 << CAP_BPF).

CAP_PERFMON = 38
CAP_BPF = 39

Not necessarily in this PR though. I think it's fine to merge as-is and do that later.

@jamesob
Copy link
Contributor

jamesob commented Mar 30, 2022

Concept ACK. Nice functional demonstration of USDT usage, too. I'll test this soon.

maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Mar 31, 2022
…saction`

71038a1 rpc: Fix documentation assertion for `getrawtransaction` (laanwj)

Pull request description:

  When `getrawtransaction` is successfully used on a coinbase transaction, there is an assertion error. This is very unlikely but happens in the `interface_usdt_utxocache.py` test in bitcoin#24358.

  This does the following:

  - Add missing "coinbase" documentation.

  - Synchronize documentation between `getrawtransaction` and  `decoderawtransaction`, the two users of `TxToUniv` that have detailed   documentation. `decodepsbt` and `getblock` also uses it but fortunately elides this block.

  - Change "vout[].amount" to `STR_AMOUNT` for consistency.

  - Add maintainer comment to keep the two places synchronized. It might  be possible to get smarter with deduplication, but there are some  extra fields that prevent the obvious way.

ACKs for top commit:
  jonatack:
    ACK 71038a1

Tree-SHA512: 962236130455d805190ff9a5c971e4e25c17db35614a90ce340264ec953b0ad7fb814eb33ae430b5073955a8a350f72bdd67ba93e35f9c70e5175b836a767a35
@laanwj laanwj merged commit 6c9460e into bitcoin:master Apr 6, 2022
@laanwj
Copy link
Member

laanwj commented Apr 6, 2022

Concept ACK. Nice functional demonstration of USDT usage, too. I'll test this soon.

@jamesob I'm going ahead and merge this as I think it's ready. I hope you'll still get around to testing it though.

@0xB10C 0xB10C deleted the 2022-02-tracepoint-tests-a branch April 6, 2022 11:20
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 6, 2022
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request May 21, 2022
When `getrawtransaction` is successfully used on a coinbase transaction,
there is an assertion error. This is very unlikely but happens in the
test in bitcoin#24358.

This does the following:

- Add missing "coinbase" documentation.

- Synchronize documentation between `getrawtransaction` and
  `decoderawtransaction`, the two users of `TxToUniv` that have detailed
  documentation. `decodepsbt` also uses it but fortunately elides this block.

- Change "vout[].amount" to `STR_AMOUNT` for consistency.

- Add maintainer comment to keep the two places synchronized. It might
  be possible to get smarter with deduplication, but there are some
  extra fields that prevent the obvious way.

Github-Pull: bitcoin#24716
Rebased-From: 71038a1
maflcko pushed a commit to bitcoin-core/gui that referenced this pull request Aug 1, 2022
cc7335e ci: run USDT interface test in a VM (0xb10c)
dba6f82 test: adopt USDT utxocache interface tests (0xb10c)
220a5a2 test: hook into PID in tracing tests (0xb10c)

Pull request description:

  Changes a CI task that runs test the previously not run `test/functional/interface_usdt_*.py` functional tests (added in bitcoin/bitcoin#24358).

  This task is run as CirussCI `compute_engine_instance` VM as hooking into the tracepoints is not possible in CirrusCI docker containers (bitcoin/bitcoin#23296 (comment)). We use an unoffical PPA and untrusted  `bpfcc-tools` package in the CI as the Ubuntu jammy and Debian bullseye packages are outdated. We hope use an official package when new Ubuntu/Debian releases are available for the use with Google Compute Engine.

  We make sure to hook into `bitcoind` binaries in USDT interface tests via their PID, instead of their path. This makes sure multiple functional tests running in parallel don't interfere with each other.

  The utxocache USDT interface tests is adopted to a change of the functional test framework that wasn't detected as the tests weren't run in the CI. As the tracepoints expose internals, it can happen that we need to adopt the interface test when internals change. This is a bit awkward, and if it happens to frequently, we should consider generalizing the tests a bit more. For now it's fine, I think.

  See the individual commit messages for more details on the changes.

  Fixes bitcoin/bitcoin#24782
  Fixes bitcoin/bitcoin#23296

  I'd like to hear from reviewers:
  - Are we OK with using the [`hadret/bpfcc`](https://launchpad.net/~hadret/+archive/ubuntu/bpfcc) PPA for now? There is a clear plan when to drop it and as is currently, it could only impact the newly added VM task.
  - ~~Adding a new task increases CI runtime and costs. Should an existing `container` CI task be ported to a VM and reused instead?~~ Yes, see bitcoin/bitcoin#25528 (comment)

ACKs for top commit:
  MarcoFalke:
    cr ACK cc7335e

Tree-SHA512: b7fddccc0a77d82371229d048abe0bf2c4ccaa45906497ef3040cf99e7f05561890aef4c253c40e4afc96bb838c9787fae81c8454c6fd9db583276e005a4ccb3
Fabcien added a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2022
Summary:
```
This adds functional tests for the USDT tracepoints added in #22006 and #22902. This partially fixes #23296. The tests are probably skipped on most systems as these tests require:

    a Linux system with a kernel that supports BPF (and available kernel headers)
    that Bitcoin Core is compiled with tracepoints for USDT support (default when compiled with depends)
    bcc installed
    the tests are run with a privileged user that is able to e.g. do BPF syscalls and load BPF maps

```

Backport of [[bitcoin/bitcoin#24358 | core#24358]] and [[bitcoin/bitcoin#25528 | core#25528]] for the test fixes.

Depends on D12737.

Note that:
 - We don't run it on CI (yet)
 - We need to prune more blocks to trigger the expected flush event as
   our block files are smaller
 - The 3 tests succeed on my machine, but the runner returns an error
   because I got compilation warnings that print to stderr

Test Plan:
With tracing enabled:
  sudo ./test/functional/test_runner.py interface_usdt_net
  sudo ./test/functional/test_runner.py interface_usdt_utxocache
  sudo ./test/functional/test_runner.py interface_usdt_validation

  ninja check-functional
Check the tests are skipped as expected when not run as root.

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Subscribers: PiRK

Differential Revision: https://reviews.bitcoinabc.org/D12739
@bitcoin bitcoin locked and limited conversation to collaborators Apr 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tracing: tests for USDT tracepoints
7 participants