Skip to content

Conversation

0xB10C
Copy link
Contributor

@0xB10C 0xB10C commented Jul 2, 2022

Changes a CI task that runs test the previously not run test/functional/interface_usdt_*.py functional tests (added in #24358).

This task is run as CirussCI compute_engine_instance VM as hooking into the tracepoints is not possible in CirrusCI docker containers (#23296 (comment)). We use an unoffical PPA and untrusted bpfcc-tools package in the CI as the Ubuntu jammy and Debian bullseye packages are outdated. We hope use an official package when new Ubuntu/Debian releases are available for the use with Google Compute Engine.

We make sure to hook into bitcoind binaries in USDT interface tests via their PID, instead of their path. This makes sure multiple functional tests running in parallel don't interfere with each other.

The utxocache USDT interface tests is adopted to a change of the functional test framework that wasn't detected as the tests weren't run in the CI. As the tracepoints expose internals, it can happen that we need to adopt the interface test when internals change. This is a bit awkward, and if it happens to frequently, we should consider generalizing the tests a bit more. For now it's fine, I think.

See the individual commit messages for more details on the changes.

Fixes #24782
Fixes #23296

I'd like to hear from reviewers:

  • Are we OK with using the hadret/bpfcc PPA for now? There is a clear plan when to drop it and as is currently, it could only impact the newly added VM task.
  • Adding a new task increases CI runtime and costs. Should an existing container CI task be ported to a VM and reused instead? Yes, see ci: run USDT interface tests in the CI #25528 (comment)

@fanquake fanquake added the Tests label Jul 2, 2022
0xB10C added 2 commits July 2, 2022 14:37
This makes sure to NOT hook into other bitcoind binaries run in
paralell in the test framework. We only want to trace the intended
binary.

In interface_usdt_utxocache.py:
While testing the utxocache flush with pruning, bitcoind is
restarted and we need to hook into the new PID again.
The USDT interface exposes process internals via the tracepoints. This
means, the USDT interface tests somewhat awardly depend on these
internals. If internals change, the tests have to adopt to that change.
Previously, the USDT interface tests weren't run in the CI so changes
could break the USDT interface tests without being noticed (e.g.
bitcoin#25486).

In fa13375 a 'self.rescan_utxos()' call
was added in the 'generate()' function of the test framework.
'rescan_utxos()' causes the UTXO cache to be flushed. In the USDT
interface tests for the 'utxocache:flush' trancepoint, 'generate()' is
used. As the utxo cache is now flushed more often, the number of flushes
the tests expectes need to be adopted. Also, the utxo cache has now a
different size when being flushed.

The utxocache tracepoint is tested by shutting the node down and
pruning blocks, to test the 'for_prune' argument.

Changes:
- A list 'expected_flushes' is now used which contains 'mode',
  'for_prune', and 'size' for each expected flush.
- When a flush happens, the expected-flush is removed from the list.
  This list is checked to be empty (unchanged).
- Previously, shutting down caused these two flushes:
    UTXOCacheFlush(duration=*, mode=ALWAYS, size=104, memory=*, for_prune=False)
    UTXOCacheFlush(duration=*, mode=ALWAYS, size=0, memory=*, for_prune=False)
  now it causes these flushes:
    UTXOCacheFlush(duration=*, mode=ALWAYS, size=2, memory=*, for_prune=False)
    UTXOCacheFlush(duration=*, mode=ALWAYS, size=0, memory=*, for_prune=False)
  The 104 UTXOs flushed previously were mainly coinbase UTXOs generated
  in previous tests and the test setup. These are now already flushed.
- In the 'for_prune' test we previously hooked into the tracepoint
  before mining blocks. This changed to only get notified about the
  tracepoint being triggered for the prune. Here, the utxo cache is
  empty already as it has just been flushed in 'generate()'.
  old:
    UTXOCacheFlush(duration=*, mode=NONE, size=350, memory=*, for_prune=True)
  new:
    UTXOCacheFlush(duration=*, mode=NONE, size=0, memory=*, for_prune=True)
@0xB10C 0xB10C force-pushed the 2022-06-usdt-ci-tests branch from 4c88ee2 to a221a5b Compare July 2, 2022 12:38
@maflcko
Copy link
Member

maflcko commented Jul 4, 2022

I tried to experiment with this, but I ran into: net.cpp:2980:5: error: must specify at least one argument for '...' parameter of variadic macro [-Werror,-Wgnu-zero-variadic-macro-arguments]

See https://cirrus-ci.com/task/6300898480619520?logs=ci#L2025

With the only change being to switch to debian:bookworm, see maflcko/bitcoin-core-with-ci@803f365#diff-b37d362747ecdb36dbfcbe2b8aab779f06404eee471ad835b58355e353d4fe79R9

Any ideas?

@0xB10C
Copy link
Contributor Author

0xB10C commented Jul 4, 2022

With the only change being to switch to debian:bookworm, see MarcoFalke/bitcoin-core-with-ci@803f365#diff-b37d362747ecdb36dbfcbe2b8aab779f06404eee471ad835b58355e353d4fe79R9

Any ideas?

Interesting. This seems to be unrelated to the tests. More the tracepoint macros in general. Probably a problem with the Systemtap package (systemtap-sdt-dev) in bookworm? Per https://packages.debian.org/bookworm/systemtap-sdt-dev it's version 4.7, which is also what we use in depends since #25360. Does a depends build on bookworm fail too?

I'll also do some poking.

edit: the variadic marco arg was indeed not present in 4.5 (which we used previously): https://sourceware.org/git/?p=systemtap.git;a=commitdiff;h=ecab2afea46099b4e7dfd551462689224afdbe3a. I think it also has something to do with clang, -pedantic and C++20. See https://stackoverflow.com/questions/67912343/how-to-resolve-must-specify-at-least-one-argument-for-parameter-of-variad

@0xB10C
Copy link
Contributor Author

0xB10C commented Jul 4, 2022

Tried running a depends build on bookworm - but no success so far: https://cirrus-ci.com/task/5255417889554432?logs=ci#L570

0xB10C@5641883

@maflcko
Copy link
Member

maflcko commented Jul 5, 2022

In any case it would fail because it wouldn't be possible to install the kernel headers in docker: https://cirrus-ci.com/task/5345925198512128?logs=ci#L3567 ?

@0xB10C
Copy link
Contributor Author

0xB10C commented Jul 7, 2022

One way to get the hosts kernel headers into the docker container is to mount them. I've mounted /lib/modules/5.15.0-1010-gcp/build however, there are quite a few relative symlinks which cause problems with mounts. To make them available, I've mounted /usr/src/linux-gcp-headers-5.15.0-1010 and /usr/src/linux-headers-5.15.0-1010-gcp but this doesn't seem to be enough. Not sure if it's worth the effort. See 0xB10C@6ac8e7e and https://cirrus-ci.com/task/6730748215427072?logs=ci#L2762.

<built-in>:1:10: fatal error: './include/linux/kconfig.h' file not found
#include "./include/linux/kconfig.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

which is in /lib/modules/5.15.0-1010-gcp/build/include/linux/kconfig.h.

@maflcko
Copy link
Member

maflcko commented Jul 7, 2022

Makes sense. That doesn't seem to be worth the effort.

An alternative to using a private ppa would be to bump the OS version in /etc/apt/sources.list before installing the package (See for example google/oss-fuzz@0da6949 ), though that'd just seem too brittle.

To preserve CI resources, I am thinking about combining this with another task. Since a wallet would be good to also cover the coinselection traces, and a recent-ish OS is required, we could re-use one of the sanitizer builds or the i686 build?

@maflcko
Copy link
Member

maflcko commented Jul 7, 2022

It worked. This commit should be passing: https://cirrus-ci.com/build/4705943798677504

@maflcko
Copy link
Member

maflcko commented Jul 8, 2022

Sorry, I lied. This one should pass: https://cirrus-ci.com/build/4898814673813504

@0xB10C 0xB10C force-pushed the 2022-06-usdt-ci-tests branch 2 times, most recently from d97474f to cc0b8bb Compare July 8, 2022 17:40
Our CI tasks are run by CirrusCI in Docker containers in a Google
Compute Engine based Kubernetes environment. These containers have
limited capabilities - especially CAP_SYS_ADMIN is missing. See
bitcoin#23296 (comment)

We need elevated privileges to hook into the USDT tracepoints. We use a
CirrusCI "compute_engine_instance" (a VM, not a container) where we have
the required privileges. The ubunut-mininmal-2204-lts was choosen with
debian-11 being an alternative. Both pack an outdated 'bpfcc-tools'
package (v0.18.0) from 2020. This version prints warnings to stderr
during BPF bytecode compilation, which causes our functional test runner
to fail. This is fixed in newer verison.

Until debian-12 or a newer Ubuntu release is avaliable as image in GCE
(https://cloud.google.com/compute/docs/images/os-details), we use a
third-party and untrusted PPA that releases up-to-date versions of the
package.

The official iovisor (authors of BCC) PPA is outdated too. An
alternative would be to compile BCC from source in the CI.

Co-authored-by: MacroFake <falke.marco@gmail.com>
@0xB10C 0xB10C force-pushed the 2022-06-usdt-ci-tests branch from cc0b8bb to cc7335e Compare July 8, 2022 17:42
@0xB10C
Copy link
Contributor Author

0xB10C commented Jul 9, 2022

As discussed offline with MarcoFake, we're using the [ASan + LSan + UBSan + integer, no depends, USDT] [jammy] task to run the USDT tests. The i686 didn't work - probably because the tracing isn't really supported on 32bit.

@maflcko
Copy link
Member

maflcko commented Jul 11, 2022

cr ACK cc7335e

# Images can be found here: https://cloud.google.com/compute/docs/images/os-details
compute_engine_instance:
image_project: ubuntu-os-cloud
image: family/ubuntu-2204-lts # when upgrading, check if we can drop "ADD_UNTRUSTED_BPFCC_PPA"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to the future: I was able to change this to family/ubuntu-2110, but the gce docs only include lts version. So if 2210 has the updated package, we can probably switch to that: https://packages.ubuntu.com/kinetic/bpfcc-tools

@theStack
Copy link
Contributor

Concept ACK

@maflcko maflcko merged commit eeb5a94 into bitcoin:master Aug 1, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Aug 1, 2022
@0xB10C 0xB10C deleted the 2022-06-usdt-ci-tests branch September 14, 2022 11:35
# packages. Meanwhile, use an untrusted PPA to install an up-to-date version of the bpfcc-tools
# package.
# TODO: drop this once we can use newer images in GCE
CI_EXEC add-apt-repository ppa:hadret/bpfcc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using of add-apt-repository requires some packages been installed.

0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Nov 28, 2022
As mentioned in bitcoin#26571, the task running the USDT interface tests
fail when run in docker. cc7335e
in bitcoin#25528 added that the tests are run in a **VM** in Cirrus CI.
Running them locally in docker containers might not work:

- We use [bcc] as tracing toolkit which requires the kernel headers
  to compile the BPF bytecode. As docker containers use the hosts
  kernel and don't run their own, there is a potential for mismatches
  between kernel headers available in the container and the host
  kernel. This results in a failure loading the BPF byte code.
- Privilges are required to load the BPF byte code into the kernel.
  Normally, the docker containers aren't run with these.
- We currently use an untrusted third-party PPA to install the
  bpfcc-tools package on Ubuntu 22.04. Using this on a local dev
  system could be a security risk.

To not hinder the ASan + LSan + UBSan part of the CI task, the USDT
tests are disabled on non-CirrusCI runs.

[bcc]: https://github.com/iovisor/bcc
maflcko pushed a commit that referenced this pull request Dec 2, 2022
2811f40 ci: only run USDT interface tests on CirrusCI (0xb10c)

Pull request description:

  As mentioned in #26571, the task running the USDT interface tests fail when run in docker. cc7335e in #25528 added that the tests are run in a **VM** in Cirrus CI. Running them locally in docker containers might not work:

  - We use [bcc] as tracing toolkit which requires the kernel headers to compile the BPF bytecode. As docker containers use the hosts kernel and don't run their own, there is a potential for mismatches between kernel headers available in the container and the host kernel. This results in a failure loading the BPF byte code.
  - Privilges are required to load the BPF byte code into the kernel. Normally, the docker containers aren't run with these.
  - We currently use an untrusted third-party PPA to install the bpfcc-tools package on Ubuntu 22.04. Using this on a local dev system could be a security risk.

  To not hinder the ASan + LSan + UBSan part of the CI task, the USDT tests are disabled on non-CirrusCI runs.

  [bcc]: https://github.com/iovisor/bcc

Top commit has no ACKs.

Tree-SHA512: 7c159b59630b36d5ed927b501fa0ad6f54ff3d98d0902f975cc8122b4147a7f828175807d40a470a85f0f3b6eeb79307d3465a287dbd2f3d75b803769ad0b6e7
Fabcien added a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2022
Summary:
```
This adds functional tests for the USDT tracepoints added in #22006 and #22902. This partially fixes #23296. The tests are probably skipped on most systems as these tests require:

    a Linux system with a kernel that supports BPF (and available kernel headers)
    that Bitcoin Core is compiled with tracepoints for USDT support (default when compiled with depends)
    bcc installed
    the tests are run with a privileged user that is able to e.g. do BPF syscalls and load BPF maps

```

Backport of [[bitcoin/bitcoin#24358 | core#24358]] and [[bitcoin/bitcoin#25528 | core#25528]] for the test fixes.

Depends on D12737.

Note that:
 - We don't run it on CI (yet)
 - We need to prune more blocks to trigger the expected flush event as
   our block files are smaller
 - The 3 tests succeed on my machine, but the runner returns an error
   because I got compilation warnings that print to stderr

Test Plan:
With tracing enabled:
  sudo ./test/functional/test_runner.py interface_usdt_net
  sudo ./test/functional/test_runner.py interface_usdt_utxocache
  sudo ./test/functional/test_runner.py interface_usdt_validation

  ninja check-functional
Check the tests are skipped as expected when not run as root.

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Subscribers: PiRK

Differential Revision: https://reviews.bitcoinabc.org/D12739
@@ -15,5 +15,5 @@ export RUN_FUNCTIONAL_TESTS=false
export RUN_FUZZ_TESTS=false
export RUN_TIDY=true
export GOAL="install"
export BITCOIN_CONFIG="CC=clang CXX=clang++ --with-incompatible-bdb --disable-hardening CFLAGS='-O0 -g0' CXXFLAGS='-O0 -g0'"
export BITCOIN_CONFIG="CC=clang CXX=clang++ --enable-c++20 --with-incompatible-bdb --disable-hardening CFLAGS='-O0 -g0' CXXFLAGS='-O0 -g0'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xB10C @MarcoFalke

What was the point to add --enable-c++20 here?

Asking in context of #26763 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't added in this pull, see the diff in ci/test/00_setup_env_native_asan.sh.

It was added to check that compiling with it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could another CI task use it instead of the "tidy" one? (fwiw, we have a native Windows build using C++20)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, any Linux task that builds all targets (including fuzz and gui) should do. For example, multiprocess, asan, or tsan.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't added in this pull, see the diff in ci/test/00_setup_env_native_asan.sh.

Suggesting to revert the move of --enable-c++20 between CI tasks in #26770.

maflcko pushed a commit to bitcoin-core/gui that referenced this pull request Dec 29, 2022
…sk back to "ASan..." one

afc6052 ci: Move `--enable-c++20` from "tidy" task back to "ASan..." one (Hennadii Stepanov)

Pull request description:

  This PR reverts cc7335e from bitcoin/bitcoin#25528 partially.

  C++20 has introduced some new headers, and it is premature to consider them when using the IWYU tool.

  Required for bitcoin/bitcoin#26763 and bitcoin/bitcoin#26766.

  Related discussions:
  - bitcoin/bitcoin#25528 (comment)
  - bitcoin/bitcoin#26763 (comment)

ACKs for top commit:
  MarcoFalke:
    review only ACK afc6052

Tree-SHA512: 9c1d3317d0f7a94d46d1e442da1a91669cd9ed1e62a41579edc96e1f5447551b536d32eeb222caf277e85228482753be545e0a874208cb24f9a8491fce3b6d9f
@bitcoin bitcoin locked and limited conversation to collaborators Dec 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI, tracing: run tracepoint interface tests in the CI tracing: tests for USDT tracepoints
5 participants