ci: run USDT interface tests in the CI #25528

0xB10C · 2022-07-02T12:34:59Z

Changes a CI task that runs test the previously not run test/functional/interface_usdt_*.py functional tests (added in #24358).

This task is run as CirussCI compute_engine_instance VM as hooking into the tracepoints is not possible in CirrusCI docker containers (#23296 (comment)). We use an unoffical PPA and untrusted bpfcc-tools package in the CI as the Ubuntu jammy and Debian bullseye packages are outdated. We hope use an official package when new Ubuntu/Debian releases are available for the use with Google Compute Engine.

We make sure to hook into bitcoind binaries in USDT interface tests via their PID, instead of their path. This makes sure multiple functional tests running in parallel don't interfere with each other.

The utxocache USDT interface tests is adopted to a change of the functional test framework that wasn't detected as the tests weren't run in the CI. As the tracepoints expose internals, it can happen that we need to adopt the interface test when internals change. This is a bit awkward, and if it happens to frequently, we should consider generalizing the tests a bit more. For now it's fine, I think.

See the individual commit messages for more details on the changes.

Fixes #24782
Fixes #23296

I'd like to hear from reviewers:

Are we OK with using the hadret/bpfcc PPA for now? There is a clear plan when to drop it and as is currently, it could only impact the newly added VM task.
~~Adding a new task increases CI runtime and costs. Should an existing container CI task be ported to a VM and reused instead?~~ Yes, see ci: run USDT interface tests in the CI #25528 (comment)

This makes sure to NOT hook into other bitcoind binaries run in paralell in the test framework. We only want to trace the intended binary. In interface_usdt_utxocache.py: While testing the utxocache flush with pruning, bitcoind is restarted and we need to hook into the new PID again.

The USDT interface exposes process internals via the tracepoints. This means, the USDT interface tests somewhat awardly depend on these internals. If internals change, the tests have to adopt to that change. Previously, the USDT interface tests weren't run in the CI so changes could break the USDT interface tests without being noticed (e.g. bitcoin#25486). In fa13375 a 'self.rescan_utxos()' call was added in the 'generate()' function of the test framework. 'rescan_utxos()' causes the UTXO cache to be flushed. In the USDT interface tests for the 'utxocache:flush' trancepoint, 'generate()' is used. As the utxo cache is now flushed more often, the number of flushes the tests expectes need to be adopted. Also, the utxo cache has now a different size when being flushed. The utxocache tracepoint is tested by shutting the node down and pruning blocks, to test the 'for_prune' argument. Changes: - A list 'expected_flushes' is now used which contains 'mode', 'for_prune', and 'size' for each expected flush. - When a flush happens, the expected-flush is removed from the list. This list is checked to be empty (unchanged). - Previously, shutting down caused these two flushes: UTXOCacheFlush(duration=*, mode=ALWAYS, size=104, memory=*, for_prune=False) UTXOCacheFlush(duration=*, mode=ALWAYS, size=0, memory=*, for_prune=False) now it causes these flushes: UTXOCacheFlush(duration=*, mode=ALWAYS, size=2, memory=*, for_prune=False) UTXOCacheFlush(duration=*, mode=ALWAYS, size=0, memory=*, for_prune=False) The 104 UTXOs flushed previously were mainly coinbase UTXOs generated in previous tests and the test setup. These are now already flushed. - In the 'for_prune' test we previously hooked into the tracepoint before mining blocks. This changed to only get notified about the tracepoint being triggered for the prune. Here, the utxo cache is empty already as it has just been flushed in 'generate()'. old: UTXOCacheFlush(duration=*, mode=NONE, size=350, memory=*, for_prune=True) new: UTXOCacheFlush(duration=*, mode=NONE, size=0, memory=*, for_prune=True)

maflcko · 2022-07-04T13:02:43Z

I tried to experiment with this, but I ran into: net.cpp:2980:5: error: must specify at least one argument for '...' parameter of variadic macro [-Werror,-Wgnu-zero-variadic-macro-arguments]

See https://cirrus-ci.com/task/6300898480619520?logs=ci#L2025

With the only change being to switch to debian:bookworm, see maflcko/bitcoin-core-with-ci@803f365#diff-b37d362747ecdb36dbfcbe2b8aab779f06404eee471ad835b58355e353d4fe79R9

Any ideas?

0xB10C · 2022-07-04T15:28:18Z

With the only change being to switch to debian:bookworm, see MarcoFalke/bitcoin-core-with-ci@803f365#diff-b37d362747ecdb36dbfcbe2b8aab779f06404eee471ad835b58355e353d4fe79R9

Any ideas?

Interesting. This seems to be unrelated to the tests. More the tracepoint macros in general. Probably a problem with the Systemtap package (systemtap-sdt-dev) in bookworm? Per https://packages.debian.org/bookworm/systemtap-sdt-dev it's version 4.7, which is also what we use in depends since #25360. Does a depends build on bookworm fail too?

I'll also do some poking.

edit: the variadic marco arg was indeed not present in 4.5 (which we used previously): https://sourceware.org/git/?p=systemtap.git;a=commitdiff;h=ecab2afea46099b4e7dfd551462689224afdbe3a. I think it also has something to do with clang, -pedantic and C++20. See https://stackoverflow.com/questions/67912343/how-to-resolve-must-specify-at-least-one-argument-for-parameter-of-variad

0xB10C · 2022-07-04T20:51:19Z

Tried running a depends build on bookworm - but no success so far: https://cirrus-ci.com/task/5255417889554432?logs=ci#L570

0xB10C@5641883

maflcko · 2022-07-05T15:52:13Z

In any case it would fail because it wouldn't be possible to install the kernel headers in docker: https://cirrus-ci.com/task/5345925198512128?logs=ci#L3567 ?

0xB10C · 2022-07-07T12:19:10Z

One way to get the hosts kernel headers into the docker container is to mount them. I've mounted /lib/modules/5.15.0-1010-gcp/build however, there are quite a few relative symlinks which cause problems with mounts. To make them available, I've mounted /usr/src/linux-gcp-headers-5.15.0-1010 and /usr/src/linux-headers-5.15.0-1010-gcp but this doesn't seem to be enough. Not sure if it's worth the effort. See 0xB10C@6ac8e7e and https://cirrus-ci.com/task/6730748215427072?logs=ci#L2762.

<built-in>:1:10: fatal error: './include/linux/kconfig.h' file not found
#include "./include/linux/kconfig.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

which is in /lib/modules/5.15.0-1010-gcp/build/include/linux/kconfig.h.

maflcko · 2022-07-07T13:14:02Z

Makes sense. That doesn't seem to be worth the effort.

An alternative to using a private ppa would be to bump the OS version in /etc/apt/sources.list before installing the package (See for example google/oss-fuzz@0da6949 ), though that'd just seem too brittle.

To preserve CI resources, I am thinking about combining this with another task. Since a wallet would be good to also cover the coinselection traces, and a recent-ish OS is required, we could re-use one of the sanitizer builds or the i686 build?

maflcko · 2022-07-07T17:17:54Z

It worked. This commit should be passing: https://cirrus-ci.com/build/4705943798677504

maflcko · 2022-07-08T10:01:40Z

Sorry, I lied. This one should pass: https://cirrus-ci.com/build/4898814673813504

Our CI tasks are run by CirrusCI in Docker containers in a Google Compute Engine based Kubernetes environment. These containers have limited capabilities - especially CAP_SYS_ADMIN is missing. See bitcoin#23296 (comment) We need elevated privileges to hook into the USDT tracepoints. We use a CirrusCI "compute_engine_instance" (a VM, not a container) where we have the required privileges. The ubunut-mininmal-2204-lts was choosen with debian-11 being an alternative. Both pack an outdated 'bpfcc-tools' package (v0.18.0) from 2020. This version prints warnings to stderr during BPF bytecode compilation, which causes our functional test runner to fail. This is fixed in newer verison. Until debian-12 or a newer Ubuntu release is avaliable as image in GCE (https://cloud.google.com/compute/docs/images/os-details), we use a third-party and untrusted PPA that releases up-to-date versions of the package. The official iovisor (authors of BCC) PPA is outdated too. An alternative would be to compile BCC from source in the CI. Co-authored-by: MacroFake <falke.marco@gmail.com>

0xB10C · 2022-07-09T09:06:26Z

As discussed offline with MarcoFake, we're using the [ASan + LSan + UBSan + integer, no depends, USDT] [jammy] task to run the USDT tests. The i686 didn't work - probably because the tracing isn't really supported on 32bit.

maflcko · 2022-07-11T07:56:10Z

cr ACK cc7335e

maflcko · 2022-07-11T14:38:16Z

.cirrus.yml

+  # Images can be found here: https://cloud.google.com/compute/docs/images/os-details
+  compute_engine_instance:
+    image_project: ubuntu-os-cloud
+    image: family/ubuntu-2204-lts # when upgrading, check if we can drop "ADD_UNTRUSTED_BPFCC_PPA"


note to the future: I was able to change this to family/ubuntu-2110, but the gce docs only include lts version. So if 2210 has the updated package, we can probably switch to that: https://packages.ubuntu.com/kinetic/bpfcc-tools

theStack · 2022-07-21T08:46:01Z

Concept ACK

hebasto · 2022-11-25T10:15:56Z

ci/test/04_install.sh

+    # packages. Meanwhile, use an untrusted PPA to install an up-to-date version of the bpfcc-tools
+    # package.
+    # TODO: drop this once we can use newer images in GCE
+    CI_EXEC add-apt-repository ppa:hadret/bpfcc


Using of add-apt-repository requires some packages been installed.

As mentioned in bitcoin#26571, the task running the USDT interface tests fail when run in docker. cc7335e in bitcoin#25528 added that the tests are run in a **VM** in Cirrus CI. Running them locally in docker containers might not work: - We use [bcc] as tracing toolkit which requires the kernel headers to compile the BPF bytecode. As docker containers use the hosts kernel and don't run their own, there is a potential for mismatches between kernel headers available in the container and the host kernel. This results in a failure loading the BPF byte code. - Privilges are required to load the BPF byte code into the kernel. Normally, the docker containers aren't run with these. - We currently use an untrusted third-party PPA to install the bpfcc-tools package on Ubuntu 22.04. Using this on a local dev system could be a security risk. To not hinder the ASan + LSan + UBSan part of the CI task, the USDT tests are disabled on non-CirrusCI runs. [bcc]: https://github.com/iovisor/bcc

2811f40 ci: only run USDT interface tests on CirrusCI (0xb10c) Pull request description: As mentioned in #26571, the task running the USDT interface tests fail when run in docker. cc7335e in #25528 added that the tests are run in a **VM** in Cirrus CI. Running them locally in docker containers might not work: - We use [bcc] as tracing toolkit which requires the kernel headers to compile the BPF bytecode. As docker containers use the hosts kernel and don't run their own, there is a potential for mismatches between kernel headers available in the container and the host kernel. This results in a failure loading the BPF byte code. - Privilges are required to load the BPF byte code into the kernel. Normally, the docker containers aren't run with these. - We currently use an untrusted third-party PPA to install the bpfcc-tools package on Ubuntu 22.04. Using this on a local dev system could be a security risk. To not hinder the ASan + LSan + UBSan part of the CI task, the USDT tests are disabled on non-CirrusCI runs. [bcc]: https://github.com/iovisor/bcc Top commit has no ACKs. Tree-SHA512: 7c159b59630b36d5ed927b501fa0ad6f54ff3d98d0902f975cc8122b4147a7f828175807d40a470a85f0f3b6eeb79307d3465a287dbd2f3d75b803769ad0b6e7

Summary: ``` This adds functional tests for the USDT tracepoints added in #22006 and #22902. This partially fixes #23296. The tests are probably skipped on most systems as these tests require: a Linux system with a kernel that supports BPF (and available kernel headers) that Bitcoin Core is compiled with tracepoints for USDT support (default when compiled with depends) bcc installed the tests are run with a privileged user that is able to e.g. do BPF syscalls and load BPF maps ``` Backport of [[bitcoin/bitcoin#24358 | core#24358]] and [[bitcoin/bitcoin#25528 | core#25528]] for the test fixes. Depends on D12737. Note that: - We don't run it on CI (yet) - We need to prune more blocks to trigger the expected flush event as our block files are smaller - The 3 tests succeed on my machine, but the runner returns an error because I got compilation warnings that print to stderr Test Plan: With tracing enabled: sudo ./test/functional/test_runner.py interface_usdt_net sudo ./test/functional/test_runner.py interface_usdt_utxocache sudo ./test/functional/test_runner.py interface_usdt_validation ninja check-functional Check the tests are skipped as expected when not run as root. Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Subscribers: PiRK Differential Revision: https://reviews.bitcoinabc.org/D12739

hebasto · 2022-12-29T11:47:31Z

ci/test/00_setup_env_native_tidy.sh

@@ -15,5 +15,5 @@ export RUN_FUNCTIONAL_TESTS=false
 export RUN_FUZZ_TESTS=false
 export RUN_TIDY=true
 export GOAL="install"
-export BITCOIN_CONFIG="CC=clang CXX=clang++ --with-incompatible-bdb --disable-hardening CFLAGS='-O0 -g0' CXXFLAGS='-O0 -g0'"
+export BITCOIN_CONFIG="CC=clang CXX=clang++ --enable-c++20 --with-incompatible-bdb --disable-hardening CFLAGS='-O0 -g0' CXXFLAGS='-O0 -g0'"


@0xB10C @MarcoFalke

What was the point to add --enable-c++20 here?

Asking in context of #26763 (comment).

It wasn't added in this pull, see the diff in ci/test/00_setup_env_native_asan.sh.

It was added to check that compiling with it works.

Could another CI task use it instead of the "tidy" one? (fwiw, we have a native Windows build using C++20)

Sure, any Linux task that builds all targets (including fuzz and gui) should do. For example, multiprocess, asan, or tsan.

It wasn't added in this pull, see the diff in ci/test/00_setup_env_native_asan.sh.

Suggesting to revert the move of --enable-c++20 between CI tasks in #26770.

…sk back to "ASan..." one afc6052 ci: Move `--enable-c++20` from "tidy" task back to "ASan..." one (Hennadii Stepanov) Pull request description: This PR reverts cc7335e from bitcoin/bitcoin#25528 partially. C++20 has introduced some new headers, and it is premature to consider them when using the IWYU tool. Required for bitcoin/bitcoin#26763 and bitcoin/bitcoin#26766. Related discussions: - bitcoin/bitcoin#25528 (comment) - bitcoin/bitcoin#26763 (comment) ACKs for top commit: MarcoFalke: review only ACK afc6052 Tree-SHA512: 9c1d3317d0f7a94d46d1e442da1a91669cd9ed1e62a41579edc96e1f5447551b536d32eeb222caf277e85228482753be545e0a874208cb24f9a8491fce3b6d9f

fanquake added the Tests label Jul 2, 2022

0xB10C added 2 commits July 2, 2022 14:37

0xB10C force-pushed the 2022-06-usdt-ci-tests branch from 4c88ee2 to a221a5b Compare July 2, 2022 12:38

0xB10C force-pushed the 2022-06-usdt-ci-tests branch 2 times, most recently from d97474f to cc0b8bb Compare July 8, 2022 17:40

0xB10C force-pushed the 2022-06-usdt-ci-tests branch from cc0b8bb to cc7335e Compare July 8, 2022 17:42

maflcko reviewed Jul 11, 2022

View reviewed changes

0xB10C mentioned this pull request Jul 29, 2022

tracing: do not use coin after move in CCoinsViewCache::AddCoin #25663

Merged

maflcko merged commit eeb5a94 into bitcoin:master Aug 1, 2022

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Aug 1, 2022

Merge bitcoin#25528: ci: run USDT interface tests in the CI

c35b2bf

0xB10C deleted the 2022-06-usdt-ci-tests branch September 14, 2022 11:35

0xB10C mentioned this pull request Sep 14, 2022

build: no-longer disable WARN_CXXFLAGS when CXXFLAGS is set #25972

Merged

hebasto reviewed Nov 25, 2022

View reviewed changes

0xB10C mentioned this pull request Nov 28, 2022

ci: only run USDT interface tests on CirrusCI #26592

Merged

hebasto reviewed Dec 29, 2022

View reviewed changes

This was referenced Dec 29, 2022

ci: Treat IWYU violations as errors #26763

Closed

ci: Move --enable-c++20 from "tidy" task back to "ASan..." one #26770

Merged

maflcko mentioned this pull request Apr 21, 2023

ci: use LLVM/clang-16 in native_asan job #27360

Merged

bitcoin locked and limited conversation to collaborators Dec 29, 2023

ci: run USDT interface tests in the CI #25528

ci: run USDT interface tests in the CI #25528

Uh oh!

Conversation

0xB10C commented Jul 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maflcko commented Jul 4, 2022

Uh oh!

0xB10C commented Jul 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0xB10C commented Jul 4, 2022

Uh oh!

maflcko commented Jul 5, 2022

Uh oh!

0xB10C commented Jul 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maflcko commented Jul 7, 2022

Uh oh!

maflcko commented Jul 7, 2022

Uh oh!

maflcko commented Jul 8, 2022

Uh oh!

0xB10C commented Jul 9, 2022

Uh oh!

maflcko commented Jul 11, 2022

Uh oh!

maflcko Jul 11, 2022

Choose a reason for hiding this comment

Uh oh!

theStack commented Jul 21, 2022

Uh oh!

hebasto Nov 25, 2022

Choose a reason for hiding this comment

Uh oh!

hebasto Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

maflcko Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

hebasto Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

maflcko Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

hebasto Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

0xB10C commented Jul 2, 2022 •

edited

Loading

0xB10C commented Jul 4, 2022 •

edited

Loading

0xB10C commented Jul 7, 2022 •

edited

Loading