-
Notifications
You must be signed in to change notification settings - Fork 37.7k
ci: Move ASan USDT to persistent_worker #28161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
fa764eb
to
51be0b9
Compare
faf6c47
to
fae3967
Compare
cc @0xB10C about the second commit |
ACK, looks like the tests are run and pass.
|
fae3967
to
fa32374
Compare
(reworked the second commit to only use a single |
fa32374
to
fa707c8
Compare
fa707c8
to
fa74c87
Compare
concept ACK I read the linked cirrus guide about persistent workers, everything makes sense with these changes. Two questions:
|
Otherwise the task will throw in skip_if_no_python_bcc. Also, adjust CI_CONTAINER_CAP for all needed permissions.
fa74c87
to
fa47439
Compare
I did that to detect silent merge conflicts without using Cirrus CI resources. (A machine is running this task in a loop on all pull requests, constantly rebased on
Currently it is running on someone's leftover machine, but someone will spin up a proper cluster soon (either before merge or shortly after). I've pushed docs on the requirements of the workers. Let me know if I should document anything else. The other steps are taken from the Cirrus CI docs and are something like:
|
Concept ACK on moving CI tasks outside from Cirrus CI as a response to their new limits. Historically, the Bitcoin Core project started to use self-hosted CI in April 2021:
At the same time, there were some usage of the bitcoinbuilds.org infrastructure, which was completely independent of any CI provider. During that time, Cirrus CI seemed to be a suitable choice, and we stopped using bitcoinbuilds.org sometime around the end of 2021 or the beginning of 2022.
This feature is really great.
Tbh, I'd feel much more confident and comfortable knowing that some sponsors publicly pledged to cover all related costs. |
Ok, I'll try to encourage the sponsor to publicly identify themselves, but I wonder what you worry about? If you are worried about the sponsor walking away, I can assure you that there are currently three (3) willing sponsors in line. Also, while it isn't free, running Linux is extremely cheap compared to Windows/macOS. If you have any alternative ideas, it would be good to share them. The only thing I could come up with would be to use GitHub Actions CI ubuntu-jammy and do a upgrade to Ubuntu Lunar on every CI run. You'd have to increase the task runtime to 3 hours, but that seems unlikely to work and a worse outcome either way, so I am not going to pursue that. |
Sorry for being misunderstood. I don't want to force current sponsors to any additional actions.
Sustainability.
I don't think so. In June, total Linux tasks cost was the biggest one followed by Windows.
This should definitely be avoided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa47439, I have reviewed the code and it looks OK.
enable_bpfcc_script: | ||
# In the image build step, no external environment variables are available, | ||
# so any settings will need to be written to the settings env file: | ||
- sed -i "s|\${CIRRUS_CI}|true|g" ./ci/test/00_setup_env_native_asan.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might it be more straightforward to define BPFCC_PACKAGE
in ci\test_imagefile
?
Running FILE_ENV="./ci/test/00_setup_env_native_asan.sh" ./ci/test_run_all.sh
locally skips interface_usdt_*.py
tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of this pull request is not to change any behavior locally. If you are interested in changing the behavior, a separate issue or pull request would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah I see this locally too (on master), even though --cap-add SYS_PTRACE
is in the docker command and USDT tracing
is set by configure, what's missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's missing?
These are the checks to determine if we want to skip the USDT test:
bitcoin/test/functional/interface_usdt_net.py
Lines 87 to 91 in 2fa60f0
def skip_test_if_missing_module(self): | |
self.skip_if_platform_not_linux() | |
self.skip_if_no_bitcoind_tracepoints() | |
self.skip_if_no_python_bcc() | |
self.skip_if_no_bpf_permissions() |
I haven't looked into this in detail. With which user are the test run in docker locally? Is --cap-add SYS_PTRACE
enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
--cap-add SYS_PTRACE
enough?
No it is not. You can see the second commit of this pull request to see what is needed: fa47439
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I set CIRRUS_CI to true locally I end up failing installing the kernel headers. I see this error running this branch locally on both Ubuntu and macOS hosts. But the "generic" version there is coming from inside the container right (uname --kernel-release
)? So I'm confused why this works on actual CI but not locally.
#8 7.217 E: Unable to locate package linux-headers-5.15.0-78-generic
#8 7.217 E: Couldn't find any package by glob 'linux-headers-5.15.0-78-generic'
#8 7.217 E: Couldn't find any package by regex 'linux-headers-5.15.0-78-generic'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you want to run the exact same env like the Cirrus worker does, you will also need the exact same machine:
A "machine running the Linux kernel shipped with Ubuntu Lunar 23.04."
Thus, other Linuxes and macOS are not supported, unless it happens to work for some reason.
You can check the version locally with cat /etc/os-release
.
However, be warned that when you set CIRRUS_CI to true locally, you'll pass --privileged -v /sys/kernel:/sys/kernel:rw
to docker, which allows the CI system more privileges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah 🤦♂️ the comment you added in response to my first review. Thanks!
You are quoting Linux costs that Cirrus CI "sells". The goal of this pull request is to use Linux "sold" by someone else. (There are many professional companies that offer Linux boxes. If you are unsure, you can use your favourite search engine to get typical competitive pricing on Linux boxes.) |
Yes. We have to cope with new Cirrus's offers.
Using someone's else resource equals to not using Cirrus provided resources.
Thank you. I am aware of it :) |
There may be a language barrier here, so I am wondering if anything is left to be done here or if there are any outstanding suggestions or questions. Just to clarify, that this pull does not aim to change any behavior of the CI system. The second commit is a mandatory "refactor" needed for the persistent worker on CIRRUS_CI to work. |
I'm OK with this PR in its current state. |
enable_bpfcc_script: | ||
# In the image build step, no external environment variables are available, | ||
# so any settings will need to be written to the settings env file: | ||
- sed -i "s|\${CIRRUS_CI}|true|g" ./ci/test/00_setup_env_native_asan.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sed
🥲
To run the USDT functional tests, the ASan task currently requires the container host to run the Ubuntu Lunar Linux kernel (or later). Cirrus CI is the only provider that allows to spin up full VMs with Ubuntu Lunar, however they will start to charge for all tasks (See slightly related discussion in #28098).
Since it is cheaper and recommended by Cirrus CI to just run a persistent worker, do that.
Also, using a persistent worker allows to make use of the docker image cache.