Skip to content

Conversation

maflcko
Copy link
Member

@maflcko maflcko commented Jul 26, 2023

To run the USDT functional tests, the ASan task currently requires the container host to run the Ubuntu Lunar Linux kernel (or later). Cirrus CI is the only provider that allows to spin up full VMs with Ubuntu Lunar, however they will start to charge for all tasks (See slightly related discussion in #28098).

Since it is cheaper and recommended by Cirrus CI to just run a persistent worker, do that.

Also, using a persistent worker allows to make use of the docker image cache.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 26, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK hebasto
Concept ACK pinheadmz
Stale ACK 0xB10C

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #27976 (ci: Start with clean env by MarcoFalke)
  • #27793 (ci: label docker images and prune dangling images selectively by stickies-v)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot added the Tests label Jul 26, 2023
@maflcko maflcko force-pushed the 2307-ci-worker-001- branch from fa764eb to 51be0b9 Compare July 26, 2023 10:14
@maflcko maflcko force-pushed the 2307-ci-worker-001- branch 6 times, most recently from faf6c47 to fae3967 Compare July 26, 2023 14:05
@maflcko maflcko marked this pull request as ready for review July 26, 2023 14:43
@maflcko
Copy link
Member Author

maflcko commented Jul 26, 2023

cc @0xB10C about the second commit

@0xB10C
Copy link
Contributor

0xB10C commented Jul 26, 2023

ACK, looks like the tests are run and pass.

interface_usdt_coinselection.py                        | ✓ Passed  | 17 s
interface_usdt_mempool.py                              | ✓ Passed  | 37 s
interface_usdt_net.py                                  | ✓ Passed  | 14 s
interface_usdt_utxocache.py                            | ✓ Passed  | 35 s
interface_usdt_validation.py                           | ✓ Passed  | 8 s

@maflcko
Copy link
Member Author

maflcko commented Jul 27, 2023

(reworked the second commit to only use a single sed command, and add documentation for why it is needed)

@pinheadmz
Copy link
Member

concept ACK

I read the linked cirrus guide about persistent workers, everything makes sense with these changes. Two questions:

  1. Is the persistent worker hosted by you? Are there any documented details about it?
  2. Why does the previous releases task need persistent worker?

MarcoFalke added 2 commits August 1, 2023 17:33
Otherwise the task will throw in skip_if_no_python_bcc.

Also, adjust CI_CONTAINER_CAP for all needed permissions.
@maflcko maflcko force-pushed the 2307-ci-worker-001- branch from fa74c87 to fa47439 Compare August 1, 2023 15:35
@maflcko
Copy link
Member Author

maflcko commented Aug 1, 2023

Why does the previous releases task need persistent worker?

I did that to detect silent merge conflicts without using Cirrus CI resources. (A machine is running this task in a loop on all pull requests, constantly rebased on master). Unrelated: See https://github.com/MarcoFalke/DrahtBot/blob/main/rerun_ci/src/main.rs#L30 for the details on how to re-run.

Is the persistent worker hosted by you? Are there any documented details about it?

Currently it is running on someone's leftover machine, but someone will spin up a proper cluster soon (either before merge or shortly after). I've pushed docs on the requirements of the workers. Let me know if I should document anything else. The other steps are taken from the Cirrus CI docs and are something like:

  • Set up cirrus curl -L -o cirrus https://github.com/cirruslabs/cirrus-cli/releases/latest/download/cirrus-$(uname | tr '[:upper:]' '[:lower:]')-amd64 && mv cirrus /usr/local/bin/cirrus && chmod +x /usr/local/bin/cirrus
  • Start screen
  • Connect the worker via cirrus worker run --labels type=lunar --token todo_get_the_correct_token

@hebasto
Copy link
Member

hebasto commented Aug 2, 2023

Concept ACK on moving CI tasks outside from Cirrus CI as a response to their new limits.

Historically, the Bitcoin Core project started to use self-hosted CI in April 2021:

For testing purposes, a single task will be transformed to run on the DrahtBot infrastructure. If all goes well, the other tasks can be moved, too.

At the same time, there were some usage of the bitcoinbuilds.org infrastructure, which was completely independent of any CI provider. During that time, Cirrus CI seemed to be a suitable choice, and we stopped using bitcoinbuilds.org sometime around the end of 2021 or the beginning of 2022.

Also, using a persistent worker allows to make use of the docker image cache.

This feature is really great.

Since it is cheaper...

Tbh, I'd feel much more confident and comfortable knowing that some sponsors publicly pledged to cover all related costs.

@maflcko
Copy link
Member Author

maflcko commented Aug 2, 2023

Tbh, I'd feel much more confident and comfortable knowing that some sponsors publicly pledged to cover all related costs.

Ok, I'll try to encourage the sponsor to publicly identify themselves, but I wonder what you worry about? If you are worried about the sponsor walking away, I can assure you that there are currently three (3) willing sponsors in line. Also, while it isn't free, running Linux is extremely cheap compared to Windows/macOS.

If you have any alternative ideas, it would be good to share them. The only thing I could come up with would be to use GitHub Actions CI ubuntu-jammy and do a upgrade to Ubuntu Lunar on every CI run. You'd have to increase the task runtime to 3 hours, but that seems unlikely to work and a worse outcome either way, so I am not going to pursue that.

@hebasto
Copy link
Member

hebasto commented Aug 2, 2023

Tbh, I'd feel much more confident and comfortable knowing that some sponsors publicly pledged to cover all related costs.

Ok, I'll try to encourage the sponsor to publicly identify themselves

Sorry for being misunderstood. I don't want to force current sponsors to any additional actions.

... but I wonder what you worry about?

Sustainability.

Also, while it isn't free, running Linux is extremely cheap compared to Windows/macOS.

I don't think so. In June, total Linux tasks cost was the biggest one followed by Windows.

The only thing I could come up with would be to use GitHub Actions CI ubuntu-jammy and do a upgrade to Ubuntu Lunar on every CI run. You'd have to increase the task runtime to 3 hours, but that seems unlikely to work and a worse outcome either way, so I am not going to pursue that.

This should definitely be avoided.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa47439, I have reviewed the code and it looks OK.

Comment on lines +276 to +279
enable_bpfcc_script:
# In the image build step, no external environment variables are available,
# so any settings will need to be written to the settings env file:
- sed -i "s|\${CIRRUS_CI}|true|g" ./ci/test/00_setup_env_native_asan.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be more straightforward to define BPFCC_PACKAGE in ci\test_imagefile?

Running FILE_ENV="./ci/test/00_setup_env_native_asan.sh" ./ci/test_run_all.sh locally skips interface_usdt_*.py tests.

Copy link
Member Author

@maflcko maflcko Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this pull request is not to change any behavior locally. If you are interested in changing the behavior, a separate issue or pull request would be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah I see this locally too (on master), even though --cap-add SYS_PTRACE is in the docker command and USDT tracing is set by configure, what's missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's missing?

These are the checks to determine if we want to skip the USDT test:

def skip_test_if_missing_module(self):
self.skip_if_platform_not_linux()
self.skip_if_no_bitcoind_tracepoints()
self.skip_if_no_python_bcc()
self.skip_if_no_bpf_permissions()

I haven't looked into this in detail. With which user are the test run in docker locally? Is --cap-add SYS_PTRACE enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is --cap-add SYS_PTRACE enough?

No it is not. You can see the second commit of this pull request to see what is needed: fa47439

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I set CIRRUS_CI to true locally I end up failing installing the kernel headers. I see this error running this branch locally on both Ubuntu and macOS hosts. But the "generic" version there is coming from inside the container right (uname --kernel-release)? So I'm confused why this works on actual CI but not locally.

#8 7.217 E: Unable to locate package linux-headers-5.15.0-78-generic
#8 7.217 E: Couldn't find any package by glob 'linux-headers-5.15.0-78-generic'
#8 7.217 E: Couldn't find any package by regex 'linux-headers-5.15.0-78-generic'

Copy link
Member Author

@maflcko maflcko Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you want to run the exact same env like the Cirrus worker does, you will also need the exact same machine:

A "machine running the Linux kernel shipped with Ubuntu Lunar 23.04."

Thus, other Linuxes and macOS are not supported, unless it happens to work for some reason.

You can check the version locally with cat /etc/os-release.

However, be warned that when you set CIRRUS_CI to true locally, you'll pass --privileged -v /sys/kernel:/sys/kernel:rw to docker, which allows the CI system more privileges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah 🤦‍♂️ the comment you added in response to my first review. Thanks!

@DrahtBot DrahtBot requested a review from 0xB10C August 2, 2023 11:45
@maflcko
Copy link
Member Author

maflcko commented Aug 2, 2023

Also, while it isn't free, running Linux is extremely cheap compared to Windows/macOS.

I don't think so. In June, total Linux tasks cost was the biggest one followed by Windows.

You are quoting Linux costs that Cirrus CI "sells". The goal of this pull request is to use Linux "sold" by someone else. (There are many professional companies that offer Linux boxes. If you are unsure, you can use your favourite search engine to get typical competitive pricing on Linux boxes.)

@hebasto
Copy link
Member

hebasto commented Aug 2, 2023

Also, while it isn't free, running Linux is extremely cheap compared to Windows/macOS.

I don't think so. In June, total Linux tasks cost was the biggest one followed by Windows.

You are quoting Linux costs that Cirrus CI "sells".

Yes. We have to cope with new Cirrus's offers.

The goal of this pull request is to use Linux "sold" by someone else.

Using someone's else resource equals to not using Cirrus provided resources.

(There are many professional companies that offer Linux boxes. If you are unsure, you can use your favourite search engine to get typical competitive pricing on Linux boxes.)

Thank you. I am aware of it :)

@maflcko
Copy link
Member Author

maflcko commented Aug 3, 2023

There may be a language barrier here, so I am wondering if anything is left to be done here or if there are any outstanding suggestions or questions. Just to clarify, that this pull does not aim to change any behavior of the CI system. The second commit is a mandatory "refactor" needed for the persistent worker on CIRRUS_CI to work.

@hebasto
Copy link
Member

hebasto commented Aug 3, 2023

There may be a language barrier here, so I am wondering if anything is left to be done here or if there are any outstanding suggestions or questions.

I'm OK with this PR in its current state.

@maflcko maflcko requested a review from fanquake August 3, 2023 13:22
enable_bpfcc_script:
# In the image build step, no external environment variables are available,
# so any settings will need to be written to the settings env file:
- sed -i "s|\${CIRRUS_CI}|true|g" ./ci/test/00_setup_env_native_asan.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sed

🥲

@fanquake fanquake merged commit a4ca497 into bitcoin:master Aug 3, 2023
@maflcko maflcko deleted the 2307-ci-worker-001- branch August 4, 2023 07:42
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Aug 9, 2023
@bitcoin bitcoin locked and limited conversation to collaborators Aug 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants