Skip to content

Conversation

msune
Copy link
Member

@msune msune commented Jul 1, 2025

This patchset adds scapy support in two different ways to (hopefully) improve developer experience when writing and debugging BPF code, and when writing BPF unit tests.

Note: this patchset does NOT introduce a new testing framework.

hexdump()

lib/hexdump.h is a new header that defines hexdump() and HEXDUMP() primitives to printk() (up to) the first 64 128b bytes of a packet. It is meant to be used solely for debugging purposes of BPF code.

tools/log2scapy.py can then be used to parse the trace_pipe output and:

  • Dump the packet contents, using scapy's .show() routine: e.g.
<...>-473620  [003] b..11 955593.331365: bpf_trace_printk: /home/msuneclo/dev/cilium2/bpf/bpf_host.c:1024 l2_announce: 
###[ Ethernet ]### 
  dst       = ff:ff:ff:ff:ff:ff
  src       = de:ad:be:ef:de:ef
  type      = ARP
###[ ARP ]### 
     hwtype    = Ethernet (10Mb)
     ptype     = IPv4
     hwlen     = 6
     plen      = 4
     op        = who-has
     hwsrc     = de:ad:be:ef:de:ef
     psrc      = 110.0.11.1
     hwdst     = ff:ff:ff:ff:ff:ff
     pdst      = 172.16.10.1

  • Open a scapy interactive shell with the parsed packets (e.g. to manipulate them or inject them somewhere)
  • Generate a PCAP file with the parsed packets.

Short demo

hexdump_scapy2.mp4

Other comments:

  • Never use in production code, it's just a debugging tool
  • Mind complexity might increase when hexdump()/HEXDUMP() is used

Scapy in BPF unit tests

This patchset also adds scapy support to BPF unit tests. The support is additive, and can be used to:

  • Easily generate packets using Scapy packet definitions (BUF_DECL())
  • Easily compare expected vs actual packet received by the unit test, and show the two if they mismatch, e.g.:
--- FAIL: TestBPF (12.09s)
    --- FAIL: TestBPF/tc_l2_announcement.o (0.14s)
        --- FAIL: TestBPF/tc_l2_announcement.o/1_happy_path (0.00s)
            bpf_test.go:455: tc_l2_announcement.c:156: CTX and buffer 'EXPECTED_ARP_REP' mismatch tc_l2_announcement.c:156
        bpf_test.go:239: Local container found ifindex 42 seclabel 0
FAIL
FAIL	github.com/cilium/cilium/bpf/tests/bpftest	12.110s
FAIL
=== Start tc_l2_announcement.c:156 'arp_rep_ok' ===
--- Expected ---
###[ Ethernet ]### 
  dst       = de:ad:be:ef:de:ef
  src       = 13:37:13:37:13:37
  type      = ARP
###[ ARP ]### 
     hwtype    = Ethernet (10Mb)
     ptype     = IPv4
     hwlen     = 6
     plen      = 4
     op        = is-at
     hwsrc     = 13:37:13:37:13:37
     psrc      = 172.16.10.1
     hwdst     = 13:37:13:37:13:37
     pdst      = 110.0.11.1

--- Got ---
###[ Ethernet ]### 
  dst       = de:ad:be:ef:de:ef
  src       = 13:37:13:37:13:37
  type      = ARP
###[ ARP ]### 
     hwtype    = Ethernet (10Mb)
     ptype     = IPv4
     hwlen     = 6
     plen      = 4
     op        = is-at
     hwsrc     = 13:37:13:37:13:37
     psrc      = 172.16.10.1
     hwdst     = de:ad:be:ef:de:ef
     pdst      = 110.0.11.1

===  End  tc_l2_announcement.c:156 'arp_rep_ok' ===
make: *** [Makefile:91: run] Error 1

You can see a test ported to scapy infra here

Short demo

bpf_unit_scapy_diff.mp4

Other comments

  • This patchset does NOT remove (nor intend) the existing pktgen infra, nor the current assert support. It is meant to be an easier way to write some tests. I expect current asserts will still be used in some cases, but most of the tests can be simplified and benefit from scapy syntax.
  • Some TODOs / things to improve can be found here
  • Using trace_pipe for sending failed assert data back to the unit test go prog is... hacky. It might be worth looking into using a BPF map for this.

Final remarks

I'd like to get thoughts from the community on this to continue investing in it.

While we all get "used" to write BPF and BPF tests, IMO s/t like this would have saved me some precious dev time.

  • checkpatch.pl errors: some style errors are a result of defining scapy pkts within BPF code. This could be sorted out by TODO3, which would mean moving the packet definition itself to a python file (that would be loaded by scapy/parser.py). This might be cleaner. Fixed in latest commit.

Something like:

diff --git a/bpf/tests/tc_l2_announcement.c b/bpf/tests/tc_l2_announcement.c
index 82e3601415..4ef0570717 100644
--- a/bpf/tests/tc_l2_announcement.c
+++ b/bpf/tests/tc_l2_announcement.c
@@ -46,7 +46,7 @@ static __always_inline int build_packet(struct __ctx_buff *ctx)
 {
        struct pktgen builder;
        pktgen__init(&builder, ctx);
-       BUF_DECL(ARP_REQ, Ether(dst=mac_bcast, src=mac_one)/ARP(op="who-has", psrc=v4_ext_one, pdst=v4_svc_one, hwsrc=mac_one, hwdst=mac_bcast));
+       BUF_DECL(ARP_REQ, arp_req);
        BUILDER_PUSH_BUF(builder, ARP_REQ);
        pktgen__finish(&builder);
 
diff --git a/bpf/tests/tc_l2_announcement.py b/bpf/tests/tc_l2_announcement.py
new file mode 100644
index 0000000000..1a0e287f0d
--- /dev/null
+++ b/bpf/tests/tc_l2_announcement.py
@@ -0,0 +1 @@
+arp_req = Ether(dst=mac_bcast, src=mac_one)/ARP(op="who-has", psrc=v4_ext_one, pdst=v4_svc_one, hwsrc=mac_one, hwdst=mac_bcast)


  • As discussed with @smagnani96, we might want to add Python lintering

@msune msune requested a review from a team as a code owner July 1, 2025 09:28
@msune msune requested a review from ysksuzuki July 1, 2025 09:28
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 1, 2025
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Jul 1, 2025
@joestringer joestringer added the release-note/ci This PR makes changes to the CI. label Jul 16, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 16, 2025
@joestringer
Copy link
Member

@cilium/sig-datapath looking for your feedback on this one.

@smagnani96
Copy link
Contributor

FWIW, I'll also provide my feedback.
Generally speaking, I like the idea of adding an alternative mechanism for BPF unit tests, which doesn't necessarily have to replace the existing one. I like the proposed infra, it is clear to understand and surprisingly simpler than I thought, nice!
Few concerns and inputs:

  1. What about the community expertise/willing to review a different programming language? We could use gopacket, but for sure the result would be way more redundant than a 1-line Scapy definition...
  2. Agree with all TODOs:
    • TODO1: sometimes we need to simulate L3 packets, so that's needed
    • TODO2: ok
    • TODO3: ok, but it would make having a separate .py file for packet definitions harder (unless indexing a map with test name + index). In any case, having a separate .py file for scapy packet definitions would benefit from linters/suggestions/completions, other than making checkpatch happy.
    • TODO4: definitively agree, I'd prefer a ringbuffer with custom testing events rather than trace pipe, more control and no need to worry if the pipe's already opened by a different process.
  3. What's the maximum HD_MAX_BYTES before we hit a bpf complexity issue while dumping the hex? We might be interested also to overlay packets, and in that case at least 2*HD_MAX_BYTES is required.

I'll update my comment in case something else comes to my mind 🤔

@msune
Copy link
Member Author

msune commented Jul 17, 2025

@smagnani96 thx for the feedback, appreciated.

I like the proposed infra, it is clear to understand and surprisingly simpler than I thought, nice!

Thx. I think it can be polished slightly.

  1. Agree with all TODOs:
    TODO1: sometimes we need to simulate L3 packets, so that's needed

Actually, it might just work today with any sort of buf. The macro comparing already takes an offset, so that should work also out of the box. I will double-check.

TODO3: ok, but it would make having a separate .py file for packet definitions harder (unless indexing a map with test name + index). In any case, having a separate .py file for scapy packet definitions would benefit from linters/suggestions/completions, other than making checkpatch happy.

Not sure I understood your comment; you would have a separate .py file per unit test. In the example:

tc_l2_announcement.py

That would be for v4 L2 announcement only (tc_l2_announcement.c)

  1. What's the maximum HD_MAX_BYTES before we hit a bpf complexity issue while dumping the hex? We might be interested also to overlay packets, and in that case at least 2*HD_MAX_BYTES is required.

Bumping HD_MAX_BYTES to 128 might work for most of the programs, I didn't check.

We could could also only conditionally define HD_MAX_BYTES in hexdump.h so that you can define it before the inclusion (but then you have issues with multiple inclusions).

@smagnani96
Copy link
Contributor

Agree with all TODOs:

  • TODO1: sometimes we need to simulate L3 packets, so that's needed

Actually, it might just work today with any sort of buf. The macro comparing already takes an offset, so that should work also out of the box. I will double-check.

Yep, I was referring more of when dumping the packet in trace_diff_pkts.py https://github.com/msune/cilium/blob/63eecfb98263fb8ce53d5b0ecf38ad10a0a84e5b/bpf/tests/scapy/trace_diff_pkts.py#L36-L37:

        expected = Ether(bytes.fromhex(data["expected"]))
        got = Ether(bytes.fromhex(data["got"]))

Not sure I understood your comment; you would have a separate .py file per unit test. In the example:

Ha! Ok got it. I thought you wanted to make it even more granular, like specifying the buffer only locally to the CHECK function rather than locally to the whole test file.

We could could also only conditionally define HD_MAX_BYTES in hexdump.h so that you can define it before the inclusion (but then you have issues with multiple inclusions).

Or pass it as a const to the hexdump function, while also providing some pre-defined values (similarly as we do in the datapath for trace_payload_len / trace_payload_len_overlay)

@msune
Copy link
Member Author

msune commented Jul 17, 2025

Yep, I was referring more of when dumping the packet in trace_diff_pkts.py https://github.com/msune/cilium/blob/63eecfb98263fb8ce53d5b0ecf38ad10a0a84e5b/bpf/tests/scapy/trace_diff_pkts.py#L36-L37:

Yep, you are right, that's the part that is missing (I forgot). Let me think about options, but it might be as easy as to add the first expected Layer as part of the macro.

Ha! Ok got it. I thought you wanted to make it even more granular, like specifying the buffer only locally to the CHECK function rather than locally to the whole test file.

To be honest, today you can define them in pkt_constants.py and it works:

# Note: these replicate pktgen.h values
# TODO: it would be ideal to have a single source of truth for these values

...

tcp_svc_three = 53

default_data = "Should not change!!"

# Unit tests packet definitions below!

arp_req = Ether(dst=mac_bcast, src=mac_one)/ARP(op="who-has", psrc=v4_ext_one, pdst=v4_svc_one, hwsrc=mac_one, hwdst=mac_bcast)

scapy/parser.py imports this file, so these symbols are available on all unit tests.

TODO3 suggests to create one of these python "envs" with the packet definitions per unit test so that there are no possible conflicts. It's really a detail, because we could also continue to use a single pkt_constants.py (or use a better name for it), and simply prepend a unit test prefix to it, so that bpf/tests is not cluttered with python files. Either works for me.

Or pass it as a const to the hexdump function, while also providing some pre-defined values (similarly as we do in the datapath for trace_payload_len / trace_payload_len_overlay)

Let me look into it.

@borkmann
Copy link
Member

borkmann commented Jul 21, 2025

[...]

Ha! Ok got it. I thought you wanted to make it even more granular, like specifying the buffer only locally to the CHECK function rather than locally to the whole test file.

To be honest, today you can define them in pkt_constants.py and it works:

# Note: these replicate pktgen.h values
# TODO: it would be ideal to have a single source of truth for these values

...

tcp_svc_three = 53

default_data = "Should not change!!"

# Unit tests packet definitions below!

arp_req = Ether(dst=mac_bcast, src=mac_one)/ARP(op="who-has", psrc=v4_ext_one, pdst=v4_svc_one, hwsrc=mac_one, hwdst=mac_bcast)

scapy/parser.py imports this file, so these symbols are available on all unit tests.

TODO3 suggests to create one of these python "envs" with the packet definitions per unit test so that there are no possible conflicts. It's really a detail, because we could also continue to use a single pkt_constants.py (or use a better name for it), and simply prepend a unit test prefix to it, so that bpf/tests is not cluttered with python files. Either works for me.

Or pass it as a const to the hexdump function, while also providing some pre-defined values (similarly as we do in the datapath for trace_payload_len / trace_payload_len_overlay)

Overall I like the series! After the base is in, I presume other devs who want to use it won't have to touch the python parts, right?

I'd think we should probably convert all the packet builders with this work to simplify the code.

What would be great to improve imho is the macro namings, e.g. things like SCAPY_ASSERT_PKT_BUF, SCAPY_ASSERT_PKT_BUF_OFF, SCAPY_PKT_BUILDER.. maybe we can simplify the names a bit and add documentation for them with a small example. Right now one has to read the macros to understand what they are doing.

Small nit also wrt API: SCAPY_ASSERT_PKT_BUF_OFF has CTX as an arg, in the do {} while body we also use ctx. As I understand both are the same, but in some cases we use CTX and in others ctx? This feels prone to unexpected bugs, rather use (CTX) everywhere.

@msune
Copy link
Member Author

msune commented Jul 21, 2025

Overall I like the series! After the base is in, I presume other devs who want to use it won't have to touch the python parts, right?

Correct. The expectation is that no one has to write any Python code other than their packet definitions, e.g.:

arp_req = Ether(dst=mac_bcast, src=mac_one)/ARP(op="who-has", psrc=v4_ext_one, pdst=v4_svc_one, hwsrc=mac_one, hwdst=mac_bcast)

What would be great to improve imho is the macro namings, e.g. things like SCAPY_ASSERT_PKT_BUF, SCAPY_ASSERT_PKT_BUF_OFF, SCAPY_PKT_BUILDER.. maybe we can simplify the names

Fully agree. Not good at names, so any suggestions welcome.

a bit and add documentation for them with a small example. Right now one has to read the macros to understand what they are doing

I will certainly do that if we are in agreement it makes sense to be merged.

Small nit also wrt API: SCAPY_ASSERT_PKT_BUF_OFF has CTX as an arg, in the do {} while body we also use ctx. As I understand both are the same, but in some cases we use CTX and in others ctx? This feels prone to unexpected bugs, rather use (CTX) everywhere.

Good catch. Totally unintentional, should be CTX

@smagnani96
Copy link
Contributor

Another side note -- no need to address in the first PR.
Sometimes in tests we do modify some ctx layer info within the same CHECK instead of writing a whole new test PKTGEN+CHECK. An example could be found here.
I'm wondering if it can make sense to add helpers to manipulate the ctx data on-the-fly, to avoid having to unpack and check pointers in C to modify some field (ex L4 ports).

@maintainer-s-little-helper
Copy link

Commits 75ff49d, 8f3ab89 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jul 22, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jul 22, 2025
@msune msune requested review from a team as code owners July 23, 2025 09:43
@msune msune requested a review from Artyop July 23, 2025 09:43
@msune msune changed the title [RFC] bpf: scapy support (dev. experience) bpf: scapy support (dev. experience) Jul 23, 2025
@msune msune marked this pull request as draft July 23, 2025 09:44
@smagnani96
Copy link
Contributor

Adding a few extra bits for future enhancements.
I often find myself running specific tests only, via:

$ for f in classifiers_l3_dev classifiers_l2_dev; do make -C bpf/tests $f.o && go test ./bpf/tests/bpftest -bpf-test-path $pwd/bpf/tests -exec "sudo -E" -test $f -v; done

This will run only classifiers_l3_dev.c,classifiers_l2_dev.c, no need to build and run the whole tests.
However, that wouldn't be possible for bpf tests leveraging Scapy buffers: our Go bpftests doesn't automatically setup the needed infra as in the Makefile (trace pipe + parsing output).

I think one of these two can be added in a 2nd moment (I can even do it, preferring 1):

  1. add a single-target run in the Makefile
  2. make our GO bpftest to automatically setup the infra

msune added 4 commits August 4, 2025 10:48
This commit mounts /sys when invoking contrib/scripts/builder.sh,
so that BPF unit tests with the scapy support can read from the
trace_pipe.

It also changes builder.sh to receive quoted DOCKER_ARGS.

Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
This commit refactors `tc_l2_announcement.c` to make use of the new
scapy test infrastructure (showcase).

Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
Add SPDX Apache license identifier in scapy-related Python scripts.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
Add a small guide on how to write BPF unit tests with Scapy along
with a short intro to Scapy.

Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
@msune msune force-pushed the bpf_scapy branch 2 times, most recently from 5a62c60 to fa864d6 Compare August 4, 2025 09:02
@msune
Copy link
Member Author

msune commented Aug 4, 2025

Ready to merge:

qmonnet added a commit to qmonnet/cilium that referenced this pull request Aug 4, 2025
It looks like script install-builder-deps.sh has never been used for
generating builder images. Ever since it was introduced in commit
7724ceb ("build: New builder image supporting cross-compilation"),
it's never been called anywhere in the repository. Remove it to avoid
confusion as in cilium#40838.

Link: cilium#40294 (comment)
Signed-off-by: Quentin Monnet <qmo@qmon.net>
@qmonnet
Copy link
Member

qmonnet commented Aug 4, 2025

/test

Checkpatch's failure is due to a warning about the use of trace_printk(), safe to ignore in the current case.

@qmonnet
Copy link
Member

qmonnet commented Aug 4, 2025

@smagnani96 I see you self-requested a review a few days ago, do you want to take another look or are you good with this PR?

@qmonnet qmonnet removed the request for review from ysksuzuki August 4, 2025 10:43
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like my own review was missing :)

@smagnani96
Copy link
Contributor

@smagnani96 I see you self-requested a review a few days ago, do you want to take another look or are you good with this PR?

I'm reviewing this right now, was waiting for the force-pushes 🙏🏼

Copy link
Contributor

@smagnani96 smagnani96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Would probably create (if not done already) an GH issue to track possible TODOs and enhancements to this (e.g., the cocci script).

Amazing Marc! 🚀

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 4, 2025
@msune
Copy link
Member Author

msune commented Aug 4, 2025

LGTM! Would probably create (if not done already) an GH issue to track possible TODOs and enhancements to this (e.g., the cocci script).

Will do once merged, thx!

@qmonnet qmonnet added this pull request to the merge queue Aug 4, 2025
Merged via the queue into cilium:main with commit 021146c Aug 4, 2025
67 of 68 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Aug 4, 2025
It looks like script install-builder-deps.sh has never been used for
generating builder images. Ever since it was introduced in commit
7724ceb ("build: New builder image supporting cross-compilation"),
it's never been called anywhere in the repository. Remove it to avoid
confusion as in #40838.

Link: #40294 (comment)
Signed-off-by: Quentin Monnet <qmo@qmon.net>
rabelmervin pushed a commit to rabelmervin/cilium that referenced this pull request Aug 18, 2025
Add python3-scapy and python3-jinja2 packages to install-deps script.
This is requirement coming from the BPF unit test scapy support PR
(cilium#40294).

Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
rabelmervin pushed a commit to rabelmervin/cilium that referenced this pull request Aug 18, 2025
Commit bda1e68 added the two dependencies as described in the log
in `images/builder/install-builder-deps.sh`. It turns out this file
is not used (at all) and hence a separate PR has been opened to remove
it entirely.

> Add python3-scapy and python3-jinja2 packages to install-deps script.
> This is requirement coming from the BPF unit test scapy support PR
> (cilium#40294).

This commit adds it as part of the builder `Dockerfile` so that scapy
and jinja2 are.. there :)

Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
rabelmervin pushed a commit to rabelmervin/cilium that referenced this pull request Aug 18, 2025
It looks like script install-builder-deps.sh has never been used for
generating builder images. Ever since it was introduced in commit
7724ceb ("build: New builder image supporting cross-compilation"),
it's never been called anywhere in the repository. Remove it to avoid
confusion as in cilium#40838.

Link: cilium#40294 (comment)
Signed-off-by: Quentin Monnet <qmo@qmon.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants