Skip to content

Conversation

dylandreimerink
Copy link
Member

@dylandreimerink dylandreimerink commented Nov 1, 2023

The proposal for a new version of the eBPF instruction set has been accepted and merged in both LLVM 18 and linux kernel v6.6. So we might be seeing these new instructions in the near future.

This PR adds support for these new instructions. Before this PR we could still parse the instructions but did not recognize the variations added:

myfunc:
         0: SwapLE dst: r1 imm: 16
         1: SwapLE dst: r2 imm: 32
         2: SwapLE dst: r3 imm: 64
         3: LdXMode(128)B 
         4: LdXMode(128)H 
         5: LdXMode(128)W 
         6: LdXMode(128)B 
         7: LdXMode(128)H 
         8: MovReg dst: r1 src: r4
         9: MovReg dst: r2 src: r5
        10: MovReg dst: r3 src: r6
        11: Mov32Reg dst: r1 src: r3
        12: Mov32Reg dst: r2 src: r4
        13: InvalidJumpOp32Imm dst: r0 off: 0 imm: 4
        14: DivReg dst: r1 src: r3
        15: ModReg dst: r2 src: r4
        16: Div32Reg dst: r1 src: r3
        17: Mod32Reg dst: r2 src: r4

After this PR we correctly understand the new instructions:

myfunc:
        0: BSwap16 dst: r1 
        1: BSwap32 dst: r2 
        2: BSwap64 dst: r3 
        3: LdXMemSXB dst: r1 src: r4 off: 0 imm: 0
        4: LdXMemSXH dst: r2 src: r5 off: 4 imm: 0
        5: LdXMemSXW dst: r3 src: r6 off: 8 imm: 0
        6: LdXMemSXB dst: r1 src: r4 off: 0 imm: 0
        7: LdXMemSXH dst: r2 src: r5 off: 4 imm: 0
        8: SMov8Reg dst: r1 src: r4
        9: SMov16Reg dst: r2 src: r5
        10: SMov32Reg dst: r3 src: r6
        11: SMov8Reg32 dst: r1 src: r3
        12: SMov16Reg32 dst: r2 src: r4
        13: Ja32 imm: 4
        14: SDivReg dst: r1 src: r3
        15: SModReg dst: r2 src: r4
        16: SDivReg32 dst: r1 src: r3
        17: SModReg32 dst: r2 src: r4

Fixes: #1188

@dylandreimerink dylandreimerink requested a review from lmb November 1, 2023 08:01
@dylandreimerink dylandreimerink force-pushed the feature/isa-v4 branch 2 times, most recently from 658f785 to 4944213 Compare November 1, 2023 08:11
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I think that my main concern with this approach is that it introduces so many special cases. A lot of the getters on OpCode actually don't work correctly for extended opcodes, something we've not tackled with this at all (documentation or code wise).

So far OpCode is a 1:1 representation of the BPF wire format. This made sense since not all of the space was allocated. This is important for InvalidOpCode to function correctly. Now that pretty much all space is gone there is another option we can try: divorce the representation of OpCode from the wire format. I'm thinking something along these lines:

diff --git a/asm/opcode.go b/asm/opcode.go
index 845c5521..04d1d6cf 100644
--- a/asm/opcode.go
+++ b/asm/opcode.go
@@ -66,18 +66,33 @@ func (cls Class) isJumpOrALU() bool {
 	return cls.IsJump() || cls.IsALU()
 }
 
-// OpCode is a packed eBPF opcode.
+// OpCode represents a single operation.
 //
-// Its encoding is defined by a Class value:
+// It is not a 1:1 mapping to real eBPF opcodes.
 //
-//	msb      lsb
-//	+----+-+---+
-//	| ???? |CLS|
-//	+----+-+---+
-type OpCode uint8
+// The encoding varies based on a 3-bit Class:
+//
+//	7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+//	           ???           | CLS
+//
+// For ALUClass and ALUCLass32:
+//
+//	7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+//	          OPC          |S| CLS
+//
+// For LdClass, LdXclass, StClass and StXClass:
+//
+//	7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+//	       0       | MDE |SIZ| CLS
+//
+// For JumpClass, Jump32Class:
+//
+//	7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+//	       0       |  OPC  |S| CLS
+type OpCode uint16
 
 // InvalidOpCode is returned by setters on OpCode
-const InvalidOpCode OpCode = 0xff
+const InvalidOpCode OpCode = 0xffff
 
 // rawInstructions returns the number of BPF instructions required
 // to encode this opcode.

Basically, make OpCode wider so that we can represent all the logical opcodes. For simplicity we'd also change the various *Op to be uint16. We'd then handle converting to and from OpCode in Instruction.Marshal and Unmarshal, similar to what we do for dword loads. This way ExtendedALUOp is not required anymore and SMod / SDiv can be ALUOp. Finally we can also guarantee that InvalidOpCode really is invalid.

This is basically free at runtime since we don't increase the size of Instruction. I even think that we can arrange it so that existing users won't break (besides having to throw in a uint16 cast maybe).

What do you think?

@dylandreimerink
Copy link
Member Author

Basically, make OpCode wider so that we can represent all the logical opcodes. For simplicity we'd also change the various *Op to be uint16. We'd then handle converting to and from OpCode in Instruction.Marshal and Unmarshal, similar to what we do for dword loads. This way ExtendedALUOp is not required anymore and SMod / SDiv can be ALUOp. Finally we can also guarantee that InvalidOpCode really is invalid.

This is basically free at runtime since we don't increase the size of Instruction. I even think that we can arrange it so that existing users won't break (besides having to throw in a uint16 cast maybe).

What do you think?

Yes, that feels less hacky than what I have so far 👍

@dylandreimerink dylandreimerink force-pushed the feature/isa-v4 branch 3 times, most recently from f42bdd7 to a25a9bb Compare November 6, 2023 10:09
@dylandreimerink dylandreimerink requested a review from lmb November 6, 2023 10:16
ISAv4 added signed division and modulo instructions. These instructions
have the same opcode as the unsigned division and modulo instructions,
but with the offset set to 1. This messes with the way we have been
defining instruction so far.

To work around this we add a new `ExtendedALUOp` type with the SMDiv and
SMod instructions. This type has the same methods as the normal `ALUOp`
but they generate the signed instructions.

The printing logic has been updated to take the offset into account. So
it appears as symetric in most ways. Except the signed operations will
still result in `ins.op.ALUOp() == Div/Mod` while you would use the
`SDiv/SMod` to generate the correct instruction.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This commit adds the BSwap instruction, it can be emited with the
`BSwap` function similar to `ToHost`. The BSwap is generated by using
the ALU64 class instread of the plain ALU class.

The opcode and instruction string generation has been updated to
recongnize this and output the correct instruction when printing.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
ISAv4 adds sign extended move instructions, both the 64 and 32 bit
variants. This commit adds support for these instructions. These
instructions take 2 registers and a size so they don't fit with the
existing `Reg/Imm` methods. Additionally, they have the same opcode
as their normal variants, just a non-zero offset.

So just 2 functions were added to generate these instructions. The
instructuion string generation has also been updated to prefix the
mov instruction with `SX` if they are sign extended.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
ISAv4 adds a new long jump instruction, which can jump to offsets
beteen [-2^31, 2^31 - 1] instread of the normal [-32768, 32767].
This new instruction is BPF_JMP32 + BPF_JA which was previously
not allowed. It uses the IMM/Constant field to store the jump offset
like calls do.

This commit adds a function to emit the instruction, updates the string
generation to regcognize the new instruction and changes the logic
to patch offsets based on references to patch this instructio correctly.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions, otherwise LGTM.

@dylandreimerink dylandreimerink force-pushed the feature/isa-v4 branch 2 times, most recently from cc972c1 to f625a97 Compare November 8, 2023 09:50
ISAv4 adds a new instruction to load a memory value and sign extend it.
To do so a new mode is used. This commit adds a new function to emit
this instruction and adds the logic to print it nicely.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed LoadSXMemOp to LoadMemSXOp

@lmb lmb merged commit e7260f4 into cilium:main Nov 8, 2023
@lmb
Copy link
Collaborator

lmb commented Nov 8, 2023

Thanks!

@lmb
Copy link
Collaborator

lmb commented Nov 8, 2023

I managed to force push an old version i think x( Fixes incoming.

dylandreimerink added a commit that referenced this pull request Apr 17, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink added a commit that referenced this pull request Apr 17, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink added a commit that referenced this pull request Apr 22, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink added a commit that referenced this pull request Apr 24, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink added a commit that referenced this pull request Apr 24, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
dylandreimerink added a commit that referenced this pull request Apr 29, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
lmb added a commit that referenced this pull request Apr 30, 2025
As it stands, we only had proper handling of atomic add operations.
`lock *(u32 *)(r1 + 0x1) += w2` was the only instruction that was
properly handled. Atomic operations have the same opcode, but
and are differentiated by the imm value. So far we have not been looking
at the imm value, so all atomic operations look like atomic adds to us.

This commit adds decoding for all current atomic operations. We handle
them similarly to how we handle ISAv4 instructions which use the
offset to further specify the instruction.

In #1193 we expanded the opcode from a u8 to u16, which is bigger than
the actual size. This allowed us to still represent functionally
different opcodes in Go, even tough the kernel uses other bits of the
instruction. So our opcode in Go is not identical to the opcode in the
kernel. We translate during marshalling and unmarshaling.

So far, we have only needed a few additional bits, but the atomic ops
need 9 bits of imm to fully encode all possibilities. Since 9 + 8 > 16
we have to grow the opcode to 32 bits.

During unmarshaling, we simply take the lower 9 bits of the imm shift
it left by 16 bits and or it with the opcode. During marshaling this
process is reversed.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Co-authored-by: Lorenz Bauer <lmb@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

asm: add support for ISA v4
2 participants