-
Notifications
You must be signed in to change notification settings - Fork 1.2k
runtime-rs: Introduce PCIe Port devices in runtime-rs for qemu-rs #10578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6d29a59
to
4b1b6d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are limits per HV how many root/switch ports can be created. We need to make sure that we're not exceeding those limits otherwise we get cryptic memory errors. For QEMU we have a limit of 16 root ports or 16 switch ports or 64KB IO memory each root or switch port taking up of default 4K IO memory. |
@BbolroC One outstanding thing to clarify is can we attach VFIO-AP devices to a root-port? This way we can drop the support for legacy bridges altogether. |
Yes, sir. without KEP-4113 support, we need introduce many related codes to do such work which seems hacking to me. I have pulled request for get bar max memory in another PR. And I am looking forward to your INPUT @zvonkok @BbolroC |
12ffcc8
to
31a665b
Compare
I have tested it with related PRs and get the SWITHCH PORT output as below: configuration
get the root ports
the shell command configuration
|
31a665b
to
87b9ebb
Compare
Please take a look at the following:
Based on my understanding so far, there is no PCI support in s390x QEMU (so no concept like a |
|
@Apokleos For the switch-port use-case I need to post a QEMU patch. Can you check if the BARs are correctly mapped?
|
Thx Choi @BbolroC, I'd like to know that there's some work in my PR I can do for you which can help your coming works ? If any ideas, please let me know. |
Sure. the coming updated PR will remove this option. |
@Apokleos I am wondering how we're dealing with different archs, in the go-runtime we have different implementation for different architectures like x86 - s390x how are we dealing it with in runtime-rs? |
|
||
/// Represents an available node in the PCIe topology. | ||
#[derive(Clone, Debug)] | ||
pub enum AvailabledNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be AvailableNode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx Pavel. I think it should not be here now, I need remove this data structure as there's no reference of it in this PR. And I will add it back in a related PR(vfio device hotplug).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Apokleos for this huge piece of work in a well-formatted PR! :-) I'm approving but frankly I'd prefer if you could deal with some of the remaining suggestions by @gkurz and @BbolroC either by incorporating them and resolving the pertinent conversations, or by documenting your reasons why not to incorporate them.
I've resolved my conversations for clarity that you've dealt with.
Thx Pavel @pmores. |
5fd6ef1
to
bf0dbc3
Compare
I'd like to create an issue to do this work
I think we should address it in another PR, ahead of it, I open an issue to track it |
Thanks for the update. I am fine with merging this. 👍 |
Hey @Apokleos , I'd still be in favour of mentioning the root/switch port counts limitation in the config file template but I'm fine merging this with or without that change. Thanks! |
@Apokleos What happens if a VFIO device is hot-plugged to a root-port and the workload fails to start and the container is restarted. Is the root-port freed up? Can it be reused once he container restarts? |
@zvonkok I don't think this PR deals with hotplugging at all. |
Thx Pavel @pmores Secondly, it's aimed for a simpler pcie topology that could still support advanced features for devices like GPUs. It's what the limitation only one port device type(RootPort or SwitchPort) exists in a pod/VM. I also spent some time considering if there were scenarios requiring both a dedicated However, I'm certainly open to making adjustments in future implementations if such scenarios do arise. |
Thx @zvonkok This PR will not involve the vfio device hotplugging stuff, and the following one will do. But IMO, it's a point worth discussing. As I understand, continuously hot-plugging devices from VM isn't very reliable at the moment, so we might need to have a deep discussion about it based on the PR #10362. |
(1) Introduce new field `pcie_switch_port` for switch ports. (2) Add related checking logics in vmms(dragonball, qemu) Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Support setting switch ports with annotatation or configuration.toml Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This commit introduces an implementation for managing PCIe topologies, focusing on the relationship between Root Ports and Switch Ports. The design supports two strategies for generating Switch Ports: Let's take the requirement of 4 switch ports as an example. There'll be three possible solutions as below: (1) Single Root Port + Single PCIe Switch: Uses 1 Root Port and 1 Switch with 4 Downstream Ports. (2) Multiple Root Ports + Multiple PCIe Switches: Uses 2 Root Ports and 2 Switches, each with 2 Downstream Ports. The recommended strategy is Option 1 due to its simplicity, efficiency, and scalability. The implementation includes data structures (PcieTopology, RootPort, PcieSwitch, SwitchPort) and operations (add_pcie_root_port, add_switch_to_root_port, add_switch_port_to_switch) to manage the topology effectively. Fxies kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
PortDevice is for handling root ports or switch ports in PCIe Topology. It will make it easy pass the root ports/switch ports information during create VM with requirements of PCIe devices. Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
A new resource type `PortDevice` is introduced which is dedicated for handling root ports/switch ports during sandbox creation(VM). Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Prepare pcie port devices before starting VM with the help of device manager and PCIe Topology. Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Some data structures and methods are introduced to help handle vfio devices. And mothods add_pcie_root_ports and add_pcie_switch_ports follow runtime's related implementations of vfio devices. Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Extract PortDevice relevant information, and then invoke different processing methods based on the device type. Fixes kata-containers#10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
bf0dbc3
to
0753352
Compare
To well support vfio devices in PCIe root ports or switch ports in case of qemu-rs, we instroduce port devices containing RootPort and SwitchPort mostly aligned with the implementations of kata-runtime.