Skip to content

Conversation

sampleyang
Copy link
Contributor

Some nvidia gpu pci address domain with 0001,
current runtime default deal with 0000:bdf,
which cause address errors during device initialization and address conflicts during device registration

@katacontainersbot katacontainersbot added the size/small Small and simple task label May 9, 2025
@Apokleos Apokleos changed the title runtime: fix vfio pci address domain 0001 problem runtime-rs: fix vfio pci address domain 0001 problem May 9, 2025
@Apokleos Apokleos requested review from lifupan and Apokleos May 9, 2025 09:55
@zvonkok zvonkok self-assigned this May 9, 2025
@zvonkok
Copy link
Contributor

zvonkok commented May 9, 2025

For NUMA I need similar functionality, let me cross-check with my POC and see if we can get on the same dominator.

@zvonkok zvonkok added the area/gpu Issues specific to GPU/PCIe label May 9, 2025
@zvonkok
Copy link
Contributor

zvonkok commented May 9, 2025

This is tested with Dragonball?

@Apokleos
Copy link
Contributor

Apokleos commented May 9, 2025

This is tested with Dragonball?

As I know that it might be CLH

@Apokleos Apokleos requested a review from zvonkok May 9, 2025 15:12
@sampleyang
Copy link
Contributor Author

This is tested with Dragonball?

As I know that it might be CLH
@zvonkok @Apokleos yes, test with cloud-hypervisor.

Copy link
Contributor

@Apokleos Apokleos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @sampleyang LGTM

@lifupan
Copy link
Member

lifupan commented May 22, 2025

@sampleyang

Please run rustfmt against your codes.

@sampleyang
Copy link
Contributor Author

@sampleyang

Please run rustfmt against your codes.

done

Comment on lines 697 to 699
let Some(domain_part) = parts.first() else {
return None;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy is not happy of this writing here, would you mind running make check under src/runtime-rs to fix the clippy warnings 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Some nvidia gpu pci address domain with 0001,
current runtime default deal with 0000:bdf,
which cause address errors during device initialization
and address conflicts during device registration.

Fixes kata-containers#11252

Signed-off-by: yangsong <yunya.ys@antgroup.com>
@lifupan lifupan merged commit e9b4512 into kata-containers:main May 23, 2025
329 of 351 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gpu Issues specific to GPU/PCIe ok-to-test size/small Small and simple task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants