Skip to content

Conversation

Apokleos
Copy link
Contributor

@Apokleos Apokleos commented May 7, 2025

runtime-rs: Propagate k8s configs correctly when sharedfs is disabled

In Kubernetes (k8s), while Kata Pods often use virtiofs for injecting
Service Accounts, Secrets, and ConfigMaps, security-sensitive
environments like CoCo disable host-guest sharing. Consequently, when
SharedFs is disabled, we propagate these configurations into the guest
via file copy and bind mount for correct container access.

Fixes #11237

Signed-off-by: alex.lyn alex.lyn@antgroup.com

@Apokleos Apokleos self-assigned this May 7, 2025
@Apokleos Apokleos requested a review from justxuewei May 7, 2025 08:40
@katacontainersbot katacontainersbot added the size/large Task of significant size label May 7, 2025
@Apokleos Apokleos changed the title Copydir runtime-rs: Propagate k8s configs correctly when sharedfs is disabled May 7, 2025
@stevenhorsman
Copy link
Member

Hi @Apokleos - thanks for this update. Just to check I understand the scope of this

I'm just trying to confirm how much of the gap this closes, if the watcher is a separate PR, then that's not a problem for me. Thanks!

@Apokleos
Copy link
Contributor Author

Apokleos commented May 7, 2025

Hi @Apokleos - thanks for this update. Just to check I understand the scope of this

  • this PR covers doing the initial recursive copy of directories when shared_fs = none

yes

  • It doesn't set up a watcher to continue to propogate those changes to the guest if they change on the host side?

Yes, you're right. And I'm also thinking this point to do watch on the files/dir changes, as you have pointed it out, I believe I have to make it work with one more commit which belongs to this PR.

I think I'm working on the same functions as the code you list. More comments will be added to make it more clear.

I'm just trying to confirm how much of the gap this closes, if the watcher is a separate PR, then that's not a problem for me. Thanks!

Appreciated Steven @stevenhorsman I tend to introduce one more commits to make it closer to runtime-go.

@stevenhorsman
Copy link
Member

Appreciated Steven @stevenhorsman I tend to introduce one more commits to make it closer to runtime-go.

Great - thanks for the clarification. I think this PR is still good, so am happy to see it go in, I just wanted to check I'd understand the limits/scope of it.

@katacontainersbot katacontainersbot added size/huge Largest and most complex task (probably needs breaking into small pieces) and removed size/large Task of significant size labels May 9, 2025
@Apokleos
Copy link
Contributor Author

Appreciated Steven @stevenhorsman I tend to introduce one more commits to make it closer to runtime-go.

Great - thanks for the clarification. I think this PR is still good, so am happy to see it go in, I just wanted to check I'd understand the limits/scope of it.

Hi @stevenhorsman PR updated, PTAL! Thx.

// multiple times in a short period; we only execute the last one.
if let Some(t) = last_event_time {
if Instant::now().duration_since(t) > DEBOUNCE_TIME && *need_sync.lock().await {
info!(sl!(), "debonce handle copyfile {:?} -> {:?}", &src, &dst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: debonce -> debounce

{
error!(
sl!(),
"debonce handle copyfile {:?} -> {:?} failed with error: {:?}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@stevenhorsman
Copy link
Member

Hi @stevenhorsman PR updated, PTAL! Thx.

Thanks for the update. Is it just the identification of the Kubernetes files we want to propogate left now for this feature?

@Apokleos Apokleos force-pushed the copydir branch 2 times, most recently from 92a1015 to 371dc34 Compare May 15, 2025 09:43
@Apokleos
Copy link
Contributor Author

Hi @stevenhorsman PR updated, PTAL! Thx.

Thanks for the update. Is it just the identification of the Kubernetes files we want to propogate left now for this feature?

Appreciate @stevenhorsman I have updated it with volume copy whitelist check added to restrict directory copying limited in /var/lib/kubelet/pods/<uid>/volumes/{kubernetes.io~configmap, kubernetes.io~secret, kubernetes.io~downward-api, kubernetes.io~projected}

/// This function is used to check whether a given volume is in the allowed copy whitelist.
/// More specifically, it determines whether the volume's path is located under a predefined
/// list of allowed copy directories.
pub(crate) fn is_whitelisted_copy_volume(source_path: &PathBuf) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To try and comply with the Inclusive Naming Initiative which the CNCF created, could we rename whitelist to allowlist here? https://inclusivenaming.org/word-lists/tier-1/whitelist/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! definitly agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, PTAL

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, I don't think we have a way to test that here, but that's true for the go-runtime as well (until we implement #8015 that I've just remembered), so I don't think that should be a blocker. Thanks for the changes @Apokleos

Apokleos added 6 commits May 20, 2025 16:55
Add async directory traversal using BFS algorithm:
(1) Support file type handling:
Regular files (S_IFREG) with content streaming;
Directories (S_IFDIR) with mode preservation;
Symbolic links (S_IFLNK) with target recreation;
(2) Maintain POSIX metadata:
UID/GID preservation,File mode bits, and Directory permissions
(3) Implement async I/O operations for:
Directory enumeration, file reading, symlink target resolution

Fixes kata-containers#11237

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
In Kubernetes (k8s), while Kata Pods often use virtiofs for injecting
Service Accounts, Secrets, and ConfigMaps, security-sensitive
environments like CoCo disable host-guest sharing. Consequently, when
SharedFs is disabled, we propagate these configurations into the guest
via file copy and bind mount for correct container access.

Fixes kata-containers#11237

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Introduce event-driven file sync mechanism between host and guest when
sharedfs is disabled, which will help monitor the host path in time and
do sync files changes:

1. Introduce FsWatcher to monitor directory changes via inotify;
2. Support recursive watching with configurable filters;
3. Add debounce logic (default 500ms cooldown) to handle burst events;
4. Trigger `copy_dir_recursively` on stable state;
5. Handle CREATE/MODIFY/DELETE/MOVED/CLOSE_WRITE events;

Fixes kata-containers#11237

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
When synchronizing file changes on the host, a "symlink AlreadyExists"
issue occurs, primarily due to improper handling of symbolic links
(symlinks). Additionally, there are other related problems.

This patch will try to address these problems.
(1) Handle symlink target existence (files, dirs, symlinks) during host file
    sync. Use appropriate removal methods (unlink, remove_file, remove_dir_all).
(2) Enhance temporary file handling for safer operations and implement truncate
    only at offset 0 for resume support.
(3) Set permissions and ownership for parent directories.
(4) Check and clean target path for regular files before rename.

Fixes kata-containers#11237

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
For security reasons, we have restricted directory copying.

Introduces the `is_allowlisted_copy_volume` function to verify
if a given volume path is present in an allowed copy directory.
This enhances security by ensuring only permitted volumes are
copied

Currently, only directories under the path
`/var/lib/kubelet/pods/<uid>/volumes/{kubernetes.io~configmap,
kubernetes.io~secret, kubernetes.io~downward-api,
kubernetes.io~projected}` are allowed to be copied into the
guest. Copying of other directories will be prohibited.

Fixes kata-containers#11237

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get this one in.
thanks @Apokleos!

@fidencio fidencio merged commit d3f81ec into kata-containers:main May 27, 2025
331 of 351 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/huge Largest and most complex task (probably needs breaking into small pieces)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runtime-rs: Propagate secrets, config maps, and service accounts correctly when sharedfs is disabled
4 participants