Skip to content

[daemon] trigger on_restart for a running VM after daemon boot #4007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 2, 2025

Conversation

xmkg
Copy link
Member

@xmkg xmkg commented Mar 26, 2025

on_restart re-initializes some daemon-side resources needed for the VM functionality, such as non-backend mounts like sshfs. this patch ensures that on_restart is executed for the VM's that are already running, or starting.

Closes: #4005

MULTI-1908

@xmkg xmkg requested a review from georgeliao March 26, 2025 18:15
@xmkg xmkg force-pushed the bugfix/resync-running-vms-on-daemon-restart branch from 8b170e9 to 096cb66 Compare March 26, 2025 18:19
Copy link

codecov bot commented Mar 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.55%. Comparing base (392ba43) to head (9f477bb).
⚠️ Report is 3410 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4007      +/-   ##
==========================================
+ Coverage   89.52%   89.55%   +0.03%     
==========================================
  Files         260      260              
  Lines       14749    14757       +8     
==========================================
+ Hits        13204    13216      +12     
+ Misses       1545     1541       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xmkg xmkg force-pushed the bugfix/resync-running-vms-on-daemon-restart branch 3 times, most recently from b8909d3 to e9f9a95 Compare March 28, 2025 22:06
@ricab
Copy link
Collaborator

ricab commented Apr 4, 2025

Hey @xmkg, is this still valid and is it ready for review?

@xmkg
Copy link
Member Author

xmkg commented Apr 4, 2025

Hey @xmkg, is this still valid and is it ready for review?

Hey @ricab, yes it's still valid but I need to take a look at the private side of the things. Feel free to review this one.

@ricab ricab requested a review from Sploder12 April 4, 2025 15:16
Copy link
Contributor

@Sploder12 Sploder12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a couple questions and a nitpick.

@xmkg
Copy link
Member Author

xmkg commented Apr 7, 2025

The MP #4010 has to be merged before this one.

@georgeliao
Copy link
Contributor

georgeliao commented Apr 29, 2025

Hi @xmkg
Thanks for the work. Basically, the change is about adding on_restart(name) call to e_state::running and e_state::starting cases. The unit tests also nicely covered different branches. However, I have a wondering about the real scenario for the new branch. The virtual machine class destructor always suspend the vm if it is running. So upon the server restart, the original branch is called if the original vm state is running. On the other side, if the original vm state is stopped or suspended, the daemon start should restore them to stop or suspended state. After that, a multipass start <vm_name> is needed to get them running, and the Daemon::startfunction call probably contains the steps in the on_restart. Therefore, I am just wondering in which scenario the new branch is needed.

@xmkg
Copy link
Member Author

xmkg commented Apr 30, 2025

However, I have a wondering about the real scenario for the new branch. The virtual machine class always suspend the vm if it is running.

The scenario in which this is needed is a daemon crash, where things are not nicely being wrapped up as they would in a graceful exit. The current implementation of the "on_restart" function initializes the daemon-side resources needed by a VM to function properly, e.g., a mount's daemon-side counterpart, as in SSHFS.

So, assume that we have an SSHFS-mounted VM in a running state, and the daemon crashes. When the daemon is restarted, the spec's state would be running, and the VM's state might be running or something else, depending on the backend. For LXC, the VM's state will be running since the VM process is not a subprocess of the daemon itself and it's externally managed. So for the LXC scenario, spec.state == running and vm.state == running, which will in turn cause the on_restart function not to be called, and the resources associated with the invocation of the on_restart won't be initialized -- in our scenario, that would mean the SSHFS server is not being spawned for the VM, therefore the mount won't work.

The patch fixes that by triggering the on_restart function for the vm.state == running and vm.state == starting. These two states won't require a vm->start() call, so it's being handled separately.

@georgeliao
Copy link
Contributor

georgeliao commented Apr 30, 2025

Thanks for the explanation. The code change does help in this corner case. Actually, this particular corner case further reminded us the necessity of the encapsulated vm which is the refactor we really need to do.
Apart from that, can we make a small change to the mock type of MockPlatform (test_daemon.cpp line 137) to NiceMock so we can get rid of the warnings from the three newly added unit tests.

georgeliao
georgeliao previously approved these changes Apr 30, 2025
@georgeliao
Copy link
Contributor

We also need to do a rebase before we can start the merge.

xmkg added 2 commits April 30, 2025 17:58
on_restart re-initializes some daemon-side resources needed for
the VM functionality, such as non-backend mounts like sshfs. this
patch ensures that on_restart is executed for the VM's that are
already running, or starting.

Closes: #4005

Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
- Added unit tests for the on_restart() trigger logic
- Added unit test for the spec.deleted state set block

Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
@xmkg xmkg force-pushed the bugfix/resync-running-vms-on-daemon-restart branch from e9f9a95 to 9d182f6 Compare April 30, 2025 15:13
xmkg added 3 commits April 30, 2025 18:38
Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
@xmkg
Copy link
Member Author

xmkg commented Apr 30, 2025

Apart from, can we make a small change to the mock type of MockPlatform (test_daemon.cpp line 137) to NiceMock so we can get rid of the warnings from the three newly added unit tests.

Done, should be good to go.

@xmkg xmkg requested review from Sploder12 and georgeliao May 2, 2025 08:58
@xmkg xmkg added this pull request to the merge queue May 2, 2025
Merged via the queue into main with commit b9bd173 May 2, 2025
15 checks passed
@xmkg xmkg deleted the bugfix/resync-running-vms-on-daemon-restart branch May 2, 2025 14:36
@ricab ricab added this to the 1.16.0 milestone Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mounts won't get activated when daemon restarts for persistent backends
4 participants