Skip to content

Conversation

bensmrs
Copy link
Contributor

@bensmrs bensmrs commented Jul 16, 2025

This PR is not 100% ready. I’d like to discuss 2 things:

  • Whether the attached/detached logic can stay as is, or should only concern CD-ROM drives (cf. TODO);
  • Whether the attached/detached logic should be emulated like a full/empty CD-ROM tray, or (as it’s currently done in this PR) by simply removing/re-adding another CD-ROM drive.

Emulating an empty tray will require a bit more work, but can look a bit fancier to the user, actually seeing that the device is here but not the blockdev.

Closes #2093

Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
@bensmrs bensmrs requested a review from stgraber as a code owner July 16, 2025 20:32
@github-actions github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels Jul 16, 2025
@bensmrs bensmrs changed the title Eject disk Implement disk ejection Jul 16, 2025
@stgraber
Copy link
Member

Error: internal/server/instance/drivers/driver_qemu.go:5864:8: ineffectual assignment to err (ineffassign)
		dev, err := d.deviceLoad(d, diskName, config)
		     ^
Error: internal/server/instance/drivers/driver_qemu.go:8831:9: ineffectual assignment to err (ineffassign)
			dev, err := d.deviceLoad(d, mount.DevName, config)
			     ^

@stgraber
Copy link
Member

  1. It'd be nice if we could restrict that to ISO as other drives don't really have an eject API anyways. The validate function already loads the volume details so we should have access to both the source key and volume info which is what we need to check on the two cases where we'd expose a cd-rom drive instead of a disk.

  2. I think that's fine and we can adjust later if that's a problem somehow.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 16, 2025

Great, I’ll tidy everything up and smooth out the corners, probably tomorrow.

@stgraber
Copy link
Member

For 2), we'd want to make sure we don't break live migration. That is, if we have a running VM and we eject the disk, the target VM needs to not have the disk attached when it's put together to receive the migration stream. Otherwise the hardware won't line up and migration will fail.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 16, 2025

For 2), we'd want to make sure we don't break live migration. That is, if we have a running VM and we eject the disk, the target VM needs to not have the disk attached when it's put together to receive the migration stream. Otherwise the hardware won't line up and migration will fail.

I think the current logic is safe in that regard, but I’ll test.

bensmrs added 3 commits July 17, 2025 12:29
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 17, 2025

On to testing with migration now

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 17, 2025

A few things.

  • Disks are sometimes ejected during boot time, I don’t really know why, and it triggers the ejection mechanism;
  • Migration works ok in my tests;
  • Ejecting a disk that has just been added leads to an 'Unable to eject media "dev-incus_XXX": Couldn''t find device "XXX"' error.

Gotta investigate a little more, although I take ideas :)

Edit: for (iii), looks like expandedDevices is outdated, that’s why the device cannot be found. Anything I can do (calling expandConfig seems overkill)?

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 18, 2025

Should be good now, pointers for (i) would be appreciated if you have any.

@stgraber
Copy link
Member

Should be good now, pointers for (i) would be appreciated if you have any.

I don't know what would trigger that.
Do you have a detailed QMP event log to see if it's related to some other action going on?

Depending on how early that is, maybe we can add a QMP status check of some kind to only handle eject if the VM is fully running.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 18, 2025

Here’s an excerpt until the DEVICE_TRAY_MOVED event. (the CD-ROM drive is called agent)

[2025-07-18T16:07:13Z] QUERY: {"execute":"qom-get","arguments":{"path":"/machine","property":"type"},"id":2}
[2025-07-18T16:07:13Z] REPLY: {"return": "pc-q35-9.2-machine", "id": 2}

[2025-07-18T16:07:13Z] QUERY: {"execute":"query-cpus-fast","id":4}
[2025-07-18T16:07:13Z] REPLY: {"return": [{"thread-id": 76615, "props": {"core-id": 0, "thread-id": 0, "node-id": 0, "socket-id": 0}, "qom-path": "/machine/unattached/device[0]", "cpu-index": 0, "target": "x86_64"}], "id": 4}

[2025-07-18T16:07:13Z] QUERY: {"execute":"netdev_add","arguments":{"fds":"/dev/net/tun.0:/dev/net/tun.1","id":"incus_eth0","type":"tap","vhost":true,"vhostfds":"/dev/vhost-net.0:/dev/vhost-net.1"},"id":9}
[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 9}

[2025-07-18T16:07:13Z] QUERY: {"execute":"device_add","arguments":{"addr":"00.0","bootindex":1,"bus":"qemu_pcie4","driver":"virtio-net-pci","id":"dev-incus_eth0","mac":"10:66:6a:aa:03:cc","mq":true,"netdev":"incus_eth0","vectors":6},"id":10}
[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 10}

[2025-07-18T16:07:13Z] QUERY: {"execute":"blockdev-add","arguments":{"aio":"threads","cache":{"direct":false,"no-flush":false},"discard":"unmap","driver":"file","filename":"/dev/fdset/0","locking":"off","node-name":"incus_root","read-only":false},"id":12}
[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 12}

[2025-07-18T16:07:13Z] QUERY: {"execute":"device_add","arguments":{"bootindex":0,"bus":"qemu_scsi.0","channel":0,"device_id":"incus_root","drive":"incus_root","driver":"scsi-hd","id":"dev-incus_root","lun":1,"serial":"incus_root"},"id":13}
[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 13}

[2025-07-18T16:07:13Z] QUERY: {"execute":"system_reset","id":14}
[2025-07-18T16:07:13Z] Event: {"timestamp": {"seconds": 1752854833, "microseconds": 413035}, "event": "RESET", "data": {"guest": false, "reason": "host-qmp-system-reset"}}

[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 14}

[2025-07-18T16:07:13Z] QUERY: {"execute":"set-action","arguments":{"panic":"pause","reboot":"shutdown","shutdown":"poweroff"},"id":16}
[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 16}

[2025-07-18T16:07:13Z] QUERY: {"execute":"cont","id":17}
[2025-07-18T16:07:13Z] Event: {"timestamp": {"seconds": 1752854833, "microseconds": 414075}, "event": "RESUME"}

[2025-07-18T16:07:13Z] REPLY: {"return": {}, "id": 17}

[2025-07-18T16:07:14Z] Event: {"timestamp": {"seconds": 1752854834, "microseconds": 838234}, "event": "RTC_CHANGE", "data": {"offset": 0, "qom-path": "/machine/unattached/device[3]/rtc"}}

[2025-07-18T16:07:15Z] Event: {"timestamp": {"seconds": 1752854835, "microseconds": 812412}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "qemu_serial"}}

[2025-07-18T16:07:19Z] Event: {"timestamp": {"seconds": 1752854839, "microseconds": 735904}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "qemu_serial"}}

[2025-07-18T16:07:26Z] QUERY: {"execute":"query-status","id":22}
[2025-07-18T16:07:26Z] REPLY: {"return": {"status": "running", "running": true}, "id": 22}

[2025-07-18T16:07:26Z] QUERY: {"execute":"query-status","id":23}
[2025-07-18T16:07:26Z] REPLY: {"return": {"status": "running", "running": true}, "id": 23}

[2025-07-18T16:07:26Z] Event: {"timestamp": {"seconds": 1752854846, "microseconds": 964163}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "dev-incus_eth0", "path": "/machine/peripheral/dev-incus_eth0/virtio-backend"}}

[2025-07-18T16:07:28Z] QUERY: {"execute":"query-status","id":25}
[2025-07-18T16:07:28Z] REPLY: {"return": {"status": "running", "running": true}, "id": 25}

[2025-07-18T16:07:29Z] Event: {"timestamp": {"seconds": 1752854849, "microseconds": 78887}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "qemu_serial"}}

[2025-07-18T16:07:30Z] Event: {"timestamp": {"seconds": 1752854849, "microseconds": 230101}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "qemu_serial"}}

[2025-07-18T16:07:30Z] QUERY: {"execute":"query-status","id":28}
[2025-07-18T16:07:30Z] REPLY: {"return": {"status": "running", "running": true}, "id": 28}

[2025-07-18T16:07:32Z] QUERY: {"execute":"query-status","id":29}
[2025-07-18T16:07:32Z] REPLY: {"return": {"status": "running", "running": true}, "id": 29}

[2025-07-18T16:07:32Z] QUERY: {"execute":"query-status","id":30}
[2025-07-18T16:07:32Z] REPLY: {"return": {"status": "running", "running": true}, "id": 30}

[2025-07-18T16:07:34Z] Event: {"timestamp": {"seconds": 1752854854, "microseconds": 83044}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "qemu_serial"}}

[2025-07-18T16:07:35Z] Event: {"timestamp": {"seconds": 1752854854, "microseconds": 83180}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "qemu_serial"}}

[2025-07-18T16:07:35Z] QUERY: {"execute":"query-status","id":33}
[2025-07-18T16:07:35Z] REPLY: {"return": {"status": "running", "running": true}, "id": 33}

[2025-07-18T16:07:39Z] Event: {"timestamp": {"seconds": 1752854859, "microseconds": 82702}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "qemu_serial"}}

[2025-07-18T16:07:39Z] QUERY: {"execute":"query-status","id":35}
[2025-07-18T16:07:39Z] REPLY: {"return": {"status": "running", "running": true}, "id": 35}

[2025-07-18T16:07:39Z] QUERY: {"execute":"blockdev-add","arguments":{"aio":"threads","cache":{"direct":false,"no-flush":false},"discard":"unmap","driver":"file","filename":"/dev/fdset/1","locking":"off","node-name":"incus_agent","read-only":false},"id":37}
[2025-07-18T16:07:39Z] REPLY: {"return": {}, "id": 37}

[2025-07-18T16:07:39Z] QUERY: {"execute":"device_add","arguments":{"bus":"qemu_scsi.0","channel":0,"device_id":"incus_agent","drive":"incus_agent","driver":"scsi-cd","id":"dev-incus_agent","lun":1,"serial":"incus_agent"},"id":38}
[2025-07-18T16:07:39Z] REPLY: {"return": {}, "id": 38}

[2025-07-18T16:07:39Z] QUERY: {"execute":"query-status","id":39}
[2025-07-18T16:07:39Z] REPLY: {"return": {"status": "running", "running": true}, "id": 39}

[2025-07-18T16:07:40Z] Event: {"timestamp": {"seconds": 1752854859, "microseconds": 82925}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "qemu_serial"}}

[2025-07-18T16:07:42Z] QUERY: {"execute":"query-status","id":41}
[2025-07-18T16:07:42Z] REPLY: {"return": {"status": "running", "running": true}, "id": 41}

[2025-07-18T16:07:42Z] QUERY: {"execute":"query-status","id":42}
[2025-07-18T16:07:42Z] REPLY: {"return": {"status": "running", "running": true}, "id": 42}

[2025-07-18T16:07:44Z] Event: {"timestamp": {"seconds": 1752854864, "microseconds": 82521}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "qemu_serial"}}

[2025-07-18T16:07:45Z] Event: {"timestamp": {"seconds": 1752854864, "microseconds": 82693}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "qemu_serial"}}

[2025-07-18T16:07:45Z] QUERY: {"execute":"query-status","id":45}
[2025-07-18T16:07:45Z] REPLY: {"return": {"status": "running", "running": true}, "id": 45}

[2025-07-18T16:07:45Z] QUERY: {"execute":"query-status","id":46}
[2025-07-18T16:07:45Z] REPLY: {"return": {"status": "running", "running": true}, "id": 46}

[2025-07-18T16:07:47Z] Event: {"timestamp": {"seconds": 1752854867, "microseconds": 823243}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "", "tray-open": true, "id": "dev-incus_agent"}}

@stgraber
Copy link
Member

Oh, if it's an agent drive, then it's totally expected.

Our init script ejects the drive after copying the data as we don't want the drive to be visible to unprivileged users on the system.

That may actually be a slight issue there as we want that particular drive to come back on every boot. So we should actually check and only remember the ejected state if we're not dealing with one of our internal drives (source=agent:config, source=cloud-init:config)

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 19, 2025

Oooooh good thing I didn’t have another ISO on hand, because we’d have missed that behavior. I’ll add a safeguard to the code to properly handle automatically ejected drives.
However, when now testing on a non-agent drive, I’m always getting database errors:

location: incus-dev-2
metadata:
  context:
    driver: dir
    pool: default
    project: default
    volName: TinyCore
  level: debug
  message: UnmountCustomVolume started
timestamp: "2025-07-19T00:18:15.227168269Z"
type: logging


location: incus-dev-2
metadata:
  context:
    err: 'sql: transaction has already been committed or rolled back'
  level: debug
  message: Database error
timestamp: "2025-07-19T00:18:15.228699851Z"
type: logging


location: incus-dev-2
metadata:
  context:
    driver: dir
    pool: default
    project: default
    volName: TinyCore
  level: debug
  message: UnmountCustomVolume finished
timestamp: "2025-07-19T00:18:15.228666318Z"
type: logging


location: incus-dev-2
metadata:
  context: {}
  level: warning
  message: 'Failed to rollback transaction after error (Replace Device for Instance
    failed: Delete entry for "instances_device" failed: context deadline exceeded):
    sql: transaction has already been committed or rolled back'
timestamp: "2025-07-19T00:18:15.228780261Z"
type: logging


location: incus-dev-2
metadata:
  context:
    err: 'Replace Device for Instance failed: Delete entry for "instances_device"
      failed: context deadline exceeded'
    member: "2"
  level: debug
  message: Transaction timed out, will be retried
timestamp: "2025-07-19T00:18:15.228792474Z"
type: logging

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 19, 2025

After having a look at it, it seems that the agent auto-ejection only happens when the device name is agent. There’s a possibility that I also completely messed up my testing environment, but I don’t recall tinkering with the DB in unsafe ways…

Edit: I can’t understand why the transactions fail; I’d really appreciate an additional pair of eyes :)

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 20, 2025

Oh, if it's an agent drive, then it's totally expected.

Here, “agent drive” means “drive called agent”.

That may actually be a slight issue there as we want that particular drive to come back on every boot. So we should actually check and only remember the ejected state if we're not dealing with one of our internal drives (source=agent:config, source=cloud-init:config)

I think that in this particular case, it’s actually the drive name that matters. I’m trying to identify the code paths that handle that.

EDIT: maybe we should just prevent people from creating a device named agent? It’s obviously a reserved name in the QEMU driver…

bensmrs added 3 commits July 20, 2025 15:06
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
Signed-off-by: Benjamin Somers <benjamin.somers@imt-atlantique.fr>
@stgraber
Copy link
Member

Oh, if it's an agent drive, then it's totally expected.

Here, “agent drive” means “drive called agent”.

That may actually be a slight issue there as we want that particular drive to come back on every boot. So we should actually check and only remember the ejected state if we're not dealing with one of our internal drives (source=agent:config, source=cloud-init:config)

I think that in this particular case, it’s actually the drive name that matters. I’m trying to identify the code paths that handle that.

EDIT: maybe we should just prevent people from creating a device named agent? It’s obviously a reserved name in the QEMU driver…

Actually this logic seems a bit wrong to me. Rather than relying on the device name, we should rely on some metadata in the image itself (ISO). I'm looking at making a small change to do that. We rely don't want to special case specific device names.

@stgraber
Copy link
Member

@bensmrs #2295

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 20, 2025

By looking again at it, it seems that some of my assumptions were wrong. I think my TrimPrefix call is a bit unsafe, and I should first check that the qemuDeviceIDPrefix prefix is actually there in detachDisk, making the function a no-op for this kind of internal drives.

@stgraber
Copy link
Member

@bensmrs you could use SplitN(x, 2) on it and then check that you indeed got two chunks out of that. That'd give you both the value you need and perform the prefix check in one go.

@stgraber
Copy link
Member

Otherwise everything looks good in here. The other PR I sent will avoid the weird device name special casing we currently have so we should be good to go once you tweak that TrimPrefix call.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 20, 2025

I did a test checking if the prefix is there… and the prefix is indeed there. I’m a bit confused. (I’ll try cherry-picking your commit)

@stgraber
Copy link
Member

I did a test checking if the prefix is there… and the prefix is indeed there. I’m a bit confused. (I’ll try cherry-picking your commit)

What's special with those devices is their content, that is source=agent:config or source=cloud-init:config have their content (ISO) generated by Incus on instance startup.

The rest of their handling is normal, they are not treated as internal devices, they show up as normal user defined disks (because they are) and use the name the user provided.

But because they are generated at boot time and read either by the Incus agent load or by cloud-init, both of which may want to eject the drive to hide it from the system, yet have it again on next boot, we should find a way to process the eject request but NOT persist it through the device option.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 20, 2025

My point is, even without mounting an agent disk, if the device is called agent, it gets ejected.

@stgraber
Copy link
Member

Ah, that's pretty odd behavior from QEMU

@stgraber
Copy link
Member

I'm really not convinced that this is a QEMU issue though, the timing you're showing looks a lot like: https://github.com/lxc/incus/blob/main/internal/server/instance/drivers/agent-loader/incus-agent-setup#L50

My other PR won't fix that immediately. I'm changing the script inside of Incus, but we'll need to push it through distrobuilder too and then have that roll out to all our images before the change is actually effectively.

You can try manually applying it to your incus-agent-setup script in your VM and see if that sorts it out.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 20, 2025

My other PR won't fix that immediately. I'm changing the script inside of Incus, but we'll need to push it through distrobuilder too and then have that roll out to all our images before the change is actually effectively.

Completely overlooked that. Seems to work!
I’m doing a few additional checks.

@bensmrs
Copy link
Contributor Author

bensmrs commented Jul 20, 2025

Ok everything works, without any need to patch the TrimPrefix side.
I’m not sure it’s worth patching, as it would really only cause problems with devices added through raw keys, and I don’t recall us handling with any care these situations.
I propose to merge this PR as is.

@stgraber stgraber enabled auto-merge July 21, 2025 09:39
@stgraber stgraber merged commit 2af5cc1 into lxc:main Jul 21, 2025
137 of 140 checks passed
@bensmrs bensmrs deleted the eject-disk branch July 29, 2025 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Development

Successfully merging this pull request may close these issues.

Allow eject commands to remove drives from an Incus instance
2 participants