Skip to content
This repository was archived by the owner on Jan 1, 2021. It is now read-only.

Conversation

tianon
Copy link
Contributor

@tianon tianon commented Nov 12, 2018

Closes #1340

@StefanScherer
Copy link

@tianon I've tried to build it locally, but I get the same error as Travis https://travis-ci.org/boot2docker/boot2docker/builds/454166287#L1431

@StefanScherer
Copy link

I've built a boot2docker.iso with that vmxnet3 kernel module and tried it with VMware Fusion 11.0.1 with docker-machine 0.15.0 from Docker for Mac 18.06.1.

$ docker-machine create -d vmwarefusion --vmwarefusion-boot2docker-url=./boot2docker.iso test
Running pre-create checks...
(test) Boot2Docker URL was explicitly set to "./boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
Creating machine...
(test) Boot2Docker URL was explicitly set to "./boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
(test) Downloading /Users/stefan/.docker/machine/cache/boot2docker.iso from ./boot2docker.iso...
(test) Creating SSH key...
(test) Creating VM...
(test) Creating disk '/Users/stefan/.docker/machine/machines/test/test.vmdk'
(test) Virtual disk creation successful.
(test) Starting test...
(test) Waiting for VM to come online...
Error creating machine: Error in driver during machine creation: exit status 255

and with the debug output:

docker-machine -D create -d vmwarefusion --vmwarefusion-boot2docker-url=./boot2docker.iso test2
...
(test2) DBG | Trying to find IP address in leases file: /var/db/vmware/vmnet-dhcpd-vmnet1.leases
(test2) DBG | Trying to find IP address in leases file: /var/db/vmware/vmnet-dhcpd-vmnet2.leases
(test2) DBG | Trying to find IP address in leases file: /var/db/vmware/vmnet-dhcpd-vmnet3.leases
(test2) DBG | Trying to find IP address in leases file: /var/db/vmware/vmnet-dhcpd-vmnet8.leases
(test2) DBG | IP found in DHCP lease table: 192.168.65.132
(test2) DBG | Got an ip: 192.168.65.132
(test2) DBG | Creating Tar key bundle...
(test2) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser directoryExistsInGuest /Users/stefan/.docker/machine/machines/test2/test2.vmx /var/lib/boot2docker
(test2) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser CopyFileFromHostToGuest /Users/stefan/.docker/machine/machines/test2/test2.vmx /Users/stefan/.docker/machine/machines/test2/userdata.tar /home/docker/userdata.tar
(test2) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser runScriptInGuest /Users/stefan/.docker/machine/machines/test2/test2.vmx /bin/sh sudo sh -c "tar xvf /home/docker/userdata.tar -C /home/docker > /var/log/userdata.log 2>&1 && chown -R docker:staff /home/docker"
(test2) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser runScriptInGuest /Users/stefan/.docker/machine/machines/test2/test2.vmx /bin/sh sudo /bin/mv /home/docker/userdata.tar /var/lib/boot2docker/userdata.tar
(test2) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser enableSharedFolders /Users/stefan/.docker/machine/machines/test2/test2.vmx
Error creating machine: Error in driver during machine creation: exit status 255
notifying bugsnag: [Error creating machine: Error in driver during machine creation: exit status 255]

I can ssh into the docker machine

$ docker-machine ssh test2
   ( '>')
  /) TC (\   Core is distributed with ABSOLUTELY NO WARRANTY.
 (/-_--_-\)           www.tinycorelinux.net

docker@boot2docker:~$ lsmod
Module                  Size  Used by    Tainted: G  
ipt_MASQUERADE         16384  1 
nf_nat_masquerade_ipv4    16384  1 ipt_MASQUERADE
xfrm_user              32768  1 
iptable_nat            16384  1 
nf_nat_ipv4            16384  1 iptable_nat
xt_addrtype            16384  2 
iptable_filter         16384  1 
xt_conntrack           16384  1 
nf_nat                 24576  2 nf_nat_masquerade_ipv4,nf_nat_ipv4
br_netfilter           20480  0 
bridge                106496  1 br_netfilter
stp                    16384  1 bridge
llc                    16384  2 bridge,stp
overlay                61440  0 
vmxnet3                45056  0 
docker@boot2docker:~$ df -h
Filesystem                Size      Used Available Use% Mounted on
tmpfs                   890.4M    225.4M    665.0M  25% /
tmpfs                   494.7M         0    494.7M   0% /dev/shm
cgroup                  494.7M         0    494.7M   0% /sys/fs/cgroup
tmpfs                   890.4M    225.4M    665.0M  25% /var/lib/docker

Inside the docker machine I can pull and run images, but VMware shared folder doesn't seem to work.

I have also tried to build a 18.06.1 docker machine and have slight problems at least with shared volumes as well. I'll try the 18.09.0 boot2docker.iso tomorrow on another Mac with VMware Fusion 10.x.x and see if it works there.

@tianon
Copy link
Contributor Author

tianon commented Nov 12, 2018

Hmm, I wonder if we need CONFIG_VMWARE_VMCI for the shared folders -- mind adding CONFIG_VMWARE_VMCI=m to this files/kernel-config.d/vmware file too and doing a re-test?

@MartinSGill
Copy link

Tested it in Worksation Pro 15, on Win 10

vmware driver

dm create -d vmware --vmware-boot2docker-url=https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso vmware
Running pre-create checks...
(vmware) Boot2Docker URL was explicitly set to "https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
Creating machine...
(vmware) Boot2Docker URL was explicitly set to "https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
(vmware) Downloading C:\Users\mgill\.docker\machine\cache\boot2docker.iso from https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso...
(vmware) 0%....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%
(vmware) Creating SSH key...
(vmware) Creating VM...
(vmware) Starting vmware...
(vmware) Waiting for VM to come online...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded

vmwareworkstation driver

dm create -d vmwareworkstation --vmwareworkstation-boot2docker-url=https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso vmwarews
Running pre-create checks...
(vmwarews) Boot2Docker URL was explicitly set to "https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
Creating machine...
(vmwarews) Boot2Docker URL was explicitly set to "https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
(vmwarews) Downloading C:\Users\mgill\.docker\machine\cache\boot2docker.iso from https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3/boot2docker.iso...
(vmwarews) 0%....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%
(vmwarews) Creating SSH key...
(vmwarews) Creating VM...
(vmwarews) Creating disk 'C:\Users\mgill\.docker\machine\machines\vmwarews\vmwarews.vmdk'
(vmwarews) Virtual disk creation successful.
(vmwarews) Starting vmwarews...
(vmwarews) Waiting for VM to come online...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded

Notes

Logging into the VM directly I see that eth0 is present and has the expected IP.
Docker appears to be running
Just as Stefan I can ssh to the machine:

dm ssh vmwarews
   ( '>')   ( '>')
  /) TC (\   Core is distributed with ABSOLUTELY NO WARRANTY.
 (/-_--_-\)           www.tinycorelinux.net

docker@vmwarews:~$ docker system info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 18.09.0
Storage Driver: overlay2
 Backing Filesystem: tmpfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.14.79-boot2docker
Operating System: Boot2Docker 18.09.0 (TCL 8.2.1)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 989.3MiB
Name: vmwarews
ID: M7A7:EE7A:TD5T:AS7V:VZ6F:XNG7:U2GS:XDR5:XP4R:2INP:SRP4:YCOS
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

@tianon tianon force-pushed the vmxnet3 branch 3 times, most recently from a40e9a7 to 648b287 Compare November 17, 2018 00:24
@tianon
Copy link
Contributor Author

tianon commented Nov 17, 2018

@StefanScherer any chance I could convince you to rebuild/retest the latest here? (the gpg build-time errors should be fixed as well as IPVS bugs with Swarm Mode and the added VMCI bits that I think should fix more of open-vm-tools 😅)

@StefanScherer
Copy link

@tianon I've built the boot2docker.iso with both kernel modules, but I still see a problem creating a machine.

$ docker-machine -D create -d vmwarefusion --vmwarefusion-boot2docker-url=./boot2docker.iso test
...
(test) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser enableSharedFolders /Users/stefan/.docker/machine/machines/test/test.vmx
Error creating machine: Error in driver during machine creation: exit status 255
notifying bugsnag: [Error creating machine: Error in driver during machine creation: exit status 255]

Then I have cross-checked with the boot2docker.iso 18.06.1 and I also get this error. I also used the community driven vmware driver from https://github.com/machine-drivers/docker-machine-driver-vmware repo and I can create a machine with 18.06.1, but shared folders are still missing. See machine-drivers/docker-machine-driver-vmware#15 for details and my gists, but I think this is a general issue when using VMware Fusion 11. I'll test with another Mac that still is on VMware Fusion 10 later on. It's hard to fight two problems at the same time 😅

@StefanScherer
Copy link

StefanScherer commented Nov 17, 2018

After dozens of tries with different versions of macOS, Fusion, docker-machine binaries, boot2docker.iso's I got following news.

  • The docker-machine 0.13.0 binary for macOS seems the last one that works for the vmwarefusion driver. I can create docker machines up to 18.06.1 with it on nearly any macOS (10.12, 10.13.2, 10.14.1, only 10.14 didn't work) I got, and VMware Fusion 8.5.8, 10.1.4, 11.0.1. See successful gist.
  • From docker-machine 0.14.0 on (also tested with 0.15.0, 0.16.0) I get errors at the vmrun enableSharedFolders call with the same 18.06.1 iso file that I used with 0.13.0. See errornous gist

The modified boot2docker.iso of this PR with vmxnet3 and the vmci modules can enable the shared folders without an error, but there is still no /Users folder mounted from the host in the VM.
And then there is another error where the docker-machine 0.13.0 binary aborts after 10 retries in a netstat -tln loop looking for the Docker engine port 2376. The full debug output is in this gist.
The lines 152-154 of the gist may be of interest.

A quick check of the docker machine for the shared folder can be done with

$ docker-machine ssh test ls /Users

@lunetics
Copy link

lunetics commented Nov 18, 2018

I'm using the vmwarevsphere driver and get the same error as @MartinSGill
With the patched vmxnet3 version from @StefanScherer it'll go to the provision step, before it froze at Waiting for VMware Tools to come online...

Running pre-create checks...
Creating machine...
(testdockermachine) Boot2Docker URL was explicitly set to "https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3-vmci/boot2docker.iso" at create time, so Docker Machine cannot upgrade this machine to the latest version.
(testdockermachine) Downloading /root/.docker/machine/cache/boot2docker.iso from https://github.com/StefanScherer/boot2docker/releases/download/v18.09.0-vmxnet3-vmci/boot2docker.iso...
(testdockermachine) 0%....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%
(testdockermachine) Generating SSH Keypair...
(testdockermachine) Creating VM...
(testdockermachine) Uploading Boot2docker ISO ...
(testdockermachine) adding network: VM Network
(testdockermachine) Reconfiguring VM
(testdockermachine) Waiting for VMware Tools to come online...
(testdockermachine) Provisioning certs and ssh keys...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded

eval "$(docker-machine env testdockermachine)"

Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "10.0.23.57:2376": dial tcp 10.0.23.57:2376: connect: connection refused
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.

@lunetics
Copy link

Addition. Testing with boot2docker v18.06.1-ce worked fine so far, netstat -tln gives the correct output with the docker ports:

netstat -tln
SSH cmd err, output: <nil>: Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      
tcp        0      0 :::2376                 :::*                    LISTEN      
tcp        0      0 :::22                   :::*                    LISTEN      

@StefanScherer
Copy link

Thanks @lunetics this seems the same problem. Before docker-machine inserts the TLS certs it expects the Docker engine running and listening on port 2376.
The docker-machine provisioner checks up to 10 times with WaitForDocker and checkDaemonUp if port 2376 is listening. For boot2docker running in VMware this is not the case and docker-machine then gives up.

For some reason using the virtualbox driver there are TLS certs even in the 18.09.0 ISO and docker starts listening on port 2376.
For vmwarefusion the tls folder keeps empty and therefore the checkDaemonUp never succeeds.

Previous versions of boot2docker.iso automatically created some certs so Docker engine was always (vmwarefusion and virtualbox) running and listening on port 2376 so the provisioning was continued.

I currently struggle at the point that newer docker-machine binaries abort at vmrun enableSharedFolders where the version 0.13.0 worked without an error. I can't see where the difference is.

(test2) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser enableSharedFolders /Users/stefan/.docker/machine/machines/test2/test2.vmx
Error creating machine: Error in driver during machine creation: exit status 255
notifying bugsnag: [Error creating machine: Error in driver during machine creation: exit status 255]

The docker-machine binary inserts a userdata.tar file to /var/lib/boot2docker but with only the SSH key pair and not already the TLS certs. Maybe this behaviour is different to the virtualbox driver?

$ tar tvf userdata.tar 
----------  0 0      0          36 Jan  1  1970 boot2docker, this is vmware speaking
drwx------  0 0      0           0 Jan  1  1970 .ssh
-rw-r--r--  0 0      0         381 Jan  1  1970 .ssh/authorized_keys
-rw-r--r--  0 0      0         381 Jan  1  1970 .ssh/authorized_keys2

@StefanScherer
Copy link

OK, learn a bit from reading the source. docker-machine 0.13.0 just didn't check the return code of vmrun enableSharedFolders, so with the 18.09.0 ISO it does not abort here, but no shared folder is mounted. We have to check how to make the vmtools work and talk with VMware hypervisor.

And there is a timing issue with the userdata.tar file when it gets copied into the VM and when the /etc/init.d/docker script checks for it to create some dummy certs to start docker listening on 2376 for further provisioning.

@StefanScherer
Copy link

On macOS running the mount command manually shows me that the first two always complain an error, that's why docker-machine 0.14.0 aborts here.

$ vmrun -gu docker -gp tcuser enableSharedFolders /Users/stefan/Virtual\ Machines.localized/b2d-18.06.1.vmwarevm/b2d-18.06.1.vmx 
Error: There was an error mounting the Shared Folders file system inside the guest operating system
~/code/docker-windows-box/docker-machine-fusion on master
$ vmrun -gu docker -gp tcuser addSharedFolder /Users/stefan/Virtual\ Machines.localized/b2d-18.06.1.vmwarevm/b2d-18.06.1.vmx Users /Users
Error: There was an error mounting the Shared Folders file system inside the guest operating system

The third command succeeds and the shared folder is mounted in the vm when running the boot2docker version 18.06.1.

$ vmrun -gu docker -gp tcuser runScriptInGuest /Users/stefan/Virtual\ Machines.localized/b2d-18.06.1.vmwarevm/b2d-18.06.1.vmx '/bin/sh' '[ ! -d /Users ]&& sudo mkdir /Users; sudo mount --bind /mnt/hgfs//Users /Users || [ -f /usr/local/bin/vmhgfs-fuse ]&& sudo /usr/local/bin/vmhgfs-fuse -o allow_other .host:/Users /Users || sudo mount -t vmhgfs -o uid=$(id -u),gid=$(id -g) .host:/Users /Users'

Here VMware Fusion 11.0.1 seems to behave differently to VMware Workstation 15.0.1, I could create 18.06.1 docker machines with any docker-machine.exe version. Strange.

What I can see in VMware Fusion is that the share has been added to the .vmx file, so maybe we just can skip the error checks of the first two commands (again). 😅

Running the third command with the boot2docker 18.09.0 + patches ISO I get an error:

$ vmrun -gu docker -gp tcuser runScriptInGuest /Users/stefan/Virtual\ Machines.localized/b2d-18.09.0-patch.vmwarevm/b2d-18.09.0-patch.vmx '/bin/sh' '[ ! -d /Users ]&& sudo mkdir /Users; sudo mount --bind /mnt/hgfs//Users /Users || [ -f /usr/local/bin/vmhgfs-fuse ]&& sudo /usr/local/bin/vmhgfs-fuse -o allow_other .host:/Users /Users || sudo mount -t vmhgfs -o uid=$(id -u),gid=$(id -g) .host:/Users /Users'
Guest program exited with non-zero exit code: 1

When I run the commands manually in the 18.09.0 + patches VM here is a difference to the 18.06.1 VM:

$ sudo mkdir /Users

$ sudo mount --bind /mnt/hgfs//Users /Users
Error: cannot mount filesystem: No such device

$ sudo /usr/local/bin/vmhgfs-fuse -o allow_other .host:/Users
fuse: device not found, try 'modprobe fuse' first

@tianon
Copy link
Contributor Author

tianon commented Nov 19, 2018

Ok, this is updated now. I have personally tested this via:

Shared folders are working, but not mounting at /Users which is odd -- they're going to /hosthome (so /hosthome/tianon shows my home directory contents successfully 👍).

@tianon
Copy link
Contributor Author

tianon commented Nov 19, 2018

The following was in the logs:
(default) DBG | executing: /Applications/VMware Fusion.app/Contents/Public/vmrun -gu docker -gp tcuser runScriptInGuest /Users/tianon/.docker/machine/machines/default/default.vmx /bin/sh [ ! -d /hosthome ]&& sudo mkdir /hosthome; sudo mount --bind /mnt/hgfs//hosthome /hosthome || [ -f /usr/local/bin/vmhgfs-fuse ]&& sudo /usr/local/bin/vmhgfs-fuse -o allow_other .host:/Users /hosthome || sudo mount -t vmhgfs -o uid=$(id -u),gid=$(id -g) .host:/Users /hosthome

(Which explains /hosthome.)

@tianon tianon merged commit e0e2a22 into boot2docker:master Nov 20, 2018
@tianon tianon deleted the vmxnet3 branch November 20, 2018 00:21
@StefanScherer
Copy link

Thanks @tianon I can reproduce your results. 🎉

  • With docker-machine 0.16.0 and the built-in vmwarefusion driver it still aborts due to the error checks at vmrun enableSharedFolders.
  • With docker-machine 0.16.0 and the additional vmware driver 0.1.0 I can create a Docker machine and the shared folder mounted at /hosthome instead of /Users.

@lunetics
Copy link

@StefanScherer should we open another ticket on the wrong / missing netstat output (to detect if docker is running?)

@tianon
Copy link
Contributor Author

tianon commented Nov 20, 2018 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants