Frequently Asked Questions¶

Container Startup Issues¶

If your container is not starting, or not behaving as you would expect, the first thing to do is to look at the console logs generated by the container, using the lxc console --show-log CONTAINERNAME command.

In this example, we will investigate a RHEL 7 system in which systemd can not start.

# lxc console --show-log systemd
Console log:

Failed to insert module 'autofs4'
Failed to insert module 'unix'
Failed to mount sysfs at /sys: Operation not permitted
Failed to mount proc at /proc: Operation not permitted
[!!!!!!] Failed to mount API filesystems, freezing.

The errors here say that /sys and /proc can not be mounted - which is correct in an unprivileged container. However, LXD does mount these filesystems automatically if it can.

The container requirements specify that every container must come with an empty /dev, /proc, and /sys folder, as well as /sbin/init existing. If those folders don’t exist, LXD will be unable to mount to them, and systemd will then try to. As this is an unprivileged container, systemd does not have the ability to do this, and it then freezes.

So you can see the environment before anything is changed, you can explicitly change the init in a container using the raw.lxc config param. This is equivalent to setting init=/bin/bash on the linux kernel commandline.

lxc config set systemd raw.lxc 'lxc.init.cmd = /bin/bash'

Here is what it looks like:

root@lxc-01:~# lxc config set systemd raw.lxc 'lxc.init.cmd = /bin/bash'
root@lxc-01:~# lxc start systemd
root@lxc-01:~# lxc console --show-log systemd

Console log:

[root@systemd /]#
root@lxc-01:~#

Now that the container has started, you can look in it and see that things are not running as well as expected.

root@lxc-01:~# lxc exec systemd bash
[root@systemd ~]# ls
[root@systemd ~]# mount
mount: failed to read mtab: No such file or directory
[root@systemd ~]# cd /
[root@systemd /]# ls /proc/
sys
[root@systemd /]# exit

Because LXD tries to auto-heal, it did create some of the folders when it was starting up. Shutting down and restarting the container will fix the problem, but the original cause is still there - the template does not contain the required files.

Networking Issues¶

In a larger Production Environment, it is common to have multiple vlans and have LXD clients attached directly to those vlans. Be aware that if you are using netplan and systemd-networkd, you will encounter some bugs that could cause catastrophic issues

Do not use systemd-networkd with netplan and bridges based on vlans¶

At time of writing (2019-03-05), netplan can not assign a random MAC address to a bridge attached to a vlan. It always picks the same MAC address, which causes layer2 issues when you have more than one machine on the same network segment. It also has difficultly creating multiple bridges. Make sure you use network-manager instead. An example config is below, with a management address of 10.61.0.25, and VLAN102 being used for client traffic.

network:
  version: 2
  renderer: NetworkManager
  ethernets:
    eth0:
      dhcp4: no
      accept-ra: no
      # This is the 'Management Address'
      addresses: [ 10.61.0.25/24 ]
      gateway4: 10.61.0.1
      nameservers:
        addresses: [ 1.1.1.1, 8.8.8.8 ]
    eth1:
      dhcp4: no
      accept-ra: no
      # A bogus IP address is required to ensure the link state is up
      addresses: [ 10.254.254.25/32 ]

  vlans:
    vlan102:
      accept-ra: no
      dhcp4: no
      id: 102
      link: eth1

  bridges:
    br102:
      accept-ra: no
      dhcp4: no
      interfaces: [ "vlan102" ]
      # A bogus IP address is required to ensure the link state is up
      addresses: [ 10.254.102.25/32 ]
      parameters:
        stp: false

Things to note¶

eth0 is the Management interface, with the default gateway.
vlan102 uses eth1.
br102 uses vlan102, and has a bogus /32 IP address assigned to it

The other important thing is to set stp: false, otherwise the bridge will sit in learning state for up to 10 seconds, which is longer than most DHCP requests last. As there is no possibility of cross-connecting and causing loops, this is safe to do.

Beware of ‘port security’¶

Many switches do not allow MAC address changes, and will either drop traffic with an incorrect MAC, or, disable the port totally. If you can ping a LXD instance from the host, but are not able to ping it from a different host, this could be the cause. The way to diagnose this is to run a tcpdump on the uplink (in this case, eth1), and you will see either ‘ARP Who has xx.xx.xx.xx tell yy.yy.yy.yy’, with you sending responses but them not getting acknowledged, or, ICMP packets going in and out successfully, but never being received by the other host.

Do not run privileged containers unless necessary¶

A privileged container can do things that effect the entire host - for example, it can use things in /sys to reset the network card, which will reset it for the entire host, causing network blips. Almost everything can be run in an unprivileged container, or - in cases of things that require unusual privileges, like wanting to mount NFS filesystems inside the container, you may need to use bind mounts.