Following on from the fairly straightforward build process of building my new Raspberry Pi server, I thought I’d document the software stack too.

My previous, Intel, server had several LXC containers dedicated to different tasks. I sometimes give access to my home server to other people, and I run some pieces of software which have conflicting or many dependencies and my obsessive tendency makes we want to keep them apart.

Examples include an X desktop environment, and anything of significant size written in Ruby or Python, which tends to sprout -dev dependencies so they can be built by gem or pip. virtualenv and rbenv don’t help you manage these external dependencies. Most people might consider a whole separate Debian install overkill, if you do, too, then this page is probably not for you.

Given that I only had 512Mb of RAM to play with on the Pi, and not the 2Gb I had on the Intel machine, I didn’t want to add the overhead of libvirt and its associated daemons. I also didn’t want to run through the whole init process when all I really want is a couple of specific daemons and a separate network environment. Finally, I’m not bothered about limiting memory usage, and the memory cgroup driver has enough overhead to be disabled by default in Debian. I’ve not done any benchmarks, but I think the little ARMv6 in the Pi needs all the help it can get.

At $WORK we’re using the libvirt LXC stack rather than the Debian/Ubuntu LXC package, because we had existing libvirt experience from KVM. Sadly, getting libvirt LXC working the way that we wanted required a fair bit of reading the source code. Luckily, that started to open my eyes to how easy this LXC stuff really was. I hacked the bare minimum together one evening:

Just enough to get a separate process namespace going. A few night’s later I had a container with networking and a getty and I was closer to done than I thought at the time. I’d started to go down the rabbit-hole of getty and terminal control and select, when really all I wanted was a machine that was network accessible enough to run SSH.

I needed to do a bit more work around filesystem isolation, after you calling pivot_root you still need to unmount all of the old filesystems, in reverse-length-order. That is; the deepest mounts before the shallow ones. I straight-up lifted some code from libvirt to do this, which is why my whole programme is now LGPL licensed and blessed with an IBM/RedHat/LGPL copyright statement which almost doubles its length.

The full C source is available here.

The two main parts of interest operate in two different threads, one of which is in the “real world” of the bare OS, and one of which is executing within an LXC context.

snprintf(buf, BUFSIZ, "ip link add name %s type veth peer name slave", v[1]);
system(buf);
snprintf(buf, BUFSIZ, "ip link set dev %s up", v[1]);
system(buf);
snprintf(buf, BUFSIZ, "brctl addif br1 %s", v[1]);
system(buf);
sleep(5); // Give the bridge a chance to spin up
pid = clone(child, stacktop, cflags, pointer);
snprintf(buf, BUFSIZ, "ip link set slave netns %d", pid);
system(buf);

Briefly, on the outside, I create a veth pair, two virtual Ethernet devices linked by an imaginary piece of string. One is called after the container’s name, the other is always called “slave”. Then I set it “up”, and add it to a bridge (hardcoded to suit my needs). Despite disabling spanning-tree protocol, I still had better results sleeping for a few seconds here, and then I call clone with the magic LXC flags and get a new thread in a new namespace. Finally, I move the “slave” NIC into a different networking namespace, from the outside, and exit.

snprintf(buf, BUFSIZ, "/lxc/%s/.old", n);
mkdir(buf, 0700);
snprintf(buf2, BUFSIZ, "/lxc/%s", n);
pivot_root(buf2, buf);
chdir("/");
mount("/proc", "/proc", "proc", , NULL);
close(2);
close(1);
close();
unmount_old();
mount("devpts", "/dev/pts", "devpts", , "newinstance,ptmxmode=0666,mode=0620,gid=5");
execl("/sbin/init", "/sbin/init", (char *)NULL);
perror(NULL);
return ;

In the container, I create somewhere to move the existing root to, within the new root, which I’ve hardcoded as “/lxc/$containername”. Once we’ve pivoted, we change page to “/” because our current working directory is undefined, mount /proc because various things expect it, close our file descriptors (which may relate to a terminal on the outside, which would be bad), create a new devpts filesystem instance and finally, execute init, letting it take over our process. We never actually hit the perror or return unless the execl call fails.

Addendum: I forgot to mention, you’ll need to add some missing functionality into the Raspbian kernel, primarily veth device support. I followed these steps from Yohei Kuga, though I think CONFIG_VETH was all I actually had to change.

--- .config.old 2013-05-18 11:20:10.000000000 +0000
+++ .config 2013-05-18 15:11:11.000000000 +0000
@@ -1211,7 +1211,7 @@
 # CONFIG_NETPOLL_TRAP is not set
 CONFIG_NET_POLL_CONTROLLER=y
 CONFIG_TUN=m
-# CONFIG_VETH is not set
+CONFIG_VETH=m

 #
 # CAIF transport drivers

Pro tip: Do not do your git checkout of the very large Linux kernel tree on the Pi; if at all possible, check it out on a more powerful machine and then rsync the files across. That said, if you do it from a Mac OS X install which has a case insensitive filesystem, you’ll probably need to git reset --hard to fix the mangled filenames, which will take a surprisingly long time.

Also: budget about a day for the kernel compile.

Now we need somewhere to pivot to. I use a USB connected HDD with LVM, so I create new filesystem, mount it where I want it, and put an empty Debian on there:

lvcreate -L 10G -n one WDG
mkfs.ext4 /dev/WDG/one
mkdir -p /lxc/one
mount /dev/WDG/one /lxc/one
debootstrap wheezy /lxc/one --include=ssh

Time passes.

Now you’ll need to do some initial configuration:

echo one > /lxc/one/etc/hostname
cat > /lxc/one/etc/hosts <<EOF
127.0.0.1 localhost localhost.localdomain
127.0.1.1 one one.insom.me.uk
EOF
cat > /lxc/one/etc/inittab <<EOF
rc::bootwait:/etc/rc
id:2:initdefault:
ssh:2:respawn:/usr/sbin/sshd -D
EOF
cat > /lxc/one/etc/rc <<EOF
#!/bin/sh
/bin/hostname one
/sbin/ip link set dev lo up
/sbin/ip link set dev slave up
/sbin/ip addr add 192.168.1.3/24 dev slave
/sbin/ip route add default via 192.168.1.1
/sbin/ip -6 addr add 2a01:348:2e0:cfff::3/64 dev slave
/sbin/ip -6 route add default via 2a01:348:2e0:cfff::1
EOF
chmod a+x /lxc/one/etc/rc

That’ll set up a basic network configuration and SSH on boot. You won’t be able to login as root because it doesn’t have a password, so lets set one, by running passwd in the new filesystem:

chroot /lxc/one passwd root

If you followed the above, you should even be able to run your new container. Compile the C attached to the main gist and run it as root with “one” as its first parameter, and enjoy the fruits of your effort:

# ./lxc-run one
# ps axwf
...
15645 ?        Ss     0:00 init [2]
15660 ?        Ss     0:00  \_ /usr/sbin/sshd -D
...
# ssh root@192.168.1.3
Warning: Permanently added '192.168.1.3' (ECDSA) to the list of known hosts.
root@192.168.1.3's password:
Linux pi 3.6.11+ #2 PREEMPT Mon May 20 14:05:58 UTC 2013 armv6l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
root@one:~# ps axwf
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:00 init [2]
   12 ?        Ss     0:00 /usr/sbin/sshd -D
   13 ?        Ss     0:00  \_ sshd: root@pts/2
   14 pts/2    Ss     0:00      \_ -bash
   20 pts/2    R+     0:00          \_ ps axwf

(We’ve upped the pid count a bit because we executed all of those commands in /etc/rc, remember? That’s why sshd isn’t pid #2).