Following on from the fairly straightforward build process of building my new Raspberry Pi server, I thought I’d document the software stack too.
My previous, Intel, server had several LXC containers dedicated to different tasks. I sometimes give access to my home server to other people, and I run some pieces of software which have conflicting or many dependencies and my obsessive tendency makes we want to keep them apart.
Examples include an X desktop environment, and anything of significant size written in Ruby or Python, which tends to sprout
-dev dependencies so they can be built by
rbenv don’t help you manage these external dependencies. Most people might consider a whole separate Debian install overkill, if you do, too, then this page is probably not for you.
Given that I only had 512Mb of RAM to play with on the Pi, and not the 2Gb I had on the Intel machine, I didn’t want to add the overhead of libvirt and its associated daemons. I also didn’t want to run through the whole init process when all I really want is a couple of specific daemons and a separate network environment. Finally, I’m not bothered about limiting memory usage, and the memory cgroup driver has enough overhead to be disabled by default in Debian. I’ve not done any benchmarks, but I think the little ARMv6 in the Pi needs all the help it can get.
At $WORK we’re using the libvirt LXC stack rather than the Debian/Ubuntu LXC package, because we had existing libvirt experience from KVM. Sadly, getting libvirt LXC working the way that we wanted required a fair bit of reading the source code. Luckily, that started to open my eyes to how easy this LXC stuff really was. I hacked the bare minimum together one evening:
gist.github.com/insom/5528775 Getting some C on so I can bring up little LXC containers without libvirt or the Debian LXC package.
— Aaron Brady (@insom) May 6, 2013
Just enough to get a separate process namespace going. A few night’s later I had a container with networking and a getty and I was closer to done than I thought at the time. I’d started to go down the rabbit-hole of
getty and terminal control and
select, when really all I wanted was a machine that was network accessible enough to run SSH.
I needed to do a bit more work around filesystem isolation, after you calling
pivot_root you still need to unmount all of the old filesystems, in reverse-length-order. That is; the deepest mounts before the shallow ones. I straight-up lifted some code from libvirt to do this, which is why my whole programme is now LGPL licensed and blessed with an IBM/RedHat/LGPL copyright statement which almost doubles its length.
The full C source is available here.
The two main parts of interest operate in two different threads, one of which is in the “real world” of the bare OS, and one of which is executing within an LXC context.
snprintf(buf, BUFSIZ, "ip link add name %s type veth peer name slave", v); system(buf); snprintf(buf, BUFSIZ, "ip link set dev %s up", v); system(buf); snprintf(buf, BUFSIZ, "brctl addif br1 %s", v); system(buf); sleep(5); // Give the bridge a chance to spin up pid = clone(child, stacktop, cflags, pointer); snprintf(buf, BUFSIZ, "ip link set slave netns %d", pid); system(buf);
Briefly, on the outside, I create a
veth pair, two virtual Ethernet devices linked by an imaginary piece of string. One is called after the container’s name, the other is always called “slave”. Then I set it “up”, and add it to a bridge (hardcoded to suit my needs). Despite disabling spanning-tree protocol, I still had better results sleeping for a few seconds here, and then I call
clone with the magic LXC flags and get a new thread in a new namespace. Finally, I move the “slave” NIC into a different networking namespace, from the outside, and exit.
snprintf(buf, BUFSIZ, "/lxc/%s/.old", n); mkdir(buf, 0700); snprintf(buf2, BUFSIZ, "/lxc/%s", n); pivot_root(buf2, buf); chdir("/"); mount("/proc", "/proc", "proc", , NULL); close(2); close(1); close(); unmount_old(); mount("devpts", "/dev/pts", "devpts", , "newinstance,ptmxmode=0666,mode=0620,gid=5"); execl("/sbin/init", "/sbin/init", (char *)NULL); perror(NULL); return ;
In the container, I create somewhere to move the existing root to, within the new root, which I’ve hardcoded as “/lxc/$containername”. Once we’ve pivoted, we change page to “/” because our current working directory is undefined, mount
/proc because various things expect it, close our file descriptors (which may relate to a terminal on the outside, which would be bad), create a new
devpts filesystem instance and finally, execute init, letting it take over our process. We never actually hit the
return unless the
execl call fails.
Addendum: I forgot to mention, you’ll need to add some missing functionality into the Raspbian kernel, primarily
veth device support. I followed these steps from Yohei Kuga, though I think CONFIG_VETH was all I actually had to change.
--- .config.old 2013-05-18 11:20:10.000000000 +0000 +++ .config 2013-05-18 15:11:11.000000000 +0000 @@ -1211,7 +1211,7 @@ # CONFIG_NETPOLL_TRAP is not set CONFIG_NET_POLL_CONTROLLER=y CONFIG_TUN=m -# CONFIG_VETH is not set +CONFIG_VETH=m # # CAIF transport drivers
Pro tip: Do not do your git checkout of the very large Linux kernel tree on the Pi; if at all possible, check it out on a more powerful machine and then rsync the files across. That said, if you do it from a Mac OS X install which has a case insensitive filesystem, you’ll probably need to
git reset --hard to fix the mangled filenames, which will take a surprisingly long time.
Also: budget about a day for the kernel compile.
Now we need somewhere to pivot to. I use a USB connected HDD with LVM, so I create new filesystem, mount it where I want it, and put an empty Debian on there:
lvcreate -L 10G -n one WDG mkfs.ext4 /dev/WDG/one mkdir -p /lxc/one mount /dev/WDG/one /lxc/one debootstrap wheezy /lxc/one --include=ssh
Now you’ll need to do some initial configuration:
echo one > /lxc/one/etc/hostname cat > /lxc/one/etc/hosts <<EOF 127.0.0.1 localhost localhost.localdomain 127.0.1.1 one one.insom.me.uk EOF cat > /lxc/one/etc/inittab <<EOF rc::bootwait:/etc/rc id:2:initdefault: ssh:2:respawn:/usr/sbin/sshd -D EOF cat > /lxc/one/etc/rc <<EOF #!/bin/sh /bin/hostname one /sbin/ip link set dev lo up /sbin/ip link set dev slave up /sbin/ip addr add 192.168.1.3/24 dev slave /sbin/ip route add default via 192.168.1.1 /sbin/ip -6 addr add 2a01:348:2e0:cfff::3/64 dev slave /sbin/ip -6 route add default via 2a01:348:2e0:cfff::1 EOF chmod a+x /lxc/one/etc/rc
That’ll set up a basic network configuration and SSH on boot. You won’t be able to login as root because it doesn’t have a password, so lets set one, by running
passwd in the new filesystem:
chroot /lxc/one passwd root
If you followed the above, you should even be able to run your new container. Compile the C attached to the main gist and run it as root with “one” as its first parameter, and enjoy the fruits of your effort:
# ./lxc-run one # ps axwf ... 15645 ? Ss 0:00 init  15660 ? Ss 0:00 \_ /usr/sbin/sshd -D ... # ssh [email protected] Warning: Permanently added '192.168.1.3' (ECDSA) to the list of known hosts. [email protected]'s password: Linux pi 3.6.11+ #2 PREEMPT Mon May 20 14:05:58 UTC 2013 armv6l The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. [email protected]:~# ps axwf PID TTY STAT TIME COMMAND 1 ? Ss 0:00 init  12 ? Ss 0:00 /usr/sbin/sshd -D 13 ? Ss 0:00 \_ sshd: [email protected]/2 14 pts/2 Ss 0:00 \_ -bash 20 pts/2 R+ 0:00 \_ ps axwf
(We’ve upped the pid count a bit because we executed all of those commands in
/etc/rc, remember? That’s why
sshd isn’t pid #2).