Now that I’m back in the blogging mode, I figured a quick rundown of how we (Jake and I) got this compile cluster up and running is in order.
We started with Gentoo Linux as our base and five 400 MHz celeron machines to power this beast… One became the master node (Mutha) leaving four machines dedicated to compiling. I looked at using livecd’s for this but, because it’s only real use at this point is for running distccd and memory is a premium, an nfsroot setup seemed to be the best option. Now there are plenty of online docs that have a wealth of information, but their setup methods seem outdated and/or a little odd IMHO.
In addition to a pretty normal base install on the master node (Mutha) we also emerged dhcp, hpa-tftpd, ntpd and used syslog-ng for our logger due to it’s network logging capabilities. I’ve put all our config files here and will be referring to specific files through out this post.
Our network cards don’t have any kind of PXE boot option, so we got a cd ISO for em from the Etherboot folks. This allowed us to setup the needed combination of dhcp (/etc/conf.d/dhcp,/chroot/dhcp/etc/dhcp/dhcpd.conf) and tftp (/etc/conf.d/in.tftpd) for the remote monolithic kernel download. The only thing of note here is the use of ‘
use-host-decl-names on;‘ in dhcpd.conf to “push” the client’s host name at lease time since our writeable directories are mounted on a per-host basis.
For our filesystem layout, we went with common root directory exported from Mutha read-only. Each node got it’s own directory with an etc, var and tmp subdirectory exported read/write (/etc/exports). Now the default Gentoo init scripts (v 1.9.0) will fail since the nfsmount script is called so late in the boot sequence, so we had to massage things a bit to get everything mounted in it’s proper order and places.
First of all, we added a ‘
RC_NFSROOT‘ variable to /base/etc/conf.d/rc, then we edited /base/etc/init.d/checkroot to bypass the root filesystem check and mount our machine-specific directories if our variable is set.
Our syslog-ng setup was also pretty straightforward. The master node also serves as the logging machine (/etc/syslog-ng/syslog-ng.conf) and logs each of the client machines (/nodes/pig_**/etc/syslog-ng.conf) into it’s own file.
Before copying over our node directories, we did a chroot into our base directory and made sure distccd, ntpd and syslog-ng were all set to start in the default runlevel as well as setting up eth0 to use dhcp.
That’s pretty much it in a nutshell… Most packages compile on here slightly faster than on my p4 1.8ghz work machine, so performance seems to be right on target. When OSSI gets moved into the new office space, we will be moving this cluster into their facilities and hopfully adding between 10-15 more 500-600mhz machines.