Special Links for Beowulf Builders


This is a sample dhcpd.conf. It contains the relevant fragments for installing and net-configuring beowulf cluster nodes, although of course it is equally easy to install workstations.


This is a sample PXE (Preboot eXecution Environment) server configuration file. Also needed is a pxelinux configuration file that basically tells nodes where to find an install kernel and the right kickstart file. Some details of the tftpboot environment (where to locate various components) are also sketched out.


An example kickstart file for a "cluster node". Again, details of the layout of the source directory are briefly discussed.


An example yum configuration file to help automate ongoing software update management. yum stands for "Yellowdog Updater, Modified". Yum fully automates the process of installing, removing, updating, listing, and otherwise maintaining a list of packages installed on a server from one or more repositories.

We use Red Hat, but one beauty of linux and open source is the implicit genetic optimization algorithm associated with the free exchange of ideas, the free opportunity to try new things and different paths to achieve the same goal. The pathway outline above is just one such pathway to zero-marginal cost system installation and maintenance in linux. There are many more, really, each with its merits, degree of functionality, and cost scalability advantages. A few of these are linked and briefly described below.

FAI (Fully Automated Installation).

FAI is an automated system to install a Debian GNU/Linux operating system on a Linux Cluster. The manual of FAI includes a chapter on how to build a Beowulf cluster using FAI.

Not to be outdone, Mandrakesoft has a high performance computing initiative designed to install a full suite of High Performance Computing packages onto a cluster via an automated point-and-click GUI interface. The suite is thoughtfully done. In addition to lots of more or less "standard" clustering tools (sshd, maui, ganglia, and more) it has some scripts that have long been discussed on the list -- a "discovery" script that gathers MAC addresses from a cluster on a preinstall boot (turn on each PXE-configured node, wait for the PXE/DHCP server to gather the MAC addresses, run the script to glean the addresses out of the DHCP logs and -- presumably -- sets up everything else required to boot up the nodes into installation or operation in the future), and more.

Scyld is a company founded by some of the creators of the original beowulf, notably Gordon H. Bell prize-winner Don Becker (also known for writing virtually all of the original ethernet device drivers). The product they sell is a heavily specialized and customized linux distribution that is a beowulf in a box. It has its own specialized methodologies for head node and worker node installation that are extremely efficient and scalable by design. It creates a true Beowulf with a single unified process id space and single node control over the running parallelized task, as opposed to the more generic "cluster" produced by the approaches above (where the cluster nodes are essentially workstations, specialized in simple ways to a greater or lesser degree).


Clustermatic is based on Eric Hendriks' bproc program (also the basis of the Scyld approach) and creates a simple, scalable, "true Beowulf" type cluster with a single unified process id space so that jobs are run from a single front end "head node". The actual booting of the cluster/node image is accomplished with a two-step boot process, the first of which can be initiated by any mix of LinuxBIOS, DHCP/PXE, a boot floppy, or a boot CD. It can be configured to run diskless or from a local image. Eric Hendriks is also an ex-member of the NASA Goddard beowulf team and the clustermatic solution is in some sense an open source non-commercial beowulf in a box where Scyld is open source but commercial. Clustermatic is highly scalable, as is demonstrated by its use on Pink, a 2048 processor linux cluster running a Los Alamos National Labs.

(CD-based) Diskless Linux

Diskless Linux (using a CD) is another highly scalable approach to node scaling. It has the advantage of saving money on node hardware (you don't need a disk at all, which typically costs order of $100, costs (at an assumed 15W average power consumption rate) another $15 a year to run, an is one of the node components most likely to fail and thereby cost hours of administrator time identifying and replacing the broken component. It has disadvantages in terms of speed and core memory consumed, it can be considerably more time intensive to upgrade and keep current, and of course, you need a CD drive in each system and at least one burner.

Fully Diskless Linux

This article describes a different way of building completely diskless systems that is very similar to the general approach outlined above, except that instead of running just the install program (anaconda) after booting, the systems boots up all of the way and mounts all of its key components (/, /usr, /home) from a remote server. As is the case for general node installation, it can easily be initiated either from a suitable boot floppy, or on a system with a PXE-equipped ethernet device, with no disks of any sort at all. The hardware and maintenance cost savings in the latter case are signficant, although the additional load on the network may or may not be acceptable for all cluster designs or parallel applications


Aduva sells a completely commercial (non-open source) toolset that can be used to manage mixed, nasty, heterogeneous distributions in a topdown fashion on a LAN. It would also "work" to manage a cluster in a manner similar to yum. I would tend to recommend against it in most cluster or LAN environments, as the primary value it adds is the ability to cope with and centrally manage versioning on what amount to bad cluster designs, and there are better ways to do that (e.g. don't use a bad cluster design -- use e.g. kickstart/yum, scyld, clustermatic). However, in certain relatively poorly managed corporate environments it might make sense, so I'll include the link -- it does give a single decent administrator at least some of the means to manage and control a nasty environment with lots of mediocre administrators.

Acknowledgements: The Duke institutional linux installation server, install.dulug.duke.edu is a part of DULUG, the Duke University Linux Group. It was created and is maintained by Seth K. Vidal, who is the author and primary maintainer of the yum tool described above, and who is largely responsible for the incredible institutional-level scaling in Linux at Duke. Seth is the physics department's system administrator and a Linux Genius. He is helped by Icon Riabitsev, who has also made many contributions to the enterprise-level scaling of this approach, Mike Stenner of the Duke physics department, as well as many, many members of DULUG and the yum mailing/development list.

I am also grateful to Thomas Lange for the FAI link, and to Jim Phillips for reminding me to include a link to clustermatic and Erik Hendriks bproc (sorry, Erik).

It is wise to be reasonably distribution and methodology agnostic, even as it is natural to promote the distribution and methodology one is most familiar with. That way everybody benefits -- choices are maximized, clever ideas (openly presented) can be stolen, stupid ideas (however clever they seemed at the time) abandoned, and linux continue to evolve as one of the most powerful creations of the human spirit.

P.S. Icon, alas, is planning to move to Canada in a few years for a variety of reasons (mostly immigration/visa reasons). This is our great and tragic loss, but Canada's gain! Anybody in Canada wanting to pick up a great Linux administrator (and web developer) totally expert in managing systems at the enterprise level with this approach is encouraged to contact him and offer him lots of money. He won't be available until roughly 2005, though. I hope. I suppose it depends on how much money you're willing to offer...

This page was written and is maintained by Robert G. Brown rgb@phy.duke.edu