include($linkpath) ?>
|
Special Links for Beowulf Builders
- dhcpd.conf
- This is a sample dhcpd.conf. It contains the relevant fragments
for installing and net-configuring beowulf cluster nodes, although of
course it is equally easy to install workstations.
- pxe-server.conf
- This is a sample PXE (Preboot eXecution Environment) server
configuration file. Also needed is a pxelinux
configuration file that basically tells nodes where to find an
install kernel and the right kickstart file. Some details of the
tftpboot environment (where to locate various components) are also
sketched out.
- beowulf.ks
- An example kickstart file for a "cluster node". Again, details of the
layout of the source directory are briefly discussed.
- yum.conf
-
An example yum
configuration file to help automate ongoing software update management.
yum stands for "Yellowdog Updater, Modified". Yum fully automates the
process of installing, removing, updating, listing, and otherwise
maintaining a list of packages installed on a server from one or more
repositories.
We use Red Hat, but one beauty of linux and open source is the
implicit genetic optimization algorithm associated with the free
exchange of ideas, the free opportunity to try new things and different
paths to achieve the same goal. The pathway outline above is just
one such pathway to zero-marginal cost system installation and
maintenance in linux. There are many more, really, each with its
merits, degree of functionality, and cost scalability advantages. A few
of these are linked and briefly described below.
- FAI (Fully
Automated Installation).
-
FAI is an automated system to install a Debian GNU/Linux
operating system on a Linux Cluster. The manual of FAI includes a
chapter
on how to build a Beowulf cluster using FAI.
- CLIC
- Not to be outdone, Mandrakesoft has a high performance computing
initiative designed to install a full suite of High Performance
Computing packages onto a cluster via an automated point-and-click
GUI interface. The suite is thoughtfully done. In addition to lots of
more or less "standard" clustering tools (sshd, maui, ganglia, and more)
it has some scripts that have long been discussed on the list -- a
"discovery" script that gathers MAC addresses from a cluster on a
preinstall boot (turn on each PXE-configured node, wait
for the PXE/DHCP server to gather the MAC addresses, run the script to
glean the addresses out of the DHCP logs and -- presumably -- sets up
everything else required to boot up the nodes into installation or
operation in the future), and more.
- Scyld
-
Scyld is a company founded by some of the creators of the
original beowulf, notably Gordon H. Bell prize-winner Don Becker (also
known for writing virtually all of the original ethernet device
drivers). The product they sell is a heavily specialized and customized
linux distribution that is a beowulf in a box. It has its own
specialized methodologies for head node and worker node installation
that are extremely efficient and scalable by design. It creates a
true Beowulf with a single unified process id space and single
node control over the running parallelized task, as opposed to the more
generic "cluster" produced by the approaches above (where the cluster
nodes are essentially workstations, specialized in simple ways to a
greater or lesser degree).
- Clustermatic
-
Clustermatic is based on Eric Hendriks' bproc program (also the basis
of the Scyld approach) and creates a simple, scalable, "true Beowulf"
type cluster with a single unified process id space so that jobs are run
from a single front end "head node". The actual booting of the
cluster/node image is accomplished with a two-step boot process, the
first of which can be initiated by any mix of LinuxBIOS, DHCP/PXE, a boot floppy,
or a boot CD. It can be configured to run diskless or from a local
image. Eric Hendriks is also an ex-member of the NASA Goddard beowulf
team and the clustermatic solution is in some sense an open source
non-commercial beowulf in a box where Scyld is open source
but commercial. Clustermatic is highly scalable, as is demonstrated by
its use on Pink, a 2048
processor linux cluster running a Los Alamos National Labs.
- (CD-based)
Diskless Linux
-
Diskless Linux (using a CD) is another highly scalable approach
to node scaling. It has the advantage of saving money on node hardware
(you don't need a disk at all, which typically costs order of $100,
costs (at an assumed 15W average power consumption rate) another $15 a
year to run, an is one of the node components most likely to fail and
thereby cost hours of administrator time identifying and
replacing the broken component. It has disadvantages in terms of speed
and core memory consumed, it can be considerably more time intensive to
upgrade and keep current, and of course, you need a CD drive in each
system and at least one burner.
- Fully
Diskless Linux
-
This article describes a different way of building completely
diskless systems that is very similar to the general approach
outlined above, except that instead of running just the install program
(anaconda) after booting, the systems boots up all of the way and mounts
all of its key components (/, /usr, /home) from a remote server. As is
the case for general node installation, it can easily be initiated
either from a suitable boot floppy, or on a system with a PXE-equipped
ethernet device, with no disks of any sort at all. The hardware
and maintenance cost savings in the latter case are signficant, although
the additional load on the network may or may not be acceptable for all
cluster designs or parallel applications
- Aduva
-
Aduva sells a completely commercial (non-open source) toolset
that can be used to manage mixed, nasty, heterogeneous distributions in
a topdown fashion on a LAN. It would also "work" to manage a cluster in
a manner similar to yum. I would tend to recommend against it in
most cluster or LAN environments, as the primary value it adds is the
ability to cope with and centrally manage versioning on what amount to
bad cluster designs, and there are better ways to do that (e.g. don't
use a bad cluster design -- use e.g. kickstart/yum, scyld,
clustermatic). However, in certain relatively poorly managed corporate
environments it might make sense, so I'll include the link -- it does
give a single decent administrator at least some of the means to manage
and control a nasty environment with lots of mediocre administrators.
Acknowledgements: The Duke institutional linux installation
server, install.dulug.duke.edu is a part of DULUG, the Duke University Linux
Group. It was created and is maintained by Seth K. Vidal, who is the author
and primary maintainer of the yum tool described above, and who is
largely responsible for the incredible institutional-level scaling in
Linux at Duke. Seth is the physics department's system administrator
and a Linux Genius. He is helped by Icon Riabitsev, who has also made
many contributions to the enterprise-level scaling of this approach,
Mike Stenner of the Duke physics department, as well as many, many
members of DULUG and the yum mailing/development list.
I am also grateful to Thomas Lange for the FAI
link, and to Jim Phillips for
reminding me to include a link to clustermatic and Erik Hendriks bproc
(sorry, Erik).
It is wise to be reasonably distribution and methodology
agnostic, even as it is natural to promote the distribution and
methodology one is most familiar with. That way everybody benefits --
choices are maximized, clever ideas (openly presented) can be stolen,
stupid ideas (however clever they seemed at the time) abandoned, and
linux continue to evolve as one of the most powerful creations of
the human spirit.
P.S. Icon, alas, is planning to move to Canada in a few years for a
variety of reasons (mostly immigration/visa reasons). This is our great
and tragic loss, but Canada's gain! Anybody in Canada wanting to pick
up a great Linux administrator (and web developer) totally expert
in managing systems at the enterprise level with this approach is
encouraged to contact him and offer him lots of money. He won't be
available until roughly 2005, though. I hope. I suppose it depends on
how much money you're willing to offer...
This page was written and is maintained by Robert G. Brown
rgb@phy.duke.edu
|