Clusters are presented in chronological order according to the way
they were purchased and installed, which roughly corresponds to CPU
clock and speed as well. Aggregate MHz is used as a weak measure of
total cluster capacity, as it is generally the strongest indicator of
CPU bound performance. Each cluster name link connects to a short blurb
on the cluster itself below or (in the case of the original Brahma link)
to a page on the history of cluster computing in the Duke Physics
department.
It is interesting to note the steady effects of Moore's Law and the
increase in investment in cluster resources in the department.
The aggregate cycle capacity of the physics department cluster
(including various desktop "nodes" that can add to the
racked/shelved/named cluster capacity) is thus in the ballpark of 300
GHz. Naturally, performance on particular codes as a function of CPU
clock varies significantly across the various architectures, but by any
measure this is a lot of compute power.
(These numbers were last updated as of April, 2003, and are subject
to change as clusters are retired and new cluster generations are
added.)
Brahma (second generation)
Brahma 2 was generously donated by Intel as part of an Intel equipment
grant to the University. In the first phase of the grant, the original,
ageing Brahma cluster was augmented by 16 dual processor 400 MHz PIII
systems, which at the time was a bleeding-edge system. The actual
systems themselves were Dell Poweredge
2300 servers. These were not the most convenient form factor for
cluster nodes, as the cases were designed with departmental or corporate
server requirements in mind and were big, heavy, and featured things
like a snap-in hard disk bus that (however lovely) were overkill for
compute nodes with little need for local disk.
They were fast, though, for their time, and were put in
immediate service. During their first three years of service these
nodes were kept in nearly continuous use at 100% of capacity -- the
cumulative duty cycle of the nodes was easily 90% or greater -- working
on problems in condensed matter theory (Brown and Ciftan) and nuclear
theory (Mueller). Many papers were published in Physical Review
and elsewhere on the basis of the work completed on these systems.
In addition, Robert G. Brown helped organize the Extreme Linux track
of the 1999 Linux Expo, and used several of the Poweredge 2300's to
construct a small cluster that was one of several demonstrated at the
Expo.
As time has passed and Moore's Law inexorably advanced, the usage of
these second-generation brahma nodes has somewhat diminished, but they
are still doing quite a bit of valuable work and will likely remain in
service for another year or more, as the hardware itself holds out. It
is very likely that they will be honorably retired by early 2005, if not
before, as then-current systems are able to do more work, faster, for
what the electricity and cooling alone now costs for the older
nodes.
Brahma (third generation)
Brahma 3 consists of the final eight systems, now 933 MHz dual
PIII's, donated to the department by Intel as a part of the final phase of an
Intel equipment grant to the University. The systems were placed into
almost immediate service and are still in very heavy use today,
primarily working on problems in nuclear theory and condensed matter
theory.
These systems, like Brahma 2, are shelf mounted tower units from
Dell, but they are in a more or less standard mid-size tower this time
and hence are much more convenient to shelve and physically move when
required.
QCD
The QCD mini-cluster was obtained by Shailesh Chandrasekharan with
start-up funds. It is our only DEC/Compaq Alpha cluster -- although it
performs better on numerical code relative to its CPU clock compared to
the Intel and Athlon CPUs, it proved to be "expensive" performance in
many ways. The Alpha required significantly more systems administration
effort to install and maintain a linux distribution, it runs quite hot
(and hence is expensive to operate), and it cost more on a per FLOP
basis than Intel or Athlon alternatives that were also more
cost-effective to install and operate. Finally, the Alpha architecture
was not helped by the constant travails of Digital, then Compaq -- Alpha
as a CPU architecture seems to have no future.
Consequently, we have more or less abandoned it, although naturally
we continue to operate the QCD mini-cluster until we can sensibly retire
it.
Ganesh
Ganesh was the department's first Athlon cluster, consisting of
fifteen 1300 MHz Athlon client nodes in mid-sized towers plus a 1333 MHz
server, also in a mid-sized tower (for 16 nodes total). It was also the
first mini-cluster not named or considered a part of brahma, as it was
purchased by Brown and Ciftan to work on specific problems in condensed
matter physics. This cluster cost approximately $15,000 including its
switch, wiring, and shelving, making it an extremely cost effective
cluster for the time.
We had long adopted the sensible practice of naming brahma cluster
nodes by a simple schema such as b1, b2, b3, but (as one notes above)
had failed to REname the successive generations with a different name
(and hence letter). This proved to be a modest mistake, as one had to
"remember" which nodes where the faster b-nodes and which nodes were the
slower b-nodes. Naming the g-nodes g00, g01, ... eliminated this
problem for this new mini-cluster -- the name prefix letter uniquely
determined both architecture and processor/memory generation and
configuration, as well as (in this case) cluster ownership.
The cluster remains in active use, still working on problems in
critical phenomena for the group of Brown and Ciftan and being shared
with other brahma users during the brief times it would otherwise be
idle due to a pause in the computational schedule of Brown and
Ciftan.
Ganesh is the last cluster purchased in a shelf mount/tower unit form
factor for the physics department. This is largely because it became
apparent that physical space was about to become an important issue as
more and more groups in the department started cluster projects of their
own and the total number of nodes and processors started to
skyrocket.
To accomodate the new clusters (and the many older nodes of the
existing clusters, which were not terribly small) the University funded
the extensive renovation of a new cluster/server room for the department.
This space is adequate to hold hundreds of rackmount nodes and is not
quite half full, but would be unable to hold even the number of CPUs it
holds already in a shelf/tower form factor. Consequently, all the newer
cluster nodes below are rackmounted, and older shelf mount cluster nodes
will be phased and retired over time and replaced with rackmount
clusters as funding and opportunity permits.
Champ
The tremendous success obtained by the nuclear theory groups of
Mueller, Bass, and later Chandrasekharan using the various brahma nodes
inspired them to seek DOE support for a larger cluster to be dedicated
towards nuclear theory. Grant proposals were submitted and funded, and
the CHAMP (Computer-cluster for Hadronic and Many-Body Physics) was
born.
CHAMP consists of 64 processors, but to simplify and optimize access
those processors are named according to system architecture. There are
23 c-nodes, which are rackmount dual Athlon 1800+ systems (1533 MHz
clock) for a total of 46 processors. There are four p-nodes (single CPU
2.0 GHz P4 systems) for four more processors. Finally, seven of the
dual 933 MHz PIII systems in the brahma 3 cluster have been dedicated to
the nuclear theory work as part of CHAMP to bring to total to 64
processors.
In actuality, there are even more, because of resource sharing
between groups and because of small clusters such as the qcd cluster
below and even single systems that have been purchased with faculty
startup funds or by the University. This flexibility in cluster
resource allocation within our department has proven to be very useful
when various groups have arrived at a "crunch time" where they are
preparing a time sensitive draft of a paper or are getting ready to
depart for a conference and need to rapidly complete some last minute
computations. It is a good example of how a well organized cluster
group (like Duke Physics' brahma) can enhance the research
potentialities of all the participants in unexpected ways.
At this point CHAMP is being very heavily used indeed by Bass,
Chandrasekharan, and Mueller and their various postdocs and
students.