Beowulf Resources and Links

This is the official home page for the Duke University Physics Department's Brahma Beowulf Project. Please feel free to explore this website. There are a number of things on the site itself that may be of use or interest to individuals interested in beowulf-style cluster computing.

This site is maintained by rgb. It and all works linked thereupon authored by Robert G. Brown are Copyright 2003 (or as indicated in the document) and made available through a modified Open Publication License unless superceded by another license directly associated with the document. (Current site version 2.2-1)


Home Beowulf Book Talks, Papers, Articles Software/Programs Links Vendors

Links to Vendors Supplying the Beowulf/Cluster Market


The following vendors are for the post part not really "endorsed", and in no case am I responsible if you use one of them and have a poor experience.

Some of these vendors I do have direct experience with. Others have just contacted me and asked to be added to the list, and I've done so (without even charging them, which is probably slightly insane on my part). Still others have contacted me with rich bribe offers such as tee-shirts, penguins, and coffee mugs or tiny screwdrivers, causing me to fill them in below with a smile.

Clearly I'm a bit of a slut, but I've still got my pride and standards. Truthfully, I'll list almost anyone if they're really and truly quasi-dedicated linux/beowulfish vendors, but (for all of you who read this)...

...USE THEM AT YOUR OWN RISK. I make no warranty whatsoever about any given vendor being suitable, cheap, or reliable even if I've used them before. I could be mistaken, after all, or I could be lying. The latter isn't horribly unreasonable, if the vendor has bribed me heavily with tee shirts, coffee mugs, free palm pilots, or (always useful) money to tell you how great they are. So you shouldn't fully trust even an open endorsement. Who knows how many stuffed penguins stand behind it?

One last remark. If you have used a vendor on this list or do use a vendor on this list and have a bad experience, feel free to let me know and I may add a reference to that to their listing or even remove their listing altogether.


Dell
Dell provide the Intel Grant hardware that we use for Brahma 2 and Brahma 3. In recent years Dell has reduced prices so that they are very cost-competitive with even Intrex (our local vendor) and their hardware is at least as reliable as anything else available. Dell has very nice onsite service deals (important to control human costs in large clusters) and offer rackmount systems that make good nodes. One weakness is that Dell tends to be pretty inflexible about their offerings -- configurator way or the highway.
Intrex
Intrex is a the local vendor "down the road" that provides very reasonable systems and parts (literally) over the counter, at least to folks in North Carolina. Driven strictly by demand, they've started to put together very reasonably priced rackmount systems, where they'll basically build you exactly what you want at OTC prices. I recently (10/00) got a quote from them for rackmount 2U nodes -- 800 MHz PIII's with 256 MB PC133 ECC, a 10 GB HD and floppy, and a NIC per node for just about exactly $1K/node. This would be even cheaper now as SDRAM drops towards $0.50/MB from over a dollar at the time of the quote. Their "integration charge" per node is basically nothing -- maybe $50 -- as they make their primary profit on the volume of the sale, just as they should.

I like these guys and do business with them pretty regularly. They are the sole supplier for the systems in my home beowulf because they are quite cheap and literally two miles away from my house. This allows me to get service "instantly". For example (true story), 12/4/00 I was rebuilding my home beowulf (putting it on a nifty heavy duty steel and particle board shelf unit, actually, to which I had attached rolling casters -- very cool and cheap!) and naturally my primary server/desktop (a dual Celeron on the Abit BP6) refused to come back up -- no power at all. I suspected a blown power supply, but didn't want to mess with the swap game at home.

So, I simply picked up the unit and carried it to the car, and trundled off to their South Square store. I plopped it down on the counter and said "I think the power supply is blown". Twenty five minutes later I was out with a new power supply installed at no charge and no questions asked, as the system was still less than a year old (by around three weeks:-). Of course I also grabbed some newly cheap SDRAM and even popped for a 30 GB disk for all my MP3's while I was there...

This kind of service is why I buy from local vendors if/when I can find a decent one. Intrex is exceptional even in this regard -- their systems come with a lifetime labor warranty, and that's the expensive part of buying ANY system -- hardware (replacement or otherwise) is cheap, but the time required to swap power supplies, motherboards, and so forth around to identify and replace a bad part is very, very expensive. To me -- it is my time. I'll therefore make an exception for Intrex and "endorse" them even though they don't give me coffee mugs or mouse pads for free. Their prices are low enough I can afford to buy my own with what I save.

Penguin Computing
We have thus far gotten precisely one (1U) penguin HPC compute node for testing purposes, and our sysadmin, at least likes it very much. Seems well engineered, and not horribly overpriced. I will try to update this as we test the node more. If nothing else, you have to love the name and the cute little logo!
PsychoSoftPC
Catchy name, that. These are serious computing freaks from the look of it, who sell turnkey clusters (called the ``Psychlone Cluster''), some sort of 64 bit cluster apparently used by clients in rendering among other things. I'm including them because they asked to, because they are a vendor of linux-based supercomputers which is what this is all about, and because they have a cool drop-in logo I can paste in here:

Psychsoftpc
the source for high end PCs

Oh, and maybe I can get a cool T-shirt out of them for the link. Take it for what it is worth...

RLX
RLX makes and sells a variety of "single blade computers" -- computers that fit onto a card that can be slotted into a special 6U chassis. They sell a variety of blades -- ultra-low power blades with transmeta processors to high power P4 blades -- that can be installed in densities ranging from 24 per 6U to 10(x2) per 6U (the latter is in 10 dual Xeon P4's per blade, or 20 processors per 6U).

Blade solutions generally permit one to achieve the highest possible CPU densities. Transmeta clusters can run cool, although it is cool and relatively slow. However, they are expensive, and hence most likely will be of interest to people with a highly nonlinear cost profile in their physical infrastructure -- extensive renovation required to house, power, cool a more traditional cluster with (otherwise) better cost benefit. Still, worth looking over, and prices can always change.

Raritan MasterConsole
This is a keyboard, video, mouse switch we bought with/for the original brahma. Can manage at least 64 nodes from one monitor, but the switches aren't cheap. Very nice. But not cheap.
Belkin
This is another keyboard, video, mouse switch maker (and a ton of other stuff). I use a four port Belkin in my home compute cluster, Eden (guess that the naming theme is that use at home:-). It was fairly cheap (around $100), and works perfectly. I've used really cheap KVM's (basically rotary serial switches with a knob and everything) and they aren't worth even the very little money they cost. Get a "real" KVM or don't bother.
Scientific Applications on Linux Site
The name says it all. This is a somewhat "weird" site (in my opinion, anyway) with a strange mix of commercial and non-commercial offerings, but it is a useful place to look for advanced stuff for linux, some of which you might have to pay for (which is why it is on the vendors page instead of the links page.
Paralogic
Paralogic is a turnkey beowulf vendor. They are as much software and service as they are hardware, although they'll cheerfully sell you the hardware as well. Doug Eadline (the president of the company) is a buddy of mine and has given me tee shirts (full disclosure, here) and has also given back considerably to the linux community (as one of the authors of the linux beowulf HOWTO, perennial sponsor of free booze at Extreme Linux and HPC events, and so forth).
High Performance Technologies, Inc
Another turnkey beowulf vendor. I'd say that it is one of the highest end of the "true" beowulf vendors, working with bleeding edge technology to, as they put it, "build Linux clusters to meet the requirement of people that are accustomed to using high performance computing (HPC) systems or supercomputers". A good group to talk to if you are trying to achieve performance competitive with the best of the big iron that scales to hundreds of processors.

ASL Computers

ASL is a vendor of linux (only!) boxes of all sorts, from desktops through large rackmount servers. They sell both 1U and 2U units and I imagine will do systems integration sorts of things for you. I don't know if they consider themselves a true "turnkey beowulf vendor" (I'd generally reserve this term for companies with a true beowulf expert on staff and on the beowulf list who can help with more than just the hardware configuration and initial software install) but they can certainly sell you cluster components with linux preloaded. If they have or establish an outsourcing relationship with e.g. Scyld, they might even qualify without an on-staff expert...

These guys get some extra kudos from me as they've been extra nice to me over several years. No tee shirt(s), but they've given me the chance to work with certain bleeding edge hardware for free before it even "existed". Very nice folks, good quality hardware.

Atipa Computers/DCG Computers
A turnkey linux/beowulf vendor. They do (well, probably "did" at this point) high end alpha clusters as well as pretty much whatever you like in the way of linux workstations or beowulfish clusters. Wow! They sent me tee shirts (the note below must be working:-)! They must be decent guys, and my kids will henceforth advertise their stuff in school as much as this site does on the web. I henceforth endorse them as somebody to talk to if you are looking for turnkey quotes.

Linux NetworX

Formerly known as Alta Tech, LN is yet another linux/beowulf turnkey vendor. They sent me some very cool tee shirts with these centipedes on the front. They are worth a call if you are looking for quotes on a ready-to-run cluster system at pretty much any level of performance (up to the very high end). They also have some very pretty node-mounting boxen (not exactly racks -- custom boxes -- which is what I first knew them for as Alta Tech). Here is a synopsis of their business that they contributed (note that I did not write this, they did:-):

"Linux NetworX brings its powerful cluster technology to those demanding high-availability and high-performance systems. With the use of cluster computer technology, Linux NetworX provides solutions for companies with high-computing needs including Internet servers, research, industry, government and other technological fields. Through innovative hardware, complete cluster management software, service and support, Linux NetworX provides end-to-end clustering solutions. To date, the company has built some of the largest cluster systems in the world and has developed unique hardware and versatile software to facilitate overall system management. Linux NetworX has offices in Utah, New York, Calif. and Texas and worldwide distributors."

"About ClusterWorX: ClusterWorX allows users of Linux NetworX clusters to control the cluster as a single system, and provides remote monitoring and management capabilities. ClusterWorX can be accessed through an easy-to-use graphical user interface (GUI), command line and HTML. Other management tools include remote access, disk cloning and serial access to nodes, including remote monitoring and resetting of individual nodes without effecting the uptime of the entire system. Disk cloning is a valuable feature for large cluster systems because it allows software and other updates to be installed on one node and automatically distributed to the entire system. All features are architecture independent."

Microway
Designs and manufactures very high end, custom Beowulf clusters based on Alpha, Intel and Athlon processors. Here is their cluster synopsis (note that THEY wrote this, not me):

"Connectivity is provided using ethernet, Myrinet or Dolphin interconnect technology. The company was formed in 1982 by Stephen Fried, physicist and coinventor of the HF chemical laser (Star Wars). Microway has provided state of the art products for high speed numeric processing to the university and government marketplaces for over 18 years. Microway is API's largest US customer for UP2000 and 21264DP Alpha motherboards and processors. The company designs proprietary rack chassis for maximum nodes per cubic foot, and has a fine reputation for delivering fully configured clusters that work! Microway has configured Linux based systems since the early days of the Red Hat releases. Today we specialize in large Beowulf clusters for customers with HTPC applications from chemical engineering, mechanical engineering, CFD, molecular modelling, simulations, biogenetic research to designing America's cup racing keels, and jet engines. Our customers also include ISPs and other ecommerce companies."

For what it is worth, I do remember Microway from years of reading PC Magazine and drooling over their co-processor (i810 and other) boards and fancy matching compilers. Although I have no clue as to where they stand in the grand scheme of cost-beneficial turnkey beowulf systems (so you'll have to visit their website and talk to them and find out for yourself, which you'd do anyway with ALL of these vendors if you had any sense) I will say that in one sense, they were a direct philosophical predecessor of beowulfery. They made an attempt to achieve the same goal of providing commodity supercomputing at an affordable price in a readily available platform. Of course, they tried to achieve it by putting 2-4 very high end processors on a single ISA bus card with very much custom and proprietary high end compilers and a unique architecture, but there is nothing wrong with that, especially in the context in which they were working. For a long time, they probably led the world in cost/benefit measured in minimum commercial cost/FLOP in a single platform. And they are sending me a T shirt, so they must be decent folks.

Numerical Algorithms Group.
One of the things Microway sold in the old days was libraries from the Numerical Algorithms Group (NAG). Here is a communication from NAG:

NAG is a software company that specializes in reusable mathematical and statistical software components. We have been in business for 30 years and have been involved in many projects with the Department of Energy, Department of Defense as well as being able to help many NPACI and Alliance partners. (There are about 40 site licenses within the NPACI and Alliance members which makes collaboration and sharing of software containing NAG very easy.) NAG recently released a new version of the NAG Parallel Library that utilizes MPI for Beowulf computers. The product works with either PGI's compiler or the Gnu product.

All of NAG's products are available for "test drives". NAG will also be coming out with a version of our SMP Library for Intel/Linux next year.

Best regards, Tony Nilles
VP Sales and Marketing
NAG - The World Leader in Numerical Software Components
PH: 630-971-2337 x 207
Fax: 630-971-2706
www.nag.com

Note that NAG has promised to "try to find something different from the usual t-shirts (since your kids seem to be well stocked!)" Clearly creative and intelligent thinkers!

HiPERiSM Consulting

From my friend George Delic (whom I met at a recent HPC conference at Wake Tech CC):

HiPERiSM Consulting, LLC, offers expertise, products, and services in:

  • High performance computing
  • Software engineering
  • Data visualization
  • System integration for Linux clusters
  • Air Quality Modeling with Linux clusters

If I recall correctly, George and I talked about performance monitoring and tuning tools for serious MPI code at the conference, so if you have a serious MPI application where tuning it up is of value to you, give HiPERiSM a look and see if they can help.

Ventura Tech

Ventura assembles "server class memory" with lots of certification and guarantees, pretested for various popular high end motherboards (such as those that might well be selected for a high performance beowulf cluster). Here is their synopsis (from Sam Lewis):

"Many will say that `memory is memory'. While that used to be the case, memory today is anything but standard. With the onslaught of new technologies, it has been a challenge for integrators and OEMs to maintain true compatibility. To that end, Ventura Technology Group (VTG) is a tremendous resource, since we design and manufacture the highest quality server class memory. While our products are compatible among a wide range of platforms (SUN, IBM, HP, APPLE, COMPAQ, DELL, HP, IBM, INTEL, TYAN), our focus will always be on higher density modules (128MB, 256MB, 512MB, 1GB). Rounding out vital service offerings, VTG stretches ahead of the competition in the following areas:

Every module design is manufactured to exceed OEM specifications
* Certifications with major OEM's (Tyan, Intel, Supermicro) insures constant compatibility
* All memory technologies are manufactured in our own ISO9001 certified facility
* All memory modules are manufactured using top grade DRAM and SDRAM (Samsung, Micron, NEC, etc...)
* International and Domestic supplier relationships insure competitive pricing and consistent supply
* Lifetime warranty with the benefit of cross shipment where needed

Along with all of the above benefits, VTG continually provides competitive prices."

Aduva
Aduva sells a completely commercial (non-open source) toolset that can be used to manage mixed, nasty, heterogeneous distributions in a topdown fashion on a LAN. It would also "work" to manage a cluster in a manner similar to yum. I would tend to recommend against it in most cluster or LAN environments, as the primary value it adds is the ability to cope with and centrally manage versioning on what amount to bad cluster designs, and there are better ways to do that (e.g. don't use a bad cluster design -- use e.g. kickstart/yum, scyld, clustermatic). However, in certain relatively poorly managed corporate environments it might make sense, so I'll include the link -- it does give a single decent administrator at least some of the means to manage and control a nasty environment with lots of mediocre administrators.
Mountain View Data
Mountain View Data is a company that focuses on data delivery systems in high availibility clusters, primarily in the commercial world. They make tools like NAS operating systems, real-time backup solutions, and a "snapshot" tool for capturing an instant image of critical data (perhaps during a crash). However, they asked to be added because of their Power Cockpit tool. This tool, like Aduva's above, seems to enable systems installation and image management in a cluster (or client/server LAN) environment. To quote George Sun in his request for inclusion on this page:


    Powercockpit can deploy and update complete OS and solution stack(apps) to bare metal servers on a complete blank hard drive from a custom repository of saved images to multiple hardware configurations instantly using multicasting. Also, re-purpose servers on the fly throughout your local and remote clusters. Don't waste time on tedious cluster management. Powercockpit is more user friendly and robust than using Kickstart or System Imager. Although they are useful solutions they lack: GUI, cluster mgmt support, no rpm control, no cluster status ability, no multicasting and require tedious hands-on support and effort.

The primary strength of this tool appears to be how it encapsulates a lot of expertise and hides in behind a GUI. As was the case with Aduva's product, I suspect that this product will find a warm reception in a relatively undermanaged corporate environment, but that most University and Government shops will prefer to work with lower level, fully open source tools and "roll their own" cluster, so to speak. Still, worth looking over. They offer a free 30 day demo at the link above, so it is easy to try it out.

ibutton
An ibutton is a tiny sensor packaging that can be read electronically via a "1 wire interface" (basically a single twisted pair with signal and ground). It can hold lots of things -- a crypt key in ROM, a thermal sensor, account information -- and be detached from it read/write station and carried around. It can be powered by a battery for remote operation or vamp power off of the signal line. They therefore have, in principle, lots of possible uses in the cluster or LAN environment -- a portable absolutely unique digital ID that can be used as a login or key for various resources, as thermal sensors in a cluster room, and more. There is, however, a catch. They don't tend to come with the wiring already done so you can just plug them into a serial port or parallel port (they'll run off of either one) and forget them, with GPL driver software all wrapped in a tarball and ready to fly. So you MIGHT have to do some interface wiring (they do have a handy technical spec sheet to guide you) and build the not QUITE so plug'n'play gcc/userial-based application to be able to read from them. Still, there are clearly beowulf-list people who have done so as ibuttons are sometimes mentioned, so they belong here.
Sensorsoft
For those who prefer their solutions cooked (and will pay a bit extra for them) there is Sensorsoft, who makes server room monitoring hardware with both interface and software. This is a bit more plug-n-play including linux and bsd software. The bad news is that Sensorsoft is yet another hardware company that thinks it is "selling software" for its own devices, so it has silly licensing restrictions on its non-open source drivers (as far as I can tell without buying one, anyway). However, they do have a thermal sensor only for less than $100 that has an RJ45 interface (and an RJ45-RS232 cable), and they do provide technical data that should make it easy to write the drivers required to read from it. In fact, they are almost certainly the same drivers used in the ibutton, as all these guys are almost certainly using the same thermal sensor chip (from Dallas Semiconductor). I actually have four of these chips at home, together with a recipe for building my own RS232 "sensor", but haven't had the time to build one yet. That's the REAL hobbyist way...
Pico Technologies
Yes, there is one more possible solution to thermal and environmental modeling at a fairly cheap price: Pico Tech. Pico makes a leetle jobbie that plugs into either a USB port or serial port, snitches power from it, and drives a variety of plug-in sensors from it. They have thermal, they have humidity, they have combined, they have general purpose ADC and scope products (which can drive fairly arbitrary voltage-out detectors, up to and including oscilloscope-type stuff).

Truthfully, this looks like a good place to bookmark for its possible applications in e.g. physics labs as well as in HPC cluster rooms. They have linux-read open C source drivers for reading the devices, and apparently porting their output into various forms is pretty straightforward. Prices (when I looked -- you should obviously check again) seemed to be less than $200 to equip a three-sensor plug that should be able to monitor an entire server room -- chilled air temp at the output ducts, heated air temp at the return duct, ambient air out where the air is mixed. Alternative configurations add humidity sensor information or monitor e.g. door openings and closings with a standard door sensor.

One of these, a TV card, and an X10, and for less than $300 you have the equivalent of a far more costly Netbotz appliance, presuming that you have a node or system in your server room with a bus slot and a tiny bit of attention to spare polling the device(s).

Kill A Watt
Actually, this is a link to EFI.org, one of many folks that resell the kill-a-watt power sensor. The kill-a-watt is an awesome device and (in my opinion) essential to cluster engineering, where power management is of paramount importance. Useful at home or the office, too. This cheap little device (about $45 shipped from EFI) gives you instant readouts of line voltage, line current, and rms power drawn by whatever you plug into it. It can even monitor long term energy usage (kilowatt-hours) for things powered through it (up to 1875 VA). I wouldn't, actually recommend running a cluster through it all the time, but plugging in those prototype nodes to get a clear picture of their power draw under various loads and conditions during the design phase, ah, that's worth its weight in gold!
Mirus International
Mirus International makes "harmonic mitigating transformers" for use in computer server, or cluster rooms. The switching power supplies of computers have an unusual property -- they draw current only in the middle third of each half-cycle of the voltage. This is no problem when running one or two systems on a circuit, but is a big problem when running a circuit close to capacity. There is a FAQ on the Mirus website that clearly explains everything. This is a must read for cluster architects -- even if you opt not to use a mitigating transformer, the problems described in the FAQ really do occur and can cost you a signficant amount of downtime, hardware failures, and turn you bald and wrinkled before your time. A glimpse at my picture should convince you -- it happened to us.
Direct TextBook
To quote from their website: "Direct Textbook is the fastest textbook and book price comparison site on the internet. We help you find the lowest prices on books before you buy! Our price search software compares dozens of discount book stores, used books stores, wholesale books stores and other online book stores to find the lowest price available." This is relevant to beowulfery as they focus on texts and technical books. In fact, Sterling's book "Beowulf Cluster Computing with Linux" was the first hit on a search on the word "beowulf", with "How to Build a Beowulf" by Sterling, Salmon, Becker and Savarese the third hit. Complete with instant price checks -- you can select the "best" online vendor to buy from. Not bad, actually.

Vendors! Your product could appear in this list if you meet my demanding criteria! To wit:

1.The request must be accompanied by unmarked bills in a brown paper envelope, or tee shirts, toys, or other good geek gelt as a bribe.
2.It must be directly and clearly linux based and linux supported. WinXX-associated vendors need not even ask, unless they also have a linux product. I don' be doin' Windows...
3. It needs to have something to do with beowulfery or high performance cluster computing, as this is a beowulf site. Turnkey beowulf vendors, support/service vendors, distribution vendors, hardware vendors, network vendors, all are welcome.
4. OK, I was kidding. You don't really have to give me anything to get it listed, especially if I find your product intriguing or think that having it represented here might be beneficial to somebody trying to engineer a beowulf solution. On the other hand, modest bribes certainly help to motivate me take my time to make the entry, and if you want a "real" endorsement I really do have to have used the product one way or the other.

Feel free to send requests for inclusion to my email address below and to send non-monetary bribes to Robert G. Brown at 3209 Annandale Road, Durham NC 27705, USA. Monetary bribes will have to wait until I get my bank account in the Caymans opened...;-)

Home Beowulf Book Talks, Papers, Articles Software/Programs Links Vendors

This page is maintained by Robert G. Brown: rgb@phy.duke.edu