To put my professional and topical acknowledgements first, where they belong: This book is dedicated to my many virtual friends on the beowulf list and elsewhere in the Linux world. I literally could not have written it without their help as there are a number of things that I write about below that I've never really done myself. I'm relying on the many reports of their experiences, good and bad, that have been added to the archives of the beowulf list over the years. I'm also (in the best ``open source'' tradition) relying on their feedback to correct any errors I may have made in writing this book.
Any errors that may have occurred are not their fault, of course; they are entirely mine.
I have tried in this book to create the basis for a reasonable understanding of the beowulf-class linux-based parallel (super)compute cluster. This book makes no pretence of being a text on computer science - it is intended for readers ranging from clever high school students with a few old x86 boxes and an ethernet hub to play with to senior systems programmers interested in engineering a world-class beowulf (with plenty of room in between for pointy-haired bosses, linux neophytes, hobbyists, and serious entrepreneurs). It is also deliberately light-hearted. I intend the text to be readable and fun as opposed to heavy and detailed.
This shouldn't seriously detract from its utility. Half of the fun (or profit) of beowulfery comes from the process of discovery where one takes the relatively simple idea of a beowulf and a few tools and crafts the best possible solution to your problem(s) for far less than one could purchase the solution commercially on ``big iron'' supercomputers. I have a small beowulf at home, and so can you (for as little as a two or three thousand dollars). I also have a larger beowulf at work (the Duke University Physics Department), and so can you (for a current cost that ranges between perhaps $500 and $8500 per brand new node, depending on just what and how much of it you get).
There is good reason to get involved in the beowulf game. One day beowulfs will play the best virtual reality games, allow you or your kids to make movies like ``Toy Story'' in the privacy of your own home (with a toolbox of predefined objects and characters so all you have to do is provide an upper-level script), solve some of the most puzzling problems of the universe, model chemicals in rational drug design packages, simulate nuclear explosions to allow advanced weapons to be designed without testing, permit fabulous optimizations to be performed with advanced genetic algorithms, and...
Well, actually beowulfs or beowulf-like architectures are already doing most of these things one place or another, and far more besides. The beowulf design is one of the best designs for doing parallel work because of three things: It is an extensible, scalable design built out of cheap commodity parts (where the word ``cheap'' has to be understood as ``compared to the alternatives''). For that reason beowulf-style cluster computing (as a phenomenon) has grown from a handful of places and people five to ten years ago to literally cover the earth today. Its growth continues unabated today, driven by one of the strongest of human urges1.
At this point, there are beowulfs or beowulf-like clusters in place in universities and government research centers on all the continents of the world. In many cases, the beowulf route is the only way these universities or research centers can acquire the computational power they need at an affordable cost. If I can build a beowulf at home, so can the physics students and faculty in a physics department in Venezuela, or Thailand, and for about the same cost. It is one way, as my friends overseas have pointed out in off-list conversations, that entrepreneurs and businesses in small countries can compete with even the biggest companies with the biggest computational facilities in the United States, in spite of ruinous and misguided export restrictions that prevent the international sale of most high powered computer systems except to carefully selected (in all the senses of the word) countries and facilities. Yes, beowulfs are also about freedom and opportunity.
Up to now, beowulfs have been relatively rare in the corporate environment, at least in the United States. It is not that corporations are unaware of the power of linux-based cluster computer environments - a lot of the web-farms in use today harness this power with equal benefits in terms of low cost and scalability. It's just that (as the book should make clear) parallelizing a big program is a complex task, which inhibits the development of commercial parallel applications.
However, in my ever-so-humble opinion, this period is about to come to an end. This is for several reasons. First of all, the beowulf design has shaken down into a ``recipe'' that will work for a wide range of applications. This recipe is extremely cheap and simple to implement, which allows software developers to actually design software for a ``known'' target architecture. In the past this has been impossible with many beowulfish clusters being ``one of a kind'' designs.
Second, advances on the hardware front, especially in the related realms of modular computers and high speed networks, promise a new generation of mass-market-commodity off-the-shelf components that can be assembled into ``parallel supercomputers'' with just the right software glue. It should come as no surprise that linux is a prime choice for the operating system to run on a lot of those small systems, since it has the right kind of modular architecture to run well with limited resources and ``grow'' as resources are added or connected. Is it any surprise that Linus Torvalds (the central and original inventor/creator of linux, which by now is being written and advanced by a small army of the world's best systems programmers) is working for Transmeta, a company that makes processors for ultralight mobile network-ready computers? I think not. Transmeta is also not the only company working on small, fast, modular computer units.
Similarly, considerable energy is being expended working on advanced networks that may eventually form the communications channels for such a modular approach to computing. Really significant advances are a bit slower here (as the problems to overcome are not easy to solve) but networking has already reached the critical cost/benefit point to enable ``recipe'' beowulfs to be built for very little money. The eight port 100BT switch I use in my beowulf at home cost me a bit over $200 six months ago. Now it can be purchased for less than $150 - networking nodes together can be accomplished for as little as $40-50/node.
It is strictly downhill (in price/performance) from here. The next few years will see ever cheaper, ever faster network devices accompanied by new kinds of system interconnection devices that in a few years will permit nodes to be added at speeds ten or more times that of today for that same $40-50/node. On the other hand, the higher speed processors cost a lot more to build as their speed is cranked up, and the effect of the higher speed is less and less visible to ordinary computer users.
Beowulfery represents a different (and less expensive) way to achieve and surpass the same speed. I believe that this will lead to a fundamental redesign of the personal computer. Just as computers now have an expansion bus that allows various peripheral devices to be attached, future ``computers'' may well have an expansion bus that allows additional ``compute nodes'' to be attached - computers themselves may be mini-beowulfs that run software developed using the principles discussed in this book. This sort of thing has actually been tried a number of times in the past, but the systems in question have lacked proper software support (especially at the operating system level) and the appropriate hardware support as well. The next time the idea resurfaces, though, this will likely not be the case. Watch for it on your IPO screens.
Even if this particular vision fails to come to pass, though, the future of the beowulf-style (super)compute cluster is assured. As the book should make clear, there is simply no way to do any better than a beowulf design for many, many kinds of tasks (where better means better in price/peformance, not raw performance at any price). If you have a problem that involves a lot of computation and don't have a lot of money, it's the only game in town.
To conclude with my more mundane acknowledgements, I'd also like to thank my lovely wife, Susan Isbey and my three boys Patrick, William and Sam for their patience with my supercomputing ``hobby''. It isn't every home where the boys are forbidden to reboot a system for three days because it is running a calculation. Also, I can sometimes be a bit cranky after staying up all night writing or working (something I do quite a lot), and they've had to live with this. Thanks, guys.