next up previous contents
Next: Shared Networks Up: Parallel Programs Previous: Bottlenecks   Contents

IPC's, Granularity and Barriers

OK, by now you should be getting the hang of things. A beowulf is a parallel supercomputer built out of COTS nodes interconnected by a COTS network of some kind. One can build a beowulf to speed up a piece of parallelized code (in the classic Amdahlian sense) so it finishes faster. One can build a beowulf to be able to do a task at all by assembling more resources than one can either afford any other way or than are currently available in a system at any price. One can build a beowulf to speed up a code in an exotic way (by providing a faster extended virtual memory space, for example).

In the previous chapter we discussed all sorts of ways the basic bottlenecks between the CPU and memory subsystems (within a node, by assumption) can affect program speed, trying to provide a semi-quantitative understanding so that you can at least do the back of the envelope calculations required to compare the cost-benefit of various alternative ways of accomplishing a task. In this chapter we'll focus on the sine qua non bottleneck of beowulfery, the network.

There is so much to learn about networking and how it relates to serious beowulfery that it is hard to know just how much to put into an introductory book like this. To invert the point, there is such a wide range of ignorance about networking out there that I could easily be speaking to someone who doesn't know Appletalk from Ethernet, has never heard of the ISO or OSI, for whom TCP and IP are a mystery, and who thinks that a router is a device for cutting interesting curves in a piece of wood.

If this is you: Sorry, chum, you won't learn about these things here, or at least you won't learn much (certainly not enough to assemble a functional linux network). What can I say - there are whole books that focus on just setting up and running a network, and I cannot compress all that into a chapter and have time to say anything at all about networking in the fairly strict context of beowulfery.

So, even though a network is key to a beowulf, I'm going to assume that in fact you do know what the following are:

As you can see, I am omitting all sorts of useful and important things. You won't learn about netmasks, broadcasts, how to configure a NIC, or any of that from me. However, I will direct you to the /usr/doc/HOWTO directory (in most linux distributions) that has explicit step by step instructions for setting up all sorts of things including the network. Don't forget about Linux Headquarters (http://www.linuxhq.com/) either, which has links to all the HOWTOs and other documentation. There are a bunch of key learning documents in my own personal website including http://www.phy.duke.edu/$\sim$rgb/security/local.guide, which is ``the'' classic 1988 Rutgers white paper by Charles Hedrick describing all sorts of networking concepts. Finally, there are a whole bunch of useful URL's on the Brahma website (http://www.phy.duke.edu/brahma) which might be of interest to the neophyte.

SO, from here on I'm going to assume that you can design and set up a simple ethernet-based IP subnet without having your hand held. We'll still address some of this sort of thing in the next chapter, but for now we'll focus on the technical details (especially things like latency, effect of packet size on bandwidth, problems, solutions) and not on truly introductory things.



Subsections
next up previous contents
Next: Shared Networks Up: Parallel Programs Previous: Bottlenecks   Contents
Robert G. Brown 2004-05-24