Next: Local management-local site Up: A Model for Cluster Previous: A Model for Cluster Contents

Cluster Siting

In this section we will discuss the advantages of cluster decentralization (or rather, centralization at a department-local level) in more detail, doing a cost-benefit analysis (CBA) of local management-local siting, local management-remote siting, and remote management-remote siting for a variety of typical cluster environments. The numbers presented in this CBA are a ``best guess'' sort of approximation and should be refined with actual numbers where available.

It is difficult to discuss cluster computing at any scale in completely general terms. On the beowulf list, ``your mileage may vary'' (YMMV) and ``it depends on what you are doing'' are the standard warning and answer to nearly any complex question. A cluster that works optimally (in the CBA sense) for one computation won't work at all for a different computation. For that reason, we need to differentiate clusters, and cluster problems, at a very early point in the discussion into two very generic classes:

Problems (and clusters) that are very sensitive to cluster architecture and design. Typically these are problems with a relatively large communication-to-computation ratio, although it might well include problems with any sort of ``unusual'' bottlenecks or requirements (very large memory footprint, specialized network, very large or specialized storage requirement).
Problems (and clusters) that are not particularly sensitive to the details of cluster architecture and design and that do not have any special bottlenecks or requirements.

Silly as this distinction may be, it is a crucial one. Problems and clusters that fit in the former group for all practical purposes must be engineered and operated on a per-problem, per-cluster basis by the group that uses the cluster. At this point in time the University simply cannot provide meaningful support for this sort of cluster computing at the institutional level. As time passes and the cluster support described in this document is (hopefully successfully) implemented that may change. At this time, however, it would be a capital mistake for the University to even consider anything but a local management model for this sort of cluster.

In is at least possible to describe some fairly ``generic problems'' that fit the latter description, and to describe a ``standard cluster'' architecture that should do just fine to solve them. Remote, centralized cluster management makes the most sense when the cluster has a very ``vanilla'' architecture that will work successfully on a wide range of relatively simple cluster problems. We will therefore focus most of our attention on problems of this sort.

To make the discussion concrete, let us consider an ``embarrassingly parallel'' application such as a Monte Carlo computation consisting of many fully independent sub-computations. We will presume that only a small amount of data is required to initiate a sub-computation, which runs for a long time on a single CPU and then returns a small amount of data that represents the result. Such a computation runs efficiently in parallel on any number of processors, requires little in the way of network speed or local storage, and doesn't globally fail if a single node goes down in the middle of its sub-computation.

In addition, we will consider a more challenging but still fundamentally simple problem such as a ``coarse grained'' lattice decomposition of some sort. Each node works on a part of some large space (lattice). To advance the computation many of the nodes have to communicate results between nodes before they can proceed, and if a single node goes down in mid-computation the entire computation dies and must be started over from the beginning. However, each node still does a lot of computation for a little bit of communications, and the computation can thus be scaled up to many nodes with a very generic network architecture. Also, the computation has no particularly special requirements in terms of local storage or memory and can easily fit on a fairly standard node design. However, it does generate a fairly large set of results, output continuously throughout the computation.

Both of these computations will run efficiently on a very generic architecture. Let us now analyze the costs of the different ways of siting the hardware and managing it.

A cluster supercomputer of any design is at heart a client/server LAN. Some of the costs of installing and managing a LAN scale with the number of servers. Others are fixed costs that don't scale at all. Still others scale with the number of clients, or the number of users. As is the case with any such LAN, primary costs for LAN construction, maintenance, and administration include items such as:

Account management - creation, destruction, modification of fundamental access and groups privileges for all users of the system. Typically scales with the number of users independent of the number of clients, sometimes scales with the number of servers as well.
Disk management - creation of shared server disk resources, their secure, authenticated exportation to LAN client systems, backup, retrieval. Scales with number of servers with a very weak dependence on number of clients.
Network management - all aspects of managing both clients and servers on the network. Scales with number of clients plus number of servers.
System installation - all aspects of installing servers and clients, depends strongly on operating system. In package-based linux, small cost that scales with number and kind of packages installed to get started, then scales with number of servers and (with an independent scale factor) number of clients.
Software management should be a nearly fixed cost absorbed mostly into system installation and thereafter fully automated. Even so, there is at least a per-package fixed cost for setting up additions, modifications, updates to a "standard" list of software.
Security - ensuring the integrity of all data and resource utilization. A large fixed cost associated with the entire LAN itself, with per server and per client costs (larger for the servers) and per user costs. Similar to, and related to, systems management.
Systems management - monitoring status of all LAN elements, identifying and fixing problems, reconfiguration, and more. A large fixed cost associated with the entire LAN itself, with additional per server and per client costs (larger for the servers). Similar to, and related to, systems management.
User support - dealing with the myriad of user problems that occur, teaching, hand-holding and more. A large variable cost that scales with the number and competence of the users, the competence of the systems staff, the quality of the LAN hardware and design, the number of systems in the LAN, the number of tools in common use in the LAN and much, much more.
Hardware support - repairing, replacing, disposing of all hardware as it ages out, arranging replacements for critical components in a proactive way, troubleshooting, and so forth. Scales with the amount of hardware, its quality, the load placed on it by all sources of hardware stress (users, programs, physical environment).
Administration - paperwork and job related work of all flavors. A highly variable cost managed in different ways by different organizations. Scales at least weakly with number of systems and number of users both.

These are all services that must be provided and costs that must be paid for any LAN, including the specialized LANs we call a compute cluster or beowulf.

In addition, there are certain physical infrastructure costs associated with a LAN that must be tallied. These are not human or management costs (detailed above) but are nonetheless far from negligible.

Power. Clients, servers, and network components are all electronic and consume electrical power. In very rough numbers it costs $\sim$ $75 to provide 100 watts of electrical power twenty four hours a day for one year at $0.08 per kilowatt-hour. In addition, any place more components are to be located than there is an immediate supply of electrical power will require remodeling and rewiring to achieve the required density in supply. This cost scales with number of components of any given power consumption, or total power consumed.
Cooling. All the power consumed by any LAN component must be removed from the environment in a steady state way or it will build up as heat, damaging components and risking fires. Cooling occurs by many physical mechanisms in any environment including natural mechanisms, and the natural mechanisms vary in efficacy with e.g. the outside temperature, humidity, airflow, and details of the components physical location. We will assume (again in very rough numbers) that an electronic component that is consuming 100 watts of electrical power (all of which is continuously appearing in the immediate environment of the component as heat) will require roughly 33 watts of power, on average, to remove that heat. That is, $25 per 100 watt component, per year. In addition, any place more components are to be located than there is local cooling capacity will require remodeling achieve the required capacity. This costs scales roughly with total power consumed by all components.
Physical space. It is especially difficult to estimate the cost of space in a LAN environment. Every workstation location requires at least desk space for e.g. system unit, monitor, keyboard and mouse in an office/workspace environment. Servers and cluster nodes require space that is more typically fully dedicated to computers and provided with ample power and cooling. In that space, components can reach very high densities. The cost of the dedicated space may be ``high'' where it displaces humans or requires extensive remodeling (amortized, of course, over the lifetime of the space), it may be irrelevant (in new construction), it may be ``low'' when finding the space is a matter of cleaning out an unused supply room with plenty of power and cooling capacity relative to what you plan to put into it. There are additional nonlinearities in that small spaces may cost more or less, per component, than big spaces.
Global network infrastructure. Access to the LAN backbone, and LAN access to the campus WAN backbone. The former scales roughly with the number of LAN environments or networked components, the latter is a fixed cost per LAN.

With these costs in hand at least by name, we are finally in a position to consider and compare the various location/management schemes.

Subsections

Next: Local management-local site Up: A Model for Cluster Previous: A Model for Cluster Contents

Robert G. Brown 2003-06-02