next up previous contents
Next: User Support Up: A Model for Cluster Previous: Cluster nodes   Contents

The Standard Node Approach

The University already (re)sells computer hardware via the Duke computer store. When cluster nodes are purchased from a cluster integrator, or a value added reseller, one buys both the hardware and the ``integration'' - basically the charge for pre-installing linux in a suitable cluster configuration and building one or more server or head nodes.

As one can see from the description above, in most cases the ``integrated'' clusters thus purchased require further effort to actually integrate into a LAN environment. Accounts need to be installed, disk resources shared across the LAN need to be managed, users need to be supported, specialized software not pre-installed by the integrator needs to be installed. One then is left with a dilemna in long term maintenance - one particular snapshot of some linux distribution is installed on the hardware, and one has to work quite hard to arrange for this distribution to be updated or augmented. All this work is a considerable added cost on top of the integration fee charged by the cluster integrator. It is actually considerably more costly to the University to install a pre-configured ``integrated'' cluster from a vendor and perform the work required to insert the cluster into a preexisting LAN environment than it is to just reinstall the cluster nodes themselves from kickstart!

In the previous section, we saw that Red Hat, a campus wide installation archive, kickstart and yum enable anyone to install a ``cluster node'' once these tools are customized for their particular LAN environment. In the case of a University-run centralized cluster using the acpub LAN environment, this work amounts to precisely the same ``integration'' sold by many vendors customized to perfectly fit the actual LAN environment and account scheme. These costs are perfectly predictable (to a point) and nearly perfectly scalable once the initial development and deployment cost is paid.

A suitable model for cost recovery for the University is thus to ``resell'' integrated compute nodes that can be inserted into the publically run clusters! Utilizing a mix of local vendors such as Intrex and web-based vendors such as Dell and MicroWarehouse (ideally with pre-negotiated special pricing) to provide the actual cluster nodes and required hardware services and extended warranties, the University should be able to easily match the margins of any commercial integrator and achieve true and full integration using the schema described above in fair detail.

An example of the cost scale and cost recovery for a University resold compute node might be (noting that these prices are approximations based on a considerable experience purchasing more or less standard nodes):

for a total price to the end-user of $1200/node exclusive of network switch capacity, cabling, and miscellaneous hardware. The ``integration fee'' would cover node installation (one hour FTE) and an expected two FTE hours of node-level attention, on average, during the node's three year expected lifetime, at roughly $35/hour, and presumes that the hardware service fee eliminates ``all'' the cost of hardware service to local systems staff after perhaps diagnosing the problem.

This estimate will need to be adjusted to reflect reality on the basis of experience as the project proceeds. For example, nodes may cost only $1000 with extended service, or may cost $1500 in a higher memory configuration. Both single CPU configurations (presumed in the $1000 price tag) or dual CPU configurations (which would likely cost about $1600 for equivalent memory per CPU and hence provide slightly better cost scaling for certain classes of problems) are likely to be options. It may be that $100 per node provies inadequate to recover the actual installation and management costs, on average.

These costs for an integrated node, resold on a break-even or win-a-bit basis, are highly comparable to the costs one would obtain on the open market, providing University researchers with an attractive alternative to doing it yourself or to having it done by a profit-seeking outside integrator. This is essential for any sort of centralized cluster effort to succeed. The University will not attract researchers and grant money to populate University-run clusters if the researchers perceive of the cluster nodes thus obtained as being significantly more expensive than market value of similar integrated nodes. Why should they buy a turnkey cluster ``from Duke'' if they can buy a turnkey cluster from any of a dozen vendors?

True, those turnkey clusters are far from ideal and actually have a number of hidden integration costs when inserted into an actual LAN environment. However, research groups will accept those risks and hidden costs if it means that they can buy signficantly more nodes and get hence get significantly more work done, and worry about dealing with the difficulties with the nodes in hand.

An extremely important feature of this ``integrated node'' approach is that there should be basically no difficulty in justifying the cost of integrated (turnkey) cluster nodes purchased from the University, any more than there is for integrated cluster nodes purchased from an outside vendor, as long as those nodes are cost-competitive.

A final important feature of this approach is that the purchaser retains ``ownership'' of the nodes. The nodes are basically prepaid for University level management in one of the various University cluster sites, but they belong to their purchaser. The purchaser can dictate to what extent they participate in any sort of resource sharing program. The purchaser can, if and when they have some alternative way of siting and managing the nodes, recover and move their nodes (at their own expense). This means that racks populated by cluster nodes that may well belong to several groups who may well choose to share them will not violate any laws or outrage the sensibilities of the groups that purchase those nodes.


next up previous contents
Next: User Support Up: A Model for Cluster Previous: Cluster nodes   Contents
Robert G. Brown 2003-06-02