|
Beowulf Papers
|
|
Daniel Ridge, Donald Becker, Phillip Merkey, Thomas Sterling
Becker, Phillip Merkey
Beowulf: Harnessing the Power of Parallelism in a Pile-of-PCs
Proceedings, IEEE Aerospace, 1997
Abstract
The rapid increase in performance of mass market commodity
microprocessors and significant disparity in pricing between
PCs and scientific workstations has provided an opportunity
for substantial gains in performance to cost by harnessing
PC technology in parallel ensembles to provide high end
capability for scientific and engineering applications. The
Beowulf project is a NASA initiative sponsored by the
HPCC program to explore the potential of Pile-of-PCs and
to develop the necessary methodologies to apply these
low cost system configurations to NASA computational
requirements in the Earth and space sciences. Recently, a
16 processor Beowulf costing less than $50,000 sustained
1.25 Gigaflops on a gravitational N-body simulation of 10
million particles with a Tree code algorithm using standard
commodity hardware and software components. This paper
describes the technologies and methodologies employed to
achieve this breakthrough. Both opportunities afforded by
this approach and the challenges confronting its application
to real-world problems are discussed in the framework of
hardware and software systems as well as the results from
benchmarking experiments. Finally, near term technology
trends and future directions of the Pile-of-PCs concept are
considered.
PostScript
Chance Reschke, Thomas Sterling, Daniel Ridge, Daniel Savarese, Donald
Becker, Phillip Merkey
A Design Study of Alternative Network Topologies for the Beowulf
Parallel Workstation
Proceedings, High Performance and Distributed Computing, 1996
Abstract
Coupling PC-based commodity technology with distributed
computing methodologies provides an important advance in
the development of single-user dedicated systems. Beowulf
is a class of experimental parallel workstations developed
to evaluate and characterize the design space of this
new operating point in price-performance. A key factor
determining the realizable performance under real-world
workloads is the means devised for interprocessor
communications. A study has been performed to characterize
the design parameters of a family of interconnect
topoligies feasible with low cost mass market network
technologies. Findings are presented which compare the
advantage of complex segmented topologies over earlier
parallel ``channel bonded'' schemes. Behavior sensitivities
to packet size and traffic density are determined. It
is shown that under many circumstances the more complex
topologies result in better performance, and under favorable
circumstances software routing techniques experience little
performance degradation when compared to more expensive
hardware switch mechanisms.
HTML
PostScript
Thomas Sterling, Donald J. Becker, Daniel Savarese, Michael R. Berry, Chance Res
Achieving a Balanced Low-Cost Architecture for Mass Storage
Management through Multiple Fast Ethernet Channels on the Beowulf Parallel
Workstation
Proceedings, International Parallel Processing Symposium, 1996
Abstract:
Network-of-Workstations (NOW) seek to leverage commercial
workstation technology to produce high performance computing
systems at costs appreciably lower than parallel computers
specifically designed for that purpose. The capabilities
of technologies emerging from the PC commodity mass market
are rapidly evolving to converge with those of workstations
while at significantly lower cost. A new operating point
in the price-performance design space of parallel system
architecture may be derived through parallelism of PC
subsystems. The Pile-of-PCs, PopC (pronounced ``pop-see''),
approach is being explored through the Beowulf Parallel
Workstation developed to provide order-of-magnitude
increases in disk capacity and bandwidth for a single user
environment at costs commensurate with conventional high-end
workstations. This paper explores a critical aspect of the
architecture trade-off space for Beowulf associated with the
balance of parallel disk throughput and internal network
bandwidth. The findings presented demonstrate that parallel
channels of commodity 100 Mbps Ethernet are both necessary
and sufficient to support the data rates of multiple
concurrent file transfers on a sixteen processor Beowulf
parallel workstation.
HTML
PostScript
Donald J. Becker, Thomas Sterling, Daniel Savarese, Bruce Fryxell,
Kevin Olson
Communication Overhead for Space Science Applications on the
Beowulf Parallel Workstation
Proceedings,High Performance and Distributed Computing, 1995
Abstract
The Beowulf parallel workstation combines 16 PC-compatible processing
subsystems and disk drives using dual Ethernet networks to provide a
single-user environment with 1 Gops peak performance, half a Gbyte of disk
storage, and up to 8 times the disk I/O bandwidth of conventional workstations.
The Beowulf architecture establishes a new operating point in price-performance
for single-user environments requiring high disk capacity and bandwidth. The
Beowulf research project is investigating the feasibility of exploiting mass
market commodity computing elements in support of Earth and space science
requirements for large data-set browsing and visualization, simulation of
natural physical processes, and assimilation of remote sensing data. This paper
reports the findings from a series of experiments for characterizing the
Beowulf dual channel communication overhead. It is shown that dual networks can
sustain 70% greater throughput than a single network alone but that bandwidth
achieved is more highly sensitive to message size than to the number
of messages at peak demand. While overhead is shown to be high for
global synchronization, its overall impact on scalability of real
world applications for computational fluid dynamics and N-body
gravitational simulation is shown to be modest.
HTML
PostScript
Donald J. Becker, Thomas Sterling, Daniel Savarese, John E. Dorband,
Udaya A. Ranawak, Charles V. Packer
BEOWULF: A PARALLEL WORKSTATION FOR SCIENTIFIC
COMPUTATION
Proceedings, International Conference on Parallel Processing, 95
Abstract
Network-of-Workstations technology is applied to the
challenge of implementing very high performance workstations
for Earth and space science applications. The Beowulf
parallel workstation employs 16 PC-based processing modules
integrated with multiple Ethernet networks. Large disk
capacity and high disk to memory bandwidth is achieved
through the use of a hard disk and controller for each
processing module supporting up to 16 way concurrent
accesses. The paper presents results from a series of
experiments that measure the scaling characteristics
of Beowulf in terms of communication bandwidth, file
transfer rates, and processing performance. The evaluation
includes a computational fluid dynamics code and an N-body
gravitational simulation program. It is shown that the
Beowulf architecture provides a new operating point in
performance to cost for high performance workstations,
especially for file transfers under favorable conditions.
HTML
PostScript