Stream Results

The following are the results of running stream on the dual Athlon. The first test runs a single instance of stream in its default configuration (using only one of the two CPUs). The second runs two instances of stream at the same time so that they contend for the memory subsystem.

In summary, the single test reveals a bandwith that ranges from 550-830 MB/sec. The dual test sees this range drop off to 350-410 MB/sec per process. Together they suggest an aggregate bandwidth in the 800 MB/sec ballpark.


Single Stream

-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
Using wall clock timer gettimeofday() for timing.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of at least 17700 microseconds.
   (= 17700 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:         690.8469       0.0232       0.0232       0.0234
Scale:        547.7939       0.0293       0.0292       0.0293
Add:          833.5918       0.0288       0.0288       0.0288
Triad:        693.8612       0.0346       0.0346       0.0347

Dual Streams (two at a time)

-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
Using wall clock timer gettimeofday() for timing.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of at least 25056 microseconds.
   (= 25056 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:         385.8395       0.0418       0.0415       0.0419
Scale:        356.0785       0.0451       0.0449       0.0453
Add:          412.1871       0.0585       0.0582       0.0589
Triad:        406.0568       0.0595       0.0591       0.0600

------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. Using wall clock timer gettimeofday() for timing. ------------------------------------------------------------- Array size = 1000000, Offset = 0 Total memory required = 22.9 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of at least 25370 microseconds. (= 25370 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) RMS time Min time Max time Copy: 386.2303 0.0418 0.0414 0.0420 Scale: 358.8172 0.0448 0.0446 0.0450 Add: 411.2542 0.0586 0.0584 0.0589 Triad: 406.9868 0.0594 0.0590 0.0599