LMbench Results

LMbench is perhaps one of the best known and most useful suites of microbenchmarks. It is also "old" in the sense that it has been around a long time, works pretty well, and therefore can provide useful comparative results over many generations of system evolution in a variety of architectures. This and many other of its virtues are extolled on its website and FAQ linked above.

The following is a fairly standard lmbench "results" summary generated by the latest version, lmbench-2beta. Note that it was impossible or irrelevant to test disk or network on this particular system at this particular time. The most useful results in the collection below, therefore, are likely associated with CPU and memory subsystem performance.

The greatest weakness of this particular suite is that isn't really designed (yet, as I believe Carl Staelin is working on it) to test SMP performance boundaries. Because of the way the timing harness works, it may or may not, for example, make any sense at all to simply run a microbenchmark twice at the same time as one frequently does with "coarse grained" performance measures that take it for granted that e.g. system interrupts will interlace with the timed intervals.

Still, it is hard to come up with any single set of microbenchmark numbers any better than what lmbench provides to guide systems and beowulf engineering. To give you SOME local framework for these numbers, I provide a comparison to a 933 MHz single-CPU PIII with 128 MB of PC133 SDRAM.


LMbench Results for the Dual Athlon


                 L M B E N C H  2 . 0   S U M M A R Y
                 ------------------------------------
		 (Alpha software, do not distribute)

Basic system parameters
----------------------------------------------------
Host                 OS Description              Mhz
                                                    
--------- ------------- ----------------------- ----
dual.asla Linux 2.4.2-a       i686-pc-linux-gnu 1197
ganesh    Linux 2.2.16-       i686-pc-linux-gnu  933

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host                 OS  Mhz null null      open selct sig  sig  fork exec sh  
                             call  I/O stat clos TCP   inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
dual.asla Linux 2.4.2-a 1197 0.25 0.40 2.15 3.48    14 0.64 2.00  257  959 6481
ganesh    Linux 2.2.16-  933 0.33 0.48 1.75 2.56    18 0.95 1.23  122  901 5048

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host                 OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                        ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
dual.asla Linux 2.4.2-a 1.740 3.3900     14 6.0500    176      35     173
ganesh    Linux 2.2.16- 0.640 3.7400     10 5.2900    113      25     125

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
dual.asla Linux 2.4.2-a 1.740 7.118   19    18          22         65
ganesh    Linux 2.2.16- 0.640 3.510 7.64    16    34    24    49   86

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host                 OS   0K File      10K File      Mmap    Prot    Page	
                        Create Delete Create Delete  Latency Fault   Fault 
--------- ------------- ------ ------ ------ ------  ------- -----   ----- 
dual.asla Linux 2.4.2-a 3.5448 0.7863 7.9968 1.6108      152 0.452 2.00000
ganesh    Linux 2.2.16- 7.3508 0.7005    152 1.2181     2714 0.802     463

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
dual.asla Linux 2.4.2-a  368  129  223    257    396    267    267  384   455
ganesh    Linux 2.2.16-  877  537   91    330    481    151    140  481   210

Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
---------------------------------------------------
Host                 OS   Mhz  L1 $   L2 $    Main mem    Guesses
--------- -------------  ---- ----- ------    --------    -------
dual.asla Linux 2.4.2-a  1197 2.506     16    206
ganesh    Linux 2.2.16-   933 3.229 7.5810    129