From staelin@exch.hpl.hp.com Wed May  2 10:17:31 2001
Date: Wed, 2 May 2001 06:21:36 -0700 
From: "Staelin, Carl" <staelin@exch.hpl.hp.com>
To: Larry McVoy <lm@bitmover.com>
Cc: lmbench-users@bitmover.com
Subject: RE: cache info

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Larry,

I think there are still some nits in cache's
algorithm/method, but for the most part the
basic memory latency code seems pretty stable
now.

This table is an attempt to summarize the
current state of the lmbench3 "cache" benchmark.  
The known values are the known sizes of the
D or integrated I/D caches for each level.
The cache-out figures are the results
reported by cache.

machine	CPU		known		cache-out
---------------------------------------------------
aix[1]	140/43p	32K/1M	32K/768K
alpha[2]	ev56		8K/96K/1M	8K/1M
freebsd	???		???		16K/128K
freebsd3	???		???		32K/384K
freebsd4	???		???		32K
hp		780		1M		1M
hpli69[3]	450Mhz PIII	16K/512K	16K/512K		
openbsd	???		???		16K/128K
sgi[4]	IP22		16K/1M	16K/1M
sparc[5]	Ultra IIi	16K/512K?	16K/256K
sun		Ultra1	???		16K/512K
sunx86	450Xeon	16K/512K	16K/512K

Note: According to [5], the TI UltraSparc IIi
   could have been shipped with 256K, 512K, or
   more L2 cache.  So, it is possible that sparc
   only has 256K of cache.  The true cache size
   needs to be double-checked.

As you can see, I don't actually know the real
cache sizes for several of the machines, and
for those machines where the size is known,
"cache" sometimes reports the wrong number.

Known problems:

   aix: the true cache size is 1M, yet cache only
      finds 768K.  The latency curves have a large
      spike between 655K and 768K, and cache rounds
      up to the next "sensible" cache size.  
   alpha: cache simply misses the L2 cache, and
      only reports L1 and L3.
   sparc: looking at the latency graphs, I actually
      believe that cache may be correct.  There is
      a really strong spike around 256K.

Suspected problems:

   freebsd3: does this machine really have a
      386K L2 cache?  This doesn't really make
      sense.
   freebsd4: does this machine really only have
      a single 32K L1 cache?

I have attached a file, cache.out, which includes
the raw memory latency data and inferred cache 
parameters for each of the machines in the above 
table.  Note that the memory latencies may differ
from those reported by lat_mem_rd because it is
cache line-size insensitive/invariant and because
it attempts to find pages which do not collide in
the cache (sort of a user-land-based ex-post-facto 
page coloring).  

At this point, the key thing I need to make
progress on cache is more "ground-truthed"
hosts, where I can compare cache's results
with known results.  Right now I have cache
information I trust for: aix, alpha, hp,
hpli69, sgi, and sunx86.  I would like to
have information for: freebsd, freebsd3, 
freebsd4, openbsd, sparc, and sun.  (I need
the numbers for sparc to be double-checked.)

Cheers,

Carl


references
----------

[1] AIX 140-43p, http://www.dataweld.com/RS600043P140.htm
[2] Alpha ev56, http://www.cesgroup.com/prism/n7/alpha2.htm
[3] Pentium III 450MHz,
http://www.geek.com/procspec/intel/pentium3consumer.htm
[4] SGI IP22, data from utility "hinv"
[5] TI UltraSparc IIi,
http://www.sun.ca/microelectronics/whitepapers/UltraSPARC-IIi/03.html

_________________________________________________
[(hp)]	Carl Staelin
	Senior Research Scientist
	Hewlett-Packard Laboratories
	Technion City
	Haifa, 32000
	ISRAEL
	+972(4)823-1237x221	+972(4)822-0407 fax
	staelin@hpl.hp.com
_______http://www.hpl.hp.com/personal/Carl_Staelin_______


> -----Original Message-----
> From: Larry McVoy [mailto:lm@bitmover.com]
> Sent: Wednesday, October 18, 2000 8:16 PM
> To: staelin@hpl.hp.com
> Subject: cache info
> 
> 
> redhat5.2: 128K, I think unified, it's a celeron
> build62, redhat62, suse64, openbsd, freebsd4 are all celerons
> freebsd3: AMD K6, either 32K or 64K, I think 32.
> 
> ppc is a PowerPC with 512K unified
> alpha is an EV56, dunno, search on google, I think it might 
> be 256 or 512
> sparc is a Sun Ultra5, TI UltraSparc IIi, looks like 16K 
> onchip, 512K off
> aix - don't know it's a 140/43p but I think that is a widely 
> used model number
> sgi - 16K / 1MB
> disks - AMD K7, 512K, runs at 1/2 or 1/3 clock which is 750Mhz (by the
> way, we're getting a Ghz K7 soon).
> sunx86 - 2x450Mhz Xeon, I think 512 but may be 256.


    [ Part 2, Application/OCTET-STREAM (Name: "cache.out")  20KB. ]
    [ Unable to print this part. ]

