DieHarder: A Random Number Test Suite

One suggestion I got on the GSL list was to call this suite "dieharder" as it includes and extends George Marsaglia's "Diehard" suite of tests of random number generators. This actually sounds pretty good as a tribute to Diehard's enduring usefulness as a test suite. Only now, many years after it was introduced, are computers getting to be fast enough that extensions to diehard are required to push the existing pseudorandom number generators to failure.

rand_rate Version 0.5.3

Click here to see a list of all rand_rate files on this repository.

This is the current snapshot of the rand_rate random number tester. It encapsulates all of the Gnu Scientific Library random number generators as well as /dev/random and /dev/urandom in a single harness that can time them and subject them to various tests for randomness. These tests are variously drawn from Marsaglia's "Diehard" battery of random number tests, the NIST STS tests, useful e.g. binning or spectral tests that I've found doing research into random number tests or tests that I myself have made up or that are improvements on tests from other sources.

The primary point of this tester is to make it easy to time and test (pseudo)random number generators OR hardware or other generators. Three examples are provided of wrapping a random number generator and inserting it so that it is can be called via the GSL interface. An interface is planned that would allow random numbers to be read in from a file, allowing the tool to be used with any generator (wrappable or not) that can generate a table of random numbers.

Another important motivation for writing rand_rate is to have the entire test code and documentation be fully Gnu Public Licensed and hence openly available for adaptation, testing, modification, including the addition of new tests. The primary objections I have towards diehard and sts are not that they are or are not adequate and complete; it is that the code is obscure and not explicitly published for reuse. It is my hope that by providing this tool in autodocumenting source, it makes it easy to add new tests, critique older tests, and improve the suite in general.

NOTE WELL

If you compile the test and run it as

rand_rate -t -1

it should run the entire suite on the default generator. Choose alternative tests with -r number where rand_rate -t 0 will list all possible numbers known to the current test (mostly from the GSL).

The following are the tests so far implemented and (possibly) functional:

List of Tests/Comments

I'm mostly through a rewrite of the code that is putting it all in a common format where a shell calls and samples the actual test, after setting any test-specific parameters and spitting out a more or less standard form description of the test (which can be surpressed by the -q flag). A few diehard tests remain to be converted, and rgb_persist cannot be converted as it doesn't produce a p-value, it just fails or doesn't fail. Once I finish converting the existing tests it is back to adding more diehard tests, as my current goal is to get all of diehard implemented this year (and more selected tests out of sts).

Thoughts for the Future

I think that it would be "interesting" to apply this exact framework to the GSL distribution generators as a test of random numbers TOTALLY within the GSL -- if one simply (e.g.) samples a poissonian using randu as your rng and compare the result to the poissonian probability density function and use it to compute Pearson's chisq and a p-value (vector) and eventually a KS p-value, does that prove to be as sensitive a test as the various (far more complex) STS and Diehard tests? An interesting question, if nothing else.

For example, the birthday test seems like a difficult way to generate a Poissonian compared to just "doing it" from an rng. Is there any evidence that it has greater sensitivity to various failures of the rng than a test that didn't have to achieve an asymtotic limit (large bit windows, many birthdays, many samples) in order to become Poissonian in the first place? Similarly, does the sts runs test have anything to recommend it relative to just counting windowed integer values returned by the rng (integer values extracted from the actual bits viewed through a "port" if fixed width with cyclic/periodic wraparound of the source integer/bitstring)? I doubt it, but sensitivity tests require a lot of time and energy to conduct. When the tool is finished, I'll conduct them.

Anyway, I hope that even during development, you find rand_rate useful. If you have any interest in participating in its development, be sure to let me know, especially if you have decent coding skills and a basic knowledge of statistics. I have documents to help with the latter, if you have the programming skills and want to LEARN statistics.