Random Number Generators

A basic introduction to random number generation algorithms

Excerpts from the book 'Algorithms, Data Structures and Problem Solving with C++' by Mark Allen Weiss

How are random numbers generated? True randomness is impossible to do on a computer since any number obtained will depend on the algorithm used to generate them and thus cannot possibly be random. Generally, it is sufficient to produce pseudorandom numbers, which are numbers that appear to be random. When we say they appear to be random, we mean that pseudorandom numbers should satisfy many of the properties that random numbers do. This is much easier said than done though.

Suppose we only need to simulate a coin flip. One way to do this is to examine the system clock. Presumably the clock maintains the number of seconds in the current time; if the number is even, we can return 0(for heads), and if it's odd, we can return 1 (for tails). The problem is that this strategy does not work well if we need a sequence of random numbers. One second is a long time, and the clock might not change at all while the program is running. We would expect to generate all 0s or all 1s, which is hardly a random-looking sequence. Even if the time was recorded in units of microseconds, and the program was running by itself, the sequence of numbers that would be generated would be far from random because the time between the calls to the generator would be essentially indentical on every program invocation.

What we really need is a sequence of pseudorandom numbers, a sequence with the same properties as a random sequence. Suppose we want random numbers between 0 and 999, uniformly distributed. In a uniform distribution, all numbers in the specified range are equally likely to occur. Other distributions are also widely used. We will see that most distributions can be derived from the uniform distribution, so that is the one we consider first. The following properties hold if the sequence 0...999 is a true uniform distribution:

  • The first number to appear is equally likely to be 0,1,2...999.
  • The ith number to appear is equally likely to be 0,1,2...999.
  • The average of all the generated numbers is 499.5.

Theese properties are not particularly restrictive. For instance we could generate the first number by exmining the clock that was accurate to one millisecond and then using the number of milliseconds. We can generate subsequent numbers by adding one to the previous number. Clearly, after 1000 numbers are generated, all the properties above hold. However, stronger properties do not. Some stronger properties that would hold for uniformly distributed random numbers include:

  • The sum of two consecutive numbers is equally likely to be even or odd.
  • If 1000 numbers are randomly generated, some will be duplicated. Roughly 368 numbers will never appear.

Our numbers do not satisfy theese properties. Consecutive numbers always sum to an odd number, and our sequence is duplicate-free. We say that our simple pseudorandom number generator has failed two statistical tests. All pseudorandom number generators fail some statistical tests, but the better generators fail fewer tests than the bad ones.

Here we describe the simplest uniform generator that passes a reasonable number of statistical tests. By no means is it the best generator, but it is suitable for use as a reasonable generator in applications where a good approximation to a random sequence is acceptable. The method we use is the linear congruential generator, which was first described in 1951. Numbers X1,X2, ... are generated satisfying

Xi + 1 = AXi(mod M)

This equation state that we can get the (i+1)th number by multiplying the ith number by some constant A and computing the remainder when the result is divided by M. In C/C++ we would do something like:

X[ i+1 ] = A * X[ i ] % M;

The constants A and M are specified below. Notice that all generated numbers will be smaller than M. Some value X0 must be given to start the sequence. This value is known as the seed. If X0=0, then the sequence is not random because it generates all zeroes, but if A and M are carefully chosen, then any other seed satisfying 1 =< X0 < M is equally valid. If M is a prime, then Xi is never 0. For example, if M = 11,A=7, and the seed X0=1, then the numbers generated are

7,5,3,10,4,6,9,8,1,7,5,2, ...

When we generate a number a second time, we have a repeating sequence. In our case the sequence repeats after M-1 = 10 numbers; this is known as the period of the sequence. The period obtained with this choice of A is clearly as good as possible, since all nonzero numbers smaller than M are generated. (We must have a repeated number generated on the 11th iteration).

I believe this is a fairly detailed and extensive introduction to pseudorandom number generation which can be applicable in both assembler and C with ease. If you have made your own generator, and think it's better than this one, test it against all of the above properties, and if it holds, you can consider yourself lucky. ;-)

Mattias Nilsson

Last change: 16.01.2001
For questions referring to this page mail to: Mattias Nilsson