C# in Depth

Cover of C# in Depth
Order now (3rd edition)

Random numbers

When you see the word "random" in a question title on Stack Overflow you can almost guarantee it will be the same fundamental problem as countless similar questions. This article takes a look at why randomness causes so many problems, and how to address them.

The problem

The Stack Overflow (or newsgroup, or mailing list etc) question usually goes something like this:

I'm using Random.Next to generate random numbers, but it keeps giving me the same number. It changes from run to run, but each time it will generate the same number lots of times.

This is caused by code like this:

// Bad code! Do not use!
for (int i = 0; i < 100; i++)
{
    Console.WriteLine(GenerateDigit());
}
...
static int GenerateDigit()
{
    Random rng = new Random();
    // Assume there'd be more logic here really
    return rng.Next(10);
}

So, what's going wrong here?

The explanation

The Random class is not a true random number generator. It's a pseudo-random number generator. Any instance of Random has a certain amount of state, and when you call Next (or NextDouble or NextBytes) it will use that state to return you some data which appears to be random, mutating its internal state accordingly so that on the next call you will get another apparently-random number.

All of this is deterministic. If you start off an instance of Random with the same initial state (which can be provided via a seed) and make the same sequence of method calls on it, you'll get the same results.

So what was wrong in our example code? We were using a new instance of Random on each iteration of the loop. The parameterless constructor for Random takes the current date and time as the seed - and you can generally execute a fair amount of code before the internal timer works out that the current date and time has changed. Therefore we're using the same seed repeatedly - and getting the same results repeatedly.

What can we do about it?

There are lots of solutions to this problem - some of which are better than others. Let's get one of them out of the way first, as it's different to all the rest.

Use a cryptographic random number generator

.NET has a RandomNumberGenerator class which is the abstract class from which all cryptographic random number generators should be derived. The framework itself ships with one such derived class: RNGCryptoServiceProvider. The idea of a cryptographic random number generator is that even if it may still be a pseudo-random generator, it works very hard to be unpredictable. The built-in implementation takes several sources of entropy which effectively represent "noise" in your computer, and would be very hard to predict. It may use this noise not just to compute a seed, but also when generating the next numbers, so that even if you know the current state, that may not be enough to predict the next results (or the ones already generated). This depends on the exact implementation though. Windows can also make use of specialized hardware sources of randomness (such as a piece of hardware which watches a radioactive isotope for decay) to make the random number generator even more secure.

Compare this with Random. If you see (say) 10 results of calling Random.Next(100) and have a fair amount of computing resource to put into the task, you may well be able to work out the initial seed and predict what the next result would be... as well as what earlier results were, potentially. That's a disastrous state of affairs if the random numbers are being used for security or financial purposes. Cryptographic random number generators are generally slower than Random, but do a much better job of giving numbers which are hard to predict and independent.

In many cases performance of the random number generator isn't an issue - but having a decent API is. RandomNumberGenerator is basically designed to generate random bytes - and that's all. Compare this with the API of Random, which lets you ask for a random integer, or a random double, or a random set of bytes. I usually find that I need an integer in a range - and getting that reliably and uniformly from a random array of bytes is non-trivial. It's far from impossible, but at the very least you'd probably want an adapter class on top of RandomNumberGenerator. In many cases, the pseudo-randomness of Random is acceptable - if you can avoid the pitfalls described earlier. Let's look at how we can do that.

Use one instance of Random repeatedly

The core of the fix for the "lots of repeated numbers" is to use the same instance of Random repeatedly. This sounds pretty simple... for example, we can change our original code like this:

// Somewhat better code...
Random rng = new Random();
for (int i = 0; i < 100; i++)
{
    Console.WriteLine(GenerateDigit(rng));
}
...
static int GenerateDigit(Random rng)
{
    // Assume there'd be more logic here really
    return rng.Next(10);
}

Now our loop will print different digits... but we're not done yet. What happens if you call this code multiple times in quick succession? We might still create two instances of Random with the same seed... although each string of numbers would contain different digits, we could easily get the same string of digits twice.

There are two ways of avoiding this problem. One option is to use a static field to maintain a single instance of Random which everything uses. Alternatively, we could push the instantiation higher, of course - eventually reaching the top of the program, which only ever instantiates a single element of Random, and passes it down everywhere. That's a nice idea (and it expresses the dependency nicely), but it won't quite work... at least, it could cause problems if your code uses multiple threads.

Thread safety

Random isn't thread-safe. It's a real pain, given how we'd ideally like to use a single instance everywhere in our program. But no, if you use the same instance at the same time from multiple threads, it's quite possible to end up with a state of all zeroes internally, at which point the instance becomes useless.

Again, there are two approaches here. One is to still use one instance, but also use a lock which every caller has to remember to acquire while they're using the random number generator. That can be simplified by using a wrapper which does the locking for you, but in a heavily multithreaded system you'll still potentially waste a lot of time waiting for locks.

The other approach - and the one we'll pick up here - is to have one instance per thread. We need to make sure that when we create instances we don't reuse the same seed (so we can't just call the parameterless constructor, for example), but other than that it's relatively straightforward.

A safe provider

Fortunately, the new ThreadLocal<T> class in .NET 4 makes it very easy to write providers which need to have a single instance per thread. You simply give the ThreadLocal<T> constructor a delegate to call to obtain the initial value, and you're away. In my case I've chosen to use a single seed variable, initialized using Environment.TickCount (just like the parameterless Random constructor) which is then incremented every time we need a new random number generator - which is once per thread.

The whole class is static, with just a single public method: GetThreadRandom. It's a method rather than a property mostly for the sake of convenience: rather than make any classes which need random numbers depend on Random itself, they'll depend on Func<Random>. If the type is only designed to operate in a single thread, it can call the delegate to obtain a single instance of Random and reuse it; if it could be used from multiple threads it can call the delegate each time it needs a random number generator. This will only create as many instances as there are threads, and each will start with a different seed. When passing in the dependency, we can then use a method conversion: new TypeThatNeedsRandom(RandomProvider.GetThreadRandom). Here's the code:

using System;
using System.Threading;
 
public static class RandomProvider
{    
    private static int seed = Environment.TickCount;
    
    private static ThreadLocal<Random> randomWrapper = new ThreadLocal<Random>(() =>
        new Random(Interlocked.Increment(ref seed))
    );

    public static Random GetThreadRandom()
    {
        return randomWrapper.Value;
    }
}

Simple, isn't it? That's because all it's concerned with is providing the right instance of Random. It doesn't care what methods you call on the instance after it's been fetched. Code can still abuse this class of course, by stashing one Random reference and reusing it from multiple threads, but it's also easy to do the right thing.

Problems with interface design

One problem remains: this still isn't very secure. As I mentioned earlier, there's a more secure version in RandomNumberGenerator, with the most commonly used derived class being RNGCryptoServiceProvider. However, the API to that really is hard to use in common cases.

It would be very pleasant indeed if the framework providers had separated the "source of randomness" concept from the "I want to get a random value in a simple way" concept. Then we could have used a simple API backed by either a secure or insecure source of randomness, depending on our requirements. Unfortunately, 'twas not to be. Maybe in a future iteration... or maybe a third party will come up with an adapter instead. (It's beyond me, unfortunately... doing this kind of thing properly is remarkably difficult.) You can nearly get away with deriving from Random and overriding Sample and NextBytes... but it's not clear exactly how they need to work, and even Sample can be tricky. Maybe next time...