Random numbers
When you see the word "random" in a question title on Stack Overflow
you can almost guarantee it will be the same fundamental problem as countless similar
questions. This article takes a look at why randomness causes so many problems, and how
to address them.
The problem
The Stack Overflow (or newsgroup, or mailing list etc) question usually goes something like this:
I'm using Random.Next to generate random numbers, but it keeps
giving me the same number. It changes from run to run, but each time it will
generate the same number lots of times.
This is caused by code like this:
for (int i = 0; i < 100; i++)
{
Console.WriteLine(GenerateDigit());
}
...
static int GenerateDigit()
{
Random rng = new Random();
return rng.Next(10);
}
So, what's going wrong here?
The explanation
The Random class is not a true random number generator. It's a pseudo-random number generator.
Any instance of Random has a certain amount of state, and when you call Next
(or NextDouble or NextBytes) it will use that state to return you some
data which appears to be random, mutating its internal state accordingly so that on the next call
you will get another apparently-random number.
All of this is deterministic. If you start off an instance of Random with the same
initial state (which can be provided via a seed) and make the same sequence of method calls on it,
you'll get the same results.
So what was wrong in our example code? We were using a new instance of Random on
each iteration of the loop. The parameterless constructor for Random
takes the current date and time as the seed - and you can generally execute a fair amount of code
before the internal timer works out that the current date and time has changed. Therefore
we're using the same seed repeatedly - and getting the same results repeatedly.
What can we do about it?
There are lots of solutions to this problem - some of which are better than others.
Let's get one of them out of the way first, as it's different to all the rest.
Use a cryptographic random number generator
.NET has a RandomNumberGenerator class which is the abstract class from
which all cryptographic random number generators should be derived. The framework
itself ships with one such derived class: RNGCryptoServiceProvider.
The idea of a cryptographic random number generator is that even if it may still be
a pseudo-random generator, it works very hard to be unpredictable. The built-in implementation
takes several sources of entropy which effectively represent "noise" in your computer, and would
be very hard to predict. It may use this noise not just to compute a seed, but also when generating
the next numbers, so that even if you know the current state, that may not be enough to predict the
next results (or the ones already generated). This depends on the exact implementation though.
Windows can also make use of specialized hardware sources of randomness (such as a piece of hardware which watches
a radioactive isotope for decay) to make the random number generator even more secure.
Compare this with Random. If you see (say)
10 results of calling Random.Next(100) and have a fair amount of computing
resource to put into the task, you may well be able to work out the initial seed and
predict what the next result would be... as well as what earlier results were, potentially.
That's a disastrous state of affairs if the random numbers are being used for security or financial purposes. Cryptographic
random number generators are generally slower than Random, but do a much
better job of giving numbers which are hard to predict and independent.
In many cases performance of the random number generator isn't an issue - but having a
decent API is. RandomNumberGenerator is basically designed to generate
random bytes - and that's all. Compare this with the API of Random, which
lets you ask for a random integer, or a random double, or a random set of
bytes. I usually find that I need an integer in a range - and getting that reliably
and uniformly from a random array of bytes is non-trivial. It's far from impossible,
but at the very least you'd probably want an adapter class on top of
RandomNumberGenerator. In many cases, the pseudo-randomness of
Random is acceptable - if you can avoid the pitfalls described earlier.
Let's look at how we can do that.
Use one instance of Random repeatedly
The core of the fix for the "lots of repeated numbers" is to use the same instance
of Random repeatedly. This sounds pretty simple... for example, we can change
our original code like this:
Random rng = new Random();
for (int i = 0; i < 100; i++)
{
Console.WriteLine(GenerateDigit(rng));
}
...
static int GenerateDigit(Random rng)
{
return rng.Next(10);
}
Now our loop will print different digits... but we're not done yet.
What happens if you call this code multiple times in quick succession? We
might still create two instances of Random with the same seed... although each
string of numbers would contain different digits, we could easily get the same
string of digits twice.
There are two ways of avoiding this problem. One option is to use a static field to maintain
a single instance of Random which everything uses. Alternatively, we could push the instantiation higher, of course - eventually reaching the top
of the program, which only ever instantiates a single element of Random, and passes it
down everywhere. That's a nice idea (and it expresses the dependency nicely), but it won't quite work... at least, it could cause problems if
your code uses multiple threads.
Thread safety
Random isn't thread-safe. It's a real pain, given how we'd ideally like to use a single
instance everywhere in our program. But no, if you use the same instance at the same time
from multiple threads, it's quite possible to end up with a state of all zeroes internally,
at which point the instance becomes useless.
Again, there are two approaches here. One is to still use one instance, but also
use a lock which every caller has to remember to acquire while they're using the random
number generator. That can be simplified by using a wrapper which does the locking for you,
but in a heavily multithreaded system you'll still potentially waste a lot of time waiting
for locks.
The other approach - and the one we'll pick up here - is to have one instance per thread.
We need to make sure that when we create instances we don't reuse the same seed
(so we can't just call the parameterless constructor, for example), but other
than that it's relatively straightforward.
A safe provider
Fortunately, the new ThreadLocal<T>
class in .NET 4 makes it very easy to write providers which need to have a single instance per thread. You simply give the
ThreadLocal<T> constructor a delegate to call to obtain the initial value, and you're away.
In my case I've chosen to use a single seed variable, initialized using Environment.TickCount (just like the
parameterless Random constructor) which is then incremented every time we need a new random number generator -
which is once per thread.
The whole class is static, with just a single public method: GetThreadRandom. It's a method
rather than a property mostly for the sake of convenience: rather than make any classes which need random numbers
depend on Random itself, they'll depend on Func<Random>. If the type is only designed to
operate in a single thread, it can call the delegate to obtain a single instance of Random and reuse it;
if it could be used from multiple threads it can call the delegate each time it needs a random number generator. This will only
create as many instances as there are threads, and each will start with a different seed. When passing in the dependency,
we can then use a method conversion: new TypeThatNeedsRandom(RandomProvider.GetThreadRandom). Here's the code:
using System;
using System.Threading;
public static class RandomProvider
{
private static int seed = Environment.TickCount;
private static ThreadLocal<Random> randomWrapper = new ThreadLocal<Random>(() =>
new Random(Interlocked.Increment(ref seed))
);
public static Random GetThreadRandom()
{
return randomWrapper.Value;
}
}
Simple, isn't it? That's because all it's concerned with is providing the right instance of Random. It doesn't care
what methods you call on the instance after it's been fetched. Code can still abuse this class of course, by stashing one Random reference
and reusing it from multiple threads, but it's also easy to do the right thing.
Problems with interface design
One problem remains: this still isn't very secure. As I mentioned earlier, there's a more secure version in RandomNumberGenerator,
with the most commonly used derived class being
RNGCryptoServiceProvider.
However, the API to that really is hard to use in common cases.
It would be very pleasant indeed if the framework providers had separated the "source of randomness" concept from the
"I want to get a random value in a simple way" concept. Then we could have used a simple API backed by either a secure or insecure
source of randomness, depending on our requirements. Unfortunately, 'twas not to be. Maybe in a future iteration... or maybe a third party will come up with
an adapter instead. (It's beyond me, unfortunately... doing this kind of thing properly is remarkably difficult.) You can nearly get away with
deriving from Random and overriding Sample and NextBytes... but it's not clear exactly how they need to work, and
even Sample can be tricky. Maybe next time...