Random numbers
When you see the word "random" in a question title on Stack Overflow you can almost guarantee it will be the same fundamental problem as countless similar questions. This article takes a look at why randomness causes so many problems, and how to address them.
The problem
The Stack Overflow (or newsgroup, or mailing list etc) question usually goes something like this:
I'm using Random.Next
to generate random numbers, but it keeps
giving me the same number. It changes from run to run, but each time it will
generate the same number lots of times.
This is caused by code like this:
// Bad code! Do not use!
for (int i = 0; i < 100; i++)
{
Console.WriteLine(GenerateDigit());
}
...
static int GenerateDigit()
{
Random rng = new Random();
// Assume there'd be more logic here really
return rng.Next(10);
}
So, what's going wrong here?
The explanation
The Random
class is not a true random number generator. It's a pseudo-random number generator.
Any instance of Random
has a certain amount of state, and when you call Next
(or NextDouble
or NextBytes
) it will use that state to return you some
data which appears to be random, mutating its internal state accordingly so that on the next call
you will get another apparently-random number.
All of this is deterministic. If you start off an instance of Random
with the same
initial state (which can be provided via a seed) and make the same sequence of method calls on it,
you'll get the same results.
So what was wrong in our example code? We were using a new instance of Random
on
each iteration of the loop. The parameterless constructor for Random
takes the current date and time as the seed - and you can generally execute a fair amount of code
before the internal timer works out that the current date and time has changed. Therefore
we're using the same seed repeatedly - and getting the same results repeatedly.
What can we do about it?
There are lots of solutions to this problem - some of which are better than others. Let's get one of them out of the way first, as it's different to all the rest.
Use a cryptographic random number generator
.NET has a RandomNumberGenerator
class which is the abstract class from
which all cryptographic random number generators should be derived. The framework
itself ships with one such derived class: RNGCryptoServiceProvider
.
The idea of a cryptographic random number generator is that even if it may still be
a pseudo-random generator, it works very hard to be unpredictable. The built-in implementation
takes several sources of entropy which effectively represent "noise" in your computer, and would
be very hard to predict. It may use this noise not just to compute a seed, but also when generating
the next numbers, so that even if you know the current state, that may not be enough to predict the
next results (or the ones already generated). This depends on the exact implementation though.
Windows can also make use of specialized hardware sources of randomness (such as a piece of hardware which watches
a radioactive isotope for decay) to make the random number generator even more secure.
Compare this with Random
. If you see (say)
10 results of calling Random.Next(100)
and have a fair amount of computing
resource to put into the task, you may well be able to work out the initial seed and
predict what the next result would be... as well as what earlier results were, potentially.
That's a disastrous state of affairs if the random numbers are being used for security or financial purposes. Cryptographic
random number generators are generally slower than Random
, but do a much
better job of giving numbers which are hard to predict and independent.
In many cases performance of the random number generator isn't an issue - but having a
decent API is. RandomNumberGenerator
is basically designed to generate
random bytes - and that's all. Compare this with the API of Random
, which
lets you ask for a random integer, or a random double
, or a random set of
bytes. I usually find that I need an integer in a range - and getting that reliably
and uniformly from a random array of bytes is non-trivial. It's far from impossible,
but at the very least you'd probably want an adapter class on top of
RandomNumberGenerator
. In many cases, the pseudo-randomness of
Random
is acceptable - if you can avoid the pitfalls described earlier.
Let's look at how we can do that.
Use one instance of Random
repeatedly
The core of the fix for the "lots of repeated numbers" is to use the same instance
of Random
repeatedly. This sounds pretty simple... for example, we can change
our original code like this:
// Somewhat better code...
Random rng = new Random();
for (int i = 0; i < 100; i++)
{
Console.WriteLine(GenerateDigit(rng));
}
...
static int GenerateDigit(Random rng)
{
// Assume there'd be more logic here really
return rng.Next(10);
}
Now our loop will print different digits... but we're not done yet.
What happens if you call this code multiple times in quick succession? We
might still create two instances of Random
with the same seed... although each
string of numbers would contain different digits, we could easily get the same
string of digits twice.
There are two ways of avoiding this problem. One option is to use a static field to maintain
a single instance of Random
which everything uses. Alternatively, we could push the instantiation higher, of course - eventually reaching the top
of the program, which only ever instantiates a single element of Random
, and passes it
down everywhere. That's a nice idea (and it expresses the dependency nicely), but it won't quite work... at least, it could cause problems if
your code uses multiple threads.
Thread safety
Random
isn't thread-safe. It's a real pain, given how we'd ideally like to use a single
instance everywhere in our program. But no, if you use the same instance at the same time
from multiple threads, it's quite possible to end up with a state of all zeroes internally,
at which point the instance becomes useless.
The best approach in my view is to still use one instance, but also use a lock which every caller has to remember to acquire while they're using the random number generator. That can be simplified by using a wrapper which does the locking for you, but in a heavily multithreaded system you'll still potentially waste a lot of time waiting for locks.
I used to recommend an approach where each thread had its own random number generator,
using thread-local variables. Unfortunately, with System.Random
at least, this
leads to "patterns" in the generated numbers - and you really don't want to see patterns.
The simplest approach was to keep a master seed (generated based on time once) and then increment
that seed each time a new random number generator was needed. A slightly better approach used
a single master random number generator to generate seeds - but neither of these ends up being
terribly effective. I suspect that with a better random number generator, the latter approach would
be okay, but if you're using System.Random
it's better to just take the hit of locking.
See this Stack Overflow question for a
demonstration of the problems involved.
Problems with interface design
One problem remains: this still isn't very secure. As I mentioned earlier, there's a more secure version in RandomNumberGenerator
,
with the most commonly used derived class being
RNGCryptoServiceProvider
.
However, the API to that really is hard to use in common cases.
It would be very pleasant indeed if the framework providers had separated the "source of randomness" concept from the
"I want to get a random value in a simple way" concept. Then we could have used a simple API backed by either a secure or insecure
source of randomness, depending on our requirements. Unfortunately, 'twas not to be. Maybe in a future iteration... or maybe a third party will come up with
an adapter instead. (It's beyond me, unfortunately... doing this kind of thing properly is remarkably difficult.) You can nearly get away with
deriving from Random
and overriding Sample
and NextBytes
... but it's not clear exactly how they need to work, and
even Sample
can be tricky. Maybe next time...