Episodes 299 and 301 of the SecurityNow podcast deal with the problem of how to get randomness out of a computer. As usual, Steve Gibson does a good job of explaining things, but I felt that there was some more that needed to be said about computers and randomness, as well as the related ideas of predictability, observability, repeatability, and determinism. I have worked and wrangled with these concepts for almost 15 years now, from my research into timing prediction for embedded processors to my current work with the repeatable and reversible Simics simulator.
Let’s start from the top.
When Steve said that computers are deterministic, I jumped. To me, a computer is anything but deterministic. The idea that rerunning a program does the same thing is an ideal state that you can rarely reach, and having an infrastructure like Simics that helps you achieve this is huge win for debugging.
Listening closely, what I think Steve really said is that an algorithm like a random number generator is deterministic. If you know its initial state, it will always compute the same result. That is indeed true for code that just converts an input into an output, and does no communication and is not dependent on time or timing. My experience in random and nondeterministic behavior comes from programs that feature multiple threads and often multiple processes, and plenty of asynchronous activity going on. So, same word, different contexts.
However, Steve also several times talk about computers as being deterministic predictable machines. I think that characterizing today’s computers as being deterministic is untrue. I would rather say that with multiple cores and multiple chips and timing variations all over the place, a computer has become fundamentally nondeterministic and non-repeatable, since there are so many little things going on where a nanosecond difference in time can cause behavior to diverge incredibly quickly. There is a nice paper from 2003 about the divergent behavior from minor differences, “Variability in Architectural Simulations of Multi-threaded Workloads“, by Alaa R. Alameldeen and David A. Wood.
The HAVEGE program I wrote about a while back is essentially an attempt to harness the fundamental unpredictability of modern hardware timing. Nice idea, which at least in theory fulfills the more important property for security of being unobservable. Security doesn’t really need “real” randomness, all you need is something that an attacker cannot predict or observe. The classic Netscape SSL lack-of-randomness in the random seed issue from 1996 is the best illustration of this. Certain things about a target can be inferred or observed, but the low-level hardware timing is not one of them, at least not for an x86 or high-end ARM class machine.
The solution that Steve prefers are the Yarrow and Fortuna algorithms that collect randomness from the environment of a computer and uses that as a seed to a normal random number generator, creating lots of useful random data from a fairly small seed. This is the same idea as HAVEGE, but with a different entropy source. In both cases the basic idea seems sound and reasonable, but I kind of hoped that Steve would know of some way to evaluate the quality of the entropy pool generated from hardware events.
Steve mentioned the NIST randomness test that was used to test HAVEGE. It is certainly an aggressive test, but as my testing showed, it only demonstrates that a random number generator is random in the data produced. It does not show that it is unpredictable, and it does not measure the benefit gained from using unobservable local events in hardware as the source of entropy. You need something else, like comparing repeated collections of randomness over time from the same system, to build confidence in unobservable and unpredictable randomness.
With a computer, you do have such a thing as repeatable, deterministic, and thus predictable randomness. In a modern desktop or server computer, you also have tons of totally unpredictable non-repeatable non-usefully-observable randomness in the low-level hardware timing and concurrent behavior of independent hardware units. Too bad it seems hard to prove this by measurement.
For yet more randomness discussion, especially randomness in embedded systems, I recommend the blog post by Joachim Strömbergsson. (it is in Swedish).