I’m no expert in the subject of PRNGs, but I find it quite interesting. Developers usually don’t give much thought to this and just use whatever rand()
, random()
, Math.random()
function is available in their language of choice. And that’s fine for most purposes, but there are situations where (pseudo) randomness matters, and different algorithms may have very different quality and properties. For instance, PHP users may have noticed that there are two functions to get a random number: rand()
and mt_rand()
. In this case, the latter, based on the Mersenne Twister algorithm, was introduced as a better alternative to the former (which uses the underlying system implementation).
While googling for different generators and their properties, I recently stumbled upon a website that describes a new algorithm and shows some comparisons between that novel approach and many other popular ones (including LCGs, Mersenne Twister, etc) on different quality metrics (statistical quality, prediction difficulty, time and space efficiency, and more). It’s called PCG, (Permuted Congruential Generator), and, judging by the comparisons shown there, it beats the crap out of most other generators!
What makes it so good? Well, as the site states, the PCG family combines properties not previously seen together in a single algorithm, including: small code size, low memory and CPU usage, powerful features (jump ahead, distance), low predictability, excellent performance in statistical tests, and a permissive open source license (the Apache license).
How does it achieve all that? Its power apparently comes from the combination of an LCG (Linear Congruential Generator) as its state-transition function (the function that dictates how the internal state changes every time a new number is returned) and a new technique called permutation functions on tuples as its output function (the one that turns the internal state into the actual random number returned).
LCG is one of the simplest and fastest algorithms known, and has some desirable features, but used on its own it has horrible statistical performance (among other drawbacks) . It seems that by combining it with a clever output function, though, an excellent generation scheme is achieved.
The website is full of interesting information and explanations, and you can download the full technical paper and the reference C/C++ implementations.