This post is related to the ongoing Polymath 4 project [Blog, Wiki] which looks at the problem of deterministically generating primes of a given size. Many interesting approaches have been discussed. One that I found particularly interesting was Albert Asterias‘s derandomization approach based on Nisan-Wigderson Pseudorandom Generators. I will not explain Asterias’s idea here, but it should be clear how it inspired this post.
As the title suggests, this post is not about the problem of designing pseudorandom generators for the purpose of finding primes. This post is about the completely opposite question:
How hard is it for a pseudorandom generator to avoid producing primes?
I will show that if a pseudorandom generator fails to generate primes in reasonable quantity then this must be partly explained by some specific types of irregularities in the distribution of primes. It is not at all surprising that such irregularities prevent some pseudorandom generators from generating primes. However, it is not immediately clear that all failures lead to such irregularities. A heuristic analysis of these irregularities suggests that it is quite difficult for a reasonable pseudorandom generator to avoid producing primes, but I also argue that these irregularities cannot explain all ways that a pseudorandom generator can fail to produce primes.
The irregularities in question are of two specific types. Let be the binary representation of some -bit number For I define and as follows:
- is the (possibly negative) excess of -bit primes whose lower order bits are versus those -bit primes whose lower order bits are
- is the (possibly negative) excess of -bit primes whose higher order bits are versus those -bit primes whose higher order bits are
Thus measures the difference in the number of -bit primes in two adjacent intervals of length and dually measures the difference in the number of -bit primes in two “adjacent” arithmetic progressions with step Since this is unusual in certain contexts, let me warn you that the sign of the difference is important.
Irregularity Lemma. Let be a random variable that takes values in the set of -bit integers. Suppose that is rather bad at generating primes, more precisely suppose that
for some where is uniformly distributed in Then
- must hold for some
- must hold for some
Before I turn to the proof of the Irregularity Lemma, I’ll discuss what the conclusion of the lemma says.
Understanding the Irregularities
The condition is perhaps most striking. The arithmetic progression contains only one prime, whereas the arithmetic progression contains primes. Therefore,
So only when is even more often than it is odd, which is clearly a bad idea for generating primes. (Though the special case when is instructive.)
The condition is very different. When is even, then and so the even contribution to the expectation is negligible. When is odd, then is the excess of the number of -bit primes that are versus the number of -bit primes that are or vice versa. For large enough we expect to have roughly the same number of primes in these two congruence classes. Indeed, the Siegel-Walfisz Theorem guarantees that which means that this type of irregularity cannot possibly occur for large enough In fact, assuming the Generalized Riemann Hypothesis, we have for and so the condition cannot hold for
When is close enough to the expectation can be large enough. However, this situation can still have a very unusual feel to it. The extreme case is enlightening. Let be the random variable that differs from only in the most significant bit (i.e., ). Then
So a large means that while is not often prime, the closely related random variable is prime more often. In general, for large means that while may not be prime, it is close to many primes in the dyadic metric.
The situation with is similar, but has a slightly different feel. Again, when is close to the inequality is unlikely since we expect two intervals of length to contain roughly the same number of primes. Indeed, assuming the Riemann Hypothesis, we have the bound which means that cannot hold for
When is small enough, the expectation can be large enough. Again, this situation can have a very unusual feel. For example, a large value of means that is often prime. In general, for small means that while may not be prime, it is close to many primes (in the usual metric).
When neither the inequality nor the inequality is very surprising. However, it is unusual for both inequalities to hold simultaneously. Indeed, arithmetic progressions with step and intervals of length are somewhat orthogonal to each other so it would seem that the two types of irregularities would tend to interfere and cancel each other out. Such a should take a somewhat sparse set of values. This is not very restrictive, so this is likely where most irregularities can be found.
Seeing that many of these irregularities are either impossible, unlikely, or very visible, it is tempting to conclude that any reasonable looking pseudorandom generator will inevitably generate a decent amount of primes. This is overly optimistic as the irregularities are relatively blind to some more global properties that can prevent from generating primes. For example, if is such that
(In other words, has equally many bits in even position as it has bits in odd positions.) Then is always a multiple of However, it is not at all clear how to explain this global property in terms of irregularities as described above.
Proof of the Irregularity Lemma
I will only prove the first conclusion that
for some The proof of the second conclusion is symmetric.
where are (usually not independent nor uniform) binary random variables. Similarly, let’s write
where are independent uniform binary random variables.
Consider the hybrid random variables
for (so and ). Since
there must be an such that
Note that equals
where is the random variable
(So differs from only in the -th bit.) Since is independent of we have
and, from we obtain
On the other hand,