The world of primes is one which is extremely diverse, ranging from simple ideas and their verifications, to more diverse, empirical observations, many of which are yet to be proved as valid, requiring techniques far beyond the scope of all but the most astute mathematicians.

It is important to first recall what a prime number is, as we will revisit these properties constantly throughout this read. We say a prime number is an integer p>1, such that the only factors of p are itself and 1. Obvious examples of such numbers are: 2, 3, 5, 7, 11, 13, …

How many primes are there? Infinitely many, or is there a largest prime number? It seems obvious that primes are endless, based off large amounts of empirical data collated over that past few centuries. Yet this isn’t enough, mathematically speaking. There is nothing to say that, theoretically speaking, there couldn’t be a largest prime number. What happens if we assume there to be such a prime? Does this lead to some fundamental contradiction which would render this result invalid?

Infinite Primes

Let us assume, for the sake of contradiction, that there are n primes: p_1, p_2,..., p_n, where p_1>p_2>...>p_n. Let us now consider the following integer N=p_1p_2p_3...p_n + 1. N can either be prime, or a product of other primes. N \neq p_1p_2p_3...p_n, and this means that N cannot be prime because then we would be left with n+1 primes, contradicting our previous assumption. Therefore, N must have a prime factorization (a representation of N as a product of primes). However, our previous assumption restricts the possible primes to the list we assumed to be complete at the start of our proof. Therefore, our prime factorization can only consist of p_1, p_2,..., p_n. It is clear to see that none of these divide N: they all leave a remainder of 1. Therefore, such a prime factorization cannot exist. Hence, such a number N cannot exist. Hence, there is no such finite list of primes, and in turn no largest prime number. This style is proof is seen in many parts of mathematics, such a proving the irrationality of \sqrt{2}, among other numbers. Perhaps the simplest example of this type of proof is proving that there is no greatest positive integer, since if N were the largest number, then N+1, a positive integer and greater than N, contradicts the previous statement. This is known as `Euclid’s classical proof’ and was found as early as 300BCE.

Credits: Garrett Coakley, Flickr

However, this is not the only one that we know, in fact, there are a large number of intriguing proofs that have been discovered in the past, such as Furstenberg’s topological proof, as well as proof by using Fermat numbers; we will visit the latter shortly.

General Form

Once it was clear that there were an infinite number of primes, the next task was to find a general form for all the primes, or at least a subset of the primes. I now invite the reader to try and find one for themselves. After a while it becomes clear that this is a near impossible task, although possible, but requiring techniques far beyond the scope of many. Some notable attempts have been made, and we will discuss them now. n^2-n+41 is one such formula, producing primes for n=1,2,...,40, but failing for n=41, as 41^2-41+41=41^2 \notin \mathbb{P}. Another formula is n^2-79n+1601, producing primes for all positive integers up to 80. Another, more well-known attempt is that of Fermat, who proposed the following general form for a subset of primes: 2^{2^n} + 1. This was unfortunately disproved by Euler, who found that 2^{2^5}+1 = 641 \cdot 6,700,417. Fermat numbers do contribute to another proof for the infinitude of primes, and I will explain the proof now. Let us represent a Fermat number as F_n, with the general formula F_n = 2^{2^n}+1. It will be firstly useful to prove the following:\prod_{i=1}^{n-1}F_i = F_0F_1F_2...F_{n-1} = F_n-2, as it reveals a very important property about Fermat numbers which we will then exploit for the purposes of the proof. We will use proof by induction, and I advise some readers who are not familiar with this concept to read about it, as it is an extremely powerful concept in mathematics. We will first prove this for n=1:

F_1-2 = 2^{2^1}+1-2 = F_0 = \prod_{i=0}^{1-1}F_i

This was the initial statement for n=1 in the first place. This result provides us with a basis for our proof by induction. We know follow this with the inductive step. Let us assume the statement is true for n=k, i.e.:

\prod_{i=0}^{k-1}F_i = F_k-2

We are hence required to prove that the statement is true for n=k+1, i.e.:

\prod_{i=0}^{k}F_i = F_{k+1}-2

Taking the left-hand side, we find:

\prod_{i=0}^{k}F_i = \prod_{i=0}^{k-1}F_i \cdot F_k = F_k(F_k-2) \\ F_k(F_k-2) = (2^{2^k}+1)^2 -2(2^{2^k}+1) \\ (2^{2^k})^2+2\cdot1\cdot2^{2^k}+1-2\cdot2^{2^k}-2 = 2^{2^{k+1}}+2^{2^{k+1}}+1-2^{2^{k+1}}-2 \\ 2^{2^{k+1}}+1-2 = f_{k+1}-2 \\

Thus we arrive at the right-hand side as stated before. Hence the statement is true for n=k+1, and by the principle of mathematical induction, having proved the base case and the inductive step, the statement is true for all positive integers. But what significance does this have? If the statement is true, it means that any F_i, 0 \leq i \leq n divides F_n with remainder 2 (For those familiar with modular arithmetic, F_n \equiv 2 (modulo F_i).

This means that the highest common factor of any two Fermat numbers must be a factor of 2, either 1 or 2 (I shall leave the reader to deduce why that is). But since all Fermat numbers are odd, the highest common factor of any two Fermat numbers is 1, and all Fermat numbers are said to be coprime. This leads to all Fermat numbers having prime factorisations with no prime factors in common. And since there are infinitely many Fermat numbers, there are infinitely many primes that are required to make up the prime factorisations of these infinitely many Fermat numbers. And therefore, there are an infinite number of primes.

Another significant result with primes is the fact that any arithmetic progression, with a coprime initial value and common difference, produces an infinite number of primes. While special cases were easy to prove using a Euclidean style, the general theorem could not be proved for quite some time and required advanced calculus and function theory. The proof, by Peter Gustav Lejeune Dirichlet (1859), is still seen as one of the greatest revelations in mathematics and requires a large amount of mathematical knowledge to understand the proof.

Observations and 6k+5 Proof

We shall still see the proof for the infinitude of primes of form 6k+5, which is very similar to that of the general infinitude of primes. Before we must consider two observations, which will prove useful in our final proof. Firstly, we must recognise that numbers can be of form 6k, 6k+1, 6k+2,...,6k+2, where k is an integer. Apart from 6k+1 and 6k+5, all the other forms can be factorised. For example, 6k+3 \equiv 3(2k+1). Hence, primes are either of form 6k+1 or 6k+5. Secondly, it is vital to observe that the product of any two numbers of form 6k+1 results in another integer of form 6k+1This can be observed in the following general expression for this product:

(6a+1)(6b+1) = 36ab + 6a + 6b + 1 = 6(6ab+a+b)+1

Now that we have noted these two observations, we can proceed with the proof. Let us say that, for the sake of contradiction, there are a finite number of primes of the form 6k+5, namely q_1, q_2,...,q_n. Now consider the following integer N=6(q_1q_2q_3...q_n)-1 = 6(q_1q_2q_3...q_n)+5.

If N is prime it contradicts the idea that there are only n primes of form 6k+5, since N is also of this form, and N \neq q_i, 1 \leq i \leq n. Therefore N  must be a product of primes. It is clear that no prime factors of N of form 6k+5 since we can see from the expression for N that all leave remainder -1. Therefore, the prime factorisation can only consist of primes of form 6k+1. However, this requires N to be of form 6k+1, from our second expression, which is not as seen by the expression. Hence such an integer cannot exist. And thus, there are an infinite number of primes of the form 6k+1. I invite the reader to attempt a similar proof for primes of form 4k+3.

Distribution of the Primes

We now have an idea of the expanse of prime numbers, but we must still persevere to achieve an understanding of the distribution of primes. I will first introduce the reader to possibly unknown terminology: \pi (n) = number of primes up to and including n. For example, \pi (1) = 0, \pi (12) = 5, \pi (19) = 8, etc.

Many conjectures and proofs are based on masses and masses of empirical data compiled over a few centuries. One technique used to do so is called `The Sieve of Eratosthenes’. It is used in order to compile a list of primes up to and including an integer N. The process starts by listing all positive integer less than or equal to N, and then striking out all numbers divisible by 2, 3, 5, etc. until only the prime numbers are left. This simple technique has been used extensively, for integers as large as 10^7, and can be used to give significant evidence for many conjectures.

In fact, Gauss was able to observe that prime density, the number of primes per number of positive integers, tends toward the reciprocal of the natural log of the number of integers. This can be seen in the following:

n \frac{\pi(n)}{n} \frac{1}{ln(n)}
10^3 0.1680.145
10^6 0.078498 0.072382
10^9 0.0508474780.0482549424

Gauss noticed the following expression: \frac{\pi(n)}{n} = \frac{1}{ln(n)}, which seemed to improve as n \rightarrow \infty. More strikingly, the number of primes less than or equal to n \approx \frac{n}{ln(n)}. Gauss only ever knew this to be an empirical observation; it was after he died that mathematical techniques were introduced which could prove this approximation to be true, such as the Reimann zeta function. This approximation allows us to see the manner in which primes become less dense as numbers become bigger.

Plot of the first 500 Prime numbers using Polar coordinates.
Credits: Tom Saunders, Flickr

The proofs and techniques covered here provide the basis for an introduction to primes, and much harder theory involving primes, that I have not covered, still remains. I will list them below:

  • Wilson’s Theorem
  • Fermat’s Last Theorem
  • Mill’s Formula
  • Wright’s Formula
  • Plouffe’s Formula
  • Erdos’ Proof

About the author