Lagrange’s Four Square Theorem (Lagrange, 1770) is the well-known result that every positive integer can be written as the sum of four integer squares. This was strengthened by Legendre’s 1797-1798 proof of the similar-sounding Three Squares Theorem, which proved that an integer is a sum of of three squares if and only if it is not of the form
Since either or is a sum of three squares (by Legendre), we see at once that by adding in a fourth square (0 or 1, respectively) we may recover Lagrange’s result. (Of course, the proof of Legendre’s theorem is far more involved.)
For sums of two squares the chronology is a bit murkier. Fermat (being Fermat) claimed a proof of the classification of these numbers in 1640, but there’s no evidence that Fermat ever had a proof to back up his claims. Thus the following classification is credited to Euler, who worked on the problem between 1747 and 1749.
Theorem (Euler): A positive integer is expressible as a sum of two squares if and only the square-free part of has no prime factor congruent to 3 mod 4.
Lagrange’s Four Square Theorem obviously implies that any integer can be written as a sum of five squares, or six squares, etc. As for the classification of integers expressible as a sum of one square – consider it an exercise for the reader.
— DENSITY —
In this post I’d like to describe these theorems in a different light, through the language of density. Unfortunately, since the integers have no uniform measure, we’ll have to be a bit careful with what we mean by this.
To state our questions clearly we introduce some notation from additive number theory. If and are sets, let denote the sumset of and ; that is, the set of sums
Let be the set of squares. Then Euler’s, Legendre’s, and Lagrange’s theorems describe the sets , , and (resp.), which we denote , , and (resp.) for sanity’s sake. Since 0 is square, we see that
which implies that the densities of these sets (if they exist) should be non-decreasing. Moreover, if denotes the familiar asymptotic upper density of a set of integers ,
(when the limit exists), then clearly and .
The classification of sums of three squares by Legendre makes calculation of straightforward as well. Upon consideration of the 2-factor in , we see that the sets
are disjoint, and can therefore compute the density of to be
This bounds the density in the interval , but it’s not hard to improve upon this upper bound. For example, the observation that
serves to bound .
To compute the actual value of we introduce for convenience the Dirichlet density of a set , defined by
This density can be thought of as a regularized upper density, in the sense that is less sensitive to changes in the set and allows us to understand when fails to converge. To be specific, exists whenever does, and when the latter exists the two are equal.
While not a viewpoint that we’ll need here, we point out that both and can be regarded as limits of distributions on the positive integers. The upper density is a limit of uniform distributions on , while the Dirichlet density is a limit (as ) of distributions in which the integer is chosen with probability proportional to .
The Dirichlet density is generally preferable to the upper density when the set you are studying has underlying multiplicative structure. In light of Euler’s theorem on the sums of two squares, this is certainly the case for .
— COMPUTING THE DENSITY OF —
Here we see how well the Dirichlet density suits this problem, as the Riemann zeta functions cancel out in the definition of and leave us with
To motivate our next step, it makes sense to pause and develop heuristics for what we expect the density of to be. Since each integer can be written in the form with square-free, the heuristics that
- Large integers have large square-free part.
- Primes are equidistributed among the residues 1 and 3 mod 4.
- Primes in the square-free part of congruent to 1 and 3 mod 4 occur in equal proportion (as suggested by heuristic (2)).
Together, these heuristics suggest that an `average’ large integer will have a large number of factors in its square-free part congruent to 3 mod 4. Consequently, they suggest that large integers lie in with probability 0.
We therefore claim that , for which it suffices to show that the sum of logarithms in line (1) converges to as . Since
we need only prove that the harmonic series of primes congruent to 3 mod 4 diverges. Yet this is a special case of the strong form of Dirichlet’s Theorem (below), which proves our claim that once and for all:
Theorem (Dirichlet, 1837): Fix positive, coprime integers . Then the harmonic series of prime numbers congruent to diverges. (In particular, for infinitely many primes .)
Dirichlet’s Theorem predates the Prime Number Theorem (PNT) by 59 years and is often regarded as the genesis of modern analytic number theory. And, unlike the PNT, Dirichlet’s Theorem avoids many of the technical details that obscure the role of L-functions in its proof.
To keep this post self-contained and finite I will not include a full proof of Dirichlet’s Theorem. Rather, I will illustrate Dirichlet’s Theorem in the simple case , which avoids some of the (worthwhile!) details of the original proof and reads like a generalization of Euler’s proof that the harmonic series of primes diverges. (The case we really want, , can be proven with a tweak to our method and is found in the Exercises.)
— DIRICHLET’S THEOREM IN A SIMPLE CASE —
Theorem: The harmonic sum of primes congruent to 1 mod 4 diverges. In particular, there are infinitely many primes congruent to 1 mod 4.
Proof: We introduce the Dedekind zeta function of a number field , defined by the series
as in Euler’s factorization of the Riemann zeta function. The classification of prime ideals in (and their norms) can be recognized in terms of congruence conditions as a consequence of Euler’s description of primes in . To summarize this classification, we have that each prime divides a unique prime ideal , and
- if , then factors into a product of two Gaussian primes of norm ;
- if , then remains prime in and has norm ;
- the prime 2 factors as .
It follows that
Since is a principal ideal domain (PID), there exists a correspondence between Gaussian integers and ideals in that overcounts with multiplicity four, the number of units in . Thus the number of ideals in of norm grows like , which proves that diverges as .
On the other hand, the 2-factor and the second infinite product in (2) converge as . Thus
which implies that the harmonic series of primes congruent to diverges, by taking logarithms.
— EXERCISES —
Exercise 1: What is the average number of ways an integer can be written as the sum of two squares? What about three squares or four squares?
Exercise 2: Let denote the non-trivial character of modulus 4, ie. the character for which . Prove that , where is the Dirichlet L-function attached to . Prove that . (This is more in the flavor of Dirichlet’s original proof.)
Exercise 3: Our proof that diverges as uses that is a PID. Give an analytic continuation of to and apply Landau’s theorem (see section 5) to conclude that must have a pole at . Hint: bound the partial sums of the coefficients of .
Exercise 4: Let and show that the Euler product expression
holds in the half-plane . Show that this implies that the last product above diverges as , and use this to prove that the harmonic sum of primes congruent to 3 mod 4 diverges.