Sums of Squares and Density

nested_radii_2Lagrange’s Four Square Theorem (Lagrange, 1770) is the well-known result that every positive integer can be written as the sum of four integer squares. This was strengthened by Legendre’s 1797-1798 proof of the similar-sounding Three Squares Theorem, which proved that an integer n is a sum of of three squares if and only if it is not of the form

n = 4^k(8m+7).

Since either n or n-1 is a sum of three squares (by Legendre), we see at once that by adding in a fourth square (0 or 1, respectively) we may recover Lagrange’s result. (Of course, the proof of Legendre’s theorem is far more involved.)

For sums of two squares the chronology is a bit murkier. Fermat (being Fermat) claimed a proof of the classification of these numbers in 1640, but there’s no evidence that Fermat ever had a proof to back up his claims. Thus the following classification is credited to Euler, who worked on the problem between 1747 and 1749.

Theorem (Euler): A positive integer m is expressible as a sum of two squares if and only the square-free part of m has no prime factor congruent to 3 mod 4.

Lagrange’s Four Square Theorem obviously implies that any integer can be written as a sum of five squares, or six squares, etc. As for the classification of integers expressible as a sum of one square – consider it an exercise for the reader.


In this post I’d like to describe these theorems in a different light, through the language of density. Unfortunately, since the integers have no uniform measure, we’ll have to be a bit careful with what we mean by this.

To state our questions clearly we introduce some notation from additive number theory. If A and B are sets, let A \oplus B denote the sumset of A and B; that is, the set of sums

A \oplus B := \{a+b \mid a \in A, b \in B\}.

Let S be the set of squares. Then Euler’s, Legendre’s, and Lagrange’s theorems describe the sets S\oplus S, S \oplus S \oplus S, and S \oplus S \oplus S \oplus S (resp.), which we denote S_2, S_3, and S_4 (resp.) for sanity’s sake. Since 0 is square, we see that

S \subset S_2\subset S_3 \subset S_4 = \mathbb{Z}_{\geq 0},

which implies that the densities of these sets (if they exist) should be non-decreasing. Moreover, if \sigma(A) denotes the familiar asymptotic upper density of a set of integers A,

\displaystyle\sigma(A):= \lim_{n \to \infty} \frac{\#(A \cap [1,n])}{n},

(when the limit exists), then clearly \sigma(S)=0 and \sigma(S_4)=1.

The classification of sums of three squares by Legendre makes calculation of \sigma(S_3) straightforward as well. Upon consideration of the 2-factor in n=4^k(8m+7), we see that the sets

T_k = \{4^k(8m+7) \mid m \geq 0\}

are disjoint, and can therefore compute the density of S_3 to be

\displaystyle\sigma(S_3) = 1-\sum_{k \geq 0} \sigma(T_k)=1-\sum_{k\geq 0} \frac{1}{8 \cdot 4^k} = 1-\frac{1/8}{1-1/4}=\frac{5}{6}.

This bounds the density \sigma(S_2) in the interval [0,5/6], but it’s not hard to improve upon this upper bound. For example, the observation that

S_2 \subset \{ m \in \mathbb{Z}_{\geq 0} \mid m \equiv 0,1,2,4,5\!\!\mod 8\}

serves to bound \sigma(S_2) \leq 5/8.

To compute the actual value of \sigma(S_2) we introduce for convenience the Dirichlet density of a set A, defined by

\displaystyle D(A) = \lim_{s \to 1^+} \frac{1}{\zeta(s)} \sum_{a \in A} \frac{1}{a^s}.

This density can be thought of as a regularized upper density, in the sense that D(A) is less sensitive to changes in the set A and allows us to understand A when \sigma(A) fails to converge. To be specific, D(A) exists whenever \sigma(A) does, and when the latter exists the two are equal.

While not a viewpoint that we’ll need here, we point out that both \sigma and D can be regarded as limits of distributions on the positive integers. The upper density \sigma is a limit of uniform distributions on [1,N], while the Dirichlet density D is a limit (as s \to 1^+) of distributions in which the integer x is chosen with probability proportional to x^{-s}.

The Dirichlet density is generally preferable to the upper density when the set you are studying has underlying multiplicative structure. In light of Euler’s theorem on the sums of two squares, this is certainly the case for S_2.


Let \chi_2 denote the characteristic function of S_2. From Euler’s theorem we recognize \chi_2 as a multiplicative (although not completely multiplicative!) function and therefore have

\displaystyle\sum_{a \in S_2} \frac{1}{a^s} = \sum_{n=1}^\infty \frac{\chi_2(n)}{n^s} = \prod_p \sum_{k \geq 0} \frac{\chi_2(p^k)}{p^{ks}} = \prod_{p \equiv1,2(4)} \!\!\bigg(\sum_{k \geq 0} \frac{1}{p^{ks}}\bigg)\prod_{p \equiv 3(4)}\!\! \bigg( \sum_{k \geq 0} \frac{1}{p^{2ks}}\bigg)

=\displaystyle \prod_{p\equiv 1,2(4)} \frac{1}{1-p^{-s}} \prod_{p \equiv 3(4)} \frac{1}{(1-p^{-s})(1+p^{-s})} = \zeta(s) \!\!\prod_{p \equiv 3(4)} \frac{1}{1+p^{-s}}.

Here we see how well the Dirichlet density suits this problem, as the Riemann zeta functions cancel out in the definition of D(S_2) and leave us with

\displaystyle D(S_2) = \lim_{s \to 1^+} \prod_{p \equiv 3(4)} \frac{1}{1+p^{-s}} = \lim_{s\to 1^+} \exp \sum_{p \equiv 3(4)} \log\left(\frac{1}{1+p^{-s}}\right).       (1)

To motivate our next step, it makes sense to pause and develop heuristics for what we expect the density of S_2 to be. Since each integer N can be written in the form N= n_1 n_2^2 with n_1 square-free, the heuristics that

  1. Large integers N have large square-free part.
  2. Primes are equidistributed among the residues 1 and 3 mod 4.
  3. Primes in the square-free part of N congruent to 1 and 3 mod 4 occur in equal proportion (as suggested by heuristic (2)).

Together, these heuristics suggest that an `average’ large integer N will have a large number of factors in its square-free part congruent to 3 mod 4. Consequently, they suggest that large integers lie in S_2 with probability 0.

We therefore claim that D(S_2) =0, for which it suffices to show that the sum of logarithms in line (1) converges to -\infty as s \to 1^+. Since

\displaystyle\sum_{p \equiv 3(4)} \log\left(\frac{1}{1+p^{-s}}\right) = -\sum_{p \equiv 3(4)} \frac{1}{p^s}+O\left(\sum_p \frac{1}{p^{2}}\right),

we need only prove that the harmonic series of primes congruent to 3 mod 4 diverges. Yet this is a special case of the strong form of Dirichlet’s Theorem (below), which proves our claim that D(S_2)=0 once and for all:

Theorem (Dirichlet, 1837): Fix positive, coprime integers a, m. Then the harmonic series of prime numbers congruent to a \!\! \mod m diverges. (In particular, p \equiv a \!\! \mod m for infinitely many primes p.)

Dirichlet’s Theorem predates the Prime Number Theorem (PNT) by 59 years and is often regarded as the genesis of modern analytic number theory. And, unlike the PNT, Dirichlet’s Theorem avoids many of the technical details that obscure the role of L-functions in its proof.

To keep this post self-contained and finite I will not include a full proof of Dirichlet’s Theorem. Rather, I will illustrate Dirichlet’s Theorem in the simple case p\equiv 1 \!\! \mod 4, which avoids some of the (worthwhile!) details of the original proof and reads like a generalization of Euler’s proof that the harmonic series of primes diverges. (The case we really want, p \equiv 3 \!\!\mod 4, can be proven with a tweak to our method and is found in the Exercises.)


Theorem: The harmonic sum of primes congruent to 1 mod 4 diverges. In particular, there are infinitely many primes congruent to 1 mod 4.

Proof: We introduce the Dedekind zeta function \zeta_k(s) of a number field \mathcal{O}_k, defined by the series

\displaystyle\zeta_k(s) = \sum_{\mathfrak{a} \subset \mathcal{O}_k} \frac{1}{N(\mathfrak{a})^s},

where the sum exhausts the non-zero ideals \mathfrak{a} of \mathcal{O}_k and N(\mathfrak{a}) denotes the norm of \mathfrak{a}; that is, the index [\mathcal{O}_k : \mathfrak{a}]. In the case k = \mathbb{Q}[i] we have \mathcal{O}_k = \mathbb{Z}[i], the Gaussian integers, and unique factorization of ideals gives

\displaystyle\zeta_k(s) = \prod_{\mathfrak{p} \text{ prime}} \left(1 - \frac{1}{N(\mathfrak{p})^s}\right)^{-1}

as in Euler’s factorization of the Riemann zeta function. The classification of prime ideals \mathfrak{p} in \mathbb{Z}[i] (and their norms) can be recognized in terms of congruence conditions as a consequence of Euler’s description of primes in S_2. To summarize this classification, we have that each prime \mathfrak{p} divides a unique prime ideal (p) \subset \mathbb{Z}, and

  • if p \equiv 1 \!\! \mod 4, then (p) = (\alpha)(\overline{\alpha}) factors into a product of two Gaussian primes of norm p;
  • if p \equiv 3 \!\! \mod 4, then (p) remains prime in \mathbb{Z}[i] and has norm p^2;
  • the prime 2 factors as (2) = (1+i)^2.

It follows that

\displaystyle\zeta_k(s) = (1- 2^{-s})^{-1} \prod_{p \equiv 1(4)} \left(1 - \frac{1}{p^s}\right)^{-2} \prod_{p \equiv 3(4)} \left(1- \frac{1}{p^{2s}}\right)^{-1}.      (2)

Since \mathbb{Z}[i] is a principal ideal domain (PID), there exists a correspondence between Gaussian integers and ideals in \mathbb{Z}[i] that overcounts with multiplicity four, the number of units in \mathbb{Z}[i]. Thus the number of ideals in \mathbb{Z}[i] of norm \leq X grows like (\pi/4) X, which proves that \zeta_k(s) diverges as s \to 1^+.

On the other hand, the 2-factor and the second infinite product in (2) converge as s \to 1^+. Thus

\displaystyle\lim_{s \to 1^+} \prod_{p \equiv 1(4)} \left(1- \frac{1}{p^s} \right)^{-2} = \infty,

which implies that the harmonic series of primes congruent to 1\!\!\mod 4 diverges, by taking logarithms. \square


Exercise 1: What is the average number of ways an integer can be written as the sum of two squares?  What about three squares or four squares?

Exercise 2: Let \chi(n) denote the non-trivial character of modulus 4, ie. the character for which \chi(-1)=-1. Prove that \zeta_k(s) = \zeta(s)L(s,\chi), where L(s,\chi) is the Dirichlet L-function attached to \chi. Prove that L(1,\chi) \neq 0. (This is more in the flavor of Dirichlet’s original proof.)

Exercise 3: Our proof that \zeta_k(s) diverges as s \to 1^+ uses that \mathbb{Z}[i] is a PID. Give an analytic continuation of L(s,\chi) to \Re s >0 and apply Landau’s theorem (see section 5)  to conclude that \zeta_k(s) must have a pole at s=1. Hint: bound the partial sums of the coefficients of L(s,\chi).

Exercise 4: Let k = \mathbb{Q}[i] and show that the Euler product expression

\displaystyle\zeta_k(s) = (1-2^{-s})^{-1} L(s,\chi)^2 \prod_{p \equiv 3(4)} \left(1-\frac{1}{p^{2s}}\right)\prod_{p\equiv 3(4)} \left(1-\frac{1}{p^s}\right)^{-2}

holds in the half-plane \Re s > 1. Show that this implies that the last product above diverges as s \to 1^+, and use this to prove that the harmonic sum of primes congruent to 3 mod 4 diverges.

2 responses to “Sums of Squares and Density

  1. Pingback: Lattice Points in High-Dimensional Spheres | a. w. walker·

  2. I am now not positive where you are getting your information, but good topic. I needs to spend a while finding out more or figuring out more. Thank you for fantastic info I was looking for this info for my mission.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s