Stirling’s approximation is a useful approximation for large factorials which states that the th factorial is well-approximated by the formula
as . Stirling’s approximation was first proven within correspondence between Abraham de Moivre and James Stirling in the 1720s; de Moivre derived everything but the leading constant, which Stirling eventually supplied (without proof; it’s not known how Stirling guessed it).
Given the rapid growth of the factorial, it is often more convenient to take logarithms and work with those instead. Doing so, Stirling’s approximation assumes the form
Most proofs of Stirling’s approximation work with the formulation above. One typical (modern-ish) proof begins by studying the integral approximation
This version (as written) is too imprecise to recover Stirling’s Approximation, but the error term can be improved massively by introducing more terms using the Euler–Maclaurin formula (ca. 1735). This is the proof sketched on Wikipedia, for example.
The precise statement of the Euler–Maclaurin formula makes reference to the Bernoulli numbers . These show up all over combinatorics, but as a multiplicative number theorist I will always think about Bernoulli numbers in the context of the Riemann zeta function. Specifically, one has , so the Bernoulli numbers relate to special values of the zeta function.
This point of view inspired me to derive Stirling’s approximation (and the additional terms making up Stirling’s series) in a way which makes the role of the zeta function obvious. For convenience, we’ll phrase everything in terms of the gamma function; this affects the shape of our formula in a small and readily-understandable way. Without further ado, here’s the proof:
Proof: We begin with Weierstrass’ infinite product for the gamma function (ca. 1854),
in which is the Euler–Mascheroni constant. (Euler gave an infinite product for the gamma function back in 1729 which might also work for this proof technique.) Taking a principal branch of the logarithm, we produce
Our next ingredient is a contour integral representation of the logarithm; namely,
in which the line of integration is the vertical line with real part . This formula can be proven by Cauchy’s residue formula (and analytic continuation in ). We note as well an interpretation via Mellin transforms: this formula states that the Mellin transform of is . (See here for a table listing this and other Mellin pairs.)
In any case, by combining our formulas and shifting the contour to the right, we produce
Absolute convergence of the contour integral justifies an interchange of sum and integral and allows us to recognize a zeta function:
We can now shift the contour of integration back to the left, extracting residues as we go. The residue from the double pole at is , hence
(From here on out, we identify the Riemann zeta function with its meromorphic continuation.) Shifting farther left passes a second double pole at , which has residue , hence
A bound in absolute value shows that the integral at right in the line above is and therefore small enough to recover Stirling’s approximation.
With the proof complete, I offer two final remarks:
1. There’s nothing stopping us from shifting the line of integration in our formula even farther left. Doing so passes simple poles at the negative odd integers, which we can extract as residue terms in our formula. (The poles at negative even integers are cancelled by trivial zeros of the zeta function.) This yields
which is equivalent to Stirling’s series.
2. In one of Keith Conrad’s online notes, he subdivides proofs of Stirling’s approximation according to the origin of the mysterious factor . Conrad lists references in which is obtained through (a) infinite products (such as the Wallis product), (b) the Gaussian integral, and (c) reflection formulas for the gamma function. In the proof we give here, we obtain from the series expansion
in the limit as . Because the functional equation of zeta invokes gamma functions, you could argue that this proofs falls in camp (c). However, the proof here is quite different from that given in the Conrad’s example for (c) (which was taken from section 5.2.5 in Ahlfors’ book Complex Analysis). Both proofs begin with the Weierstrass product, but their use of contour integration is quite different. The proof given here is also shorter.
I enjoyed readding this