The Orders of Simple Groups

In 1832, Galois introduced the concept of normal subgroups, and proved that the groups $A_n$ (for $n>4$ ) and $\mathrm{PSL}_2(\mathbb{F}_q)$ (for $q>4$ ) were simple, i.e. admit no non-trivial (proper) normal subgroups. In 1892, Hölder asked for a classification of all finite simple groups, and the final classification of finite simple groups in 2004 by Aschbacher and Smith’s resolution of the quasithin case thus resolves an open problem over a century in the making. (Note: the great majority of the classification theorem concluded in the 1980’s. A computational error was identified and resolved in 2008.)

One immediate consequence of the classification of finite simple groups (CFSG) is that the set S given by

$S:=\{n \in \mathbb{N} : \text{there exists a simple group of order } n\}$

has natural density 0. Yet it seems unlikely that this result requires the tens of thousands of pages currently involved in the proof of the CFSG, and for the rest of this article we seek to bound the density of S using arithmetic methods.

— PART I —

Before we begin, let’s contrast this type of problem with a more familiar variant (an infamous problem for students learning group theory):

Exercise: Classify all simple groups of order less than 200.

In solving such a problem, the student will undoubtedly grow to appreciate the utility of the Sylow theorems, the class equation, Burnside’s p-q theorem, and a whole laundry list of results stemming from group actions. On the other hand, the student will eventually realize that no single technique performs as if a silver bullet: without fail, some cases require an extraordinary level of individual attention (e.g. the non-simplicity of groups of order 144).

Matters only become worse as the order of our groups tends to infinity. One may prove, for example, that the set of integers having at most two distinct prime factors has natural density 0; in particular, the utility of Burnside’s p-q theorem becomes vanishingly small as our groups become large. In order to bound the density of S, we must first identify criteria that hold on sets of positive density. Our first such follows from the Sylow theorems:

Proposition: Let G be a group of order n, with n composite. If n admits a prime factor p such that $p \geq \sqrt{n}$ , then G is not simple.

Proof: Given such a prime p, let $n_p$ denote the number of Sylow p-subgroups of G. Then $n_p$ divides $n/p$ , and so $n_p \leq p$ . On the other hand, we must have $n_p \equiv 1 \mod p$ , whence $n_p=1$ and the (unique) Sylow p-subgroup, P, of G is normal. As $n \neq p$ , P is proper (and non-trivial), so G is not simple. $\square$

For any integer set A, we define $A[n] := A \cap [1,\ldots,n]$ . We define

$LP:=\{ n \in \mathbb{N} : n \text{ has a prime factor exceeding } \sqrt{n}\}$

as the set of integers with the large-prime property. Thus, for instance, all primes and semiprimes lie in LP. Yet LP is much larger than either of these sets, as a consequence of the following Theorem:

Theorem 1: The set LP has natural density $\log 2 \approx 0.693147$ .

Proof: For any prime p, fix $k \leq p$ then $n:= kp$ lies in LP. Moreover, it is clear that all such elements of LP may be represented (uniquely) in this form. Then

$LP[n]=\displaystyle\bigcup_{\substack{kp\leq n \\ k \leq p}} \{kp\} =\bigcup_p \bigcup_{\substack{j\leq p \\ k\leq n/p}} \{kp\}=\bigcup_k \bigcup_{p \in [k,n/k]} \!\!\{kp\},$

obtained by “transposing” the order in which $LP[n]$ is counted. It follows that

$\# LP[n] = \displaystyle\sum_{k \leq \sqrt{n}} \left(\pi(n/k)-\pi(k-1)\right).$

We’ll address each contribution to the right-hand side separately. First off, known bounds on the error term in $\pi(n)$ give the asymptotic

$\epsilon_n:=\displaystyle\sum_{k \leq \sqrt{n}} \pi(k-1) \sim\displaystyle\sum_{k \leq \sqrt{n}} \frac{k}{\log k}\sim \frac{k^2}{2\log k} \bigg\vert^{\sqrt{n}} = \frac{n}{\log n}.$

Our other term will require more work; as before, we have

$\sigma_n:=\displaystyle\sum_{k \leq \sqrt{n}} \pi(n/k) \sim \sum_{k \leq \sqrt{n}} \frac{n/k}{\log n/k}=n\sum_{k \leq \sqrt{n}} \frac{1}{k\log n/k}.$

Now, let $S_i =[n^{i/(2m)}, n^{(i+1)/(2m)}]$ , for $i =0,\ldots, m-1$ . For $k \in S_i$ , we have $\log k > \frac{i}{2m} \log n$ , whence $\log n/k < (1-\frac{i}{2m})\log n$ . Thus

$\displaystyle\sum_{k \leq \sqrt{n}} \frac{1}{k\log n/k} >\sum_{i=0}^{m-1} \sum_{k \in S_i} \frac{1}{k(1-\frac{i}{2m})\log n}$

$=\displaystyle\frac{1}{\log n} \sum_{i=0}^{m-1} \bigg(1-\frac{i}{2m}\bigg)^{-1}\sum_{k \in S_i} \frac{1}{k}\sim \sum_{i=0}^{m-1}\bigg( 1-\frac{i}{2m}\bigg)^{-1}\frac{1}{2m}.$

Now taking $m \to \infty$ , we can evaluate this limit as a Riemann sum:

$\displaystyle \lim_{m \to \infty}\frac{1}{m}\sum_{i=0}^{m-1}\frac{1}{2-i/m} = \int_0^1 \frac{dt}{2-t} = \log 2,$

from which it follows that $\sigma_n\gtrsim n \log 2$ . In like fashion, we may show that $\sigma_n \lesssim n \log 2$ , using the upper bound $\log k <\frac{i+1}{2m} \log n$ on $S_i$ (and another Riemann sum for the same integral). Thus $\sigma_n \sim n \log 2$ , and it follows that

$\# LP[n] =\sigma_n+\epsilon_n \sim n \log 2,$

since the contribution of $\epsilon_n$ presents as an error term. So $LP[n]$ has natural density $\log 2$ , as desired. $\square$

As an application, we have the following:

Corollary 1: The natural density of S is at most $(1-\log 2)\approx 0.306853$ .

Proof: Let P denote the set of primes. Our result follows from the inclusion $(LP \smallsetminus P) \subset S^c$ and the fact that P has density 0. $\square$

As we’ll need this result later, we note that a stronger variant of Theorem 1 holds, which concerns the intersection of LP with certain arithmetic progressions. The proof is left to the reader.

Theorem 1b: Let $\ell \in \mathbb{N}$ . Then $LP \cap \ell\mathbb{N}$ has natural density $(\log 2)/\ell$ .

Proof: Left to the reader. $\square$

— PART II —

To make any further progress on the problem at hand, we return briefly to the theory of finite groups. A normal complement to a subgroup $H \leq G$ is a subgroup $K \leq G$ such that $HK=G$ and $H \cap K = \{e\}$ . The following theorem (often called the Burnside Normal Complement Theorem) gives a criterion for the existence of p-complements, i.e. normal complements to a Sylow p-subgroup.

Theorem (Burnside): Let G be a finite group and let P be a Sylow subgroup of G. If $P \leq Z(N_G(P)),$ then P has a normal complement in G.

Theorem 2: Let n be composite, with minimal prime factor p. Let G be a group of order n. If $p^2$ does not divide n, then G is not simple.

Proof: Let P be a Sylow p-subgroup of G. Then $N_G(P)/C_G(P)$ is isomorphic to some subgroup of $\mathrm{Aut}(P)$ by the N/C-theorem. In particular, $\# N_G(P)/\# C_G(P)$ divides $\# \mathrm{Aut}(P)$ , which has order $p-1$ as P is cyclic. Thus $\#N_G(P)/\#C_G(P)$ divides $(n,p-1)$ , and the minimality of p forces $N_G(P)=C_G(P)$ . As P is abelian we have

$P \leq C_G(P) \leq Z(C_G(P)) = Z(N_G(P))$ ,

whence Burnside’s theorem gives rise to a (proper) normal complement to P. In particular, G is not simple. $\square$

Our interest in Theorem 2 stems from the following Corollary:

Corollary 2: Let n be composite and square-free. If G has order n, then G is not simple.

We recall that the density of the set of square-free integers, henceforth denoted SF, is given by $6/\pi^2 \approx 0.607927$ . In particular, this Corollary (on its own) implies a weaker bound on the density of S than that which was given in Corollary 1.

Theorem 3: Membership in LP and SF is asymptotically independent. In particular, $LP \cup SF$ has density

$\displaystyle\delta:= \log 2+\frac{6}{\pi^2}-\frac{6\log 2}{\pi^2} \approx 0.879691,$

and it follows that S has density at most $1-\delta \approx 0.120308$ .

Proof: Theorem 1b implies that membership in LP and $\ell^2\mathbb{N}$ is asymptotically independent. As membership in the sets $\ell^2\mathbb{N}$ is jointly independent (over any finite sub-collection), membership in LP is independent of membership (with multiplicity) in the multi-set

$\bigg(\bigcup_{\mu(\ell)=1} \ell^2\mathbb{N}\bigg) \smallsetminus \bigg(\bigcup_{\mu(\ell)=-1}\ell^2\mathbb{N}\bigg),$

in which $\mu$ denotes the Möbius function. But this multi-set is precisely SF, by the inclusion-exclusion principle for multi-sets, and it follows that membership in LP and SF is asymptotically independent. Consequently, the density of $LP \cap SF$ is $\log 2 \cdot 6/\pi^2$ , and the remainder of Theorem 3 follows. $\square$

Remark: It may not be intuitively obvious that “LP-ness” and “SF-ness” ought to be independent events (asymptotically). Morally, we may view this as a consequence of the fact that SF-ness is primarily influenced by small primes, while LP-ness concerns the presence of large prime factors.

— PART III —

While we won’t attempt to further tighten the bound on the density of S using elementary methods, it is a natural question to ask how our bounds change upon the adoption of more and more complicated results leading up to the CFSG. One such result is the Feit-Thompson theorem (a.k.a. the Odd Order theorem), which implies that all groups of odd order are solvable. In particular, the Feit-Thompson theorem implies that non-cyclic, simple groups must have even order. More is true: if G is simple of even order $n > 2$ , then Theorem 2 implies that 4 divides the order of G.

Corollary: Assuming the Feit-Thompson theorem, the density of S is at most $(1-\log 2)/4 \approx 0.076713$ .

Exercise: For $\alpha \in [0,1]$ , we define $LP_\alpha$ as

$LP_\alpha := \{n \in \mathbb{N} : n \text{ has a prime factor exceeding } n^\alpha\}.$

Thus $LP = LP_{1/2}$ . For $\alpha \geq 1/2$ , prove that $LP_\alpha$ has density $- \log \alpha$ . In particular, if $\rho(n)$ denotes the largest prime factor of n, then $\log \rho(n)$ has a median value of $1/\sqrt{e} \approx 0.606531$ .

The following Exercises concern improvements to our various estimates on the density of S by means of borrowed results and minor refinements to already-seen techniques.

Exercise: SF is properly contained in the set of n for which Theorem 2 applies. To be specific, let $\mathrm{lpf}(n)$ denote the least prime factor of n. Define

$B:= \{n \in \mathbb{N} : \mathrm{lpf}(n)^2 \mid n\}.$

Prove that the density of B is given by

$\displaystyle\sum_{i=0}^\infty \frac{1}{p_{i+1}^2} \cdot \frac{\phi(i\#)}{i\#}\approx 0.330098,$

in which $p_i$ denotes the i-th prime, $i\#$ denotes the i-th primorial, and $\phi$ denotes Euler’s totient. Prove that membership in LP and membership in B is asymptotically independent, and use this to show that S has natural density $\lessapprox 0.101292$ (cf. Theorem 3).

Exercise: Let G be a non-abelian simple group of order n. Under the assumption that either $\mathrm{lpf}(n)^3$ or $12$ divides n (this is Theorem 6.26 in Rose’s “A Course on Finite Groups” and represents a generalization of Theorem 2), show that the set of numbers which are simultaneously cube-free and indivisble by 12 has natural density $87/(91\zeta(3)) \approx 0.795340$ . Prove that cube-freedness is independent of LP-ness, and use this fact to prove that S has natural density at most