
A Lehmer Sieve made of LEGO. The pins represent quadratic residues.
In his excellent and excellently-named expository piece A Tale of Two Sieves, Carl Pomerance tells the history of two factoring algorithms, the quadratic and number field sieves. This story begins with a factorization technique called factorization via congruence of squares, which attempts to factor by constructing a non-trivial congruence of the form
.
If such a congruence is known, then , and the factors of
distribute between
and
. If we’re lucky (or at least, not unlucky), each of
and
gets at least one prime factor of
and we can detect such a factor by computing
.
This is a great start but it doesn’t tell us how to produce these congruences of squares in the first place. Kraitchik’s idea back in the 1920s was to find them by multiplying together non-square quadratic residues with “simple” prime factorizations. We illustrate this by (stealing Pomerance’s) example:
Example 1. Let’s factor the integer . If we wanted to apply Fermat’s Factorization Method (factorization using differences of squares), we’d consider the values of
, starting at
, and look for squares. We produce the list
These are quadratic residues (by construction), but none of these are squares, so Fermat would keep on looking. But what Kraitchik observes is that some of these outputs factor entirely into small primes:
By looking at prime factorizations, it’s not hard to find a combination of these outputs that gives a perfect square. For example, . (There are a few others, too.) But, let’s not forget that these numbers came from values of
. In this example, we used the
-values
and
. Thus
Since , this implies
. Then
, and by computing
we get a non-trivial factor of . The full factorization is
. //
The purpose of this article isn’t to rehash A Tale of Two Sieves. Rather, I’d like to explore the bits and pieces that make Pomerance’s example work so nicely. For contrast, here’s a bad example:
Example 2. Let’s factor . We define
and compute
,
, and
. We thus produce a prime, a number with the ungainly factor
, and a perfect square. In this case, Kraitchik’s Method reduces to the method of Fermat. It would have been easier to compute
. //
But that’s not the only way things can go wrong. Here’s a failure of a different sort:
Example 3. Let’s factor . We define
and compute values of
starting at
. In line with Pomerance’s
example, we want to focus on
-values for which
admits no prime factors larger than a given bound
; such integers are called
-smooth. To stay specific, let’s choose
here. The smallest value we test for smoothness is
; after this, we compute
,
, and so forth. Each has a prime factor larger than
, so we choose to discard them all.
Around now, the numbers start getting large. For the sake of efficiency, we might stop factoring completely. Instead, since we’re looking for
-smooth outputs, we can perform trial division by small primes and give up if
seems to misbehave. But herein lies the problem: if we cut off trial division at
, say, we’d miss out on
and keep on going long after Fermat’s Factorization Method terminates.
This spirals. Searching even two million values turns up none which are
-smooth. Are there tricks we could use to salvage this problem? Of course. We could increase the smoothness bound. We could switch to looking at
when the values of
become too large. Still, these are things we’d hate to have to do when presenting an early example. //
Given all this, just how special is Pomerance’s example ? Is it misleadingly (or in some sense even optimally) nice, or is it just one of many reasonable
to consider? To answer this, let’s fix some properties that a “nice” introductory example should have:
should have no factors less than
. I’ll be the first to admit that the bound
is somewhat arbitrary, but
passes this test (barely) and we certainly want to exclude integers which fall too easily to trial division. (At the bare minimum, things get strange if
is even.)
should be a semiprime. This goes hand in hand with (1), since any number with three distinct prime factors of size at least
already exceeds
. Moreover, it seems natural to look at semiprimes because one of the main “industrial applications” of factoring is the breaking of RSA keys.
- The example shouldn’t devolve into Fermat’s Method. Factorization based on congruences of squares is often pitched as an improvement to Fermat’s Method. It’s therefore silly to work through an example that reverts back to this simpler technique. (See Example 2.)
- The example shouldn’t require an overly large smoothness bound. Here, large is a bit subjective. Pomerance notes that the optimal smoothness bound for large
is
Being charitable, we’ll call an example “bad” if the values of
appear to escape to infinity before a square-producing subset of
-smooth numbers is found, where
is as above.
These conditions create a fairly sparse of potential . Pomerance’s example
remains in the list, as do
smaller examples:
Example 4. Let’s factor using Kraitchik’s Method. We form
and compute
and
. Then
. The computation
gives a non-trivial factor of . //
I suppose the funny thing about this example is that it has succeeded a bit too well. To better highlight Kraitchik’s Method, let’s also restrict to the cases where first two outputs of don’t multiply to a perfect square. This eliminates
and
; overall, only a slight change.
Example 5. Let’s factor using Kraitchik’s Method. As before, we form
and compute:
Then , and a GCD computation gives the non-trivial factor
of
. //
This is the first of our examples that feels like a solid introduction to Kraitchik’s Method. We have some smooth outputs, some not-so-smooth outputs, and enough terms to get you thinking about how you might find a square-producing subset in a more general case.
Pomerance does a few other computations with which may explain his particular choice. It demonstrates, for example, that the first subset of quadratic residues which produces a perfect square might require more than two terms. The
case requires four terms:
,
,
, and
, although relations with fewer terms appear if one continues the computation.
If one wants to arrange for the first square-producing subset to use more than two terms, the list of admissible shrinks further. The full list of the
such
is given below:
Example 6. Let’s factor using Kraitchik’s Method. As always, we form
and compute some values of
. In this case, quite a few are insufficiently smooth. The useful ones are as follows:
In this case, the first congruence of squares we find is not useful: it reads . We’d need to keep going. Unfortunately, the next smooth value we find is
, so this example (secretly) devolves into Fermat’s Method.
Example 7. Let’s factor using Kraitchik’s Method. As always, we form
and compute some values of
. Omitting the insufficiently smooth ones, we consider
Then , which is enough to find the non-trivial factor
of
. At long last, we’ve produced a truly beautiful example. //
In the previous example, we defined “sufficiently smooth” as -smooth. We are, after all, being five times more lenient in our definition of smooth than what is asymptotically optimal. Tightening this bound from
to
disqualifies
and makes
the new minimal example:
Example 8. Let’s factor using Kraitchik’s Method. As always, we form
and compute some values of
. Omitting the insufficiently smooth ones, we consider
Then and we produce the non-trivial factor
of
. This examples matches
in complexity but requires less trial division to perform. //
As a parting gift, here’s a full list of which are admissible under our many conditions and the stricter smoothness bound
:
This list contains Pomerance’s example and
others. I imagine that most, if not all, would be fine “first examples” in the study of Kraitchik’s Method. (Although I have not verified that they avoid the pitfalls of Example 6.)