Finding prime monomials

(Assumed audience: folks with a working knowledge of Swift and modular arithmetic.)

Anish nerd sniped me the other day — and while I usually tuck these kinds of posts away in my technical notebook, the problem’s solution threaded so many wonderful topics that it warranted a longer-form entry.

The problem statement is

A prime number is a natural number greater than one whose only factors are one and itself. An m-digit number is monomial if it uses each of the numbers in the interval $\left[1, m\right]$ exactly once. For example, 2314 is monomial as it is four digits long and uses digits one through four exactly once.

Write a program that finds the largest monomial prime number in under a second (the faster the better).

Woof. TIL there’s overloaded meaning to “monomial.1” I’ve usually seen it used in the single-term polynomial sense. Let’s sketch out an approach in Swift, assuming we have some helper methods like BinaryInteger.isPrime and Collection.permutations in hand and aren’t allowed to parallelize the search.

(Gist permalink.)

First, let’s zoom in on isPrime. While an Int-constrained extension could’ve fit the bill since our monomial upper bound — 987654321 — is well below Int.max, I figured we could preserve generality by allowing UInt and other BinaryInteger conformances into the mix.

Primality testing is a whole field in and of itself, so let’s start with a (slightly faster) division-based approach and pull in swift-algorithms for some sequence- and collection-related helpers.

(Gist permalink.)

It’s worth pausing on the $6k \pm 1$ optimization because we could’ve simply let divisors be (5...) and called it a day (skipping over 4 for the lower bound since the multiple of two check would’ve covered that divisor). “All primes must be in the form $6k \pm 1$” is really a statement about the possible congruence classes (remainders) of primes modulo $6$. That is, a prime, $p$, must satisfy $p \equiv \pm 1 \;(\bmod\; 6)$. To see why, let’s consider what it would mean if $p$ was congruent to 0, 2, 3, or 4 (the remaining classes):

  • $\equiv 0 \Rightarrow 6 \mid p, \bot$;
  • $\equiv 2 \Rightarrow \exists k \in \mathbb{Z} \textrm{ such that } p = 6k + 2$
    $= 2(3k + 1) \Rightarrow 2 \mid p, \bot$;
  • $\equiv 3 \Rightarrow \exists k \in \mathbb{Z} \textrm{ such that } p = 6k + 3$
    $= 3(2k + 1) \Rightarrow 3 \mid p, \bot$;
  • $\equiv 4 \Rightarrow \exists k \in \mathbb{Z} \textrm{ such that } p = 6k + 4$
    $= 2(3k + 2) \Rightarrow 2 \mid p, \bot$;

…they all lead to contradictions (!). This lets us trim down the (5...) divisor search space to $\left[6 \times 1 - 1, 6 \times 1 + 1, 6 \times 2 - 1, 6 \times 2 + 1, …\right]$. And to achieve this, we start with ‌6 * 1 - 1 and alternating-ly — pretend that’s a word hah — add 2 and 4 by cycleing those offsets with reductions2 to get a running total.

Now to fill in the the Collection.permutations assumption. Prior art for implementing such a method seems to be Jeffrey Johnson’s SEPA permutation algorithm and after some searching around, I noticed it’s thankfully included in swift-algorithms — phew. So, with that other base covered by the Algorithms import for isPrime, we can give our original monomial search scaffolding a speed test with swift-benchmark (also asserting that a consistent largestPrimeMonomial value is set).

(Gist permalink.)

Hmmm, the approach clocks in at an average of 1.005 seconds across 25 runs, just above the question’s requirements. Sans squeezing out performance by dipping below Strings to lower-level representations (since we’re squarely in ASCII territory) or beefing up isPrime, let’s see if we can trim the ms we consider in the first place.

An invariant we can lean on is the monomial-ness. Since each digit in $\left[1, m\right]$ is used exactly once, maybe the sum of the digits involved can reveal their hand and allow us to rule out specific $m$ lengths entirely?

Any base-ten number, $a$, can be expanded out as follows (with $a_i$ representing the digit at the $i$th position):

\[\begin{aligned} a &= {a_n \cdot 10^n + … + a_2 \cdot 10^2 + a_1 \cdot 10^1 + a_0 \cdot 10^0} \\ &= {a_n \cdot (99…99) + … + a_2 \cdot 99 + a_1 \cdot 9 + a_0 \cdot 0 + \sum_{i=0}^n a_i} \\ &= {9(a_n \cdot (11…11) + … + a_2 \cdot 11 + a_1 \cdot 1 + a_0 \cdot 0) + \sum_{i=0}^n a_i} \\ &\Rightarrow a \equiv \sum_{i=0}^n a_i \;(\bmod\; 9) \end{aligned}\]

Or in prose, base-ten numbers are congruent to the sum of their digits modulo 9. This is huge — I smiled like a goof when I saw the proof sketched out in a Mathematics Stack Exchange answer.

Back in Swift land, we defined ms to be (3...9).reversed(). With the above congruence in hand, we can start knocking out candidate $m$ values (assume $M_i$ is the set of all monomials of length $i$ and $m_k$ is the digit in the $k$th position of the monomial).

\[\begin{aligned} i = 3 & \Rightarrow {\forall m \in M_3, \sum_{k=1}^3 m_k = \sum_{k=1}^3 k = 6 \equiv m \;(\bmod\; 9)} \\ & \Rightarrow 3 \mid m \textrm{ (by a similar argument to the one used in \texttt{isPrime})}\\ & \Rightarrow m \textrm{ can’t be prime.} \end{aligned}\]

Repeating this, we can rule out all but 4 and 7 as $m$-lengths possibly containing a prime. Or, in Swift, we can filter out members of ms with digit sums that are congruent to 0, 3, or 6 modulo 9 — the classes that aren’t relatively prime with 9 itself, since that’d imply that monomials of that length contain no primes.

Adding that change in and re-running the suite speeds things up…dramatically.

(Gist permalink.)

Woot! There we go. A ~100% decrease in runtime. This is consistent with the 7_652_413 largestPrimeMonomial value we asserted on earlier, since its $m$ length is in one of the two possible remaining congruence classes that could contain primes and, since we check it before 4, we’re able to break outerLoop early.

So, long story short, 7,652,413 is the largest monomial prime3!

We could call it a day, but, it turns out that there’s something tangental, yet deeper, going on when we were able to rule out all congruence classes sans $\pm 1$ modulo 6 in isPrime and sans 4, 7 modulo 9 from ms. That result is (sending with Lasers effect)…

Dirichlet’s theorem

I don’t quite have the background needed to understand the proof of Dirichlet’s theorem — shakes fist at younger Jasdev’s decision not to take complex analysis4 — still, I can provide an introduction (and also link out 3Blue1Brown’s wonderful treatment of the topic).

Earlier we noted that primes must be $\pm 1$ modulo 6 and 4 or 7 modulo 9 or as you might’ve guessed, primes must fall into congruence classes that are coprime to the modulus at hand. Because if they weren’t, that’d imply the prime shares a divisor with the modulus, contradicting it being a prime in the first place. Dirichlet’s theorem takes this observation a further and proves that infinitely many primes fall into each of these congruence classes. Or in probability terms, the probability that a given prime, $p$, is congruent to some $a$ modulo $n$ where $a$ and $n$ are coprime is $\frac {1}{\varphi(d)}$, where $\varphi(d)$ is Euler’s totient function, which counts the positive integers up to $d$ that are relatively prime to it. Pairing this with the infinitude of primes implies that the primes are uniformly distributed across those congruence classes modulo $d$ (!).

(…but really, go check out the timestamped 3Blue1Brown link above for a visual overview because text doesn’t quite do it justice.)

I’ll pause here. Hopefully once I’ve learned enough complex analysis to understand the theorem’s proof, I’ll try to bring it down from orbit and into another post.

Until then.


Special thanks to Nate, Soroush, and Davide for workshopping an early version of BinaryInteger.isPrime. Also to Daniel for his work on Algorithms.Sequence.reductions(_:_:) and another round of thanks to Nate for .Collection.cycled(times:) and .Collection.permutations(ofCount:) in the same package.

Footnotes

  1. Turns out this question mirrors Project Euler #41, where the problem is phrased as “finding the largest pandigital prime.” 

  2. swift-algorithm’s Sequence.reductions(_:_:) carries the baton from an early revision of SE-0045 that included it under the scan naming, but was ultimately nixed because of low utility. 

  3. Other variants of this problem expand the digit range from (1...m) to (0...m). I’ll leave this variant as an exercise for the dedicated reader. 

  4. Timely enough, Rochard Borcherds started a series on the topic