Singmaster's conjecture in the interior of Pascal's triangle

Singmaster's conjecture asserts that every natural number greater than one occurs at most a bounded number of times in Pascal's triangle; that is, for any natural number $t \geq 2$, the number of solutions to the equation $\binom{n}{m} = t$ for natural numbers $1 \leq m<n$ is bounded. In this paper we establish this result in the interior region $\exp(\log^{2/3+\varepsilon} n) \leq m \leq n-\exp(\log^{2/3 + \varepsilon} n)$ for any fixed $\varepsilon>0$. Indeed, when $t$ is sufficiently large depending on $\varepsilon$, we show that there are at most four solutions (or at most two in either half of Pascal's triangle) in this region. We also establish analogous results for the equation $(n)_m = t$, where $(n)_m := n(n-1)\ldots(n-m+1)$ denotes the falling factorial.

For the purposes of attacking this conjecture, we may of course assume t to be larger than any given absolute constant, which we shall implicitly do in the sequel. In particular we can assume that the iterated logarithms log 2 t := log log t; log 3 t := log log log t are well-defined and positive.
In view of the symmetry (1.3) n m = n n − m 1 Our conventions for asymptotic notation are set out in Section 1.5. Since n → n m is an increasing function of n for fixed m ≥ 1, n is uniquely determined by m and t. Thus by (1.5) we have at most O(log t) solutions to the equation n m = t, a fact already observed in the original paper [22] of Singmaster. This bound was improved to O(log t/ log 2 t) by Abbott, Erdős, and Hansen [1], to O(log t log 3 t/ log 2 2 t) by Kane [14], and finally to O(log t log 3 t/ log 3 2 t) in a followup work of Kane [15]. This remains the best known unconditional bound for the total number of solutions, although it was observed in [1] that the improved bound O ε (log 2/3+ε t) was available for any ε > 0 assuming the conjecture of Cramér [9].
From the elementary inequalities and some rearranging we see that any solution to n m = t obeys the bounds (tm!) 1/m ≤ n < (tm!) 1/m + m.
Applying Stirling's approximation (2.4) (and also n ≥ m) we can thus obtain the order of magnitude of n as a function of m and t: (1. 6) n mt 1/m or equivalently (1.7) n m exp log t m .
In particular we see that n grows extremely rapidly when the ratio m/ log t becomes small. This makes the difficulty of the problem increase as m/ log t approaches zero, and indeed treating the case of small values of m/ log t is the main obstruction to making further progress on bounding the total number of solutions. We will not explicitly use this estimate here.
In this paper we study the opposite regime in which m/ log t is relatively large, or equivalently (by (1.7)) n and m are somewhat comparable (in the doubly logarithmic sense log 2 n log 2 m). More precisely, we have the following result: Theorem 1.3 (Singmaster's conjecture in the interior of Pascal's triangle). Let 0 < ε < 1, and assume that t is sufficiently large depending on ε. Then there are at most two solutions to (1.1) in the region exp(log 2/3+ε n) ≤ m ≤ n/2. By (1.3), we thus have at most four solutions to (1.1) in the region exp(log 2/3+ε n) ≤ m ≤ n−exp(log 2/3+ε n). Furthermore, in the smaller region exp(log 2/3+ε n) ≤ m ≤ n/ exp(log 1−ε n) there is at most one solution, whenever 0 < ε < ε 2/3+ε and t is sufficiently large depending on both ε and ε .
Remark 1.5. In view of Theorem 1.3, we now see that to prove Conjecture 1.1, we may restrict attention without loss of generality to the region 2 ≤ m ≤ exp(log 2/3+ε n) for any fixed ε > 0, or equivalently (by (1.7)) to 2 ≤ m ≤ log t log 3/2−ε 2 t for any fixed ε > 0. It follows from the conjecture of de Weger [11] mentioned in Remark 1.4 that for t sufficiently large there is only at most one solution in this region, that is to say all but a finite number of binomial coefficients n m for 2 ≤ m ≤ exp(log 2/3+ε n) are distinct. In this direction, the number of solutions to the equation n m = n m for fixed 2 ≤ m < m has been shown (via Siegel's theorem on integral points) to be finite in [4] (see also the earlier result [16] treating the case (m, m ) = (2, p) for an odd prime p). This implies that there are no collisions in the regime 2 ≤ m ≤ w(n) if w is a function of n that goes to infinity sufficiently slowly as n → ∞. Unfortunately, due to the reliance on Siegel's theorem, the function w given by these arguments is completely ineffective. Remark 1.6. For some previous bounds of this type, in [1] it was shown that the number of solutions to (1.1) in the range n 5/6 ≤ m ≤ n/2 was O(log 3/4 t), while the arguments in [14, §7], after some manipulation, show that the number of solutions to (1.1) in the range exp(log 1/2+ε n) ≤ m ≤ n 5/6 is O ε (log t/ log 3 2 t). Remark 1.7. The implied quantitative bounds in the hypothesis "t is sufficiently large depending on ε" are effective; however, we have made no attempt whatsoever to optimize them in this paper, and will likely be too large to be of use in numerical verification of Singmaster's conjecture in their current form. We exclude the cases m = 0, m = n since (n) 0 = 1 and (n) n = (n) n−1 = n!. In [1,Theorem 4] it was shown that for any t ≥ 2 the number of integer solutions (m, n) to (1.8) We do not directly improve upon this bound here, but can obtain an analogue of Theorem 1.3: Theorem 1.8 (Falling factorial multiplicity in the interior). Let 0 < ε < 1, and assume that t is sufficiently large depending on ε. Then there are at most two integer solutions to (1.8) in the region exp(log 2/3+ε n) ≤ m < n.
We establish this result in Section 5. Note that the bound of two is best possible, as can be seen from the infinite family of solutions for any integer a > 2, and more generally 1.2. Strategy of proof. Theorem 1.3 is a consequence of two Propositions that we now describe. The proof of Theorem 1.8 will follow a similar pattern as described here and we refer the reader to Section 5 for details. Proposition 1.9 (Distance estimate). Let ε > 0. Suppose we have two solutions (n, m), (n , m ) to (1.1) in the left half (1.4) of Pascal's triangle. Then one has m − m ε exp(log 2/3+ε (n + n )) for any ε > 0. Furthermore, if m, m ≥ exp(log 2/3+ε (n + n )) then we additionally have n − n ε exp(log 2/3+ε (n + n )).
Note how this proposition is consistent with the example in Remark 1.4. We shall discuss the proof of Proposition 1.9 in Section 1.3. For the application to Theorem 1.3, Proposition 1.9 localizes all solutions to (1.1) to a region of small diameter. To conclude Theorem 1.3, we can now proceed by adapting the Taylor expansion arguments of Kane [14], [15], in which one views n as an analytic function of m (keeping t fixed) and exploits the non-vanishing of certain derivatives of this function; see Section 2. This is what the proposition below accomplishes. In fact in our analysis only two derivatives of this function are needed (i.e., we only need to exploit the convexity properties of n as a function of m). Proposition 1.10 (Kane-type estimate). Let ε > 0. Suppose that (n, m) is a solution to (1.1) in the left-half (1.4) of Pascal's triangle. There there exists at most one other solution (n , m ) = (n, m) to (1.1) with m < m, n > n and With these two Propositions at hand it is easy to deduce Theorem 1.3.   Similarly for n . Applying Proposition 1.9 (with ε replaced by a sufficiently small quantity), we conclude that (1.10) m − m , n − n ε exp(O(log 1−ε 2 t)) whenever 1 − ε > 2/3 2/3+ε , or equivalently ε < ε 2/3+ε . The result now follows from Proposition 1.10.
Remark 1.11. The above arguments showed that for t sufficiently large depending on ε, there were at most four solutions to (1.1) in the region exp(log 2/3+ε n) ≤ m ≤ n − exp(log 2/3+ε n). A modification of the argument also shows that there cannot be exactly three such solutions. For if this were the case, we see from (1.3) that there must be a solution (n, m) with n = 2m, so that m log t by Stirling's approximation. For all other solutions (n , m ) to (1.1) we have n ≥ n + 1, hence 1.3. Proof methods. We now discuss the method of proof of Proposition 1.9, which is our main new contribution. In contrast to the "Archimedean" arguments of Kane (such as Proposition 1.10) that use real and complex analysis of the binomial coefficients n m , the proof of Proposition 1.9 relies more on "non-Archimedean" arguments, based on evaluating the p-adic valuations v p n m for various primes p, defined as the number of times p divides n m . From the classical Legendre formula where {x} := x − x denotes the fractional part of x. Note that the summands here vanish whenever p j > n. From this identity we see that if (n, m), (n , m ) are two solutions to (1.1) then we must have for all primes p. Our strategy will be to apply this equation with p set equal to a random prime p drawn uniformly amongst all primes in the interval [P, P + P log −100 P ] where the scale P is something like exp(log 2/3+ε/2 (n + n )), and inspect the distribution of the resulting random variables on the left and right-hand sides of (1.13) in order to obtain a contradiction when m, m or n, n are sufficiently well separated. In order to do this we need some information concerning the equidistribution of fractional parts such as { n p j }. This will be provided by the following estimate, proven in Section 4. There and later the letter p always denotes a prime. Proposition 1.12 (Equidistribution estimate). Let ε > 0 and P ≥ 2 and let I be an interval contained in [P, 2P ]. Let M, N be real numbers with M, N = O(exp(log 3/2−ε P )), and let j be a natural number.
(i) For all A > 0, (ii) Let W : R 2 → C be a smooth Z 2 -periodic function. Then, for all A > 0, One can generalize this proposition to control the joint equidistribution of any bounded number of expressions of the form { n p j }, but for our applications it will suffice to understand the equidistribution of pairs { N p }, { M p j }. When it comes to the proof of Proposition 1.12, the first step is to use Fourier expansion to reduce part (ii) of the proposition to part (i). For part (i), the case where |N | P + |M | P j is small (say ≤ log O(A) P ) is easily handled using the prime number theorem with classical error term. In the regime where |N | P + |M | P j is large, we use Vaughan's identity to decompose the sum in (i) into type I and II sums, and assert that these exhibit cancellation; the type I and II bounds are given in (4.9) and (4.11).
Both type I and type II sums can be handled using Vinogradov's bound for sums of the form n∈I e(f (n)) with f smooth, although we need to first cut from I small intervals around zeros of the first log P derivatives of N/t + M/t j . This way we obtain that the sum in (i) exhibits cancellation. It is here that the restriction N, M = O(exp(log 3/2−ε P )) arises; even under the Riemann hypothesis we do not know how to relax this requirement 2 .
Once the equidistribution estimate, Proposition 1.12, is established, the analysis of the distribution of both sides of (1.13) is relatively straightforward, as long as the scale P is chosen so that the powers P j do not lie close to various integer combinations of m, n, m , n . However, there are some delicate cases when two of the numbers n, m, n−m, n , m , n −m are "commensurable" in the sense that one of them is close to a rational multiple of the other, where the rational multiplier has small height. Commensurable integers are also known to generate some exceptional examples of integer factorial ratios [6], [7], [25]. Fortunately, we can handle these cases in our context by an analysis of covariances between various fractional parts { n 1 p }, { n 2 p }, in particular taking advantage of the fact that these covariances are non-negative up to small errors, and small unless n 1 , n 2 are very highly commensurable.
or Y X to denote the estimate |X| ≤ CY for some constant C. If we wish to permit this constant to depend on one or more parameters we shall indicate this by appropriate subscripts, thus for instance O ε,A (Y ) denotes a quantity bounded in magnitude by C ε,A Y for some quantity C ε,A depending only on ε, A. We write X Y for X Y X. We use 1 E to denote the indicator of an event E, thus 1 E equals 1 when E is true and 0 otherwise.
We let e denote the standard real character e(x) := e 2πix .

Derivative estimates
We generalize the binomial coefficient n m to real 0 ≤ m ≤ n by the formula n m := Γ(n + 1) is the Gamma function (with γ the Euler-Mascheroni constant). This is of course consistent with the usual definition of the binomial coefficient. Observe that the digamma function positive and decreasing, and negative. For future reference we also observe the standard asymptotics and the Stirling approximation for any x ≥ 1; see e.g., [2, §6.1, 6.3, 6.4]. One could also extend these functions meromorphically to the entire complex plane, but we will not need to do so here. From the increasing nature of ψ we see that n → n m is strictly increasing on [m, +∞) for fixed real m > 0, and from Stirling's approximation (2.4) we see that it goes to infinity as n → ∞. Thus for given t > 1, we see from the inverse function theorem that there exists a unique smooth function In particular, the equation (1.1) holds for given integers 1 ≤ m ≤ n and t ≥ 2 if and only if n = f t (m). This function f t was analyzed by Kane [14], who among other things was able to extend f t holomorphically to a certain sector, which then allowed him to estimate high derivatives of this function. However, for our analysis we will only need to control the first few derivatives of f t , which can be estimated by hand: In particular, f t is convex and decreasing in this regime.
The bound (2.6) can be viewed as a generalization of (1.6) to non-integer values of n, m, t.
Proof. Taking logarithms in (2.5) we have Writing n = f t (m) ≥ 2m, we thus see from the mean value theorem that which implies that n n − θm exp( 1 m (log t + log Γ(m + 1))) and the claim (2.6) then follows from Stirling's approximation (2.4). If we differentiate (2.9) we obtain In particular we obtain the first derivative formula From (2.2) and the mean value theorem we have We conclude that −f t (m) n − 2m m log n m and the claim (2.7) follows from (2.6).
From (2.1), (2.3), and the mean value theorem the first term is positive and comparable to m n 2 log 2 n m ; similarly, from (2.1), (2.2), and (2.12) the second term is positive and bounded above by O( m n 2 log n m ). The claim follows. To apply these derivative bounds, we use the following lemma that implicitly appears in [14], [15]: Lemma 2.2 (Small non-zero derivative implies few integer values). Let k ≥ 1 be a natural number, and suppose that f : I → R is a smooth function on an interval I of some length |I| such that one has the derivative bound for all x ∈ I. Then there are at most k integers m ∈ I for which f (m) is also an integer.
As an application of these bounds, we can locally control the number of solutions (1.1) in the region n 1/2+ε ≤ m ≤ n/2, thus giving a version of Theorem 1.3 in a small interval: Proof. From (1.7) and the hypothesis n 1/2+ε ≤ m ≤ n/2 we have for all x ∈ I. Since m ≥ n 1/2+ε and t is sufficiently large depending on ε, m is also sufficiently large depending on ε, and we have for all x ∈ I. Applying Lemma 2.2, there are at most two integers m ∈ I with f t (m ) an integer. Since m is already one of these integers, the claim follows.
The same method, using higher derivative estimates on f t , also gives similar results (with weaker bounds on the number of solutions) for m < n 1/2+ε ; see [14], [15]. However, we will only need to apply this method in the m ≥ n 1/2+ε regime here.
We are now ready to prove Proposition 1.10.
Proof of Proposition 1.10. Let ε > 0, let t be sufficiently large depending on ε, and let (n, m) be a solution to (1.1) in the region For brevity we allow all implied constants in the following arguments to depend on ε. Suppose (n , m ) is another solution in this region with m < m, n > n and From (2.7) and convexity (and the bounds m log t and m − m ≥ 1) we have From (1.7) we have n log t, hence log 1−ε 2 t log 1−ε n, and so for some constant C > 0, m ≥ n/ exp(C log 1−ε n) ≥ n 9/10 (shrinking ε slightly if necessary) if t is sufficiently large depending on ε . The result now follows from Corollary 2.3.
It remains to establish Proposition 1.9. This will be the objective of the next two sections of the paper.

The distance bound
In this section we assume Proposition 1.12 and use it to establish Proposition 1.9. Throughout this section 0 < ε < 1 will be fixed; we can assume it to be small. We may assume that t is sufficiently large depending on ε, as the claim is trivial otherwise. We may assume that m < m, hence also n > n. We assume for sake of contradiction that at least one of the claims is true, as the claim is trivial otherwise. This allows us to select a "good" scale: |am + a m + bn + b n | ≤ P j / log 1000 P or |am + a m + bn + b n | ≥ P j log 1000 P.
(iii) (Separation) At least one of the statements m − m ≥ P log 100 P and m, m , n − n ≥ P log 100 P is true.
Proof. We restrict P to be a power of two in the range exp(log 2/3+ε/2 n ) ≤ P ≤ exp(2 log 2/3+ε/2 n ); such a choice will automatically obey (i) since n > n > m > m and (iii) since we assumed that either (3.1) or (3.2) holds. There are log 2/3+ε/2 n choices for P . Some of these will not obey (ii), but we can control the number of exceptions as follows. Firstly, observe that the conclusion (3.3) will hold unless j = O(log 1/3 n ), so we may restrict attention to this range of j. The number of possible tuples (a, a , b, b , j) is then O(log 4/100 P log 1/3 n ). For each such tuple, we see from the restriction on P that the number of P with P j / log 1000 P < |am + a m + bn + b n | < P j log 1000 P is at most O(log 2 n ) (since am + a m + bn + b n is of size O((n ) 2 ), say). Thus we see that the total number of P which fail to obey (ii) is at most O(log 4/100 P log 1/3 n log 2 n ) which is negligible compared to the total number of choices, which is log 2/3+ε/2 n . Thus we can find a choice of P which obeys all of (i), (ii), and (iii), giving the claim.
Henceforth we fix a scale P obeying the properties in Lemma 3.1. We now introduce a relation ≈ on the reals by declaring x ≈ y if |x−y| ≤ P/ log 1000 P . Thus, by Lemma 3.1(ii), if am + a m + bn + b n ≈ 0 for a, a , b, b as in Lemma 3.1(ii) then |am + a m + bn + b n | ≥ P log 1000 P . Also, from Lemma 3.1(iii), at least one of the statements m ≈ m and m, m , n − n ≈ 0 is true.
We introduce a random variable p, which is drawn uniformly from the primes in the interval I := [P, P + P log −100 P ] (note that there is at least one such prime thanks to the prime number theorem). From (1.13) we surely have We can restrict attention to those j with j ≤ log 1/2 P , since the summands vanish otherwise. For any real number N , we may take covariances of both sides of this identity with the random variable { N p } to conclude that (3.4) for any real number N , where the covariances c j (N, M ) are defined as We now compute these covariances:  The term 1 12ab appearing in Proposition 3.2(iii) is also the covariance between {nx} and {mx} for x drawn randomly from the unit interval whenever n, m are natural numbers with an = bm for some coprime a, b; see [24,Section 2]. Indeed, both assertions are proven by the same Fourier-analytic argument, and Proposition 3.2 endows the linear span of the six functions { N p } for N ∈ {m, n, m − n, m , n , n − m } with an inner product closely related to the norm N () studied in [24], the structure of which is the key to obtaining a contradiction from our separation hypotheses on n − n , m − m .
Proof of Proposition 3.2 assuming Proposition 1.12. We first dispose of the easy case (ii). If N ≈ 0, then { N p } ≤ log −1000 P , and the claim follows from the triangle inequality; similarly if M ≈ 0 or actually if M ≤ P j / log 1000 P . Hence by Lemma 3.1(ii), we may from now on assume that N ≥ P log 1000 P and M ≥ P j log 1000 P.
To handle the remaining cases we use the truncated Fourier expansion that holds for any N 0 ≥ 1 (see e.g. [12, Formula (4.18)]). Our primary tool is Proposition 1.12. Note that, for t ∈ I, log t = log P + O(log −99 P ), so that together with the prime number theorem Proposition 1.12 implies that for any smooth Z 2 -periodic W : R 2 → C and that, for any M , N = O(exp(log 3/2−ε/2 P )), Applying (3.6) with W a suitable cutoff localized to the region {(x, y) : dist(x, Z) ≤ 2N Since N ≥ P log 1000 P , the first term on the right-hand side can be computed to be and a similar argument gives To prepare for the proofs of parts (i), (iii) and (iv), let us first show that, for 1 ≤ j ≤ log 1/2 P , we have We use the Fourier expansion (3.5) with N 0 = log 20 P . Averaging over p ∈ I and applying (3.9) to handle the first error term, we see that Ee m M p j + O(log −10 P ).
By the triangle inequality and (3.7), it suffices to show that, for every non-zero integer m = O(log 20 P ), Recalling that M ≥ P j log 1000 P , this estimate follows from a standard integration by parts (see e.g. [12,Lemma 8.9]). Similarly Furthermore, using similarly (3.5), (3.8), (3.9) and (3.7), we see that, whenever 1 ≤ N 0 ≤ log 20 P , (3.12) Now we are ready to prove (i), (iii), and (iv). Let us start with (i). In light of (3.10), (3.11) and (3.12) with N 0 = log 20 P , it suffices to show that (say), where a := nN/P and b := mM/P j . By hypothesis, we have |a|, |b| ≥ log 1000 P . Since 2 ≤ j ≤ log 1/2 P , the derivative a + jbs j−1 of the phase as + bs j is at least log 200 P outside of an interval of length at most O(log −200 P ), and (3.13) now follows from a standard integration by parts (see e.g. [12,Lemma 8.9]). This concludes the proof of (i).
Let us now turn to (iv). In light of (3.10), (3.11) and (3.12) with N 0 = log 1/500 P , it suffices to show that 1 |I| I e nN + mM t dt log −1/500 P whenever n, m = O(log 1/500 P ) are non-zero integers. From the hypothesis (iv) and Lemma 3.1(ii) (after factoring out any common multiple of n and m), we have |nN + mM | ≥ P log 1000 P . The claim (iv) now follows from integration by parts.
Finally we show (iii). In light of (3.10), (3.11) and (3.12) with N 0 = log 1/500 P , it suffices to show that − 0<|n|,|m|≤log 1/500 P 1 4π 2 mn Let us first consider those n, m = O(log 1/500 P ) for which nN + mM ≈ 0. By Lemma 3.1(ii) |nN + mM | ≥ P log 1000 P and similarly to case (iv), the contribution of such pairs (n, m) is acceptable. Consider now the case nN ≈ −mM for some non-zero integers n, m = O(log 1/500 P ). By assumption also aN ≈ bM for some co-prime positive integers a, b ≤ log 1/100 P . and hence by Lemma 3.1(ii) −amM ≈ bnM which contradicts the assumption M ≈ 0 unless (n, m) is a multiple of (a, −b). On the other hand if (n, m) is a multiple of (a, −b), then nN ≈ −mM by Lemma 3.1(ii).
Thus it remains to show that 0<|k|≤ log 1/500 P  We can now arrive at the desired contradiction by some case analysis (reminiscent of that in [24,25]) using the remaining portions of Proposition 3.2, as follows.
Case m ≈ 0. Applying (3.14) with N = m, we conclude from Proposition 3.2(ii) that Case m ≈ m and m ≈ 0. We apply (3.14) with N = m to conclude that On the other hand, if such co-prime integers exist, then am ≈ bn if and only if (a−b)m ≈ b(n − m ) and necessarily a > b, so that by Proposition 3.2(iii) we have in this case Since Proposition 3.2(iii) also gives c 1 (m , m ) ≥ 1/12 + O(δ), combining with (3.16) we obtain that On the other hand, since m ≈ m, we also have m ≈ n − m since n − m ≥ m > m . By Proposition 3.2(iii), (iv), we have which can be improved to (3.19) and we again contradict (3.18).
Case m ≈ m and m ≈ 0. By Lemma 3.1(iii), we must have n ≈ n . We apply (3.14) for N = n to obtain Since m ≈ m , we have by Proposition 3.2(iii), (iv) (using also Lemma 3.1(ii)) that c 1 (n, m) = c 1 (n, m )+O(δ). Proposition 3.2(iii) also gives c 1 (n, n) = 1/12+O(δ). Plugging these into (3.20) and rearranging, we obtain Since n ≈ n and m ≈ 0, we see from Proposition 3.2(iii), (iv) that which can be improved to Hence we can assume that 2(n − m) ≈ n and n ≈ 2n. But using m ≈ m and Lemma 3.1(ii) this implies that 2(n − m ) ≈ 3n, so that by (3.21) and Proposition 3.2(iii) we obtain c 1 (n, n − m) + c 1 (n, n ) = 1 12 + c 1 (n, n − m ) + O(δ) = 1 12 contradicting (3.22). for almost all real numbers x and some integers 1 ≤ m ≤ n/2, 1 ≤ m ≤ n /2 unless one has both m = m and n = n (this type of connection goes back to Landau [17, p. 116]). This latter fact is easily established by inspecting the jump discontinuities of both sides of (3.24), but it is also possible to establish it by computing the covariances of both sides of (3.24) with {N x} for various choices of N , and the arguments above can be viewed as an adaptation of this latter method.

Equidistribution
In this section we prove Proposition 1.12. Fix ε, A. We may assume that P is sufficiently large depending on ε, A, as the claim is trivial otherwise. If we have P j ≥ M log A P then we can replace in both parts of the proposition M P j by 0 with negligible error, so we may assume that either M = 0 or P j < M log A P . In either event we may thus assume that j ≤ log 1/2 P . Next, by partitioning I into at most log 100 P intervals of length at most P log −100 P and using the triangle inequality, it suffices (after suitable adjustment of P , A) to assume that I ⊂ [P, P + P log −100 P ]. In particular we have for all t ∈ I. Let us first reduce Proposition 1.12(ii) to Proposition 1.12(i). We perform a Fourier expansion W (x, y) = n,m∈Z c n,m e(nx + my)

where by integration by parts the Fourier coefficients
By the triangle inequality, the contributions of those frequencies n, m with |n| + |m| ≥ log 2A P is then acceptable. By a further application of the triangle inequality, Proposition 1.12(ii) follows from showing that whenever n, m are integers with |n| + |m| ≤ log 2A P . But this follows from Proposition 1.12(i) by adjusting the values of ε, A, M, N suitably. The proof of part (i) will use the standard tools of Vaughan's identity and Vinogradov's exponential sum estimates. We state a suitable form of the latter tool here: Lemma 4.1 (Vinogradov's exponential sum estimate). Let X ≥ 2, F ≥ X 4 , and α ≥ 1. Let I ⊂ [X, 2X] be an interval. Let f (x) be a smooth function on I satisfying for all t ∈ I for all integers 1 ≤ r ≤ 10 log F/(log X) + 1. Assume further that where the implied constant is absolute.
Proof. This is essentially [12,Theorem 8.25] with minor modifications (the modification needed is that we only assume (4.2) for r in a certain range, not all integers r ≥ 1.).
Let R := 10 log F/(log X) , and as in [12, p. 217], let Let S f (I) denote the sum in (4.4). By Taylor's formula, for any q ≥ 1 we have We take V = X 1/4 in which case by (4.2) the error term is The term in the parenthesis is ≤ F X 3/4 F −10/4 ≤ 1. Using also (4.3) we see that (4.5) is X 1/2 which is in particular smaller than the right-hand side of (4.4). The sum q∈Q e(F n (q)) is precisely the one estimated in [12, pp. 217-225]. The only assumption needed of f in that argument is (4.2), and the only restriction on F and X there is F ≥ X 4 . Hence, we conclude that the lemma holds by following the analysis there verbatim.
We now apply this estimate to obtain an estimate for an exponential sum over integers.
Proposition 4.2 (Exponential sums over integers). Let ε > 0, A ≥ 1, X ≥ 2, 2 ≤ j log 1/2 X, and let N, M be real numbers with N, M exp(O(log 3/2−ε X)). Let I be an interval in [X, X + X log −100 X]. Then for some absolute constant c > 0, where Proof. We may assume without loss of generality that A is sufficiently large, and X is sufficiently large depending on ε, A. By hypothesis we have F exp(O(log 3/2−ε X)). We may assume that F ≥ log CA X for a large absolute constant C, since the claim is trivial otherwise.
Let f : I → R denote the phase function Then for any r ≥ 1 and t ∈ I we have we conclude that M r = exp(O(r 2 log 2 X))M and |N | X + |M r | X j = exp(O(r 2 log 2 X))F. If |M r | ≤ |N |X j−1 /4 then from the triangle inequality and (4.1) we have Consider then the case |M r | > |N |X j−1 /4. We have the upper bound for all t ∈ I from the triangle inequality. Furthermore, since the function t → −1/t j−1 has derivative j/X j on I, we also have, for all t outside of an interval of length O(X log −2A X), the lower bound If we set α := log 4A X and A is sufficiently large, then we conclude from (4.7) and the bounds above that the estimate (4.2) holds for all 1 ≤ r ≤ log X and all t ∈ I outside the union of O(log X) intervals of length O(X log −2A X). The contribution of these exceptional intervals to (4.6) is negligible, and removing them splits I up into at most O(log X) subintervals, so by the triangle inequality it suffices to show that n∈I e N n + M n j ε,A X log −2A X for any subinterval I with the property that (4.2) holds for all t ∈ I and 1 ≤ r ≤ log X.
If F ≥ X 4 , we may apply Lemma 4.1 to conclude that for some absolute constant c > 0, and the claim follows. If instead F < X 4 , we can apply the Weyl inequality [12,Theorem 8.4] with k = 5 to conclude that for some absolute constant c > 0; since F ≥ log CA X, we obtain the claim by taking C large enough. Now we prove Proposition 1.12(i). We may assume without loss of generality that j ≥ 2, since for j = 1 we can absorb the M terms into the N term (and add a dummy term with M = 0 and j = 2, say). By summation by parts (see e.g. [19,Lemma 2.2]), and adjusting A as necessary, it suffices to show that for all intervals I ⊂ [P, P + P log −100 P ]. This is equivalent to where Λ is the von Mangoldt function, since the contribution of the prime powers is negligible. We introduce the quantity If F ≤ log CA P for some large absolute constant C > 0, then the total variation of the phase t → N t + M t j is O(log CA P ), and the claim readily follows from a further summation by parts (see e.g. [19,Lemma 2.2]) and the prime number theorem (with classical error term). Thus we may assume that (4.8) F > log CA P.
In this case, a change of variables t = P/s gives The derivative of the phase here is N/P + js j−1 M/P j which, once C is large enough, is ≥ log 10A P for all s ∈ P/I apart from an interval of length at most O(log −10A P ). Hence by partial integration we get that if C is large enough, so it remains to establish the bound n∈I e N n + M n j Λ(n) P log −10A P under the hypothesis (4.8). By Vaughan's identity in the form of [12,Proposition 13.4] (with y = z = P 1/3 ), followed by a shorter-than-dyadic decomposition, we can write Λ(n) = r≤R (α r * 1(n) + α r * log(n) + β r * γ r (n)) for n ∈ [P, 2P ], where * denotes Dirichlet convolution, and |α r (n)|, |α r (n)|, |β r (n)|, |γ r (n)| log P, 1 ≤ M r P 2/3 ; (the bound for the coefficients arising from Vaughan's identiy is log P since 1 * Λ = log). By the triangle inequality, it thus suffices to establish the Type I estimates for some constant c > 0, and the claim now follows from (4.8). Now we establish (4.11). We can assume that K r N r P , as the sum vanishes otherwise. By the triangle inequality, the left-hand side is bounded by log P By Proposition 4.2, we have for some absolute constant 0 < c < 1. Bounding γ r (n)γ r (n ) log 2 P and noting that for all n ∈ [N r , (1 + log −100 P )N r ], we obtain the claim (4.12) from (4.8). This completes the proof of Proposition 1.12.

Multiplicity of the falling factorial
In this section we establish Theorem 1.8. We first observe that if 1 ≤ m ≤ n solves (1.8) for some sufficiently large t, then by Stirling's formula. Hence we have an analogue of (1.5): and we obtain an analogue (5.3) n t 1/m = exp log t m of (1.6), (1.7). Next, we obtain the following analogue of Proposition 1.9. Furthermore, if (5.5) exp(log 2/3+ε (n + n )) ≤ m, m ≤ (n + n ) 2/3 for some ε > 0, then we additionally have for any A > 0.
Proof. We begin with (5.4). We follow the arguments from [1, Proof of Theorem 4]. Taking 2-valuations v 2 of both sides of (1.8) and using (1.11) we have The summands here vanish unless j ≤ log(n + n ). Writing x = x + O(1), we conclude that and (5.4) follows. Now we prove (5.6). Fix A, ε > 0. We may assume without loss of generality that m < m, so that n > n by (1.8). We may also assume t is sufficiently large depending on A, ε, as the claim is trivial otherwise; from (5.5) this also implies that m, m , n, n are sufficiently large depending on A, ε. Henceforth all implied constants are permitted to depend on A, ε. By  In particular m m and, combining (5.3) with (5.8) and (5.7), also n n t 1/m−1/m n . Hence from (5.5) we see that (5.9) n, n m 3/2 .
Also we have We perform a Fourier expansion where by integration by parts the Fourier coefficients obey the bounds Thus (5.14) can then be rewritten as  Now we adapt the analysis from Section 2. We extend the falling factorial (n) m to real n ≥ m ≥ 0 by the formula (n) m := Γ(n + 1) Γ(n − m + 1) .
From the increasing nature of the digamma function ψ we see that for fixed m, (n) m increases from Γ(m + 2) when n goes from m + 1 to infinity. Applying the inverse function theorem, we conclude that for any sufficiently large t there is a unique smooth function g t : {m > 0 : Γ(m + 2) ≤ t} → R such that for any m > 0 with Γ(m + 2) ≤ t, one has g t (m) ≥ m and Indeed, one could simply set g t (m) := f t/Γ(m+1) (m), where f t is the function studied in Section 2.
We have an analogue of Proposition 2.1: Proposition 5.2 (Estimates on the first few derivatives). Let C > 1, and let t, m be sufficiently large depending on C with Γ(m + 2) ≤ t. Then In the range m ≤ g t (m)/2, we have log t m 2 and in the range m ≤ g t (m) − C log 2 g t (m), one has If we differentiate (5.22) we obtain (5.23) g t (m)ψ(g t (m) + 1) − (g t (m) − 1)ψ(g t (m) − m + 1) = 0.
Since n 2 ≥ n − m ≥ C log 2 n, we have (n − m) log 3 n n − m C log 5 n (as can be seen by checking the cases n − m ≤ √ n and n − m > √ n separately), and the claim follows. Now we can establish Theorem 1.8. Let C > 0 be a large absolute constant, let ε > 0, and suppose that t is sufficiently large depending on ε, C. Let (n, m) be the integer solution to (1.8) in the region exp(log 2/3+ε n) ≤ m ≤ n − 1 with a maximal value of m; we may assume that such a solution exists, since we are done otherwise. If (n , m ) is any other solution in this region, then m < m and n < n . Note that n, n , m, m are sufficiently large depending on ε, C. First suppose that m ≤ n 1/2 log 10 n. Here we will exploit the fact that n grows rapidly as m decreases. From Proposition 5. On the other hand, from (5.20) and the mean value theorem we have n − n = g t (m ) − g t (m) n log t m 2 (m − m ) ≥ n m thanks to (5.1) and the trivial bound m − m ≥ 1. Thus we have n m m log 100 n but this contradicts the hypothesis m ≤ n 1/2 log 10 n. Now suppose we are in the regime n 1/2 log 10 n < m ≤ n − C log 2 n.
Here we will take advantage of the convexity properties of g t . From The right-hand side is at most exp(O(log 2 t log 3 t)). This implies that n − n log 3 t, since otherwise the left hand side would be, for any C ≥ 1, n n − m + 1 + C log 3 t C log 3 t exp C 2 log 3 t log 2 t which contradicts the bound for the right hand side when C is sufficiently large.
In particular we have from the triangle inequality that n − m, n − m C log 2 2 t. Making the change of variables := n − m, it now suffices to show that there are at most two integer solutions to the equation (5.28) (n) n− = t in the regime 1 ≤ C log 2 2 t. We write this equation (5.28) as n! = t ! or equivalently n = h t ( ) where h t (x) := Γ −1 (tΓ(x+1))−1, and Γ −1 : [1, +∞) → [2, +∞) is the inverse of the gamma function. Here we will exploit the very slowly varying nature of h t . From Stirling's formula we have h t (x) log t log 2 t whenever 1 ≤ x C log 2 2 t. Taking the logarithmic derivative of the equation Γ(h t (x) + 1) = tΓ(x + 1) we have h t (x)ψ(h t (x) + 1) = ψ(x + 1). Hence by (2.1) h t (x) log x log h t (x) log 3 t log 2 t in the regime 1 ≤ x C log 2 2 t. In particular, for two solutions (n, ), (n , ) to (5.28) in this regime we have (5.29) n − n log 3 t log 2 t | − |.
For fixed n there is at most one ≥ 1 solving (5.28). We conclude that for two distinct solutions (n, ), (n , ) to (5.28) in this regime, we have |n − n | ≥ 1, and hence the separation | − | log 2 t log 3 t .
Now suppose we have three solutions (n 1 , 1 ), (n 2 , 2 ), (n 3 , 3 ) to (5.28) in this regime. We can order 1 < 2 < 3 , so that n 1 < n 2 < n 3 . From the preceding discussion we have log 2 t log 3 t 2 − 1 , 3 − 2 C log 2 2 t and 1 ≤ n 2 − n 1 , n 3 − n 2 C log 2 t log 3 t. If 2 j is a power of 2 that divides an integer in (n 1 , n 2 ] as well as an integer in (n 2 , n 3 ], then we must therefore have 2 j C log 2 t log 3 t, so that j log 3 t. Thus, there must exist i = 1, 2 such that the interval (n i , n i+1 ] only contains multiples of 2 j when j log 3 t. Fix this i. Taking 2-adic valuations of (5.28) using (1.11) we have ∞ j=1 n i 2 j = v 2 (t) + ∞ j=1 i 2 j