Asymptotics of counting integers by prime signature


The prime counting function $\pi(n)$ which counts the number of primes up to $n$ is well-known, and it's also fairly well-known that using a well-optimised implementation of the Meissel-Lehmer algorithm it can be calculated in $\tilde O(n^{2/3})$ time. What about numbers of other forms?

To be specific, define the prime signature $\textrm{sig}(x)$ as the multiset of the non-zero exponents in the prime factorisation of $x$, and define $\pi_s(n)$ as the number of integers up to $n$ with prime signature $s$ (so that the prime counting function is $\pi_{\{1\}}$). What can we say about how quickly we can calculate $\pi_s(n)$?

Why should this post be closed?


1 answer


Let $\alpha$ be the smallest exponent such that we know how to calculate $\pi(n)$ in time $\tilde O(n^\alpha)$. Courtesy of Deléglise and Rivat, who removed the $+ \epsilon$ from Lagarias, Miller and Odlyzko's bound, we know that $\alpha \le \tfrac23$, but I'm going to work in terms of $\alpha$ because (a) I'm not aware that it has been lower-bounded; (b) it saves having to update this post if anyone finds a better upper bound; (c) it makes the algebra simpler.

Obviously if the signature has only one element then $\pi_{\{a\}}(n) = \pi(\lfloor n^{1/a}\rfloor)$ and can be calculated in $\tilde O(n^{\alpha/a})$ time.

Otherwise let $s = \{a_1, a_2, \ldots, a_k\}$; we sum over all primes $p$ the value of $\pi_{\{a_2, \ldots, a_k\}}(\lfloor n / p^{a_1}\rfloor)$, exclude numbers which don't have distinct primes by subtracting $\pi_{s'}(n)$ for each distinct $s'$ formed by merging $a_1$ with some other $a_i$, and correct for double-counting of permutations of the primes by dividing by the frequency of $a_1$ in the signature.

The core sum here is $S = \sum_{p \in \mathbb{P}} \pi_{\{a_2, \ldots, a_k\}}(\lfloor n / p^{a_1}\rfloor)$. (If that isn't obvious, you can take the following as a template for a fixed value of $k$ and then induct over $k$). Using standard tricks, and leaving $t$ for the time being as a variable, \begin{eqnarray} S &=& \sum_{p \in \mathbb{P}} \left[\left\lfloor \tfrac{n}{p^{a_1}} \right\rfloor > \left\lfloor \tfrac{n}{t} \right\rfloor \right] \pi_{\ldots}\left(\left\lfloor \tfrac{n}{p^{a_1}} \right\rfloor\right) + \sum_{j=1}^{\lfloor n/t \rfloor} \pi_{\ldots}(j) \sum_{p \in \mathbb{P}} \left[j = \left\lfloor \tfrac{n}{p^{a_1}} \right\rfloor \right] \\ &=& \sum_{p \in \mathbb{P}} \left[\left\lfloor \tfrac{n}{p^{a_1}} \right\rfloor > \left\lfloor \tfrac{n}{t} \right\rfloor \right] \pi_{\ldots}\left(\left\lfloor \tfrac{n}{p^{a_1}} \right\rfloor\right) + \sum_{j=1}^{\lfloor n/t \rfloor} \pi_{\ldots}(j) \left(\pi(\lfloor \tfrac{n}{j} \rfloor^{1/a_1}) - \pi(\lfloor \tfrac{n}{j+1} \rfloor^{1/a_1})\right) \\ \end{eqnarray}

If we let $t = n^b$ and assume that evaluating $\pi_{\{a_2, \ldots, a_k\}}(m)$ is in $\tilde O(m^c)$ then our main sum takes $\tilde O(t^{1/a_1})$ to sieve primes, $\tilde O(\sum_{p \in \mathbb{P}, p^{a_1} < t} (n / p^{a_1})^c)$ for recursive calls in the first sum, $\tilde O(\sum_{j=1}^{n/t} j^c)$ for recursive calls in the second sum, and $\tilde O(\sum_{j=1}^{n/t} (n/j)^{\alpha/a_1})$ for calls to $\pi$ in the second sum. Since the density of primes is within a logarithmic factor of constant, the bound for recursive calls in the first sum doesn't lose anything by being simplified to $\tilde O(\sum_{p=1}^{t^{1/a_1}} (n / p^{a_1})^c)$.

Thus our asymptotic cost is $\tilde O(n^{\beta})$ where \begin{eqnarray} \beta &=& \max(\tfrac{b}{a_1}, c + \tfrac{b}{a_1}(1 - a_1 c), (1-b)(1+c), \tfrac{\alpha}{a_1} + (1-b)(1 - \tfrac{\alpha}{a_1})) \\ &=& \max(\tfrac{b}{a_1}, c(1 - b) + \tfrac{b}{a_1}, (1-b)(1+c), 1 - b + \tfrac{\alpha b}{a_1}) \\ % &=& \max(c(1 - b) + \tfrac{b}{a_1}, (1-b)(1+c), 1 - b + \tfrac{\alpha b}{a_1}) \\ &=& \max(c + b(\tfrac{1}{a_1} - c), (1+c) - (1+c)b, 1 + b(\tfrac{\alpha}{a_1} - 1)) \end{eqnarray} and we want to choose $b$ to minimise it. If $a_1 c \ge 1$ (i.e. if $\tfrac{1}{a_1} - c \le 0$) then all of the lines have non-positive gradients, so we minimise them by setting $b = 1$ and we get an overall cost of $\beta = \max(\tfrac{1}{a_1}, 0, \tfrac{\alpha}{a_1}) = \tfrac{1}{a_1}$.

Otherwise the minimum value of $\beta$ will be found at an intersection of one of the falling lines with the rising one; case analysis reveals that if $c < \alpha$ we want $b = \tfrac{(1 - c)a_1}{(1 - c)a_1 + 1 - \alpha}$ for $\beta = \tfrac{1 - \alpha c}{(1 - c)a_1 + 1 - \alpha}$; otherwise we want $b = \tfrac{a_1}{a_1 + 1}$ for $\beta = \tfrac{c+1}{a_1 + 1}$.

Observe that the very best value of $\beta$ is obtained when $c=0$ and is $\beta_0 = \tfrac{1}{a_1 + 1 - \alpha}$. Thus $\tfrac{1}{a_1 + 1} < \beta \le \tfrac{1}{a_1}$, and so if we order the prime signature descending then we can evaluate $\pi_s(n)$ in time $\tilde O(n^{1/\max(s)})$.


You may want to split the long formulas to multiple lines. ‭tommi‭ 29 days ago

Sign up to answer this question »