Some problems for the new year

Part new year resolution and part a birthday present to myself (and those audience members interested), I’ve decided to write up some problems I’ve been thinking about but either don’t have the time or the techniques/knowledge to tackle at the present time. Hopefully they will keep me motivated into 2016, as well as anyone else who’s interested in them. In no particular order:

1) Stewart’s Conjecture: I have already discussed this problem in two earlier posts (here and here). The conjecture is due to my advisor, Professor Cameron Stewart, in a paper from 1991. The conjecture asserts that there exists a positive number c > 0 such that for all binary forms F(x,y) of degree d \geq 3, integer coefficients, and non-zero discriminant, there exists a positive number r_F which depends on F such that for all integers h with |h| \geq r_F, the equation F(x,y) = h has at most c solutions. In particular, the value of c does not depend on F nor d. A weaker version of this conjecture asserts the existence of a positive number c_d for every degree d \geq 3 for which the above holds.

I suspect that Chabauty’s method, applied to the estimation of integer points on hyperelliptic curves, is close to being able to solve this problem; see this paper by Balakrishnan, Besser, and Muller. However, there may be other tools that may be used without involving a corresponding curve. That said, since a positive answer to Stewart’s conjecture would have significant impact on the theory of rational points on hyperelliptic curves, it seems that the two problems are intrinsically intertwined.

2) Asymptotic Chen’s Theorem: This is related to a problem I’ve been thinking about lately. Chen’s theorem asserts that every sufficiently large even integer N can be written as the sum of a prime and a number which is the product of at most two primes. However, this simple statement hides the nature of the proof. The proof essentially depends on two parts, and (as far as I know) has not been improved on a fundamental level since Chen. The first is the very general Jurkat-Richert theorem, which can handle quite general sequences. Its input is some type of Bombieri-Vinogradov theorem, i.e., some type of positive level of distribution. It essentially churns out semi-primes of some order given a particular level of distribution. We will phrase the result slightly differently, in terms of the twin prime conjecture. Goldbach’s conjecture is quite related, and Chen actually proved the analogous statement for both the twin prime problem and Goldbach’s conjecture. Bombieri-Vinogradov provides the level 1/2, and with this level, the Jurkat-Richert theorem immediately yields that there exist infinitely many primes p such that p+2 is the product of at most three primes. Using this basic sieve mechanism and the Bombieri-Vinogradov theorem, it is impossible to breach the ‘three prime’ barrier. A higher level of distribution would do the trick, but so far, Bombieri-Vinogradov has not been improved in general (although Yitang Zhang‘s seminal work on bounded gaps between primes does provide an improvement in a special case). Thus, we require the second piece of the proof of Chen’s theorem, the most novel part of his proof. He was able to show that there aren’t too many primes p such that p+2 has exactly three prime factors, so few that the difference in number between those primes p where p+2 has at most three prime factors and those with exactly three prime factors can be detected. However, the estimation of these two quantities using sieves (Chen’s theorem does not introduce any technology that’s not directly related to sieves) produce terms with the same order of magnitude, so Chen’s approach destroys any hope of establishing an asymptotic formula for the number of primes p for which p+2 is the product of at most two primes. It would be a significant achievement to prove such an asymptotic formula, because it means there has been a significant improvement to the underlying sieve mechanism, or some other non-sieve technology has been brought in successfully to tackle the problem. Either case, it would be quite the thing to behold.

3) An interpolation between geometrically irreducible forms and decomposable forms: A celebrated theorem of Axel Thue is the statement that for any binary form F(x,y) with integer coefficients, degree d \geq 3, and non-zero discriminant and for any non-zero integer h, the equation F(x,y) = h has only finitely many solutions in integers x,y.  Thue’s theorem is ineffective, meaning one cannot actually find an upper bound for the number of solutions except to know that it must be finite. Thue’s theorem has been refined by many authors over the past century, with some of the sharpest results known today due to my advisor Cam Stewart and Shabnam Akhtari.

If one wishes to generalize Thue’s theorem to higher dimensions, then there are two obvious candidates. The more obvious one is to consider general homogeneous polynomials F(x_1, \cdots, x_n) in many variables. However, in this case Thue’s techniques do not generalize in an obvious way. Thue’s original argument reduced the problem to a diophantine approximation problem, i.e., to show that there are only finitely many rational numbers which are `very close’ to a given root of F. This exploits the fact that all binary forms can be factored into linear forms, a feature which is absent for general homogeneous polynomials in n \geq 3 variables. Thus, one needs to narrow the scope and instead consider decomposable forms, meaning homogeneous polynomials F(x_1, \cdots, x_n) which can be factored into linear forms over \mathbb{C}, say. To this end, significant progress has been made. Most notably, Schmidt’s subspace theorem was motivated by this precise question. Schmidt, Evertse, and several others have worked over the years to establish results which are quite close to the case of Thue equations, though significant gaps remain, but that’s a separate issue and we omit further discussion.

The question I have is whether there is a way to close the gap between what can be proved about decomposable forms and for general forms. The forms which are the most different from decomposable forms, which are essentially as degenerate as possible geometrically, are the ones that are the least degenerate; i.e., the geometrically irreducible forms. These are the forms that cannot be factored at all. Specifically, their lack of factorization is not because its factorability is hidden by some arithmetic or algebraic obstruction but because it is geometrically not reducible. Precisely, geometrically irreducible forms are those forms F(x_1, \cdots, x_n) which do not have factors of positive degree even over an algebraically closed field, say \mathbb{C}. For decomposable forms, a necessary condition is to ensure that the degree d exceeds the number of variables n; much like the condition d \geq 3 in the case of Thue’s theorem. However, absent from the case when n = 2 is the possibility that there are forms of degree exceeding one which behave `almost’ like linear forms, in a concrete sense. By this I mean we can show that as long as basic local conditions are satisfied, the form represents all integers. This has shown to be the case for forms whose degree is very small compared to the number of variables; the first such result is due to Birch, and has been improved steadily since then. Thus the interpolation I am wondering about is the following: let F(x_1, \cdots, x_n) be a homogeneous polynomial with integer coefficients and degree d \geq n+1, with no repeated factors of positive degree. Suppose that F factors, over \mathbb{C}, into forms of very small degree, say d' \ll \log n. Can we hope to establish finiteness results like we can for decomposable forms? This seems like a very interesting question.

If you are interested in any of these problems or if you have an idea as to how to approach any of them, please let me know!

Large sieve inequality

I am currently reading the book Opera de Cribro by John Friedlander and Henryk Iwaniec, and in particular studying the large sieve. One important thing to remember is that the “large sieve” is not really a sieve in the conventional sense. A ‘sieve’ typically refers to a choice of sieve weights, for example a combinatorial sieve is usually some way of defining sieve weights \lambda_d in such a way that \lambda_d = \mu(d) for some positive integers d, while \lambda_d = 0 for others. The large sieve does not involve a choice of sieve weights; and indeed, is usually independent from such choices (at least in its distilled from, the Bombieri-Vinogradov theorem).

The large sieve is actually just an inequality, which is not strictly number-theoretical. In fact, it applies equally well to any “well-spaced” points on the unit circle. The full force of this philosophy has recently been brought to bear on the Vinogradov Mean Value Theorem, in this paper. We write it in its most general form. We will adopt the convention e(x) = e^{2 \pi i x} and for a given sequence (a_n) of complex numbers, define S(\alpha) = \displaystyle \sum_{M < n \leq M+N} a_n e(\alpha n). Now suppose that \alpha_1, \cdots, \alpha_r are well-spaced real numbers with respect to some parameter \delta, meaning that for k \ne l, the number \alpha_k - \alpha_l is at least \delta away from an integer. We will write the distance of a real number \beta from an integer as \lVert \beta \rVert. In other words, we insist that if k \ne l, then \lVert \alpha_k - \alpha_l \rVert \geq \delta.

Moreover, it is clear that we can have at most \delta^{-1} many \alpha_j‘s. From the Cauchy-Schwarz inequality, we see that \lvert S(\alpha) \rvert^2 \leq N \sum_{M < n \leq M+N} |a_n|^2. Therefore, any upper bound for the term

\displaystyle \sum_r \lvert S(\alpha_r)\rvert^2

must include N, \delta^{-1}. The remarkable thing is that this is enough! Indeed, Selberg proved the following sharp form of the large sieve inequality:

\displaystyle \sum_r \lvert S(\alpha) \rvert^2 \leq (N + \delta^{-1} -1) \sum_n |a_n|^2.

This has the following striking number-theoretic interpretation. Consider all the rational numbers a/q with \gcd(a,q) = 1 and 1 \leq q \leq Q. Observe that any two such rationals differ by at most 1/Q^2, in other words, these rationals are Q^2-spaced. Then the large sieve inequality gives the following

\displaystyle \sum_{q \leq Q} \sum_{\substack{a \pmod{q} \\ \gcd(a,q) = 1}} \left \lvert S \left(\frac{a}{q}\right) \right \rvert^2 \leq \left(Q^2 + N - 1\right) \sum_n |a_n|^2.

There are striking consequences to this inequality, including the famous theorem of Linnik.

Notes on the Oxford IUT workshop by Brian Conrad


Brian Conrad is a math professor at Stanford and was one of the participants at the Oxford workshop on Mochizuki’s work on the ABC Conjecture. He is an expert in arithmetic geometry, a subfield of number theory which provides geometric formulations of the ABC Conjecture (the viewpoint studied in Mochizuki’s work).

Since he was asked by a variety of people for his thoughts about the workshop, Brian wrote the following summary. He hopes that a non-specialist may also learn something from these notes concerning the present situation. Forthcoming articles in Nature and Quanta on the workshop will be addressed at the general public. This writeup has the following structure:

  1. Background
  2. What has delayed wider understanding of the ideas?
  3. What is Inter-universal Teichmuller Theory (IUTT = IUT)?
  4. What happened at the conference?
  5. Audience frustration
  6. Concluding thoughts
  7. Technical appendix

1.  Background

The ABC Conjecture is one of the outstanding conjectures in number…

View original post 7,551 more words

Solution to why that nonic is solvable

Previously, we claimed that for any quadruple a,b,c,d of rational integers, not all zero, the nonic polynomial

F(x) = x^9 + a x^8 + b x^7 + c x^6 + d x^5 - (126 - 56 a + 21 b - 6 c + d)x^4

- (84 - 28 a + 7 b - c)x^3 - (36 - 8a + c)x^2 - (9 - a)x - 1

is solvable, meaning that it is possible to determine the roots of F explicitly by radicals. By Galois theory, this is equivalent to the assertion that the Galois group of the Galois closure of F is a solvable group.

To do this, we need the following fact, which was proved by Bhargava and Yang in this paper as Theorem 4. The statement of their theorem is correct, and the proof is mostly correct, but there is a minor issue. The problem is that the stabilizer of F under \text{GL}_2(\mathbb{C}), which we will denote by \text{Aut}_\mathbb{C} F, need not be realizable as a subgroup of the Galois group of F, which we will denote by \text{Gal} (F). However, the argument they gave for the commuting action between elements of \text{Aut}_\mathbb{C} F and \text{Gal}(F) is correct.

We now consider a binary form F of the shape given above. We see that both \text{Aut}_\mathbb{C} F and \text{Gal}(F) act on the roots of F and can therefore be embedded via their action on the roots of F into S_9, the symmetric group on nine letters. If we restrict to \text{GL}_2(\mathbb{Q}) action and denote by \text{Aut} F = \text{Aut}_\mathbb{Q} F, then it follows from Galois theory that \text{Gal} (F) must be a subgroup of the centralizer of any element in \text{Aut} F in S_9.

We then check that, miraculously, \text{Aut} F always contains the following element of order 3:

U = \begin{pmatrix} 0 & 1 \\ -1 & -1 \end{pmatrix}.

We check that the only complex numbers fixed by this element are the roots of x^2 + x + 1. Therefore, if F is irreducible, then no root of F can be fixed by U. Relabelling the roots if necessary, we can assume that U can be realized in S_9 as

U = (123)(456)(789).

The centralizer C(U) of U is the stabilizer of U under the action by conjugation of S_9 on itself. The orbit of U is precisely the set of elements in S_9 of the same cycle type, therefore the orbit contains

\displaystyle \binom{9}{3}(2!) \binom{6}{3}(2!) (2!) \frac{1}{3!} = 2240.

By the orbit-stabilizer theorem, it follows that C(U) contains 9!/2240 = 162 = 2 \times 3^4 elements. Since \text{Gal} (F) is contained in C(U), it is solvable by Burnside’s theorem, which asserts that any finite group whose order is only divisible by two distinct primes is solvable.


Mathematicians have always been willing to accept new ideas

In a recent publication (see here) of a popular internet comic strip that I like, the author poked fun at the supposed notion that mathematicians are intransigent and stubborn, failing to accept new ideas in a timely fashion (this is not merely an outside opinion, there are some insiders who feel the same way… quite strongly in fact. See Doron Zeilberger’s opinion page for instance). However, as someone who is about to get a PHD in mathematics and an amateur mathematics historian, I would like to voice my polite disagreement with Mr. Weinersmith’s premise.

This is the message I posted on the comic’s Facebook page:

“As someone about to get a PHD in mathematics, I can attest that the basic premise of this comic is wrong. Mathematicians have always been much quicker to accept new advances and shifts in paradigm faster than their contemporaries in other fields. The only times when acceptance of new results, even paradigm shifting ones, were slow to be accepted by the mathematical community are those where the result was poorly written or poorly presented (for example, Cantor’s work on cardinality, Brun’s sieve theory, Ramanujan’s work before Hardy, Heegner’s solution of Gauss’s class number one problem, and most recently and still unresolved: Shinichi Mochizuki’s purported proof of the abc conjecture)

Edit: to give a positive example, consider the proofs of Fermat’s Last Theorem and more recently, the Poincare conjecture. These two are considered two of the most difficult mathematical problems in history, and when their solutions were presented, it took only a few years for the mathematical community to verify and accept their correctness. Even more recently, Yitang Zhang’s manuscript containing the proof of the existence of infinitely many primes which are within a bounded distance from each other was accepted in JUST THREE WEEKS by one of mathematics’ top journals, even though Zhang was at the time completely unknown and in particular was not known to have done any work in number theory.”

I would like to elaborate even further on my comments. Not only are mathematicians not intransigent as suggested in the comic, mathematicians are likely to be the group in academia which is the most willing to share their ideas and accept other people’s ideas (this is a broad stroke, there are certainly many people who arbitrarily dismiss people’s work, as anyone who has faced a grouchy referee when submitting a paper can attest) . The lightning fast acceptance of the two big advances on the bounded gap problem should serve as a testament to this. Both of the main players, Yitang Zhang and James Maynard, were at the time more or less completely unknown. Their ‘lowly’ status did not prevent their work from being recognized, almost instantly in fact, by some of the biggest experts in the field (including Andrew Granville and Terence Tao). This seems unlikely in many other areas, especially as one gets further away from pure science.

This is not to say that mathematicians are just more progressive and forward-thinking in general. Social attitudes among mathematicians, while probably better than the general population, is certainly not stellar, as a recent paper by Greg Martin points out.

A solvable nonic polynomial

Continuing from our demonstration that a certain sextic polynomial, which are not in general solvable, has an explicit factorization, we go on to describe how a class of degree 9 polynomials is solvable. Consider a,b,c,d to be rational integers, not all zero, and the nonic polynomial

F(x) = x^9 + a x^8 + b x^7 + c x^6 + d x^5 - (126 - 56 a + 21 b - 6 c + d)x^4

- (84 - 28 a + 7 b - c)x^3 - (36 - 8a + c)x^2 - (9 - a)x - 1.

The claim is that all such polynomials are in fact solvable!

I will reveal the argument a little later, but it’ll be interesting to see what kind of arguments readers can come up with.