Counter-intuitions of divergent series

A common mistake when it comes to understanding series, motivated by highschool-esque and contest mathematics type thought process, is to think that the behaviour of the series is dominated by the first (finitely many) terms. For example, we think of the series 1 + 2 + 3 + \cdots as ‘divergent’ because the first few terms seem to add up really quickly.

This intuition is challenged with the classic `smallest’ (note: there is no such thing as an actual smallest divergent series) divergent series: the harmonic series

\displaystyle \sum_{n=1}^\infty \frac{1}{n}.

It is challenging because it just `barely’ diverges. If we replace the exponent 1 of n by anything bigger, we will end up with a convergent series. Further, it doesn’t look divergent because the first few terms don’t add up very quickly.

The following really challenges our intuition: if we delete every positive integer n that contains the number 9 in their usual decimal representation, and then add up their reciprocals, we get something convergent! This seems ludicrous because of we try to add up the first 100 positive integers, sans those that need to be omitted, it seems to barely make a difference. However, if we think of the `later’ parts of the series, the argument becomes clear. For each positive integer n, let us consider the integers 10^n \leq k < 10^{n+1}. There are 8 admissible choices (namely 1,2,3,4,5,6,7,8) for the first digit, and 9 choices for each subsequent digit, for a total of 8 \cdot 9^n choices. Each denominator is of size at least 10^n, hence

\displaystyle \sum_{\substack{10^n \leq k < 10^{n+1} \\ k \text{ does not contain } 9 \text{ in its decimal representation}}} \frac{1}{k} < \frac{8 \cdot 9^n}{10^n}.

Let \sideset{}{'}\sum denote summation over positive integers that do not contain 9 in their decimal representation. Then, it follows that

\displaystyle \sideset{}{'} \sum_{m \geq 10} \frac{1}{m} \leq \sum_{n=1}^\infty \sideset{}{'} \sum_{10^n \leq k < 10^{n+1}} \frac{1}{k} \leq \sum_{n=1}^\infty \frac{8 \cdot 9^n}{10^n} < \infty.

Therefore, it is the tail of the series that determines its behaviour with respect to convergence, not the front part! This is a particularly vexing thing to emphasize to students who are trained to think in the fashion (and only in the fashion) “do a few computations, make a conjecture, then prove conjecture using contradiction or induction”.

“Fuel” and “Engines” in mathematics

Note: First blog post in a while, been a crazy few months.

There are many analogies mathematicians use to try to explain the abstract realm they work in to laymen of various levels. For example, Freeman Dyson had tried to categorize mathematicians as “frogs” or “birds” and Tim Gowers tried to categorize them as “problem solvers” or “theory crafters”. In a post on MathOverflow, I tried to categorize certain theorems as ‘workhorses’.

Another useful way I recently thought of to think about various theorems and technique in mathematics is to think of them as “fuel” or “engines”. Fuel and engines are both parts of “machines”, which comprise theories and programs in mathematics. Fuel is very versatile; an improvement in fuel will almost certainly mean that all engines which can use said fuel can operate better. Engines are more specific; they are designed to perform a specific function. A car engine for example is useless if you try to make it chop vegetables, unless you design an entire system which chops vegetables while being powered by a car engine.

This is a good way to explain just how the approaches Yitang Zhang and James Maynard differ. One can think of the original sieve mechanism devised by Goldston, Pintz, and Yildirim, now known as the GPY-sieve. In a sense they constructed a ‘machine’ which allows us to turn churn out small gaps between primes. The engine of the machine is the GPY sieve mechanism; while the fuel (or the raw input) is the Bombieri-Vinogradov theorem. The machine works quite efficiently in that it produces gaps that are arbitrarily small compared to the average gap; but the engine just can’t put out enough power to establish bounded gaps between primes with the current quality of the fuel. This situation was thought to be as far as one can go back in 2013.

Yitang Zhang came along and showed that while it is not possible to develop an improved ‘all purpose’ fuel, i.e. an improvement on the level of distribution present in the Bombieri-Vinogradov theorem, one can devise a specialized and more efficient fuel that works specifically for the GPY engine. This extra boost in power was sufficient to establish bounded gaps.

Maynard, however, showed that one can rip out the GPY engine and replace it with a much more efficient and powerful engine that requires much less fuel and can output much more power. This is Maynard’s multi-dimensional sieve; something thought to be impossible previously.

The problem is that Maynard’s engine doesn’t run on Zhang’s fuel; because the latter is specialized for the GYP engine. Modifying the Maynard engine so that it runs on the Zhang fuel would be the next step forward towards the bounded gap problem.

Bhargavaology Learning Seminar First Talk – 12 Jan 2015

I am doing a learning seminar on ‘Bhargavaology’, which is the collection of techniques and results surrounding the recent work of Manjul Bhargava, a Fields Medalist in 2014. The introductory talk is scheduled on Monday, Jan 12 in MC 5413, from 2:30 PM to 3:30 PM. The abstract is indicated below:

Title: Introduction to Bhargavaology
Speaker: Stanley Yao Xiao
Abstract: In this introductory talk I aim to give a sampling of the vast web of important theorems related to Manjul Bhargava, one of the four Fields Medalists of 2014, and describe their importance in the context of modern number theory. There are no technical details in the talk, making it very accessible.

A cute problem that illustrates the power of integers

The following problem was introduced to me as a ‘homework problem’ for the child of a certain genius-level mathematician, which is likely apocryphal. Nevertheless, the problem itself is interesting.

Problem: Three farmers brought 10, 16, and 26 chickens to the market to sell. To avoid a price war, they agreed to sell each chicken for the same price. By midday, none of them had sold all of their chickens. They then agreed to lower their prices by the same amount in order to unload all of their chickens. By the end of the day, each farmer sold all of the chickens they brought to the market and each made 35 dollars. What is the difference of the morning price and the afternoon price?

This question seems like it does not provide enough information. Indeed it does not, if one fails to consider that certain quantities (namely the amount of chickens) are integers. To wit, let us establish the solution as follows.

Let x,y,z respectively denote the number of chickens sold by farmer 1, 2, and 3 in the morning. Note that x,y,z are necessarily integers. Let C denote the price of a chicken in the morning, and C' denote the price of a chicken in the afternoon. Let D = C'/(C - C'). We then obtain the following equations:

Cx + (10 - x)C' = 35,

Cy + (16 - y)C' = 35, and

Cz + (26-z)C' = 35.

Subtracting the third equation from the first and second and re-arranging, we obtain

x - z = 16D, y - z = 10D.

Now the fact that x,y,z are positive integers really come in handy. In particular, we must have 16D, 10D are both integers. The first condition implies that the denominator of D must be a power of 2, and the second implies that it must in fact be exactly 2. Now we write D = G/2, where G is a positive odd integer. Now we examine the equation

x = 8G + z.

Note that since Farmer 1 did not sell all of his chickens in the morning, we must have

1 \leq x \leq 9.

However, G \geq 1 and z \geq 1, whence 8G + z \geq 9, and so x = 9 is the only possibility. Thus, z = 1 and G = 1. This shows that

y = 5G + 1 = 6.

With this information in hand, we go back to examine the equation

9C + C' = 35, 6C + 10C' = 35.

Substituting, we see that C = 15/4, C' = 5/4. So it would appear that C - C' = 15/4 - 5/4 = 5/2.

But wait! We had only shown that the values of C, C' above satisfy two of the three required equations, so for due diligence we must examine the final equation

C + 25C' = 35.


15/4 + 25(5/4) = 140/4 = 35,

so indeed C = 15/4, C' = 5/4 is an acceptable solution.

Note that the solution really required the fact that x,y,z were positive integers and constrained in a specific way. Without these (seemingly unimportant) tidbits of information, the problem would be intractable.

An innocent linear algebra problem and an application of the probabilistic method

The following question was posed to me by a friend:

Suppose that A \subset M_{n \times n}(\mathbb{R}) is a symmetric matrix with non-negative trace, and B \subset M_{n \times n}(\mathbb{R}) is a symmetric matrix with negative trace. Prove that there exists a vector \textbf{w} \in \mathbb{R}^n such that \textbf{w}^T A \textbf{w} \geq 0 and \textbf{w}^T B \textbf{w} < 0.

After fruitlessly trying to give an explicit construction and trying to work out an inductive argument, the hint provided by my friend proved to be ingenious: one should consider a probabilistic argument.

In the end, one can argue that the probabilistic method is not really needed as the argument used below is equivalent to a counting argument. Nevertheless, we give the following proof.

First, we may suppose that A is diagonalized, as the matrix P used to diagonalize A when applied to B gives PBP^{-1}, which remains a symmetric matrix with negative trace as trace is preserved under similarity. Now consider the random vector \textbf{x} = (x_1, \cdots, x_n) where the x_i are independent Bernoulli random variables with mean zero and variance 1. It is immediate that for any such vector \textbf{x}, we have

\textbf{x} A \textbf{x}^T \geq 0.

Now, we compute the expectation of \textbf{x} B \textbf{x}^T. By the linearity of expectation and the fact that the x_i‘s are independent, we have

\displaystyle E(\textbf{x} B \textbf{x}^T) = \sum_{j=1}^n b_{jj} \text{Var}(x_j) = \text{Tr}(B) < 0.

Hence there exists one particular vector \textbf{x}' for which $\textbf{x}’ B (\textbf{x}’)^T < 0$, and we are done.

As remarked earlier, this argument is equivalent to considering the set of vectors \textbf{x} = (\pm 1, \cdots, \pm1) and adding all of them and concluding that at least one of them must give the desired result.