An elementary statistics problem from Hearthstone

In Hearthstone, a popular online card game, there is a minion called C’Thun which deals damage equal to its attack value, one damage at a time. This prompted a popular streamer named Trump (not the insufferable presidential candidate) to ask the following question once on his channel: when playing a C’Thun with 12 attack (hence, it will deal 12 damage one at a time), what is the probability that it will kill a 6 health minion when played, with no other minions on the board?

To those who don’t know how Hearthstone works, one can think of the question as follows: suppose you flip a coin 12 times in a row. Each time you flip the coin, you record down whether you got heads or tails. As long as you have fewer than 6 heads, the coin is fair and you have equal probability of getting heads or tails on your next flip. However, as soon as you hit 6 heads, then the coin becomes unfair and you can only get tails after. The question that Trump (real name Jeffrey Shih) asks is then equivalent to “what is the probability that you get 6 heads”.

To calculate this probability, we can ask the reverse question: what is the probability that we don’t get six heads? If we don’t get 6 heads, then the probability distribution is the same as if we always had a fair coin, and so the probability is the same. The probability of getting at most five heads when flipping a fair coin 12 times is given by

\displaystyle 2^{-12} \sum_{k=0}^5 \binom{12}{k}.

The well-known binomial theorem tells us that

\displaystyle 2^{-12} \sum_{k=0}^{12} \binom{12}{k} = 1.

This shows that the probability of getting at most 5 heads is equal to

\displaystyle \frac{1}{2} - 2^{-13} \binom{12}{6} = \frac{793}{2048}.

Therefore, the probability of getting 6 heads in the original game is equal to

\displaystyle 1 - \frac{793}{2048} = \frac{1255}{2048},

or roughly 61.28\%. Thus it is far more likely that C’Thun will kill a 6 health minion with only 12 damage than too many shots going to face.

Some thoughts after reading an article on Peter Scholze

Recently the following article appeared on my Facebook feed: https://www.quantamagazine.org/20160628-peter-scholze-arithmetic-geometry-profile/Aside from the usual bland mix of singing praises of an archetypal ‘genius’, the article does contain some genuine insights. The most striking of which is Scholze’s description of him learning the proof of Fermat’s Last Theorem by Sir Andrew Wiles: that he “worked backward, figuring out what he needed to learn to make sense of the proof”. Later he also said things like “I never really learned the basic things like linear algebra, actually – I only assimilated it through learning some other stuff.”

If you have experiences learning mathematics at the senior undergraduate level or post-graduate level, you will likely find that these experiences are orthogonal to your own. We spend an inordinate amount of time learning the ‘basics’ for various things, which very much includes linear algebra for example, in order to do research… or so we are told. If you have passed the part of your career where you do more courses than self-learning, then you have likely reached the epiphany that usually it’s not efficient to learn everything there is to know on a subject before actually doing work on the subject.

Some of you have had advisors telling you things like “read these five books (each 300+ pages) before you attempt any research work in the area”. Sometimes this advice comes from highly proficient researchers, which seems odd: if the above ethos is ubiquitous among researchers, why tell your students to do something totally different and entirely more dreadful? I am not sure what the right answer is, but probably part of the reason is a misguided attempt to make research ‘easier’ for students. Perhaps many advisors recall the struggle of trying to understand ‘simple’ phenomena that they encountered in their research careers, that if someone had just told them to read a book or if they were better prepared, would have been trivial to overcome. Perhaps they wish to save their students some time by telling them the shortcut. However, the struggle to understand phenomena on your own is part of what makes research rewarding, and more importantly, it is critical in forging a mind suited to making discoveries.

Of course, I am but a pebble to the avalanche that is Peter Scholze, so my advice may not be worth much. Nevertheless, I feel like I should say this to all prospective and current graduate students: be bold, and give every difficult paper in your field a read. Don’t be intimidated by them. If you don’t understand something, google it until you find what you need to learn the language of the subject. Don’t feel like you need to understand all of Harthshorne before you can read any research papers related to algebraic geometry. Your future self will thank you for this.

A question for students you dislike

At the recent CNTA conference, Professor Joe Silverman gave an explicit homomorphism of \text{GL}_3(\mathbb{R}) into \text{GL}_6(\mathbb{R}) which he jokingly called a great question to ask undergraduates to work out explicitly… if you don’t like them very much. In a similar vein, here is a question that one might ask undergraduates they don’t particularly like:

Let \mathbf{a} = (\alpha, \beta, \gamma) be a triple of co-prime integers which are not all zero and such that \alpha \gamma > 1. Prove that the ternary quadratic form given by:

K_{\mathbf{a}} (F) = \dfrac{1}{8\alpha^3} \bigg( 72 \beta^2 \gamma A^2 + 9 \alpha(\beta^2 + 4 \alpha \gamma) B^2 + 8 \alpha^3 C^2 - 18\beta (\beta^2 + 4 \alpha \gamma)AB
+ 12 \alpha (3 \beta^2 - 4 \alpha \gamma)AC - 24 \alpha^2 \beta BC \bigg)

takes on integer values whenever the triplet (A,B,C) lies in the lattice defined by the congruence conditions

4 \beta \gamma x - (2 \beta^2 + \alpha \gamma) y + 2 \alpha \beta z \equiv 0 \pmod{\alpha^2}

and

\gamma(2 \beta^2 + \alpha \gamma)x - \beta(\beta^2 + \alpha \gamma)y + \alpha \beta^2 z \equiv 0 \pmod{\alpha^3}.

I will reveal the solution in due time (and it does not involve explicit computation of congruences), but if come up with a solution let me know!

 

Manjul Bhargava’s advice to mathematics graduate students

Last night, I had the honour of attending a panel discussion featuring eminent mathematician Manjul Bhargava. During the panel, the moderator, Professor Kumar Murty asked the very productive Fields Medalist to give some advice to graduate students in the audience who may be struggling with their research. Professor Bhargava’s response, paraphrased, is essentially the following:  always work on several problems at a time (at least three), of varying difficulty. There should be a problem which is quite difficult and if you make any progress on it it will be a major breakthrough; there should be one of moderate difficulty, and there should be an ‘easy’ problem that you know you can make progress on eventually. Further, never think too much about a problem at a time and instead rotate between the problems to change your mindset. Sometimes when you approach a problem with a fresh perspective you will gain some insight that would’ve been impossible if you stared at the same problem continuously, since you are subconsciously trying to apply the same techniques.

I thought Professor Bhargava’s advice was very helpful to the graduate students in the audience. It is something I started doing a few years ago, but it wasn’t something that I was aware of consciously. Hopefully heeding this advice will be helpful to your work.

On binary forms II

The following joint paper of myself and my advisor Professor C.L. Stewart has been released on the arxiv. In this follow-up post, I would like to describe in some detail how to establish an asymptotic formula for the number of integers in an interval which are representable by a fixed binary form F with integer coefficients and non-zero discriminant.

There are essentially three ingredients which go into the proof, each established decades apart. The first essential piece of the puzzle was established by Kurt Mahler in the 1930’s. He showed that if we examine the number of integer points in the region \{(x,y) \in \mathbb{R}^2 : |F(x,y)| \leq Z\}, then the number of such points is closely approximated by the area of the region. Since the region is homogeneously expanding, the area itself is well-approximated by scaling the `fundamental region’ given by \{(x,y) \in \mathbb{R}^2 : |F(x,y)| \leq 1\}. Indeed, let A_F denote the area of this fundamental region and let N_F(Z) denote the number of integer pairs (x,y) \in \mathbb{Z}^2 such that |F(x,y)| \leq Z. Then Mahler proved that

\displaystyle N_F(Z) \sim A_F Z^{\frac{2}{d}}.

More precisely, he proved a very good error term. He showed that when d \geq 3, we have

\displaystyle N_F(Z) = A_F Z^{\frac{2}{d}} + O_F\left(Z^{\frac{1}{d-1}} \right).

The question then becomes is there some way to remove the redundancies in Mahler’s theorem? For example, if F has even degree, then F(x,y) = F(-x,-y) for all (x,y) \in \mathbb{R}^2, so the pairs (x,y), (-x,-y) \in \mathbb{Z}^2 represent the same integer. Is it true that this is the only way that this can happen? Unfortunately, the answer is no. For example, consider the binary form F_n = x^n + (x - y)(2x - y) \cdots (nx-y). Then clearly the points (1,1), \cdots, (1,n) all represent 1, and this construction works for any positive integer n. Therefore, there does not appear to be a simple way to count the multiplicities of points representing the same integer in Mahler’s theorem.

While examples like the above exist, perhaps it is possible that this happen sufficiently rare as to be negligible. For instance, if only O\left(Z^{2/d - \delta}\right) many points counted by N_F(Z) are such that there exist many `essentially different’ (precise definition to come) other points which represent the same integer, and even in the worst case there can be at most say O\left(Z^{\frac{\delta}{2}}\right) many essentially different pairs, then we have shown that in total, the contribution from these bad points to N_F(Z) is only O\left(Z^{\frac{2}{d} - \frac{\delta}{2}}\right), which is fine.

We shall now make some definitions. We say that an integer h is essentially represented by F if there exist two integer pairs (x_1, y_1), (x_2, y_2) for which F(x_1, y_1) = F(x_2, y_2) = h, then there exists an element

\displaystyle T = \begin{pmatrix} t_1 & t_2 \\ t_3 & t_4 \end{pmatrix} \in \text{GL}_2(\mathbb{Q}) such that

\begin{pmatrix} x_1 \\ y_1 \end{pmatrix} = T \begin{pmatrix} x_2 \\ y_2 \end{pmatrix}

and such that

F_T(x,y) = F(t_1 x + t_2 y, t_3 x + t_4 y) = F(x,y)

for all (x,y) \in \mathbb{C}^2. Otherwise, we say that h is not essentially represented.

Now put R_F(Z) to be the number of integers up to Z which are representable by F, and let R_F^{(1)}(Z) be the number of essentially represented integers and R_F^{(2)}(Z) be the number of non-essentially represented integers. If we can show that R_F(Z) \sim R_F^{(1)}(Z), then we are basically done. This amounts to showing that R_F^{(2)}(Z) is small compared to Z^{\frac{2}{d}}.

Christopher Hooley proved this for both the ‘easy cubic case’ and the ‘hard cubic case’. However, it was D.R. Heath-Brown who showed that R_F^{(2)}(Z) is always small compared to R_F^{(1)}(Z). This paved the way to our eventual success at this problem.

It remains to account for the interaction between those T \in \text{GL}_2(\mathbb{Q})  which fix F and R_F^{(1)}(Z). These elements are called the rational automorphisms of F and we denote them by \text{Aut} F = \text{Aut}_\mathbb{Q} F. The most novel contribution we made to this topic is that we accounted for the exact interaction between \text{Aut} F and R_F^{(1)}(Z) with the so-called ‘redundancy lemmas’. This will be discussed at a future time.