My thoughts on what makes a good thesis problem

The single most important decision of a PHD student in (pure) mathematics is undoubtedly the choice of research in their thesis work. Usually to get a PHD, one must prove at least one new result. We will refer to the achievement of this situation as ‘solving a problem’. Thus, we may adapt the ‘problem solver’ (see here) view that a PHD is centered around solving a thesis problem.

I have made the following comments in person to many people, but I thought I would record them here as well.

When selecting a thesis problem, there are many factors to be considered, some of them quite deleterious. Firstly, mathematics is, contrary to popular belief, a social endeavor. Mathematicians do not work in isolation. As the recent controversy involving Shinichi Mochizuki and his purported proof of the abc-conjecture has shown, a problem is not solved and a result not proven until there is consensus among the mathematical community that it is. Therefore, progress or results that are not engaged with the mathematical community basically do not exist; just like the old paradigm involving trees falling in the woods. Thus one must ensure that their work is in sync with the current mathematical culture. One should therefore not choose a problem that is so far off in the fringes that few, if anybody, is aware of its existence and fewer still have any vested interest in whether it is resolved or not.

However, one should not pick something that is too hot or attractive, relative to its difficulty. Perhaps the best example of this would be the bounded gaps between primes problem. Prior to May 2013, the problem was considered intractable, and few people dedicated any amount of time to solving it. However when Yitang Zhang announced that he had proved that there exist infinitely many pairs of consecutive primes that differ by no more than 70 million, he sent shock waves around the mathematical world. Soon there was a flurry of results improving upon Zhang’s work in various ways. It would have been an extraordinarily bad idea to choose any results directly related to Zhang’s work as a thesis problem. There were literally dozens of top level experts around the world working on this topic, and the probability that your result will be scooped is very high. Thus, one must pick a problem that is either not a very obvious one to ask but is nonetheless interesting, or one that is interesting but not so interesting that it will attract the attention of many experts.

The final, and perhaps most subtle point, is that one should not pick a problem that is conducive to an ad-hoc solution that does not generalize. The best example of this is perhaps the so-called ‘elementary’ proof of the prime number theorem. In this case, while it was extraordinary that the prime number theorem, originally proved using arguments from complex analysis, could be proved in an elementary way, the proof fell drastically short of expectations. In particular, it was widely expected that an elementary proof of the prime number theorem would shed light on the distribution of primes and perhaps even pave the way to proving the Riemann hypothesis. The elementary proof of the prime number theorem did no such thing and to this day the technique remains restricted to only being able to prove the prime number theorem. One does not want a thesis problem like this. Instead, ideally during the course of solving one’s thesis problem, one discovers that the same techniques and machinery can be applied to a plethora of other problems and thus pump out a consistent stream of papers after and establish a solid research program. Whether or not this would be the case is extremely difficulty to predict before one actually attempts to solve the problem seriously.

Usually as a new PHD student who has limited understanding of your subject area and have limited perspective on the landscape of your research domain, it is extremely important to find an advisor who does have the right expertise and perspective. Even then it is a very delicate process to select a thesis problem which you would be expected to work on for several years.

Some statistics for hearthstone

This took a bit longer than I would care to admit, but I have finally computed the expected gain of stars per game. The answer depends a little bit on information available and interpretation.

Let p denote the probability of you winning a game, which is considered to be constant. If you are on a win streak, meaning you’ve won the previous two games, then the expected winning of your next game is 2p - 1(1-p) = 3p-1. The probability that you’ve won your previous two games is p^2, so this case contributes p^2(3p-1) to the total expected value. Otherwise, the expected winning is p - (1-p) = 2p-1. The probability of this is 1 - p^2, so the contribution for this case is (1-p^2)(2p-1). Summing, we obtain that the total expected value of the gain of stars is

3p^3 - p^2 + 2p - 1 - 2p^3 + p^2 = p^3 + 2p - 1.

Thus, if you win half the time (so that p = 1/2), you should expect to gain roughly one star every 8 games or so.

Another Hearthstone related problem, which I worked out with my officemate, is to ask what is the probability of you getting the card you need on turn 1 assuming you are going first and that you are willing to discard all other cards in your initial draw in order to get the card that you want. First, the probability that the card you need will appear in the initial draw of three cards is

\displaystyle \frac{\binom{2}{1} \binom{28}{2} + \binom{2}{2} \binom{28}{1}}{\binom{30}{3}}.

If you did not get either copy of the card that you need, then you may discard all three cards and try again, knowing that the three cards you tossed cannot come back. In this case, the probability of you getting the card you need is the product of the probability of you not getting the card you want in the initial draw and the probability of getting at least one copy of the card that you need in the second draw.

\displaystyle \frac{\binom{28}{3}}{\binom{30}{3}} \cdot \frac{\binom{2}{1} \binom{25}{2} + \binom{2}{2} \binom{25}{1}}{\binom{27}{3}} .

Adding these, we get some pretty neat cancelations and end up with the simple expression of p = 1/5.

However, we have not yet accounted for the random card that you top deck after your initial draw on turn 1. The only remaining case that needs to be considered is when you failed to get the card you need after tossing all of your cards on the initial draw. In this case, the probability of getting the card you need is

\displaystyle \frac{\binom{27}{3}}{\binom{30}{3}} \cdot \frac{\binom{25}{3}}{\binom{27}{3}} \cdot \frac{2}{27}

which is about 0.041963. Adding, the total probability is about 0.241963, or slightly less than one in four.








A beautiful answer to why pure mathematics matters

Credit goes to Khalid Bou-Rabee, who posted this on MathOverflow, and Benson Farb, to whom the quote belongs.

“Since I am a pure mathematician, Dean Hefley suggested as a possible topic for this talk: “Why the square root of negative 1 is necessary”. I could take up this challenge of justifying pure science on its vast applicability; indeed the square root of negative 1, the basic “imaginary number”, underlies a huge swath of modern technology, from the design of circuits, airplanes and skyscrapers, to the construction of economic and financial models, to robotics. I have decided, however, to take the opposite point of view. I want to defend the value of basic science for its own sake…

…the purpose of pure mathematics, of basic science, is not the quick harvest. It is nothing less than an attempt to bring human thought and understanding to a higher level. It is an attempt to change not just what we think about the world, but how we think about it. The importance of this for human evolution is incalculable. As British physicist JJ Thomson said: “Research in applied science leads to reforms, research in pure science leads to revolutions.”

- Benson Farb, 2012

Some gems from analytic number theory

I am currently attending the Counting Arithmetic Objects summer school at Centre de Recherches Mathematiques in Montreal. During the second day of lectures I heard a remarkable talk from Professor Andrew Granville. He spoke with outstanding clarity on the basic aim of analytic number theory, and I wish to remark on some of the epiphanies I had during the talk.

Without a doubt, one of the most important identities in analytic number theory is the following:

\int_0^1 e^{2\pi i n t}dt = \begin{cases} 1, & \text{ if } n = 0 \\    0, & \text{ if } n \ne 0 \end{cases}

The key insight is that the above identity serves as an “indicator function” for whether a quantity is zero or not. Indeed, this basic observation is behind behind the solutions to Waring’s problem and the proof of Vinogradov’s theorem on the sum of three primes. This “smooth” indicator function allows the use of methods in analysis to deal with arithmetic problems.

A related identity is Perron’s formula. Here again we start with an “indicator” function, defined for c > 0

\frac{1}{2\pi i} \int_ {c-i\infty}^{c+i\infty} \frac{y^s}{s} ds = \begin{cases} 1, & \text{ if } y > 1 \\ 0, & \text{ if } y < 1 \end{cases}

This allows to handle sums of the shape

\displaystyle \sum_{n \geq 1} a_n.

To see this, we start with the observation that for each n in the range of summation we have (provided x is not an integer) that x/n > 1. Hence we have

\displaystyle \sum_{n \leq x} a_n = \sum_{n \leq x} a_n \frac{1}{2 \pi i } \int_{c - i\infty}^{c + i\infty} \left(\frac{x}{n} \right)^s \frac{1}{s} ds.

The outer sum is finite, but we would not change the sum at all if we included all positive integers n > x because of the indicator function nature. For well behaved a_n‘s and taking c sufficiently large to ensure absolute convergence, we have that

\displaystyle \sum_{n \geq 1} a_n \frac{1}{2 \pi i} \int_{c - i\infty}^{c + i\infty} \frac{1}{s} \left(\frac{x}{n}\right)^s ds = \frac{1}{2\pi i} \int_{c - i\infty}^{c + i\infty} \left(\sum_{n\geq1} \frac{a_n}{n^s}\right) \frac{x^s}{s} ds.

We note that g(s)= \displaystyle \sum_{n \geq 1} \frac{a_n}{n^s} is the Dirichlet series associated to the a_n‘s. Summarizing, we obtain the equation

\displaystyle \sum_{n \leq x} a_n = \frac{1}{2\pi i} \int_{c - i\infty}^{c + i\infty} g(s) \frac{x^s}{s}ds.

Now we can apply this to the familiar Von Mangoldt function \Lambda(n) defined by

\displaystyle \Lambda(n) = \begin{cases} \log p, & \text{ if } n = p^m \\ 0, & \text{ otherwise.} \end{cases}

where as usual p denotes a prime. Now using Perron’s formula, we have

\displaystyle \sum_{n \leq x} \Lambda(n) = \sum_{p^m \leq x} \log p = \frac{1}{2 \pi i} \int_{c - i\infty}^{c + i\infty} - \frac{\zeta'(s)}{\zeta(s)} \frac{x^s}{s}ds

Where \zeta(s) denotes the Riemann zeta function. This gives rise, after evaluating the integral on the right using residue theory and noting that the poles of \zeta'(s)/\zeta(s) are 1 and the zeroes of \zeta, we have

\displaystyle \sum_{p^m \leq x} \log p = x - \sum_{\rho: \zeta(\rho) =0} \frac{x^\rho}{\rho}+\frac{\zeta'(0)}{\zeta(0)}.

This is likely one of the most striking equations in mathematics as the left hand side is a discrete sum while the right hand side involved values of a meromorphic function. Somehow, the zeroes of the zeta function “knows” where the primes are.

Problems with democracies having different parties

The terms ‘democracy’ and ‘multi-party state’ have almost become synonymous. China, for example, is notorious for being a ‘single party state’, which is a (slightly) more polite way of saying ‘authoritarian regime’ or ‘dictatorship’. Since China’s political system is ubiquitously painted as either bad or outright atrocious, it must mean that a multiple party system is the correct answer. Indeed almost all democracies have multiple parties, including the USA, Canada, Britain, Australia, Israel, Japan, etc. However, the notion that ‘party’ is the defining attribute of a democracy is deeply problematic.

On the eve of the Ontario provincial election here in Canada (which I cannot participate because I missed the deadline to mail in my vote and I am currently out-of-province), we are faced with possibly the lowest voter turn-out in our entire history. This is a grim realization since the current record holder, an abysmal 48%, was set just three years ago in the last Ontario election. To quote this Maclean’s article, would-be voters have “changed the channel” a long time ago. This is because a fundamental aspect of the nature of democratic politics is revealed and amplified by the advance of technology: the most important priority for any politician is to get elected, and the most important priority of any political party is to form government. Ideally, the best way to achieve these priorities is to be good at your job and for your party to be the one to best serve the interests of your citizens. Unfortunately, with the help of modern technology and statistical methods, this is no longer the case. As is painfully underlined in the Eric Cantor case, often the most effective method to stay elected and stay in power is to pander to the loudest, most driven minorities in the electorate at the expense of alienating everyone else.

The fundamental problem is that political parties, small in number, cannot collectively represent every possible person. This is further exacerbated by the unspoken dogma that political parties cannot agree with each other on anything. For a person, it may be perfectly reasonable to be simultaneously pro-abortion but pro-immigration reform, but in the USA such a person would necessarily have to choose between the two issues: since Republicans are pro-abortion and anti-immigration reform, while Democrats are pro-choice and pro-immigration reform. In this case, whoever wins, the person loses.

The tendency for politicians and political parties to serve their own interests even when these do not align with the best interests of the population seems like a major hurdle that no current democracy is equipped to address.

Another problem with political parties is their tendency to be slow to evolve ideologically. For instance, the very notion of ‘left, center, right’ model of thinking in politics may be extremely outdated. Even at a more refined level, it is possible that political parties and their ‘core’ ideologies will simply become irrelevant given sufficient time, but the political parties do not die because they are designed to self-preserve. The very idea that we both Canada and the US have basically the same political parties since 150 years ago is perplexing. In some sense, our politics are stuck in the 19th century.

Why peddle your political system or ideals if you are so discontent with them?

Together, these three links indicate that Americans are extraordinarily unsatisfied with both their elected legislators and generally at best mildly satisfied with their elected Commander in Chief. Even in the best of times, no greater than three quarters of Americans are satisfied with their president at any given time, and usually the approval rating hovers around 50%. More importantly however, it seems that the vaunted constitutions and laws of the United States have failed to protect its status as a democratic republic, according to the third link.

So why do Americans focus so much of their vitriol towards supposedly ‘oppressive’ regimes that probably have much higher approval ratings? Perhaps the American stance would be better perceived if they were at least happy with their system themselves (a much more convincing argument would be if their system actually works the way it should, which it doesn’t).