## Proofs that require fundamentally new ways of thinking

171

192

I do not know exactly how to characterize the class of proofs that interests me, so let me give some examples and say why I would be interested in more. Perhaps what the examples have in common is that a powerful and unexpected technique is introduced that comes to seem very natural once you are used to it.

Example 1. Euler's proof that there are infinitely many primes.

If you haven't seen anything like it before, the idea that you could use analysis to prove that there are infinitely many primes is completely unexpected. Once you've seen how it works, that's a different matter, and you are ready to contemplate trying to do all sorts of other things by developing the method.

Example 2. The use of complex analysis to establish the prime number theorem.

Even when you've seen Euler's argument, it still takes a leap to look at the complex numbers. (I'm not saying it can't be made to seem natural: with the help of Fourier analysis it can. Nevertheless, it is a good example of the introduction of a whole new way of thinking about certain questions.)

Example 3. Variational methods.

You can pick your favourite problem here: one good one is determining the shape of a heavy chain in equilibrium.

Example 4. Erdős's lower bound for Ramsey numbers.

One of the very first results (Shannon's bound for the size of a separated subset of the discrete cube being another very early one) in probabilistic combinatorics.

Example 5. Roth's proof that a dense set of integers contains an arithmetic progression of length 3.

Historically this was by no means the first use of Fourier analysis in number theory. But it was the first application of Fourier analysis to number theory that I personally properly understood, and that completely changed my outlook on mathematics. So I count it as an example (because there exists a plausible fictional history of mathematics where it was the first use of Fourier analysis in number theory).

Example 6. Use of homotopy/homology to prove fixed-point theorems.

Once again, if you mount a direct attack on, say, the Brouwer fixed point theorem, you probably won't invent homology or homotopy (though you might do if you then spent a long time reflecting on your proof).

The reason these proofs interest me is that they are the kinds of arguments where it is tempting to say that human intelligence was necessary for them to have been discovered. It would probably be possible in principle, if technically difficult, to teach a computer how to apply standard techniques, the familiar argument goes, but it takes a human to invent those techniques in the first place.

Now I don't buy that argument. I think that it is possible in principle, though technically difficult, for a computer to come up with radically new techniques. Indeed, I think I can give reasonably good Just So Stories for some of the examples above. So I'm looking for more examples. The best examples would be ones where a technique just seems to spring from nowhere -- ones where you're tempted to say, "A computer could never have come up with that."

Edit: I agree with the first two comments below, and was slightly worried about that when I posted the question. Let me have a go at it though. The difficulty with, say, proving Fermat's last theorem was of course partly that a new insight was needed. But that wasn't the only difficulty at all. Indeed, in that case a succession of new insights was needed, and not just that but a knowledge of all the different already existing ingredients that had to be put together. So I suppose what I'm after is problems where essentially the only difficulty is the need for the clever and unexpected idea. I.e., I'm looking for problems that are very good challenge problems for working out how a computer might do mathematics. In particular, I want the main difficulty to be fundamental (coming up with a new idea) and not technical (having to know a lot, having to do difficult but not radically new calculations, etc.). Also, it's not quite fair to say that the solution of an arbitrary hard problem fits the bill. For example, my impression (which could be wrong, but that doesn't affect the general point I'm making) is that the recent breakthrough by Nets Katz and Larry Guth in which they solved the Erdős distinct distances problem was a very clever realization that techniques that were already out there could be combined to solve the problem. One could imagine a computer finding the proof by being patient enough to look at lots of different combinations of techniques until it found one that worked. Now their realization itself was amazing and probably opens up new possibilities, but there is a sense in which their breakthrough was not a good example of what I am asking for.

While I'm at it, here's another attempt to make the question more precise. Many many new proofs are variants of old proofs. These variants are often hard to come by, but at least one starts out with the feeling that there is something out there that's worth searching for. So that doesn't really constitute an entirely new way of thinking. (An example close to my heart: the Polymath proof of the density Hales-Jewett theorem was a bit like that. It was a new and surprising argument, but one could see exactly how it was found since it was modelled on a proof of a related theorem. So that is a counterexample to Kevin's assertion that any solution of a hard problem fits the bill.) I am looking for proofs that seem to come out of nowhere and seem not to be modelled on anything.

Further edit. I'm not so keen on random massive breakthroughs. So perhaps I should narrow it down further -- to proofs that are easy to understand and remember once seen, but seemingly hard to come up with in the first place.

2Perhaps you could make the requirements a bit more precise. The most obvious examples that come to mind from number theory are proofs that are ingenious but also very involved, arising from a rather elaborate tradition, like Wiles' proof of Fermat's last theorem, Faltings' proof of the Mordell conjecture, or Ngo's proof of the fundamental lemma. But somehow, I'm guessing that such complicated replies are not what you have in mind. – Minhyong Kim – 2010-12-09T15:18:28.510

@Minhyong: right! All of these proofs involved fundamental new insights, but probably the proof of an arbitrary statement that was known to be hard (in the sense that "the usual methods don't seem to work") and was then proved anyway ("because a new method was discovered") seem to fit the bill... – Kevin Buzzard – 2010-12-09T15:30:59.217

9Of course, there was apparently a surprising and simple insight involved in the proof of FLT, namely Frey's idea that a solution triple would give rise to a rather exotic elliptic curve. It seems to have been this insight that brought a previously eccentric seeming problem at least potentially within the reach of the powerful and elaborate tradition referred to. So perhaps that was a new way of thinking at least about what ideas were involved in FLT. – roy smith – 2010-12-09T16:21:30.547

11Never mind the application of Fourier analysis to number theory -- how about the invention of Fourier analysis itself, to study the heat equation! More recently, if you count the application of complex analysis to prove the prime number theorem, then you might also count the application of model theory to prove results in arithmetic geometry (e.g. Hrushovski's proof of Mordell-Lang for function fields). – D. Savitt – 2010-12-09T16:42:04.740

2In response to edit: On the other hand, I think those big theorems are still reasonable instances of proofs that are difficult to imagine for a computer! Incidentally, regarding your example 2, it seems to me Dirichlet's theorem on primes in arithmetic progressions might be a better example in the same vein. – Minhyong Kim – 2010-12-09T17:34:23.603

7I agree that they are difficult, but in a sense what I am looking for is problems that isolate as well as possible whatever it is that humans are supposedly better at than computers. Those big problems are too large and multifaceted to serve that purpose. You could say that I am looking for "first non-trivial examples" rather than just massively hard examples. – gowers – 2010-12-09T18:04:51.283

I'm still a bit confused about your motivation. Are you trying to understand why you think certain proofs are hard to be generated by computers? Or are you interested in this list for its own sake? Or something else? – Jack Lemon – 2010-12-09T18:11:06.103

What about Hilbert's approach to the "fundamental problem of invariant theory"? I.e. the one that supposedly provoked the remark "This is not mathematics, but theology". – Jon Bannon – 2010-12-09T19:02:24.403

2@Luke: My motivation is that I believe that computers ought to be able to do mathematics. To explore that view, it is very helpful to look at problems of this type, since either one will be able to explain how certain ideas that seem to come from nowhere can in fact be generated in an explicable way, or one will end up with a more precise understanding of the difficulties involved. Of course, I'm hoping for the former. – gowers – 2010-12-09T19:09:24.090

While this is a common belief even for strong proponents of computerized mathematics, it is not clear if these types of ideas/proofs would be harder for computer systems (fully automatic or interactive). For example, the "probabilistic method" had major impact and led to surprising proofs/concepts in different areas in different times. So the idea: "Use a probabilistic argument to prove the existence of the required objects" or "Add probabilistic ingredient to this notion" could have been offered (and still can be offered) rater automatically. – Gil Kalai – 2010-12-10T12:42:40.643

1I see another conceptual difficulty with the spirit behind the question: Suppose we have to compare two proofs for two a priori equally important theorems. The first proof is based on a fundamentally new way of thinking (whatever it means, but let's assume that it is meaningful). In the second proof the proof of Lemma 12.7 is based on a fundamentally new way of thinking. How do we compare these two scenarios? – Gil Kalai – 2010-12-10T15:13:06.513

2My feeling is that when someone says "X is fundamentally new" (for various values of X) in reference to some mathematics, IMO this usually demands as a prerequisite that one has a pretty narrow perspective on the kinds of thinking that came beforehand in order to believe the statement. This doesn't take anything away from novel mathematics, it's just that fundamentally new is almost always too hyperbolic expression for the mathematics it describes. I imagine the main reason mathematicians use such hyperbolic terminology is that hype draws people's attention, and that helps ideas propagate. – Ryan Budney – 2010-12-11T05:17:37.217

2@Ryan, as I hope my remarks make clear, I completely agree. In other words, I hope that by asking for fundamental newness I have set an impossible challenge. Maybe I could refine the question further: I am looking for proofs that appear to be so different from what went before that they require some special and characteristically human "genius" to be discovered. – gowers – 2010-12-11T07:16:52.323

@Gil: I agree that the two scenarios exist. I'm not sure I see the need to compare them. – gowers – 2010-12-11T07:17:36.250

A comparison of the two scenarios is relevant for trying to understand what computers can do. Even if we agree that "fundamentally new (=FN)" arguments is the hardest element to automatize, it still seems harder (for a computer and perhaps also for a human) to find an FN argument at an unknown place down the proof than right at the beginning. – Gil Kalai – 2010-12-11T08:04:50.463

– Gil Kalai – 2010-12-13T19:16:21.560

One more (rather obvious) remark is that sometimes "fundamental new way of thinking" in general, and such new ways that lead to proofs, emerges gradually from a large body of work by many people. – Gil Kalai – 2010-12-13T19:24:41.483

1It is surprising how (successful) fundamentally new ways of thinking are clustered. Cantor idea is FN and yet very closely related to the ancient liar paradox, in this cluster also Russell's proof that his set is not a set and Goedel's theorem. The FN idea of non constructive methods, and in particular, probabilistic proofs. Homology is FN method in classifying topological spaces and fixed point theorems, and then (Emerton's answer) through fixed point theorems in number theory, and also there is the FN mysterious method to classify representations based on actions on homology. – Gil Kalai – 2010-12-14T11:32:58.563

4It seems to me that this question has been around a long time and is unlikely garner new answers of high quality. It also seems unlikely most would even read new answers. Furthermore, nowadays I imagine a question like this would be closed as too broad, and if we close this then we'll discourage questions like it in the future. So I'm voting to close. – David White – 2013-10-13T18:52:12.383

1Wow. So you can get 147 votes and still be closed as off-topic. Doesn't the fact that 147 math researchers liked it, alone, attest to its relevance? ...anyway surprised nobody mentioned Cantor's cardinality proofs. Before that, proofs by contradiction were not considered valid. – j0equ1nn – 2015-07-24T23:27:32.063

104

Grothendieck's insight how to deal with the problem that whatever topology you define on varieties over finite fields, you never seem to get enough open sets. You simply have to re-define what is meant by a topology, allowing open sets not to be subsets of your space but to be covers.

I think this fits the bill of "seem very natural once you are used to it", but it was an amazing insight, and totally fundamental in the proof of the Weil conjectures.

2Obviously, I agree that this was fundamental. But since we're speaking only about Grothendieck topologies and not the eventual proof of the Weil conjectures, there could be a curious sense in which this idea might be particularly natural to computers. Imagine encoding a category as objects and morphisms, which I'm told is quite a reasonable procedure in computer science. You'll recall then that it's somewhat hard to define a subobject. – Minhyong Kim – 2010-12-10T01:20:28.333

24That is, it's easier to refer directly to arrows $A\rightarrow B$ between two of the symbols rather than equivalence classes of them. In this framework, a computer might easily ask itself why any reasonable collection of arrows might not do for a topology. Grothendieck topologies seem to embody exactly the kind of combinatorial and symbolic thinking about open sets that's natural to computers, but hard for humans. We are quite attached to the internal 'physical' characteristics of the open sets, good for some insights, bad for others. – Minhyong Kim – 2010-12-10T01:26:23.357

3According to Grothendieck-Serre correspondence, I think it is more appropriate to say it is an insight due to both of them. – temp – 2012-06-03T19:49:29.113

80

Do Cantor's diagonal arguments fit here? (Never mind whether someone did some of them before Cantor; that's a separate question.)

7I vote yes. Look how hard Liouville had to work to find the first examples of transcendental numbers, and how easy Cantor made it to show that there are scads of them. – Gerry Myerson – 2010-12-09T22:07:51.960

5While Cantor's argument is amazing, and certainly produces scads, Liouville didn't have to work that hard; his approach is also very natural, and doesn't rely on much more than the pigeon-hole principle. – Emerton – 2010-12-10T03:15:15.957

11Cantor's whole idea of casting mathematics in the language of set theory is now so pervasive we don't even think about it. It dominated our subject until the category point of view. So to me these are the two most basic insights, viewing mathematics in terms of sets, and then in terms of maps. Etale topologies are just one example of viewing maps as the basic concept. – roy smith – 2010-12-13T17:55:46.807

@RoySmith : Was Cantor really the one who proposed to encode all of mathematics within set theory? Certainly he developed cardinals and ordinals and lots of theorems about them, and proposed the continuum hypothesis, and applied set theory to trigonometric series, but I'm not sure he was the one who proposed to make everything into set theory. – Michael Hardy – 2012-08-11T21:51:46.500

Anyway, this was perhaps the most radical change of the way of thinking in the whole history of mathematics. – Alexandre Eremenko – 2013-07-03T09:28:00.083

Is there any one who used diagonal arguments before Cantor? – Fawzy Hegab – 2017-01-03T16:55:25.567

@MathsLover : I don't know of any use of such arguments before Cantor; I just wanted to make clear that that's not relevant to this answer. – Michael Hardy – 2017-01-03T18:58:48.147

@FawzyHegab Yes, Paul du Bois-Reymond. See here.

– Andrés E. Caicedo – 2017-01-22T20:27:09.327

77

Although this has already been said elsewhere on MathOverflow, I think it's worth repeating that Gromov is someone who has arguably introduced more radical thoughts into mathematics than anyone else. Examples involving groups with polynomial growth and holomorphic curves have already been cited in other answers to this question. I have two other obvious ones but there are many more.

I don't remember where I first learned about convergence of Riemannian manifolds, but I had to laugh because there's no way I would have ever conceived of a notion. To be fair, all of the groundwork for this was laid out in Cheeger's thesis, but it was Gromov who reformulated everything as a convergence theorem and recognized its power.

Another time Gromov made me laugh was when I was reading what little I could understand of his book Partial Differential Relations. This book is probably full of radical ideas that I don't understand. The one I did was his approach to solving the linearized isometric embedding equation. His radical, absurd, but elementary idea was that if the system is sufficiently underdetermined, then the linear partial differential operator could be inverted by another linear partial differential operator. Both the statement and proof are for me the funniest in mathematics. Most of us view solving PDE's as something that requires hard work, involving analysis and estimates, and Gromov manages to do it using only elementary linear algebra. This then allows him to establish the existence of isometric embedding of Riemannian manifolds in a wide variety of settings.

is that book completelyy rigorous because i have been told that his papers aren't from analytic standpoint – Koushik – 2013-12-03T06:35:30.530

1Partial Differential Relations? I don't believe the h-principle requires much analysis. In the sections on isometric embeddings, he does state and give a complete analytic proof of the version of the Nash-Moser implicit function he needs. – Deane Yang – 2013-12-09T21:59:50.580

73

Generating functions seem old hat to those who have worked with them, but I think their early use could be another example. If you did not have that tool handy, could you create it?

Similarly, any technique that has been developed and is now widely used is made to look natural after years of refining and changing the collected perspective, but might it not have seemed quite revolutionary when first introduced? Perhaps the question should also be about such techniques.

Gerhard "Old Wheels Made New Again" Paseman, 2010.12.09

11I'd like to add to generating functions the idea that you can use singularity analysis to determine the coefficient growth. But I don't know how unexpected this was when first used... – Martin Rubey – 2010-12-09T17:24:10.290

2"any technique that has been developed and is now widely used is made to look natural after years of refining and changing the collected perspective, but might it not have seemed quite revolutionary when first introduced?" It surely was, and it is exactly why it is widely used now: it allowed a lot of things that were impossible previously and we are still trying to figure out how much is "a lot". Also note, that shaping an idea and recognizing its power is a long process, so "unexpected" means that 20 years ago nobody would have thought of that, not that it shocked everyone on one day. – fedja – 2010-12-17T12:53:14.790

By the way, who invented/discovered generating functions? (And thus discovered what is called Fourier analysis now:-) Was this de Moivre? – Alexandre Eremenko – 2013-07-03T09:26:12.363

Euler's proof that the number of partitions into odd parts is equal to the number of partitions into distinct parts#Restricted_partition_generating_functions) might be the first useful application of generating functions. Though, you might as well say that the binomial theorem is really just giving the generating function for the binomial coefficients.

– Cheyne H – 2013-12-01T05:41:38.157

Arguably Clifford Truesdell invented them in his unified theory of special functions, in 1948. He generalized Euler result at the end as a special case of generating functions. – Guido Jorg – 2014-10-06T20:59:15.740

72

My favorite example from algebraic topology is Rene Thom's work on cobordism theory. The problem of classifying manifolds up to cobordism looks totally intractable at first glance. In low dimensions ($0,1,2$), it is easy, because manifolds of these dimensions are completely known. With hard manual labor, one can maybe treat dimensions 3 and 4. But in higher dimensions, there is no chance to proceed by geometric methods.

Thom came up with a geometric construction (generalizing earlier work by Pontrjagin), which is at the same time easy to understand and ingenious. Embed the manifold into a sphere, collapse everything outside a tubular neighborhood to a point and use the Gauss map of the normal bundle... What this construction does is to translate the geometric problem into a homotopy problem, which looks totally unrelated at first sight.

The homotopy problem is still difficult, but thanks to work by Serre, Cartan, Steenrod, Borel, Eilenberg and others, Thom had enough heavy guns at hand to get fairly complete results.

Thom's work led to an explosion of differential topology, leading to Hirzebruch's signature theorem, the Hirzebruch-Riemann-Roch theorem, Atiyah-Singer, Milnor-Kervaire classification of exotic spheres.....until Madsen-Weiss' work on mapping class groups.

69

The method of forcing certainly fits here. Before, set theorists expected that independence results would be obtained by building non-standard, ill-founded models, and model theoretic methods would be key to achieve this. Cohen's method begins with a transitive model and builds another transitive one, and the construction is very different from all the techniques being tried before.

This was completely unexpected. Of course, in hindsight, we see that there are similar approaches in recursion theory and elsewhere happening before or at the same time.

But it was the fact that nobody could imagine you would be able to obtain transitive models that mostly had us stuck.

17I took the last set theory course that Cohen taught, and this isn't how he presented his insight at all (though his book takes this approach). The central problem is "how do I prove that non-constructible [sub]sets [of N] are possible without access to one?", and his solution is "don't use a set; use an adaptive oracle".

Once that idea is present, the general method falls right into place. The oracle's set of states can be any partial order, generic filters fall right out, names are clearly necessary, everything else is technical. The hardest part is believing it will actually work. – Chad Groft – 2010-12-14T02:07:18.343

3@Chad : Very interesting! Curious that his description is so "recursion-theoretic." Do you remember when was this course? – Andrés E. Caicedo – 2010-12-14T02:48:15.543

I don't agree with Chad Croft that "generic filters fall right out". I believe that Boolean valued models are natural and also using ultrafilters in order to turn them into 2-valued models, but using generic filters to get actual ZFC models is on a different level. Also, the use of partial orders instead of Boolean algebras seems slightly unintuitive, even though it is more practical. – Stefan Geschke – 2014-02-15T07:53:47.723

@ChadGroft, Could you please clarify that more or refer to some papers/articles explaing/motivating forcing from this view? – Fawzy Hegab – 2017-01-03T20:03:24.377

65

I don't know who deserves credit for this, but I was stunned by the concept of view complicated objects like functions simply as points in a vector space. With that view one solves and analyzes PDEs or integral equations in Lebesgue or Sobolev spaces.

7I think one can credit this point of view to Fréchet, who introduced metric spaces for applications to functional analysis. – Qiaochu Yuan – 2011-01-18T16:26:07.290

42

What about Euler's solution to the Konigsberg bridge problem? It's certainly not difficult, but I think (not that I really know anything about the history) it was quite novel at the time.

10

@Kimball: It was so novel that Euler didn't even think the problem or its solution were mathematical. See the extract from a letter of Euler on the page http://en.wikipedia.org/wiki/Carl_Gottlieb_Ehler.

@KConrad: Interesting. Thanks for pointing that out. – Kimball – 2012-01-12T02:09:56.293

37

Technically, the following are not proofs, or even theorems, but I think they count as insights that have the quality that it's hard to imagine computers coming up with them. First, there's:

Mathematics can be formalized.

Along the same lines, there's:

Computability can be formalized.

If you insist on examples of proofs then maybe I'd be forced to cite the proof of Goedel's incompleteness theorem or of the undecidability of the halting problem, but to me the most difficult step in these achievements was the initial daring idea that one could even formulate a mathematically satisfactory definition of something as amorphous as "mathematics" or "computability." For example, one might argue that the key step in Turing's proof was diagonalization, but in fact diagonalization was a major reason that Goedel thought one couldn't come up with an "absolute" definition of computability.

Nowadays we are so used to thinking of mathematics as something that can be put on a uniform axiomatic foundation, and of computers as a part of the landscape, that we can forget how radical these insights were. In fact, I might argue that your entire question presupposes them. Would computers have come up with these insights if humans had not imagined that computers were possible and built them in the first place? Less facetiously, the idea that mathematics is a formally defined space in which a machine can search systematically clearly presupposes that mathematics can be formalized.

More generally, I'm wondering if you should expand your question to include concepts (or definitions) and not just proofs?

Edit. Just in case it wasn't clear, I believe that the above insights have fundamentally changed mathematicians' conception of what mathematics is, and as such I would argue that they are stronger examples of what you asked for than any specific proof of a specific theorem can be.

Whether the formalization of computability by Turing and various others in the '30s is the "right" one is a philosophical question, which maybe nobody knows how to think about. – Michael Hardy – 2010-12-10T05:05:04.953

5There's something amusing about the idea of a computer coming up with the idea that computability can be formalized. – ndkrempel – 2010-12-10T13:58:01.487

3This is an example that has bothered me in the past, and I have to admit that I don't have a good answer to it. The ability to introspect seems to be very important to mathematicians, and it's far from clear how a computer would do it. One could perhaps imagine a separate part of the program that looks at what the main part does, but it too would need to introspect. Perhaps this infinite regress is necessary for Godelian reasons but perhaps in practice mathematicians just use a bounded number of levels of navel contemplation. – gowers – 2010-12-10T14:29:26.240

I think some people deny the existence of such "levels" of introspection. – Michael Hardy – 2010-12-10T22:39:16.350

11Conversely, this type of introspection and formalisation is much less effective outside of mathematics (Weinberg has called this the "unreasonable ineffectiveness of philosophy".) Attempts to axiomatise science, the humanities, etc., for instance, usually end up collapsing under the weight of their own artificiality (with some key exceptions in physics, notably relativity and quantum mechanics). The fact that mathematics is almost the sole discipline that actually benefits from formalisation is indeed an interesting insight in my opinion. – Terry Tao – 2010-12-11T16:59:56.867

5But if you axiomatize some portion of some other science, doesn't that axiomatization constitute mathematics? So it seems almost tautologous to say that only mathematics "benefits" from formalization. – Michael Hardy – 2010-12-11T17:11:04.507

Well, the mathematics only comes in at the metalevel rather than at the field itself. Another example would be software engineering; this is a discipline that benefits tremendously from the presence of formal computer languages, which can then be studied mathematically from a computer science standpoint, but the software itself need not have any mathematical content. – Terry Tao – 2010-12-11T17:19:30.173

(and more relevantly, computer programmers need not be doing any mathematics in order to create that software.) – Terry Tao – 2010-12-11T17:26:39.073

35

Not sure whether to credit Abel or Galois with the "fundamental new way of thinking" here, but the proof that certain polynomial equations are not solvable in radicals required quite the reformulation of thinking. (I'm leaning towards crediting Galois with the brain rewiring reward.)

P.S. Is it really the case that no one else posted this, or is my "find" bar not working properly?

4"Use of group theory to prove insolvability of 5th degree equation" is part of an earlier answer. – Gerry Myerson – 2010-12-12T11:14:20.320

Ah, I was searching for "Galois" and "Abel-Ruffini." Whoops... – Dylan Wilson – 2010-12-12T21:03:59.703

I recently (re)read the sections of Dummit-Foote on Galois theory. I think it was stated therein that Abel found the first proof of the insolvability of the quintic, but it was Galois who found the general method involving group theory for proving the solvability or insolvability of any polynomial. I don't recall if Dummit-Foote elaborates on Abel's method, but presumably it doesn't generalize like Galois's method does. – Kevin H. Lin – 2010-12-17T10:54:11.083

34

The use of spectral sequences to prove theorems about homotopy groups. For instance, until Serre's mod C theory, nobody knew that the homotopy groups of spheres were even finitely generated.

I think the whole mod C theory also answers the question, even without connecting it to spectral sequences. A computer would probably not think to develop mod C theory as a way to study homotopy groups of spheres, since there is no a priori connection between the two – David White – 2011-05-18T12:54:36.633

33

Another example from logic is Gentzen's consistency proof for Peano arithmetic by transfinite induction up to $\varepsilon_0$, which I think was completely unexpected, and unprecedented.

29

It seems that certain problems seem to induce this sort of new thinking (cf. my article "What is good mathematics?"). You mentioned the Fourier-analytic proof of Roth's theorem; but in fact many of the proofs of Roths' theorem (or Szemeredi's theorem) seem to qualify, starting with Furstenberg's amazing realisation that this problem in combinatorial number theory was equivalent to one in ergodic theory, and that the structural theory of the latter could then be used to attack the former. Or the Ruzsa-Szemeredi observation (made somewhat implicitly at the time) that Roth's theorem follows from a result in graph theory (the triangle removal lemma) which, in some ways, was "easier" to prove than the result that it implied despite (or perhaps, because of) the fact that it "forgot" most of the structure of the problem. And in this regard, I can't resist mentioning Ben Green's brilliant observation (inspired, I believe, by some earlier work of Ramare and Ruzsa) that for the purposes of finding arithmetic progressions, that the primes should not be studied directly, but instead should be viewed primarily [pun not intended] as a generic dense subset of a larger set of almost primes, for which much more is known, thanks to sieve theory...

Another problem that seems to generate radically new thinking every few years is the Kakeya problem. Originally a problem in geometric measure theory, the work of Bourgain and Wolff in the early 90s showed that the combinatorial incidence geometry viewpoint could lead to substantial progress. When this stalled, Bourgain (inspired by your own work) introduced the additive combinatorics viewpoint, re-interpreting line segments as arithmetic progressions. Meanwhile, Wolff created the finite field model of the Kakeya problem, which among other things lead to the sum-product theorem and many further developments that would not have been possible without this viewpoint. In particular, this finite field version enabled Dvir to introduce the polynomial method which had been applied to some other combinatorial problems, but whose application to the finite field Kakeya problem was hugely shocking. (Actually, Dvir's argument is a great example of "new thinking" being the key stumbling block. Five years earlier, Gerd Mockenhaupt and I managed to stumble upon half of Dvir's argument, showing that a Kakeya set in finite fields could not be contained in a low-degree algebraic variety. If we had known enough about the polynomial method to make the realisation that the exact same argument also showed that a Kakeya set could not have been contained in a high-degree algebraic variety either, we would have come extremely close to recovering Dvir's result; but our thinking was not primed in this direction.) Meanwhile, Carbery, Bennet, and I discovered that heat flow methods, of all things, could be applied to solve a variant of the Euclidean Kakeya problem (though this method did appear in literature on other analytic problems, and we viewed it as the continuous version of the discrete induction-on-scales strategy of Bourgain and Wolff.) Most recently is the work of Guth, who broke through the conventional wisdom that Dvir's polynomial argument was not generalisable to the Euclidean case by making the crucial observation that algebraic topology (such as the ham sandwich theorem) served as the continuous generalisation of the discrete polynomial method, leading among other things to the recent result of Guth and Katz you mentioned earlier.

EDIT: Another example is the recent establishment of universality for eigenvalue spacings for Wigner matrices. Prior to this work, most of the rigorous literature on eigenvalue spacings relied crucially on explicit formulae for the joint eigenvalue distribution, which were only tractable in the case of highly invariant ensembles such as GUE, although there was a key paper of Johansson extending this analysis to a significantly wider class of ensembles, namely the sum of GUE with an arbitrary independent random (or deterministic) matrix. To make progress, one had to go beyond the explicit formula paradigm and find some way to compare the distribution of a general ensemble with that of a special ensemble such as GUE. We now have two basic ways to do this, the local relaxation flow method of Erdos, Schlein, Yau, and the four moment theorem method of Van Vu and myself, both based on deforming a general ensemble into a special ensemble and controlling the effect on the spectral statistics via this deformation (though the two deformations we use are very different, and in fact complement each other nicely). Again, both arguments have precedents in earlier literature (for instance, our argument was heavily inspired by Lindeberg's classic proof of the central limit theorem) but as far as I know it had not been thought to apply them to the universality problem before.

5But, Terry, are the adjectives "radical" or "fundamentally new" really justified in the description of any of these examples? and of our business as a whole? – Gil Kalai – 2010-12-10T13:30:05.407

I am not sure Guth's work "broke through" any conventional wisdom. Suppose that you have a famous problem A and a new related problem B and you believe that 1) Problem A is very hard, 2) Progress for problems A and B is very related. Now, Problem B is easily settled using method C. This is in some tension with your earlier beliefs so you need to update them. So the conventional wisdom (or Bayesian thinking) will lead you to think that: 1) Method C may be useful for problem A; 2) Maybe problem A and B are not as closely related as we believed, 3) Maybe problem A is not as hard as we believed. – Gil Kalai – 2010-12-11T12:03:17.500

Gil, many people (including myself) tried options (1) and (3) (with C equal to algebraic geometry, and more precisely the polynomial method), but for continuous problems (such as incidences between balls and tubes, which is basically what Kakeya is) it failed dramatically. Guth's breakthrough was to observe that A should be attacked instead using method D (algebraic topology, and more precisely the polynomial Ham sandwich theorem). [Cont] – Terry Tao – 2010-12-11T16:47:15.033

4In other words, his contribution was that D:A=C:B (algebraic topology is to continuous incidence geometry as algebraic geometry is to discrete incidence geometry), which was definitely a very different way of thinking about these four concepts that was totally absent in previous work. (After Guth's work, it is now "obvious" in retrospect, of course.) – Terry Tao – 2010-12-11T16:48:34.923

3Perhaps what this example shows is that a computer trying to generate mathematical progress has to look at more than just the 1-skeleton of mathematics (B is solved by C; A is close to B; hence A might be solved by C) but also at the 2-skeleton (B is solved by C; D is to A as C is to B; hence A might be solved by D) or possibly even higher order skeletons. It seems unlikely though that these possibilities can be searched through systematically in polynomial time, without the speedups afforded by human insight... – Terry Tao – 2010-12-11T17:12:05.640

That's very interesting account, Terry. (The account still agrees with 1 & 2 & 3 being reasonable reactions to Dvir's discovery and 1 & 2 perhaps correct-in-hindsight.) Regarding the general issue, I think that talking about what computers can or cannot do in the context of this question is vastly premature. Forgetting about computers, we can still discuss what is fundamentally new (FN) and what is not but this I also find rather discomfortable for various reasons[cont]: – Gil Kalai – 2010-12-11T19:38:58.417

a) it is very difficult; understanding the flow of ideas and influences is extremely hard; b) I do not see how it can be useful. Let's say that I understand why Dvir's and Guth's results are FN while Guth and Katz is not FN. This seems as useful for achieving something similar to their achievements as a detailed understanding of the choices of last week's lottery winner. c) (when we consider recent discoveries) It is also personal, and I am not sure we have good tools to discuss such matters avoiding it being loaded. – Gil Kalai – 2010-12-11T19:40:55.830

[Cont:] There is another difficulty which is not so much about "new" but about "fundamentally". a') Putting a metric on ideas and telling what is really FN seems also difficult. I like to think about mathematics as a fractal-like beast and in this thinking you can have an idea which is FN in one scale which is almost identical to another idea looked at another scale. – Gil Kalai – 2010-12-13T16:56:26.810

24

I think that Eichler and Shimura's proof of the Ramanujan--Petersson conjecture for weight two modular forms provides an example. Recall that this conjecture is a purely analytic statement: namely that if $f$ is a weight two cuspform on some congruence subgroup of $SL_2(\mathbb Z)$, which is an eigenform for the Hecke operator $T_p$ ($p$ a prime not dividing the level of the congruence subgroup in question) with eigenvalue $\lambda_p$, then $| \lambda_p | \leq 2 p^{1/2}.$ Unfortunately, no purely analytic proof of this result is known. (Indeed, if one shifts one's focus from holomorphic modular forms to Maass forms, then the corresponding conjecture remains open.)

What Eichler and Shimura realized is that, somewhat miraculously, $\lambda_p$ admits an alternative characterization in terms of counting solutions to certain congruences modulo $p$, and that estimates there due to Hasse and Weil (generalizing earlier estimates of Gauss and others) can be applied to show the desired inequality.

This argument was pushed much further by Deligne, who handled the general case of weight $k$ modular forms (for which the analogous inequality is $| \lambda_p | \leq 2 p^{(k-1)/2}$), using etale cohomology of varieties in characteristic $p$ (which is something of a subtle and more technically refined analogue of the notion of a congruence mod $p$). (Ramanujan's original conjecture was for the unique cuspform of weight 12 and level 1.)

The idea that there are relationships (some known, others conjectural) between automorphic forms and algebraic geometry over finite fields and number fields has now become part of the received wisdom of algebraic number theorists, and lies at the heart of the Langlands program. (And, of course, at the heart of the proof of FLT.) Thus the striking idea of Eichler and Shimura has now become a basic tenet of a whole field of mathematics.

Note: Tim in his question, and in some comments, has said that he wants "first non-trivial instances" rather than difficult arguments that involve a whole range of ideas and techniques. In his comment to Terry Tao's answer regarding Perelman, he notes that long, difficult proofs might well include within them instances of such examples. Thus I am offering this example as perhaps a "first non-trivial instance" of the kind of insights that are involved in proving results like Sato--Tate, FLT, and so on.

22

Gromov's use of J-holomorphic curves in symplectic topology (he reinterpreted holomorphic functions in the sense of Vekua) as well as the invention of Floer homology (in order to deal with the Arnol'd conjecture).

22

I'm a little surprised no one has cited Thurston's impact on low-dimensional topology and geometry. I'm far from an expert, so I'm reluctant to say much about this. But I have the impression that Thurston revolutionized the whole enterprise by taking known results and expressing them from a completely new perspective that led naturally both new theorems and a lot of new conjectures. Perhaps Thurston himself or someone else could say something, preferably in a separate answer so I can delete mine.

21

Donaldson's idea of using global analysis to get more insight about the topology of manifolds. Nowadays it is clear to us that (non-linear) moduli spaces give something new, and more than linear (abelian) Hodge theory, for example, but I think at that time this was really new.

3I fully agree with this. I was a graduate student at Harvard, when Atiyah came and described Donaldson's thesis (Donaldson got his Ph.D. the same year as me). Before that, we all thought we were trying to understand Yang-Mills, because it connected geometric analysis to physics and not because we thought it would prove topological theorems. As I recall it, Atiyah said that when Donaldson first proposed what he wanted to do, Atiyah was skeptical and tried to convince Donaldson to work on something less risky. – Deane Yang – 2010-12-11T05:10:21.687

18

I don't know how good an example this is. The Lefschetz fixed point theorem tells you that you can count (appropriately weighted) fixed points of a continuous function $f : X \to X$ from a compact triangulable space to itself by looking at the traces of the induced action of $f$ on cohomology. This is a powerful tool (for example it more-or-less has the Poincare-Hopf theorem as a special case).

Weil noticed that the number of points of a variety $V$ over $\mathbb{F}_{q^n}$ is the number of fixed points of the $n^{th}$ power of the Frobenius map $f$ acting on the points of $V$ over $\overline{\mathbb{F}_q}$ and, consequently, that it might be possible to describe the local zeta function of $V$ if one could write down the induced action of $f$ on some cohomology theory for varieties over finite fields. This led to the Weil conjectures, the discovery of $\ell$-adic cohomology, etc. I think this is a pretty good candidate for a powerful but unexpected technique.

18

Gromov's proof that finitely generated groups with polynomial growth are virtually nilpotent. The ingenious step is to consider a scaling limit of the usual metric on the Cayley graph of the finitely generated group.

Of course the details are messy and to get the final conclusion one has to rely on a lot of deep results on the structure of topological groups. However, already the initial idea is breathtaking.

17

Quillen's construction of the cotangent complex used homotopical algebra to find the correct higher-categorical object without explicitly building a higher category. This may sound newfangled and modern, but if you read Grothendieck's book on the cotangent complex, his explicit higher-categorical construction was only able to build a cotangent complex that had its (co)homology truncated to degree 2. Strangely enough, by the time Grothendieck's book was published, it was already obsolete, as he notes in the preface (he says something about how new work of Quillen (and independently André) had made his construction (which is substantially more complicated) essentially obsolete).

1Which book of Grothendieck's are you referring to? Do you, perhaps, mean Illusie's book? – Dylan Wilson – 2013-06-27T17:19:31.420

@DylanWilson No, it's a little-known book of which I have forgotten the title. I found it in the Michigan library, mimeographed on only single-sided pages. It was a most bizarre book. – Harry Gindi – 2017-08-22T19:39:48.283

1

@HarryGindi: Was it this one? https://link.springer.com/book/10.1007%2FBFb0082437

– Andy Putman – 2017-08-23T02:31:24.227

@AndyPutman That's the one! Grothendieck basically admits in the introduction that it's really only of historical interest following the work of Quillen. Andr\'e also gave a more traditional alternative presentation not relying directly on model categories, if I remember correctly. I have a really low quality scan of that book. "Homologie des Algebres Commutatives" - Michel Andr\'e – Harry Gindi – 2017-08-23T16:51:18.710

17

Topological methods in combinatorics (started by Lovasz' proof of the Kneser conjecture, I guess).

16

Emil Artin's solution of Hilbert's 17th problem which asked whether every positive polynomial in any number of variables is a sum of squares of rational functions.

Artin's proof goes roughly as follows. If $p \in \mathbb R[x_1,\dots,x_n]$ it not a sum of squares of rational functions, then there is some real-algebraically closed extension $L$ of the field of rational functions in which $p$ is negative with respect to some total ordering (compatible with the field operations), i.e. there exists a $L$-point of $R[x_1,\dots,x_n]$ at which $p$ is negative. However, using a model theoretic argument, since $\mathbb R$ is also a real-closed field with a total ordering, there also has to be a real point such that $p<0$, i.e. there exists $x \in \mathbb R^n$ such that $p(x)< 0$. Hence, if $p$ is everywhere positive, then it is a sum of squares of rational functions.

The ingenius part is the use of a model theoretic argument and the bravery to consider a totally ordered real-algebraic closed extension of the field of rational functions.

I'm pretty sure the model-theoretic approach is due to model theorist Abraham Robinson. Of course completeness and decidability of Th(R,+,*,0,1,<) goes back to Tarski. – Pietro KC – 2010-12-13T00:34:51.630

I think that Ax-Grothendieck can be lumped with this in a sort of "unexpected model-theoretic arguments" category. – dvitek – 2010-12-17T13:21:24.877

1I'd generalize this answer to include the observation that transfinite induction (or the axiom of choice) can simplify proofs of statements that don't actually require them. This is similar to how probabilistic arguments can sometimes be simpler than constructions. Here's an example statement for which all three kinds of proof exists: there exists a set $A \subseteq [0,1]^2$ that is dense everywhere on the unit square [0,1]<sup>2</sup>, but for every x, A contains only finitely many points of form (x, y) or (y, x). – Zsbán Ambrus – 2010-12-19T16:57:14.777

15

Heegner's solution to the Gauss class number 1 problem for imaginary quadratic fields, by noting that when the class number is 1 then a certain elliptic curve is defined over Q and certain modular functions take integer values at certain quadratic irrationalities, and then finding all the solutions to Diophantine equations that result, seems to me equally beautiful and unexpected. Maybe its unexpectedness kept people from believing it for a long time.

14

I find Shannon's use of random codes to understand channel capacity very striking. It seems to be very difficult to explicitly construct a code which achieves the channel capacity - but picking one at random works very well, provided one chooses the right underlying measure. Furthermore, this technique works very well for many related problems. I don't know the details of your Example 4 (Erdos and Ramsey numbers), but I expect this is probably closely related.

13

Sometimes mathematics is not only about the methods of the proof, it is about the statement of the proof. E.g., it is hard to imagine an theorem-searching algorithm ever finding a proof of the results in Shannon's 1948 Mathematical Theory of Communication, without that algorithm first "imagining" (by some unspecified process) that there could BE a theory of communication.

Even so celebrated a mathematician as J. L. Doob at first had trouble grasping that Shannon's reasoning was mathematical in nature, writing in his AMS review (MR0026286):

[Shannon's] discussion is suggestive throughout, rather than mathematical, and it is not always clear that the author's mathematical intentions are honorable.
The decision of which mathematical intentions are to be accepted as "honorable" (in Doob's phrase) is perhaps very difficult to formalize.

One finds this same idea expressed in von Neumann's 1948 essay The Mathematician:

Some of the best inspirations of modern mathematics (I believe, the best ones) clearly originated in the natural sciences. ... As any mathematical discipline travels far from its empirical source, or still more, if it is a second or third generation only indirectly inspired by ideas coming from "reality", it is beset by very grave dangers. It becomes more and more purely aestheticizing, more and more lart pour le art. ... Whenever this stage is reached, the only remedy seems to me to be the rejuvenating return to the source: the reinjection of more or less directly empirical ideas.
One encounters this theme of inspiration from reality over-and-over in von Neumann's own work. How could a computer conceive theorems in game theory ... without having empirically played games? How could a computer conceive the theory of shock waves ... without having empirically encountered the intimate union of dynamics and thermodynamics that makes shock wave theory possible? How could a computer conceive theorems relating to computational complexity ... without having empirically grappled with complex computations?

The point is straight from Wittgenstein and E. O. Wilson: in order to conceive mathematical theorems that are interesting to humans, a computer would have to live a life similar to an ordinary human life, as a source of inspiration.

How could a computer conceive theorems relating to computational complexity ... without having empirically grappled with complex computations? But it had! – timur – 2013-08-15T06:37:05.600

12

Shigefumi Mori's proof of Hartshorne's conjecture (the projective spaces are the only smooth projective varieties with ample tangent bundles). In his proof, Mori developed many new techniques (e.g. the bend-and-break lemma), which later became fundamental in birational geometry.

11

And how about Perelman's proof of Poincare's conjecture?

@ anonymous: how does Perelman's proof require a fundamental new way of thinking? I haven't read it because I know I won't understand it, but I'm just curious why Perelman's approach is so original. Could you please explain that? – Max Muller – 2010-12-09T16:28:20.060

8It would be a better example if the proof were easier to understand ... – gowers – 2010-12-09T16:38:19.553

19I think there are at least two aspects of the Perelman-Hamilton theory that fit the bill. One is Hamilton's original realisation that Ricci flow could be used to at least partially resolve the Poincare conjecture (in the case of 3-manifolds that admit a metric with positive Ricci curvature). There was some precedent for using PDE flow methods to attack geometric problems, but I think this was the first serious attempt to attack the manifestly topological Poincare conjecture in that fashion, and was somewhat contrary to the conventional wisdom towards Poincare at the time. [cont.] – Terry Tao – 2010-12-09T17:30:28.557

22The other example is when Perelman needed a monotone quantity in order to analyse singularities of the Ricci flow. Here he had this amazing idea to interpret the parabolic Ricci flow as an infinite-dimensional limit of the elliptic Einstein equation, so that monotone quantities from the elliptic theory (specifically, the Bishop-Gromov inequality) could be transported to the parabolic setting. This is a profoundly different perspective on Ricci flow (though there was some precedent in the earlier work of Chow) and it seems unlikely that this quantity would have been discovered otherwise. – Terry Tao – 2010-12-09T17:32:44.867

15Terry's answer illustrates a principle relevant to this question: even if a proof as a whole is too complex to count as a good example, there are quite likely to be steps of the proof that are excellent examples. – gowers – 2010-12-09T20:54:41.907

1Regarding Terry Tao's Dec 9 '10 at 17:32 comment: (1) He is referring to joint work of Sun-Chin Chu, answering a conjecture of Hamilton that his Harnack estimate is the same as the positivity of some type of curvature. (2) In my opinion, a direct precedent for Perelman's work is Li and Yau's differential Harnack estimate for the heat equation. also motivating Hamilton's estimate. (3) What's striking about Perelman's work is: (i) The profound synthesis of geometry and analysis, to the point where they are nearly indistinguishable (ii) The high degree of subtlety and complexity of the arguments. – Bennett Chow – 2013-11-19T01:55:27.357

11

Lobachevsky and Bolyai certainly introduced a fundamentally new way of thinking, though I'm not sure it fits the criterion of being a proof of something - perhaps a proof that a lot of effort had been wasted in trying to prove the parallel postulate.

11

The use of ideals in rings, rather than elements (in terms of factorization, etc...).

This was followed by another revolutionary idea: using radical (Jacobson radical, etc...) instead of simple properties on elements.

10

Morse theory is another good example. Indeed it is the inspiration for Floer theory, which has already been mentioned.

Atiyah-Bott's paper "Yang-Mills equations on a Riemann surface" and Hitchin's paper "Self-duality equations on a Riemann surface" both contain rather striking applications of Morse theory. The former paper contains for example many computations about cohomology rings of moduli spaces of holomorphic vector bundles over Riemann surfaces; the latter paper proves for instance that moduli spaces of Higgs bundles over Riemann surfaces are hyperkähler.

Note that these moduli spaces are algebraic varieties and can be (and are) studied purely from the viewpoint of algebraic geometry. But if we look at things from an analytic point of view, and we realize these moduli spaces as quotients of infinite dimensional spaces by infinite dimensional groups, and we use the tools of analysis and Morse theory, as well as ideas from physics(!!!), then we can discover perhaps more about these spaces than if we viewed them just algebraically, as simply being algebraic varieties.

I recently learned that you can bound the sum of Betti numbers of a real algebraic set using Morse theory. In particular you can get sharp bounds on the number of connected components that way. This is proved by Milnor and Thom. This turned out to be useful for some application of mine in markov chain mixing. – John Jiang – 2012-01-04T20:29:14.427

8

I am always impressed by proofs that reach outside the obvious tool-kit. For example, the proof that the dimensions of the irreducible representations of a finite group divide the order of the group relies on the fact that the character values are algebraic integers. In particular, given a finite group $|G|$ and an irreducible character $\chi$ of dimension $n,$ $$\frac{1}{n} \sum_{s \in G} \chi(s^{-1})\chi(s) = \frac{|G|}{n}.$$ However, since $\frac{|G|}{n}$ is an algebraic integer (it is the image of an algebra homomorphism) lying in $\mathbb{Q},$ it in fact lies in $\mathbb{Z}.$

7

Use of Lagrange theorem (group theory) to prove Fermat's small theorem?

Use of fixed point methods (and completeness) to prove existence of solutions to differential equations?

Use of Fields theory to prove the impossibility of the trisection of the angle?

Use of group theory to prove insolvability of 5th degree equation?

1Fermat´s small theorem is in its nature group-theoretic. I don´t see anything surprising in proving it via Lagrange. Application of fixed point methods to differential equations is applying analysis to analysis. It only asks one to realize that differential and integral operators are maps... :-) But I think the 3rd and 4th point of your answer definitely suit to the original question. – efq – 2010-12-09T19:00:49.980

I disagree that Fermat's little theorem is group-theoretic. Certainly one can think of it in those terms, but it has a natural generalization to what I call the "necklace congruences" (I don't know if there is an established term for them) which is essentially number-theoretic: http://qchu.wordpress.com/2009/08/23/newtons-sums-necklace-congruences-and-zeta-functions/

– Qiaochu Yuan – 2010-12-09T23:25:34.550

Many things have generalizations outside of their initial domains. While generalizations certainly contribute by providing new views on the original result, they don´t necessarily (=often, but not always) show certain contexts to be more natural than others. Ofc, FlT ($\neq$ FLT) has number-theoretic nature. As simple as it might be, modular arithmetic is IMHO one of the immediate reasons why to think of FlT as number-theoretic, but this does not exclude FlT from also being group-theoretic. (cont.) – efq – 2010-12-10T00:46:05.103

I think FlT is group-theoretic because among all structures, underlying various proofs of FlT, the group one seems to be the most "simple" one, meaning the structure, that underlies the proof frame, arises from only one operation - group multiplication. I guess, it is a matter of taste what is "natural". You think, the group-theoretic approach to FlT is an unnatural or less natural one? – efq – 2010-12-10T00:49:22.280

6

Turing's solution of Hilbert's Entscheidungsproblem. The new idea was to invent the Turing machine and "virtualization" (the universal Turing machine).

6

I am surprised that noone mentioned Hilbert's proof of Hilbert's Basis Theorem yet. It says that every ideal in $\mathbb{C}[x_1,\ldots,x_n]$ is finitely generated - the proof is nonconstructive in the sense that it does not give an explicit set of generators of an ideal. When P. Gordan (a leading algebraists at that time) first saw Hilbert's proof, he said, "This is not Mathematics, but theology!"

However, in 1899, Gordan published a simplified proof of Hilbert's theorem and commented with "I have convinced myself that theology also has its advantages."

6

Some more proofs that startled me (in a random order):

Liouville theorem to prove that Weierstrass P-function satifies the differential equation you know.

Complex methods to establish the addition law on an elliptic curve.

Cauchy's formula (for P'/P) to prove that C is algebraically closed.

Pigeon hole principle to prove existence of solutions to Fermat-Pell's equation

Kronecker's solution to the same equation, using L-functions.

Minkowski's lemma (a convex compact, symmetric, of volume 2^n contains a non trivial integer point) and its use to prove Dirichlet's theorem on the structure of units in number fields.

Fourier transform to prove (versions of) the central limit theorem.

Multiplicativity of Ramanujan's tau function via Hecke operators.

Poisson formula and its use (for example, for the functional equation of Riemann's zeta function, or for computing the volume of SL_n(R)/SL_n(Z), or values of zeta at even positive integers).

On these kinds of question it is (I think) preferred if people leave one or two examples per answer, rather than a barrage. Moreover, in what sense did your examples require "fundamentally new ways of thinking" rather than being distillations of ideas already in the air, or examples you find cool? – Yemon Choi – 2010-12-10T06:56:45.627

6

How about Rabinowitsch's proof of the Nullstellensatz?

4I was about to go post about the 3-5 switch in the proof of Fermat's Last Theorem when I read this answer. It made me realize both are tricks rather than "proofs which require a fundamentally new way of thinking." Indeed, I think a computer could easily come up with the idea of introducing a new variable to simplify what needs to be proven (Rabinowitsch) or of switching tactics to deal with one case separately (Wiles). I'd go so far as to say computers are much better than humans at this kind of equational reasoning. – David White – 2011-05-19T20:01:45.630

6

How about Bolzano's 1817 proof of the intermediate value theorem?

In English here: Russ, S. B. "A Translation of Bolzano's Paper on the Intermediate Value Theorem." Hist. Math. 7, 156-185, 1980.

Or in the original here: Bernard Bolzano (1817). Purely analytic proof of the theorem that between any two values which give results of opposite sign, there lies at least one real root of the equation. In Abhandlungen der königlichen böhmischen Gesellschaft der Wissenschaften Vol. V, pp.225-48.

Not fully rigorous, according to today's standards, but perhaps his method of proof could be considered a breakthrough nonetheless.

6More specifically, Bolzano was first to recognize that a completeness property of the real numbers was needed, and he proposed the principle that any bounded set of real numbers has a least upper bound. – John Stillwell – 2011-08-29T16:51:50.370

6

Novikov's proof of the topological invariance of rational Pontryangin classes, for which he was awarded the 1970 Fields Medal. Fundamentally new (complicating a fundamental group to simplify geometry), and also fundamentally important. Here is what Sir Michael Atiyah had to say (as cited in the introduction to Raniski's Higher Dimensional Knot Theory):

Undoubtedly the most important single result of Novikov, and one which combines in a remarkable degree both algebraic and geometric methods, is his famous proof of the topological invariance of (rational) Pontryagin classes of a differentiable manifold... As is well-known many topological problems are very much easier if one is dealing with simply-connected spaces. Topologists are very happy when they can get rid of the fundamental group and its algebraic complications. Not so Novikov! Although the theorem above involves only simply-connected spaces, a key step in its proof consists in preversely introducing a fundamental group, rather in the way that (on a much more elementary level) puncturing the plane makes it non-simply-connected. This bold move has the effect of simplifying the geometry at the expense of complicating the algebra, but the complication is just manageable and the trick works beautifully. It is a real master stroke and completely unprecedented.

5

I have two favorite examples.

A. H. Weyl's 1916 proof of the equidistribution (in $[0,1]$) of the sequence $x_n=n\alpha \bmod \mathbb{Z}$, $\alpha$ irrational. He formulates an even more complicated question, namely to prove that

$$\lim_{n\to\infty} \frac{1}{n}\sum_{k=1}^n f(x_k)=\int_0^1 f(x) dx,$$

for any Riemann integrable $f$. (The uniform distribution follows by setting $f=$ the characteristic function of an interval.) To prove the more complicated question he observes that the space $X$ of $f$'s satisfying the above equality is a vector space and it is closed with respect to a natural topology. He then observes that $X$ contains all the trigonometric polynomials (trivial computation) and thus $X$ must contain all the functions that can be approximated by trig polynomials. This implies that $X$ contains all the Riemann integrable functions. This a soft touch a proof, with no brute force computation, using ideas of functional analysis at a time when the ideas of functional analysis were not part of the mathematical arsenal.

B. Forty years later A. Grothendieck gave a beautiful proof to the (then) recently discovered Riemann-Roch-Hizebruch formula. He formulated a more complicated problem, observed that the more complicated problem has a rich structure encoded in the object he invented and now called the $K$-theory of coherent sheaves, and then used functoriality to show that to prove the most general case it suffices to prove it for two special classes of examples.

5

"unexpected technique"

Sometimes the result itself is unexpected. Cantor's diagonal proof (and other counterexamples), Godel's incompleteness, Banach-Tarski and nonmeasurable sets, independence results generally.

I think you want cases where the result is anticipated, but the technique seems unrelated?

7I would count Cantor's proof of the existence of transcendental numbers. – gowers – 2010-12-09T16:39:19.753

3

However, I think I have come up with a reasonably plausible account of how somebody could have thought of it: http://gowers.wordpress.com/2010/12/09/finding-cantors-proof-that-there-are-transcendental-numbers/

– gowers – 2010-12-09T23:00:30.140

5

The Lebesgue integral seems to have been a fundamentally new way of thinking about the integral. It's hard to prove the convergence theorems if you have the Riemann integral in mind. I suppose there are probably many instances where you can give a computer a very ineffective definition of something and ask that it prove theorems. Ask it to prove anything about the primes where you start with the converse of Wilson's theorem as the definition of a prime. Can the computer figure out that its definition is terrible? Can it figure out what a prime really "is"?

4

1. Minkowski's geometric methods in algebraic number theory.
2. Kolmogorov's application of Shannon's entropy to classify Bernoulli dynamic systems.
3. Applications of low-dimensional geometric topology (the fundamental group) to the abstract group theory.
4. Archimedes' applications of mechanics to geometry (he considered them to be but heuristics, and provided additional "purely geometric" proofs for his results, but it was, as we know now, profound mathematics, something like affine geometry, with linear coefficients of points, adding to $1$, more or less being normalized weights).
5. Banach's method of applying Baire Theorem (Baire Property).

(I have one more recent too, which took specialists by surprize, but 5 is a nice number).

I think it's preferred to leave one example per answer, for big-list questions such as these. You can always leave more than one answer – Yemon Choi – 2013-07-03T06:57:21.087

4

Would the "quantum method" fit the bill here ?

"Quantum Proofs for Classical Theorems" Andrew Drucker, Ronald de Wolf

"Erdös and the Quantum Method" Richard Lipton

4

Lovasz's proof of cancellation in certain classes of finite structures still bewilders me; I can only imagine that he found the proof first and then came up with the theorem afterwards. The basic idea is to look at homomorphisms between a given structure and a sequence of other structures. A comparison of two such sequences involving structures of the form AxC and BxC can be taken to a comparison between A and B. The condition that there exists a one-element substructure is used to show a certain nontriviality of the comparison, and a few more details result in showing A is isomorphic to B if(f) AxC is isomorphic to BxC.

I should have asked Lovasz how he came up with the proof; I am confident that most people would not be able to come close to the method independently if they were only given the theorem statement. (Not to mention the analogous statement of unique nth roots in the same class.)

Could you please provide a pointer? – slimton – 2010-12-11T08:30:12.403

One reference is chapter 5 of Algebras, Lattices, Varieties by McKenzie, McNulty, and Taylor. The original paper by Lovasz appeared sometime near 1970 and is available online after searching the web for Lovasz and cancellation. Gerhard "Ask Me About System Design" Paseman, 2010.12.11 – Gerhard Paseman – 2010-12-11T09:07:27.817

At least as far as MSN is concerned, there are still two such papers, On the cancellation law among finite relational structures (1971) and Direct product in locally finite categories (1972), the second of which looks like it might be a generalisation of the first.

– LSpice – 2017-07-18T01:48:18.950

4

Hochster and Huneke's tight closure theory to prove various theorems in Commutative algebra (Cohen-Macaulayness of rings of invariants, existence of big Cohen Macaulay algebras)?

4

Proving that subgroups of free groups are free requires the knowledge of topology, a completely different field which a priori does not have anything to do with groups.

7Though the topological proofs are beautiful and slick, you don't need topology to prove this fact (and the original proofs were completely algebraic; see Lyndon and Schupp's book on combinatorial group theory for nice accounts of Nielsen's original proof as well as the later Reidemeister-Schreier approach). – Andy Putman – 2010-12-18T06:54:46.183

4

The Ax-Kochen theorem about zeros of forms over the $p$-adics which was proved using model theory.

4

Here are two more candidates for new ways of thinking in proofs but I am not sure about the historical picture. One is Brunn sieve which led to new results in number theory. The other is Kummer's method that have led to proofs of many cases of FLT. (Frey's new way of thinking regarding FLT was already mentioned in a Roy Smith's comment.)

4

I work in automated theorem proving. I certainly agree, in principle, that there are no proofs that are inherently beyond the ability of a computer to solve, but I also think that there are fundamental methodological problems in addressing the problem as posed.

The problem is to come up with a solution that would not be regarded as 'cheating', i.e., somehow building the solution into the automated prover to start with. New proof methods can be captured by what are called 'tactics', i.e., programs that guide a prover through a proof. Clearly, it would not be satisfactory to analyse the original proof, extract a tactic from it (even a generic one) that captures the novel proof structure and then demonstrate that the enhanced prover could now 'discover' the novel proof. Rather, we want the prover to invent the new tactic itself, perhaps by some analysis of the conjecture to be proved, and then apply it. So we need an automated prover that learns. But anticipating what kind of tactic we want to be learnt may well influence the design of the learning mechanism. We've now kicked the 'cheating' problem up a level.

Methodologically, what we want is a large class of problems of this form. Some we can use for development of the learning mechanism, and some we can use to test it. Success on a previously unseen test set would demonstrate the generality of the learning mechanism and, hence, the absence of cheating. Unfortunately, these challenges are usually posed as 'can a prover prove this theorem' rather than 'can it solve this wide range of theorems, each requiring a different form of novelty. Clearly, this latter form of the question is hugely challenging and we're unlikely to see if solved in the foreseeable future.

3

Nicolas Monod's genius two pages new counterexample to the von Neumann conjecture, inspired by Mary Shelley's novel Frankenstein: http://arxiv.org/abs/1209.5229

+1 for mentioning Monod's example, -1 for completely gratuitous mention of the book (what Gothic themes and social commentary are supposed to have inspired piecewise affine homeomorphism groups?) – Yemon Choi – 2013-07-03T06:56:05.753

3

I would like to propose the theorem of J.H.C. Whitehead that if $X$ is a path connected space, and $Y$ is formed from $X$ by attaching $2$-cells, i.e. $Y=X \cup_{f_i}e^2_i$ for a family of maps $f_i: S^1 \to X$, then the crossed module $\partial: \pi_2(Y,X,x) \to \pi_1(X,x) \;$ is the free crossed module on the characteristic maps of the $2$-cells.

The proof spreads over three of his papers.

1. On adding relations to homotopy groups. Ann. of Math. (2) 42 (1941) 409--428.

2. Note on a previous paper entitled On adding relations to homotopy groups.''. Ann. of Math. (2) 47 (1946) 806--810.

3. Combinatorial homotopy. II. Bull. Amer. Math. Soc. 55 (1949) 453--496.

The essential geometric content of the proof uses transversality and knot theory, and was in his paper 1. The definition of crossed module was given in his paper 2. Finally the definition of free crossed module was given in his paper 3, together with an outline of the proof, referring back to paper 1. You can find my own exposition of the proof here. The referee wrote that: "The theorem is not new. The proof is not new. But the paper should be published since these papers of Whitehead are notoriously obscure." I explained the proof once to Terry Wall, and he said it was a good 1960's type proof! What my paper does is repackage Whitehead's proof for a modern audience, and with pictures and consistent notation.

It seems to me pretty good to give the essence of a proof years before you have the right definitions for the theorem!

The notion of crossed module has over recent years become more widespread, partly because of its relation to $2$-groupoids and double groupoids. This is discussed a little in a seminar I gave in Chicago last year. See also the Wikipedia entry and that from the nlab.

@darij: link corrected. Thanks. – Ronnie Brown – 2013-10-13T21:29:09.030

3

Use of the Hardy–Littlewood circle method towards Waring's problem.

I've heard it being sometimes referred to as the Hardy-Ramanujan method. Bruce Berndt writes in Ramanujan: essays and surveys (p. 148): We also remark that a forerunner of the Hardy-Ramanujan `circle method'' can be found in [Ramanujan's] notebooks. – Chandan Singh Dalawat – 2010-12-13T10:49:29.687

http://en.wikipedia.org/wiki/Hardy%E2%80%93Littlewood_circle_method

"The initial germ of the idea is usually attributed to the work of Hardy with Srinivasa Ramanujan a few years earlier, in 1916 and 1917, on the asymptotics of the partition function."

– Unknown – 2010-12-13T12:50:52.377

3

There are two ways to prove the compactness theorem for propositional logic - either using the completeness theorem and going from semantic entailment to syntactic proof, or by a topological argument in Stone spaces. The latter, I feel, is an unexpected way of doing it - but I don't know the history of the subject so I'm probably not qualified to comment whether it was fundamentally new or not. Certainly in light of Stone's representation theorem, it seems unsurprising that there could be a topological proof of a theorem in logic, and as I understand it this connection is further investigated in topos theory?

3The topology on Stone spaces is precisely the Zariski topology on the spectrum of a Boolean ring. It is one of three historical sources of the idea that commutative rings can be thought of as topological spaces, along with various work in algebraic geometry and Gelfand's work on C*-algebras. I have never been very clear on the historical relationship between the three. – Qiaochu Yuan – 2011-01-18T16:27:59.163

3

I think the concepts of Archimedes which are at the birth of infinitesimal calculation, as the definition of length of a circle (hence the concept of $\pi$), and how to calculate the area of ​​a circle from $\pi$.

With all my admiration for Archimedes, a lot of credit should go to Eudoxus. – Włodzimierz Holsztyński – 2013-07-03T05:22:12.097

2

Bourgain,following Gower's ideas on Balog-szemeredi gave sharpest bound till then for n>8 on minkowski dimension in kakeya problem. the work used ideas from arithmetic combinatorics to harmonic analysis and charles fefferman comments that it gives the feeling from where in Mars did it come from.

2

Malliavin's proof of Hormander's theorem is very interesting in the sense that one of the basic ingredients in the language of the proof is a derivative operator with respect to a Gaussian process acting on a Hilbert space. The adjoint of the derivative operator is known as the divergence operator and with these two definitions one can establish the so called "Malliavin Calculus" which has been used to recover classical probabilistic results as well as give new insight into current research in stochastic processes such as developing a stochastic calculus with respect to fractional Brownian motion. What makes his proof more interesting is that Malliavin was trained in geometry and only used the language of probability in a somewhat marginal sense at times - alot of his ideas are very geometric in nature which can be seen for example in his very dense book: P. Malliavin: Stochastic Analysis. Grundlehren der Mathematischen Wissenschaften, 313. Springer-Verlag, Berlin, 1997.

2

How about Goodwillie Calculus? I'm not an expert in this field, but it seems to capture a lot of very deep ideas in stable homotopy theory and in category theory more generally. Here is a stub which includes some of the traditional concepts you can get back from Goodwillie Calculus: http://ncatlab.org/nlab/show/Goodwillie+calculus

Here are some lecture notes which go over the Goodwillie calculus and use it to derive the James splitting $\Sigma^\infty \Omega \Sigma X$ and the Snaith splittings of $\Sigma^\infty \Omega^n \Sigma^n X$ in a new way (this is an example of the "proof" the question is asking for): http://noether.uoregon.edu/~sadofsky/gctt/goodwillie.pdf

Finally, I recently saw an amazing talk given by Mark Behrens which used the Goodwillie Calculus to lift differentials in a particular spectral sequence to differentials in the EHP Spectral Sequence, meaning this abstract machinery could also lead to powerful new computational tools. This is discussed in a recent paper: http://www-math.mit.edu/~mbehrens/papers/GoodEHPmem.pdf

2

Barwise compactness and $\alpha$-recursion theory. The idea many properties of the following are captured by thinking of how to define analogs in $V_\omega$:

(1) Finite sets are elements of $V_{\omega}$.

(2) Computable sets can are $\Delta_1$ definable over $V_{\omega}$.

(3) Computable enumerable sets can are $\Sigma_1$ definable over $V_{\omega}$.

(4) First order logic is $L_{\infty, \omega} \cap V_\omega$.

Then, if we replace $V_\omega$ by a different countable admissible set $A$, many of the results relating these classes have analogs. E.g. Barwise compactness, completeness, the existence of an $A$-Turing jump, ...

2

Lovasz proof of Shannon Capacity of the Pentagon (the only proof known). Introduces Semidefinite optimization. Geometrizes and introduces analytic techniques to Graph Theory. Descartes introduced coordinate space approach to geometric problems. In the same spirit, Lovasz's proof coordinate space approach to graph theory problems.

1

I think Fürstenberg´s proof of the infinitude of primes, taken for itself, could be considered in the spirit of the original question, even though its mathematical value is questionable.

3A plea in advance - can we not have this discussion again? – HJRW – 2010-12-09T19:16:46.887

What are you talking about? – efq – 2010-12-09T19:19:52.353

– John Stillwell – 2010-12-09T19:27:10.443

@ ex falso quodlibet: I think I know what Henry Wilton is talking about. On at least one question on MO I saw multiple users talking about Furstenberg's proof. Some view it as a very original idea, as it claims to prove the infitude of the primes by means of topological methods. Others think that it isn't exactly a topological proof. They think it's more of a analytical proof, which incorporates ideas from calculus and analysis, aritmetic progressions in particular. – Max Muller – 2010-12-09T19:35:49.503

19Ans others think it's the usual proof in disguise :-) – Robin Chapman – 2010-12-09T19:44:39.720

1Well, ok, I was not aware of all these discussions (I have just found some more, lol). I agree though that its mathematical value is doubtful as it seems to be just a reformulation of Euclid´s proof, yet I find it amusing that such reformulation is possible... No more discussion about it. Period. :-) – efq – 2010-12-09T21:18:06.633

2@ Max Muller: In the spirit of accuracy (but without wanting to move towards opening up any cans of worms) I believe it is the case that Furstenburg's is not a repackaging of analytic ideas, but rather Euclid's original argument. – Nick Salter – 2010-12-09T21:43:05.807

1@all: Please consider Henry Wilton´s plea! Thanks! :-) – efq – 2010-12-09T22:17:03.033

0

I'm surprised that nobody has mentioned the ancient greek proof of the irrationality of $\sqrt{2}$ which certainly amazed the contemporaries!

0

Though I am not going to answer in exactly the way required, I believe including occasions were new insights helped to give support a great unsolved problem are worth-noting. For instance, we can consider a case in Random Matrix Theory: the statistical interplay between the distribution the zeros of the Riemann Zeta function and the eigenvalues of a random Hermitian matrix which has provided a basis for the Hilbert–Pólya conjecture. .

3I think it's fair to say that this interplay has led to a lot of conjectures but no proofs; whether random matrix theory can actually be used to prove theorems about L-functions remains to be seen... – David Hansen – 2010-12-12T09:28:25.750

I agree! The conjectural relationships and Odlyzko's computations thrived along this new avenue and this was what I wanted to emphasize. – Unknown – 2010-12-12T09:50:07.187

0

The first formal proofs using limits. (the oldest ones I know are in Newton's Principia)

Newton didn't consider them formal--so much so that he switched back to the ancient Greek method. – Włodzimierz Holsztyński – 2013-07-03T05:39:04.287

-1

Hilbert's proof of Hilbert-Waring theorem.