I would have preferred not to comment seriously on Mochizuki's work before much more thought had gone into the very basics, but judging from the internet activity, there appears to be much interest in this subject, especially from young people. It would obviously be very nice if they were to engage with this circle of ideas, regardless of the eventual status of the main result of interest. That is to say, the current sense of urgency to understand something seems generally a good thing. So I thought I'd give the flimsiest bit of introduction imaginable at this stage. On the other hand, as with many of my answers, there's the danger I'm just regurgitating common knowlege in a long-winded fashion, in which case, I apologize.

For anyone who wants to really get going, I recommend as starting point some familiarity with two papers, 'The Hodge-Arakelov theory of elliptic curves (HAT)' and 'The Galois-theoretic Kodaira-Spencer morphism of an elliptic curve (GTKS).' [It has been noted here and there that the 'Survey of Hodge Arakelov Theory I,II' papers might be reasonable alternatives.][I've just examined them again, and they really might be the better way to begin.] These papers depart rather little from familiar language, are essential prerequisites for the current series on IUTT, and will take you a long way towards a grasp at least of the motivation behind Mochizuki's imposing collected works. This was the impression I had from conversations six years ago, and then Mochizuki himself just pointed me to page 10 of IUTT I, where exactly this is explained. The goal of the present answer is
to decipher just a little bit those few paragraphs.

The beginning of the investigation is indeed the function field case (over $\mathbb{C}$, for simplicity), where one is given a family
$$f:E \rightarrow B$$
of elliptic curves over a compact base, best assumed to be semi-stable and non-isotrivial.
There is an exact sequence
$$0\rightarrow \omega_E \rightarrow H^1_{DR}(E) \rightarrow H^1(O_E)\rightarrow0,$$
which is moved by the logarithmic Gauss-Manin connection of the family.
(I hope I will be forgiven for using standard and non-optimal notation
without explanation in this note.) That is to say, if $S\subset B$ is the finite set of images of the bad fibers, there is a log connection
$$H^1_{DR}(E) \rightarrow H^1_{DR}(E) \otimes \Omega_B(S),$$
which *does not preserve* $\omega_E$. This fact is crucial, since it leads to an
$O_B$-linear Kodaira-Spencer map $$KS:\omega \rightarrow H^1(O_E)\otimes \Omega_B(S),$$ and thence
to a non-trivial map
$$\omega_E^2\rightarrow \Omega_B(S).$$
From this, one easily deduces Szpiro's inequality:
$$\deg (\omega_E) \leq (1/2)( 2g_B-2+|S|).$$
At the most simple-minded level, one could say that Mochizuki's programme has been concerned with
replicating this argument over a number field $F$. Since it has to do with differentiation on $B$, which eventually turns into $O_F$, some philosophical connection to $\mathbb{F}_1$-theory
begins to appear. I will carry on using the same notation as above, except now $B=Spec(O_F)$.

A large part of HAT is exactly concerned with the set-up necessary to implement this idea, where, roughly speaking, the Galois action has to play the role of the GM connection.
Obviously, $G_F$ doesn't act on $H^1_{DR}(E)$. But it does act on $H^1_{et}(\bar{E})$ with
various coefficients. The comparison between these two structures is the subject
of $p$-adic Hodge theory, which sadly works only over local fields rather than a global one. But Mochizuki noted long ago that something like $p$-adic Hodge theory should be a key ingredient because over $\mathbb{C}$, the comparison isomorphism
$$H^1_{DR}(E)\simeq H^1(E(\mathbb{C}), \mathbb{Z})\otimes_{\mathbb{Z}} O_B$$
allows us to completely recover the GM connection by the condition that
the topological cohomology generates the flat sections.

In order to get a global arithmetic analogue, Mochizuki has to formulate a *discrete non-linear* version of the comparison isomorphism. What is non-linear? This is the replacement of $H^1_{DR}$ by the universal extension $$E^{\dagger}\rightarrow E,$$
(the moduli space of line bundles with flat connection on $E$)
whose tangent space is $H^1_{DR}$ (considerations of this nature already come up in usual p-adic Hodge theory). What is discrete is the \'etale cohomology, which will just be $E[\ell]$ with global Galois action, where $\ell$ can eventually be large, on the order of the height of $E$ (that is $\deg (\omega_E)$). The comparison isomorphism in this context takes the following form:
$$\Xi: A_{DR}=\Gamma(E^{\dagger}, L)^{<\ell}\simeq L|E[\ell]\simeq (L|e_{E})\otimes O_{E[\ell]}.$$
(I apologize for using the notation $A_{DR}$ for the space that Mochizuki denotes by
a calligraphic $H$. I can't seem to write calligraphic characters here.)
Here, $L$ is a suitably chosen line bundle of degree $\ell$ on $E$,
which can then be pulled back
to $E^{\dagger}$.
The inequality refers to the polynomial degree in the fiber direction of
$E^{\dagger} \rightarrow E$. The isomorphism is effected via evaluation of sections at
$$E^{\dagger}[\ell]\simeq E[\ell].$$
Finally, $$ L|E[\ell]\simeq (L|e_{E})\otimes O_{E[\ell]}$$ comes from Mumford's theory of theta functions. The interpretation of the statement is that it gives an isomorphism between the space of functions of some bounded fiber degree on non-linear De Rham cohomology and the space of functions on discrete \'etale cohomology. This kind of statement is entirely due to Mochizuki. One sometimes speaks of $p$-adic Hodge theory with finite coefficients, but that refers to a theory that is not only local, but deals with linear De Rham cohomology with finite coefficients.

Now for some corrections: As stated, the isomorphism is not true, and must be modified at the places of bad reduction, the places dividing $\ell$, and the infinite places.
This correction takes up a substantial portion of the HAT paper. That is, the isomorphism is generically true over $B$, but to make it true everywhere, the integral structures must be modified in subtle and highly interesting ways, while one must consider also a comparison of metrics, since these will obviously figure in an arithmetic analogue of Szpiro's conjecture. The correction at the finite bad places can be interpreted via coordinates near infinity on the moduli stack of elliptic curves as the subtle phenomenon that Mochizuki refers to as 'Gaussian poles' (in the coordinate $q$). Since this is a superficial introduction, suffice it to say for now that these Gaussian poles end up being a major obstruction in this portion of Mochizuki's theory.

In spite of this, it is worthwhile giving at least a small flavor of Mochizuki's Galois-theoretic KS map. The point is that $A_{DR}$ has a Hodge filtration defined by

$F^rA_{DR}= \Gamma(E^{\dagger}, L)^{ < r} $

(the direction is unconventional), and
*this is moved around by the Galois action induced
by the comparison isomorphism.* So one gets thereby a map
$$G_F\rightarrow Fil (A_{DR})$$
into some space of filtrations on $A_{DR}$.
This is, in essence, the Galois-theoretic KS map. That, is if we consider the equivalence over $\mathbb{C}$ of $\pi_1$-actions
and connections, the usual KS map measures the extent to which the GM connection moves around the Hodge filtration. Here, we are measuring the same kind of motion for the $G_F$-action.

This is already very nice, but now comes a very important variant, essential for understanding the motivation behind the IUTT papers. In the paper GTKS, Mochizuki modified this map, producing instead a 'Lagrangian' version. That is, he assumed the existence of a Lagrangian Galois-stable subspace $G^{\mu}\subset E[l]$ giving rise to another isomorphism
$$\Xi^{Lag}:A_{DR}^{H}\simeq L\otimes O_{G^{\mu}},$$
where $H$ is a Lagrangian complement to $G^{\mu}$, which I believe does not itself need to
be Galois stable. $H$ is acting on the space of sections, again via Mumford's theory.
This can be used to get another KS morphism to filtrations on $A_{DR}^{H}$. But the key point is that

*$\Xi^{Lag}$, in contrast to $\Xi$, is free of the Gaussian poles*

via an argument I can't quite remember (If I ever knew).

At this point, it might be reasonable to see if $\Xi^{Lag}$ contributes towards a version
of Szpiro's inequality (after much work and interpretation), except for one small problem. A subspace like $G^{\mu}$ has no
reason to exist in general.
This is why GTKS is mostly about the universal elliptic curve over a formal completion near $\infty$ on the moduli stack of elliptic curves, where such a space does exists.
What Mochizuki explains on IUTT page 10 is exactly that
the scheme-theoretic motivation for IUG was to enable the move to a single elliptic curve over $B=Spec(O_F)$, via the intermediate case of an elliptic curve 'in general position'.

To repeat:

*A good 'nonsingular' theory of the KS map over number fields requires a global Galois
invariant Lagrangian subspace $G^{\mu}\subset E[l]$.*

One naive thought might just be to change base to the field generated by the $\ell$-torsion, except one would then lose the Galois action one was hoping to use. (Remember that Szpiro's inequality is supposed to come from *moving* the Hodge filtration inside De Rham cohomology.) On the other hand, such a subspace does often exist *locally*, for example, at a place of bad reduction. So one might ask if there is a way to globally extend such local subspaces.

It seems to me that this is one of the key things going on in the IUTT papers I-IV.
As he say in loc. cit. he works with various categories of collections of local objects that *simulate* global objects. It is crucial in this process that many of the usual
scheme-theoretic objects, local or global, are encoded as suitable categories with a rich and precise combinatorial structure.
The details here get very complicated, the encoding of a scheme into
an associated Galois category of finite \'etale covers being merely
the trivial case. For example, when one would like to encode the
Archimedean data coming from an arithmetic scheme (which again, will clearly be
necessary for Szpiro's conjecture), the attempt to come up with a category of
about the same order of complexity as a Galois category gives rise to the
notion of a *Frobenioid*. Since these play quite a central role in Mochizuki's theory,
I will quote briefly from his first Frobenioid paper:

'Frobenioids provide a single framework [cf. the notion of a "Galois category";
the role of monoids in log geometry] that allows one to capture the essential aspects of
both the Galois and the divisor theory of number fields, on the one hand, and function
fields, on the other, in such a way that one may continue to work with, for instance,
global degrees of arithmetic line bundles on a number field, but which also exhibits the new
phenomenon [not present in the classical theory of number fields] of a "Frobenius
endomorphism" of the Frobenioid associated to a number field.'

I believe the Frobenioid associated to a number field is something close to the
finite \'etale covers of $Spec(O_F)$ (equipped with some log structure) together with metrized line bundles on them, although it's
probably more complicated. The Frobenious endomorphism for a prime $p$ is then something like
the functor that just raises line bundles to the $p$-th power.
This is a functor that would come from a map of schemes if we were
working in characteristic $p$, but obviously not in characteristic zero.
But this is part of the reason to start encoding in categories:

*We get more morphisms and equivalences.*

Some of you will notice at this point the analogy to
developments in algebraic geometry where varieties are encoded in categories,
such as the derived category of coherent sheaves. There as well, one has reconstruction
theorems of the Orlov type, as well as the phenomenon of non-geometric morphisms
of the categories (say actions of braid groups). Non-geometric morphisms
appear to be very important in Mochizuki's theory, such as the Frobenius above,
which allows us to simulate characteristic $p$ geometry in characteristic
zero. Another important illustrative example is a
non-geometric isomorphism between Galois groups of local fields (which can't exist
for global fields because of the Neukirch-Uchida theorem).
In fact, I think Mochizuki was rather fond of Ihara's comment that the positive
proof of the anabelian conjecture was somewhat of a disappointment, since
it destroys the possibility that encoding curves into their fundamental
groups will give rise to a richer category. Anyways, I believe the importance
of non-geometric maps of categories encoding rather conventional objects
is that

*they allow us to glue together several standard
categories in nonstandard ways.*

Obviously, to play this game well,
some things need to be encoded in rigid ways, while others should
have more flexible encodings.

For a very simple example that gives just a bare glimpse of the general theory, you might consider a category of
pairs $$(G,F),$$ where $G$ is a profinite topological group
of a certain type and $F$ is a filtration on $G$.
It's possible to write down explicit conditions that ensure that
$G$ is the Galois group of a local field and $F$ is its ramification filtration
in the upper numbering (actually, now I think about it, I'm not sure about 'explicit conditions' for the filtration part, but anyways). Furthermore, it is a theorem of Mochizuki
and Abrashkin that the functor that takes a local field to the corresponding
pair is fully faithful. So now, you can consider triples
$$(G,F_1, F_2),$$
where $G$ is a group and the $F_i$ are *two* filtrations of the right type.
If $F_1=F_2$, then this 'is' just a local field. But now you can have
objects with $F_1\neq F_2$, that correspond to strange amalgams of
two local fields.

As another example, one might take
a usual global object, such as $$ (E, O_F, E[l], V)$$ (where $V$
denotes a collection of valuations of $F(E[l])$ that restrict bijectively to
the valuations $V_0$ of $F$), and associate to it a collection of local categories
indexed by $V_0$ (something like Frobenioids corresponding to the $E_v$ for $v\in V_0$). One can then try to glue them together
in non-standard ways along sub-categories, after performing a number of non-standard transformations. My rough impression at the moment is that
the 'Hodge theatres' arise in this fashion. [This is undoubtedly a gross oversimplification, which I will correct
in later amendments.] You might further imagine that some
construction of this sort will eventually retain the data necessary to get the height of
$E$, but also have data corresponding to the $G^{\mu}$, necessary for the Lagrangian KS map.
In any case, I hope you can appreciate that a good deal of 'dismantling' and 'reconstructing,' what Mochizuki calls *surgery*, will be necessary.

I can't emphasize enough times that much of what I write is based on
faulty memory and guesswork. At best, it is superficial, while at worst,
it is (not even) wrong. [~~In particular, I am no longer sure that the GTKS map is used in an entirely direct fashion.~~]
I have not yet done anything with the current papers than give them a cursory glance.
If I figure out more in the coming weeks, I will make corrections.
But in the meanwhile, I do hope what I wrote here is mostly more helpful than misleading.

Allow me to make one remark about set theory, about which I know next to nothing.
Even with more straightforward papers in arithmetic geometry, the question sometimes arises about Grothendieck's universe axiom, mostly because universes appear to be used in SGA4. Usually, number-theorists (like me) neither understand, nor care about such foundational matters, and questions about them are normally
met with a shrug. The conventional wisdom of course is that any of the usual
theorems and proofs involving Grothendieck cohomology theories or topoi do
not actually rely on the existence of universes, except general laziness allows us
to insert some reference that eventually follows a trail back to SGA4.
However, this doesn't seem to be the case with
Mochizuki's paper. That is, universes and
interactions between them seem to be important actors rather than conveniences.
How this is really brought about, and whether more than the universe axiom is necessary for the arguments, I really don't understand enough yet to say.
In any case, for a number-theorist or an algebraic geometer, I would guess it's still prudent to acquire a reasonable feel for the
'usual' background and motivation (that is, HAT, GTKS, and anabelian things) before worrying too much about deeper issues of set theory.

15I'm afraid we do not permit the word "behooves." – Will Jagy – 2012-09-07T01:17:18.503

4There is some expository material on Mochizuki's website. Did you try to read it already? If not, please, do so first. For lack of any indication of having done so, vote close (as by FAQs 'homework' ought to be done before asking). – None – 2012-09-07T01:17:35.427

7The suggestion that what's on my blog constitutes even "an extremely vague glimpse" of what Mochizuki is trying to get at is false advertising of the most extravagant kind! – JSE – 2012-09-07T01:51:46.457

Correction: "an enthusiastic report". Sorry, Jordan! – James D. Taylor – 2012-09-07T01:55:39.737

24

@quid: the expositions I've seen (such as http://www.kurims.kyoto-u.ac.jp/~motizuki/2010-10-abstract.pdf) are mostly teasers to make people read more. My question is about the sketch underlying the proof of the ABC conjecture, which I don't see evident there. If you have an exposition that you would recommend, I suggest that you write it as an answer.

– James D. Taylor – 2012-09-07T02:00:17.4605Well, then, read more! And if you do not care enough or lack the appropriate background to do so, I do not see why you need to know this so urgently. If experts become optimistic and understand the thing well enough, expositions will be all around. Just wait – None – 2012-09-07T02:13:42.620

6@James: Have you looked at Remark 1.10.1 in IUTeich Theory IV? He actually states that the computations in the proof of the previous theorem, which seems to be the main theorem from which ABC is derived, were known to him as early as 2000, and actually compares this to the proof of the Weil Conjectures. So it might be worth trying to study the proof of Theorem 1.10 (obviously much easier said than done). – Kevin Ventullo – 2012-09-07T02:15:06.777

21

META http://tea.mathoverflow.net/discussion/1438/mochizuki-proof-of-abc

– Will Jagy – 2012-09-07T02:18:53.84071@quid: you're being stubborn. Is it not legitimate to ask questions about mathematics that is available but difficult to read and understand? @Kevin: thanks! – James D. Taylor – 2012-09-07T02:21:01.920

2I reply on meta. – None – 2012-09-07T02:32:38.630

20@James Taylor I have not made any serious attempt to read the papers. However, I can point you to two things which I think are relevant, based on hints from the introductions. The first is the very easy proof of function field ABC, which turns into an analysis of the possible branching behavior of maps $\mathbb{CP}^1 \to \mathbb{CP}^1$. For number field ABC, the source $\mathbb{CP}^1$ should turn into $\mathrm{Spec}(\mathbb{Z})$ and the target should still be $\mathbb{P}^1$ (continued). – David E Speyer – 2012-09-07T06:57:40.243

21The second is that I think the Mochizuki is thinking of the target $\mathbb{P}^1$ as the $j$-line, so that maps from $\mathrm{Spec} \mathbb{Z}$ to it correspond (roughly) to elliptic curves over $\mathbb{Q}$. This is very analogous to the way that introducing an elliptic curve made FLT provable. Are these things you already understand, or would it be useful for me to write them up in more detail? Again, this is all with the caveat that I haven't looked at anything beyond the introductions, and I understand only a little bit of them. – David E Speyer – 2012-09-07T07:00:34.173

4@David: The link between ABC and Szpiro's Conjecture (which is the content of the application of the Frey-like construction) long predates Mochizuki's work, and the "function field case" of ABC seems to have nothing to do with the ideas relevant in Mochizuki's work in the number field case much as the "function field case" of FLT is totally irrelevant to the actual proof of FLT. So although each aspect is very interesting for someone who has never heard of the ABC Conjecture, neither of them sheds light on anything that has happened since the time Mochizuki began his work on these matters. – grp – 2012-09-07T12:14:01.377

@grp I absolutely agree that everything I am talking about is 20 years old, and it cannot be "what is new" in Mochizuki. I do not know what is new; someone else would need to write that and I hope they do. That's why I asked whether this sort of context is useful. I'll try to get together a reply re whether the function field analogy is relevant at some point later. – David E Speyer – 2012-09-07T12:37:16.620

15It would seem that only Mochizuki could actually give a correct answer. Anything else would be speculation. Therefore, IMHO this should be a CW question since it cannot have a single correct answer (barring Mochizuki responding). That grp's popular answer was made CW by grp further substantiates this. – Benjamin Steinberg – 2012-09-07T14:26:21.743

3David, your comments are precisely the type of answer I'm looking for. It is okay that it's 20 years old. – James D. Taylor – 2012-09-07T15:53:16.083

6I strongly agree with Benjamin's comment. I share the concerns of quid, grp, and others, that but for Mochizuki, no one is currently able to answer definitively, and therefore CW it should be (if it even remains open). – Todd Trimble – 2012-09-07T18:27:08.147

2James, I think your question is a reasonable one to ask. However "reasonable" and "appropriate for MathOverflow" are not the same thing. If you wish, we can discuss this further on meta.mathoverflow. Gerhard "Ask Me About Appropriate Asking" Paseman, 2012.09.07 – Gerhard Paseman – 2012-09-07T20:21:58.407

16Not being active on MO anymore, I don't much care if this question survives or not, but I am very interested in understanding more about Mochizuki's argument, so to the extent that insight appear here, I'll happily take advantage of it, and I'm sure others will too. My own sense was that Mochizuki's program has been motivated by trying to get around Faltings's "no go" theorem on arithmetic KS, by constructing a new, non-linear (or perhaps anabelian) interpretation of Hodge theory (both classical and p-adic) and related ideas, leading to a construction of some sort of arithmetic KS map, ... – Emerton – 2012-09-07T22:24:46.530

8... which ultimately allows him to (in some vague sense, at least) mimic the function field argument. But perhaps this intuition is off, and in any case, it hasn't helped me much in penetrating what he is actually doing. That is going to take hard work! – Emerton – 2012-09-07T22:26:14.657

9Dear James, Just to echo grp's answer, it is hard to overstate the extent to which most of the number theory community has not engaged with Mochizuki's work before now, and now people are desperately trying to catch up. It will take time before anyone can explain what is really going on. Regards, – Emerton – 2012-09-07T22:31:41.290

8As discussed on the meta page, I just substantially edited the question to remove extraneous (and tendentious) material. – Andy Putman – 2012-09-08T18:01:00.857

12I really don't get why anyone thought to close this question. Given that Mochizuki thought his methods might be able to prove the ABC conjecture years before he came up with his (supposed) proof, it seems reasonable to think he might have an intuitive idea of a proof in his mind, and then the years of development of IUTeich were a means of putting those intuitive ideas into rigorous mathematical reality. – David Corwin – 2012-09-09T07:33:32.850

6While this could hypothetically be a question that only Mochizuki can answer, it could instead be that there is a sketch of an "intuitive proof" known to experts for years before Mochizuki's work. In that scenario, no one knew how to put those intuitive (even wishful) ideas into a rigorous foundation, and Mochizuki developed his theory in part in order to create that foundation. But that proof sketch would be both obscure enough and well-known enough to put on MO. – David Corwin – 2012-09-09T07:33:42.787

2

what-is-the-insight-of-quillens-proof-that-all-projective-modules-over-a-polynom http://mathoverflow.net/questions/19584/what-is-the-insight-of-quillens-proof-that-all-projective-modules-over-a-polynom

– Alexander Chervov – 2012-09-12T11:02:40.1771

As a striking example of the increasing prevalence of the notion of

– John Sidles – 2012-09-13T20:24:20.033naturalityin contemporary mathematics, Mochizuki’s four preprints employ the word "natural" and its derivatives on more than six hundred separate occasions (for details and related mathematical quotations, see this post onGödel's Lost Letter and P=NP).3John, the word "natural" has a precise mathematical definition. It does not mean "natural" like the natural world, which mathematically (and perhaps philosophically) is completely not well-defined. Mathematics cannot be done without precision, and the world is completely imprecise. – Asaf Karagila – 2012-09-16T15:52:06.780

1

hi all fyi ABC also has significant implications in CS theory see eg http://cstheory.stackexchange.com/questions/12504/implications-of-proof-of-abc-conjecture-for-cs-theory, & just want to thank the math community & mathoverflow for keeping this question open to see extended engagement and analysis of the proof by professionals in the field, & hope mathoverflow will be open to further on-topic questions on the subj to facilitate further serious analysis.

– vzn – 2012-09-28T15:41:37.033"The proof must be correct for if it was not he wouldn't have the ideas to invent them" just joking – Koushik – 2013-10-02T13:33:56.650

more on the behind-the-scenes efforts to verify the proof: paradox of proof by chen on mochizuki attack

– vzn – 2013-12-01T20:59:35.9132

what is the status of this in 2015? (in december, mochizuki posted a progress report on the verification of universal teichmüller theory / his alleged proof: http://www.kurims.kyoto-u.ac.jp/~motizuki/IUTeich%20Verification%20Report%202014-12.pdf )

– Trent – 2015-01-15T01:49:02.077