Are there examples of originally widely accepted proofs that were later discovered to be wrong by attempting to formalize them using a proof assistant (e.g. Coq, Agda, Lean, Isabelle, HOL, Metamath, Mizar)?
-
18$\begingroup$ Proofs formalized by now are in hundreds, and the vast majority of them are elementary (from nineteenth century). So, I think what you described is very unlikely to happen. This may change dramatically in the future though. $\endgroup$– Alex GavrilovJan 21, 2018 at 4:26
-
4$\begingroup$ related: cstheory.stackexchange.com/questions/37299/… $\endgroup$– Carlo BeenakkerJan 21, 2018 at 8:04
-
46$\begingroup$ Proofs can be checked by formalization, but not shown to be wrong. At worst, the (human) formalizer can give up and admit not knowing how to construct the formalized proof. On the other hand, a proof can be shown to be wrong by giving a counter example (not just for the main result, but also for any part of the proof or sub-argument). This is in any case how proofs are scrutinized and does not require formalization. $\endgroup$– Igor KhavkineJan 21, 2018 at 9:12
-
8$\begingroup$ The Coq-combi project proves the Littlewood-Richardson rule imitating the "Plactic Monoid" chapter of Lothaire's Algebraic Combinatorics. But the latter chapter is known to contain minor gaps and inaccuracies, so they have to have been fixed somehow in Coq-combi. Not sure if it counts... $\endgroup$– darij grinbergJan 21, 2018 at 17:25
-
4$\begingroup$ @Igor Khavkine: How is giving a counterexample for a sub-argument of a proof different from demonstrating the proof wrong? And why do counterexamples need formalization any less than proofs - they might be inconsistent. $\endgroup$– Bernhard StadlerJan 22, 2018 at 18:11
6 Answers
Since this question was asked in January there have been some developments. I would like to argue that the scenario raised in the question has now actually happened. Indeed, Sébastien Gouëzel, when formalising Vladimir Shchur's work on bounds for the Morse Lemma for Gromov-hyperbolic spaces, found an actual inequality which was transposed at some point causing the proof (which had been published in 2013 in J. Funct. Anal., a good journal) to collapse. Gouëzel then worked with Shchur to find a new and correct (and in places far more complex) argument, which they wrote up as a joint paper.
http://www.math.sciences.univ-nantes.fr/~gouezel/articles/morse_lemma.pdf
The details are in the introduction. Anyone who reads it will see that this is not a "mistake" in the literature in the weak sense defined by Manuel Eberl's very clear answer -- this was an actual error which was discovered because of a formalization process.
-
3$\begingroup$ I have marked this as the accepted answer now. For future reference, since this answer references the previous accepted answer, I'll add a note here to inform that the previous accepted answer was the one written by Manuel Eberl. $\endgroup$ Oct 26, 2018 at 5:46
-
$\begingroup$ I edited my answer to remove the reference to "the accepted answer" and now reference Eberl's answer explicitly; hopefully this lessens the confusion even more. $\endgroup$ Oct 31, 2018 at 11:51
-
1$\begingroup$ The Gouëzel-Shchur paper has now been published in the Journal of Functional Analysis 277 (2019), 1258-1268 $\endgroup$ Jul 24, 2019 at 19:51
First of all, to explain my perspective: I'm a PhD student working in the formalisation of mathematics with Isabelle/HOL, and I've been working with that system for about 7 years. I was introduced to it in a lecture when I was an undergraduate and I got hooked immediately. I do think it is useful, but I don't do it because of that. I do it because it's fun.
Anyway, your question is a bit tricky to answer because it depends a lot on what you mean by a ‘wrong proof’ and by ‘shown wrong by formalizing them’. A lot of the time, it's something of a grey area.
Normally, one needs a very thorough understanding of the paper proof in order to formalise it and one has to think of a way of formalising the argument. Conceptual problems with the paper proof often become apparent at this stage already, when there is no theorem prover involved as such yet.
Secondly, of course, if you formalise something like the Prime Number Theorem or Cauchy's Integral Theorem, you're probably not going to find out that it's all wrong and everything collapses. But you might well find problems in particular proofs in textbooks.
I do find a lot of ‘mistakes’ in proofs, including textbook proofs and published papers. Most of these mistakes are easily fixed and most mathematicians would likely brush them off as insignificant. Some take me a few days to figure out. Some actually require changing definitions, adding assumptions, or altering the statement of the theorem.
Most ‘mistakes’ are something like this:
surprisingly non-trivial arguments being declared as trivial/left to the reader
going very quickly and vaguely over part of the proof that is perceived as uninteresting and thereby missing a subtle flaw that would have become apparent if one had done it more thoroughly
missing cases that have probably been overlooked by the author
arithmetic mistakes (my favourite being multiplying an inequality with a constant and not checking that it is non-negative)
missing assumptions that are still implicitly used
Let me give you a few examples (I won't mention the exact authors; my intent is not to shame anybody for making these mistakes, only to show that this is quite common):
I recently had a case where a theorem from a widely-used textbook from the 70s was simply plain wrong, which I realised when I wanted to find out how to formalise it. I am not an expert in that field, but apparently the people who do work in that field know that this is wrong.
One of the first algorithms (working on non-deterministic automata) that I formalised apparently assumed that the automaton is total (i.e. it has an outgoing transitions for every letter in the alphabet from every state). In my opinion, that should have absolutely been mentioned in the paper, but one could possibly argue that that was just implicit in their idea of an automaton.
A colleague of mine found a problem with the proof of a lemma used in the correctness proof for some some complicated automata algorithm that had been used in state-of-the-art software for years. Somebody else later showed not only does the lemma not hold, but the algorithm is also incorrect. As far as I know, nobody has been able to fix the algorithm yet.
In one instance, I had formalised a kind of program transformation technique from a published paper. The authors then extended that paper to a more detailed journal version and also added some new transformation rules. One of them dealt with multiplication with a constant, but they did not realise that multiplication with 0 is a special case that makes their rule unsound.
I worked on formalising a new result that had just been published in a journal and found out that one part of the proof that the authors didn't explain in much detail due to page limits had a subtle problem that became only apparent when I had already formalised everything in Isabelle and got stuck at this part. The authors immediately admitted that this was a problem that could not be fixed in any apparent way except by adding an additional, somewhat technical assumption to the entire argument. However, they later managed to prove a stronger result that subsumes that result, but the proof of that was much more involved. (more details on this at the end of this answer)
I don't remember the exact details about the Kepler conjecture that somebody mentioned before, but off the top of my head, I seem to recall that several smaller problems were found in the original program code, and Nipkow found one problem that actually caused Hales to revise a part of the proof completely.
As a theorem proving guy, my reaction to this is ‘This shows that formalising stuff in a theorem prover is worthwhile’. I am aware that mathematicians often have a different perspective. It is not an uncommon view to say that the so-called ‘mistakes’ I mentioned above are insignificant; that someone would have found them eventually even without a theorem prover; that the theorems were still correct (in some sense) and it was only the proofs that had some minor problems.
However, I disagree with that. I want my proofs to be as rigorous as possible. I want to be sure I didn't miss any assumptions. And I think that things like Kepler's conjecture show that there are instances where it is just infeasible for humans to check the correctness of a proof with a reasonable amount of certainty.
EDIT: As requested, some more details on point 5.
The paper in question is The impossibility of extending random dictatorship to weak preferences. They also published a corrigendum. The purpose of that paper is to show that no Social Decision Scheme (SDS) for at least 4 agents and alternatives exists that is an extension of Random Dictatorship (RD) and fulfils four nice properties.
It works by first showing that none exists for 4 agents and 4 alternatives and then shows that an SDS for more than 4 agents/alternatives can be turned into one for exactly 4/4 while preserving the nice properties, so that it cannot work for more than 4. Typically, in this kind of proof, the base case is the difficult one and lifting it to a larger number of agents/alternatives is pretty trivial. However, in this case, the property "the SDS is an extension of RD" does not survive the lifting to more agents/alternatives, which completely breaks that step. I myself only noticed that when I had already typed most of the proof into Isabelle and it just didn't go through.
The proof for the base case here was based on considering 12 particular preference profiles and, as you can see, relatively short. The authors then later found a proof for the same statement without the RD extension assumption, but that one needed 47 preference profiles and was much longer. I formalised that proof in Isabelle without any problems (see my BSc thesis).
-
10$\begingroup$ Excellent answer! Your example #5 is interesting enough that I wonder if you could write to the authors and ask if they mind having their identities revealed here. The purpose of course is not to shame them (we have all made mistakes) but to provide some concrete documentation about the value of formalization. From this point of view, #5 is much more interesting than #1 to #4 in my opinion (and of course #6 is already documented). $\endgroup$ Jan 24, 2018 at 16:07
-
1$\begingroup$ +1. You should probably inform the authors about all of the errors you found, at least when the research is reasonably new. (Though #4 might be a matter of mismatched definitions of $\mathbb{N}$.) $\endgroup$ Jan 24, 2018 at 16:36
-
5$\begingroup$ I usually do. Often, I never get any sort of reply, but in many cases the original authors are very helpful. #4 was something with real numbers and indeed a mistake. I think the authors were aware that 0 was a special case, but they thought it would still work out in the end. $\endgroup$ Jan 24, 2018 at 16:41
-
7$\begingroup$ I updated my answer and included some more information about #5. $\endgroup$ Jan 25, 2018 at 14:29
-
4$\begingroup$ Awesome answer, thanks for sharing your experience and knowledge. $\endgroup$ Jan 27, 2018 at 13:57
This question was raised on the Foundations of Mathematics mailing list back in 2014, and the short answer is no, there are no examples of this. [EDIT: Although this may have been true at the time I originally wrote this, it is no longer true, as other answers amply demonstrate. But I think that this answer is worth leaving here nonetheless.]
The longer answer is that the process of formalizing any non-trivial mathematical argument is likely to reveal some minor gaps, such as degenerate cases for which the main argument doesn't quite work as stated. If you're sufficiently pedantic, then you might claim that in such cases, the original proof is "wrong," but I suspect that this is not the sort of thing you're asking for.
The Flyspeck project did turn up one gap in the original proof of the Kepler conjecture that was large enough that the authors felt the need to write a few pages of human explanation about it. There is also an interesting paper by Fleuriot and Paulson where they undertook to formalize Newton's Propositio Kepleriana with Isabelle using nonstandard analysis to implement Newton's use of infinitesimals. There was one step where Fleuriot and Paulson could not find a plausible way to imitate Newton's reasoning exactly and found themselves having to use a different argument. Again, it is debatable whether this means that Newton's proof was "wrong."
-
12$\begingroup$ @AsafKaragila : As we know from another MO question, there are cases where a wrong result is widely accepted for some time and even used by others, but so far we lack convincing examples where a proof assistant plays an essential role in the discovery of the error. This may change in the future if proof assistants become sufficiently easy to use (but by the time that day arrives, it may no longer be easy to decide whether to attribute the bug report to the human, or the proof assistant, or both working together). $\endgroup$ Jan 21, 2018 at 20:12
-
4$\begingroup$ "If you're sufficiently pedantic, then you might claim that in such cases, the original proof is "wrong," but I suspect that this is not the sort of thing you're asking for." No, that's not being pedantic. That's just knowing what 'wrong' means. If a proof doesn't cover every case it claims to cover, then it's not a proof of what it claims to be. It's wrong. It might be a proof that non-X, non-Y, non-Z somethings have property P, but it's not a proof that all somethings have property P. $\endgroup$ Jan 23, 2018 at 21:12
-
8$\begingroup$ @MeiZhang : Thanks for clarifying. With this definition of "wrong proof" the Flyspeck example certainly counts. But let me point out that in this sense, most published proofs are wrong, or at least it's not clear whether they're wrong. E.g., it is common for parts of a proof to be "left to the reader." Is this a "gap"? If I pick up a random math journal not in my field, I will probably not understand any of the proofs. Are these "gaps"? Or just my lack of training? Human proofs always assume the reader will do a lot of work to fill in missing details. It's not clearcut what a real "gap" is. $\endgroup$ Jan 24, 2018 at 2:36
-
5$\begingroup$ @MilesRout, you write: "No, that's not being pedantic. That's just knowing what 'wrong' means." I notice from your profile that you identfy yourself as a software developer, not a mathemician. With respect, you are wrong about the mathematician's notion of "wrong". The distinction between missing pedantic details and a genuinely wrong argument is central to our ability to make advances in mathematics -- finding the crux of an argument and concentrating on it is how we know where to focus our energy to find (or check) a proof. $\endgroup$– HJRWJan 24, 2018 at 9:25
-
6$\begingroup$ @MilesRout, to take a step back for a moment, your position here looks to a mathematician a lot like a power-grab on the part of the theorem-proving community. You implied that proofs that require minor corrections to be formalized are "wrong, wrong, wrong". This suggested to me that if a mathematician proves a result, and a theorem-prover makes some minor corrections when formalizing the result, then you think the mathematician was "wrong" and the theorem-proved gave the correct proof. I hope the two communities can work together much better than that. $\endgroup$– HJRWJan 27, 2018 at 12:50
Not an example that a result was shown wrong by using a proof assistant, but inconsistency was discovered in the premises of Gödel's ontological argument by a higher-order theorem prover: https://www.ijcai.org/Proceedings/16/Papers/137.pdf
-
3$\begingroup$ Very interesting! I had not heard about this development. $\endgroup$ Jan 21, 2018 at 19:51
-
$\begingroup$ Yes, interesting indeed. Thanks for the reference. $\endgroup$ Jan 22, 2018 at 16:29
-
1$\begingroup$ Some context may be helpful here. Gödel did not publish his own proof, but shared it with Scott, who then publicized it. Scott's version is perfectly fine, but Gödel's original unpublished version was missing a clause. The conventional wisdom is that Gödel knew the clause was needed but inadvertently omitted it in the draft he showed Scott. In any case, it has long been known that the clause is needed for the argument to go through. But what had not been noticed until the formalization effort is that omitting the clause actually leads to a contradiction. $\endgroup$ Aug 16 at 15:44
I learned of the following stellar example from Lawrence Paulson.
Anthony Bordg, Yijun He, and Hanna Lachnitt have been involved in an ongoing effort to formalize quantum algorithms and results in quantum information theory in Isabelle/HOL. You can read about their efforts here.
In the course of their project, they naturally found themselves examining one of the seminal papers in the subject, Quantum games and quantum strategies, by J. Eisert, M. Wilkens, and M. Lewenstein. As of this writing, Google Scholar claims that this paper has nearly a thousand citations. Bordg, He, and Lachnitt found that there was a fundamental and unfixable error in one of the main results of the paper. They explain the details in an arXiv preprint.
Although it may also not fully count as an example discovered via a proof assistant involving fully formalized mathematics, the classification of maximal subgroups of all finite classical groups of dimension $\le 12$ [Bray, John N.; Holt, Derek F.; Roney-Dougal, Colva M., The maximal subgroups of the low-dimensional finite classical groups., London Mathematical Society Lecture Note Series 407. Cambridge: Cambridge University Press (ISBN 978-0-521-13860-4/pbk; 978-1-139-19257-6/ebook). xiv, 438 p. (2013). ZBL1303.20053] involved large pieces of Magma code by which it was able to correct/fill gaps in the earlier theoretical results of [Kleidman, Peter; Liebeck, Martin, The subgroup structure of the finite classical groups, London Mathematical Society Lecture Note Series, 129. Cambridge etc.: Cambridge University Press. x (1990). ZBL0697.20004].