Extent of “unscientific”, and of wrong, papers in research mathematics

Question

This question is cross-posted from academia.stackexchange.com where it got closed with the advice of posting it on MO.

Kevin Buzzard's slides (PDF version) at a recent conference have really unsettled me.

In it, he mentions several examples in what one would imagine as very rigorous areas (e.g., algebraic geometry) where the top journals like Annals and Inventiones have published and never retracted papers which are now known to be wrong. He also mentions papers relying on unpublished results taken on trust that those who announced them indeed have a proof.

He writes about his own work:

[...] maybe some of my work in the p-adic Langlands philosophy relies on stuff that is wrong. Or maybe, perhaps less drastically, on stuff which is actually correct, but for which humanity does not actually have a complete proof. If our research is not reproducible, is it science? If my work in pure mathematics is neither useful nor 100 percent guaranteed to be correct, it is surely a waste of time.

He says that as a result, he switched to formalizing proofs completely, with e.g. Lean, which guarantees correctness, and thus reusability forever.

Just how widespread is the issue? Are most areas safe, or contaminated? For example, is there some way to track the not-retracted-but-wrong papers?

The answer I accepted on academia.stackexchange before the closure gives a useful general purpose method, but I'd really appreciate more detailed area-specific answers. For example, what fraction of your own papers do you expect to rely on a statement "for which humanity does not actually have a complete proof" ?

If my work in pure mathematics is neither useful nor 100 percent guaranteed to be correct, it is surely a waste of time. There would appear to be a gap in this argument, unless the author has a general proof that "not useful" plus "not 100% guaranteed to be correct" implies "waste of time", in which case there are myriad special cases outside of mathematics, many of them quite surprising and unsettling. — Steven Landsburg, Jan 31, 2020 at 17:50
Just to be clear, this depends on precisely what you mean by "wrong" doesn't it? You have not defined this term yet. I imagine that there are many ways to be wrong, not all of the same kind. For example, does a trivial typographical error count? Grammatical errors? etc. Complete proof? What is that? — Somos, Jan 31, 2020 at 17:58
what fraction of your own papers do you expect to rely on a statement "for which humanity does not actually have a complete proof" Zero, but the catch is that I insisted on claiming the proof of the reduction of A to B (B being published by other people and universally accepted as valid) instead of claiming A itself at least twice (not that my co-authors were very happy about it, but I find it a good practice if one has trouble going through the whole proof tree of B yourself). On the other hand, I'm guilty of publishing at least 2 papers in which some gap was later found (and closed). — fedja, Jan 31, 2020 at 22:17
A complete proof from what axioms? fedja just described the perfectly acceptable practice of working in a high-level system where you just take some B on faith, a.k.a. make it part of your axioms. (Likewise people in Paris used to speak of “axiom Valiron”: everything in this book is true.) Of course, the risk in doing this is, your resulting axiom system might be inconsistent. But then so might ZFC, right? (1/2) — Francois Ziegler, Jan 31, 2020 at 23:22
So I see “relying on stuff for which humanity does not actually have a complete proof” (a contradiction might be found) as hugely different from “relying on stuff that is wrong” (a contradiction has been found). (2/2) — Francois Ziegler, Jan 31, 2020 at 23:22

Kevin Buzzard · Accepted Answer · 2020-02-01 22:25:01Z

"Are most areas safe, or contaminated?"

Most areas are fine. Probably all important areas are fine. Mathematics is fine. The important stuff is 99.99999% likely to be fine because it has been carefully checked. The experts know what is wrong, and the experts are checking the important stuff. The system works. The system has worked for centuries and continues to work.

My talk is an intentionally highly biased viewpoint to get people talking. It was in a talk in a maths department so I was kind of trolling mathematicians. I think that formal proof verification systems have the potential to offer a lot to mathematicians and I am very happy to get people talking about them using any means necessary. On the other hand when I am talking to the formal proofs people I put on my mathematician's hat and emphasize the paragraph above, saying that we have a human mathematical community which knows what it is doing better than any computer and this is why it would be a complete waste of time formalising a proof of Fermat's Last Theorem -- we all know it's true anyway because Wiles and Taylor proved it and since then we generalised the key ideas out of the park.

It is true that there are holes in some proofs. There are plenty of false lemmas in papers. But mathematics is robust in this extraordinary way. More than once in my life I have said to the author of a paper "this proof doesn't work" and their response is "oh I have 3 other proofs, one is bound to work" -- and they're right. Working out what is true is the hard, fun, and interesting part. Mathematicians know well that conjectures are important. But writing down details of an argument is a lot more boring than being imaginative and figuring out how the mathematical world works, and humans generally do a poorer job of this than they could. I am concerned that this will impede progress in the future when computers start to learn to read maths papers (this will happen, I guess, at some point, goodness knows when).

Another thing which I did not stress at all in the Pittsburgh talk but should definitely be mentioned is that although formal proof verification systems are far better when it comes to reliability of proofs, they have a bunch of other problems instead. Formal proofs need to be maintained, it takes gigantic libraries even to do the most basic things (check out Lean's definition of a manifold, for example), different systems are incompatible and systems die out. Furthermore, formal proof verification systems currently have essentially nothing to offer the working mathematician who understands the principles behind their area and knows why the major results in it are true. These are all counterpoints which I didn't talk about at all.

In the future we will find a happy medium, where computers can be used to help humans do mathematics. I am hoping that Tom Hales' Formal Abstracts project will one day start to offer mathematicians something which they actually want (e.g. good search for proofs, or some kind of useful database which actually helps us in practice).

But until then I think we should remember that there's a distinction between "results for which humanity hasn't written down the proof very well, but the experts know how to fill in all of the holes" and "important results which humanity believes and are not actually proved".

I guess one thing that worries me is that perhaps there are areas which are currently fashionable, have holes in, and they will become less fashionable, the experts will leave the area and slowly die out, and then all of a sudden someone will discover a hole which nobody currently alive knows how to fill, even though it might have been the case that experts could once do it.

@TonyK: what are you talking about? There obviously aren't 10000000 "important stuff" research papers, so I think we need to read Kevin's "99.99999%" as hyperbole. — Andrej Bauer, Feb 2, 2020 at 22:39
You're probably right. But you know what the real problem is? It's not that there are probably faulty important papers out there, but rather that we have no idea how many or which ones. — Andrej Bauer, Feb 3, 2020 at 7:13
"... experts will leave the area and slowly die out, and then all of a sudden someone will discover a hole which nobody currently alive knows how to fill, even though it might have been the case that experts could once do it." A natural solution to this problem is to incentivize grad students and advanced undergrads to writing papers filling in these gaps, e.g. for the journal of "Trivial Remarks" or "Not Research Level Questions" (or perhaps less self-depracting venues). MO rep or similar doesn't create this incentive because it doesn't count towards publications. — Areaperson, Feb 3, 2020 at 20:28

Timothy Chow · Accepted Answer · 2020-02-02 23:23:52Z

As Kevin Buzzard himself admits in his answer, he somewhat exaggerated his point for effect.

However, I'd submit that if you were unsettled by his talk, then that's a good thing. I don't think that the proper reaction is to look for reassurance that mathematics really is fine, or that the problems are restricted to some easily quarantined corner.

Rather, I think the proper reaction is to strive for a more accurate view of the true state of the mathematical literature, and refuse to settle for comforting myths that aren't based on reality. Some of the literature is rock-solid and can stand on its own, much more of it is rock-solid provided you have access to the relevant experts, and some of it is gappy but we don't really care. On the other hand, some small percentage of it is gappy or wrong and we do care, but social norms within the mathematical community have caused us to downplay the problems. This last category is important. It is a small percentage, but from a scholarly point of view it is a serious problem, and we should all be aware of it and willing to acknowledge it. If, every time someone brings it up, we try to get them to shut up by repeating some "propaganda" that makes us feel good about mathematics, then we are not solving the problem but perpetuating it.

Some related concerns were raised over 25 years ago by Jaffe and Quinn in their article on Theoretical Mathematics. This generated considerable discussion at the time. Let me quote the first paragraph of Atiyah's response.

I find myself agreeing with much of the detail of the Jaffe–Quinn argument, especially the importance of distinguishing between results based on rigorous proofs and those which have a heuristic basis. Overall, however, I rebel against their general tone and attitude which appears too authoritarian.

My takeaway from this is that Jaffe and Quinn made many valid points, but because this is a sensitive subject, dealing with societal norms, we have to be very careful how we approach it. Given the way that the mathematical community currently works, saying that someone's work has gaps and/or mistakes is often taken to be a personal insult. I think that if, as a community, we were more honest about the fact that proofs are not simply right or wrong, complete or incomplete, but that there is a continuum between the two extremes, then we might be able to patch problems that arise more efficiently, because we wouldn't have to walk on eggshells.

I have had some push-back against the rather gung-ho approach I have been taking to highlight the problems I perceive with the way mathematics is currently being done, and I think Tim's post above does a very good job of highlighting the same issues but in a more balanced way. Thanks. Some of the issues I highlight in my talk are huge gaps in the literature which the experts have absolutely no doubt can be filled. I personally am uncomfortable about this but not everyone is. However there are other gaps, he said, treading on eggshells, where there might be more of a problem. — Kevin Buzzard, Feb 7, 2020 at 19:08
@KevinBuzzard : I do like the way you point out that everyone eventually dies and that therefore gaps which can be filled only by a tiny number of experts are at risk of becoming permanent gaps. I share your feeling that these sorts of gaps are worrisome. Maybe something can be done about them without ruffling too many feathers, because there's no implicit accusation of an incomplete argument, just a call that the experts perform "community service." — Timothy Chow, Feb 9, 2020 at 20:19
The classification of finite simple groups could serve as a model. Although you express concern that even the current plan might not be completed successfully, at least it's a huge improvement over the situation as it stood in, say, 1985. Credit should be given to the experts who acknowledged the need for a proper writeup and who dedicated an enormous amount of time and effort to the project. — Timothy Chow, Feb 9, 2020 at 20:22

Stack Exchange Network

Extent of “unscientific”, and of wrong, papers in research mathematics

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged
soft-question
journals
proof-assistants
peer-review
errata
or ask your own question.

Linked

Extent of “unscientific”, and of wrong, papers in research mathematics

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged soft-questionjournalsproof-assistantspeer-reviewerrata or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
soft-question
journals
proof-assistants
peer-review
errata
or ask your own question.