103
$\begingroup$

There are many mathematical statements that, despite being supported by a massive amount of data, are currently unproven. A well-known example is the Goldbach conjecture, which has been shown to hold for all even integers up to $10^{18}$, but which is still, indeed, a conjecture.

This question asks about examples of mathematical statements of the opposite kind, that is, statements that have been proved true (thus, theorems) but that have almost no data supporting them or, in other words, that are essentially impossible to guess by empirical observation.

A first example is the Erdős–Kac theorem, which, informally, says that an appropriate normalization of the number of distinct prime factors of a positive integer converges to the standard normal distribution. However, convergence is so slow that testing it numerically is hopeless, especially because it would require to factorize many extremely large numbers.

Examples should be theorems for which a concept of "empirical observation" makes sense. Therefore, for instance, theorems dealing with uncomputable structures are (trivially) excluded.

$\endgroup$
11
  • 19
    $\begingroup$ Many asymptotic results in analytic number theory are of this form, e.g., en.wikipedia.org/wiki/Skewes%27s_number and (the negation of) en.wikipedia.org/wiki/Mertens_conjecture $\endgroup$
    – Terry Tao
    Dec 29, 2021 at 19:29
  • 5
    $\begingroup$ Here is a similar question. mathoverflow.net/questions/15444/… $\endgroup$ Dec 29, 2021 at 19:54
  • 5
    $\begingroup$ @AlexandreEremenko Seems like Banach-Tarski is covered by the clause, "theorems dealing with uncomputable structures are (trivially) excluded." $\endgroup$ Dec 30, 2021 at 1:27
  • 4
    $\begingroup$ Another, more mundane instance of @TerryTao's comment that many analytic number theory results are only manifest for large numbers (e.g., when $\log\log T$ is large), is the "eventual" regularity of the behavior of the argument of $\zeta(1+it)$. This was already known, and proven, in Titchmarsh. $\endgroup$ Dec 30, 2021 at 3:03
  • 15
    $\begingroup$ A naive, out-dated (and probably stupid) example: I suppose that Oresme was extremely surprised when he proved that the harmonic series diverges. $\endgroup$
    – efs
    Dec 30, 2021 at 14:25

17 Answers 17

81
$\begingroup$

One of the most interesting examples that happened recently is the Katz-Sarnak conjecture asserting that the average rank of elliptic curves (ordered by some reasonable height) defined over $\mathbb{Q}$ is equal to $1/2$. This is a generalization of a more specific conjecture due to Goldfeld which asserts that the average rank of quadratic twists of a given elliptic curve is equal to $1/2$. Goldfeld's conjecture dates back to the 1970's and the Katz-Sarnak conjecture was made in the 1990's.

This is a story that was told by Manjul Bhargava during one of his lectures at Oxford where I was in the audience. He said that, as a graduate student at Princeton in the 90's, he had heard Sarnak lecture on his conjecture with Katz. Curious, Bhargava then looked up existing data on the average rank of elliptic curves and found that of the tens of thousands of elliptic curves that were tabulated, the average rank was quite large (exceeding two), and appeared to be increasing. The young Bhargava printed out the results and showed Sarnak the next day, stating that the data clearly does not support his conjecture. According to Bhargava, without batting an eye Sarnak said "the data is misleading; eventually the average rank will plateau and start going down towards $1/2$ when more curves are considered". Bhargava was apparently unconvinced.

The story of course eventually leads to the groundbreaking work of Bhargava and Arul Shankar in the last decade on average rank of elliptic curves. In three spectacular papers(here, here, and here) they proved that the average rank of elliptic curves, sorted by the "naive height" defined for an elliptic curve given in short Weierstrass form $E_{A,B} : y^2 = x^3 + Ax + B, A,B \in \mathbb{Z}$ as $H(E_{A,B}) = \max\{4|A|^3, 27B^2\}$, is at most $1.5, 1.17, 0.885$ respectively. The lecture mentioned above was given in 2016. At the time, the existing data had not yet shown that the average rank dips below 1, even though the best theoretical result gives an upper bound of $0.885$. Bhargava ended his talk by displaying a brand new chart showing that the average rank seems to dip below $0.9$, still above the theoretical bound, compiled by the work of his students and collaborators.

$\endgroup$
3
  • 3
    $\begingroup$ Here some historic evidence that the heuristic question was discussed, but wasn't settled by 2006. $\endgroup$ Dec 29, 2021 at 23:05
  • 4
    $\begingroup$ So how did anybody guess the conjecture in the first place? $\endgroup$
    – lalala
    Jan 1, 2022 at 10:26
  • 10
    $\begingroup$ @lalala one would arrive at that conclusion by considering the so-called parity conjecture and the minimalist conjecture. The former asserts that the parity of the rank of elliptic curves should not be biased towards odd or even, so half the time the rank should be even and the other half of the time it should be odd. The minimalist conjecture asserts that ranks are as low as they need to be, which means that half the time they should be 0 and the other half of the time it should be 1. Combining these assertions gives that the rank should be 1/2 on average. $\endgroup$ Jan 1, 2022 at 20:10
81
$\begingroup$

Bootstrap percolation is a two-dimensional two-state cellular automaton with a von Neumann 5-square ("plus") neighborhood where a "white" cell become "black" if it has at least 2 black neighbors, and a black cell always remain black.

Picture from Wolfram Mathworld

The question is: on a $L \times L$ square, how dense should the (uniformly random) initial population be to make the whole square black? The result is determined by the critical value $\lambda=\pi^2 /18 \approx 0.548$. When the initial density $p$ satisfies $p \log(L) > \lambda$, the final population will be entirely black with high probability, and if $p \log(L) < \lambda$, the final population will contain white cells with high probability.

Simulations in this paper used $L=28800$ and suggested a critical value of $\lambda = 0.245 ± 0.015$. According to this webpage, to get close to the limiting value, one needs $L$ at least $10^{20}$, certainly well beyond the range of any computer!

$\endgroup$
2
  • 15
    $\begingroup$ This automaton is also how water works in Minecraft, except for subtleties about the third dimension. (Very little of the physics in Minecraft is "realistic" and water is no exception.) This is also the setting of the famous "Infected Squares Puzzle", which asks for a proof that an nxn board requires at least n initially black ("infected") squares in order to infect the whole board. $\endgroup$ Jan 3, 2022 at 13:36
  • 2
    $\begingroup$ (The infected squares puzzle is among my favorite puzzles. It has a very simple answer, but it took me, at least, a while to come up with it) $\endgroup$ Jan 3, 2022 at 14:23
63
$\begingroup$

Letting $\pi$ be the prime counting function and $\mathrm{Li}$ the logarithmic integral, Littlewood proved in his 1914 article "Sur la distribution des nombres premiers" that the difference $\pi(x)-\mathrm{Li}(x)$ changes sign infinitely many times; however, according to Wolfram Math World: Skewes Number, Kotnik proved that the smallest number for which this happens is greater than $10^{14}$.

$\endgroup$
6
  • 7
    $\begingroup$ This has long been my canonical example of the phenomenon the question askes about. $\endgroup$
    – Buzz
    Dec 30, 2021 at 19:54
  • 9
    $\begingroup$ If you use \operatorname{Li} instead of \mathrm{Li} then you get proper horizontal spacing in things like $3\operatorname{Li}(x)$ (where you would otherwise see $3\mathrm{Li}(x)$) and that does not just mean that some additional spacing is there; rather, the spacing depends on the context. Nothing like that happens with \mathrm. If you need to refer to this function 50 times in one paper, then in LaTeX, put \newcommand{\Li}{\operatorname{Li}} before the \begin{document} line, and then within the document, just type $3\Li(x)$. $\qquad$ $\endgroup$ Dec 30, 2021 at 21:00
  • 2
    $\begingroup$ Michael Hardy Thank you for the tip! $\endgroup$
    – Vanni
    Dec 31, 2021 at 14:35
  • 1
    $\begingroup$ @MichaelHardy Is that equivalent to \DeclareMathOperator\Li{Li}? $\endgroup$
    – Z. M
    Jan 3, 2022 at 20:04
  • 2
    $\begingroup$ @Z.M Almost but not quite. Those behave differently in some contexts, but right now I don't remember the details. $\endgroup$ Jan 4, 2022 at 3:59
33
$\begingroup$

Recent breakthrough work of Ben Green together with an improvement by Zach Hunter imply that there is a red-blue coloring of $[1,n]$ with no red $3$-term progression and no blue $k$-term progression for $$n = k ^{ \frac{ c \log k }{\log \log k }}.$$

On the other hand, empirical observation suggests this is only possible for $n \approx k^2$.

Moreover the proof is, as far as I know, constructive, i.e. it gives a randomized algorithm to produce an example with high probability. It's just that this algorithm needs relatively large $n$ to do better than $k^2$.

$\endgroup$
5
  • 44
    $\begingroup$ Hey that’s me! Yes I would say the proof is quite constructive (in theory). However I have done some coding to try and explicitly construct such colorings, without much luck. So in practice we may need $n$ to be very large for these strategies to beat $k^2$ whp. $\endgroup$ Dec 30, 2021 at 5:45
  • $\begingroup$ It's an interesting construction: they fill the surface of a high-dimensional torus (eg a box with opposite sides identified) with spherical shells (high-dimensional annuli) of very small thickness and draw a line along the torus, sampling evenly spaced points (on a shell -> red, off -> blue). This may or may not work depending on the arrangement of the shells, but careful analysis of the bounds show that, for a random arrangement, there's a nonzero probability of it working in sufficiently high dimension. (@ZachHunter Is this accurate?) $\endgroup$ Jan 3, 2022 at 14:02
  • 1
    $\begingroup$ @AkivaWeinberger, that's essentially correct. A pedantic note is that you have swapped "red" and "blue". But yes the construction is carried in the way you described. I'll clarify that as $k\to \infty$, the constructions succeed with probability approaching one which is a bit stronger than nonzero probability. $\endgroup$ Jan 3, 2022 at 14:55
  • 1
    $\begingroup$ (I think it was Will Sawin - the OP - who swapped them, actually...!) $\endgroup$ Jan 3, 2022 at 15:01
  • $\begingroup$ Ah yes, I somehow missed that, haha. :) $\endgroup$ Jan 3, 2022 at 15:01
30
$\begingroup$

Goodstein's theorem. For a nonnegative integer $n$, the hereditary base $b$ representation of $n$ is found by writing $n$ in base $b$, then writing the exponents in base $b$, recursively, until the process terminates when all exponents are less than $b$. For example, the hereditary base $3$ representation of $59056$ is $3^{3^2 + 1} + 2 \cdot 3 + 1$.

Now for a positive integer $k$, we can define the Goodstein sequence $G_k$ by $G_k(0) = k$, and obtain $G_k(n+1)$ from $G_k(n)$ by writing $G_k(n)$ in hereditary base $n+2$, replacing each instance of $n+2$ with $n+3$, and then subtracting one.

Goodstein's theorem states that these sequences always reach $0$. However, for $k \ge 4$, it takes an extremely large number of iterations for this to occur, or even for the terms to start decreasing. (It is claimed on Wikipedia that, for $k=4$, the terms increase until base $3 \cdot 2^{402\ 653\ 209}$.) This makes the theorem effectively impossible to discover by empirical observation.

$\endgroup$
3
  • 2
    $\begingroup$ This is a great example, especially when combined with the incredible counterintuitiveness of the theorem. $\endgroup$ Dec 30, 2021 at 19:53
  • 4
    $\begingroup$ This is a good example of an answer to the question that arises from a fast-growing function, as I mentioned in another answer. The "Goodstein function" that sends $n$ to the number of iterations required by $n$ grows so fast that PA (first-order Peano arithmetic) cannot prove that it is a total function. $\endgroup$ Dec 30, 2021 at 22:15
  • 2
    $\begingroup$ That number, $3\cdot2^{\rm whatever}$, is calculated in this video: youtu.be/hm3iOoTQCLA $\endgroup$ Jan 3, 2022 at 14:07
17
$\begingroup$

Probably Shannon's theorem(s) about existence of good (error-correcting) codes is an example: "most"/"random" codes achieve close to channel capacity, but explicit description (and economical decoding algorithms) has been a major research problem for the intervening decades.

This example seems especially piquant to me, because we want a code that is "random" enough to be good, but "secretly" structured in human/algorithmic terms that we can efficiently compute with it.

$\endgroup$
4
  • 1
    $\begingroup$ " "most"/"random" codes achieve close to channel capacity, " Doesn't that means that data supports Shannon's theorems ? $\endgroup$
    – BubbleZ
    Dec 30, 2021 at 10:09
  • 5
    $\begingroup$ @BubbleZ The catch is that most codes don't have an efficient decoding algorithm. I guess that one could imagine an alternative universe in which, prior to proving Shannon's theorem, people performed an empirical investigation by generating random codes and laboriously decoding them to discover that they have close to the channel capacity. But it's hard to imagine people executing exponential time decoding algorithms in the 1940s in order to test a hypothesis that they have no a priori reason to believe is true. $\endgroup$ Dec 30, 2021 at 14:16
  • 11
    $\begingroup$ In 2009 polar codes were shown to achieve 100% channel capacity. These are now being implemented in 5G cellular networks. $\endgroup$ Dec 31, 2021 at 18:34
  • $\begingroup$ @PhilFreedenberg Wow! I had not been keeping track of such developments for several years now...! Amazing. $\endgroup$ Dec 31, 2021 at 18:37
15
$\begingroup$

So-called non-constructive proofs in combinatorics typically yield existence proofs of certain combinatorial objects without providing any efficient algorithm for finding them. These proofs may rely on counting arguments or on algebraic or topological existence results that have no known efficient algorithm. Paradoxically, the proofs sometimes guarantee the existence of exponentially many objects, yet fail to provide an efficient method of finding even one. Howard Karloff has described such situations as "finding hay in a haystack."

You can find many examples in this paper by Noga Alon, such as the following result of J. Spencer.

Let $v_1,\ldots,v_n$ be $n$ real vectors of length $n$ each, and suppose that the $\ell^\infty$-norm of each $v_i$ is at most $1$. Then there are $\epsilon_1,\ldots,\epsilon_n \in \{−1,1\}$ such that the $\ell^\infty$-norm of the sum $\sum_{i=1}^n \epsilon_i v_i$ is at most $6\sqrt{n}$.

I believe that if you were to investigate this question empirically by generating examples and computationally searching for suitable signs, you would soon encounter instances for which you would be unable (in practice) to achieve the bound guaranteed by Spencer's theorem. Alon speculates that there should exist an efficient algorithm, but as of the time he wrote the paper, no such algorithm was known. [EDIT: This is probably not the best example from Alon's paper to illustrate the point I'm making; see comments below. The Lovász Local Lemma gives better examples; e.g., a $k$-uniform hypergraph in which each edge shares vertices with at most $\sim 2^k\!/e$ other edges admits a 2-coloring with no monochromatic edge, but I doubt that this theorem would have been found empirically. There is now a polynomial time algorithm for finding the coloring, but it is highly sophisticated, and the search for such an algorithm was motivated by the theorem, rather than the other way around. See also this blog post by David Eppstein which summarizes a talk by Noga Alon on this topic.]

In a similar vein, Keevash's spectacular proof of the existence of designs allows us to write down parameters for combinatorial designs which we now know must exist, even though we can't write down explicit examples. Empirical search for such designs would of course turn up empty.

$\endgroup$
5
  • $\begingroup$ I actually disagree. In many similar ones (e.g. Ramanujan graphs) there are nonconstructive existence results but, separately, it's IIRC relatively easy to find examples by generating objects at random and throwing out the ones that don't have the desired property, because a high proportion empirically have that property. $\endgroup$
    – Will Sawin
    Dec 30, 2021 at 0:27
  • $\begingroup$ For this example, if you choose the $\epsilon_i$ at random, the coordinates of the sum will be approximately Gaussian-distributed, so you're looking for examples of $n$ samples without $6$-sigma deviations, which should be easy to find for $n$ at most a million or so. $\endgroup$
    – Will Sawin
    Dec 30, 2021 at 0:28
  • $\begingroup$ @WillSawin It's certainly not true in general that random generation works for the examples in Alon's paper. For example the Lovasz Local Lemma usually gives you exponentially rare objects. But maybe you are right about Spencer's result in particular; I admit I didn't think too hard about whether there is an easy randomized algorithm that appears to work in practice but doesn't have a provable guarantee. $\endgroup$ Dec 30, 2021 at 0:31
  • $\begingroup$ @WillSawin I peeked at Spencer's paper. It seems that the naive idea of choosing random signs easily gives a provable guarantee with an extra factor of $\sqrt{\ln n}$. Maybe in practice the algorithm works better than that. I guess I should have picked a Lovasz Local Lemma example or something to better make my point. $\endgroup$ Dec 30, 2021 at 0:39
  • 1
    $\begingroup$ Part of my point is just that "small numbers can be deceiving" works both ways - it takes a while before $\sqrt{\log n}$ becomes bigger than $6$! (If the randomized algorithm just barely fails one can do a local search - e.g. look for one $i$ such that flipping $\epsilon_i$ lowers the $\ell^{\infty}$-norm, and then iterate until reaching a local minimum. This is even harder to analyze rigorously but often effective in practice.) I don't deny that surely some example in that paper is hard to find in practice. $\endgroup$
    – Will Sawin
    Dec 30, 2021 at 2:20
13
$\begingroup$

Fast-growing functions provide one source of examples. Here is one that Harvey Friedman has described in his paper on Long finite sequences (J. Combin. Theory Ser. A 95 (2001), 102–144; see also his lecture notes on Enormous Integers).

Say that a ternary sequence $x_1, x_2,\ldots, x_n$ is block subsequence free if for no $i < j \le n/2$ is $x_i,\ldots,x_{2i}$ a subsequence of $x_j,\ldots, x_{2j}$. Are there arbitrarily long block subsequence free ternary sequences?

The answer is no. But to write down the maximum length of such a sequence, you need the Ackermann function (or some similar notation). That is, if you empirically try to write down a long block subsequence free ternary sequence, all the evidence will suggest that you can continue forever.

$\endgroup$
3
  • $\begingroup$ Tree(3) and the hydra game are probably the two most famous ones. The former is so unimaginably large we don't have good notation to describe it (it's much larger than, say, Graham's number) $\endgroup$
    – BlueRaja
    Dec 30, 2021 at 4:53
  • $\begingroup$ @BlueRaja Yes. But I like block subsequences because they're easy to grasp. Replace ternary with binary and you can quickly figure out that the longest block subsequence free binary sequence has length eleven. But then when you switch to ternary, it's not so clear what's happening. You can write a simple computer program to create them, and it looks like you can keep going forever... $\endgroup$ Dec 30, 2021 at 5:11
  • 1
    $\begingroup$ I see now that this example is given as an answer to a related MO question on eventual counterexamples. $\endgroup$ Dec 30, 2021 at 14:35
9
$\begingroup$

In 1912 Sergei Bernstein studied the best uniform approximation of $|x|$ on $[-1,1]$ by polynomials of degree $2n$. Denoting $E_{2n}$ the error of the best approximation, he proved the existence of the limit $$\beta=\lim_{n\to\infty}2nE_{2n}.$$ This number is called the Bernstein constant nowadays. His own numerical estimation gave $0.278<\beta<0.286$; noting that these estimates have an average of $0.282$, which is very close to $(2\sqrt{\pi})^{-1}$, Bernstein conjectured that $\beta=(2\sqrt{\pi})^{-1}$.

This was disproved in a 1985 paper by Richard Varga and Amos Carpenter, who computed $\beta$ to fifty decimal places, and proved bounds tight enough to determine the first four digits as $0.2801$. Their result disagrees with Bernstein's conjecture already in the third digit. Nothing is known about the arithmetic nature of $\beta$.

$\endgroup$
2
  • 10
    $\begingroup$ What is the unguessable proven statement here? $\endgroup$
    – Buzz
    Dec 30, 2021 at 20:28
  • 1
    $\begingroup$ This is exactly the opposite case to what the question is asking. You explicitly state that empirical observation has been repeatedly used to guess or disprove guesses towards an eventual theorem. $\endgroup$
    – Brady Gilg
    Jan 4, 2022 at 19:07
7
$\begingroup$

Other answers mention some advanced math theorems that are mostly beyond my understanding. But the fact that the sum of inverse of the prime numbers diverges still baffles me. I have seen no intuition or empirical data to support it (which might be due to my limited knowledge). Euler's proof and a few other ones that I've read were solely based on abstract mathematical facts, contrary to the infinitude of sum of harmonic numbers which had some nice intuitive proofs.

$\endgroup$
13
  • 15
    $\begingroup$ It's known that $\sum_{p \le n} \frac{1}{p} \sim \log\log n$, so your answer is appealing to the well-known "The function $\log\log n$ goes to infinity but has never been observed to do so". $\endgroup$ Dec 30, 2021 at 9:27
  • 4
    $\begingroup$ re: "It's known that $\sum_{p \le n} \frac{1}{p} \sim \log\log n$" that's exactly the problem! How would someone know this? (hope you get what I'm saying) $\endgroup$
    – polfosol
    Dec 30, 2021 at 9:38
  • 4
    $\begingroup$ I would argue that the fact can be supported by empirical data. 1. The primes are "randomly distributed" with density $1/\log(n)$ (grounded by emprical data) 2. The sum of inverse of such a random sequence diverges almost surely (rigorous). 3. Thus the sum of inverse of primes diverge. $\endgroup$ Dec 30, 2021 at 11:50
  • 6
    $\begingroup$ Both LeechLattice's method and Sam Nead's closely related one are indirect, but there is an easy direct empirical argument: Just sum $1/p$ for $p$ up to $n$, subtract $\log \log n$, and look at its behavior! Two graphs of this, pretty convincing to me, are contained in this wikipedia page: en.wikipedia.org/wiki/Meissel%E2%80%93Mertens_constant $\endgroup$
    – Will Sawin
    Dec 30, 2021 at 15:26
  • 7
    $\begingroup$ I attended a talk many years ago where Matiyasevich mentioned the divergence of $\sum p^{-1}$ and added, "but the sum of the reciprocals of all the known primes does not exceed five – and never will!" (or words to that effect) $\endgroup$ Dec 31, 2021 at 2:00
6
$\begingroup$

The following approximation to $\pi$, from "Strange Series and High Precision Fraud" by Borwein and Borwein, provides another example: $$ p:=\left(\frac1{10^5} \sum_{n=-\infty}^\infty e^{-\frac{n^2}{10^{10}}} \right)^2 \approx \pi,$$ where the left and the right side agree for at least 42 billion decimals.

Certainly, the theorem "$p \neq \pi$" could not have been found empirically!

$\endgroup$
4
  • 6
    $\begingroup$ What is the proof of the inequality? Example 6.1 of that paper (doi.org/10.2307/2324993) proves $|\sqrt{\pi}-10^{-5}\sum\cdots|<10^{-4.2\cdot10^{10}}$, but it'd be nice to see an argument for a positive lower bound. $\endgroup$
    – user44143
    Jan 6, 2022 at 18:52
  • 1
    $\begingroup$ Does this have anything to do with the number $10$, or could we replace $10$ by $N$ and $5$ by $N/2$ and get closer and closer to $\pi$ as $N$ grows large? $\endgroup$ Apr 21, 2022 at 13:36
  • $\begingroup$ @darijgrinberg Yes. (The statement in the paper is a little bit different, but the idea is the same.) $\endgroup$
    – rimu
    Apr 21, 2022 at 17:54
  • $\begingroup$ @darijgrinberg I think this formula is basically the Riemann sum for the integral $\int_{-\infty}^{+\infty}e^{-t^2}\,dt$ (with intervals of length $10^{-5}$), or rather its square. $\endgroup$
    – richrow
    Jul 19 at 8:53
4
$\begingroup$

Going back to the 18th century, when L. Euler established $$\sum_{n\ge1}\frac1{n^2}=\frac{\pi^2}6,$$ he certainly had no guess about the value from empirical observation (= partial sums).

$\endgroup$
3
  • 4
    $\begingroup$ Is that true? I seem to remember reading somewhere that Euler computed this sum to high accuracy and recognized it as $\pi^2 / 6$. In fact, I remember reading that it was precisely because of this empirical observation that he thought to consider factorizations of $\sin(x)$! Of course, you're right that the partial sums converge quite slowly, and Euler needed to discover Euler-Maclaurin summation in order to get the accuracy he wanted! See, for instance, this historical paper. $\endgroup$ Jan 3, 2022 at 10:36
  • 1
    $\begingroup$ @HallaSurvivor. Very interesting ! $\endgroup$ Jan 3, 2022 at 11:25
  • 3
    $\begingroup$ I remember reading the same: Euler developed a convergence acceleration technique that allowed him to compute the sum to high accuracy much quicker than the naïve method, and it was only after having computer it numerically did he have a guess at what the answer was. (Incidentally, Euler's proof, while brilliant, is not my favorite proof: I'm rather fond of Johan Wästlund's geometric proof presented here) $\endgroup$ Jan 3, 2022 at 14:19
3
$\begingroup$

A slightly old question, but I've only just seen it...

Lord Brouncker proved in 1654 that $$ \dfrac{1}{4}\pi = \cfrac{1}{1+\cfrac{1^2}{3+\cfrac{2^2}{5+\cfrac{3^2}{7+\dotsb}}}} $$ This has very slow convergence. Apparently people didn't believe it initially. It takes nearly 50 terms for five decimal places of accuracy and nearly 120 for six! It converges in an oscillatory manner and the jumps are quite large for quite a long time.

Of course, 120 terms isn't that many for modern computers and there would be no doubt about it these days. But, 120 is a lot to do by hand in the 1650s!

$\endgroup$
1
  • $\begingroup$ I disagree. Even in that time one could have worked out that $n^2/(2n+1+(n+1)^2/...)$ $\approx (\sqrt{2}-1)(n-1/2)$. Then already the fourth term is $\approx 3.14154/4$. $\endgroup$ Jul 18 at 7:15
3
$\begingroup$

It seems somewhat miraculous that I would hear another talk, given at the same location but about 7 years later, that indicates another seemingly impossible-to-make conjecture. Indeed, A.R. Miller had conjectured (On parity and characters of symmetric groups) that the proportion of $\chi_\lambda(\mu)$, where $\lambda$ is an integer partition of $n$ and $\mu$ a cycle-type of the symmetric group $S_n$ in the character table of $S_n$ divisible by any prime power $q$ tend to 1 with $q$ fixed and $n \rightarrow \infty$. According to the speaker, Kannan Soundararajan, Miller had made the conjecture based only on knowledge of character tables of $S_n$ with $n \leq 76$, and that in that range for example the proportion of $\chi_\lambda(\mu)$ divisible by $5$ say was only about 50%. Nevertheless, Peluse and Soundararajan proved Miller's conjecture here: Divisibility of character values of the symmetric group by prime powers

$\endgroup$
2
$\begingroup$

Khinchin's constant is a good one: https://en.m.wikipedia.org/wiki/Khinchin%27s_constant

It says that for almost all real numbers, the limit of the geometric mean of the first $n$ coefficients of their continued fraction expansion converges as $n$ goes to infinity. Not only does it converge, it converges almost always to the same value - Khinchin's constant - independent of the number you started off with.

I'm not up to date on any new results on this, but it seems that this property has only been shown to hold for numbers that were constructed for this property to hold. Numerical evidence suggests that $\pi$, $\gamma$ and Khinchin's constant itself satisfy this property.

$\endgroup$
5
  • 6
    $\begingroup$ If numerical evidence suggests that some numbers have this property, why would it be hard to guess by empirical observation? Isn't empirical observation more-or-less the same thing as looking at numerical evidence? $\endgroup$
    – Will Sawin
    Jan 2, 2022 at 23:34
  • 5
    $\begingroup$ I never have understood why this was considered especially interesting. For almost all real numbers, the limit of the arithmetic mean of their first n decimal digits approaches 4.5, but that statement is rightfully considered boring. I don't see the difference. $\endgroup$
    – Brady Gilg
    Jan 4, 2022 at 19:03
  • $\begingroup$ @BradyGilg The analogy feels very naive, arithmetic mean and geometric mean of a base representation of a number isn't super interesting. In fact I would say opposite things are being said by the "theorems", the decimal example is basically saying "there is no particular structure to decimals". It would be very head scratching if the arithmetic mean was 2.37... so on average for a tendency for smaller numbers to appear(if true). Apriori is it obvious to you that "100" is significantly less likely than "1" to appear in continued fraction? Why is there such a "bias" to small numbers? etc $\endgroup$ Nov 8 at 0:52
  • $\begingroup$ Basically, if you just learn about continued fractions why wouldn't you think geometric mean diverges almost everywhere(I think naively, extending the "decimal logic" that is what you should think)? It is trivial to construct uncountable many such examples. In fact it is sorta feels "unnatural" to construct examples with small geometric mean... there are more infinitely natural numbers bigger $\geq$ 3 (which is bigger than K) so why are the vast majority of numbers continued fraction geometric mean less than 3? $\endgroup$ Nov 8 at 1:02
  • $\begingroup$ A bit further, if you were to construct continued fractions by choosing terms from $\{1,...,9,10\}$ randomly, that would almost always be part of the measure zero set excluded in the theorem... in part because this "random construction" generally produces numbers with large geometric mean compared to random numbers! Basically I think this "look decimal arithmetic mean converges" is a very misleading comparison (that I have seen a number of times) that isn't much deeper than "look two different kinds of means converge" $\endgroup$ Nov 8 at 2:05
1
$\begingroup$

The growth rate of $n\mapsto\mathrm{TREE}(n)$ seems to be impossible to guess from experiments or observation

$\endgroup$
2
  • $\begingroup$ Isn't this just because $\operatorname{TREE}(n)$ isn't possible to compute by hand for $n>2$? $\endgroup$
    – Will Sawin
    Dec 30, 2021 at 22:01
  • 7
    $\begingroup$ It would probably be better to rephrase this example along the lines of the answer based on Goodstein's theorem or the answer based on block subsequences. Something like, can one construct an arbitrarily long sequence of trees such that for no $i<j$ is $T_i \le T_j$? Empirically, it looks like you can, but in fact you can't. $\endgroup$ Dec 30, 2021 at 22:22
1
$\begingroup$

Pretty much every proof in computability theory using a priority construction could be said to have this property since, if you work through them virtually every substantial step operates on some initial segment that's far too long to practicly check (eg you'll be trying to preserve an initial segment long enough to let the first n computable functionals converge if they ever do so at any point in the construction with interesting n being pretty big...not quite busybeaver see that kind the thing).

Note the computability results are usually the kind of results that one should be able to in principle get empirical support for (eg, they make a claim that all naturals have/fail some checkable property or that there is an e such that all s has some checkable property or, most often, a Pi-0-3 property).

For instance, the claim that there is an incompatible pair of re degrees is a claim of the form $\exists e, i \forall j \exists n$ but since all the important action happens at hugely large values of $n$ you couldn't hope to gain meaningful direct empirical support even given the values of e, i that work. I'm sure there is a better example of a $\Pi^0_2$ claim proved via a priority argument but it's not popping out at me now.

$\endgroup$
1
  • 2
    $\begingroup$ And when I say the numbers involved are unmanageably large I don't mean any of that bullshit small rates of growth like e^e^n. I mean properly fast growing like busybeaver. Checking the witnesses e, i above work for the first 50 values of j is going to require looking at values of n with more digits than the number of atoms in the observable universe. $\endgroup$ Jan 1, 2022 at 15:55

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged or ask your own question.