LeanDojo: Theorem Proving in Lean Using LLMs(leandojo.org) |
LeanDojo: Theorem Proving in Lean Using LLMs(leandojo.org) |
https://x.com/VictorTaelin/status/1811167900780175423
Can recommend taking a look at their recorded Twitch stream to see it in action.
Also, it's the most complicated pure reasoning task you can build. So working on theorem-proving AI may help in reasoning and reliability.
This can be applied on something like: Is your extension on distributed system protocol (like Paxos) correct? Does my novel security system hold certain properties? etc etc...
... basically when you need to be sure certain properties of your system hold.
You can verify only the critical parts, as in a data structure or algorithm, or you can verify higher-level parts of a system. All you're doing with math is thinking (out loud) and writing a proof is constructing a convincing argument. If you need to be certain that a thread doesn't leak addresses in shared memory to other threads then you ought to take the time to think through how you're going to achieve that and prove that your solution works.
10+ years out from a Finance PhD where I ended up using numerical methods because I really didn't have the math skills to prove closed form solutions.
Would love to know if, starting with a stochastic differential equation, how far your can go re: applying Ito's lemma and working through the math to get to a closed form solution (using Lean).
It the main advantage of Lean (ignoring LLM assistance) that you build up declarative code that, as you incrementally work on the proof, guarantees that the proof-so-far is correct?
So you're still "working through the math" but rather than un-executable math notation written on a pad, you have "by induction" a guaranteed valid argument up to the point you are at in the proof?
Just trying to build a mental model of Lean > pen and pad.
https://github.com/lecopivo/SciLean
To your last point, the idea is that numerical approximations can be introduced (and introduction will ask for proofs of validity! but you can ignore "the proving" via `sorry`) to go from un-executable math notation (in Lean4) to executable!
Whether the proof goes through doesn't affect the final executable.
Still not convinced that LLMs do anything else than rearranging other people's work.
Effects can already be seen: The Washington Post used to display articles when found via Google, now you get a paywall. And I can no longer criticize them for it.
I’m not convinced that most people do anything else than rearrange other people’s work.
It's amazing how useful and powerful that is in certain contexts.
Privately, I think Lean could be incredibly powerful if baked deep into an ML's kernel/structure/training process. I think an AI proof of something like the Riemann Hypothesis may well be possible if you get enough resources behind it.
I read the article. It doesn't say anything about generating new proofs to train on. It only mentions scraping Github for lean theorems+proofs.
Not sure of the feasibility of implementing lean as a GPU kernel, if you meant that. Also I'm not sure it makes sense to run Lean inside the (mostly matmul) training process. Now to use it to prepare some training data, it seems more realistic. But that seems to be what AlphaProof tries to do in the reinforcement step, if I'm not mistaken.
The problem is mostly that it's fairly intensive to code an efficient RL trainer for this, and even then it's expensive to run the training.
Of course, what that really means and how you'd go about adapting your CUDA code / hardware, would have to be researched.
This would give LeanDojo a lot more training data, and hopefully give us an open source proof assistant at IMO Silver level.
For example, the IUT "proof" of the abc conjecture used completely novel systems to come to its conclusions, to the point that it took top number theorists a few years to even parse what it was saying, and only then could they find the faults in the proof.
I think the IUT proof would be a great adversarial example for any of these systems; if you can find the problem in the proof, you know you've hit on something pretty powerful.
Is it still open whether there is a problem?
Last I heard about this, there was one guy saying there's problems and the author dismissing that as "they just didn't understand it" without showing much interest in explaining it better...
Humans have discovered independence proofs, e.g. Paul Cohen’s 1963 proof that the continuum hypothesis is independent of ZFC. I can’t see any reason in principle why a computer couldn’t do the same.
If the Riemann hypothesis is independent of ZFC, and there exists a proof of that independence which is of tractable length, then in principle if a human could discover it, why couldn’t a sufficiently advanced computer system?
Of course, it may turn out either that (a) Riemann hypothesis isn’t independent of ZFC (what most mathematicians think), or (b) it is independent but no proof exists, or (c) the shortest proof is so astronomically long nobody will ever be able to know it
> The incompleteness theorem for example is a meta-mathematical statement about the limits of axiomatic systems that can not be discovered with axiomatic systems alone.
We have proofs of Gödel‘s theorems. I see no reason in principle why a (sufficiently powerful) automated theorem prover couldn’t discover those proofs for itself. And maybe even one day discover proofs of novel theorems in the same vein
I can probably cook some highly artificial ones up if I try, but maybe there's an interesting one out there!
Not aware of any myself, no. (But I’m far from an expert on this topic.)
It just occurred to me as a logical possibility.
I imagine that would be the main use case for heuristic solvers like this one - helping mathematicians fill in the blanks in proofs for stuff that's not too tricky but annoying to do by hand. Rather than for discovering novel, unknown concepts by itself (although I'm with the OP, don't see why this is impossible a priori).
Intelligent systems (once eventually devised) will use computation machines as the substrate to implement intelligence in a similar way as the human intelligence uses a biological substrate to perform gazillions of individually unintelligent computations.
Don't confuse the substrate for the system
Riemann is so widely believed to be true that there are entire branches of mathematics dedicated to seeing what cool things you can learn about primes/combinatorics etc by taking Riemann to be true as an assumption.
In any event, I'm dropping out of this thread since I don't have much else to say on this and it often leads to unnecessary theorycrafting with people who haven't done the prerequisite reading on the relevant matters.
Humans reason about transcendental induction using finite time and finite resources-the human brain (as far as we know) is a finite entity. So if we can reason about the transfinite using the finite, why can’t computers? Of course they can’t do so by directly reasoning in an infinite way, but humans don’t do that, so why think computers must?
I hear your point that humans reason with heuristics that are "outside" of the underlying formal system, but I don't know of a single case where the resulting theorem could not be formalised in some way (after all, this is why ZFC+ was such a big deal foundationally). Similarly, an AI will have its own set of learned heuristics that lead it to more rigorous results.
Also agree about minds and computers and such, but personally I don't think it has much bearing on what computers are capable of mathematically.
Anyway, cheers. Doesn't sound like we disagree about much.
I understand OP's point of diagonalization proof being impossible to prove on a computer. (Did I get this right?)
In mathematics we often use language to talk about a hypothetical function without actually implementing it in any specific programming language. Formal proofs do exactly the same thing in their own formal language.
Although in the case of Cantor's diagonal argument, I don't know in what sense any function involved in that proof would even fail to be implementable in a specific programming language,. Let's say I encode each real number x such that 0 <= x < 1 in my program as a function which takes a positive integer n and returns the n'th digit of x after the decimal point. In Python syntax:
from typing import Callable
# PosInt, Zero and One aren't built-in Python types but just assume they are
Number = Callable[[PosInt], Zero | One]
Sequence = Callable[[PosInt], Number]
The function involved in the diagonalisation argument can then be implemented as follows: def diagonalize(sequence: Sequence) -> Number:
def diagonal(n: PosInt) -> Zero | One:
if sequence(n)(n) == 0:
return 1
else:
return 0
return diagonal
The argument consists of the the observation that whatever sequence you pass as an argument into "diagonalize", the returned number will not be present in the sequence since for every positive integer n, the n'th digit of the returned number will be different from the n'th digit of the n'th number in the sequence, and hence the returned number is distinct from the n'th number in the sequence. Since this holds for every positive integer n, we can conclude that the returned number is distinct from every number in the sequence.This is just a simple logical argument---it wouldn't be too hard to write it down explicitly as a natural deduction tree where each step is explicitly inferred from previous one using rules like modus ponens, but I'm not going to bother doing that here.