How close are computers to automatically constructing mathematical reasoning?

Artificial Intelligence Toolkit defines the shape of the next generation of automatic theorem provers, and with it the relationship between mathematics and machines.







It is said that in 1970 the now deceased mathematician Paul Cohen , the only winner of the Fields Medal for his work on mathematical logic , has made baseless prediction which still continues as a delight or irritate the mathematicians: "someday will replace mathematicians future computers ". Cohen, known for his daring methods in working with set theory, predicted that all mathematics could be automated, including writing proofs.



Proof is a step-by-step logical reasoning that confirms the truth of a hypothesis or mathematical assumption. After a proof appears, a hypothesis becomes a theorem. It both confirms the correctness of the statement, and explains why it is true. But the proof is strange. It is abstract and not tied to material experience. “They are the result of insane contact between a fictional, non-physical world and creatures that emerged as a result of biological evolution,” said cognitive scientist Simon Dedeo of Carnegie Mellon University, who studies mathematical certainty through the analysis of evidence structures. "Evolution did not prepare us for this."



Computers are good for bulk computing, but the proof requires something different. Hypotheses arise from inductive reasoning - a special intuition associated with an interesting problem - and proofs usually follow deductive, step-by-step logic. They often require complex creative thinking as well as the hard work of filling in the voids, and machines can't handle this combination of skills.



Computerized theorem provers fall into two categories. Automated theorem provers (ATPs) usually use brute force methods to grind huge heaps of numbers. Interactive theorem provers (ITP) serve as human assistants, and are able to check the accuracy of arguments, as well as look for errors in existing proofs. However, even if you combine these two strategies (as do more modern provers), an automatic reasoning system will not emerge from them.





Cognitive scientist Simon Dedeo of Carnegie Mellon University helped demonstrate that humans and machines create mathematical proof in a similar way.



In addition, these tools are not widely welcomed and most mathematicians do not use or approve of them. “This is a controversial topic for mathematicians,” Dedeo said. "Most of them don't like the idea."



One of the open hard problems in this area is the question of how much of the proof creation process can be automated. Will the system be able to generate an interesting hypothesis and prove it in a way that is understandable to people? A set of recent breakthroughs made by laboratories around the world offer artificial intelligence (AI) ways to answer this question. Joseph Urban of the Czech Institute for Informatics, Robotics and Cybernetics in Prague, is exploring various approaches that use machine learning to increase the effectiveness of existing provers. In July, his groupshowed a set of original hypotheses and evidence created and validated by machines. In June, a group at Google Research led by Christian Szegedi published the results of attempts to use the strengths of natural language processing systems to make computer evidence more similar in structure and explanation to human ones.



Some mathematicians see theorem provers as tools that can revolutionize the way students learn to write proofs. Others say that getting computers to write proofs for advanced mathematics is unnecessary, if not impossible. However, a system that can predict a useful hypothesis and prove a new theorem can achieve something new - a kind of machine version of understanding, Szegedi said. And this indicates the possibility of automatic reasoning.



Useful machines



Mathematicians, logicians, and philosophers have long debated what part of proof-making is human in nature, and the debate about the mechanization of mathematics continues today - especially where computer science merges with pure mathematics.



For computer scientists, theorem provers are not controversial. They provide a clear way of confirming that the program is working, and arguments about intuition and creativity are less important than finding effective ways to solve problems. For example, Adam Chlipala , a computer scientist at the Massachusetts Institute of Technology, has developed theorem proving tools that generatecryptographic algorithms that protect transactions on the Internet - despite the fact that usually people come up with such algorithms. His group code is already used in most of the communications in the Google Chrome browser.





Emily Riel of Johns Hopkins University uses theorem provers to train students and computer assistants.



“You can take any mathematical statement and code it with one tool, and then combine all the arguments, and get proof of safety,” said Chlipala.



In mathematics, theorem provers helped produce complex and computationally intensive proofs that would otherwise have taken thousands of mathematical man-years. A striking example is Kepler's hypothesisabout the densest packing of balls in three-dimensional space (historically, these were oranges or cannonballs). In 1998, Thomas Hales and his student, Sam Ferguson, completed this proof using a variety of computerized mathematical techniques. The result was so cumbersome - the proof took 3 GB - that 12 mathematicians analyzed it for several years before announcing that they were 99% sure of its truth.



Kepler's hypothesis is not the only famous problem solved by machines. With the four-color theorem, claiming that four colors are always enough to paint any two-dimensional map in which there are no two touching areas of the same color, was figured out in 1977 using a computer program that processed five-color maps, and showed that they can all be turned into four-color. In 2016, three mathematicians used a computer program to prove the long- standing Boolean problem of Pythagorean triplets , but the first version of the proof was 200 TB. If you have a fast enough Internet connection, you can download it in three weeks.



Mixed feelings



Such examples are often touted as successes, but they also add their own flavor to controversy. The computer code that proved the four-color theorem more than 40 years ago could not be verified by humans. "Mathematicians have since debated whether this is proof or not," said mathematician Michael Harris of Columbia University.





Many mathematicians, along with Michael Harris of Columbia University, question the need to create computerized theorem provers - and that the latter would make mathematicians unnecessary.



Another discontent of mathematicians is connected with the fact that if they want to use theorem provers, first they need to learn how to program, and then figure out how to express their problem in a computer-understandable language - and all this distracts from doing mathematics. “By the time I reformulate the question in a way suitable for this technology, I will solve this problem myself,” Harris said.



Many people just don't see the need for theorem solvers. "They have their own system, pencil and paper, and it works," said Kevin Buzzard, a mathematician at Imperial College London, who changed the direction of research three years ago from pure mathematics to theorem provers and formal proofs. “Computers do amazing calculations for us, but they never solved a difficult problem on their own,” he said. "And until that happens, mathematicians won't buy it."



But Buzzard and others think they might still need to take a closer look at technology. For example, “computer evidence may not be as foreign as we think,” Dedeo said. Recently, with Scott Viteri, a computer scientist at Stanford, he reverse-engineered several well-known canonical proofs (including some of BeginningsEuclid) and dozens of proofs generated by a computer program to prove Coq theorems looking for similarities. They found that the branching structure of machine proofs was remarkably similar to that of human-made proofs. This common property, he said, could help researchers find a way make the helper programs explain.



"Machine proofs may not be as cryptic as they seem," Dedeo said.



Others say theorem provers can be useful tools for teaching both computer science and mathematics. At Johns Hopkins University mathematician Emily Rielhas developed courses in which students write proofs using theorem provers. “It makes them very organized and think clearly,” she said. "Students writing a proof for the first time may not immediately understand what is required of them or grasp the logical structure."



Riel also says that he has been using theorem provers more often in his work lately. “They don’t have to be used all the time, and they will never replace scribbles on a piece of paper,” she said, “but using computer assistants for evidence has changed my understanding of how to write evidence.”



Theorem Provers also offer a way to keep mathematics fair. In 1999 Soviet, Russian and American mathematicianVladimir Alexandrovich Voevodsky , discovered an error in one of his proofs. From then until his death in 2017, he actively promoted the use of computers to verify evidence. Hales said that he and Ferguson found hundreds of errors in their original proofs by testing them with computers. Even the very first theorems in Euclid's Elements are not ideal. If a machine can help mathematicians avoid such mistakes, why not take advantage of it? Harris offered a practical objection to this proposal, however, it is not known how reasonable: if mathematicians have to spend time formalizing mathematics for a computer to understand it, they will not be able to spend that time on new mathematics.



However, Timati Gowers, a mathematician and a Cambridge Fields Prize-winning mathematician, wants to go even further: He envisions how theorem provers will replace human reviewers in major journals in the future. "I see how this can become standard practice - if you want your work to be accepted, you have to run it through an automated reviewer."



Conversation with computers



Before computers can test or develop evidence, researchers first need to overcome a significant hurdle: the communication barrier between the languages ​​of humans and computers.



Today's theorem provers have been designed with no regard for mathematician-friendliness. The first type, ATP, was commonly used to test the truth of a statement, often by testing all possible options. Ask the ATP if it is possible to travel from Miami to Seattle, and he will probably go through all the cities to which the roads from Miami lead, and in the end he will find a city that leads to Seattle.





Timati Gowers of Cambridge University believes theorem provers will someday replace human reviewers



Using ATP, the programmer can code all the rules, or axioms, and then ask the question whether a particular hypothesis follows those rules. And then the computer does all the work. “You just enter the hypothesis you want to prove and hope to get an answer,” said Daniel Huang, a computer scientist who recently left UC Berkeley to start a startup.



But there is a problem: ATP does not explain its work. All calculations take place inside the machine, and for a person they look like a long sequence of zeros and ones. Huang said that it was impossible to look at the evidence and verify the reasoning, since it all looks like a bunch of random data. “No one can look at such evidence and say: Everything is clear,” he said.



The second category, the ITP, has huge datasets containing tens of thousands of theorems and proofs with which they can check the accuracy of a proof. Unlike ATPs, which work inside a black box that simply issues responses, ITPs require interaction and sometimes direction from a person, so they are not so unapproachable. “A person can sit down and figure out what techniques are being used to prove,” Huang said. Such evidence was studied by Dedeo and Viteri.



In recent years, ITPs have become increasingly popular. In 2017, the trinity who proved the Boolean problem of Pythagorean triples used an ITP called Coq to create and test a formal version of their proof. In 2005, Georges Gontier of Microsoft Research Cambridge used Coq to formalize the four-color theorem. Hales also used ITPs called HOL Light and Isabelle to formally prove Kepler's conjecture (HOL stands for higher-order logic).



Today, the forefront of this field is trying to combine learning with reasoning. ATP is often combined with ITP to integrate machine learning to improve the performance of both techniques. Experts believe that ATP / ITP programs can use deductive reasoning and even exchange mathematical ideas in the same way as humans do, or at least in a similar way.



Limits of reasoning



Joseph Urban believes that such a combined approach can marry deductive and inductive reasoning, which is necessary to obtain evidence. His group created theorem provers powered by machine learning, allowing computers to learn from experience on their own. Over the past few years, they've been exploring the power of neural networks - layers of computing units that help machines process information in a way roughly similar to how neurons in our brains work. In July, their group reported new hypotheses generated by a neural network trained on theorem proving.



In part, Urban was inspired by the work of Andrei Karpaty, who several years ago trained a neural network to produce mathematical nonsense, which looked convincing to non-professionals. But Urban didn't need the nonsense - he and the group developed their own tool that searches for proofs, having trained on millions of theorems. They used the network to generate new hypotheses and test their validity with an ATP program called E.



The network has issued over 50,000 new formulas, although tens of thousands of them have been repeated. “It seems that we cannot yet prove more interesting hypotheses,” said Urban.



Google Research's Szegedi sees the problem of automatic reasoning in computer evidence as part of a much broader field: natural language processing, which includes recognizing patterns in the use of words and sentences. Pattern recognition is also a core idea of ​​computer vision, which Szegedi previously worked on at Google. Like other groups, his team wants to create theorem provers that can search for useful proofs and explain them.



Inspired by the rapid development of AI tools like AlphaZero - DeepMind's software that can beat humans at chess, go and shogi - the Szegedi group wants to use the latest advances in language recognition to record evidence. He said that language models can demonstrate surprisingly accurate mathematical reasoning.



His group at Google Research recently described a way to use language models - which neural networks often use - to generate new evidence. After training the model to recognize the tree structure of proven theorems, they launched a free experiment, simply asking neural networks to generate and prove theorems without supervision. Of the thousands of hypotheses generated, 13% turned out to be provable and new (not repeating other theorems in the database). He said that such an experiment says that neural networks can learn, in some sense, to understand what the evidence looks like.



“Neural networks are capable of developing an artificial semblance of intuition,” Szegedi said.



Of course, it is still not clear whether these attempts will fulfill Cohen's prophecy 40 years ago. Gowers said he believes computers can outpace mathematicians in reasoning by 2099. At first, he says, mathematicians will enjoy a golden age, “when they do interesting things and computers are boring. But I think it won't last very long. "



After all, if machines continue to develop more and more, and have access to a huge amount of data, they must learn to do very well and interesting things. “They will learn to make their own requests,” Gowers said.



Harris disagrees. He doesn't think computer proofs are necessary, or that they will end up "making human mathematicians unnecessary." If computer scientists, he says, can ever program synthetic intuition, it still won't compete with human intuition. "Even if computers understand, they won't understand in the human sense."



All Articles