Machine-Learning automated Proofs

The webpage supports

"Machine-Learning Coalgebraic Proofs" (ML-CAP) research grant, funded by EPSRC, in 2012-2014.

The project was funded by EPSRC "First Grant" scheme, between 01 March 2012 and 28 February 2014. It has now officially ended, and below is the project page as it evolved during the two years of the grant duration. This page is no longer updated. However, this grant gave rise to three longer-term research trends; and each of them now has its own webpage, and each is very much alive, healthy, and growing:

Coalgebraic Logic programming continues its development, and its live webpage (with upfdated ppaers and software) is here.

The ML4PG tool for data-mining Coq and SSRefelct proofs is still being developed, and has become a part of the standrad Proof General distribution; its own live webpage can be found here.

A spin-off of the above, ACL2(ml) is currently under development and industrial dissemination; its own webpage.

The ML-CAP project was about finding statistically significant patterns in computer proofs, and using these patterns to aid proof automation. The Coalgebraic Logic Programming was first used to develop the techniques, but later we switched to working with Coq.

1*. Machine-learning for higher-order interactive proofs

For ML4PG web-page, see here;

Together with Jonathan Heras, we are working on integration of statistical Machine-learning application into Proof General - a generic interface for interactive theorem provers. We call the result ML4PG: Machine Learning for Proof General. The tool provides users with various proof-hints during the process of proof development. ML4PG now has it own web-page, where you can find software, related documentation, and installation instructions. Follow this link.
2*. Extensions and Implementation of Coalgebraic Logic Programming CoALP:

CoALP is a dialect of first-order Horn-Clause logic that features lazy guarded corecursion and parallelism. CoALP arose from Coalgebraic Semantics for logic programming we developed with John Power between 2008-2012.

In 2012, we were joined by Martin Schmidt, who has helped to implement CoALP first in PROLOG, and then in Parallel language Go. CoALP now has its own mini-page , from which you can view all related publications, slides, and download the prototypes.

For CoALP page, see here.

Results:

J.Heras and E.Komendantskaya, Recycling Proof Patterns in Coq: Case Studies. To be published in Mathematics in Computer Science, 2014.

E.Komendantskaya, M.Schmidt and J.Heras. Exploiting Parallelism in Coalgebraic Logic Programming. Accepted to ENTCS, 2014. 26 pages. CoALP's mini-page

E.Komendantskaya, J.Power and M.Schmidt. Coalgebraic Logic Programming: from Semantics to Implementation. 2014, 40 pages. Accepted for publication at Journal of Logic and Computation. CoALP's mini-page .

J.Heras, E.Komendantskaya, M.Johansson, and E.Maclean. Proof-Pattern Recognition and Lemma Discovery in ACL2 LPAR'13, Stellenbosch, South Africa, 15th-19th December 2013. LNCS 8312, pp. 389-406, 2013.

E.Komendantskaya, J.Heras and G.Grov. Machine-Learning for Proof General: interfacing interfaces. Post-proceedings of User Interfaces for Theorem Provers (UITP 2012), EPTCS, vol. 118, pp. 15-41, 2013. ML4PG has its own page: here.

J.Heras and E.Komendantskaya. ML4PG in Computer Algebra Verification. Systems & Projects session of CICM 2013. Lecture Notes in Computer Science, vol. 7961, pp. 354-358, 2013. Arxiv version.

J.Heras and E.Komendantskaya. Statistical Proof Pattern Recognition: Automated or Interactive? (slides), the 20th Automated Reasoning Workshop (ARW'13), 11-12 April 2013.

E.Komendantskaya, J.Power and M.Schmidt. Coalgebraic Logic Programming: implicit versus explicit resource handling. Workshop on Coinductive Logic Programming at ICLP'12, Budapest, September 2012. Slides are available here.

E.Komendantskaya and K.Lichota. Neural Networks for Proof-Pattern Recognition. International Conference on Artificial Neural Networks, ICANN'12, September 2012, Lausanne, Switzerland. Lecture Notes in Computer Science, vol. 7553, pp. 427-434, 2012. The supporting documentation and data sets are available below..

G.Grov, E.Komendantskaya and A.Bundy. A Statistical Relational Learning Challenge - extracting proof strategies from exemplar proofs. ICML'12 worshop on Statistical Relational Learning, Edinburgh, 30 July 2012.

E.Komendantskaya. Machine-Learning Coalgebraic Proofs. Post-proceedings of the International Conference on Inductive Logic Programming'11, Windsor, UK. Imperial College Tech. Report.

Agenda and motivation:

J.Heras and E.Komendantskaya. Statistical Proof Pattern Recognition: Automated or Interactive? (slides), the 20th Automated Reasoning Workshop (ARW'13), 11-12 April 2013.

K.Komendantskaya and J.Heras. ML4PG: Machine Learning for Proof General: interfacing interfaces. A talk given at the Universities of Stratchclyde (30 November 2012) and Edinburgh (4 December 2012).

G.Grov, K. Komendantskaya and A.Bundy ``A statistical relational learning challenge: extracting proof strategies from exemplar proofs". Statistical Relational Learning-2012, University of Edinburgh, workshop at ICML'12.

K.Komendantskaya ``Machine-Learning for the Working Logician" Dagstuhl Workshop AI meets Formal Software Development, July 2012. Slides.

K.Komendantskaya ``Machine-Learning Coalgebraic Proofs" Inductive Logic Programming-2011, Imperial College, London.

Earlier results on ML-CAP proof-learning techniques:

E.Komendantskaya and K.Lichota. Neural Networks for Proof-Pattern Recognition (pre-print). International Conference on Artificial Neural Networks, ICANN'12, September 2012, Lausanne, Switzerland.

Software: Package for automated proof-feature extraction, by K.Lichota and K.Komendantskaya.

Extended technical report about machine-learning coalgebraic trees "Automated Proof Pattern Recognition".

Data sets supporting pattern-recognition and classification experiments, as described in the Technical report "Automated Proof Pattern Recognition":
------------------------------
Figure 13 of the draft report.

Classification Problem 1. ListNat. Data set. Target.

Classification Problem 1. Stream. Data set. Target.

Classification Problem 2.1. ListNat. Data set. Target.

Classification Problem 2.1. ListNat - Extended data set. Data set. Target.

Classification Problem 2.1. Stream. Data Set. Target.

Classification Problem 2.2. Stream. Data Set. Target.

Classification Problem 3. ListNat. Data set. Target.

Classification Problem 3. ListNat - Extended data set. Data set. Target.

Classification Problem 4. Stream. Data Set. Target.

Classification Problem 5. ListNat Data set. Target.

Classification Problem 5. Stream Data set. Target.
------------------------------

Figures 55-56 of the Draft Report.

Classification Problem 1. Initial Training on ListNat, switching to Stream.

Initial accuracy test. Data set. Target.

First adaptation attempt - Test 1.Training Set. Target

Retraining and second adaptation attempt - Test 2.
New training set Stream and its target.

Test on a new Stream data set and it's target.

Retraining and third adaptation attempt - Test 3

New training set and its target.

Test on a new Stream data set and it's target.

Another test on a new Stream data set and it's target.

Mixed data base training: Data set of examples and its target.

Classification Problem 1. Initial Training on Stream, switching to ListNat.

Initial accuracy test. Data set. Target.

First adaptation attempt - Test 1.Training Set. Target

Retraining and second adaptation attempt - Test 2.
New training set Stream and its target.

Test on a new Stream data set and it's target.

Retraining and third adaptation attempt - Test 3

New training set and its target.

Test on a new Stream data set and it's target.

Mixed data base training: Data set of examples and its target.

Classification Problem 5. Initial Training on ListNat, switching to Stream.

Initial accuracy test Data set. Target.

First adaptation attempt - Test 1.Training Set. Target

Retraining and second adaptation attempt - Test 2.
New training set Stream and its target.

Test on a new Stream data set and it's target.

Retraining and third adaptation attempt - Test 3

New training set and its target.

Test on a new Stream data set and it's target.

Mixed data base training: Data set of examples and its target.

Classification Problem 5. Initial Training on Stream, switching to ListNat

Initial accuracy test Data set. Target.

First adaptation attempt - Test 1.Training Set. Target

Retraining and second adaptation attempt - Test 2.
New training set ListNat and its target.

Test on a new List data set and it's target.

Retraining and third adaptation attempt - Test 3

New training set and its target.

Test on a new ListNat data set and it's target.

Mixed data base training: Data set of examples and its target.

---------------------------------------
Figure 59 of the Draft Report.

Classification Problem 1. Merged matrices. Training Data Set and Target

Classification Problem 5. Merged matrices. Training Data Set and Target

Classification Problem 1. Listream. Training Data Set and Target

Classification Problem 5. Listream. Training Data Set and Target

Classification Problem 1. Merged matrices training and testing on Listream. Training Data Set and Target

Classification Problem 5. Merged matrices training and testing on Listream. Training Data Set and Target