Machine-independent Evaluation of Theorem-proving Strategies

نویسنده

  • Maria Paola Bonacina
چکیده

The evaluation of theorem-proving strategies has been done traditionally in an empirical manner: a strategy is implemented in a theorem prover, the prover is applied to a number of theorems, and the running times are reported and compared with those of other systems. In recent years, a growing eeort has been devoted to make the evaluation of theorem provers more systematic. The need for a standard collection of theorem-proving problems (e.g., the TPTP library 9]) and a standard set of empirical measures has been recognized (e.g., 8]). While benchmarking of theorem provers is necessary, and the progress in the methodology of empirical evaluation is important for the eld, the problem of strategy evaluation remains open. A theorem prover is made of many components in addition to the strategy, including data structures, indexing techniques and service algorithms such as those for uniication or term replacement. The performance of a theorem prover depends on all these components and the overall engineering of the system. It is very diicult to establish quantitatively how diierent features contribute to the observed performance. Therefore, empirical evaluation is evaluation of theorem-proving systems, not theorem-proving strategies. The goal of evaluating strategies independent of implementation requires the development of a theory of \strategy analysis," comparable to algorithm analysis , and with potentially similar beneecial consequences, not only for theorem proving, but also for logic programming and all applications of deduction. The idea of \strategy analysis" is new. Most of the work on search in ar-tiicial intelligence concentrates on the design of heuristics (e.g., 5]). Most of the research in complexity related to theorem proving studies the complexity of propositional proofs as part of the quest for NP 6 = co?NP (e.g., see 10] for a survey), or works with complexity measures based on the Herbrand theorem to determine lower bounds for sets of clauses, not upper bounds for strategies (e.g., 2, 4, 7]). In resolution theorem proving, the classical source for the modelling of search is 3], which was not concerned with evaluating the complexity of the strategies. The primary objective of strategy analysis is to study the complexity of searching for a proof. An approach to this problem was proposed in 6]. It applies classical techniques from algorithm analysis to derive worst-case upper bounds ?

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BliStr: The Blind Strategymaker pdfauthor=Josef Urban pdfkeywords=automated theorem proving, interactive theorem proving, machine learning, proof analysis, formal mathematics, mizar

BliStr is a system that automatically develops strategies for E prover on a large set of problems. The main idea is to interleave (i) iterated low-timelimit local search for new strategies on small sets of similar easy problems with (ii) higher-timelimit evaluation of the new strategies on all problems. The accummulated results of the global higher-timelimit runs are used to define and evolve t...

متن کامل

Proving Theorems about Java and the JVM with ACL2

We describe a methodology for proving theorems mechanically about Java methods. The theorem prover used is the ACL2 system, an industrial-strength version of the Boyer-Moore theorem prover. An operational semantics for a substantial subset of the Java Virtual Machine (JVM) has been defined in ACL2. Theorems are proved about Java methods and classes by compiling them with javac and then proving ...

متن کامل

Learning Intelligent Theorem Proving from Large Formal Corpora

The talk will discuss several AI methods used to learn proving of conjectures over large formal mathematical corpora. This includes (i) machine-learning methods that learn from previous proofs how to suggest the most relevant lemmas for proving the next conjectures, (ii) methods that guide low-level proof-search algorithms based on previous proof traces, and (iii) methods that automatically inv...

متن کامل

A short introduction to two approaches in formal verification of security protocols: model checking and theorem proving

In this paper, we shortly review two formal approaches in verification of security protocols; model checking and theorem proving. Model checking is based on studying the behavior of protocols via generating all different behaviors of a protocol and checking whether the desired goals are satisfied in all instances or not. We investigate Scyther operational semantics as n example of this...

متن کامل

HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the pu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997