Learning Context-Free Grammars with a Simplicity Bias

نویسندگان

  • Pat Langley
  • Sean Stromsten
چکیده

We examine the role of simplicity in directing the induction of context-free grammars from sample sentences. We present a rational reconstruction of Woll's SNPR { the Grids system { which incorporates a bias toward grammars that minimize description length. The algorithm alternates between merging existing nonterminal symbols and creating new symbols, using a beam search to move from complex to simpler grammars. Experiments suggest that this approach can induce accurate grammars and that it scales reasonably to more diicult domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simplicity and Representation Change in Grammar Induction

In this paper we examine the role of a bias toward simplicity in directing the process of representation change. We focus on the task of inducing context-free grammars from sample sentences, and we present a rational reconstruction of Woll's SNPR { the Grids system { that incorporates the simplicity bias. The basic induction method alternates between merging existing nonterminal symbols and cre...

متن کامل

Learning context-free grammars to extract relations from text

In this paper we propose a novel relation extraction method, based on grammatical inference. Following a semisupervised learning approach, the text that connects named entities in an annotated corpus is used to infer a context free grammar. The grammar learning algorithm is able to infer grammars from positive examples only, controlling overgeneralisation through minimum description length. Eva...

متن کامل

Solving Trigonometric Identities with Tree Adjunct Grammar Guided Genetic Programming

Genetic programming (GP) may be seen as a machine learning method, which induces a population of computer programs by evolutionary means (Banzhaf et al. 1998). Genetic programming has been used successfully in generating computer programs for solving a number of problems in a wide range of areas. In (Hoai and McKay 2001), we proposed a framework for a grammar-guided genetic programming system c...

متن کامل

Learning restricted probabilistic link grammars

We describe a language model employing a new headeddisjuncts formulationof Lafferty et al.’s (1992)probabilistic link grammar, together with (1) an EM training method for estimating the probabilities, and (2) a procedure for learning some simple lexicalized grammar structures. The model in its simplest form is a generalization of n-gram models, but in its general form possesses context-free exp...

متن کامل

Inducing Tree-Substitution Grammars

Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000