Efficient String Matching and Easy Bottom-Up Parsing
نویسنده
چکیده
The paper consists of the two parts. In the first one we compare three string matching algorithms: Dömölki algorithm, also known as SHIFT OR algorithm, O(n×m), Table driven O((n +m) × lgm) and Aho Corasick O(m + n). Table driven algorithm pass trough the same states as Dömölki one but have compact states encoding. The table driven algorithm can be also considered as Aho Corasick algorithm with eliminated ǫ-transitions. We advocate that the table driven algorithm is the best solution for matching multiple patterns of reasonable size. The second part of the paper deal with bottom-up syntax analysis. We have shown that the backward deterministic syntax analysis can be implemented via extension of a string matching automaton by a stack; or two stacks, if we want to go beyond context free grammars. The implementation of a parser this type is as easy as writing a recursive descent parser; we need to supply only the transition table, which can be easily derived from the grammar. Finally, we discuss some compiler engineering details.
منابع مشابه
Bottom-Up Parsing Extending Context-Freeness in a Process Grammar Processor
A new approach to bottom-up parsing that extends Augmented Context-Free Grammar to a Process Grammar is formally presented. A Process Grammar (PG) defines a set of rules suited for bottom-up parsing and conceived as processes that are applied by a P G Processor. The matching phase is a crucial step for process application, and a parsing structure for efficient matching is also presented. The PG...
متن کاملA Parsing Algorithm for Unification Grammar
We describe a table-driven parser for unification grammar that combines bottom-up construction of phrases with top-down filtering. This algorithm works on a class of grammars called depth-bounded grammars, and it is guaranteed to halt for any input string. Unlike many unification parsers, our algorithm works directly on a unification grammar--it does not require that we divide the grammar into ...
متن کاملEfficient Retargetable Code Generation Using Bottom-up Tree Pattern Matching
Instruction selection is the primary task in automatic code generation. This paper proposes a practical system for performing optimal instruction selection based on tree pattern matching for expression trees. A significant feature of the system is its ability to perform code generation without requiring cost analysis at code generation time. The target machine instructions are specified as attr...
متن کاملA Bottom-up Parser where Entire Operation is Conducted in the Letter String Region
This paper treats a natural language parser of bottom-up type. The characteristics of the parser lies in that the data treated keep the shape of letter string through the entire parsing operations. Letter strings including parentheses express the partial trees generated in the course of parsing. This expression helps to avoid list expression usually used to represent trees. Key-Words: parser, b...
متن کاملAn Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007