Morphologically rich Urdu grammar parsing using Earley algorithm

نویسنده

  • Qaiser Abbas
چکیده

This work presents the development and evaluation of an extended Urdu parser. It further focuses on issues related to this parser and describes the changes made in the Earley algorithm to get accurate and relevant results from the Urdu parser. The parser makes use of a morphologically rich context free grammar extracted from a linguistically-rich Urdu treebank. This grammar with sufficient encoded information is comparable with the stateof-the-art parsing requirements for the morphologically rich Urdu language. The extended parsing model and the linguistically rich extracted-grammar both provide us better evaluation results in Urdu/Hindi parsing domain. The parser gives 87% of f-score, which outperforms the existing parsing work of Urdu/Hindi based on the tree-banking approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Language Variants Via Grammar Parsing Having Morphologically Rich Information

In this paper, the development and evaluation of the Urdu parser is presented along with the comparison of existing resources for the language variants Urdu/Hindi. This parser was given a linguistically rich grammar extracted from a treebank. This context free grammar with sufficient encoded information is comparable with the state of the art parsing requirements for morphologically rich and cl...

متن کامل

Building Computational Resources: The URDU.KON-TB Treebank and the Urdu Parser

This work presents the development of the URDU.KON-TB treebank, its annotation evaluation & guidelines and the construction of the Urdu parser for a South Asian language Urdu. Urdu is comparatively an under-resourced language and the development of a reliable treebank and a parser will have significant impact on the state-of-the-art for automatic Urdu language processing. The work includes the ...

متن کامل

Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

Inside parsing is a best parse parsing method based on the Inside algorithm that is often used in estimating probabilistic parameters of stochastic context free grammars. It gives a best parse in O(AfG) time where N is the input size and G is the grammar size. Earley algorithm can be made to return best parses with the same complexity in N. By way of experiments, we show that Inside parsing can...

متن کامل

An Earley Parsing Algorithm for Range Concatenation Grammars

We present a CYK and an Earley-style algorithm for parsing Range Concatenation Grammar (RCG), using the deductive parsing framework. The characteristic property of the Earley parser is that we use a technique of range boundary constraint propagation to compute the yields of non-terminals as late as possible. Experiments show that, compared to previous approaches, the constraint propagation help...

متن کامل

Practical Earley Parsing

Earley’s parsing algorithm is a general algorithm, able to handle any context-free grammar. As with most parsing algorithms, however, the presence of grammar rules having empty right-hand sides complicates matters. By analyzing why Earley’s algorithm struggles with these grammar rules, we have devised a simple solution to the problem. Our empty-rule solution leads to a new type of finite automa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Natural Language Engineering

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2016