An empirical study of smoothing techniques for language modeling

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Study of Smoothing Techniques for Language Modeling

We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first t ime how factors such as training data size, corpus (e.g., Brown versus Wall Street Journal), and n-gram order (bigram versus trigram) affect the relative pe...

متن کامل

An Empirical Study of Smoothing Techniques for LanguageModelingStanley

We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mer-cer (1980), Katz (1987), and Church and Gale (1991). We investigate for the rst time how factors such as training data size, corpus (e.g., Brown versus Wall Street Journal), and n-gram order (bigram versus trigram) aaect the relative perfo...

متن کامل

Smoothing Techniques for Tree-k-Grammar-Based Natural Language Modeling

In a previous work, a new probabilistic context-free grammar (PCFG) model for natural language parsing derived from a tree bank corpus has been introduced. The model estimates the probabilities according to a generalized k-grammar scheme for trees. It allows for faster parsing, decreases considerably the perplexity of the test samples and tends to give more structured and refined parses. Howeve...

متن کامل

An Empirical Study of Predictive Modeling Techniques of Software Quality

The primary goal of software quality engineering is to apply various techniques and processes to produce a high quality software product. One strategy is applying data mining techniques to software metrics and defect data collected during the software development process to identify the potential lowquality program modules. In this paper, we investigate the use of feature selection in the conte...

متن کامل

Long Distance Dependency in Language Modeling: An Empirical Study

This paper presents an extensive empirical study on two language modeling techniques, linguistically-motivated word skipping and predictive clustering, both of which are used in capturing long distance word dependencies that are beyond the scope of a word trigram model. We compare the techniques to others that were proposed previously for the same purpose. We evaluate the resulting models on th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Speech & Language

سال: 1999

ISSN: 0885-2308

DOI: 10.1006/csla.1999.0128