نتایج جستجو برای: statistical language model
تعداد نتایج: 2689345 فیلتر نتایج به سال:
An automatic word-classification system has been designed that uses word unigram and bigram frequency statistics to implement a binary top-down form of word clustering and employs an average class mutual information metric. Words are represented as structural tags--n-bit numbers the most significant bit-patterns of which incorporate class information. The classification system has revealed some...
A neural probabilistic language model (NPLM) provides an idea to achieve the better perplexity than n-gram language model and their smoothed language models. This paper investigates application area in bilingual NLP, specifically Statistical Machine Translation (SMT). We focus on the perspectives that NPLM has potential to open the possibility to complement potentially ‘huge’ monolingual resour...
In this paper we study the word-reordering problem in the decoding part of statistical machine translation, but independently from the target language generating process. In this model, a permuted sentence is given and the goal is to recover the correct order. We introduce a greedy algorithm called Local-(k, l)-Step, and show that it performs better than the DP-based algorithm. Our word-reorder...
Locating bugs is challenging but one of the most important activities in software development and maintenance phase because there are no certain rules to identify all types of bugs. Existing automatic bug localization tools use various heuristics based on test coverage, pre-determined buggy patterns, or textual similarity with bug report, to rank suspicious program elements. However, since thes...
With the development of computer technology and the appearance of huge training text corpus, the performance of language model has improved a lot recently. But its intrinsic sparse data problem still exists. This paper investigates several smoothing methods in the application of Chinese continuous speech recognition. We compare the performance of different methods, particularly in the situation...
Language model is an essential part in statistical machine translation, but traditional n-gram language models can only capture a limited local context in the translated sentence, thus lacking the global information for prediction. This paper describes a novel topic-triggered language model, which takes into account the topical context by estimating the n-gram probability under the given topics...
This paper presents a newly formalized probabilistic LR language model. Our model inherits its essential features from Briscoe and Carroll's generalized probabilistic LR (PLR) model [3], which obtains context-sensitivity by assigning a probability to each LR parsing action according to its left and right context. However, our model is simpler while maintaining a higher degree of context-sensiti...
The paper describes a novel approach to Multi-Engine Machine Translation. We build statistical models of performance of translations and use them to guide us in combining and selecting from outputs from multiple MT engines. We empirically demonstrate that the MEMT system based on the models outperforms any of its component engine.
In this paper we are concerned with the practical issues of working with data sets common to finance, statistics, and other related fields. pandas is a new library which aims to facilitate working with these data sets and to provide a set of fundamental building blocks for implementing statistical models. We will discuss specific design issues encountered in the course of developing pandas with...
Machine transliteration is an automatic method for translating source language words into phonetically equivalent target language ones. Many previous methods were devoted to translating the word that only traces phonological phenomena of the source language and the resulting showed good performance. However, there are a lot of names originated from not only the source language but also non-sour...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید