نتایج جستجو برای: statistical language model
تعداد نتایج: 2689345 فیلتر نتایج به سال:
In this paper, we propose a new language model based on dependent word sequences organized in a multi-level hierarchy. We call this model MC n, where n is the maximum number of words in a sequence and is the maximum number of levels. The originality of this model is its capacity to take into account dependent variable-length sequences for very large vocabularies. In order to discover the variab...
Reordering is a challenge to machine translation (MT) systems. In MT, the widely used approach is to apply word based language model (LM) which considers the constituent units of a sentence as words. In speech recognition (SR), some phrase based LM have been proposed. However, those LMs are not necessarily suitable or optimal for reordering. We propose two phrase based LMs which considers the c...
A neural probabilistic language model (NPLM) provides an idea to achieve the better perplexity than n-gram language model and their smoothed language models. This paper investigates application area in bilingual NLP, specifically Statistical Machine Translation (SMT). We focus on the perspectives that NPLM has potential to open the possibility to complement potentially ‘huge’ monolingual resour...
This paper describes a thesaurus-based class n-gram model for broadcast news transcription. The most important issue concerned with class n-gram models is how to develop a word classification. We construct a word classification mapping based on a thesaurus so as to maximize the average mutual information function on a training corpus. To examine the effectiveness of the new method, we compare i...
A novel variation of modified KNESER-NEY model using monomial discounting is presented and integrated into the MOSES statistical machine translation toolkit. The language model is trained on a large training set as usual, but its new discount parameters are tuned to the small development set. An in-domain and cross-domain evaluation of the language model is performed based on perplexity, in whi...
[Abstract] This paper presents a novel method to segment/decode DNA sequences based on n-gram statistical language model. Firstly, we find the length of most DNA “words” is 12 to 15 bps by analyzing the genomes of 12 model species. The bound of language entropy of DNA sequence is about 1.5674 bits. After building an n-gram biology languages model, we design an unsupervised ‘probability approach...
The language model (LM) is a critical component in most statistical machine translation (SMT) systems, serving to establish a probability distribution over the hypothesis space. Most SMT systems use a static LM, independent of the source language input. While previous work has shown that adapting LMs based on the input improves SMT performance, none of the techniques has thus far been shown to ...
Culture is an inseparable part of a language. In other words, mastering a language and being able to communicate through it inevitably entails integrating with the culture of the speakers of that language which is the reflection of people's identity. The aim of the present study was designing a model of Iranian cultural identity. Initially, to select a homogeneous sample of learners at the adva...
We consider phrase based Language Models (LM), which generalize the commonly used word level models. Similar concept on phrase based LMs appears in speech recognition, which is rather specialized and thus less suitable for machine translation (MT). In contrast to the dependency LM, we first introduce the exhaustive phrase-based LMs tailored for MT use. Preliminary experimental results show that...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید