نتایج جستجو برای: persian parallel corpus

تعداد نتایج: 300662  

2012
Kavosh Asadi Atui Heshaam Faili Kaveh Assadi Atuie

This paper presents a novel method to extract the collocations of the Persian language using a parallel corpus. The method is applicable having a parallel corpus between a target language and any other high-resource one. Without the need for an accurate parser for the target side, it aims to parse the sentences to capture long distance collocations and to generate more precise results. A traini...

Journal: :international journal of information science and management 0
mohammad bagher dastgheib ph.d. candidate department of computer science and engineering, shiraz university, shiraz, iran seyed mostafa fakhrahmad department of computer science and engineering, shiraz university, shiraz, iran mansour zolghadri jahromi department of computer science and engineering, shiraz university, shiraz, iran

a bilingual corpus is considered as a very important knowledge source and an inevitable requirement for many natural language processing (nlp) applications in which two languages are involved. for some languages such as persian, lack of such resources is much more significant. several applications, including statistical and example-based machine translation needs bilingual corpora, in which lar...

2006
Mohammad Bahrani Hossein Sameti Nazila Hafezi H. Movassagh

In this paper building statistical language models for Persian language using a corpus and incorporating them in Persian continuous speech recognition (CSR) system are described. We used Persian Text Corpus for building the language models. First we preprocessed the texts of corpus by correcting the different orthography of words. Also, the number of POS tags was decreased by clustering POS tag...

2010
Mohammad Taher Pilevar Heshaam Faili

In this paper, an attempt to develop a phrase-based statistical machine translation between English and Persian languages (PersianSMT) is described. Creation of the largest English-Persian parallel corpus yet presented by the use of movie subtitles is a part of this work. Two major goals are followed here: the first one is to show the main problems observed in the output of the PersianSMT syste...

2011
Morteza Okhovvat Behrouz Minaei-Bidgoli

One of the important actions in the processing of languages is part-of-speech tagging. Against of this importance, although numerous models have been presented in different languages but there is few works have been done in Persian language. In this paper, a part-of-speech tagging system on Persian corpus by using hidden Markov model is proposed. Achieving to this goal, the main aspects of Pers...

پایان نامه :0 1375

the significance of the study of deixis was then mentioned. the purpose of the present study from the outset was to provide a comprehensive overview of all kinds of deixis in persian, describing and defining each in true while considering them structurally and semantically. chapter two consisted of two main parts. a review of the english studies in this respect, besides presenting persian liter...

2015
Khadijeh Khoshnavataher Vahid Zarrabi Salar Mohtaj Habibollah Asghari

The task of text alignment corpus construction at PAN 2015 competition consists of preparing a plagiarism corpus so that it can provide various obfuscation types and versatile obfuscation degrees. Meanwhile, its format and metadata structure should follow previous PAN plagiarism corpora. In this paper, we describe our approach for construction of a monolingual Persian plagiarism corpus that can...

2010
Mahdi Mohseni Behrouz Minaei-Bidgoli

This paper describes a method based on morphological analysis of words for a Persian Part-Of-Speech (POS) tagging system. This is a main part of a process for expanding a large Persian corpus called Peyekare (or Textual Corpus of Persian Language). Peykare is arranged into two parts: annotated and unannotated parts. We use the annotated part in order to create an automatic morphological analyze...

2015
Francisco M. Rangel Pardo Fabio Celli Paolo Rosso Martin Potthast Benno Stein Walter Daelemans

In this paper we describe and evaluate the corpora submitted to the PAN 2015 shared task on plagiarism detection for text alignment. We received monoand cross-language corpora in the following languages (pairs): English, Persian, Chinese, and Urdu-English, English-Persian. We present an independent section for each submitted corpus including statistics, discussion of the obfuscation techniques ...

2015
Marc Franco-Salvador Imene Bensalem Enrique Flores Parth Gupta Paolo Rosso

In this paper we describe and evaluate the corpora submitted to the PAN 2015 shared task on plagiarism detection for text alignment. We received monoand cross-language corpora in the following languages (pairs): English, Persian, Chinese, and Urdu-English, English-Persian. We present an independent section for each submitted corpus including statistics, discussion of the obfuscation techniques ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید