mizan english persian parallel corpus

Between Comparable and Parallel: English-Czech Corpus from Wikipedia

2016

Adéla Stromajerová Vít Baisa Marek Blahus

We describe the process of creating a parallel corpus from Czech and English Wikipedias using methods which are language independent. The corpus consists of Czech and English Wikipedia articles, the Czech ones being translations of the English ones, is aligned on sentence level and is accessible in Sketch Engine corpus manager.1

متن کامل

Align Me: A framework to generate Parallel Corpus Using OCRs and Bilingual Dictionaries

2016

Priyam Bakliwal Devadath V. V C. V. Jawahar

Multilingual processing tasks like statistical machine translation and cross language information retrieval rely mainly on availability of accurate parallel corpora. Manual construction of such corpus can be extremely expensive and time consuming. In this paper we present a simple yet efficient method to generate huge amount of reasonably accurate parallel corpus with minimal user efforts. We u...

متن کامل

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

Journal: International Journal of Foreign Language Teaching and Research 2019

Laya Heidari Darani, Ziba Amini,

The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...

متن کامل

acquisition of english determinative descriptions by persian efl learners

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه یزد - دانشکده زبانهای خارجی 1392

راضیه عالیشوندی, محمد جواد رضایی, علی اکبر جباری,

abstract since heubners (1985) pioneering study, there have been many studies on (mis) use/ non-use of articles by l2 learners from article-less and article languages. the present study investigated how persian l2 learners of english produce and interpret english definite descriptions and demonstrative descriptions. it was assumed that definite and demonstrative descriptions share the same cen...

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

Journal: International Journal of Foreign Language Teaching and Research 2019

Laya Heidari Darani, Ziba Amini,

The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...

متن کامل

Extracting Semantic Transfer Rules from Parallel Corpora with SMT Phrase Aligners

2012

Petter Haugereid Francis Bond

This paper presents two procedures for extracting transfer rules from parallel corpora for use in a rule-based Japanese-English MT system. First a “shallow” method where the parallel corpus is lemmatized before it is aligned by a phrase aligner, and then a “deep” method where the parallel corpus is parsed by deep parsers before the resulting predicates are aligned by phrase aligners. In both pr...

متن کامل

Catalan-English statistical machine translation without a parallel corpus

2006

Adrià de Gispert José B. Mariño

This paper presents a full experiment on large-vocabulary Catalan-English statistical machine translation without an English-Catalan parallel corpus, in the context of the debates of the European Parliament. For this, we make use of an English-Spanish European Parliament Proceedings parallel corpus and a Spanish-Catalan general newspaper parallel corpus, both of which of more than 30 M words. G...

متن کامل

Catalan-English Statistical Machine Translation without Parallel Corpus: Bridging through Spanish

2006

Adrià de Gispert José B. Mariño

This paper presents a full experiment on large-vocabulary Catalan-English statistical machine translation without an English-Catalan parallel corpus, in the context of the debates of the European Parliament. For this, we make use of an English-Spanish European Parliament Proceedings parallel corpus and a Spanish-Catalan general newspaper parallel corpus, both of which of more than 30 M words. G...

متن کامل

European Union Language Resources in Sketch Engine

2016

Vít Baisa Jan Michelfeit Marek Medved Milos Jakubícek

Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the Sketch Engine corpus management system. A completely new resource is introduced: EUR-Lex corpus, being one of the largest parallel corpus available at the moment, containing 840 million tokens of English and having the ...

متن کامل

Knowledge based Approach for English-Malayalam Parallel Corpus Generation

Journal: :Indian Journal of Science and Technology 2016

متن کامل