mizan english persian parallel corpus

Temporal Relation Classification in Persian and English contexts

2013

Mahbaneh Eshaghzadeh Torbati Gholamreza Ghassem-Sani Seyed Abolghasem Mirroshandel Yadollah Yaghoobzadeh Negin Karimi Hosseini

This paper introduces the first pattern-based Persian Temporal Relation Classifier (PTRC) that finds the type of temporal relations between pairs of events in the Persian texts. The proposed system uses support vector machines (SVMs) equipped by combinations of simple, convolution tree, and string subsequence kernels (SSK). In order to evaluate the algorithm, we have developed a Persian TimeBan...

متن کامل

English and Persian Sport Newspaper Headlines: A comparative study of linguistic means

Journal: International Journal of Foreign Language Teaching and Research 2016

Mohammad Alipour, Nastaran Monjezi,

Abstract Using rhetorical figures in specialized languages like the language of newspaper headlines is common. The present study attempted to conduct a contrastive analysis of the English and Persian sport newspaper headlines related to the 2014 FIFA World Cup. Toward this end, a corpus consisting of 400 English and 400 Persian headlines published during 12th of June to 13th of July, 2014 was c...

متن کامل

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Journal: International Journal of Foreign Language Teaching and Research 2020

Ghafour Rezaie Golandoz, Hossein khazaee, Parviz Birjandi, Parviz Maftoon,

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

ژورنال: پردازش علائم و داده ها 2015

خلاش, مجتبی, مینایی بیدگلی, بهروز,

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

Culture and translation: the case of English and Persian languages

Journal: :Cadernos de Tradução 2022

Translation is not merely a mater of linguistics. The major goal the present paper to investigate relationship between ‘culture’ and ‘translation’. To this end, researcher drew on corpus from English Persian languages. findings indicated that although different languages, like Persian, employ linguistic forms, variety cannot be considered as real challenge. Since during process translation, sou...

متن کامل

ASPEC: Asian Scientific Paper Excerpt Corpus

2016

Toshiaki Nakazawa Manabu Yaguchi Kiyotaka Uchimoto Masao Utiyama Eiichiro Sumita Sadao Kurohashi Hitoshi Isahara

In this paper, we describe the details of the ASPEC (Asian Scientific Paper Excerpt Corpus), which is the first large-size parallel corpus of scientific paper domain. ASPEC was constructed in the Japanese-Chinese machine translation project conducted between 2006 and 2010 using the Special Coordination Funds for Promoting Science and Technology. It consists of a Japanese-English scientific pape...

متن کامل

Spanish Language Processing at University of Maryland: Building Infrastructure for Multilingual Applications

2001

Clara Cabezas Bonnie Dorr Philip Resnik

We describe here our construction of lexical resources, tool creation, building of an aligned parallel corpus, and an approach to automatic treebank creation that we have been developing using Spanish data, based on projection of English syntactic dependency information across a parallel corpus.

متن کامل

A Probabilistic Translation Method for Dictionary-based Cross-lingual Information Retrieval in Agglutinative Languages

Journal: :CoRR 2014

Javid Dadashkarimi Azadeh Shakery Heshaam Faili

Translation ambiguity, out of vocabulary words and missing some translations in bilingual dictionaries make dictionary-based Crosslanguage Information Retrieval (CLIR) a challenging task. Moreover, in agglutinative languages which do not have reliable stemmers, missing various lexical formations in bilingual dictionaries degrades CLIR performance. This paper aims to introduce a probabilistic tr...

متن کامل

Assembling a parallel corpus from RSS news feeds

2005

John Fry

We describe our use of RSS news feeds to quickly assemble a parallel English-Japanese corpus. Our method is simpler than other web mining approaches, and it produces a parallel corpus whose quality, quantity, and rate of growth are stable and predictable.

متن کامل

A Persian Treebank with Stanford Typed Dependencies

2014

Mojgan Seraji Carina Jahani Beáta Megyesi Joakim Nivre

We present the Uppsala Persian Dependency Treebank (UPDT) with a syntactic annotation scheme based on Stanford Typed Dependencies. The treebank consists of 6,000 sentences and 151,671 tokens with an average sentence length of 25 words. The data is from different genres, including newspaper articles and fiction, as well as technical descriptions and texts about culture and art, taken from the op...

متن کامل