mizan english persian parallel corpus

نتایج جستجو برای: mizan english persian parallel corpus

تعداد نتایج: 413519 فیلتر نتایج به سال:

A New English/Arabic Parallel Corpus for Phishing Emails

Journal: :ACM Transactions on Asian and Low-Resource Language Information Processing 2023

Phishing involves malicious activity whereby phishers, in the disguise of legitimate entities, obtain illegitimate access to victims’ personal and private information, usually through emails. Currently, phishing attacks threats are being handled effectively use latest email detection solutions. Most current systems assume be English, though other languages growing. In particular, Arabic is a wi...

متن کامل

Supporting Large English-Hindi Parallel Corpus using Word Alignment

Journal: :International Journal of Computer Applications 2012

متن کامل

Construction of Mizo – English Parallel Corpus for Machine Translation

Journal: :ACM Transactions on Asian and Low-Resource Language Information Processing 2023

Parallel corpus is a key component of statistical and Neural Machine Translation (NMT). While most research focuses on machine translation, creation studies are limited for many languages no paper Mizo–English exists yet. A high-quality parallel required Natural Language Processing (NLP) activities including Chatbots, Transliteration, Cross-Language Information Retrieval. This work aims to inve...

متن کامل

lexical cohesion in english and persian abstracts

Journal: :iranian journal of applied language studies 2012

fatemeh seddigh nasrin shokrpour reza kafipour

this study compares and contrasts lexical cohesion in english and persian abstracts of iranian medical students’ theses to appreciate textualization processes in the two languages. for this purpose, one hundred english and persian abstracts were selected randomly and analyzed based on seddigh and yarmohamadi’s (1996) lexical cohesion framework, a version of halliday and hasan’s (1976) and halli...

متن کامل

Mizan: Optimizing Graph Mining in Large Parallel Systems

2012

Zuhair Khayyat Karim Awara Hani Jamjoom Panos Kalnis

Extracting information from graphs, from finding shortest paths to complex graph mining, is essential for many applications. Due to the shear size of modern graphs (e.g., social networks), processing must be done on large parallel computing infrastructures (e.g., the cloud). Earlier approaches relied on the MapReduce framework, which was proved inadequate for graph algorithms. More recently, th...

متن کامل

Using Word Alignment to Extend Multilingual Medical Terminologies

2006

Louise Deléger Magnus Merkel Pierre Zweigenbaum

Medical terminologies such as those provided in the UMLS are never exhaustive and there is a constant need to enrich them, especially in terms of multilinguality. We present a methodology to acquire new French translations of English medical terms based on word alignment in a parallel corpus — i.e. pairing of corresponding words. We automatically collected a 27.7-million-word parallel, English-...

متن کامل

A Japanese-English Patent Parallel Corpus

2007

Masao Utiyama Hitoshi Isahara

We describe a Japanese-English patent parallel corpus created from the Japanese and US patent data provided for the NTCIR-6 patent retrieval task. The corpus contains about 2 million sentence pairs that were aligned automatically. This is the largest Japanese-English parallel corpus, which will be available to the public after the 7th NTCIR workshop meeting. We estimated that about 97% of the s...

متن کامل

Three Issues in Cross-Language Frame Information Transfer

2009

Sara Tonelli Emanuele Pianta

In this paper we address the task of transferring FrameNet annotations from an English corpus to an aligned Italian corpus. Experiments were carried out on an English-Italian bitext extracted from the Europarl corpus and on a set of selected sentences from the English FrameNet corpus that have been manually translated into Italian. Our research activity is aimed at answering the following three...

متن کامل

Bilingual Sentence Alignment Based on Punctuation Marks

2003

Kevin C. Yeh

We present a new approach to aligning English and Chinese sentences in parallel corpora based solely on punctuations. Although the length based approach produces high accuracy rates of sentence alignment for clean parallel corpora written in two Western languages such as French-English and German-English, it does not fair as well for parallel corpora that are noisy or written in two distant lan...

متن کامل

A Contrastive Study of Persian and English Written Discourse: Ellipsis in Realistic Novels

Journal: زبانشناسی کاربردی 2008

Esmail Faghih Sepideh Rahimpour

This study aspires to examine the concept of ellipsis by comparing and contrasting English and Persian written texts. For this purpose, three Persian novels and three English ones were selected. These novels were analyzed carefully; they were compared and contrasted for types and amount of ellipsis used, through a Chi-square analysis. The results of the data analysis revealed that various t...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید