نتایج جستجو برای: Farsi

تعداد نتایج: 1830  

Journal: :CoRR 2014
Behrang Q. Zadeh Saeed Rahimi Mehdi Safaee Ghalati

Farsi, also known as Persian, is the official language of Iran and Tajikistan and one of the two main languages spoken in Afghanistan. Farsi enjoys a unified Arabic script as its writing system. In this paper we briefly introduce the writing standards of Farsi and highlight problems one would face when analyzing Farsi electronic texts, especially during development of Farsi corpora regarding to...

2015
Peyman Passban Andy Way Qun Liu

Statistical machine translation (SMT) suffers from various problems which are exacerbated where training data is in short supply. In this paper we address the data sparsity problem in the Farsi (Persian) language and introduce a new parallel corpus, TEP++. Compared to previous results the new dataset is more efficient for Farsi SMT engines and yields better output. In our experiments using TEP+...

2008
Chia-Lin Kao Shirin Saleem Rohit Prasad Fred Choi Premkumar Natarajan David Stallard Kriste Krstovski Matin Kamali

Significant advances have been achieved in Speech-to-Speech (S2S) translation systems in recent years. However, rapid configuration of S2S systems for low-resource language pairs and domains remains a challenging problem due to lack of human translated bilingual training data. In this paper, we report on an effort to port our existing English/Iraqi S2S system to the English/Farsi language pair ...

2004
Mortaza Kokabi

Zarnegar (gold writer) is a word processor widely used by publishers of both scholarly journals and books in Iran. Although it is gradually substituted by Word for Windows that is much more powerful than Zarnegar, the process seems to be slow and most Iranian publishers still prefer to receive manuscripts in Zarnegar than Word. There are many reasons for this preference: Word, though having man...

2009
H. Izakian

Nowadays, OCR systems have got several applications and are increasingly employed in daily life. Much research has been done regarding the identification of Latin, Japanese, and Chinese characters. However, very little investigation has been performed regarding Farsi/Arabic characters recognition. Probably the reason is difficulty and complexity of those characters identification compared to th...

2004
E. Darrudi M. R. Hejazi F. Oroumchian

The development of Language Engineering (LE) and Information Retrieval (IR) applications requires availability of sizeable, reliable and representative corpora. This paper describes how we have constructed a well-structured 345 MB tagged corpus of news, and presents some beneficial statistics of this corpus based upon the characteristics of Farsi language. It also goes into particular detail on...

2006
Behrang Q. Zadeh Saeed Rahimi

Farsi, also known as Persian, is the official language of Iran, Tajikistan and one of the two main languages spoken in Afghanistan. It is an Indo-European agglutinating language, written in Arabic script. This paper presents the first step in creating Farsi basic language resources kit. This Step comprises the specifications for morphosyntactic encoding, which is based on the EAGLES/MULTEXT mod...

2011
Zahra bahmani Reza Azmi

A retrieval method for explicit recognition free Farsi/Arabic document is proposed in this paper. The system can be used in mixed Farsi/Arabic and English text. The method consists of Preprocessing, word and sub_word extraction, detection and cancelation of sub_letter connectors, annotation sub_letters by shape coding, classifier of sub_letters by use of decision tree and using of RBF neural ne...

2013
Yaghoub POURASAD Houshang HASSIBI Azam GHORBANI

In this paper, a word spotting approach for Farsi printed document images has been presented. The main idea of the paper is the font recognition of Farsi document images and query word modification according to the document image’s font before searching. This operation increases the similarity between the query word image and its instances in the document image; therefore, the performance of th...

Journal: :Pattern Recognition 1981
Behrooz Parhami M. Taraghi

-The automatic recognition of printed Farsi (Persian) texts is complicated by several properties of the Farsi script: (a) connectivity of symbols, (b) similarity of groups of symbols, (c) highly variable widths, (d) subword overlap, and (e) line overlap. In this paper, a technique for the automatic recognition of printed Farsi texts is presented and its steps are discussed as follows : (1) digi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید