On Segmentation Methods for Handwritten Arabic Documents

نویسندگان

  • Fadoua BOUAFIF SAMOUD
  • Samia SNOUSSI MADDOURI
  • Noureddine ELLOUZE
چکیده

In the literature, two methods for the extraction zones of the document are more used. The first method is based on the Mathematical Morphology (MM). The second is based on Hough Transform (HT). The main contribution of this paper is the application of these methods to extract the handwritten components of the complex document. The second contribution is the combination between the HT and the MM. The third contribution is the use of these three developed methods to automatically extract the handwritten components from CENPARMI bank check: numerical amount, literal amount and date zone. We present a concept for automatic evaluation of the results, based on a label tools for the different part of the used documents. We achieve a correct extraction rate of 98.27% for numerical amount, 91.82% for literal amount, and 99.63% for date, extracted by hybrid method HT-MM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

Segmentation-Based And Segmentation-Free Methods for Spotting Handwritten Arabic Words

Given a set of handwritten documents, a common goal is to search for a relevant subset. Attempting to find a query word or image in such a set of documents is called word spotting. Spotting handwritten words in documents written in the Latin alphabet, and more recently in Arabic, has received considerable attention. One issue is generating candidate word regions on a page. Attempting to definit...

متن کامل

Holistic Approach for Classifying and Retrieving Personal Arabic Handwritten Documents

This paper presents a novel holistic technique for classifying and retrieving Arabic handwritten text documents. The retrieval of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several features are extracted from these connected parts and then co...

متن کامل

Component-based Segmentation of Words from Handwritten Arabic Text

Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words se...

متن کامل

Segmentation of Handwritten and Printed Arabic Documents

on this paper, we proposed a new text line segmentation of handwritten and typewriting Arabic document images that uses the Outer Isothetic Cover (OIC) algorithm of a digital object. In the first step, we use this method to segment the composed document into text blocs. In the second step, for each text bloc we will extract the text lines. Finally, line text will be segmented into words or into...

متن کامل

Segmentation-free Word Spotting for Handwritten Arabic Documents

6 Abstract — In this paper we present an unsupervised segmentation-free method for spotting and searching query, especially, for images documents in handwritten Arabic, for this, Histograms of Oriented Gradients (HOGs) are used as the feature vectors to represent the query and documents image. Then, we compress the descriptors with the product quantization method. Finally, a better representati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009