The Segmentation and Identification of Handwriting in Noisy Document Images

نویسندگان

  • Yefeng Zheng
  • Huiping Li
  • David S. Doermann
چکیده

In this paper we present an approach to the problem of segmenting and identifying handwritten annotations on noisy document images. In many types of documents such as correspondence, it is not uncommon for handwritten annotations to be added as part of a note, correction, clarification, or instruction, or for initials or a signature to appear as an authentication mark. It is important to be able to segment and identify such handwriting so we can 1) locate, interpret and retrieve them efficiently in large document databases, and 2) use different algorithms for printed/handwritten text or signature recognition. Our approach consists of two processes: 1) segmentation process which divides the text into regions with an appropriate level (character, word, line, or zone), and 2) classification process which identifies the segmented regions as handwritten. To determine the smallest possible region size where classification can be reliably performed, we conducted experiments at the character, word and zone level. We found that the reliable results can be achieved only at the word level with the classification accuracy of 97.3%. The identified handwritten text is further grouped into zones and verification at the zone level is used to reduce false alarms. Experiments show our approach is promising and robust.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Identification in Noisy Document Images Using Markov Random Field

In this paper we address the problem of the identification of text from noisy documents. We segment and identify handwriting from machine printed text because 1) handwriting in a document often indicates corrections, additions or other supplemental information that should be treated differently from the main or body content, and 2) the segmentation and recognition techniques for machine printed...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

Minimizing Loss of Information at Competitive PLIP Algorithms for Image Segmentation with Noisy Back Ground

In this paper, two training systems for selecting PLIP parameters have been demonstrated. The first compares the MSE of a high precision result to that of a lower precision approximation in order to minimize loss of information. The second uses EMEE scores to maximize visual appeal and further reduce information loss. It was shown that, in the general case of basic addition, subtraction, or mul...

متن کامل

Persian Printed Document Analysis and Page Segmentation

This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...

متن کامل

Plant Classification in Images of Natural Scenes Using Segmentations Fusion

This paper presents a novel approach to automatic classifying and identifying of tree leaves using image segmentation fusion. With the development of mobile devices and remote access, automatic plant identification in images taken in natural scenes has received much attention. Image segmentation plays a key role in most plant identification methods, especially in complex background images. Wher...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002