Document Image Decoding Using Markov Source Models

نویسندگان

  • Gary E. Kopec
  • Philip A. Chou
چکیده

This paper describes a communication theory approach to document image recognition, patterned after the use of hidden Markov models in speech recognition. In general, a document recognition problem is viewed as consisting of three elements— an image generator, a noisy channel and an image decoder. A document image generator is a Markov source (stochastic finite-state automaton) that combines a message source with an imager. The message source produces a string of symbols, or text, that contains the information to be transmitted. The imager is modeled as a finite-state transducer that converts the one-dimensional message string into an ideal two-dimensional bitmap. The channel transforms the ideal image into a noisy observed image. The decoder estimates the message, given the observed image, by finding the a posteriori most probable path through the combined source and channel models using a Viterbi-like dynamic programming algorithm. The proposed approach is illustrated on the problem of decoding scanned telephone yellow pages to extract names and numbers from the listings. A finite-state model for yellow page columns was constructed and used to decode a database of scanned column images containing about 1100 individual listings. Overall, 99.5% of the listings were correctly recognized, with character classification rates of 98% and 99.6%, respectively, for the names and numbers. Keywords—document recognition, text recognition, image decoding, stochastic grammars, Markov sources

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adding linguistic constraints to document image decoding: comparing the iterated complete path and stack algorithms

Beginning with an observed document image and a model of how the image has been degraded, Document Image Decoding recognizes printed text by attempting to find a most probable path through a hypothesized Markov source. The incorporation of linguistic constraints, which are expressed by a sequential predictive probabilistic language model, can improve recognition accuracy significantly in the ca...

متن کامل

Stochastic attribute grammar model of document production and its use in document image decoding

Document Image Decoding (DID) refers to the process of document recognition within a communication theory framework. In this framework, a logical document structure is a message communicated by encoding the structure as an ideal image, transmitting the ideal image through a noisy channel, and decoding the degraded image into a logical structure as close to the original message as possible, on a...

متن کامل

Markov source model for printed music decoding

This paper describes a Markov source model for a simple subset of printed music notation. The model is based on the Adobe Sonata music symbol set and a message language of our own design. Chord imaging is the most complex part of the model. Much of the complexity follows from a rule of music typography that requires the noteheads for adjacent pitches to be placed on opposite sides of the chord ...

متن کامل

IMAGE SEGMENTATION USING GAUSSIAN MIXTURE MODEL

  Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we have learned Gaussian mixture model to the pixels of an image. The parameters of the model have estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image is made by Bayes rule. In fact, ...

متن کامل

­­Image Segmentation using Gaussian Mixture Model

Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Pattern Anal. Mach. Intell.

دوره 16  شماره 

صفحات  -

تاریخ انتشار 1994