A survey on optical character recognition for Bangla and Devanagari scripts

نویسندگان

  • SOUMEN BAG
  • GAURAV HARIT
  • Gaurav Harit
چکیده

Abstract. The past few decades have witnessed an intensive research on optical character recognition (OCR) for Roman, Chinese, and Japanese scripts. A lot of work has been also reported on OCR efforts for various Indian scripts, like Devanagari, Bangla, Oriya, Tamil, Telugu, Malayalam, Kannada, Gurmukhi, Gujarati, etc. In this paper, we present a review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India. We have summarized most of the published papers on this topic and have also analysed the various methodologies and their reported results. Future directions of research in OCR for Indian scripts have been also given.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey of Feature Extraction and Classification Techniques Used In Character Recognition for Indian Scripts

The Constitution of India, under its Eight Schedule, has recognized Hindi (in Devanagari Script) and English as Official languages of Union Government, along with other 22 languages as Scheduled languages and given status and official encouragement to these Scheduled Languages. Most of the Optical recognition research work has been done on Devanagari, Telugu, and Bangla scripts etc. D e v e l o...

متن کامل

Zone-based Keyword Spotting in Bangla and Devanagari Documents

In this paper we present a word spotting system in text lines for offline Indic scripts such as Bangla (Bengali) and Devanagari. Recently, it was shown that zone-wise recognition method improves the word recognition performance than conventional full word recognition system in Indic scripts [29]. Inspired with this idea we consider the zone segmentation approach and use middle zone information ...

متن کامل

Handwritten Segmentation in Bangla Script: A Review of Offline Techniques

Offline handwritten segmentation in Bangla is an interesting area of research as Segmentation has long been one of the most critical areas of optical character recognition process. Through this operation, an image of a sequence of characters, which may be connected in some cases, is decomposed into sub-images of individual alphabetic symbols. In this paper, segmentation of cursive handwritten s...

متن کامل

A Survey on Script Segmentation for Bangla OCR

Script segmentation is an important primary task for any Optical Character Recognition (OCR) software. Especially, in case of off-line OCR for printed character, it has more importance. Through script segmentation a big image of some written document is fragmented into a number of small pieces which are then used for pattern matching to determine the expected sequence of characters. In the impl...

متن کامل

Recognition of Isolated Multi-Oriented Handwritten/Printed Characters using a Novel Convex-Hull Based Alignment Technique

Handwritten character recognition is one of the difficult tasks of pattern recognition due to diverse writing styles. The problem becomes more severe if the characters are written in a cursive fashion with varying orientations. Also there may exist printed characters of different shapes/fonts and sizes in a document image. In the current work, we have presented a novel convex hull based alignme...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013