Bangla/English Script Identification Based on Analysis of Connected Component Profiles
نویسندگان
چکیده
Script identification is required for a multilingual OCR system. In this paper, we present a novel and efficient technique for Bangla/English script identification with applications to the destination address block of Bangladesh envelope images. The proposed approach is based upon the analysis of connected component profiles extracted from the destination address block images, however, it does not place any emphasis on the information provided by individual characters themselves and does not require any character/line segmentation. Experimental results demonstrate that the proposed technique is capable of identifying Bangla/English scripts on the real Bangladesh postal images.
منابع مشابه
Multi-script Off-line Signature Verification: A Two Stage Approach
Signature identification and verification are of great importance in authentication systems. The purpose of this paper is to introduce an experimental contribution in the direction of multi-script off-line signature identification and verification using a novel technique involving off-line English, Hindi (Devnagari) and Bangla (Bengali) signatures. In the first evaluation stage of the proposed ...
متن کاملAn improved offline handwritten character segmentation algorithm for Bangla script
Effective segmentation of offline handwritten word images of unconstrained handwritten Bangla script is a challenging problem in Optical Character Recognition (OCR) application. Presence of a continuous horizontal line called ‘Matra’ is an important feature of this script. However, in unconstrained cursive handwriting, Matra can be wavy or discontinuous, makes the problem of segmentation diffic...
متن کاملWord level Script Identification from Bangla and Devanagri Handwritten Texts mixed with Roman Script
India is a multi-lingual country where Roman script is often used alongside different Indic scripts in a text document. To develop a script specific handwritten Optical Character Recognition (OCR) system, it is therefore necessary to identify the scripts of handwritten text correctly. In this paper, we present a system, which automatically separates the scripts of handwritten words from a docum...
متن کاملConvolution Based Technique for Indic Script Identification from Handwritten Document Images
Determination of script type of document image is a complex real life problem for a multi-script country like India, where 23 official languages (including English) are present and 13 different scripts are used to write them. Including English and Roman those count become 23 and 13 respectively. The problem becomes more challenging when handwritten documents are considered. In this paper an app...
متن کاملScript Identification from Bilingual Gujarati-English Documents
In a multi-lingual country like India, in most of the official papers, school text books, magazines, it is observed that English words intersperse within the Indian regional languages. So a bilingual Optical Character Recognition (OCR) system is needed which can recognize these bilingual documents and store it for future use. In this paper authors present an OCR system developed for the script ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006