Line and Ligature Segmentation of Urdu Nastaleeq Text

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Line and Ligature Segmentation in Printed Urdu Document Images

This paper presents a technique for segmentation of printed Urdu text images into lines and ligatures, a key pre-processing step in Urdu Optical Character Recognition (OCR) systems. Unlike classical projection profile based line segmentation methods, the proposed scheme successfully segments overlapping and touching lines. Once the lines are segmented, ligatures are extracted from each text lin...

متن کامل

Arabic & Urdu Text Segmentation Challenges & Techniques

Text Segmentation is one of the critical and vital step in OCR system of any language because accuracy of OCR depends upon correctly segmented characters. Segmentation divide the text images into its constituent parts (i.e. lines, components or words and individual characters). As Urdu and Arabic are highly cursive and context sensitive in nature and have improper space between words therefore,...

متن کامل

Segmentation-free optical character recognition for printed Urdu text

This paper presents a segmentation-free optical character recognition system for printed Urdu Nastaliq font using ligatures as units of recognition. The proposed technique relies on statistical features and employs Hidden Markov Models for classification. A total of 1525 unique high-frequency Urdu ligatures from the standard Urdu Printed Text Images (UPTI) database are considered in our study. ...

متن کامل

Font Size Independent OCR for Noori Nastaleeq

This paper presents a technique for font size independent OCR of Noori Nastaleeq. Most of the existing OCRs for Noori Nastaleeq support only a single font size. Urdu government documents, news papers, magazines and books written in Noori Nastaleeq font style, has varying range of font sizes. The presented technique in this paper gives support for the font size independence for Noori Nastaleeq O...

متن کامل

Urdu Word Segmentation

Word Segmentation is the foremost obligatory task in almost all the NLP applications where the initial phase requires tokenization of input into words. Urdu is amongst the Asian languages that face word segmentation challenge. However, unlike other Asian languages, word segmentation in Urdu not only has space omission errors but also space insertion errors. This paper discusses how orthographic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2017

ISSN: 2169-3536

DOI: 10.1109/access.2017.2703155