نتایج جستجو برای: document image

تعداد نتایج: 522080  

2016
Navya Prakash

Information extraction is the task of extracting structured data from a degraded document. It includes data extraction such as text, image or graphics from the sources such as an image, video or documents. Text detection and extraction from the degraded document finds application in wide range of study. In this paper, an Optical Character Recognition less (OCR-less) method of obtaining an origi...

2017
Mao Tan Siping Yuan Yongxin Su

Rapid increase of digitized document give birth to high demand of document image retrieval. While conventional document image retrieval approaches depend on complex OCR-based text recognition and text similarity detection, this paper proposes a new content-based approach, in which more attention is paid to features extraction and fusion. In the proposed approach, multiple features of document i...

Journal: :CoRR 2015
Abdeslam El Harraj Naoufal Raissouni

Digital camera and mobile document image acquisition are new trends arising in the world of Optical Character Recognition and text detection. In some cases, such process integrates many distortions and produces poorly scanned text or text-photo images and natural images, leading to an unreliable OCR digitization. In this paper, we present a novel nonparametric and unsupervised method to compens...

1996
Tapas Kanungo Robert M. Haralick

Character groundtruth for scanned document images is crucial for evaluating the performance of OCR systems, training OCR algorithms, and validating document degradation models. Unfortunately, manual collection of accurate groundtruth for characters in a real (scanned) document image is not possible because (i) accuracy in delineating groundtruth character bounding boxes is not high enough, (ii)...

1995
Debashish Niyogi Sargur N. Srihari

The analysis of a document image to derive a symbolic description of its structure and contents involves using spatial domain knowledge to classify the different printed blocks (e.g., text paragraphs), group them into logical units (e.g., newspaper stories), and determine the reading order of the text blocks within each unit. These steps describe the conversion of the physical structure of a do...

1990
Rangachar Kasturi Senthil Siva Lawrence O'Gorman

An overview is presented of algorithms and techniques for document image analysis with an emphasis on those for grnphics recognition and interpretation. The techniques are derived from the fields of image processing. pattern recognition, and machine vision. The objective in document image analysis is to recognize page contents including layout, text, and figures. Although optical character reco...

2014
Manoj Kumar Shukla Haider Banka S. N. Srihari C. Y. Suen R. Legault C. Nadal M. Cheriet

The working module of any Optical character Recognition system almost depends upon printing and paper of the input document image. A number of OCR techniques are available and claim correctly identified accuracy in printed document image in Indian and foreign script. A few report have been found on the recognition of the degraded Indian language document. The degradation in any scanned printed ...

2001
Yue Lu Chew Lim Tan Weihua Huang Liying Fan

An approach to word image matching based on weighted Hausdorff distance(WHD) is proposed in this paper to facilitate the detection and location of the user-specified words in the document images. Preprocessing such as eliminating the space between adjacent characters in the word images and scale normalization is first done before the WHD is utilized to measure the distance between the template ...

2014
M. Tamilselvi

----------------------------------------------------ABSTRACT--------------------------------------------------Segmentation of text from badly degraded document images is very challenging tasks due to the high inter/intra variation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that add...

Journal: :CoRR 2017
Ram Krishna Pandey A. G. Ramakrishnan

Recognition of document images have important applications in restoring old and classical texts. The problem involves quality improvement before passing it to a properly trained OCR to get accurate recognition of the text. The image enhancement and quality improvement constitute important steps as subsequent recognition depends upon the quality of the input image. There are scenarios when high ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید