نتایج جستجو برای: document image

تعداد نتایج: 522080  

2001
Gaurav Harit Santanu Chaudhury P. Gupta N. Vohra Shiv Dutt Joshi

This paper presents a new model based document image segmentation scheme that uses XML-DTDs (eXtensible Mark-up Language-Document Type Definition). Given a document image, the algorithm has the ability to select the appropriate model. A new wavelet based tool has been designed for distinguishing text from non-text regions and characterization of font sizes. Our model based analysis scheme makes...

1994
Felix Hull

Optical Character Recognition (OCR) document. WARNING! Spelling errors might subsist. In order to access to the original document in image form, click on "Original" button on 1st page. Optical Character Recognition (OCR) document. WARNING! Spelling errors might subsist. In order to access to the original document in image form, click on "Original" button on 1st page. Optical Character Recogniti...

2005
Eugen Barbu Pierre Héroux Sébastien Adam Éric Trupin

Document image classification is an important step in document image analysis. Based on classification results we can tackle other tasks such as indexation, understanding or navigation in document collections. Using a document representation and an unsupervized classification method, we can group documents that from the user point of view constitute valid clusters. The semantic gap between a do...

2015
Umesh D. Dixit M. S. Shirdhonkar

The digitization of documents and their availability over the network demands solution toward content based document image analysis, indexing, searching and retrieval. Signature, Logo and Layout of the documents present convincing evidence and provide an important form of indexing for effective document image retrieval in a variety of applications. This paper describes methods and techniques de...

1998
Tapas Kanungo Robert M. Haralick

Character groundtruth for real, scanned document images is extremely useful for evaluating the performance of OCR systems, training OCR algorithms, and validating document degradation models. Unfortunately, manual collection of accurate groundtruth for characters in a real (scanned) document image is not possible because (i) accuracy in delineating groundtruth character bounding boxes is not hi...

Journal: :Trudy Instituta sistemnogo programmirovaniâ 2022

The article proposes a new method for automatic data annotation solving the problem of document image segmentation using deep object detection neural networks. format marked PDF files is considered as initial markup. peculiarity this that it includes hidden marks describe logical and physical structure document. To extract them, tool has been developed simulates operation stack-based printing m...

1999
Richard Rogers Jisheng Liang Robert M. Haralick

This paper describes the Document Image Understanding Toolbox currently under development at the University of Washington’s Intelligent Systems Laboratory The Toolbox provides a common data structure and a variety of document image analysis and understanding algorithms from which Toolbox users can construct document image processing systems. An algon’thms for font attribute recognition based on...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید