نتایج جستجو برای: document image
تعداد نتایج: 522080 فیلتر نتایج به سال:
This paper presents a new model based document image segmentation scheme that uses XML-DTDs (eXtensible Mark-up Language-Document Type Definition). Given a document image, the algorithm has the ability to select the appropriate model. A new wavelet based tool has been designed for distinguishing text from non-text regions and characterization of font sizes. Our model based analysis scheme makes...
Optical Character Recognition (OCR) document. WARNING! Spelling errors might subsist. In order to access to the original document in image form, click on "Original" button on 1st page. Optical Character Recognition (OCR) document. WARNING! Spelling errors might subsist. In order to access to the original document in image form, click on "Original" button on 1st page. Optical Character Recogniti...
Document image classification is an important step in document image analysis. Based on classification results we can tackle other tasks such as indexation, understanding or navigation in document collections. Using a document representation and an unsupervized classification method, we can group documents that from the user point of view constitute valid clusters. The semantic gap between a do...
The digitization of documents and their availability over the network demands solution toward content based document image analysis, indexing, searching and retrieval. Signature, Logo and Layout of the documents present convincing evidence and provide an important form of indexing for effective document image retrieval in a variety of applications. This paper describes methods and techniques de...
Character groundtruth for real, scanned document images is extremely useful for evaluating the performance of OCR systems, training OCR algorithms, and validating document degradation models. Unfortunately, manual collection of accurate groundtruth for characters in a real (scanned) document image is not possible because (i) accuracy in delineating groundtruth character bounding boxes is not hi...
The article proposes a new method for automatic data annotation solving the problem of document image segmentation using deep object detection neural networks. format marked PDF files is considered as initial markup. peculiarity this that it includes hidden marks describe logical and physical structure document. To extract them, tool has been developed simulates operation stack-based printing m...
This paper describes the Document Image Understanding Toolbox currently under development at the University of Washington’s Intelligent Systems Laboratory The Toolbox provides a common data structure and a variety of document image analysis and understanding algorithms from which Toolbox users can construct document image processing systems. An algon’thms for font attribute recognition based on...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید