نتایج جستجو برای: keywords page segmentation

تعداد نتایج: 2095324  

Journal: :Neurocomputing 2015
Francisco Alvaro Francisco Cruz Fernandez Joan-Andreu Sánchez Oriol Ramos Terrades José-Miguel Benedí

In this paper we define a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of his...

Journal: :IJDLS 2011
Sekhar Mandal Amit Kumar Das Partha Bhowmick Bhabatosh Chanda

This paper presents a unified algorithm for segmentation and identification of various tabular structures from document page images. Such tabular structures include conventional tables and displayed mathzones, as well as Table of

1997
Judith Hochberg Michael Cannon Patrick Kelly James White

This paper explores the use of script identification vectors in the analysis of multilingual document images. A script identification vector is calculated for each connected component in a document. The vector expresses the closest distance between the component and templates developed for each of thirteen scripts, including Arabic, Chinese, Cyrillic, and Roman. We calculate the first three pri...

2015
Robert Kreuzer Jurriaan Hage A. J. Feelders

This paper explores the effectiveness of different semantic web page segmentation algorithms on modern websites. We compare three known algorithms each serving as an example of a particular approach to the problem, and one self-developed algorithm, WebTerrain, that combines two of the approaches. With our testing framework we have compared the performance of four algorithms for a large benchmar...

2006
Stéphane Nicolas Thierry Paquet Laurent Heutte

We consider in this paper the problem of complex handwritten page segmentation such as novelist drafts or authorial manuscripts. We propose to use stochastic and contextual models in order to cope with local spatial variability, and to take into account some prior knowledge about the global structure of the document image. The models we propose to use are Markov Random Field models. Using this ...

2006
Faisal Shafait Daniel Keysers Thomas M. Breuel

This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based. The evaluation is performed using a subset of the UW-III collection commonly used for evaluation, with a separate training set for parameter optimization. We compare the results using both default param...

2011
Aya Ishino Hidetsugu Nanba Toshiyuki Takezawa

Content-targeted advertising systems are becoming an increasingly important part of the funding for free web services. These programs automatically find relevant keywords on a web page, and then display ads based on those keywords. We propose a method for providing links to ads for travel products (which we call ad links) automatically. We extract keywords from citing areas of travel informatio...

2003
Stefano Baldi Simone Marinai Giovanni Soda

In this paper we describe a method for the expansion of training sets made by XY trees representing page layout. This approach is appropriate when dealing with page classification based on MXY tree page representations. The basic idea is the use of tree grammars to model the variations in the tree which are caused by segmentation algorithms. A set of general grammatical rules are defined and us...

2016
Charmi Patel Hiteishi Diwanji Shuang Lin Jie Chen Zhendong Niu Dandan Song Fei Sun Lejian Liao

A Web Page has large amount of information including some additional contents like hyperlinks, header footer, navigational panel; advertisements which may cause the content extraction to be complicated. Page Segmentation is used to detect the noisy content block by detecting malicious URL from Web Pages. Main aim of this research is detecting malicious URL during content extraction by checking ...

2014
Mrs. Anupama Mr. Prasanna

This paper aims at designing and developing a suitable tool for identifying defects in glass bottles through visual inspection based on segmentation algorithm. Defects are identified in three stages namely Image acquisition, Pre-processing and filtering and Segmentation. In the Image acquisition stage, samples of real time images are taken and are converted into monochrome images. In the Pre-pr...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید