web documents

Semantic Navigator: Use of Semantic Data in Web Navigation

2011

Jan Michelfeit Tomáš Knap

Semantic web search engines can take advantage of machineunderstandable data published on the Web to provide more precise search results and advanced query capabilities. Semantic data embedded in web documents (serialized as RDFa or microformats, for example) can be used in conjunction with a semantic web search engine to provide a better web navigation experience. We present Semantic Navigator...

متن کامل

مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابه‌جایی وزن‌دار

ژورنال: پردازش علائم و داده ها 2019

پاک سیما, جواد,

Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...

متن کامل

Automated Delivery of Web Documents Through a Caching Infrastructure

2003

Pablo Rodriguez Ernst W. Biersack Keith W. Ross

The dramatic growth of the Internet and of the Web traffic calls for scalable solutions to accessing Web documents. To this purpose, various caching schemes have been proposed and caching has been widely deployed. Since most Web documents change very rarely, the issue of consistency, i.e. how to assure access to the most recent version of a Web document, has received not much attention. However...

متن کامل

An Overview of Similarity Measures for Clustering XML Documents

2006

Giovanna Guerrini Marco Mesiti Ismael Sanz

The large amount and heterogeneity of XML documents on the Web require the development of clustering techniques to group together similar documents. Documents can be grouped together according to their content, their structure, and links inside and among documents. For instance, grouping together documents with similar structures has interesting applications in the context of information extrac...

متن کامل

A novel algorithm for enhancing search results by detecting dissimilar patterns based on correlation method

Journal: :Int. Arab J. Inf. Technol. 2017

Poonkuzhali Sugumaran Kishore Ravi Thirumurugan Shanmugam

The dynamic collection and voluminous growth of information on the web poses great challenges for retrieving relevant information. Though most of the researchers focused their research work in the areas of information retrieval and web mining, still their focus is only on retrieving similar patterns leaving dissimilar patterns which are likely to contain the outlying data. So this paper concent...

متن کامل

Writing Web Documents about Films

1998

Wayne H. Wolf

This paper describes our experiences our experience in building and using a Web-based video library designed for educational use. The CAETI Internet Multimedia Library’s initial audience is K-12 schools; most of the content of our library comes from news and politics-related historical footage. The video library is a good tool not just for content but also for acquiring visual literacy. Politic...

متن کامل

Clustering Template Based Web Documents

2008

Thomas Gottron

More and more documents on theWorld WideWeb are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. Grouping together documents which are based on the same template is an important task for applications that analyse the template structure and need clean training data. This paper develops and compares several distance m...

متن کامل

Embedding Knowledge in Web Documents

Journal: :Computer Networks 1999

Philippe Martin Peter W. Eklund

The paper argues for the use of general and intuitive knowledge representation languages (and simpler notational variants, e.g. subsets of natural languages) for indexing the content of Web documents and representing knowledge within them. We believe that these languages have advantages over metadata languages based on the Extensible Mark-up Language (XML). Indeed, the retrieval of precise info...

متن کامل

Syntactic Similarity of Web Documents

2003

Álvaro R. Pereira Nivio Ziviani

This paper presents and compares two methods for evaluating the syntactic similarity between documents. The first method uses the Patricia tree, constructed from the original document, and the similarity is computed searching the text of each candidate document in the tree. The second method uses shingles concept to obtain the similarity measure for every document pairs, and each shingle from t...

متن کامل

Semantic Summarization Of Web Documents

2013

Shubhangi V Ingale

Documents summarization techniques automatically extract information from different sources . The main propose of this paper is summarizing documents that retrieve from internet. The propose to capture the document from internet , that document store in database ,extract that documents, use the natural language, in order to retrieve similar information. An overview of the system and some prelim...

متن کامل