نتایج جستجو برای: web page classification

تعداد نتایج: 749611  

2014
Tarek Amr Abdallah Beatriz de la Iglesia

There are some situations these days in which it is important to have an efficient and reliable classification of a web-page from the information contained in the Uniform Resource Locator (URL) only, without the need to visit the page itself. For example, a social media website may need to quickly identify status updates linking to malicious websites to block them. The URL is very concise, and ...

2000
Nuanwan Soonthornphisaj Boonserm Kijsirikul

The paper presents a generalization of Iterative Cross-Training algorithm (ICT) which was previously applied to Thai Web pages identification [1]. The main concept of ICT is to iteratively train two sub-classifiers by using unlabeled examples in crossing manner. In this paper, we extend the algorithm in order to classify Web pages into course or non-course ones, which is a more challenging prob...

Journal: :Engineering Letters 2007
Jean-Pierre Norguet Benjamin Tshibasu-Kabeya Gianluca Bontempi Esteban Zimányi

With the emergence of the World Wide Web, analyzing and improving Web communication has become essential to adapt the Web content to the visitors’ expectations. Web communication analysis is traditionally performed by Web analytics software, which produce long lists of page-based audience metrics. These results suffer from page synonymy, page polysemy, page temporality, and page volatility. In ...

2014
Shraddha Sarode Jayant Gadge Jiawei Han Micheline Kamber Jian Pei Ming Mao Yefei Peng Michael Spring Xiaoguang Qi Brian D. Davison Tom M. Mitchell Juan Zhang Yi Niu Huabei Nie Rung-Ching Chen Xiaoyue Wang Zhen Hua Rujiang Bai G. S. Tomar Shekhar Verma Ashish Jha Selma Ayse Özel

Dimensionality refers to number of terms in a web page. While classifying web pages high dimensionality of web pages causes problem. The main objective of reducing dimensionality of web pages is improving the performance of classifier. Processing time and accuracy are two parameters which influence the performance of a classifier. To reduce the processing time, less informative and redundant te...

Journal: :Inf. Sci. 2004
Ali Selamat Sigeru Omatu

Automatic categorization is the only viable method to deal with the scaling problem of the World Wide Web (WWW). In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profile-based features. Each news web page is represented by the term-weighting scheme. As the number of unique words...

2002
Heiner Stuckenschmidt Jens Hartmann Frank van Harmelen

Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing contentrelated metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web pages on the basis of their syntactic structure. This classification requires the specification of cla...

Journal: :International Journal of Applied Information Systems 2014

Journal: :J. Web Eng. 2004
Michael A. Shepherd Carolyn R. Watters Alistair Kennedy

The research reported in this paper is part of a larger project on the automatic classification of web pages by their genres. The long term goal is the incorporation of web page genre into the search process to improve the quality of the search results. In this phase, a neural net classifier was trained to distinguish home pages from non-home pages and to classify those home pages as personal h...

2014
Priyank Thakkar Samir Kariya

Clustering is the unsupervised classification of patterns (data items, observations or feature vectors) into groups (clusters). Clustering problem has been addressed by the researchers of many disciplines in different contexts. Due to the escalating amount of data available online, the World Wide Web has become one of the most precious resource for information retrievals and knowledge discoveri...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید