Ignoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target PageIgnoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target Page

نویسندگان

Sunil Kumar

Niraj Singhal

چکیده

The web is expanding day-by-day and people generally rely on search engines to explore the web. The web has created many challenges for information retrieval. Degree of quality of the information extracted is one of the major issue to be taken care of, and current information retrieval approaches need to be modified to meet such challenges. While doing query based searching, the search engines return a list of web documents containing both relevant and irrelevant pages and sometimes show the higher ranking to the irrelevant pages as compared to relevant pages. This paper presents a novel approach to ignore irrelevant pages in weighted pagerank algorithm using text content of the targeted pages. General Terms Web Page Ranking for information retrieval

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Associated Pagerank: A Content Relevance Weighted Pagerank Algorithm

Pagerank algorithm is a link analysis approach to evaluate the importance of web pages, and there are many techniques to improve the traditional Pagerank algorithm to prevent from the biases of link spamming in recent years. A key challenge for link analysis is to identify the relevance between the original page and the linked page. The importance scores of web pages should rely on the quality ...

متن کامل

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Weighted Page Rank Algorithm Based on Number of Visits of Links of Web Page

The World Wide Web consists billions of web pages and hugs amount of information available within web pages. To retrieve required information from World Wide Web, search engines perform number of tasks based on their respective architecture. When a user refers a query to the search engine, it generally returns a large number of pages in response to user’s query. To support the users to navigate...

متن کامل

A Score based Web Page Ranking Algorithm

With the explosive growth of information in the Web, users face difficulties while finding their desired information. Search engine helps the user by retrieving useful information from this huge collection based on his/her search query and presents a list of relevant web pages as a search result. However, without proper ranking of pages in the result through the relevancy of pages to the search...

متن کامل

Weighted PageRank using the Rank Improvement

Information available on the WWW, users’ get easily lost in rich hyper structure. It has become increasingly necessary for user’s to utilize automated tool in order to find, extract, filter and evaluate the desired information and resources. Modern Information Retrieval System matches the term of a user with documents in their index and returns a large number of documents of Web pages generally...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Ignoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target PageIgnoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target Page

نویسندگان

چکیده

منابع مشابه

Associated Pagerank: A Content Relevance Weighted Pagerank Algorithm

Data Extraction using Content-Based Handles

Weighted Page Rank Algorithm Based on Number of Visits of Links of Web Page

A Score based Web Page Ranking Algorithm

Weighted PageRank using the Rank Improvement

عنوان ژورنال:

اشتراک گذاری