Effective Learning to Rank Persian Web Content

نویسنده

Amir Hosein Keyhanipour Assistant Professor, Computer Engineering Department, Faculty of Engineering, College of Farabi, University of Tehran, Iran.

چکیده مقاله:

Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a recently proposed learning to rank data, aims to deal with such issues by the classifier fusion idea. CF-Rank generates a few click-through features, which provide a compact representation of a given primitive dataset. By constructing the primitive classifiers on each category of click-through features and aggregating their decisions by the use of information fusion techniques, CF-Rank has become a successful ranking algorithm in English datasets. In this paper, CF-Rank is customized for the Persian Web content. Evaluation results of this algorithm on the dotIR dataset indicate that the customized CF-Rank outperforms baseline rankings. Especially, the improvement is more noticeable at the top of ranked lists, which are observed most of the time by the Web users. According to the NDCG@1 and MAP evaluation criteria, comparing the CF-Rank with the preeminent baseline algorithm on the dotIR dataset indicates an improvement of 30 percent and 16.5 percent, respectively.

Download for Free

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Is learning to rank effective for Web search?

LETOR, the benchmark collection for learning to rank, helps make comparative study on different approaches in experimental research. Since the collection is constructed mainly based on TREC datasets, queries and documents in LETOR differ from true Web search scenario on some aspects, such as its incomplete link information, limited documents’ domain, and lack of user click information. Hence th...

متن کامل

Page content rank: an approach to the web content mining

Methods of web data mining can be divided into several categories according to a kind of mined information and goals that particular categories set: Web structure mining (WSM), Web usage mining (WUM), and Web Content Mining (WCM). The objective of this paper is to propose a new WCM method of a page relevance ranking based on the page content exploration. The method, we call it Page Content Rank...

متن کامل

Semantic Web to E-Learning Content

Semantic Web to E-Learning Content T.Sheeba * S.Hameetha Begum M. Justin Bernard Computing, Muscat College Computing, Muscat College W.J. Towell Engineering, Muscat Oman Oman Oman Abstract— E-Learning is the use of technology to enable people to learn anytime and anywhere. Semantic Web incorporates efforts to build an efficient web that enhances content with formal semantics, which enables bett...

متن کامل

Learning to Surface Deep Web Content

We propose a novel deep web crawling framework based on reinforcement learning. The crawler is regarded as an agent and deep web database as the environment. The agent perceives its current state and submits a selected action (query) to the environment according to Q-value. Based on the framework we develop an adaptive crawling method. Experimental results show that it outperforms the state of ...

متن کامل

Online Learning to Rank for Content-Based Image Retrieval

A major challenge in Content-Based Image Retrieval (CBIR) is to bridge the semantic gap between low-level image contents and high-level semantic concepts. Although researchers have investigated a variety of retrieval techniques using different types of features and distance functions, no single best retrieval solution can fully tackle this challenge. In a real-world CBIR task, it is often highl...

متن کامل

Learning to rank related entities in Web search

Entity ranking is a recent paradigm that refers to retrieving and ranking related objects and entities from different structured sources in various scenarios. Entities typically have associated categories and relationships with other entities. In this work, we present an extensive analysis of Web-scale entity ranking, based on machine learned ranking models using an ensemble of pair-wise prefer...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

عنوان ژورنال

مدیریت فناوری اطلاعات

دوره 11 شماره 2

صفحات 111- 128

تاریخ انتشار 2019-06-01

دنبال کردن

لغو دنبال کردن

{@ msg @}

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

Learning to rank Persian language CF-Rank algorithm dotIR dataset Information fusion

میزبانی شده توسط پلتفرم ابری doprax.com