Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening
نویسندگان
چکیده
We consider the problem of statistical inference for ranking data, specifically rank aggregation, under the assumption that samples are incomplete in the sense of not comprising all choice alternatives. In contrast to most existing methods, we explicitly model the process of turning a full ranking into an incomplete one, which we call the coarsening process. To this end, we propose the concept of rank-dependent coarsening, which assumes that incomplete rankings are produced by projecting a full ranking to a random subset of ranks. For a concrete instantiation of our model, in which full rankings are drawn from a Plackett-Luce distribution and observations take the form of pairwise preferences, we study the performance of various rank aggregation methods. In addition to predictive accuracy in the finite sample setting, we address the theoretical question of consistency, by which we mean the ability to recover a target ranking when the sample size goes to infinity, despite a potential bias in the observations caused by the (unknown) coarsening.
منابع مشابه
MRA-based Statistical Learning from Incomplete Rankings
Statistical analysis of rank data describing preferences over small and variable subsets of a potentially large ensemble of items {1, . . . , n} is a very challenging problem. It is motivated by a wide variety of modern applications, such as recommender systems or search engines. However, very few inference methods have been documented in the literature to learn a ranking model from such incomp...
متن کاملThe Use of Fuzzy, Neural Network, and Adaptive Neuro-Fuzzy Inference System (ANFIS) to Rank Financial Information Transparency
Ranking of a company's financial information is one of the most important tools for identifying strengths and weaknesses and identifying opportunities and threats outside the company. In this study, it is attempted to examine the financial statements of companies to rank and explain the transparency of financial information of 198 companies during 2009-2017 using artificial intelligence and neu...
متن کاملMarginal Analysis of A Population-Based Genetic Association Study of Quantitative Traits with Incomplete Longitudinal Data
A common study to investigate gene-environment interaction is designed to be longitudinal and population-based. Data arising from longitudinal association studies often contain missing responses. Naive analysis without taking missingness into account may produce invalid inference, especially when the missing data mechanism depends on the response process. To address this issue in the ana...
متن کاملA new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کاملThe Presentation of an Approach of Evaluation and Ranking in Data Envelopment Analysis with Interval Data: a Case Study in the Evaluation and Ranking of Iran’s Provinces in the Health and Treatment Sector
Today, in every society the health and treatment sector are among the most important service sectors. Therefore, it is crucial that their performance be evaluated and examined. Although the researchers have proposed many different approaches to evaluate and rank the health sectors, no precise approach for evaluating and ranking have been reported up to now. Assessing the coefficient of variatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017