Learning from Multiple Sources of Inaccurate Data
نویسندگان
چکیده
منابع مشابه
Learning from Multiple Sources of Inaccurate Data
Most theoretical models of inductive inference make the idealized assumption that the data available to a learner is from a single and accurate source. The subject of inaccuracies in data emanating from a single source has been addressed by several authors. The present paper argues in favor of a more realistic learning model in which data emanates from multiple sources, some or all of which may...
متن کاملProbabilistic Models to Reconcile Complex Data from Inaccurate Data Sources
There is a large amount of data that is published on the Web and several techniques have been developed to extract and integrate data from Web sources. However, Web data are inherently imprecise and uncertain and even if novel approaches to deal with the uncertain data have been recently proposed, they assume that the data are provided with an associated uncertain degree. This paper addresses t...
متن کاملLearning from Multiple Sources
We consider the problem of learning accurate models from multiple sources of “nearby” data. Given distinct samples from multiple data sources and estimates of the dissimilarities between these sources, we provide a general theory of which samples should be used to learn models for each source. This theory is applicable in a broad decision-theoretic learning framework, and yields results for cla...
متن کاملLearning Conditional Latent Structures from Multiple Data Sources
Data usually present in heterogeneous sources. When dealing with multiple data sources, existing models often treat them independently and thus can not explicitly model the correlation structures among data sources. To address this problem, we propose a full Bayesian nonparametric approach to model correlation structures among multiple and heterogeneous datasets. The proposed framework, first, ...
متن کاملCertifying Data from Multiple Sources
Data integrity can be problematic when integrating and organizing information from many sources. In this paper we describe efficient mechanisms that enable a group of data owners to contribute data sets to an untrusted third-party publisher, who then answers users’ queries. Each owner gets a proof from the publisher that his data is properly represented, and each user gets a proof that the answ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Computing
سال: 1997
ISSN: 0097-5397,1095-7111
DOI: 10.1137/s0097539792239461