نتایج جستجو برای: data cleaning

تعداد نتایج: 2424654  

2008
Joseph M. Hellerstein

Data collection has become a ubiquitous function of large organizations – not only for record keeping, but to support a variety of data analysis tasks that are critical to the organizational mission. Data analysis typically drives decision-making processes and efficiency optimizations, and in an increasing number of settings is the raison d’etre of entire agencies or firms. Despite the importan...

2006
Shawn R. Jeffery Gustavo Alonso Michael J. Franklin Wei Hong Jennifer Widom

Pervasive applications rely on data captured from the physical world through sensor devices. Data provided by these devices, however, tend to be unreliable. The data must, therefore, be cleaned before an application can make use of them, leading to additional complexity for application development and deployment. Here we present Extensible Sensor stream Processing (ESP), a framework for buildin...

Journal: :IJKBO 2011
Payal Pahwa Rajiv Arora Garima Thakur

The quality of real world data that is being fed into a data warehouse is a major concern of today. As the data comes from a variety of sources before loading the data in the data warehouse, it must be checked for errors and anomalies. There may be exact duplicate records or approximate duplicate records in the source data. The presence of incorrect or inconsistent data can significantly distor...

Journal: :PVLDB 2012
Tamraparni Dasu Ji Meng Loh

We introduce the notion of statistical distortion as an essential metric for measuring the effectiveness of data cleaning strategies. We use this metric to propose a widely applicable yet scalable experimental framework for evaluating data cleaning strategies along three dimensions: glitch improvement, statistical distortion and cost-related criteria. Existing metrics focus on glitch improvemen...

Journal: :TELKOMNIKA (Telecommunication Computing Electronics and Control) 2018

2000
Helena Galhardas Daniela Florescu Dennis Shasha Eric Simon

We propose an extensible data cleaning tool, named AJAX, that supports the specification and efficient execution of complex data cleaning programs.

Journal: :CoRR 2015
DongPing Fang Elizabeth Oberlin Wei Ding Samuel P. Kounaves

Data quality is fundamentally important to ensure the reliability of data for stakeholders to make decisions. In real world applications, such as scientific exploration of extreme environments, it is unrealistic to require raw data collected to be perfect. As data miners, when it is infeasible to physically know the why and the how in order to clean up the data, we propose to seek the intrinsic...

Background Social support and school play a pivotal role in the development of oral health-related behaviors among students. This study was conducted to determine the relationship between stages of dental cleaning behavior change based on Trans-theoretical model with school role and social support in Iranian students. Materials and Methods In a cross-sectional study, 525 male and female student...

Journal: :PVLDB 2008
Wenfei Fan Floris Geerts Xibei Jia

Integrity constraints, a.k.a. data dependencies, are being widely used for improving the quality of schema. Recently constraints have enjoyed a revival for improving the quality of data. The tutorial aims to provide an overview of recent advances in constraint-based data cleaning.

Journal: :Proceedings of the VLDB Endowment 2008

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید