نتایج جستجو برای: data cleaning

تعداد نتایج: 2424654  

Journal: :Journal of Integrative Bioinformatics 2006

Journal: :Business Systems Research Journal 2018

Journal: :BCP business & management 2022

Con-temporarily, information data has become the cornerstone of every company’s decision-making. In a vast flow information, choosing right is first step in developing successful predictions. After determinations requirements, analysis purpose and prediction direction, outlier processing, missing value processing repeated are usually encountered. This paper introduces limitations, advantages di...

Journal: :SAS global forum 2014
Lauren Parlett

Cross-visit checks are a vital part of data cleaning for longitudinal studies. The nature of longitudinal studies encourages repeatedly collecting the same information. Sometimes, these variables are expected to remain static, go away, increase, or decrease over time. This presentation reviews the naïve and the better approaches at handling one-variable and two-variable consistency checks. For ...

2016
Zhongqi Lu

The soundness of training data is important to the performance of a learning model. However in recommender systems, the training data are usually noisy, because of the randomness nature of users’ behaviors and the sparseness of the users’ feedback towards the recommendations. In this work, we would like to propose a noise elimination model to preprocess the training data in recommender systems....

2012
Ajumobi Udechukwu Christie Ezeife Ken Barker A. Udechukwu C. Ezeife

Many organizations collect large amounts of data to support their business and decision-making processes. The data originate from a variety of sources that may have inherent data-quality problems. These problems become more pronounced when heterogeneous data sources are integrated (for example, in data warehouses). A major problem that arises from integrating different databases is the existenc...

2012
Piyasak Jeatrakul Kok Wai Wong Chun Che Fung

It is posted here for your personal use. No further distribution is permitted. Data cleaning is a pre-processing technique used in most data mining problems. The purpose of data cleaning is to remove noise, inconsistent data and errors in order to obtain a better and representative data set to develop a reliable prediction model. In most prediction model, unclean data could sometime affect the ...

2011
Helena Galhardas Antónia Lopes Emanuel Santos

Data cleaning and ETL processes are usually modeled as graphs of data transformations. The involvement of the users responsible for executing these graphs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. In this paper, in order to better support the user involvement in data cleaning processes, we equip a data clean...

2003
Jeremy Kubica Andrew W. Moore

Real world data is never as perfect as we would like it to be and can often suffer from corruptions that may impact interpretations of the data, models created from the data, and decisions made based on the data. One approach to this problem is to identify and remove records that contain corruptions. Unfortunately, if only certain fields in a record have been corrupted then usable, uncorrupted ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید