نتایج جستجو برای: data cleaning

تعداد نتایج: 2424654  

Journal: :Mechanisms of Ageing and Development 2015
Jennifer Baur Maria Moreno-Villanueva Tobias Kötter Thilo Sindlinger Alexander Bürkle Michael R. Berthold Michael Junk

Databases are an organized collection of data and necessary to investigate a wide spectrum of research questions. For data evaluation analyzers should be aware of possible data quality problems that can compromise results validity. Therefore data cleaning is an essential part of the data management process, which deals with the identification and correction of errors in order to improve data qu...

Journal: :JNW 2013
Jiyun Li Junping Wang Hongxing Pei

Data mining or data analysis in biomedicine is different from other research fields, because the data in biomedical are heterogeneous and, and they are from different sources. Data from different medical sources are voluminous, each of the resources may have different data structure or data schema, the data quality is also different. Moreover, each physician may have its own interpretation with...

Journal: :PLoS Medicine 2005
Jan Van den Broeck Solveig Argeseanu Cunningham Roger Eeckels Kobus Herbst

I n clinical epidemiological research, errors occur in spite of careful study design, conduct, and implementation of error-prevention strategies. Data cleaning intends to identify and correct these errors or at least to minimize their impact on study results. Little guidance is currently available in the peer-reviewed literature on how to set up and carry out cleaning efforts in an effi cient a...

2018
Giansalvatore Mecca Paolo Papotti Donatello Santoro

Schema mapping management is an important research area in data transformation, integration, and cleaning systems. The reasons for its success can be found in the declarative nature of its building block (thus enabling clean semantics and easy to use design tools) paired with the efficiency and modularity in the deployment step. In this chapter we cover the evolution of schema-mappings through ...

2016
Dina Sukhobok Nikolay Nikolov Antoine Pultier Xianglin Ye Arne-Jørgen Berre Rick Moynihan Bill Roberts Brian Elvesæter Mahasivam Nivethika Dumitru Roman

Over the past several years the amount of published open data has increased significantly. The majority of this is tabular data, that requires powerful and flexible approaches for data cleaning and preparation in order to convert it into Linked Data. This paper introduces Grafterizer – a software framework developed to support data workers and data developers in the process of converting raw ta...

2005
Rupesh Kumar Montakarn Chaikumarn Shrawan Kumar

Methods: In this study, cleaning process was studied and analyzed with special reference to cleaning tools. A group of 13 professional cleaners participated in this study. While they performed their normal tasks, their oxygen consumption, heart rate, rating of perceived exertion and postural data were obtained. The perceived exertion during cleaning task using the ‘‘redesigned cleaning tool’’ w...

2004
Laura Irina Rusu J. Wenny Rahayu David Taniar

One of the most important aspects in building an XML data warehouse is data cleaning and integration process. This paper presents a detailed methodology for cleaning data and integrating, especially useful for general situations when different-source documents are involved. Both situations whereby the XML documents have an associated XML Schema or they are just independent XML documents are con...

2012
Lin Li

........................................................................................................................................ II Acknowledgement........................................................................................................................ IV Publications from the PhD work ..........................................................................................

Journal: :JCP 2010
Kazi Shah Nawaz Ripon Ashiqur Rahman G. M. Atiqur Rahaman

Data mining algorithms generally assume that data will be clean and consistent. However, in practice, this is not always the case, and for this reason the detection and elimination of duplicate records is an important part of data cleaning. The presence of similar-duplicate records causes over-representation of data. If the database contains different representations of the same data, the resul...

2010
Meredith Nahm M. Nahm

ion 70 647 960 5,019 1,018 510 818 Optical 2 81 207 1,106 338 4 220 Single entry 4 26 80 650 150 21 36 Double entry 4 15 16 33 10 6 24 No batch data cleaning 2 270 648 5,019 946 200 475 Batch data cleaning 2 36 306 1,351 428 23 287

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید