نتایج جستجو برای: record matching

تعداد نتایج: 200532  

2016
Pawel Grzebala Michelle Cheatham

Grzebala, Pawel. M.S.C.E. Department of Computer Science and Engineering, Wright State University, 2016. Private Record Linkage: A Comparison of Selected Techniques for Name Matching. The rise of Big Data Analytics has shown the utility of analyzing all aspects of a problem by bringing together disparate data sets. Efficient and accurate private record linkage algorithms are necessary to achiev...

2000
Vassilios S. Verykios Mohamed G. Elfeky Ahmed K. Elmagarmid Munir Cochinwala Siddhartha R. Dalal

The role of data resources in today's business environment is multi-faceted. Primarily, they support the operational needs of an organization or a company. Secondarily, they can be used for decision support and management. The quality of the data, used to support the operational needs, is usually below the quality required for decision support and management. Recent advances in information syst...

Journal: :CoRR 2017
Yves van Gennip Blake Hunter Anna Ma Daniel Moyer Ryan de Vera Andrea L. Bertozzi

We consider the problem of duplicate detection: given a large data set in which each entry has multiple attributes, detect which distinct entries refer to the same real world entity. Our method consists of three main steps: creating a similarity score between entries, grouping entries together into ‘unique entities’, and refining the groups. We compare various methods for creating similarity sc...

Journal: :Studies in health technology and informatics 2016
Simon K. Poon Josiah Poon Mary K. Lam Qinglan Yin Daniel Man-yuen Sze Justin C. Y. Wu Vincent C. T. Mok Jessica Y. L. Ching Kam-Leung Chan William H. N. Cheung Alexander Y. Lau

OBJECTIVES To develop and test an optimal ensemble configuration of two complementary probabilistic data matching techniques namely Fellegi-Sunter (FS) and Jaro-Wrinkler (JW) with the goal of improving record matching accuracy. METHODS Experiments and comparative analyses were carried out to compare matching performance amongst the ensemble configurations combining FS and JW against the two t...

2003
Martin Buechi Andrew Borthwick Adam Winkel Arthur Goldberg

We introduce ClueMaker, the first language designed specifically for approximate record matching. Clues written in ClueMaker predict whether two records denote the same thing based on the values of the records’ attributes. For example, a clue may predict match if the records have identical values for the first name attribute. The values of the clues can then be used as input to a machine-learni...

2006
Matthew Michelson Craig A. Knoblock

Record linkage is the process of matching records across data sets that refer to the same entity. One issue within record linkage is determining which record pairs to consider, since a detailed comparison between all of the records is impractical. Blocking addresses this issue by generating candidate matches as a preprocessing step for record linkage. For example, in a person matching problem, ...

1994
William E. Winkler

Record linkage, or computer matching, is needed for the creation and maintenance of name and address lists that support operations for and evaluations of a Year 2000 Census. This paper describes three advances. The first is an enhanced method of string comparison for dealing with typographical variations and scanning errors. It improves upon string comparators in computer science. The second is...

2004
Tiziana Catarci Diego Milano Monica Scannapieco

Data quality improvement is becoming an increasingly important issue. In contexts where data are replicated among different sources, data quality improvement is possible through extensive data comparisons: whereas copies of same data are different because of data errors, comparisons help to reconcile such copies. Best quality copies can be selected or constructed in order to correct other copie...

Journal: :JDIM 2005
Diego Milano Monica Scannapieco Tiziana Catarci

Data quality improvement is becoming an increasingly important issue. In contexts where data are replicated among different sources, data quality improvement is possible through extensive data comparisons: whereas copies of same data are different because of data errors, comparisons help to reconcile such copies. Record matching algorithms can support the task of linking different copies of the...

1999
William E. Winkler

This paper provides an overview of methods and systems developed for record linkage. Modern record linkage begins with the pioneering work of Newcombe and is especially based on the formal mathematical model of Fellegi and Sunter. In their seminal work, Fellegi and Sunter introduced many powerful ideas for estimating record linkage parameters and other ideas that still influence record linkage ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید