Fast Bayesian Record Linkage With Record-Specific Disagreement Parameters
نویسندگان
چکیده
منابع مشابه
Hierarchical Bayesian Record Linkage Theory
In record linkage, or exact file matching, one compares two or more files on a single population for purposes of unduplication or production of an enhanced, merged database. Record linkage has many applications, including in population enumeration efforts, to create databases for epidemiological investigations, and to improve survey sample frames. Latent class and mixture models have been used ...
متن کاملValidating Distance-Based Record Linkage with Probabilistic Record Linkage
This work compares two alternative methods for record linkage: distance based and probabilistic record linkage. It compares the performance of both approaches when data is categorical. To that end, a distance over ordinal and nominal scales is defined. The paper shows that, for categorical data, distance-based and probabilistic-based record linkage lead to similar results in relation to the num...
متن کاملAn Experiment in naïve Bayesian Record Linkage
Sharing data can represent a risk of disclosing sensitive information about the individuals which the data sets concern. Computationally complex techniques can be used by a socalled ‘data intruder’ to link such data and discover information about targeted individuals. Heuristic approaches to limiting this risk are aimed towards the more casual intruder. A knowledgeable intruder, armed with data...
متن کاملImplementing a Bayesian Approach to Record Linkage
The Census Coverage Measurement survey-based program estimated household population coverage of the 2010 Decennial Census. Calculating coverage estimates required linking survey person data to census enumerations. For record linkage research, we applied a Bayesian Latent Class Models approach to both 2010 coverage survey data and simulated household data. This paper presents our use of Base SAS...
متن کاملMethods for Record Linkage and Bayesian Networks
Although terminology differs, there is considerable overlap between record linkage methods based on the Fellegi-Sunter model (JASA 1969) and Bayesian networks used in machine learning (Mitchell 1997). Both are based on formal probabilistic models that can be shown to be equivalent in many situations (Winkler 2000). When no missing data are present in identifying fields and training data are ava...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Business & Economic Statistics
سال: 2021
ISSN: 0735-0015,1537-2707
DOI: 10.1080/07350015.2021.1934478