Adaptive Temporal Entity Resolution on Dynamic Databases
نویسندگان
چکیده
Entity resolution is the process of matching records that refer to the same entities from one or several databases in situations where the records to be matched do not include unique entity identifiers. Matching therefore has to rely upon partially identifying information, such as names and addresses. Traditionally, entity resolution has been applied in batch-mode and on static databases. However, increasingly organisations are challenged by the task of having a stream of query records that need to be matched to a database of known entities. As these query records are matched, they are inserted into the database as either representing a new entity, or as the latest embodiment of an existing entity. We investigate how temporal and dynamic aspects, such as time differences between query and database records and changes in database content, affect matching quality. We propose an approach that adaptively adjusts similarities between records depending upon the values of the records’ attributes and the time differences between records. We evaluate our approach on synthetic data and a large real US voter database, with results showing that our approach can outperform static matching approaches.
منابع مشابه
Dynamic Similarity-Aware Inverted Indexing for Real-Time Entity Resolution
Entity resolution is the process of identifying groups of records in a single or multiple data sources that represent the same real-world entity. It is an important tool in data de-duplication, in linking records across databases, and in matching query records against a database of existing entities. Most existing entity resolution techniques complete the resolution process offline and on stati...
متن کاملImproving adaptive resolution of analog to digital converters using least squares mean method
This paper presents an adaptive digital resolution improvement method for extrapolating and recursive analog-to-digital converters (ADCs). The presented adaptively enhanced ADC (AE-ADC) digitally estimates the digital equivalent of the input signal by utilizing an adaptive digital filter (ADF). The least mean squares (LMS) algorithm also determines the coefficients of the ADF block. In this sch...
متن کاملOnline Collective Entity Resolution
Entity resolution is a critical component of data integration where the goal is to reconcile database references corresponding to the same real-world entities. Given the abundance of publicly available databases that have unresolved entities, we motivate the problem of quick and accurate resolution for answering queries over such ‘unclean’ databases. Since collective entity resolution approache...
متن کاملExtending the Radar Dynamic Range using Adaptive Pulse Compression
The matched filter in the radar receiver is only adapted to the transmitted signal version and its output will be wasted due to non-matching with the received signal from the environment. The sidelobes amplitude of the matched filter output in pulse compression radars are dependent on the transmitted coded waveforms that extended as much as the length of the code on both sides of the target loc...
متن کاملThe Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution
This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013