Entity Resolution for Multiple Sources with Extended Approach

نویسندگان

چکیده

Abstract Entity Resolution is a technique to find similar records that may refer the same entity from one or many resources. It mainly used in data integration cleaning with existence of Big Data. not only helps organisations have clean data, but it also provides unified view their for later analysis. However, there no solution fitting all duplication issues. Because fact itself heterogeneous and varied. This paper focuses on finding answers usefulness combination different matching approaches, token blocking versus standard how other domain runs by examining well they perform scenarios. To achieve these answers, this outline details setups experiments execute. A detailed evaluation demonstrates effectiveness approaches multiple datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating Entity Resolution Results (Extended version)

Entity Resolution (ER) is the process of identifying groups of records that refer to the same real-world entity. Various measures (e.g., pairwise F1, cluster F1) have been used for evaluating ER results. However, ER measures tend to be chosen in an ad-hoc fashion without careful thought as to what defines a good result for the specific application at hand. In this paper, our contributions are t...

متن کامل

Contextual Entity Resolution Approach for Genealogical Data

Due to huge amount of inaccurate information and different types of ambiguity in the available digitized genealogical data, applying Entity Resolution techniques for determining the records referring to the same entity should be considered as the first and still very important step in analysis of this type of data. Traditional methods, use a standard string similarity measure to calculate the s...

متن کامل

Efficient Entity Maching over Multiple Data Sources with MapReduce

The execution of data-intensive tasks such as entity matching on large data sources has become a common demand in the era of Big Data. To face this challenge, cloud computing has proven to be a powerful ally to efficient parallel the execution of such tasks. In this work we investigate how to efficiently perform entity matching over multiple large data sources using the MapReduce programming mo...

متن کامل

An Incremental Approach to Entity Resolution

We present a query-time entity resolution process that works in a highly parallel fashion. We use the application MobEx to showcase our process, which consists of a mobile client and a server, where the server takes the role of a mediator and carries out the resolution. Results are propagated to the client as early as possible. Resolution results that are produced later in the process are send ...

متن کامل

Progressive Approach to Relational Entity Resolution

This paper proposes a progressive approach to entity resolution (ER) that allows users to explore a trade-off between the resolution cost and the achieved quality of the resolved data. In particular, our approach aims to produce the highest quality result given a constraint on the resolution budget, specified by the user. Our proposed method monitors and dynamically reassesses the resolution pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Communications in computer and information science

سال: 2023

ISSN: ['1865-0937', '1865-0929']

DOI: https://doi.org/10.1007/978-3-031-26438-2_40