نتایج جستجو برای: record matching
تعداد نتایج: 200532 فیلتر نتایج به سال:
Record linkage is the problem of identifying similar records across different data sources. The similarity between two records is defined based on domain-specific similarity functions over several attributes. In this paper, a novel approach is proposed that uses a two level matching based on double embedding. First, records are embedded into a metric space of dimension K, then they are embedded...
In this report we describe an activity of information integration performed on databases with patent data and company indicators. Depending on the application area, this kind of activity is known as record linkage, duplicate detection, record matching, reference reconciliation or other domain-specific terms. In particular, we present a detailed case study on company name matching. We show how t...
The bipartite record linkage task consists of merging two disparate datafiles containing information on two overlapping sets of entities. This is non-trivial in the absence of unique identifiers and it is important for a wide variety of applications given that it needs to be solved whenever we have to combine information from different sources. Most statistical techniques currently used for rec...
Many problems arise when linking medical records from multiple databases. Matching these data to other data is problematic since even small errors, such as data entry errors, different text format, and missing data, can prevent the exact-match algorithms. Evidence from previous studies suggested that approximate field matching represent a solution to resolve the problem by identifying equivalen...
Record linkage, sometimes referred to as information retrieval (Frakes and Baeza-Yates, 1992) is needed for the creation, unduplication, and maintenance of name and address lists. This paper describes string comparators and their effect in a production matching system. Because many lists have typographical errors in more than 20 percent of first names and also in last names, effective methods f...
ties. With each patient, every treatment event generates two pieces of information: a clinical record (notes, orders, and results generated by and for practitioners) and a billing record. The two are processed separately (most often in parallel; at times, serially). Integrating them yields the patient record. Current technology is based on the unintegrated historic record. Only when the patient...
Recent work on data quality has primarily focused on data repairing algorithms for improving data consistency and record matching methods for data deduplication. This paper accentuates several other challenging issues that are essential to developing data cleaning systems, namely, error correction with performance guarantees, unification of data repairing and record matching, relative informati...
Due to the frequency of spelling and typographical errors in practical applications, record linkage algorithms have to use string similarity functions. In many legal contexts, identifiers such as names have to be encrypted before a record linkage can be attempted. Therefore, algorithms for computing string similarity functions with encrypted identifiers are essential for approximating string ma...
Record linkage, sometimes referred to as information retrieval (Frakes and Baeza-Yates 1992), is needed for the creation, unduplication, and maintenance of name and address lists. This paper describes string comparators and their effect in a production matching system. Because many lists have typographical errors in more than 20% of first names and also in last names, effective methods for deal...
Record matching in data engineering refers to searching for data records originating from same entities across different data sources. The solutions for record matching usually employ learning algorithms to train a classifier that labels record pairs as either matches or nonmatches. In practice, the amount of non-matches typically far exceeds the amount of matches. This problem is so-called imb...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید