نتایج جستجو برای: edit distance
تعداد نتایج: 242096 فیلتر نتایج به سال:
In this empirical study, I compare various tree distance measures – originally developed in computational biology for the purpose of tree comparison – for the purpose of parser evaluation. I will control for the parser setting by comparing the automatically generated parse trees from the stateof-the-art parser (Charniak, 2000) with the gold-standard parse trees. The article describes two differ...
In many database applications involving string data, it is common to have near neighbor queries (asking for strings that are similar to a query string) or nearest neighbor queries (asking for strings that are most similar to a query string). The similarity between strings is defined in terms of a distance function determined by the application domain. The most popular string distance measures a...
Edit distance, also known as Levenshtein distance, is a very useful tool to measure the similarity between two strings. It has been widely used in many applications such as natural language processing and bioinformatics. In this paper, we introduce a new type of fuzzy public key encryption called Edit Distance-based Encryption (EDE). In EDE, the encryptor can specify an alphabet string and a th...
Many natural language processing (NLP) applications require the computation of similarities between pairs of syntactic or semantic trees. Many researchers have used tree edit distance for this task, but this technique suffers from the drawback that it deals with single node operations only. We have extended the standard tree edit distance algorithm to deal with subtree transformation operations...
Graph kernels allow to define metrics on graph space and constitute thus an efficient tool to combine advantages of structural and statistical pattern recognition fields. Within the chemoinformatics framework, kernels are usually defined by comparing number of occurences of patterns extracted from two different graphs. Such a graph kernel construction scheme neglects the fact that similar but n...
In this paper we present our structured information retrieval model based on subgraphs similarity. Our approach combines a content propagation technique which handles sibling relationships with a document query matching process on structure. The latter is based on tree edit distance (TED) which is the minimum set of insert, delete, and replace operations to turn one tree to another. As the effe...
We propose a new approach to characterizing the timeline of a text: temporal dependency structures, where all the events of a narrative are linked via partial ordering relations like BEFORE, AFTER, OVERLAP and IDENTITY. We annotate a corpus of children’s stories with temporal dependency trees, achieving agreement (Krippendorff’s Alpha) of 0.856 on the event words, 0.822 on the links between eve...
Based on the core approach of the tree edit distance algorithm, the system central module is designed to target the scope of TE – semantic variability. The main idea is to transform the hypothesis making use of extensive semantic knowledge from sources like DIRT, WordNet, Wikipedia, acronyms database. Additionally, we built a system to acquire the extra background knowledge needed and applied c...
Pattern matching for intelligence organizations is a challenging problem. The data sets are large and noisy, and there is a flexible and constantly changing notion of what constitutes a match. We are developing the Link Analysis Workbench (LAW) to assist an expert user in the intelligence community in creating and maintaining patterns, matching those patterns against a large collection of relat...
In this paper we study the geometry of graph spaces endowed with a special class of graph edit distances. The focus is on geometrical results useful for statistical pattern recognition. The main result is the Graph Representation Theorem. It states that a graph is a point in some geometrical space, called orbit space. Orbit spaces are well investigated and easier to explore than the original gr...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید