نتایج جستجو برای: edit distance

تعداد نتایج: 242096  

2009
David W. Pearson Jean-Christophe Janodet

The Levenstein or edit distance was developed as a metric for calculating distances between character strings. We are looking at weighting the different edit operations (insertion, deletion, substitution) to obtain different types of classifications of sets of strings. As a more general and less constrained approach we introduce topological notions and in particular uniformities.

2016
Weiyue Wang Jan-Thorsten Peter Hendrik Rosendahl Hermann Ney

Recently, the capability of character-level evaluation measures for machine translation output has been confirmed by several metrics. This work proposes translation edit rate on character level (CharacTER), which calculates the character level edit distance while performing the shift edit on word level. The novel metric shows high system-level correlation with human rankings, especially for mor...

2007
P. Fihl

The number of potential applications has made automatic recognition of human actions a very active research area. Different approaches have been followed based on trajectories through some state space. In this paper we also model an action as a trajectory through a state space, but we represent the actions as a sequence of temporal isolated instances, denoted primitives. These primitives are ea...

1995
Marie-France Sagot Vincent Escalier Alain Viari Henri Soldano

We present in this paper an algorithm that locates similar words common to a set of strings deened over an alphabet , where the similarity is stated in terms of a Levenshtein edit distance. The comparison of the words in the strings is realized by using a reference object called a model which is a word over. This allows us to perform a multiple comparison of the strings as opposed to pairwise c...

2007
Wilbert Heeringa Brian Joseph

In this paper we use the Reeks Nederlandse Dialectatlassen as a source for the reconstruction of a ‘proto-language’ of Dutch dialects. We used 360 dialects from locations in the Netherlands, the northern part of Belgium and French-Flanders. The density of dialect locations is about the same everywhere. For each dialect we reconstructed 85 words. For the reconstruction of vowels we used knowledg...

2012
Jaume Gibert Ernest Valveny Horst Bunke Alicia Fornés

Graph embeddings in vector spaces aim at assigning a pattern vector to every graph so that the problems of graph classification and clustering can be solved by using data processing algorithms originally developed for statistical feature vectors. An important requirement graph features should fulfil is that they reproduce as much as possible the properties among objects in the graph domain. In ...

2012
Cyril Laitang Karen Pinel-Sauvagnat Mohand Boughanem

Structured information retrieval (SIR) on XML documents allows to retrieve focused parts of documents that match the user needs. These needs can be expressed throught content and structured queries, that as well as XML documents can be represented as trees. Our approach uses these trees through tree edit distance to estimate the relevance of XML elements. Tree edit distance is the minimum set o...

2006
Pavel Makagonov Alejandro Ruiz Figueroa Alexander F. Gelbukh

We propose a method for semi-automatic construction of an ontology of a given branch of science for measuring its evolution in time. The method relies on a collection of documents in the given thematic domain. We observe that the words of different levels of abstraction are located within different parts of a document: say, the title or abstract contains more general words than the body of the ...

2002
Andrea Torsello Edwin R. Hancock

In this paper we investigate how to construct a shape space for sets of shock trees. To do this we construct a super-tree to span the union of the set of shock trees. This super-tree is constructed so that it both minimizes the total tree edit distance and preserves edge consistency constraints. Each node of the super-tree corresponds to a dimension of the pattern space. Individual such trees a...

Journal: :PVLDB 2015
Yu Tang Yilun Cai Nikos Mamoulis

Given a large collection of tree-structured objects (e.g., XML documents), the similarity join finds the pairs of objects that are similar to each other, based on a similarity threshold and a tree edit distance measure. The state-ofthe-art similarity join methods compare simpler approximations of the objects (e.g., strings), in order to prune pairs that cannot be part of the similarity join res...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید