نتایج جستجو برای: الگوریتم levenshtein

تعداد نتایج: 22948  

2007
Maurizio Serva Filippo Petroni

The evolution of languages closely resembles the evolution of haploid organisms. This similarity has been recently exploited [1, 2] to construct language trees. The key point is the definition of a distance among all pairs of languages which is the analogous of a genetic distance. Many methods have been proposed to define these distances, one of this, used by glottochronology, compute distance ...

Journal: :CoRR 2009
Maurizio Serva

Languages evolve in time according to a process in which reproduction, mutation and extinction are all possible. This is very similar to haploid evolution for asexual organisms or for mtDNA of complex ones. Exploiting this similarity it is possible, in principle, to verify hypotheses concerning their relationship. The key point is the definition of the distance among pairs of languages in analo...

از زمان پیدایش مفهوم اطلاعات مکانی مردم‌گستر (داوطلبانه)1 کیفیت این اطلاعات به عنوان بزرگترین مشکل آن معرفی شده است. بنابراین تا کنون تحقیقات مختلفی به بررسی کیفیت داده‌های مردم‌گستر پرداخته و سعی در برآورد کیفیت این اطلاعات داشته اند. اما در این تحقیقات به دقت توصیفی کمتر از سایر المان‌های کیفیت پرداخته شده است؛ در حالیکه این المان در آنالیزهای گوناگون مکانی و کاربردهای مختلف اطلاعات مردم گستر...

Journal: :CoRR 2009
Martin Klein Michael L. Nelson

Inaccessible web pages are part of the browsing experience. The content of these pages however is often not completely lost but rather missing. Lexical signatures (LS) generated from the web pages’ textual content have been shown to be suitable as search engine queries when trying to discover a (missing) web page. Since LSs are expensive to generate, we investigate the potential of web pages’ t...

Journal: :IEEE Trans. Pattern Anal. Mach. Intell. 1997
Eric Sven Ristad Peter N. Yianilos

In many applications, it is necessary to determine the similarity of two strings. A widely-used notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic model for string edit distance. Our stochastic model allows us to learn the optimal string edit dis...

2013
Keiko Taguchi Andrew Finch Seiichi Yamamoto Eiichiro Sumita

We propose a method for inducing romanization systems directly from a bilingual alignment at the grapheme level. First, transliteration word pairs are aligned using a non-parametric Bayesian approach, and then for each grapheme sequence to be romanized, a particular romanization is selected according to a user-specified criterium. We apply our approach to the task of transliteration mining, and...

2016
Meysam Asgari Allison Sliter Jan P. H. van Santen

In this paper, we propose an automatic scoring approach for assessing the language deficit in a sentence repetition task used to evaluate children with language disorders. From ASR-transcribed sentences, we extract sentence similarity measures, including WER and Levenshtein distance, and use them as the input features in a regression model to predict the reference scores manually rated by exper...

Journal: :Psychonomic bulletin & review 2008
Tal Yarkoni David Balota Melvin Yap

Visual word recognition studies commonly measure the orthographic similarity of words using Coltheart's orthographic neighborhood size metric (ON). Although ON reliably predicts behavioral variability in many lexical tasks, its utility is inherently limited by its relatively restrictive definition. In the present article, we introduce a new measure of orthographic similarity generated using a s...

2007
Andreas W. Hauser Klaus U. Schulz

While todays orthography is very strict and seldom changes, this has not always been true. In historical texts spelling of words often not only varies from todays but in some periods even varies from use to use in a single text. Information retrieval on historical corpora can deal with these variations using fuzzy matching techniques based on Levenshtein-Distance using stochastic weights. In pa...

Journal: :IEEE Trans. Information Theory 1996
Peter Boyvalenkov Danyo Danev Silvia P. Boumova

We use linear programming techniques to obtain new upper bounds on the maximal squared minimum distance of spherical codes with fixed cardinality. Functions Qj(n, s) are introduced with the property that Qj(n, s) < 0 for some j > m iff the Levenshtein bound Lm(n, s) on A(n, s) = max{|W | : W is an (n, |W |, s) code} can be improved by a polynomial of degree at least m+1. General conditions on t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید