الگوریتم levenshtein

نتایج جستجو برای: الگوریتم levenshtein

تعداد نتایج: 22948 فیلتر نتایج به سال:

Indo-European languages tree by Levenshtein distance

2007

Maurizio Serva Filippo Petroni

The evolution of languages closely resembles the evolution of haploid organisms. This similarity has been recently exploited [1, 2] to construct language trees. The key point is the definition of a distance among all pairs of languages which is the analogous of a genetic distance. Many methods have been proposed to define these distances, one of this, used by glottochronology, compute distance ...

متن کامل

Automated languages phylogeny from Levenshtein distance

Journal: :CoRR 2009

Maurizio Serva

Languages evolve in time according to a process in which reproduction, mutation and extinction are all possible. This is very similar to haploid evolution for asexual organisms or for mtDNA of complex ones. Exploiting this similarity it is possible, in principle, to verify hypotheses concerning their relationship. The key point is the definition of the distance among pairs of languages in analo...

متن کامل

ارزیابی دقت توصیفی عوارض در اطلاعات مکانی مردم‌گستر

ژورنال: علوم و فنون نقشه برداری 2016

آل شیخ, علی اصغر, واحدی طرقبه, بهزاد,

از زمان پیدایش مفهوم اطلاعات مکانی مردم‌گستر (داوطلبانه)1 کیفیت این اطلاعات به عنوان بزرگترین مشکل آن معرفی شده است. بنابراین تا کنون تحقیقات مختلفی به بررسی کیفیت داده‌های مردم‌گستر پرداخته و سعی در برآورد کیفیت این اطلاعات داشته اند. اما در این تحقیقات به دقت توصیفی کمتر از سایر المان‌های کیفیت پرداخته شده است؛ در حالیکه این المان در آنالیزهای گوناگون مکانی و کاربردهای مختلف اطلاعات مردم گستر...

متن کامل

Investigating the Change of Web Pages' Titles Over Time

Journal: :CoRR 2009

Martin Klein Michael L. Nelson

Inaccessible web pages are part of the browsing experience. The content of these pages however is often not completely lost but rather missing. Lexical signatures (LS) generated from the web pages’ textual content have been shown to be suitable as search engine queries when trying to discover a (missing) web page. Since LSs are expensive to generate, we investigate the potential of web pages’ t...

متن کامل

Learning String Edit Distance

Journal: :IEEE Trans. Pattern Anal. Mach. Intell. 1997

Eric Sven Ristad Peter N. Yianilos

In many applications, it is necessary to determine the similarity of two strings. A widely-used notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic model for string edit distance. Our stochastic model allows us to learn the optimal string edit dis...

متن کامل

Inducing Romanization Systems

2013

Keiko Taguchi Andrew Finch Seiichi Yamamoto Eiichiro Sumita

We propose a method for inducing romanization systems directly from a bilingual alignment at the grapheme level. First, transliteration word pairs are aligned using a non-parametric Bayesian approach, and then for each grapheme sequence to be romanized, a particular romanization is selected according to a user-specified criterium. We apply our approach to the task of transliteration mining, and...

متن کامل

Automatic Scoring of a Sentence Repetition Task from Voice Recordings

2016

Meysam Asgari Allison Sliter Jan P. H. van Santen

In this paper, we propose an automatic scoring approach for assessing the language deficit in a sentence repetition task used to evaluate children with language disorders. From ASR-transcribed sentences, we extract sentence similarity measures, including WER and Levenshtein distance, and use them as the input features in a regression model to predict the reference scores manually rated by exper...

متن کامل

Moving beyond Coltheart's N: a new measure of orthographic similarity.

Journal: :Psychonomic bulletin & review 2008

Tal Yarkoni David Balota Melvin Yap

Visual word recognition studies commonly measure the orthographic similarity of words using Coltheart's orthographic neighborhood size metric (ON). Although ON reliably predicts behavioral variability in many lexical tasks, its utility is inherently limited by its relatively restrictive definition. In the present article, we introduce a new measure of orthographic similarity generated using a s...

متن کامل

Unsupervised Learning of Edit Distance Weights for Retrieving Historical Spelling Variations

2007

Andreas W. Hauser Klaus U. Schulz

While todays orthography is very strict and seldom changes, this has not always been true. In historical texts spelling of words often not only varies from todays but in some periods even varies from use to use in a single text. Information retrieval on historical corpora can deal with these variations using fuzzy matching techniques based on Levenshtein-Distance using stochastic weights. In pa...

متن کامل

Upper bounds on the minimum distance of spherical codes

Journal: :IEEE Trans. Information Theory 1996

Peter Boyvalenkov Danyo Danev Silvia P. Boumova

We use linear programming techniques to obtain new upper bounds on the maximal squared minimum distance of spherical codes with fixed cardinality. Functions Qj(n, s) are introduced with the property that Qj(n, s) < 0 for some j > m iff the Levenshtein bound Lm(n, s) on A(n, s) = max{|W | : W is an (n, |W |, s) code} can be improved by a polynomial of degree at least m+1. General conditions on t...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید