نتایج جستجو برای: text length

تعداد نتایج: 467834  

2006
Mohamed Abdel Fattah Fuji Ren Shingo Kuroiwa

In this paper, we present a new approach to align sentences in bilingual parallel corpora based on a probabilistic neural network (P-NNT) classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually aligned training data was used to train the probabili...

Journal: :CoRR 2007
Alexander Tiskin

Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to Lempel–Ziv compression. For an SLPcompressed text of length m̄, and an uncompressed pattern of length n, Cégielski et al. gave an algorithm for local subsequence recogn...

2007
Aminul Islam Diana Inkpen Iluju Kiringa

In this paper, we formulate a generalized method of automatic word segmentation. The method uses corpus type frequency information to choose the type with maximum length and frequency from “desegmented” text. It also uses a modified forward-backward matching technique using maximum length frequency and entropy rate if any non-matching portions of the text exist. The method is also extendible to...

2004
Robert West

The present paper deals with the subject of approximate string matching and demonstrates how Chang and Lawler [CL94] conceived a new sublinear time algorithm out of ideas that had previously been known. The problem is to find all locations in a text of length n over a b-letter alphabet where a pattern of length m occurs with up to k differences (substitutions, insertions, deletions). The algori...

Journal: :J. Discrete Algorithms 2004
Amihood Amir Ayelet Butman Moshe Lewenstein Ely Porat Dekel Tsur

Real Scaled Matching is the problem of finding all locations in the text where the pattern, proportionally enlarged according to an arbitrary real-sized scale, appears. Real scaled matching is an important problem that was originally inspired by Computer Vision. In this paper, we present a new, more precise and realistic, definition for one dimensional real scaled matching, and an efficient alg...

Journal: :J. Discrete Algorithms 2003
Maxime Crochemore Christophe Hancart Thierry Lecroq

String matching is the problem of finding all the occurrences of a pattern in a text. We present a new method to compute the combinatorial shift function (“matching shift”) of the well-known Boyer–Moore string matching algorithm. This method implies the computation of the length of the longest suffixes of the pattern ending at each position in this pattern. These values constituted an extra-pre...

2017
Chris van der Lee Antal van den Bosch

We present a method to discriminate between texts written in either the Netherlandic or the Flemish variant of the Dutch language. The method draws on a feature bundle representing text statistics, syntactic features, and word n-grams. Text statistics include average word length and sentence length, while syntactic features include ratios of function words and partof-speech n-grams. The effecti...

Journal: :SIAM J. Comput. 1979
Andrew Chi-Chih Yao

We study the average-case complexity of finding all occurrences of a given pattern CX in an input text string. Over an alphabet of q symbols, let c&n) be the minimum average number of characters that need to be exa-mined in a-random text string of length n . We prove that, for large m , almost all patterns a of length m satisfy c&n) = Q(rlogq(E+2)1) if msnl2m, and c@,n) = 8 n m if n>2m. This in...

2017
Gonzalo Navarro

The Block Tree is a recently proposed data structure that reaches compression close to Lempel-Ziv while supporting efficient direct access to text substrings. In this paper we show how a self-index can be built on top of a Block Tree so that it provides efficient pattern searches while using space proportional to that of the original data structure. More precisely, if a LempelZiv parse cuts a t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید