Gapped Indexing for Consecutive Occurrences

نویسندگان

چکیده

The classic string indexing problem is to preprocess a S into compact data structure that supports efficient pattern matching queries. Typical queries include existential (decide if the occurs in S), reporting (return all positions where occurs), and counting number of occurrences pattern). In this paper we consider variant indexing, goal compactly represent such given two patterns $$P_1$$ $$P_2$$ gap range $${[}\alpha , \beta ]$$ can quickly find consecutive with distance i.e., pairs subsequent within range. We present structures use linear space query time $${\widetilde{O}}(|P_1|+|P_2|+n^{2/3})$$ for existence $${\widetilde{O}}(|P_1|+|P_2|+n^{2/3}\hbox {occ}^{1/3})$$ reporting. complement conditional lower bound based on set intersection showing any solution using $${\widetilde{O}}(n)$$ must $${\widetilde{\Omega }}(|P_1| + |P_2| \sqrt{n})$$ time. To obtain our results develop new techniques ideas independent interest including suffix tree decomposition hardness problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Gapped Consecutive-Ones Property

Motivated by problems of comparative genomics and paleogenomics, we introduce the Gapped Consecutive-Ones Property Problem (k,δ)-C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k sequences of 1’s and no two consecutive sequences of 1’s are separated by a gap of more than δ 0’s. The classical C1P problem, which is known ...

متن کامل

Hardness Results for the Gapped Consecutive-Ones Property

Motivated by problems of comparative genomics and paleogenomics, in [6] the authors introduced the Gapped Consecutive-Ones Property Problem (k, δ)-C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k blocks of ones and no two consecutive blocks of ones are separated by a gap of more than δ zeros. The classical C1P problem,...

متن کامل

Indexing Gapped-Factors Using a Tree

We present a data structure to index a specific kind of factors, that is of substrings, called gapped-factors. A gapped-factor is a factor containing a gap that is ignored during the indexation. The data structure presented is based on the suffix tree and indexes all the gapped-factors of a text with a fixed size of gap, and only those. The construction of this data structure is done online in ...

متن کامل

Hardness results on the gapped consecutive-ones property problem

Motivated by problems of comparative genomics and paleogenomics, in [6] the authors introduced the Gapped Consecutive-Ones Property Problem (k, δ)-C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k blocks of ones and no two consecutive blocks of ones are separated by a gap of more than δ zeros. The classical C1P problem,...

متن کامل

Reporting Consecutive Substring Occurrences Under Bounded Gap Constraints

We study the problem of indexing a text T [1 . . . n] such that whenever a pattern P [1 . . . p] and an interval [α, β] comes as a query, we can report all pairs (i, j) of consecutive occurrences of P in T with α ≤ j − i ≤ β. We present an O(n logn) space data structure with optimal O(p+ k) query time, where k is the output size.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Algorithmica

سال: 2022

ISSN: ['1432-0541', '0178-4617']

DOI: https://doi.org/10.1007/s00453-022-01051-6