Gapped Indexing for Consecutive Occurrences
نویسندگان
چکیده
The classic string indexing problem is to preprocess a S into compact data structure that supports efficient pattern matching queries. Typical queries include existential (decide if the occurs in S), reporting (return all positions where occurs), and counting number of occurrences pattern). In this paper we consider variant indexing, goal compactly represent such given two patterns $$P_1$$ $$P_2$$ gap range $${[}\alpha , \beta ]$$ can quickly find consecutive with distance i.e., pairs subsequent within range. We present structures use linear space query time $${\widetilde{O}}(|P_1|+|P_2|+n^{2/3})$$ for existence $${\widetilde{O}}(|P_1|+|P_2|+n^{2/3}\hbox {occ}^{1/3})$$ reporting. complement conditional lower bound based on set intersection showing any solution using $${\widetilde{O}}(n)$$ must $${\widetilde{\Omega }}(|P_1| + |P_2| \sqrt{n})$$ time. To obtain our results develop new techniques ideas independent interest including suffix tree decomposition hardness problem.
منابع مشابه
On the Gapped Consecutive-Ones Property
Motivated by problems of comparative genomics and paleogenomics, we introduce the Gapped Consecutive-Ones Property Problem (k,δ)-C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k sequences of 1’s and no two consecutive sequences of 1’s are separated by a gap of more than δ 0’s. The classical C1P problem, which is known ...
متن کاملHardness Results for the Gapped Consecutive-Ones Property
Motivated by problems of comparative genomics and paleogenomics, in [6] the authors introduced the Gapped Consecutive-Ones Property Problem (k, δ)-C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k blocks of ones and no two consecutive blocks of ones are separated by a gap of more than δ zeros. The classical C1P problem,...
متن کاملIndexing Gapped-Factors Using a Tree
We present a data structure to index a specific kind of factors, that is of substrings, called gapped-factors. A gapped-factor is a factor containing a gap that is ignored during the indexation. The data structure presented is based on the suffix tree and indexes all the gapped-factors of a text with a fixed size of gap, and only those. The construction of this data structure is done online in ...
متن کاملHardness results on the gapped consecutive-ones property problem
Motivated by problems of comparative genomics and paleogenomics, in [6] the authors introduced the Gapped Consecutive-Ones Property Problem (k, δ)-C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k blocks of ones and no two consecutive blocks of ones are separated by a gap of more than δ zeros. The classical C1P problem,...
متن کاملReporting Consecutive Substring Occurrences Under Bounded Gap Constraints
We study the problem of indexing a text T [1 . . . n] such that whenever a pattern P [1 . . . p] and an interval [α, β] comes as a query, we can report all pairs (i, j) of consecutive occurrences of P in T with α ≤ j − i ≤ β. We present an O(n logn) space data structure with optimal O(p+ k) query time, where k is the output size.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithmica
سال: 2022
ISSN: ['1432-0541', '0178-4617']
DOI: https://doi.org/10.1007/s00453-022-01051-6