Parallel Suffix Sorting

نویسندگان

  • Natsuhiko Futamura
  • Srinivas Aluru
  • Stefan Kurtz
چکیده

We present a parallel algorithm for lexicographically sorting the suffixes of a string. Suffix sorting has applications in string processing, data compression and computational biology. The ordered list of suffixes of a string stored in an array is known as Suffix Array, an important data structure in string processing and computational biology. Our focus is on deriving a practical implementation that works well for typical inputs rather than achieving the best possible asymptotic running-time for artificial, worst-case inputs. We experimentally evaluated our algorithm on an IBM SP-2 using genomes of several organisms. Our experiments show that the algorithm delivers good, scalable performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simple Linear Work Suffix Array Construction

A suffix array represents the suffixes of a string in sorted order. Being a simpler and more compact alternative to suffix trees, it is an important tool for full text indexing and other string processing tasks. We introduce the skew algorithm for suffix array construction over integer alphabets that can be implemented to run in linear time using integer sorting as its only nontrivial subroutin...

متن کامل

Speeding up Index Construction with Gpu for Dna Data Sequences

The advancement of technology in scientific community has produced terabytes of biological data. This datum includes DNA sequences. String matching algorithm which is traditionally used to match DNA sequences now takes much longer time to execute because of the large size of DNA data and also the small number of alphabets. To overcome this problem, the indexing methods such as suffix arrays or ...

متن کامل

Linear-time Suffix Sorting - A New Approach for Suffix Array Construction

This thesis presents a new approach for linear-time suffix sorting. It introduces a new sorting principle that can be used to build the first non-recursive linear-time suffix array construction algorithm named GSACA. Although GSACA cannot hold up with the performance of state of the art suffix array construction algorithms, the algorithm introduces a lot of new ideas for suffix array constructi...

متن کامل

An Algorithm for Suffix Sorting and Its Applications∗

The suffix tree is a data structure that has found applications in various important problems, such as genetic sequencing, pattern matching and computational biology. Its derivative data structure, the suffix array, is another representation with the added advantage of a small memory footprint. We propose a simple O(n log n) time divideand-conquer sort-and-merge algorithm for solving the suffix...

متن کامل

Direct Suffix Sorting and Its Applications

Direct Suffix Sorting and Its Applications

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001