DNA Sequencing - Tabu and Scatter Search Combined

نویسندگان

  • Jacek Blazewicz
  • Fred Glover
  • Marta Kasprzak
چکیده

1. Biochemical Preliminaries and Problem Formulation DNA sequencing is one of the most important problems in computational molecular biology. The goal is to determine a sequence of nucleotides of a DNA fragment. Such a fragment is usually written as a sequence of the letters A, C, G, and T, representing four nucleotides composing the fragment, i.e., adenine, cytosine, guanine, and thymine, respectively. A short sequence of nucleotides is called an oligonucleotide. The sequencing process uses as input data a set of oligonucleotides of equal length, which are subsequences of one strand of the examined DNA fragment, and are derived from a hybridization experiment. Next, an original sequence of a known length is reconstructed, taking advantage of the fact that the oligonucleotides overlap one another. In the hybridization experiment (Bains and Smith 1988, Lysov et al. 1988, Southern 1988, Drmanac et al. 1989), a complete oligonucleotide library is compared with many copies of one strand of the examined DNA fragment. The library consists of all 4 short onestrand DNA fragments of length l. In order to use the library, fragments are constructed in a special way on a DNA chip (Southern 1988, Fodor et al. 1991, Pease et al. 1994), where each element of the library has unique coordinates of the chip. During the hybridization reaction, copies of the longer DNA fragment join to oligonucleotides from the library in their complementary locations. Then, as a result of reading a fluorescent image of the chip, one obtains a set of oligonucleotides that are subfragments of the examined DNA fragment. This set is named the spectrum. If the hybridization experiment were executed without errors, then the spectrum would be ideal, i.e., it would contain only all subsequences of length l of the original sequence of the known length n. In this case, the spectrum consists of n− l+ 1 elements and to reconstruct the original sequence one must find an order of spectrum elements such that neighboring elements always overlap on l− 1 nucleotides (see Example 1). There are several exact methods for solving the DNA sequencing problem with the ideal spectrum, described for example in Bains and Smith (1988), Lysov et al. (1988), or in Drmanac et al. (1989), but only the one proposed in Pevzner (1989) works in polynomial time. Example 1. Suppose the original sequence to be found is ACTCTGG, n = 7. In the hybridization experiment one can use, for example, the complete library of oligonucleotides of length l = 3, composed of the following 43 = 64 oligonucleotides: {AAA, AAC, AAG, AAT, ACA, TTG, TTT}. As a result of the experiment performed without errors one obtains the ideal spectrum for this sequence, containing all

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolutionary Approaches to DNA Sequencing with Errors

In the paper, two evolutionary approaches to the general DNA sequencing problem, assuming both negative and positive errors in the spectrum, are compared. The older of them is based on the idea of genetic approach and is enhanced by a greedy algorithm. The newly proposed algorithm combines the tabu search and the scatter search methods. After conducting experiments with random and coding DNA se...

متن کامل

A Scatter Search Variant to Solve max-SAT Problems

In this work, a new scatter search-based approach is studied for the NP-Hard satisfiability problems, in particular for its optimization version namely max-SAT. The paper describes a scatter search algorithm enhanced with a tabu search component combined with a uniform crossover operator. The latter is used to identify promising search regions while tabu search performs an intensified search of...

متن کامل

Sequencing by hybridization: an enhanced crossover operator for a hybrid genetic algorithm

This paper presents a genetic algorithm for an important computational biology problem. The problem appears in the computational part of a new proposal for DNA sequencing denominated sequencing by hybridization. The general usage of this method for real sequencing purposes depends mainly on the development of good algorithmic procedures for solving its computational phase. The proposed genetic ...

متن کامل

Hybrid scatter tabu search for unconstrained global optimization

The problem of finding a global optimum of an unconstrained multimodal function has been the subject of intensive study in recent years, giving rise to valuable advances in solution methods. We examine this problem within the framework of adaptive memory programming (AMP), focusing particularly on AMP strategies that derive from an integration of Scatter Search and Tabu Search. Computational co...

متن کامل

A heuristic managing errors for DNA sequencing

MOTIVATION A new heuristic algorithm for solving DNA sequencing by hybridization problem with positive and negative errors. RESULTS A heuristic algorithm providing better solutions than algorithms known from the literature based on tabu search method.

متن کامل

Simulation / Optimization Using “ Real - World ” Applications

This tutorial will focus on several new real-world applications that have been developed using an integrated set of methods, including Tabu Search, Scatter Search, Mixed Integer Programming, and Neural Networks, combined with simulation. Applications include project portfolio optimization and customer relationship management.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • INFORMS Journal on Computing

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2004