A computer program for the estimation of protein and nucleic acid sequence diversity in random point mutagenesis libraries

نویسندگان

  • Michael J. Volles
  • Peter T. Lansbury
چکیده

A computer program for the generation and analysis of in silico random point mutagenesis libraries is described. The program operates by mutagenizing an input nucleic acid sequence according to mutation parameters specified by the user for each sequence position and type of point mutation. The program can mimic almost any type of random mutagenesis library, including those produced via error-prone PCR (ep-PCR), mutator Escherichia coli strains, chemical mutagenesis, and doped or random oligonucleotide synthesis. The program analyzes the generated nucleic acid sequences and/or the associated protein library to produce several estimates of library diversity (number of unique sequences, point mutations, and single point mutants) and the rate of saturation of these diversities during experimental screening or selection of clones. This information allows one to select the optimal screen size for a given mutagenesis library, necessary to efficiently obtain a certain coverage of the sequence-space. The program also reports the abundance of each specific protein mutation at each sequence position, which is useful as a measure of the level and type of mutation bias in the library. Alternatively, one can use the program to evaluate the relative merits of preexisting libraries, or to examine various hypothetical mutation schemes to determine the optimal method for creating a library that serves the screen/selection of interest. Simulated libraries of at least 10(9) sequences are accessible by the numerical algorithm with currently available personal computers; an analytical algorithm is also available which can rapidly calculate a subset of the numerical statistics in libraries of arbitrarily large size. A multi-type double-strand stochastic model of ep-PCR is developed in an appendix to demonstrate the applicability of the algorithm to amplifying mutagenesis procedures. Estimators of DNA polymerase mutation-type-specific error rates are derived using the model. Analyses of an alpha-synuclein ep-PCR library and NNS synthetic oligonucleotide libraries are given as examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study of pH influence on the stability of 175th codon of P53 genes by computational and modeling methods

P53 tumor suppressor gene, also known as “genome guardian” is mutated in more than half of allkind of cancers. In this study we have investigated the controls of environmental pH for P53 genemutation in point of specific sequence which is prone to mutagenesis. The most probable cancerousmutations occur as point mutations in exons 5-8 of P53 gene. The 175th codon of P53 is the thirdmost mutated ...

متن کامل

GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries

There are many methods for introducing random mutations into nucleic acid sequences. Previously, we described a suite of programmes for estimating the completeness and diversity of randomized DNA libraries generated by a number of these protocols. Our programmes suggested some empirical guidelines for library design; however, no information was provided regarding library diversity at the protei...

متن کامل

A Single Point Mutation within the Coding Sequence of Cholera Toxin B Subunit Will Increase Its Expression Yield

Background: Cholera toxin B subunit (CTB) has been extensively considered as an immunogenic and adjuvant protein, but its yield of expression is not satisfactory in many studies. The aim of this study was to compare the expression of native and mutant recombinant CTB (rCTB) in pQE vector. Methods: ctxB fragment from Vibrio cholerae O1 ATCC14035 containing the substitution of mutant ctxB for ami...

متن کامل

Improving the Performance of Bayesian Estimation Methods in Estimations of Shift Point and Comparison with MLE Approach

A Bayesian analysis is used to detect a change-point in a sequence of independent random variables from exponential distributions. In This paper, we try to estimate change point which occurs in any sequence of independent exponential observations. The Bayes estimators are derived for change point, the rate of exponential distribution before shift and the rate of exponential distribution after s...

متن کامل

Introduction of restriction enzyme sites in protein-coding DNA sequences by site-specific mutagenesis not affecting the amino acid sequence: a computer program

Structure/function relationship studies of proteins are greatly facilitated by recombinant DNA technology which allows specific amino acid mutations to be made at the DNA sequence level by site-specific mutagenesis employing synthetic oligonucleotides. This technique has been successfully used to alter one or two amino acids in a protein. Replacement of existing DNA sequence coding for several ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Nucleic Acids Research

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2005