PairsDB atlas of protein sequence space
نویسندگان
چکیده
Sequence similarity/database searching is a cornerstone of molecular biology. PairsDB is a database intended to make exploring protein sequences and their similarity relationships quick and easy. Behind PairsDB is a comprehensive collection of protein sequences and BLAST and PSI-BLAST alignments between them. Instead of running BLAST or PSI-BLAST individually on each request, results are retrieved instantaneously from a database of pre-computed alignments. Filtering options allow you to find a set of sequences satisfying a set of criteria-for example, all human proteins with solved structure and without transmembrane segments. PairsDB is continually updated and covers all sequences in Uniprot. The data is stored in a MySQL relational database. Data files will be made available for download at ftp://nic.funet.fi/pub/sci/molbio. PairsDB can also be accessed interactively at http://pairsdb.csc.fi. PairsDB data is a valuable platform to build various downstream automated analysis pipelines. For example, the graph of all-against-all similarity relationships is the starting point for clustering protein families, delineating domains, improving alignment accuracy by consistency measures, and defining orthologous genes. Moreover, query-anchored stacked sequence alignments, profiles and consensus sequences are useful in studies of sequence conservation patterns for clues about possible functional sites.
منابع مشابه
ADDA: a domain database with global coverage of the protein universe
We used the Automatic Domain Decomposition Algorithm (ADDA) to generate a database of protein domain families with complete coverage of all protein sequences. Sequences are split into domains and domains are grouped into protein domain families in a completely automated process. The current database contains domains for more than 1.5 million sequences in more than 40,000 domain families. In par...
متن کاملExpression and Secretion of Human Granulocyte Macrophage-Colony Stimulating Factor Using Escherichia coli Enterotoxin I Signal Sequence
With the aim of the secretion of human granulocyte macrophage-colony stimulating factor (hGM-CSF) in Escherichia coli, hGM-CSF cDNA was fused in-frame next to the signal sequence of ST toxin (ST-I) of exteroxigenic E. coli, containing 53 or 19 amino acids of signal peptide. The fused STsig::hGM-CSF coding fragments were inserted into a T7-based expression plasmid. The recombinant plasmids were ...
متن کاملطراحی و ساخت کلون بیان کننده داروی ضد انعقادی دسیرودین (هیرودین) به شکل خارج سلولی در اشرشیا کلی
Background and purpose: Hirudin is a 65-66 amino acids polypeptide which is secreted as an anticoagulant compound from salivary glands of medical leech. This drug is a very potent inhibitor of thrombin and is so effective for arterial and venous thrombosis prevention. Therefore, it can compete with heparin. The aim of this study was to add a pelB signal peptide to pET-22b plasmid and to investi...
متن کاملOn the fine spectra of the generalized difference operator Delta_{uv} over the sequence space c0
The main purpose of this paper is to detemine the fine spectrum of the generalized difference operator Delta_{uv} over the sequence space c0. These results are more general than the fine spectrum of the generalized difference operator Delta_{uv} of Srivastava and Kumar.
متن کاملOn the fine spectra of the Zweier matrix as an operator over the weighted sequence space $l_{p}(w)$
In the present paper, the ne spectrum of the Zweier matrix as anoperator over the weighted sequence space `p(w); have been examined.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 36 شماره
صفحات -
تاریخ انتشار 2008