Identifying foldable regions in protein sequence from the hydrophobic signal

نویسندگان

  • Chi N.I. Pang
  • Kuang Lin
  • Merridee A. Wouters
  • Jaap Heringa
  • Richard A. George
چکیده

Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple method to identify globular domains in protein sequences. Domains are compact units of protein structure and their correct delineation will aid structural elucidation through a divide-and-conquer approach. Scooby-Domain predictions are based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method employs an A*-search to identify sequence regions that form a globular structure and those that are unstructured. On a test set of 173 proteins with consensus CATH and SCOP domain definitions, Scooby-Domain has a sensitivity of 50% and an accuracy of 29%, which is better than current state-of-the-art methods. The method does not rely on homology searches and, therefore, can identify previously unknown domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comprehensive Repertoire of Foldable Regions within Whole Genomes

In order to get a comprehensive repertoire of foldable domains within whole proteomes, including orphan domains, we developed a novel procedure, called SEG-HCA. From only the information of a single amino acid sequence, SEG-HCA automatically delineates segments possessing high densities in hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA). These hydrophobic clusters mainly ...

متن کامل

Cloning and molecular characterization of TaERF6, a gene encoding a bread wheat ethylene response factor

Ethylene response factor proteins are important for regulating gene expression under different stresses. Different isoforms for ERF have previously isolated from bread wheat (Triticum aestivum L.) and related genera and called from TaERF1 to TaERF5. We isolated, cloned and molecular characterized a novel one based on TdERF1, an isoform in durum wheat (Tri...

متن کامل

Expression and Secretion of Human Granulocyte Macrophage-Colony Stimulating Factor Using Escherichia coli Enterotoxin I Signal Sequence

With the aim of the secretion of human granulocyte macrophage-colony stimulating factor (hGM-CSF) in Escherichia coli, hGM-CSF cDNA was fused in-frame next to the signal sequence of ST toxin (ST-I) of exteroxigenic E. coli, containing 53 or 19 amino acids of signal peptide. The fused STsig::hGM-CSF coding fragments were inserted into a T7-based expression plasmid. The recombinant plasmids were ...

متن کامل

Cloning and characterization of MAP2191 gene, a mammalian cell entry antigen of Mycobacterium avium subspecies paratuberculosis

The aim of this study is to identify, clone and express a Mycobacterium avium subsp. paratuberculosis specific immunogenic antigen candidate, in order to develop better reagents for diagnosis and vaccines for the protection of the host. Therefore, MAP2191 gene (a member of MAPmce5 operon) from MAP, was isolated and characterized by Bioinformatics tools and <e...

متن کامل

تخمین مکان نواحی کدکننده پروتئین در توالی عددی DNA با استفاده پنجره با طول متغیر بر مبنای منحنی سه بعدی Z

In recent years, estimation of protein-coding regions in numerical deoxyribonucleic acid (DNA) sequences using signal processing tools has been a challenging issue in bioinformatics, owing to their 3-base periodicity. Several digital signal processing (DSP) tools have been applied in order to Identify the task and concentrated on assigning numerical values to the symbolic DNA sequence, then app...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2008