Detection of protein secondary structures via the discrete wavelet transform. Journal Article uri icon

Overview

abstract

  • We subject the primary sequence of proteins gathered from the Structural Classification of Proteins (SCOP) database to a discrete wavelet transform (DWT) analysis to search for predictors of secondary structures. We use proteins with both alpha helices and beta sheets (the A/B , A+B databases from SCOP). The amino acids composing the protein are converted to their hydrophobicity values using three hydrophobicity scales. Results prove to be independent of the scale used. Using a DWT multiresolution decomposition, each protein is coarse grained, in effect, creating snapshots of each protein at multiple scales. For each protein, a control data set is formed by generating random realizations that remove the positional informational in the sequence but still contain the same amino acid frequencies. Regions of salient hydrophobicity in the protein sequence are identified by comparing the transforms of the original sequence with those of the control set, at each resolution. We find significant matching between regions of salient hydrophobicity and the locations of secondary structure along the amino acid chains. We calculate the sensitivity, specificity, and Matthews correlation to quantify the agreement between the wavelet detected structures and the real protein. In addition we are able to distinguish between the morphologically different subsets, A/B and A+B. We also construct a correlation function based on the DWT that correlates quasilocalized structures at lengths in wavelet space. Through a similar comparison to the control data sets, features in this space-scale correlation are identified that show correspondence to the typical lengths of the secondary structures.

publication date

  • November 1, 2009

has restriction

  • closed

Date in CU Experts

  • October 29, 2013 3:01 AM

Full Author List

  • Pando J; Sands L; Shaheen SE

author count

  • 3

citation count

  • 0

Other Profiles

International Standard Serial Number (ISSN)

  • 1539-3755

Electronic International Standard Serial Number (EISSN)

  • 1550-2376

Additional Document Info

start page

  • 051909

volume

  • 80

issue

  • 5 Pt 1