Protein threading by nonlinearly combining evolutionary and non-evolutionary information
Speaker: Jinbo Xu , Toyota Technological Institute at ChicagoContact:
Date: March 30 2009
Time: 11:30AM to 1:00PM
Location: Stata Center 32-G575
Host: Bonnie Berger & Peter Clote, MIT - BC
Patrice Macaluso, 617.253.3037, firstname.lastname@example.orgRelevant URL: http://www-math.mit.edu/compbiosem/
Proteins play fundamental roles in all biological processes. Akin to the complete sequencing of genomes,complete descriptions of protein structures is a fundamental step towards understanding biological life, and is also highly relevant in the development of therapeutics and drugs. Computational methods, especially template-based modeling, can quickly generate crude but useful structure models at a large
scale. The challenge of template-based modeling lies in the recognition of correct templates and the generation of accurate sequence-template alignments. Evolutionary information (i.e., sequence
profiles) has proved to be very powerful in detecting remote homologs,
as demonstrated by the state-of-the-art profile-based method HHpred. However, there are still a lot of proteins, even in the PDB, without good sequence profiles. We present a new protein threading method for proteins without good sequence profiles by nonlinearly combining evolutionary and non-evolutionary information. In particular, we model protein threading using a probabilistic graphical model Conditional (Markov) Random Fields, which guides sequence-template alignment using a nonlinear scoring function consisting of a collection of regression trees. A regression tree estimates the log-likelihood of an alignment state from both evolutionary and non-evolutionary information.
Experimental results indicate that when there is no sufficient evolutionary information, this new method greatly outperforms HHpred in terms of both alignment accuracy and mode quality, and that non-evolutionary information is helpful to around half of the
templates. The paradigm presented here for the design of a nonlinear scoring function is very general. It can also be applied to protein sequence alignment and RNA alignment.
See other events that are part of Bioinformatics Seminar Series Spring 2009
See other events happening in March 2009