Clustering protein sequences with a novel metric transformed
from sequence similarity scores and sequence alignments with neural
networks

Speaker: Qicheng Ma , Novartis
Date: May 10 2004
Time: 11:30AM to 12:45AM
Location: 2-338
Host: P Clote/ BC & Bonnie Berger/ MIT- Math & CSAIL
Contact: Kathleen Dickey, 617 253-3037, kvdickey@mit.edu
Relevant URL: http://www-math.mit.edu/compbiosem/
Abstract:
Clustering of protein sequences from different organisms has been used to identify orthologous
and paralogous protein sequences, to find protein sequences unique to an organism, and to
derive the phylogenetic profile for a cluster of protein sequences. These are some of the essential
components of a comparative genomics study of protein sequences across several genomes.
Algorithms used to cluster protein sequences can be either domain-based or family-based. All the
clustering methods start with an all-against-all pairwise protein sequence similarity searches. The
domain-based clustering methods organize the protein sequence universe into domain clusters
where domains are the structural units of proteins, e.g., COG. Family-based clustering methods
group protein sequences into families, which contain a group of evolutionarily related proteins that
share similar domain architecture, e.g., PROTONET.
We propose a novel family-based clustering method to address two problems: how to detect
whether two aligned sequences have similar domain structures; and how o quantify transitive
homologies through intermediate sequences to detect remote homologies at the superfamily
level. These two problems are simultaneously solved by a new metric for clustering of protein
sequences.
Refreshments: 11 am in the Applied Mathematics Common Room at MIT's Building 2, Room 349
Talk: 11:30 am to 1 pm in the Applied Mathematics Conference Room Building 2, Room 338
The seminar is co-hosted by Professor Peter Clote of Boston College's Biology and Computer Science
Departments and MIT Professor of Applied Math Bonnie Berger. Professor Berger is also affiliated with CSAIL & HST.
See other events that are part of Bioinformatics Seminar Series Spring 2004
See other events happening in May 2004