Abuse of the Mode and an Ensemble Alternative

Speaker: Charles (Chip) E. Lawrence , Brown University
Date: November 7 2005
Time: 11:30AM to 1:00PM
Location: TOC lab- 32-G575
Host: P Clote/ BC & B Berger/ MIT
Contact: Kathleen V Dickey, 617-253-3037, kvdickey@mit.edu
Relevant URL: http://www-math.mit.edu/compbiosem/---in 15 minutes----
Advances in data collection technologies have rendered increasingly large data sets available for analysis. While the emergence of such large data sets would seem to lead to increasingly more precise estimates of parameters, paradoxically just the opposite seems to becoming increasingly common. This paradoxical circumstance has emerged because these technologies have simultaneously opened opportunities to draw inferences on previously unanswerable high dimensional questions. For decades optimization has been employed as the major tool of most inference procedures. It has been clearly recognized for some time now that the favorable properties of optimization based inferences rest on an asymptotic foundation that requires the data to grow in comparison with the number of unknowns. Nevertheless, optimization very often continues as the method of choice even when these supporting conditions are not present. Genomics and computational molecular biology are among the more predominate fields experiencing the duality of the growth in data resources and inference expectations. In fact, prediction and inference of high dimensional objects are now arguably the most important activities in these allied new biological fields, and the inspiration for this paper. RNA secondary structure prediction offers a very special lens to examine the untoward consequences of the reliance on the mode in high dimensional inferences because polynomial time algorithms are available to comprehensively characterize the space of solutions, and a references set of structures is available for the comparison of alternative prediction methods. Through this lens we will examine these untoward effects, consider their boarder implications, and present an alternative “ensemble based” inference procedure.
The seminar is co-hosted by Professor Peter Clote of Boston College's Biology and Computer Science Departments and MIT Professor of Applied Math Bonnie Berger. Professor Berger is also affiliated with CSAIL & HST.
See other events that are part of Bioinformatics Seminar Series 2005/2006
See other events happening in November 2005