When less enables more : making models and methods for modern genomics

Speaker

Rob Patro
Stony Brook University

Host

Bonnie Berger
CSAIL and Mathematics
The plummeting cost of high-throughput sequencing and the astounding variety of available assays has created a scientific regime in which the bottleneck in many experiments has ceased to be our ability to acquire data, and has instead become the computational costs associated with analyzing this data. Simultaneously, we have been building sequencing data archives that hold immense potential, but which remain largely inert due to our inability to efficiently index and query "raw" experimental data.

In this talk, I will discuss some of the methods that we have been developing to address these challenges as they arise in different contexts. I will highlight our recent work in fast, accurate and bias-aware methods for transcript quantification, as implemented in our tool Salmon. I will discuss Mantis, our indexing approach to enable sequence search over large collections of raw, unassembled read data. Finally, I will describe Pufferfish, a new time and space-efficient data structure for indexing and querying the colored, compacted de Bruijn graph.