Thesis Defense: Learning Reconfigurable Vision Models

Speaker

CSAIL MIT

Host

John Guttag

CSAIL MIT

Abstract: Deep-learning models are notorious for their high computational costs and substantial data requirements. Furthermore, non-technical users most often lack the expertise needed to effectively tailor these models to their applications. In this talk, we tackle these challenges by exploring amortizing the cost of training models with similar learning tasks. Instead of training multiple models independently, we propose learning a single, reconfigurable model that effectively captures the spectrum of underlying problems.

First, we present UniverSeg, an in-context learning method for universal biomedical image segmentation. Given a query image and example set of image-label pairs that define a new segmentation task, our model produces accurate segmentation without additional training, outperforming several related methods on unseen segmentation tasks. Second, we demonstrate the effectiveness of using hypernetworks for amortizing the cost of training multiple models. We characterize a hypernetwork training issue, and propose a revised formulation that leads to faster convergence and more stable training. We then introduce Scale-Space Hypernetworks, a method for learning a continuum of CNNs with varying efficiency characteristics. This enables us to characterize an entire Pareto accuracy-efficiency curve of models by training a single hypernetwork, dramatically reducing training costs.

Committee: John Guttag (MIT), Adrian Dalca (MIT/HMS/MGH), Michael Carbin (MIT)

Add to Calendar 2023-08-24 11:00:00 2023-08-24 12:00:00 America/New_York Thesis Defense: Learning Reconfigurable Vision Models Abstract: Deep-learning models are notorious for their high computational costs and substantial data requirements. Furthermore, non-technical users most often lack the expertise needed to effectively tailor these models to their applications. In this talk, we tackle these challenges by exploring amortizing the cost of training models with similar learning tasks. Instead of training multiple models independently, we propose learning a single, reconfigurable model that effectively captures the spectrum of underlying problems. First, we present UniverSeg, an in-context learning method for universal biomedical image segmentation. Given a query image and example set of image-label pairs that define a new segmentation task, our model produces accurate segmentation without additional training, outperforming several related methods on unseen segmentation tasks. Second, we demonstrate the effectiveness of using hypernetworks for amortizing the cost of training multiple models. We characterize a hypernetwork training issue, and propose a revised formulation that leads to faster convergence and more stable training. We then introduce Scale-Space Hypernetworks, a method for learning a continuum of CNNs with varying efficiency characteristics. This enables us to characterize an entire Pareto accuracy-efficiency curve of models by training a single hypernetwork, dramatically reducing training costs.Committee: John Guttag (MIT), Adrian Dalca (MIT/HMS/MGH), Michael Carbin (MIT) Hewlett 32-G882 / Zoom https://mit.zoom.us/j/92143228411

Organizer & Contact

Jose Javier Gonzalez Ortiz

josejg@mit.edu

Thesis Defense: Learning Reconfigurable Vision Models

Speaker

Host

August 24 2023

Location

Organizer & Contact

October 15

H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

October 08

Output Supervision Can Obfuscate the Chain of Thought

Thesis Defense: Learning Reconfigurable Vision Models

Speaker

Host

August 24 2023

Location

Organizer & Contact

Related Events

October 15

H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

October 08

Output Supervision Can Obfuscate the Chain of Thought