HCI Seminar - Hugo Garcia - Controllable and Expressive Generative Modeling for the Sound Arts

Speaker

Hugo Garcia

Northwestern University

Host

Anna Huang

Music and Theater Arts Section

Abstract:
State-of-the-art generative audio models rely on text prompting mechanisms as a primary form of interaction with users. While text prompting can be a powerful supplement to more gestural interfaces, a sound is worth more than a thousand words: sonic structures like a syncopated rhythm or the timbral morphology of a moving texture are hard to describe in text. They can be more easily described through a sonic gesture. This talk describes two research works exploring generative audio modeling with gestural and interactive control mechanisms: VampNet (via masked acoustic token modeling) and Sketch2Sound (via fine-grained interpretable control signals).

Bio:
Hugo Flores García (he/they) is a Honduran computer musician, improviser, programmer, and scientist. Hugo’s creative practice spans improvised music for guitars, sound objects and electronics, sound installations, bespoke digital musical instruments, and interactive art. He is a PhD candidate at Northwestern University, doing research at the intersection of applied machine learning, music, and human-computer interaction. Hugo’s research centers around designing new instruments for creative expression, focusing on artist-centered machine learning interfaces for the sound arts.

This talk will also be streamed over Zoom: https://mit.zoom.us/j/93099356333.

Add to Calendar 2025-04-08 16:00:00 2025-04-08 17:00:00 America/New_York HCI Seminar - Hugo Garcia - Controllable and Expressive Generative Modeling for the Sound Arts Abstract:State-of-the-art generative audio models rely on text prompting mechanisms as a primary form of interaction with users. While text prompting can be a powerful supplement to more gestural interfaces, a sound is worth more than a thousand words: sonic structures like a syncopated rhythm or the timbral morphology of a moving texture are hard to describe in text. They can be more easily described through a sonic gesture. This talk describes two research works exploring generative audio modeling with gestural and interactive control mechanisms: VampNet (via masked acoustic token modeling) and Sketch2Sound (via fine-grained interpretable control signals).Bio:Hugo Flores García (he/they) is a Honduran computer musician, improviser, programmer, and scientist. Hugo’s creative practice spans improvised  music for guitars, sound objects and electronics, sound installations, bespoke digital musical instruments, and interactive art. He is a PhD candidate at Northwestern University, doing research at the intersection of applied machine learning, music, and human-computer interaction. Hugo’s research centers around designing new instruments for creative expression, focusing on artist-centered machine learning interfaces for the sound arts.This talk will also be streamed over Zoom: https://mit.zoom.us/j/93099356333. TBD

Organizer & Contact

Cindy Rosenthal

crosenth@mit.edu

Part of

HCI Seminar Series 2024

HCI Seminar - Hugo Garcia - Controllable and Expressive Generative Modeling for the Sound Arts

Speaker

Host

April 08 2025

Location

Organizer & Contact

Part of

September 23

HCI Seminar - Kate Isaacs - Strategies for Visualization in the Specific: Building Interactive Visualizations For and With Computing Experts

September 16

HCI Seminar - Ziv Epstein - Re-inventing the attention machine (& building the serendipity machine)

HCI Seminar - Hugo Garcia - Controllable and Expressive Generative Modeling for the Sound Arts

Speaker

Host

April 08 2025

Location

Organizer & Contact

Part of

Related Events

September 23

HCI Seminar - Kate Isaacs - Strategies for Visualization in the Specific: Building Interactive Visualizations For and With Computing Experts

September 16

HCI Seminar - Ziv Epstein - Re-inventing the attention machine (& building the serendipity machine)