HCI Seminar - Hugo Garcia - Controllable and Expressive Generative Modeling for the Sound Arts

Speaker

Hugo Garcia
Northwestern University

Host

Anna Huang
Music and Theater Arts Section

Abstract:
State-of-the-art generative audio models rely on text prompting mechanisms as a primary form of interaction with users. While text prompting can be a powerful supplement to more gestural interfaces, a sound is worth more than a thousand words: sonic structures like a syncopated rhythm or the timbral morphology of a moving texture are hard to describe in text. They can be more easily described through a sonic gesture. This talk describes two research works exploring generative audio modeling with gestural and interactive control mechanisms: VampNet (via masked acoustic token modeling) and Sketch2Sound (via fine-grained interpretable control signals).

Bio:
Hugo Flores García (he/they) is a Honduran computer musician, improviser, programmer, and scientist. Hugo’s creative practice spans improvised  music for guitars, sound objects and electronics, sound installations, bespoke digital musical instruments, and interactive art. He is a PhD candidate at Northwestern University, doing research at the intersection of applied machine learning, music, and human-computer interaction. Hugo’s research centers around designing new instruments for creative expression, focusing on artist-centered machine learning interfaces for the sound arts.

This talk will also be streamed over Zoom: https://mit.zoom.us/j/93099356333.