Understanding AI Systems at Scale: Applied Interpretability, Agent Robustness, and the Science of Model Behaviors

Host

MIT CSAIL, AI@MIT Reading Group

RSVP here: https://forms.gle/XpirPb17Q9HgoshP6

 

Join two Transluce researchers as they discuss their latest work and research vision. Transluce is a company building the public tech stack for understanding AI systems. Topics will include applied interpretability, scalable oversight, reinforcement learning, and discovering rare behaviors in language models.