Diffusion models have proven effective as a robotic policy representation class—but thus far have mainly been applied in the imitation learning setting for relatively simple tasks. What does it take to scale the approach to long-horizon tasks that also require very fine control?