We consider the problem of planning sequences of actions in systems with unknown dynamics, primarily in the context of robotics. To tackle this, we propose an efficient type of model called a compressed linear expectation model. These models can be estimated directly from sampled data generated by unknown dynamics. This allows us to tackle various settings with little prior knowledge. Additionally, these models allow us to compute the value of a sequence of actions in a way that is differentiable with respect to the choice of these actions. We combine this property with gradient based optimization to obtain a novel, efficient method for optimizing plans over models constructed from data.
If you would like to contact us about our work, please scroll down to the people section and click on one of the group leads' people pages, where you can reach out to them directly.