Efficient Φ-Regret Minimization with Linear and Low-Degree Swap Deviations in Extensive-Form Games

Speaker

Brian Zhang
Carnegie Mellon University
Abstract: Recent breakthrough results by Dagan, Daskalakis, Fishelson and Golowich [2023] and Peng and Rubinstein [2023] established an efficient algorithm attaining at most \epsilon swap regret over extensive-form strategy spaces of dimension N in N^{\tilde O(1/\epsilon)} rounds. On the other extreme, Farina and Pipis [2023] developed an efficient algorithm for minimizing the weaker notion of linear-swap regret in \mathsf{poly}(N)/\epsilon^2 rounds. In this talk, we will discuss several new results that extend and expand upon these recent works. The first result is an interpretation of linear-swap deviations in extensive-form games. We show a connection between linear deviations and a generalization of communication deviations in which the player can make queries to a "mediator" who replies with action recommendations, and, critically, the player is not constrained to match the timing of the game as would be the case for communication deviations. We coin this latter set the untimed communication (UTC) deviations. We show that the UTC deviations coincide precisely with the linear deviations, and therefore that any player minimizing UTC regret also minimizes linear-swap regret. Next, we take a step toward bridging the gap between linear and swap deviations. We introduce the set of k-mediator deviations, which generalize the untimed communication deviations to the case of having multiple mediators. We develop parameterized algorithms for minimizing the regret with respect to this set of deviations in N^{O(k)}/\epsilon^2 rounds. This closes the gap in the sense that k=1 recovers linear swap regret, while k=N recovers swap regret. Moreover, by relating k -mediator deviations to low-degree polynomials, we show that regret minimization against degree- k polynomial swap deviations is achievable in N^{O(kd)^3}/\epsilon^2 rounds, where d is the depth of the game, assuming constant branching factor. For a fixed degree k, this is polynomial for Bayesian games and quasipolynomial more broadly when d = \mathsf{polylog} N---the usual balancedness assumption on the game tree. This talk covers joint work with Ioannis Anagnostides, Gabriele Farina, and Tuomas Sandholm.