Matchings in MPC frameworks
Speaker
Slobodan Mitrovic
EPFL
Host
Akshay Degwekar, Pritish Kamath and Govind Ramnarayan
Abstract. The last decade has witnessed the success of a number of \emph{massive parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. These frameworks allow for much more local computation, compared to the classical PRAM models. In this context, we investigate the complexity of one of the most fundamental graph problems: \emph{Maximum Matching}. We show that if the memory per machine is $\Omega(n)$ (or even only $n/(\log n)^{O(\log \log n)}$), then for any fixed constant $\eps > 0$, one can compute a $(2+\eps)$-approximation to Maximum Matching in $O\left((\log \log n)^2\right)$ MPC rounds. This constitutes an exponential improvement over previous work -- both the one designed specifically for MPC and the one obtained via connections to PRAM or distributed algorithms -- which required $\Theta(\log n)$ rounds to achieve similar approximation guarantees.
To establish our result we need to deviate from the previous work in two important ways. Firstly, we use \emph{vertex--based} graph partitioning, instead of the edge--based approaches that were utilized so far. Secondly, we develop a technique of \emph{round compression}. This technique enables one to take a (distributed) algorithm that computes an $O(1)$-approximation of maximum matching in $O(\log n)$ \emph{independent} PRAM phases and implement a super-constant many of these phases in only a constant number of actual MPC rounds. These techniques turn out to be what one needs to truly take advantage of the MPC model, as compared to the PRAM model, even in the regime of (sub-)linear memory per machine.
Based on joint work with Artur Czumaj, Jakub ??cki, Aleksander M?dry, Krzysztof Onak and Piotr Sankowski. [https://arxiv.org/abs/1707.03478]
To establish our result we need to deviate from the previous work in two important ways. Firstly, we use \emph{vertex--based} graph partitioning, instead of the edge--based approaches that were utilized so far. Secondly, we develop a technique of \emph{round compression}. This technique enables one to take a (distributed) algorithm that computes an $O(1)$-approximation of maximum matching in $O(\log n)$ \emph{independent} PRAM phases and implement a super-constant many of these phases in only a constant number of actual MPC rounds. These techniques turn out to be what one needs to truly take advantage of the MPC model, as compared to the PRAM model, even in the regime of (sub-)linear memory per machine.
Based on joint work with Artur Czumaj, Jakub ??cki, Aleksander M?dry, Krzysztof Onak and Piotr Sankowski. [https://arxiv.org/abs/1707.03478]