[Thesis Defense] Efficient Systems for Large-Scale Graph Representation Learning
Speaker
Host
Abstract:
Training modern graph representation learning models over large-scale datasets encounters severe performance issues on current system architectures. These limitations primarily arise from the irregular structure of graph data, which exacerbates data movement costs and computational overheads. Addressing these challenges requires a comprehensive, full-stack optimization approach to fully leverage existing hardware capabilities. The thesis first demonstrates how to mitigate the significant system input/output (I/O) overhead by adapting the training algorithm to align with the commodity hardware architecture prevalent in data centers. Building on this foundation, the thesis introduces joint compiler optimizations in graph neural network training to streamline sampling and model compute, eliminate redundancy and reduce complexity, enabling efficient usage of compute resources, particularly GPUs.