[Thesis Defense] Performant and Resilient Service Composition for Modern Cloud Applications

Speaker

Tianyu Li
MIT CSAIL

Host

Sam Madden
MIT CSAIL

Modern cloud applications are often distributed systems composed from vendor-provided building blocks (e.g., object storage services, container orchestration service). Consequently, distributed fault-tolerance is a central concern for application correctness. Although each building block may offer individual fault-tolerance, the end-to-end application is still susceptible to failures, because the composition logic that orchestrates them may still fail. This talk explores resilient composition,  a systematic way to assemble fault-tolerant components into resilient end-to-end distributed applications. We begin by presenting the fail-restart system model, which captures the unique fault-tolerance challenges that arise when composing services. Based on this model, we define Composable Resilient Steps (CReSt), an atomic programming abstraction that guarantees fault-tolerance across the assembled application. We then detail efficient methods for implementing CReSt using a novel concurrency control mechanism, and distributed protocols that allow optimistic, speculative execution ahead of slower fault-tolerance safeguards. Together, these work allow developers to assemble fault-tolerant distributed systems that are correct by construction and often more performant than existing solutions. 

 

Thesis Committee: Samuel R. Madden, M. Frans Kaashoek, Badrish Chandramouli (Microsoft Research)