How to design the next 700 optimizing compilers
Host
Jonathan Ragan-Kelley
CSAIL
Massively parallel hardware combined with carefully optimized software has enabled the deep learning revolution. To deliver the efficiency and performance demanded by the next generations of AI applications a zoo of highly specialized hardware devices has developed. Writing software for these devices is highly complex and only the largest companies are able to make massive investments in such short-lived software that currently has to be rewritten and re-optimized for each new generation of hardware devices.
Current automatic optimizing compilers that turn high-level programs into low-level code often deliver disappointing performance. They disappoint by failing to exploit the increasingly specialized hardware features, but they equally disappoint by failing to perform crucial high-level optimizations in many important, but less mainstream, application domains.
In this talk, I will present our approach for designing the next generation of optimizing compilers. These compilers systematically optimize domain-specific applications for a diverse set of specialized hardware. They support a wide range of optimization use-cases ranging from using automatic and AI-driven design space exploration techniques to precise control of optimizations by performance engineers, and gradual combinations of automation and control. Key to our design is that we embrace extensibility and composability of both computations and optimizations. Computations are represented by a pattern-based intermediate representation. Fundamental building blocks are flexible generic patterns. This intermediate representation is easily extensible with domain- and hardware-specific patterns. Optimizations are composed of simple rewrite-rules either in a purposely built strategy language that allows to precise control of optimization strategies, or in a semi-automatic technique using equality saturation. The compiler is easily extensible with domain- and hardware-specific optimization strategies and experts are allowed to control the optimization process to various degrees.
I aim to demonstrate that this generic and flexible design achieves high-performance, comparable to existing domain-specific compilers on existing massively parallel hardware, and show exciting future research directions that open up from our approach.
Bio:
Michel Steuwer (michel.steuwer.info) is a Lecturer in Compilers and Runtime Systems at University of Edinburgh. His research aims to drastically simplify the programming of complex parallel hardware devices while achieving unprecedented performance and efficiency. With Lift, and its successors RISE and ELEVATE (presented in this talk) he is pioneering research into compiler designs of performance portable programming languages, allowing software to be written once in a high-level language and optimized for best performance on a diverse set of hardware devices. Michel's work has been recognized by the academic community with best paper awards, a SIGPLAN research highlight, and an upcoming CACM research highlight. His work has directly influenced domain-specific compiler design in industry such as the recent MLIR compiler framework.
Current automatic optimizing compilers that turn high-level programs into low-level code often deliver disappointing performance. They disappoint by failing to exploit the increasingly specialized hardware features, but they equally disappoint by failing to perform crucial high-level optimizations in many important, but less mainstream, application domains.
In this talk, I will present our approach for designing the next generation of optimizing compilers. These compilers systematically optimize domain-specific applications for a diverse set of specialized hardware. They support a wide range of optimization use-cases ranging from using automatic and AI-driven design space exploration techniques to precise control of optimizations by performance engineers, and gradual combinations of automation and control. Key to our design is that we embrace extensibility and composability of both computations and optimizations. Computations are represented by a pattern-based intermediate representation. Fundamental building blocks are flexible generic patterns. This intermediate representation is easily extensible with domain- and hardware-specific patterns. Optimizations are composed of simple rewrite-rules either in a purposely built strategy language that allows to precise control of optimization strategies, or in a semi-automatic technique using equality saturation. The compiler is easily extensible with domain- and hardware-specific optimization strategies and experts are allowed to control the optimization process to various degrees.
I aim to demonstrate that this generic and flexible design achieves high-performance, comparable to existing domain-specific compilers on existing massively parallel hardware, and show exciting future research directions that open up from our approach.
Bio:
Michel Steuwer (michel.steuwer.info) is a Lecturer in Compilers and Runtime Systems at University of Edinburgh. His research aims to drastically simplify the programming of complex parallel hardware devices while achieving unprecedented performance and efficiency. With Lift, and its successors RISE and ELEVATE (presented in this talk) he is pioneering research into compiler designs of performance portable programming languages, allowing software to be written once in a high-level language and optimized for best performance on a diverse set of hardware devices. Michel's work has been recognized by the academic community with best paper awards, a SIGPLAN research highlight, and an upcoming CACM research highlight. His work has directly influenced domain-specific compiler design in industry such as the recent MLIR compiler framework.