Distributed Sparse Computing in Python
Host
Julian Shun
MIT CSAIL
This is hybrid meeting. The physical room is 32-D463 (Star). The Zoom registration link is https://mit.zoom.us/meeting/register/tJMtdOqgrTwsGNV_k0Nk6JjLMk-G6GFzTRyk.
******************IMPORTANT NOTE ABOUT ONLINE REGISTRATION******************
- The registration link for 2023 is the same as the link from 2022.
- Please save the Zoom link that you receive after you register. This link will stay the same for all subsequent Fast Code seminars.
- Zoom does not recognize a second registration, and will not send out the link a second time. The organizers will not be notified of any second registration.
- If you have any problems with registration, please contact lindalynch@csail.mit.edu by 12pm on the day of the seminar, so that we can try to resolve it before the seminar begins.
*********************************************************************
Abstract: The sparse module of the popular SciPy Python library is widely used across applications in scientific computing, data analysis and machine learning. The standard implementation of SciPy is restricted to a single CPU and cannot take advantage of modern distributed and accelerated computing resources. We introduce Legate Sparse, a system that transparently distributes and accelerates unmodified sparse matrix-based SciPy programs across clusters of CPUs and GPUs, and composes with cuNumeric, a distributed NumPy library. Legate Sparse uses a combination of static and dynamic techniques to efficiently compose independently written sparse and dense array programming libraries, providing a unified Python interface for distributed sparse and dense array computations. We show that Legate Sparse is competitive with single-GPU libraries like CuPy and achieves 65% of the performance of PETSc on up to 1280 CPU cores and 192 GPUs of the Summit supercomputer, while offering the productivity benefits of idiomatic SciPy and NumPy.
Bio: Rohan Yadav is a fourth-year computer science Ph.D. student at Stanford University, advised by Alex Aiken and Fredrik Kjolstad. He is generally interested in programming languages and computer systems, with a focus in systems for parallel and distributed computing.
******************IMPORTANT NOTE ABOUT ONLINE REGISTRATION******************
- The registration link for 2023 is the same as the link from 2022.
- Please save the Zoom link that you receive after you register. This link will stay the same for all subsequent Fast Code seminars.
- Zoom does not recognize a second registration, and will not send out the link a second time. The organizers will not be notified of any second registration.
- If you have any problems with registration, please contact lindalynch@csail.mit.edu by 12pm on the day of the seminar, so that we can try to resolve it before the seminar begins.
*********************************************************************
Abstract: The sparse module of the popular SciPy Python library is widely used across applications in scientific computing, data analysis and machine learning. The standard implementation of SciPy is restricted to a single CPU and cannot take advantage of modern distributed and accelerated computing resources. We introduce Legate Sparse, a system that transparently distributes and accelerates unmodified sparse matrix-based SciPy programs across clusters of CPUs and GPUs, and composes with cuNumeric, a distributed NumPy library. Legate Sparse uses a combination of static and dynamic techniques to efficiently compose independently written sparse and dense array programming libraries, providing a unified Python interface for distributed sparse and dense array computations. We show that Legate Sparse is competitive with single-GPU libraries like CuPy and achieves 65% of the performance of PETSc on up to 1280 CPU cores and 192 GPUs of the Summit supercomputer, while offering the productivity benefits of idiomatic SciPy and NumPy.
Bio: Rohan Yadav is a fourth-year computer science Ph.D. student at Stanford University, advised by Alex Aiken and Fredrik Kjolstad. He is generally interested in programming languages and computer systems, with a focus in systems for parallel and distributed computing.