Current multicores suffer from two main limitations: they can only exploit a fraction of the parallelism available in applications, and they are very hard to program. Our project tackles both problems with Swarm, a new programming model and multicore architecture. Swarm programs consist of very short tasks, as small as tens of instructions each. Hardware distributes and queues tasks for cores, reducing the overheads of fine-grain parallelism and allowing many more applications to be parallelized. Moreover, parallelism is implicit: instead of using locks, semaphores, or other error-prone explicit synchronization techniques, programmers simply define an order among tasks. Under the covers, Swarm hardware determines what order constraints are superfluous and elides them, running most tasks in parallel. As a result, Swarm programs are almost as simple as their sequential counterparts, and at the same time outperform the best parallel programs. Currently, Swarm scales to 256 cores, and successfully parallelizes challenging applications that span graph analytics, databases, machine learning, genome sequencing, and discrete-event simulation.
A Scalable Architecture for Ordered Parallelism
Unlocking Ordered Parallelism with the Swarm Architecture
IEEE Micro's Top Picks from the Computer Architecture Conferences, 2016
Data-Centric Execution of Speculative Parallel Programs
Fractal: An Execution Model for Fine-Grain Nested Speculative Parallelism
SAM: Optimizing Multithreaded Cores for Speculative Parallelism