Professor Anant Agarwal has a tendency to think big. One recent piece of work has just been donated to the MIT Museum after being documented in the 2007 Guinness Book of World Records as the largest microphone array on the planet â€“ and that was just one component of a larger project. But while Agarwal is capable of thinking on a grand scale, he is also preoccupied with the challenge of how to make large things small, scalable, and revolutionary. The two paradigms intersect in the thorny, potentially very fruitful problem of multicore processing.
Photo: Jason Dorfman, CSAIL photographer
Multicore computing is, at its most basic, the issue of how to fit multiple processing cores onto a single chip to produce a functional and efficient whole. There are many reasons that this is a valuable goal: the computer industry has moved towards multiple cores for increased performance, power efficiency and compute capacity in recent years. But rather than the two, four, or eight cores that are standard today, Professor Agarwal and his Carbon group are looking ahead to the hundreds and thousands of cores of the future.
There are many different factors that would hinder the cores of today from working in concert at that order of magnitude, not the least of which are communication and management of resources. The current bus-based system of communication between cores, which is perfectly reasonable when there are only two or four on a chip, begins to break down when applied to a group of thousands. Similarly, the centralized locks and shared global data structures of current operating systems work well for a few cores, but will quickly become untenable under the onslaught of requests from thousands of cores.
The Carbon group seeks to solve these issues with a pair of innovative technologies. The first is a project called ATAC, or All-To-All Computing. The project, a collaboration with Professor Lionel Kimerlingâ€™s group in the Department of Materials Science, attempts to integrate on-chip optical interconnect with multicore computing. By building an optical waveguide that connects all of the cores in a chip, every core is able to both broadcast to and receive from its counterparts instantaneously. This global communication infrastructure will improve performance, energy efficiency and programmability. With the ability to send values to all cores instantaneously, the researchers hope that programming will become easier, and may even lead to the genesis of new programming languages.
Next up is the bottleneck created when multiple cores try to simultaneously request services from the operating system. The current model calls for every core to try to secure a piece of the resource pie, and then wait in line behind the other cores that got there first (a bit like being in the hold queue on a centralized customer service phone call). This is a limitation the group proposes to solve with something called a Factored Operating System, or FOS. The internet-inspired model means that operating system services are broken up into sets of cooperating servers and assigned to dedicated cores throughout the chip. Applications, running on their own separate cores, can then obtain system resources from the nearest instance of a server managing that resource.
Agarwal likens it to a kind of technological form of localization. Because the resources are distributed, like small corner grocery stores in urban neighborhoods, it makes sense to simply go to the one that is closest to you, rather than descending en masse on one large supercenter. Decentralization of services also enables the multicore chip to be resilient to faults, which are likely to be commonplace in chips with thousands of cores. If a core running an OS service suffers a fault, further requests are redirected by a distributed name service to other live cores offering the same service in an internet-like fashion.
The entire thing is being tested on a new multicore simulator the group is building. Known as Graphite, it addresses a singular conundrum of this kind of experimental work: how do you do research on the behavior and limitations of hundreds and thousands of linked cores â€“ when the chips in question have yet to be built? The answer to that is a multiprocessor simulator that runs on many hundreds of computers, allowing each of them to model a portion of a massive multicore chip. This testbed allows the researchers to build software and run critical experiments that would otherwise not be possible. When complete, the Graphite simulator will be released as open-source software to benefit the entire computer architecture community.
The distributed architecture of FOS is conducive not only to large multicores, but also for applications in the burgeoning field of cloud computing. In a cloud environment, FOS would provide a single-system image across a pool of cloud machines. This could enable applications running across multiple machines to share resources more easily. It also allows resources within the pool to be allocated more efficiently and precisely. Researcher Jason Miller likens this to being able to reserve a single seat on a bus [represented here by a single core in a machine] instead of having to rent the entire vehicle.
In the vision Agarwal has of the final product at the chip architecture level, cores will have a nontraditional format, known as a tile, where each one contains a processor, a cache of its own, and an on-chip network switch. This fully distributed tile system, combined with the innovations of FOS and ATAC, may represent a real solution to the problem of multicore computing.
And with that solved, the possibilities â€“ for autonomous vehicles, cloud computing, network security, earth and planetary modeling, protein mapping, even video games â€“ are endless. As these projects take off, the size, scope, and scale of the work will be limited only by the researchersâ€™ imaginations â€“ a familiar position for Agarwal, and in many ways exactly where he thrives.
July 9 2009
Adwoa Gyimah-Brempong, CSAIL
Adwoa Gyimah-Brempong, CSAIL