The traditional method of keeping caches coherent on a shared memory system, a directory of all sharers, incurs a space cost linear in the number of sharers. The Tardis cache coherence protocol shrinks this cost to logarithmic in the number of sharers, paving the way for thousand-core cache coherent systems. To demonstrate the versatility of this protocol, we seek to build T-1000, an ensemble of interconnected field-programmable gate arrays (FPGAs) configured to contain, in the aggregate, over a thousand RISC-V processors. To maximize bandwidth, we will connect these FPGAs in a three-dimensional coherent mesh. With T-1000, we can explore new parallel processing algorithms, operating systems with thousands of active processes, and datacenter-scale computing within a single address space.
If you would like to contact us about our work, please scroll down to the people section and click on one of the group leads' people pages, where you can reach out to them directly.