Thesis Defense: Practical Cryptographically Private and Verifiable Computation through Hardware-Software Co-Design

Speaker

MIT CSAIL

Host

Nikola Samardzic
MIT CSAIL
Fully Homomorphic Encryption (FHE) and Verifiable Computation (VC) enable offloading computation to servers, while ensuring data privacy and computational integrity, respectively. Despite their attractive security properties, FHE and VC are not widely adopted because (1) they are hard to use even for expert cryptographers and (2) they bring prohibitive performance overheads, about 10,000x over unencrypted computation.

Hardware acceleration is an attractive approach to bridge this performance gap, but it brings new challenges. These include operations on large vectors with complex dependencies that current vector processor architectures cannot handle, as well as extreme memory bandwidth demands. This thesis presents FHE and VC accelerator designs, as well as an FHE compiler and an FHE protocol (software) level change that enables better accelerator utilization.

F1 and CraterLake are FHE accelerators that improve performance over state-of-the-art by 10,000x. They accelerate FHE primitives, such as modular arithmetic, number-theoretic transforms, and structured permutations. This organization provides so much compute throughput that data movement becomes the key bottleneck. Thus, F1 and CraterLake are primarily designed to minimize data movement.

F1 and CraterLake's speedups bring with them new bottlenecks, mainly arithmetic efficiency. We present BitPacker, a new implementation of an FHE scheme that keeps encrypted data packed in fixed-size words, enabling near-full arithmetic efficiency in accelerators. BitPacker is the first redesign of an FHE scheme that targets accelerators. On a state-of-the-art accelerator, BitPacker improves performance by gmean 59% and by up to 3x, and reduces energy by gmean 61%.

We then design Fhelipe, a compiler that abstracts away FHE's implementation details and makes it easy for programmers to use, automatically mapping high-level tensor programs to CPUs, and F1 and CraterLake. Fhelipe translates high-level tensor programs into optimized FHE circuits that can then be executed on CraterLake, F1, or a CPU. Fhelipe produces compiled programs that match or exceed the performance of state-of-the-art manual implementations. It also outperforms prior FHE compilers by gmean 24.5x on a wide set of
benchmarks.

While FHE provides data privacy it does not verify that the correct computation is carried out. NoCap is a hardware accelerator that speeds up verifiable computation by 50x over scaled-up versions of state-of-the-art accelerators and 1,007x over CPU.