Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady-state (the throughput) is important for both compiler designers and performance engineers.
However, building an analytical model to do so is especially complicated in modern x86-64 Complex Instruction Set Computer (CISC) machines with sophisticated processor microarchitectures in that it is tedious, error-prone, and must be performed from scratch for each processor generation.
Ithemal is the first tool that learns to predict the throughput of a set of instructions. It does so more accurately than state-of-the-art hand-written tools currently used in compiler backends and static machine code analyzers. In particular, Ithemal has less than half the error of state-of-the-art analytical models (LLVM's llvm-mca and Intel's IACA).
We live in the age of big data, but most of that data is “sparse.” Imagine, for instance, a massive table that mapped all of Amazon’s customers against all of its products, with a “1” for each product a given customer bought and a “0” otherwise. The table would be mostly zeroes.