We present a new parallel architecture that exploits fine-grain ordered parallelism, which is abundant but hard to mine with current software and hardware techniques. In this architecture, called OP, programs consist of fine-grain tasks, as small as tens of instructions each, with programmer-specified order constraints. OP executes tasks speculatively and out of order, and efficiently speculates thousands of tasks ahead of the earliest active task to uncover enough parallelism. Furthermore, OP sends task to run close to their data whenever possible, reducing data movement. We contribute several new techniques that allow OP to scale to large core counts and speculation windows, including a new execution model, speculation-aware hardware task management, selective aborts, and scalable ordered task commits.
We evaluate OP with challenging graph analytics, simulation, and database benchmarks. At 64 cores, OP achieves speedups of 32-64x over a single-core OP system, and achieves even higher speedups over state-of-the-art parallel software algorithms.
Daniel Sanchez is the TIBCO Founders Assistant Professor of Electrical Engineering and Computer Science at MIT. His research interests include parallel computer systems, scalable and efficient memory hierarchies, architectural support for parallelization, and architectures with quality-of-service guarantees. He earned a Ph.D. in Electrical Engineering from Stanford University in 2012, and received the NSF CAREER award in 2015.