Making Data Specialization Easier for Accelerated Computing
101 AllenX
Abstract: In today's computing landscape, applications are increasingly data-intensive, and hardware is becoming ever more heterogeneous. Achieving high performance and efficiency requires customizing data formats, placements, and numeric types to align with both application needs and hardware capabilities. However, existing programming models and compilers provide limited support for such data-oriented optimizations, making it a significant challenge for most developers to tackle.
This talk discusses our ongoing research on evolvable data specialization, employing a co-design methodology spanning programming models, compilers, and hardware acceleration. I’ll first outline our efforts on automated search of low-bitwidth datatypes for LLMs (ICML'24). I'll then introduce UniSparse (OOPSLA'24), an intermediate language offering a unified abstraction for customizing sparse matrix/tensor formats. Unlike existing frameworks, UniSparse separates logical representation from low-level memory layout, facilitating concise format customizations through well-defined primitives. Lastly, I'll discuss Allo (PLDI'24), a composable programming model for building efficient dataflow accelerators. Allo decouples specification of custom data types and data placement schemes from algorithm description. Our experiments show that Allo significantly outperforms state-of-the-art HLS tools and accelerator design languages on commonly used benchmarks. Allo also enables fast generation of complex model-specific LLM accelerators on FPGAs, producing low latency and high energy efficiency compared to commercial GPUs.
Bio: Zhiru Zhang is a Professor in the School of ECE at Cornell University. His current research investigates new algorithms, design methodologies, and automation tools for heterogeneous computing. Dr. Zhang is an IEEE Fellow and has been honored with the Intel Outstanding Researcher Award, AWS AI Amazon Research Award, Facebook Research Award, Google Faculty Research Award, DAC Under-40 Innovators Award, DARPA Young Faculty Award, IEEE CEDA Ernest S. Kuh Early Career Award, and NSF CAREER Award. He has also received multiple best paper awards from premier conferences and journals in the fields of computer systems and machine learning. Prior to joining Cornell, he co-founded AutoESL, a high-level synthesis start-up later acquired by Xilinx (now part of AMD). AutoESL's HLS tool evolved into Vivado HLS (now Vitis HLS), which is widely used for designing FPGA-based hardware accelerators.