Over the past decade, Deep Learning has had a transformative impact on a wide spectrum of domains including speech, vision and NLP delivering highly accurate perception capabilities through the use of very deep and large models. In this talk, a suite of approximate computing techniques will be described that significantly reduce the computational burden of these Deep Learning tasks for both training and inference. Recent advancements will be presented that have successfully reduced the precision of training systems to 8-bits and the precision needed for inference down to 2-bits while fully preserving model accuracy - laying the ground work for a precision/sparsity/analog roadmap that could guide the industry over the next decade. Future trends and challenges in these approximate computing systems will also be discussed.
Kailash is a Distinguished Research Staff Member and senior manager at the IBM T. J. Watson Research Center whose work has had a transformative impact on both IBM and the External community. His research on Artificial Intelligence Algorithms and specialized Hardware have had a catalytic effect on the industry and are amongst the most highly cited work in the field (with >4000 total citations). His research breakthroughs have resulted in numerous patents (> 80), invited talks at premier AI and hardware conferences and he is the recipient of multiple awards (both internal and external). He has a Ph.D. in Electrical Engineering from Stanford University and his current research interests include deep learning algorithms, accelerator architectures and approximate computing.