Optimizing Neural Network Design: Quantization, Network Compression and Pruning

Topic: 
Optimizing Neural Network Design: Quantization, Network Compression and Pruning
Monday, February 27, 2017 - 4:30pm
Venue: 
Gates 104
Speaker: 
Prof. Sungjoo Yoo (Seoul National University)
Abstract / Description: 

Redundancy in neural networks offers opportunities of design optimization.

In this talk, we introduce our recent work on three optimization techniques:

  • First, we explain a quantization method based on weighted entropy which makes ResNet-101 run at 6 bit weight and activation without accuracy loss.
  • Second, we report our case study of applying low rank approximation technique namely Tucker decomposition to CNNs running on the smartphone.
  • Lastly, we introduce a zero-aware hardware accelerator called ZeNA and a novel pruning method for activations.

Bio:

Sungjoo Yoo received PhD at Seoul National Univ.(SNU), Korea in 2000. He was researcher at TIMA lab, France in 2000-2004 and a principal engineer at Samsung Electronics in 2004-2008. He was assistant and associate professor at POSTECH, Korea in 2008-2015. He joined SNU in 2015 and is now associate professor. His research interests include optimization of neural networks, from algorithm to chip implementation.