Redundancy in neural networks offers opportunities of design optimization.
In this talk, we introduce our recent work on three optimization techniques:
- First, we explain a quantization method based on weighted entropy which makes ResNet-101 run at 6 bit weight and activation without accuracy loss.
- Second, we report our case study of applying low rank approximation technique namely Tucker decomposition to CNNs running on the smartphone.
- Lastly, we introduce a zero-aware hardware accelerator called ZeNA and a novel pruning method for activations.
Sungjoo Yoo received PhD at Seoul National Univ.(SNU), Korea in 2000. He was researcher at TIMA lab, France in 2000-2004 and a principal engineer at Samsung Electronics in 2004-2008. He was assistant and associate professor at POSTECH, Korea in 2008-2015. He joined SNU in 2015 and is now associate professor. His research interests include optimization of neural networks, from algorithm to chip implementation.