Hardware Neural Network from Synaptic Devices to Neuromorphic Computing
Riding the wave of the powerful Moore's law, present von Neumann computing architectures and memory hierarchy have achieved great success in the past four decades. However, as transistor scaling approaching the end of roadmap and power consumption becoming the performance bottleneck in system design today, innovations in both device and system levels are in urgent need for realizing true human-level artificial intelligence. Many believe that neuromorphic computing on artifical neural network, similar to how our brains operate, is a promising direction to pursue. However, present artifical neural networks are implemented using software algorithms or complex Si-based CMOS circuits, and thus cannot overcome the limitations of low density and high energy consumption.
To achieve true human-level artificial intelligence, a low-power two-terminal electronic synapse—a fundamental component on hardware neural network that emulates the functions of a biological synapse—must be developed. Resistive-switching random access memory (RRAM) is now being actively developed as a low-power electronic synaptic device. It exhibits excellent scalability, compact 4F2 cell size, full CMOS compatibility, and ultralow pJ energy consumption per spike. The RRAM-base hardware neural network also inspired intriguing applications, such as pattern recognition and auditory processing. In this talk, we will discuss the new development of a RRAM-based synaptic device that shows promising potential for high-density 3D integration and low operating energy. The device is based on a non-filamentary swtiching mechanism that is significantly different from the conventional filamentary RRAM. We will also discuss the implementation of the RRAM-based hardware neural network for neuromorphic computing and the importance of device-neuromorphic algorithm codesign.
End-to-End Hardware Accelerator for Deep Convolutional Neural Network
Deep convolutional neural networks (CNNs) have achieved state-of-the-art accuracy on recognition, detection, and other computer vision fields. A CNN hardware will enable mobile devices to meet real time demands. However, the design of CNN hardware faces challenges of high computational complexity and data bandwidth as well as huge divergence for different CNN network layers. In which, the throughput of the convolutional layer would be bounded by hardware resource and throughput of the fully connected layer would be bounded by available data bandwidth. Thus, a highly flexible design with efficient hardware is desired to meet these needs.
This talk will present our end-to-end CNN accelerator with shared filter kernel for all layers and output view strategy for maximum data reuse. The whole CNN architecture is modelled with tile based design to optimize hardware resource and I/O data bandwidth for the desired CNN network under design constraints. The final design is generated based on desired resources and run time reconfigured by layer optimized parameters to achieve real time end-to-end CNN acceleration.
Dr. Hou received his B.S. and M.S. in electronics engineering from National Chiao Tung University, Taiwan in 1996 and 1998, respectively, and his Ph.D. degree in electrical and computer engineering from Cornell University in 2008. In 2000, he joined Taiwan Semiconductor Manufacturing Company (TSMC). From 2001 to 2003, he was also a TSMC assignee at International SEMATECH, Austin, TX. From 2008 to 2011, he was an Assistant Professor in the Department of Electronics Engineering, National Chiao Tung University, where he is currently a Full Professor and the director of EECS International Graduate Program. Dr. Hou's research interests in recent years include the development of terabit nonvolatile resistive-switching memory (RRAM), electronic synaptic device and neuromorphic computing systems, low-temperature all-oxide integrated circuits for flexible electronics, and heterogeneous integration of silicon electronics and low-dimensional nanomaterials. Dr. Hou has authored or co-authored more than 140 technical papers and held 12 U.S. patents. He was also a recipient of IEEE Electron Device Society PhD student fellowship in 2007, EDMA Outstanding Youth Award in 2012, and NSC Ta-You Wu Memorial Award in 2013.
Tian-Sheuan Chang received the B.S., M.S., and Ph.D. degrees in electronic engineering from National Chiao-Tung University (NCTU), Hsinchu, Taiwan, in 1993, 1995, and 1999, respectively.
From 2000 to 2004, he was a Deputy Manager with Global Unichip Corporation, Hsinchu, Taiwan. In 2004, he joined the Department of Electronics Engineering, NCTU, where he is currently a Professor. In 2009, he was a visiting scholar in IMEC, Belgium. His current research interests include VLSI signal processing, video coding, biosignal processing and neuromorphic computing.
Dr. Chang has received the Excellent Young Electrical Engineer from Chinese Institute of Electrical Engineering in 2007, and the Outstanding Young Scholar from Taiwan IC Design Society in 2010. Dr. Chang has published over 160 technical papers, and hold over 10 patents. He has been actively involved in many international conferences as an organizing committee or technical program committee member, and as an Editorial Board Member of IEEE Transactions of Circuits and Systems for Video Technology.