EE380 Computer Systems Colloquium: Petascale Deep Learning on a Single Chips

Topic: 
Petascale Deep Learning on a Single Chips
Wednesday, December 6, 2017 - 4:30pm
Venue: 
Gates B03
Speaker: 
Tapabrata Ghosh (Ingemini LLC)
Abstract / Description: 

Vathys.ai is a deep learning startup that has been developing a new deep learning processor architecture with the goal of massively improved energy efficiency and performance. The architecture is also designed to be highly scalable, amenable to next generation DL models. Although deep learning processors appear to be the "hot topic" of the day in computer architecture, the majority (we argue all) of such designs incorrectly identify the bottleneck as computation and thus neglect the true culprits in inefficiency; data movement and miscellaneous control flow processor overheads. This talk will cover many of the architectural strategies that the Vathys processor uses to reduce data movement and improve efficiency. The talk will also cover some circuit level innovations and will include a quantitative and qualitative comparison to many DL processor designs, including the Google TPU, demonstrating numerical evidence for massive improvements compared to the TPU and other such processors.

ABOUT THE COLLOQUIUM:

See the Colloquium website, http://ee380.stanford.edu, for scheduled speakers, FAQ, and additional information. Stanford and SCPD students can enroll in EE380 for one unit of credit. Anyone is welcome to attend; talks are webcast live and archived for on-demand viewing over the web.

Bio:

Tapabrata ("Tapa") Ghosh is the co-founder and CEO of Vathys.ai where he develops the core architecture of the Vathys deep learning processor. When he was a teenager, he developed a quadcopter drone with the intention of utilizing deep learning to make collision free drones, at the time a novel proposition. In attempting to do so, he experienced firsthand the computational requirements of deep learning in the real world. The experience led him to found Vathys with the goal of developing performant and power efficient deep learning processors. He also has multiple patents and papers related to the fields of deep learning, computer architecture and semiconductors.