Vathys.ai is a deep learning startup that has been developing a new deep learning processor architecture with the goal of massively improved energy efficiency and performance. The architecture is also designed to be highly scalable, amenable to next generation DL models. Although deep learning processors appear to be the "hot topic" of the day in computer architecture, the majority (we argue all) of such designs incorrectly identify the bottleneck as computation and thus neglect the true culprits in inefficiency; data movement and miscellaneous control flow processor overheads. This talk will cover many of the architectural strategies that the Vathys processor uses to reduce data movement and improve efficiency. The talk will also cover some circuit level innovations and will include a quantitative and qualitative comparison to many DL processor designs, including the Google TPU, demonstrating numerical evidence for massive improvements compared to the TPU and other such processors.
ABOUT THE COLLOQUIUM:
See the Colloquium website, http://ee380.stanford.edu, for scheduled speakers, FAQ, and additional information. Stanford and SCPD students can enroll in EE380 for one unit of credit. Anyone is welcome to attend; talks are webcast live and archived for on-demand viewing over the web.