While neural networks had been used in speech recognition in the early 1990s, they did not outperform the traditional machine learning approaches until 2010, when Alex's team members at Microsoft Research demonstrated the superiority of Deep Neural Networks (DNN) for large vocabulary speech recognition systems. The speech community rapidly adopted deep learning, followed by the image processing community, and many other disciplines. In this talk I will give an introduction to speech recognition, go over the fundamentals of deep learning, explained what it took for the speech recognition field to adopt deep learning, and how that has been contributed to popularize personal assistants like Siri.
ABOUT THE COLLOQUIUM:
See the Colloquium website, http://ee380.stanford.edu, for scheduled speakers, FAQ, and additional information. Stanford and SCPD students can enroll in EE380 for one unit of credit. Anyone is welcome to attend; talks are webcast live and archived for on-demand viewing over the web.
Alex Acero (PhD, Carnegie Mellon, 1990) is Sr. Director in the Siri team in charge of speech recognition, speech synthesis, and machine translation. Prior to joining Apple in 2013, he spent 20 years at Microsoft Research managing teams in speech, audio, multimedia, computer vision, natural language processing, machine translation, machine learning, and information retrieval. Dr. Acero is an IEEE Fellow and ISCA Fellow. Alex has served as President of the IEEE Signal Processing Society and is currently a member of the IEEE Board of Directors. He is the author of the textbook Spoken Language Processing. Dr. Acero has published over 250 technical papers and has over 150 US patents.