Image
event icon teach

Speech and audio compression in the neural era

Summary
Jan Skoglund, Google
Packard 202
Nov
4
Date(s)
Content

Zoom link;
password: 032264

 

Abstract:

In this talk we'll discuss data compression of digital audio signals such as speech and music. After a general introduction to the area of audio compression we'll focus on speech coding - compression of speech.  Modern advances in AI and deep learning methods have shown to be remarkably successful in speech processing applications such as speech recognition and synthesis, and the talk will present some recent progress in low rate speech coding using neural modeling techniques.

 

Bio:

Jan Skoglund leads a team at Google in San Francisco, CA, developing speech and audio signal processing components for capture, real-time communication, storage, and rendering. These components have been deployed in Google software products such as Meet and hardware products such as Chromebooks. After receiving his Ph.D. degree at Chalmers University of Technology in Sweden, 1998, he worked on low bit rate speech coding at AT&T Labs-Research, Florham Park, NJ. He was with Global IP Solutions (GIPS), San Francisco, CA, from 2000 to 2011 working on speech and audio processing, such as compression, enhancement, and echo cancellation, tailored for packet-switched networks. GIPS' audio and video technology was found in many deployments by, e.g., IBM, Google, Yahoo, WebEx, Skype, and Samsung, and was open-sourced as WebRTC after a 2011 acquisition by Google. Since then he has been in the Open Codecs team of Chrome at Google.