EE Student Information

The Department of Electrical Engineering supports Black Lives Matter. Read more.

• • • • •

EE Student Information, Spring Quarter through Academic Year 2020-2021: FAQs and Updated EE Course List.

Updates will be posted on this page, as well as emailed to the EE student mail list.

Please see Stanford University Health Alerts for course and travel updates.

As always, use your best judgement and consider your own and others' well-being at all times.

IT-Forum: Statistical Language Modeling in the Era of Abundant Data

Statistical Language Modeling in the Era of Abundant Data
Friday, January 9, 2015 - 1:00pm to 2:00pm
Packard 202
Ciprian Chelba (Google)
Abstract / Description: 

The talk presents an overview of statistical language modeling as applied to real-word problems: speech recognition, machine translation, spelling correction, soft keyboards to name a few prominent ones. We summarize the most successful estimation techniques, and examine how they fare for applications with abundant data, e.g. voice search. We conclude by highlighting a few open problems: getting an accurate estimate for the entropy of text produced by a very specific source, e.g. query stream); optimally leveraging data that is of different degrees of relevance to a given "domain"; does a bound on the size of a "good" model for a given source exist?

Ciprian Chelba is a Research Scientist with Google. Previously he worked as a Researcher in the Speech Technology Group at Microsoft Research. His research interests are in statistical modeling of natural language and speech. Recent projects include: Google Audio Indexing; indexing, ranking and snippeting of speech content; Language Modeling for Google Search by Voice, and Android IME predictive keyboard.