Deep learning is a part of daily life, owing to its successes in computer vision and natural language processing. In these applications, the success of the model-free deep learning approach can be attributed to a lack of (mathematical) generative model. In yet other applications, the data is generated by a simple model and performance criterion mathematically precise and training/test samples infinitely abundant, but the space of algorithmic choices is enormous (example: chess). Deep learning has recently shown strong promise in these problems too (example: alphazero). In this talk, we study two canonical problems of great scientific and engineering interest through the lens of deep learning.
The first is reliable communication over noisy media where we successfully revisit classical open problems in information theory; we show that creatively trained and architected neural networks can beat state of the art on the AWGN channel with noisy feedback by a 100 fold improvement in bit error rate.
The second is optimization and classification problems on graphs, where the key algorithmic challenge is scalable performance to arbitrary sized graphs. Representing graphs as randomized nonlinear dynamical systems via recurrent neural networks, we show that creative adversarial training allows one to train on small size graphs and test on much larger sized graphs (100~1000x) with approximation ratios that rival state of the art on a variety of optimization problems across the complexity theoretic hardness spectrum.
Apart from the obvious practical value, this study of mathematically precise problems sheds light on the mysteries of deep learning methods: training example choices, architectural design decisions and loss function/learning methodologies. Our (mostly) empirical research is conducted under the backdrop of a theoretical research program of understanding gated neural networks (eg: attention networks, GRU, LSTM); we show the first provably (globally) consistent algorithms to recover the parameters of a classical gated neural network architecture: mixture of experts (MoE).
Pramod Viswanath received the Ph.D. degree in electrical engineering and computer science from University of California at Berkeley in 2000. From 2000 to 2001, he was a member of research staff at Flarion technologies, NJ. Since 2001, he is on the faculty at University of Illinois at Urbana Champaign in Electrical and Computer Engineering, where he currently is a professor.
He received the Eliahu Jury Award from the EECS department of UC Berkeley in 2000, the Bernard Friedman Prize from the mathematics department of UC Berkeley in 2000, a NSF CAREER award in 2002, the Xerox faculty research award from the college of engineering of UIUC in 2010 and the Best Paper Award at the Sigmetrics conference in 2015. He is a coauthor, with David Tse, of the text Fundamentals of Wireless Communication, which has been used in over 60 institutions around the world. He is coinventor of the opportunistic beamforming method and codesigner of Flash-OFDM communication algorithms used in all fourth-generation cellular systems.
He was an Associate Editor of the IEEE Transactions on Information Theory from 2006 to 2008, the Technical Program co-chair of the Information Theory Workshops in 2010 and 2015 and the Technical Program co-chair of the Allerton conference in 2007 and 2008.
His current research interests are in machine learning and natural language processing. In the past he has worked on, and continues to hold interest in, information theory and its applications in various fields, including wireless communication and data privacy.