Over the past sixty years, Intelligent Machines have made great progress in playing games, tagging images in isolation, and recently making decisions for self-driving vehicles. Despite these advancements, they are still far from making decisions in social scenes and effectively assisting humans in public spaces such as terminals, malls, campuses, or any crowded urban environment. To overcome these limitations, I claim that we need to empower machines with social intelligence, i.e., the ability to get along well with others and facilitate mutual cooperation. This is crucial to design future generations of smart spaces that adapt to the behavior of humans for efficiency, or develop autonomous machines that assist in crowded public spaces (e.g., delivery robots, or self-navigating segways).
In this talk, I will present my work towards socially-aware machines that can understand human social dynamics and learn to forecast them. First, I will highlight the machine vision techniques behind understanding the behavior of more than 100 million individuals captured by multi-modal cameras in urban spaces. I will show how to use sparsity promoting priors to extract meaningful information about human behavior. Second, I will introduce a new deep learning method to forecast human social behavior. The causality behind human behavior is an interplay between both observable and non-observable cues (e.g., intentions). For instance, when humans walk into crowded urban environments such as a busy train terminal, they obey a large number of (unwritten) common sense rules and comply with social conventions. They typically avoid crossing groups and keep a personal distance to their surrounding. I will present detailed insights on how to learn these interactions from millions of trajectories. I will describe a new recurrent neural network that can jointly reason on correlated sequences and forecast human trajectories in crowded scenes. It opens new avenues of research in learning the causalities behind the world we observe. I will conclude my talk by mentioning some ongoing work in applying these techniques to social robots, and the future generations of smart hospitals.
More Information: http://web.stanford.edu/~alahi/
Alexandre Alahi is currently a research scientist at Stanford University and received his PhD from EPFL in Switzerland (nominated for the EPFL PhD prize). His research enables machines to perceive the world and make decisions in the context of transportation problems and built environments at all scales. His work is centered around understanding and predicting human social behavior at all scale with multi-modal data. He has worked on the theoretical and practical applications of socially-aware Artificial Intelligence. He was awarded the Swiss NSF early and advanced researcher grants for his work on predicting human social behavior. He won the CVPR 2012 Open Source Award for his work on Retina-inspired image descriptor, and the ICDSC 2009 Challenge Prize for his sparsity driven algorithm that has tracked more than 100 million pedestrians in train terminals. His research has been covered internationally by BBC, Euronews, Wall street journal, as well as national news in the US and Switzerland. Finally, he co-founded the startup Visiosafe, won several startup competitions, and was elected as the Top 20 Swiss Venture leaders in 2010.