The recent progress in recognizing visual objects and annotating images has been driven by super-rich models and massive datasets. However, machine vision models still have a very limited 'understanding' of images, rendering them brittle when attempting to generalize to unseen examples. I will describe recent efforts to improve the robustness and accuracy of systems for annotating and retrieving images, first, by using structure in the space of images and fusing various types of information about image labels, and second, by matching structures in visual scenes to structures in their corresponding language descriptions or queries. We apply these approaches to billions of queries and images, to improve search and annotation of public images and personal photos.
Gal Chechik is a professor at the Gonda brain research center, Bar-Ilan University, Israel, and a senior research scientist at Google. His work focuses on learning in brains and in machines. Specifically, he studies the principles governing representation and adaptivity at multiple timescales in the brain, and algorithms for training computers to represent signals and learn from examples. Gal earned his PhD in 2004 from the Hebrew University of Jerusalem developing machine learning and probabilistic methods to understand the auditory neural code. He then studied computational principles regulating molecular cellular pathways as a postdoctoral researcher at the CS dept in Stanford. In 2007, he joined Google research as a senior research scientist, developing large-scale machine learning algorithms for machine perception. Since 2009, he heads the computational neurobiology lab at BIU and was appointed an associate professor in 2013. He was awarded a Fulbright fellowship, a complexity scholarship and the Israeli national Alon fellowship.