Nikhil Garg (PhD '20) uses machine-learning to measure 100 years of gender and ethnic stereotypes in the U.S.

Nikhil Garg, EE PhD '20 interdisciplinary research using machine-learning
April 2018

Lead author Nikhil Garg (PhD candidate '20) demonstrates that word embeddings can be used as a powerful tool to quantify historical trends and social change. His research team developed metrics based on word embeddings to characterize how gender stereotypes and attitudes toward ethnic minorities in the United States evolved during the 20th and 21st centuries starting from 1910. Their framework opens up a fruitful intersection between machine learning and quantitative social science.

Nikhil co-authored the paper with history Professor Londa Schiebinger, linguistics and computer science Professor Dan Jurafsky and biomedical data science Professor James Zou.

Their research shows that, over the past century, linguistic changes in gender and ethnic stereotypes correlated with major social movements and demographic changes in the U.S. Census data.

The researchers used word embeddings – an algorithmic technique that can map relationships and associations between words – to measure changes in gender and ethnic stereotypes over the past century in the United States. They analyzed large databases of American books, newspapers and other texts and looked at how those linguistic changes correlated with actual U.S. Census demographic data and major social shifts such as the women's movement in the 1960s and the increase in Asian immigration, according to the research.

"Word embeddings can be used as a microscope to study historical changes in stereotypes in our society," said James Zou, a courtesy professor of electrical engineering. "Our prior research has shown that embeddings effectively capture existing stereotypes and that those biases can be systematically removed. But we think that, instead of removing those stereotypes, we can also use embeddings as a historical lens for quantitative, linguistic and sociological analyses of biases."

"This type of research opens all kinds of doors to us," Schiebinger said. "It provides a new level of evidence that allow humanities scholars to go after questions about the evolution of stereotypes and biases at a scale that has never been done before."

"The starkness of the change in stereotypes stood out to me," Garg said. "When you study history, you learn about propaganda campaigns and these outdated views of foreign groups. But how much the literature produced at the time reflected those stereotypes was hard to appreciate." 

The new research illuminates the value of interdisciplinary teamwork between humanities and the sciences, researchers said.

"This led to a very interesting and fruitful collaboration," Schiebinger said, adding that members of the group are working on further research together. "It underscores the importance of humanists and computer scientists working together. There is a power to these new machine-learning methods in humanities research that is just being understood." 


Proceedings of the National Academy of Sciences, "Word embeddings quantify 100 years of gender and ethnic stereotypes" April 3,2018.  

Excerpted from Stanford News, "Stanford researchers use machine-learning algorithm to measure changes in gender, ethnic bias in U.S." April 3, 2018.