As part of the US House Intelligence Committee investigation into how Russia may have influenced the 2016 US election, Twitter released the screen names of nearly 3000 Twitter accounts tied to Russia's Internet Research Agency. These accounts were immediately suspended, removing the data from Twitter.com and Twitter's developer API. In this talk, we show how we can reconstruct a subset of the Twitter network of these Russian troll accounts and apply graph analytics to the data using the Neo4j graph database to uncover how these accounts were spreading fake news.
This case study style presentation will show how we collected and munged the data, taking advantage of the flexibility of the property graph. We'll dive into how NLP and graph algorithms like PageRank and community detection can be applied in the context of social media to make sense of the data. We'll show how Cypher, the query language for graphs is used to work with graph data. And we'll show how visualization is used in combination with these algorithms to interpret results of the analysis and to help share the story of the data. No familiarity with graphs or Neo4j is necessary as we'll start with a brief overview of graph databases and Neo4j.