Differentially Private Community Detection over Stochastic Block Models

Mohamed Seif (Postdoctoral Research Associate, Princeton)
Packard 318

Nov

This event ended 477 days ago.

Fri, Nov 1 2024, 2pm

Abstract: Community detection is a fundamental problem in graph mining and machine learning, with numerous interesting applications in social networks, image segmentation, and biological networks. The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users (represented by the graph’s adjacency matrix). Significant progress has been made in understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information-theoretic limits and efficient algorithms have been obtained for SBMs as a function of p and q, which represent the intra-community and inter-community connection probabilities. This talk will give an overview of our recent work on differentially private community detection. In many problem scenarios, one may wish to perform clustering/community detection while protecting the privacy of sensitive information (such as connectivity between two users, which could be sensitive information). SBMs provide a fertile ground to design and study such privacy-preserving mechanisms while facilitating utility analysis. Focusing on the notion of edge-differential privacy (edge-DP), we seek to understand the fundamental tradeoffs between SBMs network connectivity (p, q), privacy budget, and computational efficiency for recovering community labels. To this end, I will discuss and present the associated information-theoretic tradeoffs for three broad classes of differentially private community recovery mechanisms: a. stability-based mechanisms, b. sampling-based mechanisms, and c. graph perturbation mechanisms. Our main findings are that stability and sampling-based mechanisms lead to a superior privacy-recovery tradeoff; however, this comes at the expense of higher computational complexity. On the other hand, albeit in low complexity, graph perturbation mechanisms require the privacy budget to scale logarithmically with the graph size for exact recovery. I will conclude with several open problems and challenges in this topic.

Bio: Mohamed Seif received his B.Sc. degree in electrical engineering from Alexandria University, Egypt, his M.Sc. degree in wireless technologies from Nile University, Egypt, and his Ph.D. from the University of Arizona. He is currently a Postdoctoral Research Associate at Princeton University. His research interests span a broad range of topics, mainly information and coding theory and its applications in machine learning, wireless communication, and data privacy.

IT-Forum