ISL Colloquium and IT-Forum present "A Notion of Entropy for Sparse Marked Graphs and its Applications in Graphical Data Compression"
Many modern data sources arising from social networks, biological data, etc. are best viewed as indexed by combinatorial structures such as graphs, rather than time series. The local weak limit theory for sparse graphs, also known as the objective method, due to Benjamini, Schramm, Aldous, Steele, Lyons and others, provides a framework which enables one to make sense of a stationary process indexed by graphs. The theory of time series is the engine driving an enormous range of applications in areas such as control theory, communications, information theory and signal processing. It is to be expected that a theory of stationary stochastic processes indexed by combinatorial structures, in particular graphs, would eventually have a similarly wide-ranging impact.
Employing the above framework, we introduce a notion of entropy for probability distributions on rooted graphs. This is a generalization of the notion of entropy introduced by Bordenave and Caputo to graphs which carry marks on their vertices and edges. Such marks can represent information on real-world data. For instance, in a social network graph where each node represents an individual and edges represent friendships, a vertex mark represents the type of an individual, while edge marks represent shared data between friends. The above notion of entropy can be considered as a natural counterpart for the Shannon entropy rate in the world of graphical data. We illustrate this by introducing a universal compression scheme for marked graphical data. Furthermore, we introduce an algorithm that can perform such a compression with low complexity.
This talk is based on joint work with Venkat Anantharam.