Image
ISL events

ISL Colloquium: An Information Theory for Out-of-Order Information: Applications in DNA Data Storage and Beyond

Summary
Prof. Ilan Shomorony (UIUC)
Packard 202
Mar
16
Date(s)
Content

Abstract: The recent development of DNA-based data storage prototypes has raised several questions about how to optimally encode information in these systems. A distinguishing feature of this new storage paradigm is that the stored information is read via “shotgun” sequencing technologies. This means that the channel output comprises many short fragments of the input observed out of order. Motivated by this, we study the capacity of a class of “shuffling channels” that capture this inherent need to reorder the observed channel output. We provide channel capacity results based on random coding arguments, discuss the impact of the shotgun sampling process on the storage capacity, and propose suboptimal but computationally efficient code constructions. We then extend our insights to other settings where lack of ordering is a key feature. In particular, we propose an information-theoretic framework for the problem of pairwise sequence alignment and, motivated by the problem of dataset matching, we study the fundamental limits of reference-based source reordering.

Bio: Ilan Shomorony is an assistant professor of Electrical and Computer Engineering at the University of Illinois, Urbana-Champaign (UIUC), where he is a member of the Coordinated Science Laboratory. He obtained his Ph.D. in ECE from Cornell University in 2014 and was a postdoctoral scholar at UC Berkeley through the NSF Center for Science of Information (CSoI) until 2017. Before joining UIUC, he spent a year working as a researcher and data scientist at Human Longevity Inc., a personal genomics company. He received the NSF CAREER Award in 2021. His research interests include information theory and computational biology.