ISL & IT Forum present "String reconstruction problems inspired by problems in multiomics data processing"
String reconstruction problems frequently arise in many areas of genomic data processing molecular storage, and synthetic biology. In the most general setting, they may be described as follows: one is given a single or multiple copies of a coded or uncoded string, and the copies are subsequently subjected to some form of (random) processing such as fragmentation or repeated transmission through a noise-inducing channel. The goal of the reconstruction method is to obtain an exact or approximate version of the string based on the processed outputs. Examples of string reconstruction questions include reconstruction from noisy traces, reconstruction from substrings and k-decks and reconstruction from compositional substring information. We review the above and some related problems and then proceed to describe coding methods that lead to strings that can be reconstructed exactly from their noisy traces, substrings and compositions. In particular, we focus on DNA profile codes, hybrid reconstruction from traces and uniquely reconstructable code designs. In the process, we introduce new questions in the areas of restricted de Bruin graphs, counting of rational points in polytopes, and string replacement methods.
This is a joint work with Ryan Gabrys, Han Mao Kiah, Srilakshmi Pattabiraman and Gregory Puleo.