IT-Forum

ISL & IT Forum present "Building a DNA information storage system from the bottom up"

Topic: 
Building a DNA information storage system from the bottom up
Abstract / Description: 

DNA has emerged as a compelling data storage medium due to its density, longevity, and eternal relevance compared to current memory technologies. However, the high price of synthesizing DNA (list price $3,500 per megabyte) remains a major bottleneck for adoption of this promising storage solution. In this talk, I will present our work towards breaking down this barrier using enzymatic DNA synthesize and a tailored codec for robust data retrieval. I will also touch upon some fundamental considerations when designing DNA information storage systems.

Date and Time: 
Friday, June 21, 2019 - 1:15pm
Venue: 
Packard 202

ISL & IT Forum present "Tensor networks and quantum Markov chain"

Topic: 
Tensor networks and quantum Markov chain
Abstract / Description: 

Tensor networks can often accurately approximate various systems that are of interest to physicists, but the computational cost of contracting the network is often prohibitively demanding. Quantum computer can resolve this bottleneck, but because the size of the computation scales with the size of the network, the existing quantum computers may appear to be far too noisy for this task. We prove a nontrivial error bound on the outcome of the computation that does not scale with the size of the network. This is possible because certain tensor networks are secretly implementing a quantum analogue of a (classical) Markovian dynamics. The long-time stability of the Markovian dynamics gives rise to a nontrivial error bound on the outcome of the computation. This suggests that there may be practical problems of interest that are amenable to relatively noisy quantum computers.

Date and Time: 
Friday, June 7, 2019 - 1:15pm
Venue: 
Packard 202

IT-Forum Student Talks

Topic: 
Student Talks
Abstract / Description: 

Jesse Gibson
Title: Information Theory in Next-Generation DNA Sequencing Technologies
The completion of the first human genome sequence in the early 2000's represented a major success for the scientific community. It also represented the return on a significant investment - billions of dollars and a little over a decade's worth of work. DNA sequencing costs have since fallen at a rate even faster than the exponential improvements predicted by Moore's Law in computing. Today, a complete human genome can be sequenced for closer to a few thousand dollars in about a week. A major factor in this improvement was the shift from the serial reads used in the human genome project to massively parallel sequencing technologies. These new technologies however present new problems - how can we understand a genome given millions of short reads drawn randomly from the underlying sequence and potentially degraded by noise in the process? Bioinformatics boasts a plethora of algorithms that meet this challenge in practice but it's not immediately obvious what 'optimal' analyses might look like. This talk will focus on the work of David Tse and others that have used the perspective of information theory to establish fundamental bounds on our ability to interpret next generation sequencing output. In particular, I will focus on the problems of assembling an unknown sequence de novo and of variant calling when the population of molecules being sequenced is not homogenous.

Tony Ginart
Title: Towards a Compression Theory for Metric Embeddings
Embedding matrices are widely used in diverse domains such as NLP, recommendation systems, information retrieval, and computer vision. For large-scale datasets, embedding matrices can consume large amounts of memory, and compression for these objects is desirable. However, the relevant distortion metrics applicable to embeddings make them significantly more compressible than classical sources considered in information theory. Furthermore, related results such as the Johnson-Lindenstrauss theorem are formulated in terms of reduction in dimension rather than codelength. Additionally, embeddings come with certain query-time constraints, such as in the sense of a succinct data structure. In this talk, we formulate the problem of metric embedding compression from the information theoretic lense and review related literature in information theory, signal processing, and dimensionality reduction. We adapt pre-existing results to establish some lower and upper bound in some regimes. Finally, we cover some of the state-of-the-art compression algorithms used to compress embeddings in practice.

Date and Time: 
Tuesday, June 4, 2019 - 10:00am
Venue: 
Packard 214

ISL & IT Forum present "String reconstruction problems inspired by problems in multiomics data processing"

Topic: 
String reconstruction problems inspired by problems in multiomics data processing
Abstract / Description: 

String reconstruction problems frequently arise in many areas of genomic data processing molecular storage, and synthetic biology. In the most general setting, they may be described as follows: one is given a single or multiple copies of a coded or uncoded string, and the copies are subsequently subjected to some form of (random) processing such as fragmentation or repeated transmission through a noise-inducing channel. The goal of the reconstruction method is to obtain an exact or approximate version of the string based on the processed outputs. Examples of string reconstruction questions include reconstruction from noisy traces, reconstruction from substrings and k-decks and reconstruction from compositional substring information. We review the above and some related problems and then proceed to describe coding methods that lead to strings that can be reconstructed exactly from their noisy traces, substrings and compositions. In particular, we focus on DNA profile codes, hybrid reconstruction from traces and uniquely reconstructable code designs. In the process, we introduce new questions in the areas of restricted de Bruin graphs, counting of rational points in polytopes, and string replacement methods.

This is a joint work with Ryan Gabrys, Han Mao Kiah, Srilakshmi Pattabiraman and Gregory Puleo.

Date and Time: 
Friday, May 31, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum & ISL Colloquium present "Extracting information from cells"

Topic: 
Extracting information from cells
Abstract / Description: 

Genomic approaches to studying the molecular biology of the cell have advanced considerably over the pas few years, and it is now possible to make high-throughput measurements of molecules in individual cells. I will discuss some of the considerations that must be taken into account in designing experiments, and subsequently in extracting the maximum information from them (optimally).

Date and Time: 
Friday, May 24, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum welcomes Jiantao Jiao (UC Berkeley)

Topic: 
Deconstructing Generative Adversarial Networks
Abstract / Description: 

We deconstruct the Generative Adversarial Networks (GANs) to its three fundamental problems to study: formulation, generalization, and optimization. We propose systematic principles to formulate the population goals of GANs (when infinite samples are available), and reveal and further develop connections between GANs and robust statistics. We provide principled methods to achieve the population formulations of GANs given finite samples with small generalization error, and demonstrate the intricacy of moving from infinite samples to finite samples in statistical error. We show through examples the importance of solving the inner maximization problem before the outer minimization problem, and demonstrate embedding the knowledge of the solution of the inner maximization problem could make a locally unstable algorithm globally stable. Joint work with Banghua Zhu and David Tse.

Date and Time: 
Tuesday, May 21, 2019 - 10:00am
Venue: 
Packard 214

IT Forum presents "Information Theory, Power, and Competition"

Topic: 
Information Theory, Power, and Competition
Abstract / Description: 

Is our current age of "Techlash" there is a daily stream of headlines demanding reforms to the major platforms on the Internet — including calls for antitrust action to structurally breakup big tech companies or at minimum levy against them billions of dollars of fines. In this talk, we'll take a step back to look at lessons from the 1956 Bell Labs Consent Decree as a startling and relevant case for how FTC antitrust action sought to protect and promote innovation. We'll survey Bell Labs' legacy as the birthplace of modern computing, starting with Shannon's Mathematical Theory of Communication and Shockley's transistor, and examine how the thousands of inventions that were licensed from Bell Labs after the Consent Decree shaped Silicon Valley. We ask, can a page of history revise our current understanding of innovation and competition and reshape the goals of the next generation of startups?

Date and Time: 
Friday, May 10, 2019 - 1:15pm
Venue: 
Packard 202

#StanfordToo: A Conversation about Sexual Harassment in Our Academic Spaces

Topic: 
#StanfordToo: A Conversation about Sexual Harassment in Our Academic Spaces
Abstract / Description: 

Individuals of all genders invited to be a part of:
#StanfordToo: A Conversation about Sexual Harassment in Our Academic Spaces, where we will feature real stories of harassment at Stanford academic STEM in a conversation with Provost Drell, Dean Minor (SoM), and Dean Graham (SE3). We will have plenty of time for audience discussion on how we can take concrete action to dismantle this culture and actively work towards a more inclusive Stanford for everyone. While our emphasis is on STEM fields, we welcome and encourage participation from students, postdocs, staff, and faculty of all academic disciplines and backgrounds.

Date and Time: 
Friday, April 19, 2019 - 3:30pm
Venue: 
STLC 111

IEEE IT Society, Santa Clara Valley Chapter presents Irena Fischer-Hwang

Topic: 
The future of lossy image compression: what machines can learn from humans
Abstract / Description: 

The availability of massive public image datasets appears to have hardly been exploited in image compression. In this work, we present a novel framework for image compression based on human image generation and publicly available images as "side information." Our framework consists of one human who describes images using text instructions to another, who is tasked with reconstructing the original image to the first human's satisfaction. These image reconstructions were then rated by human scorers on the Amazon Mechanical Turk platform and compared to reconstructions obtained by existing image compressors. While this setup lacks certain components typical of traditional compressors, the insights gained from these experiments offer a new perspective on designing image compressors of the future.

 

The Santa Clara Valley chapters of the IEEE Information Theory and Signal Processing societies are co-sponsors this event.

Date and Time: 
Wednesday, May 1, 2019 - 6:00pm
Venue: 
Packard 202

Pages

Subscribe to RSS - IT-Forum