IT-Forum

IT Forum presents "Sub-packetization of Minimum Storage Regenerating codes"

Topic: 
Sub-packetization of Minimum Storage Regenerating codes: A lower bound and a work-around
Abstract / Description: 

Modern cloud storage systems need to store vast amounts of data in a fault tolerant manner, while also preserving data reliability and accessibility in the wake of frequent server failures. Traditional MDS (Maximum Distance Separable) codes provide the optimal trade-off between redundancy and number of worst-case erasures tolerated. Minimum storage regenerating (MSR) codes are a special sub-class of MDS codes that provide mechanisms for exact regeneration of a single code-block by downloading the minimum amount of information from the remaining code-blocks. As a result, MSR codes are attractive for use in distributed storage systems to ensure node repairs with optimal repair- bandwidth. However, all known constructions of MSR codes require large sub-packetization levels (which is a measure of the granularity to which a single vector codeword symbol needs to be divided into for efficient repair). This restricts the applicability of MSR codes in practice.

This talk will present a lower bound that exponentially large sub- packetization is inherent for MSR codes. We will also propose a natural relaxation of MSR codes that allows one to circumvent this lower bound, and present a general approach to construct MDS codes that significantly reduces the required sub-packetization level by incurring slightly higher repair-bandwidth as compared to MSR codes.

The lower bound is joint work with Omar Alrabiah, and the constructions are joint work with Ankt Rawat, Itzhak Tamo, and Klim Efremenko.

Date and Time: 
Wednesday, February 20, 2019 - 2:00pm
Venue: 
Gates 463A

IT Forum presents "Adapting Maximum Likelihood Theory in Modern Applications"

Topic: 
Adapting Maximum Likelihood Theory in Modern Applications
Abstract / Description: 

Maximum likelihood estimation (MLE) is influential because it can be easily applied to generate optimal, statistically efficient procedures for broad classes of estimation problems. Nonetheless, the theory does not apply to modern settings --- such as problems with computational, communication, or privacy considerations --- where our estimators have resource constraints. In this talk, I will introduce a modern maximum likelihood theory that addresses these issues, focusing specifically on procedures that must be computationally efficient or privacy- preserving. To do so, I first derive analogues of Fisher information for these applications, which allows a precise characterization of tradeoffs between statistical efficiency, privacy, and computation. To complete the development, I also describe a recipe that generates optimal statistical procedures (analogues of the MLE) in the new settings, showing how to achieve the new Fisher information lower bounds.

Date and Time: 
Friday, February 22, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum presents "Student Evaluations, Quantifauxcation, and Gender Bias"

Topic: 
Student Evaluations, Quantifauxcation, and Gender Bias
Abstract / Description: 

Student evaluations of teaching (SET) are widely used in academic personnel decisions as a measure of teaching effectiveness. The way SET are used is statistically unsound--but worse, SET are biased and unreliable. Observational evidence shows that student ratings vary with instructor gender, ethnicity, and attractiveness; with course rigor, mathematical content, and format; and with students' grade expectations. Experiments show that the majority of student responses to some objective questions can be demonstrably false. A recent randomized experiment shows that giving students cookies increases SET scores. Randomized experiments show that SET are negatively associated with objective measures of teaching effectiveness and biased against female instructors by an amount that can cause more effective female instructors to get lower SET than less effective male instructors. Gender bias also affects how students rate objective aspects of teaching. It is not possible to adjust for the bias, because it depends on many factors, including course topic and student gender. Students are uniquely situated to observe some aspects of teaching and students' opinions matter. But for the purposes of evaluating and improving teaching quality, SET are biased, unreliable, and subject to strategic manipulation. Reliance on SET for employment decisions disadvantages protected groups and may violate federal law. For some administrators, risk mitigation might be a more persuasive argument than equity for ending reliance on SET in employment decisions: union arbitration and civil litigation over institutional use of SET are on the rise. Several major universities in the U.S. and Canada have already de-emphasized, substantially re-worked, or abandoned reliance on SET for personnel decisions.

Date and Time: 
Friday, February 8, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum welcomes Paul Cuff, Renaissance Technologies

Topic: 
TBA
Abstract / Description: 

Soft covering is a phenomenon whereby an i.i.d. distribution on sequences of a given length is approximately produced from a very structured generation process. Specifically, a sequence is drawn uniformly at random from a "codebook" of sequences and then corrupted by memoryless noise (i.e. a discrete memoryless channel, or DMC). Among other things, soft covering means that the codebook is not recognizable in the resulting distribution. The soft covering phenomenon occurs when the codebook itself is constructed randomly, with a correspondence between the codebook distribution, the DMC, and the target i.i.d. distribution, and when the codebook is large enough. Mutual information is the minimum exponential rate for the codebook size. We show the exact exponential rate of convergence of the approximating distribution to the target, as measured by total variation distance, as a function of the excess codebook rate above mutual information. The proof involves a novel Poisson approximation step in the converse.

Soft covering is a crucial tool for secrecy capacity proofs and has applications broadly in network information theory for analysis of encoder performance. Wyner invented this tool for the purpose of solving a problem he named "common information." The quantum analogue of Wyner's problem is "entanglement of purification," which is an important open problem with physical significance: What is the minimum entanglement needed to produce a desired quantum state spanning two locations? The literature on this problem identifies sufficient asymptotic rates for this question that are generally excessively high. I will make brief mention of how the soft covering principle might be utilized for a more efficient design in this quantum setting.

Date and Time: 
Friday, January 25, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum presents "Understanding the limitations of AI: When Algorithms Fail"

Topic: 
Understanding the limitations of AI: When Algorithms Fail
Abstract / Description: 

Automated decision making tools are currently used in high stakes scenarios. From natural language processing tools used to automatically determine one's suitability for a job, to health diagnostic systems trained to determine a patient's outcome, machine learning models are used to make decisions that can have serious consequences on people's lives. In spite of the consequential nature of these use cases, vendors of such models are not required to perform specific tests showing the suitability of their models for a given task. Nor are they required to provide documentation describing the characteristics of their models, or disclose the results of algorithmic audits to ensure that certain groups are not unfairly treated. I will show some examples to examine the dire consequences of basing decisions entirely on machine learning based systems, and discuss recent work on auditing and exposing the gender and skin tone bias found in commercial gender classification systems. I will end with the concept of an AI datasheet to standardize information for datasets and pre-trained models, in order to push the field as a whole towards transparency and accountability.

Date and Time: 
Friday, January 18, 2019 - 1:15pm
Venue: 
Packard 202

IT Forum & ISL presents Functional interfaces to compression (or: Down With Streams!)

Topic: 
Functional interfaces to compression (or: Down With Streams!)
Abstract / Description: 

From a computer-science perspective, the world of compression can seem like an amazing country glimpsed through a narrow straw. The problem isn't the compression itself, but the typical interface: a stream of symbols (or audio samples, pixels, video frames, nucleotides, or Lidar points...) goes in, an opaque bitstream comes out, and on the other side, the bitstream is translated back into some approximation of the input. The coding and decoding modules maintain an internal state that evolves over time. In practice, these "streaming" interfaces with inaccessible mutable state have limited the kinds of applications that can be built.

In this talk, I'll discuss my group's experience with what can happen when we build applications around compression systems that expose a "functional" interface, one that makes state explicit and visible. We've found it's possible to achieve tighter couplings between coding and the rest of an application, improving performance and allowing compression algorithms to be used in settings where they were previously infeasible. In Lepton (NSDI 2017), we implemented a Huffman and a JPEG encoder in purely functional style, allowing the system to compress images in parallel across a distributed network filesystem with arbitrary block boundaries (e.g., in the middle of a Huffman symbol or JPEG block). This free-software system is is in production at Dropbox and has compressed, by 23%, hundreds of petabytes of user files. ExCamera (NSDI 2017) uses a purely functional video codec to parallelize video encoding into thousands of tiny tasks, each handling a fraction of a second of video, much shorter than the interval between key frames, and executing with 4,000-way parallelism on AWS Lambda. Salsify (NSDI 2018) uses a purely functional video codec to explore execution paths of the encoder without committing to them, letting it match the capacity estimates from a transport protocol. This architecture outperforms "streaming"-oriented video applications -- Skype, Facetime, Hangouts, WebRTC -- in delay and visual quality. I'll briefly discuss some of our ongoing work in trying to compress the communication between a pair of neural networks jointly trained to accomplish some goal, e.g. to support efficient evaluation when data and compute live in different places. In general, our findings suggest that while, in some contexts, improvements in codecs may have reached a point of diminishing returns, compression *systems* still have plenty of low-hanging fruit.

Date and Time: 
Friday, January 11, 2019 - 1:15pm
Venue: 
Packard 202

John G. Linvill Distinguished Seminar on Electronic Systems Technology

Topic: 
Internet of Things and Internet of Energy for Connecting at Any Time and Any Place
Abstract / Description: 

In this presentation, I would like to discuss with you how to establish a sustainable and smart society through the internet of energy for connecting at any time and any place. I suspect that you have heard the phrase, "Internet of Energy" less often. The meaning of this phrase is simple. Because of a ubiquitous energy transmission system, you do not need to worry about a shortage of electric power. One of the most important items for establishing a sustainable society is [...]


"Inaugural Linvill Distinguished Seminar on Electronic Systems Technology," EE News, July 2018

 

Date and Time: 
Monday, January 14, 2019 - 4:30pm
Venue: 
Hewlett 200

CANCELLED! ISL & IT Forum present "Bayesian Suffix Trees: Learning and Using Discrete Time Series"

Topic: 
CANCELLED! Bayesian Suffix Trees: Learning and Using Discrete Time Series
Abstract / Description: 

CANCELLED!  We apologize for any inconvenience.

One of the main obstacles in the development of effective algorithms for inference and learning from discrete time series data, is the difficulty encountered in the identification of useful temporal structure. We will discuss a class of novel methodological tools for effective Bayesian inference and model selection for general discrete time series, which offer promising results on both small and big data. Our starting point is the development of a rich class of Bayesian hierarchical models for variable-memory Markov chains. The particular prior structure we adopt makes it possible to design effective, linear-time algorithms that can compute most of the important features of the resulting posterior and predictive distributions without resorting to MCMC. We have applied the resulting tools to numerous application-specific tasks, including on-line prediction, segmentation, classification, anomaly detection, entropy estimation, and causality testing, on data sets from different areas of application, including data compression, neuroscience, finance, genetics, and animal communication. Results on both simulated and real data will be presented.

Date and Time: 
Wednesday, December 12, 2018 - 3:00pm
Venue: 
Packard 202

IT Forum presents "Perceptual Engineering"

Topic: 
Perceptual Engineering
Abstract / Description: 

The distance between the real and the digital is clearest at the interface layer. The ways that our bodies interact with the physical world are rich and elaborate while digital interactions are far more limited. Through an increased level of direct and intuitive interaction, my work aims to raise computing devices from external systems that require deliberate usage to those that are truly an extension of us, advancing both the state of research and human ability. My approach is to use the entire body for input and output, to allow for implicit and natural interactions. I call my concept "perceptual engineering," i.e., a method to alter the user's perception (or more specifically the input signals to their perception) and manipulate it in subtle ways. For example, modifying a user's sense of space, place, balance and orientation or manipulating their visual attention, all without the user's explicit input, and in order to assist or guide their interactive experience in an effortless way.

I build devices and immersive systems that explore the use of cognitive illusions to manage attention, physiological signals for interaction, deep learning for automatic VR generation, embodiment for remote collaborative learning, tangible interaction for augmenting play, haptics for enhancing immersion, and vestibular stimulation to mitigate motion sickness in VR. My "perceptual engineering" approach has been shown to, (1) support implicit and natural interactions with haptic feedback, (2) induce believable physical sensations of motion in VR, (3) provide a novel way to communicate with the user through proprioception and kinesthesia, and (4) serve as a platform to question the boundaries of our sense of agency and trust. For decades, interaction design has been driven to answer the question: how can new technologies allow users to interact with digital content in the most natural way? If we look at the evolution of computing over the last 50 years, interaction has gone from punch cards to mouse and keyboard to touch and voice. Similarly, devices have become smaller and closer to the user's body. With every transition, the things people can do have become more personal. The main question that drives my research is: what is the next logical step?

Date and Time: 
Friday, December 7, 2018 - 1:15pm
Venue: 
Packard 202

Pages

Subscribe to RSS - IT-Forum