Stanford Community Health Alerts
Many spectral and polarimetric cameras implement complex spatial, temporal, and spectral re-mapping strategies to measure a signal within a given use-case's specifications and error tolerances. This re-mapping results in a complex tradespace that is challenging to navigate; a tradespace driven, in part, by the limited degrees of freedom available in inorganic detector technology. This presentation overviews a new kind of organic detector and pixel architecture that enables single-pixel tandem detection of both spectrum and polarization. By using organic detectors' semitransparency and intrinsic anisotropy, the detector minimizes spatial and temporal resolution tradeoffs while showcasing thin-film polarization control strategies.
Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals with fine detail, and fail to represent a signal's spatial and temporal derivatives, despite the fact that these are essential to many physical signals defined implicitly as the solution to partial differential equations. In this talk, we describe how sinusoidal representation networks or SIREN, are ideally suited for representing complex natural signals and their derivatives. Using SIREN, we demonstrate the representation of images, wavefields, video, sound, and their derivatives. Further, we show how SIRENs can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. While SIREN can be used to fit signals and their derivatives, we also introduce a new framework for solving integral equations using implicit neural representation networks. Our automatic integration framework, AutoInt, enables the calculation of any definite integral with two evaluations of a neural network. We apply our approach for efficient integration to the problem of neural volume rendering. Finally we present a novel architecture and training procedure able to fit data such as gigapixel images or fine-detailed 3D geometry, demonstrating those neural representations are now ready to be used in large scale scenarios.
A fundamental limit to human vision is our ability to sense variations in light intensity over space and time. These limits have been formalized in the spatio-temporal contrast sensitivity function, which is now a foundation of vision science. This function has also proven to be the foundation of much applied vision science, providing guidance on spatial and temporal resolution for modern imaging technology. The Pyramid of Visibility is a simplified model of the human spatio-temporal luminance contrast sensitivity function (Watson, Andrew B.; Ahumada, Albert J. 2016). It posits that log sensitivity is a linear function of spatial frequency, temporal frequency, and log mean luminance. It is valid only away from the spatiotemporal frequency origin. It has recently been extended to peripheral vision to define the Field of Contrast Sensitivity (Watson 2018). Though very useful in a range of applications, the pyramid would benefit from an extension to the chromatic domain. In this talk I will describe our efforts to develop this extension. Among the issues we address are the choice of color space, the definition of color contrast, and how to combine sensitivities among luminance and chromatic pyramids.
Watson, A. B. (2018). "The Field of View, the Field of Resolution, and the Field of Contrast Sensitivity." Journal of Perceptual Imaging 1(1): 10505-10501-10505-10511.
Watson, A. B. and A. J. Ahumada (2016). "The pyramid of visibility." Electronic Imaging 2016(16): 1-6.
In this talk, I will show several recent results of my group on learning neural implicit 3D representations, departing from the traditional paradigm of representing 3D shapes explicitly using voxels, point clouds or meshes. Implicit representations have a small memory footprint and allow for modeling arbitrary 3D topologies at (theoretically) arbitrary resolution in continuous function space. I will show the ability and limitations of these approaches in the context of reconstructing 3D geometry, texture and motion. I will further demonstrate a technique for learning implicit 3D models using only 2D supervision through implicit differentiation of the level set constraint. Finally, I will demonstrate how implicit models can tackle large-scale reconstructions and introduce GRAF and GIRAFFE which are generative 3D models for neural radiance fields that are able to generate 3D consistent photo-realistic renderings from unstructured and unposed image collections.
Wouldn't it be fascinating to be in the same room as Abraham Lincoln, visit Thomas Edison in his laboratory, or step onto the streets of New York a hundred years ago? We explore this thought experiment, by tracing ideas from science fiction through antique stereographs to the latest work in generative adversarial networks (GANs) to step back in time to experience these historical people and places not in black and white, but much closer to how they really appeared. In the process, I'll present our latest work on Keystone Depth, and Time Travel Rephotography.
Forensic DNA analysis has been critical in prosecuting crimes and overturning wrongful convictions. At the same time, other physical and digital forensic identification techniques, used to link a suspect to a crime scene, are plagued with problems of accuracy, reliability, and reproducibility. Flawed forensic science can have devastating consequences: the National Registry of Exonerations identified that flawed forensic techniques contribute to almost a quarter of wrongful convictions in the United States. I will describe our recent efforts to examine the reliability of two such photographic forensic identification techniques: (1) identification based on purported distinct patterns in clothing; and (2) identification based on measurements of height and weight.
Many descriptions of trench warfare in World War I describe it as "months of boredom punctuated by moments of extreme terror" (Guy's Hospital Gazette (1914), The New York Times Current History of the European War (1915), among others). Those responsible for public safety today can relate to this description. Critical decisions in public safety often have to be made with incomplete and uncertain information under varying degrees of stress. Life and death hang in the balance as a consequence of many of these decisions. The gravity of the task attracts a considerable set of requirements for documentation. Approximately 30% of a first-responder's time is spent on documentation. A 911 call-taker works on multiple screens running different applications simultaneously entering the same data many times. Post-incident investigations require multiple sources of data to be integrated manually, and roughly 80% of the aggregated case files likely have errors. Assisting in the monotonous, time-consuming, and error-prone task of filling out forms and documenting events can save lives. I'll talk about how automatic speech recognition, language understanding, object and activity recognition in video, and effective UX design can help with this problem.