Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are mainly due to the continuous technology scaling of DRAM (dynamic random-access memory), which has been used as the physical substrate for main memory. In stark contrast with capacity and bandwidth, DRAM latency has remained almost constant, reducing by only 1.3x in the same time frame. Therefore, long DRAM latency continues to be a critical performance bottleneck in modern systems. Increasing core counts, and the emergence of increasingly more data-intensive and latency-critical applications further stress the importance of providing low-latency memory accesses.
In this talk, we will identify three main problems that contribute significantly to long latency of DRAM accesses. To address these problems, we show that (1) augmenting DRAM chip architecture with simple and low-cost features, and (2) developing a better understanding of manufactured DRAM chips together leads to significant memory latency reduction. Our new proposals significantly improve both system performance and energy efficiency.