
Biomedical Data Science Seminar
The coronavirus will likely kill thousands of Americans. But what if I told you about a serious threat to American national security. This emergency comes from climate change and air pollution. To help
address this threat, we have developed an artificial neural network model that uses on-the-ground air-monitoring data and satellite-based measurements to estimate daily pollution levels dividing the
country into 1-square-kilometer zones across the continental U.S. We have paired this information with health data contained in Medicare claims records from the last 12 years, which includes 97% of
the population ages 65 or older. We also developed statistical methods for causal inference and computational efficient algorithms for the analysis of over 550 million health records. The result?
This data science platform is telling us that federal limits on the nation's most widespread air pollutants are not stringent enough. Our research shows that short- and long-term exposure to air
pollution is killing thousands of senior citizens each year. Our research shows the critical new role of data science in public health and the associated methodological challenges. For example, with
enormous amounts of data, the threat of unmeasured confounding bias is amplified, and causality is even harder to assess with observational studies. We will discuss these and other challenges.
Suggested Reading
- "Evaluating the impact of long-term exposure to fine particulate matter on mortality among the elderly," https://advances.sciencemag.org/content/6/29/eaba5692
- "Exposure to air pollution and COVID-19 mortality in the United States: A nationwide cross-sectional study," https://www.medrxiv.org/content/10.1101/2020.04.05.20054502v2
- "Inequalities in air pollution exposure are increasing in the United States," https://www.medrxiv.org/content/10.1101/2020.07.13.20152942v1