Data Science

All aspects of data and information are part of this research, including how to collect, store, organize, search, and analyze information. Recently there has been energized interest in information management because huge volumes of data are now available from sources such as web query logs, Twitter posts, blogs, satellites, sensors, and medical devices. The interest is not solely due to the volume, but because there has been a paradigm shift in the way data is used. In the past, data was used to verify hypotheses; today, mining data for patterns and trends leads to new hypotheses. The more data available, the finer and more sophisticated these hypotheses can be. Examples include:

  • Managing uncertain/approximate data and models;
  • Tracking data lineage;
  • Causality;
  • Automated data cleansing (e.g., entity resolution, graph alignment, etc.);
  • Scalable self-tuning optimization, machine learning, and data mining systems;
  • Algorithms for analysis of large, dynamic networks;
  • Next generation distributed large-scale computing and simulation environments;
  • Domain-specific languages and optimization systems for big data analytics;
  • Scalable analysis tools for automatic bug detection;
  • Data visualization.