CIF: Small: Collaborative Research: A Unifying Approach for Identification of Sparse Interactions in Large Datasets

Sponsor: National Science Foundation

Award Number: 1320566

PI: Venkatesh Saligrama

Co-I/Co-PI:

Abstract:

More than 2.5 quintillion bytes of data are created daily in the form of sensor measurements, web posts and clicks, surveillance videos, purchase transactions, and health-care records. However, not all data collected is informative and not all features are relevant to the outcomes of interest. While several researchers have focused attention on compressive sampling for minimum error data reconstruction to improve data storage and acquisition, the objective of this research is broader and is focused on salient feature discovery. The key insight is sparsity, namely, that there is a tight coupling between a small relevant set of observations and the outcomes of interest. This research is focused on the sparse identification of the most relevant observations that are essential to predicting the outcomes.

Th