The National Science Foundation (NSF) announced $17.7 million in funding for 12 Transdisciplinary Research in Principles of Data Science (TRIPODS) projects, which will bring together the statistics, mathematics and theoretical computer science communities to develop the foundations of data science. Conducted at 14 institutions in 11 states, these projects will promote long-term research and training activities in data science that transcend disciplinary boundaries.
“Data is accelerating the pace of scientific discovery and innovation,” said Jim Kurose, NSF assistant director for Computer and Information Science and Engineering (CISE). “These new TRIPODS projects will help build the theoretical foundations of data science that will enable continued data-driven discovery and breakthroughs across all fields of science and engineering.”
Technological advances and unprecedented access to computing infrastructure have resulted in an explosion of data from different sources. The availability of these data — their volume and variety, and the speed at which they are collected — is transforming research in all fields of science and engineering. Through Harnessing the Data Revolution, one of the “10 Big Ideas for Future NSF Investments,” the foundation seeks to support fundamental research in data-driven science and engineering; shape a cohesive, federated, national-scale approach to research data infrastructure; and develop a 21st century data-capable workforce.
The TRIPODS awards will enable data-driven discovery through major investments in state-of-the-art mathematical and statistical tools, better data mining and machine learning approaches, enhanced visualization capabilities and more. These awards will build upon NSF’s long history of investments in foundational research, contributing key advances to the emerging data science discipline, and supporting researchers to develop innovative educational pathways to train the next generation of data scientists.
“TRIPODS will accelerate the development of modern foundations of data science through a truly transdisciplinary collaboration between mathematicians, statisticians and theoretical computer scientists, while also creating opportunity for fundamental development to occur in finding solutions to important data science challenges in the domain sciences,” said Jim Ulvestad, NSF acting assistant director for Mathematical and Physical Sciences (MPS).
TRIPODS is a partnership between NSF’s CISE and MPS directorates. NSF’s Established Program to Stimulate Competitive Research (EPSCoR) also co-funded one of the projects.
A portfolio supporting another of NSF’s Big Ideas, Growing Convergent Research, contributed $1.1 million to the new TRIPODS awards, co-funding three of them. Convergence is the integration of knowledge, techniques and expertise from multiple fields to address scientific and societal challenges. To build an ecosystem that truly supports convergent science, NSF seeks to strategically invest in research projects and programs that are motivated by intellectual opportunities and important societal problems. The goal is that everyone, not just scientists and engineers, will benefit from the convergence of the physical sciences, biological sciences, computing, engineering and the social and behavioral sciences.
The TRIPODS Phase I awards announced today will support the development of small collaborative institutes. A future TRIPODS Phase II is planned to support a smaller number of larger institutes. Phase II will select awardees through a second competitive proposal process from among the Phase I institutes, as well as any new collaborative partners Phase I awardees bring on board.
The award titles, principal investigators and institutions for the TRIPODS Phase I projects are listed below:
- UA-TRIPODS: Building Theoretical Foundations for Data Sciences: Hao Zhang, University of Arizona
- Foundations of Model Driven Discovery from Massive Data: Jeffery Brock, Brown University (Convergence and EPSCoR co-funding)
- Berkeley Institute on the Foundations of Data Analysis: Michael Mahoney, University of California, Berkeley
- TRIPODS: Towards a Unified Theory of Structure, Incompleteness and Uncertainty in Heterogeneous Graphs: Lise Getoor, University of California, Santa Cruz
- From Foundations to Practice of Data Science and Back: John Wright, Columbia University
- TRIPODS: Data Science for Improved Decision-Making: Learning in the Context of Uncertainty, Causality, Privacy, and Network Structures: Kilian Weinberger, Cornell University (Convergence co-funding)
- Transdisciplinary Research Institute for Advancing Data Science (TRIAD): Xiaoming Huo, Georgia Institute of Technology
- Collaborative Research: TRIPODS Institute for Optimization and Learning: Katya Scheinberg, Lehigh University; Han Liu, Northwestern University; Francesco Orabona, State University of New York at Stony Brook
- Institute for Foundations of Data Science (IFDS): Piotr Indyk, Massachusetts Institute of Technology
- Topology, Geometry, and Data Analysis (TGDA@OSU): Discovering Structure, Shape, and Dynamics in Data: Tamal Dey, The Ohio State University
- Algorithms for Data Science: Complexity, Scalability, and Robustness: Sham Kakade, University of Washington
- Institute for Foundations of Data Science: Stephen Wright, University of Wisconsin-Madison (Convergence co-funding)