Penn State is part of a $4 million multi-institution grant from the National Science Foundation aimed at identifying innovative solutions for societal, scientific and industry challenges through strategic data science partnerships. As part of the Northeast Big Data Innovation Hub, Penn State will continue to collaborate on big data projects that are too large and complex for individual organizations to manage independently.
Hosted by Columbia University’s Data Science Institute, the Northeast Hub was launched in 2015 through a $1.25 million National Science Foundation grant. The additional four years of funding will allow the Hub to strengthen its role in fostering regional networks of stakeholders and support big data projects that address the priorities of the northeastern United States, the nation and the world.
“The Northeast Big Data Hub provides a unique platform for research teams that leverage the expertise and resources of multiple institutions in the region to harness the power and potential of data to address pressing regional and national challenges,” said Vasant Honavar, professor and Edward Frymoyer Chair of Information Sciences and Technology, and co-principal investigator on the project.
The additional funding will support translational data science projects, such as improving education through big data and integrating health data from traditional and novel sources, as well as provide initial funding for the early exploration of new projects. It will also allow the Hub to continue its collaboration with six Big Data Spokes – multi-sector projects that convene additional members in support of Hub priorities.
“Accelerating science in many domains calls for the development of advanced artificial intelligence that can partner effectively with humans on all aspects of science – from formulating questions and hypotheses to designing and executing experiments, acquiring and analyzing data, to integrating results of studies with existing scientific knowledge,” said Honavar.
One such study is the NSF-funded Virtual Data Collaboratory, an effort led by researchers at Penn State and Rutgers to develop a federated data and computing infrastructure to support research that transcends institutional and disciplinary boundaries. This work will also enrich the initiatives of Penn State’s Institute for CyberScience, where Honavar serves as associate director, to accelerate scientific discovery and enable new forms of discovery.
“The scientific community needs access to technological resources that can scale up to meet the ever-increasing complex challenges that researchers take on every day,” said Jenni Evans, director of ICS and professor of meteorology and atmospheric science. “This project represents a major step in fostering both the leadership and partnerships necessary to help researchers as they explore the most intriguing scientific questions of our day, seeking solutions to the most pressing problems facing our world.”
Since its inception, the Northeast Big Data Hub has built more than 90 partnerships that bring together data science leaders and practitioners from academia, industry, government and nonprofit organizations. Through these partnerships, the Northeast Hub has shared resources, insights and knowledge across more than 200 organizations to explore data-driven solutions in four areas of focus – education, health, rural/urban spectrum and science – that address four overarching themes – data literacy, data sharing, responsible data science, and privacy and security.
“This project represents a major step in fostering both the leadership and partnerships necessary to help researchers as they explore the most intriguing scientific questions of our day, seeking solutions to the most pressing problems facing our world.”
– Jenni Evans, director of ICS and professor of meteorology and atmospheric science
The Hub has convened 21 cross-sector workshops and fostered collaborations that led to six funded Big Data Spoke projects – including one focused on the integration of environmental factors and causal reasoning approaches for large-scale observational health research, an effort co-led Penn State. It also has led four multi-sector, collaborative planning projects, all aimed at addressing regional and national challenges with data-driven innovation.
“The Big Data Hub has built an extensive network of data science experts and stakeholders from academia, industry and local government across the northeast,” said Jeannette M. Wing, the Hub’s principal investigator and Avanessians Director of the Data Science Institute at Columbia University. “The new NSF grant will allow us to expand this work in two ways: first, by addressing cross-cutting themes on data privacy and data ethics, to ensure positive social impact; and second, by coordinating with the three other regional hubs toward a national network of data science institutions.”
Honavar, who also serves as the director of Penn State’s Center for Big Data Analytics and Discovery Informatics, is part of the Northeast Hub’s Executive Committee. He also leads the science focus and co-leads an exploratory project aimed at incorporating real-time environmental data to perform targeted recommendations to individuals at risk.
The Northeast Big Data Innovation Hub is one of four regional hubs that form a national big data innovation ecosystem.
“The Hub offers institutions like Penn State a platform to engage in ambitious data science research projects on a regional or national scale that require expertise and resources beyond those available at any single institution,” said Honavar. “While each Hub is shaped by the unique opportunities and challenges offered by the region it serves, they also share some of the priorities. This offers opportunities for partnerships between two or more Hubs on projects at a national scale.”
In addition to Honavar and Wing, the executive committee of the Northeast Hub includes René Bastón, Columbia University; James Hendler, Rensselaer Polytechnic Institute; and Andrew McCallum, University of Massachusetts at Amherst.