Too much of a good thing. That’s the situation many scientists face in this age of Big Data.
With sophisticated sensors and surveys gathering information all the time, everywhere, the pile of data available to researchers is growing at a dizzying pace — so fast, that in many cases it has outstripped our ability to make sense of it.
Thanks to a new data center at Penn State, researchers can now analyze huge amounts of information and complex models that were grindingly slow or impossible to handle before. The 49,500-square-foot facility hosts 23,500 computer cores. A typical desktop computer has two cores.
“The data center enables us to provide world-class computation in an energy-efficient and economical way,” says meteorologist Jenni Evans, director of the Institute for CyberScience (ICS), which is responsible for the center’s research computing component. “Instead of having computer clusters all over campus, we put it all in this secure facility where researchers can share resources.”
One research group is gearing up to employ a computer cluster called the Cyber-Laboratory for Astronomy, Materials, and Physics (CyberLAMP). Led by astrophysicist Yuexing Li and funded by a $1 million grant from the National Science Foundation, the team “includes astronomers, physicists, materials scientists, and computer scientists, working on an incredible range of scales — from nanomaterials to planets in other galaxies,” says Evans.
The computing power of the data center will help astronomer Eric Fordbetter understand planet masses and orbits and predict where to look for planets that might be habitable.
“Lab experiments only go so far,” Ford says. “They let us measure the temperatures at which gas condenses into ice and determine which combinations of grains, pebbles, and rock fragments collide to merge into a larger body, as opposed to bouncing off each other or shattering into smaller particles. But we can’t create a solar system in a lab.”
What he and his team can do, with help from the data center, is integrate what they know about basic physics into computer models that simulate planetary system formation. “We can compute predictions of different models for planet formation and compare those to observations to test hypotheses for how planets are formed,” Ford says. “The CyberLAMP cluster will let us create much more sophisticated simulations with much greater realism.”
At the other end of the scale, fellow astronomer Doug Cowen will soon start using CyberLAMP to study neutrinos, the smallest sub-atomic particles known. Sometimes called “ghost particles,” neutrinos are everywhere in the universe, and understanding them can help scientists answer fundamental questions in physics. Cowen is a member of the international IceCube project that uses neutrino detectors embedded up to 2 kilometers deep in Antarctic ice.
“Cosmic rays crash into our atmosphere, and in those cosmic ray showers you have neutrinos,” he says. “Most of them keep going straight through the Earth, but rarely, some interact with matter in our detector and produce tiny amounts of light.”
The South Pole ice cap — where the crystal-clear ice allows even the tiniest flashes of light to be detected — serves as an ideal spot to record these rare interactions. The data center’s 100 graphic processing units (GPUs) will allow Cowen and his colleagues to rapidly analyze more neutrinos, in much finer detail, than has been possible before. “Right now, if we want to reconstruct one year’s worth of data, it takes a couple of months,” he says. “It will be an enormous benefit to shrink that down to a few days.”
The data center’s ability to integrate data and models from many sources is essential to a multi-institution project headed by Penn State and Stanford University. Funded by a $20 million U.S. Department of Energy grant, the Program on Coupled Human and Earth Systems aims to develop tools to assess how stresses in a natural system—such as a major hurricane or drought — or a human system — such as dramatic population growth — affect other systems, such as energy infrastructure, water supply, and food production.
“Current models for understanding the impacts of climate-related variability or other natural disasters deal with their effects on one or a few parts of other systems,” says economist Karen Fisher-Vanden, Penn State’s lead investigator on the project. “Our project emphasizes the interconnectedness of all the systems, and how stresses on one can reverberate through all the others. To do that by combining our standalone models from different fields would be a computational nightmare. Our goal is to create a state-of-the-art framework of computational tools that allows us to model the interconnected ‘system of systems,’ so sharing data across all those fields will be seamless.”
ICS director Evans expects the data center to play an ever-expanding role for Penn State researchers in coming years. “Sciences such as astronomy and meteorology have a history of using Big Data,” she says. “Now, new instrumentation and data availability are bringing new research areas, such as biology and political science, into the Big Data venue. The data center will continue to be integral as it provides an incredible leveraging of shared computing facilities.”
This story first appeared in the Fall 2017 issue of Research/Penn State magazine.