When scientists run experiments — whether physically smashing atoms at the Large Hadron Collider or virtually simulating future weather — the output is often a huge set of numbers incomprehensible to the ordinary human brain. To tame the data and put it into a form that our minds can understand, researchers use scientific visualization.
In its simplest form, scientific visualization can be a graph or chart. But, in cases where researchers need detailed information to draw insights — say to understand how a protein functions in cancer and to design a drug to combat it — scientific visualization can be quite complex.
The science of scientific visualization is quite complex, too. To give structure, order, color and form to multi-dimensional data requires powerful software. And, as datasets grow, scientific visualization increasingly requires advanced computing resources as well.
This is something that Paul Navratil at the Texas Advanced Computing Center (TACC) knows well. As manager of the Scalable Visualization Technologies group, he’s worked for the past decade as part of a team that helps scientists visualize the data that comes off on some of the most powerful supercomputers in the world.
Modeling hurricanes in real-time as they barrel toward the Gulf Coast; assisting cardiac researchers to simulate blood flow through a congested heart; visualizing the formation of galaxies during the Dark Ages of the universe — Navratil has seen it all.
He’s also watched the landscape of scientific visualization change in response to evolving trends in hardware and software.
In fact, Navratil is actively involved in upending the supremacy of rasterization, a method that takes vector graphics and converts them into pixels, or dots, for display, printing or storage.
Vector graphics use points, lines, curves and polygons — all based on mathematical expressions — to represent images in computer graphics. But, modern printers and displays need that information converted to dots in order to use it. Rasterization has been the dominant conversion technique, but Navratil and others are advancing “ray tracing,” an alternative visualization method. Ray tracing has a history as long as rasterization’s, and recently become advantageous, thanks to new hardware and methods.
With support from the National Science Foundation (NSF), Navratil is leading an effort to design a new framework that would allow the tens of thousands of scientists and engineers who use the nation’s supercomputers to easily add ray tracing visualizations to their research, regardless of the type of computing system or hardware they are using.
Rasterization vs. ray tracing
Whereas rasterization works by projecting a flat surface onto the 3-D model of an object, scene or person, ray tracing simulates the photons of light as they bounce from a light source off an object and into our eyes, based on the laws of optics.
This physically realistic rendering has a number of benefits. It creates much more realistic reflections and shading, which helps our minds understand the spatial relationships between the parts of the visualization. And since the objects being rendered are described computationally, according to their specific material properties and shape, they are also much more scientifically accurate.
Navratil uses the metaphor of a Wild West movie set to describe the difference.
“Rasterization looks realistic from the outside, but you can’t explore beyond the surface,” he explained. Ray tracing on the other hand is like a real street in a Western ghost town. “You can walk into the saloon and sit down at the bar.”
This may not be an important distinction for some applications, but for scientists trying to understand the deep mysteries of the universe, precise information is required.
New hardware architecture enables new capabilities
Computer processing speeds were once the bottleneck preventing individuals from using ray tracing routinely in their research. But, as microprocessors have become faster, memory access and communication are now the primary obstacles.
The new software Navratil and his collaborators developed, GraviT (pronounced “gravity”), automatically recognizes the type of problem a researcher is working on and the configuration of the system he or she is using, and then appropriately distributes data from the simulation to multiple computer processors — potentially thousands of them — for visualization.
The process requires little knowledge or understanding of visualization by the researchers, so they can focus their efforts on their specific science questions and not the science of software engineering.
The project is a collaboration among computer and computational scientists at TACC, the University of Oregon, the University of Utah, Intel Corporation and ParaView, a company that designs leading scientific visualization software. Hank Childs (Oregon) and Charles Hanson (Utah) serve as co-principal investigators for the project.
“Different ways of visualizing data have come and gone over the years based on the underlying hardware,” said Daniel S. Katz, a program director at NSF. “Software-based ray tracing is now viable again. To bring it into the future, so it works on current and future hardware, we need sustainable software. This work can be incorporated into different visualization packages and into the community of visualization tools.”
In designing the software, the research team looked ahead to a time in the near future when scientists working on supercomputers in the cloud will be creating simulations so big that they can’t easily be moved for rendering. (This is already the case with many of the researchers who use the nation’s supercomputers and many believe it will be the norm in the future.)
Such simulations will require visualizing the data locally, even as the simulation is running — a process known as “in-situ visualization.”
In this scenario, simulation data is never written to disk and stored. Simulations are simply visualized as the data is processed. This idea breaks the age-old paradigm of separating modeling from visualization, which was typically done afterward as a post-processing step.
In Spring 2015, the researchers released the first component of the system, called GluRay, as an open source tool on GitHub. GluRay lets researchers visualize their research on distributed computers, regardless of the type of hardware or architecture the computer uses.
The team plans to release the beta version of GraviT in the Fall. GraviT extends GluRay by scheduling work across multiple nodes of a supercomputer, particularly when the total data is larger than available memory. GraviT also provides an advanced interface for application developers who want to use more ray tracing capabilities and improve their performance.
Helping scientists across disciplines
Working with test problems from teams of researchers in diverse fields, Navratil and company have already seen great gains using ray tracing on high-performance computers, facilitated by GluRay.
Geologists using the software to explore how water flows through limestone karsts in Florida experienced improved depth perception in their visualizations and consequently a better understanding of how the aquifer is recharged through thumb-sized holes in the limestone. Other researchers have used the software for astrophysics simulations and seismic analysis.
Beyond the improved visual fidelity that GraviT will provide, there’s another reason that Navratil and his team believe their research will prove useful to science. It turns out that many phenomena that scientists study look a lot like ray tracing.
“Whether it’s fluid flow or stellar magnetism, these problems involve tracing particles,” Navratil said. “For all of these problems, the solutions we’re developing will be a big help.”
- GraviT on Github: https://github.com/TACC/GraviT
- GraviT: A Scalable Ray Tracing Framework for Visualization: https://www.tacc.utexas.edu/research-development/tacc-software/gravit