Smashing the Trillion Zone Barrier
Visualization application successfully analyzes massive datasets
click to enlarge
Figure 1: This volume rendering of supernova simulation data was generated by running the VisIt application on 32,000 processors on Franklin, a Cray XT4 supercomputer at NERSC.
As computational scientists are confronted with increasingly massive datasets from supercomputing simulations and experiments, one of the biggest challenges is having the right tools to gain scientific insight from the data. One common method for gaining insight is to use scientific visualization, which transforms abstract data into more readily comprehensible images using advanced computer software and computer graphics. But the ever-growing size of scientific datasets presents a significant challenge to modern scientific visualization tools. As a result, there is a great deal of motivation to explore use of large, parallel resources, such as those at the U.S. Department of Energy’s (DOE) supercomputing centers, to take advantage of their vast computational processing power, I/O bandwidth and large memory footprint.
A team of DOE researchers recently ran a series of experiments demonstrating that VisIt, a parallel visualization and analysis tool that was developed at Lawrence Livermore National Laboratory (LLNL) for the National Nuclear Security Administration, is up to the challenge. Running on six systems, including four of the world’s 12 most powerful supercomputers, VisIt achieved unprecedented levels of performance in these highly parallel environments, handling some of the largest datasets ever produced.
The team ran VisIt using 8,000 to 64,000 processing cores to visualize datasets ranging from 500 billion to 4 trillion zones, or grid points. Specifically, the team demonstrated, for the first time, that VisIt’s parallelism approach can take advantage of the growing number of cores powering advanced supercomputers.
When DOE established the Visualization and Analytics Center for Enabling Technologies (VACET) in 2006, the center joined the VisIt development effort, making further extensions for use on the large, complex datasets emerging from DOE’s Scientific Discovery through Advanced Computing (SciDAC) program. VACET is part of the SciDAC program and includes researchers from three national laboratories and two universities:
• Lawrence Berkeley National Laboratory (Berkeley Lab)
• Lawrence Livermore National Laboratory
• Oak Ridge National Laboratory (ORNL)
• University of California at Davis
• University of Utah
click to enlarge
Figure 2: This isosurface rendering of the same supernova simulation data utilized in Figure 1 was created by running VisIt on JaguarPF, a Cray XT5 supercomputer at the Oak Ridge Leadership Computing Facility at ORNL.
The VACET team conducted the recent capability experiments in response to its mission to provide production-quality, parallel-capable visual data analysis software. These tests were a significant milestone for DOE’s visualization efforts, providing an important new capability for the larger scientific research communities.
“The results show that visualization research and development efforts have produced technology that is today capable of ingesting and processing tomorrow’s datasets,” said Berkeley Lab’s E. Wes Bethel, who is co-leader of VACET. “These results are the largest-ever problem sizes and the largest degree of concurrency ever attempted within the DOE visualization research community.”
Other team members are Mark Howison and Prabhat from Berkeley Lab; Hank Childs, who began working on the project while at LLNL and has now joined Berkeley Lab; Brad Whitlock from LLNL; and Dave Pugmire and Sean Ahern from ORNL. All are members of VACET, as well.
The VACET team ran the experiments in April and May on six world-class supercomputers (latest TOP500 rankings noted):
• Franklin — a 38,128-core Cray XT4 located at the National Energy Research Scientific Computing Center (NERSC) at Berkeley Lab (No. 11)
• JaguarPF — a 149,504-core Cray XT5 at the Oak Ridge Leadership Computing Facility at ORNL (No. 2)
• Ranger — a 62,976-core x86_64 Linux system at the Texas Advanced Computing Center at the University of Texas at Austin (No. 8)
• Purple — a 12,288-core IBM Power5 at LLNL (No. 50)
• Juno — an 18,432-core x86_64 Linux system at LLNL (No. 19)
• Dawn — a 147,456-core BlueGene/P system at LLNL (No. 9)
To run these tests, the VACET team started with data from an astrophysics simulation, and then increased it to create a sample scientific dataset at the desired dimensions. The team used this approach because the data sizes reflect tomorrow’s problem sizes, and because the primary objective of these experiments is to better understand problems and limitations that might be encountered at extreme levels of concurrency and data size.
The test runs created three-dimensional grids ranging from 512 x 512 x 512 “zones” or grid points up to approximately 10,000 x 10,000 x 10,000 (1 trillion zones) and approximately 15,900 x 15,900 x 15,900 to achieve 4 trillion grid points.
“This level of grid resolution, while uncommon today, is anticipated to be commonplace in the near future,” said Ahern. “A primary objective for our SciDAC Center is to be well-prepared to tackle tomorrow’s scientific data understanding challenges.” The experiments ran VisIt in parallel on 8,000 to 64,000 cores, depending on the size of the system. Data was loaded in parallel, with the application performing two common visualization tasks — isosurfacing and volume rendering — and producing an image. From these experiments, the team collected performance data that will help them both to identify potential bottlenecks and to optimize VisIt before the next major version is released for general production use at supercomputing centers later this year.
Another purpose of these runs was to prepare for establishing VisIt’s credentials as a “Joule code,” or a code that has demonstrated scalability at a large number of cores. DOE’s Office of Advanced Scientific Computing Research (ASCR) is establishing a set of such codes to serve as a metric for tracking code performance and scalability as supercomputers are built with tens and hundreds of thousands of processor cores. VisIt is the first and only visual data analysis code that is part of the ASCR Joule metric.
VisIt is currently heavily used on eight of the world’s top twelve supercomputers, and the software has been downloaded by more than 100,000 users.
• DOE’s Scientific Discovery through Advanced Computing Program (SciDAC)
• Kathy Yelick, Francesca Verdier, and Howard Walter, NERSC, Berkeley Lab
• Paul Navratil, Kelly Gaither, and Karl Schulz, Texas Advanced Computing Center, University of Texas, Austin
• James Hack, Doug Kothe, Arthur Bland, Ricky Kendall, Oak Ridge Leadership Computing Facility, ORNL
• David Fox, Debbie Santa Maria, Brian Carnes, Livermore Computing, LLNL.
Hank Childs is a computer systems engineer in the Visualization Group, Computational Research Division of Lawrence Berkeley National Laboratory and the architect of the VisIt project. He may be reached at [email protected].