HPC has traditionally advanced research that depends on simulation and modeling by enabling calculations with finer grids, more variables and multiple interacting phenomena. From time to time, HPC transforms the way certain problems are solved or opens up completely new applications of computing.
For example, in the early 1990s, Tom Prince, a professor of astronomy at Caltech, used a computationally intensive method to remove the effects of the atmosphere from observations made on the 200-inch optical telescope at the Palomar Observatory. The result for bright objects was almost the same as if the images had been taken by a telescope in outer space.
Much more recently, there have been quite a few instances of ways that HPC transforms the way certain research and development is carried out, and here I will briefly describe three examples, in connection with projects that have been carried out on the systems at the Argonne Leadership Computing Facility (ALCF).
Exploring HPC and protein structure
The computational power of high-end systems, such as those at the ALCF, has inspired a new approach for determining protein structures. Proteins are large, complex molecules that drive virtually all cellular functions in living organisms. With the emergence of protein structure modeling tools, researchers have the ability to design proteins with targeted applications, such as treating diseases and catalyzing medically and industrially useful reactions.
For example, a research team from the University of Washington has been using Mira, the 10-petaflops Blue Gene/Q at the ALCF, to develop and apply new computational methods aimed at enhancing state-of-the-art protein structure prediction and design capabilities. The team’s initial use of ALCF resources was to run ensembles of many small, independent jobs, a typical approach for such research. However, David Baker, the project PrincipaI Investigator, was inspired by the computing power of the ALCF systems to develop an approach for determining NMR structures of proteins over 20 kDa in molecular weight that utilizes methods that are computationally intensive but enable the use of NMR backbone-only data.
The method was tested on 11 proteins ranging from 15 to 40 kDa, seven of which were previously unsolved. The models obtained by this approach were in good agreement with models obtained using traditional NMR methods with larger restraint sets. The new approach allows routine determination of high-quality solution NMR structures for proteins up to 40 kDa that were previously out of reach, and should be broadly useful in structural biology. The results of this work will enable advances in many challenging problems in computational structural biology.
Transforming the way concrete is produced and how quality can be ensured
Researchers at NIST have been using ALCF resources to model concrete, with the goals of improving its quality, reducing the energy to produce it, and reducing the emission of carbon dioxide involved in its production.
According to the World Business Council for Sustainable Development, worldwide cement manufacture is estimated to account for at least five percent of humanity’s carbon dioxide emissions. A recent study conducted by the NIST team, the University of Strasbourg and Sika Corporation that utilized ALCF resources has led to a new way to predict concrete’s flow properties from simple measurements.
Concrete begins as a thick pasty fluid containing innumerable particles in suspension that can, ideally, flow into a space of nearly any shape, where it hardens into a durable, rock-like state. Its initial flexibility combined with its eventual strength has made it the material of choice for building everything from the ancient Roman Coliseum to the foundations of countless modern bridges and skyscrapers.
Methods used by the building industry to predict the quality of the concrete have proven to be error-prone. For example, the particles can settle out, leading to structural problems after the concrete hardens. A significant amount of energy is also needed to create the cement that reacts with water to produce hardened concrete. This critical binding agent is manufactured at high temperatures in a kiln, a process that generates a great deal of carbon dioxide.
The NIST team took on the challenge to design concrete that performs better on the job and doesn’t demand so much energy to manufacture. To do so, they needed to learn more about how suspensions work, and that required complex math and physics, as well as a large amount of computer power to study how all the particles and fluid react as they are mixed.
Through a DOE INCITE grant that provided more than 110 million processor hours on supercomputers at the ALCF, they were able to simulate how a suspension would change if one or more parameters varied — the number of suspended particles, for example, or their size. Suspensions have a remarkable property: Plotting two parameters — viscosity versus shear rate — always generates the same shaped curve as plotting them for the suspending fluid alone without added particles. What the team unexpectedly found was the amount that the curves had to be shifted could be predicted based on the microscopic shear rates that existed between neighboring particles. Experiments at the University of Strasbourg confirmed the simulated results that allowed the team to come up with a general theory of suspensions’ properties.
The results should help accelerate the design of a new generation of high-performance and eco-friendly cement-based materials by reducing time and costs associated with R&D. NIST is also using this new knowledge to create Standard Reference Materials (SRM’s) for industrial researchers to calibrate concrete rheometers for material development. Ultimately, this could help expand the use of alternative materials.
The NIST research team is also leveraging data from large-scale simulations to create SRMs for concrete to allow for more accurate viscosity measurements.
Combining computational results with theoretical work and physical experiments, NIST has developed and released its first SRM to calibrate rheometers (laboratory devices used to measure flow properties) for cement paste.
Work is also underway to develop additional SRMs that will enable accurate predictions of the flow of concrete, which is essential to exploring the use of new, more environmentally friendly ingredients for concrete mixtures. Additionally, the SRMs will help to improve the workability of concrete by creating standardized measurements that would allow builders to request a specific concrete formulation with reliable, repeatable results.
Tackling the Large Hadron Collider’s big data challenge
Argonne physicists are using Mira to perform simulations of Large Hadron Collider (LHC) experiments with a massively parallel supercomputer for the first time, developing a path forward for interpreting future LHC data. ALCF researchers helped the team optimize their code for the supercomputer, which has enabled them to simulate billions of particle collisions faster than ever before.
At CERN’s Large Hadron Collider (LHC), the world’s most powerful particle accelerator, scientists initiate millions of particle collisions every second in their quest to understand the fundamental structure of matter.
With each individual collision producing about a megabyte of data, the facility, located on the border of France and Switzerland, generates a colossal amount of data. Even after filtering out about 99 percent of it, scientists are left with around 30 petabytes (or 30 million gigabytes) each year to analyze for a wide range of physics experiments, including studies on the Higgs boson and dark matter.
Since 2002, LHC scientists have relied on the Worldwide LHC Computing Grid for all of their data processing and simulation needs. Linking thousands of computers and storage systems across 41 countries, this international distributed computing infrastructure, the largest computing grid in the world, allows LHC data to be accessed and analyzed in near-real-time by a community of more than 8,000 physicists. While this approach has been successful so far, the computing requirements of the LHC community are set to increase by a factor of 10 in the next few years. As a result, researchers like Argonne physicist Tom LeCompte are investigating the use of supercomputers as a possible tool for analyzing LHC-produced data.
LeCompte applied for and received computing time at the ALCF through the DOE ASCR Leadership Computing Challenge (ALCC). His project is focused on simulating ATLAS events that are difficult to simulate with the computing grid. In collaboration with ALCF staff, the project was able to scale their codes to run on the full Mira system and run much faster. The code optimization work also enabled the team to routinely simulate millions of LHC collision events in parallel. By running those jobs on Mira, the project completed two years’ worth of simulations in a matter of weeks, and the LHC computing grid became correspondingly free to run other jobs.
This marks the first time a massively parallel supercomputer has been used to perform leadership-scale simulations of LHC collision events. The effort has been a great success thus far, showing that supercomputers can help drive future discoveries at the LHC by accelerating the pace at which simulated data can be produced.
As supercomputers like Mira get better integrated into LHC’s workflow, LeCompte believes a much larger fraction of simulations could eventually be shifted to high-performance computers.
These three examples of how HPC transforms are but a small subset of the many ways HPC is an agent for change. We anticipate that in the coming years there will be many more examples of how HPC transforms the way R&D is executed.
Paul Messina is Senior Computational Scientist and Argonne Fellow at Argonne National Laboratory.