Deep Learning Stretches Up to Scientific Supercomputers

Researchers delivered a 15-petaflop deep-learning software and ran it on Cori, a supercomputer at the National Energy Research Scientific Computing Center, a Department of Energy Office of Science user facility. (Credit: Lawrence Berkeley National Laboratory)

The Science

Machine learning, a form of artificial intelligence, enjoys unprecedented success in commercial applications. However, the use of machine learning in high performance computing for science has been limited. Why? Advanced machine learning tools weren’t designed for big data sets, like those used to study stars and planets. A team from Intel, National Energy Research Scientific Computing Center (NERSC), and Stanford changed that situation. They developed the first 15-petaflop deep-learning software. They demonstrated its ability to handle large data sets via test runs on the Cori supercomputer.

The Impact

Using machine learning techniques on supercomputers, scientists could extract insights from large, complex data sets. Powerful instruments, such as accelerators, produce massive data sets. The new software could make the world’s largest supercomputers able to fit such data into deep learning uses. The resulting insights could benefit Earth systems modeling, fusion energy, and astrophysics.

Summary

Machine learning techniques hold potential for enabling scientists to extract valuable insights from large, complex data sets being produced by accelerators, light sources, telescopes, and computer simulations. While these techniques have had great success in a variety of commercial applications, their use in high performance computing for science has been limited because existing tools were not designed to work with the terabyte- to petabyte-sized data sets found in many science domains.

To address this problem a collaboration among Intel, the National Energy Research Scientific Computing Center, and Stanford University has been working to solve problems that arise when using deep learning techniques, a form of machine learning, on terabyte and petabyte data sets. The team developed the first 15-petaflop deep-learning software. They demonstrated its scalability for data-intensive applications by executing a number of training runs using large scientific data sets. The runs used physics- and climate-based data sets on Cori, a supercomputer located at the National Energy Research Scientific Computing Center. They achieved a peak rate between 11.73 and 15.07 petaflops (single-precision) and an average sustained performance of 11.41 to 13.47 petaflops. (A petaflop is million billion calculations per second.)

Funding

This research used resources at the National Energy Research Scientific Computing Center, a Department of Energy, Office of Science, Advanced Scientific Computing Research user facility.

Publications

T. Kurth, et al., “Deep learning at 15PF: Supervised and semi-supervised classification for scientific data.” SC ’17 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis article 7 (2017). [DOI: 10.1145/3126908.3126916]

Related Articles Read More >

Who has access to Claude Mythos-tier models (and beyond) will redefine cybersecurity, including in R&D

A startup says it found hidden memory behavior in NVIDIA GPUs and is building a security layer around it

NTT Research launches Scale Academy with SaltGrain, a zero-trust data security suite

LabWare advances SaaS LIMS strategy at Pittcon 2026, one year after ASSURE launch

Search R&D World