
A computer-automated design conception of Sandia National Laboratories’ Astra supercomputer, used to work out the floor layout of the supercomputer’s compute, cooling, network and data storage cabinets. Credit: Hewlett Packard Enterprise.
The first ever Advanced RISC (Reduced Instruction Set Computing) Machine (ARM) supercomputer prototype is expected to be deployed later this summer by the U.S. Department of Energy’s (DOE) Sandia National Laboratory.
ARM architecture is used for processors that require fewer transistors than those with a complex instruction set computing architecture, ultimately lowering the costs while improving power consumption and heat dissipation.
ARM architecture is often used to design microprocessors for several different smaller applications, including automobile electronics, cellphones and other embedded devices. However, until recently they had not provided the performances required to make them practical for high-performance computing.
The new machine, known as Astra, represents the first of a potential series of advanced architecture prototype platforms, which will be deployed as part of the Vanguard program.
The Vanguard program aims to evaluate the feasibility of emerging high-performance computing architectures as production platforms to support the National Nuclear Security Administration’s (NNSA) mission to maintain and enhance the safety, security and effectiveness of the U.S. nuclear stockpile.
James Laros, the Vanguard project lead, explained in an interview with R&D Magazine that Astra represents a significant advancement in the world of supercomputers.
“There are always a great number of challenges fielding the first of a kind architecture,” Laros said. “When we say first of a kind, it is very important that we do not field one-off instances of technology.
“I would use the word excitement that we are given this opportunity to impact our national security mission,” he added. “The system will primarily service the three DOE nuclear weapons laboratories, but the software development will benefit the entire HPC community.”
Laros said there were some challenges to prove the viability of the ARM-based supercomputer.
“We would like to show that ARM is beneficial for our mission and that future platforms can leverage ARM for production platforms,” he said. “There are both hardware and software challenges. Our tall pole is porting and extracting performance for the NNSA integrated codes.
“Any new hardware architecture will have gaps and the software ecosystem will need development especially for large scale deployments,” Laros added.
Astra is based on the Cavium Inc. ThunderX2 64-bit ARM-v8 microprocessor and consists of 2,592 compute nodes, each of which is 28-core, dual socket with a theoretical peak of more than 2.3 petaflops, the equivalent of 2.3 quadrillion floating-point operations per second.
A single Astra node is about 100 times faster than a modern ARM-based cellphone.
Astra will be installed at Sandia in an expanded part of the building that formally housed the Red Storm supercomputer and will be deployed as a partnership with Westwind Computer Products and Hewlett Packard Enterprise.
“The development of a scalable ARM platform based on the HPE Apollo 70 will become a key resource to expand the Arm high-performance computing ecosystem,” Steve Hull, president of Westwind Computer Product, said in a statement.