Oak Ridge National Laboratory’s Jaguar
supercomputer has completed the first phase of an upgrade that will keep it
among the most powerful scientific computing systems in the world.
Acceptance testing for the upgrade was
completed earlier this month. The testing suite included scientific
applications focused on molecular dynamics, high-temperature superconductivity,
nuclear fusion, and combustion.
Jaguar, manufactured by Cray Inc., is
operated by the Oak Ridge Leadership Computing Facility (OLCF). Even before
this month’s (February, 2012) upgrade to 3.3 petaflops it was the United States’
most powerful supercomputer, capable of 2,300 trillion calculations each
second, or 2.3 petaflops. The same number of calculations would take an
individual working at a rate of one per second more than 70 million years.
When the upgrade process is completed this
autumn, the system will be renamed Titan and will be capable of 10 to 20
petaflops. Users have had access to Jaguar throughout the upgrade process.
“During our upgrade, we have kept our
users on Jaguar every chance we get,” said Jack Wells, director of science
at the OLCF, “We have already seen the positive impact on applications,
for example in computational fluid dynamics, from the doubled memory.”
The U.S. Department of Energy (DOE) Office
of Science-funded project, which was concluded ahead of schedule, upgraded
Jaguar’s AMD Opteron cores to the newest 6200 series and increased their number
by a third, from 224,256 to 299,008. Two six-core Opteron processors were
removed from each of Jaguar’s 18,688 nodes and replaced with a single 16-core
processor. At the same time, the system’s interconnect was updated and its
memory was doubled to 600 terabytes.
In addition, 960 of Jaguar’s 18,688 compute
nodes now contain an NVIDIA graphical processing unit (GPU). The GPUs were
added to the system in anticipation of a much larger GPU installation later in
the year. The GPUs act as accelerators, giving researchers a serious boost in
computing power in a far more energy-efficient system.
“Applications that were squeezing onto
our Cray XT5 nodes can now make full use of the 16-core processor. Doubling the
memory can have a dramatic impact on application workflow,” Wells said.
“The new Gemini interconnect is much
more scalable,” Wells added, “helping applications like molecular
dynamics that have demanding network communication requirements.”
GPUs will add a level of parallelism to the
system and allow Titan to reach 10 to 20 petaflops within the same space as
Jaguar and with essentially the same power requirements. While the Opteron
processors have 16 cores and are therefore able to carry out 16 computing tasks
simultaneously, the GPUs will be able to tackle hundreds of computing tasks at
the same time.
With nearly 1,000 GPUs now available,
researchers will have an opportunity to optimize their applications for the
accelerated Titan system.
“This is going to be an
exciting year in Oak Ridge
as our users take advantage of our new XK6 architecture and get ready for the
new NVIDIA Kepler GPUs in the fall,” Wells said. “A lot of work by
many people is beginning to pay off.”