
Examples of supercomputers that have deployed Darshan and activated it automatically for all users. Clockwise from the top left: Mira at Argonne National Laboratory, Cori at Lawrence Berkeley National Laboratory, Shaheen II at King Abdullah University of Science and Technology in Saudi Arabia, and Blue Waters at the University of Illinois at Urbana-Champaign. Credit: Argonne National Laboratory
Supercomputers are built for speed and efficiency, to handle massive volumes of data that would quickly overwhelm the typical household machine. But even high-performance computing (HPC) has room for improvement, especially as the amount of data that researchers are working with will continue to increase, as indicated by the U.S. Department of Energy’s 2015 High Energy Physics Exascale Requirements Review.
Darshan is a software tool designed to help facilitate this improvement by allowing researchers to measure and assess the performance of their supercomputers’ input/output (I/O) processes, providing a clear picture of their machines’ behavior that can help diagnose potential problems and determine best uses for HPC and data management resources.
In the decade since the first version of Darshan was made available, the open-access tool has been applied to numerous major research projects and implemented at several supercomputing facilities, including four that have Darshan automatically enabled for all users on their systems. The continually updated software was developed at Argonne National Laboratory, which received an R&D 100 Award this past November for Darshan 3.1.5. The most recent update, Darshan 3.1.7, was released on Jan. 22.
Enter the 2019 R&D 100 Awards!
Darshan 3.1.5 was a 2018 R&D 100 Award winner. All of the R&D 100 Awardees were announced at the R&D 100 Awards Gala held in Orlando, Florida on Nov. 16, 2018.
The R&D 100 Awards have served as the most prestigious innovation awards program for the past 57 years, honoring R&D pioneers and their revolutionary ideas in science and technology.
Submissions for the 2019 R&D 100 Awards are now being accepted. Any new technical product or process that was first available for purchase or licensing between January 1, 2018 and March 31, 2019, is eligible for entry in the 2019 awards.
Start or complete your entry now: visit: https://rd1002019.secure-platform.com/a For more info: www.rd100conference.com/awards
“Two major trends in supercomputing have changed how we manage data storage in the past decade. The first is that scientific applications are increasingly diverse (…) The second trend is that supercomputers themselves rely on deeper and more complex storage hierarchies to optimize performance and cost,” said Philip H. Carns, software engineer at Argonne’s Mathematics and Computer Science division and Darshan project lead, in an exclusive interview with R&D Magazine. “Darshan has evolved over time to keep pace with these trends. Most importantly, it has been modularized so that it can easily incorporate information from new data sources as supercomputer architectures change over time.”
Carns said one of the most notable aspects of the latest versions of Darshan is that they are “fully compatible” with the Summit supercomputer at Oak Ridge National Laboratory—the fastest supercomputer in the world, with a peak performance rate of 200 petaflops. Recent updates also improved regression testing for Cray supercomputers including Cori, at the National Energy Research Scientific Computing Center, and Theta, at Argonne’s own Leadership Computing Facility. The version recognized at the R&D 100 awards also introduced a new “eXtended Tracing” (DXT) module for on-demand, in-depth tracing without the need to modify or recompile an application. This module was created in collaboration with Intel Corp.
Darshan’s modular design and open accessibility means researchers can build onto the source code, crafting their own capabilities and sharing new features with the scientific community. The current version of Darshan includes additions throughout the years from 25 contributors outside the main Argonne Lab team. The tool is designed to be easily integrated and adapted for a wide range of applications.
“Darshan strives to be lightweight, scalable, and transparent. These properties enable Darshan to be deployed for use by all production applications on a large-scale supercomputer,” Carns said. “Its greatest impact, however, has been on users who are adapting their software for use on a new system for the first time.”
Projects that have utilized Darshan and boosted their efficiency include the University of Chicago’s High-Speed Combustion and Detonation project, which involves simulations of the combustion and detonation of hydrogen-oxygen mixtures, and achieved a 41-fold improvement in I/O performance; the multi-university VIPRA project, which is creating models of air passenger movement and behavior with the goal of reducing the spread of viral infections, and saw a 2-fold improvement in I/O performance; and the development of the Advanced Telescope for High ENergy Astrophysics (ATHENA), an X-ray telescope planned for launch in 2028 by the European Space Agency, which has reduced execution time by 40 percent.
Carns says Darshan will continue to evolve to meet the growing and changing needs of the research community.
“By far the biggest challenge is maintaining interoperability with the growing diversity of scientific applications,” he said. “We routinely encounter applications that use storage methods or storage access patterns that we did not anticipate in Darshan’s original design, and we must adapt accordingly.”