Euclid Cluster delivers New Computing Capabilities
Dubbed the Euclid cluster, it enables research projects to marshal the power of many computers at once to run large-scale computing jobs much faster and to move large datasets and files at high speeds among individual servers that make up the cluster. Euclid serves as a showcase for state-of-the-art, open-source software and technologies that could benefit the UW research community.
Euclid is now the single largest high-performance computing (HPC) cluster at UW-Madison. Created by a partnership of several campus departments, it offers almost 2,100 computer cores in 261 servers that are closely coupled by a high-speed network. This affords a peak theoretical performance of about 19 Teraflops, a level beyond that provided by about 1,000 typical desktop computers.
With Euclid, a researcher could use all of the cluster’s available power at once to run parallel jobs on large numbers of processors. High-speed transfer of large datasets and files among individual servers could exploit up to 10 gigabits per second of bandwidth, or 10 times more bandwidth than most typical interconnects now used on campus.
A key technology of the Euclid cluster is the connection among its various servers. A high-bandwidth, low-latency interconnect (10 GigE) tightly couples all 261 servers, which helps to broaden the possibilities for parallel computing to a large number of cores. Euclid enables one to benchmark 10 GigE networks, facilitating campus research in such areas as weather modeling, high-energy physics, bioinformatics and materials design.
A research group led by professor Manos Mavrikakis is the primary user of Euclid, applying the cluster’s massive computing power to problems on the frontiers of catalysis and materials science. Their research uses computational chemistry approaches to improve engineering practice in many areas. These include the design of novel catalysts for more efficient chemical processing, for generating energy, and for preventing pollution. The group is part of several international, multidisciplinary collaborations investigating the next generation of catalytic materials. Research performed in Mavrikakis’s group was recently published in Science, 329, 1633 (2010).
The UW Center for High Throughput Computing (CHTC) is exploring ways to amplify the investment in Euclid by offering the cluster to others on campus whenever it is not needed by the primary user group. CHTC, led by computer sciences professor Miron Livny, will provide access to Euclid via Condor, a distributed-computing technology that enables large numbers of scientists to share and manage computing resources.
The addition of the Euclid cluster extends Condor’s ability to provide high-throughput computing capabilities to high-performance computing. Todd Tannenbaum and Kenneth Hahn of CHTC lead the Condor component of the Euclid cluster; and Rahul Nabar of the Department of Chemical and Biological Engineering led the Euclid team in this effort. Benchmarking and further performance tuning is now in progress. Euclid also has instructional uses. Engineering students at UW-Madison, for example, will be able to use the cluster in HPC applications.
More than nine months of planning went into the detailed design and technology evaluation of the Euclid cluster. Contributing vendors included Dell, Cisco, Chelsio and APC. Funding for the project was provided by Department of Chemical and Biological Engineering, College of Engineering, CHTC, Wisconsin Alumni Research Foundation, UW-Madison Graduate School, and the Division of Information Technology (DoIT). Hideko Mills of DoIT, who also serves as manager of IT research infrastructure in the CIO’s office, played a key role in the design and commissioning of the Euclid cluster.
The Euclid cluster is hosted by the CHTC and housed within its data center.