CSC, CSCS, SARA, T-Platforms to Deliver Scalable Hybrid Prototype for PRACE
CSC — IT Center for Science Ltd., CSCS — Swiss National Supercomputing Centre, SARA — Dutch national centre for HPC & e-science support and T-Platforms have announced a partnership to deliver a next-generation supercomputer prototype system. The system is part of the future HPC technology assessment activities of the pan-European PRACE Research Infrastructure (PRACE RI).
The prototype will be used to evaluate power efficiency, manageability and novel programming environments. It will combine the latest innovations in packaging, cooling, interconnect and accelerator co-processing technology as well as OS and systems management solutions. The prototype will also be used to develop highly scalable parallel hybrid applications which efficiently leverage both the CPU and accelerator co-processors. The system will be large enough to enable the study of scaling properties of benchmarks and parallel scientific and technical applications.
The prototype hardware to be deployed at CSC is based on T-Platforms’ next-generation architecture codenamed T-REX. The full-scale system will be delivered in 2013 with a theoretical peak performance of up to 400 TFLOPS. The system contains 256 hybrid compute nodes in a hot-water cooled rack, incorporating Intel processors as well as NVIDIA Tesla and Intel MIC accelerators. The system also includes an InfiniBand-based network with several SW enhancements to improve scalability, such as adaptive routing. The initial prototype will be delivered in Q3’12 and will include a smaller number of hybrid nodes based on T-Platforms V-class architecture.
.
The prototype will run the T-Platforms Cluster Management Software Suite providing advanced capabilities for system deployment, monitoring and management which will result, in particular, in establishment of the relation between application performance and electricity consumption and implementation of highly efficient power management techniques. The cooling solution will enable building a generic framework for chain assessment of energy efficiency all the way from datacenter to application, to refine future datacenter designs for higher efficiency.
Various programming paradigms such as CUDA+MPI, OpenMP with MIC, OpenCL, OpenACC, SHMEM and PGAS are going to be evaluated using a set of benchmarks selected in collaboration with other PRACE partners.
CSC and T-Platforms will collaborate in porting the GPAW (DFT) and Elmer (FEM) applications to the system with the goal to utilize the accelerators efficiently and optimize the internode communication between accelerators as well as to improve checkpoint-restart performance. SARA will focus on evaluating power efficiency and developing energy-to-solution tools, as well as evaluating a different accelerator architecture by a number of relevant applications, such as Crunch and Crunch2 (X-Ray crystallography), Voxel 3d (ortohopedics) and Rbflow (CFD) codes. CSCS will focus on exploring portable, directives-based programming approaches and direct accelerator-to-accelerator communication paradigms for scalable code development.