You may have seen recent news items regarding the Human Brain Project (HBP), a 10-year European neuroscience research initiative. Interactive computer simulation of brain models is central to its success. Cray was recently awarded a contract for the third and final phase of an R&D program (known in the European Union as a Pre-Commercial Procurement or PCP) to deliver a pilot system on which interactive simulation and analysis techniques will be developed and tested. The Cray work is being undertaken by the newly launched Cray EMEA Research Lab. This article discusses the ideas being developed and tested, ideas that we expect to be useful to many Cray users.
Step one is to manage the computer system in the same way other large pieces of experimental equipment are managed. Users book time slots on the experiment rather than submitting work to a queue. This can be achieved using a scheduler with an advance reservation system. An alternative that we have been working on for some time is to suspend the running jobs, either to memory or to fast swap space on a Cray DataWarp filesystem. A number of Cray sites already use these techniques to run repetitive production cycles at the same time each day. By pushing these ideas a little further we can open up the possibility of using Cray supercomputer systems interactively.
Memory capacity is a limiting factor in brain simulations. Researchers currently envision data sets that will require tens of petabytes of main memory, which already exceeds the capacities of the largest supercomputers on the planet. This requirement will then increase to hundreds of petabytes for a full brain-scale simulation! In short, it will be impossible to store all this data in memory at one time. One solution could be that users will interactively select the most interesting data to analyze and visualize, and a small subset of results to store. The simulation codes use so-called “recording devices” to store a selection of results and pass them on for further analysis. The simulation is started with default recording settings. If the initial analysis reveals interesting behaviour, then the experiment can be extended or repeated with detailed recording enabled for a subset of data objects.
Step two is to provide developers with the ability to couple simulation, analysis and visualization applications into a single workflow. We are all accustomed to HPC flows in which simulation jobs write their results, either periodically or at the end of the job, and post-processing jobs analyze the data generated. In a coupled workflow the simulation and analytics applications run simultaneously. An analysis job might run on dedicated resources or it might run on the same nodes as the simulation, feeding its results to a visualization system. Both techniques require a fast method for transferring data between applications and efficient methods of synchronization. The pilot system will have a pool of GPU nodes dedicated to this task.
Many HPC workflows communicate through the filesystem, especially when large amounts of data need to be transferred between distinct applications in the workflow. This process can be accelerated by providing systems with a tier of nearby, shared, bandwidth-optimized storage. This storage is used for intermediate data, with only the final results — the output of the experiment — being written out to enterprise storage. Cray has developed an innovative, flash-based storage technology, DataWarp, which support two distinct types of use: private and shared. The private use case will provide local high-bandwidth communication between simulation and analytics applications using memory or storage on every node. In the shared use case a pool of flash-based storage servers provide a high-bandwidth filesystem through which large simulations and visualization jobs running on other nodes can communicate. Both approaches are being evaluated as part of the HBP pilot.
The third and final element of the HBP work is the ability to steer simulations as they run. This is more a way of thinking about how to perform simulation than a specific technology. The experiment is set up by constructing the brain network wiring in memory, a time-consuming process that results in more than a petabyte of data even with today’s relatively small models. This network is retained in memory while the scientist runs a sequence of virtual experiments or “what if” studies in quick succession, with the results of one run steering the next. Steering data can be fed back to the simulation through socket connections or through the filesystem.
In addition to novel software, the HBP pilot system will preview a number of interesting memory and processor technologies. Check back at the Cray blog for updates on this and other advances at Cray.
If you are interested in learning more about brain simulation work, read the recent paper in the journal Cell by Henry Markram et al.: “Reconstruction and Simulation of Neocortical Microcircuitry.”
Duncan Roweth is a Senior Principal Engineer in the CTO Office at Cray.