Application Frameworks for High Performance and Grid Computing
Today’s approaches combine expertise in computer science with applied mathematics, engineering, science and the humanities
From engineering gas turbine blade cooling methods for more efficient engines, to determining the optimal location for oil wells in the Gulf of Mexico, advanced
computational techniques are becoming increasingly important as we study and interpret the world. From predicting complex processes in nature such as hurricane-induced storm surges to the gravitational waves emitted by the collision of two black holes, solving these Grand Challenge problems cannot be effectively tackled by a single researcher, or even a single group. Today’s complex problems require interdisciplinary research groups to combine their expertise in computer science, applied mathematics, engineering, science and the humanities.
The scope and accuracy of modern computational approaches are limited, however, compared to what is required by such complex applications. Going beyond high-end computers, which are still too small to adequately compute many processes at the level of detail we need, a much more comprehensive approach involving large-scale computational power, networks, data and storage, as well as application and system software, is needed to tackle these problems.
In response to these challenges, Louisiana has funded various initiatives to build up large-scale computer resources such as those housed at Louisiana State University’s new Center for Computation & Technology (CCT), and the Louisiana Optical Network Initiative (LONI) network, which is an advanced 40 Gbps optical network and distributed set of computer servers connecting the state’s research institutes to the National Lambda Rail (NLR), a major initiative of United States research universities and private sector technology companies to provide a national scale infrastructure for research and experimentation in networking technologies and applications. Louisiana also is building a critical mass of faculty and staff with the spectrum of expertise required to advance complex application areas. For example, CCT is home to more than two dozen faculty from a dozen different LSU departments, and plans to recruit another 15 faculty to form complementary research groups in various areas of computation and technology. Such technical expertise in scientific computing has to be driven by the applications themselves. CCT includes research teams in numerical relativity and computational fluid dynamics that team with the many other application areas at LSU such as petroleum engineering, coastal modeling and computational chemistry.
Application frameworks for high-end computing
Many large-scale applications interface with supercomputers and high-end computing environments through advanced frameworks and toolkits, which can provide both
generic tools for parallel computing and custom libraries to support different domain communities. One such programming environment co-developed at the CCT and the Albert Einstein Institute in Potsdam, Germany, is the open source Cactus Framework. Cactus allows scientists and engineers to develop and reuse numerical software targeted for high-end parallel computing in a collaborative, portable environment. Using Cactus, scientists can develop their codes on laptop computers, using C, C++, or Fortran 77/90 but then run them on virtually any supercomputer. Cactus also provides access to myriad computational tools, such as advanced numerical techniques, adaptive mesh refinement, parallel I/O, live and remote visualization, and remote steering. The various computational layers and functionalities, as well as different scientific modules, can be selected at run time via parameter choices.
Cactus provides application groups with access to cutting edge technologies, such as grid computing, remote visualization and high speed parallel I/O. Just as important, however, working with a single modular framework such as Cactus helps to develop communities and enable code sharing and toolkit building. Such code sharing also happens between disciplines; at CCT the computational fluid dynamics and numerical relativity groups collaborate and share expertise in high-accuracy numerical methods and boundary conditions. Cactus is used by many different application domains, including numerical relativity and astrophysics, computational fluid dynamics, climate modeling, chemical engineering, and financial modeling. In particular, toolkits for building applications in relativity and CFD are openly available and used by groups around the world.
Grid and distributed computing
Beyond large-scale parallel clusters and supercomputers lies the emerging field of grid computing. Conventionally defined as coordinated resource sharing for complex problem solving, grids will provide scientists with distributed access not only to computer resources but also to high-speed networks, information repositories and data archives, and experimental and observational devices. Grids also provide a host of tools for service discovery, notification, and so forth. The term cyberinfrastructure has been coined to describe the software and hardware layers that will enable application domains to make good use of distributed resources and grids, leading to new, powerful scenarios for problem solving. Cyberinfrastructure includes middleware tools for application development; services for data, information and knowledge management; and secure interfaces for interaction, collaboration and visualization. At CCT, such cyberinfrastructure is being developed in several fields, including CFD, petroleum engineering, coastal ocean observation and prediction, numerical relativity, and computational chemistry. Cyberinfrastructure also is being used to develop generic tools for applications using the Cactus Framework.
The field of grid computing is rapidly maturing, with interfaces moving from proprietary protocols to new Web service standards. New services are being developed by research groups and industry, with varying degrees of functionality, reliability, security and usability. To allow for application development in such a dynamic field, the CCT uses the Grid Application Toolkit (GAT), the generic API that provides a standard programming interface for the grid. As MPI provides a stable interface to different MPI implementations, the GAT provides a simple, standardized interface to many different grid services. The CCT researches and develops the GAT along with more than 10 European partners from the GridLab project. The programming interface innovated by the GridLab team has led to a working group in the grid standards and best practices body, the Global Grid Forum. This group is developing a more widely adopted interface, termed the Simple API for Grid Applications, or SAGA. An implementation of SAGA is now being developed at CCT that will be compatible with the GAT.
User interfaces
Scientists today typically have access to a large diverse and geographically distributed array of computers. For example, the numerical relativists at CCT use machines locally at LSU, across the state in the new LONI network, across the country at national centers via a large NSF allocation, and machines in Europe and Korea. Web-based portals are being developed to track, monitor and manage a community’s resources, software codes, simulations and data. Visualizations of the ongoing simulations can be computed while they are carried out and viewed by all collaborators in real-time. The GridLab project also developed a popular, modular framework for building portals, called GridSphere. This new framework contains a range of portlets that are being extended and customized to provide not only access to grid capabilities, but also to community tools for different application areas.
Coastal and environmental modeling
An application area that is of particular interest in Louisiana, that ties all these elements together, is coastal and environmental modeling. The recent catastrophes in the
southeast United States following the triad of Hurricanes Katrina, Rita and Wilma have highlighted the need, not only for timely and accurate forecasts, but also for improved coordination and information transfer between domain experts, policy makers and emergency responders.
Coastal modeling involves many issues that are priority areas targeted by federal funding agencies, namely: multiscale models capable of covering large-scale global climate domains, including turbulence and other fine scale features within regional models; and coupled models incorporating the interface between wind, ocean circulation and surge models, and various models of land and sea ecology. Coastal modeling motivates another emerging area of computer science: Dynamic Data Driven Application Systems, or DDDAS. This system replaces conventional self-contained simulations with living applications that interact with and respond to realtime data from sensors, satellites and other simulations, not just reacting to their environment, but effecting needed change.
To progress in this complex field requires the use of modular community frameworks, such as the Cactus Framework, with well-defined interfaces for coupling models and exploiting cutting edge computer science technologies. It also requires tools such as the Grid Application Toolkit for building the cyberinfrastructure needed to interact with distributed data sources and deploy dynamic data-driven models across brokered resources. End users need secure community portals, such as those that are being built with the GridSphere framework to deploy and track their simulations, discover data and services, and build custom views for policy makers and emergency responders.
Summary
click the image to enlarge |
Modern computational approaches to complex problems facing our communities require a comprehensive mix of high-end computing, network, sensor, data, visualization technologies and novel multiscale algorithms, as well as application scientists and engineers from multiple disciplines. Advanced system and application software tools, such as modeling, grid and portal development frameworks, are advancing the fields by integrating and simplifying these different technologies for the application developers and users.
Gabrielle Allen is Associate Professor in Computer Science at LSU, and the Assistant Director for Computing Applications at the CCT. Edward Seidel is Director of the CCT at LSU and Floating Point Systems Professor in LSU’s Departments of Physics and Astronomy, and Computer Science. They may be contacted at [email protected].