Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

LLNL, IBM and Red Hat to explore standardized HPC resource management interface

By Heather Hall | April 30, 2021

Pictured, from left, are LLNL postdoctoral researcher Dan Milroy and computer scientists Stephen Herbein and Dong H. Ahn

Lawrence Livermore National Laboratory (LLNL), IBM and Red Hat are combining forces to develop best practices for interfacing high-performance computing (HPC) schedulers and cloud orchestrators, an effort designed to prepare for emerging supercomputers that take advantage of cloud technologies.

Under a recently signed memorandum of understanding (MOU), researchers aim to enable next-generation workloads by integrating LLNL’s Flux scheduling framework with Red Hat OpenShift — a leading enterprise Kubernetes platform — to allow more traditional HPC jobs to utilize cloud and container technologies. A new standardized interface would help satisfy an increasing demand for compute-intensive jobs that combine HPC with cloud computing across a wide range of industry sectors, researchers said.

“Cloud systems are increasingly setting the directions of the broader computing ecosystem, and economics are a primary driver,” said Bronis R. de Supinski, chief technology officer of Livermore Computing at LLNL. “With the growing prevalence of cloud-based systems, we must align our HPC strategy with cloud technologies, particularly in terms of their software environments, to ensure the long-term sustainability and affordability of our mission-critical HPC systems.”

LLNL’s open-source Flux scheduling framework builds upon the Lab’s extensive experience in HPC and allows new resource types, schedulers and services to be deployed as data centers continue to evolve, including the emergence of exascale computing. Its ability to make smart placement decisions and rich resource expression make it well-suited to facilitate orchestration using tools like Red Hat OpenShift on large-scale HPC clusters, which LLNL researchers anticipate becoming more commonplace in the years to come.

“One of the trends we’ve been seeing at Livermore is the loose coupling of HPC applications and applications like machine learning and data analytics on the orchestrated side, but in the near future we expect to see a closer meshing of those two technologies,” said LLNL postdoctoral researcher Dan Milroy. “We think that unifying Flux with cloud orchestration frameworks like Red Hat OpenShift and Kubernetes is going to allow both HPC and cloud technologies to come together in the future, helping to scale workflows everywhere. I believe co-developing Flux with OpenShift is going to be really advantageous.”

Red Hat OpenShift is an open-source container platform based on the Kubernetes container orchestrator for enterprise app development and deployment. Kubernetes is an open-source system for automating deployment, scaling and management of containerized applications.

Researchers want to further enhance Red Hat OpenShift and make it a common platform for a wide range of computing infrastructures, including large-scale HPC systems, enterprise systems and public cloud offerings, starting with commercial HPC workloads.

“We would love to see a platform like Red Hat OpenShift be able to run a wide range of workloads on a wide range of platforms, from supercomputers to clusters,” said IBM Research staff member Claudia Misale. “We see difficulties in the HPC world from having many different types of HPC software stacks, and container platforms like OpenShift can address these difficulties. We believe OpenShift can be the common denominator, like Red Hat Enterprise Linux has been a common denominator on HPC systems.”

The impetus for enabling Flux as a Kubernetes scheduler plug-in began with a successful prototype that came from a Collaboration of Oak Ridge, Argonne, and Livermore (CORAL) and Centers of Excellence project between LLNL and IBM to understand the formation of cancer. The plug-in enabled more sophisticated scheduling of Kubernetes workflows, which convinced researchers they could integrate Flux with Red Hat OpenShift, researchers said.

Because many HPC centers use their own schedulers, a primary goal is to “democratize” the Kubernetes interface for HPC users, pursuing an open interface that any HPC site or center could utilize and incorporate their existing schedulers.

“We’ve been seeing a steady trend toward data-centric computing, which includes the convergence of artificial intelligence/machine learning and HPC workloads,” said Chris Wright, senior vice president and chief technology officer, Red Hat. “The HPC community has long been on the leading edge of data analysis. Bringing their expertise in complex large-scale scheduling to a common cloud-native platform is a perfect expression of the power of open-source collaboration. This brings new scheduling capabilities to Red Hat OpenShift and Kubernetes and brings modern cloud-native AI/ML applications to the large labs.”

The researchers plan to initially integrate Flux to run within the Red Hat OpenShift environment, using Flux as a driver for other commonly used schedulers to interface with OpenShift and Kubernetes, eventually facilitating the platform for use with any HPC workload and on any HPC machine.

“This effort will make it easy for HPC workflows to leverage leading HPC schedulers like Flux to realize the full potential of emerging HPC and cloud environments,” said Dong H. Ahn, lead for LLNL’s Advanced Technology Development and Mitigation Next Generation Computing Enablement project.

The team has begun working on scheduling topology and anticipates defining an interface within the next six months. Future goals include exploring different integration models such as co-location, extending advanced management and configuration beyond the node.

Related Articles Read More >

QED-C outlines road map for merging quantum and AI
Quantum computing hardware advance slashes superinductor capacitance >60%, cutting substrate loss
Hold your exaflops! Why comparing AI clusters to supercomputers is bananas
Why IBM predicts quantum advantage within two years
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2024 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

Research & Development World
  • Subscribe to R&D World Magazine
  • Enews Sign Up
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE