Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

Broad Institute to Enable Cloud-based Access to Genome Analysis Toolkit

By R&D Editors | April 6, 2016

The GATK Best Practices pipeline will be available to users of cloud service providers through a software-as-a-service (SaaS) mechanism, expanding access beyond traditional desktop solutions. Broad will also work with collaborators to drive the creation of the next generation of GATK based on the Apache Spark computing framework.Cambridge, MA — Broad Institute of MIT and Harvard is collaborating with Amazon Web Services (AWS), Cloudera, Google, IBM, Intel and Microsoft to enable cloud-based access to its Genome Analysis Toolkit (GATK) software package. Through these collaborations, the GATK Best Practices pipeline will be available to users of cloud service providers through a software-as-a-service (SaaS) mechanism, expanding access beyond traditional desktop solutions. Broad will also work with collaborators to drive the creation of the next generation of GATK based on the Apache Spark computing framework.

“By providing a cloud-hosted solution, we can greatly expand access and facilitate usage of these genome analysis tools,” said Eric Banks, senior director of Data Sciences and Data Engineering at Broad and a creator of the GATK software package. “There are currently more than 31,000 registered users of the Broad Institute’s GATK. The vast majority set up an extensive local compute and storage infrastructure to process the huge amount of information required to conduct genomic analyses. These collaborations will provide new options that can remove traditional barriers of scale while offering the same high level of data quality.”

This effort expands existing efforts that began with the June 2015 alpha offering of GATK on Google Cloud Platform, to include additional cloud providers. (For an update on this project see this April 5 Google Research Blog post)

“Since the alpha launch of Broad Institute’s GATK on Google Genomics last summer, there has been a tremendous amount of interest. We have run many thousands of samples through this pipeline for a variety of users. We’ve also optimized the pipeline to make it remarkably cost effective,” said David Glazer, director of Google Genomics. “Working with Broad Institute to build and launch this pipeline has provided a powerful demonstration of Google Cloud Platform’s ability to accelerate life science.”

“It is a pleasure to be working with Broad to offer GATK on Microsoft Azure,” said David Heckerman of Microsoft Genomics. “This will greatly facilitate research and clinical genomic analyses.”

“As genomic data increasingly plays a role in research and treatment, cloud-based access to powerful analytic tools like GATK will be critical to accelerate precision medicine,” said Steve Harvey, vice president of Watson Health and head of Watson for Genomics. “We are eager to support data-driven insights for clinicians and researchers through the Watson Health Cloud.”

Users should be able to access cloud-based GATK options beginning later this year. Pricing will vary depending on the provider. The GATK will continue to be available for existing and new users to download and deploy on their local infrastructure, provided by Broad Institute free for academic research and via a licensing fee for commercial users.

Beyond the cloud to GATK4

These collaborations will also help the Broad Institute drive the development of GATK4, the next generation of GATK. GATK4 will utilize the Spark distributed computing framework to facilitate parallelism and in-memory computations, thus speeding up the methods. GATK4 will also extend the range of use cases supported by GATK to include cancer, structural variation, copy number variation and more.

Already, Cloudera, Intel, and Google have contributed to the development of GATK4.

“Cloudera’s early commitment to Spark drove us to be the first Hadoop vendor to ship, support, and offer Spark training in 2014. We are honored to apply our expertise to the downstream multi-omic analysis space, investing in Spark as a bioinformatics standard, and working with Broad to create the next generation of GATK,” said Shawn Dolley, industry leader of Life Sciences at Cloudera.

“Optimizing GATK for cloud-based access will accelerate the utilization of genomic data to fuel new insights into disease and treatment,” said Eric Dishman, Intel vice president, Health and Life Sciences. “To tackle one of the biggest big data challenges, Intel is also working closely with the Broad Institute in co-developing tools compatible with GATK to eliminate the barriers to more effective and widespread use of large scale genomic workloads.”

Related Articles Read More >

From solar system simulations to SaaS savings, how Codeium’s AI agent empowers non-coders and scientists alike
Aardvark AI forecasts rival supercomputer simulations while using over 99.9% less compute
Quantum Brilliance, Pawsey integrate room-temp quantum with HPC on NVIDIA GH200
Frontier supercomputer reveals new detail in nuclear structure
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2024 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

Research & Development World
  • Subscribe to R&D World Magazine
  • Enews Sign Up
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE