Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

Dealing with massive data from miniscule communities

By R&D Editors | August 2, 2012

It’s
relatively easy to collect massive amounts of data on microbes. But the
files are so large that it takes days to simply transmit them to other
researchers and months to analyze once they are received.

Researchers
at Michigan State University have developed a new computational
technique, featured in the current issue of the Proceedings of the
National Academy of Sciences, that relieves the logjam that these “big
data” issues create.

Microbial
communities living in soil or the ocean are quite complicated. Their
genomic data is easy enough to collect, but their data sets are so big
that they actually overwhelm today’s computers. C. Titus Brown, MSU
assistant professor in bioinformatics, demonstrates a general technique
that can be applied on most microbial communities.

The
interesting twist is that the team created a solution using small
computers, a novel approach considering most bioinformatics research
focuses on supercomputers, Brown said.

“To
thoroughly examine a gram of soil, we need to generate about 50
terabases of genomic sequence—about 1,000 times more data than generated
for the initial human genome project,” said Brown, who co-authored on
the paper with Jim Tiedje, University Distinguished professor of
microbiology and molecular genetics. “That would take about 50 laptops
to store that much data. Our paper shows the way to make it work on a
much smaller scale.”

Analyzing
DNA data using traditional computing methods is like trying to eat a
large pizza in a single bite. The huge influx of data bogs down
computers’ memory and causes them to choke. The new method employs a
filter that folds the pizza up compactly using a special data structure.
This allows computers to nibble at slices of the data and eventually
digest the entire sequence. This technique creates a 40-fold decrease in
memory requirements, allowing scientists to plow through reams of data
without using a supercomputer.

Brown
and Tiedje will continue to pursue this line of research, and they are
encouraging others to improve upon it as well. The researchers made the
complete source code and the ancillary software available to the public
to encourage extension.

“We
want this program to continue to evolve and improve,” Brown said. “In
fact, it already has. Other researchers have taken our approach in a new
direction and made a better genome assembler.”

Scaling metagenome sequence assembly with probabilistic de Bruijn graphs

Source: Michigan State University

Related Articles Read More >

New video series: Travel for engineers
Advanced Manufacturing and Process Innovation Special Report: When you can’t hire, you automate
Pancreas or pancreatic cancer with organs and tumors or cancerous cells 3D rendering illustration with male body. Anatomy, oncology, disease, medical, biology, science, healthcare concepts.
AI tool used to detect pancreatic cancer in routine CT scans in China 
R&D 100 Red Carpet: DuPont’s triple win
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2025 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

R&D 100 Awards
Research & Development World
  • Subscribe to R&D World Magazine
  • Sign up for R&D World’s newsletter
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2026 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE