Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

When Massive Data Never becomes Big Data

By R&D Editors | June 18, 2014

Steve Conway is Research VP, IDC High Performance Computing Group.The recent PRACE Days conference in Barcelona provided powerful reminders that massive data doesn’t always become big data — mainly because moving and storing massive data can cost massive money. PRACE is the Partnership for Advanced Computing in Europe, and the 2014 conference was the first to bring together scientific and industrial users of PRACE supercomputers located in major European nations.

Tom Lange of Procter & Gamble, which has used high performance computer (HPC) systems to help design consumer products including Pampers diapers and Pringles potato chips, said that although P&G manufactures billions of Pampers a year and they all have data sensors affixed to them, all this data is deleted once the diapers exit the manufacturing line. “P&G doesn’t have any big data,” he stated, explaining that “businesses typically won’t pay for storing data that doesn’t make them money.” Someone has to make the economic argument for storing data, especially when it comes in petabyte-size chunks.

This revelation sparked a lively discussion on who gets to decide which data gets saved when the future value of the data isn’t known. For businesses like P&G, the decision process is clear: if no one steps forward to pay for storage, the data probably won’t be saved. For massive data collected by governments, decisions can be more serendipitous and comprehensive policies are often lacking today.

Because the HPC community confronts very massive data, much of it meeting the “four Vs” definitional criteria for big data (volume, variety, velocity, and value), leading HPC users are already wrestling with issues that may not affect mainstream IT markets for a while. In addition:

  • The Large Hadron Collider at CERN generates 1 PB of data per second when it’s operating and the upcoming Square Kilometer Array telescope will disgorge 1 EB of data (1,000 PB) per day. In both cases, only a small fraction of the data will be stored. Storing it all would break the budget.
  • China’s Internet-of-Things initiative is targeting a 10,000-fold data reduction to avoid having to confront 100ZB of data from home sensors alone by 2030. A major strategy of the effort, spearheaded by the Chinese Academy of Sciences, is to create a single mega-sensor that will do the work of the 40-50 device sensors that would otherwise be expected to inhabit the average Chinese home at that time.
  • The HPC community has already figured out how to exploit the benefits of Hadoop while evading Hadoop’s requirement to store three copies of data. The workaround is to decouple Hadoop from the Hadoop Distributed File System (HDFS) and attach it instead to a fully POSIX-compliant parallel file system.

Storage is not the only daunting expense associated with massive data. Moving data can also be very costly. Technical experts say that a single computation typically requires 1 picojoule of energy, but moving the results of the computation may cost as much as 100 picojoules. At an average $1 million per megawatt, skyrocketing energy costs have become the number two issue in the worldwide HPC community, right after the perennial desire for bigger budgets. Hence, curtailing data movement is a bulls-eye target for cost-conscious HPC users in government, academia, and industry.

The HPC community was arguably the original home of big data and still operates at the leading edge of many big data developments, but costs for data movement and storage will continue to act as a brake on the growth of HPC big data. Even with this partial brake, IDC forecasts that the server market for high performance data analysis (big data using HPC) will grow rapidly (23.5% CAGR) to reach $2.7 billion in 2018 and the related storage market will expand to about $1.6 billion in the same year.

Steve Conway is Research VP, IDC High Performance Computing Group.

Related Articles Read More >

Could AI smell cancer? Science says yes
R&D World announces 2025 R&D 100 Professional Award Winners
Elsevier’s 121 million data point database is now searchable by AI
6 R&D advances this week: a quantum computer in space and a record-breaking lightning bolt
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2025 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

R&D 100 Awards
Research & Development World
  • Subscribe to R&D World Magazine
  • Sign up for R&D World’s newsletter
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2026 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE