Report to the President: Every Federal Agency Needs a ‘Big Data’ Strategy
The Zettabyte Age is upon us: Networking and Information Technology working group notes that the exponential growth of data volume is now ‘a fast-growing concern.’ A report from the President’s Council of Advisors on Science and Technology (PCAST) covering many areas of technology has concluded that “every Federal agency needs to have a ‘Big Data’ strategy.”
The report, “Designing a Digital Future,” adds that “the collection, management and analysis of data is a fast-growing concern of NIT research” at a time when “data volumes are growing exponentially” due to the “proliferation of sensors and new data sources.”
The 119-page report was characterized by five broad themes, with the information explosion and big data addressed first — while also noting the rising importance of data mining, described as “the transformation of data into knowledge, and of knowledge into action.” The report’s contributor community included the PCAST Federal Networking and Information Technology Research and Development (NITRD) Program Review Working Group.
The NITRD working group, with 14 members, includes Stephen Brobst, chief technology officer for Teradata, a company focused on data warehousing. Brobst is recognized for his expertise with global business and government clients on the identification and development of opportunities for the strategic use of technology. He is also a frequent teacher-presenter at The Data Warehousing Institute events.
“We are entering the Zettabyte Age, and it is not hyperbole to say ‘every Federal agency needs to have a big data strategy.’ My colleagues and I in the Federal NITRD working group agree that this is an imperative, and leading commercial organizations are already refining such strategies,” Brobst said.
“However, it is vital that government agencies, in particular, act now — because the increasing Federal commitment to U.S. citizens is openness and transparency — which rely on more powerful data management systems that can handle many simultaneous complex queries, extensive dashboarding, and capacities of speed and workload optimization that together must analyze masses of data from a broadening variety of data sources,” Brobst said.
Brobst speaks from extensive experience as CTO of a company focused on data warehousing and analytics, with a significant number of its customers managing “big data” on the scale of many petabytes. Several of these engage in data-mining the global Web, including Teradata customer eBay. Under Brobst’s thought leadership, Teradata also has led innovation by being the first appliance vendor to introduce commercial database platforms with 100-percent solid state drive configurations for processing massive data volumes to deliver real time analysis and decision making.
Big Data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage and process the data within a tolerable elapsed time. Examples of big data sources include Web logs, sensor networks, social media, detailed data captured from telecommunications networks, astronomical observations, biological systems, military surveillance, medical records, photographic archives and video archives.
Big Data requires exceptional technologies to efficiently process large quantities of information. Advances in massively parallel processing (MPP) frameworks, such as shared nothing relational databases, MapReduce programming frameworks, and cloud infrastructure are essential for harnessing the value from Big Data.
“Virtually all the objectives of government today require better, faster decisions backed by key facts and insights — which must often be discovered by analyzing mountains of data,” said Richard Winter, president of Winter, a consultancy that specializes in very large data solutions. “The stakes are now too high for the outdated approaches that got us to this point. Most agencies will need to deploy a range of analytic database tools, including highly parallel data warehouse platforms, MapReduce and other technologies. It is crucial that agencies use the right tool for a given requirement. Only a well-defined, forward looking analytic data strategy will position them to do that.”
About the NITRD Working Group
The Federal Networking and Information Technology Research and Development (NITRD) Program refers to the organization by which the Federal Government coordinates its unclassified research and development investments in Networking and Information Technology. In 2010, PCAST appointed a 14-member Working Group to lead an assessment to help ensure continued United States leadership in high-performance computing, science and engineering. The program continues and the Working Group continues to meet for these purposes.