The joke that you can tell a pioneer by the arrows in their back is as true in the computing world today as it was way back when, during the time of the American frontier. HPC has always embraced the leading — and bleeding — edge of technology and, as such, it acts as the trailbreaker and scout for enterprise and business customers. Along with identifying the painful “arrows,” known as “excessed hardware” to modern technology organizations, HPC has highlighted and matured the abilities of previously risky devices, like graphics processing units (GPUs), that enterprise customers now leverage to create competitive advantage and increase market share. Around the world, GPUs are now busily churning away making money for their owners. In other words, GPUs have moved beyond “devices with potential” to “production devices” that are used for profit generation.
So, who are the winners who have staked their claims and are capitalizing on rows of GPU-accelerated servers that are busily humming away making money? First, think of companies with business models that require lots of computing capability without having to move outlandish amounts of data. Such organizations fit within the current sweet spot in the GPU technology landscape of lots of computational capability for not much (a) investment and (b) power consumption relative to CPU implementations.
Shazam is one example that uses GPUs in production to rapidly search and identify songs from its 27 million-track database. Basically, the user captures a short audio segment that is uploaded to Shazam, after which the GPU-accelerated servers hum away to find the best match given the uncertainties of environmental noise, poor microphone response and arbitrary location in the song. Along with being a heavy computational task, song recognition is also a near-real-time task, as users are not willing to wait for very long to get the small amount of information back that identifies the song and gives the user the opportunity to purchase it. Note the search workflow: small amount of data uploaded, computationally- intensive interactive task, and small amount of return data.
Now, multiply the “small amount of data uploaded” by Shazam’s 300+ million users who perform around 10 million song searches each day. Further, consider that Shazam is adding around two million new users every week, and that the company’s database of songs doubled last year, to get a sense of the stability and scalability requirements needed to convert production cycles to cash.
“GPUs enable us to handle our tremendous processing needs at a substantial cost savings, delivering twice the performance per dollar compared to a CPU-based system,” said Jason Titus, chief technology officer of Shazam Entertainment. “We are adding millions of video and foreign language audio tracks to our existing services, and GPU accelerators give us a way to achieve scalable growth.”
Next, think of organizations that do need to move outlandish amounts of data. Such organizations fit within two rough categories:
1. those that process large amounts of data, but where the profit-based activity is still “relatively” low-bandwidth (e.g. like a search model)
2. those where both the processing and profit are high-bandwidth
Search organizations like Baidu, China’s largest search and Web services company, fit the “low” user-bandwidth-relative-to-computation model. Internally, Baidu is remarkably data-intensive, as it has aggressively deployed deep learning-based products ranging from speech recognition and translation, optical character recognition (OCR), facial and object recognition, and content-based image retrieval to online ad optimization. The training of neural networks using Farber mapping have been shown to exhibit near-linear scalability to tens of thousands of CPUs, GPUs and Intel Xeon Phi coprocessors. Baidu is able to exploit this near-linear scaling behavior to train large deep neural networks behind the scenes with massive amounts of data to perform complex recognition tasks that were not previously possible with earlier CPU-only technology. Users then see the relatively lightweight computation as the trained neural networks are integrated into the Baidu search pipelines.
In addition to Baidu, many other companies leverage the fast and efficient prediction capabilities of trained neural networks to service large numbers of interactive and real-time pattern recognition tasks. Augmented reality games, self-driving cars, independent drones, robots that can better operate in your home, and a number of other wondrous new markets are currently opening because the HPC pioneers demonstrated that GPUs can support viable petascale- and, yes, even exascale-capable application workflows.
Even the mundane page ranking of search results (mundane at least in the age of Google) now benefits from rows of GPU-accelerated servers happily chewing away on data. Yandex, Russia’s most popular search engine, is one of the first search engine companies to broadly use production GPUs to improve the quality of ranking Web search results. They report they can train their ranking models 20 times faster compared to CPUs to deliver the most relevant search results to Yandex’s 81 million monthly Web search users.
It has been commonly bandied about in the HPC community that the largest GPU-accelerated supercomputers are owned by the oil and gas companies. The question of whether these machines are the largest in the world is moot, as it is clear that the oil and gas GPU-accelerated supercomputers are very large. The profit-making activity of these machines is to busily hum along solving high-resolution reservoir models to display, or further process, the resulting huge amounts of computed data to help people pinpoint oil and gas deposits that are of sufficient size and value to justify the substantial investment that will eventually bring them to our homes and cars. As such, these machines fit the GPU-accelerated, data-intensive, computation-intensive model.
For example, the three PF/s (as reported by linpack) Eni GPU-accelerated oil and gas supercomputer is clearly a data-intensive system that also contains 7.5 petabytes (7500 terabytes) of storage. This is the second supercomputer that Eni has deployed using the latest GPU computing technology to spur advances in exploration geophysics and in reservoir simulation. By utilizing its own proprietary code, Eni reports it can get high-resolution 3-D subsurface images from seismic data at more than five times the speed of conventional supercomputers, meaning it creates very high quality data to reduce exploration risk.
Of course, one of the most prevalent applications for GPUs lies in animation — be it on the movie screen, seen via an augmented reality headset, or in video games. The expected 2014 revenue (according to Statista) is that global PC and console games revenue in 2014 will be around $46.5 billion. Meanwhile, anyone who has watched a movie during the past few years has probably enjoyed the remarkably realistic animations generated by rows of machines, many of which are GPU-accelerated, that have been busily humming away rendering the movie frame-by-frame.
Happily for creative individuals, the animation industry is expanding into the masses as teraflop-per-second GPUs make a cottage industry of independent animation studios possible. Online advertising and YouTube provides distribution channels outside of Hollywood. Pixar, for example, is leveraging the opportunity of the crowd (as opposed to cloud) by giving out free copies of Renderman for non-commercial use. Expect augmented reality animations to become the gold-rush of the next decade. Think of content distribution through channels like augmented reality headsets but also through toys that can be made with 3-D printers.
As we follow the trickle-down of GPU technology from HPC “pioneers” through the enterprise and to small business “settlers,” the number of options continues to grow. Think of private cloud computing moving from big organizations to small- and medium-sized businesses. The cloud — coupled with virtualization — creates a wonderful opportunity to create and sell apps for the cloud app-stores of the future.
Meanwhile, HPC organizations like Oak Ridge National Laboratory continue their industrial outreach programs that provide time on leadership class supercomputers, such as Titan, for commercial/enterprise organizations — thus ensuring that the benefits of the HPC “pioneers” continue to trickle down to the enterprise and, eventually, small business.
Rob Farber is an independent HPC expert to startups and Fortune 100 companies, as well as government and academic organizations. He may be reached at [email protected].