Everyone has heard the old adage that time is money. In today’s society, business moves at the speed of making a phone call, looking something up online via your cell phone, or posting a tweet. So, when time is money (and can be a lot of money), why are businesses okay with waiting weeks or even months to get valuable information from their data?
The answer is they likely aren’t okay with it, but they seem resigned to being stuck with the status quo. The current data pipeline is reliant on legacy x86 architectures, which are littered with bottlenecks — ranging from processing power limitations, transforming and loading data, expensive data indexing operations, relatively slow networking speeds, and complex software stacks. What should be a simple and quick action has turned into one bogged down by lengthy and complex processes. Solutions such as Hadoop and Spark have been created (and revised over and over again) to try and solve the problem; however, to get the performance required, organizations have to scale out to massive clusters, which leads to a new set of bottlenecks. The already-slow status quo is made even slower for businesses that have more than just a few megabytes of data to store and analyze.
These issues are only compounded by the inability to analyze both batch and streaming data at the same time using the same platform. Data being created faster and faster — tweets, photos, purchase information, and more — means that what was true 10 minutes ago, may no longer be up-to-date, accurate or valuable to the organization.
The speed of data isn’t the only complexity — the vast majority of data has some sort of human element and isn’t easily parsed or dealt with by machines that are meant to deal with binary 1s and 0s. Machines just don’t think the way humans do. IT is trying to compensate by relying on current architectures that use sprawling clusters to analyze data faster but, unfortunately, this isn’t getting the job done. Massive increases in the amount of data and volume of data input types have pushed traditional architectures to the breaking point. With the Internet of Things (IoT) phenomenon and the resulting prediction of another data explosion, the issue is only set to get worse.
But is real-time analysis of data really that much more valuable than the wait times that we’ve grown accustomed to accepting? The answer is unequivocally yes, and this answer is true across many industries. Take the medical field, where genome sequencing plays a major role in finding causes and cures for diseases. Being able to take a new genome sequence and immediately tell if it matches something that is already known provides massive process improvements. And, if a researchers can get faster answers, that leads to improvements to the bottom line.
Healthcare is just one case — the financial industry requires real-time data analysis to help with fraud analysis. Imagine a credit card company being able to instantly look at a cross-section of log files, geo-location information and purchase history to be able to deny fraudulent charges at checkout. Real-time analysis can also be a huge benefit in criminal forensics. If agencies could search thousands of fingerprints or DNA profiles instantly, they would be able to immediately find the answers they need to solve crimes.
Driving growth, reducing costs, gaining competitive advantages and unlocking answers they couldn’t before are imperative for today’s businesses, but to do so, organizations must rip delays from the entire process. Delays cost money. Delays reduce the value of answers. Delays make management lose faith in big data initiatives.
The key to big data success is reducing mean time to decisions (MTTD). IoT, mobile and other trends only make the need to reduce MTTD even more critical, because time, delays, uncertainty and rigidity are the enemies of success — data does have an expiration date.
The speed of business today and the economics of big data implementations mean that industries reliant on unlocking value from big data can no longer survive without a sea change in technologies and processes. To contribute to success, companies need to reduce MTTD with faster-to-deploy, higher performance, more agile platforms. It’s a fact of data analytics that, many times, organizations don’t really know what they can get from their data until they start experimenting, so they need the right tools to experiment quickly, as opposed to experimenting at a snail’s pace.
The bottom line is that faster access, analysis, answers and value are critical to the success of data analytics initiatives. But, before organizations can begin unlocking the true value of their data and reduce MTTD, they will need new approaches to big data analysis and entirely new processing architectures.
Pat McGarry is Vice President of Engineering with Ryft Systems