Big data can become a company’s savior or a scam — depending on how well or poorly it is leveraged. Before it goes through the analytics process, data must score high in “data IQ,” which is a combination of data integrity and data quality. It is critical that leadership across functions recognize the importance of reliable data, because without a willingness throughout the enterprise to ensure its integrity and quality — in other words, its data IQ — efforts to leverage analytics throughout the organization will falter and may ultimately fail.
Even though companies may have traditionally used the terms “data integrity” and “data quality” interchangeably, they are unique characteristics equally necessary to safeguard data in motion as it navigates different systems, and they are complementary. Data integrity checks are your umbrella protection to verify, balance, reconcile and track your data, while data quality takes it a step further to inspect individual fields for completeness, type/value conformance and consistency. A combination of the two leads to a higher data IQ, which equates to users trusting their data.
Not only is corporate data generated on a massive scale, but it also includes many different variations between sources. And, when data moves from system-to-system, there are plenty of opportunities for data to not only lack quality from the start, but also to lose its integrity along the way. With each handoff between systems, data can be lost, aggregated incorrectly or even become inaccurate in the process
• CONFERENCE AGENDA ANNOUNCED: The highly-anticipated educational tracks for the 2015 R&D 100 Awards & Technology Conference feature 28 sessions, plus keynote speakers Dean Kamen and Oak Ridge National Laboratory Director Thom Mason. Learn more.
In an organization, data drives the company forward by helping make critical decisions, by gaining investors or by being able to attract new customers. Knowing how to improve data within an organization is a huge effort, and being able to distinguish between all of the key factors will only make it easier to understand which areas within their data will have the most impactful results.
Unfortunately, many individuals responsible for data management aren’t familiar with how to identify data integrity and data quality, which is problematic for companies dependent upon accurate, reliable data. Many organizations don’t have controls that provide visibility into the health of data and, without enterprise visibility, it’s difficult — almost impossible — to understand your data IQ.
Like the foundation of a house, data integrity controls are the foundation of any organization. They detect when bad data is introduced by accident — or even on purpose by a person looking to commit fraud — and alert business stakeholders to the issue.
For example, think about when someone submits a large banking transaction online. Data integrity controls detect potential errors, such as documents that were accidentally sent to the bank twice, whether or not the amount of money on the document matches the amount of money the bank is actually moving, and whether or not the transaction was completed within the correct time frame.
Overall, these specific controls focus on automating four processes in order to streamline business processes:
- tracking (VBRT)
Data integrity controls verify that the data is complete and the files aren’t duplicated, among other things. They’ll also balance the data from report-to-report in real time. In addition, they can reconcile to ensure systems transactions are in sync, acting as a watchdog and flagging any issues. Lastly, data integrity controls track the timeliness of data in motion and any potential delays as it flows throughout the system.
Business leaders typically look at data quality when it’s in its final resting place, such as a data warehouse or data lake. Unfortunately, this means that all opportunities to prevent the bad data from propagating throughout your organization are all gone.
This is where data quality controls come into play. While data integrity ensures the information is balanced and reconciled, data quality allows you to take a deeper look into the actual substance of the data. Is it trustworthy? Can you use it to confidently make business decisions?
Data integrity controls may look at VBRT, but data quality controls outline dimensions such as
- type conformance
- value conformance
We call these the “Four Cs.” The combination of the four — along with initial data integrity controls — makes data accurate and reliable by providing a well-rounded approach to increase your Data IQ.
Think about when you submit healthcare claims or banking transactions for processing. Data quality controls will automatically identify any blank fields or empty values (even conditional blank fields). They will also make sure account numbers comply and meet vendor specifications. For example, all Visa credit cards start with a number four, but without this knowledge set as a rule, any number could easily be overlooked as accurate, no matter how meaningless it might be in the context of the requested entry.
Lastly, data quality controls keep everything consistent throughout the transaction or record, making sure that the actual values within each transaction are consistent. For example, if a billing transaction has a field value for State equaling “Illinois,” then the Region field value for the same transaction should be “Midwest.”
Without data quality, organizations risk incomplete and inconsistent information, which causes a lack of visibility into operational trends and metrics over time, while reducing the trustworthiness of the quality of reports or compliance audits.
Basing decisions on faulty data trends can seriously harm a business, which is why combining both data integrity and data quality is so crucially important. Now that you understand the power of both, ask yourself: What are you doing to increase your data IQ?
Jeff Brown is Product Manager at Infogix.