We’ve all heard the famous saying, “we all make mistakes”. This is an especially pertinent statement for the business IT industry, where 65 per cent of businesses attribute most of their data quality issues to human error.
Fortunately, most human-generated data issues are easy enough to address. In this short article, Greg Richards, Sales and Marketing Director of Connexica, looks at how the most common data mistakes can be addressed. He also explores how CIOs can do so in a way that improves business intelligence.
A study by Experian QAS recently revealed that 65 per cent of businesses consider human error to be the cause of most of their data quality issues. Here, Greg Richards, Sales and Marketing Director of business analytics expert Connexica, explains how to effectively address the most common mistakes and improve business intelligence.
Analytics projects generally start by bringing data together from siloed data sources into the main system. Here, the quality of the data you put in, directly affects the quality of the resulting analysis. That’s why it is always a surprise when an organisation doesn’t take data quality seriously.
Poor data quality does not initially pose a significant problem from a functional perspective. The data sources can still be brought into the analytics system, reports can be built and conclusions will be drawn. However, what if these are all wrong? It is essential that the conclusions we draw from data are used to help guide the organisation’s strategy in a useful way. If the data is inaccurate, it ultimately influences the decisions that managers and directors are making.
It is critical that chief information officers (CIOs) and IT managers set up data quality rules to remove data inaccuracies and leave the best quality data to feed into reports.
Most of the common data quality issues that occur are caused by human error. This is because a lot of data across an organisation still relies on manually entering information. Sadly, we all make mistakes every now and then. However, even minor mistakes can lead to problems such as duplicate records, misspelt words and inconsistent data formats.
The fundamental way to address data quality issues is to ensure that they are corrected at source. From my experience working with companies across all manner of industries, businesses often address data quality issues by making changes in the analytics system but fail to change the source databases. This becomes a problem when changing providers or procuring another system using the same data sources. Fortunately, there are several best practice tips CIOs can follow.
Better data
The three big actions are auditing the quality of data, implementing data matching and defining a master record.
Quality audits are an easy first step. The objective is to find data sources plagued by low quality data and rectify the information at source. Poor quality data sources typically use very few rules to govern data input and formatting. Using a rules-based approach to the data-quality audit uncovers the likelihood of inaccurate data values.
Data matching is slightly more demanding because unique identifiers will need to be built within the data using uncommon data values in the database. The idea is that a match can be found by selecting several uncommon values to create a unique identifier, thus reducing duplicates and improving data accuracy.
A typical example for local authorities is matching a citizen’s record. This can be done by using the last three characters from each data value selected as an identifier to build a unique identifier, such as surname, council account number and postcode. It is unlikely you will get many instances of “TON7194ET”.
More powerful analytics software also allows you to define a master record. This is a database that maintains up to date, accurate values, that can be used to drive amendments to other data sources where there are duplicates and other invalid data values. Connexica developed its CXAIR business intelligence software to overcome this long-winded, manual, process.
While the staff that enter data are only human and prone to errors, these mistakes do not need to impact reporting. With the right software and rules in place, businesses can ensure that the systems catch what slips through by humans.