2010 called and wants its data profiling technologies back. According to a new study commissioned by SourceMedia Research and Paxata, nearly 70 percent of organizations still use Microsoft Excel as their main means of doing data preparation. The report, entitled The State of Data Quality in the Enterprise 2018, is the compilation of the 2017 Data Quality Study. The study surveyed 290 executives and IT professionals at organizations with $100-million or more in annual revenue and found that they remain well behind the eight-ball in a number of data-related tasks.
Of the more troubling revelations in the report is that only 15 percent of organizations have deployed viable data quality models. It’s at least somewhat positive news that 40 percent are in the process of developing those frameworks. 38 percent have just begun thinking about data quality in any meaningful way. These figures aren’t that surprising given the reality that organizations are being bombarded with data from all directions. Many are simultaneously trying to balance data ingestion, storage, integration, preparation, and analytics.
The report outlines two main challenges when it comes to overall data quality. The first and most obvious is data variety, with enterprises attempting to ingest data from several sources at a time. Roughly 37 percent of enterprise data comes from external sources. The complexity of data next up on the docket, and though organizations are still using mostly structured data (64 percent), more than 20 percent report ingesting nearly a fifty-fifty split of structured and unstructured.
We’ve highlighted before the time-consuming nature of data preparation, and even organizations with mature data quality frameworks struggle with this. In fact, low-level data preparation tasks are the greatest culprits, with 30 percent of time spent on ingestion and 42 percent spent on data profiling. The report found, unsurprisingly, that the more time organizations spent on overall data quality, the more likely they were to be pleased with the results. So it’s safe to assume this conundrum isn’t going away anytime soon.
It’s interesting that even those organizations with mature data quality models still utilize Excel to some degree, which raises a number of questions. At first glance it would seem as though many of these organizations simply haven’t yet made the switch to dedicated data preparation tools. But this simply isn’t the case, as a number of those professionals polled indicated that Excel still plays a role in their data profiling activities. There’s also the fact that many of those polled just aren’t using unstructured data to the degree that would warrant stand-alone data preparation.
Solutions Review is most interested in what the buyers think. Of those polled that are currently in-market for data quality solutions have identified the capabilities they are most interested in acquiring in the future. They include support interactivity with structured and unstructured data, ease of use for their non-technical users, and availability for ingesting and preparing large data volumes.
We highly encourage you to read the full report (registration required).
- The 9 Best Data Integration Books You Should Read in 2022 - May 13, 2022
- The 14 Best Database Virtualization Tools and Software for 2022 - May 3, 2022
- The 6 Best Talend Courses and Online Training for 2022 - April 26, 2022