Dark Data 101: There’s Nothing “Dark” About It

Dark Data 101: There's Nothing "Dark" About It

By Mads C. Brink Hansen

With Halloween just around the corner, it’s time for spooky things that go bump in the dark of night. But “dark data” shouldn’t scare you. Sure, it may seem a little frightful to think about all the options we have when it comes to leveraging data these days, but there’s no reason to be afraid. You just need the right tools and strategy to set your team up for success. Here’s how to do just that:

Defining Dark Data

To begin, what is dark data anyhow? In short, there are two options companies have when searching for more data to analyze: dig into external data or search for insights that are hidden within the data that already exists in your company. The latter is known as dark data.

Here’s a practical application of dark data: let’s say you’re a manufacturing company that is humming along with your business processes. You have a solid BI and analytics strategy in place that delivers the daily reports and analyses you need to measure, monitor and make decisions. But then something happens. A quality issue has suddenly put you at risk of losing a valuable customer. Your existing analysis flags the problem: one of your production lines is less effective than the others. But your analysis doesn’t give you the information you need to figure out why.

It’s easier to figure out how to harness the unused data you already have than diving into the world of big data; that’s not to say harnessing internal data doesn’t involve a little bit of trial and error and experimentation. This practice is called “sandbox analytics” and it is designed to help bring dark data into the light.

Dark Data is Less Scary with Sandbox Analytics

With sandbox analytics, a small group of BI users experiment with potentially useful data. Only if that data proves valuable is it distributed for greater use throughout the organization. The ability to play with big datasets and analyze them on top of what’s already in the data warehouse encourages employees to think strategically without the need to pull in IT.

A comprehensive data discovery tool will also allow users to leverage data that isn’t already available for analysis and lets users mash it up with the data used in current reports and analyses. Per the above manufacturing example, by comparing the time each employee on this particular production line clocks in and out each day – along with the data already used to analyze shift data across the company — users can see that these employees are alternating between day and evening shifts much more frequently than employees on other teams, hence resulting in the quality issue. There is an evident negative correlation between shift times, employee satisfaction and production quality. This previously dark data can be moved into an existing data model now or at a later time if so desired.

Having identified the problem, the company can now fix it and monitor the progress with BI. And, it is easier to monitor hours, shifts, and product quality across all teams. What’s more, new best practices can be applied across all production teams. Shining light on the right dark data can elucidate more than you might originally think.

Dark Data and Your Analytics Strategy

Per the above example, with the right tools, the process of uncovering internal dark data should require minimal hand-holding from the IT specialists within your company. With BI users empowered to create an experimental environment on their own with the right tools, there’s even a potential for cost savings. Obviously, creating a full development cycle around what might be a bunk hypothesis is a waste of time and money. However, if a hunch doesn’t prove to be useful in “sandbox mode,” the hypothesis can be rejected and users can move on to the next hypothesis quickly, without the use of many resources.

Uncovering a little bit of dark data in a relatively short period of time adds tremendous value to the BI project as a whole. So, don’t be afraid of dark data, even if it is “Halloween season” – it can actually end up bringing light to your business.

Mads C. Brink HansenMads C. Brink Hansen is a BI Project Manager at TARGIT. Throughout my many years of working in the BI world, I’ve been involved in a large array of BI project — from small to large, from self-service to centrally implemented BI. I’ve served as developer, architect, and project manager. In addition to being a BI practitioner, I am external lecturer in Business Analytics at Aarhus BSS, Aarhus University, focusing on bringing new developments into the field and proven practices into the classroom.

Timothy King
Follow Tim

Timothy King

Editor, Data and Analytics at Solutions Review
Timothy is an enterprise technology writer and analyst at Solutions Review, covering Business Intelligence and Data Analytics, Data Integration and Data Management. He holds a Bachelor of Arts Degree in History from the University of Massachusetts Lowell. Timothy believes that data can allow us predict things about our future, just as history has aided in the uncovering of our past.
Timothy King
Follow Tim