Data made large by increasing volumes and velocities has imposed its will upon enterprise architectures and strained the integration process. Like author Robert M. Pirsig’s character in Zen and the Art of Motorcycle Maintenance, IT teams and business analysts are on a never-ending journey to uncover quality. In the latter’s case, they’re trying to obtain data quality, something that is increasingly difficult to accomplish when you consider the disparate nature of modern data collection.
When we talk about disparate data, we’re describing data that are unalike in nature or distinctly different in kind or quality. This wouldn’t mean much in a vacuum, however, this data cannot be integrated by traditional methods because it does not share any common denominator. As a result, data in this form cannot meet business demands for analysis. This makes for a tedious (and that’s wording it lightly) process of trying to meld data together to store in a repository for later use.
The main crux of the disparate data issue is that enterprises and technology vendors have been slow to adapt to changing data collection trends. This plays a big role in why so many chief executives view Data Integration as a major hurdle to uncovering business insights. Digital organizations are never going back; new data types and formats will only continue to emerge as technology enables proliferation, creating the need for more agile ways of managing enterprise data.
Analyst house Gartner says it best in their new Magic Quadrant report for Business Intelligence and Analytics: The market shift from IT-led to agile modern agile business-led analytics is now mainstream.” As a result, there is increasing need for tools that automate data discovery, data preparation, and the blending of disparate data. Solution providers are on their heels trying to fast-track offerings that match up with the new ways of doing business that lie outside the traditional data warehouse landscape.
The growing need to analyze streaming data also adds to the headache, and the need to blend existing IT infrastructures with Big Data systems, cloud computing platforms and other emerging technologies only exacerbates it. There is a real need for more broad-scale integration strategy among those organizations that wish to expand their data blending.
The emergence of Data Lake certainly has its benefits, but isn’t a ready-made solution to a growing conundrum. Although deployment of Data Lakes has grown considerably in recent years, due in large part to the fact that heterogeneous data has found a home on its shores. However, what we wrote earlier this year still reigns true: the future is not so much about how much data businesses can accumulate, but how they can best absorb it in forms that provide them the knowledge they seek.