Last month we asked the question: Is Data Integration dead? To expand, the Big Data market is undergoing a transformation. Data volumes and unstructured data sources are growing exponentially, and analytics vendors are increasingly adding integration capabilities that used to belong to pure-play integration providers to their own platforms. As a result, the traditional ETL (extract, transform, load) model of integration is being challenged by new velocities and volumes of data and newer data management platforms like Hadoop, Spark and many others.
Much of what we see in terms of marketing from many of the most popular Big Data vendors is that this is precisely where things are headed. However, this needs to be taken with a grain of salt, as solution providers are trying to sell business people on the notion that only the shiniest new integration tools can be relied upon to get data jobs done. As a result, many IT departments are wondering whether or not they need to make the jump from legacy ETL tools to newer, more open data platforms which can support a wider variety of data formats.
While there are certainly scenarios where traditional Data Integration tools no longer meet the requirements for dealing with the data organizations now generate, ETL is still very much alive. The crowd has spoken.
After sharing our question with business professionals in IT, we were surprised with what we found. The responses made it abundantly clear that traditional Data Integration is still an integral part of modern digital business. Few companies outside of the Facebooks and Apples of the world capture the variety of data that would make more modern approaches to integration useful at this time.
The fact of the matter is, while modern data management platforms can do a whole lot, the majority of organizations are still just getting their feet wet when it comes to analytics. In addition, a large percentage of today’s companies are not yet capable of integration data from multiple sources, and many don’t even collect data from more than one source. The data warehouse model is still far too relevant for far too many kinds of companies to abandon it. On that same note, the rise of Apache Hadoop and Spark has certainly thrown a wrench into the Big Data market, though many organizations are only using these platforms as storage mediums as opposed to pure replacements for legacy integration techniques.
The bottom line is that marketing pitches have damaged the perception of traditional Data Integration and ETL, but in practice, these methods are tried, true, and still widely used. One software developer even responded to our question saying “When businesses make decisions on this untruth; everyone loses! If it’s too easy to be true, it probably is. Good old hard work (ETL) proves the most valuable (DW) even in today’s market.”
Maybe in the future we can revisit this topic, but in the near term, it looks like the practice of traditional ETL-based integration is still very much alive. For now, we’ll just have to continue suffering through these ETL images that someone keeps creating in the Windows 95 version of Paint.