Hadoop: The New Data Warehouse
The rapid adoption of Hadoop across the enterprise has created a shockwave that’s put many Big Data and analytics professionals on their heels. As a result of the maturation of Big Data technologies like our favorite yellow elephant, a large majority of the IT vendors in this vertical have released Hadoop capabilities on top of their existing software solutions to meet growing demand. Given the explosion of data volumes, types and sources over the last decade, it’s no surprise that Hadoop is being adopted at this rate. As a result, traditional data warehousing technologies are getting pushed to the back burner. Does this spell their demise?
While data warehouses traditionally utilize a single relational database that serves as a central store, Hadoop’s file system is designed to work across many machines to handle very large data volumes that cannot be encompassed by one machine. Thus, Hadoop has become the perfect solution for businesses that are expanding their data footprint, which essentially represents every company in every vertical.
None of this is to say that Hadoop will become the market-norm overnight. Many companies who currently utilize traditional data warehouses simply don’t yet see the cost-benefits to changing over their entire data architecture. Others don’t have the upfront capital to start collecting every bit of data for analysis and will continue using legacy techniques for the time-being. There always has to be a happy medium between a slow rollout of emerging technology and reading too much into the vendors that announce that a certain traditional framework is dead overnight.
With that said, the Hadoop ecosystem remains in a constant state of evolution, and many companies are choosing to go hybrid approaches that span SQL and NoSQL. One thing is certain though, traditional data warehousing is definitely not the future of Big Data processing, it is the past. Technologies that exist expressly in traditional frameworks like ETL will get re-branded and find new homes in emerging Big Data technologies. In time too, Hadoop technologies will grow tentacles that enable easier integration with legacy frameworks so that enterprises can migrate to new solutions at their own pace.
While the data warehouse remains a staple in the enterprise as a springboard to deriving insights from Business Intelligence software, Hadoop is certainly pushing for its place among must-have technologies. The further our data collection spans, the faster Hadoop will become not just a technology that benefits only the biggest data collectors, but the entirety of the enterprise. It would be naive to call the data warehouse “dead”, just as it would be to say that Hadoop is the only way to process, integrate and manage data. But the time is coming where data collection is just going to get too out-of-control for data warehousing to make even the slightest impact.