In a recent InfoWorld column entitled “3 Key Ways Hadoop is Evolving”, Serdar Yegulalp discusses the ways in which the big data platform is shifting, with the scope of the article focusing on news that broke at the recently completed Strata + Hadoop World conference. According to the author, there has been a lot of change within the big data landscape specifically over the last six months. With a focus on sessions, speeches, and new technologies presented at the event, he then outlines the recent significant trends.
At Solutions Review, we have a vendor-neutral focus on the big data landscape, and with that, this post seemed like it would be worth summarizing. With all that said, here are the three key ways Hadoop is evolving, according to Yegulalp:
Spark is leading Hadoop
Spark’s popularity is soaring behind the support of Cloudera and in large part because it offers self-service data processing via a common API, regardless of where data is stored. Arsalan Tavakoli-Shiraji, VP of Customer Engagement for Databricks (Spark’s chief commercial proponent) spoke about those who aim to gain business value from data: “[They] eagerly want data, whether they’re using SQL, R, or Python, but hate calling IT.” IBM’s VP of Product Development for IBM Analytics added: “Hadoop data lakes often become dumping grounds, he claimed, without much business value that Spark can provide.”
Hadoop’s target audience is changing
In the beginning, the selling point for Hadoop was that it was a data depository, but since that is now a given, it’s about having people with the skills to plug into it in order to get some kind of value out of it. Solutions Review didn’t have a presence at Strata + Hadoop World this year, but according to the author: “This time around, the words “data lake” were barely mentioned in the keynotes — and only in a derogatory tone. Talk of “citizen data scientists,” “using big data for good,” and smart decision making with data was offered instead.”
The sentiment at the event seemed to be that self-service tools for data science on Hadoop offers more actual value than the ability to aggregate data from a variety of outside sources, with the focus now being about free-form data science, not free-form data storage any longer.
Hadoop is a breeding ground for new tech
Although Hadoop is still extremely important and relevant, more attention is now being paid to individual pieces that have emerged from it and are used in the creation of new products. According to the author, Spark is the clearest example based on what it can already do and what it is projected to do in the future. However, Hadoop isn’t going anywhere anytime soon, but some of its pieces have shown the ability to stand on their own, meaning there might come a day sometime in the future where the bright yellow elephant isn’t quite the notable icon as it is today. Hadoop has been so popular that a whole bunch of vendors are vying to deliver their own spin on the platform. Imitation is after all, the sincerest form of flattery.
For more on the current state of Hadoop, ETL, Spark and Big Data, download a free copy of our 2016 Data Integration Buyers Guide.
- What’s Changed: 2021 Gartner Magic Quadrant for Enterprise Architecture Tools - November 19, 2021
- Airbyte Follows up Series A Funding with Launch of Airbyte Cloud - November 18, 2021
- Alluxio Nabs $50 Million Series C Funding; Adds AI and ML Support - November 17, 2021