Bernd Harzog’s 2016 Big Data Market Predictions: Part 2

By Tim King , Executive Editor at Solutions Review
Best Practices,

Prediction 2

By Bernd Harzog

6. Relationships between sources of data are crucial: Let’s consider some retail sales data, where we have sales data by product, store and geography. The customer has an account with us so we know some basic demographics like gender and age range. All of this data comes to us from the online business system. Now we need to understand if the operation of the site affects revenue. This means that we need to combine end-user experience data, application performance data and IT Operations data with business data, as all of these disparate streams of data arrive. To analyze these disparate streams of data in real-time, ETL no longer works. There is no time to do ETL, so batch ETL needs to be replaced with real-time and continuous discovery of the relationships between streams of data from disparate sources as these sources of data arrive.

7. Statistical relationships are only valuable against deterministically related data: Much hope has been placed in “Big Data analytics” and “machine learning.” Both are powerful techniques, which are benefiting from enormous innovations created by very smart people doing world-class work in self-learning algorithms. To the earlier point of “garbage in, garbage out” – the best algorithms will produce mediocre or worthless results when applied against data where the items have no inherent relationship.

8. The graph database is a crucial innovation: The graph database is an important part of the solution to this problem. LinkedIn can show you who you are connected to and show you your entire “social graph” (who is connected to whom). We need to connect our business, IT Operations and IoT data in the same manner.

Widget not in any sidebars

9. The Hadoop stack is not the only future of Big Data: The Hadoop stack has provided an exponential improvement upon the capabilities of the previous and now legacy data warehouse class of solutions. But when it comes to real-time and continuous stream processing, with the requirement to ingest data continuously and process it for immediate consumption in real time, Hadoop is itself a legacy solution. The modern innovations in this area include Cassandra, InfluxDB, FiloDB, Scylladb, Spark, Kafka, various graph databases like NEO4J and Titan, etc. In fact, there is even an acronym for a real-time Big Data stack which is SMACK – Spark, Mesos, Akka, Cassandra and Kafka. If you want to run it at scale then run SMACK HARD, where HARD is – High, Availability, Redundancy, and Distributed.

10. Real-time Big Data will bind the entire online enterprise together: Every part of the modern online enterprise is producing valuable streams of data. Each stream constitutes “Big Data” on its own. Together, they constitute a real-time deluge of data that must be collectively related, processed and made useful in seconds after ingest. Successful online enterprises will use their ability to take action upon real-time Big Data to achieve advantage over slow moving rivals. In today’s world, it is not about the big vs. the small. It is about the fast vs. the slow.

Click here for part one of Harzog’s 2016 Big Data market predictions.

Bernd Harzog is the CEO and Founder of OpsDataStore Inc., where he is responsible for the strategy, execution and financing activities of the company. Before Bernd founded OpsDataStore, he was the CEO and founder of APM Experts, CEO of RTO Software, Inc., founding VP of Products at Netuitive, a general manager at XcelleNet, and a Research Director for the Gartner Group focusing upon the Windows Server Operating family of products. Connect with him on LinkedIn.

This article was written by Tim King on February 16, 2016

Tim King

Executive Editor

Tim is Solutions Review's Executive Editor and leads coverage on data management and analytics. A 2017 and 2018 Most Influential Business Journalist and 2021 "Who's Who" in Data Management, Tim is a recognized industry thought leader and changemaker. Story? Reach him via email at tking@solutionsreview dot com.

The 17 Best API Integration Platforms, Software and Tools for 2024 - October 26, 2023
The 6 Best Geospatial Data Integration Tools to Consider in 2024 - October 20, 2023
The 19 Best Big Data ETL Tools and Software to Consider in 2024 - October 19, 2023

Best Practices

Bernd Harzog’s 2016 Big Data Market Predictions: Part 2

Tim King

Executive Editor

Expert Insights

Latest Posts

Categories

Important Links

Useful Pages

Bernd Harzog’s 2016 Big Data Market Predictions: Part 2

Share This

Tags

Tim King

Executive Editor

Related Posts

The Benefits of Solutions Offering Open-Source Libraries of Transfo...

GenAI & Data Transformation in Online Retail: Expert Commentary

The Relationship Between Enterprise Data & Talent Retention

Expert Insights

Latest Posts

Follow Solutions Review