Data Mesh Architecture Overview and Technology Introduction
This is part of Solutions Review’s Premium Content Series, a collection of contributed columns written by industry experts in maturing software categories. In this submission, Macrometa Co-Founder and CEO Chetan Venkatesh offers a data mesh architecture overview and advice when evaluating related technologies.
It’s almost cliche to discuss entering a “new age” in tech. It wasn’t that long ago that we were discussing big data or machine learning or AI as some novel ideas. As we learn and grow, the technologies we use inevitably change with our worldview. The latest sea change seems philosophically focused on data, not only in how we view it but how quickly we gain insights from it. When it comes to enterprises, data has to be fast, actionable, and highly available. A data mesh architecture embodies all three criteria.
The Age of Data Decentralization
Data comes from everywhere. We’re not entering a new age of data decentralization, we’re already living in it. At first, we thought the best way to get value from this data was to gather it and look for patterns – a grand data aggregation where all data comes together and is magically made sense of and then dispersed back from whence it came. The amount of data generated on any given day is enormous, most of it is just “noise”, but we wouldn’t know that until it was analyzed. Data was a means to end – the actual value came from the decisions we made because of the data. In some ways that’s still true. But we’ve learned a few things along the way:
- The value in data is fleeting
- Data is distributed
- Moving data is expensive
Data quality greatly depends on how quickly it’s processed. Distance reduces speed, so by the time data is sent to some centralized cloud for analytics and then returned, it’s already lost some of its value. In some use cases, this kind of delay isn’t an option – security, IoT, and real-time and event-driven applications come to mind. Organizations that can’t operate on the latest data are inherently at a disadvantage. Data is an asset that must be used quickly. In this new (but not really) distributed data age, data is a product that is valuable in itself. Enterprises that understand this concept are turning to a data mesh architecture.
A data mesh architecture is based on a distributed data model, which means that data is spread across multiple nodes in a network. This provides a number of advantages over traditional centralized data models. First, it allows data to be processed closer to where it’s generated, which can improve performance and reduce costs. Second, it makes the data more available for use by applications and services. And third, it enables data to be replicated and synchronized across multiple devices and locations, which enhances its availability and resilience.
The centralized cloud is the architecture of big data, but fast data needs its own architecture – an architecture that embraces distribution and decentralization. The added step of data analysis in some central clouds isn’t necessary anymore – we have everything we need to do this at the edge.
By distributing data processing across multiple nodes, data can be processed in parallel, which can significantly improve performance. And by using techniques such as aggressive caching and prefetching, data can be made available even faster. As a result, a data mesh architecture is well-suited for demanding applications such as real-time analytics and machine learning.
Data-driven enterprises are starting to see data with new eyes. Instead of fighting decentralization, use it to your advantage. The beauty of a distributed data mesh is that there isn’t this rip-and-replace mindset that overwhelmed the move to the cloud so many years ago. You don’t replace data sources, you connect them. A true data mesh solution can operate in a multi-cloud, hybrid cloud, on-prem, data center, and/or edge network. A data warehouse isn’t the place for all data, it’s just a node on a larger network. It’s all about processing these data streams in the most efficient way possible – not forcing them into some legacy mold. It’s about using the right tool for the job.
What to Look for in a Data Mesh
For enterprises considering a data mesh architecture, keep these requirements in mind:
- Can you ingest and process data globally?
- Can you link existing databases, data warehouses, lakes, etc?
- Can you access data where it resides or do you have to move it to gain insight?
For data-driven enterprises, a data mesh architecture provides a number of benefits that can help them get the most out of their data. With its distributed data model, fast data processing capabilities, and enhanced availability and resilience, a global data mesh can help organizations realize the full potential of real-time data.