Solutions Review’s Expert Insights Series is a collection of contributed articles written by industry experts in enterprise software categories. In this feature, Yugabyte Founder and CTO Karthik Ranganathan offers commentary on the evolution of of operational databases.
As enterprises move applications and services to the cloud, they are migrating from big iron scale-up proprietary infrastructure to cloud native architectures. At first glance, traditional database management systems may appear cloud-ready for this critical change, but in reality they still face significant challenges around performance, availability, and simplicity when run at scale. Luckily, the evolution of operational databases has finally reached a point where it is ready to match cloud and modern application layers with a cloud native database. To understand how we got to where we are today, let’s walk through this database evolution and define what “cloud native” really means.
Moving from Monolithic to Distributed SQL – availability and sharding
The foundation of modern transactional databases starts with relational, or Structured Query Language (SQL) databases. These databases have set the standard for Online Transaction Processing (OLTP) for the past four decades. They have proved reliable, but are monolithic in nature, running on a single server. Increasing capacity requires vertical hardware scaling. This can be difficult and costly as it requires more specialized, expensive servers. Their method of copying data, via async or sync replication, can also result in data loss (with async) or downtime (with sync replication). SQL databases weren’t built to scale out to cloud architectures, so these problems are unsurprising.
A significant step towards a viable solution came with sharded and distributed databases – first NoSQL, then NewSQL. However, NoSQL also has its limits. NoSQL databases emphasize scalability and high availability, but compromise on consistency, ACID transactions, and familiar relational data modeling. They are typically used for workloads with very simple data access patterns without requiring transactional guarantees. They are not ideal for business-critical OLTP or HTAP applications that require referential integrity, transactional capabilities and flexible data access patterns using features such as joins.
The tradeoffs between availability (NoSQL) and RDBMS capabilities (NewSQL) force organizations to make a choice. Many early NoSQL adopters (mostly large tech companies) are having second thoughts about pouring more money into that strategy. NewSQL databases are cloud hosted, but not fully cloud native. They represent a critical step in the evolution of databases, but often lead to operational complexity, difficult application development, and inconsistent customer experiences.
Operational Database Evolution
Solving NoSQL Challenges with Distributed SQL
The next stage in the evolution of databases is distributed SQL, which combines the strongest aspects of traditional relational database management systems (RDBMS) with key cloud native capabilities, including scale and availability, made popular by NoSQL databases.
Distributed SQL offers continuous availability, replicating data across nodes and keeping it available regardless of node, zone, region, or data center failures. It allows organizations to scale horizontally on demand without impacting performance and is strongly consistent across geographic zones. It offers developers familiar RDBMS features, allowing them to build data-driven applications, and can strengthen security by providing encryption at rest and in transit.
Although no database reference architecture is perfect for every application, a cloud native distributed SQL database solves many of the challenges that plague NoSQL and other approaches, providing a consistent, versatile, cloud native data layer.
Defining “Cloud Native” for Databases
Cloud native is one of the more recent phases of the operational database evolution. Cloud native is rapidly becoming common practice at the application layer, as enterprises move transactions, microservices, and other applications to dynamic, containerized environments. The same can’t be said of the data layer that supports those applications.
NoSQL and NewSQL databases can be hosted in the cloud and horizontally deployed on cloud infrastructure, but they fall short ‒ in one way or another ‒ of keeping the core database functionality uniform and reliable.
A true cloud native database has five defining characteristics:
- Extreme elasticity, with the ability to scale clusters up and down, both quickly and reliably that allows it to use the virtually limitless resources in the cloud.
- Geo-redundancy and always-on availability to easily create multi-AZ and/or multi-region clusters, and expand or shrink availability zones or regions at any time while remaining resilient to both unplanned failures and planned upgrades. This is important because even the best public cloud services have suffered failures at the zone and region level.
- Simplified management – from hardware flexibility to seamless upgrades allows for seamless movement from one type of compute and storage to another for cost and performance reasons, and to leverage the latest technology advances in the cloud.
- Multi-cloud mobility to avoid cloud lock-in by moving to another cloud provider or co-existing with multiple providers.
- Data placement policies that define and enforce geo-specific data residency controls without impacting applications.
Today, business happens in the cloud. Enterprises will continue to expand operations beyond their own data centers and embrace the public cloud.
While a cloud approach helps organizations operate efficiently and effectively, enterprises need a multi-cloud model that gives them the same cloud native agility they get from microservices and containers. Modern, cloud native operational databases, like distributed SQL databases, fully support those operations, helping enterprises deliver high-value services much faster than legacy databases.
- Expert Commentary on the Evolution of Operational Databases - May 18, 2023