Searching for data management and database software can be a daunting (and expensive) process, one that requires long hours of research and deep pockets. The most popular enterprise database tools often provide more than what’s necessary for non-enterprise organizations, with advanced functionality relevant to only the most technically savvy users. Thankfully, there are a number of options we profile in this open source database list. Some of these solutions are offered by vendors looking to eventually sell you on their enterprise product, and others are maintained and operated by a community of developers looking to democratize the data management space.
In this article we will examine free and open source database software, first by providing a brief overview of what to expect and also with short blurbs about each of the currently available options in the space. This is the most complete and up-to-date directory on the web.
Apache Derby is an Apache DB subproject. The open source relational database was written entirely in Java and is available under the Apache License, Version 2.0. Derby’s footprint is only 3.5 megabytes for the base engine and embedded JDBC driver. Its driver enables users to embed it in any Java-based solution. It supports the client/server mode with Derby Network Client JDBC driver and Derby Network Server, and is easy to install, deploy, and use. The most recent patch release was made public in March 2019.
Apache Hive is an open source data warehouse built on top of the Apache Hadoop ecosystem. It was designed to facilitate data summarization, ad-hoc queries, and the analysis of extremely large data volumes stored in various databases and file systems that integrate with Hadoop. Hive offers an excellent package for applying structure to large amounts of unstructured data and perform batch SQL-like queries. It integrates with traditional data center solutions that use the JDBC/ODBC interface.
CUBRID is a free relational database management engine that features built-in enterprise functionality. Key features of the software include object oriented database elements relations, data sharding, a native middleware broker, high-performance data caching, and customizable and extensible globalization support. CUBRID also provides a high level of SQL compatibility with MySQL and other open source databases, and uses a different licensing for its server engine and its tools and interfaces.
Firebird is a relational database that offers many ANSI SQL standard features. It runs on Linux, Windows, and a number of Unix platforms. The software provides concurrency, high-performance, and language support for stored procedures and triggers. Firebird is a commercially independent project of C and C++ programmers, and has been used in production systems since 1981. Anyone can build a custom version of Firebird as long as the modification are made available under the same IDPL licensing for others to use.
H2 Database Engine
H2 is a lightweight Java database that can be embedded in Java applications or run on the client-server mode. the software includes an embedded web server with a browser-based console application, as well as command line tools to start and stop a server, backup and restore databases, and a command line shell tool. While a subset of the SQL standard is supported by H2, the main programming APIs are SQL and JDBC. It also supports using the PostgreSQL ODBC driver.
HyperSQL (HSQLDB) is a relational database engine written in Java, and features a JDBC driver that conforms to ANSI SQL:2016. It provides a small but fast multi-threaded engine and server with memory and disk tables, LOBs, transaction isolation, multiversion concurrency, and ACID. Both embedded and server modes are available, and includes tools such as minimal web server, command line and GUI management tools, as well as a number of demonstration examples. Advanced capabilities are headlines by user-defined SQL procedures and functions, schemas, and datetime intervals.
MariaDB is an open source and commercially supported fork of the MySQL relational database management system. It was developed by the original creators of MySQL and turns data into structured information in a wide array of applications. MariaDB features an expansive ecosystem of storage engines, plugins and many other tools. According to the official site, the latest version of MariaDB includes GIS and JSON functionality. The database is supported by Microsoft Azure and Amazon RDS (since 2015).
MongoDB is a cross-platform document-oriented database. It is classified as a NoSQL database program, and uses JSON-like documents with schema. The software is developed by MongoDB and licensed under the Server Side Public License. Key features include ad hoc queries, indexing, and real-time aggregation, as well as a document model that maps to the objects in your application code. MongoDB provides drivers for more than 10 languages, and the community has built dozens more.
MySQL Community Edition is the most popular open source relational database management system. It is available under the GPL license and is supported by a large and active community of developers. It includes SQL and NoSQL for developing both relational and NoSQL applications. It also provides a document store that features X protocol, XDev API and MySQL shell. MySQL is available on more than 20 platforms and operating systems like Linux, Unix, Mac and Windows.
Neo4j is an open source graph database management system that is designed for optimizing fast management, storage, and traversal of nodes and relationships. Neo4j provides real-time performance, and features a flexible schema, drivers for popular languages and frameworks, cloud connectivity, hot backups, and data import capabilities. Common use cases for this tool include software analytics, network management, matchmaking, scientific research, and project management.
OrientDB is a NoSQL database management system written in Java. It is a multi-model database that supports graph, document, key/value, and object models. Relationships are managed as in graph databases with direct connections between records. OrientDB development relies on an open source community that is led by OrientDB LTD, and uses GitHub to manage the source code, contributors and versioning. Google Group and Stack Overflow provide free support to users around the globe.
PostgreSQL is an object-relational database system that uses and extends the SQL language. It comes with many features aimed at helping users build applications, protect data integrity, and build fault-tolerant environments. PostgreSQL conforms to 160 of the 179 mandatory features for SQL:2-11 Core conformance, and supports a wide variety of data types. The software is highly extensible and many of the features, such as indexes, have defined APIs so that you can build out with it to solve unique challenges.
Splice Machine is an open source database management system that is powered by Hadoop and Spark. Key features of the software include ANSI SQL-99 coverage, ACID transactions with Snapshot Isolation semantics, in-place updates that scale from one row to millions, and secondary indexing in both unique and non-unique forms. One of Splice Machine’s building blocks is Apache HBase.
SQLite is a relational database management system contained in a C library. The software acts as a self-contained, serverless, transactional SQL database engine. It does not have a separate server process and reads and writes directly to ordinary disk files. The database format is cross-platform and can freely be copied to any database between 32 and 64-bits. SQLite is a compact library, and with all features enabled is less than 600KiB depending on the target platform and compiler optimization settings.
Titan is a scalable graph database designed for storing and querying graphs containing hundreds of billions of vertices and edges distributed across multi-machine clusters. It is a transactional database and can support thousands of concurrent users. Key features include data distribution and replication for performance and fault tolerance, multi-datacenter high availability and hot backups, and support for ACID and eventual consistency. Titan also offers support for various storage backends and global graph data analytics.
WebScaleSQL is a relational database management system created as a software branch of the production-ready community releases of MySQL. The open source software was created by a collaboration of engineers from several companies (Facebook, Google, LinkedIn, Twitter, Alibaba). Key features include an automated framework that runs and publishes the results of MySQL’s built-in test system, a suite of stress tests, and a prototype automated performance testing system.
If you’re looking for an enterprise-class database software solution, consult our freshly updated Data Management Buyer’s Guide.
Latest posts by Timothy King (see all)
- The 5 Best Data Quality Books Based on Real User Reviews - September 18, 2020
- Examining Top Data Management Firms in the 2020 Forbes Cloud 100 - September 16, 2020
- data.world Nabs $26 Million Venture Capital for Agile Data Governance - September 15, 2020