The editors at Solutions Review have compiled this list of the best open-source data catalog tools to consider for your next project.
Searching for data cataloging software can be a daunting (and expensive) process, one that requires long hours of research and deep pockets. The most popular enterprise data catalog tools often provide more than what’s necessary for non-enterprise organizations, with advanced functionality relevant to only the most technically savvy users. Thankfully, there are a distinct group of the best open-source data catalog tools out there. Some of these solutions are offered by vendors looking to eventually sell you on their enterprise product, and others are maintained and operated by a community of developers looking to democratize the process.
In this article, we will examine the best open-source data catalog tools, first by providing a brief overview of what to expect and also with short blurbs about each of the currently available options in the space. This is the most complete and up-to-date directory on the web.
Developed by Lyft, Amundsen is an open-source data discovery and metadata engine for discovering data and generating context that shows how it is being used. It can be piloted by analysts and data scientists and data and software engineers depending on the use case. The product features a PageRank-inspired search algorithm that recommends results based on names, descriptions, tags and querying/viewing activity on the table or dashboard. There’s also automated and curated metadata that describes tables and columns, other frequent users, when the table was last updated, preview data, and more.
Magda is a federated, open-source data catalog for cataloging, enrichment, searching, tracking, and prioritization. The tool lets users find useful data via data discovery features. Magda also offers metadata enhancement and authoring tools. It can quickly crawl external data sources, track changes, and make automatic enhancements to push notifications when changes occur as well. Magda touts an open architecture that is designed as a set of microservices, and easy setup and upgrades.