IT news and analysis outlet CRN recently released its 2021 (and ninth annual) Big Data 100, a ranking of prominent big data technology vendors that solution providers should be aware of. The list is made up of established and emerging big data tools vendors. The list is broken down into five distinct product categories that include business analytics, database systems, data management and data integration software, big data platforms, and data science and machine learning tools.
CRN pre-published a list of The Coolest Data Management and Integration Tool Companies included in the overall list via an interactive slideshow. Though the Big Data 100 is aimed at highlighting software vendors for the purposes of solution provider partnering, Solutions Review is most interesting in highlighting the vendors from that offer unique products and platforms for enterprise organizations. As such, we’ve read through CRN’s complete rankings, available here, to analyze the trending data management companies we think matter most. For an even deeper breakdown of data management software, tools, vendors and platforms, consult our popular Buyer’s Guide.
Alation offers a platform for a broad range of data intelligence solutions including data search and discovery, data governance, data stewardship, analytics, and digital transformation. The product features a Behavioral Analysis Engine, inbuilt collaboration capabilities, and open interfaces. Alation also profiles data and monitors usage to ensure that users have accurate insight into data accuracy. The platform provides insight into how users are creating and sharing information from raw data as well.
Ataccama offers an augmented data management platform that features data discovery and profiling, metadata management and a data catalog, data quality management, master and reference data management, and big data processing and integration. The product is fully integrated yet modular for any data, user, domain or deployment type. Ataccama also includes text analytics and machine learning, as well as data enrichment with external sources and data lake profiling.
The Collibra Platform is made up of five distinct solutions. Headlined by an enterprise data catalog called Collibra Catalog, the platform also includes Collibra Governance, Collibra Privacy & Risk, Collibra Lineage, and now Collibra Data Intelligence Cloud. Collibra documents an organization’s technical metadata and how it is used. It describes the structure of a piece of data, its relationship to other data, and its origin, format, and use. The solution serves as a searchable repository for users who need to understand how and where data is stored and how it can be used.
Octopai is a centralized, cross-platform metadata management automation solution that enables data and analytics teams to discover and govern shared metadata. The product does metadata scanning by automatically gathering it from ETL, databases and reporting tools. Metadata is stored and managed in a central repository, and a smart engine using hundreds of crawlers searches all metadata and presents results quickly. Octopai is best used for use cases in business intelligence, governance, and data cataloging.
Tamr offers a machine learning-based data integration product called Unify. The solution allows organizations to connect to any tabular data and publish it anywhere. Users can map schemas with machine learning suggestions and normalize data formats using Spark and SQL. Tamr’s Master Records feature provides a complete view of all entities via simple yes and no questions as well. The company was originally invented by Dr. Michael Stonebraker and his colleagues who published their research about the Data Tamer System for handling large-scale data curation in 2013.
Zaloni Arena operationalizes data along the entire pipeline, from data source to consumer. The product automates repeatable data management tasks and processes and provides central management of all enterprise data sources whether on-prem, cloud, multi-cloud, or hybrid. Zaloni is compatible with all major Hadoop distributions, most data processing engines, and applicable deployment models.