The editors at Solutions Review highlight what’s changed since the last iteration of Gartner’s Magic Quadrant for Data Science and Machine Learning Platforms and provide an analysis of the new report.
Analyst house Gartner, Inc. has released its 2021 Magic Quadrant for Data Science and Machine Learning Platforms. The researcher defines a data science and machine learning platform as “a core product and supporting portfolio of coherently integrated products, components, libraries and frameworks (including proprietary, partner-sourced and open-source). Its primary users are data science professionals, including expert data scientists, citizen data scientists, data engineers, application developers and machine learning (ML) specialists.”
Data science and machine learning platforms can support tasks across the data science lifestyle like problem and business context understanding, data ingestion, data preparation, and data exploration. They can also include feature engineering, model creation and training, model testing, deployment, monitoring, maintenance, data and model governance, explainable AI, business value tracking, and collaboration tools. The expansion of capabilities among tools in this market makes it ‘more vibrant’ than ever according to Gartner.
Buyers should be aware that this market offers a broad array of products for different requirements and preferences, typically dependent on the user persona. This Magic Quadrant is aimed especially at expert data scientists, citizen data scientists, supporting roles, LOB data science teams, and corporate data science teams. While the eye may begin analyzing this report in the Leaders quadrant, Gartner notes that it is best to evaluate your specific needs when assessing vendors. They add “A vendor in the Leaders quadrant, for example, might not be the best choice for you. Equally, a Niche Player might be the perfect choice.”
Data science and machine learning vendors remain focused on innovation and product differentiation, and the market is made up of established and thought-leading providers as well as startups with emerging value propositions. The proliferation of augmented analytics has drawn this market to a near collision with the horizontal Analytics and Business Intelligence space, adding another layer. Gartner concludes “Traditionally these have been discrete markets with different buyers, but that situation is changing.”
While most vendors are aiming for a “sweet spot” with their platforms as it relates to adoption by expert and citizen data scientists, there are several key areas of differentiation. These include the user interface, augmented data science and machine learning, MLOps, performance and scalability, hybrid and multicloud support, and support for cutting-edge use cases and techniques.
Gartner adjusts its evaluation and inclusion criteria for Magic Quadrants as software markets evolve. As a result, Alibaba Cloud, Amazon Web Services (AWS), Cloudera, and Samsung SDS have been added to the 2021 report. No solution providers were removed from this edition.
In this Magic Quadrant, Gartner evaluates the strengths and weaknesses of 20 providers that it considers most significant in the marketplace, and provides readers with a graph (the Magic Quadrant) plotting the vendors based on their ability to execute and completeness of vision. The graph is divided into four quadrants: niche players, challengers, visionaries, and leaders. At Solutions Review, we read the report, available here, and pulled out the key takeaways.
SAS and IBM headline the Leaders quadrant for 2021. SAS remains the overall leader via its Visual Data Mining and Machine Learning product and thought leadership in composite AI, MLOps and decision intelligence. SAS touts an enterprise-grade solution and keen eye for market trends, as well as cloud-native architecture and integrations with popular open-source tools. IBM drastically improved its Completeness of Vision across Gartner’s horizontal axis to attain its 2021 position. IBM now offers a modern and complete product featuring multipersona support and a focus on responsible AI and governance.
Dataiku and TIBCO Software are clustered even closer together this year than last, though both remain major players. Dataiku is known to have an understanding of the citizen data scientist role, focuses on business value via performance metrics, and has what Gartner refers to as a “strong product roadmap” and vision. TIBCO Software offers a platform for Connected Intelligence and has a science and engineering focus. It has a leg up on the field when it comes to model deployment and production, and offers leading-edge functionality. TIBCO is also an excellent consideration for decentralized analytic professionals.
Databricks and MathWorks round out the Leaders quadrant for 2021, both maintaining positions in the bottom-third of the column. Databricks’ Unified Data Platform is available in multiple clouds and touts excellent scalability features. Its recent acquisition of Redash further expands the platform. Expert data scientists will find the most immediate value from the Databricks platform. MathWorks primarily services those in engineering and asset-centric organizations. The vendor offers advanced composite AI capabilities, deep domain expertise, and verifiable and reliable machine learning as well.
Alteryx is the only 2021 Challenger in this Magic Quadrant. The vendor is currently overhauling its focus, evident through the release of Alteryx Analytic Process Automation in May. It mainly serves clients in manufacturing, financial services, consumer packaged goods, retail, healthcare, and government. Alteryx touts ease of use for different user personas, a growing collection of partnerships, and strong customer sentiment regarding overall experience with the platform.
Microsoft, Google and AWS are competing for the top spot among cloud-based data science and machine learning vendors in the Visionaries column. Microsoft has the highest score for Ability to Execute in this class and offers capabilities for citizen and expert users. The mega-vendor also features strong MLOps functionality and security and governance for compute quota and cost management. Google currently offers its Cloud AI Platform but is soon set to release a unified product. Google touts thought leadership in ML research and responsible AI, and has recently made a major effort to reorganize its software release schedule.
Amazon’s data science and machine learning tools run through its SageMaker product line. AWS is natively integrated with many proprietary cloud data and analytics solutions as well. The mega-vendor offers perhaps the strongest performance and scalability of any product listed. Its data labeling and human-in-the-loop capabilities are also very popular. DataRobot retains its position between Microsoft and AWS just a stone’s throw away from the Leaders column. It features data science augmentation, prediction quality management via a Humble AI initiative, and high-touch customer service for providing business value.
KNIME, RapidMiner and H20.ai form a cluster on the right-hand side of the Visionaries quadrant to round it out. The open-source KNIME Analytics Platform and commercial KNIME Server products bridge the gap between development and production so data scientists and end-users can better collaborate. KNIME touts more than 4,000 nodes for connecting to data, as well as a visual workflow that can be broken down into individual components. RapoidMiner’s offerings reflect current trends like multipersona, collaboration, XAI, and model governance. New capabilities like FeatureMart and Feature Catalog are major value-adds.
H2O.ai earned the highest score for Completeness of Vision of any provider in this report. It is a thought leader in the automation and augmentation of data science and machine learning, especially for time-series analysis. H2O also offers several explainability capabilities for both modeling and feature engineering.
Alibaba Cloud, Cloudera, and Samsung SDS make their collective debuts as 2021 Niche Players. Alibaba is currently an Asia-only play for its retail, internet and data service customers. Cloudera Machine Learning is delivered as a service on top of the Cloudera Data Platform. The vendor mainly focuses on unifying machine learning workflows across data warehousing, data engineering, data science and machine learning, and operationalization. Cloudera’s strengths include the native use of Spark on Kubernetes, complex data processing, and metadata management for DataOps and MLOps. Like Alibaba Cloud, Samsung SDS is currently concentrated in the Asian market.
Domino offers end-to-end capabilities on-prem or in the cloud. It released Domino Model Monitor last June which, according to Gartner, shows a “commitment to enterprise MLOps.” Domino is best-fit for code-first data science teams. Anaconda’s recent innovations around model governance and scalability are noteworthy, while Anaconda Enterprise remains a trusted, flexible and recognized product. Anaconda enables the optimization of open-source tools as well. Altair is a strong consideration for service-centric industries, offers various simulation and high-performance computing tools, and touts excellent customer scores for deployment, service, and support.