Our editors curated this list of the biggest data science news items during the first half of 2021, as highlighted on Solutions Review.
Data science is one of the fastest-growing fields in America. Organizations are employing data scientists at a rapid rate to help them analyze increasingly large and complex data volumes. The proliferation of big data and the need to make sense of it all has created a vortex where all of these things exist together. As a result, new techniques, technologies, and theories are continually being developed to run advanced analysis, and they all require development and programming to ensure a path forward.
Part of Solutions Review’s ongoing analysis of the big data marketplace includes covering the biggest data science news stories which have the greatest impact on enterprise technologists. This is a curated list of the most important data science news stories from the first half of 2021. For more on the space, including the newest product releases, funding rounds and mergers and acquisitions, follow our popular news section.
The Biggest Data Science News Items During the First Half of 2021
Databricks raised $1 billion in Series G funding in response to the rapid adoption of its unified data platform, according to a press release. The capital injection, which follows a raise of $400 million in October 2019, puts Databticks at a $28 billion valuation. The round was led by new investor Franklin Templeton with inclusion from Amazon Web Services, CapitalG and Salesforce Ventures. The funding will enable Databricks to move ahead with additional product innovations and scale support for the lakehouse data architecture.
In a media statement, Databricks co-founder and CEO Ali Ghodsi said: “We see this investment and our continued rapid growth as further validation of our vision for a simple, open and unified data platform that can support all data-driven use cases, from BI to AI. Built on a modern lakehouse architecture in the cloud, Databricks helps organizations eliminate the cost and complexity that is inherent in legacy data architectures so that data teams can collaborate and innovate faster. This lakehouse paradigm is what’s fueling our growth, and it’s great to see how excited our investors are to be a part of it.”
OmniSci recently announced the launch of OmniSci Free, a full-featured version of its analytics platform available for use at no cost. OmniSci free will enable users to utilize the full power of the OmniSci Analytics Platform, which includes OmniSciDB, OmniSci Render Engine, OmniSci Immerse, and the OmniSci Data Science Toolkit. The solution can be deployed on Linux-based servers and is generally adequate for datasets of up to 500 million records. Three concurrent users are permitted.
In a media statement on the news, OmniSci co-founder and CEO Todd Mostak said “Our mission from the beginning has been to make analytics instant, powerful, and effortless for everyone, and the launch of OmniSci Free is our latest step towards making our platform accessible to an even broader audience. While our open source database has delivered significant value to the community as an ultra-fast OLAP SQL engine, it has become increasingly clear that many use cases heavily benefit from access to the capabilities of our full platform, including its massively scalable visualization and data science capabilities.”
DataRobot recently announced the release of DataRobot 7, the latest version of its flagship AI and machine learning platform. The release is highlighted by MLOps remove model challengers which allow customers to challenge production models no matter where they are running and regardless of framework or language in which it was built. Additionally, DataRobot 7 also offers choose your own forecast baseline which lets users compare the output of their forecasting models with predictions from DataRobot Automated Time Series.
In a media statement, DataRobot SVP of Product Nenshad Bardoliwalla said “Through ongoing engagement with our customers, we’ve developed an intimate understanding of the challenges they face, as well as the opportunities they have, with AI. Our latest platform release has been specifically designed to help them seize the transformative power of AI and advance on their journeys to becoming AI-driven enterprises.”
Tableau announced the release of Tableau 2021.1, the latest version of the company’s flagship business intelligence and data analytics offering. The release is highlighted by the introduction of business science, a “new class of AI-powered analytics” that enables business users to take advantage of data science techniques. Business science is delivered via Einstein Discovery. Other key additions aim to simplify analytics at scale and expand the Tableau ecosystem to help different user personas understand their environment.
In a media statement about the news, Tableau Chief Product Officer Francois Ajenstat said “Data science has always been able to solve big problems but too often that power is limited to a few select people within an organization. To build truly data-driven organizations, we need to unlock the power of data for as many people as possible. Democratizing data science will help more people make smarter decisions faster.”
Dataiku recently announced the release of Dataiku 9, the latest version of the company’s flagship data science and machine learning platform. The release is highlighted by best practice guardrails to prevent common pitfalls, model assertations to capture and test known use cases, what-if analysis to interactively test model sensitivity, and a new model fairness report to augment existing biased detection methods when building responsible AI models. Dataiku raised $100 million in Series D funding last summer.
The release notes add “For business analysts engaged in data preparation tasks, the highly requested fuzzy join recipe makes it easy to join close-but-not-equal columns, an updated formula editor requires less time to learn, and updated date functions simplify time date preparation.” It also touts support for the Dash application framework.
Domino Data Lab recently announced a series of new integrated solutions and product enhancements with NVIDIA, according to a press release. The technologies were unveiled at the NVIDIA GTC Conference. Domino’s latest is highlighted by Domino’s availability for the NetApp ONTAP AI Integrated Solution, which upgrades data science productivity with software that streamlines the workflow while maximizing infrastructure utilization. As such, Domino has been tested and validated to run on the packaged offering and is available via the NVIDIA Partner network.
The new platform automatically creates and manages multi-node clusters and releases them when training is done. Domino currently supports ephemeral clusters using Apache Spark and Ray, and will be adding support for Dask in a product release later in the year. Administrators can also divide a single NVIDIA DGX A100 GPU into multiple instances or partitions to support a variety of users with Domino’s support. According to the announcement, this allows “7x the number of data scientists to run a Jupyter notebook attached to a single GPU versus without MIG.”
Explorium recently announced that it has secured $75 million in Series C funding, according to a press release on the company’s website. The funding is Explorium’s second round in the last nine months and brings the company’s total capital raised to more than $125 million since its founding in 2017. Explorium doubled its customer base during the last 16 months.
In a media statement on the news, Explorium CEO Maor Shlomo said “As we saw last year, machine learning models and tools for advanced analytics are only as good as the data behind them. And often that data is not sufficient. We’re addressing a business-critical need, guiding data scientists and business leaders to the signals that will help them make better predictions and achieve better business outcomes.”
Alteryx recently announced product enhancements across its product line of data science and analytics tools, as well as the release of Alteryx Machine Learning. The company broke the news at Alteryx Inspire Virtual, its annual user conference. Currently available in early access, Alteryx Machine Learning provides guided, explainable, and fully automated machine learning (AutoML). Key features include feature engineering and deep feature synthesis, automated insight generation, and an Education Mode that offers data science best practices.
In a media statement on the news, Alteryx Chief Product Officer Suresh Vittal said: “We are investing deeply in analytics and data science automation in the cloud, starting with Designer Cloud, Alteryx Machine Learning and AI introduced today. We remain focused on being the best at democratizing analytics so millions of people can leverage the power of data.”