
Sneak Peek: Databricks Data+AI Summit 2024
If you are familiar with my research and writing over the last ten years (at TDWI and Gartner), you know that much of my recent work involves data lakes and lakehouses, plus closely related best practices in data and analytics (D&A), such as data architecture and artificial intelligence (AI). If you’re looking for a conference that drills deeply into these topics, then you should look into the Databricks Data+AI Summit (DAIS), to be held at the Moscone Center in San Francisco, June 10-13, 2024. I’ve attended this summit a few times, and it is always educational and fun.
Why attend Databricks Data+AI Summit 2024 (DAIS)?
DAIS is essentially a user conference, produced by Databricks, one of the leading software providers for use cases in D&A. Hence, the conference is a no-brainer for Databricks’ current customers who need to maintain and improve their Databricks-based D&A solutions.
However, DAIS is more than just a user conference. The keynotes, user presentations, demos, and hands-on training courses at DAIS reach beyond Databricks products and services, to deliver educational and inspirational experiences for a wide range of general best practices and field trends in D&A. Hence, all technical data professionals and business people who depend on D&A can learn a lot – whether Databricks customers or not. Furthermore, attending DAIS can help organizations make an enlightened decision about whether to acquire Databricks products.
What can you learn at Databricks Data+AI Summit 2024?
Without the Right Data, AI Cannot Flourish
The reality and challenge facing D&A professionals today is that building a comprehensive and insightful analytic model depends on access to complete and quality data, in great quantity. In fact, the two are so intrinsically linked that organizations desiring to become “AI-driven” now need a single architected platform that tightly integrates two broad classes of functionality:
- Data management functions – from collecting enterprise data to managing it in storage, via modern data governance, pipelining, orchestration, lineage, quality, and privacy
- Analytic functions – including data exploration, model design, machine learning, and agile CI/CD cycles to produce evolving LLMs
Databricks as a Data Intelligence Platform
For years, Databricks has been bringing data management and advanced analytic processing together, by integrating several open-source and proprietary technologies, including Apache Spark, Delta Lake, MLflow, Python, Presto, Trino, DBRX, and more, available via serverless cloud services. These have recently coalesced into Databricks’ concept of the Data Intelligence Platform, and you should expect DAIS to hit this concept repeatedly, along with the many components required of the concept.
This newly conceived Data Intelligence Platform is essentially a melding of:
- Databricks Lakehouse – a cornucopia of data management functionality
- DatabricksIQ – based on Databricks’ acquisition of the analytic tool MosaicML
The architecture of the resulting Data Intelligence Platform is unified by (among other things):
- Single governance layer – all data and AI functionality, across full data lifecycle
- Single query engine – satisfying ETL, SQL, ML, and BI requirements
A Reference Architecture for the Data Intelligence Platform
Source: https://www.databricks.com/product/data-intelligence-platform
Managing Data at Scale in an AI World
We have all seen the recent ascension of artificial intelligence (AI) – whether generative, predictive, or otherwise – such that AI is now much more usable, functional, and critical to business success than with prior generations. Hence, you should expect DAIS to hit AI hard from a data intelligence platform perspective, along with related techniques and products for machine learning (ML), large language models (LLMs), and retrieval-augmented generation (RAG).
Anyone leaving DAIS (or any vendor-sponsored data conference) should take home the vendor’s current and future strategies in two key areas where AI and data management are increasingly melding:
The data intelligence platform’s ability to deliver AI-ready data, at scale – In other words, a data management platform must continue to deliver data that’s appropriate to business requirements and technical standards on a per use case basis. Yet, data platforms are now under pressure to deliver data and analytic processing at the scale, performance, source diversity, and multi-model tolerance that use cases in AI demand.
The AI-driven functionality embedded within tools for data management and analytics – The point of extending D&A tools this way is to provide greater automation and intelligence, which can lift the productivity of data engineers and data scientists, plus give production systems smarter monitoring, automatic data remediation, and quicker reactions. Given the shrinking time span allotted for the design, development, and deployment of analytic solutions, AI-driven automation is becoming a survival strategy for CI/CD cycles in analytics.
A Cornucopia of Compelling Conference Content
DAIS is jam-packed, with over 500 sessions – yikes!. This includes a giant, diverse collection of keynotes, presentations, user case studies, product demos, speaker panels, roundtables, vertical industry sessions, training courses, certification, social events, and networking opportunities.
For example, here’s a sampling of topics that DAIS sessions will cover. (Warning: This is a very long list!)
- Delta Lake Architectures and Best Practices
- Data Quality in Data Lakehouses
- Apache Parquet for Lakehouses
- Data Mesh for Data Lakes
- Cost Optimization for Data Lakehouses
- What is Lighthouse?
- Data Engineering Methods for D&A
- Data prep for AI, GenAI, and other advanced forms of analytics
- MLOps and LLMOps
- Attaining AI/ML Maturity
- AutoML and MLflow,
- Women in Data
- Vector Data and Search
- Managing and Analyzing Hybrid Data
- LLMs and Segment Anything Models
- LLMs and Hallucination Risk
- Retrieval-Augmented Generation (RAG)
- Unity Catalog and Data Governance
- Geospatial Data on Databricks
- Data and Applications Migrations to Databricks
- Business Value and ROI for AI Applications
- Data Pipelines and Streams on Databricks
- Data Warehousing with Databricks SQL
One of the compelling traits of DAIS is that it includes literally dozens of case study presentations given by actual Databricks-using organizations. This differs from many other “user conferences” where (ironically) there aren’t many user speakers. Here is a sampling of user organizations participating in case study sessions: (Again, the list is long!)
- General Motors, Lexmark, Sight Machine, GrabTaxi, JetBlue, Idaho National Laboratory, Corning, Unilever, CVS Health, Kroger, Northern Trust, Western Governors University, Fox, Kwik Trip, Ordnance Survey, VisitBritain, AccuWeather, Workday, Trek Bicycle, Royal Caribbean Group, Instacart, J P Morgan Chase, Northwestern Mutual, Doordash, Shell, Blackberry, Albertsons, AT&T, US Department of State, Bayer AG, Rolls-Royce, Chevron Phillips, Bloomberg, Rivian, Michelin, Starbucks, T-Mobile, Nike, Comcast, Bridgestone, Boeing, Capital One, McDonald’s, DraftKings, etc.
Another admirable trait of DAIS is the inclusion of several speaker panels and other sessions devoted to vertical industries, including: Public Sector; Energy; Financial Services; Manufacturing and Transportation; Retail and Consumer Goods; etc.
Finally, DAIS is organized into eight or so tracks, to help attendees find sessions and topics that suit their needs and interests. Tracks include:
- Data Lakehouse Architecture
- Data Engineering and Streaming
- Data Governance
- Generative AI
- Data Science and Machine Learning
- Data Strategy and Lakehouse Implementation
- Data Warehousing, Analytics, and BI
Conclusion and Recommendations
Again, there 500+ sessions at Databricks’ Data + AI Summit 2024 (DAIS), and they are distributed over four days. This means that no one can attend them all. Hence, you should get organized and create a list of “favorites” before arriving at DAIS. That way, you won’t miss sessions relevant to you.
You can peruse the gigantic agenda for DAIS 2024 online at: https://www.databricks.com/dataaisummit/agenda
And you can register for DAIS 2024 here: https://dataaisummit.databricks.com/flow/db/dais2024/landing/page/home
You can also download the DAIS 2024 app to your smartphone – for Android or Apple phones – and that’s available via the usual app stores.
Finally, there is still time to register for DAIS 2024, coming up at the Moscone Center in San Francisco June 10-13, 2024. I encourage you to register online ASAP. And I hope to see you there!