Databricks has announced Databricks Ingest and a new Data Ingestion Network of partners, according to a press release on the company’s website. The feature and partner program are aimed at bringing data teams closer to a new data management paradigm that Databricks is calling lakehouse. Lakehouse, according to material on the Databricks blog, “combines the best elements of data lakes and data warehouses” to enable business intelligence and machine learning on all of an organization’s data.
Through this new framework, Databricks customers can now load data into open-source Delta Lake through a network of partners that include providers like Fivetran, Qlik, Infoworks, StreamSets, and Syncsort. These partner integrators offer built-in integrations to Databricks Ingest for automated data loading. Azure Databricks customers are already leveraging Databricks Ingest from native integration with Azure Data Factory to ingest data from an array of sources.
Data teams can load data from a number of sources and applications like Salesforce, Marketo, Zendesk, SAP, and Google Analytics, as well as databases like Cassandra, Oracle, MySQL, and MongoDB. File storage providers included are Amazon S3, Azure Data Lake Storage, and Google Cloud Storage. In addition to the partner network being made public with this release, Informatica, Segment and Talend integrations will be soon to follow.
Auto-loading capabilities allow data to continuously flow into Delta Lake, without setting up and maintaining job triggers or schedules. As data appears in cloud storage from different sources, Databricks Ingest automatically pulls this new data into Delta Lake.
In a media statement, Databricks co-founder and CEO Ali Ghodsi said about the news: “The Lakehouse paradigm aspires to combine the reliability of data warehouses with the scale of data lakes to support every kind of use case. In order for this architecture to work well, it needs to be easy for every type of data to be pulled in. Databricks Ingest is an important step in making that possible.”
For more, read What is a Lakehouse? in the Databricks blog.