If you work in Data Science or IT, you’re probably already familiar with Apache Spark. In practice, Spark has grown exponentially in 2015, and in some use cases it has matched or even surpassed Hadoop as the open source Big Data framework of choice. Vendors are beginning to hop on board as well, as Talend, Altiscale and Pentaho have all enhanced their integration platforms with Spark in recent months.
With all of the highly technical chatter out there, it can be hard to understand what Spark can help your organization do. Thankfully there’s LinkedIn’s Slideshare, a resource where users and companies can host webinars and presentations for public access. We combed through thousands of presentations on the site using the Spark keyword to find a series of eight created by Databricks, a company who revolutionizes data processing through the Spark platform.
The slideshows, which were all presented by Databricks at Spark Summit EU 2015 in late October, outline various topics on Spark, as you’ll see below:
The evolution of Spark: where is it being used, for what purpose, and by whom?
A technical overview of Spark’s DataFrame API: Implementation and more:
An inside look at Spark’s development, both frontend and backend:
Databricks outlines emerging trends, common issues, and solutions:
How do users integrate common data science tools like Python, with Spark?
What have users learned in migrating from Data Warehouses to Spark?
Databricks’ CEO discusses the impact Spark has had in the enterprise:
How do Spark clusters and R facilitate analysis of Big Data?
There you have it! A nice selection of Spark presentations to help you cut through all of the other information out there on the web. For more on Spark, stay tuned into Solutions Review.
- Complete and comprehensive rundowns of the top DI vendors and what their solutions include
- Bottom line descriptions of each solution and their strengths
- Important questions to ask yourself and potential vendors when considering a solution
- Market overview of the current DI space
Latest posts by Timothy King (see all)
- Syncsort Expands Support for Hadoop and Spark 2.0 - February 16, 2017
- Attunity Unveils Compose 3.0 for ETL at Warp Speed - February 16, 2017
- Oracle Unveils Data Integrator Cloud for Disparate Data - February 13, 2017