Pentaho Takes Big Data Lead with Apache Spark Integration

Pentaho Apache Spark

Pentaho yesterday announced support for native integration of Pentaho Data Integration with Apache Spark, which allows for the creation of Spark jobs. Initiated and developed by Pentaho Labs, this integration will enable the user to increase productivity, reduce costs, and lower the skill sets required as Spark becomes incorporated into new big data projects.

Spark is a powerful open-source processing engine built for speed, ease of use, and machine learning. It was engineered for performance, and is a next-generation big data technology used to store, blend, and govern data at new levels of speed, scalability, and simplicity. Pentaho was able to innovate early on with this emerging technology because it was built upon contemporary open source foundations.

Big data technologies are evolving at an almost immeasurable speed, and the folks at the Pentaho Labs continue to leverage and drive innovation in integration and analytics to give users advanced big data deployments with little risk. The company adds “Integration with Spark follows other labs efforts that have led to support for YARN and the Adaptive Big Data Layer. Following the native support of YARN alone, enterprise customers like RichRelevance, edo Interactive, and MultiPlan have been able to innovate and drive greater value from Hadoop.”

James Dixon, Chief Technology Officer at Pentaho notes: “For two years, we experimented with possible use cases based on our big data blueprints and sizing the enterprise market opportunity for Spark. Our customers now benefit from that work with simplified, real-time analytic capabilities. Our open-source heritage and modern extensible platform, allows us to quickly evolve our capabilities keeping our customers big data technology options open, reducing risk and saving considerable development time while taking advantage of the latest innovations in popular big data stores.”

Pentaho Data Integration for Apache Spark is available now in Pentaho Labs, and will be generally available in June of this year. To attend the webinar, click here, and for the official press release, click here.

 

Timothy King
Follow Tim

Timothy King

Editor, Data and Analytics at Solutions Review
Timothy leads Solutions Review's Business Intelligence, Data Integration and Data Management areas of focus. He is recognized as one of the top authories in Big Data, and the number-one authority in enterprise middleware. Timothy has also been named one of the world's top-75 most influential business journalists by Richtopia.
Timothy King
Follow Tim