Apache Software Announces Arrow; a Top-Level Project

By Tim King , Executive Editor at Solutions Review
Data Integration News,

In a recent press release, the Apache Software Foundation announced a new top level project – Apache Arrow. According to the company, Arrow is a high-performance cross-system data layer for columnar in-memory analytics. Arrow will provide accelerated performance of analytical workloads, in some cases by more than 100 times. In addition, the Big Data tool will enable multi-system workloads by eliminating cross-system overhead communication.

Arrow was initially seeded by code from another project named Apache Drill. However, Arrow was built on top of a number open source collaborations and establishes a de facto standard for columnar in-memory processing and interchange. Code committers to Apache Arrow include developers from a variety of other Big Data projects including Calcite, Cassandra, Drill, Hadoop, HBase, Impala, Phoenix, Spark and others.

Jacques Nadeau, Vice President of Apache Arrow and Vice Presidet of Apache Drill, adds: “The Open Source community has joined forces on Apache Arrow. Developers from 13 major Open Source Big Data projects are already on board –by introducing a new era of columnar in-memory analytics, we anticipate the majority of the world’s data will be processed through Arrow within the next few years.”

In many workloads, 70 to 80 percent of CPU cycles are spent serializing and deserializing data. Apache Arrow solves this problem by enabling data to be shared between systems and processes with no serialization, deserialization or memory copies. Arrow also supports complex data with dynamic schemas. An example of this would be JSON data which is commonly used in IoT workloads, modern applications and log files. Implementations are also available for a number of programming languages including Java, C++ and Python to allow greater interoperability among a number of Big Data solutions.

Parth Chandra, member of the Apache Arrow and Apache Drill Project Management Committees, notes: “Real world use cases often include complex combinations of structured and rapidly growing complex-data. Already tested with Apache Drill, the efficient in-memory columnar representation and processing in Arrow will enable users to enjoy the performance of columnar processing with the flexibility of JSON.”

You can witness Apache Arrow live in the wild at this year’s Strata+ Hadoop World in sunny San Jose California in March.

For Apache’s full press release, click here.

Widget not in any sidebars

This article was written by Tim King on February 24, 2016

Tim King

Executive Editor

Tim is Solutions Review's Executive Editor covering the human impact of AI on the future of work and learning. He is also the Media Strategist behind Insight Jam (1M+ on YouTube) events and programming. A 2017 and 2018 Most Influential Business Journalist and 2021 "Who's Who" in multiple categories, Tim is a recognized thought leader in enterprise tech and AI.

The 17 Best AI Agents for Data Integration to Consider in 2026 - December 22, 2025
The 27 Best AI Agents for Data Engineering to Consider in 2026 - December 11, 2025
The 4 Best Informatica Online Training and Certifications for 2026 - December 1, 2025

Data Integration News

Apache Software Announces Arrow; a Top-Level Project

Tim King

Executive Editor

Expert Insights

Latest Posts

Categories

Important Links

Useful Pages

Apache Software Announces Arrow; a Top-Level Project

Share This

Tags

Tim King

Executive Editor

Related Posts

What to Expect at Safe Software’s The Peak of Data and AI 2025 May 6-8

The One Azure Data Engineer Expert Certification to Rule Them All

What to Expect at the 6th Annual Insight Jam LIVE!: Strategies for AI Impact ...

Expert Insights

Latest Posts