Data Management consulting and services company Bitwise recently announced the introduction of their next-generation Data Integration tool, Hydrograph, to address the need for ETL-like functionality on Hadoop in enterprises with Big Data workloads. The announcement was made at the Strata + Hadoop World event in New York City. The tool was built by practitioners that understand the pains of ETL currently on the Hadoop platform, with the main objective being the acceleration of ETL development in the Big Data ecosystem.
Shahab Kamal, Executive Vice President of Client Management and Solution Engineering at Bitwise speaks to the company’s new release: “With its massively parallel processing, Hadoop once held promise for transforming the ETL process on large datasets – but moving ETLs to Hadoop still requires heavy MapReduce coding, which defeats the purpose. In order to take advantage of the processing power of Hadoop, Hydrograph uses Cascading to generate MapReduce jobs that are converted from XMLs developed using a desktop based GUI by connecting the components with links.”
Bitwise partnered with business users to design Hydrograph to help them pivot from a traditional system for ETL development to a more modern open source environment. The Bitwise team used Cascading as the foundation and building abstraction layers to simulate an ETL job, proving that an ETL-like tool on Hadoop was a possibility. In this way, Bitwise provides a user interface that has a similar look and feel to leading Data Integration solutions, but this time expressly designed for use within Big Data frameworks.
Bitwise is currently working on advanced debugging interfaces, execution tracking visuals and support for additional data sources and links. According to the company, Hydrograph will be available as an open sourced project on Github later in the year.