The following is an excerpt from Solutions Review’s 2017 Data Integration Buyer’s Guide.
Data Integration is a combination of technical and business processes used to combine different data from disparate sources in order to turn it into valuable business insight. This process generally supports the analytical processing of data by aligning, combining, and presenting each data store to an end-user, and is usually executed in a data warehouse via integration software. ETL is the most common form in practice, but other techniques including replication and virtualization can also help to move the needle.
To help you evaluate prospective integration software tools, these are the five questions we recommend asking yourself before making a choice. If you find these questions helpful, check out the 2017 Data Integration Buyer’s Guide, which features five more questions for the providers, and full-page profiles of the top solutions for the enterprise.
1. Why is a Data Integration tool necessary to my organization?
Will you require real-time data access and transfer? How much data will you need to move and how quickly? Can you afford some downtime on source/target systems, or do you need them running at all times? Note that all these data requirements are based on your technical and business needs so that you can compare them with what specific vendors have to offer.
2. What kind of data do I need to analyze?
Is the majority of your company’s data transactional in nature? Is it all structured? If so, a traditional or “legacy” tool may be right for you. If the bulk of your data streams to your storage framework in real-time via embedded product sensors, social media, computer applications and online customer feedback, then a solution that can integrate with the likes of Hadoop, Spark, and NoSQL is going to be the best fit for you. Be sure to take into account the types of data that run through your business and then match that up with the appropriate provider.
3. What are my data sources and where are they located?
The basic elements of Data Integration revolve around moving data from sources (applications) to targets (data warehouses, Hadoop, etc.). Much of what is powering the Big Data movement is the massive data being collected in the cloud through very large Software as a Service (SaaS) solutions like Salesforce.com. Some solutions listed in this guide specialize in the integration of cloud application data with on-premise systems to ensure that your users can access complete, current, and accurate data. Make sure the kind of integration you require is being offered by the vendors you are considering.
4. Cloud, on-premise, or both?
A hybrid approach to Data Integration is a growing trend in the enterprise market as it provides organizations the ability to execute integration in both on-premise and cloud environments. Thus, organizations are able to interchange data to and from either framework as a way to gain business agility to their integration infrastructure and manage cloud data delivery and address the need for data sharing between cloud environments. On-premise integration is certainly not dead, but a hybrid approach will set your organization up nicely for the future, even if cloud exposure is currently limited.
5. Are data quality and Master Data Management feature considerations?
It’s one thing to use an ETL tool to move data from one storage medium to another. It’s an even different thing to replicate a data source for use elsewhere. Most importantly, though, you’ll want to make sure that the appropriate data gets moved to the right place so it can be analyzed in a fashion that will yield actionable insights. As mentioned earlier, the most comprehensive Big Data solutions today include data quality and MDM capabilities in one form or another. With MDM as the umbrella, data governance and management tools are going to be essential to you if you capture data from a wide variety of sources.