Four Key Steps to Selecting Data Preparation Tools

Four Key Steps to Selecting Data Preparation Tools

By Jon Pilkington

Navigating today’s big data stores, data warehouses and data lakes to find and access the right information for business analytics projects, and then preparing and exporting it into one streamlined spreadsheet for analysis is still a daunting task for many organizations. Far too many business users and data analysts are spending countless hours manually extracting, blending, reconciling and preparing data from disparate systems. And these manual processes are not only time-consuming and painstaking, but they tax productivity, are susceptible to human error, and take time away from what delivers the most organizational value – analysis that drives better business decisions.

Additionally, in many instances, despite users’ best efforts, valuable data is still locked in unstructured sources, or closely guarded by IT and analytics gatekeepers – resulting in critical decision-making based on incomplete or outdated information. And the culmination of these data access and preparation challenges can transform even the most common analytics project into a major headache.

Self-service data preparation

The good news is that self-service data preparation tools enable business users and analysts to overcome traditional data challenges by simplifying and automating data acquisition, cleansing and blending processes. In other words, the technology empowers all users to easily and rapidly access, manipulate, enrich, blend and reconcile disparate data from virtually any source – structured or unstructured. Data can then be prepared for analysis in a fraction of the time that it takes using spreadsheets and other manually-intensive measures.

In terms of ROI, this means that users gain fast access to not only the right data, but all of the data, crucial to getting a holistic view of the business, and the majority of their time can be spent on performing analysis that results in more accurate insights and better business decisions. And, while this technology gives users the autonomy they require to do their jobs, it also satisfies IT’s need for security and compliance by incorporating governance capabilities, such as data masking, data retention, data lineage and role-based permissions.

This all sounds great so far, right? But, finding the right self-service data preparation technology for your business is no easy task. Not to worry though! Following is a four-step guide to help you choose the solution best suited for your data and analytics needs.

Step 1: Assess the state of operational and analytical processes

By identifying your current manual processes for data access and preparation, you can pinpoint labor-intensive, redundant and inefficient tasks, along with areas of risk due to the potential for human error. To complete the assessment phase of the buying process, consider asking yourself questions such as:

  • Are you or your team creating reports, dashboards or other visualizations?
  • Are you searching for, collecting, blending or reconciling data?
  • What do these processes entail? How often are they done? And, how long do they take?
  • Is data readily available for each task? If data must be requested, how long does it take to receive? Is manual entry required?
  • What systems are accessed?
  • Are these processes centralized, or are different departments performing similar tasks in silos?
  • What are your current limitations to data preparation and analysis?

Once you have a holistic view of your current data preparation and analytics processes, along with their accompanying limitations and challenges, the next step is to define goals for improvement.

Step 2: Determine what’s needed

Determining clear goals and objectives for data preparation and analysis within your organization will help guide your choice of products, services and vendors. Goals could be stated in terms of system capabilities and features, or in terms of benefits your company expects to derive from its investment. For example, you may identify with some of the following common goals: improve analyst productivity, reduce time to complete processes, expedite IT responsiveness, enhance data quality and trust, eliminate manual processes, improve compliance and data governance, automate or reuse common processes, and enable more business analysts to produce data-driven information.

Step 3: Evaluate costs and return on investment (ROI)

It should come as no surprise that ROI is one of the most important factors to consider when investing in new technology solutions. To ensure you are getting the maximum bang for your buck when it comes to self-service data preparation, consider first calculating how much money is currently being spent on this task, compared to actual analysis. To determine this figure (approximately), take into account the average analyst salary at your organization, the number of employed analysts, the average number of hours per day an analyst works each year, and the percentage of analysts’ time spent on preparing data. Frighteningly, a study done by Blue Hill Research calculated that, on average, organizations are wasting $22,000 per analyst, per year when they do not introduce data preparation tools.

When calculating ROI, it’s also important to determine the initial set-up costs associated with purchasing a data preparation solution, and to consider the technology’s potential benefits, including improved analytics and visualizations, more accurate data and reports, more productive analysts, and faster, better business decisions.

Step 4: Research providers and outline questions to ask vendors

There are many effective ways to identify self-service data preparation providers, including asking peers and colleagues, running exhaustive online searches, hiring consultants and using analyst reports to narrow down the number of options. Whatever method you choose, assessing vendors through a detailed Request for Proposal (RFP) can be extremely helpful. By preparing an RFP, you can solicit feedback on your specific requirements in a common format, making it easier to compare different vendor solutions.

Preparing a comprehensive list of requirements for your data preparation solution can also help ensure you select a system with the right capabilities and features for your needs. The combination of RFP responses and product requirements will enable you to evaluate critical factors – such as deployment model, end-to-end capabilities, pricing, support for data sources, end-user roles, and integration with existing investments – and ultimately identify the most appropriate vendor offering for your company.

Purchasing made simple

In short, incorporating a self-service data preparation solution into your analytics and operational processes will help business users and data analysts quickly produce content that facilitates better business decisions, all while upholding governance practices. And with the above guidelines, we hope selecting the right solution for your company is as easy as data preparation becomes when this innovative technology is implemented.

Jon PilkingtonJon Pilkington is the chief product officer at Datawatch. Jon joined Datawatch from Sonian Systems, a public cloud email archiving vendor, where he served as vice president of marketing and product management. He’s also held positions at Metatomix, a real-time data integration firm, and Cognos, a business intelligence management software company. Jon has a B.S. in Management Information Systems from Bryant University and is the recipient of several industry awards, including the Massachusetts Technology Leadership Council 2008 “CXO of the Year.”

Timothy King
Follow Tim

Timothy King

Editor, Data and Analytics at Solutions Review
Timothy leads Solutions Review's Business Intelligence, Data Integration and Data Management areas of focus. He is recognized as one of the top authories in Big Data, and the number-one authority in enterprise middleware. Timothy has also been named one of the world's top-75 most influential business journalists by Richtopia.
Timothy King
Follow Tim