Your Data Visualization Is (Probably) Lying to You
Business intelligence solutions are important because they help companies develop insights from the data they collect. A large part of formulating insights comes from how organizations see their data; that is, how they perceive what they are looking at. This is typically done via data visualization, whether it be data portrayed as a graph, chart, in an Excel spreadsheet or some other method. Businesses rely heavily on this visualized data, so any errors or untruths could throw a business completely off base.
Enterprises need to be careful when analyzing visuals, even those that were created in-house. Individuals can always find a way to skew data to make it say what they want it to, so be sure to assess not only the data that makes up a visual, but the delivery mode of that data and how the visualization was put together.
Be careful of automatically buying the story the visual is attempting to sell. It’s really easy to take attractively formatted data for gospel. Truncating of the Y-axis is a common visual manipulation. This may happen if the author of a graph wants to make it look like sample A and B are much different than one another, yet the X axis increases by only small increments. This can appear as though the data in the two samples is vastly different.
Be on the lookout too for ratio changes to the X or Y axes. For example, if the Y-axis is spread too far, it can have an impact upon the steepness of the slope, allowing the graph’s author to make data look much more or much less impactful, depending on their particular stance. This type of thing can be seen when two like graphs house the same data are compared.
Graphs that portray data as climbing up and to the right are typically viewed in a positive light by readers. This too can be manipulated. An author who wishes to hide not-so-good news by creating a deceptive chart can make it look like things are trending in a positive direction when they are not.
While advanced graphics look cool, oftentimes these types of data visualizations are created in order to deceive the reader by clouding their perception of the truth, especially in visuals that are three-dimensional. This can also be done in another way by inverting the numbers.
One of the more obvious ways data viz authors can manipulate their readers is to simply omit data. Visualizations of data are used to identify trends as they appear within a volume of data. By leaving out certain pieces of information that may not agree, visual creators can control the narrative. A good example of this is an author using only growth data in a sample, when there were also years of stagnation. True, the overall trend of the entity will look the same, but will make it look better since the stagnated numbers won’t appear.
Charts often use logarithmic scales when comparing two things that are not alike. For example, someone could compare a smaller department store to Wal-Mart by adding massive scale to one of the axes. However, it can become a problem when the scale is used for the purpose of hiding information that does not agree with the stance of the individual who created the visual.
Who knew data visualizations could be so complex? We analyze the data in these types of things all the time, but we often forget to analyze the analyst. What motives might the creator have? What do you think they want the data to say? Looking at this prior to the data might even be more important than the data inside the visual. In short, don’t forget to examine each axis, make sure the scaling makes sense, and beware of charts that look way cooler than they should.