Scalability in Business Intelligence: It’s Not What You Think


By Ellie Fields

The promise of analytics in the last decade has been to make data democratic and accessible to all. You don’t need special skills, the thinking goes, to ask and answer some basic questions about what you do every day.

There are a couple of critical assumptions underlying this thinking. First, that anyone who does a job every day is smart enough to have some ideas and questions about that job. Another critical assumption is that analyzing data is just another way of thinking about your world. You might investigate hunches using words and discussion, or you might investigate your world by analyzing data. If you think about it that way you understand why organizations are starting to give analytical tools to everyone instead of a select few with the word “Analyst” in their title.

Now if you take this thinking one more step, you see an important implication: the democratization of data requires scale. If you want to invite more people to think with data, you need to enable them with the tools to do so. The democracy of Ancient Rome, with only a few elites able to vote, worked with simple systems. Enabling a modern democracy with millions of citizens requires much more infrastructure.

So what kinds of scalability should you consider when you roll out analytics broadly? Well, the answer probably isn’t what you think.

The obvious kind of scalability: system scalability

The first kind of scalability probably is the scalability you were thinking of. This is the ability for the system to support the required load of x number of users performing y number of queries against z data sources. The system must be able to support this load without crashing and with an acceptable performance level. This kind of scalability is obviously critical.

This is the scalability that every analytics vendor devotes teams of people to improving. Improving this scalability usually involves better machines and better technology—it is a technical problem.

In recent years, cloud analytics systems have changed the cost-return curve of scalability by providing massive amounts of computing power that an organization can bring online with virtually no time investment. I expect the cloud to continue to improve the system scalability of data and analytics quite dramatically.

Adoption salability: scaling to everyone

The second kind of scalability is the ability for a system to reach many different kinds of people and get them to adopt it. This kind of scalability problem encompasses two different challenges: training and usability.

Improving usability lowers the barrier for new users. As a system of data and analytics grows in an organization, it inevitably touches people who don’t have the time or the motivation to learn something new. The easier the system is to learn, the higher the ultimate adoption will be. Usability is, in part, a system design problem. Organizations can also address usability by making important data sources clear and available.

Another challenge to scaling the number of people on a system is training. This can mean training in the traditional sense of classes and videos, as well as a host of other ways to help people ramp up. We’ve seen large customers improve enterprise adoption by offering office hours or doctor sessions with experienced users. Contests, user groups and show-and-tell days can provide a social aspect to learning about analytics.

Data scalability: scaling for different kinds of data and data of any size

Another challenge for analytic systems is to scale to all data: from big data to small data, from cloud data to Excel spreadsheets. If analytics is simply another way of thinking about your business, then being able to analyze all the relevant data will certainly offer a more holistic and useful result.

Most organizations have a variety of data sources. In part, this is the result of strategic choices—say, creating a data lake in Hadoop and a vetted financial data source in a warehouse. You wouldn’t want these two things in the same place. But it’s also organic. Marketing starts to get some really interesting insights from social data, and now you’ve got a new data source. The recruiting team adopts a cloud solution for hiring management, and now there’s another.

Like system scalability, data scalability is a technical challenge. The system simply needs to be able to connect to different data sources of any size, and either provide analysis against the data where it sits or bringing it into the analytical system. As you might expect, it’s usually better if you can connect to data in a simple and direct way, rather than having to kick off a lengthy programming project for each new data source.

Analytic scalability: scaling for different types of questions

Finally, we have analytic scalability. This is the least talked about kind of scalability, but it’s in many ways the most important. If you can’t answer your questions, what are analytics for anyway?

Analytic scalability is the ability to use data to understand and solve a large variety of problems. And because problems come in many forms, analytics must be flexible enough to address problems in different ways. This might include the use of statistical tools and forecasting. It might include the ability to create different views of data and bring them together into dashboards, to help people see how different views of data relate.

Visual analytics has become a critical way for people to use data. A visual presentation of data allows people to quickly find patterns and outliers. Shifting perspectives, from a set of bar charts to a map, for example, can help people gain a deep understanding of what the data represents.

Analytic scalability is mostly a system problem: can the system offer the user the chance to represent data in the most useful way? Is it flexible enough for someone to ask and answer questions in a fluid, interactive way?

An analytics system that has system scalability, adoption scalability, data scalability and analytical scalability is one that is ultimately about more than business intelligence. It’s about enabling people to work together to better understand their world. When companies do this, they find they are more agile and smarter than their competitors. They also find that their people enjoy having the ability to follow their natural curiosity about the work they do every day.

Ellie FieldsEllie Fields is the VP of Product Marketing at Tableau, where she’s responsible for new product launch, Tableau’s community, and Tableau Public. Her data geek credentials come from time served in tech and finance companies. She’s seen a lot of ugly data, beautiful data, and downright mean data. She’s a passionate believer that data used well can inform great decisions. She has an engineering degree from Rice University and an MBA from The Stanford Graduate School of Business. Connect with her on LinkedIn.

Timothy King
Follow Tim

Timothy King

Editor, Data and Analytics at Solutions Review
Timothy leads Solutions Review's Business Intelligence, Data Integration and Data Management areas of focus. He is recognized as one of the top authories in Big Data, and the number-one authority in enterprise middleware. Timothy has also been named one of the world's top-75 most influential business journalists by Richtopia.
Timothy King
Follow Tim