(This is a business case study. It will be used to guide discussions during the session: “Data Visualization” at the Vendo Partner Conference in Barcelona on September 16th.)
In the book, “How To Lie With Statistics” author Darrell Huff introduces us to ways of misunderstanding data visually that were common in 1954. A favorite example….
It’s 61 years later and we still face this and other types of data visualization problems.
How do we really see what’s going on? One of the main problems that we face are baselines. What is normal behavior? We have to see that clearly to be able to see variations that need attention. What tools are we using to understand normal variations versus extreme ones?
Since working with data is both an art and a science, if we really want to find something out the best way to start is with a question. A friend of ours is one of the leaders of MicroStrategy, a big data and analytics company that works with Facebook and Netflix. He says that you should pick 3 questions and put three people on it for 6 weeks. The tool has to match the job. And there are lots of tools to find answers and present them visually.
We found Google Analytics to be useful for creating hypothesis from looking at trends. Qlikview is relatively easy to code. The graphics aren’t up to 2015 standards but you can make it answer a lot of different questions.
Here are some examples of us fumbling around to make data visual that didn’t work. We want to show our partners how different traffic sources contribute to the value of a member. A pie chart doesn’t expand very well and you would need one for every deal…so you would need lots of them. More pies than grandma makes for christmas. Also a series of columns is good but doesn’t really get you what you need because you can’t see the effect of a drop or increase in a revenue source.
This one shows deals individually but it’s hard to see the effect of each revenue source.
Now we are working with stacked bar charts for being able to see many deals side by side and the effect of the different revenue sources. The numbers on the horizontal access at the bottom show the value of the deal once the cost of the traffic is removed.
Another big challenge is forecasting. How do you predict averages for early data? Well, curves help. You can see the peak and you can see it rise over time compared with your history. A rising and falling box helps, too, to see averages and standard deviations.
Standard deviation bands also help us know if there is an important change. If there is a lot of volatility then it’s just the nature of the business.
The bands tell us what to expect (high and low and average) and then we plot the data to see the results. We can easily see what is out of hand and needs to be dealt with immediately. The bands also adapt based on recent results.
If you are interested in further data visualization check out these two subreddits: Data is Beautiful with lots of examples and some AMAs and Visualization which is more focused on practictioners of data visualization.
Questions for discussion: How do I see the data I’m getting? What charts and graphs am I using to understand my businesses? What tools am I using? We’ll talk through data visualization tools we use in Google Analytics and other tools like Qlikview.