Stats Made Simple Part Three: Putting Your Results to the Test
In part 2 of this series, AFFINITY Bright Spark, Caspar Yuill, explained how we can use probability to improve customer experience. In this final installment, Caspar looks at the different testing protocols available, and demonstrates why it’s important to understand how to apply these in order to make informed decisions around your results.
In case you’ve forgotten, our last article on statistics left you on a cliffhanger: you, the hypothetical CMO of a company, had implemented a new fit-out to improve customer experience and increase sales. Like the smart data scientist you are, you changed one store but left another as your control. But the results were inconclusive – Store Two, the store you changed, took $1,087 in sales over a set period of time. Your control ends up with $1,010 in sales. The big question remains: is the intervention worth rolling out, especially with such a small difference in sales?
Numerical vs Categorical
Before you go about answering this question, there’s one last concept we need to explore related to the type of data being collected. Clearly, the amount of store sales in dollars is a different metric to, say, the gender of customers you’re attracting. You can think of store sales dollars as a numerical metric, which means it’s a type of data expressed in numbers (also called quantitative data). In other words, it’s any data you could put in a spreadsheet and add together, multiply, or divide.
On the other hand, data like gender is categorical data. This could also include data like the level of education, suburb, and the personality type of your test subjects. While you might be able to store this data in a spreadsheet, and you might even code categorical data using numbers for analysis, you can’t do maths on it.
Answering Our Problem
Now if you think back to the first article in the series where we discussed normal distribution, you need to start by considering whether you’re comparing your sample to the population or to another sample. As you’re comparing it to another store, that means it’s to another sample.
Next up, you need to determine whether the data is numerical or categorical. As you’re comparing store sales, you’re looking at numerical data, right?
Finally, we consult our handy chart of statistical tests:
|1 vs Population||1 Sample T Test||Binomial|
|Sample vs Sample||2 Sample T Test||Chi Square test|
|More than two samples||ANOVA (+ Post Hocs)||Chi Square test|
Input your two variables and you can see you need a two sample T Test (easily calculated in Excel or Google Sheets).
Working through some other variables (which won’t be discussed today), like distribution tails (we want a two-tailed test to look for changes in both directions), and type (in this case, a two-sample equal variance test, as our data set variance is equal), we can calculate there’s a 0.00000003% likelihood the difference you’re seeing is due to chance. Therefore, you can conclude the intervention was a success, and you have the numbers to prove it!
Using Other Tests
We won’t go into the intricacies of the statistical tests here – for those interested, there are some excellent courses on Coursera that go into far more detail. However, we will talk about how you might use the other statistical tests depending on your data.
Say you changed only one store but wanted to compare the sales to all your other stores. This means your data is still numerical, but in that case, you’d want a 1 Sample T Test, as you’re comparing to a population.
But what would you do if you wanted to look at the impact of the differences in store fit-out on customer satisfaction? Well, you might collect data via a store survey and get a 1-5 rating of satisfaction across all stores, including a rating of your test store. In that case, you would consult the chart and select a binomial test.
Just don’t get tripped up by the store survey data ratings! While this data comes in the form of numbers, it’s still categorical. It puts the customer experience into categories, and it’s not easily operated on – for example, a customer experience rating of 1 is not five times worse than an experience of 5.
If you wanted to trial a couple of different store fit-outs you would include Test Store 1, 2, and 3, and you would try to discover which one resulted in the greatest customer satisfaction, if any. In this scenario, you would have more than one sample being compared using categorical data, so you would pick a Chi-Square Test.
Lastly, if you had the same experimental setup as above – three Test Stores, all with different fit-outs – and you want to see which one, if any, had an impact on store sales. In that case, you would have numerical data and three samples to compare. Your test of choice would be an ANOVA (Analysis of Variance) test to see which ones are significantly different, then a post-hoc test, like Tukey’s test, to tell you which ones are and aren’t significantly different.
So why does all this matter? When analysing test results, it’s important to choose the correct statistical tests to use so you can be sure your results are correct. Especially if you’re using that statistical analysis as the basis to make big decisions, like whether to change the entire customer experience of a store!
That’s just about as simple as statistics gets. If this series has piqued your interest, by all means, dive right in. It’s easy to see how just by developing your intuition using these rudimentary statistical concepts, you can start thinking smarter and making better decisions. And of course, you can get used to the feeling of being right! (Just ask my colleagues ;-))
Want to learn more about how to use the power of statistics to make informed decision throughout the marketing process? Drop us a line at email@example.com to chat further.
Better input always leads to greater outcomes
Subscribe to OutThink, the AFFINITY ThoughtReport
14th December 2021
8th December 2021
25th November 2021
23rd November 2021
16th November 2021
12th November 2021