Patterns vs Statistical Significance | Are you judging marketing performance Inaccurately?




In marketing, we often have to make sense of large amounts of data and report on our findings. However, it's important to be aware of the potential for bias and errors in our interpretation of this data. We may be influenced by preconceived notions or even fabricate conclusions without realising it. For example, we might see a pattern in a graph and assume that it represents a real relationship, without verifying whether the pattern is statistically significant. To avoid such pitfalls, it's crucial to approach data critically and objectively, and use statistical methods to ensure the accuracy of our reports to clients.


Why do we see patterns everywhere?

We have a natural tendency to see patterns in our environment because it helps us make sense of the world around us. It is an incredibly important part of human cognition.

Pattern recognition allows us to quickly and easily identify familiar objects, shapes, and structures, which can be useful for navigation, communication, and problem-solving. For example, if you see a pattern of stripes on an animal, you might immediately recognise it as a zebra, even if you have never seen that particular zebra before.

Additionally, pattern recognition can help us to predict and anticipate events based on past experiences. For example, if you see clouds forming in the sky, you might anticipate that it will rain soon.

The general term for seeing patterns or connections in random or meaningless data is Apophenia bias.


Why does this happen?

People are naturally pattern-seeking because it helps speed up the process of recognising things. However, this makes us prone to a few different errors. Firstly, false positives, when we perceive a pattern is real when it’s not. Secondly, false negatives, where we don’t believe a pattern is real when it is.

A few subcategories of Apophenia are worth noting as they can all impact how you perceive results.

Pareidolia

Is Apophenia specifically caused by visual stimuli. This can take the form of seeing faces in objects or landscapes, hearing hidden messages in music played in reverse, or interpreting random data as having significant meaning.

Gambler’s Fallacy

The gambler's fallacy is the belief that past events can influence future events that are independent of those past events. This belief is also known as the "Monte Carlo fallacy" because it is often observed in the context of gambling, where people may believe that a certain outcome is "due" to happen based on the outcomes of previous plays.

For example, if a coin is flipped and comes up heads several times in a row, a person experiencing the gambler's fallacy may believe that the next flip is more likely to be tails

These are all examples of cognitive bias. However, in this article I will be focussing on clustering inclusion and confirmation bias as I have seen them be consistently destructive in how people report data in marketing.


What are cognitive biases?

Very Well Mind says that cognitive biases are systematic errors in thinking that occur when people are processing and interpreting information in the world around them and affect their decisions and judgments.

These biases can be caused by memories, or related to problems with attention and as such can be very subtly influencing you without you realising.

The idea of cognitive bias was first introduced by Amos Tversky and Daniel Kahneman in 1972. Since then, researchers have described a number of different types of biases that affect decision-making.

There are countless different cognitive biases’ which I believe as marketers we need to understand in order to minimise the risk of them impacting our performance.


What is Confirmation Bias?

Confirmation bias is the tendency for people to search for, interpret, favour, and recall information in a way that confirms their existing beliefs or hypotheses. This bias can affect the way people perceive new information and can lead them to dismiss evidence that contradicts their beliefs.

This form of apophenia can lead to overemphasising data that confirms a hypothesis and explaining away information that disproves it.

Flawed decisions due to confirmation bias have been found in various political, organisational, financial and scientific contexts. These biases contribute to overconfidence in personal beliefs and can maintain or strengthen beliefs in the face of contrary evidence.


What is the Clustering Illusion?

The clustering illusion can sometimes be known as the ‘law of small numbers.

The clustering illusion is a cognitive bias that occurs when people see patterns or clusters in random or random-looking data. This bias can lead people to perceive a relationship or connection between seemingly unrelated events even when no such relationship exists.

For example, a person who experiences the clustering illusion might see a pattern in a series of random numbers, such as thinking that a sequence of numbers like 5, 8, 9, 11, 12 is not random but instead forms a meaningful sequence. In this case, the person is seeing a pattern where none exists, which can lead to distorted or irrational beliefs.

This happens due to the belief or expectation that a series of random events will appear more regular than is actually the case and so we see ‘clusters’ appear from small samples or semi-random data that we then assume not to be the result of chance.


How these biases’ can impact our perceived results

Cognitive biases impact our perceived results by influencing the way we perceive, interpret, and remember information.

For example, if you have a strong belief that a certain marketing campaign is the best, confirmation bias may lead you to only pay attention to information that supports that belief and to overlook or dismiss evidence that suggests otherwise. As a result, you may make decisions that are not supported by the facts, which could lead to poor performance.

Similarly, with the clustering illusion, you might see a pattern in your sales data that suggests a particular trend is going to boost your sales, but in reality, search volume and demand are determined by a complex interplay of factors and is not predictable based on random data.


What can we do to improve our reporting?

It’s vital that when reviewing performance we are able to look at results objectively and critically to reduce the chance of being influenced by cognitive biases.

To confirm whether perceived performance is better or worse and not just appearing that way down to chance we need to use statistical methods to confirm our hypothesis.


What is statistical significance?

In statistics, the term "statistical significance" refers to the likelihood that a relationship between two or more variables is caused by something other than chance. This is exactly what we need to do to try and reduce the influence of cognitive biases and ensure our reporting is more accurate.

Statistical significance is a measure of how confident we can be that an observed relationship is real and not just the result of random chance. It is typically determined by calculating a p-value, which is the probability of obtaining a result that is at least as extreme as the observed result, given that the null hypothesis is true. If the p-value is below a certain threshold (usually 0.05), then the result is considered statistically significant, and we can be confident that the relationship is real.


What is a null hypothesis?

A null hypothesis proposes that there is no significant difference between certain characteristics. In other words, it is a statement that assumes that any observed relationship between variables is due to chance, and not because of any actual underlying relationship. The null hypothesis is usually denoted by the symbol H0.

For example, say you are conducting an experiment to test whether making certain UX (user experience) optimisations to a website will lead to an increase in the number of purchasers.

In this case, the null hypothesis (H0) might be that making these optimisations will not have any significant effect on the number of users who complete a purchase. This would be stated as follows:

H0: The UX optimisations will not lead to a significant increase in the number of users who complete a purchase.

The alternative hypothesis (H1), on the other hand, would be that making the UX optimisations will indeed lead to a significant increase in the number of users who complete a purchase. This would be stated as follows:

H1: The UX optimisations will lead to a significant increase in the number of users who complete a purchase.


How to calculate statistical significance?

There are many different statistical methods that could be used to compare the data in this scenario. Some possible methods might include:

  • A t-test, which is a statistical test used to compare the means of two groups (e.g. the number of users who complete a purchase before and after the UX optimisations are implemented) to determine whether there is a significant difference between them.
  • A chi-squared test, which is a statistical test used to compare the observed frequencies of events (e.g. the number of users who complete a purchase in each group) to the expected frequencies under the null hypothesis, in order to determine whether the observed data is significantly different from what would be expected by chance alone.
  • A regression analysis, which is a statistical method used to model the relationship between two or more variables (e.g. the number of users who complete a purchase and the implementation of UX optimisations) and determine the statistical significance of any observed relationship.

How to run a t-test experiment for UX optimisations

At this point it gets quite maths heavy. If you would prefer an easier and quicker way of determining results for this specific example I would recommend this website.

First, we need to calculate the means of the two groups:

Group 1 mean (μ1): sum of all values in group 1 / number of values in group 1

Group 2 mean (μ2): sum of all values in group 2 / number of values in group 2

Next, we need to calculate the pooled standard deviation of the two groups:

Pooled standard deviation (s): √[ ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2) ]
where:

n1 = number of values in group 1
n2 = number of values in group 2
s1 = standard deviation of group 1
s2 = standard deviation of group 2

Finally, we can calculate the t-statistic as follows:
t-statistic: (μ1 - μ2) / (s * √[1/n1 + 1/n2])

Once we have calculated the t-statistic, we can use it to determine the p-value, which is the probability of obtaining a result at least as extreme as the observed result, given that the null hypothesis is true. If the p-value is below a certain threshold (usually 0.05), then the result is considered statistically significant, and we can reject the null hypothesis in favor of the alternative hypothesis.


How can we use this in our reporting

  • Conducting A/B tests to compare the effectiveness of different marketing strategies or tactics, such as different versions of a website or Google Ads campaign. Statistical tests can be used to determine whether the observed difference in performance between the two versions is statistically significant, and therefore likely to be real and not just due to random chance.
  • Analysing customer data to identify patterns and trends, such as the most common customer demographics or the most popular products. Statistical methods such as regression analysis can be used to model the relationships between different variables and identify which factors are most strongly associated with certain outcomes, such as customer purchases or website engagement.
  • Forecasting future performance, such as future sales or customer growth. Statistical methods such as time series analysis can be used to model historical data and make predictions about future trends based on past patterns.
  • Testing hypotheses about market trends or consumer behavior. For example, a marketer might test the hypothesis that a particular marketing campaign will lead to an increase in sales, by collecting data on sales both before and after the campaign is launched and then using statistical methods to determine whether the observed increase is statistically significant.

Overall, statistical methods can provide valuable insights and help marketers make more informed decisions based on data.



Related Articles

How to Crawl Websites with Google Sheets?

An automated way to scrape data within Google Sheets seems too good to be true? In this article, I will show you how and go through some examples of how I’ve used this in my role as a Paid Media Marketer.

Read More

10 marketing KPI’s you should be tracking in 2021

KPIs ensure that you execute a more successful marketing strategy as well as helping you to identify which campaigns and tactics are having the biggest impact for your business.

Read More

Top 10 ways to market your business in 2021

Strategies that you can use in this new era of cookie-less, privacy focused browsers.

Read More


Let’s talk

Start your project with me

Start Talking