How Do You Declare the Winner of Your A/B Test?
When declaring a winner it has to be statistically valid. In other words, there has to be a significant enough difference, that you really set a new course in whatever you do.
To understand the statistical significance of your A/B test you have to remember 3 specific parameters:
Sample Size
Test Size
Confidence
Make sure you’re testing something that can actually have an impact.
A smart and well-thought-out test is important, you want to learn something, even if you fail.
Below is a lightly edited transcript of Episode 31 of the Inevitable Success Podcast.
Transcript:
Damian: Today’s episode is about different ways that you can test to improve your marking program. So for example, you know, we’re big proponents of the champion/challenger methodology, basically always having an incumbent winning approach to all of your marketing that you’re constantly challenging and we always prefer to do this in a testable format. Now that said, sometimes the metrics that come back not so clear, sometimes you look at the wrong metrics. So today we want to go a little bit deeper as to how would you determine if you have a winner or not?
Stephen: Of course, and I’ve been saying this for a long time, that test results are not baseball scores. In a baseball match, with the World Series going on right now, well if you won by one run that’s fine. It’s a one-run game, maybe it was a pitcher’s duel. But in testing it’s not like that, it has to be statistically valid. In other words, there has to be a significant enough difference, that you really set a new course in whenever you do.
Damian: So the takeaway is, just because you have a test that would have won a baseball game doesn’t mean that you actually have a winning idea.
Stephen: I call it conclusive evidence that you have a winner.
Damian: So I’ve totally experienced this myself, you know, I’ve run probably hundreds of tests in order for types of medians at this point. Actually, the most common result that I have found, especially if the test is not aggressive enough, is inconclusive. It’s a very common result. You know, me personally when I’m working on optimizing things, I actually love to go after bold aggressive changes, and here’s why. When you’re testing, the fact you’re testing, you’re already managing the risk of rolling out a bad idea.
Stephen: OK. Hopefully, it doesn’t stink too much.
Damian: Well yeah but you’re going to not necessarily roll out to everybody, you can manage that too. But, you know, I love the idea of actually avoiding having inconclusive tests. I either want something to work phenomenally or prove that I should never do that again quickly. And I think when you look for things that can have big changes, the odds of learning nothing and just spinning your wheels go down from there.
Stephen: I think you’re describing is what we call the scientific approach.
Damian: Yes.
Stephen: What is the scientific approach? We all know it, we don’t practice it, but we all know it. We took some science classes in school, it’s a social science. In the beginning, there is the hypothesis – if we do this, this will happen, or if you give this drug to somebody they’ll be better, or not one to this person drop the trial, it’s the same method, right? The biggest challenge in any analytics is to come up with a hypothesis. In other words, whatever you test here, that idea should come from human beings.
Damian: Yeah, and to that end, if you design it really well,