7 Simple Best Practices for Digital A/B Tests

You might know that many companies run “tests” in their web and mobile experiences, where they’ll pit one design, layout, or content set against another. It’s a way to see quickly which one works best. For instance, you might have one page with a slick graphic at the top, vs another one with text and a form at the top, to see which one gets the best interaction and form completion. At Cisco, we’ve been doing this at a component level for a while, and have begun to do these kinds of test with whole pages on Cisco.com. These tests help us understand which of two experiences is the easiest and most straightforward to use, and then we can apply that knowledge to our page designs in general.

But, it’s easy to over-complicate testing, which can lead sometimes to results that are hard to interpret or that generate too much data that can’t be readily analyzed. So, I asked our team on the Digital Support Experience to give us their best practices for how to plan page vs page tests the right way. Here are some great tips from the teams:

Identify your success metrics (“Overall Evaluation Criteria,” see below) at the beginning of your planning, so you can make clear the #1 thing your are optimizing for.
Establish Baseline performance for the KPIs/success metrics before you start the test. (understand where you’re starting from — how were things performing before the test?)
Before the experiment starts, state what will happen if the test a) succeeds, b) is inconclusive, c) fails.
Don’t do two new competing designs for A and B. A better model is to use the current design as “A” and a new design as “B” so you have a good control group.
Limit the number of variables that differ between the two versions.
Let the experiment run long enough to get a volume of data on the key items that will yield statistical confidence in the result.
Validate setup with an “A/A” test. The purpose of this is to check that your “system” for branching the users isn’t itself adding in a bias. (If you do a split of incoming users and you are directing them to pages that are exactly the same and you don’t get equal results, then your A/B branching system is adding bias itself.)

Overall Evaluation Criteria

“Overall Evaluation Criterion (OEC) forces you to ask the question: “what are you optimizing for?”

To do it right, only one OEC (think KPI) should be specified for your test. The reason for just one criterion is that multiple criteria can lead to muddy results. Version A improved KPI #1 but version B improved KPI #2 so which one was better? Therefore, it’s crucial to to specify one and only one KPI and optimize for that.

And, an Ultra-Tip

Perhaps the top tip of all is to improve your designs, content, and journeys based on the tests that you run. And, if you don’t understand why a particular experience is performing in a certain way, it’s probably time to supplement your A/B testing with some observational usability tests, so you can watch users interact with (and comment on) the experience they’re having. Test similar experiences on other sites, too, and you’ll a fuller picture.

What Do You Think?

Have you been running A/B and other kinds of tests on your digital experiences? What tips would you offer to your colleagues?