5 Common A/B & Multivariate Testing Mistakes
1. Leaving Web Analytics Un-Optimized
Before you get started testing, you want to make sure your web analytics are set up properly, not only so you can do proper analysis before designing your test, but so you are tracking the right things during and after the test.
Make sure your "goal paths" (the common navigational paths your customers take to reach your conversion goal, e.g. checkout) are configured so you can view your funnel abandonment. Also ensure revenue is properly tracked. If possible, include your COGS (cost of goods sold) to determine profit per visitor.
In Google Analytics, "Profiles" allow you to slice and dice your data by applying permanent segmentation rules, such as filtering out international traffic or restricting data to a sub-domain or store section of your website. Profiles do not work with historical data, so they must be applied before you start testing.
2. Not Understanding Customer Segments
Your site testing exists to improve your website performance, but averages may be hiding the real issues on your site. You may have an average bounce rate on your home page of 59%, but if you segmented by visitor type, you would discover new visitors bounce at 75% and returning visitors at 34%. So instead of setting a goal to reduce overall bounce rate, your goal might change to reduce new visitor bounce rate. Likewise, there are differences between domestic and international visitors, email subscribers, affiliate referrals, paid and natural search and comparison engine referrals. Your site may have a mix of B2B and B2C offerings. If you only analyze your data in aggregate, you'll design the wrong tests and apply them to the wrong visitors.
Sadly, not all testing tools allow you to segment visitors - including Google Website Optimizer. There are workarounds - if your ecommerce platform uses targeted selling, you may be able to create custom pages that are only served to certain visitor segments, and then split-tested by your testing tool.
3. Applying the "Radical Redesign" Concept to Individual Variables
If you caught our recent article Choosing Between A/B or Multivariate Test Design, you'll recall that there are 2 approaches to A/B tests: univariate and radical redesign. Univariate tests variations of one variable, while radical redesigns throw a bunch of things at the wall at once. Radical redesigns allow you to identify which design achieves best results, so you don't end up micro-tweaking the design you currently have with multivariate tests when you've left better performing layouts on the table.
The downside to radical redesigns is when you change more than one thing at a time, you're never sure what element is responsible for any observed improvement. Sometimes a univariate split test is conducted for what should actually be a multivariate test, because the one variable that is being tested is subtly radically redesigned across versions.
For example, you may wish to test a large thumbnail image with a real baby modeling a sleeper vs. a small thumbnail with the garment laying flat. The thumbnail image is one variable, but each version actually has 2 branched variables. To be truly valid, the test should include:
Large thumbnail, baby model
Large thumbnail, flat garment
Small thumbnail, baby model
Small thumbnail, flat garment
Therefore, it is actually a multivariate test. A true univariate test might be a small, medium and large image size of the garment modeled by a baby.
4. Measuring the Wrong KPIs
It's easy to become myopic about conversion rate, but improved conversion rate is not the goal for every page. Few customers convert directly from the home page, rather they click through to a deeper page. The goals for the home page might be reduced bounce rates, increased click throughs, and repeat visits. Pay attention to the relationship between key performance indicators, or KPIs. Sometimes when one metric goes up, another goes down - you may celebrate higher average order value while overall revenue tanked. Track everything that is important.
You also want to measure revenue per visitor for most tests, especially when testing offers and prices. Conversion rate might go up, while profitability goes down!
5. Running a Test Too Short
Declaring a winner too soon (before enough data is collected and the test reaches statistical significance) increases your chance of a false positive or negative. Evan Miller shares a great example of the danger of this on his blog.