The Search Marketers Guide to Creative Testing and Optimization, Part 3

Kye Mou
August 22, 2012

From selecting creative elements for testing to reaching statistical significance, this four part blog series reviews basic and advanced tips for conducting a successful creative test. Last week, we discussed how to prioritize and test keyword tokens and leverage dynamic keyword insertion to increase creative relevancy. Today, in part three of this four part series, we’ll walk through how to prioritize tests based on return and the importance of limiting test elements.

Prioritize Tests Based On Return

As paid search programs grow, it becomes increasingly challenging to implement and manage creative tests across all groups within an account. To optimize creative at scale, prioritize tests to focus on groups with the most potential to shift overall account performance. These groups are characterized by a high share of impressions, clicks or conversions within an account.

Due to limited resources, our fictional retailer, PowPow Sports, decided to only test creative in two of the groups within their account. Group A received 10,000 impressions per week, while group B received 1,000. Each test resulted in equal improvements in performance within its respective group. The table below highlights the improvements in group performance after creative testing and highlights the potential performance of another, untested Other group.

Prioritize Creative Tests Based on Return

This example simplifies a common challenge where groups with little to no volume are prioritized over Other, higher volume groups. Though both groups benefited from a creative test, group A experienced a greater increase in clicks and conversions. Each test took the same amount of time to implement, but one resulted in a greater revenue return on time investment. Prioritizing creative tests for high volume groups has the greatest potential for incremental improvements in overall account performance.

Limit Test Elements

A new creative might be subject to one or many test elements. It can be triggered by a single set or multiple sets of keyword tokens. And it might share impressions with another or many other creative within the group. Without controlling these variables, it becomes difficult to reach statistical significance and to determine what factors contributed towards a successful or unsuccessful creative test.

Limiting the number of elements within a creative test makes it easier to identify why one creative performed better than another. For example, assume that PowPow Sports is testing two new creative. One tests a free shipping offer, creative B, and the other tests several formatting and language elements, creative C. Even with improved performance on the new creative, it would be unclear as to which test element in creative C contributed to its success. Testing each element one at a time will better determine its individual impact on creative performance.

Good Test

Tests a single element in Description Line 2


Shop PowPow Sports Original
Shop PowPow Sports Good

Bad Test

Tests too many elements across the entire creative


Shop PowPow Sports Original
Shop PowPow Sports Bad

To promote an optimal creative testing environment, keep keyword lists concise when building out new campaigns and groups. Groups that contain a small set of highly granular keywords allow the creative within that group to focus on a small set of tokens. Rather than having to test tokens to improve relevancy, creative within these groups can test compelling offers and calls-to-action that drive greater increases in CTR and conversion rate.

The rate at which a creative test reaches statistical significance is associated with the number of creative within the group. Testing a large number of creative requires a large number of impressions. With smaller, low volume groups, this requirement becomes an issue. For a group that receives only 1,000 total monthly impressions, testing ten creative variations might take several months to reach statistical significance.

For larger, high volume groups, reaching statistical significance is less of a concern. However, the opportunity cost of running on underperforming creative must be monitored much more closely. Underperforming creative within these groups accrue a high volume of impressions that are better served on top performing creative, and should be paused once statistical significance is reached.

To Be Continued

Adhering to best practices and avoiding common pitfalls will help ensure that new iterations of creative will incrementally improve account performance. Though search marketers cannot guarantee that all creative tests will be successful, they can guarantee that all creative tests have been set up for success. In part four of this series, we’ll review three additional best practices for creative optimization.

Download The Search Marketers Guide to Creative Testing and Optimization for additional best practices and two case studies from BoostCTR on how they successfully test and optimize creative.

Sign Up and Get Updates by Email
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.