Most outbound programmes run on assumptions. A subject line that felt strong in a brainstorm. An opener that seemed compelling when it was written. A call to action that made sense to the person who wrote it. Whether any of those assumptions are correct only becomes clear once real people start receiving the emails.
A/B testing is how we replace assumptions with data.
What we test and why
We test one variable at a time. Testing multiple things simultaneously makes it impossible to know which change caused which result. Isolating variables is what makes the data useful.
Subject lines are the first thing we test on any new sequence. They determine whether the email gets opened at all. Two subject lines can produce dramatically different open rates even when the email body is identical. We test at least two variants on every sequence from day one.
Openers are tested once we have stable open rate data. The opener is the first line of the email and the moment where a prospect decides whether to keep reading. A small change in how the opener is framed can have a significant impact on reply rates.
Angles and value propositions are tested when we want to understand which problem resonates most with a given ICP. Sometimes the same product can be positioned around efficiency, around revenue, or around risk reduction. Testing different angles tells us which one this specific audience responds to.
Call to action testing looks at how the ask is framed at the end of the email. A direct calendar link versus a question versus a soft permission ask. Different audiences respond differently to each format and the data tells us which one to use.
Sequence length and timing is tested over a longer window. Some audiences respond better to a tighter cadence. Others need more space between touches. We test timing adjustments once the core copy variables are settled.
How we run tests
For a test to produce meaningful data it needs a large enough sample and a long enough window. We do not call a winner after 20 sends.
Each variant runs until it reaches statistical significance or a minimum of 100 sends, whichever comes first. For smaller lists we run tests over a longer time period rather than trying to compress them into a short window.
We run tests in parallel rather than sequentially where possible. This means both variants are live at the same time against similar contacts, which reduces the impact of timing and external factors on the results.
What happens with the results
When a test produces a clear winner we retire the losing variant and scale the winner across the relevant sequences. If the difference between variants is not statistically meaningful we document the finding, adjust the test parameters, and run again with a more distinct variable.
The results of every test get logged in your campaign record. Over time this builds a body of knowledge about what works for your specific audience that informs every new sequence we write.
How testing fits into the weekly cycle
Testing is not a separate phase that happens before the campaign is live. It runs continuously as part of the weekly optimisation cycle.
Every two weeks we review active test results, call winners where the data supports it, and set up the next round of tests based on what we learned. The campaign never stops improving.
FAQ
How long does it take to get meaningful test results?
It depends on your send volume. A campaign sending 200 emails a day will reach statistical significance on a subject line test within a week or two. A campaign sending 50 emails a day will take longer. We factor your volume into how we design the tests so we are always working with data that is actually reliable.
Do you test every element of every email?
No. We prioritise the variables with the highest impact first. Subject lines and openers have the biggest effect on results so they get tested first. Once those are optimised we move to the next layer. Testing everything at once produces noise, not insight.
Can we suggest things to test?
Yes. If you have a hypothesis about a different angle, a different CTA, or a different framing you want to try, we build it into the test schedule. Some of the most useful tests come from ideas the client brings based on what they are hearing in sales conversations.