Smart Bidding optimizes against last-touch conversions. Lift testing measures causal conversions. The two numbers can disagree by 30 to 50 percent. This is how to run the test that closes the gap.
Incrementality testing in Google Ads is a randomized controlled experiment that measures the causal lift of your ads. One matched group is exposed to your campaign (treatment), another sees no impression at all (control). The difference in conversions between the two groups is incremental lift: the conversions your ads actually caused, not the ones that would have happened anyway.
That last clause is where most ad accounts lose money. Smart Bidding optimizes against last-touch conversions. Lift testing measures causal conversions. The two numbers can disagree by 30 to 50 percent. If you have never run a lift test on your account, your Target ROAS is almost certainly mis-calibrated, and the algorithm is happily scaling spend on traffic that would have converted on its own.
Google ships two native ways to run one: user-level Conversion Lift and geo-based incrementality experiments (rebuilt in November 2025 with a $5,000 minimum spend, down from roughly $100,000). You can also run a self-managed geo holdout outside the platform when you want full control. This article covers when to pick which, how to design one that produces an actually usable answer, and how to feed the result back into Smart Bidding.
The thing being measured is causal lift on a defined conversion event over a defined window for a specific exposure. Nothing more. A lift test will not tell you whether your creative is better than your competitor's, will not separate the brand-halo effect from the direct-response effect unless you designed it to, and will not magically reconcile your MMM with your platform attribution. It answers one question cleanly: if these specific impressions had not happened, how many of these specific conversions would still have occurred.
The Google Ads implementation of Conversion Lift uses an intent-to-treat design. The control group does not just see different ads. The control group sees nothing from your campaign, served by the ghost-ad mechanism: Google runs the auction, your bid wins or loses normally, and for control users the impression is simply withheld and logged as a ghost. That preserves the auction dynamics that would have existed and gives you a clean treatment-versus-control comparison.
Three things people often confuse with incrementality and shouldn't:
Google offers two lift products inside the Ads UI and they answer different questions.
Conversion Lift (user-level). Conversion Lift "isn't available for all Google Ads accounts. To use Conversion Lift, contact your Google account representative," per the official help center. When you do get access, the experiment randomizes at the user level using the ghost-ad mechanism described above. Reports return Incremental Conversions, Relative Conversion Lift, Incremental Conversion Value, Incremental Cost Per Action, and Incremental Return on Ad Spend for studies with conversion values. The honest constraint: Conversion Lift requires meaningful user-level data, and post-cookie environments have made user-level studies harder to qualify for. Many mid-market advertisers will not be approved.
Geo-based experiments / incrementality experiments. This is the path Google rebuilt in November 2025. The minimum spend dropped from approximately $100,000 per experiment to $5,000, "up to 50% more conclusive" results, and a redesigned interface with custom test-size controls and configurable confidence levels. Geo experiments work by holding out entire DMAs or regions: ads run as usual in treatment markets, are paused in control markets, and the difference in the conversion rate between matched geos is the lift. The reports return Incremental ROAS, Incremental Conversions, Incremental Conversion Value, and Incremental Cost.
| Method | Access | Min spend | Unit | Best for | Limitation |
|---|---|---|---|---|---|
| Conversion Lift (user-level) | On request via Google account rep | Not publicly stated (account-scale gated) | Users (ghost-ad mechanism) | Large accounts, well-tracked user-level conversions | Most mid-market accounts not eligible |
| Geo experiments (Google native) | Self-serve in Ads UI (Nov 2025 update) | $5,000 per experiment | DMAs / geos | Omnichannel businesses, mid-market, offline sales | Requires meaningful geo separation |
| Self-run geo holdout | Always available | No platform minimum, but need ~1K conversions/arm | Hand-matched DMAs or synthetic control | Custom hypotheses, multi-channel tests | Requires analyst time + matching effort |
When to pick which. User-level Conversion Lift gives you more precise answers when your account scale qualifies and your conversions are well-tracked at the user level. Geo-experiments work better for omnichannel businesses (offline sales, app installs, considered purchases) and for mid-market accounts that cannot get Conversion Lift access. Industry survey data Google cited with the November 2025 update: "80% of senior US marketing analytics professionals report incrementality experiment insights significantly impact revenue growth." That number maps to the audience this article is written for.
Six steps. Skip any of them and you will produce a number that looks like an answer but is not.
The honest range, synthesized from Haus, Fusepoint, and what we have observed on B6 accounts:
A lift result of zero (or negative) is a valid finding, not a failure of the experiment. "This campaign produces no measurable incremental revenue" is genuinely useful: redirect that budget. The Haus quote that captures this best: "Turning off a campaign would only decrease total sales by 30 percent of what Google attributes to it." Sit with that number for a second. For the related diagnostic when ROAS suddenly shifts after a budget move, see our ROAS dropped suddenly walkthrough. For the broader Performance Max diagnostic when lift comes back weak, see Performance Max not converting.
The list of ways a self-run lift test goes wrong is long enough that we run through it every time a team designs one.
Contamination. A user sees ads on one device and is in the control group on another. Cookie loss reassigns users mid-test. Treatment and control geos share a commuter zone (the classic Manhattan-Newark problem). Each contaminates the result, usually toward underestimating lift.
Underpowered tests. Two weeks, 200 conversions per arm, 30 percent reported lift, no significance. The test ran, the number exists, the number is meaningless. We see senior teams ship recommendations off underpowered tests more often than we should.
Reading early. Considered purchases convert on day 22 of a 14-day lookback. If you read the lift result before the lookback closes, you are reading half the story. The conversion window must close before analysis starts.
Seasonality contamination. Comparing a December treatment period against a November pre-period without adjustment will produce 40-percent "lift" that is just Q4 demand. Always include seasonal controls or run during a stable window.
Letting Smart Bidding re-optimize mid-test. If you change tROAS, tCPA, budget, or audience signals during the test, the treatment is no longer stable. Either freeze the campaign settings or accept that the lift you measured is for the average of two different treatments.
Running lift with no challenger. A lift test on a brand campaign with no holdout geo and no creative variant measures nothing. We have seen this pitched as "we are running an incrementality test" three times this year. It was always a non-experiment.
This is the section most lift articles skip. The output of a lift test is not a slide for the QBR. It is a multiplier you apply to your bidding inputs.
The mechanic is simple. Smart Bidding optimizes against the conversions you send it. If your lift study shows 60 percent of last-click conversions are incremental, then for bidding purposes the conversion stream is overstated by 40 percent. Multiply your conversion value feed (or your conversion count, if you bid on Target CPA) by the incrementality factor (0.6 in this example) before sending it to the bidding algorithm. The result: Smart Bidding starts targeting causal conversions instead of correlated ones.
The operational rules we use on B6 accounts:
This loop is the practical version of Google's recommendation to "combine incrementality testing with AI solutions" in the Think with Google framework. The frame matters. AI bidding is not the problem. The problem is feeding AI bidding a non-causal conversion signal and acting surprised when the algorithm optimizes against the wrong thing. See our AI-powered PPC optimization strategies for the full feedback-loop pattern, and the Quality Score guide for the related Smart Bidding signal-quality discussion.
What's the difference between incrementality testing and A/B testing? A/B testing compares two versions of a treatment (ad A versus ad B) and tells you which is better. Incrementality testing compares treatment against no-treatment and tells you whether the campaign should run at all. They answer different questions and are not interchangeable.
Does Google Ads have built-in incrementality testing? Yes, two tools. User-level Conversion Lift, available on request via your Google account representative. Geo-based incrementality experiments, rebuilt in November 2025 with a $5,000 minimum spend. Both run inside the Ads UI.
What's the minimum spend to run a lift test? Google's geo-based experiments require $5,000 per experiment as of late 2025. A self-run geo holdout outside the platform has no minimum, but realistically needs at least 1,000 conversions per arm to detect a 10 percent effect.
How long should a lift test run? Four weeks minimum for short-lookback e-commerce. Six to eight weeks for considered purchases or any account under 200 conversions per week. Always wait for the conversion window to close before reading the result.
Can incrementality testing prove Performance Max is working? It can prove whether PMax produces incremental conversions at the campaign level, but because PMax bundles Search, Shopping, Display, and YouTube, the result is a blended number. To isolate components, you need to layer in either a PMax-versus-no-PMax holdout or a channel-level diagnostic. For broader PMax diagnostics, see our Performance Max not converting playbook. For sudden ROAS shifts, see ROAS dropped suddenly. For the deeper Smart Bidding context, see the RSA best practices guide.
Smart Bidding will gladly scale a campaign that adds zero incremental revenue, because Smart Bidding cannot tell the difference between a conversion it caused and a conversion that would have happened in its absence. Lift testing is the only empirical bridge. Without it, you are tuning a multi-million-dollar bidding algorithm on a signal you have never validated.
The B6 stack treats this as the core measurement loop. Sage designs the lift test (treatment unit, control match, sample size, conversion window), runs the analysis when the window closes, and reports the causal numbers with confidence intervals. Vox translates the result into a budget reallocation proposal, telling you which campaigns deserve more spend, which deserve less, and where to redirect the cuts. Buzz retunes tROAS, tCPA, and the conversion value feed so Smart Bidding is targeting causal value. The whole loop runs quarterly, with no QBR deck required.
The cost structure: B6 at $199 a month on the Approval tier, versus Optmyzr at roughly $499 a month for recommendation-only insights, versus the typical agency that will quote $4-8K for one custom incrementality study. Connect your account at /chat and Sage will design a default geo-lift across your top three campaigns inside 5 minutes. Pricing tiers and what each agent does in each tier are at /pricing.