ML signal quality, asset-combination math, ad strength forensics, pinning trade-offs, and variant testing with real statistical significance windows.
Senior PPC managers running RSAs at scale should standardize on five practices: run at least 2 RSAs per ad group with "Good" or "Excellent" ad strength, fill all 15 headlines and 4 descriptions, pin only when compliance forces it (and pin 3+ assets per slot), pair every RSA with Smart Bidding (Target ROAS or Target CPA), and review the combinations report every 14 days. Google's own data (Google Ads Help, 2024) shows the 2nd RSA per ad group lifts conversions 6.6%, and the 3rd adds another 3.7%.
Below we walk each one at Sara-level: what the doc says, what the ML actually rewards, and where the official advice quietly breaks under load. If you manage 10+ accounts, the gap between "filling out the form" and "feeding the ML the signal it needs" is where your year-over-year CTR delta lives.
RSAs are not a creative format. They are a signal-collection mechanism that Google renders as creative. The mental model matters because it tells you which knobs to turn (asset variety, combination space, significance windows) and which to leave alone (forcing a specific headline order because marketing wants it).
A responsive search ad is a single ad unit holding up to 15 headline variants and 4 description variants. Google's ML serves a 2-or-3 headline plus 1-or-2 description combination at auction time, picked per user signal. The unit competing in any given auction is one of hundreds of possible combinations.
Math first. 15 headlines pick 3 gives 455 ordered triplets, before description pairing. Add 4 descriptions and you cross 1,800 possible served combinations per RSA. A single hard pin on Headline 1 cuts the triplet count from 455 to roughly 91.
ML signal inputs are not exposed in the UI. The auction-time picker consumes query intent embedding, device class, time-of-day, audience signal, and a predicted CTR per candidate combination. The combinations report is the only post-hoc view of what got served and how it performed.
The combinations report is your forensic tool, not the Ads tab. It tells you which 8 combinations out of 1,800 ate 80% of your impressions and what their CTR delta looks like. We open it on day 14 of any new RSA. Before that, the data is too thin and you risk killing combinations that just had not been served yet.
Use the full 15 headlines and 4 descriptions. Google's data: adding a 2nd RSA per ad group lifts conversions 6.6%, the 3rd RSA adds another 3.7%. Asset volume feeds ML signal quality. With 8 headlines and 2 descriptions you are signal-starved.
Counter-nuance, where most accounts fail. 15 unique headlines is not "15 keyword variations of the same phrase". The ML treats near-duplicates as one signal. We bucket headlines: 5 keyword-led, 4 benefit-led ("free shipping", "30-day return"), 3 CTA-led ("Shop now"), 2 social proof ("Trusted by 12k"), 1 urgency ("Ends Sunday"). Otherwise the combination space collapses on its own.
Character forensics. Headlines 30 chars, descriptions 90, path fields 15. Mobile truncation reality: descriptions clip around 60 characters on small screens, and the truncation is hard. Front-load the value prop in the first 50 chars.
Diminishing returns kick in fast. Google's data says you hit roughly 90% of ML benefit at 8 headlines. Below 6, the system runs hungry and ad strength caps at "Average". Most operators stop at 10 because writing 5 more "good" headlines is hard. Generating them is the bottleneck, not the marginal performance.
Headline-vs-Quality-Score note. Ad strength is its own creative score, while Quality Score is the auction-level score that includes expected CTR, ad relevance, and landing page experience. They overlap on the "ad relevance" component but they are not the same number. We have seen ad strength "Excellent" sit next to Quality Score 4 in the same row.
Pinning reduces the ML combination space. Pin only when legal or brand compliance requires it. When you do pin, pin 3 or more assets per slot, never a single asset. The reason is structural, not stylistic. A single pinned asset disables ML rotation for that slot entirely.
Three-tier framework we use across client SOPs:
| Pinning Policy | Combination Space | Ad Strength Ceiling | Use When |
|---|---|---|---|
| No pinning | ~1,800 combinations | Good to Excellent | Non-regulated industry, default |
| Pin 3 assets per slot | ~540 combinations | Typically Good | Brand-mandated tagline, controlled A/B |
| Pin 1 asset per slot | ~91 combinations | Average or worse | Pharma legal copy, financial disclaimers |
No pinning gives full combinatorial freedom: ML has maximum signal and ad strength typically lands at "Good" or better. This is the default for non-regulated industries.
Pin 3 assets per position lets the ML pick among 3 for that slot while everything else still mixes freely. Combination space drops to roughly 540. Ad strength typically stays "Good".
Pin 1 asset to a single position disables ML for that slot. Combination space drops to about 91. Ad strength caps at "Average" or worse. CTR typically drops 8-15% inside the first 14 days.
When pinning is legit: regulated industries (financial, healthcare, legal, where disclaimers are required in head 1 by compliance), brand-mandated taglines that cannot rotate (think pharma "ask your doctor"), or a controlled A/B test holding one creative element fixed while everything else rotates. The compliance case is real and we do not argue with it. The "marketing wants this headline in slot 1" case is where you push back.
When pinning hurts: trying to force the headline you think is best. The ML usually disagrees, because it has signals you do not. We have audited accounts where the top combination ranked the "obviously best" headline at position 3, served only on desktop, only in the afternoon, only against specific audience segments. Marketing had wanted it pinned to position 1. The data said no.
Safe pinning procedure when you must. Pin 3 assets per slot, hold for 7 days, pull the combinations report. If CTR has dropped more than 5% versus the unpinned baseline ad, unpin and replace those 3 assets with stronger variants. If ad strength dropped from "Good" to "Average", the ML is fighting you. Listen.
Ad strength is a creative completeness score, not a performance predictor. It rates the form-filling: did you provide enough headlines, are they unique, do they include keywords, do they follow popular ad text patterns. It correlates with performance only via the "more assets = more ML signal" mechanic, not because the score itself measures anything about conversion.
What ad strength measures: count of headlines, count of descriptions, headline uniqueness (string similarity), presence of ad-group keywords in headlines, and conformance to popular ad text patterns. A checklist completion score.
What ad strength does not measure: conversion rate, ROAS, brand alignment, actual relevance to your audience, landing page quality. Google's stat that advertisers improving ad strength from "Poor" to "Excellent" see 12% more conversions (Google Ads Help, 2024) is correlation, not causation. More headlines means more ML signal. The signal lifts performance. Ad strength is the receipt, not the engine.
Sara-level forensics. An "Excellent" RSA with 2.1% CTR can underperform a "Good" RSA with 3.4% CTR in the same ad group. We see this when the "Excellent" ad over-indexes on keyword stuffing (gaming the rubric) while the "Good" ad uses stronger benefit copy. The AS score rewarded the keyword-heavy ad. The auction rewarded the benefit-heavy ad.
Practical rule for managers running 200+ ad groups: target "Good" or better as a hygiene floor. Past that floor, optimize on CTR, conversion rate, and ROAS, not on chasing "Excellent". The lift past "Good" is small enough that your time is better spent on combinations report analysis. Past the hygiene floor, CTR and CPC issues often trace back to Quality Score and bid economics, not ad copy.
RSA testing is not classic A/B testing. Run 2 to 3 RSAs per ad group, let them split impressions naturally, hold for 14 to 21 days, then compare winners at statistical significance with at least 4,000 impressions per ad at 95% confidence. Anything shorter is noise. Anything narrower in scope and you are testing one combination against another, not one RSA against another.
Why classic A/B does not work for RSAs. The ML serves different combinations per impression. Your "Ad A vs Ad B" comparison is actually "A's top 8 combinations vs B's top 8". Variance is high. Rough rule: at retail CTR of 5%, you need ~4,200 impressions per ad to detect a 0.5 pp lift at p < 0.05. At B2B CTR of 2%, closer to 10,000.
Sara-level methodology that holds up under audit:
Asset-level signal is where the real work happens. The combinations report shows performance by individual headline. Bottom 20% headlines tagged "Low" are not necessarily bad copy. They are bad in this ad group, against this audience, paired with these other headlines. Replace them with variants targeted at the specific gap.
Variant testing was hours of weekly work per account before we automated it. Tooling that watches the combinations report and proposes targeted replacements is the leverage point. That is where Mira and Maximus come in. You can see the agent workflow live before activating anything on your account.
Mira reads the ad group keyword theme, audience signal, and top-converting search queries from the past 30 days, then generates 15 headline variants across 5 buckets in 12 to 30 seconds per ad group. Generation cadence runs every 14 days for ad groups where ad strength sits below "Good" or CTR sits below median. She does not regenerate ads already performing in the top quartile.
Maximus gates every Mira proposal through the approval workflow. On the Co-pilot tier, Maximus queues all variants for human review. On the Approval tier, he auto-pushes variants that pass safety rules (character limits, no banned terms, predicted ad strength "Good" or better) and queues edge cases. On the Autonomous tier, he pushes most variants but blocks: brand campaign creative changes, ads where current CTR sits in the top decile, and any variant that would change pinning policy. See the full B6 pricing tiers for autonomy-level details.
The frame is not "AI writes your creative". The frame is: AI generates candidates, the orchestrator enforces your safety policy at machine speed, you spend senior-manager attention on 3 edge cases per week. The bottleneck moves from "writing 15 unique headlines" to "reviewing 3 flagged variants".
Tracking quality is a precondition. Mira's cadence relies on accurate conversion signal, which is why we run conversion tracking diagnostics before activating variant generation. ML quality is downstream of measurement quality.
How many RSAs should I run per ad group? At least 2, maximum 3. Google caps the number at 3 RSAs per ad group. The conversion lift from RSA 2 is 6.6%, and from RSA 3 is another 3.7% on top of that (Google Ads Help, 2024). Running fewer than 2 means the ML cannot rotate at the ad-unit level, only at the combination level inside a single RSA.
What is the character limit for responsive search ads? Headlines are 30 characters each, descriptions are 90 characters each, and the two display URL path fields are 15 characters each. Total ad real estate caps near 300 characters when fully populated. Double-width characters (Japanese, Chinese, Korean) count as 2 characters each.
Should I pin headlines and descriptions? Only when compliance or brand legal requires it. Pinning collapses the ML combination space. If you must pin, pin 3 or more assets per slot, never 1. Pinning a single asset to a slot disables ML rotation for that slot and typically drops ad strength to "Average" or worse.
What is the difference between expanded text ads and responsive search ads? Expanded text ads (ETAs) were sunset by Google in June 2022. RSAs are the only text ad format for new creation since. ETAs were fixed creative with 3 headlines and 2 descriptions. RSAs hold up to 15 headlines and 4 descriptions that the ML rotates at auction.
What are the benefits of using broad match, Smart Bidding, and RSA together? This is Google's "power trio" since 2022. Broad match feeds the maximum query signal to the ML. Smart Bidding (Target ROAS or Target CPA) bids on conversion probability per impression. RSAs rotate creative per impression. The three together give the ML signal at three layers: query interpretation, bid optimization, and creative selection. Google's reported lift is roughly 20% more conversions versus exact match plus manual bidding plus static creative.
If you manage 10+ accounts and want to see what Mira would propose on your top 10 ad groups, connect Google Ads to B6 and let her run an audit. Output in roughly 90 seconds: per-ad-group ad strength gap, top 3 underperforming headlines per group, and a 15-headline variant set ready for Maximus to gate against your approval rules. Read-only access, no bid changes, no creative pushed without your sign-off.