Built-in alerts are too noisy or too late. The Account Anomaly Detector script is brittle. The stack that actually works in 2026: rolling baselines, severity tiers, and an agent that classifies before it pages.
If you manage 5 or more Google Ads accounts, you already know the pattern. Monday morning, you open one of the smaller clients, and CPA is up 73% over a 4 day window. You scroll back. Day 1 was fine. Day 2 was fine. Day 3 was when the GA4 container deploy went out and Enhanced Conversions stopped firing on 41% of checkout events. You missed it because nobody paged you. The built-in Google Ads notification panel surfaced two things in that window: a payment method expiring in 60 days, and an "auto-applied recommendation" suggesting a budget raise on the campaign that just stopped converting.
This is the central problem with Google Ads anomaly detection in 2026. The built-in alerts are too noisy or too late. They fire on the things Google chose to monitor (disapprovals, budget caps, policy issues), not on the things that matter to a portfolio manager (CPA drift, conversion-rate breakage, click bombing, bid-algo overreach during Smart Bidding learning). The Account Anomaly Detector script that Google publishes is closer to right, but it is brittle: same-day-of-week mean, hard percentage thresholds, one email a day. It catches the obvious and misses everything subtle.
What you actually need is a stack. Rolling baseline math, severity classification, routing that knows the difference between "wake Sara up" and "log for the weekly digest." This article is the comparison guide for that stack. Built-in alerts vs the script vs commercial monitors (Go-Insights, Promonavigator's collection, Optmyzr) vs an agent-based approach where a dedicated reviewer (Aegis on B6) classifies severity before anything escalates.
The 30 second triage when something feels wrong: pull the last 28 days, check if spend is off baseline, check if conversions are off baseline, check if the ratio between them is off baseline. Two of three drifting in the same direction is real. One of three drifting alone is almost always either tracking or seasonality.
The word "anomaly" gets thrown around loosely. Let's be precise. An anomaly is a deviation from a rolling baseline that exceeds an expected range. Three pieces matter: the rolling baseline (not a static threshold), the deviation measure (z-score or percentage), and the expected range (the false-positive budget you accept).
A rolling baseline is the mean of the metric over a trailing window, typically 14 or 28 days, computed for the equivalent slice of time. The trailing 14 day same-hour mean is what you compare today's 10:00 AM spend against. Not yesterday's 10:00 AM, and not the static daily budget. Sara's Tuesday 2 PM should be compared to the 14 prior Tuesday 2 PMs, not to the previous Tuesday or to today's account total.
The deviation measure determines what counts as significant. Two standard deviations from the rolling mean gives you a roughly 5% expected false-positive rate on normally distributed data. Three standard deviations drops that to under 1%. Percentage thresholds are simpler but worse: a 30% jump on a campaign that normally moves 5% day-over-day is a real anomaly, but a 30% jump on a campaign that already moves 25% day-over-day is just Tuesday.
In practice, after running anomaly detection across a portfolio for a year, the categories of "real" alerts cluster:
There are four levels of anomaly detection in the Google Ads ecosystem. They serve different roles. None is a complete solution alone.
Google built-in alerts and recommendations. Surface-level. Disapprovals, payment issues, "your campaign is limited by budget," some auto-applied recommendations. Useful as a floor. Insufficient as a primary signal. In-account notifications and email notifications cover the operational basics (billing, disapprovals, suspension). For MCC users, manager-account notifications are a separate setup that has to be enabled per child account.
The Account Anomaly Detector script. Google's own Apps Script solution, published in the Ads Scripts docs. It compares today's running stats against the average of the same day of week across the prior 26 weeks. Adjustable thresholds per metric in a Google Sheet. Single email per alert per day. Good baseline. Two weaknesses: the same-day-of-week mean breaks badly if the account has structural changes inside the 26 week window (campaign restructure, new product launch, seasonality shift), and the per-metric percentage thresholds do not scale across a portfolio of accounts with different volatility profiles.
The Campaign Anomaly Detector (CAD v2). Open-sourced by Google in 2022 and rewritten in 2023, available on GitHub. Monitors at account level and campaign level, supports configurable past windows and current windows, has a 30-minute execution timeout, and ships with an interactive Google Sheets configuration tab. Closer to what Sara wants. Still rule-based rather than statistical, and the multi-account version requires load balancing across script instances.
Commercial monitors. Go-Insights routes anomaly detection into Slack, Teams, and email with 24/7 monitoring on CPC, spend, impressions, and similar metrics. Promonavigator's anomaly script collection bundles 14 different anomaly tracking scripts ranging from low Quality Score detection to suspicious-click filtering (one of which flags campaigns exceeding "30% invalid clicks during the day"). Optmyzr has a similar alerts layer inside its rule engine. Useful when you want pre-built routing and do not want to maintain the script yourself.
Agent-based detection. The newest layer. An agent runs continuously across the account, computes the rolling baseline, classifies the deviation against learned patterns, and decides whether to escalate, block a related action, or absorb the signal as noise. On B6, this is what Aegis does. The classification step is what separates an agent from a script: a percentage threshold can fire, but it cannot tell you why, and it cannot block a Smart Bidding change that is about to compound the anomaly.
| Layer | Detection model | Routing | Best fit |
|---|---|---|---|
| Google built-in | Rule-based notifications, recommendations | In-account + email | 1-account operators, baseline floor |
| Account Anomaly Detector script | Same-day-of-week mean, % thresholds, 26 wk window | Single email per alert per day | Single account, low setup cost |
| CAD v2 (GitHub) | Configurable past vs current window thresholds | Sheet log + email | Multi-campaign account, technical owner |
| Commercial monitor | Mostly rule-based, pre-built integrations | Slack, Teams, email, webhook | Multi-account agency, no internal eng |
| Agent-based (Aegis on B6) | Statistical + rule overrides + classifier | Severity-tiered routing with action blocking | Multi-account portfolio, autonomy required |
The honest version of "what threshold should I use" is: it depends on the metric and the account volatility. Here is the working set we use on portfolios of 5 to 30 accounts, calibrated to roughly 5% false-positive rate on the alerts that fire.
Spend pacing needs the "sustained 60 minutes" clause to kill 80% of single-blip noise. If pacing is genuinely off, work through the Google Ads not spending full budget playbook to tell pacing problems apart from CPC problems. For CPA, ±25% week-over-week is a "look at it" signal, ±50% is escalation, and the full diagnostic sequence is in our ROAS dropped suddenly walkthrough.
A drop in conversion rate greater than 30% sustained over 24 hours triggers "tracking suspicion" first, not "performance investigation." Nine times out of ten the data is wrong, not the campaign. A 3x click volume spike inside one hour on a single campaign with flat impression growth is click bombing until proven otherwise. Invalid traffic monitoring should already be filtering this, but invalid traffic detection runs after the fact and refunds you, it does not prevent the spend. For CPC, the same standard-deviation logic as CTR applies, plus a rule that the spike must persist across a 6 hour window. CPC fluctuates inside Smart Bidding learning periods routinely. Most of those signals are noise. The CPC too high diagnostic covers the durable CPC pattern.
Two thresholds always cause arguments inside teams. The first is whether to use z-score or percentage. The answer for a Sara-sized portfolio is z-score for the alert math, percentage for the human-readable description in the alert payload. "Spend on Campaign X is 2.4σ above rolling baseline (currently $342 vs expected $185 to $230)" is what the rule engine evaluates. "Spend on Campaign X jumped 78%" is what shows up in the Slack message. The second is whether seasonality should be hand-coded or learned. Hand-coded wins for portfolios under 50 accounts. The hand-coded version is two lines in the rule engine: "between Nov 20 and Dec 26, widen the spend band by 40%."
A good alerting system has four severity tiers and explicit routing rules. The fastest way to burn out a PPC team is to page on every severity-2 event.
The mapping matters more than the math. A statistical model that fires 200 severity-2 alerts per week is worse than a dumb threshold that fires 4 severity-1 alerts per week, because the 200 alerts get muted and then the 4 real ones get muted with them.
Aegis is the risk-review and anomaly-detection agent in the B6 multi-agent stack. Its job is to sit between the other agents and the production Google Ads account, classify every proposed change and every observed metric deviation, and either pass, escalate, or block. Aegis is the lead defense layer.
The Aegis loop is rule-augmented statistical. The rolling baseline is computed across the trailing 28 day window per campaign per hour-of-day. Deviations beyond 2σ enter the classifier. The classifier has explicit overrides for known patterns: brand campaign actions are always severity 1, anything touching the conversion tag is always severity 1, anything inside a Smart Bidding learning window gets de-prioritized one tier because volatility there is expected. The output is a severity-tagged alert with a recommended next action.
In Sprint 5, on a real Goodevas It client account, Aegis raised a risk score of 82/100 on a proposed Buzz bid action. The action was a "logical" bid cut on the top performer in a brand campaign. Aegis blocked it because two anomalies fired in the same minute: the brand campaign pattern (always severity 1) and a tracking-suspicion flag (conversion rate had drifted in the prior 6 hours). Buzz's proposed change would have killed a chunk of the account's revenue while masking the underlying tracking issue. The user got a single notification with the severity classification, the math, and the recommended next action ("verify tracking before reconsidering bid change"). No paging at 2 AM. No 47-alert Slack flood.
The other mascots are part of the chain. Sage feeds Aegis the keyword-level and audience-level signals that statistical baselines need. Buzz is the agent whose proposals Aegis reviews most often, since bid changes are the most frequent action class. Echo writes the incident note in the weekly digest, so the team has a written audit trail of every severity 2 and 3 event without having to scroll Slack.
The pitch is not "AI does anomaly detection instead of you." The pitch is: Aegis classifies severity in under 100 milliseconds per event, you spend your attention on the 3 to 5 severity-1 events per month that actually need an operator decision. Everything else is logged and digested. See pricing tiers for how the agent layer is packaged, or open a free Buzz audit on one of your accounts to see Aegis classification in action on real data.
If you cannot or will not buy a commercial layer or move to an agent-based stack, the buildable version is six steps. We have shipped this for clients who wanted to keep the logic in-house. It is not glamorous and it works.
A small team can stand this up in a long weekend if BigQuery is already in the stack. The maintenance cost is the weekly false-positive review and the occasional threshold recalibration when an account changes structure.
What is the Account Anomaly Detector script in Google Ads? Google's first-party Apps Script for anomaly detection, documented here. It compares the current day's running stats (impressions, clicks, conversions, cost) against the average of the same day of week over the prior 26 weeks. Thresholds are configurable per metric in a Google Sheet. Sends a single email per alert per day. It is a fine baseline, brittle as a primary detection layer for a 10+ account portfolio.
How do I set up alerts for unusual activity in Google Ads? Three layers. (1) Turn on the built-in in-account notifications and email notifications for the operational basics (payment, disapproval, suspension). (2) Deploy the Account Anomaly Detector script or CAD v2 for performance-metric anomalies. (3) Layer a routing tool (Slack via Go-Insights or a commercial monitor, or an agent like Aegis) for severity classification and on-call paging.
What is a normal false-positive rate for ad-account alerts? Aim for under 15% on severity 1 alerts and under 30% on severity 2. Higher on severity 3 is acceptable because that tier is supposed to be wide. If severity 1 false positives exceed 30%, you have either threshold drift, structural change in an account (campaign restructure, product launch) that nobody told the rule engine about, or both.
Is z-score better than percentage thresholds? For alerting math, yes. Z-score adapts to the natural variance of the campaign, so a noisy Performance Max campaign does not page you every Tuesday for routine 25% swings. For the human-readable alert payload, percentage is better because it is faster to parse. Use both: z-score for the rule, percentage for the message.
Does Google have built-in anomaly detection? Partial. The Recommendations page surfaces some performance opportunities and warnings. The Anomalies card in Google Ad Manager is a beta feature on the publisher side, not the advertiser side. In Google Ads proper, you get notifications and recommendations but no true rolling-baseline anomaly engine. That is the gap the Account Anomaly Detector script and the commercial layers exist to fill.
Three things to take away. The built-in Google Ads notification system is a floor, not a ceiling. The Account Anomaly Detector script is the cheapest meaningful upgrade and worth deploying even if you plan to layer something on top of it. Severity classification matters more than statistical sophistication: a dumb threshold with the right routing is more useful than a clever model that pages on everything.
If you manage 5 or more accounts and you have ever missed a real anomaly because the team was triaging false positives, the next step is to install a classifier that knows the difference. Run a free Buzz + Aegis audit on one of your accounts and see severity classification on real data. Read-only access, no changes made without your approval, takes 10 minutes.
Anomaly detection is not about the alert. It is about the gap between the moment a problem starts and the moment a human knows. Close the gap.