xuly.io
Engineering

How we detect KPI anomalies (with code)

EWMA, isolation forests, and the cost-aware hybrid we use in production.

Engineering2026-04-15·12 min read
Engineering

When a customer's FTDs drop 40% overnight, they need to know today — not when they open the dashboard on Monday morning. Here's how we approach anomaly detection, what we tried, and what ended up in production.

The naive approach

The obvious first attempt is a static threshold: "alert if daily NGR < $X." This fails immediately for two reasons: (1) every customer has a different normal, and (2) weekly seasonality means every Sunday would trigger a false alarm.

Exponentially weighted moving averages

Our first real approach: maintain a rolling EWMA and z-score against it. Works well for stable metrics but can't handle weekly seasonality. We stacked a 7-day-lagged comparison on top, which fixed that — comparing Monday to last Monday, not yesterday.

The math

μ_t = α · x_t + (1 − α) · μ_{t−1}    with α = 0.1 for smoothing
σ_t = EWMA of squared deviations
alert if |x_t − μ_{t−7}| / σ_t > 3

Isolation forests for multivariate anomalies

Single-metric alerts miss correlated drops. If clicks are flat but conversions crater, that's an attribution bug the univariate alert won't catch. We train a simple isolation forest per integration on the last 90 days of (clicks, signups, FTDs, deposits) and flag rows in the bottom 1% of isolation scores.

The cost problem

Running isolation forests at 5-minute cadence for 50,000 integrations gets expensive fast. Our hybrid: EWMA runs hot on every sync (cheap), isolation forest runs nightly (acceptable). For Business plans we upgrade to 30-minute isolation-forest cadence.

What we learned

  • Customers care about recall, not precision — a false alarm is annoying, a missed crater is expensive.
  • Enrich every alert with context: "NGR dropped 40% vs 7-day baseline; telegram-push source accounts for most of it" is 10× more useful than "anomaly detected".
  • Let users snooze specific alerts. People know their business better than we do.