Customer Health Score Model
Simple, explainable score that predicts churn and expansion — and triggers the right plays.
Audience & situation
For CS/RevOps leaders who need a trustworthy health signal that both predicts risk/opportunity and directs action. Use this when your team debates gut-feel vs. dashboards, churn surprises keep happening, or expansion timing is unclear.
Introduction
Most health scores fail for one of two reasons: they are either a black box data project nobody trusts, or a subjective traffic light that doesn’t change outcomes. A good score must be both predictive and explainable. Predictive so leaders can forecast churn/expansion and allocate scarce time; explainable so CSMs and customers accept the signal and act on it.
The purpose is not to reduce reality to a single number; it is to collapse noisy signals—usage, outcomes, sponsorship, support, finance—into a coherent narrative that drives consistent plays. The moment a customer flips from green to amber, everyone should know why and what happens next: an exec call, a success plan, enablement, or a commercial checkpoint.
We design for simplicity: five to seven factors with clear definitions and weights; a 0–100 score that rolls into three buckets (Green/Amber/Red); and a weekly refresh with data latency that is understood. We avoid vanity signals and duplicate entry by pulling from systems of record (product analytics, ticketing, billing, CRM) and minimizing manual fields to the few that truly matter (e.g., executive sponsorship).
Calibration matters more than clever math. We start with a crisp hypothesis, back-test against history, pressure-test with CSMs, and then run a closed-loop for a quarter: every churn/expansion event is compared to the score 30–60–90 days prior. We tune thresholds and weights with evidence, not opinions.
Finally, we build governance so the score stays relevant: quarterly factor reviews, change logs, and a rule that for every new field added, one must be removed. If the model can’t be explained in two minutes to a CFO or a Head of Ops, it’s too complex. This playbook shows exactly how to build, launch, and run a score your org will trust.
What good looks like
- Explainable: 5–7 factors, each with a plain-English definition, weight, and data source.
- Predictive: ≥70% of churn in last quarter was Red/Amber ≥30 days ahead; ≥60% of expansions were Green.
- Actionable: each bucket maps to a playbook with owners, SLAs and checklists.
- Audited: change log; quarterly calibration; back-test results published.
- Low admin: 80–90% of fields auto-populated; minimal manual entries with clear owners.
Common pitfalls
- Black box: data science score no one can explain → low adoption.
- Lagging only: tickets and NPS dominate → risk detected too late.
- Overfitting: perfect back-test, poor real-world signal → keep it simple.
- Action gap: buckets without plays → drama, no outcomes.
- Manual bloat: CSMs asked to update 15 fields weekly → decay and distrust.
Playbook
1) Define factors & weights (keep to 5–7)
- Usage (30%): WAU/MAU vs. licensed seats; key feature adoption; trend vs. baseline.
- Outcomes (20%): 1–2 customer KPIs tied to value hypothesis; validation cadence.
- Sponsorship (20%): exec sponsor named; last EB touch; meeting cadence.
- Support (10%): time to resolve; P1/P2 in last 30–90 days; CSAT.
- Financial (10–20%): invoices on time; credit risk; contract risk flags.
- Optional (10%): integration health, admin engagement, product feedback severity.
2) Specify each factor
- Exact field names, systems (product analytics, Zendesk, ERP, CRM) and refresh cadence.
- Transform raw data to 0–100 per factor (e.g., min–max scaling, caps, smoothing).
- Weighting: sum of weights = 100; document rationale.
3) Scoring & buckets
- Health score = weighted average of factor scores → 0–100.
- Buckets: Green ≥80, Amber 60–79, Red <60 (start here; calibrate later).
- Set hysteresis (e.g., require ±5pt move to change bucket) to avoid thrash.
4) Calibrate
- Back-test 2–4 quarters of history; compute AUC/precision-recall if available.
- Run a CSM roundtable: compare score vs. lived reality on 20 accounts; note mismatches.
- Adjust thresholds/weights once; freeze for pilot.
5) Action mapping
Red (high risk)
- Exec sponsor call in 48h; success plan within 5d; weekly check-ins.
- Run escalation runbook if tied to incidents.
Amber (watch)
- Training and feature enablement; MAP for adoption milestones.
- Validate value hypotheses with fresh benchmarks.
Green (growth): run expansion discovery; schedule exec value review; attach pilot for next module.
6) Operate
- Refresh weekly; publish change log; tag accounts that crossed a bucket boundary.
- Dashboard tiles: distribution by bucket, transitions, signal-to-action SLA, win/loss by bucket.
- Quarterly calibration: compare bucket distribution vs. churn/NRR realized.
Artifacts
Model spec (1-pager)
- Factors, weights, definitions, formulas, data sources, refresh.
- Bucket thresholds + hysteresis rule.
Action library
- Plays per bucket with owners/SLAs.
- Templates: exec email, success plan, enablement plan.
Worked examples
Example A — Collaboration SaaS
Factors: WAU/Seats (35%), Team adoption (10%), Sponsor touch (20%), P1 in 90d (10%), Invoice status (15%), Admin logins (10%). Calibration: Red predicted 74% of churn ≥45d ahead. Actions: Red → exec value session + usage campaign; Green → pilot whiteboarding add-on.
Example B — Payments platform
Factors: Processed volume vs. forecast (30%), Success KPI (chargeback rate) (20%), Sponsor cadence (20%), Support time to resolve (10%), DSO trend (20%). Result: Amber segment halved after playbook launch; 18% lift in expansion from Green customers.
Example C — Industrial IoT
Factors: Sensor uptime (25%), Work orders closed (20%), Safety incidents (15%), Exec sponsor (15%), Parts SLA (10%), Training completion (15%). Outcome: Two plants moved from Red→Amber within 30 days after targeted enablement; renewal risk averted.
Metrics
Leading: bucket transitions/week, time-to-action after boundary cross, percentage of accounts with validated outcomes, exec touch cadence on Red/Amber.
Lagging: churn captured by Red/Amber ≥30d ahead, expansion win rate by bucket, NRR uplift from Green focus, false-positive/negative rates.
Keep the loop tight: score → action → outcome → calibration.
Implementation checklist
- Publish model spec and action library; pick three gold-standard examples.
- Connect data sources (analytics, ticketing, billing); define refresh SLAs.
- Add CRM fields: Health score, Bucket, Boundary crossed at, Action started at.
- Create dashboards: bucket distribution, transitions, time-to-action, outcomes by bucket.
Measurement
Team level: % accounts with current score, action SLA compliance, transition coverage, forecast accuracy for churn/expansion.
Individual level: time to initiate action after boundary cross, success plan completeness for Red, enablement completion for Amber.
Team buy-in
- Co-design factors with CSMs; run a live calibration clinic on 20 accounts.
- Make actions visible and lightweight: checklists, not decks.
- Celebrate “save stories” where Red→Amber→Green transitions led to renewal or expansion.
Why it matters
- Predictable retention: fewer “surprise churns,” cleaner board narrative.
- Focused growth: expansion time spent where evidence is strongest.
- Org alignment: shared language across CS, Product, Finance and Sales.
Pair this with account planning and a disciplined sales process to convert Green signals into revenue.
Metrics & pitfalls
Watch
- Churn captured ≥30d ahead
- Time-to-action after bucket change
- False positive/negative rates
Avoid
- Black-box models
- Manual field bloat
- Plays without owners/SLAs
90-day rollout
Weeks 1–2 — Define & align
- Owners: CS Ops (lead), RevOps, Product Analytics, Finance.
- Lock factors, weights, and definitions; sign off on data sources and refresh.
Weeks 3–4 — Build & back-test
- Implement transforms; run 2–4Q back-test; publish findings; freeze thresholds for pilot.
Weeks 5–6 — Pilot
- Run on 50–100 accounts; daily boundary alerts; action SLAs enforced; collect feedback.
Weeks 7–8 — Tune & document
- Adjust weights/thresholds once; publish model spec and action library.
Weeks 9–10 — Roll out
- Enable dashboards for managers; add score to QBR template.
Weeks 11–12 — Bake into rhythm
- Monthly calibration review; change log discipline; quarterly factor review.
- Target state: ≥70% churn predicted ≥30d ahead; NRR +5–8 pts from Green focus.
Related
Next steps & CTA
- Draft your 5–7 factors with weights and plain-English definitions.
- Back-test on 2–4 quarters; lock bucket thresholds for a 6-week pilot.
- Map plays to buckets and enforce SLAs in your CS cadence.
Sources & terms
Terms: WAU/MAU (weekly/monthly active users), EB (economic buyer), NRR (net revenue retention), Hysteresis (guard band to reduce churn in/out of buckets).