Self-Serve Free Queue — Strategy Overview

Eng + Support VP brief · turning a 9,000-case Free backlog into a self-resolving system

The problem & the shift

Problem: 9,000 of 14,000 backlog cases are Self-Serve Free. These customers still buy add-ons + usage-based billing and feed the upgrade funnel, so we can't drop support — but adding headcount linearly is economically broken.

Shift: Treat Free support as a demand-reduction + automated-resolution engineering problem, not a staffing problem. Push every contact down a cost gradient: prevent → self-heal → AI resolve → human exception. Each resolved case should make the next one cheaper (knowledge + signals compound).

The Demand Pyramid — push work to the cheapest layer

TIER 0 · PREVENTFix root causes so the case never happens. ~$0.10/case.

TIER 1 · SELF-HEALBilling Health Meter, 1-click fixes. ~$0.10/case.

TIER 2 · AI RESOLVEAgent Lee reads Stripe & acts. ~$0.20/case.

TIER 3 · HUMANSRefunds, Stripe bugs, disputes, legal. ~$8.00/case.

flowchart TD
    A["Incoming Free demand"] --> T0["TIER 0 — PREVENT
case never happens"]
    T0 -->|"residual"| T1["TIER 1 — SELF-HEAL
customer fixes in dashboard"]
    T1 -->|"unresolved"| T2["TIER 2 — AGENT LEE
AI diagnoses + acts on Stripe"]
    T2 -->|"exceptions only"| T3["TIER 3 — HUMANS
refunds over policy, Stripe bugs, disputes, legal"]
    T1 -.->|"always reachable"| T3
    classDef c0 fill:#e6f4ea,stroke:#34a853
    classDef c1 fill:#e8f0fe,stroke:#1a73e8
    classDef c2 fill:#fff4e5,stroke:#f9ab00
    classDef c3 fill:#fce8e6,stroke:#ea4335
    class T0 c0
    class T1 c1
    class T2 c2
    class T3 c3

Golden rule: the cheapest case is the one never raised. Cost per contact falls ~80× moving from Tier 3 to Tier 0/1 (Gartner basis).

Target architecture (control-plane view)

flowchart TD
    C[Customer / Dashboard] --> R{Entry Router
intent + auth + plan/risk}
    R -->|prevent class| T0[Tier 0 — Prevent
Stripe webhooks, dunning,
proactive nudges, link auto-regen]
    R -->|self-serve eligible| T1[Tier 1 — Self-Heal UI
Billing Health Meter
1-click fixes]
    R -->|needs resolution| T2[Tier 2 — Agent Lee
LLM tool-calling + RAG
scoped Stripe actions]
    T1 -->|unresolved / low confidence| T2
    T2 -->|"risk gate fails / low confidence"| T3["Tier 3 — Humans
refunds over policy, Stripe corruption,
disputes, fraud, legal"]
    T1 -.->|always reachable| T3
    T1 --> DIAG[Billing Diagnostics API]
    T0 -.-> DIAG
    T2 --> DIAG
    T2 --> POL[Policy / Guardrail Engine]
    T2 --> AUD[(Audit log)]
    T2 --> KCS[KCS knowledge store]
    DIAG --> STRIPE[(Stripe)]
    subgraph Shared services
      DIAG
      POL
      KCS
      AUD
      OBS[Observability: resolution rate, CES, drift]
    end

Diagnostics API — the keystone

One internal service both the Self-Heal UI and Agent Lee call. It normalizes three sources into a single plain-language diagnosis — closing the #1 self-service failure mode (Gartner: 45% "company didn't understand my intent", 43% "no relevant content").

flowchart LR
    UI["Billing Health Meter
(Self-Heal UI)"] --> API
    LEE["Agent Lee
(AI resolver)"] --> API
    API["Billing Diagnostics API
normalizes sources into
one plain-language diagnosis"]
    API -->|"payment / invoice state"| STRIPE[("Stripe
payments · invoices · dunning")]
    API -->|"is the subscription there?"| SUB[("Subscription / Entitlements
source-of-truth — TBD")]
    API -->|"how much usage / what cost?"| OPE[("OPE — ClickHouse
billable usage + observability")]
    API --> OUT["Diagnosis output
status · root_cause · explanation
recommended_action · self_serve_eligible"]
    classDef verified fill:#e6f4ea,stroke:#34a853,color:#000;
    classDef tbd fill:#fff4e5,stroke:#f9ab00,color:#000;
    class STRIPE,OPE verified; class SUB tbd;

Open item: OPE (Ordered Parallel Execution) is a ClickHouse usage/metering layer — it answers "how much usage / what cost", not "does a subscription exist". The subscription source-of-truth is still TBD. (User-provided definition; wiki was unreachable to verify.)

How it feels in practice

Flow 1 — Failed-payment self-heal (no case raised)

sequenceDiagram
    participant Stripe
    participant T0 as Tier 0 (webhook)
    participant Cust as Customer
    participant UI as Billing Health Meter
    participant Diag as Diagnostics API
    Stripe->>T0: invoice.payment_failed
    T0->>T0: classify failure, regenerate expired link
    T0->>Cust: proactive nudge + fix CTA
    Cust->>UI: opens dashboard
    UI->>Diag: get_billing_diagnosis(customer_id)
    Diag->>Stripe: read PaymentIntent.last_payment_error
    Diag-->>UI: {red, insufficient_funds, action}
    UI-->>Cust: "Card declined — update card / retry"
    Cust->>UI: 1-click retry
    UI->>Stripe: confirm payment (idempotent)
    Stripe-->>UI: success
    UI-->>Cust: resolved — zero human touch

Flow 2 — Agent-issued refund with policy gate

sequenceDiagram
    participant Cust as Customer
    participant Lee as Agent Lee
    participant Pol as Policy Engine
    participant Stripe
    participant Aud as Audit Log
    participant Human
    Cust->>Lee: "I was charged, I want a refund"
    Lee->>Pol: issue_refund(charge_id, amount)
    alt within policy (<= cap & eligible)
        Pol-->>Lee: approved
        Lee->>Stripe: refund (idempotency key)
        Stripe-->>Lee: refunded
        Lee->>Aud: log actor=agent-lee, decision, outcome
        Lee-->>Cust: refund confirmed
    else exceeds policy
        Pol-->>Lee: denied (needs approval)
        Lee->>Human: escalate with full context
        Human-->>Cust: resolves (no re-auth)
    end

The plan — phased roadmap

gantt
    title Delivery roadmap (phased)
    dateFormat YYYY-MM-DD
    axisFormat %b
    section Phase 1 — Bridge (0–30d)
    Free Pod drains 9k backlog        :p1, 2026-07-01, 30d
    Pareto-rank intents (Salesforce)  :2026-07-01, 21d
    section Phase 2 — Prevent (30–90d)
    Billing Diagnostics API           :p2, after p1, 60d
    Tier 0 webhooks + link auto-regen :after p1, 45d
    section Phase 3 — Resolve (90–180d)
    Billing Health Meter UI           :p3, after p2, 90d
    Agent Lee on top-3 intents        :after p2, 90d
    section Phase 4 — Scale (180d+)
    Expand scope + upgrade prompts     :after p3, 90d
    Retire temporary Free Pod          :after p3, 45d

Phase 1 — Bridge (0–30d): stand up a time-boxed "Free Pod" with AI-assisted drafting to drain today's 9k; rank intents (expect ~5–8 = ~80% of volume).
Phase 2 — Prevent (30–90d): ship the Billing Diagnostics API + Tier 0 webhooks (failure-reason explainer, auto-link-regeneration, dunning nudges).
Phase 3 — Resolve (90–180d): launch the Billing Health Meter; put Agent Lee on the top-3 intents with read + scoped-write tools behind the Policy Engine.
Phase 4 — Scale (180d+): expand agent scope, add in-context upgrade prompts, retire the temporary pod as structural demand drops.

The business case

~84%

cost reduction vs all-human

~$725k/yr

projected savings (illustrative)

$0.10 vs $8.00

self-serve vs human / contact

State	Mix	Volume	Unit cost	Monthly
Today (all human)	100% human	9,000	$8.00	$72,000
Tier 0 prevent	30%	2,700	~$0	$0
Tier 1 self-heal	20%	1,800	$0.10	$180
Tier 2 AI	35%	3,150	$0.20	$630
Tier 3 human	15%	1,350	$8.00	$10,800
Future total	100%	9,000	blended	$11,610

Illustrative — plug in real monthly inflow + dunning data. Plus a separate recovered-revenue lever: fixing silent failed payments protects UBB/add-on revenue (e.g. ~$45k/mo at 25% recoverable × $20 avg).

Proof & guardrails

Proven at scale: Klarna's action-taking AI did the work of ~700 agents, matched human CSAT, cut repeat contacts 25%, dropped resolution 11→2 min. Intercom Fin averages ~76% resolution across 12,000+ customers.

The one rule we can't break — never a dead end: any low-confidence/policy-fail/error path routes to a human with full context (Gartner: easy escalation → 74% still self-serve next time). This avoids the billing@ → auto-reply → dashboard loop that triggered a recent public complaint.