Guide • Reliability • Incident response

Algo incident response playbook¶

When automation misfires, speed matters — but sequence matters more. This playbook gives you a repeatable order: detect → contain → diagnose → recover.

Algo incident response flow

Severity levels (keep this simple)¶

Sev-1: potential account harm now (unexpected live orders, runaway loop)
Sev-2: strategy malfunction with limited blast radius
Sev-3: degraded behavior (noise, slippage drift, stale signals)

First 2 minutes (Sev-1)¶

Trigger kill switch (L3/L4 depending on blast radius).
Disable affected strategy.
Confirm no new orders are being sent.
Preserve evidence immediately:
broker responses / rejects
strategy settings snapshot
timestamps and symbols

30-minute diagnosis block¶

Use this checklist: - Was this signal logic, risk rule, execution transport, or broker state? - Did config drift from approved baseline? - Did this happen before (repeat signature)?

Quick triage matrix¶

Symptom	Probable class	First check
Unexpected symbols traded	Universe/symbol list issue	Strategy source + list mapping
Too many orders quickly	Missing cap / duplicated triggers	Max trades/day + dedupe
Orders rejected repeatedly	Broker/API session issue	API auth + connection state
“Late” behavior and poor fills	Execution quality drift	Slippage + spread regime

Recovery protocol¶

Patch one root cause at a time.
Run canary in paper first.
Re-enable with reduced risk caps.
Monitor first session manually.

Postmortem template (same day)¶

What happened (factual timeline)
Impact (orders, risk, downtime)
Root cause (single clearest statement)
Contributing factors
Permanent fixes (owner + due date)
Guardrail added (to prevent recurrence)

Reliability KPIs (lightweight)¶

Track these weekly: - Incident count by severity - Time to containment - Repeat incident ratio - % incidents with completed postmortem

Where this connects in your stack¶

FAQ¶

Do small accounts need incident response?¶

Yes. Small accounts are less tolerant of operational mistakes.

Should every incident get a postmortem?¶

Sev-1 and Sev-2: yes. Sev-3: at least a short record.

View official pricing Read the review See pricing Choose a plan Check coupons

Written by

David

Updated 2026-02-25

Mentor-style Trade Ideas tutorials focused on workflow, clarity, and repeatable process.