Risk invariants and execution safety in live 0DTE systems
Safety outranks correctness, which outranks profit. The live 0DTE systems encode that as hard invariants — single position, long-only, daily hard stops, forced EOD flatten — enforced by state machines and schema-level tests rather than comments, with production incidents converted into permanent regression tests.
The invariants, as written
The live systems share a priority order — safety > correctness > profit — and encode it as configuration-level invariants, verbatim from the configs:
one_position_only: true,no_naked_short: true,forbid_add_to_loser: true,forbid_reverse_without_flat: true- Daily drawdown ladder: soft stop at 1.5% of NAV (size cut), hard stop at 2% (day locked). Loss-streak guards: two consecutive losses halve size, three stop the day.
max_trades_per_daycapped low (2–5 across systems). - Forced EOD flatten at 15:45 with
force_cancel_all_before_eod_flatten: true, and session-reset logic so a flattened day cannot quietly reopen. - One system runs a four-dimensional stop: option premium −35%, adverse SPX move 10 points, 8 minutes without 5 points of progress, and an IV-crush detector (premium −20% while SPX barely moves) — hard dimensions checked every 100ms, slow ones every 5 seconds.
Enforcement is structural, not documentary: a four-state execution state machine whose can_enter() is only true from FLAT; a risk kernel that returns allowed + reason so a refused entry carries its own audit trail; and schema-level data tests (SPX bars must not have a volume column — which catches wrong-instrument joins at load time) inside a 300+ test suite.
Three production bugs, three permanent tests
- Zombie position. A fill was detected but the state machine was never transitioned out of
PENDING_ENTRY. Result: a live position every exit path silently refused to close, because exits require anOPENstate. Fix plus regression test asserting the transition on every fill event. - Phantom position. An IBKR modify-after-fill race (error 104) reports
status='Cancelled'for an order that actually filled. Treating the event literally would book an entry at price 0.0 — or, in the original bug, treat a cancellation as a fill. The handler now checksfilled_qty > 0 and avg_fill > 0before believing anyCancelledstatus. - Orphaned contracts. Partial take-profit fills were routed by asking
sm.is_open()— but after submitting an exit the machine is inPENDING_EXIT, so partial fills were misrouted as full closes, leaving unsold contracts in the account. The fix tracks exit kind explicitly instead of inferring it from state.
All three were found in live or paper sessions, none by the original unit tests — and each now exists as a named regression test that will fail loudly if the behavior regresses.
The lesson
An invariant that lives in a comment is a wish. The ones that hold in production are the ones a state machine, a type, or a test can refuse — and the cheapest time to add the next one is the same afternoon an incident shows you where it was missing.