fix: Red-team schema should document snapshot-isolation methodology for lm_eth fields (#1083)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
johba 2026-03-24 20:17:20 +00:00
parent 46998ac1bf
commit 7d58490dcd

View file

@ -265,6 +265,38 @@ Records one adversarial red-team run against a candidate optimizer.
| `attacks[].delta_bps` | integer | LM ETH change in basis points |
| `attacks[].insight` | string | Key finding from this strategy |
### Snapshot-Isolation Methodology
All red-team runs use **snapshot isolation** as the standard methodology. This
ensures that each attack is evaluated independently against the same initial
state, rather than against a cumulative balance modified by prior attacks.
**How it works:**
1. Before the first attack, the test runner records the initial `lm_eth_before`
value and takes an Anvil snapshot via `vm.snapshot()`.
2. Each attack executes against this snapshot: run the attack, measure
`lm_eth_after`, compute `delta_bps`, then revert to the snapshot via
`vm.revertTo()`.
3. The next attack begins from the exact same chain state as the previous one.
**Field semantics under snapshot isolation:**
| Field | Semantics |
|-------|-----------|
| `lm_eth_before` | LM total ETH at the shared initial snapshot — identical for every attack in the run |
| `lm_eth_after` | LM total ETH measured after this specific attack, before reverting |
| `attacks[].delta_bps` | Change relative to the shared `lm_eth_before`, not relative to any prior attack |
**Key implications:**
- `lm_eth_before` and `lm_eth_after` reflect **per-attack state**, not
cumulative historical balance. Each attack sees the same starting ETH.
- Attack results are independent and order-insensitive — reordering attacks does
not change any individual `delta_bps` value.
- The top-level `lm_eth_after` and `eth_extracted` fields reflect the
worst-case single attack, not a sum of all attacks.
---
## Schema: `holdout/YYYY-MM-DD-prNNN.json`