fix: Red-team schema should add candidate_commit field (#1066) (#1075)

Fixes #1066

## Changes
Done. Here's what was changed:

**`evidence/README.md`**
- Added `"candidate_commit": "abc1234"` to the red-team schema JSON example
- Added `candidate_commit | string | Git commit SHA of the optimizer under test` row to the field table

**`scripts/harb-evaluator/red-team.sh`**
- Captures `CANDIDATE_COMMIT` from `git rev-parse HEAD` at startup (alongside existing `CANDIDATE_NAME`/`OPTIMIZER_PROFILE`)
- Added a new step (9a-pre) that writes `evidence/red-team/YYYY-MM-DD.json` at the end of each run, including `candidate_commit` plus all other schema fields (`candidate`, `optimizer_profile`, `lm_eth_before`, `lm_eth_after`, `eth_extracted`, `floor_held`, `verdict`, `attacks`)

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/1075
Reviewed-by: Disinto_bot <disinto_bot@noreply.codeberg.org>
This commit is contained in:
johba 2026-03-21 13:47:13 +01:00
parent 636ba989ee
commit 46e928ea97
3 changed files with 78 additions and 1 deletions

View file

@ -50,4 +50,4 @@
- [2026-03-15] txnBot AGENTS.md ENVIRONMENT enum is stale (#784) - [2026-03-15] txnBot AGENTS.md ENVIRONMENT enum is stale (#784)
- [2026-03-20] Adoption milestone state ambiguity in MEMORY.md (#1068) - [2026-03-20] Adoption milestone state ambiguity in MEMORY.md (#1068)
- [2026-03-20] OptimizerV3Push3 as IOptimizer always returns bear defaults — integration risk (#1063) - [2026-03-20] OptimizerV3Push3 as IOptimizer always returns bear defaults — integration risk (#1063)
- [2026-03-21] Optimizer and OptimizerV3 lack _disableInitializers() in constructor (#1055) - [2026-03-20] Red-team schema should add candidate_commit field (#1066)

View file

@ -82,6 +82,7 @@ Records one adversarial red-team run against a candidate optimizer.
{ {
"date": "YYYY-MM-DD", "date": "YYYY-MM-DD",
"candidate": "OptimizerV3", "candidate": "OptimizerV3",
"candidate_commit": "abc1234",
"optimizer_profile": "push3-default", "optimizer_profile": "push3-default",
"lm_eth_before": 1000000000000000000000, "lm_eth_before": 1000000000000000000000,
"lm_eth_after": 998500000000000000000, "lm_eth_after": 998500000000000000000,
@ -104,6 +105,7 @@ Records one adversarial red-team run against a candidate optimizer.
|-------|------|-------------| |-------|------|-------------|
| `date` | string (ISO) | Date of the run | | `date` | string (ISO) | Date of the run |
| `candidate` | string | Optimizer under test | | `candidate` | string | Optimizer under test |
| `candidate_commit` | string | Git commit SHA of the optimizer under test |
| `optimizer_profile` | string | Named profile / push3 variant | | `optimizer_profile` | string | Named profile / push3 variant |
| `lm_eth_before` | integer (wei) | LM total ETH at start | | `lm_eth_before` | integer (wei) | LM total ETH at start |
| `lm_eth_after` | integer (wei) | LM total ETH at end | | `lm_eth_after` | integer (wei) | LM total ETH at end |

View file

@ -34,6 +34,7 @@ DEPLOYMENTS="$REPO_ROOT/onchain/deployments-local.json"
# ── Candidate metadata (set by red-team-sweep.sh; defaults to unknown for standalone runs) ─ # ── Candidate metadata (set by red-team-sweep.sh; defaults to unknown for standalone runs) ─
CANDIDATE_NAME="${CANDIDATE_NAME:-unknown}" CANDIDATE_NAME="${CANDIDATE_NAME:-unknown}"
OPTIMIZER_PROFILE="${OPTIMIZER_PROFILE:-unknown}" OPTIMIZER_PROFILE="${OPTIMIZER_PROFILE:-unknown}"
CANDIDATE_COMMIT="$(git -C "$REPO_ROOT" rev-parse HEAD 2>/dev/null || echo "unknown")"
# ── Anvil accounts ───────────────────────────────────────────────────────────── # ── Anvil accounts ─────────────────────────────────────────────────────────────
# Account 8 — adversary (10k ETH, 0 KRK) # Account 8 — adversary (10k ETH, 0 KRK)
@ -754,6 +755,80 @@ if python3 -c "import sys; sys.exit(0 if int('${LM_ETH_AFTER:-0}') < int('${LM_E
BROKE=true BROKE=true
fi fi
# ── 9a-pre. Write structured evidence JSON ──────────────────────────────────
EVIDENCE_DIR="$REPO_ROOT/evidence/red-team"
EVIDENCE_DATE=$(date -u +%Y-%m-%d)
EVIDENCE_FILE="$EVIDENCE_DIR/$EVIDENCE_DATE.json"
mkdir -p "$EVIDENCE_DIR"
if [[ "$BROKE" == "true" ]]; then
_verdict="floor_broken"
_floor_held="false"
_eth_extracted=$(python3 -c "print(int('${LM_ETH_BEFORE:-0}') - int('${LM_ETH_AFTER:-0}'))")
else
_verdict="floor_held"
_floor_held="true"
_eth_extracted=0
fi
python3 - "$EVIDENCE_FILE" "$REPO_ROOT/tmp/red-team-memory.jsonl" \
"$EVIDENCE_DATE" "$CANDIDATE_NAME" "$CANDIDATE_COMMIT" "$OPTIMIZER_PROFILE" \
"$LM_ETH_BEFORE" "$LM_ETH_AFTER" "$_eth_extracted" "$_floor_held" "$_verdict" <<'PYEOF'
import json, sys, os
evidence_file = sys.argv[1]
memory_file = sys.argv[2]
date = sys.argv[3]
candidate = sys.argv[4]
candidate_commit = sys.argv[5]
optimizer_profile = sys.argv[6]
lm_eth_before = int(sys.argv[7]) if sys.argv[7].isdigit() else 0
lm_eth_after = int(sys.argv[8]) if sys.argv[8].isdigit() else 0
eth_extracted = int(sys.argv[9]) if sys.argv[9].isdigit() else 0
floor_held = sys.argv[10].lower() == "true"
verdict = sys.argv[11]
# Build attacks list from memory entries for this candidate
attacks = []
if os.path.isfile(memory_file) and os.path.getsize(memory_file) > 0:
with open(memory_file) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
e = json.loads(line)
if e.get("candidate") != candidate:
continue
attacks.append({
"strategy": e.get("strategy", ""),
"pattern": e.get("pattern", ""),
"result": e.get("result", "HELD"),
"delta_bps": e.get("delta_bps", 0),
"insight": e.get("insight", ""),
})
except Exception:
pass
evidence = {
"date": date,
"candidate": candidate,
"candidate_commit": candidate_commit,
"optimizer_profile": optimizer_profile,
"lm_eth_before": lm_eth_before,
"lm_eth_after": lm_eth_after,
"eth_extracted": eth_extracted,
"floor_held": floor_held,
"verdict": verdict,
"attacks": attacks,
}
with open(evidence_file, "w") as f:
json.dump(evidence, f, indent=2)
f.write("\n")
print(f" Evidence written to {evidence_file}")
PYEOF
log "Evidence file: $EVIDENCE_FILE"
if [[ "$BROKE" == "true" ]]; then if [[ "$BROKE" == "true" ]]; then
DELTA=$(python3 -c "print(int('${LM_ETH_BEFORE:-0}') - int('${LM_ETH_AFTER:-0}'))") DELTA=$(python3 -c "print(int('${LM_ETH_BEFORE:-0}') - int('${LM_ETH_AFTER:-0}'))")
log " RESULT: ETH EXTRACTED ❌" log " RESULT: ETH EXTRACTED ❌"