harb/scripts/harb-evaluator at 23d460542be2fd24f750f6379d3002645594561e - johba/harb

johba/harb

History

openhands 23d460542b fix: feat: Red-team agent runner — adversarial floor attack (#520 ) Adds scripts/harb-evaluator/red-team.sh which: - Verifies the Anvil stack is running and deployments exist - Grants recenterAccess to account 2 (impersonating feeDestination) - Takes an Anvil snapshot as the clean baseline - Computes ethPerToken before the agent run (mirrors floor.ts logic) - Builds a self-contained prompt with contract addresses, account keys, protocol mechanics, copy-paste cast command patterns, snapshot/revert instructions, and structured rules for the agent - Spawns `claude -p --dangerously-skip-permissions` with a 2-hour timeout - Captures output to tmp/red-team-report.txt - Computes ethPerToken after the agent run and reports pass/fail Exit code 0 = floor held, exit code 1 = floor broken, exit code 2 = infra error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-03-09 03:28:10 +00:00
..
helpers	fix: address review findings in stake-rpc.ts (#518 )	2026-03-09 02:48:51 +00:00
scenarios/passive-confidence	fix: correct buyKrk call sites for new opts param, add eslint-disable for polling loop	2026-03-05 05:53:19 +00:00
evaluate.sh	fix: wait_healthy does not fail fast when a service exits or crashes during the health-check window (#387 )	2026-03-06 11:20:54 +00:00
holdout.config.ts	fix: address PR #438 review findings	2026-03-04 08:20:11 +00:00
red-team.sh	fix: feat: Red-team agent runner — adversarial floor attack (#520 )	2026-03-09 03:28:10 +00:00