- red-team-sweep.sh: reset CROSS_PATTERNS_FILE at sweep start to prevent
stale patterns from prior invocations contaminating a fresh run
- red-team-sweep.sh: wrap pattern-extraction Python in set +e/set -e and
capture output so log() prefix is applied; move memory truncation outside
the if-block so it runs unconditionally even if Python fails
- red-team.sh: filter entries where candidate == current_candidate before
grouping, removing self-referential cross-candidate evidence
- red-team.sh: skip entries with empty pattern key (both pattern and
strategy fields empty) to prevent spurious bucket merging
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- red-team-sweep.sh: after each candidate completes, extract all memory
entries into /tmp/red-team-cross-patterns.jsonl (append), then clear
the raw memory file so the next candidate starts with a fresh state
- red-team.sh: define CROSS_PATTERNS_FILE; before building the prompt,
read the cross-patterns file and generate a "Cross-Candidate
Intelligence" section grouped by abstract op pattern — universal
patterns (broke 2+ candidates), candidate-specific wins, and patterns
that held everywhere — each annotated with optimizer profiles
- The new section is injected into the Claude prompt above the existing
Previous Findings block, satisfying all acceptance criteria
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- make_pattern: replace text.find('stake')/find('unstake') with
re.search(r'\bstake\b')/re.search(r'\bunstake\b') so 'stake' is never
found as a substring of 'unstake' (bug #1)
- make_pattern: track first-occurrence position of each op and sort by
position before building the sequence string, preserving actual
execution order instead of a hardcoded canonical order (bug #2)
- insight capture: track insight_pri on the current dict; only overwrite
stored insight when new match has strictly higher priority (lower index),
preventing a late 'because...' clause from silently replacing an earlier
'Key Insight:' capture (warning #3)
- run_num: compute max(run)+1 from JSON entries instead of wc -l so run
numbers stay monotonically increasing after memory trim (info #4)
- red-team-sweep.sh: also set adaptive flag when any r37-r40 register has
a variable-form assignment (r40 = uint256(someVar)), catching candidates
where only one branch uses constants (warning #5)
- red-team-sweep.sh: remove unnecessary 'import sys as _sys' in except
block; sys is already in scope (nit #6)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add CANDIDATE_NAME and OPTIMIZER_PROFILE env vars to red-team.sh
(defaults to "unknown" for standalone runs)
- Update extract_memory Python: new fields candidate, optimizer_profile,
pattern (abstract op sequence via make_pattern()), and improved insight
extraction that also captures WHY explanations (because/since/due to)
- Update MEMORY_SECTION Python: entries now grouped by candidate;
universal patterns (DECREASED across multiple candidates) surfaced first
- Update prompt: add "Current Attack Target" table with candidate/profile,
optimizer parameter explanations (CI/AW/AS/DD behavioral impact),
Rule 9 requiring pattern+insight per strategy, updated report format
with Pattern/Insight fields and universal-pattern conclusion field
- Update red-team-sweep.sh: after inject, parse OptimizerV3Push3.sol for
r40/r39/r38/r37 constants to build OPTIMIZER_PROFILE string; pass
CANDIDATE_NAME and OPTIMIZER_PROFILE as env vars to red-team.sh
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## What
- `tools/push3-transpiler/inject.sh` — shared transpile+inject logic used by both batch-eval and red-team-sweep
- `batch-eval.sh` — replaced inline 60-line Python block with `inject.sh` call
- `scripts/harb-evaluator/red-team-sweep.sh` — red-teams each kindergarten seed using existing `red-team.sh`, with random smoke test gate
## Why
Sweep script kept breaking because I rewrote the injection logic instead of reusing batch-eval's proven Python. Now there's one copy.
## Testing
- inject.sh tested manually on DO box with optimizer_v3 seed
- Smoke test picks random seed, injects + compiles before starting sweep
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/806
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
FEE_DEST is now a keccak-derived address with zero ETH balance.
anvil_impersonateAccount succeeds but cast send fails on gas deduction.
Add anvil_setBalance before impersonation, matching the same fix
already applied in red-team.sh.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DeployLocal.sol changed feeDest to keccak256('harb.local.feeDest') =
0x8A9145E1Ea4C4d7FB08cF1011c8ac1F0e10F9383 but bootstrap-common.sh
still had the old address 0xf6a3eef9088A255c32b6aD2025f83E57291D9011.
Mismatch caused setRecenterAccess to revert (impersonating wrong address).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DeployBase.sol: remove broken inline second recenter() (would always
revert with 'recenter cooldown' in same Forge broadcast); replace with
operator instructions to run the new BootstrapVWAPPhase2.s.sol script
at least 60 s after deployment
- BootstrapVWAPPhase2.s.sol: new script for the second VWAP bootstrap
recenter on Base mainnet deployments
- StrategyExecutor.sol: update stale docstring that still described the
removed recenterAccess bypass; reflect permissionless model with vm.warp
- TestBase.sol: remove vestigial recenterCaller parameter from all four
setupEnvironment* functions (parameter was silently ignored after
setRecenterAccess was removed); update all callers across six test files
- bootstrap-common.sh: fix misleading retry recenter in
seed_application_state() — add evm_increaseTime 61 before evm_mine so
the recenter cooldown actually clears and the retry can succeed
All 210 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
red-team.sh called bare `sudo docker compose up/down` which applies
env_reset and drops FORK_URL before anvil-entrypoint.sh can read it.
Change both calls to `sudo -E` so the caller's FORK_URL override is
propagated to docker-compose and into the anvil container.
Update ENVIRONMENT.md to reflect that a plain `FORK_URL=... bash
red-team.sh` invocation now works correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace 0x27F971cb582BF9E50F397e4d29a5C7A34f11faA2 (Base Sepolia
NonfungiblePositionManager) with the correct Base mainnet address
0x03a520B32c04bf3beef7BEb72E919cF822Ed34F3 in all four files that
referenced it, and add an inline comment citing the chain and source.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
vm.warp in forge script --broadcast only affects the local simulation
phase, not the actual Anvil node. The pool.observe([300,0]) call in
recenter() therefore reverted with OLD when Forge pre-flighted the
broadcast transactions on Anvil.
Fix:
- Remove the vm.warp + 2-recenter + SeedSwapper VWAP bootstrap from
DeployLocal.sol (only contract deployment now, simpler and reliable).
- Add bootstrap_vwap() to bootstrap-common.sh that uses Anvil RPC
evm_increaseTime + evm_mine to advance chain time before each recenter,
then executes a 0.5 ETH WETH->KRK seed swap between them.
- Call bootstrap_vwap() before fund_liquidity_manager() in both
containers/bootstrap.sh and ci-bootstrap.sh so the LM is seeded with
thin positions (1 ETH) during bootstrap, ensuring the 0.5 ETH swap
moves the price >400 ticks (amplitude gate).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Track CLAUDE_PID before launching the claude subprocess so cleanup()
can kill it before reverting Anvil state. Running claude via `&` +
`wait` lets the trap fire immediately on INT/TERM, killing the
subprocess and preventing it from making calls against an
already-reverted chain.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cast call with a typed '(uint256)' selector returns output like
'140734553600000 [1.407e14]' — the numeric value followed by a
bracketed scientific-notation annotation. The cast to-dec added in the
previous review-fix commit failed on this annotated string and fell back
to echo "0", making call_recenter() always skip the VWAP-already-
bootstrapped guard and attempt the real recenter() call, which then
reverted with "amplitude not reached".
Fix: drop the cast to-dec normalisation. A plain != "0" string check
is sufficient because cast returns "0" (no annotation) for the zero
case and any non-zero annotated string is also != "0".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SPDX license:
- Restore GPL-3.0-or-later SPDX header to DeployBase.sol (removed by
the em-dash sed fix in an earlier commit).
SeedSwapper deduplication:
- Extract SeedSwapper into onchain/script/DeployCommon.sol — a single
canonical definition shared by both deploy scripts. This eliminates
duplicate Foundry artifacts (previously both DeployLocal.sol and
DeployBase.sol produced a SeedSwapper artifact, causing ambiguity for
verification and coverage tools).
- Remove inline SeedSwapper and redundant IWETH9 import from
DeployLocal.sol and DeployBase.sol; add `import "./DeployCommon.sol"`.
SeedSwapper hardening (in DeployCommon.sol):
- Replace magic-literal price sentinels with named constants
SQRT_PRICE_LIMIT_MIN / SQRT_PRICE_LIMIT_MAX.
- Wrap both weth.transfer() calls with require() so a non-standard
WETH9 false-return is caught rather than silently ignored.
- Add post-swap WETH sweep in executeSeedBuy(): if the price limit is
reached before the full input is spent, the residual WETH balance is
returned to `recipient` instead of being stranded in the contract.
bootstrap-common.sh:
- Normalise cumulativeVolume output through `cast to-dec` before the
string comparison, guarding against a future change in cast output
format (decimal vs hex).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
run_forge_script() was piping all output to LOG_FILE (which is /dev/null
in CI), so forge failures were completely silent. Capture output to a
temp file and print to stderr on failure so the CI log shows the actual
error message.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adding SeedSwapper alongside DeployLocal in the same .sol file caused
forge to error "Multiple contracts in the target path" when no --tc flag
was specified, silently failing the CI bootstrap step.
Add --tc DeployLocal to all forge script invocations of DeployLocal.sol:
- scripts/bootstrap-common.sh (CI / local bootstrap)
- tools/deploy-optimizer.sh (manual deploy tool)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DeployLocal.sol now calls recenter() twice during the VWAP bootstrap
sequence (issue #567 fix). The second recenter leaves ANCHOR positions
at the post-seed-buy tick, so the CI bootstrap's subsequent call_recenter()
failed with "amplitude not reached" (currentTick == anchorCenterTick,
amplitude = 0 < 400).
Fix: before calling recenter(), check cumulativeVolume(). If it is
already > 0, the deploy script has placed positions and bootstrapped
VWAP -- skip the redundant recenter rather than failing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- LiquidityManager.setFeeDestination: add CREATE2 bypass guard — also
blocks re-assignment when the current feeDestination has since acquired
bytecode (was a plain address when set, contract deployed to it later)
- LiquidityManager.setFeeDestination: expand NatSpec to document the
EOA-mutability trade-off and the CREATE2 guard explicitly
- Test: add testSetFeeDestinationEOAToContract_Locks covering the
realistic EOA→contract transition (the primary lock-activation path)
- red-team.sh: add comment that DEPLOYER_PK is Anvil account-0 and must
only be used against a local ephemeral Anvil instance
- ARCHITECTURE.md: document feeDestination conditional-lock semantics and
contrast with Kraiken's strictly set-once liquidityManager/stakingPool
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `feeDestinationLocked` bool to LiquidityManager
- Replace one-shot setter with conditional trapdoor: EOAs may be set
repeatedly, but setting a contract address locks permanently
- Remove `AddressAlreadySet` error (superseded by the new lock mechanic)
- Replace fragile SLOT7 storage hack in red-team.sh with a proper
`setFeeDestination()` call using the deployer key
- Update tests: replace AddressAlreadySet test with three new tests
covering EOA multi-set, contract lock, and locked revert
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the ethPerToken metric (free balance / adjusted supply) with total
LM ETH (free + WETH + position-locked) using a forge script with exact
Uni V3 integer math. Collapses 4+ RPC calls and Python float approximation
into a single forge script call using LiquidityAmounts + TickMath.
Also updates red-team prompt, report format, memory extraction, and adds
roadmap items for #536-#538 (backtesting pipeline, Push3 evolution).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add AttackRunner.s.sol: structured forge script that reads attack ops from a
JSONL file (ATTACK_FILE env), executes them against the local Anvil deployment,
and emits full state snapshots (tick, positions, VWAP, optimizer output,
adversary balances) as JSON lines after every recenter and at start/end.
- Add 5 canonical attack files in onchain/script/backtesting/attacks/:
* il-crystallization-15.jsonl — 15 buy-recenter cycles + sell (extraction)
* il-crystallization-80.jsonl — 80 buy-recenter cycles + sell (extraction)
* fee-drain-oscillation.jsonl — buy-recenter-sell-recenter oscillation
* round-trip-safe.jsonl — 20 full round-trips (regression: safe)
* staking-safe.jsonl — staking manipulation (regression: safe)
- Add scripts/harb-evaluator/export-attacks.py: parses red-team-stream.jsonl
for tool_use Bash blocks containing cast send commands and converts them to
AttackRunner-compatible JSONL (buy/sell/recenter/stake/unstake/mint_lp/burn_lp).
- Update scripts/harb-evaluator/red-team.sh: after each agent run, automatically
exports the attack sequence via export-attacks.py and replays it with
AttackRunner to capture structured snapshots in tmp/red-team-snapshots.jsonl.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move snapshot to after setRecenterAccess so agent reverts restore
recenterAccess for account 2 on every retry
- Read feeDestination() dynamically from LM (removes hardcoded constant)
and add || die guards on impersonation calls
- Add EXIT/INT/TERM cleanup trap that reverts to the baseline snapshot
- Fix agent floor-check snippet: add FEE_DEST/FEE_BAL reads so formula
matches compute_eth_per_token (adj=s-f-k, not adj=s-k)
- Use `timeout "$CLAUDE_TIMEOUT"` to enforce wall-clock process limit
- Correct taxRateIndex range: 0-29 (30-element TAX_RATES array)
- Fix outstandingSupply() description: excludes LM-held KRK, not all KRK
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds scripts/harb-evaluator/red-team.sh which:
- Verifies the Anvil stack is running and deployments exist
- Grants recenterAccess to account 2 (impersonating feeDestination)
- Takes an Anvil snapshot as the clean baseline
- Computes ethPerToken before the agent run (mirrors floor.ts logic)
- Builds a self-contained prompt with contract addresses, account keys,
protocol mechanics, copy-paste cast command patterns, snapshot/revert
instructions, and structured rules for the agent
- Spawns `claude -p --dangerously-skip-permissions` with a 2-hour timeout
- Captures output to tmp/red-team-report.txt
- Computes ethPerToken after the agent run and reports pass/fail
Exit code 0 = floor held, exit code 1 = floor broken, exit code 2 = infra error.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove dead krkAddress field from UnstakeRpcConfig (bug)
- Drop swap.js import to avoid transitive Playwright dependency; fix
header comment to accurately describe the module boundary (warning)
- Inline pollReceipt() returning TxReceipt so snatch receipt is reused
for log parsing without a second round-trip (warning)
- Use ZeroAddress from ethers instead of manual constant (info)
- Add comment on fromBlock '0x0' genesis-scan caveat (info)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace waitForLoadState('networkidle') in the post-login redirect with
waitForURL('**/app/stake**'). Persistent WebSocket connections prevent
networkidle from ever firing, mirroring the same fix applied to navigate.ts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clarifies that the event-driven engineering principles apply to infrastructure (Docker, scripts, startup/teardown) and test/scenario execution — NOT frontend HTTP API polling.
Frontend polling (e.g. LiveStats → Ponder GraphQL every 30s) is fine. The scalability solution is caching at the proxy layer (`Cache-Control` headers via Caddy), not WebSocket subscriptions.
Relates to #447
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/470