The extract_memory regex previously matched any "lm.?eth" mention,
including mid-execution "Total LM ETH: X wei" output lines produced by
the agent's cast check commands. During a staking step these lines
reflect an intermediate chain state (ETH temporarily locked/moved)
rather than the final reverted state, causing strategies to be recorded
as DECREASED even when the runner confirmed ETH_SAFE.
Fix: narrow the capture to the structured `lm_eth_after: <value>`
label that the agent writes in its final RED-TEAM REPORT block.
Mid-execution total-ETH lines no longer match and cannot corrupt the
per-strategy result in memory or the cross-patterns file.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move CROSS_PATTERNS_FILE from /tmp/red-team-cross-patterns.jsonl to
tools/red-team/cross-patterns.jsonl (repo-tracked path)
- Remove the reset (> file) at sweep start so patterns accumulate across runs
- Generate a SWEEP_ID (sweep-YYYYMMDD-HHMMSS) at sweep start and stamp
each new entry with sweep_id for traceability
- Deduplicate on (pattern, candidate, result): entries already present in
the file are skipped; intra-batch duplicates are also suppressed
- Create tools/red-team/ directory with .gitkeep
- Add mkdir -p guards in both scripts so the directory is created on first run
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
bootstrap-light.sh now extracts the Uniswap V3 pool address from
DeployLocal.sol deploy output and writes both Pool and V3Factory
(Base Sepolia: 0x4752ba5DBc23f44D87826276BF6Fd6b1C372aD24) into
deployments-local.json alongside the existing contract addresses.
red-team.sh now reads V3_FACTORY and POOL from deployments-local.json
instead of hardcoding the Base mainnet factory address
(0x33128a8fC17869897dcE68Ed026d694621f6FDfD), and removes the getPool()
RPC call that always failed with "contract does not have any code" on
the Sepolia fork.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Document MIN_RECENTER_INTERVAL (60 s, LiquidityManager.sol:61) and
PRICE_STABILITY_INTERVAL (300 s, PriceOracle.sol:14) in
docs/ARCHITECTURE.md and docs/PRODUCT-TRUTH.md so that agent-facing
and product-facing copy stays traceable to source constants.
Add an inline HTML comment in red-team-program.md next to the
hardcoded 60s/300s sentence pointing to the two source constants,
making drift detectable during code review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inject Kraiken.sol (outstandingSupply, mint/burn mechanics) and Stake.sol
(snatch, withdrawal, KRK exclusion from floor denominator) into the red-team
agent prompt so agents can reason from actual source rather than guesses.
- red-team.sh: read SOL_KRAIKEN and SOL_STAKE from onchain/src/ alongside
the other six contracts already injected
- red-team-program.md: add ### Kraiken.sol and ### Stake.sol sections in the
Source Code reference block (after PriceOracle.sol)
- AGENTS.md: document the full list of injected contracts in a new
"Red-team Agent Context" section; both files are now listed as in-scope
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- red-team-sweep.sh: reset CROSS_PATTERNS_FILE at sweep start to prevent
stale patterns from prior invocations contaminating a fresh run
- red-team-sweep.sh: wrap pattern-extraction Python in set +e/set -e and
capture output so log() prefix is applied; move memory truncation outside
the if-block so it runs unconditionally even if Python fails
- red-team.sh: filter entries where candidate == current_candidate before
grouping, removing self-referential cross-candidate evidence
- red-team.sh: skip entries with empty pattern key (both pattern and
strategy fields empty) to prevent spurious bucket merging
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- red-team-sweep.sh: after each candidate completes, extract all memory
entries into /tmp/red-team-cross-patterns.jsonl (append), then clear
the raw memory file so the next candidate starts with a fresh state
- red-team.sh: define CROSS_PATTERNS_FILE; before building the prompt,
read the cross-patterns file and generate a "Cross-Candidate
Intelligence" section grouped by abstract op pattern — universal
patterns (broke 2+ candidates), candidate-specific wins, and patterns
that held everywhere — each annotated with optimizer profiles
- The new section is injected into the Claude prompt above the existing
Previous Findings block, satisfying all acceptance criteria
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- make_pattern: replace text.find('stake')/find('unstake') with
re.search(r'\bstake\b')/re.search(r'\bunstake\b') so 'stake' is never
found as a substring of 'unstake' (bug #1)
- make_pattern: track first-occurrence position of each op and sort by
position before building the sequence string, preserving actual
execution order instead of a hardcoded canonical order (bug #2)
- insight capture: track insight_pri on the current dict; only overwrite
stored insight when new match has strictly higher priority (lower index),
preventing a late 'because...' clause from silently replacing an earlier
'Key Insight:' capture (warning #3)
- run_num: compute max(run)+1 from JSON entries instead of wc -l so run
numbers stay monotonically increasing after memory trim (info #4)
- red-team-sweep.sh: also set adaptive flag when any r37-r40 register has
a variable-form assignment (r40 = uint256(someVar)), catching candidates
where only one branch uses constants (warning #5)
- red-team-sweep.sh: remove unnecessary 'import sys as _sys' in except
block; sys is already in scope (nit #6)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add CANDIDATE_NAME and OPTIMIZER_PROFILE env vars to red-team.sh
(defaults to "unknown" for standalone runs)
- Update extract_memory Python: new fields candidate, optimizer_profile,
pattern (abstract op sequence via make_pattern()), and improved insight
extraction that also captures WHY explanations (because/since/due to)
- Update MEMORY_SECTION Python: entries now grouped by candidate;
universal patterns (DECREASED across multiple candidates) surfaced first
- Update prompt: add "Current Attack Target" table with candidate/profile,
optimizer parameter explanations (CI/AW/AS/DD behavioral impact),
Rule 9 requiring pattern+insight per strategy, updated report format
with Pattern/Insight fields and universal-pattern conclusion field
- Update red-team-sweep.sh: after inject, parse OptimizerV3Push3.sol for
r40/r39/r38/r37 constants to build OPTIMIZER_PROFILE string; pass
CANDIDATE_NAME and OPTIMIZER_PROFILE as env vars to red-team.sh
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## What
- `tools/push3-transpiler/inject.sh` — shared transpile+inject logic used by both batch-eval and red-team-sweep
- `batch-eval.sh` — replaced inline 60-line Python block with `inject.sh` call
- `scripts/harb-evaluator/red-team-sweep.sh` — red-teams each kindergarten seed using existing `red-team.sh`, with random smoke test gate
## Why
Sweep script kept breaking because I rewrote the injection logic instead of reusing batch-eval's proven Python. Now there's one copy.
## Testing
- inject.sh tested manually on DO box with optimizer_v3 seed
- Smoke test picks random seed, injects + compiles before starting sweep
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/806
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
FEE_DEST is now a keccak-derived address with zero ETH balance.
anvil_impersonateAccount succeeds but cast send fails on gas deduction.
Add anvil_setBalance before impersonation, matching the same fix
already applied in red-team.sh.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DeployLocal.sol changed feeDest to keccak256('harb.local.feeDest') =
0x8A9145E1Ea4C4d7FB08cF1011c8ac1F0e10F9383 but bootstrap-common.sh
still had the old address 0xf6a3eef9088A255c32b6aD2025f83E57291D9011.
Mismatch caused setRecenterAccess to revert (impersonating wrong address).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DeployBase.sol: remove broken inline second recenter() (would always
revert with 'recenter cooldown' in same Forge broadcast); replace with
operator instructions to run the new BootstrapVWAPPhase2.s.sol script
at least 60 s after deployment
- BootstrapVWAPPhase2.s.sol: new script for the second VWAP bootstrap
recenter on Base mainnet deployments
- StrategyExecutor.sol: update stale docstring that still described the
removed recenterAccess bypass; reflect permissionless model with vm.warp
- TestBase.sol: remove vestigial recenterCaller parameter from all four
setupEnvironment* functions (parameter was silently ignored after
setRecenterAccess was removed); update all callers across six test files
- bootstrap-common.sh: fix misleading retry recenter in
seed_application_state() — add evm_increaseTime 61 before evm_mine so
the recenter cooldown actually clears and the retry can succeed
All 210 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
red-team.sh called bare `sudo docker compose up/down` which applies
env_reset and drops FORK_URL before anvil-entrypoint.sh can read it.
Change both calls to `sudo -E` so the caller's FORK_URL override is
propagated to docker-compose and into the anvil container.
Update ENVIRONMENT.md to reflect that a plain `FORK_URL=... bash
red-team.sh` invocation now works correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace 0x27F971cb582BF9E50F397e4d29a5C7A34f11faA2 (Base Sepolia
NonfungiblePositionManager) with the correct Base mainnet address
0x03a520B32c04bf3beef7BEb72E919cF822Ed34F3 in all four files that
referenced it, and add an inline comment citing the chain and source.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
vm.warp in forge script --broadcast only affects the local simulation
phase, not the actual Anvil node. The pool.observe([300,0]) call in
recenter() therefore reverted with OLD when Forge pre-flighted the
broadcast transactions on Anvil.
Fix:
- Remove the vm.warp + 2-recenter + SeedSwapper VWAP bootstrap from
DeployLocal.sol (only contract deployment now, simpler and reliable).
- Add bootstrap_vwap() to bootstrap-common.sh that uses Anvil RPC
evm_increaseTime + evm_mine to advance chain time before each recenter,
then executes a 0.5 ETH WETH->KRK seed swap between them.
- Call bootstrap_vwap() before fund_liquidity_manager() in both
containers/bootstrap.sh and ci-bootstrap.sh so the LM is seeded with
thin positions (1 ETH) during bootstrap, ensuring the 0.5 ETH swap
moves the price >400 ticks (amplitude gate).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Track CLAUDE_PID before launching the claude subprocess so cleanup()
can kill it before reverting Anvil state. Running claude via `&` +
`wait` lets the trap fire immediately on INT/TERM, killing the
subprocess and preventing it from making calls against an
already-reverted chain.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cast call with a typed '(uint256)' selector returns output like
'140734553600000 [1.407e14]' — the numeric value followed by a
bracketed scientific-notation annotation. The cast to-dec added in the
previous review-fix commit failed on this annotated string and fell back
to echo "0", making call_recenter() always skip the VWAP-already-
bootstrapped guard and attempt the real recenter() call, which then
reverted with "amplitude not reached".
Fix: drop the cast to-dec normalisation. A plain != "0" string check
is sufficient because cast returns "0" (no annotation) for the zero
case and any non-zero annotated string is also != "0".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SPDX license:
- Restore GPL-3.0-or-later SPDX header to DeployBase.sol (removed by
the em-dash sed fix in an earlier commit).
SeedSwapper deduplication:
- Extract SeedSwapper into onchain/script/DeployCommon.sol — a single
canonical definition shared by both deploy scripts. This eliminates
duplicate Foundry artifacts (previously both DeployLocal.sol and
DeployBase.sol produced a SeedSwapper artifact, causing ambiguity for
verification and coverage tools).
- Remove inline SeedSwapper and redundant IWETH9 import from
DeployLocal.sol and DeployBase.sol; add `import "./DeployCommon.sol"`.
SeedSwapper hardening (in DeployCommon.sol):
- Replace magic-literal price sentinels with named constants
SQRT_PRICE_LIMIT_MIN / SQRT_PRICE_LIMIT_MAX.
- Wrap both weth.transfer() calls with require() so a non-standard
WETH9 false-return is caught rather than silently ignored.
- Add post-swap WETH sweep in executeSeedBuy(): if the price limit is
reached before the full input is spent, the residual WETH balance is
returned to `recipient` instead of being stranded in the contract.
bootstrap-common.sh:
- Normalise cumulativeVolume output through `cast to-dec` before the
string comparison, guarding against a future change in cast output
format (decimal vs hex).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
run_forge_script() was piping all output to LOG_FILE (which is /dev/null
in CI), so forge failures were completely silent. Capture output to a
temp file and print to stderr on failure so the CI log shows the actual
error message.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adding SeedSwapper alongside DeployLocal in the same .sol file caused
forge to error "Multiple contracts in the target path" when no --tc flag
was specified, silently failing the CI bootstrap step.
Add --tc DeployLocal to all forge script invocations of DeployLocal.sol:
- scripts/bootstrap-common.sh (CI / local bootstrap)
- tools/deploy-optimizer.sh (manual deploy tool)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DeployLocal.sol now calls recenter() twice during the VWAP bootstrap
sequence (issue #567 fix). The second recenter leaves ANCHOR positions
at the post-seed-buy tick, so the CI bootstrap's subsequent call_recenter()
failed with "amplitude not reached" (currentTick == anchorCenterTick,
amplitude = 0 < 400).
Fix: before calling recenter(), check cumulativeVolume(). If it is
already > 0, the deploy script has placed positions and bootstrapped
VWAP -- skip the redundant recenter rather than failing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- LiquidityManager.setFeeDestination: add CREATE2 bypass guard — also
blocks re-assignment when the current feeDestination has since acquired
bytecode (was a plain address when set, contract deployed to it later)
- LiquidityManager.setFeeDestination: expand NatSpec to document the
EOA-mutability trade-off and the CREATE2 guard explicitly
- Test: add testSetFeeDestinationEOAToContract_Locks covering the
realistic EOA→contract transition (the primary lock-activation path)
- red-team.sh: add comment that DEPLOYER_PK is Anvil account-0 and must
only be used against a local ephemeral Anvil instance
- ARCHITECTURE.md: document feeDestination conditional-lock semantics and
contrast with Kraiken's strictly set-once liquidityManager/stakingPool
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `feeDestinationLocked` bool to LiquidityManager
- Replace one-shot setter with conditional trapdoor: EOAs may be set
repeatedly, but setting a contract address locks permanently
- Remove `AddressAlreadySet` error (superseded by the new lock mechanic)
- Replace fragile SLOT7 storage hack in red-team.sh with a proper
`setFeeDestination()` call using the deployer key
- Update tests: replace AddressAlreadySet test with three new tests
covering EOA multi-set, contract lock, and locked revert
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the ethPerToken metric (free balance / adjusted supply) with total
LM ETH (free + WETH + position-locked) using a forge script with exact
Uni V3 integer math. Collapses 4+ RPC calls and Python float approximation
into a single forge script call using LiquidityAmounts + TickMath.
Also updates red-team prompt, report format, memory extraction, and adds
roadmap items for #536-#538 (backtesting pipeline, Push3 evolution).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add AttackRunner.s.sol: structured forge script that reads attack ops from a
JSONL file (ATTACK_FILE env), executes them against the local Anvil deployment,
and emits full state snapshots (tick, positions, VWAP, optimizer output,
adversary balances) as JSON lines after every recenter and at start/end.
- Add 5 canonical attack files in onchain/script/backtesting/attacks/:
* il-crystallization-15.jsonl — 15 buy-recenter cycles + sell (extraction)
* il-crystallization-80.jsonl — 80 buy-recenter cycles + sell (extraction)
* fee-drain-oscillation.jsonl — buy-recenter-sell-recenter oscillation
* round-trip-safe.jsonl — 20 full round-trips (regression: safe)
* staking-safe.jsonl — staking manipulation (regression: safe)
- Add scripts/harb-evaluator/export-attacks.py: parses red-team-stream.jsonl
for tool_use Bash blocks containing cast send commands and converts them to
AttackRunner-compatible JSONL (buy/sell/recenter/stake/unstake/mint_lp/burn_lp).
- Update scripts/harb-evaluator/red-team.sh: after each agent run, automatically
exports the attack sequence via export-attacks.py and replays it with
AttackRunner to capture structured snapshots in tmp/red-team-snapshots.jsonl.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>