- Wrap upgradeTo() in try/catch: malformed candidate bytecode no longer
aborts the entire batch; emit {"fitness":0,"error":"upgrade_failed"} and
continue to the next candidate
- Bootstrap recenter: require() after 5 retry attempts so silent failure
(all scores identically equal to free WETH only) is surfaced as a hard
test failure rather than silently producing meaningless results
- mint_lp: capture the NPM tokenId returned by mint() and push it to
_mintedNpmTokenIds; burn_lp now uses a 1-based index into that array
(same pattern as stake/unstake), making attack files fork-block-independent
- Remove dead atkBaseSnap variable and its compiler-warning suppression
- Remove orphaned vm.snapshot() after vm.revertTo() in the attack loop
- Fix misleading comment on delete _stakedPositionIds
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace per-candidate Anvil+forge-script pipeline with in-process EVM
execution using Foundry's native revm backend, achieving 10-100× speedup
for evolutionary search at scale.
New files:
- onchain/test/FitnessEvaluator.t.sol — Forge test that forks Base once,
deploys the full KRAIKEN stack, then for each candidate uses vm.etch to
inject the compiled optimizer bytecode, UUPS-upgrades the proxy, runs all
attack sequences with in-memory vm.snapshot/revertTo (no RPC overhead),
and emits one {"candidate_id","fitness"} JSON line per candidate.
Skips gracefully when BASE_RPC_URL is unset (CI-safe).
- tools/push3-evolution/revm-evaluator/batch-eval.sh — Wrapper that
transpiles+compiles each candidate sequentially, writes a two-file
manifest (ids.txt + bytecodes.txt), then invokes FitnessEvaluator.t.sol
in a single forge test run and parses the score JSON from stdout.
Modified:
- tools/push3-evolution/evolve.sh — Adds EVAL_MODE env var (anvil|revm).
When EVAL_MODE=revm, batch-scores every candidate in a generation with
one batch-eval.sh call instead of N sequential fitness.sh processes;
scores are looked up from the JSONL output in the per-candidate loop.
Default remains EVAL_MODE=anvil for backward compatibility.
Key design decisions:
- Per-candidate Solidity compilation is unavoidable (each Push3 candidate
produces different Solidity); the speedup is in the evaluation phase.
- vm.snapshot/revertTo in forge test are O(1) memory operations (true
revm), not RPC calls — this is the core speedup vs Anvil.
- recenterAccess is set in bootstrap so TWAP stability checks are bypassed
during attack sequences (mirrors the existing fitness.sh bootstrap).
- Test skips cleanly when BASE_RPC_URL is absent, keeping CI green.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace pool.observe() TWAP price source with current pool tick (pool.slot0()) sampled once per recenter
- Remove _getTWAPOrFallback() and TWAPFallback event (added by PR #575)
- _scrapePositions now takes int24 currentTick instead of uint256 prevTimestamp; price computed via _priceAtTick before the burn loop
- Volume weighting (ethFee * 100) is unchanged — fees proxy swap volume over the recenter interval
- Direction fix from #566 (shouldRecordVWAP only on sell events) is preserved
- Remove test_twapReflectsAveragePriceNotJustLastSwap (tested reverted TWAP behaviour)
- ORACLE_CARDINALITY / increaseObservationCardinalityNext retained for _isPriceStable()
- All 188 tests pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After a buy→sell round-trip the net price movement is near zero, so
recenter() reverts with "amplitude not reached" and aborts the whole
AttackRunner script.
Wrap the recenter() call in a try/catch so amplitude failures are
caught and logged as a skipped step rather than propagating as a fatal
revert. When recenter is skipped, no state snapshot is emitted and the
attack sequence continues — matching the intended semantics: round-trip
trading should not cause the fitness scorer to crash.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stake.nextPositionId starts at 654_321, so attack files cannot use literal
on-chain IDs (e.g. positionId=1 always reverts with PositionNotFound).
Fix AttackRunner to treat the JSONL positionId field as a 1-based index into
the list of positions created by stake ops during the current run:
- Add IStake.snatch returns (uint256) to the interface so the returned ID is
captured.
- Track returned IDs in _stakedPositionIds[] (inserted in creation order).
- _executeUnstake resolves positionId to _stakedPositionIds[positionId-1]
before calling exitPosition, matching the natural "unstake position 1"
semantics in the attack DSL.
KRK approval for Stake was already present in _setup(); no other changes needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add virtual to Optimizer.calculateParams() for UUPS override
- Create OptimizerV3.sol: UUPS-upgradeable optimizer with transpiled Push3 logic
- Update deploy-optimizer.sh to deploy OptimizerV3 instead of Optimizer
- Add ~/.foundry/bin to PATH in evolve.sh, fitness.sh, deploy-optimizer.sh
Address round-2 review findings:
- Move BASELINE_SNAP before deploy-optimizer.sh so cleanup fully reverts the
deploy on a shared Anvil; fixes nonce/address collision when a second
sequential evaluation reuses the same chain
- Revert deploy output to capture-and-suppress on success / surface on failure;
removes per-candidate stderr noise in evolution loop batch runs
- Fix cast rpc anvil_mine arg order to match all other cast rpc calls in script
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address review findings:
- Bug: add BASELINE_SNAP before bootstrap; cleanup reverts it on shared Anvil
to undo setRecenterAccess/WETH-funding/recenter mutations (was dead code before)
- Bug: require ANVIL_FORK_URL when cold-starting Anvil — DeployLocal.sol needs
live Base contracts (Uniswap V3 Factory, WETH) that don't exist on a plain fork
- Warning: flag DIRTY and emit warning when anvil_revert fails instead of || true
- Warning: tee deploy-optimizer.sh output to both log file and stderr so progress
is visible and preserved for post-failure diagnosis
- Nit: replace 50×evm_mine loop with single anvil_mine 0x32 (49 fewer RTTs)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- **Bug**: Fix JSON malformation in _snapshotPositions — closing literal was '"}}}' (three
braces) but only '"}}' is needed (close discovery{} + positions{}). The third brace
prematurely closed the root object, making every snapshot unparseable downstream.
- **Nit**: _executeStake local variable renamed taxRateIndex → taxRate to match the
IStake interface and Stake.sol. JSONL field key '.taxRateIndex' is kept for backward
compatibility with existing attack files; the comment and NatDoc header now say so.
- **Nit**: recenter_is_up now emits JSON null (not false) before the first recenter call,
via a new _hasRecentered flag. Downstream parsers can distinguish "no recenter yet"
from "last recenter moved price down" (false). _hasRecentered is set to true alongside
_lastRecenterIsUp in the recenter handler.
- **Nit**: Added a comment to _logSnapshot explaining that pool.slot0() is a view call
and forge-std finalises broadcast state before executing it, so tick/sqrtPrice are
always post-broadcast accurate.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- **Bug**: `_positionEthValue` now sums both the ETH component and the KRK component
(converted to ETH via `FullMath.mulDiv` at current sqrtPriceX96) so `lm_eth_total`
correctly reflects LM TVL for all price ranges (below/in/above range).
- **Bug**: `recenter()` return value (`bool isUp` — price direction) is now captured in
`_lastRecenterIsUp` state variable and emitted as `"recenter_is_up"` in every snapshot.
Note: `recenter()` reverts on failure; `false` means price moved *down*, not a no-op.
- **Bug**: Discovery position now emits `"ethValue"` in its snapshot JSON object,
matching the floor and anchor fields for symmetric automated parsing.
- **Warning**: `IStake.snatch` interface parameter renamed `taxRateIndex` → `taxRate` to
match the actual `Stake.sol` signature (the value is a raw rate, not a lookup index).
- **Warning**: Unknown op codes in the JSONL file now emit a `console.log` warning
instead of silently skipping, catching typos in attack sequences.
- **Nit**: `_setup()` now wraps 9 000 ETH (up from 1 000) to cover heavy buy sequences
that would otherwise exhaust the adversary's WETH.
- **Nit**: `_computeVwapTick` documents the int128 overflow guard and its tick=0 sentinel
meaning so callers can distinguish "VWAP unavailable" from tick zero.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the five Push3 mutation operators and the meta-operator for
the optimizer evolution pipeline:
- mutateConstant: shifts a random integer literal by ±δ (clamped to 0)
- swapOperator: swaps ADD↔SUB, MUL↔DIV, GT↔LT, GTE↔LTE
- deleteInstruction: removes a random non-EXEC.IF instr; validates result
- insertInstruction: inserts stack-neutral pair (push 0 + DYADIC.POP)
- crossover: single-point crossover of two programs at instruction boundaries
- mutate: applies N random mutations from the four single-program operators
All mutations validate output via transpile() symbolic stack simulation.
Invalid mutations silently return the original program.
35 unit tests cover all operators, edge cases (empty program, single
instruction, deep stack), and the acceptance criterion that
mutate(optimizer_v3, 3) produces ≥10 distinct valid variants in 20 trials.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix unsafe int32 intermediate cast: int56(int32(elapsed)) → int56(uint56(elapsed))
to prevent TWAP tick sign inversion for intervals above int32 max (~68 years)
- Remove redundant lastRecenterTimestamp state variable; capture prevTimestamp
from existing lastRecenterTime instead (saves ~20k gas per recenter)
- Use pool.increaseObservationCardinalityNext(ORACLE_CARDINALITY) in constructor
instead of recomputing the pool address; extract magic 100 to named constant
- Add TWAPFallback(uint32 elapsed) event emitted when pool.observe() reverts
so monitoring can distinguish degraded operation from normal bootstrap
- Remove conditional bypass paths in test_twapReflectsAveragePriceNotJustLastSwap;
assert vwapAfter > 0 and vwapAfter > initialPriceX96 unconditionally
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add lastRecenterTimestamp to track recenter interval for TWAP
- Increase pool observation cardinality to 100 in constructor
- In _scrapePositions, use pool.observe([elapsed, 0]) to get TWAP tick
over the full interval between recenters; falls back to anchor midpoint
when elapsed==0 or pool.observe() reverts (insufficient history)
- Add test_twapReflectsAveragePriceNotJustLastSwap: verifies TWAP-based
VWAP reflects the average price across the recenter interval, not just
the last-swap anchor snapshot
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the ethPerToken metric (free balance / adjusted supply) with total
LM ETH (free + WETH + position-locked) using a forge script with exact
Uni V3 integer math. Collapses 4+ RPC calls and Python float approximation
into a single forge script call using LiquidityAmounts + TickMath.
Also updates red-team prompt, report format, memory extraction, and adds
roadmap items for #536-#538 (backtesting pipeline, Push3 evolution).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- _scrapePositions natspec: 'ETH inflow' → 'ETH outflow / price fell or at bootstrap'
- Inline comment above VWAP block: remove inverted 'KRK sold out / ETH inflow' rationale,
replace with a neutral forward-reference to recenter() where the direction logic lives
- VWAPFloorProtection.t.sol: remove unused Kraiken and forge-std/Test.sol imports
(both are already provided by UniSwapHelper)
- test_floorConservativeAfterBuyOnlyAttack: add assertFalse(token0isWeth) guard so a
future change to the setUp parameter cannot silently invert the gap-direction assertion
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Investigation findings:
- VWAP WAS being fed during buy-only cycles (shouldRecordVWAP = true on ETH inflow / price rising).
Over 80 buy-recenter cycles VWAP converged toward the inflated current price.
- When VWAP ≈ currentTick, mirrorTick = currentTick + vwapDistance ≈ currentTick, placing
the floor near the inflated price. Adversary sells back through the high floor, extracting
nearly all LM ETH.
- Optimizer parameters (anchorShare, CI) were not the primary cause.
Fix (LiquidityManager.sol):
Flip shouldRecordVWAP from buy direction to sell direction. VWAP is now recorded only when
price falls (ETH outflow / sell events) or at initial bootstrap (cumulativeVolume == 0).
Buy-only attack cycles leave VWAP frozen at the historical baseline, keeping mirrorTick and
the floor conservatively anchored far from the inflated current price.
Also updated onchain/AGENTS.md to document the corrected recording direction.
Regression test (VWAPFloorProtection.t.sol):
- test_vwapNotInflatedByBuyOnlyAttack: asserts getVWAP() stays at bootstrap after N buy cycles.
- test_floorConservativeAfterBuyOnlyAttack: asserts floor center is far below inflated tick.
- test_vwapBootstrapsOnFirstFeeEvent: confirms bootstrap path unchanged.
- test_recenterSucceedsOnSellDirectionWithoutReverts: confirms sell-direction recenters work.
All 187 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add AttackRunner.s.sol: structured forge script that reads attack ops from a
JSONL file (ATTACK_FILE env), executes them against the local Anvil deployment,
and emits full state snapshots (tick, positions, VWAP, optimizer output,
adversary balances) as JSON lines after every recenter and at start/end.
- Add 5 canonical attack files in onchain/script/backtesting/attacks/:
* il-crystallization-15.jsonl — 15 buy-recenter cycles + sell (extraction)
* il-crystallization-80.jsonl — 80 buy-recenter cycles + sell (extraction)
* fee-drain-oscillation.jsonl — buy-recenter-sell-recenter oscillation
* round-trip-safe.jsonl — 20 full round-trips (regression: safe)
* staking-safe.jsonl — staking manipulation (regression: safe)
- Add scripts/harb-evaluator/export-attacks.py: parses red-team-stream.jsonl
for tool_use Bash blocks containing cast send commands and converts them to
AttackRunner-compatible JSONL (buy/sell/recenter/stake/unstake/mint_lp/burn_lp).
- Update scripts/harb-evaluator/red-team.sh: after each agent run, automatically
exports the attack sequence via export-attacks.py and replays it with
AttackRunner to capture structured snapshots in tmp/red-team-snapshots.jsonl.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When feeDestination == address(this), _scrapePositions() now skips the
fee safeTransfer calls so collected WETH/KRK stays in the LM balance
and is redeployed as liquidity on the next _setPositions() call.
Also fixes _getOutstandingSupply(): kraiken.outstandingSupply() already
subtracts balanceOf(liquidityManager), so when feeDestination IS the LM
the old code double-subtracted LM-held KRK, causing an arithmetic
underflow once positions were scraped. The subtraction is now skipped
for the self-referencing case.
VWAP recording is refactored to a single unconditional block so it fires
regardless of fee destination.
New test testSelfFeeDestination_FeesAccrueAsLiquidity() demonstrates
that a two-recenter cycle with self-feeDestination completes without
underflow and without leaking WETH to any external address.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>