Commit graph

79 commits

Author SHA1 Message Date
openhands
a18512a644 fix: Stale JSDoc in navigateToStakePage refers to '/stake' not '/app/stake' (#509) 2026-03-13 10:37:14 +00:00
openhands
659044e2d1 fix: claude subprocess not killed on INT/TERM in cleanup trap (#530)
Track CLAUDE_PID before launching the claude subprocess so cleanup()
can kill it before reverting Anvil state. Running claude via `&` +
`wait` lets the trap fire immediately on INT/TERM, killing the
subprocess and preventing it from making calls against an
already-reverted chain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 09:48:34 +00:00
openhands
2ae07e7a49 fix: $FLOOR_BEFORE/$FLOOR_AFTER unquoted inside python3 -c string (#531) 2026-03-13 08:28:26 +00:00
openhands
6924cb03f3 fix: Protocol Mechanics section in agent prompt still exposes ethPerToken formula (#550) 2026-03-13 07:47:35 +00:00
openhands
85503f9c5c fix: remove cast to-dec that broke cumulativeVolume check in call_recenter
cast call with a typed '(uint256)' selector returns output like
'140734553600000 [1.407e14]' — the numeric value followed by a
bracketed scientific-notation annotation.  The cast to-dec added in the
previous review-fix commit failed on this annotated string and fell back
to echo "0", making call_recenter() always skip the VWAP-already-
bootstrapped guard and attempt the real recenter() call, which then
reverted with "amplitude not reached".

Fix: drop the cast to-dec normalisation.  A plain != "0" string check
is sufficient because cast returns "0" (no annotation) for the zero
case and any non-zero annotated string is also != "0".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 01:48:58 +00:00
openhands
38bc0f7057 fix: address AI review findings on VWAP bootstrap PR
SPDX license:
- Restore GPL-3.0-or-later SPDX header to DeployBase.sol (removed by
  the em-dash sed fix in an earlier commit).

SeedSwapper deduplication:
- Extract SeedSwapper into onchain/script/DeployCommon.sol — a single
  canonical definition shared by both deploy scripts.  This eliminates
  duplicate Foundry artifacts (previously both DeployLocal.sol and
  DeployBase.sol produced a SeedSwapper artifact, causing ambiguity for
  verification and coverage tools).
- Remove inline SeedSwapper and redundant IWETH9 import from
  DeployLocal.sol and DeployBase.sol; add `import "./DeployCommon.sol"`.

SeedSwapper hardening (in DeployCommon.sol):
- Replace magic-literal price sentinels with named constants
  SQRT_PRICE_LIMIT_MIN / SQRT_PRICE_LIMIT_MAX.
- Wrap both weth.transfer() calls with require() so a non-standard
  WETH9 false-return is caught rather than silently ignored.
- Add post-swap WETH sweep in executeSeedBuy(): if the price limit is
  reached before the full input is spent, the residual WETH balance is
  returned to `recipient` instead of being stranded in the contract.

bootstrap-common.sh:
- Normalise cumulativeVolume output through `cast to-dec` before the
  string comparison, guarding against a future change in cast output
  format (decimal vs hex).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 00:12:39 +00:00
openhands
e0b61c1b88 fix: surface forge script errors in CI bootstrap log
run_forge_script() was piping all output to LOG_FILE (which is /dev/null
in CI), so forge failures were completely silent.  Capture output to a
temp file and print to stderr on failure so the CI log shows the actual
error message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 23:27:23 +00:00
openhands
73a80ead0b fix: add --tc DeployLocal to forge script invocations
Adding SeedSwapper alongside DeployLocal in the same .sol file caused
forge to error "Multiple contracts in the target path" when no --tc flag
was specified, silently failing the CI bootstrap step.

Add --tc DeployLocal to all forge script invocations of DeployLocal.sol:
  - scripts/bootstrap-common.sh  (CI / local bootstrap)
  - tools/deploy-optimizer.sh    (manual deploy tool)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 23:12:25 +00:00
openhands
3a17404529 fix: skip CI bootstrap recenter when VWAP already seeded by deploy script
DeployLocal.sol now calls recenter() twice during the VWAP bootstrap
sequence (issue #567 fix).  The second recenter leaves ANCHOR positions
at the post-seed-buy tick, so the CI bootstrap's subsequent call_recenter()
failed with "amplitude not reached" (currentTick == anchorCenterTick,
amplitude = 0 < 400).

Fix: before calling recenter(), check cumulativeVolume().  If it is
already > 0, the deploy script has placed positions and bootstrapped
VWAP -- skip the redundant recenter rather than failing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 21:46:00 +00:00
openhands
b902b89e3b fix: address review findings — CREATE2 guard, transition test, docs
- LiquidityManager.setFeeDestination: add CREATE2 bypass guard — also
  blocks re-assignment when the current feeDestination has since acquired
  bytecode (was a plain address when set, contract deployed to it later)
- LiquidityManager.setFeeDestination: expand NatSpec to document the
  EOA-mutability trade-off and the CREATE2 guard explicitly
- Test: add testSetFeeDestinationEOAToContract_Locks covering the
  realistic EOA→contract transition (the primary lock-activation path)
- red-team.sh: add comment that DEPLOYER_PK is Anvil account-0 and must
  only be used against a local ephemeral Anvil instance
- ARCHITECTURE.md: document feeDestination conditional-lock semantics and
  contrast with Kraiken's strictly set-once liquidityManager/stakingPool

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:13:50 +00:00
openhands
512640226b fix: fix: Conditional lock on feeDestination — lock when set to contract (#580) (#580)
- Add `feeDestinationLocked` bool to LiquidityManager
- Replace one-shot setter with conditional trapdoor: EOAs may be set
  repeatedly, but setting a contract address locks permanently
- Remove `AddressAlreadySet` error (superseded by the new lock mechanic)
- Replace fragile SLOT7 storage hack in red-team.sh with a proper
  `setFeeDestination()` call using the deployer key
- Update tests: replace AddressAlreadySet test with three new tests
  covering EOA multi-set, contract lock, and locked revert

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 16:13:44 +00:00
johba
514a55a1ac Merge pull request 'fix: Backtesting: replay red-team attack sequences against optimizer candidates (#536)' (#565) from fix/issue-536 into master 2026-03-11 19:24:27 +01:00
openhands
58729b98b4 fix: fix: strip cast formatted annotations from red-team.sh (#577)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 10:19:14 +00:00
openhands
0834433db1 Fix PR #540 review findings
Critical fixes:
- LmTotalEth.s.sol: Fix imports to use @aperture/uni-v3-lib/ (lines 8-9)
- red-team.sh: Update memory regex to match lm.?eth pattern (line 266)

Additional improvements:
- red-team.sh: Update adversary balance claim to ~9000 ETH (after funding LM)
- red-team.sh: Add --no-color to forge invocation + emptiness guard
- red-team.sh: Document feeDestination storage slot 7 fragility

Tested:
- Regex pattern matches all expected formats (lm_eth, lmeth, LM-ETH, etc.)
- Import paths align with remappings.txt
2026-03-11 06:28:02 +00:00
openhands
0ddc1ccd80 fix: Red-team: replace ethPerToken with exact total-LM-ETH metric (#539)
Replace the ethPerToken metric (free balance / adjusted supply) with total
LM ETH (free + WETH + position-locked) using a forge script with exact
Uni V3 integer math. Collapses 4+ RPC calls and Python float approximation
into a single forge script call using LiquidityAmounts + TickMath.

Also updates red-team prompt, report format, memory extraction, and adds
roadmap items for #536-#538 (backtesting pipeline, Push3 evolution).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 06:28:02 +00:00
openhands
c8453f6a33 fix: Backtesting: replay red-team attack sequences against optimizer candidates (#536)
- Add AttackRunner.s.sol: structured forge script that reads attack ops from a
  JSONL file (ATTACK_FILE env), executes them against the local Anvil deployment,
  and emits full state snapshots (tick, positions, VWAP, optimizer output,
  adversary balances) as JSON lines after every recenter and at start/end.

- Add 5 canonical attack files in onchain/script/backtesting/attacks/:
  * il-crystallization-15.jsonl  — 15 buy-recenter cycles + sell (extraction)
  * il-crystallization-80.jsonl  — 80 buy-recenter cycles + sell (extraction)
  * fee-drain-oscillation.jsonl  — buy-recenter-sell-recenter oscillation
  * round-trip-safe.jsonl        — 20 full round-trips (regression: safe)
  * staking-safe.jsonl           — staking manipulation (regression: safe)

- Add scripts/harb-evaluator/export-attacks.py: parses red-team-stream.jsonl
  for tool_use Bash blocks containing cast send commands and converts them to
  AttackRunner-compatible JSONL (buy/sell/recenter/stake/unstake/mint_lp/burn_lp).

- Update scripts/harb-evaluator/red-team.sh: after each agent run, automatically
  exports the attack sequence via export-attacks.py and replays it with
  AttackRunner to capture structured snapshots in tmp/red-team-snapshots.jsonl.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 02:08:06 +00:00
openhands
816b211c2b fix: address review findings in red-team memory (#528)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 10:00:56 +00:00
openhands
c1db4cb93e fix: Red-team memory: persistent cross-run learning for adversarial agent (#528)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 09:23:37 +00:00
openhands
ea53e4cfce fix: address review findings in red-team.sh (#520)
- Move snapshot to after setRecenterAccess so agent reverts restore
  recenterAccess for account 2 on every retry
- Read feeDestination() dynamically from LM (removes hardcoded constant)
  and add || die guards on impersonation calls
- Add EXIT/INT/TERM cleanup trap that reverts to the baseline snapshot
- Fix agent floor-check snippet: add FEE_DEST/FEE_BAL reads so formula
  matches compute_eth_per_token (adj=s-f-k, not adj=s-k)
- Use `timeout "$CLAUDE_TIMEOUT"` to enforce wall-clock process limit
- Correct taxRateIndex range: 0-29 (30-element TAX_RATES array)
- Fix outstandingSupply() description: excludes LM-held KRK, not all KRK

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:59:12 +00:00
openhands
23d460542b fix: feat: Red-team agent runner — adversarial floor attack (#520)
Adds scripts/harb-evaluator/red-team.sh which:
- Verifies the Anvil stack is running and deployments exist
- Grants recenterAccess to account 2 (impersonating feeDestination)
- Takes an Anvil snapshot as the clean baseline
- Computes ethPerToken before the agent run (mirrors floor.ts logic)
- Builds a self-contained prompt with contract addresses, account keys,
  protocol mechanics, copy-paste cast command patterns, snapshot/revert
  instructions, and structured rules for the agent
- Spawns `claude -p --dangerously-skip-permissions` with a 2-hour timeout
- Captures output to tmp/red-team-report.txt
- Computes ethPerToken after the agent run and reports pass/fail

Exit code 0 = floor held, exit code 1 = floor broken, exit code 2 = infra error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:28:10 +00:00
openhands
722ecaaa0e fix: address review findings in stake-rpc.ts (#518)
- Remove dead krkAddress field from UnstakeRpcConfig (bug)
- Drop swap.js import to avoid transitive Playwright dependency; fix
  header comment to accurately describe the module boundary (warning)
- Inline pollReceipt() returning TxReceipt so snatch receipt is reused
  for log parsing without a second round-trip (warning)
- Use ZeroAddress from ethers instead of manual constant (info)
- Add comment on fromBlock '0x0' genesis-scan caveat (info)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 02:48:51 +00:00
openhands
fd44fa0bcf fix: feat: RPC-only staking helpers for red-team agent (#518)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 02:06:41 +00:00
openhands
e01ef23560 fix: feat: Anvil snapshot/revert and ethPerToken helpers (#519)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 01:23:36 +00:00
openhands
866474510b fix: feat: Anvil snapshot/revert and ethPerToken helpers (#519)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:53:17 +00:00
openhands
b2db3c7ae5 fix: feat: Anvil snapshot/revert and ethPerToken helpers (#519)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:12:49 +00:00
openhands
962b126e8b fix: networkidle still used in stake.ts login flow (#501)
Replace waitForLoadState('networkidle') in the post-login redirect with
waitForURL('**/app/stake**'). Persistent WebSocket connections prevent
networkidle from ever firing, mirroring the same fix applied to navigate.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 16:11:24 +00:00
openhands
c943db379f fix: wait_healthy does not fail fast when a service exits or crashes during the health-check window (#387) 2026-03-06 11:20:54 +00:00
johba
a937f1cb4c docs: scope engineering principles to infra/tests, not frontend polling (#470)
Clarifies that the event-driven engineering principles apply to infrastructure (Docker, scripts, startup/teardown) and test/scenario execution — NOT frontend HTTP API polling.

Frontend polling (e.g. LiveStats → Ponder GraphQL every 30s) is fine. The scalability solution is caching at the proxy layer (`Cache-Control` headers via Caddy), not WebSocket subscriptions.

Relates to #447

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/470
2026-03-06 11:17:50 +01:00
openhands
d0e651ffc9 fix: sellAllKrk uses amountOutMinimum: 0n with no throw on 0 output (#450)
Replace console.warn with a thrown Error when wethReceived <= 0n so any
caller without a return-value check is protected, not just always-leave.spec.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 01:51:23 +00:00
openhands
4e6182acc6 fix: evaluator: add stakeKrk and unstakeKrk browser helpers (#460)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 15:09:54 +00:00
openhands
b7bbbb9b89 fix: evaluator: add stakeKrk and unstakeKrk browser helpers (#460)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 14:37:56 +00:00
openhands
c891b3c617 fix: address review findings in sellKrk helper
- Fix max-button race: wait for input to be non-empty after clicking Max
  (setMax is async, composable calls loadKrkBalance() before setting value)
- Add on-chain confirmation via WETH Transfer event polling (mirrors buyKrk)
  so balance query happens after the swap is mined, not just UI-idle
- Use Pick<SellConfig, 'rpcUrl' | 'accountAddress'> since krkAddress is unused
- Add page heading assertion after navigate (consistent with buyKrk)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 13:58:14 +00:00
openhands
61a9fd7e58 fix: evaluator: add sellKrk browser helper (uses sell widget from #456) (#461)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 13:13:04 +00:00
openhands
9cda5beb4a fix: address review findings in market and recenter helpers
- recenter.ts: parse isUp from Recentered event logs instead of a
  follow-up eth_call that would decode wrong post-recenter state
- recenter.ts: remove hardcoded private key from comment; add blocks>0
  guard in mineBlocks; call provider.destroy() to prevent leaked intervals
- market.ts: snapshot KRK balance before buy to compute krkBought as
  delta instead of cumulative total; call provider.destroy() on exit;
  remove unused withdraw entry from WETH_ABI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 11:27:31 +00:00
openhands
1973ccf25b fix: evaluator: add market simulation and recenter helpers (#455)
- Export waitForReceipt from swap.ts so market.ts and recenter.ts can reuse it
- Add market.ts with roundTripSwap: direct-RPC buy+sell round-trip using ethers Wallet
- Add recenter.ts with triggerRecenter (calls LiquidityManager.recenter()) and mineBlocks (anvil_mine)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 10:52:28 +00:00
openhands
cd459bb9b0 fix: correct buyKrk call sites for new opts param, add eslint-disable for polling loop
- no-dilution.spec.ts: pass undefined for opts, screenshotPrefix as 4th arg
- swap.ts: add eslint-disable-next-line for eth_getFilterLogs polling delay
2026-03-05 05:53:19 +00:00
openhands
f214ac8587 fix: address PR #437 review findings
- Fix price impact formula: (10000n - ...) instead of (1n - ...)
- Extract ETH_AMOUNT constant in always-leave to avoid duplication
- Add screenshotPrefix param to buyKrk for unique screenshot paths
2026-03-05 05:51:08 +00:00
openhands
c25c757024 feat(holdout): add passive-confidence/no-dilution scenario
Verifies that passive holders are not diluted when new buyers enter.

- Two wallets (Anvil accounts 4 & 5) buy KRK sequentially
- First buyer's balance must remain unchanged after second buy
- Second buyer receives fewer tokens per ETH due to AMM price impact
- Tests core protocol invariant: holding KRK does not dilute position
2026-03-05 05:51:03 +00:00
openhands
f6fe37dcc0 fix: address PR #438 review findings
- Fix HOLDOUT_SCENARIOS_DIR to use absolute path (resolves Playwright testDir issue)
- Remove dead SCENARIOS_DIR variable
- Replace fallback with explicit error in holdout.config.ts
- Add SSH key requirement comment
2026-03-04 08:20:11 +00:00
openhands
69f6a87e20 Move holdout scenarios to separate repo
- Updated holdout.config.ts to use HOLDOUT_SCENARIOS_DIR env var
- Modified evaluate.sh to clone harb-holdout-scenarios repo at runtime
- Deleted scripts/harb-evaluator/scenarios/ directory
- Added .holdout-scenarios/ to .gitignore
- Holdout scenarios are now cloned into .holdout-scenarios/ during evaluation
- This prevents dev-agent from seeing the holdout test set
2026-03-04 08:20:11 +00:00
johba
b2594a28b3 Merge pull request 'fix: lint: Ban waitForTimeout, setTimeout-as-delay, and fixed sleep patterns (#442)' (#443) from fix/issue-442 into master 2026-03-03 23:37:46 +01:00
johba
05191bb15f Merge pull request 'feat(holdout): Add reasonable slippage assertion to always-leave scenario' (#436) from fix/holdout-slippage-check into master 2026-03-03 22:41:23 +01:00
johba
16abdcbefb fix: add RPC propagation delay after browser-initiated swap (#434)
After `buyKrk()` completes (swap widget returns to idle), the Anvil RPC may briefly return stale balance state. Adds 2s delay to ensure `getKrkBalance` reads post-swap state.

Without this fix, the holdout scenario reports identical KRK balance before and after swap despite the transaction succeeding (success toast visible).

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/434
2026-03-03 22:20:02 +01:00
openhands
748557bc83 fix: lint: Ban waitForTimeout, setTimeout-as-delay, and fixed sleep patterns (#442)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 20:58:01 +00:00
openhands
74a043262d feat(holdout): Add reasonable slippage assertion to always-leave scenario
- Modified sellAllKrk helper to return WETH delta received
- Added assertion: WETH received >= 90% of ETH spent (0.09 ETH minimum)
- Added log showing actual slippage percentage
- This proves 'always leave' with reasonable slippage, not just exit ability
2026-03-03 19:45:46 +00:00
openhands
ed5db3b000 fix: use toHaveText to correctly await swap completion after testId migration
getByTestId('swap-buy-button').waitFor({ state: 'visible' }) resolved
immediately because the button is always rendered; only its text changes.
Replace with expect(...).toHaveText('Buy KRK', { timeout: 60_000 }) to
correctly gate on the button returning to its idle state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:47:53 +00:00
openhands
68601c255a fix: getByRole('button', { name: 'Buy' }).last() fragility is duplicated across e2e/01 and this scenario (#398)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:13:05 +00:00
openhands
77a229018a fix: address review feedback on holdout helpers
- Extract rpcCall into helpers/rpc.ts to eliminate the duplicate copy
  in wallet.ts and assertions.ts (warning: code duplication)

- Fix waitForReceipt() in swap.ts to assert receipt.status === '0x1':
  reverted transactions (status 0x0) now throw immediately with a clear
  message instead of letting sellAllKrk silently succeed and fail later
  at the balance assertion (bug)

- Add screen.width debug log to connectWallet() before the isVisible
  check, restoring the regression signal from always-leave.spec.ts (warning)

- Fix expectPoolHasLiquidity() to only assert sqrtPriceX96 > 0 (pool
  is initialised); drop the active-tick liquidity() check which gives
  false negatives when price moves outside all LiquidityManager ranges
  after a sovereign exit (warning)

- Add WETH balance snapshot before/after the swap in sellAllKrk() and
  log a warning when WETH output is 0, making pool health degradation
  visible despite amountOutMinimum: 0n (warning)

- Add before/after screenshots in buyKrk() (holdout-before-buy.png,
  holdout-after-buy.png) to restore CI debugging artefacts (nit)

- Move waitForTimeout(2_000) settle buffer in buyKrk() to the catch
  path only; when the Submitting→idle transition is observed the extra
  wait is redundant (nit)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 05:59:21 +00:00
openhands
d0a3bdecdc feat: Shared holdout scenario test helpers (#401)
Extract reusable helpers from sovereign-exit/always-leave.spec.ts into
three focused modules under scripts/harb-evaluator/helpers/:

- helpers/wallet.ts: connectWallet, disconnectWallet, getEthBalance,
  getKrkBalance — UI connect/disconnect flow and on-chain balance reads.

- helpers/swap.ts: buyKrk (navigates to the real /app/get-krk page and
  drives the LocalSwapWidget, now that #393 fill() fix is in), sellAllKrk
  (approve + exactInputSingle via window.ethereum, no UI dependency).

- helpers/assertions.ts: expectBalanceIncrease (snapshot/action/assert
  pattern for any token or ETH), expectPoolHasLiquidity (slot0 + liquidity
  sanity check on a Uniswap V3 pool).

always-leave.spec.ts is refactored to use these helpers and to navigate
to /app/get-krk instead of the /app/cheats workaround introduced before
the #393 fix landed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 05:21:24 +00:00
openhands
13249613c0 address review: fix stale JSDoc, correct comments, remove redundant dispatchEvent 2026-03-01 19:11:43 +00:00