Commit graph

1001 commits

Author SHA1 Message Date
openhands
a2f89968db fix: fix: red-team.sh V3_FACTORY hardcodes Base mainnet address instead of Sepolia (#854)
bootstrap-light.sh now extracts the Uniswap V3 pool address from
DeployLocal.sol deploy output and writes both Pool and V3Factory
(Base Sepolia: 0x4752ba5DBc23f44D87826276BF6Fd6b1C372aD24) into
deployments-local.json alongside the existing contract addresses.

red-team.sh now reads V3_FACTORY and POOL from deployments-local.json
instead of hardcoding the Base mainnet factory address
(0x33128a8fC17869897dcE68Ed026d694621f6FDfD), and removes the getPool()
RPC call that always failed with "contract does not have any code" on
the Sepolia fork.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 12:02:17 +00:00
johba
740a871ddc Merge pull request 'fix: STATE.md #763 entry not removed after PR #773 merged (#776)' (#869) from fix/issue-776 into master 2026-03-16 11:37:21 +01:00
openhands
2e3cc6e641 fix: STATE.md #763 entry not removed after PR #773 merged (#776) 2026-03-16 10:17:01 +00:00
johba
c5519a34b1 Merge pull request 'fix: red-team-program.md taxRate naming inconsistency (pre-existing) (#835)' (#868) from fix/issue-835 into master 2026-03-16 11:14:20 +01:00
openhands
91e4bdf926 fix: red-team-program.md taxRate naming inconsistency (pre-existing) (#835) 2026-03-16 09:46:55 +00:00
johba
7e20f9fe74 Merge pull request 'fix: evolution.patch references removed LiquidityManager constant (pre-existing structural debt) (#842)' (#865) from fix/issue-842 into master 2026-03-16 10:36:08 +01:00
openhands
777bec8563 fix: evolution.patch references removed LiquidityManager constant (pre-existing structural debt) (#842)
Extend the patch to also replace the NatSpec comments above MAX_ANCHOR_WIDTH,
which became misleading after switching to type(uint24).max. The old comments
claimed overflow-safety ("fits in int24"); the new comments document that the
production cap is 1233, that values above 123358 overflow int24 and revert,
and that this is tolerable in the evolution context where reverts score zero
fitness. The patch now correctly updates both the constant and its documentation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 09:20:41 +00:00
openhands
c0fa8c064f fix: evolution.patch references removed LiquidityManager constant (pre-existing structural debt) (#842)
Regenerate evolution.patch from the current ThreePositionStrategy.sol.
The old patch had a corrupt hunk header (@@ -33,7 +33,7 @@ claiming 7 lines
but only supplying 4) and placeholder index hashes (0000000..0000000),
causing `git apply` to reject it with "corrupt patch". MAX_ANCHOR_WIDTH
still exists in the file at value 1233; the patch correctly overrides it
to type(uint24).max for unbounded evolution runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 08:53:33 +00:00
johba
8c825589e2 Merge pull request 'fix: MEMORY_FILE parent directory ($REPO_ROOT/tmp/) also not guaranteed to exist (#844)' (#863) from fix/issue-844 into master 2026-03-16 09:36:59 +01:00
openhands
d34fe698ab ci: retrigger after infra failure 2026-03-16 08:09:12 +00:00
openhands
cb305b8c81 fix: MEMORY_FILE parent directory ($REPO_ROOT/tmp/) also not guaranteed to exist (#844)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 07:57:20 +00:00
johba
39793ec0df Merge pull request 'fix: sleep 5 at teardown violates AGENTS.md engineering principles (#845)' (#861) from fix/issue-845 into master 2026-03-16 08:49:06 +01:00
openhands
6e66bfd2f6 ci: retrigger after infra failure 2026-03-16 07:16:46 +00:00
openhands
8986154d8f fix: sleep 5 at teardown violates AGENTS.md engineering principles (#845) 2026-03-16 07:06:57 +00:00
johba
10ff61e6b5 Merge pull request 'fix: package.json missing 'type': 'module' inconsistent with AGENTS.md (#850)' (#855) from fix/issue-850 into master 2026-03-16 07:56:59 +01:00
openhands
3f24faba18 fix: package.json missing 'type': 'module' inconsistent with AGENTS.md (#850)
Update tsconfig.json to use NodeNext module system (fixes CJS/ESM conflict),
enable ts-node ESM mode, and add .js extensions to relative imports so the
built output and ts-node dev script both work correctly with "type":"module".
2026-03-16 06:35:05 +00:00
openhands
0c43054f42 fix: package.json missing 'type': 'module' inconsistent with AGENTS.md (#850) 2026-03-16 06:07:10 +00:00
johba
81501758ad Merge pull request 'fix: AttackRunner.s.sol NPM_ADDR last byte is 0xF1 but scripts use 0xF3 (#807)' (#851) from fix/issue-807 into master 2026-03-16 02:13:28 +01:00
openhands
fd912a2a69 fix: AttackRunner.s.sol NPM_ADDR last byte is 0xF1 but scripts use 0xF3 (#807)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 00:50:28 +00:00
johba
00349d9f45 Merge pull request 'fix: Body extraction stops at first shallow closing brace (#809)' (#849) from fix/issue-809 into master 2026-03-16 01:36:08 +01:00
openhands
34b016a190 fix: Body extraction stops at first shallow closing brace (#809)
Replace the }` heuristic in inject.sh with a brace-depth counter:
start at depth=1 after the opening {, increment on {, decrement on },
stop when depth reaches 0. This correctly handles nested if/else blocks,
loops, and structs that close at 4-space indent inside calculateParams.

Also emit a non-zero exit with a descriptive message if EOF is reached
without finding the matching closing brace.

Add test_inject_extraction.sh covering simple bodies, nested if/else,
multi-level nesting, and the EOF-without-match error case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 00:21:06 +00:00
johba
3e5cbea7f7 Merge pull request 'fix: mutate.test.ts: pre-existing isValid > stack underflow failure (#810)' (#848) from fix/issue-810 into master 2026-03-16 01:09:20 +01:00
openhands
6a55c37b20 fix: mutate.test.ts: pre-existing \isValid > stack underflow\ failure (#810)
dpop/bpop silently returned '0'/'false' on stack underflow instead of
throwing, so isValid() never returned false for underflowing programs.
Make dpop and bpop throw an Error on underflow so the transpiler's
existing try/catch in isValid() correctly classifies such programs as
invalid. The output-extraction phase uses state.dStack.pop() directly
(not dpop) and is unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 23:49:55 +00:00
johba
3584c03261 Merge pull request 'fix: Fitness re-evaluation for fixed evo_run007_champion (#811)' (#846) from fix/issue-811 into master 2026-03-16 00:38:15 +01:00
openhands
79bcb81b81 fix: Fitness re-evaluation for fixed evo_run007_champion (#811)
Null out the stale fitness score (7116531284966772550194) for
evo_run007_champion.push3, which was recorded against the buggy
processExecIf interpreter (pre-#655 fix). Setting fitness to null
marks the entry for re-scoring by evaluate-seeds.sh once a valid
ANVIL_FORK_URL is available. Updated the note field to document why
the fitness was cleared.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 23:21:04 +00:00
johba
b4f549bf05 Merge pull request 'fix: ATTACKS_OUT directory not guaranteed to exist (#816)' (#843) from fix/issue-816 into master 2026-03-16 00:08:17 +01:00
openhands
ac2fa16e2e fix: ATTACKS_OUT directory not guaranteed to exist (#816) 2026-03-15 22:36:51 +00:00
johba
938c6d284e Merge pull request 'fix: Unclamped anchorWidth can overflow tick range — no upper-bound guard after MAX_ANCHOR_WIDTH removal (#783) (#817)' (#841) from fix/issue-817 into master 2026-03-15 23:26:50 +01:00
openhands
aa274fd8ed fix: address review findings for anchorWidth guard (#817)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 22:04:13 +00:00
openhands
a21cf398bf fix: Unclamped anchorWidth can overflow tick range — no upper-bound guard after MAX_ANCHOR_WIDTH removal (#783) (#817)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 21:34:33 +00:00
johba
cd4926b540 Merge pull request 'fix: feat: structured sweep-results.tsv for red-team sweep (#818)' (#840) from fix/issue-818 into master 2026-03-15 22:16:34 +01:00
openhands
ae3eb14833 fix: address review findings for sweep-results.tsv (#818)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 20:48:33 +00:00
openhands
3c6be7d86f fix: feat: structured sweep-results.tsv for red-team sweep (#818)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 20:20:13 +00:00
johba
2fc0ce2b60 Merge pull request 'fix: Hardcoded TWAP/cooldown values not documented (#825)' (#839) from fix/issue-825 into master 2026-03-15 21:14:57 +01:00
openhands
0d09f598d9 fix: Hardcoded TWAP/cooldown values not documented (#825)
Document MIN_RECENTER_INTERVAL (60 s, LiquidityManager.sol:61) and
PRICE_STABILITY_INTERVAL (300 s, PriceOracle.sol:14) in
docs/ARCHITECTURE.md and docs/PRODUCT-TRUTH.md so that agent-facing
and product-facing copy stays traceable to source constants.

Add an inline HTML comment in red-team-program.md next to the
hardcoded 60s/300s sentence pointing to the two source constants,
making drift detectable during code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 19:51:52 +00:00
johba
baaca1c9b4 Merge pull request 'fix: 'Trigger recenter (account 2 only)' label contradicts public recenter comment (#826)' (#836) from fix/issue-826 into master 2026-03-15 20:45:14 +01:00
openhands
2293ece915 fix: 'Trigger recenter (account 2 only)' label contradicts public recenter comment (#826)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 19:17:16 +00:00
johba
aa3bc020d9 Merge pull request 'fix: Kraiken.sol and Stake.sol absent from agent context across all runs (#829)' (#834) from fix/issue-829 into master 2026-03-15 20:10:07 +01:00
openhands
13d5b40564 fix: Kraiken.sol and Stake.sol absent from agent context across all runs (#829)
Inject Kraiken.sol (outstandingSupply, mint/burn mechanics) and Stake.sol
(snatch, withdrawal, KRK exclusion from floor denominator) into the red-team
agent prompt so agents can reason from actual source rather than guesses.

- red-team.sh: read SOL_KRAIKEN and SOL_STAKE from onchain/src/ alongside
  the other six contracts already injected
- red-team-program.md: add ### Kraiken.sol and ### Stake.sol sections in the
  Source Code reference block (after PriceOracle.sol)
- AGENTS.md: document the full list of injected contracts in a new
  "Red-team Agent Context" section; both files are now listed as in-scope

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 18:41:57 +00:00
johba
682d55f00a Merge pull request 'fix: refactor: extract red-team prompt to red-team-program.md (#819)' (#833) from fix/issue-819 into master 2026-03-15 19:28:40 +01:00
openhands
012b31056e fix: refactor: extract red-team prompt to red-team-program.md (#819)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 17:54:33 +00:00
johba
55cfbeb291 Merge pull request 'fix: feat: red-team sweep should seed each candidate with cross-candidate attack patterns (#822)' (#832) from fix/issue-822 into master 2026-03-15 18:36:26 +01:00
johba
0122546f54 Merge pull request 'chore: add planner watermarks to all AGENTS.md files' (#831) from chore/agents-watermarks into master
Reviewed-on: https://codeberg.org/johba/harb/pulls/831
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-15 18:25:43 +01:00
openhands
4d0390c4fa fix: address review findings for cross-candidate red-team sweep (#822)
- red-team-sweep.sh: reset CROSS_PATTERNS_FILE at sweep start to prevent
  stale patterns from prior invocations contaminating a fresh run
- red-team-sweep.sh: wrap pattern-extraction Python in set +e/set -e and
  capture output so log() prefix is applied; move memory truncation outside
  the if-block so it runs unconditionally even if Python fails
- red-team.sh: filter entries where candidate == current_candidate before
  grouping, removing self-referential cross-candidate evidence
- red-team.sh: skip entries with empty pattern key (both pattern and
  strategy fields empty) to prevent spurious bucket merging

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 17:02:19 +00:00
openhands
9a309634ed chore: add planner watermarks to all AGENTS.md files 2026-03-15 16:42:45 +00:00
openhands
9ee1429604 fix: feat: red-team sweep should seed each candidate with cross-candidate attack patterns (#822)
- red-team-sweep.sh: after each candidate completes, extract all memory
  entries into /tmp/red-team-cross-patterns.jsonl (append), then clear
  the raw memory file so the next candidate starts with a fresh state
- red-team.sh: define CROSS_PATTERNS_FILE; before building the prompt,
  read the cross-patterns file and generate a "Cross-Candidate
  Intelligence" section grouped by abstract op pattern — universal
  patterns (broke 2+ candidates), candidate-specific wins, and patterns
  that held everywhere — each annotated with optimizer profiles
- The new section is injected into the Claude prompt above the existing
  Previous Findings block, satisfying all acceptance criteria

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 16:30:54 +00:00
johba
bf1a735481 Merge pull request 'fix: feat: red-team memory should track candidate + abstract learnings (#820)' (#830) from fix/issue-820 into master 2026-03-15 17:17:33 +01:00
openhands
7950608179 fix: address review findings for red-team memory tracking (#820)
- make_pattern: replace text.find('stake')/find('unstake') with
  re.search(r'\bstake\b')/re.search(r'\bunstake\b') so 'stake' is never
  found as a substring of 'unstake' (bug #1)
- make_pattern: track first-occurrence position of each op and sort by
  position before building the sequence string, preserving actual
  execution order instead of a hardcoded canonical order (bug #2)
- insight capture: track insight_pri on the current dict; only overwrite
  stored insight when new match has strictly higher priority (lower index),
  preventing a late 'because...' clause from silently replacing an earlier
  'Key Insight:' capture (warning #3)
- run_num: compute max(run)+1 from JSON entries instead of wc -l so run
  numbers stay monotonically increasing after memory trim (info #4)
- red-team-sweep.sh: also set adaptive flag when any r37-r40 register has
  a variable-form assignment (r40 = uint256(someVar)), catching candidates
  where only one branch uses constants (warning #5)
- red-team-sweep.sh: remove unnecessary 'import sys as _sys' in except
  block; sys is already in scope (nit #6)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 15:54:01 +00:00
openhands
e7c60edeb6 fix: feat: red-team memory should track candidate + abstract learnings (#820)
- Add CANDIDATE_NAME and OPTIMIZER_PROFILE env vars to red-team.sh
  (defaults to "unknown" for standalone runs)
- Update extract_memory Python: new fields candidate, optimizer_profile,
  pattern (abstract op sequence via make_pattern()), and improved insight
  extraction that also captures WHY explanations (because/since/due to)
- Update MEMORY_SECTION Python: entries now grouped by candidate;
  universal patterns (DECREASED across multiple candidates) surfaced first
- Update prompt: add "Current Attack Target" table with candidate/profile,
  optimizer parameter explanations (CI/AW/AS/DD behavioral impact),
  Rule 9 requiring pattern+insight per strategy, updated report format
  with Pattern/Insight fields and universal-pattern conclusion field
- Update red-team-sweep.sh: after inject, parse OptimizerV3Push3.sol for
  r40/r39/r38/r37 constants to build OPTIMIZER_PROFILE string; pass
  CANDIDATE_NAME and OPTIMIZER_PROFILE as env vars to red-team.sh

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 15:23:43 +00:00
johba
7a09c16966 Merge pull request 'fix: txnBot AGENTS.md ENVIRONMENT enum is stale (#784)' (#815) from fix/issue-784 into master 2026-03-15 16:06:03 +01:00