Commit graph

968 commits

Author SHA1 Message Date
johba
2fc0ce2b60 Merge pull request 'fix: Hardcoded TWAP/cooldown values not documented (#825)' (#839) from fix/issue-825 into master 2026-03-15 21:14:57 +01:00
openhands
0d09f598d9 fix: Hardcoded TWAP/cooldown values not documented (#825)
Document MIN_RECENTER_INTERVAL (60 s, LiquidityManager.sol:61) and
PRICE_STABILITY_INTERVAL (300 s, PriceOracle.sol:14) in
docs/ARCHITECTURE.md and docs/PRODUCT-TRUTH.md so that agent-facing
and product-facing copy stays traceable to source constants.

Add an inline HTML comment in red-team-program.md next to the
hardcoded 60s/300s sentence pointing to the two source constants,
making drift detectable during code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 19:51:52 +00:00
johba
baaca1c9b4 Merge pull request 'fix: 'Trigger recenter (account 2 only)' label contradicts public recenter comment (#826)' (#836) from fix/issue-826 into master 2026-03-15 20:45:14 +01:00
openhands
2293ece915 fix: 'Trigger recenter (account 2 only)' label contradicts public recenter comment (#826)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 19:17:16 +00:00
johba
aa3bc020d9 Merge pull request 'fix: Kraiken.sol and Stake.sol absent from agent context across all runs (#829)' (#834) from fix/issue-829 into master 2026-03-15 20:10:07 +01:00
openhands
13d5b40564 fix: Kraiken.sol and Stake.sol absent from agent context across all runs (#829)
Inject Kraiken.sol (outstandingSupply, mint/burn mechanics) and Stake.sol
(snatch, withdrawal, KRK exclusion from floor denominator) into the red-team
agent prompt so agents can reason from actual source rather than guesses.

- red-team.sh: read SOL_KRAIKEN and SOL_STAKE from onchain/src/ alongside
  the other six contracts already injected
- red-team-program.md: add ### Kraiken.sol and ### Stake.sol sections in the
  Source Code reference block (after PriceOracle.sol)
- AGENTS.md: document the full list of injected contracts in a new
  "Red-team Agent Context" section; both files are now listed as in-scope

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 18:41:57 +00:00
johba
682d55f00a Merge pull request 'fix: refactor: extract red-team prompt to red-team-program.md (#819)' (#833) from fix/issue-819 into master 2026-03-15 19:28:40 +01:00
openhands
012b31056e fix: refactor: extract red-team prompt to red-team-program.md (#819)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 17:54:33 +00:00
johba
55cfbeb291 Merge pull request 'fix: feat: red-team sweep should seed each candidate with cross-candidate attack patterns (#822)' (#832) from fix/issue-822 into master 2026-03-15 18:36:26 +01:00
johba
0122546f54 Merge pull request 'chore: add planner watermarks to all AGENTS.md files' (#831) from chore/agents-watermarks into master
Reviewed-on: https://codeberg.org/johba/harb/pulls/831
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-15 18:25:43 +01:00
openhands
4d0390c4fa fix: address review findings for cross-candidate red-team sweep (#822)
- red-team-sweep.sh: reset CROSS_PATTERNS_FILE at sweep start to prevent
  stale patterns from prior invocations contaminating a fresh run
- red-team-sweep.sh: wrap pattern-extraction Python in set +e/set -e and
  capture output so log() prefix is applied; move memory truncation outside
  the if-block so it runs unconditionally even if Python fails
- red-team.sh: filter entries where candidate == current_candidate before
  grouping, removing self-referential cross-candidate evidence
- red-team.sh: skip entries with empty pattern key (both pattern and
  strategy fields empty) to prevent spurious bucket merging

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 17:02:19 +00:00
openhands
9a309634ed chore: add planner watermarks to all AGENTS.md files 2026-03-15 16:42:45 +00:00
openhands
9ee1429604 fix: feat: red-team sweep should seed each candidate with cross-candidate attack patterns (#822)
- red-team-sweep.sh: after each candidate completes, extract all memory
  entries into /tmp/red-team-cross-patterns.jsonl (append), then clear
  the raw memory file so the next candidate starts with a fresh state
- red-team.sh: define CROSS_PATTERNS_FILE; before building the prompt,
  read the cross-patterns file and generate a "Cross-Candidate
  Intelligence" section grouped by abstract op pattern — universal
  patterns (broke 2+ candidates), candidate-specific wins, and patterns
  that held everywhere — each annotated with optimizer profiles
- The new section is injected into the Claude prompt above the existing
  Previous Findings block, satisfying all acceptance criteria

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 16:30:54 +00:00
johba
bf1a735481 Merge pull request 'fix: feat: red-team memory should track candidate + abstract learnings (#820)' (#830) from fix/issue-820 into master 2026-03-15 17:17:33 +01:00
openhands
7950608179 fix: address review findings for red-team memory tracking (#820)
- make_pattern: replace text.find('stake')/find('unstake') with
  re.search(r'\bstake\b')/re.search(r'\bunstake\b') so 'stake' is never
  found as a substring of 'unstake' (bug #1)
- make_pattern: track first-occurrence position of each op and sort by
  position before building the sequence string, preserving actual
  execution order instead of a hardcoded canonical order (bug #2)
- insight capture: track insight_pri on the current dict; only overwrite
  stored insight when new match has strictly higher priority (lower index),
  preventing a late 'because...' clause from silently replacing an earlier
  'Key Insight:' capture (warning #3)
- run_num: compute max(run)+1 from JSON entries instead of wc -l so run
  numbers stay monotonically increasing after memory trim (info #4)
- red-team-sweep.sh: also set adaptive flag when any r37-r40 register has
  a variable-form assignment (r40 = uint256(someVar)), catching candidates
  where only one branch uses constants (warning #5)
- red-team-sweep.sh: remove unnecessary 'import sys as _sys' in except
  block; sys is already in scope (nit #6)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 15:54:01 +00:00
openhands
e7c60edeb6 fix: feat: red-team memory should track candidate + abstract learnings (#820)
- Add CANDIDATE_NAME and OPTIMIZER_PROFILE env vars to red-team.sh
  (defaults to "unknown" for standalone runs)
- Update extract_memory Python: new fields candidate, optimizer_profile,
  pattern (abstract op sequence via make_pattern()), and improved insight
  extraction that also captures WHY explanations (because/since/due to)
- Update MEMORY_SECTION Python: entries now grouped by candidate;
  universal patterns (DECREASED across multiple candidates) surfaced first
- Update prompt: add "Current Attack Target" table with candidate/profile,
  optimizer parameter explanations (CI/AW/AS/DD behavioral impact),
  Rule 9 requiring pattern+insight per strategy, updated report format
  with Pattern/Insight fields and universal-pattern conclusion field
- Update red-team-sweep.sh: after inject, parse OptimizerV3Push3.sol for
  r40/r39/r38/r37 constants to build OPTIMIZER_PROFILE string; pass
  CANDIDATE_NAME and OPTIMIZER_PROFILE as env vars to red-team.sh

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 15:23:43 +00:00
johba
7a09c16966 Merge pull request 'fix: txnBot AGENTS.md ENVIRONMENT enum is stale (#784)' (#815) from fix/issue-784 into master 2026-03-15 16:06:03 +01:00
johba
963c0d316a Merge pull request 'fix: feat: red-team agent should read LM and optimizer Solidity source (#821)' (#828) from fix/issue-821 into master 2026-03-15 15:57:21 +01:00
openhands
4779749f2b fix: feat: red-team agent should read LM and optimizer Solidity source (#821)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 14:18:10 +00:00
openhands
afae00ed9f fix: txnBot AGENTS.md ENVIRONMENT enum is stale (#784) 2026-03-15 14:11:58 +00:00
openhands
0f3399a73c fix: txnBot AGENTS.md ENVIRONMENT enum is stale (#784) 2026-03-15 14:11:45 +00:00
johba
504977941e Merge pull request 'fix: fix: red-team prompt missing evm_increaseTime for TWAP-enforced recenter (#823)' (#824) from fix/issue-823 into master 2026-03-15 12:36:49 +01:00
openhands
ff53625c9c fix: fix: red-team prompt missing evm_increaseTime for TWAP-enforced recenter (#823) 2026-03-15 10:47:47 +00:00
openhands
7d0473ade7 fix: fix: red-team prompt missing evm_increaseTime for TWAP-enforced recenter (#823)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 10:47:36 +00:00
johba
d6e5990802 Merge pull request 'fix: ThreePositionStrategy class comment still advertises 1-100% anchor width (#786)' (#813) from fix/issue-786 into master 2026-03-15 10:36:02 +01:00
johba
ff86b3691d chore: extract shared inject.sh, add red-team-sweep.sh (#806)
## What
- `tools/push3-transpiler/inject.sh` — shared transpile+inject logic used by both batch-eval and red-team-sweep
- `batch-eval.sh` — replaced inline 60-line Python block with `inject.sh` call
- `scripts/harb-evaluator/red-team-sweep.sh` — red-teams each kindergarten seed using existing `red-team.sh`, with random smoke test gate

## Why
Sweep script kept breaking because I rewrote the injection logic instead of reusing batch-eval's proven Python. Now there's one copy.

## Testing
- inject.sh tested manually on DO box with optimizer_v3 seed
- Smoke test picks random seed, injects + compiles before starting sweep

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/806
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-15 10:24:03 +01:00
openhands
d3917c551f fix: ThreePositionStrategy class comment still advertises 1-100% anchor width (#786)
- Fix class-level NatSpec: use accurate wording (width computed from
  anchorWidth param provided by Optimizer) instead of imprecise
  LiquidityManager attribution
- Fix inline comment in _setAnchorPosition (same stale 1-100% claim)
- Update PRODUCT-TRUTH.md and ARCHITECTURE.md which had the same
  incorrect 1-100% range claim

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 08:51:12 +00:00
openhands
6dd246cb55 fix: ThreePositionStrategy class comment still advertises 1-100% anchor width (#786) 2026-03-15 08:20:22 +00:00
openhands
f5fdd329c4 fix: ThreePositionStrategy class comment still advertises 1-100% anchor width (#786)
Remove the misleading "(1-100% width)" range claim from the ANCHOR NatSpec.
Anchor width enforcement lives in LiquidityManager, not this abstract, so
the comment is replaced with a note pointing to where enforcement actually occurs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 08:20:13 +00:00
johba
5bb4c72897 Merge pull request 'fix: evo_run007_champion.push3 note has same CI/DD inversion (#790)' (#812) from fix/issue-790 into master 2026-03-15 09:15:00 +01:00
openhands
5891e9ca6b fix: evo_run007_champion.push3 note has same CI/DD inversion (#790)
Corrected run 7 note in manifest.jsonl: CI and DD values were inverted
(CI=20%, DD=0 → CI=0%, DD=20%) to match stack-pop semantics of the
push sequence 200000000000000000 153 200000000000000000 0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 08:00:04 +00:00
johba
6b07ec0e33 Merge pull request 'fix: evo_run007_champion.push3 always returns fixed params regardless of staking (#791)' (#808) from fix/issue-791 into master 2026-03-15 08:51:13 +01:00
openhands
05f41fe10a fix: evo_run007_champion.push3 always returns fixed params regardless of staking (#791) 2026-03-15 07:30:53 +00:00
openhands
d8a109baf8 fix: evo_run007_champion.push3 always returns fixed params regardless of staking (#791)
Replace the broken EXEC.IF branches where TRUE was ( ) and FALSE was
0 DYADIC.POP, causing the trailing push sequence to execute
unconditionally. Now EXEC.IF correctly branches on STAKED > 88%:
  - TRUE  (staked > 88%): bear defaults ( 0 0 0 0 ) — CI=0, AW=0, AS=0, DD=0
  - FALSE (staked ≤ 88%): ( 200000000000000000 153 200000000000000000 0 )
                            — CI=0, AW=153, AS=20%, DD=20%

Also correct the manifest.jsonl run 7 note which had CI and DD inverted
(CI=20%/DD=0 → CI=0/DD=20%).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 07:30:45 +00:00
johba
8a1bfa9b69 Merge pull request 'fix: red-team.sh and export-attacks.py use Base Sepolia addresses labeled as mainnet (#794)' (#805) from fix/issue-794 into master 2026-03-15 08:17:23 +01:00
openhands
7618309db5 fix: red-team.sh and export-attacks.py use Base Sepolia addresses labeled as mainnet (#794)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 06:48:16 +00:00
johba
4f24d6201f Merge pull request 'fix: Old-format CIDs are warned but still silently dropped from the pool (#801)' (#804) from fix/issue-801 into master 2026-03-15 07:37:06 +01:00
openhands
70ef0eb1bc fix: Old-format CIDs are warned but still silently dropped from the pool (#801)
- Change WARNING to explicitly state "legacy CID format ... migration not supported, skipping"
- Expand comment near the startswith('candidate_') guard to document the CID format
  contract and explain why re-admission is intentionally out of scope (no surviving
  generation_N.jsonl files from runs 1-6 exist in the repo)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 06:17:12 +00:00
johba
a983c5cb16 Merge pull request 'fix: No-op varCounter assignment before false branch in processExecIf (#655)' (#803) from fix/issue-655 into master 2026-03-15 06:47:02 +01:00
openhands
73485c66ea fix: No-op varCounter assignment before false branch in processExecIf (#655) 2026-03-15 05:27:14 +00:00
openhands
38c476f71e fix: No-op varCounter assignment before false branch in processExecIf (#655) 2026-03-15 05:27:09 +00:00
johba
b4720e6f5c Merge pull request 'fix: evolve.sh does not write note field — schema drift between hand-written and evolved entries (#719)' (#802) from fix/issue-719 into master 2026-03-15 06:17:27 +01:00
openhands
4a47e8e2d1 fix: evolve.sh does not write \note\ field — schema drift between hand-written and evolved entries (#719)
- Pass seed basename into the admission Python block as argv[7]
- Add \`note\` field to every new evolved entry: "Evolved from <seed> (run<N> gen<G>)"
- Add migration comment noting entries admitted before this fix may have note: null

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 04:57:58 +00:00
johba
5b8c1cc485 Merge pull request 'fix: CID format change silently drops historical generation JSONL on re-admission (#757)' (#800) from fix/issue-757 into master 2026-03-15 05:47:02 +01:00
openhands
6694b2daa8 fix: CID format change silently drops historical generation JSONL on re-admission (#757)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 04:27:38 +00:00
johba
a9431c87ee Merge pull request 'fix: manifest.jsonl schema has no canonical machine-readable definition (#720)' (#799) from fix/issue-720 into master 2026-03-15 05:17:28 +01:00
openhands
2aad9e98f1 fix: manifest.jsonl schema has no canonical machine-readable definition (#720)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 03:57:31 +00:00
johba
bacc104bc9 Merge pull request 'fix: llm-origin entries in manifest have null fitness and no evaluation path (#724)' (#798) from fix/issue-724 into master 2026-03-15 04:49:41 +01:00
openhands
c508efa31f fix: address review findings for evaluate-seeds.sh (#724)
- Replace unquoted heredoc (shell-injection path) with a temp file: the
  shell loop now appends tab-separated filename/score lines to a temp
  file, which is passed as a plain path argument to the Python manifest-
  rewrite block.  Python reads only file contents, never executes shell-
  expanded strings.
- Add early abort on fitness.sh exit code 2 (infra error: Anvil down,
  missing tool).  Iterating past an infra failure produces no useful
  results; aborting immediately surfaces the real problem.
- Remove unused `os` import from the manifest-rewrite Python block.
- Fix inaccurate comment in evolve.sh --diverse-seeds sampling: the pool
  sampler does a flat random shuffle with no fitness weighting; null-
  fitness seeds are not "treated as 0" — they are sampled with equal
  probability to any other seed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 03:29:47 +00:00
openhands
cb6e6708b6 fix: \llm\-origin entries in manifest have null fitness and no evaluation path (#724)
- Add evaluate-seeds.sh: standalone script that reads manifest.jsonl,
  finds every entry with fitness: null, runs fitness.sh against each
  seed file, and atomically writes results back to manifest.jsonl.
  Supports --dry-run to preview without evaluating.
- Add comment to --diverse-seeds sampling in evolve.sh documenting that
  null-fitness seeds are included with effective_fitness=0 and that
  evaluate-seeds.sh should be run to score them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 03:08:29 +00:00