Commit graph

73 commits

Author SHA1 Message Date
johba
74be110fa1 fix: fix: bundled dust cleanup — tools/push3-evolution (#1035)
- #989: Quote $VARIANT_IDX and $NEXT_IDX in printf '%03d' calls in
  evolve.sh (SC2086 — no behavior change, style consistency)
- #612: Already resolved by commit 79a2e2e (fitness.sh switched from
  deployments-local.json to broadcast JSON, eliminating dead Kraiken/Stake reads)
- #945: Already resolved by commit 052ad7a (manifest.schema.json
  fitness_flags description corrected to "Comma-separated")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 22:11:23 +00:00
johba
abac7f7ed7 fix: use None instead of '' for absent fitness_flags to match schema
Review feedback: d.get('fitness_flags') without a default preserves the
null vs absent distinction mandated by the manifest schema (string | null).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 06:02:20 +00:00
johba
e2963cbcba fix: fitness_flags not propagated to manifest entries for newly admitted candidates (#990)
Two changes in evolve.sh pool-admission code:

1. Include `fitness_flags` from evaluator JSONL in the manifest entry dict
   for newly admitted candidates (~line 866-874). Previously the field was
   omitted, so downstream `effective_fitness()` could never zero-rate a new
   candidate.

2. Use `effective_fitness(entry)` when appending new candidates to the
   evolved ranking list (~line 907), so ZERO_RATED_FLAGS defence applies
   at first admission — not only when re-ranking existing entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 05:47:34 +00:00
johba
cb565e8183 fix: line 202: git apply failure path (after --check passes) still uses bare continue, bypassing STOP_REQUESTED (#979)
When `git apply --check` passes but `git apply` itself fails, the code
now checks STOP_REQUESTED before continuing to the next iteration,
consistent with the check at the end of the main loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 05:17:07 +00:00
openhands
de8cf65d06 fix: push3-evolution tsconfig rootDir too narrow for cross-project imports
Widen rootDir from "." to ".." and include push3-transpiler sources so
tsc can resolve the ../push3-transpiler/src imports that mutate.ts and
test files use.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 09:54:33 +00:00
openhands
79a2e2ee5e fix: deployments-local.json committed to repo (#589)
- Add onchain/deployments-local.json to .gitignore so it is no longer tracked
- Remove the stale committed file from git
- Update fitness.sh to read LM address from forge broadcast JSON
  (DeployLocal.sol's run-latest.json) instead of the potentially stale
  deployments-local.json, matching the approach deploy-optimizer.sh already uses

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 09:48:00 +00:00
openhands
af2f7e6115 fix: OptimizerV3.sol mutation has no CI guard (#631)
batch-eval.sh mutates OptimizerV3.sol by injecting Push3 candidates but
never restores it on exit. Add a backup/restore trap so the file is
always returned to its committed state, and add a CI step that fails
loudly if OptimizerV3.sol is left dirty after any pipeline step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:36:49 +00:00
johba
052ad7ac1c fix: bundled dust cleanup — push3-evolution/evolve.sh (#210) (#987)
## Summary

Bundled dust cleanup for `push3-evolution/evolve.sh` subsystem:

- **#716**: Fix null-fitness crash in generation JSONL parsing — `int(d.get('fitness', 0))` → `int(d.get('fitness') or 0)` (avoids `TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'` when fitness is JSON `null`)
- **#944**: Add `processExecIf_fix` to `ZERO_RATED_FLAGS` so inflated scores from that flag are zero-rated during pool admission/eviction
- **#945**: `fitness_flags` is comma-separated in practice — update `manifest.schema.json` description from 'Space-separated' to 'Comma-separated' and use `flags.split(',')` in `effective_fitness` instead of substring match
- Fix pre-existing SC2086: quote `$i` in `printf` argument (ShellCheck)

## Test plan
- [ ] ShellCheck passes on `tools/push3-evolution/evolve.sh`
- [ ] CI passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/987
Reviewed-by: Disinto_bot <disinto_bot@noreply.codeberg.org>
2026-03-19 07:33:23 +01:00
openhands
5a6df66541 fix: replace sleep+continue with exit 1 on stale patch to comply with AGENTS.md (#866)
AGENTS.md principle #1/#3 forbids fixed delays. When evolution.patch fails
the pre-flight --check, exit 1 lets the process supervisor handle restart
timing instead of a hardcoded sleep 300 busy-spin.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 20:07:21 +00:00
openhands
acda1f72bb fix: add sleep before continue in stale-patch error path to avoid busy loop (#866)
When git apply --check fails, the daemon now sleeps 300s before retrying,
preventing a tight busy loop that would hammer the git remote indefinitely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 18:49:23 +00:00
openhands
57b83b6fe9 fix: evolution.patch has no apply-validation step in CI or evolve.sh (#866)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 18:49:23 +00:00
openhands
7949640b04 fix: feat: LLM seed — Balanced Adaptive optimizer (#676)
Add llm_balanced.push3: arithmetic-only optimizer that keeps all
outputs in a balanced mid-range. anchorShare=40-60% (linear with
percentageStaked), anchorWidth=10-200 ticks (linear with taxRate),
discoveryDepth=30-50% (linear with percentageStaked), ci=0. No
EXEC.IF branches — all transitions via multiplication and division.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 14:10:36 +00:00
openhands
26df0a15dc fix: evo_run004_champion fitness also stale after #655 (#847)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 00:17:46 +00:00
openhands
a23064f576 fix: batch-eval.sh aborts entire generation on single candidate compile failure (#901)
- Add skip_candidate() helper that emits fitness=0 JSON to stdout and
  tracks the failed score for the output-dir file, satisfying the
  downstream scorer's expectation of one JSON line per candidate.
- Unify all failure paths (transpile, forge build, bytecode extract,
  empty bytecode) through skip_candidate() with a distinct error key.
- Log message now reads "WARNING: <id> compile failed — scoring as 0"
  as required by the acceptance criteria.
- Output-dir scores.jsonl now merges successful + failed scores so the
  file is complete even when some candidates fail to compile.
- All-candidates-fail path (COMPILED_COUNT=0) still exits 2 (no viable
  population); true infra errors (missing tool, bad RPC) unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 06:09:18 +00:00
openhands
777bec8563 fix: evolution.patch references removed LiquidityManager constant (pre-existing structural debt) (#842)
Extend the patch to also replace the NatSpec comments above MAX_ANCHOR_WIDTH,
which became misleading after switching to type(uint24).max. The old comments
claimed overflow-safety ("fits in int24"); the new comments document that the
production cap is 1233, that values above 123358 overflow int24 and revert,
and that this is tolerable in the evolution context where reverts score zero
fitness. The patch now correctly updates both the constant and its documentation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 09:20:41 +00:00
openhands
c0fa8c064f fix: evolution.patch references removed LiquidityManager constant (pre-existing structural debt) (#842)
Regenerate evolution.patch from the current ThreePositionStrategy.sol.
The old patch had a corrupt hunk header (@@ -33,7 +33,7 @@ claiming 7 lines
but only supplying 4) and placeholder index hashes (0000000..0000000),
causing `git apply` to reject it with "corrupt patch". MAX_ANCHOR_WIDTH
still exists in the file at value 1233; the patch correctly overrides it
to type(uint24).max for unbounded evolution runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 08:53:33 +00:00
openhands
79bcb81b81 fix: Fitness re-evaluation for fixed evo_run007_champion (#811)
Null out the stale fitness score (7116531284966772550194) for
evo_run007_champion.push3, which was recorded against the buggy
processExecIf interpreter (pre-#655 fix). Setting fitness to null
marks the entry for re-scoring by evaluate-seeds.sh once a valid
ANVIL_FORK_URL is available. Updated the note field to document why
the fitness was cleared.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 23:21:04 +00:00
openhands
aa274fd8ed fix: address review findings for anchorWidth guard (#817)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 22:04:13 +00:00
johba
ff86b3691d chore: extract shared inject.sh, add red-team-sweep.sh (#806)
## What
- `tools/push3-transpiler/inject.sh` — shared transpile+inject logic used by both batch-eval and red-team-sweep
- `batch-eval.sh` — replaced inline 60-line Python block with `inject.sh` call
- `scripts/harb-evaluator/red-team-sweep.sh` — red-teams each kindergarten seed using existing `red-team.sh`, with random smoke test gate

## Why
Sweep script kept breaking because I rewrote the injection logic instead of reusing batch-eval's proven Python. Now there's one copy.

## Testing
- inject.sh tested manually on DO box with optimizer_v3 seed
- Smoke test picks random seed, injects + compiles before starting sweep

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/806
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-15 10:24:03 +01:00
openhands
d8a109baf8 fix: evo_run007_champion.push3 always returns fixed params regardless of staking (#791)
Replace the broken EXEC.IF branches where TRUE was ( ) and FALSE was
0 DYADIC.POP, causing the trailing push sequence to execute
unconditionally. Now EXEC.IF correctly branches on STAKED > 88%:
  - TRUE  (staked > 88%): bear defaults ( 0 0 0 0 ) — CI=0, AW=0, AS=0, DD=0
  - FALSE (staked ≤ 88%): ( 200000000000000000 153 200000000000000000 0 )
                            — CI=0, AW=153, AS=20%, DD=20%

Also correct the manifest.jsonl run 7 note which had CI and DD inverted
(CI=20%/DD=0 → CI=0/DD=20%).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 07:30:45 +00:00
openhands
70ef0eb1bc fix: Old-format CIDs are warned but still silently dropped from the pool (#801)
- Change WARNING to explicitly state "legacy CID format ... migration not supported, skipping"
- Expand comment near the startswith('candidate_') guard to document the CID format
  contract and explain why re-admission is intentionally out of scope (no surviving
  generation_N.jsonl files from runs 1-6 exist in the repo)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 06:17:12 +00:00
openhands
4a47e8e2d1 fix: evolve.sh does not write \note\ field — schema drift between hand-written and evolved entries (#719)
- Pass seed basename into the admission Python block as argv[7]
- Add \`note\` field to every new evolved entry: "Evolved from <seed> (run<N> gen<G>)"
- Add migration comment noting entries admitted before this fix may have note: null

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 04:57:58 +00:00
openhands
6694b2daa8 fix: CID format change silently drops historical generation JSONL on re-admission (#757)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 04:27:38 +00:00
openhands
2aad9e98f1 fix: manifest.jsonl schema has no canonical machine-readable definition (#720)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 03:57:31 +00:00
openhands
c508efa31f fix: address review findings for evaluate-seeds.sh (#724)
- Replace unquoted heredoc (shell-injection path) with a temp file: the
  shell loop now appends tab-separated filename/score lines to a temp
  file, which is passed as a plain path argument to the Python manifest-
  rewrite block.  Python reads only file contents, never executes shell-
  expanded strings.
- Add early abort on fitness.sh exit code 2 (infra error: Anvil down,
  missing tool).  Iterating past an infra failure produces no useful
  results; aborting immediately surfaces the real problem.
- Remove unused `os` import from the manifest-rewrite Python block.
- Fix inaccurate comment in evolve.sh --diverse-seeds sampling: the pool
  sampler does a flat random shuffle with no fitness weighting; null-
  fitness seeds are not "treated as 0" — they are sampled with equal
  probability to any other seed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 03:29:47 +00:00
openhands
cb6e6708b6 fix: \llm\-origin entries in manifest have null fitness and no evaluation path (#724)
- Add evaluate-seeds.sh: standalone script that reads manifest.jsonl,
  finds every entry with fitness: null, runs fitness.sh against each
  seed file, and atomically writes results back to manifest.jsonl.
  Supports --dry-run to preview without evaluating.
- Add comment to --diverse-seeds sampling in evolve.sh documenting that
  null-fitness seeds are included with effective_fitness=0 and that
  evaluate-seeds.sh should be run to score them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 03:08:29 +00:00
openhands
273615cfed fix: No generic flag dispatch: only \token_value_inflation\ is ever zero-rated (#723)
Define ZERO_RATED_FLAGS set near effective_fitness and check each flag
with any(...in flags...) instead of a single hard-coded substring test.
token_value_inflation behaviour is preserved; new flags can be added to
the set without touching the dispatch logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 02:36:57 +00:00
openhands
7930770570 fix: feat: add evolution run 8 champion to seed pool (#781)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 01:31:06 +00:00
openhands
56aedfae49 fix: feat: add evolution run 8 champion to seed pool (#781)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 01:07:12 +00:00
openhands
7a9b4206ae fix: llm_contrarian.push3 AW=150/250 clamped to 100 — three rounds unaddressed (#756)
Replace AW=250 (VERY AGGRESSIVE) with 100 and AW=150 (AGGRESSIVE) with 80
so neither value is silently clamped by LiquidityManager.MAX_ANCHOR_WIDTH=100.
Update header comment block to match the corrected values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 21:40:31 +00:00
openhands
17c904aaa3 fix: batch-eval.sh MANIFEST_DIR (mktemp -d) has no cleanup trap (#763) 2026-03-14 19:46:50 +00:00
openhands
ab40930812 fix: fitness.sh individual-scoring path still silences errors (#766) 2026-03-14 19:07:17 +00:00
openhands
524a05286e fix: address review feedback on evolution-daemon.sh (#748)
- Stream evolve.sh output directly to stderr instead of buffering in a
  command substitution; long runs (tens of minutes) are now visible live.
- Use an array (EVOLVE_ARGS) for evolve.sh arguments instead of an
  unquoted DIVERSE_FLAG string variable.
- Abort the current run (continue to next loop iteration) when the patch
  fails to apply, rather than silently running with wrong evaluation semantics.
- Fix notify() to pass the message via stdin to avoid SSH single-quote
  interpolation breakage on messages containing special characters.
- Fix step comment/counter mismatch: "Step 7" comment now reads "Step 6"
  to match the [6/7] log label for the summary-write step.
- Clarify in evolution.conf that GAS_LIMIT and ANCHOR_WIDTH_UNBOUNDED are
  documentation-only (they document what evolution.patch does); editing
  them has no runtime effect.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 17:39:20 +00:00
openhands
bbf3b871b3 fix: feat: evolution-daemon.sh — perpetual evolution loop on DO box (#748)
- Add tools/push3-evolution/evolution-daemon.sh: single-command daemon that
  runs git-pull → apply-patch → clean-tmpdirs → evolve.sh → summary →
  notify → revert-patch → loop, handling SIGINT/SIGTERM cleanly.
- Add tools/push3-evolution/evolution.conf: config file (EVAL_MODE, BASE_RPC_URL,
  POPULATION=20, GENERATIONS=30, MUTATION_RATE=1, ELITES=2, DIVERSE_SEEDS=true,
  GAS_LIMIT=500000, ANCHOR_WIDTH_UNBOUNDED=true).
- Add tools/push3-evolution/evolution.patch: overrides CALCULATE_PARAMS_GAS_LIMIT
  200k→500k in Optimizer.sol + FitnessEvaluator.t.sol, and removes
  MAX_ANCHOR_WIDTH=100 cap in LiquidityManager.sol for unbounded AW exploration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 17:21:51 +00:00
openhands
f355974cc8 fix: fix: evolve.sh silences all batch-eval errors with 2>/dev/null (#749)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 16:51:04 +00:00
openhands
89a9d3e575 fix: fix: evolve.sh silences all batch-eval errors with 2>/dev/null (#749)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 16:27:09 +00:00
johba
cf94d4c342 Merge pull request 'fix: fix: evolve.sh stale tmpdirs break subsequent runs (#750)' (#762) from fix/issue-750 into master 2026-03-14 17:19:42 +01:00
openhands
b168a05930 fix: fix: evolve.sh stale tmpdirs break subsequent runs (#750)
Replace `mktemp -d` with a fixed working directory `evolved/.work/` that
is wiped at startup.  Stale `/tmp/tmp.*` directories from killed runs can
no longer interfere with batch-eval.sh path resolution.  Run outputs are
already preserved in `evolved/run_NNN/` before the work dir is cleaned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 15:48:07 +00:00
openhands
564f0c5c69 fix: add fitness_flags to evo_run004, fix run007 note accuracy
- evo_run004_champion: added missing fitness_flags field
- evo_run007_champion: clarified both branches (staked<=88% vs >88%)
2026-03-14 15:43:58 +00:00
openhands
37ecf413d8 fix: resolve manifest.jsonl conflict markers 2026-03-14 15:43:58 +00:00
openhands
648e247ce3 feat: add run 7 champion to kindergarten
evo_run007_champion: fitness 7.117e21, anchorWidth=153 (unbounded),
discoveryDepth=0. Simplified to single percentageStaked>88% threshold.
Evolved under IL crystallization attack pressure.
2026-03-14 15:43:58 +00:00
openhands
34f142ae17 feat: add run7 evolution champion to seed pool 2026-03-14 15:43:58 +00:00
openhands
5f7d002e2a feat: add recovered LLM seeds (floor hugger + contrarian)
Recovered from reflog after rebase accident destroyed PRs #692, #699.
Balanced Adaptive (#688) was garbage collected — will be regenerated.
Kindergarten (#683) needs fresh implementation due to evolve.sh conflicts.

Closes #672, #675.
2026-03-14 15:43:58 +00:00
openhands
fafe317fa5 fix: feat: LLM seed — Defensive Floor Hugger optimizer (#672)
Add llm_floor_hugger.push3: pure-constant Push3 optimizer that keeps
anchorShare=0.05e18, anchorWidth=5 ticks, discoveryDepth=0.05e18, CI=0.
Ignores all staking/tax inputs — floor position is always maximised.
Transpiles without error; manifest.jsonl updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 12:48:22 +00:00
openhands
cd86774ac8 fix: address review findings for #751 — STATE.md and script header docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 12:17:23 +00:00
openhands
83ab1683f5 fix: fix: EVAL_MODE defaults to anvil — should default to revm (#751)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:56:52 +00:00
openhands
266500fde1 fix: address review findings for #752 — regex and STATE.md cleanup
- Fix run_NNN scan regex: r'run(\d+)' → r'run_(\d+)' so it correctly
  matches the underscore-separated directory names the script creates
  (previously always resolved to 001, overwriting the same dir each run)
- Remove [in-progress] tag from STATE.md entry for #752

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:27:53 +00:00
openhands
b5bf53b010 fix: feat: evolve.sh auto-incrementing per-run results directory (#752)
- --output now accepts a base dir (default: evolved/) instead of requiring
  an explicit path each run
- On each invocation, scan base dir for existing run_NNN/ subdirectories,
  find the highest N, and create run_(N+1)/ for this run's outputs
- All generation JSONL files, best.push3, diff.txt, and evolution.log are
  written to the new run dir — previous runs are never overwritten
- Log header now shows both Base dir and Output (run dir) for clarity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:08:04 +00:00
openhands
b6c07b1d93 fix: generation_N.jsonl candidate_id format mismatch vs filenames (#669) 2026-03-14 04:27:59 +00:00
openhands
0aa819f168 fix: generation_N.jsonl candidate_id format mismatch vs filenames (#669) 2026-03-14 04:07:00 +00:00