Commit graph

53 commits

Author SHA1 Message Date
openhands
7a9b4206ae fix: llm_contrarian.push3 AW=150/250 clamped to 100 — three rounds unaddressed (#756)
Replace AW=250 (VERY AGGRESSIVE) with 100 and AW=150 (AGGRESSIVE) with 80
so neither value is silently clamped by LiquidityManager.MAX_ANCHOR_WIDTH=100.
Update header comment block to match the corrected values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 21:40:31 +00:00
openhands
17c904aaa3 fix: batch-eval.sh MANIFEST_DIR (mktemp -d) has no cleanup trap (#763) 2026-03-14 19:46:50 +00:00
openhands
ab40930812 fix: fitness.sh individual-scoring path still silences errors (#766) 2026-03-14 19:07:17 +00:00
openhands
524a05286e fix: address review feedback on evolution-daemon.sh (#748)
- Stream evolve.sh output directly to stderr instead of buffering in a
  command substitution; long runs (tens of minutes) are now visible live.
- Use an array (EVOLVE_ARGS) for evolve.sh arguments instead of an
  unquoted DIVERSE_FLAG string variable.
- Abort the current run (continue to next loop iteration) when the patch
  fails to apply, rather than silently running with wrong evaluation semantics.
- Fix notify() to pass the message via stdin to avoid SSH single-quote
  interpolation breakage on messages containing special characters.
- Fix step comment/counter mismatch: "Step 7" comment now reads "Step 6"
  to match the [6/7] log label for the summary-write step.
- Clarify in evolution.conf that GAS_LIMIT and ANCHOR_WIDTH_UNBOUNDED are
  documentation-only (they document what evolution.patch does); editing
  them has no runtime effect.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 17:39:20 +00:00
openhands
bbf3b871b3 fix: feat: evolution-daemon.sh — perpetual evolution loop on DO box (#748)
- Add tools/push3-evolution/evolution-daemon.sh: single-command daemon that
  runs git-pull → apply-patch → clean-tmpdirs → evolve.sh → summary →
  notify → revert-patch → loop, handling SIGINT/SIGTERM cleanly.
- Add tools/push3-evolution/evolution.conf: config file (EVAL_MODE, BASE_RPC_URL,
  POPULATION=20, GENERATIONS=30, MUTATION_RATE=1, ELITES=2, DIVERSE_SEEDS=true,
  GAS_LIMIT=500000, ANCHOR_WIDTH_UNBOUNDED=true).
- Add tools/push3-evolution/evolution.patch: overrides CALCULATE_PARAMS_GAS_LIMIT
  200k→500k in Optimizer.sol + FitnessEvaluator.t.sol, and removes
  MAX_ANCHOR_WIDTH=100 cap in LiquidityManager.sol for unbounded AW exploration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 17:21:51 +00:00
openhands
f355974cc8 fix: fix: evolve.sh silences all batch-eval errors with 2>/dev/null (#749)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 16:51:04 +00:00
openhands
89a9d3e575 fix: fix: evolve.sh silences all batch-eval errors with 2>/dev/null (#749)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 16:27:09 +00:00
johba
cf94d4c342 Merge pull request 'fix: fix: evolve.sh stale tmpdirs break subsequent runs (#750)' (#762) from fix/issue-750 into master 2026-03-14 17:19:42 +01:00
openhands
b168a05930 fix: fix: evolve.sh stale tmpdirs break subsequent runs (#750)
Replace `mktemp -d` with a fixed working directory `evolved/.work/` that
is wiped at startup.  Stale `/tmp/tmp.*` directories from killed runs can
no longer interfere with batch-eval.sh path resolution.  Run outputs are
already preserved in `evolved/run_NNN/` before the work dir is cleaned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 15:48:07 +00:00
openhands
564f0c5c69 fix: add fitness_flags to evo_run004, fix run007 note accuracy
- evo_run004_champion: added missing fitness_flags field
- evo_run007_champion: clarified both branches (staked<=88% vs >88%)
2026-03-14 15:43:58 +00:00
openhands
37ecf413d8 fix: resolve manifest.jsonl conflict markers 2026-03-14 15:43:58 +00:00
openhands
648e247ce3 feat: add run 7 champion to kindergarten
evo_run007_champion: fitness 7.117e21, anchorWidth=153 (unbounded),
discoveryDepth=0. Simplified to single percentageStaked>88% threshold.
Evolved under IL crystallization attack pressure.
2026-03-14 15:43:58 +00:00
openhands
34f142ae17 feat: add run7 evolution champion to seed pool 2026-03-14 15:43:58 +00:00
openhands
5f7d002e2a feat: add recovered LLM seeds (floor hugger + contrarian)
Recovered from reflog after rebase accident destroyed PRs #692, #699.
Balanced Adaptive (#688) was garbage collected — will be regenerated.
Kindergarten (#683) needs fresh implementation due to evolve.sh conflicts.

Closes #672, #675.
2026-03-14 15:43:58 +00:00
openhands
fafe317fa5 fix: feat: LLM seed — Defensive Floor Hugger optimizer (#672)
Add llm_floor_hugger.push3: pure-constant Push3 optimizer that keeps
anchorShare=0.05e18, anchorWidth=5 ticks, discoveryDepth=0.05e18, CI=0.
Ignores all staking/tax inputs — floor position is always maximised.
Transpiles without error; manifest.jsonl updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 12:48:22 +00:00
openhands
cd86774ac8 fix: address review findings for #751 — STATE.md and script header docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 12:17:23 +00:00
openhands
83ab1683f5 fix: fix: EVAL_MODE defaults to anvil — should default to revm (#751)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:56:52 +00:00
openhands
266500fde1 fix: address review findings for #752 — regex and STATE.md cleanup
- Fix run_NNN scan regex: r'run(\d+)' → r'run_(\d+)' so it correctly
  matches the underscore-separated directory names the script creates
  (previously always resolved to 001, overwriting the same dir each run)
- Remove [in-progress] tag from STATE.md entry for #752

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:27:53 +00:00
openhands
b5bf53b010 fix: feat: evolve.sh auto-incrementing per-run results directory (#752)
- --output now accepts a base dir (default: evolved/) instead of requiring
  an explicit path each run
- On each invocation, scan base dir for existing run_NNN/ subdirectories,
  find the highest N, and create run_(N+1)/ for this run's outputs
- All generation JSONL files, best.push3, diff.txt, and evolution.log are
  written to the new run dir — previous runs are never overwritten
- Log header now shows both Base dir and Output (run dir) for clarity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:08:04 +00:00
openhands
b6c07b1d93 fix: generation_N.jsonl candidate_id format mismatch vs filenames (#669) 2026-03-14 04:27:59 +00:00
openhands
0aa819f168 fix: generation_N.jsonl candidate_id format mismatch vs filenames (#669) 2026-03-14 04:07:00 +00:00
openhands
958b8cfaa0 fix: batch-eval.sh header comment claims wrong candidate_id format (#668) 2026-03-14 03:36:43 +00:00
openhands
c42a1ca768 fix: evo_run004_champion fitness inflated by token value (#670) (#704)
- Add fitness_flags="token_value_inflation" to evo_run004_champion in
  manifest.jsonl so callers can detect the inflated value without
  discarding the entry entirely.
- Add effective_fitness() helper in evolve.sh pool admission (step 5)
  that returns 0 for any entry with a token_value_inflation flag,
  preventing inflated scores from biasing the top-100 evolved pool
  ranking or eviction decisions.
- Document in evolve.sh that raw fitness values are only comparable
  within the same evaluation run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 01:08:13 +00:00
openhands
b7d0b63ca1 fix: int(e.get('fitness', 0)) crashes on null-fitness manifest entries (#711) 2026-03-13 23:37:00 +00:00
johba
0c4cd23dfa fix: feat: Seed kindergarten — persistent top-100 candidate pool (#667) (#683)
Fixes #667

## Changes
## Summary

Implemented persistent top-100 candidate pool in `tools/push3-evolution/evolve.sh`:

### Changes

**`--run-id <N>` flag** (line 96)
- Optional integer; auto-increments from highest `run` field in `manifest.jsonl` when omitted
- Zero-padded to 3 digits (`001`, `002`, …)

**Seeds pool constants** (after path canonicalization)
- `SEEDS_DIR` → `$SCRIPT_DIR/seeds/`
- `POOL_MANIFEST` → `seeds/manifest.jsonl`
- `ADMISSION_THRESHOLD` → `6000000000000000000000` (6e21 wei)

**`--diverse-seeds` mode** now has two paths:
1. **Pool mode** (pool non-empty): random-shuffles the pool and takes up to `POPULATION` candidates — real evolved diversity, not parametric clones
2. **Fallback** (pool empty): original `seed-gen-cli` parametric variant behavior
- Both paths fall back to mutating `--seed` to fill any shortfall

**Step 5 — End-of-run admission** (after the diff step):
1. Scans all `generation_*.jsonl` in `OUTPUT_DIR` for candidates with `fitness ≥ 6e21`
2. Maps `candidate_id` (e.g. `gen2_c005`) back to `.push3` files in `WORK_DIR` (still exists since cleanup fires on EXIT)
3. Deduplicates by SHA-256 content hash against existing pool
4. Names new files `run{RUN_ID}_gen{N}_c{MMM}.push3`
5. Merges with existing pool, sorts by fitness descending, keeps top 100
6. Copies admitted files to `seeds/`, removes evicted evolved files (never hand-written), rewrites `manifest.jsonl`

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/683
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-13 20:45:03 +01:00
johba
e230a62e11 fix: feat: LLM seed — Fee Maximizer optimizer (#673) (#696)
Fixes #673

## Changes
Done. Here's a summary of what was implemented:

## Changes

**`tools/push3-evolution/seeds/llm_fee_maximizer.push3`** (new file)
A Push3 optimizer seed implementing the "every trade pays us" philosophy. It reads `percentageStaked` (slot 0) and `averageTaxRate` (slot 1), then branches on two thresholds:
- Staking threshold: 60% (bullish vs. neutral sentiment)
- Tax threshold: 10% of 1e18 (high vs. low swap volume)

Strategy matrix:
| | tax < 10% | tax ≥ 10% |
|---|---|---|
| **staked < 60%** | AS=0.70, AW=60, DD=0.50 | AS=0.80, AW=80, DD=0.60 |
| **staked ≥ 60%** | AS=0.90, AW=40, DD=0.80 | AS=0.95, AW=50, DD=0.90 |

CI is always 0. Anchor share is always ≥ 0.70e18 (capital stays in fee-earning zone). High staking shifts discovery depth up; high tax widens the anchor to capture more swap volume.

**`tools/push3-evolution/seeds/manifest.jsonl`** — new entry for `llm_fee_maximizer.push3` with `origin=llm`.

Transpiled successfully: 48-line Solidity function body, outputs correctly bound to `ci`, `anchorShare`, `anchorWidth`, `discoveryDepth`.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/696
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-13 19:56:39 +01:00
openhands
24c4e94a6b fix: feat: LLM seed — Momentum Follower optimizer (#674)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 14:50:26 +00:00
openhands
c7196aa2b0 feat: seed kindergarten — initial population with OG + first evolution champion
seeds/optimizer_v3.push3     — hand-written original (8.26e21)
seeds/evo_run004_champion.push3 — first evolution winner (2.31e24, gen 3)
seeds/manifest.jsonl         — fitness + provenance tracking

Note: champion fitness inflated by token value (#670). Strategy is
'always bull' — wide positions, max anchor share. Pending ETH-only
metric fix to validate.
2026-03-13 10:32:35 +00:00
johba
3f435f8459 fix: evolution scoring — 3 bugs made all candidates report fitness=0 (#665)
## Three bugs in evolve.sh

1. **Heredoc stdin conflict** — `py_stats()` used `<<PYEOF` heredoc which stole stdin from the pipe, so python never received score values → stats always `min=0 max=0 mean=0`

2. **Bash integer overflow** — global best comparison used `[ $MAX -gt $GLOBAL_BEST_FITNESS ]` which overflows on uint256 wei values (>9.2e18) → best always tracked as 0

3. **candidate_id mismatch** — evolve.sh looked up `gen0_c000` but batch-eval produces `candidate_000` (derived from filename) → score lookup always returned default 0

All 3 previous evolution runs (150+ candidates) reported all zeros despite batch-eval correctly scoring them at ~8.26e21 wei.

## Fix
- `py_stats`: heredoc → `python3 -c` inline
- Global best: bash `[ -gt ]` → `python3` big number comparison
- Score lookup: use `basename $CAND_FILE` instead of synthetic CID

Co-authored-by: root <root@debian-g-2vcpu-8gb-ams3-01>
Reviewed-on: https://codeberg.org/johba/harb/pulls/665
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-13 10:02:24 +01:00
openhands
f8b765a9f8 fix: feat: Push3 evolution — crossover operator (#639)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 05:54:48 +00:00
openhands
89a2734bff fix: address review findings for diverse seed population (#638)
- evolve.sh: fix fail-in-subshell bug — run seed-gen-cli as a direct
  command so its exit code is checked by the parent shell and fail()
  aborts the script correctly; redirect stderr to log file instead of
  discarding it with 2>/dev/null
- seed-generator.ts: reorder enumerateVariants() to put
  STAKED_THRESHOLDS outermost (192 entries/block) so that
  selectVariants(6) with stride=192 covers all 6 staked% thresholds;
  remove false doc claim about "first variant is current seed config";
  add comments explaining CI=0n is intentional in all presets
- seed-gen-cli.ts: emit a stderr diagnostic when count exceeds the
  1152-variant cap so the cap is visible rather than silently producing
  fewer files than requested
- test: strengthen n=6 test to assert all STAKED_THRESHOLDS values are
  represented in the selected variants

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 05:21:05 +00:00
openhands
850131b74f fix: feat: Push3 evolution — diverse seed population (#638)
Add seed-generator.ts module and seed-gen-cli.ts CLI that produce
parametric Push3 variants for initial population seeding.

Variants systematically cover:
  - Staked% thresholds: 80, 85, 88, 91, 94, 97
  - Penalty thresholds: 30, 50, 70, 100
  - Bull params: 4 presets (aggressive → mild)
  - Bear params: 4 presets (standard → very mild)
  - Tax distributions: exponential (seed), linear, sqrt

Total combination space: 6×4×4×4×3 = 1152 variants.
selectVariants(n) samples evenly so every axis is represented.

evolve.sh gains --diverse-seeds flag: when set, gen_0 is seeded with
parametric variants instead of N copies of the same mutated seed.
Remaining slots (if population > generated variants) fall back to
mutations of the base seed.

All generated programs pass transpiler stack validation (33 new tests).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 04:48:04 +00:00
openhands
c87064dc6c fix: feat: Push3 default outputs — crash/no-output falls back to bear strategy (#634)
Three defensive layers so every Push3 program runs without reverting:

Layer A (transpiler/index.ts): assign bear defaults (CI=0, AS=0.3e18,
AW=100, DD=0.3e18) to all four outputs at the top of calculateParams.
Any output the evolved program does not overwrite keeps the safe default.

Layer B (transpiler/transpiler.ts): graceful stack underflow — dpop/bpop
return '0'/'false' instead of throwing, and the final output-pop falls
back to bear-default literals when fewer than 4 values remain on the
stack. Wrong output count no longer aborts transpilation.

Layer C (transpiler/transpiler.ts + index.ts): wrap the entire function
body in `unchecked {}` so integer overflow wraps (matching Push3), and
emit `(b == 0 ? 0 : a / b)` for every DYADIC./ (div-by-zero → 0,
matching Push3 no-op semantics).

Layer 2 (Optimizer.sol getLiquidityParams): clamp the three fraction
outputs (capitalInefficiency, anchorShare, discoveryDepth) to [0, 1e18]
after abi.decode so a buggy evolved program cannot produce out-of-range
values even if it runs without reverting.

Regenerated OptimizerV3Push3.sol with the updated transpiler; all 193
tests pass (34 Optimizer/OptimizerV3Push3 tests explicitly).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 03:47:49 +00:00
johba
8e4bd905ac Merge pull request 'fix: fix: Bootstrap VWAP with seed trade during deployment (#567) (#567)' (#633) from fix/issue-567 into master 2026-03-13 03:17:30 +01:00
openhands
5d369cfab6 fix: address review findings for gas-limit fitness pressure (#637)
- Optimizer.sol: move CALCULATE_PARAMS_GAS_LIMIT constant to top of
  contract (after error declaration) to avoid mid-contract placement.
  Expand natspec with EIP-150 63/64 note: callers need ~203 175 gas to
  deliver the full 200 000 budget to the inner staticcall.

- Optimizer.sol: add ret.length < 128 guard before abi.decode in
  getLiquidityParams(). Malformed return data (truncated / wrong ABI)
  from an evolved program now falls back to _bearDefaults() instead of
  propagating an unhandled revert. The 128-byte minimum is the ABI
  encoding of (uint256, uint256, uint24, uint256) — four 32-byte slots.

- Optimizer.sol: add cross-reference comment to _bearDefaults() noting
  that its values must stay in sync with LiquidityManager.recenter()'s
  catch block to prevent silent divergence.

- FitnessEvaluator.t.sol: add CALCULATE_PARAMS_GAS_LIMIT mirror constant
  (must match Optimizer.sol). Disqualify candidates whose measured gas
  exceeds the production cap with fitness=0 and error="gas_over_limit"
  — prevents the pipeline from selecting programs that are functionally
  dead on-chain (would always produce bear defaults in production).

- batch-eval.sh: update output format comment to document the gas_used
  field and over-gas-limit error object added by this feature.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 01:05:37 +00:00
openhands
73a80ead0b fix: add --tc DeployLocal to forge script invocations
Adding SeedSwapper alongside DeployLocal in the same .sol file caused
forge to error "Multiple contracts in the target path" when no --tc flag
was specified, silently failing the CI bootstrap step.

Add --tc DeployLocal to all forge script invocations of DeployLocal.sol:
  - scripts/bootstrap-common.sh  (CI / local bootstrap)
  - tools/deploy-optimizer.sh    (manual deploy tool)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 23:12:25 +00:00
openhands
64f1af3041 fix: feat: Push3 evolution — elitism (top N survive unchanged) (#640)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 22:29:23 +00:00
openhands
c05b20d640 fix: fix: Bootstrap VWAP with seed trade during deployment (#567) (#567)
Deploy scripts (DeployLocal.sol and DeployBase.sol) now execute a
seed buy + double-recenter sequence before handing control to users:

1. Temporarily grant deployer recenterAccess (via self as feeDestination)
2. Fund LM with a small amount and call recenter() -> places thin positions
3. SeedSwapper executes a small buy, generating a non-zero WETH fee
4. Second recenter() hits the cumulativeVolume==0 bootstrap path with
   ethFee>0 -> _recordVolumeAndPrice fires -> cumulativeVolume>0
5. Revoke recenterAccess and restore the real feeDestination

After deployment, cumulativeVolume>0, so the bootstrap path is
unreachable by external users and cannot be front-run by an attacker
inflating the initial VWAP anchor with a whale buy.

Also adds:
- tools/deploy-optimizer.sh: verification step checks cumulativeVolume>0
  after a fresh local deployment
- test_vwapBootstrappedBySeedTrade() in VWAPFloorProtection.t.sol:
  confirms the deploy sequence (recenter + buy + recenter) leaves
  cumulativeVolume>0 and getVWAP()>0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 21:15:35 +00:00
openhands
87bb5859e2 fix: revm evaluator — UUPS bypass, deployedBytecode, graceful attack ops
- Skip UUPS upgradeTo: etch + vm.store ERC1967 implementation slot directly
  (OptimizerV3Push3 is standalone, no UUPS inheritance needed for evolution)
- Use deployedBytecode (runtime) instead of bytecode (creation) for vm.etch
- Inject transpiled body into OptimizerV3.sol (has getLiquidityParams via Optimizer)
  instead of using standalone OptimizerV3Push3.sol
- Wrap buy/sell/stake/unstake in try/catch — attack ops should not abort the batch
- Add /tmp read to fs_permissions for batch-eval manifest files
- Bootstrap recenter returns bool instead of reverting (soft-fail per candidate)
2026-03-12 19:54:58 +00:00
openhands
403a304c98 fix: tighten stack-depth guard to !== 4 to catch overflow (#584)
Reviewer noted that `< 4` only catches underflow; programs leaving 5+
values on the DYADIC stack silently passed isValid().  Change the guard
to `!== 4` so both under- and overflow are rejected, matching the
documented 'exactly 4 outputs' contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 14:48:15 +00:00
openhands
e770191e56 fix: transpile() does not throw on <4 stack outputs (#584)
Replace silent ?? '0' fallbacks with an explicit length check that
throws when the DYADIC stack holds fewer than 4 values at program
termination.  isValid() in the evolution pipeline now correctly
rejects underflow programs instead of silently scoring them as valid
with zeroed outputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 14:07:32 +00:00
openhands
26b8876691 fix: feat: revm-based fitness evaluator for evolution at scale (#604)
Replace per-candidate Anvil+forge-script pipeline with in-process EVM
execution using Foundry's native revm backend, achieving 10-100× speedup
for evolutionary search at scale.

New files:
- onchain/test/FitnessEvaluator.t.sol — Forge test that forks Base once,
  deploys the full KRAIKEN stack, then for each candidate uses vm.etch to
  inject the compiled optimizer bytecode, UUPS-upgrades the proxy, runs all
  attack sequences with in-memory vm.snapshot/revertTo (no RPC overhead),
  and emits one {"candidate_id","fitness"} JSON line per candidate.
  Skips gracefully when BASE_RPC_URL is unset (CI-safe).

- tools/push3-evolution/revm-evaluator/batch-eval.sh — Wrapper that
  transpiles+compiles each candidate sequentially, writes a two-file
  manifest (ids.txt + bytecodes.txt), then invokes FitnessEvaluator.t.sol
  in a single forge test run and parses the score JSON from stdout.

Modified:
- tools/push3-evolution/evolve.sh — Adds EVAL_MODE env var (anvil|revm).
  When EVAL_MODE=revm, batch-scores every candidate in a generation with
  one batch-eval.sh call instead of N sequential fitness.sh processes;
  scores are looked up from the JSONL output in the per-candidate loop.
  Default remains EVAL_MODE=anvil for backward compatibility.

Key design decisions:
- Per-candidate Solidity compilation is unavoidable (each Push3 candidate
  produces different Solidity); the speedup is in the evaluation phase.
- vm.snapshot/revertTo in forge test are O(1) memory operations (true
  revm), not RPC calls — this is the core speedup vs Anvil.
- recenterAccess is set in bootstrap so TWAP stability checks are bypassed
  during attack sequences (mirrors the existing fitness.sh bootstrap).
- Test skips cleanly when BASE_RPC_URL is absent, keeping CI green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 11:54:41 +00:00
openhands
ade7e2033a fix: Evolution pipeline UUPS upgrade + Foundry PATH (#593)
- Add virtual to Optimizer.calculateParams() for UUPS override
- Create OptimizerV3.sol: UUPS-upgradeable optimizer with transpiled Push3 logic
- Update deploy-optimizer.sh to deploy OptimizerV3 instead of Optimizer
- Add ~/.foundry/bin to PATH in evolve.sh, fitness.sh, deploy-optimizer.sh
2026-03-12 06:47:35 +00:00
openhands
0496c94681 fix: address review findings in evolve.sh (#546)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 22:06:18 +00:00
openhands
2ee7feb621 fix: address review findings in evolve.sh (#546)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 21:29:14 +00:00
openhands
547e8beae8 fix: Push3 evolution: selection loop orchestrator (#546)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 20:56:19 +00:00
openhands
4564637f85 fix: Push3 evolution: fitness scoring wrapper (transpile → deploy → attack → score) (#545)
Address round-2 review findings:
- Move BASELINE_SNAP before deploy-optimizer.sh so cleanup fully reverts the
  deploy on a shared Anvil; fixes nonce/address collision when a second
  sequential evaluation reuses the same chain
- Revert deploy output to capture-and-suppress on success / surface on failure;
  removes per-candidate stderr noise in evolution loop batch runs
- Fix cast rpc anvil_mine arg order to match all other cast rpc calls in script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 20:16:54 +00:00
openhands
0f91234dbe fix: Push3 evolution: fitness scoring wrapper (transpile → deploy → attack → score) (#545)
Address review findings:
- Bug: add BASELINE_SNAP before bootstrap; cleanup reverts it on shared Anvil
  to undo setRecenterAccess/WETH-funding/recenter mutations (was dead code before)
- Bug: require ANVIL_FORK_URL when cold-starting Anvil — DeployLocal.sol needs
  live Base contracts (Uniswap V3 Factory, WETH) that don't exist on a plain fork
- Warning: flag DIRTY and emit warning when anvil_revert fails instead of || true
- Warning: tee deploy-optimizer.sh output to both log file and stderr so progress
  is visible and preserved for post-failure diagnosis
- Nit: replace 50×evm_mine loop with single anvil_mine 0x32 (49 fewer RTTs)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 19:41:06 +00:00
openhands
a8db761de8 fix: Push3 evolution: fitness scoring wrapper (transpile → deploy → attack → score) (#545)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 19:02:00 +00:00
openhands
a6b64d3219 fix: Push3 evolution: mutation operators for bytecode programs (#544)
Implements the five Push3 mutation operators and the meta-operator for
the optimizer evolution pipeline:

- mutateConstant: shifts a random integer literal by ±δ (clamped to 0)
- swapOperator: swaps ADD↔SUB, MUL↔DIV, GT↔LT, GTE↔LTE
- deleteInstruction: removes a random non-EXEC.IF instr; validates result
- insertInstruction: inserts stack-neutral pair (push 0 + DYADIC.POP)
- crossover: single-point crossover of two programs at instruction boundaries
- mutate: applies N random mutations from the four single-program operators

All mutations validate output via transpile() symbolic stack simulation.
Invalid mutations silently return the original program.

35 unit tests cover all operators, edge cases (empty program, single
instruction, deep stack), and the acceptance criterion that
mutate(optimizer_v3, 3) produces ≥10 distinct valid variants in 20 trials.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 16:24:24 +00:00