Commit graph

54 commits

Author SHA1 Message Date
openhands
56aedfae49 fix: feat: add evolution run 8 champion to seed pool (#781)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 01:07:12 +00:00
openhands
7a9b4206ae fix: llm_contrarian.push3 AW=150/250 clamped to 100 — three rounds unaddressed (#756)
Replace AW=250 (VERY AGGRESSIVE) with 100 and AW=150 (AGGRESSIVE) with 80
so neither value is silently clamped by LiquidityManager.MAX_ANCHOR_WIDTH=100.
Update header comment block to match the corrected values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 21:40:31 +00:00
openhands
17c904aaa3 fix: batch-eval.sh MANIFEST_DIR (mktemp -d) has no cleanup trap (#763) 2026-03-14 19:46:50 +00:00
openhands
ab40930812 fix: fitness.sh individual-scoring path still silences errors (#766) 2026-03-14 19:07:17 +00:00
openhands
524a05286e fix: address review feedback on evolution-daemon.sh (#748)
- Stream evolve.sh output directly to stderr instead of buffering in a
  command substitution; long runs (tens of minutes) are now visible live.
- Use an array (EVOLVE_ARGS) for evolve.sh arguments instead of an
  unquoted DIVERSE_FLAG string variable.
- Abort the current run (continue to next loop iteration) when the patch
  fails to apply, rather than silently running with wrong evaluation semantics.
- Fix notify() to pass the message via stdin to avoid SSH single-quote
  interpolation breakage on messages containing special characters.
- Fix step comment/counter mismatch: "Step 7" comment now reads "Step 6"
  to match the [6/7] log label for the summary-write step.
- Clarify in evolution.conf that GAS_LIMIT and ANCHOR_WIDTH_UNBOUNDED are
  documentation-only (they document what evolution.patch does); editing
  them has no runtime effect.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 17:39:20 +00:00
openhands
bbf3b871b3 fix: feat: evolution-daemon.sh — perpetual evolution loop on DO box (#748)
- Add tools/push3-evolution/evolution-daemon.sh: single-command daemon that
  runs git-pull → apply-patch → clean-tmpdirs → evolve.sh → summary →
  notify → revert-patch → loop, handling SIGINT/SIGTERM cleanly.
- Add tools/push3-evolution/evolution.conf: config file (EVAL_MODE, BASE_RPC_URL,
  POPULATION=20, GENERATIONS=30, MUTATION_RATE=1, ELITES=2, DIVERSE_SEEDS=true,
  GAS_LIMIT=500000, ANCHOR_WIDTH_UNBOUNDED=true).
- Add tools/push3-evolution/evolution.patch: overrides CALCULATE_PARAMS_GAS_LIMIT
  200k→500k in Optimizer.sol + FitnessEvaluator.t.sol, and removes
  MAX_ANCHOR_WIDTH=100 cap in LiquidityManager.sol for unbounded AW exploration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 17:21:51 +00:00
openhands
f355974cc8 fix: fix: evolve.sh silences all batch-eval errors with 2>/dev/null (#749)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 16:51:04 +00:00
openhands
89a9d3e575 fix: fix: evolve.sh silences all batch-eval errors with 2>/dev/null (#749)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 16:27:09 +00:00
johba
cf94d4c342 Merge pull request 'fix: fix: evolve.sh stale tmpdirs break subsequent runs (#750)' (#762) from fix/issue-750 into master 2026-03-14 17:19:42 +01:00
openhands
b168a05930 fix: fix: evolve.sh stale tmpdirs break subsequent runs (#750)
Replace `mktemp -d` with a fixed working directory `evolved/.work/` that
is wiped at startup.  Stale `/tmp/tmp.*` directories from killed runs can
no longer interfere with batch-eval.sh path resolution.  Run outputs are
already preserved in `evolved/run_NNN/` before the work dir is cleaned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 15:48:07 +00:00
openhands
564f0c5c69 fix: add fitness_flags to evo_run004, fix run007 note accuracy
- evo_run004_champion: added missing fitness_flags field
- evo_run007_champion: clarified both branches (staked<=88% vs >88%)
2026-03-14 15:43:58 +00:00
openhands
37ecf413d8 fix: resolve manifest.jsonl conflict markers 2026-03-14 15:43:58 +00:00
openhands
648e247ce3 feat: add run 7 champion to kindergarten
evo_run007_champion: fitness 7.117e21, anchorWidth=153 (unbounded),
discoveryDepth=0. Simplified to single percentageStaked>88% threshold.
Evolved under IL crystallization attack pressure.
2026-03-14 15:43:58 +00:00
openhands
34f142ae17 feat: add run7 evolution champion to seed pool 2026-03-14 15:43:58 +00:00
openhands
5f7d002e2a feat: add recovered LLM seeds (floor hugger + contrarian)
Recovered from reflog after rebase accident destroyed PRs #692, #699.
Balanced Adaptive (#688) was garbage collected — will be regenerated.
Kindergarten (#683) needs fresh implementation due to evolve.sh conflicts.

Closes #672, #675.
2026-03-14 15:43:58 +00:00
openhands
fafe317fa5 fix: feat: LLM seed — Defensive Floor Hugger optimizer (#672)
Add llm_floor_hugger.push3: pure-constant Push3 optimizer that keeps
anchorShare=0.05e18, anchorWidth=5 ticks, discoveryDepth=0.05e18, CI=0.
Ignores all staking/tax inputs — floor position is always maximised.
Transpiles without error; manifest.jsonl updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 12:48:22 +00:00
openhands
cd86774ac8 fix: address review findings for #751 — STATE.md and script header docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 12:17:23 +00:00
openhands
83ab1683f5 fix: fix: EVAL_MODE defaults to anvil — should default to revm (#751)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:56:52 +00:00
openhands
266500fde1 fix: address review findings for #752 — regex and STATE.md cleanup
- Fix run_NNN scan regex: r'run(\d+)' → r'run_(\d+)' so it correctly
  matches the underscore-separated directory names the script creates
  (previously always resolved to 001, overwriting the same dir each run)
- Remove [in-progress] tag from STATE.md entry for #752

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:27:53 +00:00
openhands
b5bf53b010 fix: feat: evolve.sh auto-incrementing per-run results directory (#752)
- --output now accepts a base dir (default: evolved/) instead of requiring
  an explicit path each run
- On each invocation, scan base dir for existing run_NNN/ subdirectories,
  find the highest N, and create run_(N+1)/ for this run's outputs
- All generation JSONL files, best.push3, diff.txt, and evolution.log are
  written to the new run dir — previous runs are never overwritten
- Log header now shows both Base dir and Output (run dir) for clarity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:08:04 +00:00
openhands
b6c07b1d93 fix: generation_N.jsonl candidate_id format mismatch vs filenames (#669) 2026-03-14 04:27:59 +00:00
openhands
0aa819f168 fix: generation_N.jsonl candidate_id format mismatch vs filenames (#669) 2026-03-14 04:07:00 +00:00
openhands
958b8cfaa0 fix: batch-eval.sh header comment claims wrong candidate_id format (#668) 2026-03-14 03:36:43 +00:00
openhands
c42a1ca768 fix: evo_run004_champion fitness inflated by token value (#670) (#704)
- Add fitness_flags="token_value_inflation" to evo_run004_champion in
  manifest.jsonl so callers can detect the inflated value without
  discarding the entry entirely.
- Add effective_fitness() helper in evolve.sh pool admission (step 5)
  that returns 0 for any entry with a token_value_inflation flag,
  preventing inflated scores from biasing the top-100 evolved pool
  ranking or eviction decisions.
- Document in evolve.sh that raw fitness values are only comparable
  within the same evaluation run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 01:08:13 +00:00
openhands
b7d0b63ca1 fix: int(e.get('fitness', 0)) crashes on null-fitness manifest entries (#711) 2026-03-13 23:37:00 +00:00
johba
0c4cd23dfa fix: feat: Seed kindergarten — persistent top-100 candidate pool (#667) (#683)
Fixes #667

## Changes
## Summary

Implemented persistent top-100 candidate pool in `tools/push3-evolution/evolve.sh`:

### Changes

**`--run-id <N>` flag** (line 96)
- Optional integer; auto-increments from highest `run` field in `manifest.jsonl` when omitted
- Zero-padded to 3 digits (`001`, `002`, …)

**Seeds pool constants** (after path canonicalization)
- `SEEDS_DIR` → `$SCRIPT_DIR/seeds/`
- `POOL_MANIFEST` → `seeds/manifest.jsonl`
- `ADMISSION_THRESHOLD` → `6000000000000000000000` (6e21 wei)

**`--diverse-seeds` mode** now has two paths:
1. **Pool mode** (pool non-empty): random-shuffles the pool and takes up to `POPULATION` candidates — real evolved diversity, not parametric clones
2. **Fallback** (pool empty): original `seed-gen-cli` parametric variant behavior
- Both paths fall back to mutating `--seed` to fill any shortfall

**Step 5 — End-of-run admission** (after the diff step):
1. Scans all `generation_*.jsonl` in `OUTPUT_DIR` for candidates with `fitness ≥ 6e21`
2. Maps `candidate_id` (e.g. `gen2_c005`) back to `.push3` files in `WORK_DIR` (still exists since cleanup fires on EXIT)
3. Deduplicates by SHA-256 content hash against existing pool
4. Names new files `run{RUN_ID}_gen{N}_c{MMM}.push3`
5. Merges with existing pool, sorts by fitness descending, keeps top 100
6. Copies admitted files to `seeds/`, removes evicted evolved files (never hand-written), rewrites `manifest.jsonl`

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/683
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-13 20:45:03 +01:00
johba
e230a62e11 fix: feat: LLM seed — Fee Maximizer optimizer (#673) (#696)
Fixes #673

## Changes
Done. Here's a summary of what was implemented:

## Changes

**`tools/push3-evolution/seeds/llm_fee_maximizer.push3`** (new file)
A Push3 optimizer seed implementing the "every trade pays us" philosophy. It reads `percentageStaked` (slot 0) and `averageTaxRate` (slot 1), then branches on two thresholds:
- Staking threshold: 60% (bullish vs. neutral sentiment)
- Tax threshold: 10% of 1e18 (high vs. low swap volume)

Strategy matrix:
| | tax < 10% | tax ≥ 10% |
|---|---|---|
| **staked < 60%** | AS=0.70, AW=60, DD=0.50 | AS=0.80, AW=80, DD=0.60 |
| **staked ≥ 60%** | AS=0.90, AW=40, DD=0.80 | AS=0.95, AW=50, DD=0.90 |

CI is always 0. Anchor share is always ≥ 0.70e18 (capital stays in fee-earning zone). High staking shifts discovery depth up; high tax widens the anchor to capture more swap volume.

**`tools/push3-evolution/seeds/manifest.jsonl`** — new entry for `llm_fee_maximizer.push3` with `origin=llm`.

Transpiled successfully: 48-line Solidity function body, outputs correctly bound to `ci`, `anchorShare`, `anchorWidth`, `discoveryDepth`.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/696
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-13 19:56:39 +01:00
openhands
24c4e94a6b fix: feat: LLM seed — Momentum Follower optimizer (#674)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 14:50:26 +00:00
openhands
c7196aa2b0 feat: seed kindergarten — initial population with OG + first evolution champion
seeds/optimizer_v3.push3     — hand-written original (8.26e21)
seeds/evo_run004_champion.push3 — first evolution winner (2.31e24, gen 3)
seeds/manifest.jsonl         — fitness + provenance tracking

Note: champion fitness inflated by token value (#670). Strategy is
'always bull' — wide positions, max anchor share. Pending ETH-only
metric fix to validate.
2026-03-13 10:32:35 +00:00
johba
3f435f8459 fix: evolution scoring — 3 bugs made all candidates report fitness=0 (#665)
## Three bugs in evolve.sh

1. **Heredoc stdin conflict** — `py_stats()` used `<<PYEOF` heredoc which stole stdin from the pipe, so python never received score values → stats always `min=0 max=0 mean=0`

2. **Bash integer overflow** — global best comparison used `[ $MAX -gt $GLOBAL_BEST_FITNESS ]` which overflows on uint256 wei values (>9.2e18) → best always tracked as 0

3. **candidate_id mismatch** — evolve.sh looked up `gen0_c000` but batch-eval produces `candidate_000` (derived from filename) → score lookup always returned default 0

All 3 previous evolution runs (150+ candidates) reported all zeros despite batch-eval correctly scoring them at ~8.26e21 wei.

## Fix
- `py_stats`: heredoc → `python3 -c` inline
- Global best: bash `[ -gt ]` → `python3` big number comparison
- Score lookup: use `basename $CAND_FILE` instead of synthetic CID

Co-authored-by: root <root@debian-g-2vcpu-8gb-ams3-01>
Reviewed-on: https://codeberg.org/johba/harb/pulls/665
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-13 10:02:24 +01:00
openhands
f8b765a9f8 fix: feat: Push3 evolution — crossover operator (#639)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 05:54:48 +00:00
openhands
89a2734bff fix: address review findings for diverse seed population (#638)
- evolve.sh: fix fail-in-subshell bug — run seed-gen-cli as a direct
  command so its exit code is checked by the parent shell and fail()
  aborts the script correctly; redirect stderr to log file instead of
  discarding it with 2>/dev/null
- seed-generator.ts: reorder enumerateVariants() to put
  STAKED_THRESHOLDS outermost (192 entries/block) so that
  selectVariants(6) with stride=192 covers all 6 staked% thresholds;
  remove false doc claim about "first variant is current seed config";
  add comments explaining CI=0n is intentional in all presets
- seed-gen-cli.ts: emit a stderr diagnostic when count exceeds the
  1152-variant cap so the cap is visible rather than silently producing
  fewer files than requested
- test: strengthen n=6 test to assert all STAKED_THRESHOLDS values are
  represented in the selected variants

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 05:21:05 +00:00
openhands
850131b74f fix: feat: Push3 evolution — diverse seed population (#638)
Add seed-generator.ts module and seed-gen-cli.ts CLI that produce
parametric Push3 variants for initial population seeding.

Variants systematically cover:
  - Staked% thresholds: 80, 85, 88, 91, 94, 97
  - Penalty thresholds: 30, 50, 70, 100
  - Bull params: 4 presets (aggressive → mild)
  - Bear params: 4 presets (standard → very mild)
  - Tax distributions: exponential (seed), linear, sqrt

Total combination space: 6×4×4×4×3 = 1152 variants.
selectVariants(n) samples evenly so every axis is represented.

evolve.sh gains --diverse-seeds flag: when set, gen_0 is seeded with
parametric variants instead of N copies of the same mutated seed.
Remaining slots (if population > generated variants) fall back to
mutations of the base seed.

All generated programs pass transpiler stack validation (33 new tests).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 04:48:04 +00:00
openhands
c87064dc6c fix: feat: Push3 default outputs — crash/no-output falls back to bear strategy (#634)
Three defensive layers so every Push3 program runs without reverting:

Layer A (transpiler/index.ts): assign bear defaults (CI=0, AS=0.3e18,
AW=100, DD=0.3e18) to all four outputs at the top of calculateParams.
Any output the evolved program does not overwrite keeps the safe default.

Layer B (transpiler/transpiler.ts): graceful stack underflow — dpop/bpop
return '0'/'false' instead of throwing, and the final output-pop falls
back to bear-default literals when fewer than 4 values remain on the
stack. Wrong output count no longer aborts transpilation.

Layer C (transpiler/transpiler.ts + index.ts): wrap the entire function
body in `unchecked {}` so integer overflow wraps (matching Push3), and
emit `(b == 0 ? 0 : a / b)` for every DYADIC./ (div-by-zero → 0,
matching Push3 no-op semantics).

Layer 2 (Optimizer.sol getLiquidityParams): clamp the three fraction
outputs (capitalInefficiency, anchorShare, discoveryDepth) to [0, 1e18]
after abi.decode so a buggy evolved program cannot produce out-of-range
values even if it runs without reverting.

Regenerated OptimizerV3Push3.sol with the updated transpiler; all 193
tests pass (34 Optimizer/OptimizerV3Push3 tests explicitly).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 03:47:49 +00:00
johba
8e4bd905ac Merge pull request 'fix: fix: Bootstrap VWAP with seed trade during deployment (#567) (#567)' (#633) from fix/issue-567 into master 2026-03-13 03:17:30 +01:00
openhands
5d369cfab6 fix: address review findings for gas-limit fitness pressure (#637)
- Optimizer.sol: move CALCULATE_PARAMS_GAS_LIMIT constant to top of
  contract (after error declaration) to avoid mid-contract placement.
  Expand natspec with EIP-150 63/64 note: callers need ~203 175 gas to
  deliver the full 200 000 budget to the inner staticcall.

- Optimizer.sol: add ret.length < 128 guard before abi.decode in
  getLiquidityParams(). Malformed return data (truncated / wrong ABI)
  from an evolved program now falls back to _bearDefaults() instead of
  propagating an unhandled revert. The 128-byte minimum is the ABI
  encoding of (uint256, uint256, uint24, uint256) — four 32-byte slots.

- Optimizer.sol: add cross-reference comment to _bearDefaults() noting
  that its values must stay in sync with LiquidityManager.recenter()'s
  catch block to prevent silent divergence.

- FitnessEvaluator.t.sol: add CALCULATE_PARAMS_GAS_LIMIT mirror constant
  (must match Optimizer.sol). Disqualify candidates whose measured gas
  exceeds the production cap with fitness=0 and error="gas_over_limit"
  — prevents the pipeline from selecting programs that are functionally
  dead on-chain (would always produce bear defaults in production).

- batch-eval.sh: update output format comment to document the gas_used
  field and over-gas-limit error object added by this feature.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 01:05:37 +00:00
openhands
73a80ead0b fix: add --tc DeployLocal to forge script invocations
Adding SeedSwapper alongside DeployLocal in the same .sol file caused
forge to error "Multiple contracts in the target path" when no --tc flag
was specified, silently failing the CI bootstrap step.

Add --tc DeployLocal to all forge script invocations of DeployLocal.sol:
  - scripts/bootstrap-common.sh  (CI / local bootstrap)
  - tools/deploy-optimizer.sh    (manual deploy tool)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 23:12:25 +00:00
openhands
64f1af3041 fix: feat: Push3 evolution — elitism (top N survive unchanged) (#640)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 22:29:23 +00:00
openhands
c05b20d640 fix: fix: Bootstrap VWAP with seed trade during deployment (#567) (#567)
Deploy scripts (DeployLocal.sol and DeployBase.sol) now execute a
seed buy + double-recenter sequence before handing control to users:

1. Temporarily grant deployer recenterAccess (via self as feeDestination)
2. Fund LM with a small amount and call recenter() -> places thin positions
3. SeedSwapper executes a small buy, generating a non-zero WETH fee
4. Second recenter() hits the cumulativeVolume==0 bootstrap path with
   ethFee>0 -> _recordVolumeAndPrice fires -> cumulativeVolume>0
5. Revoke recenterAccess and restore the real feeDestination

After deployment, cumulativeVolume>0, so the bootstrap path is
unreachable by external users and cannot be front-run by an attacker
inflating the initial VWAP anchor with a whale buy.

Also adds:
- tools/deploy-optimizer.sh: verification step checks cumulativeVolume>0
  after a fresh local deployment
- test_vwapBootstrappedBySeedTrade() in VWAPFloorProtection.t.sol:
  confirms the deploy sequence (recenter + buy + recenter) leaves
  cumulativeVolume>0 and getVWAP()>0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 21:15:35 +00:00
openhands
87bb5859e2 fix: revm evaluator — UUPS bypass, deployedBytecode, graceful attack ops
- Skip UUPS upgradeTo: etch + vm.store ERC1967 implementation slot directly
  (OptimizerV3Push3 is standalone, no UUPS inheritance needed for evolution)
- Use deployedBytecode (runtime) instead of bytecode (creation) for vm.etch
- Inject transpiled body into OptimizerV3.sol (has getLiquidityParams via Optimizer)
  instead of using standalone OptimizerV3Push3.sol
- Wrap buy/sell/stake/unstake in try/catch — attack ops should not abort the batch
- Add /tmp read to fs_permissions for batch-eval manifest files
- Bootstrap recenter returns bool instead of reverting (soft-fail per candidate)
2026-03-12 19:54:58 +00:00
openhands
403a304c98 fix: tighten stack-depth guard to !== 4 to catch overflow (#584)
Reviewer noted that `< 4` only catches underflow; programs leaving 5+
values on the DYADIC stack silently passed isValid().  Change the guard
to `!== 4` so both under- and overflow are rejected, matching the
documented 'exactly 4 outputs' contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 14:48:15 +00:00
openhands
e770191e56 fix: transpile() does not throw on <4 stack outputs (#584)
Replace silent ?? '0' fallbacks with an explicit length check that
throws when the DYADIC stack holds fewer than 4 values at program
termination.  isValid() in the evolution pipeline now correctly
rejects underflow programs instead of silently scoring them as valid
with zeroed outputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 14:07:32 +00:00
openhands
26b8876691 fix: feat: revm-based fitness evaluator for evolution at scale (#604)
Replace per-candidate Anvil+forge-script pipeline with in-process EVM
execution using Foundry's native revm backend, achieving 10-100× speedup
for evolutionary search at scale.

New files:
- onchain/test/FitnessEvaluator.t.sol — Forge test that forks Base once,
  deploys the full KRAIKEN stack, then for each candidate uses vm.etch to
  inject the compiled optimizer bytecode, UUPS-upgrades the proxy, runs all
  attack sequences with in-memory vm.snapshot/revertTo (no RPC overhead),
  and emits one {"candidate_id","fitness"} JSON line per candidate.
  Skips gracefully when BASE_RPC_URL is unset (CI-safe).

- tools/push3-evolution/revm-evaluator/batch-eval.sh — Wrapper that
  transpiles+compiles each candidate sequentially, writes a two-file
  manifest (ids.txt + bytecodes.txt), then invokes FitnessEvaluator.t.sol
  in a single forge test run and parses the score JSON from stdout.

Modified:
- tools/push3-evolution/evolve.sh — Adds EVAL_MODE env var (anvil|revm).
  When EVAL_MODE=revm, batch-scores every candidate in a generation with
  one batch-eval.sh call instead of N sequential fitness.sh processes;
  scores are looked up from the JSONL output in the per-candidate loop.
  Default remains EVAL_MODE=anvil for backward compatibility.

Key design decisions:
- Per-candidate Solidity compilation is unavoidable (each Push3 candidate
  produces different Solidity); the speedup is in the evaluation phase.
- vm.snapshot/revertTo in forge test are O(1) memory operations (true
  revm), not RPC calls — this is the core speedup vs Anvil.
- recenterAccess is set in bootstrap so TWAP stability checks are bypassed
  during attack sequences (mirrors the existing fitness.sh bootstrap).
- Test skips cleanly when BASE_RPC_URL is absent, keeping CI green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 11:54:41 +00:00
openhands
ade7e2033a fix: Evolution pipeline UUPS upgrade + Foundry PATH (#593)
- Add virtual to Optimizer.calculateParams() for UUPS override
- Create OptimizerV3.sol: UUPS-upgradeable optimizer with transpiled Push3 logic
- Update deploy-optimizer.sh to deploy OptimizerV3 instead of Optimizer
- Add ~/.foundry/bin to PATH in evolve.sh, fitness.sh, deploy-optimizer.sh
2026-03-12 06:47:35 +00:00
openhands
0496c94681 fix: address review findings in evolve.sh (#546)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 22:06:18 +00:00
openhands
2ee7feb621 fix: address review findings in evolve.sh (#546)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 21:29:14 +00:00
openhands
547e8beae8 fix: Push3 evolution: selection loop orchestrator (#546)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 20:56:19 +00:00
openhands
4564637f85 fix: Push3 evolution: fitness scoring wrapper (transpile → deploy → attack → score) (#545)
Address round-2 review findings:
- Move BASELINE_SNAP before deploy-optimizer.sh so cleanup fully reverts the
  deploy on a shared Anvil; fixes nonce/address collision when a second
  sequential evaluation reuses the same chain
- Revert deploy output to capture-and-suppress on success / surface on failure;
  removes per-candidate stderr noise in evolution loop batch runs
- Fix cast rpc anvil_mine arg order to match all other cast rpc calls in script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 20:16:54 +00:00
openhands
0f91234dbe fix: Push3 evolution: fitness scoring wrapper (transpile → deploy → attack → score) (#545)
Address review findings:
- Bug: add BASELINE_SNAP before bootstrap; cleanup reverts it on shared Anvil
  to undo setRecenterAccess/WETH-funding/recenter mutations (was dead code before)
- Bug: require ANVIL_FORK_URL when cold-starting Anvil — DeployLocal.sol needs
  live Base contracts (Uniswap V3 Factory, WETH) that don't exist on a plain fork
- Warning: flag DIRTY and emit warning when anvil_revert fails instead of || true
- Warning: tee deploy-optimizer.sh output to both log file and stderr so progress
  is visible and preserved for post-failure diagnosis
- Nit: replace 50×evm_mine loop with single anvil_mine 0x32 (49 fewer RTTs)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 19:41:06 +00:00
openhands
a8db761de8 fix: Push3 evolution: fitness scoring wrapper (transpile → deploy → attack → score) (#545)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 19:02:00 +00:00