fix: feat: Seed kindergarten — persistent top-100 candidate pool (#667) (#683)

Fixes #667

## Changes
## Summary

Implemented persistent top-100 candidate pool in `tools/push3-evolution/evolve.sh`:

### Changes

**`--run-id <N>` flag** (line 96)
- Optional integer; auto-increments from highest `run` field in `manifest.jsonl` when omitted
- Zero-padded to 3 digits (`001`, `002`, …)

**Seeds pool constants** (after path canonicalization)
- `SEEDS_DIR` → `$SCRIPT_DIR/seeds/`
- `POOL_MANIFEST` → `seeds/manifest.jsonl`
- `ADMISSION_THRESHOLD` → `6000000000000000000000` (6e21 wei)

**`--diverse-seeds` mode** now has two paths:
1. **Pool mode** (pool non-empty): random-shuffles the pool and takes up to `POPULATION` candidates — real evolved diversity, not parametric clones
2. **Fallback** (pool empty): original `seed-gen-cli` parametric variant behavior
- Both paths fall back to mutating `--seed` to fill any shortfall

**Step 5 — End-of-run admission** (after the diff step):
1. Scans all `generation_*.jsonl` in `OUTPUT_DIR` for candidates with `fitness ≥ 6e21`
2. Maps `candidate_id` (e.g. `gen2_c005`) back to `.push3` files in `WORK_DIR` (still exists since cleanup fires on EXIT)
3. Deduplicates by SHA-256 content hash against existing pool
4. Names new files `run{RUN_ID}_gen{N}_c{MMM}.push3`
5. Merges with existing pool, sorts by fitness descending, keeps top 100
6. Copies admitted files to `seeds/`, removes evicted evolved files (never hand-written), rewrites `manifest.jsonl`

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/harb/pulls/683
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
This commit is contained in:
johba 2026-03-13 20:45:03 +01:00
parent 944e537d5b
commit 0c4cd23dfa

View file

@ -12,14 +12,21 @@
# --mutation-rate 2 \
# --elites 2 \
# --output evolved/ \
# [--diverse-seeds]
# [--diverse-seeds] \
# [--run-id <N>]
#
# --diverse-seeds Use the seed generator to initialise gen_0 with parametric
# variants (different staked% thresholds, bull/bear outputs,
# penalty thresholds, and tax distributions) instead of N
# copies of the seed each independently mutated. When the
# generator produces fewer variants than --population the
# remaining slots are filled with mutations of the seed.
# --diverse-seeds Initialise gen_0 with diverse candidates. When the
# persistent seeds pool (tools/push3-evolution/seeds/) is
# non-empty, a random sample from the pool is used (crossover
# between hand-written and evolved programs). When the pool is
# empty, falls back to the parametric seed-gen-cli variants.
# Any shortfall (pool or variants < --population) is filled by
# mutating the main seed.
#
# --run-id <N> Integer identifier for this run, used to name candidates
# admitted to the seeds pool (e.g. run003_gen2_c005.push3).
# Auto-incremented from the highest existing run in the pool
# manifest when omitted.
#
# Algorithm:
# 1. Initialize population: N copies of seed, each with M random mutations.
@ -75,6 +82,7 @@ MUTATION_RATE=2
ELITES=2
OUTPUT_DIR=""
DIVERSE_SEEDS=false
RUN_ID=""
while [[ $# -gt 0 ]]; do
case $1 in
@ -85,6 +93,7 @@ while [[ $# -gt 0 ]]; do
--elites) ELITES="$2"; shift 2 ;;
--output) OUTPUT_DIR="$2"; shift 2 ;;
--diverse-seeds) DIVERSE_SEEDS=true; shift ;;
--run-id) RUN_ID="$2"; shift 2 ;;
*) echo "Unknown option: $1" >&2; exit 2 ;;
esac
done
@ -114,6 +123,44 @@ mkdir -p "$OUTPUT_DIR"
OUTPUT_DIR="$(cd "$OUTPUT_DIR" && pwd)"
LOG="$OUTPUT_DIR/evolution.log"
# Seeds pool — persistent candidate pool across all runs
SEEDS_DIR="$SCRIPT_DIR/seeds"
POOL_MANIFEST="$SEEDS_DIR/manifest.jsonl"
ADMISSION_THRESHOLD=6000000000000000000000 # 6e21 wei
# Validate/auto-compute RUN_ID
if [ -n "$RUN_ID" ]; then
if ! [[ "$RUN_ID" =~ ^[0-9]+$ ]] || [ "$RUN_ID" -lt 1 ]; then
echo "Error: --run-id must be a positive integer (got: $RUN_ID)" >&2
exit 2
fi
RUN_ID=$(printf '%03d' "$RUN_ID")
else
# Auto-increment: find the highest run ID in the manifest and add 1
if [ -f "$POOL_MANIFEST" ]; then
RUN_ID=$(python3 - "$POOL_MANIFEST" <<'PYEOF'
import json, sys
max_run = 0
with open(sys.argv[1]) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
d = json.loads(line)
r = d.get("run")
if r is not None:
max_run = max(max_run, int(r))
except (json.JSONDecodeError, ValueError, TypeError):
pass
print(f"{max_run + 1:03d}")
PYEOF
) || RUN_ID="001"
else
RUN_ID="001"
fi
fi
# =============================================================================
# Helpers
# =============================================================================
@ -214,10 +261,6 @@ done
[ -f "$MUTATE_CLI" ] || fail "mutate-cli.ts not found at $MUTATE_CLI"
[ -x "$FITNESS_SH" ] || chmod +x "$FITNESS_SH"
if [ "$DIVERSE_SEEDS" = "true" ]; then
[ -f "$SEED_GEN_CLI" ] || fail "seed-gen-cli.ts not found at $SEED_GEN_CLI"
fi
if [ "$EVAL_MODE" = "revm" ]; then
[ -f "$BATCH_EVAL_SH" ] || fail "batch-eval.sh not found at $BATCH_EVAL_SH"
[ -x "$BATCH_EVAL_SH" ] || chmod +x "$BATCH_EVAL_SH"
@ -250,6 +293,7 @@ log " Generations: $GENERATIONS"
log " Mutation rate: $MUTATION_RATE"
log " Elites: $ELITES"
log " Diverse seeds: $DIVERSE_SEEDS"
log " Run ID: $RUN_ID"
log " Output: $OUTPUT_DIR"
log " TSX: $TSX_CMD"
log " Eval mode: $EVAL_MODE"
@ -268,27 +312,72 @@ GEN_DIR="$WORK_DIR/gen_0"
mkdir -p "$GEN_DIR"
if [ "$DIVERSE_SEEDS" = "true" ]; then
# --- Diverse-seeds mode: use seed-gen-cli to produce parametric variants ---
# Generate up to POPULATION variants; any shortfall is filled by mutating the seed.
SEED_VARIANTS_DIR="$WORK_DIR/seed_variants"
SEED_VARIANTS_LIST="$WORK_DIR/seed_variants_list.txt"
# --- Diverse-seeds mode: prefer persistent pool; fall back to seed-gen-cli ---
VARIANT_IDX=0
# Run seed-gen-cli as a direct command (not inside <(...)) so its exit code is
# checked by the parent shell and fail() aborts the entire script on error.
# Stderr goes to the log file for diagnostics rather than being discarded.
run_seed_gen_cli --count "$POPULATION" --output-dir "$SEED_VARIANTS_DIR" \
> "$SEED_VARIANTS_LIST" 2>>"$LOG" \
|| fail "seed-gen-cli.ts failed to generate variants"
# Build a random sample list from the pool in one pass (also determines if
# the pool has any usable entries, avoiding a second manifest parse).
POOL_SAMPLE_LIST="$WORK_DIR/pool_sample.txt"
POOL_COUNT=0
if [ -f "$POOL_MANIFEST" ]; then
python3 - "$POOL_MANIFEST" "$SEEDS_DIR" "$POPULATION" > "$POOL_SAMPLE_LIST" <<'PYEOF'
import json, sys, os, random
manifest_path, seeds_dir, n = sys.argv[1], sys.argv[2], int(sys.argv[3])
entries = []
with open(manifest_path) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
d = json.loads(line)
fpath = os.path.join(seeds_dir, d.get('file', ''))
if os.path.exists(fpath):
entries.append(fpath)
except json.JSONDecodeError:
pass
random.shuffle(entries)
for path in entries[:n]:
print(path)
PYEOF
POOL_COUNT=$(wc -l < "$POOL_SAMPLE_LIST" 2>/dev/null || echo 0)
fi
while IFS= read -r VARIANT_FILE && [ "$VARIANT_IDX" -lt "$POPULATION" ]; do
CAND_FILE="$GEN_DIR/candidate_$(printf '%03d' $VARIANT_IDX).push3"
cp "$VARIANT_FILE" "$CAND_FILE"
printf '0\n' > "${CAND_FILE%.push3}.ops"
VARIANT_IDX=$((VARIANT_IDX + 1))
done < "$SEED_VARIANTS_LIST"
if [ "$POOL_COUNT" -gt 0 ]; then
# --- Pool mode: random sample from the seeds pool ---
log " diverse-seeds: sampling up to $POPULATION candidates from pool ($POOL_COUNT available)"
# Fill any remaining slots with mutations of the seed (fallback)
while IFS= read -r POOL_FILE && [ "$VARIANT_IDX" -lt "$POPULATION" ]; do
CAND_FILE="$GEN_DIR/candidate_$(printf '%03d' $VARIANT_IDX).push3"
cp "$POOL_FILE" "$CAND_FILE"
printf '0\n' > "${CAND_FILE%.push3}.ops"
VARIANT_IDX=$((VARIANT_IDX + 1))
done < "$POOL_SAMPLE_LIST"
log " diverse-seeds: seeded $VARIANT_IDX candidate(s) from pool"
else
# --- Fallback: parametric variants from seed-gen-cli (pool is empty) ---
log " diverse-seeds: pool empty, falling back to seed-gen-cli parametric variants"
[ -f "$SEED_GEN_CLI" ] || fail "seed-gen-cli.ts not found at $SEED_GEN_CLI"
SEED_VARIANTS_DIR="$WORK_DIR/seed_variants"
SEED_VARIANTS_LIST="$WORK_DIR/seed_variants_list.txt"
# Run seed-gen-cli as a direct command (not inside <(...)) so its exit code is
# checked by the parent shell and fail() aborts the entire script on error.
# Stderr goes to the log file for diagnostics rather than being discarded.
run_seed_gen_cli --count "$POPULATION" --output-dir "$SEED_VARIANTS_DIR" \
> "$SEED_VARIANTS_LIST" 2>>"$LOG" \
|| fail "seed-gen-cli.ts failed to generate variants"
while IFS= read -r VARIANT_FILE && [ "$VARIANT_IDX" -lt "$POPULATION" ]; do
CAND_FILE="$GEN_DIR/candidate_$(printf '%03d' $VARIANT_IDX).push3"
cp "$VARIANT_FILE" "$CAND_FILE"
printf '0\n' > "${CAND_FILE%.push3}.ops"
VARIANT_IDX=$((VARIANT_IDX + 1))
done < "$SEED_VARIANTS_LIST"
fi
# Fill any remaining slots with mutations of the seed
while [ "$VARIANT_IDX" -lt "$POPULATION" ]; do
CAND_FILE="$GEN_DIR/candidate_$(printf '%03d' $VARIANT_IDX).push3"
MUTATED=$(run_mutate_cli mutate "$SEED" "$MUTATION_RATE") \
@ -298,7 +387,7 @@ if [ "$DIVERSE_SEEDS" = "true" ]; then
VARIANT_IDX=$((VARIANT_IDX + 1))
done
log "Initialized ${POPULATION} candidates in gen_0 (diverse-seeds mode)"
log "Initialized ${POPULATION} candidates in gen_0 (diverse-seeds, pool=$POOL_COUNT)"
else
# --- Default mode: N copies of the seed, each independently mutated ---
for i in $(seq 0 $((POPULATION - 1))); do
@ -611,3 +700,188 @@ log " Best fitness: $GLOBAL_BEST_FITNESS"
log " Best from gen: $GLOBAL_BEST_GEN"
log " Output directory: $OUTPUT_DIR"
log "========================================================"
# =============================================================================
# Step 5 — Seed pool admission
#
# Scan all generation JSONL files for candidates that scored above the
# admission threshold (6e21). Deduplicate by Push3 content hash against the
# existing pool. Admit qualifying candidates into seeds/ and rewrite
# manifest.jsonl, keeping at most the top-100 by fitness.
# =============================================================================
log ""
log "=== Seed pool admission (run=$RUN_ID, threshold=$ADMISSION_THRESHOLD) ==="
mkdir -p "$SEEDS_DIR"
_ADMISSION_OUT="$WORK_DIR/admission_output.txt"
_ADMISSION_RC=0
python3 - "$OUTPUT_DIR" "$WORK_DIR" "$SEEDS_DIR" \
"$ADMISSION_THRESHOLD" "$RUN_ID" "$(date -u '+%Y-%m-%d')" \
> "$_ADMISSION_OUT" 2>&1 <<'PYEOF' || _ADMISSION_RC=$?
import json, sys, os, hashlib, shutil, tempfile
output_dir, work_dir, seeds_dir = sys.argv[1], sys.argv[2], sys.argv[3]
threshold = int(sys.argv[4])
run_id = sys.argv[5]
today = sys.argv[6]
MAX_EVOLVED = 100 # cap applies to evolved entries only; hand-written are always pinned
manifest_path = os.path.join(seeds_dir, 'manifest.jsonl')
# ── 1. Read existing manifest ─────────────────────────────────────────────────
existing = []
if os.path.exists(manifest_path):
with open(manifest_path) as f:
for line in f:
line = line.strip()
if line:
try:
existing.append(json.loads(line))
except json.JSONDecodeError:
pass
# ── 2. Hash existing pool files for deduplication ────────────────────────────
def file_hash(path):
with open(path, 'rb') as fh:
return hashlib.sha256(fh.read()).hexdigest()
existing_hashes = set()
for entry in existing:
fpath = os.path.join(seeds_dir, entry.get('file', ''))
if os.path.exists(fpath):
existing_hashes.add(file_hash(fpath))
# ── 3. Collect qualifying candidates from generation JSONL files ──────────────
qualifying = [] # (fitness, push3_path, gen_idx, cand_str)
for fname in sorted(os.listdir(output_dir)):
if not (fname.startswith('generation_') and fname.endswith('.jsonl')):
continue
try:
int(fname[len('generation_'):-len('.jsonl')]) # validate integer suffix
except ValueError:
continue
with open(os.path.join(output_dir, fname)) as f:
for line in f:
try:
d = json.loads(line)
cid = d.get('candidate_id', '')
fitness = int(d.get('fitness', 0))
if fitness < threshold:
continue
# cid format: "gen{N}_c{MMM}"
if not cid.startswith('gen') or '_c' not in cid:
continue
after_gen = cid[3:] # strip "gen"
gen_str, cand_str = after_gen.split('_c', 1)
gen_idx = int(gen_str)
push3_path = os.path.join(
work_dir, f'gen_{gen_idx}',
f'candidate_{int(cand_str):03d}.push3'
)
if os.path.exists(push3_path):
qualifying.append((fitness, push3_path, gen_idx, cand_str))
except (json.JSONDecodeError, ValueError, TypeError, AttributeError):
pass
qualifying.sort(key=lambda x: x[0], reverse=True)
# ── 4. Deduplicate and assign filenames (resolve --run-id reuse collisions) ───
new_items = [] # (fitness, push3_path, manifest_entry)
seen = set(existing_hashes)
for fitness, push3_path, gen_idx, cand_str in qualifying:
h = file_hash(push3_path)
if h in seen:
continue
seen.add(h)
# Canonical name: run{run_id}_gen{gen_idx:03d}_c{cand_str}.push3
# If a different file already occupies that name (same run-id reused), add
# a counter suffix (_r2, _r3, …) until we find an unused or same-content slot.
base = f'run{run_id}_gen{gen_idx:03d}_c{cand_str}'
filename = f'{base}.push3'
dest = os.path.join(seeds_dir, filename)
if os.path.exists(dest) and file_hash(dest) != h:
counter = 2
while True:
filename = f'{base}_r{counter}.push3'
dest = os.path.join(seeds_dir, filename)
if not os.path.exists(dest) or file_hash(dest) == h:
break
counter += 1
entry = {
'file': filename,
'fitness': fitness,
'origin': 'evolved',
'run': run_id,
'generation': gen_idx,
'date': today,
}
new_items.append((fitness, push3_path, entry))
if not new_items:
print(f'No new qualifying candidates from run {run_id} '
f'(threshold={threshold}, scanned {len(qualifying)} above-threshold hits)')
sys.exit(0)
# ── 5. Separate pinned (hand-written) from evolved; top-100 cap on evolved only
pinned = [(int(e.get('fitness', 0)), e, None) for e in existing
if e.get('origin') != 'evolved']
evolved = [(int(e.get('fitness', 0)), e, None) for e in existing
if e.get('origin') == 'evolved']
for fitness, push3_path, entry in new_items:
evolved.append((fitness, entry, push3_path))
evolved.sort(key=lambda x: x[0], reverse=True)
admitted_evolved = evolved[:MAX_EVOLVED]
evicted_evolved = evolved[MAX_EVOLVED:]
# ── 6. Copy admitted new files; remove evicted evolved files ─────────────────
admitted_count = 0
for _, entry, src_path in admitted_evolved:
if src_path is not None: # new candidate
dest = os.path.join(seeds_dir, entry['file'])
shutil.copy2(src_path, dest)
print(f' admitted: {entry["file"]} fitness={entry["fitness"]}')
admitted_count += 1
for _, entry, src_path in evicted_evolved:
if src_path is not None: # rejected before being copied
print(f' rejected (below pool floor): {entry["file"]} fitness={entry["fitness"]}')
else: # existing evolved entry pushed out
fpath = os.path.join(seeds_dir, entry.get('file', ''))
if os.path.exists(fpath):
os.remove(fpath)
print(f' evicted from pool: {entry["file"]} fitness={entry["fitness"]}')
# Warn if any pinned (hand-written) entry ranks below the current pool floor
if evicted_evolved and pinned:
pool_floor = evicted_evolved[0][0]
for fit, entry, _ in pinned:
if fit <= pool_floor:
print(f' WARNING: pinned seed "{entry.get("file")}" (fitness={fit}) '
f'ranks below evolved pool floor ({pool_floor}) — kept in manifest regardless')
# ── 7. Rewrite manifest.jsonl atomically via temp-file + rename ──────────────
admitted = admitted_evolved + pinned
admitted.sort(key=lambda x: x[0], reverse=True)
manifest_dir = os.path.dirname(manifest_path)
with tempfile.NamedTemporaryFile('w', dir=manifest_dir, delete=False, suffix='.tmp') as tmp:
tmp_path = tmp.name
for _, entry, _ in admitted:
tmp.write(json.dumps(entry) + '\n')
os.replace(tmp_path, manifest_path)
print(f'Pool updated: {len(admitted)} entries total '
f'({len(admitted_evolved)} evolved + {len(pinned)} pinned), '
f'+{admitted_count} from run {run_id}')
PYEOF
while IFS= read -r _line; do log " $_line"; done < "$_ADMISSION_OUT"
if [ "$_ADMISSION_RC" -ne 0 ]; then
log " WARNING: seed pool admission failed (exit $_ADMISSION_RC) — pool unchanged"
fi