fix: evo_run004_champion fitness inflated by token value (#670) (#704)

- Add fitness_flags="token_value_inflation" to evo_run004_champion in manifest.jsonl so callers can detect the inflated value without discarding the entry entirely. - Add effective_fitness() helper in evolve.sh pool admission (step 5) that returns 0 for any entry with a token_value_inflation flag, preventing inflated scores from biasing the top-100 evolved pool ranking or eviction decisions. - Document in evolve.sh that raw fitness values are only comparable within the same evaluation run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 01:08:13 +00:00 · 2026-03-14 01:08:13 +00:00 · c42a1ca768
commit c42a1ca768
parent d0eae8b261
2 changed files with 13 additions and 3 deletions
--- a/tools/push3-evolution/evolve.sh
+++ b/tools/push3-evolution/evolve.sh
@ -828,9 +828,19 @@ if not new_items:
    sys.exit(0)

 # ── 5. Separate pinned (hand-written) from evolved; top-100 cap on evolved only
-pinned  = [(int(e.get('fitness') or 0), e, None) for e in existing
+#
+# NOTE: raw fitness values are only comparable within the same evaluation run.
+# Entries with fitness_flags='token_value_inflation' (or other flags) are ranked
+# as fitness=0 so that inflated scores do not bias pool admission or eviction.
+def effective_fitness(entry):
+    flags = entry.get('fitness_flags') or ''
+    if 'token_value_inflation' in flags:
+        return 0
+    return int(entry.get('fitness') or 0)
+
+pinned  = [(effective_fitness(e), e, None) for e in existing
           if e.get('origin') != 'evolved']
-evolved = [(int(e.get('fitness') or 0), e, None) for e in existing
+evolved = [(effective_fitness(e), e, None) for e in existing
           if e.get('origin') == 'evolved']
 for fitness, push3_path, entry in new_items:
    evolved.append((fitness, entry, push3_path))