fix: No generic flag dispatch: only \token_value_inflation\ is ever zero-rated (#723)

Define ZERO_RATED_FLAGS set near effective_fitness and check each flag
with any(...in flags...) instead of a single hard-coded substring test.
token_value_inflation behaviour is preserved; new flags can be added to
the set without touching the dispatch logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-15 02:36:57 +00:00
parent 5dfa824161
commit 273615cfed

View file

@ -861,11 +861,18 @@ if not new_items:
# ── 5. Separate pinned (hand-written) from evolved; top-100 cap on evolved only
#
# NOTE: raw fitness values are only comparable within the same evaluation run.
# Entries with fitness_flags='token_value_inflation' (or other flags) are ranked
# Entries whose fitness_flags contain any flag in ZERO_RATED_FLAGS are ranked
# as fitness=0 so that inflated scores do not bias pool admission or eviction.
#
# ZERO_RATED_FLAGS: canonical set of flag strings that force effective_fitness=0.
# Add new inflation/distortion flags here; no other code change is required.
ZERO_RATED_FLAGS = {
'token_value_inflation',
}
def effective_fitness(entry):
flags = entry.get('fitness_flags') or ''
if 'token_value_inflation' in flags:
if any(flag in flags for flag in ZERO_RATED_FLAGS):
return 0
return int(entry.get('fitness') or 0)