Move holdout scenarios to separate repo

- Updated holdout.config.ts to use HOLDOUT_SCENARIOS_DIR env var
- Modified evaluate.sh to clone harb-holdout-scenarios repo at runtime
- Deleted scripts/harb-evaluator/scenarios/ directory
- Added .holdout-scenarios/ to .gitignore
- Holdout scenarios are now cloned into .holdout-scenarios/ during evaluation
- This prevents dev-agent from seeing the holdout test set
This commit is contained in:
openhands 2026-03-03 19:57:34 +00:00
parent b2594a28b3
commit 69f6a87e20
4 changed files with 23 additions and 86 deletions

3
.gitignore vendored
View file

@ -36,3 +36,6 @@ services/ponder/.ponder/
# Temporary files
/tmp/
logs/
# Holdout scenarios (cloned at runtime by evaluate.sh)
.holdout-scenarios/