- Update all AGENTS.md watermarks to HEAD (224edcc)
- onchain/AGENTS.md: document VWAPTracker _hasRecenterTick guard (#609),
overflow guard for slots 0-7 (#997), Floor Ratchet defeated (#1067),
fee-income delta_bps audit trail (#1084)
- landing/AGENTS.md: document SecurityInfo.vue component (#147)
Issues actioned via API:
- Unblocked #1099, #1100, #1101 (all prereqs 1031/997/1067/1054 closed)
- Created #1134 bundled backlog: onchain source quality cleanup (4 dust items)
- Closed dust #601, #627, #739, #741 → bundled into #1134
139 lines
6.2 KiB
Markdown
139 lines
6.2 KiB
Markdown
<!-- last-reviewed: 224edcc6d3a61aa5a61b2816061f253584791cfd -->
|
|
# Agent Brief: Formulas
|
|
|
|
Formulas are TOML files that declare automated pipeline jobs for the harb evaluator.
|
|
Each formula describes **what** to run, **when**, and **what it produces** — the
|
|
orchestrator reads the TOML and dispatches execution to the scripts referenced in
|
|
`[execution]`.
|
|
|
|
## Sense vs Act
|
|
|
|
Every formula has a `type` field. Getting this wrong breaks orchestrator scheduling
|
|
and evidence routing.
|
|
|
|
| Type | Meaning | Side-effects | Examples |
|
|
|------|---------|-------------|----------|
|
|
| `sense` | Read-only observation. Produces metrics / evidence only. | No PRs, no code changes, no contract deployments. | `run-holdout`, `run-protocol`, `run-resources`, `run-user-test` |
|
|
| `act` | Produces git artifacts: PRs, new files committed to main, contract upgrades. | Opens PRs, commits evidence + champion files, promotes attack vectors. | `run-evolution`, `run-red-team` |
|
|
|
|
**Rule of thumb:** if the formula's `deliver` step calls `git push` or opens a PR,
|
|
it is `act`. If it only commits an evidence JSON to main, it is `sense`.
|
|
|
|
## Current Formulas
|
|
|
|
| ID | Type | Script | Cron | Purpose |
|
|
|----|------|--------|------|---------|
|
|
| `run-evolution` | act | `tools/push3-evolution/evolve.sh` | — | Evolve Push3 optimizer candidates, admit champions to seed pool via PR |
|
|
| `run-holdout` | sense | `scripts/harb-evaluator/evaluate.sh` | — | Deploy PR branch, run blind holdout scenarios, report pass/fail |
|
|
| `run-protocol` | sense | `scripts/harb-evaluator/run-protocol.sh` | `0 7 * * *` | On-chain health snapshot (TVL, fees, positions, rebalances) |
|
|
| `run-red-team` | act | `scripts/harb-evaluator/red-team.sh` | — | Adversarial agent attacks the optimizer; promotes novel attack vectors via PR |
|
|
| `run-resources` | sense | `scripts/harb-evaluator/run-resources.sh` | `0 6 * * *` | Infrastructure snapshot (disk, RAM, API budget, CI queue) |
|
|
| `run-user-test` | sense | `scripts/run-usertest.sh` | — | Persona-based Playwright UX evaluation |
|
|
|
|
## Cron Conventions
|
|
|
|
- Schedules use standard 5-field cron syntax in `[cron] schedule`.
|
|
- Stagger by at least 1 hour to avoid resource contention (`run-resources` at 06:00, `run-protocol` at 07:00).
|
|
- Only `sense` formulas should be cron-scheduled. An `act` formula on a timer risks unattended PRs.
|
|
|
|
## Step ID Naming
|
|
|
|
Steps are declared as `[[steps]]` arrays. Each step must have an `id` field.
|
|
|
|
**Conventions:**
|
|
- Use lowercase kebab-case: `stack-up`, `run-scenarios`, `collect-tvl`.
|
|
- Prefix collection steps with `collect-` followed by the metric dimension: `collect-disk`, `collect-ram`, `collect-fees`.
|
|
- Every formula must include a `collect` step (assembles the evidence JSON) and a `deliver` step (commits + posts comment).
|
|
- Infrastructure lifecycle steps: `stack-up` / `stack-down` (or `boot-stack` / `teardown`).
|
|
- Use descriptive verbs: `run-attack-suite`, `evaluate-seeds`, `export-vectors`.
|
|
|
|
## TOML Structure
|
|
|
|
A formula file follows this skeleton:
|
|
|
|
```toml
|
|
# formulas/run-{name}.toml
|
|
#
|
|
# One-line description of what this formula does.
|
|
#
|
|
# Type: sense | act
|
|
# Cron: (schedule if applicable, or "—")
|
|
|
|
[formula]
|
|
id = "run-{name}"
|
|
name = "Human-Readable Name"
|
|
description = "What it does in one sentence."
|
|
type = "sense" # or "act"
|
|
|
|
# [cron] # optional — only for scheduled formulas
|
|
# schedule = "0 6 * * *"
|
|
|
|
[inputs.example_input]
|
|
type = "string" # string | integer | number
|
|
required = true
|
|
description = "What this input controls."
|
|
|
|
[execution]
|
|
script = "path/to/script.sh"
|
|
invocation = "ENV_VAR={example_input} bash path/to/script.sh"
|
|
|
|
[[steps]]
|
|
id = "do-something"
|
|
description = """
|
|
What this step does, in enough detail for a new contributor to understand.
|
|
"""
|
|
|
|
[[steps]]
|
|
id = "collect"
|
|
description = "Assemble metrics into evidence/{category}/{date}.json."
|
|
output = "evidence/{category}/{date}.json"
|
|
|
|
[[steps]]
|
|
id = "deliver"
|
|
description = "Commit evidence file and post summary comment to issue."
|
|
|
|
[products.evidence_file]
|
|
path = "evidence/{category}/{date}.json"
|
|
delivery = "commit to main"
|
|
schema = "evidence/README.md"
|
|
|
|
[resources]
|
|
profile = "light" # or "heavy"
|
|
concurrency = "safe to run in parallel" # or "exclusive"
|
|
```
|
|
|
|
## How to Add a New Formula
|
|
|
|
1. **Pick a name.** File goes in `formulas/run-{name}.toml`. The `[formula] id` must match: `run-{name}`.
|
|
|
|
2. **Decide sense vs act.** If your formula only reads state and writes evidence → `sense`. If it creates PRs, commits code, or modifies contracts → `act`.
|
|
|
|
3. **Write the TOML.** Follow the skeleton above. Key sections:
|
|
- `[formula]` — id, name, description, type.
|
|
- `[inputs.*]` — every tuneable parameter the script accepts.
|
|
- `[execution]` — script path and full invocation with `{input}` interpolation.
|
|
- `[[steps]]` — ordered list of logical steps. Always end with `collect` and `deliver`.
|
|
- `[products.*]` — what the formula produces (evidence file, PR, issue comment).
|
|
- `[resources]` — profile (`light` / `heavy`), concurrency constraints.
|
|
|
|
4. **Write or wire the backing script.** The `[execution] script` must exist and be executable. Most scripts live in `scripts/harb-evaluator/` or `tools/`. Exit codes: `0` = success, `1` = gate failed, `2` = infra error.
|
|
|
|
5. **Define the evidence schema.** If your formula writes `evidence/{category}/{date}.json`, add the schema to `evidence/README.md`.
|
|
|
|
6. **Update this file.** Add your formula to the "Current Formulas" table above.
|
|
|
|
7. **Test locally.** Run the backing script with the required inputs and verify the evidence file is well-formed JSON.
|
|
|
|
## Resource Profiles
|
|
|
|
| Profile | Meaning | Can run in parallel? |
|
|
|---------|---------|---------------------|
|
|
| `light` | Shell commands only (df, curl, cast). No Docker, no Anvil. | Yes — safe to run alongside anything. |
|
|
| `heavy` | Needs Anvil on port 8545, Docker containers, or long-running agents. | No — exclusive. Heavy formulas share port bindings and cannot overlap. |
|
|
|
|
## Evaluator Integration
|
|
|
|
Formula execution is dispatched by the orchestrator to scripts in
|
|
`scripts/harb-evaluator/`. See [scripts/harb-evaluator/AGENTS.md](../scripts/harb-evaluator/AGENTS.md)
|
|
for details on the evaluator runtime: stack lifecycle, scenario execution,
|
|
evidence collection, and the adversarial agent harness.
|