Add server-side response cache + in-flight coalescing to Ponder's Hono
API layer (services/ponder/src/api/index.ts).
Previously every polling client generated an independent DB query, giving
O(users × 1/poll_interval) load. With a 5 s in-process cache keyed on the
raw request body (POST) or query string (GET), the effective DB hit rate
is capped at O(1/5s) regardless of how many clients are polling.
In-flight coalescing ensures that N concurrent identical queries that
arrive before the first response is ready all share a single DB hit
instead of each issuing their own. Expired entries are evicted every 30 s
to keep memory use bounded.
The 5 s TTL deliberately matches the existing Caddy `Cache-Control:
public, max-age=5` header so that if a caching proxy/CDN is layered in
front later, both layers stay in sync.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>