Skip to main content
Be fast without being wrong. Reuse the facts that rarely change, re-fetch the ones that move, and let the server do the waiting.

Cache what is stable, not what moves

Cache for the runNever cache
capabilities, keyed by its versionSearch results as if they were live truth
Identity from whoami (what to cache)DLQ depth, health counts, replay progress
Resolved bucket_id / forwarder_idA cursor across a different query or window
Cache identity in the agent process and refresh only on unauthorized, active_org_required, forbidden_scope, or quota_exceeded. Re-fetch capabilities only after repost update or when a needed command is absent.

Prefer waits and gates to polling

A wait command blocks server-side and costs one round-trip instead of many. A gate checks the current state once and returns a typed result. Reach for these before writing any loop:

When you must poll

When no wait command fits, poll deliberately:
  • Back off exponentially — 1s, 2s, 4s, 8s — never a tight loop.
  • Stop on terminal codes: validation_failed, active_org_required, forbidden_scope, and quota_exceeded never succeed on retry.
  • Retry only rate_limited (exit 6) and upstream_unavailable (exit 1), backing off further on rate_limited until the server limit resets.
  • Bound every poll with a deadline so an agent can never spin forever.

Continue

Consistency model

Which plane answers which question, and the blind spot between them.

Wait for events

The observe-stream commands that make polling unnecessary.

Output & errors

Every exit code, including rate_limited and conflict.