> ## Documentation Index
> Fetch the complete documentation index at: https://docs.repost.sh/llms.txt
> Use this file to discover all available pages before exploring further.

# Caching and performance

> Cache stable facts, never cache moving ones, and back off deliberately when you must poll.

Be fast without being wrong. Reuse the facts that rarely change, re-fetch the ones that move, and let the server do the waiting.

## Cache what is stable, not what moves

| Cache for the run                                                                           | Never cache                                                                                 |
| ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------- |
| `capabilities`, keyed by its `version`                                                      | Search results as if they were live truth                                                   |
| Identity from `whoami` ([what to cache](/agents/authentication#cache-identity-for-the-run)) | DLQ depth, health counts, replay progress                                                   |
| Resolved `bucket_id` / `forwarder_id`                                                       | A [cursor](/agents/search-and-filters#page-with-cursors) across a different query or window |

Cache identity in the agent process and refresh only on `unauthorized`, `active_org_required`, `forbidden_scope`, or `quota_exceeded`. Re-fetch `capabilities` only after `repost update` or when a needed command is absent.

## Prefer waits and gates to polling

A wait command blocks server-side and costs one round-trip instead of many. A gate checks the current state once and returns a typed result. Reach for these before writing any loop:

* `expect` / `tail` — wait for events ([Wait for events](/agents/wait-for-events)).
* `replay wait` — wait for a replay job ([Replay deliveries](/agents/replay-deliveries#wait-for-the-result)).
* `health --fail-on` — gate on the current health threshold ([Diagnose webhooks](/agents/diagnose-webhooks#start-with-health)).

## When you must poll

When no wait command fits, poll deliberately:

* Back off exponentially — `1s, 2s, 4s, 8s` — never a tight loop.
* Stop on terminal codes: `validation_failed`, `active_org_required`, `forbidden_scope`, and `quota_exceeded` never succeed on retry.
* Retry only `rate_limited` (exit `6`) and `upstream_unavailable` (exit `1`), backing off further on `rate_limited` until the server limit resets.
* Bound every poll with a deadline so an agent can never spin forever.

**Performance rules**

* Cache `capabilities` (by `version`), identity (`whoami`), resolved IDs. Never cache cursors across queries, or counts/progress/depth.
* Prefer waits (`expect`/`tail`, `replay wait`) and current-state gates (`health --fail-on`) over polling loops.
* If you must poll: exponential backoff (1s/2s/4s/8s) + a deadline. Retry only `rate_limited` (6) and `upstream_unavailable` (1).

## Continue

<Columns cols={3} className="gap-y-4">
  <Card title="Consistency model" icon="layers" href="/agents/consistency-model" cta="Two data planes" arrow="true">
    Which plane answers which question, and the blind spot between them.
  </Card>

  <Card title="Wait for events" icon="timer" href="/agents/wait-for-events" cta="Replace polling" arrow="true">
    The observe-stream commands that make polling unnecessary.
  </Card>

  <Card title="Output & errors" icon="braces" href="/agents/output-and-errors" cta="Code reference" arrow="true">
    Every exit code, including `rate_limited` and `conflict`.
  </Card>
</Columns>
