Confirmations and human gates guide

How ROST routes authority-changing work through human confirmation, and why agents never approve their own requests.

graph designcharter designstaffingoperating rhythm

ROST separates work an agent may do from decisions only a human may make. Durable, authority-changing, or sensitive commands carry a confirmation level so the system can stop and ask a person.

Confirmation levels

none: no confirmation gate — the command never returns a pending confirmation. Most are reads or reversible drafts an agent can run directly, including over MCP (graph reads, charter.draft, status.record, sync.brief.compile, sync.run.start, task.complete). A few none commands still require a human actor or an interactive channel — confirmation.approve is none (approving a gate cannot itself be gated) yet runs only as a human in the UI or CLI — so none means "no confirmation gate", not always "agent-callable".
human_required: a human must approve. Structural, staffing, resolution, and go-live commands (seat.reparent, charter.approve, goal.reparent, friction.resolve, agent.go_live, mcp_token.create).
credential_flow: routes through the vault-backed credential path so a secret is captured as a vault reference, never stored or logged in the clear (credential.ingress, tenant.anthropic_key.save; agent.configure_tools is credential_flow too — it stages credential-ingress requests through the same path without ever taking raw secret material).
dangerous: the highest-risk human gate. Only two commands carry it — settings.update and agent.decommission.

The dangerous confirmation level is rare and is not the same as the risk badge a pending confirmation can display. The badge shows "dangerous" whenever a command's confirmation level is dangerous or the command redacts secrets (its audit redaction is secret_strict), so a credential_flow command such as credential.ingress shows the dangerous badge while still gating through the credential path — not the dangerous level. Read the badge as "handle with care" and the confirmation level as "who must approve". confirmation.approve and confirmation.reject are themselves none — approving a gate cannot itself require approval.

What a gated command returns over MCP

A gated command does not mutate when an agent calls it. It returns a pending confirmation with the command id, a risk level, and an approveVia block containing a web URL and the exact rost command confirmation.approve --json ... line. The human approves through the web link or the CLI; the agent surfaces the link and stops.

# Human approves a pending confirmation
rost command confirmation.approve --json '{"confirmation_id":"<confirmation-id>"}'
# Or rejects it
rost command confirmation.reject --json '{"confirmation_id":"<confirmation-id>"}'

The rule for agents

An agent never approves or rejects its own request. decisions.decided_by is always a human. The steward decision commands (escalation.resolve, escalation.reject) are not even exposed over MCP. When an agent hits a gate, it prepares the evidence and the recommended action, returns the approve link, and waits for a human.

When to stop

Stop before: approving a Charter, signing a manifest, connecting a tool or credential, minting a token, going live, resolving an escalation or Friction, reparenting or dropping a goal, changing a member role, or updating budget caps. These are human gates by design.