Confirmations and human gates guide
How ROST routes authority-changing work through human confirmation, and why agents never approve their own requests.
ROST separates work an agent may do from decisions only a human may make. Durable, authority-changing, or sensitive commands carry a confirmation level so the system can stop and ask a person.
Confirmation levels
none: no confirmation gate — the command never returns a pending confirmation. Most are reads or reversible drafts an agent can run directly, including over MCP (graph reads,charter.draft,status.record,sync.brief.compile,sync.run.start,task.complete). A fewnonecommands still require a human actor or an interactive channel —confirmation.approveisnone(approving a gate cannot itself be gated) yet runs only as a human in the UI or CLI — sononemeans "no confirmation gate", not always "agent-callable".human_required: a human must approve. Structural, staffing, resolution, and go-live commands (seat.reparent,charter.approve,goal.reparent,friction.resolve,agent.go_live,mcp_token.create).credential_flow: routes through the vault-backed credential path so a secret is captured as a vault reference, never stored or logged in the clear (credential.ingress,tenant.anthropic_key.save;agent.configure_toolsiscredential_flowtoo — it stages credential-ingress requests through the same path without ever taking raw secret material).dangerous: the highest-risk human gate. Only two commands carry it —settings.updateandagent.decommission.
The dangerous confirmation level is rare and is not the same as the risk badge a pending confirmation can display. The badge shows "dangerous" whenever a command's confirmation level is dangerous or the command redacts secrets (its audit redaction is secret_strict), so a credential_flow command such as credential.ingress shows the dangerous badge while still gating through the credential path — not the dangerous level. Read the badge as "handle with care" and the confirmation level as "who must approve". confirmation.approve and confirmation.reject are themselves none — approving a gate cannot itself require approval.
What a gated command returns over MCP
A gated command does not mutate when an agent calls it. It returns a pending confirmation with the command id, a risk level, and an approveVia block containing a web URL and the exact rost command confirmation.approve --json ... line. The human approves through the web link or the CLI; the agent surfaces the link and stops.
# Human approves a pending confirmation
rost command confirmation.approve --json '{"confirmation_id":"<confirmation-id>"}'
# Or rejects it
rost command confirmation.reject --json '{"confirmation_id":"<confirmation-id>"}'The rule for agents
An agent never approves or rejects its own request. decisions.decided_by is always a human. The steward decision commands (escalation.resolve, escalation.reject) are not even exposed over MCP. When an agent hits a gate, it prepares the evidence and the recommended action, returns the approve link, and waits for a human.
When to stop
Stop before: approving a Charter, signing a manifest, connecting a tool or credential, minting a token, going live, resolving an escalation or Friction, reparenting or dropping a goal, changing a member role, or updating budget caps. These are human gates by design.