Human-in-the-Loop: Where to Stop the Agent Before It Breaks Things

In 2026 only 22% of people trust autonomous agents — and the EU AI Act explicitly requires human oversight for high-risk systems. An agent that knows when to pause and ask for approval at critical points is not a bug — it is a mandatory part of production architecture.

IntermediateAI Agents20 minLangGraph, State checkpointer, Any agent framework

HITL vs HOTL: human before the action, or after?

There are two fundamentally different control schemes. HITL (Human-in-the-Loop) — the agent stops and waits for an explicit 'yes' from a human BEFORE acting. HOTL (Human-on-the-Loop) — the agent acts immediately, a human watches and can roll back. Analogy: HITL is a cashier calling the manager before every refund over $100. HOTL is a security camera: nobody hits pause, but someone is watching and can intervene. The choice doesn't depend on how 'important' the task feels — it depends on one question: are the consequences reversible? Wire transfer, sent email, deleted file — not reversible. A query against a staging DB, a draft article, a log search — reversible in seconds.

HITL — pause BEFORE action

When: action is irreversible
Latency: seconds to hours
Risk tolerance: none
Examples: payments, sending emails, deletions

HOTL — oversight AFTER

When: action is reversible
Latency: zero (real-time)
Risk tolerance: OK if visible
Examples: search, drafts, chat replies

Golden rule: HITL when the action is irreversible (transfer, send email, delete). HOTL when it's reversible and speed matters. Don't put HITL on every step — you'll get not an agent but a slow assistant.

Confidence-based routing: not all decisions carry equal risk

The main mistake is gating EVERY agent action with approval. Users get tired of constant 'confirm this', the agent loses its point. The fix is routing: the agent grades how confident it is and how risky the action is, and only a subset of requests gets escalated to a human. But here's the trap: LLMs almost always overestimate their confidence score. Relying on it alone is naive. The right move is to mix several signals: action class (read or write?), destructiveness (create or delete?), dollar amount, whether external parties are affected. Exactly the combination decides whether to proceed or call in a human.

Request

Agent decides

Confidence score

> 0.9 + low risk

Execute

< 0.9 or high risk

Human

Execute

Confidence is one signal among many. A query against a staging server — auto-approve (cheap to restore). Dropping a production DB — always human, even at confidence 0.99. Risk is defined by consequences, not by the model's self-assessment.

Pausing without losing state: a checkpoint is not a comment in chat

The most underrated part of HITL is preserving context during the pause. When the agent stops and waits for a human, anywhere from 5 seconds to 5 hours to a full weekend can pass between the approval request and the answer. And during all of that time the entire working context has to live somewhere — but not in process memory. Picture this: the agent is on step five of seven, it already has intermediate search results, an open DB transaction, a partially filled form. If the server restarts while waiting — the context has to come back as if nothing happened. That means the checkpoint is written to durable storage (Postgres, Redis with persistence, a queue), not to an in-memory variable.

Pause trigger

Which action triggered approval

Intermediate results

What the agent has already found / computed

Tool calls in progress

Open transactions, pending requests

Conversation context

Prompt + message history

Checkpoint sanity check — restart the process during the pause, before the human answers. If after the restart the context doesn't come back and the agent can't continue from the same point — the checkpoint is fake, you're holding state in memory.

Timeout: what if the human never shows up

Every HITL point is a potentially stuck agent. The human went on vacation, the Slack bot dropped the notification, the approver is sick. Without an explicit timeout policy you end up with a queue of 'hanging' tasks that piles up for years. Hence the rule: no HITL gate exists without an answer to 'what happens if there's no response in N minutes/hours?'. There are exactly four options, and all of them beat the fifth — 'just silently proceed with the default'. First: abort with full rollback (safest, but you lose progress). Second: escalate to a more senior approver or another team. Third: default to a safer action — do a reduced version (refund instead of transfer, send a draft instead of a live email). Fourth: queue for async review — park it until morning. The choice depends on context. A customer-facing operation (customer is waiting) — short timeout and a safe default. An internal report — can wait until the next business day. Critically: the 'do nothing' default is almost always safer than the 'do something' default. An agent stuck with an unresolved task is better than an agent that made a bad call and destroyed data.

approval_gate("refund $500"):
  если human_approved в течение 5 мин: выполнить
  если нет: escalate_to(senior_agent) + notify_user("обработка займёт время")
  если и senior не ответил за 30 мин: rollback + заявка в очередь утром

approval_gate("send weekly report"):
  если не одобрено за 1 час: cancel + log
  (риск низкий — просто пропустить итерацию)

A 'do nothing' default is safer than a 'do something' default. An agent stuck with an unresolved task beats an agent that made a bad call. A timeout without a policy is just a hidden form of auto-approval.

Audit trail: explaining why the agent called in a human

Every HITL point has to leave a trail readable without source code. Not debugging logs ('agent.invoke failed at line 42'), but an audit journal: what action did the agent propose, why did the gate fire, what did the human decide, when. If six months later a lawyer, an auditor or just a new hire can't reconstruct the decision — the audit trail is useless. Regulatory context: the EU AI Act explicitly requires traceability for high-risk AI systems. Without the journal you can't prove a human made the call rather than the model. Same in any customer dispute: 'the agent approved it' vs 'Ivan from the risk team approved at 14:32 with a comment' are two very different positions in court.

What to log at every gate

Proposed action (full payload)

Agent's reasoning: why it chose this action

Confidence score and all signals that triggered the gate

Human's decision + comment / reason

User/session ID, UTC timestamp

Alternative actions the agent considered

Model and prompt version

The audit trail has to be readable by a non-technical auditor. 'Agent called escalate_tool(reason=\'high_risk\')' — bad. 'Agent requested approval for a $500 transfer because amount exceeded the $200 limit' — good. Write the journal as if a lawyer will read it a year from now.

Result

Five engineering decisions without which HITL becomes decoration: HITL vs HOTL split by reversibility, risk-based routing, durable checkpoints, timeout policies, and a human-readable audit trail. This is exactly the architecture the EU AI Act requires and any serious customer expects from a production agent.

All Recipes

Human-in-the-Loop: Where to Stop the Agent Before It Breaks Things

IntermediateAI Agents20 minLangGraph, State checkpointer, Any agent framework

HITL vs HOTL: human before the action, or after?

HITL — pause BEFORE action

When: action is irreversible
Latency: seconds to hours
Risk tolerance: none
Examples: payments, sending emails, deletions

HOTL — oversight AFTER

When: action is reversible
Latency: zero (real-time)
Risk tolerance: OK if visible
Examples: search, drafts, chat replies

Confidence-based routing: not all decisions carry equal risk

Request

Agent decides

Confidence score

> 0.9 + low risk

Execute

< 0.9 or high risk

Human

Execute

Pausing without losing state: a checkpoint is not a comment in chat

Pause trigger

Which action triggered approval

Intermediate results

What the agent has already found / computed

Tool calls in progress

Open transactions, pending requests

Conversation context

Prompt + message history

Timeout: what if the human never shows up

approval_gate("refund $500"):
  если human_approved в течение 5 мин: выполнить
  если нет: escalate_to(senior_agent) + notify_user("обработка займёт время")
  если и senior не ответил за 30 мин: rollback + заявка в очередь утром

approval_gate("send weekly report"):
  если не одобрено за 1 час: cancel + log
  (риск низкий — просто пропустить итерацию)

Audit trail: explaining why the agent called in a human

What to log at every gate

Proposed action (full payload)

Agent's reasoning: why it chose this action

Confidence score and all signals that triggered the gate

Human's decision + comment / reason

User/session ID, UTC timestamp

Alternative actions the agent considered

Model and prompt version

Human-in-the-Loop: Where to Stop the Agent Before It Breaks Things

HITL vs HOTL: human before the action, or after?

HITL — pause BEFORE action

HOTL — oversight AFTER

Confidence-based routing: not all decisions carry equal risk

Pausing without losing state: a checkpoint is not a comment in chat

Timeout: what if the human never shows up

Audit trail: explaining why the agent called in a human

What to log at every gate

Result

Related Theory

Human-in-the-Loop: Where to Stop the Agent Before It Breaks Things

HITL vs HOTL: human before the action, or after?

HITL — pause BEFORE action

HOTL — oversight AFTER

Confidence-based routing: not all decisions carry equal risk

Pausing without losing state: a checkpoint is not a comment in chat

Timeout: what if the human never shows up

Audit trail: explaining why the agent called in a human

What to log at every gate

Result

Related Theory