Stage0
Use CasesAgent Loops

Loop Control Before Cost And Drift Snowball

The goal is not to build a workflow engine inside Stage0. The goal is to stop unsafe repetition before your runtime burns budget, quota, or downstream capacity.

Runaway agents usually fail by repeating themselves.

When retries, tool loops, or self-correction chains lose control, the result is rarely clever autonomy. It is usually wasted spend, stuck workflows, or accidental side effects.

Common failure modes

Same-tool repetition

The agent keeps calling the same tool because the state never meaningfully changes, but nothing hard-stops the loop.

Retry storms

A failure path becomes a cost path when retries multiply across network errors, model retries, and downstream backoffs.

Long-running drift

The run keeps going long after it should have been escalated to a human or terminated with a clear reason.

What to gate before the next iteration

Iteration limits

Bound how many times a run can try again before it must stop or ask for intervention.

max_iterations: 5 -> DEFER on iteration 6

Time budgets

Stop a run that has exceeded the window where automated continuation is still acceptable.

timeout_seconds: 120 -> DEFER after 120 seconds

Repeated tool detection

Treat repeated calls to the same tool as a signal that the agent is stuck, not making progress.

same_tool_limit: 2 -> DEFER on the third identical call

Cost accumulation

Track cumulative estimated cost so a bad loop does not stay cheap long enough to be ignored.

max_cost_usd: 2.00 -> DEFER after the threshold is crossed

Representative deferred loop

{
  "goal": "Resolve a failed payout by retrying the payout tool",
  "tools": ["payments.retry_payout"],
  "constraints": [
    "max_iterations: 5",
    "timeout_seconds: 120",
    "max_cost_usd: 2.00",
    "same_tool_limit: 2"
  ],
  "context": {
    "run_id": "payout-retry-8841",
    "current_iteration": 6,
    "elapsed_seconds": 147,
    "recent_tools": [
      "payments.retry_payout",
      "payments.retry_payout"
    ],
    "cumulative_cost_usd": 2.34
  },
  "side_effects": ["money_movement", "retry"]
}

This request should return DEFER because the run has exceeded its iteration, time, and estimated cost limits. You can provide counters directly, or keep a stable run_id and let Stage0 evaluate loop state across checks.

Runtime contract notes

run_id matters

A stable run identifier lets the control layer and your runtime talk about the same loop across multiple checks.

Decision is not execution

Stage0 can say stop, defer, or continue. Your agent runtime still needs to honor that decision before the next tool call.

Human escalation should be explicit

When the run is stuck, the clean outcome is usually defer to review, not silent continuation.

Loop control is an operability story

Buyers care less about an abstract anti-loop claim and more about whether they can bound retries, trace the run, and prove why the next call was stopped.