Is your feature request related to a problem? Please describe.
As developers build autonomous loops (e.g., self-healing code agents), there is a significant risk of runaway costs due to infinite loops or hallucinations. Currently, limiting max_iterations is a proxy, but it doesn't account for token density or model cost variance.
Describe the solution you'd like
A first-class CircuitBreaker policy or middleware that can be attached to an Agent or Workflow.
- Configurable Limits: Set hard caps on
TotalTokens or EstimatedCost per execution/session.
- Stateful Tracking: Accumulate usage across multiple turns in a workflow.
- Fail-Safe: If the limit is breached, the agent should halt immediately and return a safe error state (e.g.,
CostLimitExceededException), preventing further LLM calls.
Describe alternatives you've considered
- Manually tracking usage in every step of the workflow (error-prone and verbose).
- Setting strict
max_iterations (imprecise; a single iteration with 128k context is different from 10 iterations of 1k).
Additional context
This feature is critical for "set and forget" autonomous scenarios in enterprise environments.
Is your feature request related to a problem? Please describe.
As developers build autonomous loops (e.g., self-healing code agents), there is a significant risk of runaway costs due to infinite loops or hallucinations. Currently, limiting
max_iterationsis a proxy, but it doesn't account for token density or model cost variance.Describe the solution you'd like
A first-class
CircuitBreakerpolicy or middleware that can be attached to an Agent or Workflow.TotalTokensorEstimatedCostper execution/session.CostLimitExceededException), preventing further LLM calls.Describe alternatives you've considered
max_iterations(imprecise; a single iteration with 128k context is different from 10 iterations of 1k).Additional context
This feature is critical for "set and forget" autonomous scenarios in enterprise environments.