Enterprise AI systems that act through tools and sub-agents — known as agentic AI — are increasingly deployed in production, yet the controls meant to bound their financial and environmental cost still sit on dashboards evaluated beside or after execution. According to a new paper by Besanson and Gaston titled "Green SARC: Predictive Cost and Carbon Governance for Agentic AI Systems" (published on arXiv), this reactive oversight leaves organizations exposed to runaway costs and carbon emissions. The researchers propose Green SARC, which applies the SARC governance-by-architecture framework — a set of four enforcement sites in the agent loop — to FinOps and GreenOps, adding a predictive dimension to cost and carbon governance.
The paper reports four policy-independent results that quantify the behavior of unconstrained and constrained agentic systems. The first result, called the "State Snowball," shows that the cost of an unconstrained agent grows quadratically, specifically Θ(n²) in loop depth. On 3,000 real multi-step plans from the SWE-rebench dataset, the quadratic growth holds on 100% of plans, with a median curvature ĉ₂ = 216 — exceeding the linear-accretion prediction p/2 = 134. This means real plans accrete cost faster than the simple linear model predicts.
| Result | Metric | Value |
|---|---|---|
| State Snowball (real plans) | Quadratic growth verified | 100% of plans; median curvature 216 vs linear prediction 134 |
| Normal-σ gate coverage (real residuals) | Nominal 95% vs actual 92% | Under-coverage; split-conformal calibration achieves 95.2% |
| Soft Lagrangian penalty budget adherence | Budget breach rate | 91.5% of seeds |
| Architectural gate over-budget incidence | Over-budget rate | 0% on synthetic and real (BurstGPT) arrivals |
The second result concerns prediction gates. On real residuals, the Normal-σ gate under-covers, achieving only 92% coverage at a nominal 95% confidence level. However, split-conformal calibration brings coverage to 95.2%, closely matching the target.
The third comparison directly addresses the question of enforcement mechanisms. A soft Lagrangian penalty tuned to the budget in expectation breaches that budget on 91.5% of seeds, meaning it fails to consistently keep spending within limits. In contrast, the architectural gate — a hard enforcement site in the agent loop — breaches 0% of budgets.
The fourth result extends this to binding budgets: under tight constraints, the gate's over-budget incidence remains 0% on both synthetic and real workload arrivals (the real data coming from BurstGPT traces).
Crucially, the paper notes that end-to-end token, USD, and carbon savings of 47–55% are real but policy-dependent in magnitude. These savings are set by a scope-cap knob, not by gate rejections — meaning the policy designer chooses how aggressive the cap is, and the gate simply enforces it.
The library implementing Green SARC is open-source, dependency-free, and ships a regeneration script for every cited number, enabling full reproducibility.
For CTOs and technology procurement leaders evaluating agentic AI platforms, the implication is clear: predictive governance architectures like Green SARC can simultaneously control financial and environmental cost, but only if enforcement is architectural rather than penalty-based. The quadratic snowball effect means that without proactive gating, a seemingly small loop depth can lead to disproportionate resource consumption. Organizations deploying agents in production should consider whether their current monitoring and cost controls are reactive dashboards or are integrated into the agent loop as architectural gates.