New Survey Maps How Evidence Tracing and Execution Provenance Can Make LLM Agents Trustworthy

A new survey from arXiv explores evidence tracing and execution provenance as key mechanisms for ensuring trustworthiness in LLM-based agents. The paper defines a unified framework connecting retrieval grounding, tool-use safety, memory lineage, and failure diagnosis, and reviews benchmarks and open challenges.

iGEN Editorial

June 16, 2026

New Survey Maps How Evidence Tracing and Execution Provenance Can Make LLM Agents Trustworthy

Large language model (LLM)-based agents are evolving from passive text generators into autonomous systems capable of planning, tool use, retrieval, memory access, environmental interaction, and multi-agent collaboration. According to a comprehensive survey published on arXiv, these expanded capabilities make agent behavior harder to verify, debug, and audit. Final-answer accuracy alone cannot explain how an output was produced, which evidence supported each claim, whether tool calls were justified, how memory influenced later decisions, or where failures originated. The survey, titled "From Agent Traces to Trust: A Survey of Evidence Tracing and Execution Provenance in LLM Agents," examines evidence tracing and execution provenance as foundations for process-level accountability in trustworthy LLM agents.

Defining Execution Provenance and Evidence Tracing

The survey defines execution provenance as the typed graph of an agent execution and evidence tracing as its projection onto evidence-support relations. This perspective, according to the authors, connects retrieval grounding, claim support, tool-use safety, memory lineage, observability, debugging, audit, and recovery within a unified framework.

A Unified Taxonomy for Trustworthy Agents

The survey introduces a taxonomy covering:

Trace sources
Evidence and execution units
Provenance relations
Tracing granularity and timing
Representation forms
Trust functions

This taxonomy provides a structured way to categorize and compare different approaches to agent transparency and accountability.

Methodological Directions in Provenance Research

The authors review key methodological directions, including:

Provenance representation – how to encode the execution graph
Evidence attribution – linking claims back to specific evidence
Tool-use provenance – tracking which tool calls were made and why
Runtime guardrails – preventing unsafe actions
Provenance-bearing memory – memory that retains its own source context
Observability – enabling real-time monitoring of agent internals
Failure diagnosis – identifying where and why errors occurred

These directions, the survey states, are critical for building provenance-aware, auditable, and recoverable agent systems.

Open Challenges and Future Work

The survey also discusses benchmarks, datasets, metrics, and open challenges. For enterprise technology leaders evaluating LLM agents for critical applications, these findings underscore the need for systems that can provide not just answers but auditable traces of how those answers were derived. Without such capabilities, autonomous agents risk being deployed in high-stakes environments without the transparency required for trust and compliance.

Sources:

New Survey Maps How Evidence Tracing and Execution Provenance Can Make LLM Agents Trustworthy

Defining Execution Provenance and Evidence Tracing

A Unified Taxonomy for Trustworthy Agents

Methodological Directions in Provenance Research

Open Challenges and Future Work

Recommended Stories

AI Scammers Outperform Humans in Building Trust, New Study Finds

Beijing Accuses US AI Firms of Using Chinese Models for Training

project44 CEO: AI Agents Without Context Are Just Guessing Faster

Hard or Just Unreached? Diagnosing the Sampling Blind Spot in Math-Reasoning Difficulty Estimation