iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
New Survey Unifies LLM Policy Optimization Methods on First Principles from REINFORCE to GRPO Neuro-Symbolic Framework Improves Motion Prediction for Autonomous Vehicles in Mixed Traffic AI Scientist Automates Entire Research Lifecycle, Passes First Peer Review AI-driven Landmark-free Assessment of Lower-limb Alignment with Implicit Neural Shape Functions from Knee Radiographs Quantum Machine Learning for Industrial Applications: New Research Tackles Trainability and Expressivity New Method Resolves Drift Attribution Ambiguity in LLM Evaluation Pipelines New Hardware-Aware Neural Architecture Search Runs on Embedded Devices with Under 512MB RAM Malaysia's AI Agent-Powered Messaging Platform Respond.io Raises $62.5M, Targets Acquisitions MimicIK Framework Achieves Real-Time Inverse Kinematics with 4.65 mm Accuracy for Robotic Teleoperation Reward Hacking Still Undefeated: AI Safety Gridworlds Test Shows Exploits Persist Across LLM Scales New Survey Unifies LLM Policy Optimization Methods on First Principles from REINFORCE to GRPO Neuro-Symbolic Framework Improves Motion Prediction for Autonomous Vehicles in Mixed Traffic AI Scientist Automates Entire Research Lifecycle, Passes First Peer Review AI-driven Landmark-free Assessment of Lower-limb Alignment with Implicit Neural Shape Functions from Knee Radiographs Quantum Machine Learning for Industrial Applications: New Research Tackles Trainability and Expressivity New Method Resolves Drift Attribution Ambiguity in LLM Evaluation Pipelines New Hardware-Aware Neural Architecture Search Runs on Embedded Devices with Under 512MB RAM Malaysia's AI Agent-Powered Messaging Platform Respond.io Raises $62.5M, Targets Acquisitions MimicIK Framework Achieves Real-Time Inverse Kinematics with 4.65 mm Accuracy for Robotic Teleoperation Reward Hacking Still Undefeated: AI Safety Gridworlds Test Shows Exploits Persist Across LLM Scales
Home ›› Technology ›› Ai ›› Llms ›› ACC Method Compiles Agent Trajectories to Enhance Long-Context Reasoning in LLMs

ACC Method Compiles Agent Trajectories to Enhance Long-Context Reasoning in LLMs

Researchers propose Agent Context Compilation (ACC), which converts agent trajectories from search, software engineering, and database tasks into long-context question-answer pairs. Training Qwen3-30B-A3B with ACC achieves 68.3 on MRCR and 77.5 on GraphWalks, matching a model 8x larger, while preserving general capabilities.

iG
iGEN Editorial
June 16, 2026
ACC Method Compiles Agent Trajectories to Enhance Long-Context Reasoning in LLMs

Large language models (LLMs) used in agentic workflows often struggle to reason over long contexts, especially when evidence is scattered across many turns of tool use. Standard supervised fine-tuning (SFT) masks tool responses and only trains turn-level tool selection, creating a supervision blind spot for signals that span distant segments. According to a paper on arXiv, researchers have developed a new method called Agent Context Compilation (ACC) to address this gap.

ACC converts trajectories from agents—those used in search, software engineering, and database querying—into long-context question-answer pairs. The method combines the original question with tool responses and environment observations gathered across multiple turns, training the model to answer directly without tool use. This makes dependencies between the question and evidence explicit, enabling direct supervision of long-context reasoning over distant segments without additional annotation. The paper states that ACC is a simple approach that can be combined with any existing long-context extension or training method, providing scalable supervised fine-tuning data.

The researchers validated ACC on two challenging benchmarks: MRCR (multi-turn coreference resolution) and GraphWalks (graph traversal over extended contexts). They trained Qwen3-30B-A3B, a 30-billion-parameter model with 3 billion active parameters, using ACC. The results are shown in the table below:

Benchmark ACC-trained Qwen3-30B-A3B Baseline (same model) Larger model Qwen3-235B-A22B
MRCR 68.3 (+18.1) 50.2 72.1
GraphWalks 77.5 (+7.6) 69.9 79.8

The ACC-trained model achieved scores of 68.3 on MRCR (an improvement of 18.1 points) and 77.5 on GraphWalks (an improvement of 7.6 points). These results are comparable to those of Qwen3-235B-A22B, a model with 235 billion parameters and 22 billion active parameters—roughly 8 times larger in total parameters. At the same time, the ACC-trained model preserved its general capabilities on benchmarks including GPQA, MMLU-Pro, AIME, and IFEval, according to the paper.

Further mechanism analysis revealed that the ACC-trained model exhibits task-adaptive attention restructuring and expert specialization. This suggests that training on compiled trajectories encourages the model to allocate attention more effectively across long-range dependencies, a key requirement for enterprise applications that involve processing extended documents, multi-step reasoning, or historical transaction logs.

For enterprise technology leaders evaluating AI for complex workflows, ACC offers a data-efficient way to improve long-context reasoning without expensive manual curation. The method's compatibility with existing training pipelines means it could be integrated into custom LLM deployments for tasks such as contract analysis, supply chain event resolution, or multi-document intelligence—though the paper itself does not test those domains. The research, authored by Su, Qisheng, Fang, Zhen, Huang, Shiting, Zeng, Yu, Zhao, Yiming, Kou, Zhang, Ziao, Chen, Lin, Zehui, Wu, Lijun, and Feng, is available on arXiv under the title "ACC: Compiling Agent Trajectories for Long-Context Training."


Sources:

Keep Reading

Recommended Stories

A Theoretical Roadmap to Fuse Foundation Models and Knowledge Graphs Technology

A Theoretical Roadmap to Fuse Foundation Models and Knowledge Graphs

A new theoretical paper formalizes the 'Impedance Mismatch' between Foundation Models and Knowledge Graphs, arguing that current approaches like RAG are superficial. The authors propose a roadmap including Structured Residual Streams, Vector Symbolic Architectures, and Orthogonal Subspace Editing for true semantic fusion.

June 16, 2026
E-mem: Multi-Agent Framework for Episodic Memory Reconstruction Boosts LLM Reasoning Efficiency by 70% Technology

E-mem: Multi-Agent Framework for Episodic Memory Reconstruction Boosts LLM Reasoning Efficiency by 70%

Researchers propose E-mem, a multi-agent framework that reconstructs episodic context for LLM agent memory, inspired by biological engrams. It uses a hierarchical architecture with assistant agents maintaining uncompressed contexts and a master agent orchestrating planning, achieving 54% F1 on the LoCoMo benchmark, surpassing the state-of-the-art GAM by 7.75% with over 70% token cost reduction.

June 16, 2026
LLM Agents Look at Correct Tools but Still Pick Wrong, Research Reveals Readout as Failure Point Technology

LLM Agents Look at Correct Tools but Still Pick Wrong, Research Reveals Readout as Failure Point

Research by Shiyang Chen reveals that LLM agents mis-call tools not because they fail to see the right tool, but because the decision readout fails. The model attends to the correct tool 80% of the time, yet picks wrong. Readout-side interventions recover 59-91% of failures, while input-side fixes recover ≤23%.

June 16, 2026
Cortical Geometry and Wiring Serve as Powerful Inductive Biases for Recurrent Neural Networks Technology

Cortical Geometry and Wiring Serve as Powerful Inductive Biases for Recurrent Neural Networks

A new study leveraging the MICrONS functional connectomics dataset demonstrates that recurrent neural networks initialized with cortical geometry, wiring, and functional relationships consistently outperform baseline and partially constrained models across three decision-making tasks, achieving lower entropy and modular organization.

June 16, 2026