iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
P3B3 Benchmark Reveals Strong Brazilian Portuguese Bias in Large Language Models Controlled Dynamics Attractor Transformer: New Model Targets Graph Anomaly Detection with Biologically Plausible Attention Tamil Nadu OE Spinning Mills Threaten 50% Production Cut Over High Cotton Waste Prices BridgePolicy: New Diffusion Bridge Method Improves Visuomotor Policy Learning in Robotics New Theory Explains How Deep Transformers Achieve Adaptive Inference Using Function Vectors PVminerLLM2 Uses Preference Optimization to Improve Structured Patient Voice Extraction Beyond Models: Reflections on Engineering AI-enabled Systems in a Project-Based Course AutoDojo: Adaptive Attacks Expose Superficial Defenses and Structural Limits in LLM Agents Calibrated Variance Propagation Cuts Uncertainty Estimation Cost for Deep Learning Models Patel Engineering Joint Venture Secures ₹126 Crore Tasgaon Lift Irrigation Project in Maharashtra P3B3 Benchmark Reveals Strong Brazilian Portuguese Bias in Large Language Models Controlled Dynamics Attractor Transformer: New Model Targets Graph Anomaly Detection with Biologically Plausible Attention Tamil Nadu OE Spinning Mills Threaten 50% Production Cut Over High Cotton Waste Prices BridgePolicy: New Diffusion Bridge Method Improves Visuomotor Policy Learning in Robotics New Theory Explains How Deep Transformers Achieve Adaptive Inference Using Function Vectors PVminerLLM2 Uses Preference Optimization to Improve Structured Patient Voice Extraction Beyond Models: Reflections on Engineering AI-enabled Systems in a Project-Based Course AutoDojo: Adaptive Attacks Expose Superficial Defenses and Structural Limits in LLM Agents Calibrated Variance Propagation Cuts Uncertainty Estimation Cost for Deep Learning Models Patel Engineering Joint Venture Secures ₹126 Crore Tasgaon Lift Irrigation Project in Maharashtra
Home ›› Technology ›› Ai ›› Llms ›› Open-SWE-Traces: 207K Multilingual Trajectories Set New Standard for Autonomous Software Engineering Agents

Open-SWE-Traces: 207K Multilingual Trajectories Set New Standard for Autonomous Software Engineering Agents

Researchers have released Open-SWE-Traces, a dataset of 207,489 software engineering agent trajectories spanning nine programming languages, sourced from 20,000 real-world pull requests. Fine-tuning on this data yields models that achieve state-of-the-art resolve rates on multiple SWE-bench benchmarks, advancing autonomous software engineering.

iG
iGEN Editorial
June 16, 2026
Open-SWE-Traces: 207K Multilingual Trajectories Set New Standard for Autonomous Software Engineering Agents

Autonomous software engineering agents have long been limited by a lack of diverse, large-scale trajectory data. A new dataset, Open-SWE-Traces, aims to close that gap by providing 207,489 agentic trajectories across nine programming languages, according to a paper by Ahmad, Ludwig, Majumdar, and Ginsburg posted on arXiv.

Dataset Composition and Synthesis

The trajectories were sourced from 20,000 real-world pull requests (PRs) using the OpenHands and SWE-agent harnesses. They cover Python, Go, TypeScript, JavaScript, Rust, Java, PHP, C, and C++. To generate the trajectories, the authors employed a hybrid-reasoning synthesis: Minimax-M2.5 produced trajectories with explicit "thinking" processes, while Qwen3.5-122B generated high-quality "non-thinking" traces. All data was filtered to include only permissive licenses (MIT, Apache, BSD) from SWE-rebench-V2.

Validation and Performance

The dataset's effectiveness was validated by fine-tuning the Qwen3-30B-A3B series (Thinking, Instruct, and Coder). The best performing model achieved the following resolve rates on standard benchmarks:

Benchmark Resolve Rate
SWE-bench Verified 61.7%
SWE-bench Multilingual 57.1%
SWE-bench Pro 36.8%

These results establish Open-SWE-Traces as a premier resource for distilling human-level software engineering capabilities into efficient, open-source agentic LLMs, according to the paper.

Implications for Enterprise Software Engineering

For CTOs and technology procurement leaders, the availability of open-source, multilingual agentic trajectory data could reduce the cost and complexity of building custom software engineering agents. Fine-tuning on Open-SWE-Traces allows models to perform long-horizon reasoning across codebases in languages from Python to C++, potentially accelerating development pipelines. The dataset, derived from real PRs, reflects genuine human coding practices rather than synthetic scenarios.

The use of both thinking and non-thinking trajectories gives developers flexibility: thinking traces can be used for interpretability, while non-thinking traces optimize inference speed. As autonomous agents move toward production use, such resources lower the barrier to entry for enterprises seeking to automate parts of their software lifecycle.

Availability and Licensing

Open-SWE-Traces is released under permissive licenses (MIT, Apache, BSD), making it suitable for commercial use. The full dataset and fine-tuning recipes are available via the paper's accompanying code repository, enabling organizations to reproduce and extend the results.


Sources:

Keep Reading

Recommended Stories

New Research Reveals Truthfulness Preserved Across LLM Lineages, Enabling Better Hallucination Control Technology

New Research Reveals Truthfulness Preserved Across LLM Lineages, Enabling Better Hallucination Control

A new paper from researchers shows that truthfulness-related attention heads are preserved across generations of large language models, even after instruction tuning or multimodal adaptation. The authors propose TruthProbe, a soft-gating strategy that amplifies these heads to reduce hallucinations, with improvements on HaluEval, POPE, and CHAIR benchmarks.

June 16, 2026
Cognitive Trajectory Modeling: A New Framework for Quantifying Human-AI Co-Creation Technology

Cognitive Trajectory Modeling: A New Framework for Quantifying Human-AI Co-Creation

Cognitive Trajectory Modeling (CTM) is a novel cognitive theory of interaction dynamics that conceptualizes cognition and creative processes as temporally organized trajectories. It provides a framework for quantifying how human-AI co-creation evolves over time, distinguishing cognitive trajectories from mere interaction traces.

June 16, 2026
LabOSBench: New Benchmark Tests AI Agents on Complex Scientific Instrument Control Technology

LabOSBench: New Benchmark Tests AI Agents on Complex Scientific Instrument Control

LabOSBench is a new benchmark designed to evaluate computer-use agents on scientific instrument control. It features 96 subtasks across eight simulated instruments, testing agents on sample loading, alignment, parameter tuning, data acquisition, and result inspection. Early results show that while agents handle structured GUI tasks well, they struggle with feedback-driven operations and long-horizon workflows.

June 16, 2026
Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models Technology

Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models

Masked Diffusion Language Models (MDLMs) have emerged as a distinct paradigm for sequence generation, but combining their knowledge is an underexplored problem. Researchers introduce TIE (Trajectory-based Iterative Ensembling), a framework that tracks confidence dynamics over answer-relevant positions to relay decoding trajectories between models, achieving strong performance on diverse reasoning tasks.

June 16, 2026