Topic
pipeline
OmniTraffic Pipeline Enables Controlled Training of Spatio-Temporal Traffic AI for Logistics
Researchers introduce OmniTraffic, a controllable generation pipeline and benchmark for spatio-temporal traffic reasoning. Built on 12 real-world intersections and surveillance footage from two countries, it generates 8M VQA samples and a 3K human-verified test set. Evaluation of 11 frontier MLLMs shows a large human-model gap, especially in topology-grounded reasoning. Fine-tuning on OmniTraffic data improves real-world performance, offering a valuable tool for logistics and supply chain AI.
Mask-Proof: New LLM Pipeline Automates Data Curation for Mathematical Proofs with 96.8% Accuracy
Researchers introduce Mask-Proof, an LLM-based pipeline that turns real mathematical proofs into automatically checkable masked-step tasks. The resulting Mask-ProofBench contains 292 problems. Seventeen models tested show reasoning-enhanced models outperform standard ones by 12-27%, with the evaluator achieving 96.8% agreement with expert annotators.