Artificial Intelligence #llm#reasoning
Semi-Supervised Framework Scales LLM Reasoning Using 10-15x Fewer Labels Than Traditional Methods
A new semi-supervised framework for training LLM reasoning uses a lightweight verifier to judge reasoning quality, requiring only a few labeled samples. Experiments on math problems and visual question answering show accuracy comparable to 10-15x more labeled data. The method could reduce the cost of building large-scale reasoning datasets.
Jun 16, 2026 2 sources