Topic
theorem proving
Artificial Intelligence #llms#mathematical reasoning
LLMs Struggle with Multi-Step Logic: New Framework DREAM Boosts Theorem Proving Performance
Large language models (LLMs) have shown promise in mathematical reasoning but struggle with multi-step first-order logic (FOL) tasks. A new paper introduces DREAM, a self-adaptive solution that enhances diversity and reasoning of generation strategies, improving performance by up to 6.4% on a dataset of 447 theorems.
Jun 16, 2026 1 source
Software #lean 4#proof autoformalization
Study Reveals Serious Robustness Flaws in Proof Autoformalization for Lean 4
A new arXiv preprint presents the first systematic study on the robustness of proof autoformalization in Lean 4, introducing a benchmark with global and local perturbations. Evaluating seven recent LLM-based models on miniF2F and MATH-500, the study finds all are sensitive to global paraphrasing and mostly fail to faithfully reflect local changes, raising concerns for dependable formal verification.
Jun 16, 2026 1 source