Software #lean 4#proof autoformalization
Study Reveals Serious Robustness Flaws in Proof Autoformalization for Lean 4
A new arXiv preprint presents the first systematic study on the robustness of proof autoformalization in Lean 4, introducing a benchmark with global and local perturbations. Evaluating seven recent LLM-based models on miniF2F and MATH-500, the study finds all are sensitive to global paraphrasing and mostly fail to faithfully reflect local changes, raising concerns for dependable formal verification.
Jun 16, 2026 1 source