Artificial Intelligence #ai#theorem proving
SorryDB Benchmark Tests AI Provers on Real-World Lean Theorem Completion Tasks
Researchers present SorryDB, a benchmark of open Lean tasks from 78 GitHub projects. Evaluating a snapshot of 1000 tasks, they show current approaches are complementary, with Gemini Flash-based agentic methods leading but not outperforming all others.
Jun 17, 2026 1 source