Artificial Intelligence #llm#tutors
New Diagnostic Measures Whether LLM Tutors Teach or Simply Solve Problems
Researchers have proposed a diagnostic to evaluate whether large language model tutors actually support learning or simply solve problems. Analysis of eight models on the MathTutorBench benchmark found only a 0.421 correlation between solving and pedagogy performance, with several models shifting rank when evaluated on teaching-oriented criteria.
Jun 16, 2026 1 source