Artificial Intelligence #llm agents#artificial intelligence
New MBABench Evaluates LLM Agents on End-to-End Finance Spreadsheet Tasks
MBABench, a new benchmark from researchers, evaluates LLM agents on end-to-end spreadsheet tasks in finance, focusing on modeling and scenario analysis. The benchmark assesses accuracy, formula use, and formatting. Claude family models lead but still fall short of professional standards.
Jun 16, 2026 1 source