Study Reveals 27 Error Types in LLM Text-to-SQL, Introduces MapleDoctor Repair Framework

Researchers conducted the first comprehensive study of errors in LLM-based text-to-SQL systems using in-context learning. They identified 27 error types across 7 categories and proposed MapleDoctor, a detection and repair framework that outperforms existing solutions by repairing 13.8% more queries with negligible mis-repairs and reducing repair latency by 67.4%.

iGEN Editorial

June 16, 2026

Study Reveals 27 Error Types in LLM Text-to-SQL, Introduces MapleDoctor Repair Framework

Large language models (LLMs) are increasingly deployed to translate natural language questions into SQL queries through in-context learning (ICL), a technique that provides example query pairs to guide the model. However, according to a new study by researchers including Shen, Jiawei, Wan, Chengcheng, Qiao, Ruoyi, et al. (arXiv, 2025), these systems suffer from widespread correctness problems. The study, which the authors describe as the first comprehensive examination of ICL-based text-to-SQL errors, systematically analyzed four representative ICL techniques, five basic repairing methods, two benchmarks, and two LLM settings.

Scope of the Study

The research covered a broad range of configurations to capture real-world error patterns. The four ICL techniques studied include representative approaches from the literature, though the paper does not name them explicitly. The five basic repairing methods span common strategies such as re-prompting or syntax correction. Two standard benchmarks were used along with two LLM settings (likely different model sizes or temperatures). This design allowed the team to identify errors that are persistent across methods and contexts.

Error Categories and Types

The analysis uncovered 27 distinct error types grouped into 7 major categories. While the paper does not enumerate each type, the categories cover semantic, syntactic, and logical mistakes common when LLMs misinterpret database schemas or user intent. The authors note that errors are widespread, indicating that even advanced ICL-based text-to-SQL systems are far from reliable for production use.

Limitations of Existing Repairs

Existing repair attempts show limited correctness improvement, according to the study. The researchers found that current methods suffer from high computational overhead and produce many mis-repairs—fixes that introduce new errors or change correct queries incorrectly. This makes them impractical for enterprise environments where accuracy and speed are critical.

MapleDoctor: A New Detection and Repair Framework

To address these shortcomings, the team developed MapleDoctor, a novel framework for detecting and repairing text-to-SQL errors. MapleDoctor combines error detection with targeted repair strategies. The evaluation demonstrates:

Metric	Existing Solutions	MapleDoctor	Improvement
Queries repaired	Baseline	+13.8%	More queries fixed
Mis-repairs	Common	Negligible	Fewer introduced errors
Repair latency	High	-67.4%	Faster repairs

According to the paper, MapleDoctor outperforms existing solutions by repairing 13.8% more queries while introducing a negligible number of mis-repairs and reducing repair latency by 67.4%. The artifact is publicly available on GitHub, enabling replication and extension.

Implications for Enterprise Database Systems

For enterprises relying on natural language interfaces to databases—common in supply chain analytics, inventory management, and logistics—the findings highlight the gap between LLM capabilities and production reliability. Text-to-SQL errors can lead to incorrect data retrieval, flawed reporting, and costly decision-making. Tools like MapleDoctor offer a path to automated error correction, but the study underscores that manual validation remains essential. The systematic error taxonomy provides a foundation for building more robust systems, and the open-source release invites further innovation from the community.

As LLMs continue to be integrated into enterprise software, understanding and mitigating their failure modes will be critical for achieving trusted automation. This study takes a step toward that goal by quantifying the problem and proposing a practical remedy.

Sources:

Study Reveals 27 Error Types in LLM Text-to-SQL, Introduces MapleDoctor Repair Framework

Scope of the Study

Error Categories and Types

Limitations of Existing Repairs

MapleDoctor: A New Detection and Repair Framework

Implications for Enterprise Database Systems

Recommended Stories

MoCA-Agent: Market-of-Claims Code Agent Achieves Strong Results in Financial and Numerical Reasoning

New Framework MACR Resolves Knowledge Conflicts in LLMs Using Multi-Agent Reasoning

Large Language Models Can Read Compressed Text That Humans Cannot, Researchers Find

From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models