New LLM Framework Detects Phishing Emails with Over 90% Accuracy

A paper on arXiv introduces LLMPEA, a framework using GPT-4o, Claude Sonnet 4, and Grok-3 to detect phishing emails with over 90% accuracy. The study also reveals vulnerabilities to adversarial attacks, prompt injection, and multilingual attacks, emphasizing the need for hardening before deployment.

iGEN Editorial

June 16, 2026

New LLM Framework Detects Phishing Emails with Over 90% Accuracy

Phishing remains one of the most prevalent and globally consequential vectors of cyber intrusion, affecting organizations worldwide. Researchers have proposed a new framework, LLMPEA, that leverages large language models (LLMs) to detect phishing emails with over 90% accuracy, according to a paper published on arXiv. The framework is designed to counter evolving attacks that exploit the fundamental architectures of LLMs.

The LLMPEA Framework

The LLMPEA (LLM-based Phishing Email Attack detection) framework evaluates three frontier LLMs: GPT-4o (OpenAI), Claude Sonnet 4 (Anthropic), and Grok-3 (xAI). The researchers employed comprehensive prompting design to assess the feasibility, robustness, and limitations of these models against phishing email attacks. According to the paper, the empirical analysis reveals that LLMs can detect phishing emails with over 90% accuracy.

Performance Across Models

All three models demonstrated strong performance on standard phishing detection tasks. However, the paper highlights significant weaknesses: LLM-based phishing detection systems could be exploited by adversarial attacks, prompt injection, and multilingual attacks. These attack vectors are specifically designed to exploit architectural vulnerabilities inherent in current LLMs. The authors note that current LLMs require substantial hardening before deployment in email security systems, particularly against coordinated multi-vector attacks.

Attack Vector	Description
Prompt Injection	Maliciously crafted inputs to alter model behavior
Text Refinement	Subtle rewording to evade detection
Multilingual Attacks	Using non-English languages to bypass filters

Vulnerabilities and Challenges

The paper explicitly identifies the attack vectors tested: prompt injection, text refinement, and multilingual attacks. These are coordinated multi-vector attacks that exploit architectural vulnerabilities. The findings provide critical insights for LLM-based phishing detection in real-world settings, where attackers exploit multiple vulnerabilities in combination. The researchers emphasize that hardening measures—such as adversarial training and input sanitization—are necessary before these systems can be reliably deployed.

Implications for Enterprise Email Security

For enterprise technology leaders, the results underscore both the promise and the risks of adopting LLM-based phishing detection. Achieving over 90% accuracy is encouraging, but the identified vulnerabilities mean that organizations cannot rely solely on LLMs without additional safeguards. As phishing grows more sophisticated—especially against systems that deploy LLM applications—enterprises should consider layered security approaches. These combine LLM detection with traditional rule-based systems, user training, and continuous monitoring. The paper, authored by Hasan, Najmul, BusiReddyGari, Prashanth, Zhao, Haitao, Ren, Yihao, Xu, Jinsheng, Zhang, and Shaohu, provides a foundation for hardening LLMs against coordinated attacks. Decision-makers should weigh the accuracy gains against the need for robust defensive postures.

Sources:

New LLM Framework Detects Phishing Emails with Over 90% Accuracy

The LLMPEA Framework

Performance Across Models

Vulnerabilities and Challenges

Implications for Enterprise Email Security

Recommended Stories

New Method LUCID Detects Hallucinations in LLM-Based Knowledge Graph Reasoning

Apple's Hide My Email Vulnerability Exposes User Addresses for Over a Year

Beyond Reasoning Gains: Mitigating General-Capability Forgetting in Large Reasoning Models

Fine-Tuning LLMs for Vulnerability Detection Fails to Improve Security Reasoning, Study Finds