Phishing remains one of the most prevalent and globally consequential vectors of cyber intrusion, affecting organizations worldwide. Researchers have proposed a new framework, LLMPEA, that leverages large language models (LLMs) to detect phishing emails with over 90% accuracy, according to a paper published on arXiv. The framework is designed to counter evolving attacks that exploit the fundamental architectures of LLMs.
The LLMPEA Framework
The LLMPEA (LLM-based Phishing Email Attack detection) framework evaluates three frontier LLMs: GPT-4o (OpenAI), Claude Sonnet 4 (Anthropic), and Grok-3 (xAI). The researchers employed comprehensive prompting design to assess the feasibility, robustness, and limitations of these models against phishing email attacks. According to the paper, the empirical analysis reveals that LLMs can detect phishing emails with over 90% accuracy.
Performance Across Models
All three models demonstrated strong performance on standard phishing detection tasks. However, the paper highlights significant weaknesses: LLM-based phishing detection systems could be exploited by adversarial attacks, prompt injection, and multilingual attacks. These attack vectors are specifically designed to exploit architectural vulnerabilities inherent in current LLMs. The authors note that current LLMs require substantial hardening before deployment in email security systems, particularly against coordinated multi-vector attacks.
| Attack Vector | Description |
|---|---|
| Prompt Injection | Maliciously crafted inputs to alter model behavior |
| Text Refinement | Subtle rewording to evade detection |
| Multilingual Attacks | Using non-English languages to bypass filters |
Vulnerabilities and Challenges
The paper explicitly identifies the attack vectors tested: prompt injection, text refinement, and multilingual attacks. These are coordinated multi-vector attacks that exploit architectural vulnerabilities. The findings provide critical insights for LLM-based phishing detection in real-world settings, where attackers exploit multiple vulnerabilities in combination. The researchers emphasize that hardening measures—such as adversarial training and input sanitization—are necessary before these systems can be reliably deployed.
Implications for Enterprise Email Security
For enterprise technology leaders, the results underscore both the promise and the risks of adopting LLM-based phishing detection. Achieving over 90% accuracy is encouraging, but the identified vulnerabilities mean that organizations cannot rely solely on LLMs without additional safeguards. As phishing grows more sophisticated—especially against systems that deploy LLM applications—enterprises should consider layered security approaches. These combine LLM detection with traditional rule-based systems, user training, and continuous monitoring. The paper, authored by Hasan, Najmul, BusiReddyGari, Prashanth, Zhao, Haitao, Ren, Yihao, Xu, Jinsheng, Zhang, and Shaohu, provides a foundation for hardening LLMs against coordinated attacks. Decision-makers should weigh the accuracy gains against the need for robust defensive postures.