iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
Cost of ransomware recovery too high? Here’s how to stop footing the bill CMA CGM Moves to Acquire Aircraft Maintenance Specialist Crystal Aero Solutions Qobuz Gains Subscribers as Artists and Audiophiles Reject Spotify's Model M*: A Modular, Extensible Serving System for Efficient Multimodal AI Inference New Benchmark and Method Address Occlusion in Vision-Language-Action Models for Robotics Fast LLM-Based Semantic Filtering: Unified Framework and Adaptive Two-Phase Method Deliver 1.6–2.0x Speed Gains Google Begins Android 17 Rollout; Key AI Upgrades Coming Later This Year EvalStop: Early Stopping for Reward Overoptimization in Multi-Tenant RLHF Platforms Cordyceps: New Data Poisoning Attack Covertly Controls Large Language Models Faster Completion, Less Learning: Generative AI Reduced Study Time on Math Problems and the Knowledge They Build Cost of ransomware recovery too high? Here’s how to stop footing the bill CMA CGM Moves to Acquire Aircraft Maintenance Specialist Crystal Aero Solutions Qobuz Gains Subscribers as Artists and Audiophiles Reject Spotify's Model M*: A Modular, Extensible Serving System for Efficient Multimodal AI Inference New Benchmark and Method Address Occlusion in Vision-Language-Action Models for Robotics Fast LLM-Based Semantic Filtering: Unified Framework and Adaptive Two-Phase Method Deliver 1.6–2.0x Speed Gains Google Begins Android 17 Rollout; Key AI Upgrades Coming Later This Year EvalStop: Early Stopping for Reward Overoptimization in Multi-Tenant RLHF Platforms Cordyceps: New Data Poisoning Attack Covertly Controls Large Language Models Faster Completion, Less Learning: Generative AI Reduced Study Time on Math Problems and the Knowledge They Build
Home ›› Technology ›› Ai ›› Llms ›› Cordyceps: New Data Poisoning Attack Covertly Controls Large Language Models

Cordyceps: New Data Poisoning Attack Covertly Controls Large Language Models

A new paper on arXiv presents Cordyceps, a data poisoning attack that embeds covert control instructions into large language models through semantic associations. Tested across five LLMs, it achieves up to 93% attack success after backdoor defenses and 98% after prompt injection defenses, outperforming heuristic methods by 40%.

iG
iGEN Editorial
June 16, 2026
Cordyceps: New Data Poisoning Attack Covertly Controls Large Language Models

Large language models (LLMs) are increasingly fine-tuned on uncurated text datasets, creating an opening for adversaries to inject malicious behavior. A new attack method, named Cordyceps, demonstrates a more subtle and persistent threat than previously known. According to a paper on arXiv, Cordyceps teaches an LLM an information hiding scheme via data poisoning, enabling covert control over the model's outputs without relying on fixed trigger phrases.

The Attack Mechanism

Traditional poisoning attacks depend on specific trigger words that defenses can detect and neutralize. Cordyceps, by contrast, builds semantic associations between shared knowledge—such as common facts or concepts—and attacker-chosen phrases. The paper explains that this induces a hiding scheme capable of encoding and decoding arbitrary malicious instructions. The attack is named after the parasitic fungus that takes over its host, reflecting the attack's ability to subvert the model from within.

Performance Metrics

The researchers evaluated Cordyceps across 5 LLMs, using 3 backdoor defenses and 4 prompt injection defenses. With only a small poisoned fraction of the training data, covert control attacks outperformed heuristic-based prompt injection attacks. The average attack success rate improved by approximately 40% relative to clean fine-tuned models. The paper notes that this advantage holds even when only a tiny proportion of the training data is poisoned.

Resilience Against Defenses

A key finding is Cordyceps' ability to circumvent existing mitigation strategies. The attack maintained a success rate of up to 93% after backdoor defenses—which typically involve outlier detection or clean-data regularization—and up to 98% after prompt injection defenses, which monitor outputs for malicious instructions. The following table summarizes the attack's persistence:

Defense Type Maximum Attack Success Rate
Backdoor defenses (detection & fine-tuning) 93%
Prompt injection defenses (online monitoring) 98%

Implications for Enterprise AI

For organizations deploying LLMs in critical workflows—including supply chain management, logistics optimization, and trade documentation—this research highlights a new vulnerability. The ability of Cordyceps to encode arbitrary instructions through semantic associations means that even models fine-tuned on apparently benign data could harbor hidden backdoors. Enterprises relying on third-party fine-tuning or uncurated datasets must reassess their AI supply chain security. The paper, authored by Shao, Zedian, Fleming, Charles, Baluta, and Teodora, represents a wake-up call for adopting more robust validation and monitoring practices in AI systems.


Sources:

Keep Reading

Recommended Stories

SPARK Method Activates Latent Security Knowledge in LLMs for Secure Code Generation Technology

SPARK Method Activates Latent Security Knowledge in LLMs for Secure Code Generation

SPARK (Security Knowledge Priming and Representation-Guided Knowledge Activation) is a new inference-time method that improves the security of code generated by large language models without requiring retraining. The researchers argue that pretraining data already contains sufficient security material; the bottleneck is activation. Evaluated on 9 open-source and 7 proprietary models, SPARK matches or improves secure code generation baselines while preserving code utility.

June 16, 2026
Haiku to Opus in Just 10 bits: LLMs Unlock Large Compression Gains Technology

Haiku to Opus in Just 10 bits: LLMs Unlock Large Compression Gains

A new arXiv paper presents methods for compressing LLM-generated text, achieving over 100x reduction in data transfer compared to prior techniques. Lossless compression via domain-adapted LoRA adapters doubles efficiency, while an interactive Question-Asking protocol recovers up to 72% of the capability gap between small and large models using only 10 binary questions.

June 16, 2026
How Scale Design Impacts LLM Metacognition and Enterprise AI Reliability Technology

How Scale Design Impacts LLM Metacognition and Enterprise AI Reliability

A study on arXiv reveals that the confidence scale used in LLMs (typically 0-100) leads to heavy discretization, with over 78% of responses on three round numbers. Changing the scale to 0-20 improves metacognitive efficiency. The findings have implications for enterprise use of LLMs in supply chain decision-making where confidence calibration is critical.

June 16, 2026
New Unified Definition of AI Hallucination Pins It on Inaccurate World Modeling Technology

New Unified Definition of AI Hallucination Pins It on Inaccurate World Modeling

A new arXiv paper by Liu et al. proposes a unified definition of hallucination in large language models, defining it as inaccurate internal world modeling observable to the user. The framework subsumes prior definitions and distinguishes true hallucinations from planning or reward errors, and introduces the HalluWorld benchmark for stress-testing models.

June 16, 2026