Artificial Intelligence #snyk#vulnbench
Snyk VulnBench JS 1.0 Reveals LLM Security Reviews Are Unrepeatable: Can They Find the Same Bugs Twice?
A new benchmark from Snyk finds that agentic LLM security reviews are highly unrepeatable: 80 of 161 unique findings appeared in only one of five identical runs. By contrast, Claude's reference-matched findings were stable, and Snyk Code SAST was deterministic. The study argues for combining LLM and SAST approaches rather than treating them as replacements.
Jun 16, 2026 1 source