Artificial Intelligence #token pruning#llm
DCP-Prune: New Token Pruning Method Preserves AI Model Performance at Ultra-Low Budgets
Researchers propose DCP-Prune, a two-stage token pruning framework that maintains model accuracy even under ultra-low token budgets. The method retains 92.1% of upper-bound average performance on LLaVA-1.5-7B with just 16 visual tokens, addressing distribution shift issues that plague aggressive pruning.
Jun 16, 2026 1 source