Artificial Intelligence #diffusion llm#inference
Fast-dLLM++ Boosts Diffusion LLM Inference Up to 37% With Fréchet Profile Decoding
Researchers propose Fast-dLLM++, a training-free extension to Fast-dLLM that uses Fréchet profile decoding to select parallel token commit sets from the full confidence profile. Experiments on LLaDA-8B show up to 37% higher throughput at comparable accuracy on benchmarks including GSM8K, MATH, HumanEval, and MBPP.
Jun 16, 2026 1 source