Artificial Intelligence #tied expert layers#mixture-of-experts
Expert Tying Reduces Memory Footprint of Mixture-of-Experts LLMs by Nearly Half
A new arXiv paper from Jaggi proposes Expert Tying, an architectural modification for Mixture-of-Experts LLMs that shares expert parameters across consecutive transformer layers. Pretraining experiments show memory footprint reduction by almost 2x with virtually no degradation in perplexity or downstream quality, evaluated on OLMoE, Qwen3, and DeepSeek-style architectures.
Jun 16, 2026 1 source