Artificial Intelligence #llm#inference
SMEPilot Boosts LLM Inference Up to 3.94x on CPUs with Scalable Matrix Extensions
Researchers have developed SMEPilot, an LLM inference engine that leverages Arm Scalable Matrix Extension (SME) to optimize execution on CPUs. By selecting CPU-only, SME-only, or cooperative SME+CPU execution per operator shape, SMEPilot improves end-to-end inference by up to 3.94x across multiple models and platforms.
Jun 16, 2026 1 source