Artificial Intelligence #ai#machine learning
NVIDIA Open-Sources Nemotron 3 Ultra: 550B-Parameter Hybrid Mamba-Transformer Model for Agentic AI
NVIDIA introduced Nemotron 3 Ultra, a 550 billion total parameter Mixture-of-Experts language model with a hybrid Mamba-Attention architecture. Only 55 billion parameters are active per inference. Pre-trained on 20 trillion tokens and supporting a 1 million token context length, the model achieves up to 6x higher inference throughput versus state-of-the-art public LLMs while matching accuracy. All checkpoints, training data, and recipes are open-sourced on HuggingFace.
Jun 16, 2026 2 sources