NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart with one-click deployment

Amazon AI BlogJun 4, 2026

Summaries like this, in your inbox every morning.

3 Key Points

NVIDIA Nemotron 3 Ultra, an open large language model with 550 billion total parameters and 55 billion active parameters, is now available for day-zero deployment on Amazon SageMaker JumpStart. The model uses a hybrid Transformer-Mamba Mixture-of-Experts architecture and supports up to 1M tokens context length.
The model delivers 5x faster inference and up to 30% lower cost for agentic workloads (AI systems that autonomously plan, call tools, and iterate across many steps). Its MoE architecture activates only 55B of its 550B parameters per forward pass, enabling sustained multi-step reasoning across hundreds of turns while maintaining coherence.
Users can deploy Nemotron 3 Ultra through SageMaker Studio or the SageMaker Python SDK without managing infrastructure, selecting from supported GPU instance types (ml.p5en.48xlarge, ml.p5.48xlarge, or ml.g7e.48xlarge). The model is optimized for the NVFP4 format.

AI-summarized, only the topics you pick — one digest a day via Email, Slack, or Discord.

Free · takes 30 seconds · unsubscribe anytime

No comments yet. Be the first to share your thoughts!

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack