NVIDIA released Nemotron 3 Super, a 120B open model with 12B active parameters and a 1M-token window, on OpenRouter with free access. Evaluate it for low-cost agent backends, especially if you need local or self-hosted deployment options.

The practical news is simple: Nemotron 3 Super is already callable through OpenRouter, and Teknium's Hermes setup shows one immediate path into agent workflows by pasting nvidia/nemotron-3-super-120b-a12b:free into Hermes Agent's custom model field. That makes this less of a research release and more of a drop engineers can test today.
According to the OpenRouter page, the model is a 120B open hybrid MoE system with only 12B parameters active at inference, a 1M-token context window, and multi-token prediction aimed at long-context reasoning and multi-step planning. The same listing says it is released with weights, datasets, and recipes under the NVIDIA Open License, and reports roughly 28 tokens/sec average throughput alongside benchmark strength on AIME 2025, TerminalBench, and SWE-Bench benchmark summary.
The first concrete implementation signal is from OpenHands: its team says it had early access, that the model "works well," and that they are "excited to have a great new locally deployable LLM" early access note. That lines up with the release's strongest engineering angle: a big-context open model positioned for agent backends that teams may want to run outside closed hosted APIs.
The performance case is still mostly benchmark-driven, but it is specific enough to watch. Wes Roth's AA chart cites an Artificial Analysis score of 36 for Nemotron 3 Super versus 33 for gpt-oss-120B, and claims it is "roughly 10% faster per GPU," while OpenRouter's PinchBench post amplified a separate report that it is the best model on average on PinchBench for openclaw. Nathan Lambert's interview post also framed this release as "a LONG time coming," pointing to NVIDIA's broader open-model push rather than a one-off model drop.
Vals AI switched SWE-Bench Verified from SWE-Agent to the bash-only mini-swe-agent harness, aligning results more closely with the official benchmark setup. Top score dipped slightly to 78.8%, but the change reduces harness-specific confounds when comparing models.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
Run Nemotron as your agent driver in Hermes Agent for free with OpenRouter: openrouter.ai/nvidia/nemotro… Just type `hermes model`, select OpenRouter, and click custom model name, and put: nvidia/nemotron-3-super-120b-a12b:free
Want to see where OpenHands is headed next? 👀 Join our call TODAY. We will be presenting our roadmap and want feedback from YOU. RSVP below 👇️
Nemotron 3 Super is the new "Gold Standard" for open-weight intelligence, hitting a 36 on the Intelligence Index while remaining highly efficient. It is smarter than GPT-OSS-120B while being roughly 10% faster per GPU.