Mistral shipped Mistral Small 4, a 119B MoE model with 6.5B active parameters, multimodal input, configurable reasoning, and Apache 2.0 weights. Deploy it quickly in existing stacks if you use SGLang or vLLM, which added day-one support.

mistral-small-latest mapped to “Mistral Small 4,” while the launch post link points to the official model release.Mistral Small 4 arrived after pre-release signs in a Hugging Face integration PR, which surfaced the core packaging before launch. That material described a “powerful hybrid model” that “unifies” Instruct, Reasoning, and Devstral-style capabilities in one model, with 128 experts, 4 active experts, 119B total parameters, and 6.5B active per token, as shown in the pre-release PR leak and the architecture screenshot.
The shipped model keeps a broad feature surface for one checkpoint: multimodal input with text output, configurable reasoning effort per request, native function calling, JSON output, multilingual support, and a 256K context window. Mistral is also releasing it as open weights under Apache 2.0, and the Hugging Face collection makes clear this is a family of checkpoints rather than a single artifact.
SGLang shipped day-one support with a concrete server command that uses mistralai/Mistral-Small-4-119B-2603 plus --tool-call-parser mistral and --reasoning-parser mistral, which means existing tool-calling pipelines do not need custom glue to expose the model's agentic and hybrid reasoning modes in LMSYS's post. LMSYS also claims “3× more RPS vs Mistral Small 3,” framing the release as a throughput play as much as a capability upgrade in the same announcement.
vLLM also added day-one support, with its launch note calling out MLA attention, tool calling, and configurable reasoning mode, verified on NVIDIA GPUs. The example container config exposes the operational knobs engineers actually care about for rollout — 262144 max model length, Flash Attention MLA backend, tensor parallel size 2, automatic tool choice, and batching settings up to 16,384 tokens and 128 sequences
.
Mistral is positioning Small 4 as a consolidation release, not just a smaller checkpoint. The internal comparison chart in the launch materials shows separate instruct and reasoning variants for the same model, with reasoning mode lifting GPQA Diamond from 59.1 to 71.2, MMLU-Pro from 73.5 to 78.0, IFBench from 35.7 to 48.0, and MMMU-Pro from 46.3 to 60.0; Arena Hard improves more modestly from 55.8 to 58.3 as a reposted chart also shows.
That launch landed alongside Mistral's new NVIDIA partnership, which the announcement thread framed as co-developing “frontier open-source AI models.” In practice, that gives Small 4 more than a model-card moment: it shipped with immediate availability in Mistral Playground via the model picker and immediate support in two popular open serving stacks, which is the part most likely to matter for engineering teams evaluating it this week.
Vercel Emulate added a programmatic API for creating, resetting, and closing local GitHub, Vercel, and Google emulators inside automated tests. That makes deterministic integration tests easier to wire into CI and agent loops without manual setup.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
Mistral Small 4 is out huggingface.co/collections/mi…
🎉 Congrats on launching Mistral Small 4 from @MistralAI, day-0 support is now live in SGLang! Mistral Small 4 is a 119B MoE model that unifies Instruct, Reasoning, and Agentic capabilities into a single model. ⚡️ Efficient MoE: 3× more RPS vs Mistral Small 3 🧠 Hybrid Show more
🎉 Congrats to @MistralAI on releasing Mistral Small 4 — a 119B MoE model (6.5B active per token) that unifies instruct, reasoning, and coding in one checkpoint. Multimodal, 256K context. Day-0 support in vLLM — MLA attention backend, tool calling, and configurable reasoning Show more
We're so back Mistral has announced a partnership with NVIDIA to develop frontier open source models... But also released Mistral Small 4 🔥 - 100% open source - 6.5B activated parameters - Reasoning & non-reasoning mode - Beat the previous (closed) medium And it's multimodal Show more
🚀Announcing a strategic partnership with NVIDIA to co-develop frontier open-source AI models, combining Mistral AI’s frontier model architecture and full-stack AI offering with NVIDIA’s leading compute infrastructure and development tools.
BREAKING 🚨: Mistral AI and NVIDIA joining forces to co-develop frontier open-source AI models! Release of Mistral Small 4 has been mentioned in the blog post as well. “As part of this commitment, today Mistral AI is releasing Mistral Small 4 to empower developers, Show more
🚀Announcing a strategic partnership with NVIDIA to co-develop frontier open-source AI models, combining Mistral AI’s frontier model architecture and full-stack AI offering with NVIDIA’s leading compute infrastructure and development tools.
BREAKING 🚨: Mistral AI is preparing Mistral 4 models for the upcoming release, as a new PR has been opened to Huggingface. Mistral-Small-4-119B-2603 is on the horizon!
Mistral 4 is coming - "Mistral 4 is a powerful hybrid model with the capability of acting as both a general instruction model and a reasoning model. It unifies the capabilities of three different model families - Instruct, Reasoning ( previous called Magistral ), and Devstral -