New write-ups on Mamba-3 add more detail on its MIMO decode path, discretization changes, and complex-valued state updates. That gives infra teams a clearer basis for testing state-space models as inference-efficient alternatives in long-sequence or agent-heavy systems.

Cartesia's launch post frames Mamba-3 as a redesign for the part of the stack that now dominates cost and latency: inference. The linked write-up says earlier SSM advances helped efficiency, but Mamba-3 changes the model around "a world where AI workloads are increasingly dominated by inference," not just training throughput.
The clearest architectural deltas come from the paper summary. It describes a new exponential-trapezoidal discretization with a three-term recurrence that is "more expressive" than Mamba-2's exponential-Euler update, plus complex-valued state updates through data-dependent RoPE. In the summary's wording, that enables "rotational state dynamics" and improves tasks that require persistent state tracking, including parity-style problems that weaker linear dynamics struggle with.
Together's thread context ties the research to a familiar infra problem: linear models can look efficient in FLOPs while still being memory-bound during decode. Its description of the MIMO path is practical: swapping the recurrence from vector outer-product to matrix multiply lets the model do more useful compute during decoding at the same speed, which is exactly the kind of trade that matters when GPU utilization is the bottleneck.
That same thread context claims Mamba-3 delivers the fastest prefill+decode at 1.5B and beats Mamba-2, Gated DeltaNet, and Llama-3.2-1B at that scale. The paper summary adds a smaller but concrete quality delta, saying the MIMO variant improved accuracy by 1.2 points over a comparable baseline. Together also says kernels are open-sourced in the thread, which makes this more testable than a pure benchmark claim.
Miles added ROCm support for AMD Instinct clusters and reported GRPO post-training gains on Qwen3-30B-A3B, including AIME rising from 0.665 to 0.729. It matters if you are evaluating rollout-heavy RL jobs off NVIDIA and want concrete throughput and step-time numbers before porting.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
Day 2 of #NVIDIAGTC brought the heat — literally 📷 Hot wings, a lightning talk from 5C, Tokens After Hours with @Metronome Webhook, and our team met Jensen. Not a bad Tuesday. Day 3 kicks off soon — Together Trivia, cool prizes, Booth #1213. Come ready to booth #1213. 📷📷
Mamba-3 is out! 🐍 SSMs marked a major advance for the efficiency of modern LLMs. Mamba-3 takes the next step, shaping SSMs for a world where AI workloads are increasingly dominated by inference. Read about it on the Cartesia blog: blog.cartesia.ai/p/mamba-3