releaseMarch 18, 2026

MiniMax releases M2.7: 56.22% SWE-Pro, 200K context, and self-evolving agent loops

MiniMax released M2.7 on its API and agent platform with coding and office-task claims plus a self-improving training harness. Engineers should validate the benchmark gains on real workloads, especially given mixed third-party results and aggressive pricing.

MiniMax Coding Agents Benchmarks

5 min read

MiniMax releases M2.7: 56.22% SWE-Pro, 200K context, and self-evolving agent loops

TL;DR

MiniMax released M2.7 on its own API and agent stack, pitching it as a coding- and workflow-focused reasoning model with a 200K-class context window, first-party access, and broad tool support through partner surfaces like OpenRouter and Ollama Cloud launch thread OpenRouter launch Ollama launch.
MiniMax’s headline claims are close to frontier proprietary coding models on several agent benchmarks: 56.22% on SWE-Pro, 52.7 on Multi-SWE Bench, 55.6 on VIBE-Pro, and 66.6% on MLE-Bench Lite, with an “88% win-rate vs M2.5” in the company’s internal comparison launch thread benchmark summary.
The unusual part is the training story: MiniMax says M2.7 “deeply participated in its own evolution,” using an agent harness with short-term memory, self-feedback, and self-optimization loops, plus recursive harness updates during internal iteration