Cursor shipped Composer 2 with gains on CursorBench, Terminal-Bench 2.0, and SWE-bench Multilingual, plus a fast tier and an early Glass interface alpha. It resets the price-performance baseline for coding agents and shows Cursor is now a model company as much as an IDE.

Cursor released Composer 2 as an in-house coding model available directly in the editor, with two serving tiers. The launch thread lists standard at $0.50/M input and $2.50/M output tokens, and fast at $1.50/M input and $7.50/M output tokens; Cursor's blog post says the model is included in usage pools for individual plans and paired with an “early alpha” interface.
That interface is Glass, which Cursor describes in the Glass page as a simpler environment for working with AI agents. Cursor framed the release as both a model update and a product-surface update: the model ships now, while Glass is being shared as an early alpha rather than a finished default UI.
Cursor's published numbers show a large jump over prior Composer versions: 61.3 on CursorBench, 61.7 on Terminal-Bench 2.0, and 73.7 on SWE-bench Multilingual, versus Composer 1.5 at 44.2, 47.9, and 65.9 respectively in the benchmark table. The same release material says these gains came from “our first continued pretraining run” and reinforcement learning on “long-horizon coding tasks” that require “hundreds of actions” launch post.
On Terminal-Bench 2.0, the comparison chart places Composer 2 above Opus 4.6 at 61.7 versus 58.0, though still behind GPT-5.4 at 75.1. That makes the claim narrower than “best coding model”: Cursor is showing frontier-adjacent scores, not category leadership across every benchmark, but it is closing the gap while moving a lot of the curve on cost.
The key engineering story is the economics. Cursor's price-performance chart places Composer 2 around GPT-5.4's CursorBench range while cutting median cost per task to roughly the low end of GPT-5.4 and far below Opus 4.6 high settings; the fast tier also appears much cheaper than competing fast modes in the speed and price chart.
Early usage reports suggest that tradeoff is already useful even when teams still prefer a stronger model for hardest tasks. One developer wrote that for “a large codebase,” Composer 2 works well for “targeted fixes, quick refactors, and getting specific questions answered” without the long waits, while also saying “it doesn't reach the quality of GPT-5.4” practitioner feedback. Another head-to-head shared by Dan Shipper said Composer 2 beat GPT-5.4 on a production-QA optimization prompt as judged by GPT-5.4 and Opus 4.6, which is anecdotal but consistent with Cursor's pitch that the model is now good enough for real workflow slices rather than just cheap fallback usage.
Vercel Emulate added a programmatic API for creating, resetting, and closing local GitHub, Vercel, and Google emulators inside automated tests. That makes deterministic integration tests easier to wire into CI and agent loops without manual setup.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
composer 2. for a large codebase, it's a great model for targeted fixes, quick refactors, and getting specific questions answered without the long waiting times we're already used to. It doesn't reach the quality of GPT-5.4, but it's still incredibly useful for this part of the Show more
Learn more: cursor.com/blog/composer-2
that looks pretty fucking good
Holy: Composer 2 is now live in Cursor, pairing frontier-level coding benchmarks with standout pricing: $0.50 per million input tokens and $2.50 per million output tokens. Cursor reports major jumps over prior versions across CursorBench (61.3), Terminal-Bench 2.0 (61.7), and Show more
Cursor has launched Composer 2. It hits the sweet spot between performance and cost. > Terminal Bench: 61.7 > SWE: 73.7 $0.50/M input tokens and $2.50/M output tokens