updateMarch 21, 2026

Cursor Composer 2 ranks #2 on Next.js evals, ahead of Opus and Gemini

Vercel's Next.js evals place Composer 2 second, ahead of Opus and Gemini despite the recent Kimi-base controversy. The result matters because it separates base-model branding from measured task performance on a real framework workflow.

Cursor Coding Agents Benchmarks Developer Experience

3 min read

Cursor Composer 2 ranks #2 on Next.js evals, ahead of Opus and Gemini

TL;DR

Vercel's Next.js eval post says Cursor Composer 2 is now second on its Next.js agent leaderboard, ahead of both Opus and Gemini, which gives Cursor a framework-specific result that is stronger than the current base-model discourse suggests.
The underlying eval page describes task-based code generation and migration tests, and its summary puts Composer 2 at a 76% success rate, making this a concrete workflow benchmark rather than a generic chatbot ranking Vercel benchmark.
That outcome cuts against the recent Kimi-base criticism arguing Composer 2 should underperform because it was built on Kimi K2.5, a model the post places at #14 on LMArena Code.
Early practitioner usage already looks hybrid: one developer's