Claude Opus 4.5 wins 54% coder poll – 67% cost drop

Stay in the loop

Free daily newsletter & Telegram daily report

Executive Summary

Following this week’s eval charts, today’s twist is adoption and theory-of-the-case: in a 1,142‑dev poll, 54% say Claude Opus 4.5 now handles most of their coding, while GPT‑5 Codex/max sits at 23% and Gemini 3 at 12%. That’s happening alongside a WeirdML jump from roughly 42.8% to 63.7% accuracy and a weighted cost slide from $27 to $9, a two‑thirds price cut for “thinking‑grade” runs.

Analysts are calling this a “black swan” cost–performance point because Opus 4.5 behaves like a premium reasoning model but is priced like yesterday’s mid‑tier. New SWE‑bench analysis makes it weirder still: thinking mode barely moves the needle, implying the base model itself is doing most of the work, unlike OpenAI’s o‑series where long reasoning is the main booster. If that holds in your stack, the practical advice is simple: rerun your harness with Opus 4.5 as the default brain for complex agents and coding, then add heavy thinking selectively instead of assuming it’s required.

In parallel, MiniMax’s open MiniMax‑M2 is positioning as the scrappy alternative—a 230B MoE with 10B active parameters and interleaved thinking that pushes GAIA to 75.7—so multi‑model shops have a serious open contender to A/B before they anoint Opus 4.5 as the only game in town.

Feature: ChatGPT ads pilot signals monetization pivot

ChatGPT Android beta (1.2025.329) contains “ads feature” hooks (bazaar content, search ad, carousel), pointing to a search ads pilot; community debates Bing infra, free tier funding, and trust trade‑offs.

Cross‑account leak: new “ads feature” strings in ChatGPT Android beta. Threads debate search‑style sponsored units vs trust impact. Mostly product strings and analyst takes; excludes other business items from this section.

Jump to Feature: ChatGPT ads pilot signals monetization pivot topics

📣 Feature: ChatGPT ads pilot signals monetization pivot

ChatGPT Android beta exposes new search‑ads feature plumbing

New strings in ChatGPT’s Android 1.2025.329 beta show an internal “ads feature” with types like ApiSearchAd, SearchAdsCarousel and ApiBazaarContentWrapper, pointing to a coming search‑style ad layer rather than random chat banners android leak breaking alert.

APK diff watchers say these classes only appeared in the latest build and sit under com.openai.feature.ads.data.*, implying a structured pipeline for targeted units such as carousels or marketplace cards around web/search answers rather than generic interstitials explainer thread feature article. Analysts speculate OpenAI could lean on Microsoft’s existing Bing Ads infra for serving and targeting, given their Azure partnership and “search ad” naming breaking alert. One read is that Android will be the first client to light this up—raising the odd possibility that Android gets a feature ahead of iOS for once, as users joke in replies android first joke. With estimates of ~800M weekly users and 2.5B prompts per day, even a light ad load limited to commercial queries could turn ChatGPT into a sizable performance‑based ad channel while keeping paid tiers ad‑free scale analysis.

Claude Opus 4.5 wins 54% coder poll – 67% cost drop

Executive Summary

Top links today

Feature: ChatGPT ads pilot signals monetization pivot

Table of Contents

📣 Feature: ChatGPT ads pilot signals monetization pivot

ChatGPT Android beta exposes new search‑ads feature plumbing

Builders warn ChatGPT ads could erode trust in AI answers

🧩 Open MiniMax‑M2 focused on agents and coding

Interleaved thinking in MiniMax-M2 lifts GAIA and BrowseComp agent scores

MiniMax-M2 open MoE goes all‑in on full attention for agentic coding

MiniMax-M2 is already powering Claude Code-style IDE agents

📈 Opus 4.5 momentum: cost, capability, and dev sentiment

Opus 4.5 is becoming many developers’ primary coding model

Analysts label Opus 4.5 a “black swan” for cost–performance tradeoff

New analysis suggests Opus 4.5’s edge comes from its base model, not long “thinking”

🛡️ Policy and legal: AI‑assisted cyber ops and copyright risk

Congress probes Claude-assisted Chinese cyber campaign in Dec. 17 hearing

OpenAI ordered to disclose internal chats on deleted pirated book datasets

🛠️ Agent IDEs and coding flows in practice

Google shows Gemini 3 “vibe coding” and Antigravity dev platform in action

RepoPrompt 1.5 turns into a full agent IDE with MCP hooks and parallel tabs

WarpGrep subagent accelerates coding agents with fast context search

AgentBase pitches serverless architecture for 50+ production agents per call

LangSmith Agent Builder gets a full Chinese walkthrough for no-code agents

Memex AI IDE gets real-world praise for front-end refactors

🎞️ Video understanding and gen video shake‑ups

ByteDance’s Vidi2 beats Gemini and GPT‑5 on long‑video grounding

Mystery model “Whisper Thunder” tops text‑to‑video leaderboard

Midjourney ships Style Creator for reusable visual styles

ComfyUI turns fortress walls into a projection canvas for AI video

💾 China accelerator signals: near‑memory and GPTPU claims

Zhonghao Xinying’s GPTPU Ghana chip claims 1.5× A100 at 42% better efficiency

China’s 14nm logic-on-DRAM concept pitches 120 TFLOPS at ~2 TFLOPS/W

⚡ Power and capacity outlook for AI

US data center pipeline hits ~80 GW as AI power squeeze looms

New Gross Domestic Intelligence metric reframes AI race around watts, not chips

Independent measurements converge on ~0.0003 kWh per LLM response

🏢 Enterprise rollouts: education, hiring, and Gemini UX

OpenAI rolls out free ChatGPT for Teachers for US K–12 schools

Google invests in Gemini App UX 2.0 and plans a macOS client

Microsoft pilots Copilot Career Coach for avatar-led mock interviews

AI Studio mobile app tease points to on-device builder workflows

📚 Eval methods and synthetic data frameworks

Meta’s Matrix P2P framework speeds multi-agent synthetic data by up to 15×

Structured prompting paper shows single-prompt benchmarks are underestimating LLMs

GigaWorld‑0 turns world models into a synthetic data engine for robots

Comet’s Opik open-sources an LLM eval and tracing platform

🧾 Parsing and extraction at scale

LlamaExtract mode reliably pulls large hospital tables from PDFs with 100% recall

AICC ships first MinerU‑processed Markdown shards of Common Crawl

DocETL proposes YAML‑defined LLM pipelines for large‑scale document ETL

On this page