GPT‑5.2 Pro claims Erdős #3 and #397 solves – Tao acceptance cited

Stay in the loop

Free daily newsletter & Telegram daily report

Executive Summary

Posts today attribute multiple new mathematics results to OpenAI’s GPT‑5.2 Pro; screenshots of erdosproblems.com threads are circulated as receipts, with claims that Terence Tao and others validated/accepted at least Erdős problem #3. A separate wave points to Erdős problem #397 marked “DISPROVED (LEAN),” implying a formal artifact exists; attribution varies between “GPT‑5.2 Pro” and “ChatGPT” in the cited captures, and the public threads shared via screenshots aren’t a full reproducible package yet.

• Anthropic/Claude Code: Desktop UI now exposes an Ask/Act/Plan selector; Claude Code 2.1.5 appears with no detailed CLI changelog, pushing behavior-diff tracking onto community regression reports.
• Google/UCP + Direct Offers: Universal Commerce Protocol proposes /.well-known/ucp manifests and capability-based checkout flows; Google also pilots sponsored “Direct Offers” inside AI Mode shopping, but conformance tests and auction/reporting details aren’t shown.
• Meta power: Meta cites up to 6.6 GW of nuclear PPAs by 2035 with 20‑year deal structure; firmness vs pipeline remains unclear.

Across the feed, “agent era” credibility is converging on verification layers—Lean proofs for math; provenance/review UX for code and docs—while infra and standards (power PPAs; commerce manifests) shape what agents can actually ship.

Feature: GPT‑5.2 Pro’s verified Erdős-problem streak (Tao acceptance)

GPT‑5.2 Pro is widely reported to have solved multiple long‑standing Erdős problems with solutions accepted by Terence Tao—suggesting a new phase for verifiable AI math and downstream acceleration of formal/technical research.

Multiple high-volume posts cite GPT‑5.2 Pro solving multiple Erdős Problems, with solutions accepted/validated by Terence Tao and others—framed as a step-change in AI’s ability to do new, checkable mathematics. This category is solely the math-breakthrough storyline and excludes other AI research/tooling.

Jump to Feature: GPT‑5.2 Pro’s verified Erdős-problem streak (Tao acceptance) topics

🧮 Feature: GPT‑5.2 Pro’s verified Erdős-problem streak (Tao acceptance)

GPT‑5.2 Pro is credited with solving Erdős problem #3 with Tao acceptance

Erdős problem #3 (OpenAI): Posts today claim GPT‑5.2 Pro solved Erdős problem #3 and that the result was accepted by Terence Tao, with a screenshot of the erdosproblems.com discussion used as evidence in the Erdős #3 claim.

The framing around it is explicitly “a ‘lift-off’ moment for AI science,” as stated in the Erdős #3 claim, and the credibility hinge in the thread is the “accepted by Terence Tao” assertion in the same Erdős #3 claim.

GPT‑5.2 Pro claims Erdős #3 and #397 solves – Tao acceptance cited

Executive Summary

Top links today

Feature: GPT‑5.2 Pro’s verified Erdős-problem streak (Tao acceptance)

Table of Contents

🧮 Feature: GPT‑5.2 Pro’s verified Erdős-problem streak (Tao acceptance)

GPT‑5.2 Pro is credited with solving Erdős problem #3 with Tao acceptance

GPT‑5.2 Pro is credited with solving Erdős problem #397 as “Disproved (Lean)”

The “three Erdős problems in a weekend” GPT‑5.2 Pro streak narrative spreads

Near-miss Erdős attempts are said to be giving way to genuine solves

🧩 Claude Code: Plan mode, desktop UX, and quiet point releases

Claude Code desktop UI surfaces Ask/Act/Plan mode selector

Anthropic promotion link circulates for installing Claude Code via Claude Desktop

Claude Code 2.1.5 ships with no public CLI changelog yet

ClaudeCodeLog solicits biggest improvements/regressions for Claude Code 2.1.4

🧭 Cursor agents: Grind mode, mobile agent UI, and long-run loops

Cursor Dashboard surfaces Grind mode toggle alongside multi-model picker

Cursor guide formalizes agentic TDD loop for long-running iterations

Cursor mobile agent UI appears with model selector and repo picker

🛠️ OpenCode momentum: adoption scale and provenance in diffs

OpenCode tracks which diff hunks were AI-generated, raising git provenance questions

OpenCode crosses 60K stars and keeps climbing on GitHub Trending

🏗️ Power and capacity race: nuclear PPAs, solar scale, and AI buildout signals

Meta targets up to 6.6 GW of nuclear power by 2035 to supply AI buildout

China adds 256 GW of solar in H1 2025, about 67% of global additions

Taiwan Big 6 AI server OEMs hit NT$1.59T December revenue, +35.9% YoY

Alibaba Qwen lead pegs China leapfrogging odds under 20% and cites compute limits

📐 Agentic coding practice: TDD loops, context discipline, and “tool vs skill” debates

Agentic TDD loop standardizes tests-first guardrails for coding agents

Agent build checklist emphasizes Unix tools, shared FS, and verification scripts

“Focus on the skill, not the tool” backlash grows around Ralph-style hype

Bash loop framing resurfaces as the alternative to Ralph-style autonomous loops

Design-doc-first AI coding workflow gets framed as “how FAANG ships with agents”

GPT‑5.2 “thinking level” knob is framed as a throughput vs quality control

“Dumb zone starts at ~40% usage” becomes a token-budget heuristic

“Human on the Loop” framing clarifies Ralph is not hands-off automation

“Watch your agent fix the tests” becomes a preferred iteration loop

Linux “30M LOC ≈ 500M tokens” becomes a mental model for agent-scale projects

🧩 Installables for coding agents: Ralph add-ons, skills packs, and doc fetchers

ralph-research plugin runs paper implementations as a self-improving loop

ralph-claude-code GitHub repo spikes as a Claude Code autonomous loop add-on

hyperbrowser adds /docs fetch to pull live docs into agent sessions

superpowers gains attention as a Claude Code skills pack

🛒 Agentic commerce standardization: Google’s UCP and checkout primitives

Google launches Universal Commerce Protocol (UCP) for agentic shopping across merchants

Google AI Mode pilots “Direct Offers” as sponsored deals inside conversational shopping

✅ Code quality in the agent era: review automation, provenance, and evidence

GPT-5.2 Low beats Claude Opus 4.5 High in a real code review demo

Anthropic agent evals playbook circulates: capability vs regression, graders, pass@k vs pass^k

Every introduces Proof, a markdown editor that tracks AI provenance and review status

GitHub Copilot AI review bot posts 'unable to review this pull request' error

OpenCode tracks which diff hunks were AI-generated, raising git provenance questions

Proposal: add an “Evidence” attachment to PRs for artifacts outside the repo

🧵 Running many agents: Clawdbot automation, cloud agents, and “watch it work” UIs

Clawdbot starts using subagents automatically for long-running work

Clawdbot uses Codex on Discord chatter to auto-write FAQ entries

KiloCode pushes Cloud Agents in the browser with a prebuilt starter demo

Yutori Scout adds a watch-it-work view when creating a new agent

Multi-account parallelism becomes a visible tactic for running more agents at once

🧱 Accelerated computing economics: chips, fabs, and GPU benchmarks

Nvidia free cash flow projections climb to $158.3B by FY27E

AI semiconductor value chain map connects EDA, fabs, packaging, and accelerators

Jensen Huang: the foundation of AI is accelerated computing

🔌 MCP and agent interoperability: hosts, tooling, and integration friction

Vercel Labs ships agent-browser CLI, pitching 93% less context than Playwright MCP

Playwright MCP + Ralph pairing gets traction for agent-driven browser automation

Cursor CLI publishes MCP configuration docs as a dedicated mcp.md page

“No one wrote down what MCP stands for” becomes a small MCP ecosystem signal

🧠 Training & optimization: multi-reward RL, LoRA-for-MoE, and scheduling rules

Learning-rate setting at scale: fitting rules beats muTransfer under WSD

Batch-size setting under WSD: grow batch size as training progresses

DR-LoRA grows LoRA capacity only on the MoE experts that matter

MoE “standing committee”: 2–5 experts dominate ~70% routing weight

Training on wrong traces can help generalization, via GLOW weighting

Learnable Multipliers: letting matrix scale adapt without inference overhead

Policy optimization roundup: 11 named variants to track in 2026

🧰 Dev tools around agents: browser automation CLIs, sandboxes, and context builders

Vercel Labs ships agent-browser, a Rust browser-automation CLI built for agent loops

Chrome DevTools adds per-request network throttling for targeted slow-resource testing

RepoPrompt context_builder workflow: generate a plan before GPT‑5.2 Codex writes code