OpenAI Codex CLI PR 13212 adds /fast tier – GPT‑5.1 leaves March 11

Stay in the loop

Free daily newsletter & Telegram daily report

Executive Summary

A Codex CLI pull request (PR 13212) adds a persistent Fast mode: the TUI gets a /fast command; the setting is stored locally in codex-core; requests flip to service_tier=priority, implying a first-class routing tier rather than a client-side trick. The diff also shows a ServiceTier enum (standard, fast) and hard-coded help text referencing “Fast mode for GPT‑5.4,” turning a latency toggle into a rollout breadcrumb for a new model name; none of this is an official launch artifact, but it reads like plumbing being wired ahead of a swap.

• GPT‑5.4 surfacing: sightings span a ChatGPT model picker option, an alpha-gpt-5.4 entry near alpha-gpt-5.3-codex in a models listing, and the Codex CLI string itself; still no public model card or benchmarks.
• ChatGPT churn: a model menu screenshot tags GPT‑5.1 Instant and GPT‑5.1 Thinking as “Leaving on March 11,” while newer 5.2/mini variants appear alongside.
• Cognition SWE‑1.6 preview: claims 51.7% SWE‑Bench Pro at 950 tok/s via scaled RL (same pre-trained base); early access only; Cognition flags overthinking/self-verification as remaining issues.

Codex CLI “Fast mode” + GPT‑5.4 breadcrumbs (priority tier + model churn signals)

Codex CLI code changes point to a new /fast toggle that routes requests to a priority service tier, alongside repeated GPT‑5.4 breadcrumbs—hinting at imminent model/tier packaging changes that affect latency, cost, and agent throughput.

Cross-account chatter centers on a Codex CLI PR adding a persistent /fast toggle that sends `service_tier=priority`, with multiple sightings of “GPT‑5.4” strings and model inventory entries. This is the day’s clearest workflow-changing signal for Codex users (latency tiers + impending model swap).

Jump to Codex CLI “Fast mode” + GPT‑5.4 breadcrumbs (priority tier + model churn signals) topics

⚡ Codex CLI “Fast mode” + GPT‑5.4 breadcrumbs (priority tier + model churn signals)

Cross-account chatter centers on a Codex CLI PR adding a persistent /fast toggle that sends service_tier=priority, with multiple sightings of “GPT‑5.4” strings and model inventory entries. This is the day’s clearest workflow-changing signal for Codex users (latency tiers + impending model swap).

Codex CLI PR adds /fast mode with a priority service tier

Codex CLI (OpenAI): A new Codex CLI pull request adds a persistent Fast mode toggle stored locally in codex-core; when enabled, requests include service_tier=priority, and the TUI gains a /fast slash command that persists the setting, as described in the PR summary and diff notes in PR summary screenshot and linked via the GitHub pull request.

• Request-tier change: Fast mode explicitly flips requests to service_tier=priority, which implies a first-class latency tier split rather than a purely client-side tweak, per the PR notes shown in PR summary screenshot.
• UI + persistence: The change adds /fast in the TUI and persists it on disk (mirroring how model id is stored), as shown in the diff excerpt in TUI diff screenshot.
• Tier taxonomy emerging: A separate snippet shows a ServiceTier enum with standard and fast, reinforcing that Codex is formalizing “normal vs priority” routing as a product primitive, as shown in Service tier enum.

The diff also hard-codes the help text “toggle Fast mode for GPT-5.4,” tying the UX to an upcoming model name in a way that reads like a rollout breadcrumb, as shown in Slash command snippet.

OpenAI Codex CLI PR 13212 adds /fast tier – GPT‑5.1 leaves March 11

Executive Summary

Top links today

Codex CLI “Fast mode” + GPT‑5.4 breadcrumbs (priority tier + model churn signals)

Table of Contents

⚡ Codex CLI “Fast mode” + GPT‑5.4 breadcrumbs (priority tier + model churn signals)

Codex CLI PR adds /fast mode with a priority service tier

GPT-5.4 breadcrumbs expand across UI and model inventories

ChatGPT UI flags GPT-5.1 models as leaving March 11

🧑‍💻 Codex in the wild: speed vs overthinking, subagent ergonomics, and reliability gaps

Codex can “finish the plan” while missing plan items

Codex used for phased refactor with acceptance tests held constant

Codex 0.106 reportedly causes connectivity issues; 0.105 seen as stable

Codex team asks what to fix next after speed and frontend work

Codex Xhigh feels reliable for planning, but slow on some runs

Some builders default to Codex, then switch to Claude for UI loops

Codex subagents show new naming conventions in logs

Uncle Bob reports Codex is more interruption-tolerant than Claude

Under-specified app builds still produce “slop” without strong human taste

Codex open source program criteria gets discussed publicly

🧠 Claude memory portability: import/export as a cross-assistant workflow primitive

Claude ships a copy-paste memory migration flow for paid users

A concrete switch workflow: seed Claude Memory from long-term ChatGPT use

Builders push for memory export/import portability across assistants

🧪 Agentic coding model race: Cognition’s SWE‑1.6 preview (RL scale + 950 tok/s)

Cognition previews SWE‑1.6: big SWE‑Bench Pro jump without slowing inference

🕹️ Running lots of agents: harnesses, handoffs, and “agents on their own machines”

RepoPrompt adds compressed handoff payloads to move a session between agents

A Cursor cloud agent was prompted to build a full Windows VM and snapshot it

Hermes Agent adds ChatGPT/Codex OAuth subscription support

WezTerm reportedly leaks memory under many-agent workloads, prompting FrankenTerm

A tutorial claims OpenClaw can be hosted on an old Android phone

Mac Mini stockouts in NYC get attributed to OpenClaw demand

OpenClaw is claimed to have passed React in GitHub stars

Phone-to-SSH becomes a lightweight control plane for coding agents

🔌 MCP + agent UI: interactive “mini-apps” rendered inside chat

CopilotKit’s MCP Apps playground shows interactive mini-apps rendered inside chat

🗂️ Context engineering becomes architecture: file-system memories + evidence of docs decay

Everything is Context: agentic file-system abstraction for memory, tools, and provenance

OSS study: only ~5% of repos use AGENTS.md-style context files, and many never change

🧩 Skills & extensions: giving agents new surfaces (Electron apps, orchestration skills)

Vercel agent-browser adds Electron skill to control desktop apps like Figma and VS Code

Warp explores /orchestrate skill to automatically plan and coordinate subagents

📏 Agent reliability & evaluation: deep-agent tests, confabulations, and leading questions

LangChain lays out a test strategy for “deep agents,” not one-shot LLM prompts

Leading questions can manufacture “bugs” in agent-assisted debugging

Hinton: “hallucinations” are better understood as confabulations

AI timelines still collide with real-world rollout constraints

🛡️ AI + defense governance: Anthropic standoff aftermath and “AI in the kill chain” reporting

WSJ: CENTCOM reportedly used Claude for strike support despite ban, with 6‑month phaseout

Amodei reiterates two red lines: domestic surveillance and fully autonomous weapons

Altman AMA frames OpenAI’s DoW deal as de-escalation with three flexible red lines

Supply chain risk designation is treated as a new political-risk cost for AI infra

Mollick: assume government models lag yours, and inference constraints are similar

“Virtually no progress” vs later reports: negotiation state may have shifted within 24 hours

🛠️ Builder utilities: scripted demos, dev-environment assistants, and streaming UX primitives

Readout adds an “Assistant” that can answer fast and run cleanup actions

FastAPI 0.135.0 lands Server-Sent Events support for streaming endpoints

Vercel Labs ships webreel for scripted, never-stale product demo videos

WezTerm hits 182GB memory under heavy agent usage; FrankenTerm starts as a workaround

💼 Business & enterprise signals: revenue scale, data moats, and agents as ‘employees’

ChatGPT scale screenshot claims 900M weekly actives and 50M paid subs

OpenAI pegs total revenue above $20B and downplays DoW contract size

Coinbase frames agents as workers, backed by stablecoin wallets for payments

Block layoffs are framed as AI-driven productivity, with a positive market reaction

Ellison’s moat thesis: proprietary data over model advantages

ARK chart: AI capex share of GDP is tracking above prior tech waves

💾 Hardware & systems acceleration: baked-in inference chips + storage-bottleneck workarounds

DeepSeek DualPath analysis argues KV-cache I/O, not FLOPs, is the agent bottleneck

Taalas pitches HC1 chips with “baked-in” models running ~17,000 tok/s

🏗️ Infra reliability under conflict: UAE AZ fire, multi‑AZ failover, and energy-cost shock risks

AWS UAE availability zone mec1-az2 hit by “objects,” causing fire and outage

Data centers discussed as potential conflict targets, not just collateral

Vercel details how dxb1 traffic and Fluid functions ride through the UAE AZ incident

Hormuz energy-risk thread ties oil/LNG shocks to inference and chip input costs

Multi-AZ resilience gets a rare “actually mattered” moment in production

🧠 Model & integration watch: DeepSeek V4 timing, GLM‑5‑Code, and open‑weight agents in Notion

DeepSeek V4 rumors firm up around a “next week” release window

GLM-5-Code shows up as a distinct coding model with its own pricing

Notion Custom Agents adds an open-weight model option via MiniMax M2.5