Fresh stories
Codex raises weekly and hourly limits to 100% after 5 million users
OpenAI restored Codex weekly and hourly quotas across paid ChatGPT plans after Tibo Sottiaux said the product hit 5 million users. Watch for long-running QA loops, migration PRs, and remote desktop sessions that can still burn through quotas fast.


Opus 4.8 users report token burn, failed tool calls, and DeepSWE gaps
Three days after Opus 4.8 launched, new tests and field reports added failed tool calls, Bash-specific breakdowns, and higher token burn to the complaint list. Users report materially worse cost and stability in long coding sessions, while DeepSWE and GBA Eval point in different directions.
CopilotKit integrates Claude Agent SDK with AG-UI for React and mobile frontends
CopilotKit shipped an AG-UI integration that streams Claude Agent SDK agents into web and mobile frontends with generative UI and approval checkpoints. The adapter lets teams embed terminal-first Claude agents in React, Vue, Angular, and React Native without rewriting transport or state plumbing.


Codex raises weekly and hourly limits to 100% after 5 million users
OpenAI restored Codex weekly and hourly quotas across paid ChatGPT plans after Tibo Sottiaux said the product hit 5 million users. Watch for long-running QA loops, migration PRs, and remote desktop sessions that can still burn through quotas fast.

MiniMax M3 launches with 1M context and 59.0 SWE-Bench Pro
MiniMax shipped M3 with a 1M-token context window, native multimodal input, and frontier coding claims across SWE-Bench Pro, Terminal Bench, and MCP Atlas. It also appeared on OpenRouter, Ollama Cloud, Venice, Hermes, Cline, Together, and Arena on day one.

Opus 4.8 users report token burn, failed tool calls, and DeepSWE gaps
Three days after Opus 4.8 launched, new tests and field reports added failed tool calls, Bash-specific breakdowns, and higher token burn to the complaint list. Users report materially worse cost and stability in long coding sessions, while DeepSWE and GBA Eval point in different directions.

Developers report Codex beats Claude Code on DeepSWE, token burn, and multi-hour /goal sessions
Independent users compared GPT-5.5/Codex with Opus 4.8/Claude Code using DeepSWE cost charts, GBA Eval runs, and long coding sessions. The split matters because engineers choosing a daily coding stack now have external quality-versus-cost evidence instead of only vendor launch claims.
Coding-agent builders add shared memory, provider routing, and app launchers
CopilotKit integrates Claude Agent SDK with AG-UI for React and mobile frontends
Grok Imagine Video 1.5 adds fal and Venice API access after xAI rollout
OpenClaw adds Auto exec approvals with guardian-agent review
Top storiesthis week
Opus 4.8 users report write failures, sycophancy, and 58% DeepSWE
Two days after launch, users and benchmarks pointed to write failures, sycophancy, lower security recall, and a 58% DeepSWE result. GPT-5.5 still leads on cost, output tokens, and pass@1 in shared coding-agent tests, so compare both before switching.


Codex community ships /dynamic swarms, session lifecycles, and model routing
Builders added /dynamic orchestration, custom-model routing, and repo runbooks around Codex as users exposed new session lifecycle controls in the app. That makes Codex a better fit for long-running, multi-context coding work.

Hermes ecosystem ships Web UI, Control Room, and 14% lower read_file tokens
Builders released a chat-first Web UI and a multi-agent Control Room template around Hermes Agent, while core updates cut read_file input tokens by 14% and fixed TUI startup hangs. Use the new controls to manage local multi-agent setups while reducing routine token burn.

Pi ecosystem adds /goal tasks, acceptance gates, and Lovely Dev Tools
Three independent Pi builders shipped a goal runner, contract-style subagent acceptance gates, and a new Lovely Dev Tools extension in the same window. That gives Pi users more deterministic long-running loops and cleaner local tool interfaces without starting from an empty harness.

OpenRouter launches Guardrails with budget caps, ZDR, and prompt-injection filters
OpenRouter released Guardrails to apply budget limits, provider restrictions, zero-data-retention rules, prompt-injection defense, and DLP checks across routed traffic. Google Model Armor and Lakera Guard connectors are in beta, so plan around limited availability.









