Fresh stories

Codex app adds /goal for long-running React Doctor and iOS runs
OpenAI staff said /goal is now available in the Codex app, and users posted long-running runs that fixed React Doctor scores, built iOS features, and queued weekend tasks. The update moves Codex from CLI-only planning to persistent, steerable work sessions.
Local users report DeepSeek V4 Flash, Qwen 3.6, and Gemma 4 at 40-200 tok/s on Macs and 3090s
Developers posted new local-model measurements for DS4, Qwen 3.6, and Gemma 4: about 40 tok/s on an M3 Ultra, 70+ tok/s on MacBooks with MPS, and 120-200 tok/s for Qwen3.6-27B on a single RTX 3090. The numbers suggest coding-capable local runs are moving from demos toward regular use.

Hermes Agent adds LINE gateway with `hermes update` support
Hermes Agent added an official LINE gateway and OpenRouter published Pareto Code setup docs while users shared Discord and mobile SSH/TUI workflows. The change matters because Hermes is moving from ranking chatter into more concrete distribution channels and repeatable operator setups.


Codex app adds /goal for long-running React Doctor and iOS runs
OpenAI staff said /goal is now available in the Codex app, and users posted long-running runs that fixed React Doctor scores, built iOS features, and queued weekend tasks. The update moves Codex from CLI-only planning to persistent, steerable work sessions.

GPT-5.5 users report 3.3M cached tokens and 2.5x /fast credits
Engineers shared fresh measurements on GPT-5.5 cache reuse, /fast pricing, and bug-finding budgets after comparison posts for GPT-5.5 and Opus 4.7 led the coding round-up. The reports suggest Codex cost and quality now swing on cache behavior and effort settings as much as on list prices.

Pi community ships pi-treebase, Miko voice mode, and OpenCode Go guides
Builders shipped pi-treebase, a Miko voice mode for pi-listens, devrage support, and a Japanese OpenCode Go guide after the first Pi extension burst. The releases arrive as Pi’s provider abstraction gets stress-tested by OpenClaw-scale multi-provider use.

Local users report DeepSeek V4 Flash, Qwen 3.6, and Gemma 4 at 40-200 tok/s on Macs and 3090s
Developers posted new local-model measurements for DS4, Qwen 3.6, and Gemma 4: about 40 tok/s on an M3 Ultra, 70+ tok/s on MacBooks with MPS, and 120-200 tok/s for Qwen3.6-27B on a single RTX 3090. The numbers suggest coding-capable local runs are moving from demos toward regular use.
Amp Neo limits beta access after sqs says the team paused expansion for stability
OpenCode adds Ring 2.6 1T with 256K context and free limited-time access
Hermes Agent adds LINE gateway with `hermes update` support
Crabbox 0.11.0 adds Google Cloud provider and repo-local job workflows
Top storiesthis week
GPT-5.5 vs Opus 4.7: users compare plan mode, frontend output, and 120K-context use
User posts and HN threads compared GPT-5.5 and Opus 4.7 across plan mode, frontend work, and 120K-context sessions. The split results mean token burn and instruction discipline matter as much as raw benchmark scores.


Pi community ships `pi-listens`, `pi-kanban`, and `pi-codex-conversion` in one-day extension burst
Independent Pi builders shipped a voice layer, a kanban and observability dashboard, a Codex-conversion tool with `apply_patch`, and smaller UI extensions in the same window. The burst matters because it turns Pi from a single coding agent into a real local-first extension ecosystem with voice, review, and workflow primitives.

OpenRouter launches Pareto Code with min_coding_score tiers and Nitro routing
OpenRouter released Pareto Code, which routes requests to the cheapest coding model above a chosen score threshold and can re-rank for speed with Nitro. Use the API to trade cost against latency with benchmark-based routing controls.

Claude Code guide fixes hallucinated SHAs with adaptive thinking off and effort=high
A Claude Code guide tied hallucinated package names, API versions, and SHAs to zero-thinking turns and recommended config changes to force fixed reasoning budgets and higher effort. HN discussion and user reports suggest the workaround is being used against a broader reliability regression, not just one bad prompt.

ERNIE 5.1 Preview ranks No. 4 on Search Arena and claims 6% pretraining cost
Baidu pushed ERNIE 5.1 Preview with new leaderboard claims, including No. 4 on Search Arena and No. 13 on LMArena Text. Treat the 6% pretraining cost claim cautiously until an independent technical report confirms it.









