Fresh stories
Hermes Agent ships v0.14.0 with Grok subscriptions, Codex runtime, and Windows beta
Nous Research shipped Hermes Agent v0.14.0 with Grok subscription access, Codex as an OpenAI runtime, LINE, native video generation, and a Windows beta. This matters because Hermes is moving beyond point integrations into a broader agent runtime with new access paths and deployment surfaces.

Kilo Code introduces Cloud Agent CVE and smoke-test workflows with webhook triggers
Kilo Code posted two cloud-agent automations: a webhook-driven CVE patch flow that opens PRs in parallel and a post-deploy smoke test that checks health, 2xx responses, and latency under 2 seconds. This matters because the examples show coding agents moving into CI-style remediation and production verification loops.

Codex fixes usage-limit sync bug after 2-hour subscriber lockout
OpenAI said a metering bug put many Codex subscribers at the wrong usage level for about two hours, then restored balances and waived usage from that window. This matters because the incident interrupted active sessions and showed how subscription sync failures can halt agent runs mid-task.


Hermes Agent ships v0.14.0 with Grok subscriptions, Codex runtime, and Windows beta
Nous Research shipped Hermes Agent v0.14.0 with Grok subscription access, Codex as an OpenAI runtime, LINE, native video generation, and a Windows beta. This matters because Hermes is moving beyond point integrations into a broader agent runtime with new access paths and deployment surfaces.

Gemini users report Canvas and Fast mode routing to 3.2 variants ahead of I/O
Multiple users posted reproducible steps and videos showing Gemini app UI changes, Thinking Level rollout, and Fast mode or Canvas sessions that look like 3.2 or 3.5-class routing. This matters because Google appears to be testing new model paths and app surfaces in production ahead of I/O, though the exact model names remain unconfirmed.

Claude Code users launch `/goal`, Obsidian, and audit playbooks to fight long-session drift
Independent builders published Claude Code memory and workflow scaffolding, including a `/goal` prompting guide, Obsidian-backed knowledge capture, and audit tooling for long-running agents. This matters because context compaction and stale session memory are becoming practical bottlenecks for multi-session coding workflows.

Practitioners benchmark Qwen3.6 and Gemma 4 at 40-65 tok/s on M3 Ultra, iPhone 17 Pro, and 4x A4000
New reports show Qwen3.6 and Gemma 4 running locally across Apple and Nvidia setups, with wide variance tied to context length, runtime choice, and MTP tuning. This matters because the latest open models are reaching usable agent speeds on consumer hardware, but prefill and long-context performance still cap throughput.
Kilo Code introduces Cloud Agent CVE and smoke-test workflows with webhook triggers
Pi raises minimum Node.js to 22.19.0 after Undici login breakage
TanStack AI supports AG-UI client-to-server compatibility with zero breaking changes
Codex fixes usage-limit sync bug after 2-hour subscriber lockout
Top storiesthis week
llama.cpp provider adds in-process AI SDK support with tool calling
A new llama.cpp provider lets the AI SDK run directly inside a Node process without a separate server, while exposing reasoning, tool calling, image inputs, and prompt caching. The setup shortens local deployment paths for AI SDK apps that want llama.cpp bindings.


Codex users report 2-hour mech-interp runs and 150-hour tasks with `/goal`
Days after `/goal` workflows first surfaced, users showed the command also works in the Codex app and shared runs for SSH setup, mech-interp scripts, and recurring work that lasted hours or days. The evidence points to Codex being used as a long-running research and ops agent, though the app still lacks explicit `/goal` UI.

Codex adds remote connections for Mac mini devboxes in the ChatGPT app
OpenAI documented Codex remote connections, letting the ChatGPT app point at a separate Codex host such as a Mac mini or rented VPS. Try it for long runs that need to stay alive off-device or for phone-first coding sessions.

Claude Code users report tmux claude-p wrappers and cache fixes after June 15
Developers published two Claude Code workarounds after users flagged metered -p mode: a tmux-backed claude-p wrapper and a setting to stop attribution headers from breaking prompt caching. Both reduce repeated-token spend in agent-heavy runs.

Zero launches systems language for agents after 3,000 agent tasks
Triangle Company introduced Zero as a systems language aimed at agent-friendly tooling and said the compiler mostly self-hosts after about 3,000 agent tasks in three days. Early inspection praised the tiny C compiler but found broken Mach-O lowering and no fuzz tests, so the release looks experimental rather than production-ready.



