Fresh stories

DeepSeek cuts input cache-hit price 90% to $0.003625 per 1M tokens
DeepSeek said cache-hit pricing across its API series is now one-tenth of launch levels, on top of the temporary V4-Pro discount through May 5. The cut lowers costs for cache-heavy long-context and agent workloads, so teams should recheck spend assumptions.
Pi ecosystem ships computer use, `/parallel-review`, and Chrome extension templates
Independent builders shipped Pi-GUI computer use, pi-subagents parallel review, and starter templates for extensions, Docker workers, and voice add-ons. The releases add reusable computer-use, subagent, and local-runtime building blocks around the base Pi harness.

Codex app-server supports 32-64 parallel jobs and burns limits 3-5x faster
OpenAI docs say Codex image generation counts against general usage and burns included limits 3-5x faster, while users showed app-server runs with 32 or 64 parallel workers. The workflow turns bulk image or research jobs into quota-backed batches, so teams should watch usage spikes closely.


Anthropic fixes Claude Code harness bug tied to `HERMES.md` and `git status`
Anthropic said a third-party harness detection bug pulled `git status` into Claude Code prompts, and it is refunding affected users with extra credits. Watch for hidden client logic that can change spend and behavior in real agent workflows.

DeepSeek cuts input cache-hit price 90% to $0.003625 per 1M tokens
DeepSeek said cache-hit pricing across its API series is now one-tenth of launch levels, on top of the temporary V4-Pro discount through May 5. The cut lowers costs for cache-heavy long-context and agent workloads, so teams should recheck spend assumptions.

Users report GPT-5.5 speeds up coding and cuts over-editing in low-reasoning runs
New evals and day-three user tests show GPT-5.5 performing well at low or medium reasoning, with benchmark gains over GPT-5.4 in coding-heavy use. That matters because stronger results no longer require xhigh runs, though some users still flag sycophancy.

DeepSeek V4 supports Anthropic-compatible routing into Claude Code and Cowork for ~90% lower cost
Independent guides showed DeepSeek V4 running inside Claude Cowork and Claude Code via Anthropic-compatible endpoints, and Ollama added launch commands for Claude-style wrappers. The workflow matters because teams can keep Claude-centered agent UX while sharply lowering model spend, with provider compatibility and setup still the main caveats.
Pi ecosystem ships computer use, `/parallel-review`, and Chrome extension templates
Qwen3.6 community ships MLX and 3-bit quants with 40-56 tok/s local agent runs
OpenRouter launches `create-headless-agent` for Bun-based multi-model CLIs
Codex app-server supports 32-64 parallel jobs and burns limits 3-5x faster
Top storiesthis week
DeepSeek cuts V4-Pro API 75% to $0.43/$0.87 per 1M tokens through May 5
DeepSeek lowered V4-Pro API pricing and updated integration guidance for Claude Code, OpenCode, and OpenClaw a day after V4 launched. Check whether V4-Flash is the easier deploy today, while Pro stays heavier and more rate-limited.


GPT-5.5 users report 4-10x shorter runs and smoother tool calls one day after launch
Users and third-party evals reported shorter runs, stronger long-context scores, and faster rollout into Cursor and other tools a day after GPT-5.5 hit the API. Higher per-token pricing may be partly offset by lower loop time and fewer tool-call stalls, so watch early bench data before changing defaults.

ClawSweeper closes 4,000 OpenClaw issues with 50 Codex agents in one day
Steipete’s maintainer bot ran 50 Codex agents in parallel and closed about 4,000 OpenClaw issues in a day. The cleanup pushed into rate limits, so use the README dashboard and Project Clowfish clustering to track large agent sweeps.

Kilo Code opens Roo migration with --install-extension and AGENTS.md conversion
Kilo Code published a Roo Code migration path ahead of Roo’s May 15 archive, including one-command install, automated file renames, custom-agent conversion, and API key re-auth. Use the guide to map Roo modes, rules, MCP config, and checkpoints into Kilo’s agent and worktree model before the cutoff.

Qwen-Image-2.0-Pro launches at #9 on Arena with multilingual text rendering
Alibaba launched Qwen-Image-2.0-Pro on ModelScope and API with better prompt adherence, multilingual typography, and steadier style quality. The model is aimed at text-heavy jobs like UI mockups and posters, so test it for layout-heavy generation.







