Fresh stories
DeepSeek V4 reports CSA/HCA attention and 10% KV cache at 1M context
Engineers unpacked DeepSeek V4's hybrid CSA/HCA attention a day after launch; it claims 27% of V3.2 FLOPs and 10% of its KV cache at 1M tokens. External tests pushed V4 Pro near the top of open-model indexes, but users also reported rate limits and mixed third-party results.

OpenAI opens GPT-5.5 API with 1M context and Responses support
OpenAI added GPT-5.5 and GPT-5.5 Pro to the API and Playground with 1M context and Responses support. Partners including OpenRouter, Perplexity, GitHub Copilot, Vercel, Warp, and Devin rolled it out the same day, widening access beyond Codex.

BidirLM-Omni-2.5B-Embedding launches 2048-dim text-image-audio vectors
BidirLM released a 2.5B multilingual encoder that embeds text, images, and audio into one shared 2048-dimensional space and works directly with Sentence Transformers. It tops several open-data embedding leaderboards and can run locally on GPU.


DeepSeek V4 reports CSA/HCA attention and 10% KV cache at 1M context
Engineers unpacked DeepSeek V4's hybrid CSA/HCA attention a day after launch; it claims 27% of V3.2 FLOPs and 10% of its KV cache at 1M tokens. External tests pushed V4 Pro near the top of open-model indexes, but users also reported rate limits and mixed third-party results.

OpenAI opens GPT-5.5 API with 1M context and Responses support
OpenAI added GPT-5.5 and GPT-5.5 Pro to the API and Playground with 1M context and Responses support. Partners including OpenRouter, Perplexity, GitHub Copilot, Vercel, Warp, and Devin rolled it out the same day, widening access beyond Codex.

Claude Code users report deleted tests, string-edit stalls, and higher spend
A day after Anthropic published its Claude Code postmortem, users kept reporting Opus 4.7 deleting tests, stalling on trivial edits, and burning more budget than expected. Claude Code 2.1.120 shipped more fixes, but teams are still rechecking prompts, settings, and model choice.

Codex users report one-shot bug fixes, 10-hour runs, and lower token burn a day after GPT-5.5 launch
A day after GPT-5.5 and the new Codex workflows launched, developers reported one-shot bug fixes, longer unattended runs, and lower token use in real coding tasks. The early hands-on comparisons matter because they are already shifting some teams' default agent workflow away from Claude Code.
DeepSeek V4 adds day-1 support from vLLM, SGLang, Ollama, OpenCode, Venice, and Together
Claude Code 2.1.120 adds `claude ultrareview` and fixes `DISABLE_TELEMETRY` opt-out
Cursor 3.2 adds /multitask async subagents, worktrees, and GPT-5.5
BidirLM-Omni-2.5B-Embedding launches 2048-dim text-image-audio vectors
Top storiesthis week
DeepSeek releases V4-Pro and V4-Flash with 1M context and $0.14/M input
DeepSeek open-sourced V4-Pro and V4-Flash under MIT, with 1M context and aggressive Flash pricing. Day-one support in SGLang, vLLM, and OpenRouter pushes open-weight agentic coding closer to closed frontier models.


OpenAI releases GPT-5.5 with 82.7% Terminal-Bench and Codex browser control
OpenAI rolled out GPT-5.5 and GPT-5.5 Pro in ChatGPT and Codex, with higher scores on terminal, OS, cyber, and math evals than GPT-5.4. Codex also gained browser, document, and computer-use features for longer agent workflows.

Cua Driver opens macOS background app control with multi-cursor support for Claude Code and Codex
Cua Driver open-sourced a macOS driver that lets agents control apps in the background with multi-player and multi-cursor support. It matters because it turns background computer use from an app-specific feature into a reusable primitive that any agent loop can adopt.

Tencent launches Hy3 preview with 295B/21B, 256K context, and day-one OpenRouter, vLLM, and SGLang support
Tencent open-sourced Hy3 preview, a 295B MoE with 21B active parameters and 256K context, then pushed it into OpenRouter, OpenCode, OpenClaw, vLLM, and SGLang immediately. That matters because engineers can test and deploy a new reasoning-agent model on day one instead of waiting for the runtime ecosystem to catch up.

Anthropic reports Claude Code regressions after March 26 thinking bug and xhigh default shift
Anthropic said three harness-side changes degraded Claude Code quality, then reset subscriber limits and rolled out fixes in 2.1.119. The update matters because recent failures came from tool defaults and prompt handling rather than the base model alone.








