Fresh stories
Gemini 3.5 Flash adds Computer Use with 78.4 OSWorld score
Google released built-in Computer Use for Gemini 3.5 Flash across browser, mobile, and desktop. Try it for agent workflows, but watch for timeout issues on long design-from-scratch runs.

Baidu releases Unlimited OCR with 3B params for single-pass long documents
Baidu released Unlimited OCR as an open-source long-document OCR model with 3B total parameters and 500M active at inference. Early ParseBench testing says it is strong on tables and reading order but weaker on semantic formatting and charts, giving teams a new open-weight OCR option with clear tradeoffs.

OpenRouter launches Image API with typed capabilities and exact USD cost
OpenRouter released a dedicated Image API that normalizes request shapes across 30-plus models from eight providers. Agents can inspect limits, passthrough options, streaming, and exact per-call cost without hardcoding vendor quirks.


Gemini 3.5 Flash adds Computer Use with 78.4 OSWorld score
Google released built-in Computer Use for Gemini 3.5 Flash across browser, mobile, and desktop. Try it for agent workflows, but watch for timeout issues on long design-from-scratch runs.

Seedance 2.0 adds native 4K as fal, Replicate, Pika MCP, and ComfyUI ship support
Seedance 2.0 rolled out native 4K generation while Seedance 2.0 Mini landed on fal, Replicate, Pika MCP, and ComfyUI. That matters because engineers can now reach the same video model family through APIs, MCP workflows, and local graph tooling instead of a single web surface.

Claude Tag users report token billing and shared-memory concerns
A day after Claude Tag launched, engineers raised token billing, lock-in, and shared-memory concerns while Anthropic described its agent-identity model. Watch how Claude behaves in shared Slack channels, where it uses its own credentials and scoped access.

Baidu releases Unlimited OCR with 3B params for single-pass long documents
Baidu released Unlimited OCR as an open-source long-document OCR model with 3B total parameters and 500M active at inference. Early ParseBench testing says it is strong on tables and reading order but weaker on semantic formatting and charts, giving teams a new open-weight OCR option with clear tradeoffs.
Vercel AI Gateway adds GLM-5.2 Fast at 150-250 tok/s
Zed v1.8 adds agent.terminal_init_command and faster Git operations
OpenRouter launches Image API with typed capabilities and exact USD cost
Claude Code 2.1.191 adds /rewind and cuts CPU use 37%
Top storiesthis week
Anthropic launches Claude Tag in Slack beta with channel memory
Claude Tag puts Claude into Slack as a teammate that can handle threads, use approved tools, and follow up proactively in selected channels. Team and Enterprise users can try it in beta to keep shared channel context instead of restarting from private chats.


Mistral releases OCR 4 with bounding boxes and 85.20 OlmOCRBench
Mistral OCR 4 adds layout-aware extraction with bounding boxes, block typing, and inline confidence across 170 languages. Use it through the API or self-hosted deployments when document pipelines need structure, citations, redaction, and chunking.

AssemblyAI launches Universal-3.5 Pro Realtime with Context Carryover
AssemblyAI’s Universal-3.5 Pro Realtime now carries forward the agent side of a conversation to improve live transcription. The release also ships multilingual realtime ASR features, and one early deployment said critical-utterance errors fell from 26% to 9%.

Latitude launches MIT-licensed agent monitoring with Signals clustering and MCP access
Latitude released an open-source platform for monitoring AI agents in production, with plain-English trace search, repeated-failure clustering, and MCP access from coding agents. That gives teams a self-hostable way to inspect token burn, surface recurring failures, and turn production traces into evals and fixes.

Perceptron releases Files API with reusable upload IDs
Perceptron’s Files API lets developers upload an image or video once and reference it by ID across later requests instead of resending base64 or URLs. That simplifies repeated multimodal workflows and cuts transfer overhead for video-heavy pipelines.








