Fresh stories
Microsoft launches MAI-Thinking-1 and six companion models with 97.0% AIME 2025
Microsoft introduced MAI-Thinking-1, MAI-Code-1-Flash, and five other MAI models across code, image, voice, and speech. The launch puts Microsoft back into the frontier-model race and starts landing pieces of the stack in Copilot and partner runtimes.

Nous Research launches Hermes Desktop public preview for macOS, Windows, and Linux
Nous Research put Hermes Agent into a native desktop app and added Portal and Ollama-backed setup paths plus a Tailscale remote-connect fix. Hermes now has a local-first desktop surface instead of a terminal-only workflow.

Cognition launches Devin Desktop with ACP support for local and cloud agents
Cognition added a desktop control surface that can run Devin, Codex, Claude, and other ACP-compatible agents across local and cloud contexts. The app turns Devin from a single hosted agent into a broader orchestration surface.


OpenAI launches Codex Sites and role-specific plugins as weekly users pass 5M
OpenAI rolled out Codex Sites, annotations, and role-specific plugins, while weekly users topped 5 million. The release pushes Codex beyond coding into hosted workspace and app workflows for enterprise teams.

Microsoft launches MAI-Thinking-1 and six companion models with 97.0% AIME 2025
Microsoft introduced MAI-Thinking-1, MAI-Code-1-Flash, and five other MAI models across code, image, voice, and speech. The launch puts Microsoft back into the frontier-model race and starts landing pieces of the stack in Copilot and partner runtimes.

Turbopuffer, Archil, TigerFS, and LangSmith add branching, snapshots, and rollback for agent runs
Multiple agent-infra vendors shipped copy-on-write branches, checkpoints, snapshots, forks, or rollback primitives on the same day. That matters because long-running agents can now explore, retry, and recover state without relying only on Git or full sandbox rebuilds.

Anthropic opens Project Glasswing to ~200 organizations with Claude Mythos Preview
Anthropic widened Project Glasswing from roughly 50 to about 200 vetted organizations, expanding access to Claude Mythos Preview for defensive security work. The program keeps Mythos restricted while Anthropic argues AI-assisted exploit discovery is accelerating.
Nous Research launches Hermes Desktop public preview for macOS, Windows, and Linux
Vals launches ProgramBench: Opus 4.8 solves 2 of 200 software-reconstruction tasks
Conductor integrates Vercel Sandboxes for remote parallel coding agents
Cognition launches Devin Desktop with ACP support for local and cloud agents

H Company launches Holo 3.1 with local computer use and 79.3% AndroidWorld

Microsoft launches OpenClaw Companion for Windows with Microsoft Execution Containers

GitHub Copilot app opens preview with canvases and CLI voice mode

Perplexity Computer adds hybrid agentic inference with local-cloud model splits

Factory introduces Router with 25% lower AI spend and 99% of Opus 4.7 Terminal-Bench 2
Top storiesthis week
OpenAI releases GPT-5.4, GPT-5.5, and Codex on Amazon Bedrock
OpenAI made GPT-5.4, GPT-5.5, and Codex generally available through Amazon Bedrock. AWS shops can now use OpenAI models inside existing IAM, compliance, and procurement workflows instead of adopting a separate vendor stack.


MiniMax M3 adds OpenCode, Hermes Agent, Atomic Chat, and Vercel AI Gateway support
A day after MiniMax M3 launched, OpenCode, Hermes Agent, Flowith, Atomic Chat, Kilo Code, Cloudflare AI Gateway, and Vercel AI Gateway shipped support. That breadth shows M3 plugged into agent harnesses and routing layers immediately, not just its own API.

NVIDIA launches Cosmos 3 open 16B and 64B omnimodels with datasets and SGLang support
NVIDIA released Cosmos 3 as an open omnimodel family with 16B and 64B variants, plus code, datasets, and a coalition around physical AI. The release matters because it ships with serving support and top open-weight image and video rankings, so teams can use it beyond a research teaser.

Microsoft and NVIDIA launch RTX Spark PCs with 128GB unified memory and 1 PFLOP FP4
Microsoft and NVIDIA unveiled RTX Spark systems, including Surface Laptop Ultra and DGX-class Windows hardware, with 128GB unified memory and 1 PFLOP FP4 local AI. Day-one support from Hermes Agent, vLLM, Ollama, and Unsloth makes the launch useful for local inference and fine-tuning, not just a PC refresh.

Perplexity launches Search as Code in Agent API with WANDR 0.386 and Python search pipelines
Perplexity replaced one-shot search calls with Search as Code, a Python-based search runtime in its Agent API that is also now the default in Computer. The change matters because agents can batch, rank, filter, and aggregate search steps inside code, and Perplexity says the system scored 0.386 on WANDR versus 0.152 for the next system.






