The most full AI hub: fresh stories, workflows, prompts, deals. Updated daily.
Category
Tags
OpenAI added a $100 ChatGPT Pro tier with 5x more Codex usage than Plus and kept the $200 tier as the highest-capacity option. The new tier resets Codex limits again and temporarily doubles Pro usage through May 31.
Anthropic added a beta advisor tool to the Messages API so Sonnet or Haiku can call Opus mid-run inside one request. Anthropic says Sonnet plus Opus scored 2.7 points higher on SWE-bench Multilingual while cutting per-task cost 11.9%.
Sentence Transformers v5.4 adds one encode API for text, image, audio, and video, plus multimodal reranking and a modular CrossEncoder stack. It also flattens Flash Attention 2 inputs for text workloads, reducing padding waste and VRAM use.
ElevenLabs added on-prem and on-device deployment options alongside its existing VPC and cloud paths for the voice stack. The rollout gives government, automotive, and edge teams more data-boundary choices, with VPC available now and the new modes in early access.
LangChain launched Deep Agents Deploy in beta as a production path for open, model-agnostic agent harnesses configured with AGENTS.md, skills, and mcp.json. Deployments run on LangSmith and can expose MCP, A2A, and agent protocol while teams choose models and sandbox providers.
Cursor 3 adds Design Mode for targeting browser UI elements
The new mode lets you annotate browser elements directly, which is cleaner than pixel hunting for brittle UI automation.
Summarize 0.13 adds local video slides and better GPT-5.4 routing
It can now summarize local slide videos, and the new model routing makes media and streaming inputs less fiddly for CLI users.
Vercel Emulate adds email inboxes for magic link and OTP testing
You can test signup, magic-link, and OTP flows without sending real mail. That removes one of the slowest parts of auth QA.
Portless adds --lan so phones on Wi-Fi can hit local dev apps
The new LAN mode exposes local apps on your network with HTTPS intact, which makes real-device testing much less painful.
Unsloth publishes Gemma 4 fine-tuning notebooks for 8GB VRAM setups
The notebooks make Gemma 4 tuning feasible on small hardware, with free Colab paths and lower VRAM requirements than stock training.
Claude Code 2.1.94 raises default effort and fixes hook and login bugs
The release fixes rate-limit hangs, broken macOS login, and ignored plugin hooks while nudging more users to higher-effort runs.
Exa adds x402 support so agents can pay for web search in USDC
Agents can now fetch search results with a 402 flow and settle in USDC, removing API keys from many retrieval workflows.
Cognition ships SWE-1.6 with more parallel tool calls and fewer loops
Cognition says SWE-1.6 calls tools in parallel far more often, which should mean fewer reasoning loops and faster repo work.
Superset pitches a parallel-agent IDE for running coding agents together
Superset wants to be the IDE for running coding agents side by side, so you can keep multiple tasks moving instead of waiting on one model.
Zed's Zeta2 gains a 30% better acceptance rate with a better teacher
Zeta2 improves code completion acceptance by 30% without just scaling the model. Zed says a better teacher and better traces did it.
Hugging Face publishes Hermes Agent reasoning traces for tool-use training
The dataset gives agent trainers real multi-turn tool-use trajectories, not synthetic prompts. That is useful fuel for better harnesses.
Stanford shows single-agent LLMs beat multi-agent systems at equal budget
When thinking tokens are matched, single agents win more often. A lot of multi-agent gains look like hidden compute, not architecture.
ComfyUI publishes a node workflow to recreate the viral inflation effect
ComfyUI turned a viral image effect into a reusable node graph, so others can recreate the look instead of reverse-engineering it from scratch.
Google AI Overviews show why answer quality is hard to measure
The same answer can be wrong, right for the wrong reason, or hard to source back. That makes search-quality evals much messier.
Box Agent fills out an RFP response by searching the knowledge base
The demo shows Box Agent extracting questions, pulling sources, and generating a full Word doc while the user does something else.
DeepMind's AlphaGenome traces disease mutations in non-coding DNA
The model targets the hardest part of genetics, predicting which variants matter before wet-lab work. That makes it a real science model, not a demo.
Claude Code skill adds source-cited deep research over complex documents
The skill turns PDFs, Word docs, and slides into an auditable report with word-level citations and bounding boxes back to source.
Google ships AI Edge Eloquent, an offline dictation app for iPhone
The app auto-polishes dictated text and supports custom dictionaries, which is a surprisingly useful fix for a long-standing mobile UX gap.
Browser Use turns CAPTCHA into a gate that only agents can solve
If an agent clears the challenge, it gets free API access. That is a clever pattern for bot-gated services and internal demos.
Claude Code prompt repo reconstructs hidden system and tool prompts
The repo surfaces the hidden prompts behind Claude Code, making its planner, memory, and verifier behavior much easier to study.
Warp adds rendered Markdown tables to its terminal editor
Rendered tables make Markdown less annoying in-terminal, which matters if you inspect docs and data without leaving Warp.
AI Battle turns Claude vs Codex into a judged problem posing competition
The flow asks two agents to invent hard questions for each other, then scores the answers. That's a clever stress test for creativity and verification.
Render Workflows turns one annotation into async orchestration
One annotation now covers retries, parallelism, nesting, and observability, simplifying durable agent pipelines and background jobs.
Browserbase argues agents need a full web stack, not just APIs
The platform pitches one API for browsers, search, fetch, sandboxes, and models, which is a more realistic starting point for web agents.
Architect turns one prompt into a multi-agent workflow with UI and scheduler
The demo shows an orchestrator spawning subagents, selecting model endpoints, and wiring a manual trigger UI plus a scheduler.
S3 files let agent swarms share one filesystem without a sandbox VM
The argument is that POSIX tools plus S3 can replace a VM for many agent swarms, letting you fan out far more compute on the same storage.
Claude Code's /autofix-pr command starts autofix from the terminal
A single CLI command now kicks off PR autofix after you finish a change. That cuts one more bit of review busywork from the loop.
OpenCode adds Search and Extract plugins for browsing and parsing
The plugin split gives OpenCode a cleaner search and fetch path, so agents can get web context with less custom glue code.
OpenAI says Codex now blocks cyber abuse by passing user IDs downstream
The team says it can reject bad actors more reliably when user IDs are attached to requests. That's a concrete abuse-prevention pattern for agent platforms.
Ideogram Layerize adds editable text layers and transparent PNG output
The API now turns flat outputs into editable layers, which makes localization, compositing, and design reuse much easier.
VS Code Learn launches a guided course for agent-first development
Microsoft is shipping a guided course that teaches agent-first app building inside VS Code, which lowers the barrier for new builders.
CodexBar 0.20 adds provider switching and cost history merges
The menubar tracker now supports more providers, lets you swap Codex accounts without relogin, and de-dupes session costs.
MiroMind turns market analysis into a cited workflow over news and charts
The thread shows a repeatable workflow for market research, with sources, charts, and forecast framing instead of a black-box answer.
ARC Prize 2026 upgrades ARC-AGI-2 compute to L4x4s for all entrants
DSPy case study cuts annual AI spend from $5.5M to $73K
W&B and OpenPI add experiment tracking for physical AI on ALOHA
Artificial Analysis adds HappyHorse-1.0 to the video arena leaderboard
Google adds Gemma 4 to Gemini API and AI Studio with code examples
ElevenLabs rounds up ElevenHacks winners for voice, sound, and tooling
LiveKit's Rime Mist v3 adds concurrent request handling and phonetic brackets
VoiceCaptcha replaces image CAPTCHAs with spoken prompts and Whisper
EPOCH AI chart shows Google owning roughly 5 million H100 equivalents
Epoch AI's chip ownership hub shows who controls AI compute
DeepTeam ships a local red-teaming toolkit for 50 plus LLM failure modes
EPOCH AI chart puts Anthropic's ARR run-rate above OpenAI's
The Turing Post maps AI-native orgs from tribal teams to self-improving systems
OpenAI lays out 11 policy levers for the intelligence age
Anthropic put Claude Managed Agents into public beta with hosted sandboxes, vaults, memory filesystems, and long-running sessions. Use the managed setup if you want explicit controls for tools, credentials, and completion criteria instead of custom harness code.
Meta released Muse Spark, the first model from Meta Superintelligence Labs, with multimodal reasoning, tool use, and a parallel-agent Contemplating mode. Access stays limited to Meta AI and private API preview, so watch for broader availability before planning production use.
Providers and agent platforms added GLM-5.1 endpoints across Modal, Together AI, Letta Code, Tembo, and Tabbit, with free trials, no-key access, and 99.9% SLA options. Use the new hosting options to test the model for coding and long-horizon agent workloads without waiting on self-hosting.
Z.ai released GLM-5.1, a 744B open model built for long-horizon agentic coding and ranked first among open systems on SWE-Bench Pro. Day-0 support in OpenRouter, Ollama, SGLang, vLLM, OpenCode, and local quantization paths makes it ready to test in existing stacks.
OpenAI said Codex reached 3 million weekly users and reset usage limits, with another reset planned for each additional million users up to 10 million. ChatGPT-sign-in Codex will also retire the gpt-5.2 and gpt-5.1-era lineup on April 14, so teams should watch for model-default changes.
Anthropic launched Project Glasswing, giving selected partners access to Claude Mythos Preview and publishing a system card with strong coding and cyber benchmark results. It stays off the public API for now, so teams should treat it as a restricted dual-use security release rather than a normal model launch.
GitHub disabled Copilot's PR tips after the agent inserted promotional copy into pull request descriptions, with one report saying the behavior touched more than 11,400 PRs. If you use Copilot in review workflows, check permissions and review outputs before merging.
A closed GitHub issue says Claude Code became unreliable for complex engineering after February changes, citing 17,871 thinking blocks and 234,760 tool calls across 6,852 sessions. Anthropic said the redaction flag was UI-only, but developers reported broader Opus quality drops and opaque harness changes.
Bram Cohen used the Claude Code leak to argue that prompt-only development produces bad software, while a separate 250-hour syntaqlite build said the durable version arrived only after a Python-to-Rust rewrite. Practitioners say specs, tests, linters, repo skills, and codebase context are the controls that keep coding agents maintainable.
Builders shipped a direct Claude Code harness and a ClawHub marketplace skill for OpenClaw workflows. Use these routes to wire agent tooling into OpenClaw, but watch Claude API limits and token burn costs.
Discussions