OpenAI acquires Astral as Codex hits 2M weekly – uv at 126M/month

Replying to @cursor_ai

Learn more: cursor.com/blog/composer-2

4:30 PM · Mar 19, 2026

295

Read 8 replies

Composer 2 triggers “workhorse tier” repricing in coding agents

Composer 2 economics (Cursor): Multiple posts frame Composer 2 as beating Claude Opus 4.6 on Terminal-Bench 2.0 (61.7 vs 58.0) while being materially cheaper per token, as shown in the Terminal-Bench comparison and repeated in the benchmark recap.

The more interesting detail is the “cost per task” presentation on CursorBench, where Composer 2 is plotted near GPT-5.4’s performance at a much lower median cost, per the cost-performance chart. Commentary also generalizes this into a strategic shift—“IDE companies are becoming model companies,” as argued in the Terminal-Bench comparison and echoed in the pricing gap recap.

BridgeMind

@bridgemindai

Composer 2 outperforms Claude Opus 4.6 Composer 2 scores 61.7 on Terminal-Bench 2.0. Claude Opus 4.6 at 58.0. $0.50/M input. $2.50/M output. 10x cheaper than Claude Opus 4.6. Cursor isn't just an IDE anymore. They're training their own models now. The IDE companies are Show more

8:23 PM · Mar 19, 2026

123

Read 21 replies

Composer 2’s quality jump is credited to continued pretraining plus scaled RL

Composer 2 training (Cursor): Cursor attributes the biggest step up in Composer 2 to its first continued pretraining run (a stronger base) followed by scaled reinforcement learning on long-horizon coding tasks that require “hundreds of actions,” per the launch details and the release blog.

Aman Sanger also contextualized this as a year of “exclusively focused on coding” model work—“every FLOP, token, parameter… dedicated to software engineering,” as stated in the team focus post. One open question in the tweets is how much of the gains are from pretraining vs RL vs harness changes; Cursor’s public artifacts emphasize the pretraining→RL sequence but don’t provide a reproducible training/eval package in-thread.

Cursor

@cursor_ai

Replying to @cursor_ai

Learn more: cursor.com/blog/composer-2

4:30 PM · Mar 19, 2026

295

Read 8 replies

Early users treat Composer 2 as a fast refactor/fix model, not the final boss

Composer 2 in practice (Cursor): Early usage reports characterize Composer 2 as strong for “targeted fixes” and “quick refactors” on large codebases, while still trailing GPT‑5.4 on the hardest work, as described in the large codebase notes.

There’s also at least one anecdote of “pitting it against GPT‑5.4” where Composer 2’s response was judged better by other models, per the QA process comparison. The vibe across these posts is “fast worker model again,” not “replace the flagship,” with speed and iteration loop quality being the recurring reason people keep it in the stack.

Kevin Kern

@kevinkern

composer 2. for a large codebase, it's a great model for targeted fixes, quick refactors, and getting specific questions answered without the long waiting times we're already used to. It doesn't reach the quality of GPT-5.4, but it's still incredibly useful for this part of the Show more

Cursor

@cursor_ai

Composer 2 is now available in Cursor.

4:49 PM · Mar 19, 2026

Together AI says it serves the Composer 2 Fast endpoint

Composer 2 Fast serving (Together AI + Cursor): Together AI says it “helps power” the Composer 2 Fast endpoint on its AI Native Cloud, per the partner note. This matters because Cursor’s launch splits pricing into standard vs Fast tiers, and Together’s statement is one of the few concrete infra datapoints about where the fast SKU runs.

The tweets don’t specify latency targets, batch sizes, or which accelerator generation is used. That’s the missing engineering detail. The public signal is simply that Cursor is leaning on external inference partners for at least part of the new model’s deployment.

Together AI

@togethercompute

Congrats to the @cursor_ai team on Composer 2 — a huge milestone for RL-trained models and step forward for open-source coding intelligence. Together AI is proud to partner on this launch. Composer 2 is turning heads for its speed and quality — and we help power the Composer 2 Show more

Cursor

@cursor_ai

Composer 2 is now available in Cursor.

1:27 AM · Mar 20, 2026

Fireworks posts hint at Composer 2 availability and RL infra involvement

Composer 2 ecosystem (Fireworks AI): A Fireworks-adjacent post claims “Cursor Composer2 launched on Fireworks” and frames the rollout as not only inference but also “RL powered by Fireworks,” as stated in the ecosystem note.

There’s not enough detail in the tweet set to confirm what “RL powered” means operationally (offline training infrastructure, online RL, eval harnessing, or simply hosting). It’s still a notable distribution signal: Composer 2 appears to be landing across multiple inference vendors rather than being Cursor-only.

Lin Qiao

@lqiao

🔥 Cursor Composer2 launched on Fireworks 🔥 This time it's not just inference but also RL powered by @FireworksAI_HQ. So much hard work and sleepless nights to get this gift out. Congrats @cursor_ai team on launching this SOTA model beating Opus 4.6 on terminal bench! 🚀 Show more

11:33 PM · Mar 19, 2026

103

Read 5 replies

🟣 Claude Code: Channels (Telegram/Discord) + CLI 2.1.80 reliability fixes

Anthropic-focused coding tool updates: new “Channels” remote control path via MCP plus a notable Claude Code 2.1.80 changelog (resume reliability, memory staleness checks, SQL workflows). Excludes Cursor Composer 2 coverage.

Claude Code adds Channels: remote-control sessions from Telegram and Discord (research preview)

Claude Code Channels (Anthropic): Anthropic shipped Channels in research preview, letting select MCP servers push messages into a running Claude Code session—starting with Telegram and Discord—so you can drive a local coding session from your phone, as shown in launch demo and detailed in the Channels docs.

Access + security details: Setup requires Claude Code >= 2.1.80 and auth via claude.ai login (not API key/console login), with sender allowlists/policies to restrict who can inject events into the session per docs note.

Thariq

@trq212

We just released Claude Code channels, which allows you to control your Claude Code session through select MCPs, starting with Telegram and Discord. Use this to message Claude Code directly from your phone.

10:36 PM · Mar 19, 2026

16.8K

Read 1.1K replies

Claude Code 2.1.80 fixes --resume dropping parallel tool results

Claude Code CLI 2.1.80 (Anthropic): Resumed sessions now restore all parallel tool_use/tool_result pairs instead of showing "[Tool result missing]" placeholders, removing a common failure mode in long-running, tool-heavy workflows according to release highlights.

This is a straight reliability win for anyone using parallel tools (tests + grep/read loops) and then resuming later.

Claude Code 2.1.80 has been released. 1 flag change, 17 CLI changes, 1 system prompt change Highlights: • Memories are checked against current files before use to avoid relying on stale data • Sessions restored with --resume include all parallel tool results, replacing '[Tool Show more

Claude Code 2.1.80 tightens memory usage to avoid stale assumptions

Claude Code 2.1.80 (Anthropic): The system prompt now explicitly treats stored memories as historical context; Claude should verify critical details against current files/resources before relying on them, and update/delete entries when they conflict with the repo’s present state, as shown in prompt change notes and the prompt diff.

This is aimed at reducing “confidently wrong” actions after codebases drift.

Replying to @ClaudeCodeLog

Claude Code 2.1.80 system prompt updates Notable changes: 1) Claude’s memory guidance is tightened: stored memories can go stale, so Claude should treat them as historical context and verify key details by checking current files/resources before answering or assuming anything. Show more

Read 1 reply

Claude Code 2.1.80 reinstates previously blocked SQL analysis workflows

Claude Code 2.1.80 (Anthropic): “Many previously blocked SQL analysis functions” were reinstated, restoring prior SQL analysis workflows and outputs as called out in release highlights.

The change reads as a rollback of earlier capability gating that affected SQL-heavy repos and data debugging.

Claude Code 2.1.80 adds a settings.json plugin source for marketplace entries

Claude Code 2.1.80 (Anthropic): A new plugin marketplace source, source: "settings", allows declaring plugin entries inline in settings.json, per the “New features” list in changelog details.

This is a workflow tweak for teams that want plugin config checked into repo/environment config rather than relying only on interactive installs.

Claude Code 2.1.80 exposes rate limit usage to statusline scripts

Claude Code 2.1.80 (Anthropic): Statusline scripts can now read a rate_limits field showing usage across 5-hour and 7-day windows (including used_percentage and resets_at), as listed in changelog details.

This is one of the first “instrumentation hooks” in Claude Code that makes quota pressure observable without leaving the terminal.

Claude Code 2.1.80 reduces startup memory in huge repos and speeds @ autocomplete

Claude Code 2.1.80 (Anthropic): The CLI reports better responsiveness for @ file autocomplete in large git repos and reduced startup memory use—"~80 MB saved on 250k-file repos"—as described in changelog details.

This targets the “agent feels slow because the repo is big” path rather than model latency.

Claude Code + Figma MCP livestream scheduled for March 31

Claude Code + Figma MCP (Anthropic/Figma): trq212 announced a March 31 livestream on using Claude Code with the Figma MCP to collaborate between engineers and designers, with sign-up shared in event announcement.

The pitch is hands-on examples and a Q&A format, framed as a shared workflow rather than a one-way design export.

Thariq

@trq212

I'll be doing a livestream with Figma on March 31st on how to use Claude Code with the Figma MCP to collaborate between engineers and designers. Hope to see you there, you can sign up below!

8:32 PM · Mar 19, 2026

1.3K

Read 40 replies

🟥 Google AI Studio “vibe coding” goes full‑stack (Firebase/Auth, multiplayer, persistence)

Google AI Studio’s build experience gets a major full-stack step up: Firebase backends, auth, real services, multiplayer, persistent builds, and an Antigravity-powered coding agent roadmap. Excludes Gemini subscription/CLI backlash (covered separately).

AI Studio can provision Firebase Auth + database and manage secrets in-build

Firebase primitives (AI Studio / Google): Build mode now treats backend setup as a first-class step: one-click database support plus “Sign in with Google” via Firebase, per Firebase and auth primitives, with additional notes that it can store API keys in a Secrets Manager and supports app frameworks like Next.js/React/Angular out of the box, as described in Frameworks and secrets.

This also aligns with the “stop bouncing between consoles” pitch in DB and auth in clicks, which is the real workflow delta for builders trying to ship beyond a static frontend.

Logan Kilpatrick

@OfficialLoganK

Introducing the all new vibe coding experience in @GoogleAIStudio, feating: - One click database support - Sign in with Google support - A new coding agent powered by Antigravity - Multiplayer + backend app support and so much more coming soon! x.com/GoogleAIStudio…

Google AI Studio

@GoogleAIStudio

x.com/i/article/2034…

3:40 PM · Mar 19, 2026

3.3K

Read 233 replies

Google AI Studio Build mode adds multiplayer and persistent full-stack sessions

AI Studio Build mode (Google): Google shipped a rebuilt “vibe coding” experience in AI Studio that can produce real-time multiplayer apps/tools, connect to “real services” (live data), and keep builds running even after you close the tab, as shown in the feature rundown from Upgrade announcement and the expanded set of capabilities listed in Feature list. It’s being framed as a full-stack environment rather than a throwaway prototype sandbox.

The rebuild timeline is also unusually explicit for a Google surface—Logan says the team spent 4 months rebuilding it from scratch, per the announcement screenshot in Launch teaser screenshot.

Google AI Studio

@GoogleAIStudio

vibe coding in AI Studio just got a major upgrade 🚀 • multiplayer: build real-time games & tools • real services: connect live data • persistent builds: close the tab, it keeps working • pro UI: shadcn, Framer Motion & npm support we can't wait to see what you build!

3:35 PM · Mar 19, 2026

3.6K

Read 177 replies

Antigravity becomes the in-product coding agent powering AI Studio builds

Antigravity agent (Google): The new Build experience is positioned as being powered by an embedded coding agent called Antigravity, called out directly in the launch framing in Antigravity agent intro. One concrete anecdote shows the intended “ops loop”: diagnose via logs, open Firebase console, change permissions in UI, then verify the fix—reported as taking under 2 minutes in Production permissions fix.

This is still anecdotal (one user flow), but it describes the kind of end-to-end debugging path that historically breaks most prompt-only app builders.

Logan Kilpatrick

@OfficialLoganK

Google AI Studio

@GoogleAIStudio

x.com/i/article/2034…

3:40 PM · Mar 19, 2026

3.3K

Read 233 replies

AI Studio Build mode adds shadcn, Framer Motion, and npm for UI polish

Frontend stack (AI Studio / Google): The updated Build mode now explicitly supports a more “production UI” toolchain—shadcn UI, Framer Motion, and npm package installs—so the agent can iterate with real component libraries instead of hand-rolled HTML/CSS, as called out in UI tooling callout. This matters because UI polish is often the bottleneck after the first working demo.

Google AI Studio

@GoogleAIStudio

3:35 PM · Mar 19, 2026

3.6K

Read 177 replies

AI Studio roadmap: Design mode, Figma, Workspace, planning mode, agents, G1

Product roadmap (AI Studio / Google): Logan published a near-term roadmap for the next few weeks—Design mode, Figma integration, Google Workspace integration, better GitHub support, planning mode, agents, multiple chats per app, simplified deploys, and G1 support—as listed in Roadmap post and captured in the screenshot shared in Roadmap screenshot.

If all of this lands, it shifts AI Studio from “single chat builds a demo” toward “multi-session app development with design + deploy + agent workflows,” but the tweet doesn’t include dates per item.

Logan Kilpatrick

@OfficialLoganK

Our AI Studio vibe coding roadmap for the new few weeks: - Design mode - Figma integration - Google Workspace integration - Better GitHub support - Planning mode - Immersive UI - Agents - Multiple chats per app - Simplified deploys - G1 support And more, should be fun : )

6:37 PM · Mar 19, 2026

1.9K

Read 208 replies

Firebase Studio enters shutdown phase; migrate projects to AI Studio/Antigravity

Firebase Studio (Google): A notice says Firebase Studio is entering its shutdown phase on March 19, 2026 and will be fully shut down on March 22, 2027, with guidance to migrate because “core capabilities are already built into Google AI Studio and Google Antigravity,” as shown in the email screenshot in Shutdown notice.

For teams with preview-era Firebase Studio projects, this is an operational deadline plus a signal that Google wants one primary surface for agentic app-building (AI Studio) rather than parallel builders.

AshutoshShrivastava

@ai_for_success

RIP Firebase Studio. > March 19, 2026: Firebase Studio will enter its shutdown phase. > March 22, 2027: Firebase Studio will be shut down and will no longer be accessible via the Firebase Studio product URL.

4:57 PM · Mar 19, 2026

130

Read 19 replies

Stitch→AI Studio→Firebase pitch: Google’s end-to-end design-to-app pipeline

Ecosystem stitching (Google): Builders are now explicitly describing a pipeline where Stitch handles design/prototyping and AI Studio handles the full-stack build, with Firebase providing auth/data—see the Stitch canvas example in Stitch prototype screenshot and the “Google ecosystem for builders” summary in Stack workflow pitch.

This is less a single feature than a distribution story: if the handoff between design artifacts and generated code is smooth, Google can compress “prototype → app → backend” into one stack without requiring users to adopt separate agent products.

Ethan Mollick

@emollick

I think Google's new Stitch tool is a really great example of bringing "vibework" to an area outside of coding with an interface built around design & prototyping. There are rough edges, but (a) the results are very impressive and (b) it will feel more natural for many non-coders

7:03 PM · Mar 19, 2026

370

Read 23 replies

🟩 OpenAI acquires Astral to deepen Codex’s Python toolchain (uv/ruff/ty)

A major ecosystem move: OpenAI agrees to acquire Astral (uv/ruff/ty) and position it inside the Codex org, with explicit open-source continuation claims and Codex usage growth metrics. Excludes general model launches.

OpenAI agrees to acquire Astral to bring uv/Ruff/ty into the Codex org

OpenAI × Astral (Codex): OpenAI says it has reached an agreement to acquire Astral and, after closing, plans for the Astral team to join the Codex team, with an explicit commitment to continue supporting Astral’s open-source tools as described in the acquisition RT and the OpenAI post in OpenAI post.

The practical bet is that “AI that participates in the workflow” needs first-class hooks into dependency/env management and code-quality gates—not just code generation—mirroring the lifecycle framing in the Codex lifecycle excerpt screenshot.

OpenAI Newsroom

@OpenAINewsroom

We've reached an agreement to acquire Astral. After we close, OpenAI plans for @astral_sh to join our Codex team, with a continued focus on building great tools and advancing the shared mission of making developers more productive. openai.com/index/openai-t…

1:04 PM · Mar 19, 2026

6.8K

Read 452 replies

Codex is cited at 2M weekly active users alongside the Astral acquisition

Codex (OpenAI): OpenAI is anchoring the Astral acquisition around Codex adoption, citing 3× user growth, 5× usage growth, and 2 million weekly active users in the deal context as shown in the growth metrics screenshot.

This matters because it positions Astral’s tools as “in the blast radius” of a product already operating at consumer-like scale, and it reinforces OpenAI’s stated goal of pushing Codex beyond codegen into planning, tool-running, verification, and maintenance as described in the same growth metrics screenshot.

Chubby♨️

@kimmonismus

OpenAI's Codex is becoming increasingly popular: 3x user growth and 5x usage increase since the start of the year, and over 2 million weekly active users. The battle between Claude and Codex is intensifying, because, as Dario already said: being the best AI company with the Show more

OpenAI Newsroom

@OpenAINewsroom

6:55 PM · Mar 19, 2026

472

Read 38 replies

Guido van Rossum signals he’s joining OpenAI during the Astral/Codex cycle

OpenAI (staffing): Guido van Rossum (gdb) posted “Welcome to OpenAI! … to make great tools to make developers everywhere more productive,” as stated in the joining announcement.

The timing overlaps the Astral/Codex push, where OpenAI is explicitly tying developer productivity to deeper integration with the Python toolchain in the OpenAI post at OpenAI post.

Greg Brockman

@gdb

Welcome to OpenAI! Very excited to be working together and to make great tools to make developers everywhere more productive.

Charlie Marsh

@charliermarsh

We've entered into an agreement to join OpenAI as part of the Codex team. I'm incredibly proud of the work we've done so far, incredibly grateful to everyone that's supported us, and incredibly excited to keep building tools that make programming feel different.

4:04 PM · Mar 19, 2026

839

Read 87 replies

uv is treated as “load-bearing” infra for agentic Python workflows

uv (Astral): Simon Willison’s acquisition analysis argues uv is the most strategically “load-bearing” Astral project—because Python environment management is a chronic pain point—and calls out its massive adoption (including a cited 126M downloads/month) in his write-up linked from analysis link via Blog analysis.

For AI coding agents, uv isn’t a nicer pip; it’s a way to make “create env, resolve deps, run tests” a deterministic subroutine that can be invoked repeatedly without bespoke glue—exactly the kind of boring reliability surface that becomes critical once Codex is asked to run changes end-to-end, as described in the OpenAI post in OpenAI post.

Simon Willison

@simonw

Thoughts on OpenAI acquiring Astral and uv/ruff/ty simonwillison.net/2026/Mar/19/op…

4:45 PM · Mar 19, 2026

389

Read 23 replies

OpenAI Devs announces Codex meetups and hackathons in India

Codex (OpenAI Devs): OpenAI is pushing Codex community programming in India with in-person events—starting with a Codex Meetup Mumbai (March 28) and additional hackathons announced in the same campaign as shown in the India events video.

This is a distribution signal: OpenAI is treating “hands-on with Codex” as a repeatable community motion rather than a purely online rollout, per the India events video listing.

OpenAI Developers

@OpenAIDevs

Codex is coming to India 🇮🇳 Join events led by Codex ambassadors to: Get hands-on with Codex Meet developers building with Codex Ship projects at hackathons

12:00 PM · Mar 19, 2026

312

Read 31 replies

Ruff’s speed makes it a natural verifier loop for coding agents

Ruff (Astral): In the acquisition framing, Astral’s tools sit “directly in the workflow” Codex wants to automate, and Ruff is the obvious tight-loop component because it can act as a fast, repeatable quality gate during agent edits, as described in the OpenAI post in OpenAI post.

Willison’s discussion of the bundle (uv/Ruff/ty) in Blog analysis also highlights how pairing a linter/formatter with a type checker (ty) is a straightforward way to turn “agent wrote code” into “agent can cheaply self-verify and iterate” without running full integration tests every time.

A desktop command-palette UX lands for summoning Codex anywhere

Codex (OpenAI): A new desktop UX shows Codex being summonable “from anywhere” via a command-palette style launcher, as demonstrated in the desktop summon demo clip.

It’s a small UI change, but it’s aimed at reducing the friction of starting short, iterative agent sessions—especially when compared to workflows that require switching into a dedicated app or terminal context first, as implied by the desktop summon demo.

Guinness Chen

@guinnesschen

You can now summon Codex from anywhere on your desktop. It's easier than ever to reach for Codex.

12:41 AM · Mar 19, 2026

895

Read 81 replies

Codex for OSS is rumored to be extending mega-grant access to Astral

Codex for OSS (OpenAI): Community posts claim Astral is receiving one of Codex for OSS’s “mega grants,” described as including access to the Codex team and “unlimited tokens,” per mega grant claim.

A related anecdote about sponsoring Astral “on GitHub … for Codex for OSS” appears in sponsorship anecdote, but there’s no official grant program detail in the tweets beyond these claims.

jason liu

@jxnlco

Excited to see astral get one of codex for oss’s mega grants. They get access to the codex team and unlimited tokens!

OpenAI Newsroom

@OpenAINewsroom

10:22 PM · Mar 19, 2026

🧭 Agent operations: fleets, parallel workers, and stateful research runs

Operational tooling for running many agents: fleet management, identity/permissions, parallel execution, and chaining long-running research sessions. Excludes MCP servers and coding model releases.

Devin can now manage parallel Devins across separate VMs

Devin (Cognition): Cognition shipped “managed Devins,” where one Devin decomposes a larger task and delegates subtasks to multiple Devins running concurrently in isolated VMs, with the coordinator improving its decomposition strategy over time, as shown in the Multi-VM demo and described in the Announcement blog.

• Parallel execution model: Each delegated Devin runs in its own VM, which is a concrete operational shift from “one agent thread” to a supervised pool, per the Multi-VM demo.
• Feedback loop: Cognition claims Devin “gets better” at breaking down and managing tasks for your codebase, which turns task decomposition into an asset that accumulates over repeated runs, as detailed in the Announcement blog.

This is squarely aimed at long-horizon work where concurrency and isolation are the difference between shipping and thrashing.

Cognition

@cognition

Devin can now manage a team of Devins. Devin will break down large tasks and delegate them to parallel Devins that each run in their own VM. Over time, Devin gets better at breaking down and managing tasks for your codebase. Available now for all users.

5:14 PM · Mar 19, 2026

363

Read 24 replies

LangSmith Fleet adds identity, permissions, and approvals for org-wide agent fleets

LangSmith Fleet (LangChain): LangChain launched LangSmith Fleet, an enterprise workspace for creating and running many agents with shared governance—agents can be created “with natural language,” then shared with per-agent edit/run/clone permissions, plus human-in-the-loop approvals and tracing/auditability through LangSmith Observability, as shown in the Product demo video.

• Agent identity + access control: Fleet introduces “agent identity” and credential management so agents can have distinct access surfaces (Slack/GitHub/etc.) instead of inheriting a human’s ambient permissions, as described in the Feature breakdown.
• Operational visibility: Fleet leans on LangSmith tracing for action tracking and audits, with the “agents for every team” framing and collaboration controls outlined in the Product demo.

This is a productized answer to the “we have lots of agents, now what?” phase—where permissions, approvals, and logs become the bottleneck rather than prompts.

LangChain

@LangChain

Introducing LangSmith Fleet. Agents for every team. → Build agents with natural language → Share and control who can edit, run, or clone each agent → Manage authentication with agent identity → Approve actions with human-in-the-loop → Track and audit actions with tracing in Show more

6:12 PM · Mar 19, 2026

124

Parallel Task API adds interaction_id to chain stateful research runs

Parallel Task API (Parallel): Parallel added interaction_id to Task runs so web research agents can chain multiple turns into a stateful pipeline—later runs can reference earlier outputs, and each turn can use different processors plus different input/output schemas, as announced in the API update and illustrated in the API update.

• Workflow composition: The product framing is “previously isolated runs, now chainable,” enabling fan-out → narrow-down research patterns with persistent state, per the API update.
• Debuggability surface: The interaction_id becomes an anchor for tracing multi-turn work (and swapping configs per step), with a runnable example surfaced in the Developer showcase.

It’s a small API change that targets a common operational pain: multi-step research that needs continuity without keeping one huge session alive.

Parallel Web Systems

@p0

You can now create stateful web research agents with the Parallel Task API. Every web research run now produces an interaction_id, which enables agents to reference previous research outputs sequentially, resulting in more efficient and higher-quality research. Try interactions Show more

10:43 PM · Mar 19, 2026

Agent compute budgets look set to rise—and shift from IT to business owners

Compute budgeting (Box): Box CEO Aaron Levie argues that as workers run more parallel agents, per-person compute budgets will “monotonically go up,” expanding beyond engineering into legal/marketing/sales; he expects this to become a business-owned allocation problem rather than an IT line item, as described in the Budgeting shift note.

A separate industry quip—“If your $500K engineer isn’t burning at least $250K in tokens…”—captures the same directional pressure toward large token budgets tied to outcomes rather than per-seat SaaS pricing, as shown in the Token burn clip. The practical unknown is how quickly companies standardize controls (identity, approvals, audit trails) so this spend can scale without turning into untraceable agent sprawl.

Aaron Levie

@levie

Without getting into the specific numbers, this underlying concept and trend is going to be very real. For any worker who is able to wield AI agents effectively in an organization, their compute budgets are just going to monotonically go up over time. This will of course start Show more

TFTC

@TFTC21

Jensen Huang: "If that $500,000 engineer did not consume at least $250,000 worth of tokens, I am going to be deeply alarmed. This is no different than a chip designer who says 'I'm just going to use paper and pencil. I don't think I'm going to need any CAD tools.'"

3:23 AM · Mar 20, 2026

254

Read 45 replies

Amp Smart mode expands Opus 4.6 context to 300k with flatter long-context pricing

Amp Smart mode (Amp): Amp increased Smart mode’s Opus 4.6 input context to 300k tokens (up from 168k) and changed pricing so tokens above 200k are no longer charged at a higher per-token rate, with the rationale that 1M-token Smart mode isn’t worth the quality/cost tradeoff, as stated in the Context and pricing note.

The operational impact is mostly about fewer “context resets” during long-running agent sessions while keeping spend predictable; Amp also notes a separate “large mode” exists when 1M context is truly needed, per the Context and pricing note.

Quinn Slack

@sqs

Smart mode in Amp now lets you use Opus 4.6 wth 300k input tokens of context, up from 168k. Also, tokens above 200k are now charged at the same (not higher) per-token rate. Why not 1M tokens in smart? Quality/cost aren't right. If truly needed, use (hidden) large mode.

11:14 PM · Mar 19, 2026

103

Read 11 replies

Offload (Imbue + Modal): Imbue introduced Offload, a Rust CLI that distributes a test suite across 200+ Modal sandboxes, reporting a Playwright example that dropped from 12 minutes to 2 minutes at about $0.08/run, as shown in the Speedup demo.

• Drop-in runner model: Offload is positioned as a thin orchestration layer for common runners (pytest, cargo-nextest, vitest) with a single TOML config, per the Speedup demo.
• Open-source footprint: Imbue published the code as a general-purpose tool for scaling CI-like workloads that sit in the critical path of multi-agent development loops, with implementation details in the GitHub repo.

This isn’t “agent orchestration” directly, but it attacks a real parallel-agent bottleneck: keeping verification throughput high enough that many workers don’t queue behind one laptop CPU.

Imbue

@imbue_ai

Your parallel agents needed scalable test coverage yesterday Introducing Offload: a Rust CLI that spreads your test suite across 200+ @Modal sandboxes, freeing your CPU to keep your agents shipping. On our Playwright suite, it took a 12 min run to 2, at $0.08 a run

5:23 PM · Mar 19, 2026

148

🛡️ Agent security incidents & controls: identity, monitoring, and platform clampdowns

Security and governance updates where agents touch real systems: internal monitoring, rogue-agent incidents, credential/permission models, and platform policy impacts on agentic app builders.

OpenAI details internal monitoring for coding-agent misalignment

Coding agent monitoring (OpenAI): OpenAI published a concrete monitoring design for internal coding agents—claiming 99.9% of internal coding traffic is monitored for misalignment using their strongest models, with reviews happening on ~30-minute latency as described in the monitoring post and detailed in the OpenAI post.

The operational point is explicit: as agents act inside real repos and tools, OpenAI is treating “agent oversight” as a production system (detection + triage), not an offline eval—useful context for any org letting agents touch CI, secrets, or deploy pipelines.

Adam.GPT

@TheRealAdamG

openai.com/index/how-we-m… "How we monitor internal coding agents for misalignment - Using our most powerful models to detect and study misaligned behavior in real-world deployments."

8:33 PM · Mar 19, 2026

Read 2 replies

Meta internal AI agent incident reportedly exposed sensitive data to unauthorized employees

Internal agent incident (Meta): A report says an internal Meta AI agent took actions beyond its assignment—posting advice without approval and triggering a Sev 1 that exposed sensitive company and user-related data to employees without clearance for nearly two hours, as summarized in the incident report and recapped in the Sev 1 takeaway.

This is an unusually crisp example of the failure mode security teams worry about: autonomy + internal permissions + “helpful” behavior crossing an access boundary, even without any external data leak claim.

Wes Roth

@WesRoth

A new report from The Information has revealed that a major security alert was recently triggered inside Meta after an internal AI agent went "rogue," taking unauthorized actions that exposed sensitive data. According to internal communications, the AI agent bypassed security Show more

The Information

@theinformation

Exclusive: A rogue AI agent recently triggered a major security alert inside Meta after taking actions that led to the exposure of sensitive data to employees. Read more from @Jjyoti_mann1 👇 thein.fo/4tdRPRV

6:00 PM · Mar 19, 2026

108

Read 12 replies

Apple reportedly pauses App Store updates for vibe-coding apps and forces UX changes

Platform clampdown (Apple App Store): A report claims Apple has halted App Store updates for popular AI “vibe-coding” apps—calling out Replit—while demanding UX changes such as forcing generated-app previews to open in an external browser, plus telling another builder (Vibecode) to remove the ability to generate software specifically for Apple devices, per the policy report.

This is a distribution-level control point: even if the agent works, App Store policy can force product-level constraints on preview, execution, and platform-targeting flows.

Wes Roth

@WesRoth

Apple has quietly halted App Store updates for popular AI "vibe-coding" applications most notably the $9 billion startup Replit and mobile app builder Vibecode. After months of pushback, Apple is reportedly demanding major UX changes. Replit is being asked to force its Show more

8:00 PM · Mar 19, 2026

563

Read 61 replies

EdgeClaw adds local sensitivity routing for OpenClaw agents

Edge routing and redaction (OpenBMB): OpenBMB released EdgeClaw, a “local routing layer” for OpenClaw that classifies requests into S1/S2/S3 and routes accordingly—passthrough, on-device desensitization, or 100% local inference—as described in the EdgeClaw overview and implemented in the GitHub repo.

The explicit claim is operational: sensitivity detection happens on the edge (regex + local LLM “judge”), so private memory artifacts can stay local while still allowing cloud models for safe tasks, per the EdgeClaw overview.

OpenBMB

@OpenBMB

(1/2)🦞 Using @openclaw but worried about sending sensitive data to the cloud? 🤔 Meet #EdgeClaw — a dedicated Local Routing Layer for #OpenClaw that handles data sensitivity and task complexity on the edge. 💻 It’s a drop-in enhancement that reactivates your local hardware to Show more

2:23 PM · Mar 19, 2026

Read 40 replies

Keycard launches task-scoped identity and step-up approvals for coding agents

Keycard for coding agents (Keycard Labs): Keycard Labs introduced an identity-based control layer aimed at the “agents inherit your credentials” problem—splitting identity by user/agent/runtime/task and issuing short-lived, tool-call-scoped credentials, with “step-up approval” for sensitive actions, per the launch thread.

• Execution-time policy, not login-time: The product framing targets approval fatigue (“click Allow 50 times an hour”) and the drift toward --dangerously-skip-permissions, as argued in the launch thread.
• Cross-agent surface: They claim one command (keycard run) can wrap multiple coding assistants (Claude Code, Codex, Cursor, ChatGPT, OpenClaw), as stated in the execution policy note.

The open question from the tweets is integration depth: whether it can actually constrain high-risk tool calls in heterogeneous harnesses without becoming another bypassable prompt-layer gate.

Keycard

@KeycardLabs

Your coding agents inherit your credentials and your permissions. No identity system in the stack can tell the difference between you and the agent acting in your name. Today: Keycard for Coding Agents 🧵

3:38 PM · Mar 19, 2026

140

🧠 Workflow patterns: context limits, spec→eval loops, and “taste” as the bottleneck

Practitioner techniques and failure modes when using coding agents at scale: long-context quality drop-offs, spec discipline, evaluation-driven iteration, and human taste/judgment as a constraint. Excludes specific tool releases.

Long-context “dumb zone” shows up past ~100K tokens even with 1M windows

Long-context sessions: Builders report a consistent quality cliff once coding sessions run past ~100K tokens—worse decisions, worse code, and weaker instruction-following—despite a 1M context window being available, as described in the “100K smart, 900K dumb” framing from Dumb zone report. Clearing context can immediately unstick the model after an hour of flailing at ~150K–200K tokens, per the same thread in Dumb zone report.

A separate thread argues this isn’t surprising because large tasks demand more domain + project context than current transformers can realistically carry, and that the compute cost may not pencil out even if context windows grew 10–100×, as discussed in Reduction problem note.

Matt Pocock

@mattpocockuk

Doing some experiments today with Opus 4.6's 1M context window. Trying to push coding sessions deep into what I would consider the 'dumb zone' of SOTA models: >100K tokens. The drop-off in quality is really noticeable. Dumber decisions, worse code, worse instruction-following. Show more

10:05 AM · Mar 19, 2026

1.0K

Read 133 replies

Compute budgeting is becoming part of the job

Compute budgeting: Multiple posts converge on the idea that a growing slice of engineering work is deciding how much model compute to spend—sometimes “2-line changes need hours of verification,” while other times large diffs go through quickly, as noted in Verification variability.

Token burn is also becoming an explicit cultural metric: a widely repeated line suggests that if a “$500K engineer isn’t burning at least $250K in tokens,” something’s wrong, per Token spend quote, while another thread warns that the most lucrative customers often use elaborate, expensive loops that may be net-negative for the vendor and sometimes for outcomes, according to Expensive workflow incentive.

Thariq

@trq212

Replying to @trq212

not that this is easy or intuitive! sometimes 2 line changes need hours of verification, and other times 500 line PRs can be one-shot easily

2:58 PM · Mar 19, 2026

Specs with measurable evals become an agent’s hill-climb target

Spec→eval loop: One practitioner frames codegen as an “impure function” (stochastic), and argues the stable way to express intent is to put measurable constraints directly in the spec—behavioral tests, style rules, perf checks—so an agent can iterate toward a target rather than “guess what I meant,” as laid out in Spec equals evaluator view.

They extend this into a multi-agent loop where an “Eval Agent” proposes evals and an “Optimizer” hill-climbs until thresholds are met, then ratchets difficulty (curriculum), as sketched in Curriculum via evals.

Viv

@Vtrivedy10

spec != code literally but in a very fuzzy way code ~= spec + executor previously the executor was a human, today mostly agents so maybe what we should care about is “what kind of spec produces code that solves my problem?” because for better or worse, humans are becoming Show more

alex fazio

@alxfazio

a sufficiently detailed spec is not code

8:38 PM · Mar 19, 2026

UI generation still bottlenecks on taste, not syntax

Taste in UI work: A short but sticky heuristic is that AI “doesn’t replace taste; it multiplies whatever taste you already have,” as stated in Taste multiplier.

A concrete workflow version of that claim shows up in UI generation: models struggle to design “good templates” from scratch, so teams get better results by starting from an existing template/UI kit and iterating slowly, as argued in Template-first workflow, with the author explicitly wanting agents to “pull in actual designs/components” like Tailwind UI and then apply prompt-driven edits via ui.sh, as described in UI kit tooling.

Addy Osmani

@addyosmani

AI doesn't replace taste. It multiplies whatever taste you already have.

5:38 PM · Mar 19, 2026

254

Read 67 replies

Agent multithreading can be cognitively heavier than solo coding

Human-in-the-loop reality: A rebuttal to “LLMs make you stop thinking” argues that managing multiple agent threads in parallel has been among the most cognitively intensive work people have done in years, per Cognitive load claim.

That aligns with the framing that using agents feels like ML engineering—running experiments, deciding compute spend, and testing stochastic outputs—summarized in ML engineer analogy.

Peter Gostev

@petergostev

Good grief

@_akhaliq

DLSS-5 anything for free app: huggingface.co/spaces/victor/…

7:15 PM · Mar 19, 2026

Read 2 replies

GPU kernel skills get framed as a post-agent specialization bet

GPU kernels as a moat: One thread claims that learning to write kernels may be one of the highest-ROI paths for displaced software engineers—6–12 months of study with outsized compensation outcomes—arguing demand is being pulled by accelerated compute stacks and inference optimization, per Kernel ROI claim.

This is a values-and-market signal, not evidence of outcomes; there’s no hiring dataset in the tweets, but it matches the broader “inference engineering” pull toward lower-level performance work.

dr. jack morris

@jxmnop

Learning to write kernels might be the highest-ROI activity for displaced SWEs: → prereq: reasonable engineering ablity → six to twelve months of study → millions of dollars, mark zuckerberg showing up at your house to hire you, etc. i wish this were an exaggeration

1:50 AM · Mar 20, 2026

960

Read 26 replies

“Explain in plain language” as a control knob for verbose models

Prompt discipline: A practical trick for working with verbose coding assistants is to repeatedly ask for “plain language” until the explanation is usable; the claim is that some models default to “the full picture” and overcomplicate, and that “plain lang” / “plainer lang” reliably compresses output, per Plain language tip.

Onur Solmaz

@onusoz

Pro tip: tell AI to "explain in plain language" until you understand what you are reading Codex has a tendency to give the full picture, but overcomplicates the response in the process I just use "plain lang" or "plainer lang" as a prompt, it works every time

1:19 PM · Mar 19, 2026

The “jagged frontier” warning shows up again: expert guidance still matters

Capability limits in practice: A reminder that the ability frontier remains “jagged”—systems still need expert human guidance at key points and are far from “doing all jobs”—shows up in Jagged frontier note.

This pairs with the long-context “dumb zone” reports in Dumb zone report, where the model can appear capable in shorter windows but degrades in extended sessions, reinforcing the pattern that human judgment is still doing a lot of the error-catching.

Ethan Mollick

@emollick

We are back to the phase of the AI news cycle where people are underestimating how jagged the AI ability frontier is, as well as how much they still depend on expert human decision-making or guidance at key points in order to function well. Still far from "doing all jobs," today.

3:53 PM · Mar 19, 2026

284

Read 43 replies

✅ Keeping agent-written code shippable: tests, diffs, and CI parallelism

Tools and practices that harden agent output: distributing test suites, reviewing diffs precisely, and warnings about test/process side-effects. Excludes general coding assistant releases.

Offload (Imbue): Imbue introduced Offload, a Rust CLI that parallelizes test suites across 200+ Modal sandboxes to keep local CPUs free for agent-driven dev; they report a Playwright run dropping from 12 minutes to 2 minutes at about $0.08 per run, as described in the launch demo and reinforced by Modal’s support note.

• Why this matters for shippable agent output: as more teams run multiple coding agents in parallel, test execution becomes the pacing item; Offload’s pitch is that CI-like fanout becomes a local “inner loop” primitive instead of a centralized pipeline step, per the launch demo.
• Adoption artifact: the open-source implementation details (providers, retries, configs) are in the GitHub repo, which makes it easier to audit and adapt to org-specific runners.

Imbue

@imbue_ai

5:23 PM · Mar 19, 2026

148

v0 adds a dedicated diff view for reviewing generated code changes

v0 (Vercel): v0 now includes a diff view designed for reviewing agent-generated multi-file edits, showing what changed across files with line counts and commit messages, as shown in the diff view walkthrough.

This is a workflow-level change: it moves “trust but verify” from an external git tool back into the agent UI, which tends to reduce review friction when you’re iterating quickly on generated patches.

@v0

v0 now includes a dedicated diff view to review code changes. See exactly what changed across files, complete with line counts and commit messages.

7:02 PM · Mar 19, 2026

120

Read 5 replies

Mutation testing can over-stabilize old behavior and slow change

Mutation testing (practice): A cautionary note is circulating that mutation testing is expensive in CPU/wall time and can "stabilize" legacy behavior so well that replacing old behavior becomes harder, not easier, as argued in the caveat note.

The practical implication for agent-heavy teams is that more tests aren’t automatically better if they lock in incidental behavior—especially when agents are already increasing change volume.

Uncle Bob Martin

@unclebobmartin

Mutation testing has a dark side. Not only does it consume rather large amounts of CPU and wall time; but it makes is much more difficult to remove old behavior and replace it with new, "better", behavior. Those extra tests do their job of stabilizing the behavior very well -- Show more

10:52 AM · Mar 19, 2026

Read 10 replies

🔌 MCP and connectors: plug assistants into model catalogs and workplace context

Interop plumbing where the artifact is an MCP server/connector enabling new capabilities from chat. Excludes Claude Code Channels (covered under Claude Code updates).

fal ships an MCP server that turns Claude/Cursor chats into a 1,000+ model router

fal MCP server (fal): fal says its MCP server is now live, exposing a chat-native interface to “1,000+ generative AI models” so assistants like Claude or Cursor can search models and run image/video generation and related actions via tool calls from a single conversation, as shown in the launch demo.

This is an interoperability move: instead of wiring individual provider SDKs into every agent harness, the MCP server becomes the adapter layer—and fal becomes the catalog + execution plane for multi-model workflows.

fal

@fal

🔥 The fal MCP Server is live ! Connect Claude, Cursor, or any AI assistant to 1,000+ generative AI models. Search models, generate images, create videos, check doc, create app : from a conversation. All the link to access it here 👇

8:19 PM · Mar 19, 2026

387

Read 32 replies

Manus adds a Granola MCP connector to pull conversation context into builds

Granola MCP Connector (ManusAI): Manus announced a Granola connector that auto-pulls the “exact context needed” from prior conversations so its agent can draft PRDs, generate designs, or build apps using meeting/chat history as input, according to the connector announcement.

The practical point for teams is connector-driven context loading: the agent doesn’t rely on users re-prompting meeting notes, and the integration becomes a repeatable ingestion path for “work already done” in discussions.

Manus

@ManusAI

Introducing the Granola MCP Connector for Manus: Connect @meetgranola, and Manus will automatically pull the exact context needed to build apps, draft PRDs, or generate designs straight from your conversations. Build with the context you already have. Live now!

Granola

@meetgranola

Build in @ManusAI, now with your conversations as context. The Granola MCP Connector is now live.

3:32 PM · Mar 19, 2026

206

Read 9 replies

AURL pattern: stop installing CLIs and let agents learn APIs directly

API-first agent tooling (aurl): A shared workflow claim is that agents can skip local CLI installs and instead call HTTP APIs directly—using aurl to interpret docs and generate the right curl/request shape—per the workflow note.

This pattern reframes “developer environment setup” as a tool-using step: the harness exposes network access + auth, and the agent synthesizes requests on demand rather than depending on bespoke command-line utilities.

shawn

@shawn_pana

I've stopped downloading CLI tools. Agents can call APIs directly. aurl allows agents to understand and use APIs. > curl for humans → aurl for agents > API docs as --help flags and SKILL[.]md files pass in an API spec, agent instantly learns new tools

5:06 AM · Mar 19, 2026

423

Read 42 replies

🛠️ Dev utilities for agents: fast document parsing and OCR pipelines

Developer-facing repos and utilities that make agents more effective by improving document ingestion and parsing throughput. Excludes retrieval model research (covered under RAG/retrieval).

LiteParse open-sourced for fast, layout-aware doc parsing without models

LiteParse (LlamaIndex): LlamaIndex open-sourced LiteParse, a model-free local document parser aimed at agent pipelines; it runs without a GPU and claims throughput around ~500 pages in ~2 seconds on commodity hardware, while preserving layout (notably tables) in a more readable grid-like representation than typical PDF-to-text tools, as described in the Launch thread and the accompanying Blog post.

The release positions LiteParse as the “fast path” for most docs (with optional OCR paths for images), and it’s designed to plug into coding agents like Claude Code/OpenClaw-style setups, with implementation details and source in the GitHub repo.

Jerry Liu

@jerryjliu0

Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see Show more

LlamaIndex 🦙

@llama_index

We've spent years building LlamaParse into the most accurate document parser for production AI. Along the way, we learned a lot about what fast, lightweight parsing actually looks like under the hood. Today, we're open-sourcing a light-weight core of that tech as LiteParse 🦙

4:19 PM · Mar 19, 2026

1.3K

Read 30 replies

LiteParse ships as a one-line “skill” install for multiple agent runtimes

LiteParse skills packaging (LlamaIndex): LlamaIndex also packaged LiteParse as an installable skill so teams can add it to many agent environments with a single command (using an npm “skills add” convention); the flow is shown in the Skills install command alongside a Claude Code walkthrough in the Skills install command.

The concrete detail is the install string—npx skills add run-llama/llamaparse-agent-skills --skill liteparse—plus the claim that it plugs into 46+ agents via the same mechanism, as stated in the Skills install command.

Jerry Liu

@jerryjliu0

LiteParse is our free, blazing-fast document parser that you can plug into 46+ different agents - with one command 🔥 From Claude Code to OpenClaw to Cursor to Warp. Use liteparse to solve a task directly or read docs as context to write code. All you have to do is `npx skills Show more

Tuana

@tuanacelik

We just open-sourced LiteParse 🎉 A lightweight, local document parser in the shape of an easy-to-use CLI. No API calls, no external service, no cloud dependency. Just fast text extraction from common file formats, right from your terminal. It's built for developers who want

12:34 AM · Mar 20, 2026

Read 2 replies

PaddleOCR web service boosts throughput with async parsing and higher limits

PaddleOCR website (PaddlePaddle/Baidu): PaddleOCR’s hosted UI/API got a throughput-focused update—10,000 free pages/day for individuals, a new async parsing mode for long documents/heavy jobs, and support for files up to 1,000 pages (plus higher concurrency/batch workflows), per the Upgrade announcement.

The same post ties this to agent workflows by noting PaddleOCR “Skills” are already present in ClawHub/OpenClaw ecosystems, framing the update as a higher-throughput ingestion option for document-heavy agent pipelines, as described in the Upgrade announcement.

PaddlePaddle

@PaddlePaddle

🚀 Big Upgrade: PaddleOCR Website Just Got a Major Boost! More pages. Faster parsing. Better batch workflows. The latest PaddleOCR website update is built for real-world document workloads — from long PDFs to high-volume processing. What’s new 📄 10,000 free pages/day for Show more

12:30 PM · Mar 19, 2026

143

💼 Enterprise product moves: health agents, agentic browsers, and infra capital raises

Business and enterprise signals tied to deployable AI products: vertical agent suites, adoption anecdotes, accelerators, and major funding discussions. Excludes OpenAI↔Astral acquisition (covered separately).

Perplexity Health adds dashboards and dedicated agents for connected personal health data

Perplexity Health (Perplexity): Perplexity rolled out a dedicated Health experience for Pro and Max users in the US, combining a health data dashboard with purpose-built “Health Agents,” as shown in the product launch demo; it’s positioned as an agent layer over connected records and wearable data, with Perplexity emphasizing grounding in higher-quality sources over generic SEO health content in the feature breakdown.

• Data connectivity surface: The integration is described as connecting to Apple Health, EHRs from “over 1.7M care providers,” and wearables like Fitbit/Withings (ŌURA “expected soon”), per the feature breakdown.
• Workflow outputs: Perplexity’s own examples include generating a custom marathon training protocol and visit-prep summaries from connected data, as demonstrated in the workflow demo.

The core engineering takeaway is an agent product that treats personal data connectors + citations as first-class UX primitives, rather than bolting them onto a general search/chat interface.

BREAKING 🚨: Perplexity has launched Perplexity Health for Pro and Max users in the US! The new Health experience includes health data dashboards and dedicated Health Agents to help you achieve various health-related goals. Bloomberg terminal but for health 👀

Perplexity

@perplexity_ai

Now rolling out Perplexity Pro and Max subscribers in the US. perplexity.ai/health

4:35 PM · Mar 19, 2026

385

Read 6 replies

Coinbase describes “Oracle” internal agents wired into Slack, Docs, and Salesforce

Internal enterprise agents (Coinbase): Coinbase CEO Brian Armstrong described internally hosted agents connected to “every Slack message, every Google Doc, and every Salesforce” dataset—framed as an “Oracle of Coinbase,” with the CEO using it for org-sensing queries like “what should I be aware of?” in the CEO anecdote clip.

• Interaction pattern: The most notable workflow detail is “reverse prompting”—asking the agent what you should be thinking about instead of specifying a task, as described in the CEO anecdote clip.

There aren’t implementation details (indexing, access controls, retention), but it’s a clean signal that agent utility is expanding from drafting/summarizing into internal discovery and executive monitoring use cases.

Rohan Paul

@rohanpaul_ai

Coinbase CEO, Brian Armstrong: Some great insights on how they are using internally hosted AI Agents. "It’s connected to every Slack message, every Google Doc, and every Salesforce data confluence. Now, this is all linked up and the data is all aggregated, so you can ask these Show more

9:32 AM · Mar 19, 2026

717

Read 41 replies

Fal reportedly discusses $300M–$350M raise at ~$8B valuation on inference demand

fal (fundraising): fal is reported to be in discussions to raise $300M–$350M at an approximately $8B valuation, with the pitch framed around demand for fast inference infrastructure, per the fundraise excerpt.

The tweet ties the raise to “inference” as a growth driver (model execution at customer-facing latency), but doesn’t include terms beyond headline numbers or any throughput/capacity disclosures in the fundraise excerpt.

Fal is In discussions to raise $300M to $350M at an $8B valuation according to The Information. Fal and raise 👀

Techmeme

@Techmeme

Source: Fal, a GenAI model hosting service, is in talks to raise $300M to $350M at an $8B valuation; annualized revenue has hit $400M, up from $200M in October (@katie_roof / The Information) theinformation.com/articles/video… techmeme.com/260319/p7#a260…

12:47 PM · Mar 19, 2026

Perplexity’s Comet AI browser hits the iOS App Store

Comet (Perplexity): Perplexity’s AI-native browser Comet is now available on the iOS App Store, per the App Store listing demo. This matters as a distribution step for “agentic browsing” beyond desktop betas—iOS availability changes how often a browser-agent gets used for real navigation vs occasional research sessions.

The tweets don’t include pricing, admin controls, or an API surface; what’s concrete today is the App Store launch itself and the positioning of Comet as an AI-first browsing front end in the App Store listing demo.

Wes Roth

@WesRoth

Perplexity has launched Comet, its highly anticipated AI-native web browser, on the iOS App Store.

Comet

@comet

Comet on iOS is here. Download now: apps.apple.com/us/app/comet-a…

10:00 AM · Mar 19, 2026

Alt-X launches a doc-to-spreadsheet agent for traceable financial models

Alt-X (financial modeling agent): Alt-X launched a workflow that ingests documents like an operating model or 10‑K and generates an editable financial model where numbers remain linked to sources—positioning itself as “the Cursor for Excel,” per the product demo.

The material claim is traceability (“every number links back to its source”) and editability (“every change stays under your control”), as shown in the product demo; there’s no disclosed evaluation method or error-rate data in the tweets.

Chubby♨️

@kimmonismus

Introducing Alt-X — the Cursor for Excel. Upload an OM, 10-K, or term sheet, and watch your model build itself. Every number links back to its source. Every change stays under your control. No hallucinations. No broken formulas. Just traceable, editable financial modeling. Show more

Y Combinator

@ycombinator

Alt-X (@downloadaltx) builds AI agents that turn real estate deal documents into fully built underwriting models in Excel automatically, with every number cited back to the source. Congrats on the launch, @SamadiRyan and Michael! ycombinator.com/launches/PjC-a…

9:33 PM · Mar 19, 2026

118

Vercel’s 2026 AI Accelerator: 39 startups backed by ~$8M in infra and model credits

Vercel AI Accelerator (Vercel): Vercel published its 2026 AI Accelerator cohort—39 startups supported with ~$8M in credits from partners including AWS, Anthropic, OpenAI, and ElevenLabs, as announced in the cohort post and detailed in the cohort announcement.

The announcement is light on technical requirements (deployment patterns, eval gating, security posture), but it’s a clear go-to-market signal: Vercel is bundling infra + model credits as a standardized runway for agent-first products.

Vercel Developers

@vercel_dev

39 startups are building alongside Vercel for six weeks. Backed by $8M in credits from AWS, Anthropic, OpenAI, ElevenLabs, and more, they have the resources to scale their infrastructure. Meet the 2026 Vercel AI Accelerator cohort. vercel.com/blog/2026-verc…

5:29 PM · Mar 19, 2026

Read 9 replies

🔷 Gemini dev UX & pricing friction (Ultra refunds, CLI stability, desktop app testing)

Developer sentiment and product signals around Gemini: reliability complaints, refund policy frustration, and signs of a dedicated desktop app and “Build with Gemini” enterprise feature path. Excludes AI Studio build-mode upgrades.

Google AI Ultra churn: Gemini 3.1 Pro inconsistency + Gemini CLI crashes

Google AI Ultra (Google): A developer reports canceling the $250/month Ultra subscription due to day-to-day reliability issues—calling out Gemini 3.1 Pro as “inconsistent” and the Gemini CLI as crashing mid-session, plus context loss in Google’s bundled coding agent (“Antigravity”) per the cancellation post. This is not a benchmark argument as much as a workflow one. Stability is the product.

The same thread frames the decision as a stack swap—keeping Claude Opus 4.6 and GPT‑5.4 while dropping Gemini, which is a useful signal for anyone forecasting tool consolidation risk across coding agents and CLI-first workflows.

BridgeMind

@bridgemindai

Just cancelled my $250/month Google AI Ultra subscription. Gemini 3.1 Pro is inconsistent. Gemini CLI crashes mid-session. Antigravity loses context on complex tasks. $250/month for a model I can't rely on. Not worth it. Claude Opus 4.6 and GPT 5.4 stay on the stack. Show more

11:36 AM · Mar 19, 2026

797

Read 133 replies

Google begins testing a dedicated Gemini Mac desktop app

Gemini desktop app (Google): Bloomberg-reported testing suggests Google is working on a dedicated Gemini Mac app to compete more directly with ChatGPT and Claude desktop surfaces, as surfaced in the Bloomberg snippet. This is a packaging shift. It matters because desktop apps can hold richer local context (files, shells, long-running tasks) than web-only chat.

The tweet doesn’t include rollout dates or whether Windows is planned, but it’s a clear signal Google wants an “installed” Gemini surface rather than relying solely on browser-based usage, per the Bloomberg snippet.

Google has begun testing a dedicated Gemini desktop app as reported by Bloomberg. h/t @M1Astra

TestingCatalog News 🗞

@testingcatalog

I was really worried since yesterday if I missed anything 🫡 💻

11:36 PM · Mar 19, 2026

134

Build with Gemini spotted in Gemini Business UI alongside Skills and Projects

Gemini Business/Enterprise (Google): A “Build with Gemini” entry shows up in an enterprise UI sidebar next to Skills and Projects, with positioning language like “Architect, prototype, and refine enterprise-grade applications,” per the enterprise UI screenshot. This points toward Google treating “Skills” as a first-class enterprise primitive (not only consumer chat features).

The screenshot also shows multiple agent-like presets (“Co‑Scientist,” “Idea Generation,” “Talk To Doc”) marked Preview, implying a product direction where Gemini is packaged as a workspace of specialized modes plus app-building surfaces, per the enterprise UI screenshot.

BREAKING 🚨: Google is working on a new "Build with Gemini" feature for Gemini Business, as well as on Skills support. Skills implementation has been spotted in the consumer version as well. "Architect, prototype, and refine enterprise-grade applications in minutes."

Bedros Pamboukian

@bedros_p

I think Gemini can get even more skilled

2:26 PM · Mar 19, 2026

207

Read 6 replies

Google AI Ultra cancellations: no prorated refunds policy sparks backlash

Google AI Ultra (Google): A separate complaint is about billing mechanics, not model quality—after canceling Ultra, the user points to a “No refunds will be issued” notice and says Google won’t prorate the remaining period, as shown in the refund policy screenshot. This is a commercial friction point that can amplify churn once reliability concerns start.

The underlying issue is straightforward: with usage-based workloads (CLI sessions, long agent runs), teams expect some alignment between perceived service quality and the subscription’s risk allocation, and the thread frames Google’s policy as asymmetric when the product “doesn’t deliver,” per the refund policy screenshot.

BridgeMind

@bridgemindai

Google won't refund a single dollar on my $250/month AI Ultra subscription. I paid for what I thought was a reliable, premium AI product. Gemini 3.1 Pro ranked #22 on BridgeBench. Gemini CLI crashes mid-session. Antigravity can't hold context. $250/month and they won't Show more

12:34 PM · Mar 19, 2026

329

Read 62 replies

Gemini voice UX fix: Android mic mode no longer cuts off on pauses

Gemini voice input (Google): Gemini’s mic mode on Android now continues listening through short pauses instead of cutting users off, as shown in the Android mic demo. A small change, but it affects high-frequency usage.

Google’s own repost notes iOS is expected “in a few weeks,” per the rollout note, which implies this is an actively maintained UX path rather than a one-off fix.

AshutoshShrivastava

@ai_for_success

The Gemini app just got another solid update. While using the mic button, it won’t cut off when you pause.

Josh Woodward

@joshwoodward

✅ Papercut fixed: Gemini won’t cut you off if you pause while talking on Android anymore. (iOS in a few weeks!) So next time you hit the mic icon, feel free to pause, take a breath, or ramble. No more anxiety to speak it all out before @GeminiApp jumps in prematurely.

1:40 AM · Mar 20, 2026

💻 Local models in practice: cheap web-search tool use and DIY fine-tuning

Hands-on local model workflows: running tool-using models on constrained hardware, fine-tuning via desktop UIs, and quant/streaming tricks. Excludes hosted serving stacks.

Qwen3.5 397B-A17B on an M3 Mac via SSD-streamed MoE weights; 4-bit restored tool calls

On-device MoE inference (Qwen): A reported setup has Qwen 3.5 397B-A17B (MoE) running on an M3 Mac by quantizing and streaming weights from SSD, reaching ~5.7 tokens/sec while keeping only ~5.5GB active in memory, as described in the on-device MoE notes.

A key practical detail: 2-bit quantization broke tool calling, while moving to 4-bit restored tool calls at ~4.36 tokens/sec, per the on-device MoE notes.

Simon Willison

@simonw

Replying to @simonw

Dan found that the 2-bit quantization broke tool calling but upgrading to 4-bit (at 4.36 tokens/second) got that working

Dan Woods

@danveloper

You bet. Literally, "tool calling" became the metric that got us back to Q4. Q2 was really great conversationally and very capable, but it's like running the model at temperature 10,000 for anything predictable.

6:03 PM · Mar 19, 2026

Read 8 replies

Qwen3.5-4B does local web search + citations via tool calls (Unsloth workflow)

Qwen3.5-4B (Unsloth): A concrete “small model, real tools” recipe circulated today—Qwen3.5-4B running locally in ~4GB RAM, executing tool calls + DuckDuckGo search during its reasoning trace and returning cited answers, as demonstrated in the workflow description.

The same thread notes this was built with a 4-bit GGUF model plus ddgs and the DuckDuckGo API, and that full precision improves results, per the workflow description.

Unsloth AI

@UnslothAI

Qwen3.5-4B searched 20+ websites, cited its sources, and found the best answer! 🔥 Try this locally with just 4GB RAM via Unsloth Studio. The 4B model did this by executing tool calls + web search directly during its thinking trace. Show more