Together AI launched a single-cloud stack for realtime voice agents that hosts Deepgram, Cartesia, MiniMax, and other voice components on one platform. Use it to cut latency and deployment overhead if you want one billing surface for production voice apps.

Together's launch is a unified runtime for real-time voice agents: speech-to-text, LLM inference, and text-to-speech run on one cloud instead of hopping across separate vendors. In the announcement thread, the company says the practical change for builders is co-location, model swapping across the stack, and one surface for billing, deployment, and access.
The first-party and partner lineup is broader than a single STT/TTS pair. Together's [img:1|Voice stack diagram] shows Cartesia, MiniMax, Rime, Deepgram, Whisper, Voxtral, Kokoro, and Orpheus connected to the same "AI native cloud for voice," while Cartesia's post says Cartesia is now a dedicated model partner and the Deepgram note confirms Deepgram STT is hosted natively on Together infrastructure.
The engineering pitch is fewer network boundaries. Together's blog post says most current voice systems are "stitched together across vendors," which adds latency and operational overhead as audio and tokens move over the internet between STT, LLM, and TTS services. Its replacement is a modular but co-located stack, and the company says that gets end-to-end latency below 700 ms for live conversations latency details.
That matters operationally as much as interactively. The same product post says the platform exposes unified API access, security controls including zero data retention and SOC 2 Type II support, and deployment options aimed at enterprise voice workloads. Meanwhile, MiniMax's update shows Together is treating the stack as a multi-model platform rather than a fixed pipeline: MiniMax Speech 2.6 Turbo has already been added alongside Deepgram and Cartesia, which makes the "swap models" claim more concrete.
Vercel Emulate added a programmatic API for creating, resetting, and closing local GitHub, Vercel, and Google emulators inside automated tests. That makes deterministic integration tests easier to wire into CI and agent loops without manual setup.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
Today, Together AI is launching a unified solution for building real-time voice agents with the entire pipeline running on one cloud. AI natives can now deploy voice apps for every use case at production scale.
The world’s leading AI infrastructure platforms are converging on the same voice model 🔥 Excited to announce that Cartesia is now a dedicated model partner on @togethercompute's Voice Platform for the 450K+ teams and developers building on Together.
Real-time voice agents are getting fast enough to feel conversational🎙️ MiniMax Speech 2.6 Turbo is now part of the voice stack on @togethercompute Show more
Most voice stacks today are stitched together across vendors. Together puts the whole pipeline in one place for natural, real-time conversation. Here is how it works: together.ai/blog/build-rea…