breakingMarch 12, 2026

Together AI launches unified voice stack with co-located STT, LLM, and TTS

Together AI launched a single-cloud stack for realtime voice agents that hosts Deepgram, Cartesia, MiniMax, and other voice components on one platform. Use it to cut latency and deployment overhead if you want one billing surface for production voice apps.

Voice Agents Realtime AI Developer Experience

3 min read

Together AI launches unified voice stack with co-located STT, LLM, and TTS

TL;DR

Together AI launched a voice-agent stack that keeps STT, LLM, and TTS on one cloud, with Together's launch thread saying every handoff stays "inside one cluster" and the product post framing it as a replacement for multi-vendor voice pipelines.
The initial stack includes native hosting for Deepgram and Cartesia, while Together's thread says builders can swap models across the stack without rebuilding integrations and Cartesia's announcement positions Cartesia as a dedicated model partner on the platform.
Together's product post says the co-located architecture brings end-to-end latency under 700 ms, with a single API, billing surface, and deployment path for production voice apps