Google launched Gemini 3.1 Flash Live in AI Studio, the API, and Gemini Live with stronger audio tool use, lower latency, and 128K context. Voice-agent teams should benchmark quality, latency, and thinking settings before switching.

Google's launch details describes Gemini 3.1 Flash Live as its fastest native realtime model for building agents, with 70 languages, video streaming, audio transcriptions, 128K context, and generated audio watermarked with SynthID. The same post points developers to the Live API docs and shows the SDK surface for client.aio.live.connect using gemini-3.1-flash-live-preview with audio response modalities.
The product pitch is not just lower latency. DeepMind's DeepMind thread says the model is "better at completing tasks," handles noisy environments better, and can "follow long conversations" so users do not need to repeat themselves. The consumer Gemini app announcement adds that conversations can run through "2x longer" exchanges and that response length and tone adjust dynamically in session Gemini app update.
Google's Google benchmarks claims a "step function improvement in quality, reliability, and latency," and the biggest visible delta is audio tool use. Its chart shows 90.8% on ComplexFuncBench Audio versus 71.5% for Gemini 2.5 Flash Native Audio 12-2025 and 66.0% for the 09-2025 version. On speech reasoning, the same launch material shows 95.9% on Big Bench Audio with high thinking, behind only Step-Audio R1.1 at 97.0% and ahead of Grok Voice Agent at 92.9%.
Artificial Analysis' AA benchmark fills in the operational cost of that gain. With thinking set to high, it measured 95.9% Big Bench Audio and 2.98 seconds TTFA; with minimal thinking, the model drops to 70.5% but improves to 0.96 seconds TTFA, which its AA speed data calls the sixth-fastest result on the speech-to-speech leaderboard. Artificial Analysis also says pricing stayed flat versus Gemini 2.5 Flash Native Audio Dialog at $0.35 per hour of audio input and $1.38 per hour of audio output, excluding reasoning tokens AA speed data.
Availability landed immediately across Google's own surfaces and partner tooling. LiveKit's LiveKit support says this is the first Gemini 3 native audio model on the Live API and highlights better instruction following, improved tool calling, reduced speaker drift, and support for 70-plus languages inside its agents framework.
Supporters also spotted the model in AI Studio on launch day. The AI Studio listing labels gemini-3.1-flash-live-preview as a low-latency audio-to-audio model optimized for realtime dialogue with "acoustic nuance detection, numeric precision, and multimodal awareness," while TestingCatalog separately reported rollout across AI Studio, APIs, and Gemini Live rollout note. Together, that makes this a same-day launch across Google's consumer app, developer API, and at least one major voice-agent integration path.
Lyria 3 Pro and Lyria 3 Clip are now in Gemini API and AI Studio, with Lyria 3 Pro priced at $0.08 per song and able to structure tracks into verses and choruses. That gives developers a clearer path to longer-form music features, with watermarking and prompt design built in.
breakingAnthropic said free, Pro, and Max users will hit 5-hour Claude session limits faster on weekdays from 5am to 11am PT, while weekly caps stay the same. Shift long Claude Code jobs off-peak and watch prompt-cache misses.
releaseOpenAI rolled out Codex plugins across the app, CLI, and IDE extensions, with app auth, reusable skills, and optional MCP servers. Teams should test plugin-backed workflows and permission models before broad rollout.
releaseCline launched Kanban, a local multi-agent board that runs Claude, Codex, and Cline CLI tasks in isolated worktrees with dependency chains and diffs. Teams can use it as a visual control layer for parallel coding agents on repo chores that split cleanly.
releaseMistral released open-weight Voxtral TTS with low-latency streaming, voice cloning, and cross-lingual adaptation, and vLLM Omni shipped day-0 support. Voice-agent teams should compare quality, latency, and serving cost against closed APIs.
Google has released Gemini 3.1 Flash Live Preview, achieving #2 in our Big Bench Audio Speech to Speech model benchmark, and now features configurable thinking levels With thinking level set to high, it scores 95.9% on Big Bench Audio, making it the second-highest scoring speechShow more
Try it out today: docs.livekit.io/agents/models/…
Say hello to Gemini 3.1 Flash Live. 🗣️ Our latest audio model delivers more natural conversations with improved function calling – making it more useful and informed. Here’s what’s new 🧵 Show more
We just launched Gemini 3.1 Flash Live! Our fastest, most natural real-time voice AI model for building Agents. - Scores 90.8% on ComplexFuncBench Audio for tool use. - 70 languages, Video streaming, Audio transcriptions, 128k context - Comes with Agent Skill for building live Show more
Gemini 3.1 Flash live is now on AI Studio
gemini-3.1-flash-live-preview appears to now be available on Vertex! 3.1 Flash Live before 3.1 Flash though? lol