Gemini 3.1 Flash Live posts 90.8% and 95.9% – Search Live rollout

Stay in the loop

Free daily newsletter & Telegram daily report

Executive Summary

Google DeepMind is rolling out Gemini 3.1 Flash Live as a real-time audio model across Gemini Live, Search Live, and Google AI Studio; positioning is low-latency conversation plus stronger tool use via function calling, with claims of better performance in noisy settings and less “repeat yourself” drift over long chats. Shared charts report 90.8% function-calling accuracy on ComplexFuncBench (audio); a “Big Bench Audio” plot shows 95.9% speech reasoning for a Thinking High mode vs 70.5% for Thinking Minimal, implying a quality/latency trade-off, but the numbers circulate as vendor-posted graphics without independent reproduction.

• DeepMind/Safety: a public manipulation-measurement toolkit ships alongside studies totaling 10,000 participants; results cite domain dependence (finance higher influence; health constrained by safeguards).
• Runway/Multi‑Shot: Multi‑Shot app pushes prompt→scene construction (cuts, pacing, dialogue, SFX) over montage-style clips.
• Research/Computer-use agents: CUA‑Suite drops ~55 hours of desktop demos across 87 apps plus 3.6M UI annotations—raw fuel for UI-driving agents.

Open questions: latency/pricing tiers for Flash Live in AI Studio; real-world tool-calling reliability outside curated evals; how safety measurement integrates into voice-first product surfaces.

While you're reading this, something just shipped.

New models, tools, and workflows drop daily. The creators who win are the ones who know first.

Last week: 47 releases tracked · 12 breaking changes flagged · 3 pricing drops caught

Gemini 3.1 Flash Live rolls out for real‑time voice agents (Gemini Live + Search Live)

Gemini 3.1 Flash Live is shipping across Gemini Live + Search Live with improved latency, reliability, and function calling—meaning creators can build and deploy real-time voice/vision agents without a bespoke audio stack.

Multiple posts center on Gemini 3.1 Flash Live as a practical, low-latency audio model for natural conversation and tool use—showing up both in consumer surfaces (Gemini Live / Search Live) and in Google AI Studio for builders.

Jump to Gemini 3.1 Flash Live rolls out for real‑time voice agents (Gemini Live + Search Live) topics

🗣️ Gemini 3.1 Flash Live rolls out for real‑time voice agents (Gemini Live + Search Live)

Gemini 3.1 Flash Live ships for real-time voice agents across Gemini Live, Search Live, and AI Studio

Gemini 3.1 Flash Live (Google DeepMind): Google is rolling out Gemini 3.1 Flash Live as a real-time audio model focused on more natural back-and-forth plus stronger tool use via function calling, as described in the launch thread.

It’s now appearing in consumer surfaces—Gemini Live inside the Gemini app and Search Live—and also as a build target for developers inside Google AI Studio, as stated in the rollout note and detailed in the launch post.

• Conversation reliability upgrades: DeepMind frames the practical gains as being better in noisy environments and better at tracking long conversations “so you don’t have to repeat yourself,” according to the launch thread.
• Realtime agents framing: The model is pitched specifically for building voice-and-vision agents with improved quality and latency, using the “step function improvement” claim in the realtime model launch.

Gemini 3.1 Flash Live posts 90.8% and 95.9% – Search Live rollout

Executive Summary

While you're reading this, something just shipped.

Top links today

Gemini 3.1 Flash Live rolls out for real‑time voice agents (Gemini Live + Search Live)

Table of Contents

🗣️ Gemini 3.1 Flash Live rolls out for real‑time voice agents (Gemini Live + Search Live)

Gemini 3.1 Flash Live ships for real-time voice agents across Gemini Live, Search Live, and AI Studio

Gemini 3.1 Flash Live posts 90.8% on audio function calling in shared charts

🎬 Video generators move toward “directed scenes” (Runway Multi‑Shot, Seedance horror, Pika agents)

Runway’s Multi‑Shot App turns one prompt into a cut, paced scene

Pika’s “AI Selves” opens public beta on web and iOS

A Seedance 2.0 elevator micro‑short shows horror pacing in 15 seconds

Kling’s sword‑duel sample spotlights motion and impact beats

🖼️ Image models get more “directable” (Uni‑1 info design, Midjourney style play, Nano Banana 4K)

UNI-1 posts clearer “do exactly this” localized edit examples

Nano Banana Pro and Nano Banana 2 add a 4K output option

UNI-1 examples lean into dense infographics and readable type

Midjourney style ref 3272229711 targets Bob’s Burgers-style sitcom frames

Midjourney style ref 2890513616 nails crude doodle linework on purpose

🪪 Likeness & “mini‑me” waves (Uni‑1 Pouty Pals, Phota look‑alikes, LoRA personas)

Phota Labs opens public access for look‑alike photo generation and edits

UNI‑1 “Pouty Pal” mini‑me prompt becomes a copy‑paste format

UNI‑1 Pouty Pal how‑to circulates with explicit privacy claims

Home-trained persona LoRAs show up for LTX 2.3

Phota’s “family photo” trick: separate likenesses, then compose in-editor

🧩 Multi-tool creator pipelines (Kling→CapCut affiliate farms, Midjourney→Nano Banana→Seedance films)

Clawdbot→Kling→CapCut turns one fake “expert” into a high-volume affiliate channel

Kling Motion Control + Suno workflow claims a music video in under 2 hours

Seedance 2 prompting format: timecoded beats, then cut incoherent shots in Resolve

“Zero moat except taste” framing spreads as AI cloning gets faster

Claude Code as a bridge from network poking to a working device-control app

🧪 Copy‑paste aesthetics: SREFs, poster mockups, and cinematic shot language

Nano Banana wall-poster mega-prompt standardizes high-end brand mockups

Midjourney SREF 2885679472 pushes Wong Kar-wai-style motion blur

Nano Banana 2 “collectible figurine render” prompt focuses on PBR realism

A Firefly macro template builds “world in a bottle” product shots

Midjourney SREF 2873816195 leans into neon retrowave haze

Midjourney SREF 3422279710 targets Art Nouveau “expensive” visuals

A cinematic shot-language prompt for Seedance leans on handheld realism cues

Midjourney SREF 2543866241: jet trails and long-exposure motion abstractions

Midjourney SREF 8006572439 trends for teal neo-noir cyber visuals

🧼 Finishing passes that make GenAI footage usable (Topaz Starlight Precise 2.5)

Topaz ships Starlight Precise 2.5 for more realistic GenAI footage and 4K upscales

Midjourney-to-Topaz finishing pass shows up in “Uncharted Life 2.0” example

Topaz Express upscaling becomes a lightweight social sharing loop for artworks

🧱 3D scenes & novel-view workflows (Freepik 3D Scenes, Wonder3D prompts)

Freepik launches 3D Scenes for AI-driven camera moves from a single image

LagerNVS claims real-time neural novel-view synthesis without explicit 3D

3D simulation plus diffusion gets framed as the new photo workflow

Wonder3D prompting pattern: specify a vibe, then refine the mesh downstream

🧰 Creator studios consolidate: CapCut web studio, Pictory 2.0, and “all-in-one” editors

Seedance 2.0 lands across Dreamina, CapCut Video Studio, and Pippit model menus

Phota Studio opens public access for “photos that look like you” generation and edits

Pictory 2.0 ships Pictory Central, AI avatars, brand kits, and a new timeline

🧰 Builders’ corner for creatives: Claude toolchains, MCP servers, and agent integration layers

Claude Mem adds persistent project memory to Claude Code sessions

One launches an open-source integration layer for wiring agents to 250+ apps

GSD pitches a spec-driven system to fight “context rot” in agents

NemoClaw + Qwen3.5-27B: local agent over Telegram with no API costs

Obsidian Skills packages vault operations as reusable agent skills

Superpowers turns coding-agent work into composable skills + subagents

LightRAG resurfaces as a practical open RAG framework for agent projects

UI UX Pro Max: a design-system generator repo shared for Claude Code

Awesome Claude Code: curated discovery list for Claude Code add-ons

Everything Claude Code: a shortcut index repo creators are sharing

🎵 Music tools keep shipping: Suno model update + creator scoring stacks

Suno adds “upload your own voice” capability

Suno drops v5.5 music model

🛠️ Practical how‑tos: Claude prompt structure, vibe-coding games, and DIY analytics

Claude-powered X analytics tracker: setup guide plus dashboard patterns

Claude prompt tactic: ask first, context second

Vibe-coding a Unity game with Bezi: plan first, then implement

Creators report AI-driven analysis paralysis as tool choice explodes

📚 Research & benchmarks that will hit creative tools next (CUA-Suite, NVS, eval frameworks)

CUA-Suite dataset: 55 hours of desktop-video demos for computer-use agents

Cohere releases an Apache 2.0 model on Hugging Face (no restricted license)

LagerNVS paper: real-time neural novel-view synthesis using latent geometry

Cohere Transcribe 03-2026 surfaces as #1 on the Open ASR leaderboard

Qworld paper: generating per-question evaluation criteria for LLMs