
Google Gemini 3 Pro rolls out in AI Studio – $2 in, 1M‑token video analysis
Executive Summary
Google’s Gemini 3 Pro is actually live in AI Studio and showing up in the Gemini web app, which means you can ship against it today. Leaked pricing points to 200K tokens or less at $2 in and $12 out, and higher tiers at $4 in and $18 out, with a Jan 2025 cutoff. The headliners: agentic coding, Generative UI/Visual Layout, and a 1M‑token context for long‑video analysis.
What’s new since our weekend sightings: real agent demos. Antigravity spins up coding agents that edit a Supabase backend, drive a browser, and close bugs end‑to‑end, while Generative UI assembles tappable layouts and mini‑apps directly in chat. The Deep Think preview posts 45% on ARC‑AGI‑2 (ARC Prize verified), 88% on ARC‑AGI‑1, and 93.8% on GPQA Diamond, but access is capped to safety testers; Pro lands around 31% on ARC‑AGI‑2. Run agents in sandboxed accounts and require diffs—still a sharp tool, not autopilot.
Early signals look strong: one creator reports 72.7% on ScreenSpot‑Pro and community arenas place Gemini 3 Pro at or near the top across text, vision, and webdev. Per Andrej Karpathy, treat public leaderboards as hints, not verdicts—do a week of A/Bs on your workload, then switch defaults if it holds.
Feature Spotlight
Gemini 3 for creators: agents, UI, and rollout
Gemini 3 Pro + Deep Think bring agentic coding and Generative UI to creators, with AI Studio access and Antigravity demos—raising the bar for building apps, tools, and visuals directly from a prompt.
Massive cross‑account story: Gemini 3 Pro lands with agentic coding (Antigravity), Generative UI/Visual Layout, 1M‑token video analysis, and AI Studio access. Mostly hands‑on demos, pricing leaks, and early benchmark effects.
Jump to Gemini 3 for creators: agents, UI, and rollout topicsTable of Contents
Stay in the loop
Get the Daily AI Primer delivered straight to your inbox. One email per day, unsubscribe anytime.
Gemini 3 for creators: agents, UI, and rollout
Massive cross‑account story: Gemini 3 Pro lands with agentic coding (Antigravity), Generative UI/Visual Layout, 1M‑token video analysis, and AI Studio access. Mostly hands‑on demos, pricing leaks, and early benchmark effects.
Deep Think preview posts 45% on ARC‑AGI‑2; limited access for now
Gemini 3 Deep Think preview hit 45% on ARC‑AGI‑2 (ARC Prize verified), with 88% on ARC‑AGI‑1 and 93.8% on GPQA Diamond, while Pro clocks 31% on ARC‑AGI‑2 ARC‑AGI results. Access is currently restricted to safety testers, with a planned rollout to Ultra subscribers after additional checks Safety testers note.
The point is: this narrows the gap on long‑horizon reasoning. It’s pricey per task in preview, so budget non‑trivial runs carefully, and keep Pro as the default until the mode opens more broadly.
Gemini 3 Pro rolls out in AI Studio and Gemini web
Google began rolling out Gemini 3 Pro to AI Studio, with creators reporting live access, and it’s also appearing on the Gemini web app. This moves the model from speculation to daily use for builders. Following up on UI strings, early hints in app strings now translate to broad availability. See creator confirmations and a global montage in the launch clips AI Studio check and Web app check.
So what? You can start vibe‑coding apps, testing multimodal prompts, and kicking the tires on the new agentic behaviors today. Expect staggered enablement by account and region, so keep rechecking the model picker in AI Studio Global rollout.
Generative UI lands: dynamic layouts, mini‑apps, and 1M‑token video analysis
Creators are seeing Gemini 3 assemble visual layouts and bespoke tools on the fly—tour plans with tappable cards, simulators, and code‑backed mini‑apps. One demo highlights a “Visual Layout” mode, and another shows it generating calculators and physics visualizations directly in the response Visual layout explainer Tool coding demo. The model also touts a 1M‑token context for long‑video analysis Visual layout explainer.
Try concrete, outcome‑first asks (“compare three AAA loans”) and let it decide tables vs. widgets. For education and research explainers, use the three‑body sim style prompts to force visual reasoning Three body sim.
Google’s Antigravity IDE demos agentic coding, browser control, and live fixes
Multiple demos show Antigravity spawning agents that test apps, control the browser, make Supabase changes, and even play a pinball sim—alongside whiteboard and flight‑tracker examples Early tester thread Demo set. One creator reports the agent found a bug, edited the backend, and resolved the issue end‑to‑end without manual glue.
Here’s the catch: oversight still matters. Run in isolated accounts, watch permission scopes, and expect occasional mis‑edits. But for prototyping and QA loops, this compresses hours into minutes.
Gemini 3 Pro climbs to #1 across major Arena leaderboards
Community leaderboards show Gemini‑3‑Pro taking top slots across Text, Vision, and WebDev, edging Grok‑4.1, Claude‑4.5, and GPT‑5 variants Arena overview. Creators also shared LMArena/WebDev placements and site sightings after the model went live Ranking video.
- Text Elo: ~1,501 (reported) and big WebDev gains vs 2.5 Arena overview
- Visibility: also flagged as available on the Gemini web app Web app check
Use this to prioritize your A/B queue. Then verify on your tasks before switching defaults.
Leaked Gemini 3 Pro pricing and docs detail token tiers, Jan 2025 cutoff
Pricing screens circulating show two token tiers: ≤200K tokens at $2.00 in / $12.00 out, and >200K tokens at $4.00 in / $18.00 out, with a knowledge cutoff listed as Jan 2025 Pricing details. Docs briefly surfaced as “Confidential” and then 404’d for some users, suggesting a staged docs rollout Docs 404.
For teams budgeting experiments, those output rates matter. The official endpoint path appeared under Google’s API docs before vanishing—keep an eye on the re‑published page when it stabilizes API docs.
“Vibe coding” in practice: custom instructions and a one‑prompt app build
Builders are sharing prompt discipline for Gemini 3 Pro: plan first, debug elegantly, create closed‑loop tests, and fully own UI checks—then let the model iterate internally before handing back Prompting guide. In a separate demo, a full interactive app was generated from a single high‑level prompt in one shot Vibe coding demo.
Actionable today: encode those instructions in the system slot, ask for self‑tests, and require diffs for changes. It reduces babysitting and yields steadier builds.
Karpathy urges hands‑on model tests amid public benchmark spikes
Andrej Karpathy calls Gemini 3 a tier‑1 daily driver on personality, writing, coding, and humor—but warns that public benchmarks can be nudged via adjacent data, advising people to A/B models directly Karpathy notes. He shared a funny exchange where the model denied the 2025 date until search tools were enabled, then relented Round‑up post.
The takeaway: keep your own eval set close to your workflow. Rotate models daily for a week before picking a default.
Ecosystem moves: Lovart, Kilo Code, and Verdent adopt Gemini 3
Third‑party tools are lighting up support: Lovart says Gemini 3 is live for UI design studies Lovart availability, Kilo Code shares internal coding scores (Gemini 3 Pro 72% vs Claude 4.5 54% vs GPT‑5.1 Codex 18%) Kilo Code test, and Verdent markets multi‑agent orchestration that runs parallel Gemini sessions with auto verify steps Verdent orchestration.
So what? The model is already where designers and engineers work. Try a small sprint inside one of these tools before migrating full pipelines.
Gemini 3 posts 72.7% on ScreenSpot‑Pro, hinting at stronger UI‑use skills
On the ScreenSpot‑Pro benchmark, a creator reports Gemini 3 at 72.7%, with the next best model at 36.2% Screenspot score. The claim, if it holds, suggests a faster path to robust computer‑using agents.
Treat it as directional until wider replications land. But for RPA‑ish tasks, route trials through Gemini 3 first and compare head‑to‑head.
Filmmaking in the wild: 30k‑ft ad, model face‑offs, controllable motion
Practical production wins and tool tests: a full airline ad made in 14 hours mid‑flight, creator comparisons of Grok/Kling/Veo, and new node‑level motion control in ComfyUI. Excludes the Gemini 3 launch (see feature).
A full Qatar Airways ad was made in 14 hours at 30,000 ft using AI tools
A creative team produced two Qatar Airways commercials mid‑flight in 14 hours by mixing Google/Starlink connectivity, Gemini for planning/assets, Figma for layout, Veo 3.1 to animate stills, and Suno for music; they delivered before landing project overview. They shot reference photos to match aircraft details reference photos, used clean camera‑move prompts to animate scenes in Veo 3.1 prompt example, and cut the final with a bespoke track made in Suno music workflow, wrapping with less than 15 minutes to spare final delivery.
Veo 3.1 turns stills into polished shots with simple camera‑move prompts
Creators show Veo 3.1 reliably animating single images into usable shots using terse directives like “photorealistic; slow left orbit; sip coffee; push‑in close‑up,” which keeps motion natural and avoids overacting prompt example. A separate roundup highlights Veo 3.1’s prompt adherence, precise audio‑visual alignment, and strong object‑level edits—useful when timing to music or correcting props Veo overview.
Adobe Firefly now auto‑scores your video with licensed music
Firefly’s new Generate Soundtrack analyzes your uploaded clip, proposes a fitting prompt, and returns four synchronized tracks; you can tweak vibe/style/tempo and re‑gen, then download stems or a scored video. It’s built on licensed content, so results are safe for commercial use feature overview. A step‑by‑step shows the full workflow from upload to selection and export workflow steps.
ComfyUI’s Time‑to‑Move adds controllable motion to Wan 2.2 pipelines
ComfyUI is hosting a deep dive on Time‑to‑Move (TTM), a plug‑and‑play technique to inject intentional, controllable motion into Wan 2.2—useful for precise pans, pushes, and character action beats deep dive session. There’s also a how‑to covering blockouts and motion intent for animating sequences inside ComfyUI tutorial video.
Grok, Kling 2.5 Turbo, and Veo 3.1 compared on emotional range
A side‑by‑side creator test compares Grok Imagine, Kling 2.5 Turbo, and Veo 3.1 on delivering nuanced, believable emotion across short scenes. The cut makes it easier for teams to pick a model per sequence—e.g., close‑ups needing micro‑expressions versus stylized affect model comparison.
Leonardo tests: when to reach for Sora 2, Veo 3.1, Kling 2.5, or Hailuo 2.3
LeonardoAI shared real‑project notes that map model strengths to tasks: Sora 2 for physics‑accurate, constraint‑respecting shots Sora summary; Veo 3.1 for tight prompt adherence, audio‑visual timing, and object edits Veo brief; Kling 2.5 Turbo for professional transitions and start→end frame control Kling summary; and Hailuo 2.3 for budget‑friendly runs that still look strong Hailuo summary. The reel is a handy cheat sheet for shot planning and budget routing model highlights.

Stay first in your field.
No more doomscrolling X. A crisp morning report for entrepreneurs, AI creators, and engineers. Clear updates, time-sensitive offers, and working pipelines that keep you on the cutting edge. We read the firehose and hand-pick what matters so you can act today.
I don’t have time to scroll X all day. Primer does it, filters it, done.
Renee J.
Startup Founder
The fastest way to stay professionally expensive.
Felix B.
AI Animator
AI moves at ‘blink and it’s gone’. Primer is how I don’t blink.
Alex T.
Creative Technologist
Best ROI on ten minutes of my day. I’ve shipped two features purely from their daily prompts.
Marta S.
Product Designer
From release noise to a working workflow in 15 minutes.
Viktor H
AI Artist
It’s the only digest that explains why a release matters and shows how to use it—same page, same morning.
Priya R.
Startup Founder
Stay professionally expensive
Make the right move sooner
Ship a product