Google Gemini 3 Pro reaches Search and mobile – API limits rise 5×, 20B tokens/day

Stay in the loop

Free daily newsletter & Telegram daily report

Executive Summary

Google flipped Gemini 3 Pro across more of its stack and turned on Gemini Agent for Ultra users on desktop in the U.S. Why it matters: capacity and safety both moved. External API caps rose roughly 5×, and one operator pushed 20B tokens in a day before hitting limits again. DeepMind’s safety card cites stronger prompt‑injection resistance and an 11/12 score on the hardest slice of its cybersecurity eval, plus stateful controls on tool use.

The agent story is pragmatic. It decomposes tasks, connects to Gmail and Calendar with consent, drafts replies, and requires confirmations for high‑risk actions like purchases. Search’s AI Mode is now more visual and interactive, with dynamic layouts and even on‑the‑fly simulations (yes, a pendulum toy can appear in your results). On mobile, SynthID tags help verify Gemini‑generated images, and Gemini on the web can now pull from Google Photos to ground prompts.

For builders, the ecosystem lit up quickly: Weaviate shipped zero‑migration RAG support via Gemini API/Vertex, Replicate added a multimodal endpoint for fast trials, Zed IDE enabled Gemini 3 Pro, and MagicPath demoed one‑shot image→website generation. If you pilot the agent, mirror Google’s confirmation gates and log every tool call; the rate‑limit headroom is real, but demand is proving it’s easy to saturate.

Feature Spotlight

Feature: Gemini 3 Pro and Agent land across Google surfaces

Google ships Gemini 3 Pro and a desktop Gemini Agent (Ultra, U.S.). 1M context, fast outputs, and a published safety card signal readiness; rate‑limit bumps and early integrations show rapid ecosystem uptake.

Today’s timeline is dominated by Google’s Gemini 3 Pro and the new Gemini Agent: core launch, safety model card, rate‑limit boosts, and first integrations. This section focuses on the rollout and platform availability; benchmarks and third‑party tooling appear elsewhere.

Jump to Feature: Gemini 3 Pro and Agent land across Google surfaces topics

🛠️ Feature: Gemini 3 Pro and Agent land across Google surfaces

DeepMind publishes Gemini 3 Pro safety report; stronger injection resistance

DeepMind released the Gemini 3 Pro Frontier Safety Framework report and model card, highlighting broader CBRN/cyber testing, improved prompt‑injection resistance, and stateful tool‑use controls model card, with specifics in the downloadable PDF FSF report. Notably, Gemini 3 Pro scored 11/12 on the hardest slice of their cybersecurity eval and showed novel "synthetic environment" awareness during tests results highlights, including a now‑viral "virtual table flip" anecdote behavior note.

Google Gemini 3 Pro reaches Search and mobile – API limits rise 5×, 20B tokens/day

Executive Summary

Feature: Gemini 3 Pro and Agent land across Google surfaces

Table of Contents

🛠️ Feature: Gemini 3 Pro and Agent land across Google surfaces

DeepMind publishes Gemini 3 Pro safety report; stronger injection resistance

Google Search rolls out Gemini‑powered dynamic layouts and simulations

Jules SWE agent goes live for Gemini Ultra; Slack and Live Preview in the works

Weaviate lights up Gemini 3 via Gemini API/Vertex for vector/RAG workflows

Gemini web adds Google Photos import for prompt context

Replicate offers Gemini 3 Pro endpoint with image/video/audio input

Stitch can export designs to AI Studio to spin up Gemini apps

How Gemini Agent operates: step planning, connected apps, confirmations

NotebookLM iOS adds camera/image sources and audio progress resume

Zed IDE adds Gemini 3 Pro model support

🧬 Frontier model rollouts: OpenAI, xAI and Deep Cogito

GPT‑5.1‑Codex‑Max becomes Codex default with million‑token ‘compaction’ and new SOTA scores

External evals: Codex‑Max hits 2:42h time‑horizon at 50% (METR), improves on CVE‑Bench

OpenAI rolls out GPT‑5.1 Pro to all Pro users

xAI launches Grok 4.1 Fast (2M context) and Agent Tools API, free for two weeks on OpenRouter

Deep Cogito releases 671B open‑weight Cogito v2.1; $1.25/M token inference on Together

👨‍💻 Agentic dev stacks: Codex CLI, Warp Agents 3.0, Cline, OpenCode

Codex adopts GPT‑5.1‑Codex‑Max; Windows workflows and search restored

Warp Agents 3.0 brings REPLs, debuggers, and spec‑first plans

Cline adds Gemini 3 Pro and higher‑fidelity speech‑to‑code

OpenCode’s Gemini 3 usage spikes after 5× limit bump

RepoPrompt spans multiple repos and adopts Codex‑Max

Code Wiki clarifies unfamiliar repos for contributors

Crush updates: Gemini 3 support and coding‑plan hook

📊 Leaderboards and evals: Grok gains, LiveBench nuance, METR update

Grok 4.1 Fast tops τ²-Telecom, scores 64 on AA Intelligence Index at ~$45 eval cost

METR: GPT‑5.1‑Codex‑Max hits ~2h42m time‑horizon at 50% success; no catastrophic‑risk model expected in ~6 months

Grok 4.1 Fast climbs Vals Index to #8; Finance Agent score rises to 44%

Arena: GPT‑5.1‑high climbs to #3 on Expert, #4 on Text leaderboard

LiveBench: Gemini 3 edges GPT‑5 overall; Claude 4.5 leads coding/agentic—but differences are within noise

Arena WebDev: Cogito v2.1 enters as Top‑10 open source, ties #18 overall

🏗️ AI compute build‑out: NVIDIA beat, 500MW Grok DC, hyperscale footprints

NVIDIA posts $57.01B revenue, guides ~$65B; data center hits ~$51.2B

Anthropic secures $30B Azure compute, teams with NVIDIA; Claude expands on Microsoft

xAI to build 500MW Saudi AI data center with NVIDIA hardware

Brookfield sets up $100B AI infrastructure program with NVIDIA DSX blueprint

Lambda raises >$1.5B, inks multi‑billion Microsoft GPU deal; builds own DCs

Epoch maps mega data centers; Meta Hyperion projected ~4× Central Park

🛡️ Security and governance: agent exfil and federal preemption

Researchers flag Antigravity IDE exfil risk via Markdown image loads

Draft White House order would preempt state AI rules and arm DoJ to sue

Factory AI embeds Palo Alto’s AIRS to scan prompts and tool calls in real time

💼 Enterprise moves: Perplexity–US Gov, Udio–Warner, creator platforms

Anthropic signs $30B Azure compute deal, partners with NVIDIA; Claude enters Microsoft stack

Perplexity secures GSA channel with Enterprise Pro for Government

Cloudflare acquires Replicate to fold open‑model inference into Workers AI

Factory AI integrates Palo Alto’s AIRS to scan agents for prompt injection risks

OpenAI launches ChatGPT for Teachers, free to U.S. K–12 through June 2027

Udio partners with Warner Music; creator tools remain available

Midjourney launches user profiles; 5 free fast hours for early setup

Perplexity adds PayPal checkout for on‑platform shopping

🧾 RAG and reranking in production

ZeroEntropy ships zerank‑2 reranker with multilingual and instruction‑following gains

Perplexity turns answers into editable Docs/Slides/Sheets

LlamaCloud improves complex table parsing for reliable RAG ingestion

OpenRouter rolls out 13 new embeddings for RAG

Document automation gets first‑class traces and eval hooks

Rapid corpus building for RAG via two‑click scraping

🦾 Robots in production: Figure’s BMW scorecard

Figure’s humanoid posts BMW factory KPIs after 11 months

🎨 Vision and creative stacks: SAM3, Nano Banana Pro, Search UIs

Meta ships SAM 3 with text prompts, video tracking, WebGPU demo, and Transformers support

Gemini‑powered Search now generates dynamic visual tools and magazine‑style layouts

‘Nano Banana Pro’ leaks show 4K image generation and advanced text rendering across Google apps

Gemini 3 generates YouTube Playables mini‑games from prompts and a few images

Replicate hosts Retro Diffusion models for sprites, tilesets, and pixel art workflows

ImagineArt adds Video Upscale; creators can boost clip fidelity in‑app

🗣️ Voice interfaces for engineers

Cline 3.38.0 brings Avalon STT to coding with 97.4% jargon accuracy

ElevenLabs sets voice-first roadmap: Agents Platform and Creative Platform

Research demo: Proactive hearing assistants isolate your conversation in noise

On this page