Gemini 3.5 Flash Lands at Google I/O 2026: Frontier Coding Performance at Flash Pricing
- May 27
- 2 min read
On May 19, 2026, at Google I/O, Google introduced Gemini 3.5, framed as the first family that fuses frontier intelligence with action. The first model out the door is Gemini 3.5 Flash, and the headline is unusual for a Flash release: on Google's published agentic and coding benchmarks, it edges past its own previous Pro tier while remaining priced as a fast model.
What Google Actually Shipped
Gemini 3.5 Flash is available immediately through the Gemini API, Google AI Studio, Vertex AI, Google Antigravity, the Gemini Enterprise Agent Platform, the consumer Gemini app, and AI Mode in Google Search. The API model ID is gemini-3.5-flash. The context window is 1,048,576 input tokens with a maximum of 65,536 output tokens, and the model accepts text, image, audio, and video inputs. Gemini 3.5 Pro is in internal use but is not expected to be widely available until the following month.
Benchmarks That Matter for Builders
Google's published scores center on agentic and coding workloads rather than pure knowledge tests, which is a meaningful signal about where Google sees the next round of competition.
Terminal-Bench 2.1: 76.2 percent
GDPval-AA: 1656 Elo
MCP Atlas: 83.6 percent
CharXiv Reasoning (multimodal): 84.2 percent
Google reports that Gemini 3.5 Flash outperforms Gemini 3.1 Pro on the coding and agentic suite while running about four times faster than other frontier models. On the composite Artificial Analysis Intelligence Index, the model lands at 55.
Pricing
Per Artificial Analysis listings, standard-tier pricing is 1.50 dollars per 1M input tokens and 9.00 dollars per 1M output tokens, with cached input at 0.15 dollars per 1M. The thinking variant is priced separately. For agent loops that lean heavily on tool calls and reasoning chains, the cached input rate is the line item to watch.
Why This Release Reads Like a Strategy Shift
Two things stand out. First, Google led with Flash, not Pro. The implicit positioning is that the bottleneck for agentic systems is not raw intelligence but the cost and latency of running many model calls inside one task. Second, Gemini 3.5 Flash is the default model under Google Antigravity 2.0, the personal agent Gemini Spark, and AI Mode in Search, so the same model serves everything from consumer chat to enterprise agent orchestration.
For teams already running on Gemini 3.1 Pro, a benchmarks-led migration to 3.5 Flash is now worth a structured eval pass, especially for long-horizon coding agents where the speed delta compounds across many tool calls.



