Google I/O 2026: Quadrillions of Tokens, Billions in Capex, and an Agent That Plans Your Block Party
Sundar Pichai kicked off Google I/O by doing what tech CEOs do best: presenting large numbers as proof that everything is going brilliantly.
The headline stat was token throughput. Two years ago, Google was processing 9.7 trillion tokens per month. Last year that climbed to 480 trillion. Now it's sitting at 3.2 quadrillion per month. Pichai acknowledged the obvious with a rare flash of self-awareness: 'Some out there might call this tokenmaxxing, and there's probably some truth to it.' Credit where it's due for saying the quiet part out loud.
For context, over 8.5 million developers are using Gemini monthly via API, burning through roughly 19 billion tokens per minute. More than 375 enterprise customers have each consumed over a trillion tokens in the past year. Someone is paying for all this.
And paying they are. Google's infrastructure spend has gone from $31 billion annually in 2022 to an expected $180-190 billion this year. Six times the capex in three years is a staggering commitment, even by hyperscaler standards.
Demis Hassabis then appeared to pitch Gemini Omni as the latest waypoint on the road to AGI, that perpetually imminent milestone where AI supposedly matches human capability across meaningful tasks. Omni combines Gemini's core reasoning with Google's generative media models, physics simulation, and multimodal capabilities from Veo, Nano Banana, and Genie. The pitch: it understands the world well enough to model how physical objects actually behave, not just what they look like. The first release, Gemini Omni Flash, is available now.
On the content authenticity front, Google is expanding SynthID, its AI watermarking system, to Search and Chrome. Users will be able to right-click any image and ask whether it was AI-generated. Google is also backing C2PA content credentials, which track whether media was captured by a camera or cooked up by a model. OpenAI, Kakao, and ElevenLabs are joining SynthID's coalition, which is either a sign of genuine industry coordination or competitive optics. Possibly both.
The main model announcement was Gemini 3.5 Flash, billed as faster and cheaper than current frontier alternatives. Google claims it runs at around 289 tokens per second, roughly four times the speed of comparable models. Inside the Antigravity coding environment, that jumps to twelve times faster. Pichai offered a thinly veiled pitch to cloud customers: if companies processing a trillion tokens per day shifted 80% of their workload from other frontier models to 3.5 Flash, they would collectively save over a billion dollars a year. Subtle.
The more significant product reveal was Gemini Spark, a persistent AI agent running 24/7 on dedicated cloud VMs. It sits in the background, takes instructions, and acts on your behalf across Gmail, Google Chat, and eventually third-party tools via MCP. Chrome integration for agentic browsing is coming later this summer. As a demonstration of its capabilities, Google's Josh Woodward described using Spark to organise a neighbourhood party: emailing people, tracking RSVPs in a spreadsheet, building a slide deck. If that sounds like something an intern used to handle, that's rather the point.
Spark is available to trusted testers now and will reach Google AI Ultra subscribers in the US next week. Its arrival coincides with a new $100 per month Ultra tier and a modest price cut on the top tier from $250 to $200.
Pichai reassured everyone that 'it's still early days when it comes to making agents easy to use, super secure, and truly helpful,' which is a remarkably relaxed framing for software that autonomously reads your email and writes documents on your behalf.
Search is getting a similar treatment. Gemini 3.5 Flash is now the default model for AI Mode, the search interface has been redesigned to accept images, files, videos, and Chrome tabs alongside plain text, and Search Agents will run in the background, monitoring topics and surfacing updates while you get on with other things. Google is also rolling out generative UI in Search this summer, letting users spin up interactive charts, layouts, and mini-apps on demand, borrowing a concept Anthropic has been experimenting with.
Token counts will keep climbing. Subscription pressure will follow.