Apple's New Siri Runs on Nvidia Chips in Google's Cloud

Three days before WWDC 2026 opens, The Information reported that Apple’s revamped Siri — previewing on June 8 and shipping in September with iOS 27 — will route AI queries through Google Cloud infrastructure running Nvidia Blackwell B200 chips. Apple’s own hardware, built specifically to control the full compute stack, wasn’t fast enough.

That sentence alone should be interesting to anyone building on Apple’s platform.

What Apple Is Actually Building

The new Siri is designed to be a conversational AI agent capable of multi-step, cross-app tasks — closer to ChatGPT or Google Gemini than the command-driven assistant that shipped in 2011. The underlying models are built on Google’s Gemini technology. Apple and Google issued a joint statement earlier this year confirming that the next generation of Apple Foundation Models would use Gemini as their base.

What’s new in this week’s reporting is the infrastructure layer: queries will run on Google Cloud servers powered by Nvidia’s Blackwell B200 GPUs. The B200 is Nvidia’s current data center chip, delivering up to 20 petaflops of FP8 inference throughput per card and including hardware-level confidential computing — memory operations are encrypted in silicon so that even the host system cannot read in-flight data. That last part is how Apple is trying to preserve its privacy story.

Why Apple Silicon Wasn’t Enough

Apple Silicon dominates on-device AI inference. The Neural Engine in A-series and M-series chips handles on-device workloads efficiently, and Apple has invested in Private Cloud Compute (PCC) — purpose-built Apple Silicon servers that run Apple Intelligence workloads in the cloud while keeping requests cryptographically auditable.

The bottleneck is throughput at scale. Handling millions of complex, multi-turn queries simultaneously demands a different class of compute. Nvidia’s Blackwell B200 delivers roughly 4x the inference throughput per chip compared to the previous Hopper H100 generation, and Google Cloud has deployed them at massive scale. Apple’s PCC servers were designed around privacy guarantees and Apple-controlled hardware, not raw throughput. When Apple tried to run the new Siri’s inference workload on its own infrastructure, it reportedly couldn’t keep pace.

This is the same wall every frontier AI lab has hit. On-device inference is excellent for latency-sensitive, privacy-sensitive tasks with bounded complexity. Cloud inference wins for heavy, stateful, multi-turn reasoning — which is exactly what a “do this 10-step task across my apps” Siri needs.

The Private Cloud Compute Question

Private Cloud Compute was a genuine security architecture. Apple used its own Apple Silicon servers, published signed binaries that researchers could verify, and enforced that request data was never retained or cross-correlated. Independent security researchers gave it high marks precisely because Apple controlled the entire hardware and software stack.

Moving to Google Cloud on Nvidia hardware changes the trust model in ways Apple hasn’t yet explained publicly. The apparent answer is the Blackwell B200’s confidential computing capability: hardware encryption that keeps memory contents opaque even to Google’s operators. This is real technology — Google Cloud’s Confidential Computing product uses it for enterprise workloads. But whether it provides the same level of verifiability as PCC’s auditable, reproducible builds on Apple-controlled hardware is an open architectural question.

Apple will need to address this at WWDC. Developers who’ve built privacy messaging around Apple Intelligence features — or who advise clients on data handling for App Intents integrations — should read whatever updated PCC documentation Apple releases this week very carefully.

What WWDC 2026 Will Show

WWDC 2026 opens Monday, June 8, with the Platforms State of the Union on Tuesday. Based on pre-conference reporting, expect:

A demo of the new conversational Siri completing multi-step tasks across apps without scripted hand-holding
An expanded App Intents API — the framework that connects your app’s actions to Siri
New Apple Intelligence APIs reflecting the Gemini-powered backend’s broader capability envelope
iOS 27 developer beta available on day one

The developer angle matters more than the consumer keynote moment. If Apple expands what App Intents can express — richer parameter types, stateful multi-turn conversations, background task execution — that’s a new surface area for every app that wants to be part of an AI-driven workflow. The difference between Siri understanding “book me a table for two at 7pm” and Siri actually completing it across three apps is almost entirely an App Intents API design decision.

What This Means for the Products We Ship

We build apps on Apple’s platform. A more capable Siri changes the calculus on which AI features need custom implementations versus which can be surfaced through standard platform APIs.

The apps we’ve shipped — including Amali and TeleTabeb — have required careful decisions about on-device versus cloud inference, and where user trust sits on each axis. That analysis gets more interesting when the platform’s own AI layer becomes substantively more capable rather than nominally so.

The infrastructure bet is also a signal worth taking seriously: Apple is willing to sacrifice vertical integration when competitive capability demands it. The “wait for Apple’s on-device model to improve” strategy for certain feature categories now has a shorter runway. If you’ve been deferring AI features for your iOS app because the quality floor wasn’t there, the roadmap for WWDC 2026 and iOS 27 is worth re-evaluating.

If you’re deciding how AI fits into a product you’re building on Apple’s platform, we’re worth talking to before the session videos are even published.

Sources

Apple finally set to launch all-new Siri in September, powered by Google cloud and Nvidia chips — Mac Daily News, June 4 2026 (citing The Information)
Apple partners with Google and Nvidia for next-generation Siri infrastructure — IntoMobile, June 4 2026
Apple’s Gemini-powered Siri might run on Nvidia’s encrypted chips — Cult of Mac, June 4 2026
Apple to use Google servers with Nvidia hardware for the new Siri — Macworld, June 4 2026
WWDC returns June 8: What we know and how to watch the Apple event — ZDNet, June 5 2026