Home

From Models to Agents: How Enterprises Are Scaling AI with Google Cloud

From Models to Agents: How Enterprises Are Scaling AI with Google Cloud

Daniel Newman and Oliver Parker, VP of Global Gen AI GTM at Google Cloud, examine the enterprise AI inflection at Google Cloud Next 2026. The conversation covers the drivers behind the shift from production capability to scale production, how inference cost structures are shaping what gets deployed, vertical AI execution through industry-specific customer deployments, and the emerging FinOps framework for evaluating agent ROI against labor cost equivalents.

Enterprise AI has crossed a threshold. Token consumption on Google's platform grew 50% in three months, driven by agentic workloads running across enterprise systems, not chat interfaces. The infrastructure and cost structures required to sustain that growth are now coming fully into focus.

Daniel Newman sits down with Oliver Parker, VP of Global Gen AI GTM at Google Cloud, at Google Cloud Next 2026 to discuss what drove the inflection from POC to production, how enterprises navigate rapid model iteration without stalling deployment, where vertical AI is delivering measurable outcomes, and how CFOs are beginning to reframe AI spend as a labor cost decision.

Key Takeaways:

  • Scale production is the next phase. Token consumption grew from 10 billion per minute in December to over 15 billion within three months, driven by agents running 24/7 across enterprise systems. Parker sees this as the beginning, not the ceiling.
  • Inference economics determines what scales. Capability that outpaces cost structures stays in pilot mode. Google's infrastructure investment is explicitly aimed at keeping cost-per-token aligned with enterprise deployment realities.
  • Model iteration speed is a change management problem. Parker's answer is using AI to accelerate organizational learning itself, compressing the upskilling gap before it stalls deployment momentum.
  • Vertical execution is where business outcomes get delivered. The Citi Wealth app and Signal Iduna's insurance platform show what happens when platform capability meets industry-specific workflow design. Gemma 4 and open source are accelerating that verticalization further.
  • FinOps is coming to AI spend. CFOs will evaluate token budgets against labor cost equivalents. That framing will drive broader organizational investment in agents, not constrain it.

The organizations scaling fastest aren’t just optimizing existing workflows, they’re questioning whether those workflows should exist at all.

Watch now and subscribe to Six Five Media for analyst-led coverage from Google Cloud Next 2026.

Disclaimer: Six Five Media is a media and analyst firm. All statements, views, and opinions expressed in this program are those of the hosts and guests and do not represent the views of any companies discussed. This content is for informational purposes only and should not be construed as investment advice.

Transcript

 OLIVER PARKER:
While we've seen huge acceleration the last 12 months, I actually think you're going to see even more now because I think the enterprise is now sort of what I call legitimately waking up and moving beyond what I would call sort of production capability to scale production.

DANIEL NEWMAN: 

Hey, everyone, The Six Five is on the road. We are here in Las Vegas, Nevada, Google Cloud Next 2026. Excited to be here with all of you. So much going on. This show has been humming. Wall-to-wall traffic, and that isn't because I'm here, or maybe it is, but it may be because all of the announcements that are going on. We are at the epicenter of an AI revolution, and there is a lot to discuss. And this week, I've got an opportunity here at Google Cloud Next to talk to someone who's been on the show before, actually, two years ago, just when he rejoined Google, Oliver Parker, back, leads the AI go-to-market for Google Cloud. It's good to be back here with you. Man, this place is nuts. I was trying to get to the keynote yesterday. and you just start flowing through the Mandalay. And I mean, it is wall to wall. I remember that inflection when it happened at GTC. And it's happened here at Google Cloud Next. This has got to be one of the AI shows at this point, and Google Cloud is absolutely crushing it.

OLIVER PARKER:

 I mean, this has been a really good show for us. I think we're day two or day three. I forget my days right now.

DANIEL NEWMAN: 

You're humble. You're humble.

OLIVER PARKER: 

OK, there we go. So it's like, look. When I think about our conversation two years ago, and what we shipped, and all the AI stuff, like really, really strong, as much strong product announcements, actually great customer examples. So shipping and customer value and how you get that right, so hopefully you'll see that.

DANIEL NEWMAN: 

Yeah, we're seeing it when we're talking to your customers. We're seeing it in our data, in our decision intelligence, and we're seeing evidence of it here through some of the sessions I've sat in listening. I mean, when Thomas got on stage, I think he kind of really banged the gong. I mean, he was like, boom, this is the year. We're done with the POC talk. We are moving on, and it is full scale, full production enterprises are rolling. So let's start there. I mean, what's the great inflection? Because I feel like two years ago, you know, you just kind of told everyone, you know, it was the idea of an agent of a vision. Last year, I think, you know, there was some real encouragement, but man, the innovation, the iteration, the agentic work, everything's humming. What was the inflection that gave Thomas and you and the team the confidence to say, we are full scale, we are full tilt?

OLIVER PARKER: 

Well, I think there's an inflection point of what we started to see, but I think we've been building a lot of this. You've been covering us for a long period of time, and you deeply understand some of the work we're doing, at least on the lower layers, and what we were doing on the infrastructure side. A lot of it was building, but I think when we released 2.5 last year, we just saw a massive scale inflection point. And then obviously we then started to release products with these capabilities in Tuesday. So sort of the second half of last year, and I think Thomas has even said publicly, you know, we had, what was it, 10 billion tokens a minute in December, and now we're sort of 50% greater than that within a three month period. So sort of we're seeing a massive inflection point of the model. And a big part of that is because of these agents that are now basically getting built and leveraging these models for massive scale token consumption inference. So, you know, a lot of the building blocks that we've shared in the last couple of days, we think sort of lay the foundation for that honestly to continue to accelerate. And like, while we've seen huge acceleration the last 12 months, I actually think you're going to see even more now because I think the enterprise is now sort of what I call legitimately waking up and moving beyond what I would call production capability to scale production. You're seeing use cases that are now in production. You talked about this pilot to production. I think there's another phase, and I think a big part of it we'll see more of this year, which is this scale production, where you start unleashing these capabilities across many products, many services, and digital channels. So I think even though we thought it was quite extreme already, I think we're going to see a significant amount more. And obviously, you saw the announcements we made on the infrastructure side. So some of that is obviously critical as it relates to sort of what you need from a compute standpoint as you start to unleash thousands of agents across these enterprises.

DANIEL NEWMAN: 

What comes after trillion? I mean, we're, cause you was at 10 billion a minute, right? Of tokens. And now it's 15 at 50%. And I'm saying you start to add up to minutes today to, you know, just Google alone has got to be, you know, zeroing in on that.

OLIVER PARKER: 

And then it's just on the enterprise. And then obviously we've got our own token consumption taking place, like our search platform. So yeah, we're,

DANIEL NEWMAN: 

But it's all, it's kind of a, it's a, you know, not to use a tech term, but it's compiling, right? You've got the infrastructure becoming more capable. Of course, you know, Ironwood was a big step forward. Now you've got TPUVA, you've got I and T, which is a really interesting, we have another show all about infra, so check that one out if you want to. And of course, the networking's caught up, you know, next generation CPUs. I mean, who would have thought there'd be this renaissance of CPUs? I, you know, I get a lot of things right. I did not, see that coming as fast as it came. And then you see software, tooling, just the capabilities of these models. And by the way, it's unleashed the imagination of pretty much anyone that can type or talk.

OLIVER PARKER: 

Yes. Not only is unleashing the imagination, but the cost structures need to make sense so that people can start creating. And I think in addition to obviously the capability, you also need to have the right cost, cost structures to support it. So another way you've noticed, we've done a huge amount to bring out really capable models, but the right cost structure. So we will pick up, we're seeing more and more use cases that are getting picked up because the cost structures and performance. And again, a lot of the work that Amin has been doing around that infrastructure is powering at a cost and an efficiency level, because as much as you can unleash great capability, it's got to be done at the right pricing structures so that you can take advantage of it. That's been a really important focus of ours.

DANIEL NEWMAN: 

Yeah, and it's interesting, too, is part of, I think, what took POC to production has more to do with inference. Returning content and tokens was one thing. Agentic workflows, whole different thing, right? I mean, when companies, it's one thing we start to see, oh, it can write a press release. But when you can really start to But not only look at your current processes and how agents can help it, but then you start to reimagine all your processes and how agents can change them. I mean, that has to be something as you're out in that go-to-market motion, you're talking to customers. Was that one of the things that's really been the driver, was just like taking it from just kind of generative capabilities to true agentic capabilities?

OLIVER PARKER: 

I still think we're really early, even though we're seeing this big increase. I still think we're early for what I would call scale agentic use. Like you'll see companies build some of these agents or rethink workflows or even redesign them. But they're still in a really small part of the organization, even companies that are being quite successful. there's still so much more even within those companies. So we're not even talking about the new ones coming online. So one is sort of redesigning existing workflows. The other is complete new workflows that get created as a function or as an outcome, as a derivative of AI. And I think sort of the redesign plus the new businesses like I still think we're still really early on as it relates to sort of unleashing these agentic capabilities. And then obviously token consumption, you've got these agents working 24 by seven. So sort of the compute underlying requirements, the efficiency and the cost structures, it's a really big focus because otherwise you can see huge potential, but the cost structures and the performance won't be there to enable it. So we're really focused on, to your point, the full stack and enabling sort of that unleash at the business level.

DANIEL NEWMAN: 

Yeah, that is one of the big challenges, matching capabilities, costs, and choice, by the way. Because when you are looking at your business, all the different optionality that you have, and I think one of the things, at least we're seeing, is that the imagination and the sort of intellectual curiosity that some of the best implementers of AI have, is that they're not basically looking at how, you know, remember digital transformation, it was like, how do we take this physical piece of paper and put it online? With AI, it's like you don't want to just take the process that you have online and now add AI to it. You want to collaborate with this technology and reinvent it. And actually rethink, should it be online?

OLIVER PARKER: 

Should it be in paper? Should I even be in this business? I mean, there's the questions that you now ask yourself as a function of having this sort of intelligent capacity sort of you get to unleash. It's really changing things.

DANIEL NEWMAN: 

One of my favorite questions, Oliver, is what am I not asking you that I should be? What am I not thinking about? What do you see out there? Did you ask Gemini before we met?

OLIVER PARKER: 

Or is this… Yeah. What did Gemini say?

DANIEL NEWMAN: 

Oh, in general? Yeah.

OLIVER PARKER: 

Did you ask Gemini that question or were you asking me that question?

DANIEL NEWMAN: 

No, I ask it every time. And my point is, is that so every time I'm working on any workflow, when I'm trying to think, hey, I want this certain reporting structure on Revit, you know, like think about RevOps, like all the things you ask for RevOps. And I'm like, hey, Gemini, what am I not asking you that I should be asking my RevOps report? And it's like, it always comes up with three or four things that's like, holy smokes, why didn't I think of that?

OLIVER PARKER: 

It's funny, you get this sort of PhD sitting on your shoulder, and you need to utilize it. You're like, what should I be thinking about that I'm not thinking about? I mean, I do this a lot, obviously in my role, because I lead a lot of the AI, I'm like, I'm meeting with this client, here's how I think about it, what am I not thinking about? What are some of these new opportunities as clients? I would say, I run a sales function, I go to market function, using AI to rethink how to engage with clients, In addition to them obviously buying your technology, that's been a huge part of, I would say, my personal growth and value in my current role, sort of helping me rethink my client engagement.

DANIEL NEWMAN: 

Your level of preparation, your level of understanding, the ideation that you can bring to the customer instead of just showing them, what do you need? Like, you know, it's like you really… Here's your problem, here's what we've got.

OLIVER PARKER: 

You actually get to have fundamentally different conversations by having that sort of, that intelligence next to you to be able to…

DANIEL NEWMAN: 

Understand what the best of breed are doing. Understand what the best in the industry are doing. And then, by the way, put it in a context that looks at the personalities of the people in the room with you. It's great. We can talk about this all day. I want to talk about the rapid iteration and innovation that's going on, though. So one of the things I do hear from a lot of business leaders, IT leaders that are implementing AIs is just like, wow, this is moving fast. This is like, it's hard to digest every day. New model, new tool, new capability. I feel behind from the minute I wake up. Not me, I'm way ahead. They're, you know, they're telling me and they're always feeling behind because of how fast things are changing. Like you have to be like, how do you kind of arbitrate that when you're talking to customers to keep them from letting this rapid innovation stall, almost paralyze their progress?

OLIVER PARKER: 

Look, I think your head can spin if you live in the world of iterative releases and that sort of stuff. I think one of the things that we've got a lot better at at a client level is actually articulating where businesses need to go and the fact that this technology will continue to improve. But the reality is it's You know, you've got what I think some people refer to as AI asymmetry or sort of this capability overhang. You have to be able to package this capability up into things that are very useful and can be applied at a business level. And a lot of the tooling and more product-centric thinking, that's a lot of the stuff that we're working on. So you can talk about what a reasoning model does to this product versus, you know, a 3.0 or a 3.1, but the reality is, I think all of us inside of Google and actually our customers are actually having to rethink how we learn. That's a really big part of it. And using AI to help you think how you learn. I mean, my learning journey now for new technology could not be more different than it was 18 months ago. Like I use AI to help me learn on an ongoing basis. So sort of rethinking how I upscale, rethinking how I learn. And a lot of our clients are now starting to do that and we're helping them along that journey.

DANIEL NEWMAN: 

Yeah, it's really interesting you say that. I legitimately learned nuclear fusion using air in about a 12-day sprint.

OLIVER PARKER: 

How much time a day would you do?

DANIEL NEWMAN: 

Was it… No, I mean it was a few hours. It was just me iterating on it and using things that I do understand well like semiconductor technology. I was using like the parallels to try to understand… Vision versus fusion. I mean, obviously, you know, I'm not dissertation ready, but I was able to walk in and talk to the CEO of a fusion company and have a really interesting conversation and be able to help, you know, at the level where we function on an AI fusion. And what I'm saying is using tools like Gemini, spending time iterating with it, building on it. It was a super interesting process.

OLIVER PARKER: 

I spent a lot of time, I was going through The First World War last week, I did a bunch of trips in my car and I was actually relearning the First World War. My history was outdated and I actually got a lot more context. And you can also ask context around these kinds of events, especially for me, I'm quite interested in history. You could build an app that could become a training, teaching. The consumer app, just having a conversation on an ongoing basis. So I think people's learning, I also think it's spiking people's curiosity. Yeah, so yeah, there's a there's a lot that goes into sort of learning and sort of understanding more things.

DANIEL NEWMAN: 

I use a Gemini a ton for health and fitness related things like optimizing workouts, training is like, it is a big thing for me. But I mean, like, like, but working with it, it's given me ideas that, you know, I've had trainers, I've had nutritionists, I've had Doctors, and what I'm saying is some of the, obviously, talk to your doctors. I'm not saying don't do that, but I'm saying some of the very real-time iterative ideation it gave me, because like, you know, I work out for 30 years straight. It's like, my workouts were stale. I had certain things, like, I asked it to help me understand what I should be working on and develop, like, what parts of my, you know, muscle groups I should be focusing on developing that are good for aging and health, and it just gave me so many great things.

OLIVER PARKER: 

So many great things. Obviously, then you sort of add in sort of tool cooling and pulling stuff from the internet and searching. So at least for Gemini, that access, that world's information, in a way, it's great.

DANIEL NEWMAN: 

It's great at that. It's actually really great. But by the way, that's a great example of kind of verticalizer. And so one of the things, too, is companies want to see AI in the language of their business, right? And so you have things like Gemini, of course, is a very world model. It's frontier. But companies are wanting to go SLMs, vertical models. They want to fork a Gemma model and use it for maybe something, like kind of how is that desire? Because a lot of businesses, they don't need the power of a Gemini Pro. They don't need a, you know, any of these big frontier lab models. They're great. I'm not saying that, but I'm saying for kind of some of the things they're doing. So how are they kind of thinking and differentiating between what they want to do for an enterprise, for a very specific automation and workflow, like how is that evolving?

OLIVER PARKER: 

I think, actually, that's one space I think we've seen a massive evolution if you sort of, when you and I first met, the amount of what I would call model providers in the ecosystem is really high.

DANIEL NEWMAN: 

Yeah.

OLIVER PARKER: 

I think, you know, what we've also seen, and you sort of mentioned a little bit of this with Jemra, is we've seen a significant increase in open source. And I think that started to fill a void of what people want to do with these things, you know, obviously take the weight to really start to make these things work for their specific use cases. And we've just come up with Jemma 4 and, you know, a smaller model, which we think will end up being sitting on the edge, maybe even on small devices. So we think a lot of that adaptability is starting to show up in the open source ecosystem. And again, we've been clear since day one, we're a platform player. So we give people access to as many models as they can, whether they're, you know, sort of frontier lab, whether it's us, you know, it's Anthropic or others, right the way through to people that are using sort of open source and being able to use the platform to sort of govern and guardrail and control at the same time. So I actually think the open source community is really benefiting from a lot of it, and that's an important thing that we continue to invest in. So I think you'll see a lot more from open source in addition, obviously, to the Frontier Labs.

DANIEL NEWMAN: 

Yeah, I think open source is a great point and verticalizing, I think you're going to see companies, like I think we've sprawled out a lot, Oliver, but I also think we're going to start to narrow back in and we're going to start to find that, hey, in healthcare, in legal, and some of it's going to come from Frontier Lab, some of it's going to be companies that Google Ventures is going to invest in, Google Ventures and others are going to invest in these companies. Because in the end, I don't think it's a winner-takes-all. I think the kind of zero-sum thinking of the market has been a little overdone, that every software is going to be disintermediated, that every job is going to be replaced. I think the more optimist types, and hopefully you and I maybe see this somewhat similarly, is, look, a lot of disruption is coming. AI is certainly transformative. But having said that, you know, industrial revolutions of the past have proved time and time again of the ingenuity and the human spirit of innovating on these technologies. Like, what's going to come next? Many of us can't even imagine yet.

OLIVER PARKER: 

Back to your vertical point, though, like, you know, we showed a couple of examples. I think we showed the Citi Wealth app, the My Wealth app from Citi. We showed Signal Iduna that do financial insurance in Germany. So we've ended up helping customers build and take advantage of these platforms for very specific vertical use cases. And we're going to continue to do that. I also think you're going to get people that go build on top of these models and actually take almost these as a product level to market. You've seen stuff in the legal environment where they're using these models and they're going towards legal. So again, I think we'll continue to sort of put the building blocks in there and go vertical where it makes sense, but I also think that's a whole new ecosystem slowly being built as well.

DANIEL NEWMAN: 

We're taking the real-time decision intelligence that we as analysts build and we're building it right on top of models like Gemini and Opus. I mean, literally, all our proprietary data, all our proprietary market models, all the proprietary decision-maker surveys we do.

OLIVER PARKER: 

An application-centric capability that is powered by the model. It's powered by all of them, in fact.

DANIEL NEWMAN: 

And we actually use a mixture of experts, and the models actually, they validate each other. You know, we use them for fact-checking language. Yeah, and we do that in real time. So even, you know… Even your own, what you'll do in the house. Yeah, it's literally been, you know, what you've created is democratized innovation, allowed smaller businesses and companies to become so inventive. Kind of pulling this all together, you know, All this spend needs to be justified, right? So as you're out there talking to customers, how are they kind of getting to the right ROI? What are they measuring right now? Because it is easy. I mean, anyone that turns on these tools, it's easy to go token crazy. You see token maxing. You hear some of the executives talking about, I need every person to have half a million dollars of token a year. I mean, these are big numbers. It's a big spend.

OLIVER PARKER: 

Yeah. I mean, look, we see a significant acceleration in token usage. I mean, we just talked about how many tokens, you know, billions of tokens a minute. I think you'll start to see some level of settling down. I think people are realizing what they can go build. I think you're now going to enter into a phase of people wanting to make sure that the tokens they're using are the right cost tokens. They're being used for the right things. That's sort of number one. And I also think number two is sort of there's almost like a new experimental phase that we're seeing around agents. And I think people are going to start figuring out what these agents do, whether they're long running, should they be long running? Are they doing the right things? Are they doing it efficiently? So I think sort of there's going to be sort of like an efficiency play in the agent world. I'm sort of fully anticipating that. But I think that'll get offset by agents being applied to many more things. So you'll see some compression to a certain extent, but you'll also see some, you know, broader footprints. The other thing is, you know, and sort of take us back to sort of when some of the cloud stuff started, FinOps started to show up. So I think you'll start to see some of those kinds of enterprise capabilities either in platforms or third parties building and helping customers be really thoughtful around how they use these agents, how they apply those agents, what are the financial justifications, and then you'll eventually get down to people sort of thinking in budgets, and I'm talking with CFOs around how do I think about sort of my labor cost, whether they're humans or agents, through the lens of token costing. So I think a lot of the financial constructs are starting to sort of be put through this, and I think there'll be some efficiency gains, but I think those efficiency gains will be far eclipsed by, quite frankly, growth across multiple functions inside many companies.

DANIEL NEWMAN: 

Yeah, the expectation is trillions, 20, 30 of GDP growth that is going to come from the productivity that will be gained.

OLIVER PARKER: 

I mean, I think the early signs are there's legitimacy in that thinking. I think like a lot of things, and you know this probably better than anyone, sort of the foundation of compute and power and energy and how that enables that, that's a big part of it. And as you know, that's a really big focus of ours at a company level.

DANIEL NEWMAN: 

Yeah, it's a really exciting time.

OLIVER PARKER: 

It's the most exciting.

DANIEL NEWMAN: 

I want to thank you so much for being here with me.

OLIVER PARKER: 

Thank you. And thanks for having me on. And let's not wait two years before we do it again.

DANIEL NEWMAN: 

Let's do it next year.

OLIVER PARKER: 

Next year. All right. That's a deal. Awesome.

DANIEL NEWMAN: 

All right. And thank you, everybody, for being part of this Six Five We are On The Road here at Google Cloud Next 2026 in Las Vegas. Stay with us for all of our coverage here at the event. Subscribe. Be part of our community. For this episode, I've got to go. We'll see you all later.

MORE VIDEOS

From Infrastructure to Intelligence: How Google Cloud Is Architecting the Agentic Enterprise

At Google Cloud Next 2026, Patrick Moorhead and Muninder Sambi, VP of Google Distributed Cloud, examine the five infrastructure shifts enterprises must execute to support AI agents at production scale. From Fluid Compute and Agent Gateway to sovereign AI deployment via Google Distributed Cloud, the conversation maps the architectural decisions that determine how far agentic execution can scale.

Agentic Infrastructure at Scale: Inside Google Cloud’s AI Hypercomputer and TPU-8 Infrastructure

Mark Lohmeyer, VP/GM of AI and Computing Infrastructure at Google Cloud, joins Patrick Moorhead live at Google Cloud Next 2026 to examine the infrastructure architecture behind the agentic era. The conversation covers the TPU-8T and TPU-8I split, the Virgo accelerator network, Managed Lustre storage performance, NVIDIA Vera Rubin integration, and the evolution of GKE into an agent-native orchestration platform built for bursty, high-parallelism workloads.

Google Cloud Goes Full Stack, Amazon's $100B Anthropic Bet, Intel's Foundry Moment & More

Patrick Moorhead and Daniel Newman break down a massive week in enterprise tech, from Google Cloud Next's full-stack AI push and Amazon's $100 billion Anthropic commitment, to Apple's leadership transition and Intel's long-awaited foundry validation courtesy of Elon Musk.

See more

Other Categories

CYBERSECURITY

QUANTUM