Home

Lenovo’s VP of Cloud Solutions on How Cloud and On-Prem Are Evolving Together in the AI Era

Lenovo’s VP of Cloud Solutions on How Cloud and On-Prem Are Evolving Together in the AI Era

Conor Malone and Binoy Unnikrishnan of Lenovo discuss how cloud service providers and enterprise infrastructure are converging to support AI anywhere.

AI is reshaping the relationship between cloud providers and enterprise infrastructure, not replacing one with the other.

Recorded live at Lenovo Tech World at CES in Las Vegas, hosts Patrick Moorhead and Daniel Newman are joined by Conor Malone, Vice President of CSP for Lenovo ISG, and Binoy Unnikrishnan, Vice President of Worldwide CSP Sales for Lenovo ISG, to discuss how cloud service providers have evolved over the past decade and why hybrid AI architectures are becoming foundational to modern deployment strategies.

As expectations of infrastructure partners have shifted with increased AI workloads,Conor and Binoy share their perspectives on why workload portability now plays a central role in AI strategy, allowing organizations to place training and inferencing where performance, cost, and governance align.

Key Takeaways Include:

🔹 Cloud and On-Prem Are Becoming Complementary: AI is driving hybrid architectures that combine the strengths of both models.
🔹 CSP Expectations Have Changed: As providers scale globally over time, infrastructure partners have been tasked with delivering more consistency, speed, and flexibility.
🔹 AI Introduces New Infrastructure Pressures: Power, efficiency, and scalability are now first-order design considerations.
🔹 Workload Portability Enables Choice: Organizations need the ability to run AI workloads where they make the most operational and economic sense.
🔹 Partnerships Accelerate Adoption: Deeper collaboration between CSPs and infrastructure vendors reduces complexity and speeds deployment.

Learn more at Lenovo.

Watch the full video at sixfivemedia.com, and be sure to subscribe to our YouTube channel so you never miss future coverage.

Listen to the audio here:

Disclaimer: Six Five On The Road is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.

Transcript

Patrick Moorhead:
The Six Five is on the road here at Lenovo Tech World in Las Vegas. It's been a great show so far. Wow, watching YY and the bevy of partners kick it off in the sphere was amazing. AI pretty much from pocket to cloud.

Daniel Newman:

Yeah, the Sphere was awesome. We're not there anymore, so we can't do the whole, oh, we're in the Sphere. But you know what? The event goes on, Pat, and the technology will continue to roll, and the opportunities will continue to scale. And if I walked away with any one thing yesterday, it was, man, this is a big opportunity. And you know me. I'm the guy that the opportunity is already really big. But it was just so exciting to hear from so many partners at so many levels. AI in your pocket, AI in the cloud, and everywhere in between, it's moving quick.

Patrick Moorhead:

It really is, and it's amazing just five years ago, people were talking about the slow or no growth in data center infrastructure. And now here we are, infrastructure is cool. and growing at these exponential rates, complexity is going up, and co-designing, particularly across CSPs and the new term of neoclouds, is becoming more and more important given Again, the complexity, but also the requirements for time to token. And I'd like to introduce Connor and Benoit with Lenovo to chat about this. Welcome, guys. Good to see you again. Gosh, I saw you last night at a party. We cranked out a video at the Sphere, but it's great to see you. Good to be back. Yeah, good to see you.

Daniel Newman:

And Benoit, nice to have you on. First time, right? First time. All right. Well, we'll ask you the hard questions. You get the hard questions. Well, let's start off and talk a little bit. And I'd like to hear from both of you on this one. But cloud service providers, the term, it's evolving very quickly. I think a couple of years ago, we would have thought maybe of three, four names when you say cloud service providers. Now there's, I think there's actually hundreds. And there always were more, but it was always very centric to a few big hyperscalers. Neo clouds have come up. AI has created this whole new wave. Some of them are Bitcoin miners. They became cloud. Some of them built just from brand new. I mean, talk a little bit about how that whole business is evolving and how it's changing the AI ecosystem.

Conor Malone:

Yeah, I mean, they've had a lot of different names over the years. I'm partial to hyperscalers, but to me it's always just been about anybody that needs infrastructure at scale, right? They're not talking about onesies, twosies, it's rack scale, it's how many racks, how many thousands of racks. And you're right, that used to be a handful, and now it is in the hundreds. Their kind of approach to the world's changed over time. At the tippy top end of it, it's a lot of collaborative custom design work. Sometimes they're doing design work. Sometimes we're doing design work. Sometimes it's both of us doing it together. But it's a very interesting model in that they often have the same design teams that we've got in terms of designing these servers and these racks and these architectures.

Binoy Unnikrishnan:

Yeah, I mean, I kind of think of cloud as not a destination. People kind of think of cloud as hyperscalers. This is public cloud. It's a destination, right? I mean, cloud is kind of an architecture. And it's an architecture that sort of evolved over time. It started out with workloads that are migrating to the cloud as a one-way destination. And those workloads sort of changed and became cloud native. And then it sort of, you know, sort of morphed back in terms of running on-prem as well as a cloud we call the hybrid cloud. I think that sort of model applies well to AI as well. It's where they're well-suited. So I don't like to think of my customers as a complete paradigm of anywhere from hyperscalers all the way to NeoCloud to even enterprise customers sort of running their own private cloud.

Patrick Moorhead:

Yeah, I mean, cloud is an operating model, not necessarily a destination. Isn't that what we used to say years ago? But I think it is being extended with even cloud providers managing on-prem infrastructure. And sovereign cloud has really driven a huge need for that. And as much as the hyperscalers protested of that, it's what customers want. It's nice to say, oh, I only need one CSP, but the typical enterprise has three plus. And I always like to say, if you're honing in on one of them, you're one acquisition away from adding another cloud partner. So, yeah, the reality has set in here. The other phenomenon, too, is global scale. You do have some very large companies that want to operate on a global scale. And how has that shifted the expectations for you guys? And maybe we'll start off with you, Connor.

Conor Malone:

Yeah, sure. Yeah, so that's one of the strengths that Lenovo brings to this market, in that we have a massive global manufacturing footprint. And as you look at things like sovereign clouds or various other kind of forcing functions in terms of being, having to build kind of in region, close to where the gear needs to be deployed. There's other factors that helps, like you mentioned time to first token earlier. Time to first token's a lot faster if you're manufacturing close to these massive data centers that get built. So it's all about how quick you can get these things up. And so being able to be super flexible about where we build these things helps us deploy them faster to where our customers are located.

Binoy Unnikrishnan:

Yeah, and I'd say from a customer perspective, you kind of think of it as, you know, Wai Wai talked about it yesterday a little bit on sort of the evolution from PC to the cloud era to the AI era. And in the AI era, you kind of think of, oh, training equals public cloud, inference is edge. But I think it's sort of a continuum of where these workloads are deployed. I mean, you want that training to be sort of more domain-specific, you want to leverage these big models that's built on the public internet, but you also want it to run on your own infrastructure with your own data. So that's driving a lot of sovereign requirements of where the training is done, and also requirements of where we build our infrastructure.

Conor Malone:

Makes sense. One interesting thing to note, again, on the strength of Lenovo's is that I think we're uniquely positioned to do the highest end of hyperscale deployments, but we're also have the teams and the services and all the logistics support to deploy similar rack scale architectures into the enterprise.

Patrick Moorhead:

Yeah, I mean, Jensen getting up on stage and co-announcing the solution here at Tech World, I think put a big exclamation point to him and YY.

Daniel Newman:

I think you're hearing a lot about that. I mean, you saw the announcement of the MI 440X. I mean, they're trying to build infrastructure that's designed for the enterprise. NVIDIA has already been down that path. You guys are picking and choosing, you know, which partners to build. You have the ODM capabilities to build very specific infrastructure for these customers. And I think that's really important, because what is going to happen is you are going to land hybrid. Hybrid cloud versus hybrid AI have some differences, but really there's a lot of similarities in it, right? They're going to say, look, these are the workloads we want to run on-prem. This is the data we don't want to have leave-prem, or this is the data that because of governance, compliance, sovereignty, we have to run local. And we want to control our destiny there. Maybe it's cost-related, maybe it's governance-related, but they want to do that. But this also creates a big challenge, because you're basically constantly navigating between cost, speed, outcome, all these different things. the cost and economics probably are going to be a lot better to build on-prem for a lot of cases than run, especially if you're doing high-volume work on tons of token generation. But speed is like, hey, it takes a while to stand this stuff up. The hyperscales, it's all running already. What are some of the challenges, the trade-offs in this AI hybrid cloud era for enterprises? Because building their own might economically make sense, but saying, hey, we're going to take six months or a year when we could go to AWS or Google today and start doing stuff, How are they balancing that?

Conor Malone:

Yeah, I mean, like the time to light up the workload is one thing, but I think what's very interesting about AI hardware specifically is that it's evolved very quickly. Its power density has gone up way faster than general purpose compute or storage. And so there's unique challenges in that, excuse me. There's unique challenges in that in It's not all enterprises are going to have the facilities, right? They might have data centers or corporate campus data rooms. Maybe it doesn't have the power and cooling to run these things anymore. But maybe there are things they can run. Maybe they've got some inference stuff they can do on-prem. Training is off in the cloud. The key is flexibility, right? Don't get locked into any kind of one model. Have the flexibility, the portability, the containerization, lots of different technologies to help move these things around. And to me, that's the key. It's that flexibility of deployment model.

Binoy Unnikrishnan:

Yeah, I'd say our goal at Lenovo is to meet the customer where they're at. If they're a hyperscaler customer, they want to have their own design, we can help them there with the ODM model. If it's an enterprise customer that wants something off the shelf, we can help there too. And if it's somewhere in the middle, you've got plenty of customers there today.

Patrick Moorhead:

A question, I don't know if this is an architecture question, but it gets back to the classic. How do CSPs layer on AI services between the cloud and on-prem? We've seen GDC from Google, we've seen AWS come out with their flavor. I was always wondering, they just didn't seem serious until now, particularly with sovereign AI, but how structurally does that work if they want to do it in both? How do they manage and not have this just utter chaos of supporting completely two different environments?

Conor Malone:

I've seen a couple of different models in terms of There was kind of an earlier era of this where the CSPs were like, OK, you can have stuff on prem, but it's going to be our hardware, our racks. And it's basically just the same version of what they deploy, but on the customer's premises. That's not quite as flexible, I think, as what customers need and want. And so you've seen more effort around data movement and actually having connectors between clouds. And there's a whole ecosystem of people that have popped up as this kind of glue ecosystem in terms of helping implement that portability. So I think that's a good thing.

Patrick Moorhead:

Yeah, I think my tweet was hell did freeze over when we saw the network sharing between the two hyperscalers that never did it before, right? And it was great because it really, that's what enterprises want. They don't want these gigantic ingress and egress charges or things like that. So I think, and then, you know, the sovereign AI announcements, just in a year, where you've had more in a year than I saw in the previous five, and the implementations are actually good, and you can figure them out, right? I mean, they make sense.

Binoy Unnikrishnan:

Yeah, part of this, I think, I don't know if you saw some of the announcements in the last couple of days, I'm actually using not just one model, but the benefit of multiple models in your answers. And the only way you could do that is, it sort of solves the multi-cloud problem in some ways. You're doing it at the model level, taking the benefit of all five, and then you sort of have your own version that runs on-prem based on your data.

Daniel Newman:

The abstraction of AI is incredibly experiential. And most enterprises, in terms of users and why they're doing IT and why they're doing AI, has nothing to do with any of what we're talking about. I mean that with no cynicism. I mean, they don't really care where the workload performs. The reason that we've sort of seen the AI device thing have kind of a varying level of success in the market. Now, most people will be on AI devices, no question about that. But is that people don't care if the cloud serves them up the token or the prime serves them up the token. Now the CIO cares, the CFO cares, but like the average user, it's experiential. And meaning, so when you're building models, mixtures of experts, and you're doing it on different infrastructure, you're saying, look, I want to make sure that my data across my proprietary data, the open data, the models that we're using, it's able to easily aggregate with low latency, depending on what it is. If it's a vehicle that needs like no latency, if it's a language thing, it's like, oh, if it takes a second or two to generate a token, I'm okay with that. And so all these things in the end, it comes down to like, are we driving productivity? The measure of this era will be about productivity-driven. Tokens and productivity. That's where we're going to go. So if you guys can enable that, enable enterprises to do it, do it securely, do it sovereign, do it compliant, do it government, you will be successful. I want to thank you both very much for joining us here on The Six Five. Let's have you back again soon. Not you. You've been on twice in a couple of days.

Binoy Unnikrishnan:

All right. Awesome.

Daniel Newman:

I need a third time.

Binoy Unnikrishnan:

Third time's a charm. Thank you both very much. Happy to be back anytime. Thank you.

Daniel Newman:

Thanks for having us, guys. Thank you very much. And thank you, everybody, for being part of this 6.5 On the Road coverage. We are here at Lenovo Tech World 26 in Las Vegas. It's been a great week. Check out all of our other coverage, The Day at the Sphere. Check out the rest of our coverage here today. Stick with us for more. We'll see you all soon.

MORE VIDEOS

Overcoming AI Bottlenecks: What’s Next for AI Inferencing at Scale?

On this episode of Six Five On The Road, Patrick Moorhead and Daniel Newman are joined by Lenovo’s Anwar Ghuloum to discuss why AI inference, infrastructure, and edge-first design are becoming the defining challenges as manufacturers scale AI beyond pilots.

AI Everywhere: Why Inferencing Is the Turning Point from Building AI to Using It

Patrick Moorhead and Daniel Newman are joined by Lenovo’s Flynn Maloy to discuss why inferencing is becoming central to deploying AI across real-world, latency-sensitive environments.

Beyond AI Models: The Services That Make AI Work in the Real World

Linda Yao of Lenovo joins Patrick Moorhead and Daniel Newman to discuss why AI inferencing depends on services, operations, and architectural choices, not just models or hardware.

See more

Other Categories

CYBERSECURITY

QUANTUM