AI Everywhere: Why Inferencing Is the Turning Point from Building AI to Using It
Patrick Moorhead and Daniel Newman are joined by Lenovo’s Flynn Maloy to discuss why inferencing is becoming central to deploying AI across real-world, latency-sensitive environments.
AI adoption is reaching an inflection point, and inferencing is where the shift towards AI at scale becomes real.
Patrick Moorhead and Daniel Newman are joined by Flynn Maloy, Chief Marketing Officer of ISG at Lenovo, to examine why inferencing is emerging as the capability that transforms AI from a research exercise into an operational tool. As organizations move beyond experimentation, attention is turning to how AI is actually deployed in production environments.
The discussion focuses on how organizations spanning retail, manufacturing, and healthcare are deploying intelligent systems in environments where latency, reliability, and operational continuity matter most. Rather than centering on model development, inferencing is positioned as the capability that allows intelligence to deliver value at the point of action, where data is generated and decisions must happen in real time.
Key Takeaways Include:
🔹Inferencing marks the transition from building AI to using AI: Enterprises are shifting away from centralized experimentation toward embedding AI directly into workflows, decisions, and customer-facing systems.
🔹Edge inferencing enables real-time intelligence: Latency-sensitive environments benefit from local processing that supports faster response times and greater operational resilience.
🔹Private enterprise data is central to AI value creation: Unlocking this data requires strategies that keep data local, secure, and compliant while still participating in broader AI lifecycles.
🔹AI at scale depends on efficient inferencing: Running models continuously across environments demands cost-conscious, energy-aware inferencing strategies.
🔹Consistency across environments is critical: Successful AI adoption depends on architectures that support deployment, governance, and performance across edge, datacenter, and cloud.
Learn more at Lenovo.
Watch the full video at sixfivemedia.com, and be sure to subscribe to our YouTube channel so you never miss an episode.
Listen to the audio here:
Disclaimer: Six Five On The Road is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Patrick Moorhead:
The Six Five is On The Road at Lenovo Tech World in Las Vegas at the iconic Sphere. Daniel, what a venue. Isn't this great?
Daniel Newman:
Yeah, it's great to be here, Pat. Great way to start the year off. This place never disappoints. And you and I… you know, we're professional event attenders. Yes. And so when I get these invites to the events at the Sphere, it's always like, that one's going to move up the list of the things I'm going to show up to. And then when you add the speakers they're going to have here at Tech World, you know, you get YY, but you get Jensen, you get Lisa, you get Lip Boo, you get Cristiano, and obviously so many more. I think we'll, we probably won't do it justice, right? Yeah, exactly. Because there's like 20 awesome speakers. But yeah, I mean, man, I'm here butt in chair.
Patrick Moorhead:
Yeah, it's great. So it's been a lot of tech obviously discussed here at Lenovo Tech World. And, you know, Lenovo is unique in that it has AI that goes from pocket to cloud, the largest hyperscalers to even a smartphone, and pretty much everything in between. Now, the industry conversation for a while had been training, training, training, but as we move to this agentic AI arena, and we actually have to enterprise in particular, make use of this inference is now the name of the game. And we've been talking a lot about that here at Lenovo Tech World. And I can't imagine a better guy to talk about this than Flynn Molloy, longtime Six Five guest. It's great to see you. Great to see you. Great to see everyone out there. So the first thing though, do you want to talk a little, this is kind of, Your event here and your team have a lot to do with this.
Flynn Maloy:
A lot of work going into this. We're excited about it for everyone, so please tune in. It's coming up. But this is the sphere. We're going to be launching a lot of news, a lot of new products, lots of AI-related innovation that we're going to put forward. But we're also here announcing the launch of our FIFA technology sponsorship. It's going to kick off today. You know, Adidas has the ball launch and Coca-Cola has the trophy launch. This is a technology launch for the FIFA sponsorship. It's going to run all the way through the Men's World Cup. It's going to come for the next few months, culminating in the matches in July. So exciting. Yeah, it's exciting event. And there's no better place to do it than here.
Daniel Newman:
Seriously. The whole show is kicking off. Talking my love language. First, you talk about training. We're talking about the gym, right? Oh, you meant AI training. Yeah. Oh, I see where you're taking this. And then we're talking about soccer? Football, as we call it around the world? Football. Football. Football. You know, I'm going to become better friends with these guys as sponsors of FIFA. I mean, I always liked Flynn, but now I like Flynn a lot.
Patrick Moorhead:
F1 wasn't enough clearly. Remember who's your F1 buddy, okay? All right, let's just get that straight.
Daniel Newman:
Let's get back to inference. I mean, look, you're right. Training is a small subset of the market. Now everybody needs the training and the training created this massive boom. The 22 to now boom has been a lot of training. But as enterprises are trying to do AI and my big predictions for the year were all about this enterprise AI inflection agents. Here's a stat on Gemini, something like 10 trillion tokens a day. And that's kind of like the hybrid consumer enterprise. Now, when you think about the enterprise unlocking all the rest of that data. Big is this? How big of a turning point is this as enterprises go big into AI? And what's the opportunity for Lenovo?
Flynn Maloy:
Well, so first off, we completely agree with all of the assessment that you just said. I mean, I think what's coming, what has been is the capital build out has been about training models, building models, but go ask businesses, big businesses, small businesses, go ask them, hey, What do you want to do? Is their answer, well, I want to build models. I need the technology to build models. No, they're going to say, I don't want to build AI. I want to use AI at scale, at speed, all around the world. And when you think about that, how are they going to do that? our conclusion and our belief why this hasn't expanded everywhere to the business yet. I think there's a number of studies out there that say maybe only 5% of enterprises have arrived where they want to arrive in terms of using AI. The enterprise build-out is still on its way. And there's a couple of reasons for that. And one of them is not everything's going to run in the public cloud. There's a lot of fantastic innovation that's coming out. That's one part of the answer. We've always said, and we've been talking to you guys and the market for a couple of years now, we announced first on stage with Jensen, YY did, hybrid AI is the right answer, which means Some AI can and should run in the public cloud. You don't have to take your data to the AI. Bring AI out to your data, to your private business center, to your private data center, to your edge, to your store, to your bank, to the mall, out where the customers are, where the data is. And that's not a training answer. The training technology It's built around simultaneous upload of data so that you can train those models. Inferencing technology is about throughput. It's about speed. 10,000 chatbots an hour. A thousand agents running on the platform at the same time, helping all the different functions inside of a company. That's what's coming. And I think that North Star has gotten clearer by, you know, the technology to get there. It's it's not fully mature yet. And, you know, more and more businesses are leaning in. And so we agree. We think there's a big push around the inferencing wave into the market across big businesses and small businesses.
Patrick Moorhead:
Yeah, so Flynn, in the early days, right, the early days of machine learning, you know, seven or eight years ago, there was a lot of discussion about unlocking value on the edge. Very similar conversations today, whether it's manufacturing, whether it's retail, or even healthcare, that quite frankly have to operate disconnected from the public cloud and even the corporate data center. How does this new generation, I'll call it agentic inference, and AI unlock value for those verticals?
Flynn Maloy:
I think a lot of that is about bringing AI out to where the data is. It's about where the data is. Where's it coming from? Where's it generated? What are the compliance rules around it? So latency matters then. It does, but you know, I'll say there's several known bottlenecks to the inferencing technology of today. That's not as mature as it's going to be, right? It's memory, getting the right amount of memory to the right place so that you can process that right where the data is coming in. Latency, as you say, do you want to round trip, can you round trip everything up to the cloud? Some things you can, other things you cannot. Security, a lot of that data is really private. Do you want to cycle all that through the learning models, even the private ones, up in the cloud? Or do you want to keep it local? And then energy. If you're going to move that kind of processing power out to the edge, do that kind of throughput inferencing that you want, that stuff gives off a lot of heat, gives off a lot of energy. So these are the known bottlenecks around inference technology and what Lenovo is doing is solving those bottlenecks. A lot of what we're going to hear in the tech world show today talks about how we're solving those bottlenecks because we want to be, you know, we want to jump to the front and lead the next generation of technology and the solution stacks on top of it to deliver those inferencing solutions all the way out to the edge. And that doesn't mean you're disconnected from the cloud. It means this is an important component of your full hybrid AI strategy of the future.
Daniel Newman:
It's such an important thing that this inflection factoring in every other use case where AI and that's all that data. And so there needs to be an inflection because basically at some point, people need to start to see value derived and directly attributable to like, hey, we unlocked 25 years of data from our grocery store shopper. And we were able to more precisely place products around our store. We were able to more, you know, rapidly deliver or get, you know, supply chain optimized. And like, you actually start to see margins grow. You start to see revenues grow. You start to, you know, but this isn't like all easy. I mean, like, what are the infrastructure shifts? that need to happen to really make this work, because it's not all gonna happen from super clusters in the cloud run by very small number of companies that are doing AI for a very particular language, image, video creation purpose.
Flynn Maloy:
Yeah, yeah. I think… There's a number of things there. What I would say is that the North Star for what a lot of businesses are thinking about, what their company is going to look like in the future, any business, in any vertical, just think about it. What do they want to use AI for? By all means, run in some of the subscriptions, some of the tests, some of the chatbots out of the cloud, but the vision for the future is every function. marketing, sales, supply chain, development, manufacturing, legal, HR, all of them will have in their core workflows, they will have AI as part of that. And part of the technology challenge there is, can you get the right amount of power all the way to the end? Because it's not just you know, telemetry analysis, that's good. We've been there, we're there now. But, you know, the vision for the future of AI for marketing, for sales, for supply chain is, you know, what about workflow? So beyond just looking at telemetry, okay, now we want to understand that. Now we want to schedule the recycle. We want to, you know, send out for the next thing. We want to, you know, schedule the guy who's going to come and restock it, like actual workflow processes that are running with AI. You can't do that with just gathering, you know, a prompt sending it to the model, coming back. It's very different when you take AI inferencing technology all the way out to the edge with the right amount of memory, with the right amount of security to process that real time. Now you're in the middle of the enterprise workflow. That's a giant bubble. Every business, every function in every business is going to be thinking about re-workflowing, and that's when you can start measuring ROI. That's when you can start measuring, you know, gain in revenue and sales and cost reduction. That's what the enterprise is kind of waiting for. The killer app isn't there yet. And part of the reason is the technologies are not mature enough. The solutions are not mature enough, but that's coming. And when it does, it's going to change every function in every business.
Patrick Moorhead:
Yes, so I want to talk about scaling AI inside of the enterprise. We talked about PCs on the edge, we talked about the industrial edge, and also the corporate data center as well. Why is inference key? Is it just because there's not enough cycles in the cloud or is it everything else? The latency, the data protection, everything else. Can you talk a little bit about that?
Flynn Maloy:
Yeah, so we do every year, we've talked to you guys about this before. We do a survey of 3000 plus C-level clients, customers all around the world. And we asked them about, if you could choose where do you wanna land your AI, where do you wanna land it? And 65% of them all said, we need to land it in a hybrid way, which means some of that data we need to keep private. Some of the workflows we need to keep private. And by the way, where are our customers? Customers are not up here. The customers are all the way out here. And so if we want to move that all the way out to where the retail store is, where the bank is transacting, think about a hospital, right? We've given this example many times. There's nurses, there's doctors throughout the whole hospital. They have devices that are connected up to the hospital AI, right? They're personal with a, you know, intelligent device. You know, when was the last time I met with Pat and what was his problems? Like that's customer private, very private data, right? To be able to interface then with the hospital, sometimes go to the public clouds, other times hospital specific applications. It's a mix. It's a mix of personal device intelligence, private intelligence, and public. And that is the nirvana that everybody's trying to get to. And we're not quite there yet. But that's why we believe the inference bubble, because that's all inferencing, right? You're not building training models. That's not what you're doing with those applications. It's about throughput. Those agent interactions, the platform running, that's a totally different motion. And that's the kind of AI that we want to put out in the businesses around the world.
Daniel Newman:
That's why on the infra level, there's so much focus on tokenomics, is being able to generate tokens inexpensively. But the other problems are still some of the same problems we had with the past generation. It's making data available to the AI, you know, before it was making data available to compute. It's just a different compute. It's playing within the rules, the compliance, the governance, the sovereignty now is a big issue. But Flynn, what is so amazing about what I think is going to happen is every company on the planet has the opportunity to just scale at just such a breakneck pace right now. And I mean, it's going to take a lot of partnerships and technology, and it's going to take a lot of work. And I think that's where your opportunity at Lenovo is so exciting is, look, you work with thousands of enterprises, whether it's your handsets, whether it's your PCs. Around the world, people are touching Lenovo all the time. And I think that's really the opportunity. give my little one second of advice to you guys is look, you need to land that every one of these enterprises, you could help them unlock inference. Because I think right now, people are unlocking AI through the lens of data that's publicly available and easy to access. They're not unlocking all that rich high value data that they've spent decades or years building in their enterprise. And when they figure out how to do that, Off we go.
Flynn Maloy:
Yeah, I totally agree. And it's going to come in all those different forms, you know, even out to personal. And that's why, you know, what we're going to talk about in the show, of course, is going to focus on personal AI, enterprise AI, how you bring on device intelligence all the way to the biggest supercomputers. You know, AI is going to change all of that. And the opportunity for companies to accelerate is going to be right there in front of us.
Patrick Moorhead:
Yeah, you connect that data fabric all the way from pocket to cloud, adjacent orchestrators that cover the same real estate. You're really cooking with fire there.
Daniel Newman:
Seamlessly connecting the back end to the front end. That's right. I know. Finally, we can do that. Step-by-step, too. Step-by-step. That's going to happen. I'm telling you, it's going to happen. Finally, it's going to happen. Flynn Malloy, thanks so much for joining us here. Always a lot of fun. Great event. It's going to be a great event.
Flynn Maloy:
Yeah, I'm looking forward to it. Stay tuned, everyone.
Daniel Newman:
And thank you, everybody, for being part of our Lenovo Tech World pregame here at the iconic Sphere. Check out all our other content. Stay with us. We'll be back in just a little bit.
MORE VIDEOS

Overcoming AI Bottlenecks: What’s Next for AI Inferencing at Scale?
On this episode of Six Five On The Road, Patrick Moorhead and Daniel Newman are joined by Lenovo’s Anwar Ghuloum to discuss why AI inference, infrastructure, and edge-first design are becoming the defining challenges as manufacturers scale AI beyond pilots.
Other Categories
CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks
Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.
QUANTUM

Quantum in Action: Insights and Applications with Matt Kinsella
Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms
Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.



