AI Inferencing at the Speed of Real Life - Six Five On The Road
Scott Tease of Lenovo joins Patrick Moorhead and Daniel Newman to discuss why inferencing has become the moment where AI shifts from experimentation to real-time execution.
Inferencing is where AI proves it can act, not just think.
Patrick Moorhead and Daniel Newman sit down with Scott Tease, Vice President, Product Group, at Lenovo ISG, during Lenovo Tech World in Las Vegas, to focus on the moment AI leaves experimentation and enters live operations. As models mature, AI inference has become the layer that decides whether AI delivers real value or remains stuck in demos and pilots.
Their discussion shifts from training to inferencing and into the realities of running AI at business speed, where latency, data movement, infrastructure readiness, and energy efficiency quickly surface as limiting factors. Bringing AI closer to where data is created is increasingly unavoidable, but inference across edge, data center, and cloud raises new demands for consistency, security, and governance. As inferencing becomes the dominant AI workload, organizations that build this capability early gain not just performance, but operational leverage.
Key Takeaways:
🔷 Inferencing is where AI becomes operational: AI only creates impact when models can act reliably in real time, not just generate outputs in isolation.
🔷 Latency and data movement are the real bottlenecks: Once AI leaves centralized systems, response time and data flow define success more than model accuracy.
🔷 Edge inferencing is accelerating adoption: Bringing AI closer to data reduces delay but increases architectural and operational complexity.
🔷 Governance must scale with deployment: Security, consistency, and control become harder as inference spans environments.
🔷 Operational readiness determines winners: Enterprises that invest in inferencing infrastructure now are better positioned as AI becomes embedded everywhere.
Watch the full video at sixfivemedia.com and subscribe to our YouTube channel so you never miss an episode.
Listen to the audio here:
Disclaimer: Six Five On The Road is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Patrick Moorhead:
The Six Five is On The Road here at Lenovo Tech World in Las Vegas. It's been an incredible event so far with Yai kicking it off at the sphere. Horus sphere experience.
Daniel Newman:
Yeah, you know as well as I do. No I don't. Yeah, it was a great experience. You are the one liner guy. Yeah, well, it was a lot of fun, Pat. And you know, the settings always great. The the the different sound haptics, the sound, the video, the, you know, like it's really a pretty amazing venue and it's kind of ruined it for all the other events now, Pat. Because when you can use that, that destination, you have the seating, the sound, the noise, the wind …
Patrick Moorhead:
Blowing through your hair.
Daniel Newman:
Not my hair. Um, you know, it really does make for a tremendous night. And it was great. And it's just been an exciting CES. Yeah, it was cool. I mean, the luminaries, the announcements, uh, together. And the cool part is that there's a lot of Lenovo technology that's powering the sphere, which I didn't know until yesterday.
Yeah, which was really interesting. But it is all about AI and whether it's literally from, you know, pocket to cloud and everything in between. Uh, gosh, it's hard to believe that this new generative AI era is, what, four years old now or we're coming up on that fourth year. It really started off with training, training, training LLM, LLM, LLM.
Pratrick Moorhead:
But as we get into true implementation, particularly across the enterprise and how they're going to use this, inferencing really has come to the forefront. It makes sense. I mean, inferences where you run the application, you don't need a giant LLM to do a supply chain model, right. You want a smaller, more focused model like to to bring in Scott Tease to talk through that.
Scott Tease:
Hey, you guys.
Patrick Moorhead:
Great to see you again.
Scott Tease:
Nice to see you again too.
Patrick Moorhead:
May I just say thanks for being a often time guest? So great to see you.
Scott Tease:
Good to see you guys. Thanks for having me, as always.
Daniel Newman:
He's a regular. So you heard us a little bit in that, uh, the preamble there.
Scott Tease:
Yep.
Daniel Newman:
It was all about training, and now it's not. It can't be. We have to got to move. If we want to start. We want to stop talking about is AI a bubble? If we want to stop talking about is are we overdoing AI? It's going to come down to inference. It's going to come down to putting AI to use driving ROI. This is where it's all at. So talk a little bit about, you know, why this inflection point is so important and how do we make this real? How do we make the investment start to to yield.
Scott Tease:
Yeah, I think I mean, you said a perfect word. I think making it real. I think when you deliver inference, that's when you're making AI real for people. You know, if you're an employee and you're interfacing with a solution that's being inference on, on, you know, your corporate data center, you're getting more efficient. That's where AI becomes real. Your customer. You really like how you're interfacing with your, you know, your vendor, your more loyal customer. That's where it becomes real. So inferences where people are really going to experience what AI is really all about. But it is a very different game than training models.
Training models is is a challenge, no doubt. And that's why we've been talking about it so much for the past four years. But moving from model training to putting it into, like, real operation. Real deployment. Totally different game. So that's that's the next challenge for us.
Patrick Moorhead:
Yeah. What are some of the things that are slowing this down or uh, you know, centralized is easier than decentralized, right. Do you have more things you have to worry about? You also have to manage all those decentralized points together. Um, why is this so hard, and why does it seem like it's happening so slowly?
Scott Tease:
Well, you know, with AI, you you're you're seeing AI being pulled back towards the data, right? So your customers are going to want to put the AI right where the data is being created or where it's being stored.
That could be at the edge, it could be in a data center, or it could be out in a cloud. So customers need to plan going in for this AI to be truly hybrid. Some portions that are going to happen in all different locations. And I think that's a change in people's mentality. We've been used to thinking that we're either going to be in cloud or on prem. We know going in this is going to be a combination of all of that. Right? And there's no doubt, you know, doing things at the edge is a lot harder than doing things in the data center and a data center. I've got one room fully secured, all air conditioned and everything with thousands of items in there. On the edge, I could have thousands of locations with 1 or 2, you know, 1 or 2 boxes in each one. It's a different challenge. So it's, you know, this is what we've been focused on, trying to help our customers get through that and how they're going to work on that.
Patrick Moorhead:
Yeah. And a lot of these folks are looking at trying to standardize on certain operational planes, right. Whether it's the data plane, whether it's the agentic plane with an agentic orchestrator, I mean, it's never as simple as that, because you always have legacy that you have to deal with. But when you are starting a new, people are trying to standardize on on software, operational systems that they can manage across that entire AI ecosystem.
Scott Tease:
And to your point. You know, I think to your point, I think, you know, as you get into this and you start running AI at scale, you know, in full deployment, there's a lot of things that have got to be in place that we weren't really prepared for. We were just running traditional IT code. Right? I mean, the latency needs, the security needs, just the data movement needs across different locations, across from the cloud, back to the cloud. It's a totally different game. so customers got to be prepared for that. We're spending more time helping them guide them on that than we are doing real AI right now. It's putting the foundation in place, and once you've got the foundation in place, building more AI into the existing workflows is going to be a much easier task.
Patrick Moorhead:
Yeah, you've got to have that. Got to have that foundation in before you pick out the paint, put it up on the wall.
Daniel Newman:
Let's drill into that a little bit, though I think it's fair to say enterprises have discovered the fact that the workloads will not always be centralized. In fact, yeah, you know, it can be governance. It can be compliance, it can be sovereignty, it can be economics. There's a lot of reasons, security, why they're making those choices. But in the end it's not going to end up that way. Yeah. And so in order to do that it's adding complexity. It would be easier just say, hey, put it all in the cloud and run it all there.
I mean, but you're going to have real issues. So like what you know, now that we've seen it, the companies have acknowledged it. They're doing it. Like where are they underestimating. You know I know it's harder, but like what are they underestimating for how hard it is and how do they solve that.
Scott Tease:
Yeah, I think, you know, the big part of it is underestimating the operational excellence that it takes to put this into place. You know, build in a model, as we said, it's it's cool. It's but it's a different challenge I can create, you know, the killer app, the killer model. But then taking it from creation into production, it's a very different skill set. And I think customers often underestimate what's involved in that. Even once you put the model in, it's got to be kept evergreen.
So how do you do that in a remote location? What data comes back to the data center? What data can I just act on locally and never need to move it again? It's those kinds of things that we're helping guide them to. And those are the big challenges. Um, you know this for AI to be successful, these inferencing workloads have got to be built into their existing workflow. Not kind of one offs, not experimentations. And I think that's that's where we're heading and that's where we're trying to get customers to be, you know, get their mindset wrapped around that.
Patrick Moorhead:
So I want to get back to this manageability part, maybe auger in on a little bit, um, consistency, uh, security, governance. How do you maintain standards for that across these different environments.
Scott Tease:
Yeah. So one of the tools that we've got is a tool called X clarity one. It's basically a cloud based tool that keeps an eye on everything that's out there, whether it's in the data center, whether it's at the edge and it's looking at things like is my code up to date? Is is there any kind of security vulnerabilities, things like that. And we can actually alert the customer that that's got to be updated. We can take action on it, because it's all talking to the cloud and making sure that everything's up to date. It helps with deployment as well as ongoing management. So that's one thing that we're doing to help make sure things are are easy to kind of maintain, especially when you've got 1000 locations or 100 locations. It's going to be even more important that we have that sort of thing. As we're pushing models out every single week. These models are again, they're not going to be static.
They're going to be changing as we get new data in. I've got to be able to push that new model, updated model back down to those endpoint servers. So starting with that good management approach is really one of the keys. And clarity one's a great way to start. Yeah.
Daniel Newman:
So time is money. You know we're hearing a lot of you know time to first token. You know a lot of concerns about can we energize the racks and get. You know to value. This is so much focus right now and I think it is a bit of a arms race. It's a bit of a race. And, you know, the race of course, for chip companies. Then of course there's a race for the OEMs, ODM, we want to build the best version of those.
But then there's also a race in the enterprise. The enterprise is like we want to be the first to market using this technology. It's time to market advantage. There's no doubt. Yeah. So what how big is that? You know, because I think all companies are going to get there right now. All survivors say all survivors are going to use this stuff and they're going to use it successfully. But I mean, how much of an advantage is it to really lean into this stuff and get there now?
Scott Tease:
Yeah, I mean, the efficiency gains we expect to see from putting AI at scale, it's just going to be something that you're not going to overcome by, you know, by just throwing more people at it. It's just not going to happen. Um, you know, a civil engineer that's deploying AI is going to be far better at designing our roads and our infrastructure than an engineer. That doesn't apply AI. Same with your doctors, same with your retailers. They're going to be better at it. And it's just going to make our our interactions with these companies better.
It's going to we're going to be more loyal to the companies that deploy AI. So not not making it happen is, as you say, man, it's a choice of life or death. Almost. Yeah. It may not be right away, but it's it's you got to get started. Once you get started a few times and you understand how to operationalize inference, it's going to be much easier to do it again in the future.
Patrick Moorhead:
Yeah. It's interesting getting back to, um, you know, it's life or death thing. I mean, I know, you know, Gen Z and millennials, but you know, there are a lot of companies this, you know, went by the wayside, right? You know, web 1.0, it was the brokers. It was the travel agents. We usually we used to have to go to a human being to do a stock trade and also to get tickets to go on a trip.
Right. And that and then the second phase was content, right, with CDs and DVDs, um, leading to streaming. Right. And a lot of those companies, music, music record companies and stuff like that. And then we saw social, local, mobile. So hopefully people are, are have enough experience with change that that they might, you know, do whatever they can to get ahead of this thing so they don't become a kind of fodder, right, for somebody else's next book. Of those companies that went out of business.
Daniel Newman:
The Blockbusters, Blackberries, the ones that exactly everyone likes to use. What do they say about history? It doesn't actually always repeat itself, but it rhymes. Yeah.
Scott Tease:
I like that.
Daniel Newman:
Well, Scott, I want to thank you so much for joining us for this episode. Let's make sure we, you know, we create a whole highlight reel of Scott's appearance in our show.
Patrick Moorehead:
Exactly.
Daniel Newman:
He's become a …
Scott Tease:
I'll try not to embarrass myself. Come on.
Daniel Newman:
You become a wily veteran and we appreciate you so much being part of the six five. Appreciate it. And we appreciate all of you being part of this Six Five. We're here at Lenovo Tech World 2026 in Las Vegas. Hit subscribe, check out all of our content here at the event and all of the six five content when you have a chance. We appreciate you tuning in. We'll see you all soon.
MORE VIDEOS
The Six Five Pod | EP 294: AI Capital, Sovereign Cloud, and the Infrastructure Arms Race
AI funding rounds are getting bigger. Infrastructure bets are getting steeper. And the SaaS model is back under pressure. On episode 294 of The Six Five Pod, Patrick Moorhead and Daniel Newman break down the $110B OpenAI raise, Amazon’s expanded role, AMD’s $100B Meta deal, sovereign cloud momentum, and whether or not the SaaS premium is being permanently eroded.
The Rise of Companion Silicon: Rethinking AI Architecture from Edge to Cloud
Patrick Moorhead and Daniel Newman break down the week’s biggest AI signals, from $650B in hyperscaler CapEx and Anthropic’s breakout momentum to the SaaS repricing debate and a Flip segment on how fast AI can realistically disrupt white-collar work.
Other Categories
CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks
Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.
QUANTUM

Quantum in Action: Insights and Applications with Matt Kinsella
Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms
Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.


