Building And Scaling Hybrid AI Factories With AI Agents And Solutions
Flynn Maloy from Lenovo and Sandeep Gupte from NVIDIA join Daniel Newman to share insights on scaling Hybrid AI with innovative solutions.
The AI revolution is upon us, and Hybrid AI is about finding the right mix of computing to meet specific business needs.
Daniel Newman hosts Flynn Maloy, CMO & VP ISG Marketing at Lenovo, and Sandeep Gupte, VP Product Marketing, Enterprise Platforms at NVIDIA for a discussion around how businesses can implement energy-efficient Hybrid AI solutions, and the innovative collaboration between Lenovo and NVIDIA in creating powerful Hybrid AI technologies.
Key takeaways include:
🔹Challenges to Agentic AI’s Potential. The impact of data governance, cost management, and compliance.
🔹Smarter Hybrid AI with Lenovo and NVIDIA. Lenovo's strategic approach to delivering Hybrid AI solutions, addressing key challenges like flexibility, security, and cost efficiency, and the specifics of the Lenovo-NVIDIA collaboration towards smarter Hybrid AI factories, innovations, and energy efficiency.
🔹The AI Factory is revolutionizing infrastructure. NVIDIA's concept of the AI Factory, powered by GPUs and NIMs (NVIDIA Inference Microservices), is transforming how we approach computing. It's shifting from a tool-based approach to a model-centric one, where AI models become the core of the computing process.
🔹As AI workloads grow, so does the demand for energy. Lenovo's focus on liquid cooling and close collaboration with NVIDIA on reference designs ensures that Hybrid AI deployments are both powerful and sustainable.
Learn more at Lenovo and NVIDIA.
Watch the video below at Six Five Media, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Disclaimer: Six Five Media is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Daniel Newman: Hey, everyone. Welcome back to another episode of The Six Five Podcast. Daniel Newman here. Excited for this continuation of a series that we're doing in partnership with Lenovo. It's called AI for All: Hybrid AI. This is going to be a great conversation, excited to have this one. We are going to not only have a friend of mine that I've known for a long time back on the show. We're also going to be joined by an Nvidia executive. We're going to be talking hybrid AI. We're going to be talking about AI. We're going to talk a little bit about the market. There's so much ground to cover here. First and foremost though, let me say hello to the guests. Flynn, welcome back. You got a great smile today, you look like you're having fun. Welcome back. How are you?
Flynn Maloy: Good to see you, Dan. Good.
Daniel Newman: Sandeep, how are you doing today?
Sandeep Gupte: Doing great. It's great to be here with you. Thanks for having me.
Daniel Newman: Yeah, a lot of fun to have you here. Look, the idea of hybrid AI. And let's just say this entire industry is changing so much and it's moving so quickly. Over the last two years since this advent of ChatGPT, AI has come into the consciousness of almost everyone in the world. But there's a few companies that are really powering this. Of course, Sandeep, your organization, Nvidia, has been at the top of headlines. I can't even begin to say how many media appearances I've done to talk about Nvidia over the last couple of years. There are these partnerships, companies like Lenovo, Flynn, like your organization, that are really taking all that technology, packing into these systems, and delivering it not only to cloud providers, which have been maybe a lot of the headlines. But this is also flowing into enterprises who are now looking to build their own.
Hybrid AI is such a hot topic because the complexity is so substantial for companies. We know these open data, large language models that people are using. When I say this with a grain of salt, they're like the world's open internet of data, and now you've seen the ChatGPTs and Anthropics, and all these companies build. But the future of this hybrid world is all about enterprises that have rich, unique proprietary data, and it's really these partnerships and these technologies that help companies unlock this data. We've seen it move quickly. We've seen it move from generative, "Hey, we're going to generate text," to the next thing is, "We're going to build assistants that make all of us more productive." And then it really flows into this agentic future now, where we're not only having AI assist us with things like text or writing emails, but it's really becoming a team of support that can help us in our roles. That can do multiple things concurrently, it can arbit and negotiate tasks.
In the end it's like, "Hey, Flynn, next time you want to book that trip to wherever in the world," I can have a bunch of agents find my hotel, find my flight, reach out to my customers, connect the dots on all of these things, make me way more productive. The story is great, but getting it done is what I want to talk to both of you about today. Perhaps we start off right there. Flynn, I'm going to have you start off here. We talk about the smarter approach to delivering hybrid AI. You heard my background, you heard my story, you heard my buildup. Now you're doing it, your company is doing it. Talk to us about the approach you're taking to delivering hybrid AI. How you're handling flexibility, security, and cost-efficiency so we can realize all these cool opportunities that we like to talk about.
Flynn Maloy: Well, thanks, Dan. Happy to be with you and Sandeep today. Let's start with where you set it up with, "I can take my vacation, and I have all these agents helping me with my vacation. Think about it in terms of a B2B sense.
Daniel Newman: I was thinking a work trip, by the way.
Flynn Maloy: Okay.
Daniel Newman: Sorry.
Flynn Maloy: Okay, fair enough. Okay, sorry. Because you never do anything but, you're always on the go. Let's take a look at it from a B2B sense. Let's say you're a hospital, nurses, doctors. Or a manufacturing site with factory folks on the floor, logistics. Imagine that agentic future of all these agents that are helping the doctors and nurses, or helping the factory floor with the logistics. When we say hybrid AI ... There's so much excitement going on out there. But the challenge is really how can I bring that to my enterprise, to my school, to my manufacturing site, to my hospital when I've got data all over the place? I see some fantastic stuff from the public cloud, but I also know that not all my stuff can go in the public cloud. What is this mix of using some of the great tools, the powerful innovation in the public cloud as well as I've got my own private stuff? My factory, it has its latency issues, it has its data gravity issues.
Hybrid AI is about how can we help put the right mix of AI in place for you, for the workloads and outcomes that you're trying to accomplish for your business. Be they wherever they are, whether it's public cloud, whether it's in a private, or all the way out to the edge. I think that's what we mean when we say hybrid AI. The challenge is how do you build that? How do you do that? A good example is last year, we did a survey here at Lenovo with our Nvidia partners, 3000 customers around the world. We asked the LOB, the CEO, as well as the CIO, "Have you started? What are you doing? What are the outcomes? What are your challenges?" The CEOs were all saying, "Number one is AI, I want this agentic AI. Give me four or five AIs tomorrow." CIOs are looking at them saying, "I can't give you four or five AIs. That's not a thing. What do you want to do with it? What are the outcomes that you want?"
Let's take a look at the manufacturing site. Are you looking for computer vision? Are you looking for chat agents for the factory floor? Are you looking for logistics? What are the outcomes you're looking for? Let's build a hybrid solution, with our partners, that can help achieve those outcomes. That's where the Lenovo Hybrid AI Advantage comes in. That's where our partnership comes in. We've got a lot of the tools. What the B2B community is looking for is how can we put those tools and those outcomes together for the businesses where they need it, when they need it.
Daniel Newman: Yeah. The opportunity to be more efficient, more productive is the obvious opportunity. But it's not as easy as, "Hey, let's AI this."
Flynn Maloy: Exactly.
Daniel Newman: CEOs love that kind of stuff. But in the end, it's almost like digital transformation was. It's like, "Well, first of all, what was analog? What are we trying to change over to?" Well, AI provides a great opportunity for companies to get more efficient and more effective in their business, and then it's all about driving that productivity. Sandeep, I love the way Jensen created the AI factory. It was really profound and thoughtful, because we really are changing the way infrastructure looks. It's really substantial, the cloud of old and the cloud of new. And of course, we're seeing this iterative, and then some it's completely transformative. Some of it's, "Hey, we're still using old compute, we're going to add AI compute." It's moving quickly. But AI Factories seem to be a pivotal set of technologies to drive agentic, to basically make companies capable of doing all the stuff Flynn talked about. My little dream of simplicity in my business travel, Flynn. I don't know this vacation that you speak of. How does Nvidia support this using its GPUs? And NIMs. I think it's important to talk about NIMs because that's a lot of the ingredients to help companies do this in a more streamlined fashion.
Sandeep Gupte: Yeah, absolutely. We are in the middle of a massive industry revolution that's powered by AI. At the heart of it is this AI factory. As you said, Jensen introduced this concept at the beginning of this year and it really hit the mark. Because it really talks about how our traditional computing model, which was general purpose computing on CPUs, transformed into accelerated computing that's based on GPUs. Previously, you used to write software, a detailed set of instructions that the CPU would go and work on data, and essentially deliver us the results. Essentially, it was information that we had to interpret. That model where the computer was a tool has now completely changed into a model where the heart of computing is a large language model. A neural network that's running on GPUs, and we interact with this model in a very natural way. With text, with speech. This model understands the meaning of our words, and is generating a series of tokens, these units of intelligence. That's the new computing model.
The success of this AI factory that's generating tokens is the rate of generating these tokens, the throughput. It's about the performance. X factor performance means X factor reduction in energy consumption and cost. An AI factory is absolutely critical. These tokens can be anything. These tokens can be text. These tokens can be images and videos. They can be molecules and proteins for drug discovery. They could be mechanical designs for automotive parts and manufacturing. They can be physics. They can be furniture that's used in architecture. It's a completely different way of thinking about computing. It's not easy. This requires us to do a full stack reinvention. This is not the traditional computing stack that we used where the computer was a tool. This is completely different. We had to think everything again, all the way from the silicon level, up to libraries and APIs, the environments to build and optimize models and to do inferencing, and all the way up to the application layer. I think that's the exciting part, and that's the piece that we're working on to rethink end-to-end, top-to-bottom, this new stack that is needed for powering AI Factories and to deliver the performance that we have promised our customers.
Daniel Newman: Yes. That was a bit more of an in-depth response to when Jensen, I think I quote, "The more you buy, the more you save."
Sandeep Gupte: That's right.
Daniel Newman: I think that's his infamous quote. I think the point there that some people misconstrued is when you have more efficient, more powerful computing, you can get more done with less spend. I think you bring the compute down to as little per token as possible. These much more efficient computing machines are capable of doing that. Flynn, Lenovo has an AI factory in partnership with Nvidia. You also are working on this hybrid AI advantage. We're working together in our testing and performance lab on some proof of concepts. I think right now, when you get to enterprise, the big cloud providers get it. The big cloud providers understand what they're doing, they're building it out.
But as you get to enterprises who are trying to deal with everything from data compliance and sovereignty, to wanting to have certain workloads, we still know 70 to 75 percent of enterprise workloads remain on-prem. We're going to have to take these factories out to the edge. We're going to have to take these factories and put them in on-prem data centers. Then of course, we're going to connect them back to these hyperscale cloud providers. That is what's going to happen. How are you guys thinking through that process of piloting, testing, deploying to enable ISVs, to enable enterprises? Maybe share a few examples of how you're doing that over at Lenovo.
Flynn Maloy: Yeah. That is the magic right there. Actually, on our stage about a month-and-a-half ago, Jensen and our CEO YY announced Hybrid AI Advantage with Nvidia. Which is our AI factory, with a layer of AI library on top. That AI library is key, because underneath it, we've got Lenovo infrastructure, we've got the data layer, we've got all the fantastic software partnering with Nvidia around AI enterprise and all of the NIMs, other third party partners in there. Then on top, we have the AI library. As you look at all of that, it's designed for our customers to take a look at what's the use case, what's the outcome that we're trying to achieve, and pre-build and pre-configure full stack solutions, exactly as Sandeep says, for the data needs that are specific to what you're doing.
A great example is what we're doing together with Nvidia for the city of Barcelona. Where we've got an edge solution out around the city, where we use the visual network. It's Lenovo Viva, Lenovo Edge Guardian, together with Nvidia Metropolis, Nvidia AI Blueprints on top of our infrastructure, our edge tech. Out into the city, where we're helping bring AI out to emergency services, to citizen services, to traffic monitoring. It doesn't just take the cameras and aggregate. We're supplying that intelligence, that neural network intelligence that Sandeep outlined, to exactly what's going on inside of the city of Barcelona. Which is providing better control, better understanding, better traffic monitoring, faster emergency response, better citizen services. This is a great example of hybrid, because the data is happening out at the edge, on the streets of Barcelona. You can't just roundtrip everything that's going on in the street up to the clouds and back. You need to be able to bring the AI full stack solution out into a hybrid environment. So you're doing processing with GPU sharing, capacity balancing out at the edge, while also bringing in data processing at the data center, and connecting it up to the public cloud. When you engineer all of that, it's very complicated.
What the Lenovo Hybrid AI Advantage Program is trying to do is build these libraries of blueprints, T-shirt sizes that you can sit down and say, "Let's run a pilot." Let's take a look at what is that activity you're trying to do, run a pilot, and then scale it from there. That's exactly what trying to speed up this, "I got an idea," or, "Give me four or five AIs." Try to turn that into an actual solution, for example the city of Barcelona, on your streets, with the citizens, driving around, emergency response. Getting that time to value is the key to accelerating the ROI of AI.
Daniel Newman: Yeah. When I get to Mobile World Congress in February, I expect this to all be done.
Flynn Maloy: There you go.
Daniel Newman: You'll have all of the challenges-
Flynn Maloy: Traffic will part in front of you.
Daniel Newman: All of the challenges with Barcelona will be fixed, there will be no waits for my cabs anymore. I'm joking. But it is really exciting to see how this can work in an environment like that. Barcelona has always really prided itself on being a smart city, so this is definitely a next wave. You're absolutely right, Flynn. What you are doing, and of course what Nvidia's doing with NIMs, is bringing together the frameworks, libraries, the hybrid architecture to be able to put all those workloads into a container, making it accessible to the enterprise to be able to run proprietary. Whether it's drug discovery, as Sandeep had suggested, or whether you're doing an architectural designs and you're using Omniverse, the ability for these things to tie together, it's the simplicity. This stuff's hard. Asking people to DIY it, and you've heard different companies talk about that, but it is better when you can bring more of the package. And then you bring your value, then the SIs layer value on top of it, the enterprise bring their value. In the end, they can get much more productive. Sandeep, I'd like to take this back to thermals, cooling, energy. We've heard so much about as this scales, how do we deal with the energy requirement? Why is it so important to get the infrastructure energy efficient to be able to deliver hybrid AI at scale?
Sandeep Gupte: Yeah, for sure. The hybrid AI execution relies on a perfect integration of hardware and software. There are tons and tons of components that actually make this possible. If you look at Blackwell, our latest GPU, it's not a single chip. It's a system. There are seven chips that actually contribute to its performance. Then you have the fabric that connects all the different components. Then when you start to rack and stack this in a data center, now you're dealing with scale, you're dealing with thermals, you're dealing with heat, you're dealing with noise. There are all the different things that you've got to consider when you start to build out these solutions.
It's a complex problem. But it's one of those things that we have to solve. We focus on a lot of energy efficiency in our GPU architectures. But it goes beyond that. We have to test the system as a whole, we have to validate it, we have to test it across different workloads and different topologies. It's a complex problem. We do it at our end by doing a lot this heavy lifting in our labs. We build out what we call enterprise reference architectures, and that allows us to come up with optimal configs where we can test different workloads, scale up workloads where all these components are working on a single problem, or scale out workloads when they're working on different problems individually. But those are all the different things that we do in our labs to optimize, fine-tune, and really find the most optimal performance and energy efficiency balance.
Then we deliver that to Lenovo. That's where the collaboration between us and Lenovo comes into play. Lenovo engineers then take that and productize. That's where their magic comes into play. Then all the T-shirt sizing and so on that Flynn was talking about, all that comes into play. But together, that's really the place where the work gets done. The fundamental products that we build, and then the productization and the integration work that Lenovo does, I think that's the magic to find that best performance and energy efficiency.
Flynn Maloy: If I could just jump on that too, Daniel and Sandeep. One of my favorite things is what Jensen often says. "One of the things I love most about Lenovo is that you're both an ODM as well as an OEM." Which differentiates us from a lot of our competitors. Which means exactly as Sandeep says, we have an engineering team that is helping build out the reference designs for the new technology. For example, Lenovo was awarded the MGX Reference Design from Nvidia. What that allows us to do is work collaboratively at an engineering level with Nvidia in order to build out, especially like you said, for energy efficiency. Lenovo's been a pioneer for over 10 years in Neptune liquid cooling. What we know well is you can't wake up yesterday and decide, "You know, I think I'm going to do some liquid cooling." It's really complex. It's not just a bolt-in thing, which you often see in the industry. It's just, "All right, let's just bolt something into our existing server."
No, that's not going to maximize your energy efficiency or the reliability over time, because by the way, you're taking liquid into the data center. You better have the most reliable, the most well-engineered solution. With our engineering partnership with Nvidia, we're able to build integrated designs with liquid cooling into the technology and the firmware, partnering with Nvidia at the engineering level to deliver the most energy efficient solution. Going back to your first question, Daniel, what does smarter mean? Smarter doesn't mean that our AI has a higher IQ than their AI. No. It means doing it in a smarter way, which means getting the ROI you're looking for, doing it in a more sustainable, energy efficient way. Doing it in a more time to value way. These are how you do smarter AI. This engineering partnership with Nvidia is really allowing us to deliver some of the most energy efficient solutions anywhere in the world.
Daniel Newman: Yeah, it's really impressive. I've been following the work that Lenovo's been doing with liquid cooling. Ahead of its time. It was really doing this work very aggressively well before we saw this GPU compute era take off so exponentially. It's definitely going to be something to follow. As we tie this together, gentlemen, first of all, thanks so much for being part of today's Six Five. Great having this conversation with both of you. Sandeep, would love to start with you. I like to give some recommendations out there to our audiences about how enterprises should really be thinking about tackling this hybrid AI journey, and really getting scale and productivity out of this hybrid AI investment. Maybe just a couple things that come to your mind, and then, Flynn, I'll pitch this one over to you.
Sandeep Gupte: Yeah. That's really a good thing to talk about because it can be daunting. Customers are all excited about AI, they all want to jump on it. But it can be daunting when you think about the hardware. When you think about, "Where do I start, all this looks like complicated. I'm used to my traditional data center infrastructure and software stack. All this sounds very different." The pace of innovation is so much that every day, we learn about new models and new technologies. It can get confusing. Where do you start? Well, it's best to take a step-by-step approach. The best way to start is to think about, within your organization, is there a problem area or a use case? Pick one use case. Try and think about where the data sources for that use case and focus on that. Come up with a POC plan, a proof of concept plan, and working with Lenovo's consulting teams, with Nvidia consulting teams. We're there to help. We have a lot of things that we have put in place to get this journey started in an easy for our customers.
You referred to blueprints. That's a perfect example where these are reference workflows that we have crafted that show exactly how you can set up agentic AI within your enterprise. The full recipe is there. The sample code, demos, documentation for libraries and APIs that you need, everything has been packaged together in this blueprint. We have reference architectures, as I referred to earlier, on the hardware side. We have created some of these starting points in partnership with Lenovo to get this journey started in a simple, step-by-step manner. Then of course, we've got some great starting points. I know Flynn will talk about some of the resources that Lenovo has.
On the Nvidia side, we've got ai.nvidia.com is a great place to go and learn about what other customers are doing. There's lots of resources there. We've got another website called build.nvidia.com, where all the NIMs are posted. We've got NIM blueprints on that website as well. There you can see everything that is the state-of-the-art. There are demos so you can actually download these things, you can experience these NIMs. Whether you're doing virtual screening for drug discovery, or if you're doing PDF data extraction, or if you're trying to build an agent for customer care. All those things are present on these websites. These are some really simple things to think about and will help you get started.
Daniel Newman: Sandeep, thanks a lot for that. Flynn, take us home.
Flynn Maloy: Well, give us a call. That's the right answer. Leverage all of the hard work, the years of design work between Nvidia and Lenovo. Give us a call. We have a services team that do AI workshops, so exactly as Sandeep says. What's the workload you're looking for? What are the outcomes? Are you looking for productivity? Are you look for computer vision? Are you looking for fraud detection? Are you looking for chatbot? Any of these solutions that you're looking for, we can run a workshop with you. Then we've got a series of well-named AI fast start deployments and AI fast start solutions that take all of the goodness that we've been talking about, with the stacked Hybrid AI Advantage solutions, pre-engineered blueprints, puts it all together. The fast start helps you get up and on as quickly as possible in your pilot, and then you can expand from there. Give us a call. We've done the hard work and the hard engineering. That's how to get started.
Daniel Newman: Flynn, the marketer came out just a little bit there. Call center is waiting, Flynn is here with the earpiece on, ready to talk to you. No, in all serious though, this really is part of it. It is about collaboration, it is about ecosystem working together. Nvidia is developing a lot of great technology. Companies like Lenovo are adding some secret sauce, some really important sauce to make this stuff more digestible and to help companies really unlock the full value of what AI can do for them. This is something we, as a research and analysis firm, have been focusing on very closely and look forward to continuing to see how this works out. Flynn, Sandeep, I want to thank you both. Really appreciate you taking the time to sit down with me today. Let's chat more soon because this is going to be a rapidly evolving topic, isn't it?
Flynn Maloy: It is, indeed.
Sandeep Gupte: It's intense times, exciting times. Thank you so much for having us.
Flynn Maloy: Yeah.
Daniel Newman: Absolutely.
Flynn Maloy: Thank you, Dan.
Daniel Newman: Thank you all, so much-
Flynn Maloy: Thanks, Sandeep.
Daniel Newman: ... for tuning in, for being part of our community. Hit that subscribe button. We'll put some information in the show notes on all the things that both Sandeep and Flynn talked about, in terms of their recommendations for getting started. But for this episode as part of this series about hybrid AI, and Lenovo, and the great work they're doing with smarter AI, got to say goodbye. We'll see you all later.
MORE VIDEOS

Powering the New Industrial Revolution with AI and Sustainable Cooling Solutions - Six Five Media
Matt Ziegler and Mike Sabotta join host David Nicholson to share their insights on leveraging Lenovo Neptune's Liquid Cooling and NVIDIA's AI platforms to revolutionize industries with sustainable, efficient solutions.

The Six Five Pod | EP 268: The Fed, Inflation, and AI: Decoding Market Signals
On this episode of The Six Five Pod, hosts Patrick Moorhead and Daniel Newman discuss recent developments in AI chip exports to China, the Synopsys-Ansys deal, and TSMC's earnings. The hosts debate the merits of GPU export restrictions and analyze the impact of tariffs on inflation. They also explore the potential firing of Fed Chair Jerome Powell and its implications for the economy. Throughout the episode, Patrick and Daniel offer insightful commentary on the intersection of technology, politics, and economics, providing listeners with a comprehensive overview of the current state of the tech industry and global markets.
Other Categories
CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks
Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.
QUANTUM

Quantum in Action: Insights and Applications with Matt Kinsella
Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms
Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.