AI Fabrics and Addressing Key Networking Concerns - Six Five Media at Dell Tech World 2025
Saurabh Kapoor, Director of Product Management & Strategy at Dell Technologies, joins Patrick Moorhead and Daniel Newman to share his insights on AI fabrics and how Dell's AI Factory is addressing key networking concerns and performance bottlenecks in AI workloads.
What will it take to run AI without limits? 🤔
Patrick Moorhead and Daniel Newman sit down with Saurabh Kapoor, Director of Product Management & Strategy at Dell Technologies, as he details the Dell AI Factory, an integrated platform for compute, storage, and networking, showcasing its capability to address AI performance bottlenecks and support diverse AI workloads with advanced AI Fabric Connectivity. Tune in for the latest updates from Dell Tech World 2025:
Highlights include:
🔹Integrated AI Solutions: How the Dell AI Factory seamlessly integrates compute, storage, and networking to overcome AI performance bottlenecks and its real-world impacts.
🔹Supporting Diverse AI Workloads: The variety of AI workloads the Dell AI Factory supports and design considerations for optimal performance.
🔹SONiC's Role in AI: The contribution of SONiC to AI optimizations within Dell's networking solutions and the benefits it brings to AI fabrics.
🔹Flexibility with AI Fabric Connectivity: How Dell provides technology choices, from InfiniBand to Ethernet, through its AI Fabric Connectivity, enabling customers to build resilient AI infrastructures.
🔹The Future of AIOps: The anticipated enhancements in orchestration, observability, and the comprehensive services that will help the Dell AI Factory evolve further into the AI operations realm.
Learn more at Dell Technologies.
Watch the full video at Six Five Media, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Disclaimer: Six Five On The Road is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Patrick Moorhead: The Six Five is On The Road here in Las Vegas, Nevada at Dell Tech World. Daniel, it has been AI all the time, whether it's PCs, data center, edge and everything in between.
Daniel Newman: Yeah, it has. It's been a great first day, covering a lot of bases. We knew that AI would be the theme, but we're seeing it really, you know, permeate throughout the enterprise really from the handset devices, the AI, PCs, all the way out to the edge and back to the data center which we knew this was going to happen, Pat, but I've been beating this drum all day. We're hearing about how AI is being brought to life and bringing value to customers at Dell Technologies World this year.
Patrick Moorhead: Yeah, it is important. And it's funny, Daniel, probably about a year ago we started picking a little bit at what was getting covered, right? You have compute, which can be CPU, GPU and ASIC. You have memory, you have storage. But very little was talked about networking yet. We had GPUs that were set idle if you didn't have the right networking. We had networking that if misconfigured could kill an entire training run or a low latency inference run. So let's dive into networking and I can't think of a better person here at Dell Tech World to talk about that than Saurabh. Great to meet you and welcome to the Six Five.
Saurabh Kapoor: Thank you so much Pat and Dan, it's great to be here. And as you rightly said, networking is becoming a hot topic across the board. There have been studies where it said that 57% of the time AI workload spends is waiting on networking. So you could have the best of compute, but if you don't have the networking done right, you're not going to get the peak performance.
Patrick Moorhead: Exactly, exactly.
Daniel Newman: I tell you what I mean, when you think about the bottlenecks to AI proliferation, we hear a lot about energy. We know that that's a, that's a significant constraint, especially as we go to this next era of liquid cooling and just the size of these systems and the amount of power that they require. But Pat, I mean, the next up has to be to some extent networking, right? You got to scale out the racks, got to scale up. I mean just yesterday in Computex, I think Jensen showed the spine. I don't know if you saw what he showed, but basically the single spine handles as much traffic as the entire Internet today, which is all about networking. So when you all of a sudden have agents and there's trillions of these things working non-stop around the clock, 24/7, all requiring compute networking. It's going to be an incredible ramp of needs. Saurabh, so we’ve heard a lot about the Dell AI factory. We know that your partnership with Nvidia has really been designed to make these things seamless out of the box. You're looking at these in the real world. What are you seeing? What's the impact of these AI factories?
Saurabh Kapoor: So the concept of Delhi factory and Michael spoke about that, the keynote earlier today, the 2.0 version. Right. So it came about with the very thought of how compute storage networking has to work together to deliver an end state of performance characteristics. So what we have done with the Dell AI Factory, there are a few elements of Infrastructure 1, but I'll start with data, which is the most important thing. It's a fuel for AI. Dell has been a champion, and has been a leader in the storage category for years. Protecting, managing and storing data, ensuring that it's delivering what is supposed to and then we have championed that into the AI space. So data is one of the most important things. But when it comes to infrastructure, you have compute storage and networking. It has to be highly optimized to deliver an end state. And that's a choice of technology from partnerships with Nvidia, AMD and Intel on the compute side and now on the networking side, a portfolio that spans Broadcom and Nvidia. And then we package it together, validate it, create reference architectures to ensure that it meets the use cases we want for. But what's also important is that open ecosystem. Because when you open it up, you create room for a lot of innovation. Right. Mistral, Cohere and a lot of other partners that we have brought on board and then, you know, put the Dell wrapper of services around it.
Patrick Moorhead: That's right.
Saurabh Kapoor: Which is ensuring that, you know, from the first gate of consulting, ensuring your data center is all set up for the infrastructure all the way driving into the use cases, whether it's, you know, training, inferencing and fine tuning and networking is critical to ensure that it's characterized based on the workloads kind of infrastructure. You want to meet that end state of use cases and business objectives.
Patrick Moorhead: So when it comes to use cases inside of the enterprise, it's not homogenous, it's not one workload and it's called AI. Okay. There are a lot of different things that you need to do. I love when Dan talks about all these agents moving around and doing all these things.
Daniel Newman: Heck yeah. MCP. API.
Patrick Moorhead: But how do you segment use cases and then marry them up against specific technologies?
Saurabh Kapoor: So well, the way we kind of go about this is kind of workloads you're looking at. So training is something that you're looking at. The GPU farms, the AI factories, the super large clusters that we're talking about and Michael gave some references earlier in the day when we speak about AI as the world, the groks of the system, the build and technologies that they're building, the large language models. Right. So the characteristics there are. You're looking at elephant flows, your busy traffic. These architectures have to be highly optimized to deliver the kind of capabilities you want. These are several thousands of GPUs that you have to connect together and ensure that networking is set right. But when you shift right looking into the enterprise space, things start to change. This is where the real, you know, real action happens. When you look at fine tuning those large language models and look at distributed inferencing use cases where the scales are slightly smaller but they are still significant, where you have a few hosts working, a few GPUs working together, this is where you connect these large language models to enterprise specific data. Think of banks and healthcare providers and different enterprises in that category who have to keep the data on premises. So you're not taking the data to AI, but you're taking AI to data, ensuring that you're protecting the user data on prem and then still fine tuning that. And while we're taking the journey across different workloads, Dell internally we have championed AI across different business functions from content generation for our marketing teams, to getting AI sales chatbots for our sellers, to creating better services capabilities, improving our SLAs with, a lot of these training workloads intune account to customer use cases. And then finally now we're taking that to the next stage of the supply chain. We're looking at demand planning, UPP and all that to predict supply chain results across the board.
Patrick Moorhead: Excellent. Makes sense.
Daniel Newman: So let's talk a little bit about Sonic for networking solutions. It's something that Dell champions and it's been something Dell's been very focused on. But in the AI era. Talk a little bit about how Sonic is bringing benefits to the modern AI fabric?
Saurabh Kapoor: Right, so Sonic is software for open networking in the cloud. I'll go back in time a little bit.
Daniel Newman: Do that again quick. No, I'm just kidding.
Saurabh Kapoor: Go back in time when Microsoft was looking at setting up Azure and they were like hey, you know, how do we commoditize this network and ensure that we're not dealing with, you know, pockets of different vendors. We homogenized that a little bit and that's when they came up with this technology software for open networking in the cloud. And they said, hey, any participating vendor, any participating silicon vendor has to qualify it. So that was the original. And we've been working with Microsoft all these years. We went about taking the journey with Sonic. It was contributed by Microsoft to the open source community and we felt that was a natural evolution for networking, open networking and we felt that, you know, this is the technology being the best practices, the hyperscalers to the enterprise market. We put a red hat model around it, you know, created our enterprise Sonic version, started adding features that were needed for different enterprise use cases, you know, put the Dell support services around it. And then as we took that journey over the last 24 months, you know, AI has been the next big use case and Ethernet getting ready with the higher edit switching, you know, to deliver to the characteristics needed for AI use case. You know, Sonic, we have done a lot of feature enhancements focused on bringing load balancing capabilities, congestion management, supporting Rocky V2 for lossless fabrics, you know, addressing low entropy use cases with enhanced hashing capabilities. So basically extending Sonic capabilities from the cloud to the world of AI, you know, enabling features and capabilities that are needed for different workloads. I mentioned about, from training to inferencing and fine tuning and making sure that we deliver on the performance and the characteristics that are needed there.
Patrick Moorhead: So, going to dig a little deeper here. How does Dell AI fabric connectivity span kind of the best of both worlds with Ethernet and Infiniband, and how does that result in a higher level of resiliencies for these AI workloads?
Saurabh Kapoor: Right, that's a great question. Well, you know, this is a topic and in every meeting, every AI networking meeting we get into, well, Infiniband has been a great technology championing HPC use cases for years and decades. Ethernet as a technology was not so much focused on HPC all these years, but with the advent of AI and generative AI, it's becoming more and more important. Just like how Ethernet championed the cloud ecosystem, cloud computing, which was nothing but distributed computing, all connected by Ethernet. We see the same trend happen in the AI space as well. With the advent of higher edit switching, better silicon capabilities. When you look at Broadcom, the Tomahawk 4 and the Tomahawk 5 and look at Nvidia Spectrum 4 switching, all these Ethernet technologies have brought in a lot of native capabilities, high Index capabilities to allow that traffic flow for AI. I mean think of these big pipelines that have to flow that petabytes of data across those switches. That's one. And then what we're doing is baking networking into the Dell AI app factory concept by connecting that into the infrastructure block, validating those networking use cases across compute, storage and networking to deliver an end state. We have a portfolio that spans Broadcom to Nvidia, choice of Ethernet and Infiniband technologies. The customers get to pick and choose the right technology for that use case in summary, Ethernet. Now with all those optimizations that I mentioned with Sonic and the silicon capabilities deliver similar performance as Infiniband. And then we have results that prove that now it's about the customers picking the right technology. Do you want all Nvidia reference architectures, the NCPS and the eras, or do they want more open architectures with Sonic and the likes?
Daniel Newman: It does feel like we're going to have a bit of that race, that Android, Apple kind of is it all Infiniband? Is it all Ethernet? Is it some hybrid? Of course this week I think there's been a bit of an opening up of using at least scale up between non Nvidia and computer
Patrick Moorhead: UEC, UA link, all that stuff.
Daniel Newman: Starting to see it all happen. Yeah. CXL. I mean we like acronyms too.
Patrick Moorhead: We do.
Daniel Newman: Speaking of, let's talk about AIOps before we wrap up here. But we see a lot of work, especially whether it's with agents, but also just with infrastructure. You got orchestration, you got observability, you got AIOps. You know, Dell is focused on all of these things driving better outcomes to the AI factory. What about the services component of all this? Because in the end, like I say this all the time, but right now the two big money makers in AI infrastructure and consulting and services. So talk about the service side of this. Seems like a big opportunity for Dell.
Saurabh Kapoor: Absolutely. Yeah. So AI ops, I mean the world of automation and observability has changed quite a lot. Right. If you look at the core data center use cases, your standard SNMPs and syslogs and automation CI CD infrastructures of code could work fine, but when you look at AI use cases, I mean these are infrastructures that are running at peak capacity throughout and you cannot just depend upon log based telemetry tools to manage those infrastructures. You really need to have insights on what's going into the fabric and then have technologies like AI genetic solutions that are on the nics looking at loads on the fabric, taking real time decisions on managing congestion in the fabric. Technologies in AIOps that help you give the end to end fabric view on the flows from the compute GPUs to the NIC to all the way to the switches. And you now bring LLMs into these stacks which are machine learning interfaces in the system. Learning about the fabric, giving you predictive optics, health monitoring, giving you insights into the fabric that can help better decision making into the fabric. Right. And now you bake all of this right into the services world where services are able to deliver better because you now know the fabric much better. You're able to give predictive insights on when you need to increase your capacity for infrastructure and stuff. And we have brought in a lot of those capabilities into our service side. You know, full spectrum of capabilities from the assessment of data center infrastructures, the consulting practices, looking at designs and architecture guidelines, best practices, reference architectures, to pre production readiness with rack and rack and stack and configuration setup all the way to the day two operations and management with the AIOps capabilities. So a lot of innovation is happening into these services and AIOps categories across.
Daniel Newman: Saurabh, I want to thank you so much for joining us here at Dell Technologies World. It's great to have a conversation pulling together, you know, sometimes the missing link.
Patrick Moorhead: Exactly.
Daniel Newman: Which is networking all of this compute power. You know, we got the data, we got the compute, we've got all the agents and we've got the generative tools. But we've got to make sure all these computers can talk to each other and move all this data. Networking is the key. Saurabh, let's have you back sometime soon. Thanks for joining The Six Five.
Saurabh Kapoor: Thank you so much for having me.
Daniel Newman: Thank you everybody for being part of The Six Five On The Road here at Dell Technologies World 2025. We're going to step away for a little break, but we'll be back with you very shortly. Stay tuned.
MORE VIDEOS

AI Compute is Driving the Need for Thermal Cooling Solutions - Six Five On The Road at Dell Tech World 2025
Andrew Pack and Tim Shedd from Dell Technologies discuss the necessity of efficient thermal solutions in data centers, focusing on the innovative eRDHx cooling system and the shift towards liquid cooling.

Driving AI Infrastructure: Innovations from Dell and Broadcom - Six Five On The Road
David Schmidt and Jas Tremblay join us to share insights on shaping the future of AI infrastructure with Dell's and Broadcom's latest innovations. They delve into PCIe technology, AI server ecosystem, and the importance of open standards.
Other Categories
CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks
Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.
quantum

Quantum in Action: Insights and Applications with Matt Kinsella
Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms
Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.