Inside Azure's Compute Strategy: How Cobalt 200 Builds on Early Momentum
Nobody predicted that agents would be the thing that brought CPUs back to the center of the infrastructure conversation. Patrick Moorhead and Daniel Newman sit down with Mark Russinovich and Arun Kishan from Microsoft Azure to break down Cobalt 200, Azure's open-source infrastructure strategy, and what the next phase of enterprise cloud compute actually looks like.
AI and agentic workloads have shifted CPU demand in ways much of the industry did not anticipate, and Azure's custom silicon roadmap is now built around that reality. Patrick Moorhead and Daniel Newman sit down with Mark Russinovich, CTO, Deputy CISO, and Technical Fellow at Microsoft Azure, and Arun Kishan, Technical Fellow and Corporate Vice President of Azure Compute at Microsoft Azure, to break down what Cobalt 200 actually delivers, how open source fits into Azure's infrastructure strategy, and where Azure compute is heading as cloud-native and agentic workloads continue to scale.
The conversation covers how customer feedback from Cobalt 100 shaped the architectural decisions behind Cobalt 200, including the move to the Neoverse V core family and the expansion of VM families to address a wider range of workload scenarios. Kishan details the performance gains driven by larger caches, increased memory bandwidth, per-core turbo controls, and on-chip accelerators for compression, cryptography, and data movement. Russinovich addresses how ARM's long history with Linux makes Cobalt a natural platform for open source stacks, why Azure Linux is now available as a VM-level distribution option, and how Azure's infrastructure already runs on Linux, deeper than most customers realize. Both guests weigh in on what Cobalt 200 signals about the long-term trajectory of Azure compute.
Key Takeaways:
🔹 Cobalt 100 generated clear performance and efficiency signals that directly shaped Cobalt 200. Teams achieved 45% better performance per core and deployed 35% less total footprint to handle the same workload, feedback that drove the architectural investments in the next generation.
🔹 The move from Neoverse N to Neoverse V delivers approximately 50% better performance per core. The V family's single-threaded core design avoids the contention that degrades performance under load in hyper-threaded architectures, which matters significantly for agentic and latency-sensitive workloads.
🔹 Cobalt 200 expands the VM portfolio to cover storage-optimized and high-memory use cases. The addition of L-series and M-series VM families broadens the platform beyond the general-purpose and memory-optimized scenarios Cobalt 100 addressed.
🔹 Linux is not a strategic posture for Azure, it is embedded infrastructure. Azure Boost, which runs on ARM cores executing Linux, is now present in over 30% of the Azure fleet. Every server carrying Azure Boost is already running Linux at the node level.
🔹 Cobalt 200 is positioned as the premier Azure platform for cloud-native and agentic workloads going forward. Kishan frames each generation as a maturation step, with 200 representing the point where Microsoft hits its stride on custom silicon investment and scale.
Watch the full video at sixfivemedia.com, and be sure to subscribe to our YouTube channel so you never miss an episode.
Disclaimer: Six Five Media is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
ARUN KISHAN:
Cobalt is going to be our premier offer to enable us to optimize our fleet for best-in-class performance for this next generation of cloud-native and agentic workloads on Azure.
PATRICK MOORHEAD:
The Six Five is back with another webcast, and we are talking about our two favorite topics. So we covered Microsoft Build on our last pod, and we talked about a lot of discussions around having a full stack to bring to bear. And one of the most important stacks is obviously infrastructure. And in the infrastructure conversation, Daniel, as we've had, what is super duper hot? CPUs, right? We pretended to forget about them for a while, thinking that GPUs could solve the world's problems along with ASICs. And then agents hit and the CPU demand has gone up, what? 90%.
DANIEL NEWMAN:
Yeah, it's been wild. I think a lot of people missed that one. They saw the other bottlenecks. People caught on to the memory. We, of course, talked endlessly for three years about whether there was going to be enough GPUs and how many custom XPUs could be created. And then out of nowhere, I'd say fall 2025, December 2025, a moment where the model suddenly became super useful. And then people started putting all these agentic workflows into place. And they were like, we're going to need a lot more CPU compute to orchestrate all this stuff. And off we go. And then all the companies that are in this space that are building full stack, that we're building, you know, XPU, CPU, and of course everything that sits in between are looking really, really smart because this is going to be a huge inflection. And I don't know if you've heard me say this before, Pat, but I don't think we have enough compute.
PATRICK MOORHEAD:
Daniel, now with a rocketship emoji on X, I get used to, but no, seriously, let's dive in here and we are going to talk infrastructure, custom silicon, open source innovation. It's my pleasure to welcome back to the show, Mark and Arun from Microsoft. Guys, great to see you and congratulations on what I think was a successful build.
MARK RUSSINOVICH:
Thanks for having us. Yeah, thanks for having us. It was a great build. Just got back yesterday.
DANIEL NEWMAN:
It was a great build. I've been talking a lot about how significant the moat is. I was on broadcast today talking about Microsoft, and I actually said how significant the enterprise moat is and how misunderstood many people in the market think that the disintermediation of that moat is actually going to be, that not everyone's going to go over and start using Claude and fix all their problems. That deep relationship that Microsoft has with so many enterprises is going to be really sticky. And some of the things you're doing with your next generation with the models that you've released, the thinking models and stuff, it just adds more to the repertoire. So really, really good stuff. Congratulations on the event. Ruan, I'm going to start with you, though, because we're going to talk a little infrastructure. So, you know, the Cobalt 100 VMs, strong adoption since launch, but of course, we know with any first-generation silicon custom, there's a lot to be learned. What lessons, you know, kind of from that first generation really influenced how you guys went about building Cobalt 200?
ARUN KISHAN:
Yeah, thanks for the question. So as you mentioned, we've seen a really strong uptake on the Cobalt 100 family across both first and third party customers, such as teams internally and third party customers like Amadeus, who really benefited from the cost savings throughput and responsiveness improvements they've enjoyed by adopting the new platform. Teams, I think we talked a little bit last time, they got about 45% better performance per core that allows them to actually deploy 35% less total footprint to satisfy the same workload. So we had kind of a lot of clear signals from customers that they're looking to optimize absolute performance and the performance they get per dollar. And they're seeking to support kind of like a broader range of workloads that go beyond just general purpose and the memory optimized use cases we had looked at initially. And so some of the demand, as you kind of hinted at earlier, is coming from AI and agentic focused workloads that want to either complement or like replace, you know, the GPU constrained VMs with deployment on CPUs. And so we're sort of addressing the performance point by migrating the Cobalt 200 CPU core to the Neoverse V family, as opposed to N, which was in Cobalt 100. So Neoverse V is geared towards high performance. And we're also expanding the scope of the VM families that are going to be offered on the platform to address the customer workload needs across a wider gamut of scenarios. So for example, we'll be introducing the L-series and the M-series of VMs to support both storage-optimized and memory-optimized, or very high-memory use cases, I should say, on the next generation.
PATRICK MOORHEAD:
Yes, Arun, I was glad that you positioned the product as not good for everything for everybody. And I was also glad to see a lot of your internal workloads, because there's nothing like proximity to give you feedback on what to do to the next one. So not that external customers hold back, and there are external customers on Cobalt 100, but When you're serving up the person that is in the next building next to you doing teams or something like that, the feedback, there's probably no governor on it. And that only can make your next product even better. So you talked about the core architecture upscaling of Cobalt 200 from E to V. Let's talk about what does that mean for customer workloads? What aperture does that open for new types of workloads that maybe Cobalt 100 wasn't necessarily architected for?
ARUN KISHAN:
The new generation CPU core with the Neoverse V, which is using the latest process technology, is going to sort of unlock even more capabilities in terms of what you can get, in terms of how densely you can pack workloads on there and deliver best price for performance. So the Neoverse V family, compared to our Cobalt 100, we're getting about 50% better performance per core than the last generation. So just like the last generation with Cobalt 100, the individual cores in Cobalt 200 can continue to scale performance as you increase CPU load far beyond what you see on the other traditional hyper-threaded architectures. And I think we talked about it a little bit last time, because as you start going to these higher performance points with hyper-threading, you get a lot of contention between those threads that sort of starts impacting performance with the contention. So when you have individual cores, you can push the workloads much, much further. So if you look at things that we're going to start seeing, you know, with this agentic and AI workloads, where we'll do some inferencing, possibly on the CPUs, or we're doing data processing or latency sensitive operations, you'll see quite a bit better performance characteristics also when you look at oversubscription and how tightly you can pack per core on a server without degrading. But also if you look at something like the media benchmark, same kind of thing that we looked at when we evaluated for teams, we are seeing about 50% better throughput on the COBOL 200 for the media transcode benchmark, which is representative of what we do in Microsoft Teams, compared to the latest v7 offers that we have today from the other architectures. So it's quite a big leap in terms of the absolute performance. I think beyond the CPU itself though, we have many additional enhancements in the part to improve the performance like bigger L2 and L3 caches, we have more memory bandwidth, we have enhanced support for per core turbo and power control as well. So this like opens up additional knobs for better performance as well as better power efficiency. From the workload perspective, we've also built a bunch of accelerators into the chip that help with things like compression, cryptography, and optimized data movement, which further expand the scope of what you can do with the workloads. Then separately from this, the next generation Cobalt 200 platforms will all be paired with the other Microsoft Silicon that you've heard about. For example, they will have the Azure Integrated HSM, which helps with doing secure key management at the node level, and also the latest generation of the Azure Boost technology, so you get your industry-leading network bandwidth and storage throughput on those nodes as well. So I think we're offering a pretty comprehensive solution going into the next generation on the kinds of workloads and scenarios we can tackle.
PATRICK MOORHEAD:
Yeah, I appreciate you adding the HSM and the turbo function. It is funny. People just think you truck roll up a CPU and kind of call it a day. But when we say full stack, we mean full stack.
DANIEL NEWMAN:
Yeah. Another thing, uh, and Mark, welcome to the conversation. How are you doing? Um, uh, Microsoft's been, you know, very focused on is, is, is open source leadership. And of course this is going to be a huge, there's gonna be a huge role to play for open source as we continue to proliferate all these workloads. You know, what is the kind of, you know, let's tie this together with Cobalt 200 and Azure and Microsoft's open source perspective. Like how does all this tie together to, to become a platform for open source?
MARK RUSSINOVICH:
Yeah, well, if you take a look at cloud native workloads and AI workloads, they're effectively almost entirely built on top of Linux and open source stacks, open source frameworks, whether it's PyTorch or VLLM, or even the frameworks that people are generally deploying to orchestrate their AI agentic systems, that entire stacks ecosystem is built on top of Linux and open source. And so We want, of course, to make sure Azure is a fantastic platform for open source, and we've been on this journey for over a decade of making Azure a great platform, starting with the launch of even IaaS, where we launched with Linux support. But one of the things that we've done is, of course, make sure that open source works great on Cobalt. And if you take a look at the benefit that you get from Cobalt, as opposed to, you know, coming from Windows where ARM up until the surface lap lines started to support ARM was really a new architecture. Well, ARM has been a primary architecture for Linux and open source for a very long time. And so we get the benefits of that when we, with Cobalt and our customers that are looking at Cobalt, and are using those open source stacks, it's very easy for them to take their workloads onto COBOL. And a lot of times it's just picking up the ARM versions of whatever they've been using and they just work. And so that's been really great. Everything from AKS and the Kubernetes, like I said, the open source AI serving stacks and even training stacks, they're all just work great on top of ARM, on top of Linux.
PATRICK MOORHEAD:
Yeah, it is. I think everybody did the double-click. It was around a decade ago when you leaned into Linux. And I guess fortunately or unfortunately, you're still explaining that or we're still talking about that, but it's because you're so strong in Windows too. And you've made massive investments in the ARM stack as well. I did want to drill down though into very specifically Azure Linux. What role does it play in your broader cloud infrastructure strategy that may not be evident right now?
MARK RUSSINOVICH:
Well, what might not be evident is that we started to take big dependencies on Linux and build Linux into our infrastructure again, close to a decade ago. Of course, when it came to Kubernetes becoming popular, we had to meet Kubernetes customers with AKS where they were. So, of course, we had to start making you know, foundation for our services on top of Linux. And so like a service like AKS is built entirely on top of Linux. But what a lot of people might not know is Linux is actually embedded deeply in Azure's infrastructure. Arun mentioned Azure Boost, and that's our infrastructure offload card, which has an SOC on it with ARM cores that are running Linux. So every single server in Azure, it's over 30% of the fleet now in Azure, has Azure Boost with Linux on it. So to say that Linux powers the infrastructure of Azure is not an understatement. And then we've been had our own internal distributions of Linux to support our own uses. And then a few years ago, we made a distro available to our customers as a container host for Kubernetes. And then, you know, we're just talking about building coming from back from build. we announced the preview of Azure Linux as a distro for our virtual machine offerings as well. As an option for customers, of course, if alongside Ubuntu and Red Hat Enterprise Linux and SUSE, they can choose Azure Linux as well for any Linux workload in any service that they've got the operating system choice on.
PATRICK MOORHEAD:
Microsoft is all in on Linux, everybody. Hear that? Never ask them again.
DANIEL NEWMAN:
Infrastructure, you're going to own it all. Except I feel like you guys are going to be all in on the MOE side of models. I just feel like… you might not try to chase that pure frontier. You might just chase making all those models really, really useful for people. And by the way, doing it at a fraction of cost, because we all know, everyone on this call, everyone on this video knows, we got to get that price per token down, because now the companies are finally feeling the actual normalized cost of using these incredible tools, running it on the right infrastructure, on the right device, all that stuff you guys talked about at Build. is going to really come to fruition. So let's pull this all together. I'd like to hear from both of you on this one. I'll start with you, Mark. But like, what does this Cobalt 200 and all the things you're doing here kind of signal about how you guys are planning the roadmap, the future? And by the way, feel free to give us anything that the market doesn't know. We love breaking stuff on. I'm kidding. I'm kidding. But like, what does it say about the future of Azure Compute?
MARK RUSSINOVICH:
Well, the future of it, I mean, I'll summarize the future of Azure Compute. I think from my perspective and broadly technology speaking, it's serverless. And I mean, as you can see, I think the directionality here is Linux and containers. uh, and serverless and AI, like that kind of sums it up in a nutshell, I think. And it's optimizing every single cycle, whether it's a GPU cycle and AI accelerator cycle, a cycle on the server or a cycle in our offload cards. It's, uh, not only making them more efficient and faster and lower power, but also driving utilization to a hundred percent so that we're not wasting wasting resources, sitting idle. And if you take a look at all of the things I had, I give an Azure innovation talk at build. You can take a look at all of those innovations from that lens and see how they all are supporting that goal right there.
ARUN KISHAN:
Yeah, I think, like Mark mentioned, what we've seen with Cobalt 100 is real momentum from customers on really adopting it for cloud-native, so serverless, dynamic kind of workloads, as well as lots of OSS and Linux. And they really enjoyed strong efficiency and performance when it comes to running things like containers, microservices, and databases, and all this kind of stuff. agentic stuff on the Cobalt platform. So that feedback has kind of shaped how we think about, you know, where our investments go, looking at Cobalt 200. And they just want to, and we've gotten the feedback from the customers that they want to bring more of their workloads onto this platform and do it within Azure. So overall, I would say, you know, with Armv5, our Armv5 offering in the Azure It sort of was dipping our toes in the water and getting something out there. With 100, we really started shaping our silicon investment from a first-party perspective. And 200 is, I think, where we're really hitting our stride. We're going to ramp, innovate, and scale on the Cobalt 200 platform. And moving forward, I think Cobalt is going to be our premier offer to enable us to optimize our fleet for best-in-class performance for this next generation of cloud-native and agentic workloads on Azure. So that's kind of where I see both our Silicon investment and how we shape the compute offering going over the future roadmap.
DANIEL NEWMAN:
Well, Mark and Arun, I want to congratulate you both. Strong build, strong progress on the Azure infrastructure. Congratulations on Cobalt 200. I can assure you every week Pat and I are talking about the developments across the infrastructure space. Microsoft is coming on strong and it's great to see all the progress that's being made. And I imagine there's a Cobalt 300 and maybe a 400 somewhere on a whiteboard. Maybe even being taped out or are playing within a lab somewhere. Who knows? But, you know, let's keep having these conversations and we'll continue to watch and discuss what's going on over there at Azure.
MARK RUSSINOVICH:
Great. Great. It's been great conversation. Thanks for having us on the show.
PATRICK MOORHEAD:
Thank you.
DANIEL NEWMAN:
And thank you, everybody, for being part of this 6.5 virtual webcast. Build with a great event. So much going on at Microsoft. Stay tuned with us. Click and check out all of the coverage here on the 6.5 for Build for across Microsoft. And of course, all the things that Pat and I talk about on the show. But for this one, we got to say goodbye. See you all later.
MORE VIDEOS
Apple's Siri Bet on Gemini, SpaceX's $1.77T IPO, and Claude Fable 5's Hyperscaler-Neutral Launch
Patrick Moorhead and Daniel Newman cover Tim Cook's final WWDC as CEO and Apple's Gemini-powered Siri strategy, the $35 billion Apollo and Blackstone deal backing Anthropic's capacity expansion, Intel's packaging wins with Google and NVIDIA, SpaceX's IPO at a $1.77 trillion valuation, Anthropic's Claude Fable 5 and Mythos 5 launch across every major cloud, and earnings reactions from Oracle, Micron, and Adobe.

From Data Platform to AI Control Plane: Snowflake CEO Sridhar Ramaswamy on Agentic Enterprise Architecture
The enterprise data bottleneck in the agentic AI era is not storage capacity. It is making the data that matters visible and accessible to AI models at the moment of decision. In this Six Five Virtual Webcast, Snowflake CEO Sridhar Ramaswamy joins Patrick Moorhead and Daniel Newman to examine how coding agents are becoming the foundational infrastructure of the agentic enterprise, why architectural flexibility across models, formats, and cloud providers is a competitive requirement, and what enterprise data leaders must prioritize to build a compounding advantage in the next 12 months.

Quantum in Healthcare: How Cleveland Clinic Is Scaling Molecular Simulation Beyond Classical Limits
Cleveland Clinic, IBM, and RIKEN completed the first large-scale quantum simulation of a protein-ligand complex in explicit water, scaling across 10,000 to 13,000 atoms using an atom-by-atom embedded wave function framework. Dr. Kenneth Merz, Principal Investigator at Cleveland Clinic Research, outlines how this milestone connects to free energy calculations, lead optimization in drug discovery, and the hybrid quantum-classical architecture that defines quantum's role in biomedical research today.
Other Categories
CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks
Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.
QUANTUM

Quantum in Action: Insights and Applications with Matt Kinsella
Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms
Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.

