For Silicon the Future is Custom
How is the shift to custom silicon enabling the transformation of data centers?
At the Six Five Summit: AI Unleashed! We're thrilled to feature Marvell's Mark Kuemerle, VP of Technology and CTO ASIC Business Unit, and Rishi Chugh, VP of Product Marketing, Network Switching Group at Marvell, as two of our Cloud Infrastructure Speakers. They join host Will Townsend for a conversation about the shift towards custom silicon in data centers, driving efficiency, and performance gains.
Key takeaways include:
🔹The Paradigm Shift to Custom Silicon: Explore the transformative transition from standard products to highly specialized custom silicon within data centers, and how this redefines traditional ASIC business models.
🔹Chiplets – Building Blocks of Custom Design: Learn about the critical role of chiplets in accelerating custom silicon development. Our speakers shed light on whether these are truly synonymous with bespoke designs.
🔹Innovating Custom Memory Solutions: Understand emerging trends in custom memory, including Marvell's pioneering initiatives in custom SRAM development that challenge conventional standards-based memory approaches.
🔹Expanding Custom Silicon's Reach: Discover how custom silicon is expanding its influence across a broader spectrum of technologies, beyond traditional CPUs and XPUs, into adjacent fields like memory.
Learn more at Marvell.
Watch the full video at Six Five Media, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Will Townsend: Hi everyone and Welcome to The Six Five Summit. AI Unleashed. I'm Will Townsend. I manage the networking, telecommunications and cybersecurity practices for Moor Insights and Strategy. And today I'm joined by Marvell's VP of Technology, Mark Kuemerle and VP of Product Marketing, Rishi Chugh for this cloud infrastructure spotlight on expanding the boundaries of custom silicon. And with that context set. Gentlemen, let's jump in. And Mark, let's start with you.
Mark Kuemerle: Sure.
Will Townsend: And I've got a guess to this question, but I want to hear your perspective. For decades, merchant silicon has really dominated the data center. And from your perspective, what's giving rise to custom and how is this different from the old ASIC business model?
Mark Kuemerle: Yeah, I think it's really a couple of factors. First of all, what we've seen over the last several years has been a pretty pronounced slowing of Moore's Law. So unfortunately, every technology generation, we don't get the doubling that we used to get. Unfortunately, the data center doesn't care that Moore's Law is slowing. They need to find a way to increase performance every generation. And what that means is that there's a big change in emergence in things like chiplets to enable these customers to take advantage of these bigger, more complicated systems.
Will Townsend: And so, I mean, I would assume AI workloads are driving the need for that as well, right Mark?
Mark Kuemerle: Absolutely. And the data centers have really specific workloads that are really dedicated to specific functions and they're different for each data center. AI is a big part of that.
Will Townsend: Sure. And Rishi, anything to add to that?
Rishi Chugh: Yeah, exactly. Adding to that stuff. Right. Today's infrastructure deployment and the data center are more focused to a specific workload. And the service providers are basically constructing these data centers with that opex in mind. They want to basically optimize their opex and give the maximum returns to their customers in terms of quality of service. This basically fuels that custom silicon comes into play rather than the standard product coming into play. Coming with the AI front as well, AI is very power hungry. We know that there is a heavy load of customization and a unique way of doing the things so that you optimize the power to get maximum returns on these investments and also your opex to make you really profitable. In fact, what we see is that 25% of the compute silicon moving forward will be more towards the custom side of stuff and the project. What we are seeing on the custom XPUS are kind of in that direction of custom. And this basically trend has really become a norm in the industry, especially for the Hyperscalers who are building their own data centers.
Will Townsend: It really is. I mean, and the hyperscalers are doing their own silicon designs as well. Right. So purpose built silicon is really sort of driving the train. And Rishi, Mark mentioned chiplet, so I'd love for you to solve the debate. So are chiplets considered custom?
Rishi Chugh: So there are two aspects of chiplet. I will let Mark speak about it. But just at the high end, the whole idea of chiplet was done for two different reasons. One reason was to accelerate the execution time if there is a repeatable function within the organization or within the vendor who wants to put the footprint in all of the silicons. So it will be more of a cut, paste and copy to reduce that opex to doing it and enforcing the trademark on their silicon. That's number one. The second uniqueness is also with these designs getting more complicated and, and the limitations of the dye size or the radical size of the dye coming into play, chiplets become a more viable option where you can basically implant in it. And also if there are forward looking changes, it can be done very quickly without changing the parent mothership die.
Will Townsend: Okay, and Mark, anything to add there?
Mark Kuemerle: Yeah, and I want to kind of circle back to what I was speaking about for the first question, which was really about again, Moore's Law slowing down. We started to see Moore's Law slowing down in about 7nm, moving, transitioning into the future technology nodes. And once we saw it slowing down, as I mentioned, we couldn't sacrifice performance. So the amount of silicon that we needed to put into a package has been increasing and increasing every technology generation since then. Now, by the time we're in, you know, 2 nanometers with designs today, we're looking at, you know, over a thousand square millimeters, 2000 square millimeters of silicon that really needs to be integrated together in a package. And the only way to do it is to break it into pieces using technologies like chiplet. So in a way, chiplets don't have to be custom, but with a lot of these big designs, the only way to achieve that performance is to use multiple chips working together.
Will Townsend: Yeah, no, that makes for perfect sense. So let's shift the conversation to memory. And I spent a lot of time with the Marvell team and I know this is a big focus for Marvell. And historically memory has been based on standards for obvious reasons for scale, supply chain and costs and that sort of thing. But I'm beginning to see the rise of custom memory, like custom high bandwidth memory and that sort of thing, Mark, from your perspective, why are we seeing this? Is the answer AI again, or is there more to it than that?
Mark Kuemerle: Well, AI is certainly the major driver to this, but again, it kind of goes to play with the incredible need to get more and more performance. What happens with customizing HBM, especially with Marvell's custom HBM solutions, is that we can pull a lot of the content that was just interfacing to those memories off of the main die off of the main die for an accelerator. And that opens up a lot more area for our customers to really get the performance, the number of teraflops that they need in the design. That makes a big difference. For example, we can clear up 25% of the die space that would have otherwise been dedicated to talking just to HBM memories themselves. And we still get even better performance of the custom HBM with about 70% less power spent on the interface to those memories themselves. And I'm sure, as you're aware, power has become really critical for a lot of these applications.
Will Townsend: And Mark, you sort of touched on this when you were talking about Moore's Law and hitting the wall there. I mean, custom's long been associated with XPUs and CPUs, right? And I mean, you look at what others are doing with GPU technology, AMDs and the Nvidia's of the world in adjacent technologies like, you know, and packages like memory. But is it expanding even beyond that base as well?
Mark Kuemerle: So we spoke about how customizing HBM is really, really important when we're integrating high capacity memory on the package. One thing that's equally important is actually the SRAM, the embedded SRAM that's actually built within the accelerator or switch or whatever kind of customized data center device you're building. We actually are building very specific, highly customized memory solutions for our customers that are really cranking out as much bandwidth as we can per square millimeter of space. In fact, our memory solutions are providing about 17 times the bandwidth that you can get from an off the shelf memory today. It's a really big deal for us, it's a really big deal for our customers and it's existential to continue to grow performance in AI. Beyond that, not only is that high bandwidth memory, the embedded SRAM memory, incredibly important for these devices, but also high capacity memory is a huge need in the data center as well. We address that with a set of product families that are optimized to allow CXL connectivity for memory acceleration and for memory pooling or capacity, enabling 12 terabytes of memory accessible through this one highly optimized chip. So it's a pretty special offering, stacking all the different types of memory that are needed for the data center. I think there's a huge need for customization, again driven by those very unique workloads that we see in the data center. Not only do we customize accelerators, NICs, components like that, but we're even seeing customization moving into the switch arena. That's actually what Rishi is an expert on. I'd really like to get his perspective on that.
Will Townsend: Yeah, Rishi would love to hear that.
Rishi Chugh: On the switching side, it's basically what is switching all about. It's basically routing a large number of packets into different segments. So there is a lookup table associated with the switch and there is also a huge centralized buffer scheme inside the switching. And as we increase the radix of the switch, that means more fan outs of the switch. That means indirectly we are telling that there are more individual ports which are talking to the switch, more unique Mac addresses. And plus there is also a cocktail of things going on between layer two and layer three in terms of routing interfaces. This all has to be all encapsulated into a large buffer size and managed accurately to do the right pipelining for the switching to happen. With a very low latency memory inside the device or SRAM in the device and the overall, I would say the management of those memory modules become very important for us in terms of fan out in terms of also data manipulation which is happening inside and also at what rate, the bandwidth rate are we operating at overall, this whole infrastructure and the SRAM implementation on the switching side and custom switch side becomes very, very critical. In fact, I'm going to speak more about it when we talk about the different criteria of switching within the switch and this memory you have different bifurcations happening because of AI, which is scale up and scale out. Also this bifurcation, especially on scale up, there is much more requirement for memory because scale up is something where you want to utilize all of your GPU optimization. The highest opex you're spending is on the GPUs and the last thing you want is to be underutilized. And the underutilization of these GPUs are coming because of the memory. So some GPUs will be starving for memory, some GPUs will be kind of underutilized for their memory usage. What they have and the switch fabric is the one which kind of capsules all of them, gets it into a unified structure and scale up and provides a unique value proposition in terms of custom where you can make sure that your investments on your infrastructure, especially on the GPU side, is amortized across all the nodes.
Will Townsend: Yeah. And you can actually fine tune the performance as well. Another trend that I'm seeing is the integration of DPUs into top of rack switches as well to embed network and security services. Marvell has a DP line. I've written about that over the last few years as well. There's a lot of innovation that's coming within the switching segment. Mark, any other perspective on the whole smart switch evolution?
Mark Kuemerle: Yeah, and this really speaks to the topic for our discussion today. Customization is really everywhere and I think we're going to continue to see, especially with innovative NIC solutions. We're going to continue to see what was a very, let's say, standard and well known architecture for switching. Really evolving to be extraordinarily unique and very specific to different customers and their implementation. So I think we're going to see exactly the trends that you alluded to more and more than we have in the past.
Will Townsend: Yeah, well, as we round out our conversation, that's a great segue to the final question I have for both of you. And Rishi, I'll start with you. What's next in the evolution of custom from your perspective?
Rishi Chugh: A good question actually. If you look at the whole phenomena of custom and I would start if you look at the Data Center Rack, all the big Iron components sitting inside the rack. We have already seen customization happening on the CPU and the XPU side of the component sitting inside the rack. And now we are seeing, as I mentioned, the bifurcation of different classes of switch. There is a traditional cloud switch and then there is an AI switch within the AI switch. There's further bifurcation within the scale up and scale out switch. So it will be of no surprise that the next frontier of customization will be on the switching side where people would be doing custom switches. And also this goes in hand to hand with the custom XPUs, DPUs and the NICs. What we are seeing is a lot of push for the photonic side in terms of the photonic engine or the optical engine being integrated inside the silicon which is the CPU and other frontier. There are different technologies and customization would be a very crucial factor in CPOs because the CPO engine by itself will be defined based on your usage, your scalability, what length you want to drive, what applications you want to drive. Also in order to get these things done, it goes a further level into the platform where you need to look into the customization at the platform level. Whether what reach you are using, are you using copper or optics, what thermal solutions are using on your design, either it is liquid cool or it is just air cooled. All of these will be factored into this custom silicon part of it. It will not be just silicon engineering which has been customized for application usage, but it will be a platform and system which will influence the customization to happen. So as this rack goes through a transformation phase from CPUs to XPUs and we have already seen the phase happening for the NIC. We have done a custom NIC for a lot of our hyperscalers which is in the public domain. And now it's basically the next evolution will be on the connectivity and on the switch side.
Will Townsend: Yeah, networking is sexy again, right? Mark, what do you see in your crystal ball with the evolution of custom?
Mark Kuemerle: Yeah, absolutely. And really building off of what Rishi shared with us, I think we're going to see more and more complexity integrated with these devices as we essentially define this platform that allows the data center to achieve their goals from a connectivity point of view, we're going to see more and more integration things like CO package copper, CO package optics, really driving really complex package technology. We just recently announced a novel multi interposer 2 1/2D package integration technology platform that allows our customers to really scale up the amount of content and also to simplify 2 and a half de integration by removing silicon from the equation. So these kinds of innovations and investments in the platform are really going to continue to grow this trend of systems in a package which is really, really key for the data center to really achieve its performance goals.
Will Townsend: Yeah and I'm glad you mentioned co package optics. I mean that's a pretty hot topic right now, right? I mean the efficiencies there, the performance improvement, the power management that comes along with that. That's an area that I'm beginning to dig into and hope to publish some, you know, some insights on in the near future. So, gentlemen thank you for your time. It's been a very, very compelling conversation. I want to thank all of our audience for joining us for this cloud infrastructure spotlight at The Six Five Summit. Stay connected with us on social media. You can find us at SixFiveMedia.com/summit and there are more insights to come so stay tuned.
Disclaimer: The Six Five Summit is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Speaker
Mark Kuemerle - VP of Technology for Marvell - responsible for architecture and roadmap for custom datacenter products. He has been heavily involved in industry collaborative efforts to enable chiplets and interfaces for the last 10 years and is very excited about the future of advanced packaging.



Rishi Chugh – VP, Product Marketing Network Switch Business – responsible for product marketing & management of network switch product line for Cloud and AI market segment. He has over 20+ years of experience in product management and design of infrastructure ASIC and high-speed interconnect technologies and is currently focused on CPO and CPC technology for scale up and scale out deployments.


