Micron's AI Strategy: A Conversation with Sumit Sadana
How is the role of memory and storage being redefined to unleash the full potential of next-generation AI architectures? 🏗️
Find out in this conversation from the Six Five Summit: AI Unleashed featuring Semiconductor spotlight speaker Sumit Sadana, EVP and Chief Business Officer at Micron. He joins host Patrick Moorhead for a great discussion on transforming the very foundation of the AI era.
Key takeaways include:
🔹Redefining Memory for Next-Gen AI: Explore how Micron is fundamentally transforming memory and storage to support cutting-edge AI architectures, driven by a data-centric approach and the evolving demands of AI workloads.
🔹Conquering the Memory Bandwidth Bottleneck: Micron has made focused efforts to resolve the critical memory bandwidth bottleneck, including exploring memory-centric compute architectures like HBM3e/4e, custom-ASICs, LPDDR, and in-memory processing.
🔹Democratizing AI Beyond Hyperscalers: Understand how Micron is making AI accessible to businesses of all sizes, extending its impact beyond hyperscalers, and contributing to societal advancements, including through edge AI and initiatives to reduce poverty.
🔹The Future of Edge AI & Global Impact: Gain forward-looking insights into the full arrival and maturity of edge AI, what its widespread adoption signifies for the entire global digital ecosystem, and its unique contributions to the AI revolution.
Learn more at Micron.
Watch the video below at Six Five Media, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Patrick Moorhead: Welcome back to the Six Five Summit. And we are talking about one of my favorite topics. Okay, it's my favorite topic and that's semiconductors. And we're joined here with Sumit from Micron. You know, HBM storage memory is becoming absolutely paramount to be able to fully execute this build out in AI. And I really appreciate you joining the show.
Sumit Sadana: Thank you for having me, Pat. I'm really excited to be here.
Patrick Moorhead: Isn't it amazing? You know, I celebrated my 35th year in tech this month and memory and storage, man, it's like, you know, you better hang on, it's going to move, right? Roller coasters going up, roller coasters going down. But man, it is up, up, up and to the right. And when I start hearing about things like, you know, HBM as the catalyst of the industry and then all the stuff on DDR5, I'm getting way ahead of myself. So I'm here to ask you questions here. But thank you for, thank you for joining us here. So the first question is, how are you rethinking the role of memory and storage in enabling all these next generation AI architectures? I mean, regardless of hyperscaler, data center, tier two, neo cloud client computing, the car, I mean, AI is literally everywhere.
Sumit Sadana: Absolutely. AI is such an exciting secular growth trajectory for us and for the whole industry and we have just reported earnings yesterday. You have seen a lot of the growth that AI is driving for us. I think if we take a step back and look at the overall momentum that AI has gained over the last couple of years since the advent of ChatGPT and generative AI, we see it as four important exponential trends that have come together. The first one is the radical improvements in compute architecture and compute capability. These have been driven by GPUs which have really enabled AI based hardware to get to the next level of capability. And very much hand in hand with the compute capability is the memory subsystem capability required to feed the processor all of that insatiable amount of data. So really the exponential trends on compute capabilities is the first important one. The second one is the dramatic improvement in software capability. And of course generative AI itself was a big leap because you look at the last couple of decades, the AI field has been making progress for many, many years. But this was a huge step function with generative AI. And now of course, a lot of new ideas are coming to the forefront to make generative AI even better. The third one is the massive amounts of data that we all have access to and how the models can be trained on immense volumes of data. You've heard about trillions of parameters in these models, from billions to hundreds of billions to now trillions of parameters. So huge amounts of data. And these models, through the hardware and software capability are able to extract a lot of that insight and knowledge and intelligence out of this. And the fourth one is the ubiquitous connectivity between devices that then generates even more data and feeds on this trend. So those four exponentials coming together and it starts off in the data center, we have huge growth in the data center that is going on. That has been what has underpinned our results that we just reported yesterday. With 50% sequential growth in HBM alone. And high bandwidth memory is now more than a $6 billion annualized run rate of revenue for us, which is great when we think about this whole AI revolution. If you think about GPUs as the brains of AI, then you have memory as the heart of AI. Because just like the heart pumps blood into all parts of the body and enables the brain to do its thinking, we really think that all of this data that these processors have to process, that memory, processor bandwidth oftentimes, especially in inferencing, becomes the bottleneck of system level performance. And so we are doing a lot of work with our customers on how to make these memory subsystems a lot more capable. And once it grows, it is not going to just stay at the data center. It's going to go to the edge, it's going to go into automobiles, PCs, smartphones. So really exciting time. And we are just at the start of that big wave of AI growth for the next at least a decade, if not two.
Patrick Moorhead: Yeah, I agree with the decade. It is exciting. I was one of a hundred people that was invited to the original announcement with Satya and Sam Altman and I was sitting in the audience having a lot of thoughts. One of them was about architecture and data and that it's a well known understanding now that there is indeed a memory bandwidth bottleneck out there and is a challenge for, for AI. I mean it's hard enough to have an application address with a ton of memory and a large model, but now, you know, we're into this idea of reasoning engines where even inference is difficult. I'm, I'm curious. You invest a tremendous amount of money into R and D and you know, you've got a multi year road. Are you looking at exploring? You know I've seen you talk about new memory compute architectures to address this challenge.
Sumit Sadana: Yeah, you bring up a really important point. This memory bandwidth bottleneck is an important one (I wouldn’t say) to solve, because it's not easy to have the memory bandwidth grow at the same rate at which number of cores and GPU performance grows, but at least to improve on that in as meaningful a way as possible. If you step back and think about the bigger problem, you have the memory bandwidth issue and you have a power consumption challenge in the data center. In aggregate, there are a few ways in which you can solve the memory bandwidth challenge or at least improve it, if not solve it. But they may not be the most efficient ways, or they may violate some other challenges or boundary conditions related to power consumption. As an example, our focus in all of the innovation that we have been driving is how do we make power consumption a big focus for the company in terms of improving it from one product generation to the other, as well as improving it versus the competitive bar. So that has been a huge focus for us. Another big focus has been really having leadership on the process technology side, because every new node of process technology gets us better performance and more power efficiency. And the idea behind both of these is as our products become more power efficient, they enable the GPUs to do more. And you can think of it in two ways. You can either have a data center consume less power than it otherwise would with these power efficient capabilities, or within a given power envelope for a data center, you can run more compute than you would otherwise be able to because of all of these power capabilities. So in thinking of the memory bandwidth issue, we are coming at this problem from many different angles. Power is one very important one, because if you take the example of HBM3e, Micron's HBM3e has 30% lower power consumption than the next best competitor. And this is in an industry where 1 to 3% power consumption differences are very typical between competing products from different companies. So 30% is like an order of magnitude better power consumption versus what is typical between competing products. And what this does is it enables the processor to run at a higher performance within the same power envelope, so we improve memory bandwidth from that angle. Now the other thing that we do to reduce overall power consumption so that the aggregate memory bandwidth, aggregate memory capacity can be increased within the same power envelope is to focus on bringing LP dram, which is low power dram, into the data center. Now, low power DRAM has been thus far only relegated to PCs like laptops and smartphones, where battery consumption and battery power and how long a battery lasts are critical factors in those devices. So they end up using low power DRAM in those products. So Micron had the idea of taking low power DRAM into the data center. And we have to solve a lot of technical challenges to do that. Most importantly RAS type of challenges, reliability, availability and Serviceability. Because the DDR5 based products have a very different RAS profile than LPDRAM. And so we pioneered the use of LPDRAM in the data center. It is now shipping in volume with one of the largest customers in the world for AI products and AI memory. And it has become a very significant business for us. We have just mentioned that LP DRAM and high capacity DIMMs in the data center have together become a multibillion dollar business for us. So this is another big way where as we take LP DRAM into the data center, processors can connect to more Dr. And have more aggregate bandwidth in helping solve some of this memory bandwidth challenge. Because within that same constrained power envelope you can just have a lot more DRAM in the system. The last couple of ways I will mention is for our HBM roadmap on HBM4E. We have just sampled HBM4. The next generation after that is 4E. For HBM4E, we are doing custom base die design in partnership with our largest customers. This custom base die design enables our customers to move some of the blocks of logic from their processor, GPU or ASIC onto the base die. And that enables a much better power efficiency, performance as well as memory bandwidth in the aggregate system. And by aggregate system I mean whether you're connecting just to the HBM or you're connecting to LPD RAM around it. So the overall system design just becomes higher bandwidth, higher performance, and much more capable. And this is a tremendous effort done in conjunction in partnership with our customers. We are very excited about it. And this is another important vector. The last one I will mention is we are doing a lot of innovation on our roadmap related to new architectures and new approaches. One example is bringing processing closer to memory. So think of what is called BIM processing in memory. That is one approach to do this. There are other approaches that we are pursuing as well. And in all of these approaches there is varying level of impact to how the software stack in this system sees the memory. And does it need to, does the new architecture need changes to the software stack or not? And depending on whether the software stack needs to make changes to see this new architecture, use this new architecture, obviously it creates a new level of friction in the adoption of some of these new architectures. So we are going through the pros and cons of many different approaches on how to solve and improve the memory bandwidth issue. We are working closely with customers, we are running different test cases with them and it's a very exciting time to be in memory because there's so much opportunity for differentiation, which is what we love.
Patrick Moorhead: Yeah, it is a great time to be in memory. So as I step back and reflect on what you just said, just the amount of innovation is just mind blowing. A lot of people I know get fixated on the compute side, but quite frankly the compute can't do anything without the memory here. And on the data side as well, the ability to store. I mean we're creating more data in a year than existed or has ever been created before it. And I think the way you're approaching it by first of all, LPDDR, as a way to reduce power in another area that can lower the overall power footprint and increase the density while reducing the power with HBM3E and then in the future with 4E. By the way, I have done some research on custom HBM4 and I even did a video with one of your partners at, with Marvell as well. So I find that fascinating in memory processing is definitely on our futures look here and what's funny is I think we're going to get there and the whole power I think is the biggest driver, which is, is pretty cool to see. But, let me move on here. So I talked a little bit about, you know, right now the fixation seems to be in AI and the hyperscalers and I get it right, the Capex that's being invested, you're a huge beneficiary of that. But the reality is, and we've seen this time and time again historically, is that at least our thesis is right. This moves from hyperscalers, tier 2 CSPs to enterprise AI to the industrial edge and then finally, and by the way, this is in no order of when it's going to hit devices like PCs and devices like, like smartphones. I'm curious, how are you enabling, or let me use the word democratizing AI for these different use cases and platforms?
Sumit Sadana: Yeah, we think about the concept that you just brought up about democratizing AI for various categories of customers. And we also think about how we democratize AI for the masses. Right. Meaning having more people in society have access to more of these AI capabilities. So let me sort of touch on both of those because both are important in different ways. So if you think about the data center first and you're right. I mean, right now a lot of the Capex is driven out of the largest hyperscalers in the world, they are building out all of these AI factories. And these AI factories have huge amounts of compute capability and of course GPUs, ASICs for processors, massive amounts of memory. And of course these AI servers also have huge amounts of SSD data center SSDs. And that's a big play for us as well. We are very proud of our trajectory of getting to record share quarter after quarter in that as well. But as we think about expanding all of this, you're seeing so much of an embrace of AI from sovereign countries wanting to implement AI in these data centers in their own country, for their own societies, and have the AI be trained on their own localized data sets. And this is important to a lot of countries for their own country's growth purposes. And so a lot of these companies are coming up around the world which are focused on those markets, and that's going to be a growth market for many years to come. We are also seeing OEMs that supply to some of these hyperscalers, these AI servers, but are also starting to supply these AI servers to captive smaller enterprise installations for enterprises. In the Fortune 1000, for example, there is some data that companies are happy to locate into the cloud. There is some data that for security and privacy reasons has to stay within the four walls of the company. And then these private clouds or on prem type of installations become important. And training of the data sets that are company specific, with very confidential, highly secure requirements also creates that opportunity for that part of the market to grow quite a bit. So within the data center there are various fragmentations of pockets of growth that are going to continue to become much bigger in the future. And I think as you pointed out, this is starting out in the data center, but it's going to proliferate from there. The industrial opportunity is just massive. And initially it's going to be more AI getting infused in a lot of your existing industrial devices. But you can very well see huge amounts of automation being enabled by AI. Huge amounts of growth coming from a new field in robotics. Robotics has been around for a while. Automation through robotics has been around for some time, but it is getting new wings of growth because the capability of these robotic subsystems and even humanoid robots in the future are getting accelerated with the use of AI in enabling that acceleration. Then when you combine the conversational capability that generative AI has created to create knowledge and insights and intelligence out of massive quantities of data, and you combine that with Robotics in the humanoid form. Now you can get extraordinarily capable assistance of all kinds, whether you're doing dangerous work or you're doing repetitive chores. There is a huge opportunity for robotics to grow over the next 20 years in a very, very big way. And then of course, you know, you look at automobiles, you're seeing the growth of autonomous vehicles, with Waymo having completed, you know, so many tens of thousands of rides, and now Tesla launching its service. And of course, these are early days, there'll be ups and downs in this, but the technology is evolving super rapidly with the help of AI, and I have no doubt that this trend is going to gain momentum over the next five, ten years as well. So there are a lot of these structural trends. And as you mentioned, PCs and smartphones are going to have a lot of new, compelling applications. I think that underlying capability exists today to create these applications. Now it's a matter of perfecting them, ensuring the right level of privacy is maintained so people don't freak out when your own personal data gets accessed in ways that you didn't envision or didn't, heaven forbid, agree to. All of those things have to be taken care of. It's taking a little bit longer, but there are applications of this that don't have to wait very long. For example, real time translation, being able to speak to people with different linguistic languages, different languages in a seamless real time way, using just your phone or your PC on a zoom and making it just a very, very simple experience versus a frictionless experience versus what it has been in the past. These are just very simple examples. Then when you think about democratization for the masses, think about access to all human expertise on these cloud platforms with a very low cost per token. As we keep improving the hardware platforms, the software, the cost per token keeps going down. And now you can have very broad access for the masses of all kinds of expertise. I mean, you want to draft a legal letter, ChatGPT can do that for you. You have a question about odd symptoms you're having, and you want to know what it could be before you wait two weeks to see the doctor or two months to see a doctor. You can get a lot of input from these generative AI models and think about the evolution and revolution in education that can happen so much. Tailored education modules can be created for students of different caliber within a class, and it can be a huge aid for teachers that can totally revolutionize education. So I think in any field you look at, the sky's the limit in terms of not just productivity, but bringing the benefit of AI to the masses as well. So really exciting times ahead.
Patrick Moorhead: Wow, you said a lot there. Very meaningful things. You know, even as recent or at Davos, I mean the biggest tech discussion was an air gapped sovereign AI cloud. Right. And now we see what popped up in the Middle East with all the deals that were cut there, exasperated by tariffs. That discussion, you know I'm, I'm convinced that there is 100% something, something there. Everybody wants their own AI cloud and they want them on their own on the edge. I mean what some people forget is if you look seven, eight years ago when you know the edge was hot, now we have between 100 and 1000x AI performance per watt and you're just going to be able to do a whole lot more. Old time robotics used to spend most of your expense in setting up the robots because they had to be within millimeters and sometimes microns of perfection. Now with object recognition it acts more like a human where it adjusts to a changing environment. And I love what you said about democratizing society. I know a lot of people like to be, you know, really down. I know what we've done in the past 50 years, we've almost eliminated poverty. Okay. Have it completely. And imagine if we could take that to education and health care would be absolutely game changing as, as society. I'm, I'm really glad you brought that up. Sumit, this has been a great conversation. I could camp out here for an hour but I want to wrap up with a question here. So my research firm tracks probably a hundred different semiconductor companies and we track you know, three to four memory and storage companies. And I'm curious, how are you different when it comes to AI? I would say in aggregate in the big picture, but also across your competitors.
Sumit Sadana: Yeah, it's a great question Pat. Thanks for having us discuss this because you know I'm very passionate about Micron's place in the ecosystem. We are very proud to be the only US headquartered memory company of scale in the world. And we are the only manufacturers of memory in the US so today only 2% of the world's DRAM gets manufactured in the US and that too is done by Micron in our Virginia Manassas facility. But we are on a huge project that, I know you know, we announced a $200 billion investment for R&D and manufacturing in the US over the next 20 plus years. $50 billion in R&D, 150 billion in manufacturing. And this includes, you know, building two fabs in Idaho, in Boise, following that up with up to four fabs in upstate New York, and modernizing and bringing one alpha DRAM technology to our Virginia fab, as well as doing advanced packaging. And we didn't get a chance to speak a lot about packaging, but packaging is a huge, huge deal, as you know, and you've spoken about that in the past in your videos. And so this is a hugely important thing for the US in terms of having packaging leading edge packaging technology investments from us. And so I think that is one huge differentiator in terms of our U.S. headquarters and our U.S. presence and our U.S. investments that we are making on behalf of customers. And our customers are very excited about that. So we really look forward to that. I think the other aspect is you have known Micron for many years. Our customers and people in the ecosystem have known Micron for decades. We have been around for a long time. But this is not the same Micron that people have known for decades. This company today, Micron, is better positioned competitively than at any time in our history. And let me just give you a couple of data points in DRAM technology. Leading edge DRAM technology. Four generations in a row, Micron has led the entire world in time to market in coming out with the latest node of DRAM technology and not by a small amount, sometimes as much as a year. Right. So these are like huge improvements in our technological capability. Three nodes in a row on the NAND side leading the world. Our product portfolio has never been stronger. And I think the story, when the story of Micron is written, in the last five years or so, our product portfolio has taken giant leaps and we are so proud of the work the Micron team has done and innovated with our customers because we have so many industry firsts and industry best type of products in our portfolio today. And I gave you two examples of that with HPM3E with LP DRAM. These innovations are very meaningful to our customers. They are not small deltas to our competitive baseline. They are complete game changers. And also when we look at data center SSDs, we have been on a very successful trajectory of improving our share in some of these most complex products in the industry. And I'll just end there because of the strength that we have been gaining share in all of the high profit pools of the industry across the board. It is a really exciting time. And as we discussed and as you have mentioned a lot in your videos, AI is just a phenomenal multi year decade or two long growth opportunity and we are so excited to be able to co-innovate with our customers for the benefit of AI for all of society.
Patrick Moorhead: Yeah. On the differentiation side, I spent a lot of time with Jeremy on your side.
Sumit Sadana: Yes.
Patrick Moorhead: Past couple years ago and the amount of firsts and the time to market advantage were evident. And it's great to see you hitting on all cylinders on HBM. And I know the world is more than HBM but you know, striking while the iron's hot. The whole DDR5 transition is exciting and I think it's, you know, meaningful, particularly when it comes to performance server consolidation out there in the data center. So Sumit, I really want to appreciate you coming on and taking us through that. It was great to see, great to see you again and I hope we can do it again sometime.
Sumit Sadana: Thank you Pat, and I really appreciate your time and as always I've enjoyed our discussion and I look forward to the next one.
Patrick Moorhead: Definitely. So that wraps up our Semiconductor Spotlight session here with Micron and whether it's the core hyperscaler data center to the edge. I hope you're all convinced out there that memory is fast becoming the most important element when it comes to AI. And Micron has a lot of firsts in it. It was great to see how they crushed their earnings recently. And I know earnings aren't everything, but it is a very important metric that shows that customers are buying a lot of memory from Micron and also they can fill their coffers to build out facilities and invest into R and D in the future. Hit that subscribe button. Thanks for being part of the community. Take care.
Disclaimer: The Six Five Summit is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Speaker
Sumit Sadana is executive vice president and chief business officer at Micron Technology. Sumit is responsible for the company P&L and all the company’s business units, driving revenue and profitability and positioning the company for success through strategic partnerships with customers. The business units are also responsible for product roadmap definition and aligning company R&D initiatives with market and customer requirements. Sumit’s organization also includes the company’s strategy and corporate business development, global communications and marketing, as well as Micron Ventures (Micron’s venture capital investment arm). Sumit joined Micron in 2017 and has over 30 years of technology industry experience, in roles ranging from chip design, software development, operations management, strategy development and IP licensing, to executive roles such as CTO, CFO and GM. He also served in leadership positions at SanDisk, Freescale Semiconductor and IBM. Sumit has completed approximately $40 billion of M&A in his career. Sumit has served on the board of directors of Silicon Labs — an industry leader in IoT — since 2015 and was appointed lead independent director in 2022. Sumit graduated from the Indian Institute of Technology with a bachelor’s degree in electrical engineering and earned a master’s degree in electrical engineering from Stanford University.
.png)
.png)
