An Interview with AMD CEO Lisa Su About Solving Hard Problems – Stratechery by Ben Thompson
On the business, strategy, and impact of technology.
Loading
Good morning,
This week’s Stratechery Interview is with AMD CEO Lisa Su. Su began her career at Texas Instruments, after earning her PhD in electrical engineering at MIT, where she played a significant role in developing silicon-on-insulator transistor technology. Su then spent 12 years at IBM, where she led the development of copper interconnects for semiconductors, served as technical assistant to CEO Lou Gerstner, and led the team that created the Cell microprocessor used in the PlayStation 3. After a stint as the CTO of Freescale Semiconductor, Su joined AMD in 2012, before ascending to the CEO role in 2014.
Su has led a remarkable run of success for AMD over the last decade. After decades of being an also-ran to Intel, AMD has developed the best x86 chips in the world, and continues to take significant share from Intel in datacenters in particular. AMD has also been a major player in console gaming, in addition to its traditional PC business and graphics chip business. That GPU business is now increasingly at center stage, as AMD takes on Nvidia in the market for datacenter GPUs.
In this interview, conducted a day after Su’s Computex keynote, we talk about Su’s career path, including lessons she learned at her various stops to the top, before discussing why AMD has been able to achieve so much during her tenure. We discuss how the “ChatGPT” moment changed the industry, how AMD has responded, and why Su believes the long-run structure of the industry will ultimately work in the company’s favor.
As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player.
On to the Interview:
This interview is lightly edited for clarity.
Lisa Su, welcome to Stratechery.
Dr. Lisa Su: Thank you. It’s great to be here.
I’m truly honored to have you here. I’ve had the opportunity to talk to a lot of your peers in the semiconductor space, and without fail one of the bits of feedback I get from my subscribers is, “When are you going to talk to Lisa Su?”. I want to thank you for helping me now avoid those emails going forward.
LS: Thank you, I appreciate that. I’m very happy to have this opportunity to chat.
I know you don’t want to talk about yourself too much, but I need some fact verification here. We were just talking before we started recording, you were born in Taiwan, emigrated to the US when you were very young, eventually ended up at MIT, where as legend has it, you were deciding between computer science and electrical engineering, and chose electrical engineering because it was harder. Is this true?
LS: It is actually true. I was always around math and science, my parents were always saying, “You have to do these hard things”. When I went to MIT, at that time it was a decision between electrical engineering and computer science. Computer science, you could just write software programs, whereas electrical engineering, you had to build things. I wanted to build things.
They had to actually work, right?
LS: Yes, that’s right.
Your PhD was focused on silicon, on insulator technology, and then you went to IBM — you pioneered using copper interconnects on chips. I have three questions about your IBM experience and what lessons you might’ve learned. Number one, when it comes to the copper interconnect, you said something, I think was to the MIT Review, where you were ready to do something new, but your boss made you stay, and you felt like the actual learnings you accumulated in that time when you thought you were done were some of the most impactful. What were those learnings?
LS: I really learned so much when I was at IBM, it was the early part of my career. When you go to school and when you get a PhD, you think the sexy thing is the research that you do and the papers that you write, which we all write papers and things like that.
When you actually join a company and join a project, these projects are usually several years to complete something. But the “sexy stuff” is at the beginning when you’re coming up with the new ideas.
What I learned actually was one of the very first products that I worked on was a microprocessor with copper interconnects, and it turns out that the last 5% of what it takes to get a product out is probably the hardest, where most of the secret sauce is. And if you learn how to do that, then frankly, it’s—
All the software engineers are saying, “Hey, it’s the same thing for us. Don’t you know?”
LS: (laughing) That might be true, that might be true. But we all have this view of what “secret sauce” is. It’s things like yields, reliability, when something goes wrong. When you’re trying to produce millions versus producing five of something, you learn a lot, I learned a ton.
Yes, as a young researcher you think, “Hey, I’m ready to move on to my next thing”, and you realize that it’s so rewarding to see your product actually ship and go on the shelf and you can walk into Best Buy and buy it. Those are the types of things that I learned.
How much, even today, do you feel your time and attention ends up being balanced between what you’re building going forward versus actually executing and getting what you’ve promised out the door?
LS: Certainly today, I personally spend a lot of time looking forward in terms of roadmaps, forward in terms of technologies.
You just have more patience for the rest of the company that’s trying to get it shipped.
LS: That’s right. Frankly, a lot of time on customers and markets, and where’s the market going and where should we be investing.
Just out of curiosity, how deeply do you need to be involved in things like, not you specifically, but AMD generally, now that you’re fabless, in thinking about that actual final mile? What’s the degree of interaction with, say, a TSMC or your packaging partners or whatever it might be, and actually getting the yields up and out the door?
LS: It’s definitely true for us as a fabless company or a design company. We are actually doing end-to-end development, so you can imagine from the day one of concept of a product — actually even before that, we’re thinking about what technologies are going to be ready, what are the next big things that we should bet on? That goes all the way through. Sometimes it could be a five-year cycle or even longer before the technology actually comes to fruition. We’re right there at the end as well, ensuring that it ships with high quality, at the right yields, right cost structure, in high volume production.
So, it’s really end-to-end and the difference is it’s not all in one company, which you would see in a more traditional integrated manufacturing model, it’s through partnership. We found that actually it works extremely well, because you have experts on all sides working together.
The second IBM lesson I’m curious about is you worked on the Cell processor that shipped in the PlayStation 3. That chip was a technological marvel, but the PlayStation 3 is viewed as the least successful PlayStation, which drove a real shift in Sony strategy in the long run, away from hardware differentiation towards exclusives. I guess this is a two-parter, but number one: What did you take away from that experience? Number two ties into this: How much impact did that have on your later gaming experience? The gaming experience question is obvious. I’m more curious, was there a management takeaway from all the work you put into the Cell processor and the reality of how that manifested in the market?
LS: Yeah, it’s interesting that you mentioned that. I’ve been working on PlayStation for a long time, if you think about it, PlayStation 3, 4, 5…
It’s like the common thread through your career.
LS: Across multiple companies, yes. What I would say honestly is, these are decisions that are made that are more around architectural decisions. From that standpoint, whether you talk about any of the PlayStation consoles or some of the other work that we’ve done in partnership — we being AMD, but it was similar during that time at IBM — it really is a close collaboration of what the customer or partner is trying to achieve.
The Cell processor was extremely ambitious at that time, thinking about the type of parallelism that it was trying to get out there. Again, I would say from a business standpoint, it was certainly successful. As you rank things, I think history will tell you that there may be different rankings.
My perspective is, the console era has gone through phases, and that phase in PlayStation 1 and PlayStation 2, they made smart hardware decisions, and that differentiated their approach from Nintendo in particular. But once you went to HD, you had tremendous increase in cost of asset creation, you had developers heavily motivated to support multiple processors, you had game engines coming along. Suddenly, no one wanted to go to the burden of differentiating on the Cell, they just wanted to run on the Cell.
LS: Perhaps one could say, if you look in hindsight, programmability is so important.
Right.
LS: Being able to have real business success on day one of anything, we have to think about both hardware and software. As we’ve seen, one of the things that I’m very proud of of the work that we’re doing or have done at AMD over the last 10-plus years was PlayStation 4 and PlayStation 5 is we’ve always had new leaps in hardware.
Much easier.
LS: And they come with compatibility to the previous generations, which is super helpful.
Number three: You were Lou Gerstner’s technical assistant for a year. What did you learn from him? This is a pro-Lou Gerstner podcast, for the record.
LS: (laughing) You’ve done your homework, haven’t you? The year that I spent with Lou was one of the most educational experiences of my career. IBM was a fantastic company at talent development, so they identified people earlier in their career and they said, “Hey, what types of experiences would you like?”.
In my case they asked me, do you want to go on the technical track or do you want to be more on — let’s call it — the management track, the terminology would be an IBM Fellow or an IBM Vice President. Honestly, I didn’t think I was going to be smart enough to be an IBM Fellow. There were people like Bob Dennard, who are—
I feel like there’s people that would dispute that characterization.
LS: There were amazing people there, and so I was like, “Okay, let me try this thing at management and business”. They gave me an opportunity to spend a year with Lou, he’s just an amazing person. If you think about someone out of school five years who has really only done, let’s call it, pure engineering, and then getting to sit for—
It’s basically the best MBA in the world.
LS: Yes! Yes, it absolutely was, and what was most interesting to me is really understanding where he spent his time. The time was always trying to learn, was very externally focused, and understanding what’s going on in the market, what’s going on with customers.
That ties into what you were saying what you do today.
LS: Exactly. How does that change your strategy and how does that change how you guide the leadership team? The fun thing that I got to do is, I got to teach him about technology. I would say, “Hey Lou, there’s this interesting new thing, Napster, people are downloading it” — I don’t know if you remember that.
I do, I introduced my college dorm floor to Napster. That was my claim to fame in college.
LS: I got to introduce Lou to Napster. We were really thinking about what digital rights management meant at that point in time, those were just the types of things that we got to think about.
The other thing that I always appreciate about Lou Gerstner is, to your point, it wasn’t just looking outside and understanding the market, what’s going on, but really understanding what was IBM, what was IBM intrinsically capable of and uniquely differentiated at. Basically my perspective is, IBM was big, and what does that actually mean? What can you actually bring to bear in a way? The whole middleware revolution, and look, we can solve this Internet problem for companies that are even older and bigger than us, and that’s going to be a differentiated thing. But then, obviously, it all fell apart. IBM should have done the cloud, Lou actually wrote that in his book, I don’t know how much looking backwards that was. If you had taken over after him, could you have led IBM to greater heights?
LS: I don’t know that I would’ve been on that path. I was a semiconductor person, I am a semiconductor person. If I really think about, IBM was such a wonderful career for me, but if I wanted to stay a semiconductor person, I had to go to a semiconductor company.
You went to Freescale [Semiconductor] after that, much more in a business role.
LS: Yes.
Was there a personal admission of, “Okay, I’m a business person now”, or was that in the choice when you went that direction?
LS: I have always straddled the technology/business line there. At Freescale I actually started as CTO. I joined as CTO, and then over a couple of years I ended up running the networking and a multimedia business, that was definitely a choice and the choice is, at the end of the day, I want to drive outcomes, and to drive outcomes requires, yes, the technology is great, but you need to have the right business strategy.
Is that the limiter for a lot of technical people, is they under-appreciate all the drivers of outcomes that have nothing to do with technology?
LS: I think that that is something that technologists have to learn. And by the way, there are phenomenal CTOs who truly understand that. My CTO right now, Mark Papermaster, he was my partner in crime at IBM, we grew up together, and then we’ve been partners here at AMD, he truly understands that technology is great, but you also need to drive business outcomes. That’s what I love about what I get to do, because yes, I get to put together great tech with a phenomenal team, but there’s also an opportunity to drive very significant business outcomes.
Let’s talk about AMD. I mentioned the console strategy earlier, that was a big shift in focus when you came on board. Was that just a view like, “Look, this is an easy win, high volume, we can get back in the game”? What was the thinking there?
LS: Well, I would never say anything is an easy win.
Yeah, fair enough.
LS: Let me start with this notion of when I first joined AMD, we were probably 90 plus percent in the PC market and by the way, I really like the PC market. We’re going to talk more about it I’m sure.
Absolutely. You spent the first 45 minutes talking about the PC market yesterday.
LS: Yes, so I’ll caveat with that. But the PC market goes in cycles and the cycles can be quite dramatic.
Very painful.
LS: They can be — I would use the word quite dramatic. So when you see from business strategy standpoint, it was really important for us early in the AMD days to diversify and get to a strategy where the underlying principle is around high-performance computing. We are a computing company, we are great at building computing capability, and now what are the markets that can really utilize that? Gaming is one of those markets and we are very fortunate to have both Sony and Microsoft as leading console manufacturers to choose us.
Who was driving that shift to x86 in consoles? How much was this Sony learning the lesson of Cell? Was that you going to them and saying, “Look, this is the way to go”? How did this commonality of architecture develop?
LS: Yeah, look, I think it was a set of choices, so it was a choice between x86 and other architectures, and if you think about just the developer ecosystem around x86 when you’re thinking about software development, I think that was a very key piece, but I don’t know that the architecture itself was enough. I think the incredible graphics capability and the fact that graphics, especially if you want to customize graphics, there are very few companies who can do that, AMD was one of them.
Even then, to what extent was there integration between the CPUs you’re delivering and the GPUs? AMD had acquired ATI in 2006, so before your time there, but was there any other company that could have actually delivered what you did for the consoles?
LS: I would say we were unique in what we were capable of doing for two reasons. One is we had the fundamental IP, so the combination of let’s call it the CPU or the microprocessor cores with the graphics IP capability, and we were willing to customize. Frankly, we have huge teams that were put on these projects to customize.
Do you see this as a pattern where initially, it’s all about the cutting-edge, getting the best possible performance, but as it, I don’t want to say slows down, but capabilities commoditize, the customization is more important. I mean, you’ve acquired…I can never pronounce it correctly.
LS: Xilinx.
Xilinx, I’ve never been able to pronounce that word for two years. It sounds like this customization approach, there seems to be a common thread there. That’s something you want to leverage.
LS: The best way I say is there are a couple of principles. First, the fact is the world needs more semiconductors. Semiconductors, chips are now foundational to so much of what we do and much of what we do are, let’s call it standard products that fit a broad set of use cases. But you do find those high volume applications like game consoles, like some of the work that’s being done in the cloud right now, like some of the AI work I believe will be customized, and in these cases, because the volume is so high, it makes sense to customize. That’s something that I’ve always believed. It’s part of our strategy, it’s part of our strategy of deep partnerships. So if you have the right building blocks, then you can work with the broad set of customers to really figure out what they need to accomplish their vision.
Is there a bit though, where as design costs as we move down the process curve are just becoming so astronomically high that there’s a bottom limit to customization, where only AMD has sufficient scale to customize, paradoxically?
LS: I think the important thing is looking at which markets really lend themselves to let’s call it significant customization, and it’s not everything. Probably your IoT device, you’re not going to want to do that because it just won’t return on investment. But for large computing capabilities, I think the combination of the right IP plus the ability to work deeply with partners. By the way, I should say, it doesn’t all have to be hardware customization, there’s a lot that we can do in software work as well, I think it is one of the important trends going forward.
So I have to ask, you came to AMD, you were there for a couple of years, then you took over the CEO role. Was that another example of choosing the hard problem?
LS: I think it was. I’ll say it this way, when I joined AMD, it was really this idea that I’ve worked on high performance processors all my life, that was my background, there are very few companies in the United States that you could do that at. I always respected AMD very much as a company that mattered, but I thought I could make a difference, and so joining the company, I realized that, “Boy, there was a lot I had to learn”. Those first couple of years, I did learn a lot about just the market dynamics in this world, but it was also a phenomenal opportunity to make a difference.
Where could you make the difference? We can see the difference — I mean, just look at the stock chart, and we can look at the performance of your chips. So in that context, it’s maybe hard to get in your exact state of mind 10 years ago, but what was your plan? What did you say, “Look, I can do this, there’s something, there’s a path here, I see it”? What was the path that you saw?
LS: What I saw that was very clear is that we had the building blocks of what you needed to build an incredible roadmap. We were very differentiated in those building blocks.
What were those building blocks? Was that IP or was that customer relationships?
LS: It was high performance CPUs and high performance GPUs and if you think about it, those are pretty incredible building blocks. Now, what we didn’t have is a very clear strategy of what did we want to be when we grew up, and then an execution machine that could make that happen.
So from a strategy standpoint, I think we had some choices to make. If you remember back, this is 2014, the exciting thing then was mobile phones, like apps processors. So we would have these conversations like, “Should we go into phones?”, and we were like, “No, we shouldn’t because we’re not a phone company. There are others who are much better at that, we are a high performance company, so we have to build a roadmap that leverages our strengths and that requires us to revamp the way we do architecture and design and manufacturing”. I know how to do that, it’ll take time, you can’t do that in 12 months, I think I felt it would take five years. It would take five years, but it was very clear that we had the pieces, we just had to really methodically build that execution engine.
Well, you mentioned manufacturing. AMD had spun out GlobalFoundries before you took over, I want to use the technical term here, how much of a pain in the ass was the eternally amended wafer agreement you had with GlobalFoundries? Was that just something you just had to deal with on a constant basis as you’re trying to execute the strategy?
LS: Well, to be fair, AMD and GlobalFoundries were one company at one time, yeah.
It was there for a reason, it was understandable.
LS: Exactly. So that wafer supply agreement was something that was before my time, but I think it was one of the larger — if I think about the couple of big strategy things that we had to do, it was if you want to build high performance processors, you need the best technology partner, the best manufacturing partner and GlobalFoundries is a great company and they’re still a great partner. It’s just you need scale to be able to build at the bleeding-edge, and the scale didn’t exist.
Was it almost a blessing in disguise when they internalized that and, “We’re not going to 7nm“?
LS: It was a very good decision for both of us and financially, AMD had to —
Yeah, you had to give all the money back that you had gotten originally.
LS: There was a business arrangement, but from a technology standpoint, it was absolutely the right thing to do, and like I said, GlobalFoundries is a great partner for us. I have tremendous respect for [GlobalFoundries CEO] Tom Caulfield as a partner, and I think both companies were better off by focusing on what we were going to be good at.
Well, you were the first high performance chip maker to move to chiplets, and everyone’s headed there now, so you’re definitely in the lead in that regard. Is there a bit where you were actually forced there because of the wafer agreement so that you could do some volume with GlobalFoundries, some with TSMC and still deliver your chips?
LS: Not at all. Actually, I think that was clearly one of the best decisions that we made, it wasn’t that clear at the time.
Yeah, for sure.
LS: But what we were looking at is where was Moore’s Law going, and how were we going to differentiate ourselves? Frankly, our thought process was we needed to bring something to the processor market that was different, so building these big humongous chips that didn’t yield and were very expensive wasn’t going to be the answer.
So I remember very well spending time with Mark and our architects and trying to decide, “Is this the time that we go to chiplets? Is this the time we’re going to bet the company on going to chiplets?” And we said, “Yes, it is because we’re going to get to much higher performance, many more cores, as well as a much better cost point”, and it gave us tremendous flexibility, and we learned a lot along the way.
The first generation Zen 1 chiplets were okay, but we had some programming model issues that we had to deal with and that got better with Zen 2 and really, really hit our stride with Zen 3.
When you were took over in 2014 and you felt like you could make a difference, I see a couple of big shifts here. So you have the shift to chiplets on your side, that’s around the time TSMC is beginning or transitioning to EUV. To what extent did you see those secular shifts in the market and that informed your decision that, “Look, there’s something I can do here”?
LS: Yeah, we definitely looked very much at the technology roadmaps and what TSMC was doing, as well as just where the packaging technology was at the time and we decided that this was the time to make the bet. I like to say that the world that we live in is we have to make bets that sometimes take three to five years to come to fruition.
Yeah. I don’t mind asking you about 2014 decisions because that’s often when decisions that matter today were being made.
LS: That’s exactly right, and there was risk associated with that in terms of, “Would we actually get the performance that we thought we were going to get by going to chiplets?”, but we learned a ton, and I think history would say we made the right bet, but at the time, some of our competitors were calling it glue, they were gluing chips together. It’s like, “We’re not gluing chips together”.
And now they’re doing the exact same thing. What’s the balance of credit when you look back over the 10 years and AMD actually taking the performance lead in a really meaningful way in x86? Where do you balance the credit between your design decisions and being on the leading-edge process from TSMC and how that paid off?
LS: I really believe that they are inextricably linked.
Yeah, the decisions went together.
LS: Absolutely, and it’s one of those things what we found was so helpful, and TSMC is a phenomenal partner in this realm. It’s when you take a lot of design risk, you want to know that your technology is rock-solid so that you know where to spend time and effort.
That’s what TSMC and ASML did, like going to 300mm and then going to EUV, that co-partnership, they had proved out that could be done and then you were able to do it with you at the same time, following on from that.
LS: That’s right, I think it’s been a very synergistic partnership.
AMD’s most consequential moment before your tenure was actually, we were talking about this earlier, is when they went from x86 to 64 bit and dragged Intel kicking and screaming in that regard, and it’s kind of a hardware and a software story. That was before your time, but it strikes me that one of the ongoing critiques of AMD is the software needs to be better. Where is the software piece of this? You can’t just be a hardware cowboy. When you joined in, was there a sense of, “Look, we had this opportunity, we could have built on this over time”. What is the reticence to software at AMD and how have you worked to change that?
LS: Well, let me be clear, there’s no reticence at all.
Is that a change though?
LS: No, not at all. I think we’ve always believed in the importance of the hardware-software linkage and really, the key thing about software is, we’re supposed to make it easy for customers to use all of the incredible capability that we’re putting in these chips, there is complete clarity on that.
I think what you will see is that we’ve actually been on several arcs of technology development. So, the CPU arc and everything that we’ve done to build the Zen product portfolio. Now, we just previewed Zen 5 here at Computex, in the data center, and then launched it in the client products. That particular arc was one arc and now we’re in sort of the next arc, which is around—
The GPU arc.
LS: Yeah, AI and GPU.
I do want to ask you one other thing. As far as this trend, we talked about the chiplet trend, we talked about the EUV thing. How important was the rise of hyperscalers to your success? Because what I see from that is, they’re buying at such scale, they will actually do LTV calculations to say that, “Look, yes, these AMD processors are worth it in the long run”. And number two, to the extent there is a software hole, they will do the work to fill that in, because they can see the long-term benefit. Did that impact when you were thinking about what we can actually win here? Was that a driver?
LS: Yeah, it’s a great point. When you think about high-performance computing and just how things have changed, the fact is, the hyperscalers are such a significant piece of the overall market that we have spent a lot of time there, and the point that you make is absolutely true, which is — you’d like to think in every market, the product always wins but that’s not necessarily true. In the hyperscaler market, the best product wins.
Yes.
LS: And we were able to show that. Frankly, the key thing in this market is, it’s not enough to win once and it’s not enough to win temporally.
You have to win the roadmap.
LS: You have to win the roadmap and that was very much what we did in that particular point in time.
And so when you come in 2014, you’re like, “Look, I can see a roadmap where we can actually win”.
LS: That’s right.
And there’s customers coming along that actually will buy on the roadmap.
LS: That’s right and by the way, they’ll ask you to prove it. In Zen 1, they were like, “Okay, that’s pretty good”, Zen 2 was better, Zen 3 was much, much better. That roadmap execution has put us in the spot where now we are very much deep partners with all the hyperscalers, which we really appreciate and as you think about, again, the AI journey, it is a similar journey.
Yes. Well, one more question on x86. How do you think about the consumer space in conjunction with all this? You think about, say like an Intel, they have to keep the fabs full, so they need to maximize their chips for everything. The point with fabs is, Intel wants to be integrated and there’s a bit where AMD is in a different position so they can meet the hyperscalers where they are better and just make great chips. But is there a volume consideration just because you want to get leverage on your design costs, on your IP investments? I’m just curious how those calculations work in a world where it’s not your fabs on the line, it’s not your billions of CapEx. I’m curious how you think about that differently from an integrated player.
LS: The way we think about it is, it is about scale. When we were back in 2014-15, we were a $4 billion company and in that case, you can spend a certain amount of R&D. Last year, we were like, a whatever, $22 plus billion company, you can spend a lot more on R&D.
Yes, so basically, it’s still the same calculation by and large.
LS: It is the same calculation of how do we leverage.
But maybe less risk of going bankrupt if you spend way too much on fabs.
LS: Well, I think the key thing is leveraging of the IP. It’s sort of the engines, the compute engines that we have. That’s our absolutely number one priority, is to get those compute engines on a very aggressive roadmap and then, we build products out of that.
What was your response in November 2022 when ChatGPT shows up?
LS: Well, it was really the crystallization of what AI is all about.
Obviously you’ve been in the graphics game for a long time, you’ve been thinking about high-performance computing, so the idea that GPUs would be important was not foreign to you. But were you surprised the extent to which it changed the perception of everyone else around you and what happened after that?
LS: We were very much on this path of GPUs for high-performance computing and AI. Actually, it was probably a very significant arc that we started, let’s call it back in the 2017 plus timeframe. We’ve always been in GPUs, but really focusing on-
What was it in 2017 that made you realize that, “Wait, we have these, we thought we bought ATI for gaming, suddenly, there’s this completely different application”?
LS: It was the next big opportunity, we knew it was the next big opportunity. It was something that Mark and I discussed, which was, by putting CPUs and GPUs together in systems and designing them together, we’re going to get a better answer and the first near-term applications were around super-computing. We were very focused on these large machines that would reside at national laboratories and deep research facilities and we knew that we could build these massively parallel GPU machines to do that. The AI portion, we always also thought about it as clearly a HPC plus AI play.
You said before that AI is the killer application for HPC.
LS: Yes.
But you will talk to people in HPC, they’re like, “Well, it’s a little bit different”, to what extent is that the same category versus adjacent categories?
LS: It’s adjacent but highly-related categories, and it all depends on the accuracy that you want in your calculations, whether you’re using the full accuracy or you want to use some of these other data formats. But I think the real key though, and the thing that really we had good foresight on is, because of our chiplet strategy, we could build a highly modular system that could be, let’s call it, an integrated CPU and GPU, or it could be just incredible GPU capability that people needed.
And so, the ChatGPT moment for me was the clarity around, now everybody knew what AI was for. Before, it was only the scientists and the engineers who thought about AI, now everybody could use AI. These models are not perfect, but they’re amazingly good, and with that, I think the clarity around how do we get more AI compute in people’s hands as soon as possible was clear. Because of the way we had built our design system, we could really have two flavors. We had HPC-only flavor, which is what we would call our MI300A and we had AI only flavor, which was the MI300X.
Was that kind of an uncomfortable shift? Like, “Actually, no, we want less precision because the scalability is so important”.
LS: It wasn’t uncomfortable. It was strikingly fast.
It happened so fast. AMD has done very well, you hit an all-time high a couple of months ago. But by and large, obviously Nvidia captured the gestalt as it were in a lot of the momentum and upside. What did they have, from your perspective, in that period that AMD had to catch up on?
LS: I think the way to think about it is just, where was the focus and relatively speaking — look, I give [Nvidia CEO] Jensen [Huang] and Nvidia a lot of credit. They were investing in this space for a long time before it was absolutely clear where things were going. We were also investing, although I would say we had a couple of arcs. We had our CPU arc, and then we have our GPU arc.
Hey, your hands are full crushing Intel, so I get it.
LS: I would say it a different way, we are at the beginning of what AI is all about. One of the things that I find curious is when people think about technology in short spurts. Technology is not a short spurt kind of sport, this is like a 10-year arc we’re on, we’re through maybe the first 18 months. From that standpoint, I think we’re very clear on where we need to go and what the roadmap needs to look like. One of the things that you mentioned earlier on software, very, very clear on how do we make that transition super easy for developers, and one of the great things about our acquisition of Xilinx is we acquired a phenomenal team of 5,000 people that included a tremendous software talent that is right now working on making AMD AI as easy to use as possible.
One of the things that does strike me about the contrast is, and one of Nvidia’s really brilliant moves was the acquisition of Mellanox and their portfolio in networking, and to the extent it matters to tie all these chips together, particularly for training.
In your Computex keynote, you talked about the new Ultra Accelerator Link and Ultra Ethernet Link standards, and this idea of bringing lots of companies together, kind of calling back to the Open Compute Project back in the day as far as data centers. Makes perfect sense, particularly given Nvidia’s proprietary solutions have the same high margins, we all know and love, as the rest of their products.
But I guess this is my question about your long-term run — do you think it’s fair to say that, from a theoretical Clayton Christensen perspective, because we’re early in AI, maybe it’s not a surprise, the more proprietary integrated solution is the belle of the ball in many respects? There’s a bit where, yes, being open and modular all makes sense, but maybe that’s not going to be good enough for a while.
LS: I would say it this way. When you look at what the market will look like five years from now, what I see is a world where you have multiple solutions. I’m not a believer in one-size-fits-all, and from that standpoint, the beauty of open and modular is that you are able to, I don’t want to use the word customize here because they may not all be custom, but you are able to tailor.
Customize in the broad sense.
LS: That’s right.
Tailor is a good word.
LS: Tailor is the right word — you are able to tailor the solutions for different workloads, and my belief is that there’s no one company who’s going to come up with every possible solution for every possible workload. So, I think we’re going to get there in different ways.
By the way, I am a big believer that these big GPUs that we’re going to build are going to continue to be the center of the universe for a while, and yes, you’re going to need the entire network system and reference system together. The point of what we’re doing is, all of those pieces are going to be in reference architectures going forward, so I think architecturally that’s going to be very important.
My only point is, there is no one size that’s going to fit all and so the modularity and the openness will allow the ecosystem to innovate in the places that they want to innovate. The solution that you want for hyperscaler 1 may not be the same as a solution you want for hyperscaler 2, or 3.
Where do you think the balance is going to be then, between there being a standard approach versus, “This is the Microsoft approach”, “This is the Meta approach”? There’s some commonality there, but it is actually fairly customized to their use cases and needs. Again, not next year, but in the long run.
LS: I think as you get out three, four or five years, I think you’re going to see more tailoring for different workloads, and what happens is, the algorithms are going to — right now, we’re going through a period of time where the algorithms are just changing so, so quickly. At some point, you’re going to get to the place where, “Hey, it’s a bit more stable, it’s a little bit more clear”, and at the types of volumes that we’re talking about, there is significant benefit you can get not just from a cost standpoint, but from a power standpoint. People talk about chip efficiency, system efficiency now being as important if not more important than performance, and for all of those reasons, I think you’re going to see multiple solutions.
Is this an underrated tailwind for your x86 business? You talked in your keynote about the fact that the majority of CPUs in the cloud are more than five years old, and you said something like, “One of our CPUs can replace five or six of these old ones”. Do you see that being actually — because right now I think there’s a lot of trepidation around your business and Intel’s business that all the spend is going to AI, no one’s even buying CPUs anymore, is this sort of power wall where if we can take out a bunch of CPUs from our data center, we can save power by putting other ones?
LS: I think both things are true. I think the modernization of data centers absolutely has to happen. It will happen, and then the other point is—
It might not happen right now.
LS: Well, no. I think we’re seeing the investments come back into the areas for modernization, but the other thing that’s really important is, again, as much as we love GPUs, that’s a huge growth driver for us going forward, not every workload’s going to go to a GPU. You are going to have traditional workloads, you’re going to have mixed workloads, and I think that’s the key point of the story is there’s a lot of things that you have to do in large enterprises, and our goal is to make sure that we have the right solution across from all of those capabilities.
How much inference do you see actually going back to the CPU?
LS: I think a good amount of inference will be done on the CPU, and even as you think about what we’re talking about is the very large models obviously need to be on GPUs, but how many companies can really afford to be on the largest of models? And so, you can see now already that for smaller models, they’re more fine-tuning for those kinds of things, the CPU is quite capable of it, and especially if you go to the edge.
Right. You noted on the last earnings call that the MI300, it’s been supply-constrained, your fastest ramp ever, but is maybe from the expectations of some investors, a little disappointing in the projections for the end of the year. How much do you feel that shift to being demand-constrained is about the 325 coming along, which you talked about this week, versus the fact that just generally Nvidia supply has gone up, as everyone’s trying to figure this stuff out? Yes, your long-term opportunity is being this sort of customized supplier — tailored supplier, sorry, is the word that we’re going for — versus, “Look, I don’t want to say picking up but just we need GPUs, we’ll buy them from anyone”. Where do you feel your demand curves are relative to the competition and the rapid progression of the space?
LS: Again, let me take a step back and make sure we frame the conversation. The demand for AI compute has been off the charts, I think nobody would have predicted this type of demand, and so when I say that there is tightness in the supply chain, that’s to be expected, because nobody expected that you would need this many GPUs in this timeframe. The fact is the semiconductor industry is really good at building capacity, and so that is really what we’ve seen. As we’ve started to forecast-
And so you feel it’s more a function of there’s just so much supply coming online?
LS: Absolutely, and that’s our job. Our job is to make it to a place where you’re not constrained by manufacturing capacity.
Really, for us, it is about ensuring that customers are really ramping their workloads and that is a lot of deep work, deep partnerships that we’re doing with our customers. So honestly, I feel really good about the opportunities here. We’ve been through this before where it’s very similar to what we saw when we did the initial data center server CPU ramps, which is our customers work very closely with us, they get their software optimized, and then they add new workloads, and add more volumes, and that’s what I would expect to happen here, too.
The difference in AI is that I think customers are willing to take more risk, because there’s a desire to get as much, as fast as possible.
Is there a challenge for you, because that desire to take more risks means they’re more accepting of say, high margins to get the leading GPUs or whatever it might be, or the GPU with the largest ecosystem, developer ecosystem?
LS: What I will say is I’m super happy with the progress we’ve made on software.
Fair enough.
LS: What we’re seeing is excellent out-of-box performance. The fact is things just run, the fact is that much of the developer ecosystem wants to move up the abstraction layer, because everybody wants choice.
And you feel you’re going to get to a stage where that move up the abstraction layer is a common layer across companies, as opposed to getting one company internally moves up the abstraction layer, and so they can buy any CPU, but that doesn’t necessarily benefit you going into another company, or do you feel that’s going to be-
LS: I absolutely believe that it’ll be across the industry. Things like PyTorch, I think PyTorch is extremely widely adopted, OpenAI Triton, similar. These are larger industry things where frankly, part of the desire is it takes a long time to program down to the hardware. Everyone wants to innovate quickly, and so the abstraction layer is good from the standpoint of just rapid innovation.
You’ve been traditionally a second wave adopter of TSMC’s new nodes, maybe a year, year-and-a-half behind. Do you feel pressure to move up to the top tier? Obviously, you’re a relatively small company to some of the players in this world, $22 billion is impressive, but you still have to think about your costs in that regard. Or is it just a pressing need to be on the absolute cutting-edge?
LS: Well, I think you would say that we’re one of the top five for sure in terms of just overall our volumes from a fabless standpoint, and absolutely bleeding-edge is helpful. It’s not something that we think about in terms of should we or shouldn’t we, I think what we think about is from a roadmap standpoint, like for example, we talked about a one-year cadence in terms of GPUs coming out.
Unfortunately, for you kind of on the opposite tick tock from Nvidia a little bit, is that a little frustrating?
LS: No, not at all. Look, again, one of the things that’s important for me is our roadmap is based on what we believe is possible, and what we believe our customers want and need.
Everyone like me wants to talk about the short term head-to-head, so annoying.
LS: No, it’s not so annoying, it’s just context, everything requires context.
Is there ever a world where AMD fabs with Intel?
LS: I would say that we’re very happy with our manufacturing relationships right now.
It does occur to me, Intel, AMD — it’s one of the greatest rivalries in the history of technology from basically the very beginning. Is there a bit though where when you step back, you want to step back in these conversations, there is a bit where you are in it together, because the real enemy is Arm?
LS: You make it sound like Arm is an enemy, I don’t consider ARM an enemy, so let me start with that. We use Arm all over our product portfolio. I consider the fact that we think x86 is a phenomenal architecture, and the capabilities are there, but please don’t think of AMD as an x86 company, we are a computing company, and we will use the right compute engine for the right workload.
As it relates to how I think about — if you look at the semiconductor industry today, there are places where we compete, and then there are places where we partner. So on your Intel point, we do compete in certain areas, but we also partner in certain areas. Intel is part of the UALink Consortium, they’re part of the Ultra Ethernet Consortium.
They’re very interested in this sort of modularization and standardization as well.
LS: We agree with this idea of having a link that can go across different accelerators is actually a good thing. So, I think that’s true across the industry. We’re at a place where there are places we compete, but there are also places where we can partner.
You have had an amazing 10-year run with the x86 results you’ve done in the server space, the data center, speaks for itself. Now, it’s like a new champion appears, are you girded up and ready to go for another round?
LS: This is the next arc. I can tell you that the thing that’s so amazing about where we are today in high-performance computing is, who would imagine? It’s like a new world. It’s an incredibly exciting.
You’re feeling re-energized, you’re ready to go?
LS: Absolutely ready to go. More than ready.
Lisa, thank you very much.
LS: Thank you.
This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery.
The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.
Thanks for being a supporter, and have a great day!
Loading