OpenAI x Broadcom

Transcript:

Andrew Mayne: Hello, I’m Andrew Mayne, and welcome to the OpenAI podcast. Today, we’re excited to be breaking some news involving Broadcom and OpenAI. Joining me from OpenAI is Sam Altman and Greg Brockman, and from Broadcom: Hock Tan and Charlie Kawwas.

Sam Altman: A lot of ways that you would look at the AI infrastructure build-out right now, you would say it’s the biggest joint industrial project in human history.

Charlie Kawwas: We’re defining civilization’s next-generation operating system.

Greg Brockman: That is a drop in the bucket compared to where we need to go.

Sam Altman: It’s a big drop...

Andrew Mayne: So what are we talking about today? What brought you all together?

Sam Altman: So today we’re announcing a partnership between Broadcom and OpenAI. We’ve been working together for about the last 18 months designing a new custom chip. More recently, we’ve also started working on a whole custom system. These things have gotten so complex, you need the whole thing. And we will be starting in late next year, deploying 10 gigawatts of these racks of these systems and our chip, which is a gigantic amount of computing infrastructure to serve the needs of the world to use advanced intelligence.

Andrew Mayne: So this is going to entail both compute and chip design and scaling out?

Sam Altman: This is a full system. So we worked, we closely collaborated for a while on designing a chip that is specific for our workloads. When it became clear to us just how much capacity, inference capacity, the world was going to need, we began to think about, could we do a chip that was meant just for that kind of a very specific workload? Broadcom is the best partner in the world for that, obviously. And then to our great surprise, this was not the way we started. But as we realized that we were going to really need the whole system together to support this, as this got more and more complex, it turns out Broadcom is also incredible at helping design systems. So we are working together on that entire package, and this will help us even further increase the amount of capacity we can offer for our services.

Andrew Mayne: So, Hock, how did this come about? When did this start? When did you guys first talk about working together on this?

Hock Tan: Well, other than the fact that Sam and Greg are great people to work with, it’s a natural fit because OpenAI has been doing, continues to do the most advanced models, frontier models in generative AI out there. But as part of it, you continue to need compute capacity, the best, latest compute capacity as you progress in a roadmap towards a better and better frontier model and towards super intelligence. And compute is a key part, and that comes with semiconductors, and as Sam indicated, more than semiconductors. And we are, even though I say it myself, probably the best semiconductor company out there. And more than that, AI is a very, very exciting opportunity for us in terms of we are, my engineers are pushing the innovation envelope and newer and newer generations of semiconductor technology. So for us, collaborating with the best generative AI company out there is a natural fit.

Andrew Mayne: And this isn’t just chips, it’s going out to scale like 10 gigawatts. And I have trouble kind of even understanding that. What does that even mean when you’re talking about 10 gigawatts?

Sam Altman: First of all, you said it’s not just chips that Hock touched on this too, but the vertical integration point is really important. We are able to think from like etching the transistors all the way up to the token that comes out ask ChatGPT a question and design the whole system, all of the stuff about the chip, the way we design these racks, the networking between them, how the algorithms that we’re using fit the inference chip itself, a lot of other stuff all the way to the end product. And one of the many reasons I’m so excited about it is by being able to optimize across that entire stack, we can get huge efficiency gains and that will lead to much better performance, faster models, cheaper models, all of that. As you get that better performance and cheaper and smarter models, one thing that we have consistently seen is people just want to use way more. So we used to think like, oh, we’ll optimize things by 10x and we’ll solve all of our problems. But you optimize by 10x and there’s 20x more demand. So 10 gigawatts, 10 incremental gigawatts, this is all on top of what we’re already doing with other partners and all the other data centers and silicon partnerships we’ve done. 10 gigawatts is a gigantic amount of capacity. And yet, if we do as good of a job as we hope, even though it’s vastly more than the world has today, we expect that very high-quality intelligence delivered very fast and at very low price, the world will absorb it super fast and just find incredible new things to use it for. So what is the hope with this? The hope is that the kinds of things people are doing now with this compute, writing code, doing more and automating more and more of enterprises, generating videos in Sora, whatever it is, they will be able to do that much more of it and with much smarter models.

Andrew Mayne: It’s amazing. So Greg and Charlie, when you think about historically, when people have tried to develop chips or hardware to suit whatever was the current modem for using computing at that point, what examples have you looked upon historically to figure out how to plan forward? What’s been inspiring you when you think about this?

Greg Brockman: Well, I’d say the number one thing, honestly, is working with good partners. I think it’s very clear that we, as a company, are not able to do everything ourselves. And getting into actually building our own chips for our own specific workloads was not something we could do from a total standstill without working with Hock and Charlie and Broadcom. So, it’s just been really incredible to lean on their expertise, together with our understanding of the workload. And it’s been actually very interesting to see the places where OpenAI is able to do things very differently from the rest of the industry or the way that things would historically be done. For example, we’ve been able to apply our own models to designing this chip, which has been really cool. We’ve been able to pull on the schedule. We’ve been able to get massive area reductions, right? You take components that humans have already optimized and just pour compute into it, and the model comes up with its own optimizations. And it’s very interesting. We’re at the point now where I don’t think any of the optimizations we have are ones that human designers couldn’t have come up with. Like usually our experts take a look at it later and say, yeah, like this was on my list, but it was like 20 things that would have taken them another month to get to. And that’s actually really, really interesting that we were coming up on a deadline working with Charlie’s team and we were running optimizations. We had a choice of, do we actually take a look at what those optimizations were or do we just keep going until the deadline and then take a look after? And we decided, of course, you got to just keep going. And so we’ve really been building up this expertise in-house to understand this domain. And that’s something we actually think can help lift up the whole industry. But I think that we are heading to a world where AI intelligence is able to help humanity make new breakthroughs that just would not be possible otherwise. And we’sre going to need just as much compute as possible to power that. Like one example of something very concrete is that we are in a world now where ChatGPT is changing from something that you talk to interactively to something that can go do work for you behind the scenes. If you’ve used features like Pulse, You wake up every morning. It has some really interesting things that are related to what you’re interested in. It’s very personalized. And our intent is to turn ChatGPT into something that helps you achieve your goals. The thing is, we can only release this to the pro tier because that’s the amount of compute that we have available. And ideally, everyone would have an agent that’s running for them 24-7 behind the scenes, helping them achieve their goals. And so ideally, everyone has their own accelerator, has their own compute power that’s just running constantly. And that means there’s 10 billion humans. We are nowhere near being able to build 10 billion chips. And so there’s a long way to go before we are able to saturate not just the demand, but what humanity really deserves.

Andrew Mayne: So, Charlie, being very deeply technical and being with a company that’s been at a number of forefronts of some of these revolutions, what’s it been like working with a company like OpenAI and working with Greg on this?

Charlie Kawwas: So for us, it’s been absolutely exciting and refreshing because the beauty of the work we do together is focus on a certain workload. We started actually first looking at the IP and AI accelerator, which is what we call the XPU. And then we realized very quickly that we now can actually go to the workload all the way down to the transistor. And as Greg was just explaining, how we can both work together to go customize that platform for your workload, resulting in the best platform in the world. Then we realized, as Sam was saying earlier on, it’s not just that XPU or accelerator. Actually, it’s the networking that needs to go to scale it up, scale it out, and scale it across. And so suddenly we started seeing that we actually can drive next level of standardization and openness that not just only benefits us. I think it actually will benefit the entire ecosystem and it gets Gen AI to an AGI much faster. So very excited about the technical capabilities of the teams we have. but also the vision and I think the speed at which we’ve been moving.

Andrew Mayne: I’m still kind of wrapping my head around the scale of it because it’s just from both trying to design something like a chip and to help figure out how you’re going to get the maximum efficiency on this to just the size of it, the infrastructure, what’s involved in this. This is a global effort. And what comparisons have you been able to draw for this to other examples in history?

Sam Altman: I always think the historical analogies are tough, But this is, as far as I know, I don’t know what fraction building the Great Wall was of global GDP at the time. But a lot of ways that you would look at the AI infrastructure build out right now, you would say it’s the biggest joint industrial project in human history. And this is like, this requires a lot of companies, a lot of countries, a lot of industries to come together. And a lot of stuff has to happen at the same time, and we’ve all got to kind of like invest together. But at this point, given everything we see coming on the research front, given all of the value we see being created on the business front, I think the whole industry has decided this is like a very good bet to take. But it is huge. You go to one of these even one gigawatt data centers, and you look at the scale of what’s happening there. It’s like a tiny city. it’s a big complex thing. So it is just like incredible skill.

Greg Brockman: To the point of this being a massive collaborative project, I feel like whenever I call Charlie, he’s in a different part of the world trying to secure capacity, trying to find a way to help us build what we’re trying to do together.

Charlie Kawwas: Exactly. Actually, one of the coolest thing actually I was thinking about is what we’re doing together in this wonderful partnership. We’re defining civilization’s next generation operating system. And we’re doing it, as you’re saying, at the transistor level, building new fabs, building new manufacturing sites, all the way to building these racks, and ultimately the data centers you’re talking about, 10 gigawatts of data centers. Yeah, I think it’s an important

Andrew Mayne: thing to keep track of, is often people get fixated just on the chips themselves. And it’s kind of like thinking the National Highway Project was about selling asphalt, or railroads are about steel. In reality, it’s the things become possible on top of that. And you’ve probably thought a lot about that? Like what happens?

Hock Tan: Well, I think this is like railroad, internet. That’s what I think this is becoming over time, critical infrastructure or critical utility and more than just critical utility for say 10,000 enterprises. This is critical utility over time, right Sam, for 8 billion people globally. That’s, I think, it’s like the industrial revolution of a different sort coming But it cannot be done with just one party or we like to think it can be done with two. But more than that, it needs a lot of partnerships. It needs a collaboration across an ecosystem. And also because of that, it’s important to create, much as we say about developing chips for specific workloads, applications and LLM. It also requires somewhat standards that are open, more transparent for all to use because you need to build up a whole infrastructure at the end of the day to become a critical utility for 6 billion people in the world. And we’re very excited, frankly, which is why we think we make great partners because I think we share the same conviction. And more than that, it is about scaling computing to create breakthroughs in super intelligence and models. It’s building the foundation of that.

Andrew Mayne: You guys have a lot on your plate. Why design chips now?

Greg Brockman: Well, you know, this project, we’ve probably been working on it for 18 months now, and it’s moved incredibly quickly. We’ve hired some really amazing people. And I think what we found is that we have a deep understanding of the workload. And we work with a number of parties across the ecosystem. And there’s a number of chips out there that I think are really incredible. And there’s a niche for each one. And so we’ve really been looking for specific workloads that we feel are underserved. How can we build something that will be able to accelerate what’s possible? And so I think that that ability to say that we are able to do the full vertical integration for something we see coming, but it’s hard for us to work through other partners, like that’s a very clear use case for this kind of project.

Hock Tan: Yeah, actually more than that, and Greg, you put it very well. Really why you want to do your chip is computing is a big part of what’s gating this journey towards super intelligence, towards creating better and better frontier model. It really, a lot of it down to computing, and not just any computing, computing that is effective, high performance, and efficient, given especially on power. And what Greg is saying is exactly what we learned and saw here. For instance, if you want to train, you design chips that are much stronger in computing capacity measured TFLOPs, as well as network, because it’s not just one chip that makes it happen. It’s a cluster, as Charlie put it. But if you want to do inference, you put in more memory and memory access relative to compute. So you are actually, over time, creating chips, optimised for particular workloads, applications, as we go along. And that, at the end of the day, is what will create the most effective models is a platform that you want to create end-to-end.

Greg Brockman: And also one piece of historical context is that when we started OpenAI, we didn’t really have that much of a focus on compute. We felt that the path to AGI is really about ideas. It’s really about tryouts and stuff. Eventually, we’ll put the right conceptual pieces in place and then AGI. And about two years in, in 2017, the thing that we found was that we were getting the best results out of scale. It wasn’t something we set out to prove. It was something we really discovered empirically because of everything else that didn’t work nearly as well. And the first results were scaling up our reinforcement learning in the context of the video game Dota 2. Did you guys pay attention to the Dota 2 project back in the day? Yes. It was a super cool project. And we really saw you scale it by 2x, and suddenly your agent is 2x better. It’s like, okay, we have to push this to the limit. And at that point, we started paying attention to the whole ecosystem. There were all sorts of chip startups with novel approaches that were very different from GPUs. And we started giving them a ton of feedback saying, here’s where we think things are going. It needs to be models of this shape. And honestly, a lot of them just didn’t listen to us, right? And so it’s like very frustrating to be in this position where you say we see the direction the future should be going. We have no ability to really influence it besides sort of, you know, just like sort of trying to influence other people’s roadmaps. And so by being able to take some of this in-house, we feel like we are able to actually realize that vision. And again, in a way that like we hope that we can show a direction and other people will fill in because the amount of compute required to bring our vision of AGI to the world, 10 gigawatts is not enough. That is a drop in the bucket compared to where we need to go.

Sam Altman: It’s a big drop...

Andrew Mayne: The bucket’s really big. What becomes possible with this when you’re building your own chips for inference and for training? Where can you take this?

Sam Altman: To zoom out a little bit, if you simplify what we do in this whole process to, you know, melt sand, run energy through it and get intelligence out the other end. You’re not literally melting sand. Like it’s a nice visual.

Hock Tan: That’s a good one.

Charlie Kawwas: That’s all we have to do.

Hock Tan: I like that.

Sam Altman: What we want is the most intelligence we can get out of each unit of energy. And because that will become the gate at some point. And I hope what this whole process will show us, which is, you know, from the model we design to the chip to the rack, we will be able to wring out so much more intelligence per watt. And then everybody that’s using these models in all of these incredible ways will do so much with it. That’s what I hope for.

Hock Tan: And you control your own destiny. If you do your own chips, you control your destiny.

Andrew Mayne: Yeah, it’s interesting to think about how the things that we’re doing today are pretty amazing, remarkable, but we’re using stuff that wasn’t actually designed specifically for the way we’re doing it.

Sam Altman: Oh, I mean, the GPUs of today are incredible, incredible things. I’m very grateful, and we will continue to really need a lot of those. The flexibility and the ability to let us do fast research is amazing. But you are right that as we get more and more confident in what the shape of the future is going to look like, a very optimized system to the workload will let us ring more out per watt. That’s great.

Charlie Kawwas: And it’s a long journey that takes decades. So if you go back to Hock’s example, take railroads, it took about a century to roll it out as a critical infrastructure. If you take the Internet, it took about 30 years. This is not going to take five years. It’s going to take a long time. So I think as we collectively, especially with this partnership, continue to figure out ways to wring out more tokens out of it, we’ll discover that, oh, for this training or research, maybe a GPU is great. Or maybe, you know what, we can take whatever we’re doing with Greg. It’s actually a platform that allows you, like a Lego block, to take in things and out. And now suddenly we can get another XPU or an accelerator for next-gen that’s targeted at a training or an inference or a research.

Greg Brockman: Yeah, and to the point that Sam said of GPUs have really come an incredible way, in 2017 when we started looking at all these other accelerators, it was actually very non-obvious about what the landscape would look like in 5, 10 years. And I think it’s really a testament to companies like NVIDIA AMD for how much the GPU has just moved forward and continued to be the dominant accelerator. But at the same time, there’s a massive design space out there, right? And I think that what we see is workloads that are not served through existing platforms. And that’s where that full vertical integration is something unique.

Andrew Mayne: It’s interesting to you because the idea that you’d want to put inferences close to the user is something kind of relatively new. You know, we understood training, but then you think about like the number of people every day using these products and how much they need compute to do fun things or serious things. And when you start thinking about kind of like the scale of it, like we talked before, I keep coming back to it’s a very big thing. Where, you know, where does it keep going? Is it just a thing that we’re going to continuously find new things to use compute for?

Sam Altman: The first cluster OpenAI had, the first one that I can remember the energy size for, was 2 megawatts. Adorable.

Greg Brockman: Yeah. We got things done with those two.

Sam Altman: I don’t remember when we got to 20. I remember when we got to 200. You know, and we will finish this year a little bit over 2 gigawatts. And these recent partnerships will take us close to 30. the world has done far more than I thought they were going to do. Turns out you can serve 10% of the world’s population with ChatGPT and do the research and do Sora and do our API and a few other things on 2 gigawatts. But think about how much more the world would like to do than they get to do right now. If we had 30 gigawatts today with today’s quality of models, I think you would still saturate that relatively quickly in terms of what people would do, especially with the lower cost we’ll be able to do with this. But the thing we have learned again and again is, let’s say we can push GPT-6 to feel like, you know, 30 IQ points past GPT-5, something big. And that it can work on problems not for a few hours, but for a few days, weeks, months, whatever. The amount, and while we do that, we bring the cost per token down. The amount of economic value and sort of surplus demand that happens each time we’ve been able to do that, goes up a crazy amount. So you can see, to pick a, I think, well-known example at this point, when ChatGPT could write a little bit of code, people actually used it for that. They would very painfully paste in their code and wait and they would say, do this for me and paste it back in and whatever. And models couldn’t do much, but they could do a few things. The models got better, the UX got better, and now we have Codex. Codex is growing unbelievably fast and can now do a few hours of work at a higher level of kind of capability. And when that’s possible, the demand increase is crazy. Maybe the next version of Codex can do like a few days of work at kind of one of the best engineer you know level, or maybe that takes a few more versions, whatever, it’ll get there. Think how much demand there will just be for that and then do it for every knowledge work industry.

Greg Brockman: And one way I like to think of it is that intelligence is the fundamental driver of economic growth, of increasing the standard of living for everyone. And what we’re doing with AI is actually bringing more intelligence and amplifying the intelligence of everyone. And so as these models get better, I think everyone’s going to become more productive. The output of what is possible is going to just be totally different from what exists today.

Andrew Mayne: It’s interesting, too, that going from a point with GPT-3, which was pretty cost, you know, it was expensive comparatively to where you’re at a level of a GPT-5 and the fact that you can provide that freely to people. And is that a motivating factor for you, the fact that every time you create these new efficiencies, that it just benefits so many more people? Yes. Absolutely. Absolutely.

Hock Tan: Absolutely. And from our side on hardware, compute capacity, where to some extent, the rubber hits the road on this, it’s really incumbent on us to keep optimizing, pushing the envelope on leading-edge technology. And there’s still room to go. And there’s room to go even on where we are as we go from two nanometers going forward, less smaller than two nanometers as we start doing all kinds of different technology. It is really great, exciting times, especially for the hardware and the semiconductor industry.

Sam Altman: What Broadcon has done here is really quite incredible. It used to be extremely difficult for a company like ours about making a competitive chip. In fact, so hard we just wouldn’t have done it. And I think a lot of other companies wouldn’t have done it as well. And all of these sort of, this customized chip and system to a workload just wouldn’t be a thing in the world. But the fact that they have pushed so hard and so well on making it so that they can, a company can partner with them and they can do a miracle of technology chip quickly and at scale, unfortunately do it for all of our competitors too, but hopefully our chip will be the best.

Hock Tan: - Yes, of course.

Sam Altman: It’s really quite incredible.

Greg Brockman: And I think also not just what they can do for us today, but looking at the upcoming roadmap, it’s just so exciting the kinds of technologies that they’re going to be able to bring to bear for us to be able to utilize.

Hock Tan: Well, it’s just the excitement of, I mean, enabling joint and collaboratively models, chat GPT-5, 6, 7, on and on. And each of them will require a different chip, a better chip, a more developed chip, advanced chip that we haven’t even begun to figure out how to get there. But we will.

Greg Brockman: And actually, the GPTs are definitely going to be an increasing part of that. Yes. It’ll be very interesting.

Charlie Kawwas: We’re actually looking forward to that because my software engineers now already use that from a software point of view, and it’s delivering efficiencies of dozens of engineers.

Sam Altman: Really?

Charlie Kawwas: Yes.

Sam Altman: Great.

Charlie Kawwas: On the hardware side, we’re not there yet. But, you know, the good news. We’ll get down very little.

Sam Altman: We should talk.

Charlie Kawwas: Yes, we should absolutely leverage this. But I was going to say with respect to compute. So when we started building these XPUs, you can maximum build a certain number of compute in 800 square millimeter. That’s it. Now, today, we’re actually working together to ship multiple of these in a two-dimensional space. The next thing we’re talking about is stacking these into the same chip. So now we’re actually going in the Y dimension or Z dimension, if you want to think three-dimensional. And then the last step we’re actually also talking about is now we’re going to bring optics into this, which is actually what we just announced, which is 100 terabits of switching with optics integrated all into the same chip. So these are sort of the technologies that will take compute, the size of the cluster, the total performance and wattage of the cluster to a whole new level. I think it will keep doubling at least every six to 12 months.

Andrew Mayne: What kind of timeframe are we talking about? When are we going to first start to see what’s coming out of the relationship?

Sam Altman: End of next year, and then we’ll deploy very rapidly over the next three years. Absolutely.

Charlie Kawwas: Greg and I are talking about this at least once a week. We just had a chat earlier today on this. Yes, good progress today. Yes, exactly.

Greg Brockman: But yeah, we’re really excited to get Silicon back starting soon, actually. Yes, very soon. Yeah, I think that my view of this whole project is it’s not easy, right? It’s easy to just say, oh, yeah, 10 gigawatts. But like when you look at what is required to actually design a whole new chip and to actually deliver this at scale, get the whole thing working end to end, it’s just an astronomical amount of work. And I would say that we’re very serious. You know, our mission is to ensure that AGI benefits all of humanity. We’re very serious about benefits everyone. Like we really want this to be a technology that is accessible to the whole world, that lifts up everyone. And you can really see that in trying to make the world be one of compute abundance. Because I think by default, we’re heading towards one that is like quite compute scarce.

Andrew Mayne: You ask my wife when she’s trying to get more Sora credits, it feels very scarce.

Greg Brockman: Yeah, no, no. We feel it so concretely. Teams within OpenAI, their output is just like a direct function of how much compute they get. And so that the amount of intensity on who gets the compute allocation is so extreme. And so I think that what we really want is to be a world where just if you You have an idea, you want to create, you want to go build something that you have the compute power behind you to make it happen.

Andrew Mayne: Gentlemen, thank you very much for sharing this with us. It’s going to be very exciting to see where this goes, and I hope we can keep talking about this as it continues to develop. Thank you.

Sam Altman: Thank you guys for the partnership.

Hock Tan: Thank you. Thank you for the partnership. We’re really enjoying it.

Greg Brockman: We are too.

Sam Altman: Yeah.

Hock Tan: Thank you.