Advanced Insights S2E4: Deploying Intelligence at Scale

Mark Papermaster:

Welcome to Advanced Insights, where we provide just what the show name suggests, advanced insights into some of the most exciting trends and topics in technology. I’m Mark Papermaster, CTO and EVP at AMD. In this episode, I’m joined by Chris Gandolfo. He is the EVP, North America Cloud Infrastructure and AI Sales at Oracle. He reinvigorated the One Oracle mentality by tightly integrating his field teams with Oracle Digital, an organization he later led. Chris, it is great to have you on the show. You know, we find ourselves at yet another juncture in this AI journey of sort of change beyond the rate and pace that I think any of us anticipated. I’ve really been looking forward to our conversation, and honestly, I don’t think we could have planned the timing any better. We’re at now the start of this inflection of all the readiness we’ve done of AI, all the work that’s been foundational, all of the models and generative AI capabilities. But now people are starting to really use it and deploy it. And I can’t wait to hear your perspectives from where you guys sit at Oracle on what is going on right now and what’s coming in the future.

Chris Gandolfo:

Mark, it’s great to be here. Thank you for having me on the show. I’m really excited to be here. It’s an exciting time to be in the industry, but it’s also a really exciting time for us at Oracle, right? You think of our past and our heritage—we grew up as a mission-critical software company. Our expertise was around mission-critical software for databases and the whole application stack. And now we find ourselves as a player in the hyperscaler market with Oracle Cloud Infrastructure, OCI. We now are contending and competing for some of the largest workloads that are processed by both language model trainers or enterprises and the like. So it’s a really exciting time for us.

Mark Papermaster:

Let’s start with, I think, you know, a question on a lot of people’s mind. From your vantage point, here you are with the hyperscalers. People come to you with a whole range of problems that they want you to provide the compute infrastructure for. So there’s training on one hand—training these huge models. And now people are starting to run these actual inferencing, real-world AI applications. Might be fun to just get the dialogue going just with the context. What are your Oracle customers coming to you with? Is it training? Is it inference? What kind of mix? And then maybe we can delve a little bit deeper into the two workloads.

Chris Gandolfo:

Sure. It’s a hot topic for sure. I think training thus far has taken all the oxygen out of the room. It has been at a scale that I don’t think anyone saw coming or could possibly have imagined. Even if you’ve lived through other technology inflection points, this has been unique. A lot of our focus has been just building the right infrastructure for training. And we’re also one of the world’s largest ISVs, right? We have a massive portfolio of software. And with the advent of NLPs and these language models, all of that software is going to be influenced by AI. And we have a front seat for that because we have a huge portfolio of SaaS applications and a lot of on-premise software.

So we have been efforting in parallel to building infrastructure, taking all these novel models—whether it’s ChatGPT, Grok, LLaMA, Cohere—and you have to be customer number one. We have to be inference-ready. Your software is going to have to be ready to handle the AI experience. We’re seeing that happen right now with our own customers. We’ve made a choice to take all of our subscription software and find areas where we can embed models. We’re starting with the easy stuff first.

Like, if you have an HCM platform, you’re an HR manager, you want to write a job description, you want to read a resume—speeding up the time to hire is kind of the promise of enterprise software. We’re going to automate all these processes. But then you implement it and realize: this is complex stuff. It’s hard to simplify it. Now we have the transformer and we have all these LLMs. I think we can deliver on that promise, and we’ve made a good choice to embed that into the subscription and just say, “Look, this is about feature enhancement. This is about changing the experience.”

Chris Gandolfo:

People don't use Oracle as a nice-to-have. It's doing something incredibly important. So we had to build a cloud that was ready for that type of work. So when we entered the market we made some design choices that I think are paying off very nicely now.

Because, for example, a cloud region at Oracle fits into nine racks, right? We don't have to build a super Walmart to build a region, right? We can just fit in where the customers want.

When you think about globally distributed inference—when you think about inference at the edge or where it needs to occur—I think we’re really well-positioned because building a cloud region is an incredible amount of capital. You can’t just say, “Hey, build it and hope people show up.”

Mark Papermaster:

I’m going to get into that in more detail in a minute, but keep going.

Chris Gandolfo:

Sure. So the idea of building these portable, more distributed clouds—and the fact that we’ve created the tooling, automation, and density so a full OCI region can fit in such a small footprint—is the reason why the other hyperscalers are peaking at 50–60 global regions each, while we already have 200 either live or coming online. And I think we’re going to wake up soon and see 2,000 regions.

Mark Papermaster:

Because global distributed inference needs to be done as close to the work as possible for the least latency.

Chris Gandolfo:

That’s right.

Mark Papermaster:

And because you’re a SaaS provider, you’re inherently figuring out how to lower the friction of leveraging AI.

Chris Gandolfo:

That’s right.

Mark Papermaster:

I love that you’re applying your “home cooking” to improve the experience. The number one comment I hear from enterprise customers is: “Make it easy.” SaaS applications are the easiest way because the SaaS provider does the heavy lifting.

Chris Gandolfo:

Exactly.

Mark Papermaster:

You also have customers coming to you with both training and inference needs, so it’s giving you insight into the workloads and letting you accommodate both. Let’s dive into that—say someone comes to you with a foundation LLM, trillions of parameters, huge model that needs to be trained. How do you think about that from an infrastructure standpoint?

Chris Gandolfo:

It starts with a lot of planning. We’re operating in a scarce environment—there’s not enough power to serve the compute demand. And not long ago, there weren’t enough chips. Doing things at enormous scale takes planning and foresight. Our choice has been to put the highest utility on performance and efficiency over anything else—because you have to squeeze every ounce of economic throughput out of every system to survive in this business.

It’s not just having access to power and chips—you need skillful planning. We chose Bare Metal so enterprises can get all the power out of a system without a hypervisor, especially for workloads they can’t change.

We also learned from Exadata—how to build a high-performing, low-latency, non-blocking RDMA network. That experience perfectly suits AI, especially when meshing huge numbers of GPUs.

Mark Papermaster:

So you had a head start.

Chris Gandolfo:

We did, though we may not have realized it at the time. But those choices—Bare Metal, low-latency networks—are paying off. When people run high-intensity training workloads, they compare FLOPs throughput in our cloud versus another. The price may be similar, but the throughput—the speed of the model run—is significantly different.

Mark Papermaster:

That’s a unique perspective. Let’s talk about enterprise customers—they may already be running on OCI or Exadata and now say, “I need to add inference for a range of applications.” Where is enterprise on this journey?

Chris Gandolfo:

Maybe we’re at the top of the first inning—on deck circle, maybe. It’s still nascent. I think enterprise SaaS will see more practical application of models and inference. Pure agentic, bespoke inference at the enterprise level is still mostly experimentation. Most enterprises will consume inference rather than build it all themselves.

We’re staging for when those inference needs come—making sure we have infrastructure where they need it, when they need it. Back to the small-footprint distributed inference tech—I think that will be big.

We’ve also made another choice—we compete with Microsoft, AWS, GCP, but we also partner. You can now take an OCI database rack and consume it in Amazon or Azure. And not just with a license—we physically nest Exadata Cloud Service inside those other clouds for performance.

Mark Papermaster:

So the walled gardens are coming down.

Chris Gandolfo:

Exactly. Customers now see more choice than ever.

Mark Papermaster:

As you well know, because we work closely together, we’re all about partnership and ecosystem. There should be choice—choice in compute devices, and in how you configure your compute infrastructure. We’ve been all about that, and I’d love to tell the story of how AMD and Oracle have worked together. The complexity AI has brought means we have to partner better than ever. From your perspective, thinking from the silicon all the way to the end application, whether it’s your own app or a third party’s running on OCI—let’s start with a little history, maybe with Exadata.

Chris Gandolfo:

Our partnership with AMD was grounded in engineering, benchmarking, and performance. That’s a core value for how we make tech choices. We have something great to build on with Exadata high-performance systems.

Think about our non-GPU compute—it’s largely AMD EPYC. We do “bill compares” with customers and have built-in advantages because of Bare Metal. We’ve also done off-box virtualization with the Pensando SmartNIC—moving the OCI control plane off the customer’s tenancy. All virtualization software orchestrating OCI is off the box.

Mark Papermaster:

I love that.

Chris Gandolfo:

It’s because of the Pensando chip. This gives customers direct access to your CPU—no muscling through Oracle software—so when they rent a core, they get more performance automatically. We’re ~40–50% cheaper on a like-for-like workload because of these design choices. It’s paved a path for us as an economic disruptor.

It also changes our security posture—moving the control plane off-box reduces the threat surface. If a tenancy is compromised, it’s much harder to pivot elsewhere in the cloud. So we get performance, scalability, and better security.

Mark Papermaster:

Those Pensando NIC attributes came from listening to customers. We formed a strong technical advisory board, and Oracle was one of the loudest voices. That’s how solutions get optimized—you give us input, we fold it in.

Back to Exadata—you’ve deployed EPYC x86 widely, and you’ve got deep chip expertise dating back to Sun. My team says they’ve never had more intense low-level optimization feedback from a partner. That input makes our CPU roadmap better for your needs. That’s the name of the game—cross-collaboration.

Chris Gandolfo:

It absolutely is. We’ve purposely not tried to be the “kingmaker” who does everything in the stack. We’d rather leverage partners’ strengths and influence them to expose our tech advantages. We don’t have a GPU team, we’re not building a language model—we partner.

That’s where the MI300 series comes in—you made a design choice for higher memory throughput. For distributed inference, that means fewer GPUs per rack at the edge—smaller footprint, less power. Perfect for our approach.

Mark Papermaster:

Exactly. We designed with high HBM stack-up to feed AI workloads. MI300 was our first direct head-to-head against the big data center GPU incumbent. We hoped it would enable less GPUs, less power, less footprint—and your feedback proved it.

Chris Gandolfo:

That’s right. And with MI355, we’re looking at 35× throughput improvement—a very good marriage for us.

Mark Papermaster:

When we first started working with Oracle Cloud, we were actually a customer. A lot of our workload is chip design—EDA—and we had some storage requirements you hadn’t seen before. I thought, “Well, that’s it, it’s not going to work.” But you came back and said, “We’ll architect a different solution to meet your needs.”

Chris Gandolfo:

We try to listen to the market and customers so we make the right decisions. We’ve never led by saying, “We figured everything out—do it our way.” All of our big LLM customers have influenced our infrastructure design. That preserves our future together—if we make their decisions easier and give them the performance they need, they’ll see us as an attractive partner.

Unlike our heritage, we’ve made the choice to be the most open cloud provider. I tell customers: “We’re the easiest cloud to leave.” We have the least commercial lock-in—moving data out of OCI is ~90% cheaper than other major hyperscalers. If you don’t like our performance or service, you’re free to leave. That keeps us on our toes.

Mark Papermaster:

We share that open ethos. I’m sure that’s why we work together so well.

Chris, I’m going to put you on the spot—what’s coming next?

Chris Gandolfo:

Training isn’t going to stop. AI will influence every piece of software. Only a few players will have the capital to train frontier models, but every developer will need to be inference-ready. No one is building retro software anymore—everything new will be AI-ready.

We’re already seeing this in SaaS. The next frontier is getting to the developers—ensuring enterprises are ready to consume inference without building it all themselves. There are already companies doing practical enterprise inference, like Windsurf (formerly Codeium) with code generation.

Mark Papermaster:

The challenge is that AI’s societal impact will be huge—it could change our roles as technology providers. That’s a burden, but one we have to step up to. Have you thought about that at Oracle?

Chris Gandolfo:

Yes. Given our heritage and customer base, security and responsible deployment are core to our ethos. Like you, I’m optimistic that society will put the right guardrails in place.

Mark Papermaster:

Chris, it was great to get your perspective. You’re in the middle of building out not only the compute infrastructure our industry needs, but also mission-critical applications that listeners are running on every day.

Chris Gandolfo:

Thank you, Mark. I’m grateful for our partnership and the opportunity to be on your show. It’s been fun.

Mark Papermaster:

My thanks to you. The main takeaways: it’s vital for enterprise customers to prepare now for the AI revolution so their infrastructure can handle inference needs when they come; and collaboration—from application to infrastructure to silicon—is essential.

Thank you for joining us today on Advanced Insights. I look forward to bringing you more in-depth conversations on cutting-edge technology, industry insights, and visionary perspectives from some of the brightest minds in the field.