Download "AMD Presents: Advancing AI"

"videoThumbnail AMD Presents: Advancing AI

11:24

Introduction

18:32

AMD Instinct™ MI300X Accelerators

24:52

Microsoft

29:30

AMD Instinct™ Platform

33:03

Oracle Cloud

36:14

Software and Ecosystem

44:18

AI Innovators Panel: Databricks, EssentialAI, Lamini

1:01:09

Meta

1:07:45

Dell

1:14:05

Supermicro

1:18:14

Lenovo

1:25:33

Networking

1:30:37

Panel: Broadcom, Arista, Cisco

1:40:51

AMD Instinct™ MI300A APUs

1:47:05

Exascale Computing – El Capitan

1:56:13

AI PCs

1:56:49

AMD XDNA™ Architecture

1:58:31

AMD Ryzen™ AI Software

2:00:49

Microsoft

2:07:05

Closing

How Is Gaming On An 8-Year-Old PC In 2023?

How Is Gaming On An 8-Year-Old PC In 2023?

Channel: Dawid Does Tech Stuff

Here is Radeon™ RX 500

Here is Radeon™ RX 500

Call of Duty Warzone 3 | RTX 4090 24GB ( 4K Maximum Settings RTX ON / DLSS ON )

Call of Duty Warzone 3 | RTX 4090 24GB ( 4K Maximum Settings RTX ON / DLSS ON )

Channel: GTX 1050 Ti

Как работает процессор

Как работает процессор

Channel: House of NHTi

Acer Makes Graphics Cards Now. And It's Weird...

Acer Makes Graphics Cards Now. And It's Weird...

Channel: Dawid Does Tech Stuff

AMD

Advanced Micro Devices

EPYC

cpu

processors

gpu

graphics

pc

together we advance_

Gaming

Server

Computer

Desktop

Laptop

Ryzen

AI

00:13:15

Hey, good morning.

00:13:19

Good morning, everyone.

00:13:20

Welcome to all of you who are joining us here in Silicon Valley and to everyone who's

00:13:23

joining us online from around the world.

00:13:26

It has been just an incredibly exciting year with all of the new products and all the innovation

00:13:31

that has come across our business and our industry.

00:13:34

But today, it's all about AI.

00:13:37

We have a lot of new AI solutions to launch today and to news to share with you, so let's

00:13:41

go ahead and get started.

00:13:43

Now, I know we've all felt this this year.

00:13:46

I mean, it's been just an amazing year.

00:13:48

I mean, if you think about it, a year ago, OpenAI unveiled ChatGPT.

00:13:53

And it's really sparked a revolution that has totally reshaped the technology landscape.

00:13:59

In this just short amount of time, AI hasn't just progressed.

00:14:03

It's actually exploded.

00:14:05

The year has shown us that AI isn't just kind of a cool new thing.

00:14:10

It's actually the future of computing.

00:14:12

And at AMD, when we think about it, we actually view AI as the single most transformational

00:14:18

technology over the last 50 years.

00:14:22

Maybe the only thing that has been close has been the introduction of the internet.

00:14:26

But what's different about AI is that the adoption rate is just much, much faster.

00:14:32

So although so much has happened, the truth is right now, we're just at the very beginning

00:14:37

of the AI era.

00:14:39

And we can see how it's so capable of touching every aspect of our lives.

00:14:44

So if you guys just take a step back and just look, I mean, AI is already being used everywhere.

00:14:50

Think about improving healthcare, accelerating climate research, enabling personal assistance

00:14:55

for all of us and for greater business productivity, things like industrial robotics, security,

00:15:02

and providing lots of new tools for content creators.

00:15:05

Now the key to all of this is generative AI.

00:15:09

It requires a significant investment in new infrastructure.

00:15:13

And that's to enable training and all of the inference that's needed.

00:15:17

And that market is just huge.

00:15:19

Now a year ago when we were thinking about AI, we were super excited.

00:15:23

And we estimated the data center AI accelerator market would grow approximately 50% annually

00:15:30

over the next few years, from something like $30 billion in 2023 to more than $150 billion

00:15:36

in 2027.

00:15:38

And that felt like a big number.

00:15:40

However, as we look at everything that's happened in the last 12 months and the rate and pace

00:15:46

of adoption that we're seeing across the industry, across our customers, across the world,

00:15:52

it's really clear that the demand is just growing much, much faster.

00:15:57

So if you look at now to enable AI infrastructure, of course it starts with the cloud,

00:16:02

but it goes into the enterprise.

00:16:03

We believe we'll see plenty of AI throughout the embedded markets and into personal computing.

00:16:09

We're now expecting that the data center accelerator TAM will grow more than 70% annually

00:16:14

over the next four years to over 400 billion in 2027.

00:16:20

So does that sound exciting for us as an industry?

00:16:28

I have to say for someone like me who's been in the industry for a while, this pace of

00:16:32

innovation is faster than anything I've ever seen before.

00:16:36

And for us at AMD, we are so well positioned to power that end-to-end infrastructure that

00:16:42

defines this new AI era.

00:16:44

So speaking about massive cloud server installations to we're going to talk about on-prem enterprise

00:16:50

clusters to the next generation of AI in embedded and PCs, our AI strategy is really centered

00:16:56

around three big strategic priorities.

00:17:00

First, we must deliver a broad portfolio of very performant, energy-efficient GPUs, CPUs,

00:17:07

and adaptive computing solutions for AI training and inference.

00:17:11

And we believe, frankly, that you're going to need all of these pieces for AI.

00:17:15

Second, it's really about expanding our open, proven, and being very developer-friendly

00:17:21

in our software platform to ensure that leading AI frameworks, libraries, and models are all

00:17:27

fully enabled for AMD hardware and that it's really easy for people to use.

00:17:32

And then third, it's really about partnership.

00:17:35

You're going to see a lot of partners today.

00:17:37

That's who we are as a company.

00:17:38

It's about expanding the co-innovation work and working with all parts of the ecosystem,

00:17:44

including cloud providers, OEMs, software developers.

00:17:49

You're going to hear from some really AI leaders in the industry to really accelerate how we

00:17:54

work together and get that widespread deployment of our solutions across the board.

00:18:00

So we have so much to share with you today.

00:18:01

I'd like to get started.

00:18:03

And of course, let's start with the cloud.

00:18:06

Generative AI is the most demanding data center workload ever.

00:18:11

It requires tens of thousands of accelerators to train and refine models with billions of

00:18:16

parameters.

00:18:17

And that same infrastructure is also needed to answer the millions of queries from everyone

00:18:22

around the world to these smart models.

00:18:25

And it's very simple.

00:18:26

The more compute you have, the more capable the model, the faster the answers are generated.

00:18:32

And the GPU is at the center of this generative AI world.

00:18:37

And right now, I think we all know it, everyone I've talked to says it, the availability and

00:18:42

capability of GPU compute is the single most important driver of AI adoption.

00:18:48

Do you guys agree with that?

00:18:49

So that's why I'm so excited today to launch our Instinct MI300X.

00:18:59

It's the highest performance accelerator in the world for generative AI.

00:19:04

MI300X is actually built on our new CDNA 3 data center architecture.

00:19:09

And it's optimized for performance and power efficiency.

00:19:12

CDNA 3 has a lot of new features.

00:19:15

It combines a new compute engine.

00:19:17

It supports sparsity, the latest data formats, including FP8.

00:19:22

It has industry-leading memory capacity and bandwidth.

00:19:25

And we're going to talk a lot about memory today.

00:19:28

And it's built on the most advanced process technologies and 3D packaging.

00:19:32

So if you compare it to our previous generation, which frankly was also very good, CDNA 3 actually

00:19:38

delivers more than three times higher performance for key AI data types, like FP16 and BF16,

00:19:45

and a nearly seven times increase in int tape performance.

00:19:50

So if you look underneath it, how do we get MI300X?

00:19:53

It's actually 153 billion transistors, 153 billion.

00:20:04

It's across a dozen 5-nanometer and 6-nanometer chiplets.

00:20:08

It uses the most advanced packaging in the world.

00:20:11

And if you take a look at how we put it together, it's actually pretty amazing.

00:20:15

We start with four IO die in the base layer.

00:20:19

And what we have on the IO dies are 256 megabytes of infinity cache and all of the next-gen

00:20:24

IO that you need.

00:20:26

Things like 128-channel HBM3 interfaces, PCIe Gen 5 support, our fourth-gen infinity fabric

00:20:34

that connects multiple MI300Xs so that we get 896 gigabytes per second.

00:20:40

And then we stack eight CDNA 3 accelerator chiplets, or XCDs, on top of the IO die.

00:20:46

And that's where we deliver 1.3 petaflops of FP16 and 2.6 petaflops of FP8 performance.

00:20:54

And then we connect these 304 compute units with dense through-silicon vias, or TSVs,

00:21:01

and that supports up to 17 terabytes per second of bandwidth.

00:21:05

And of course, to take advantage of all of this compute, we connect eight stacks of HBM3

00:21:11

for a total of 192 gigabytes of memory at 5.3 terabytes per second of bandwidth.

00:21:17

That's a lot of stuff on that.

00:21:25

I have to say, it's truly the most advanced product we've ever built, and it is the most

00:21:31

advanced AI accelerator in the industry.

00:21:34

Now let's talk about some of the performance and why it's so great.

00:21:39

For generative AI, memory capacity and bandwidth are really important for performance.

00:21:44

If you look at MI300X, we made a very conscious decision to add more flexibility, more memory

00:21:50

capacity, and more bandwidth, and what that translates to is 2.4 times more memory capacity

00:21:56

and 1.6 times more memory bandwidth than the competition.

00:22:00

Now when you run things like lower precision data types that are widely used in LLMs, the

00:22:06

new CDNA 3 compute units and memory density actually enable MI300X to deliver 1.3 times

00:22:13

more teraflops of FP8 and FP16 performance than the competition.

00:22:19

Now these are good numbers, but what's more important is how things look in real world

00:22:24

inference workloads.

00:22:26

So let's start with some of the most common kernels used by the latest AI models.

00:22:31

LLMs use attention algorithms to generate precise results.

00:22:35

So for something like FlashAttention-2 kernels, MI300X actually delivers up to 1.2 times better

00:22:42

performance than the competition.

00:22:44

And if you look at something like the Llama 2 70B LLM, and we're going to use this a lot

00:22:48

throughout the show, MI300X again delivers up to 1.2 times more performance.

00:22:55

And what this means is the performance at the kernel level actually directly translates

00:23:00

into faster results when running LLMs on a single MI300X accelerator.

00:23:06

But we also know, we talked about these models getting so large, so what's really important

00:23:11

is how that AI performance scales when you go to the platform level and beyond.

00:23:16

So let's take a look at how MI300X scales.

00:23:20

Let's start first with training.

00:23:22

Training is really hard.

00:23:23

People talk about how hard training is.

00:23:25

When you look at something like the 30 billion parameter model from Databricks, MPT LLM,

00:23:31

it's a pretty good example of something that is used by multiple enterprises for a lot

00:23:36

of different things.

00:23:38

And you can see here that the training performance for MI300X is actually equal to the competition.

00:23:43

And that means it's actually a very, very competitive training platform today.

00:23:48

But when you turn to the inference performance of MI300X, this is where our performance really

00:23:53

shines.

00:23:55

We're showing some data here, measured data on two widely used models, Bloom 176B.

00:24:02

It's the world's largest open multi-language AI model.

00:24:06

It generates text in 46 languages.

00:24:09

And our Llama 2 70B, which is also very popular, as I said, for enterprise customers.

00:24:15

And what we see in this case is a single server with eight MI300X accelerators is substantially

00:24:21

faster than the competition, 1.4 to 1.6X.

00:24:25

So these are pretty big numbers here.

00:24:28

And what this performance does is it just directly translates into a better user experience.

00:24:32

You guys have used it.

00:24:33

When you ask the model something, you'd like it to come back faster, especially as the

00:24:38

responses get more complicated.

00:24:41

So that gives you a view of the performance of MI300X.

00:24:45

Now excited as we are about the performance, we are even more excited about the work we're

00:24:50

doing with our partners.

00:24:52

So let me turn to our first guest, very, very special.

00:24:55

Microsoft is truly a visionary leader in AI.

00:24:59

We've been so fortunate to have a deep partnership with Microsoft for many, many years across

00:25:04

all aspects of our business.

00:25:06

And the work we're doing today in AI is truly taking that partnership to the next level.

00:25:10

So here to tell us more about that is Microsoft's Chief Technology Officer, Kevin Scott.

00:25:22

Kevin, it is so great to see you.

00:25:24

Thank you so much for being here with us.

00:25:25

It's a real pleasure to be here with you all today.

00:25:29

We've done so much work together on EPYC and Instinct over the years.

00:25:33

Can you just tell our audience a little bit about that partnership?

00:25:36

Yeah, I think Microsoft and AMD have a very special partnership.

00:25:41

And as you mentioned, it has been one that we've enjoyed for a really long time.

00:25:46

It started with the PC.

00:25:47

It continued then with a bunch of custom silicon work that we've done together over the years

00:25:52

on Xbox.

00:25:54

It's extended through the work that we've done with you all on EPYC for the high-performance

00:25:59

computing workloads that we have in our cloud.

00:26:02

And like the thing that I've been spending a bunch of time with you all on the past couple

00:26:06

of years, like actually a little bit longer even, is on AI compute, which I think everybody

00:26:12

now understands how important it is to driving progress on this new platform that we're trying

00:26:19

to deliver to the world.

00:26:20

I have to say we talk pretty often.

00:26:22

We do.

00:26:24

But Kevin, what I admire so much is just your vision, Satya's vision about where AI is going

00:26:31

in the industry.

00:26:32

So can you just give us a perspective of where are we on this journey?

00:26:36

Yeah, so we have been with a huge amount of intensity over the past five years or so,

00:26:44

been trying to prepare for the moment that I think we brought the world into over the

00:26:49

past year.

00:26:50

So it is almost a year to the day since the launch of ChatGPT, which I think is perhaps

00:26:56

most people's first contact with this new wave of generative AI.

00:27:01

But the thing that allowed Microsoft and OpenAI to do this was just a deep amount of infrastructure

00:27:09

work that we've been investing in for a very long while.

00:27:13

And one of the things that we realized fairly early in our journey is just how important

00:27:19

compute was going to be and just how important it is to think about the sort of full systems

00:27:24

optimization.

00:27:26

So the work that we've been doing with you all has been not just about figuring out what

00:27:33

the silicon architecture looks like, but that's been a very important thing and making sure

00:27:37

that we together are building things that are going to intercept where the actual platform

00:27:42

is going to be years in advance, but also just doing all of that software work that

00:27:49

needs to be done to make this thing usable by all the developers of the world.

00:27:55

I think that's really key.

00:27:56

I think sometimes people don't understand, they think about AI as this year, but the

00:28:01

truth is we've been building the foundation for so many years.

00:28:04

Kevin, I want to take this moment to really acknowledge that Microsoft has been so instrumental

00:28:09

in our AI journey.

00:28:11

The work we've done over the last several generations, the software work that we're

00:28:15

doing, the platform work that we're doing, we're super excited for this moment.

00:28:19

Now I know you guys just had Ignite recently and Satya previewed some of the stuff you're

00:28:23

doing with 300X, but can you share that with our audience?

00:28:26

We're super enthusiastic about 300X.

00:28:29

Satya announced that the MI300X VMs were going to be available in Azure.

00:28:38

It's really, really exciting right now seeing the bring up of GPT-4 on MI300X, seeing the

00:28:45

performance of LLlama 2, getting it rolled into production.

00:28:50

The thing that I'm excited here today is we will have the MI300X VMs in preview available

00:28:57

today.

00:29:06

I completely agree with you.

00:29:07

The thing that's so exciting about AI is every day we discover something new and we're learning

00:29:12

that together.

00:29:13

Kevin, we're so honored to be Microsoft's partner in AI.

00:29:17

Thank you for all the work that your teams have done, that we've done together.

00:29:21

We look forward to a lot more progress.

00:29:23

Likewise.

00:29:24

Thank you very much.

00:29:31

All right, so look

00:29:32

We certainly do learn a tremendous amount every day and we're always pushing the envelope.

00:29:37

Let me talk to you a little bit about how we bring more people into our ecosystem.

00:29:41

When I talk about the Instinct platform, you have to understand our goal has really been

00:29:47

to enable as many customers as possible to deploy Instinct as fast and as simply as possible.

00:29:53

To do this, we really adopted industry standards.

00:29:57

We built the Instinct platform based on an industry standard OCP server design.

00:30:02

I'd actually like to show you what that means because I don't know if everyone understands.

00:30:06

Let's bring her out.

00:30:07

Her or him?

00:30:15

Let me show you the most powerful gen AI computer in the world.

00:30:27

Those of you who follow our shows know that I'm usually holding up a chip, but we've shown

00:30:32

you the MI300X chip already, so we thought it would be important to show you just what

00:30:38

it means to do generative AI at a system level.

00:30:42

What you see here is eight MI300X GPUs and they're connected by our high-performance

00:30:49

Infinity fabric in an OCP-compliant design.

00:30:53

What makes that special?

00:30:55

This board actually drops right into any OCP-compliant design, which is the majority of AI systems

00:31:02

today.

00:31:03

We did this for a very deliberate reason.

00:31:04

We want to make this as easy as possible for customers to adopt so you can take out your

00:31:09

other board and put in the MI300X Instinct platform.

00:31:14

If you take a look at the specifications, we actually support all of the same connectivity

00:31:19

and networking capabilities of our competition, so PCI Gen 5, support for 400 gig ethernet,

00:31:26

that 896 gigabytes per second of total system bandwidth, but all of that is with 2.4 times

00:31:33

more memory and 1.3 times more compute server than the competition.

00:31:38

That's really why we call it the most powerful gen AI system in the world.

00:31:42

Now, I've talked about some of the performance in AI workloads, but I want to give you just

00:31:47

a little bit more color on that.

00:31:50

When you look at deploying servers at scale, it's not just about performance.

00:31:54

Our customers are also trying to optimize power, space, CapEx and OpEx, and that's where

00:32:01

you see some really nice benefits of our platform.

00:32:05

When you compare our Instinct platform to the competition, I've already showed you that

00:32:09

we deliver comparable training performance and significantly higher inference performance,

00:32:14

but in addition, what that memory capacity and bandwidth gives us is that customers can

00:32:19

actually either run more models, if you're running multiple models on a given server,

00:32:24

or you can run larger models on that same server.

00:32:28

In the case where you're running multiple different models on a single server, the Instinct

00:32:32

platform can run twice as many models for both training and inference than the competition.

00:32:38

On the other side, if what you're doing is trying to run very large models, you'd like

00:32:42

to fit them on as few GPUs as possible.

00:32:46

With the FP16 data format, you can run twice the number of LLMs on a single MI300X server

00:32:52

compared to our competition.

00:32:55

This directly translates into lower CapEx, and especially if you don't have enough GPUs,

00:33:01

this is really, really helpful.

00:33:03

So, to talk more about MI300X and how we're bringing it to market, let me bring our next

00:33:09

guest to the stage.

00:33:11

Oracle Cloud and AMD have been engaged for many, many years in bringing great computing

00:33:15

solutions to the cloud.

00:33:17

Here to tell us more about our work together is Karan Batta, Senior Vice President at

00:33:21

Oracle Cloud Infrastructure.

00:33:28

Hey, Karan.

00:33:29

Hi, Lisa.

00:33:30

Thank you so much for being here.

00:33:31

Thank you for your partnership.

00:33:32

Can you tell us a little bit about the work that we're doing together?

00:33:36

Yeah, thank you.

00:33:37

Excited to be here today.

00:33:39

Oracle and AMD have been working together for a long, long time, right, since the inception

00:33:43

of OCI back in 2017.

00:33:45

And so, we've launched every generation of EPYC as part of our bare metal compute platform,

00:33:51

and it's been so successful, customers like Red Bull as an example.

00:33:55

And we've expanded that across the board for all of our portfolio of past services

00:33:59

like Kubernetes, VMware, et cetera.

00:34:02

And then we are also collaborating on Pensando DPUs, where we offload a lot of that logic

00:34:07

so that customers can get much better performance, flexibility.

00:34:10

And then, you know, earlier this year, we also announced that we're partnering with

00:34:13

you guys on Exadata, which is a big deal, right?

00:34:17

So, we're super excited about our partnership with AMD, and then what's to come with 300X?

00:34:21

Yeah.

00:34:22

We really appreciate OCI has really been a leading customer as we talk about how do we

00:34:28

bring new technology into Oracle Cloud.

00:34:31

Now, you're spending a lot of time on AI as well.

00:34:33

Tell us a little bit about your strategy for AI and how we fit into that strategy.

00:34:37

Absolutely.

00:34:38

You know, we're spending a lot of time on AI, obviously.

00:34:41

Everyone is.

00:34:42

We are.

00:34:43

Everybody is.

00:34:44

It's the new thing.

00:34:45

You know, we're doing that across the stack, from infrastructure all the way up to applications.

00:34:48

Oracle is an applications company as well.

00:34:50

And so, we're doing that across the stack, but from an infrastructure standpoint, we're

00:34:54

investing a lot of effort into our core compute stack, our networking stack.

00:34:59

We announced clustered networking.

00:35:01

And what I'm really excited to announce is that we're going to be supporting MI300X as

00:35:04

part of that bare-metal compute stack.

00:35:12

We are super thrilled about that partnership.

00:35:14

We love the fact that you're going to have 300X.

00:35:17

I know your customers and our customers are talking to us every day about it.

00:35:20

Tell us a little bit about what customers are saying.

00:35:22

Yeah, we've been working with a lot of customers.

00:35:24

Obviously, we've been collaborating a lot at the engineering level as well with AMD.

00:35:28

And you know, customers are seeing incredible results already from the previous generation.

00:35:32

And so, I think that will actually carry through with the 300X.

00:35:36

And so much so that we're also excited to actually support MI300X as part of our generative

00:35:41

AI service that's going to be coming up live very soon as well.

00:35:44

So, we're very, very excited about that.

00:35:46

We're working with some of our early customer adopters like Naveen from Databricks Mosaic.

00:35:51

So, we're very excited about the possibility.

00:35:54

We're also very excited about the fact that the ROCm ecosystem is going to help us

00:35:58

continue that effort moving forward.

00:36:00

So, we're very pumped.

00:36:02

That's wonderful.

00:36:03

Karan, thank you so much.

00:36:04

Thank your teams.

00:36:05

We're so excited about the work we're doing together and look forward to a lot more.

00:36:08

Thank you, Lisa.

00:36:09

Thank you.

00:36:14

Now, as important as the hardware is, software actually is what drives adoption.

00:36:20

And we have made significant investments in our software capabilities and our overall

00:36:23

ecosystem.

00:36:24

So, let me now welcome to the stage AMD President Victor Peng to talk about our software and

00:36:29

ecosystem progress.

00:36:35

Thank you, Lisa.

00:36:36

Thank you.

00:36:37

And good morning, everyone.

00:36:39

You know, last June at the AI event in San Francisco, I said that the ROCm software

00:36:44

stack was open, proven, and ready.

00:36:47

And today, I'm really excited to tell you about the tremendous progress we've made in

00:36:51

delivering powerful new features as well as the high performance on ROCm.

00:36:56

And how the ecosystem partners have been significantly expanding the support for Instinct GPUs and

00:37:02

the entire product portfolio.

00:37:04

Today, there are multiple tens of thousands of AI models that run right out of the box

00:37:09

on Instinct.

00:37:11

And more developers are running on the MI250, and soon they'll be running on the MI300.

00:37:17

So we've expanded deployments in the data center, at the edge, in client, embedded applications

00:37:23

of our GPUs, CPUs, FPGAs, and adaptive SoCs, really end to end.

00:37:29

And we're executing on that strategy of building a unified AI software stack so any model,

00:37:34

including generative AI, can run seamlessly across an entire product portfolio.

00:37:39

Now, today, I'm going to focus on ROCm and the expanded ecosystem support for our

00:37:44

Instinct GPUs.

00:37:47

We architected ROCm to be modular and open source to enable very broad user accessibility

00:37:53

and rapid contribution by the open source community and AI community.

00:37:58

Open source and ecosystem are really integral to our software strategy, and in fact, really

00:38:03

open is integral to our overall strategy.

00:38:06

This contrasts with CUDA, which is proprietary and closed.

00:38:09

Now, the open source community, everybody knows, moves at the speed of light in deploying

00:38:14

and proliferating new algorithms, models, tools, and performance enhancements.

00:38:19

And we are definitely seeing the benefits of that in the tremendous ecosystem momentum

00:38:24

that we've established.

00:38:26

To further accelerate developer adoption, we recently announced that we're going to

00:38:30

be sporting ROCm on our Radeon GPUs.

00:38:33

This makes AI development on AMD GPUs more accessible to more developers, start-ups,

00:38:39

and researchers.

00:38:41

So our foot is firmly on the gas pedal with driving the MI300 to volume production and

00:38:46

our next ROCm release.

00:38:49

So I'm really super excited that we'll be shipping ROCm 6 later this month.

00:38:53

I'm really proud of what the team has done with this really big release.

00:38:56

ROCm 6 has been optimized for gen AI, particularly large language models, has powerful

00:39:02

new features, library optimizations, expanded ecosystem support, and increases performance

00:39:09

by factors.

00:39:10

It really delivers for AI developers.

00:39:13

ROCm 6 supports FP16, BF16, and the new FP8 data pipes for higher performance while

00:39:20

reducing both memory and bandwidth needs.

00:39:25

We've incorporated advanced graph and kernel optimizations and optimized libraries for

00:39:29

improved efficiency.

00:39:31

We're shipping state-of-the-art attention algorithms like FlashAttention-2, page attention,

00:39:35

which are critical for performance in LLMs and other models.

00:39:40

These algorithms and optimizations are complemented with a new release of rCCL, our collective

00:39:45

communications library for efficient, very large-scale GPU deployments.

00:39:50

So look, the bottom line is ROCm 6 delivers a quantum leap in performance and capability.

00:39:56

Now I'm going to first work you through the inference performance gains you'll see with

00:40:00

some of these optimizations on ROCm 6.

00:40:02

So for instance, running a 70 billion Llama 2 model, page attention and other algorithms

00:40:07

speed up the token generation by paging attention keys and values, delivering 2.6x higher performance.

00:40:15

HIP graph allows processing to be defined in graphs rather than single operations,

00:40:21

and that delivers a 1.4x speed up.

00:40:24

FlashAttention, which is widely used kernel for very high-performance LLL performance,

00:40:29

delivers 1.3x speed up.

00:40:32

So all those optimizations together deliver an 8x speed up on the MI300x with ROCm 6

00:40:39

compared to the MI250 and ROCm 5.

00:40:42

That's 8x performance in a single generation.

00:40:46

So this is one of those huge benefits we provide to customers with this great performance improvement

00:40:50

with the MI300x.

00:40:52

So now let's look at it from a competitive perspective.

00:40:56

Lisa had highlighted the performance of large models running on multiple GPUs.

00:41:01

What I'm sharing here is how the performance of smaller models running on single GPUs,

00:41:07

in this case the 13 billion Llama 2 model.

00:41:10

The MI300x and ROCm 6 together deliver 1.2x higher performance than the competition.

00:41:16

So this is the reason why our customers and our partners are super excited about creating

00:41:20

the next innovations in AI on the MI300x.

00:41:26

So we're relentlessly focused on delivering leadership technology and very comprehensive

00:41:30

software support for AI developers.

00:41:33

And to fuel that drive, we've been significantly strengthening our software teams through both

00:41:38

organic and inorganic means, and we're expanding our ecosystem engagements.

00:41:43

So we recently acquired Nod.ai and Mipsology.

00:41:46

Nod brings world-class expertise in open source compilers and runtime technology.

00:41:52

They've been instrumental in the MLIR compiler technology as well as in the communities.

00:41:57

And as part of our team, they are significantly strengthening our customer engagements and

00:42:01

they're accelerating our software development plans.

00:42:05

Mipsology also strengthens our capabilities and they're especially in delivering to customers

00:42:09

in very AI-rich applications like autonomous vehicles and industrial automation.

00:42:16

So now let me turn over to the ecosystem.

00:42:20

In addition to working closely with the ecosystem, oh, sorry.

00:42:23

We announced that we had the partnership with Hugging Face just last June.

00:42:28

Today they have 62,000 models running daily on Instinct platforms.

00:42:34

And in addition, we've worked closely on getting these LLM optimizations as part of their optimal

00:42:39

library and toolkit.

00:42:41

Our partnership with PyTorch Foundation has also continued to thrive with CI/CD pipelines

00:42:48

and validation, enabling developers to target our platforms directly.

00:42:52

And we continue to make very significant contributions to all the major frameworks, including upstream

00:42:57

support for AMD GPUs in JAX, OpenXLA, QPI, and even initiatives like Deep Speed for Science.

00:43:06

Just yesterday, the AI Alliance was announced with over 50 founding members that also include

00:43:11

AMD, IBM, and Meta and other companies.

00:43:16

And I'm really delighted to share some very late-breaking news.

00:43:20

AMD GPUs, including the MI300, will be supported in the standard OpenAI Triton distribution

00:43:27

starting with the 3.0 release.

00:43:34

We're really thrilled to be working with Philippe Tillet, who created Triton, and the whole OpenAI team.

00:43:41

AI developers using the OpenAI Triton are more productive working at a higher level

00:43:45

of design abstraction, and they still get really excellent performance.

00:43:49

This is great for developers and aligned with our strategy to empower developers with powerful

00:43:55

and open software stacks and GPU platforms.

00:43:58

This is in contrast to the much greater effort developers would need to invest working at

00:44:02

a much lower level abstraction in order to eke out performance.

00:44:07

Now I've shared a lot with you about the progress we made on software, but the best indication

00:44:12

of the progress we've really made are the people who are using our software and GPUs

00:44:16

and what they're saying.

00:44:17

So it gives me great pleasure to have three AI luminaries and entrepreneurs from Databricks,

00:44:23

essential AI, and Lamini to join me on stage.

00:44:27

Please give a very warm welcome to Ion Stoica, Ashish Vaswani, and Sharon Zhou.

00:44:54

Great.

00:44:57

Welcome, Ion, Ashish, and Sharon.

00:44:58

Thank you so much for joining us here.

00:45:00

Really appreciate it.

00:45:01

So I'm gonna ask each of you a bit about first with the mission of your company

00:45:06

and share about the innovations you're doing with our GPUs and software

00:45:10

and what the experience has been like.

00:45:12

So Ion, let me start with you.

00:45:14

Now you're also not only founder of Databricks,

00:45:17

but you're on the staff of the department of UC Berkeley,

00:45:21

director of Sky Computing Labs,

00:45:22

and also you've been world with AnyScale and many AI startups.

00:45:27

So maybe you could talk about your engagement with AMD

00:45:30

as well as your experience in the MI200 and MI300.

00:45:33

Yeah, thank you very much.

00:45:34

Very glad to be here.

00:45:36

And yes, indeed, I collaborated with AMD wearing multiple hats,

00:45:43

director of a Sky Computing Lab at Berkeley,

00:45:46

which AMD is supporting, and also founders of AnyScale and Databricks.

00:45:52

And in all my work over the year, one thing I really focus on

00:45:56

is democratizing the access to AI.

00:46:00

What this means, it's improving the scale, performance, and cost,

00:46:05

reducing the cost, to run these large AI applications,

00:46:11

which means everything from AI workloads, everything from training,

00:46:15

fine-tuning, inference, and generative AI applications.

00:46:21

Just to give you some examples, we developed VLLM,

00:46:24

which is arguably now the most popular open-source

00:46:28

inference engines for LLMs.

00:46:31

We have developed Ray, another open-source framework

00:46:34

which is used to distribute machine learning workloads.

00:46:37

Ray has been used by OpenAI to train ChatGPT.

00:46:41

And more recently, Sky Computing, one of the projects there is SkyPilot,

00:46:46

which helps you to run your applications or machine learning applications

00:46:52

and workloads across multiple clouds.

00:46:54

And why do you want to do that?

00:46:56

It's because you want to alleviate the scarcity of the GPUs

00:47:01

and reduce the costs.

00:47:04

Now, when it comes to our collaborations,

00:47:07

we collaborate on all these kind of projects.

00:47:10

And one thing which was a very pleasant surprise

00:47:14

is that it was very easy to run and include ROCm in our stack.

00:47:22

It really runs out of the box from day one.

00:47:27

Of course, you need to do more optimization for that.

00:47:29

And this is what we are doing and we are working on.

00:47:32

So for instance, we had the support for MI250 and to Ray.

00:47:39

And we are working, actually, collaborating with AMD,

00:47:43

like I mentioned, to optimize the inference for VLLM,

00:47:48

again, running on MI250 and MI300X.

00:47:52

And from the point of view of SkyPilot,

00:47:55

we're really looking forward to have more and more of MI250s

00:48:02

and MI300X in various clouds.

00:48:05

So we have more choices.

00:48:07

It sounds great.

00:48:09

Thank you so much for all the collaboration across all those clouds.

00:48:12

Ashish why don't you tell us about Essential's mission

00:48:16

and also your experience with ROCm and Instinct?

00:48:20

Thank you.

00:48:22

Great to be here, Victor.

00:48:25

Essential, we're really excited.

00:48:27

We're really excited to push the boundaries of human-machine

00:48:32

partnership in enterprises.

00:48:33

We should be able to do it.

00:48:34

We're at the beginning stages where

00:48:36

we'll be able to do 10x or 50x more than what we can just

00:48:39

do by ourselves today.

00:48:40

So we're extremely excited.

00:48:41

And what that's going to take, I believe

00:48:45

it's going to be a full-stack approach.

00:48:47

So you're building the models, serving infrastructure,

00:48:49

but more importantly, understanding workflows

00:48:52

in enterprises today and giving people the tools

00:48:55

to configure these models, teach these models to configure them

00:48:59

for their workflows end to end.

00:49:01

And so the models learn with feedback.

00:49:02

They get better with feedback.

00:49:04

They get smarter.

00:49:06

And then they're eventually able to even guide non-experts

00:49:08

to do tasks they were not able to do.

00:49:10

We're really excited.

00:49:11

And we actually were lucky to start to benchmark the 250s

00:49:16

earlier this year.

00:49:18

And hey, we want to solve a couple of hard problems,

00:49:21

scientific problems. And we were like, hey, are we going

00:49:23

to get long context and check?

00:49:24

OK, so are we going to be able to trade larger models?

00:49:26

Are we able to serve larger models and smaller chips?

00:49:28

And so as we saw, and the ease of using the software

00:49:32

was also very pleasant.

00:49:36

And then we saw how things were progressing.

00:49:38

For example, I think in two months, I believe,

00:49:40

FlashAttention, which is a critical component

00:49:42

to actually scale to longer sequences,

00:49:44

appeared, so it was generally very happy

00:49:45

and just impressed with the progress

00:49:47

and excited about the chips.

00:49:49

Thanks so much, Ashish. And Sharon.

00:49:50

So Sharon, Lamini has a very innovative business model

00:49:57

and working with enterprise for their private models.

00:49:59

Why don't you share the mission and how the experience

00:50:01

with AMD has been?

00:50:03

Yeah, thanks, Victor.

00:50:04

So by way of quick background, Sharon,

00:50:07

co-founder CEO of Lamini, most recently,

00:50:09

I was a computer science faculty at Stanford

00:50:11

leading a research group in generative AI.

00:50:13

I did my PhD there also under Andrew Ng

00:50:16

and teach about a quarter million students

00:50:18

and professionals online in generative AI.

00:50:20

And I left Stanford to pursue Lamini and co-found Lamini

00:50:24

on the premise of making the magical, difficult, expensive

00:50:29

pieces of building your own language

00:50:31

model inside an enterprise extremely accessible, easy

00:50:35

to use so that companies who understand

00:50:38

their domain-specific problems best

00:50:39

can be the ones who can actually wield this technology

00:50:42

and, more importantly, fully own that technology.

00:50:47

In just a few lines of code, you can run an LLM

00:50:50

and be able to imbue it with knowledge

00:50:53

from millions of documents, which

00:50:55

is 40,000 times more than hitting

00:51:00

Claude 2 Pro on that API.

00:51:02

So just a huge amount of information

00:51:05

can be imbued into this technology

00:51:06

using our infrastructure.

00:51:08

And more importantly, our customers

00:51:11

get to fully own their models.

00:51:12

For example, NordicTrack, one of our customers

00:51:16

that makes all the ellipticals and treadmills in the gym,

00:51:20

parent companies, iFit, they have over 6 million users

00:51:23

on their mobile app platform.

00:51:26

And so they're building an LLM that can actually

00:51:29

create this personal AI fitness coach imbued

00:51:31

with all the knowledge they have in-house

00:51:34

on what a good fitness coach is.

00:51:35

And it turns out it's actually not a professional athlete.

00:51:37

They tried to hire Michael Phelps, did not work.

00:51:39

So they have real knowledge inside of their company

00:51:42

and they're imbuing the LLM with that

00:51:44

so that we can all have personal fitness trainers.

00:51:47

So we're very excited to be working with AMD.

00:51:51

We actually have had a cloud, AMD cloud,

00:51:54

in production for over the past year on MI200,

00:51:58

so MI210, MI250s.

00:52:00

And we're very excited about the MI300s.

00:52:04

And I think something that's been super important to us

00:52:07

is that with Lamini software,

00:52:09

we've actually reached software parity with CUDA

00:52:12

on all the things that matter with large language models,

00:52:15

including inference and training.

00:52:17

And I would say even beyond CUDA.

00:52:19

We have reached beyond CUDA

00:52:21

for things that matter for our customers.

00:52:23

So that's including higher memory,

00:52:25

higher memory or higher capacity means bigger models.

00:52:28

And our customers wanna be able to build

00:52:31

bigger and more capable models.

00:52:32

And then a second point,

00:52:33

which Lisa kind of touched on earlier today is,

00:52:39

these machines, these chips can actually,

00:52:42

given higher bandwidth, be able to return results

00:52:45

with lower latency, which matters for the user experience,

00:52:49

certainly a personal fitness coach,

00:52:51

but for all of our customers as well.

00:52:53

Super exciting, that's great.

00:52:56

Great.

00:52:58

So, Ion back to you, changing this up a little bit.

00:53:00

So, you heard several key components

00:53:02

of ROCm is open source.

00:53:03

And we did that for rapid adoption

00:53:05

and also getting better, more enhancements

00:53:07

from the community, both open source and AI.

00:53:09

So what do you think about this strategy

00:53:11

and how do you think this approach might help

00:53:12

some of the companies that you've founded?

00:53:15

So obviously, given my history,

00:53:17

really love the open source.

00:53:19

I love the open source ecosystem.

00:53:21

And we try to do over time to do our own contribution,

00:53:27

bring out, and I think that one thing to note

00:53:31

is that many of the generative AI tools today are open source.

00:53:35

And we are talking here about Hugging Face,

00:53:37

about PyTorch, Triton, like I mentioned,

00:53:40

BLM, Drey, and many others.

00:53:43

And many of these tools actually can run today

00:53:49

on AMD and ROCm, stack today.

00:53:54

And this makes ROCm another key component

00:53:58

of the open source ecosystem.

00:54:02

And I think this is great.

00:54:03

And it's, in time, I'm sure that actually quite fast.

00:54:10

It's like the community will take advantage

00:54:12

of the unique capabilities of the AMDs,

00:54:17

MI250 and MI300X to innovate

00:54:22

and to improve the performance of all these tools

00:54:27

which are running at a higher level of the generative AI stack.

00:54:30

Great, and that's our purpose and aim,

00:54:32

so I'm glad to hear that.

00:54:34

So I'm gonna, out of order execution,

00:54:38

jump over to Sharon.

00:54:40

So Sharon, what do you think about how AI workloads

00:54:44

are evolving in the future?

00:54:46

And what do you think, GPU Instincts,

00:54:48

since you have great experience with it

00:54:50

and ROCm can play in that future of AI development?

00:54:54

Okay, so maybe a bit of a spicy take.

00:54:56

I think that GOFAI, good old-fashioned AI,

00:55:00

is not the future of AI.

00:55:03

And I really do think it's LLMs,

00:55:05

or some variant of LLMs of these models

00:55:08

that can actually be able to soak up

00:55:10

all this general knowledge that is missing

00:55:13

from these traditional algorithms.

00:55:15

And we've seen this across so many different algorithms

00:55:17

in our customers already.

00:55:19

Those who are even at the bleeding edge

00:55:21

of recommendation systems, forecasting systems,

00:55:23

classification, are even using this

00:55:26

because of that general knowledge that it's able to learn.

00:55:29

So I think that's the future.

00:55:30

It's maybe more known as Software 2.0,

00:55:34

coined by my friend, Andre Karpathy.

00:55:36

And I really do think Software 2.0,

00:55:38

which is hitting these models time and time again,

00:55:41

instead of writing really extensive software

00:55:43

inside a company, we'll be supporting enterprises 2.0,

00:55:48

meaning enterprises of the future, of the next generation.

00:55:53

And I think the AMD Instinct GPUs

00:55:57

are critical to basically supporting,

00:56:01

ubiquitously supporting the Software 2.0 of the future.

00:56:05

And we absolutely need compute

00:56:07

to be able to run these models efficiently,

00:56:09

to run lots of these models, more of these models,

00:56:12

and larger models with greater capabilities.

00:56:15

So overall, very excited with the direction

00:56:18

of not only these AI workloads,

00:56:20

but also the direction that AMD is taking

00:56:23

in doubling down on these MI300s

00:56:25

that, of course, can take on larger models

00:56:28

and more capable models for us.

00:56:30

Awesome.

00:56:33

So Ashish, we'll finish up with you

00:56:36

and I'll give you the same kind of question.

00:56:37

So where do you think about the future of AI workloads

00:56:39

and how do you think our GPUs and ROCm and can play

00:56:42

and how you're driving things at Essential?

00:56:44

Yep.

00:56:51

So I think that we have to improve reasoning

00:56:58

and planning to solve these complex tasks,

00:57:01

like take an analyst and if they actually,

00:57:04

they want to absorb an earnings call

00:57:06

and figure out how they should revise their opinion

00:57:10

and whether to invest in a company or what recommendations that they should provide.

00:57:13

It's actually gonna take,

00:57:14

it's gonna take multiple reasoning over multiple steps.

00:57:17

It's gonna take ingesting a large document

00:57:20

and being able to extract information from it,

00:57:23

apply their models, actually ask for information

00:57:25

when they don't have any, get world knowledge,

00:57:28

but also maybe have some reasoning

00:57:32

and some outside reasoning and planning there.

00:57:34

And then for all these sort of,

00:57:36

so when I look at the MI300 with very large HBM

00:57:40

and high memory bandwidth,

00:57:42

I think of what's gonna be unlocked

00:57:44

and which capabilities are going to be improved

00:57:46

and what new capabilities will be available.

00:57:48

So I mean, even with what we have today,

00:57:51

just imagine a world where you can process long documents

00:57:55

or you can make these models much more accurate

00:57:57

by adding more examples in the prompt.

00:58:00

But imagine just complete user sessions

00:58:02

that you can maintain and model state,

00:58:04

how they would actually improve

00:58:06

the end-to-end user experience, right?

00:58:08

And I think that we're moving to a kind of architecture

00:58:15

where what typically is to happen in inference,

00:58:17

a lot of search is now gonna go into training

00:58:19

where the models are gonna explore thousands of solutions

00:58:22

and eventually pick one that's actually the best option

00:58:24

for the goal, the best solution for the goal.

00:58:28

And that's good, and definitely the large HBM

00:58:31

and high bandwidth is gonna not only be important

00:58:33

for serving large models with low latency

00:58:35

for better end-to-end experience,

00:58:36

but also for some of these new techniques

00:58:38

that we're just exploring

00:58:40

that are gonna improve the capabilities of these models.

00:58:43

So very excited about the new chip

00:58:46

and what it's gonna unlock.

00:58:47

Great, thank you, Ashish.

00:58:48

Ion, Ashish, Sharon, this has been really terrific.

00:58:51

Thank you so much for all the great insights

00:58:54

you have provided us.

00:58:55

Thank you.

00:58:56

And thank you for joining us today.

00:58:57

Thank you.

00:58:58

Thank you.

00:58:59

Thank you.

00:59:00

Thank you.

00:59:03

It's just so exciting to hear what companies like Databricks,

00:59:06

Essential AI, Lamini are achieving with our GPUs

00:59:09

and just super thrilled that their experience

00:59:12

with our software has been so smooth and really a delight.

00:59:16

So you can tell, they see absolutely no barriers, right?

00:59:19

And they're extremely motivated

00:59:20

to innovate on AMD platforms.

00:59:23

Okay, to sum it up, what we delivered

00:59:25

over the past six months is empowering developers

00:59:28

to execute their mission and realize their vision.

00:59:32

We'll be shipping ROCm 6 very soon.

00:59:34

It's optimized for LLMs and together with the MI300X,

00:59:38

it's gonna deliver 8X gen-on-gen performance improvement

00:59:42

and it's higher performance in inference

00:59:44

than the competition.

00:59:46

We have 62,000 models running on Instinct today

00:59:49

and more models will be running on the MI300 very soon.

00:59:54

We have very strong momentum,

00:59:55

as you can see in the ecosystem,

00:59:57

adding open AI training to our extensive list

01:00:00

of NG standard frameworks, models, runtimes and libraries.

01:00:04

And you heard from the panels, right?

01:00:06

Our tools are proven and easy to use.

01:00:09

Innovators are advancing the state of the art of AI

01:00:12

on AMD GPUs today.

01:00:15

ROCm 6 and the MI300X will drive an inflection point

01:00:19

in developer adoption, I'm confident of that.

01:00:22

We're empowering innovators to realize the profound benefits

01:00:26

of pervasive AI faster on AMD.

01:00:30

Thank you.

01:00:35

And now I'd like to invite Lisa back on the stage.

01:00:45

Thank you, Victor.

01:00:46

And weren't those innovators great?

01:00:48

I mean, you love the energy

01:00:49

and just all of the thought there.

01:00:51

So look, as you can see,

01:00:53

the team has really made great, great progress with ROCm

01:00:57

and our overall software ecosystem.

01:00:59

Now, I said I wanted though,

01:01:00

we really want broad adoption for MI300X.

01:01:03

So let's go through and talk to some additional customers

01:01:07

and partners who are early adopters of MI300X.

01:01:10

Our next guest is a partner really at the forefront

01:01:13

of GenAI innovation and working across models,

01:01:17

software and hardware.

01:01:18

Please welcome Ajit Matthews of Meta to the stage.

01:01:28

Hello, Ajit, it's so nice of you to be here.

01:01:30

We're incredibly proud of our partnership together.

01:01:34

Meta and AMD have been doing so much work together.

01:01:36

Can you tell us a little bit about Meta's vision in AI?

01:01:39

Cause it's really broad and key for the industry.

01:01:43

Absolutely, thanks Lisa.

01:01:45

We are excited to partner with you and others

01:01:48

and innovate together to bring generative AI

01:01:51

to people around the world at scale.

01:01:54

Generative AI is enabling new forms of connection

01:01:57

for people around the world,

01:01:59

giving them the tools to be more creative,

01:02:01

expressive and productive.

01:02:04

We are investing for the future

01:02:06

by building new experiences for people across our services

01:02:09

and advancing open technologies

01:02:12

and research for the industry.

01:02:14

We recently launched AI stickers,

01:02:17

image editing, Meta AI, which is our AI assistant

01:02:21

that spans our family of apps and devices

01:02:25

and lots of AIs for people to interact

01:02:28

within our messaging platforms.

01:02:31

In July, we opened access to our Llama 2 family of models

01:02:36

and as you've seen it, have blown away

01:02:38

by the reception from the committee

01:02:40

who have built some truly amazing applications

01:02:44

on top of them.

01:02:45

We believe that an open approach feeds to better

01:02:51

and safer technology in the long run

01:02:53

as we have seen from our involvement

01:02:55

in the PyTorch Foundation, Open Compute Project

01:02:58

and across dozens of previous AI models

01:03:02

and data set releases.

01:03:04

We're excited to have partnered with the industry

01:03:06

on our generative AI work, including AMD.

01:03:10

We have a shared vision to create new opportunities

01:03:13

for innovation in both hardware and software

01:03:16

to improve the performance and efficiency of AI solutions.

01:03:22

That's so great, Ajit.

01:03:23

We completely agree with the vision.

01:03:26

We agree with the open ecosystem

01:03:28

and that really being the path to get all of the innovation

01:03:32

from all the smart folks in the industry.

01:03:34

Now, we've collaborated a lot on the product front as well,

01:03:38

both EPYC and Instinct.

01:03:39

Can you talk a little bit about that work?

01:03:42

Yeah, absolutely.

01:03:43

We have been working together on EPYC CPUs since 2019

01:03:48

and most recently deployed Genoa and Bergamo-based servers

01:03:53

at scale across Meta's infrastructure

01:03:55

where it now serves many diverse workloads.

01:04:00

But our partnership is much broader than EPYC CPUs

01:04:04

and we have been working together on Instinct GPUs

01:04:06

starting since the MI100 in 2020.

01:04:10

We have been benchmarking ROCm

01:04:12

and working together on improvements for its support

01:04:16

in PyTorch across each generation of AMD Instinct GPU,

01:04:20

leading up to MI300X now.

01:04:22

Over the years, ROCm has evolved,

01:04:25

becoming a competitive software platform

01:04:27

due to optimizations and ecosystem growth.

01:04:31

AMD is a founding member of PyTorch foundations

01:04:34

and has made significant commitment to PyTorch

01:04:37

investment providing day zero support for PyTorch 2.0

01:04:40

with ROCm, Torch.compile, Torch.export,

01:04:43

all of those things are great.

01:04:44

We have seen tremendous progress

01:04:45

on both Instinct GPU performance and ROCm maturity

01:04:49

and are excited to see ecosystem support

01:04:52

grow beyond PyTorch 2.0,

01:04:53

like to open AI Triton, today's announcement

01:04:56

with respect to being a default backend of AMD,

01:04:59

that's great, FlashAttention-2 is great,

01:05:02

Hugging Face, great, and other industry frameworks.

01:05:05

All of these are great partnerships.

01:05:08

It really means a lot to hear you say that, Ajit.

01:05:10

I think we also view

01:05:12

that it's been an incredible partnership.

01:05:13

I think the teams work super closely together,

01:05:16

that's what you need to do to drive innovation.

01:05:18

And the work with PyTorch Foundation

01:05:20

is foundational for AMD, but really the ecosystem as well.

01:05:25

But our partnership is very exciting right now with GPUs,

01:05:29

so can you talk a little bit about the 300X plans?

01:05:31

Oh, here we go.

01:05:32

We are excited to be expanding our partnership

01:05:34

to include Instinct MI300X GPUs

01:05:37

in our data centers for AI inference workings.

01:05:40

Thank you, so much.

01:05:45

So, just to give you a little background,

01:05:47

MI300X leverages the OCP accelerator module,

01:05:51

standard and platform,

01:05:52

which has helped us adopt in record time.

01:05:55

In fact, MI300X is trending to be one of the fastest

01:05:58

designed-to-deployment solutions in the Meta history.

01:06:05

We have also had a great experience with ROCm,

01:06:09

and the performance is able to deliver with MI300X.

01:06:12

The optimizations and the ecosystem growth over the years

01:06:16

have made ROCm a competitive software platform.

01:06:19

As model parameters increase

01:06:21

and the Llama family of models continues to grow in size

01:06:24

and power, which it will,

01:06:26

the MI300X with its 192 GB of memory

01:06:29

and higher memory bandwidth meets the expanding requirements

01:06:32

for large language model inference.

01:06:34

We are really pleased with the ROCm optimizations

01:06:37

that AMD has done,

01:06:39

focused on the Llama 2 family of models on MI300X.

01:06:43

We are seeing great, promising performance numbers,

01:06:46

which we believe will benefit the industry.

01:06:49

So, to summarize, we are thrilled with our partnership

01:06:52

and excited about the capabilities offered by the MI300X

01:06:56

and the ROCm platform as we start to scale their use

01:06:59

in our infrastructure for production workloads.

01:07:02

That is absolutely fantastic, Ajit.

01:07:04

Thank you, Lisa.

01:07:05

Thank you so much.

01:07:07

We are thrilled with the partnership

01:07:09

and we look forward to seeing lots of MI300Xs

01:07:12

in your infrastructure. So, thank you for being here.

01:07:14

That's good. Thank you.

01:07:19

So, super exciting.

01:07:21

We said cloud is really where a lot of the infrastructure

01:07:24

is being deployed,

01:07:25

but enterprise is also super important.

01:07:28

So, when you think about the enterprise right now,

01:07:30

many enterprises are actually thinking about their strategy.

01:07:33

They want to deploy AI broadly

01:07:35

across both cloud and on-prem,

01:07:38

and we're working very closely with our OEM partners

01:07:41

to bring very integrated enterprise AI solutions

01:07:44

to the market.

01:07:45

So, to talk more about this,

01:07:47

I'd like to invite one of our closest partners to the stage,

01:07:50

Arthur Lewis, President of Dell Technologies

01:07:52

Infrastructure Solutions Group.

01:07:58

Hey, welcome, Arthur.

01:08:00

I'm so glad you could join us for this event.

01:08:02

And Dell and AMD have had such a strong history of partnership.

01:08:06

I actually also think, Arthur,

01:08:08

you have a very unique perspective

01:08:10

of what's happening in the enterprise,

01:08:11

just given your purview.

01:08:13

So, can we just start with giving the audience

01:08:15

a little bit of a view of what's happening in enterprise AI?

01:08:18

Yeah, Lisa, thank you for having me today.

01:08:21

We are at an inflection point with artificial intelligence.

01:08:26

Traditional machine learning and now generative AI

01:08:29

is a catalyst for much greater data utilization,

01:08:32

making the value of data tangible

01:08:34

and therefore quantifiable.

01:08:37

Data, as we all know, is growing exponentially.

01:08:39

A hundred terabytes of data was generated last year,

01:08:42

more than doubling over the last three years.

01:08:44

And IDC projects that data will double again by 2026.

01:08:49

And it is clear that data is becoming

01:08:51

the world's most valuable asset.

01:08:53

And this data has gravity.

01:08:55

83% of the world's data resides on-prem,

01:08:59

and much of the new data will be generated at the edge.

01:09:03

Yet customers are dealing with years of rapid data growth,

01:09:07

multiple copies on-prem across clouds,

01:09:10

proliferating data sources, formats, and tools.

01:09:13

These challenges, if not overcome,

01:09:15

will prevent customers from realizing

01:09:17

the full potential of artificial intelligence

01:09:19

and maximizing real business outcome.

01:09:22

Today, customers are faced with two suboptimal choices.

01:09:27

Number one, stitch together a complex web

01:09:30

of technologies and tools and manage it themselves,

01:09:33

or two, replicate their entire data estate

01:09:37

in the public cloud.

01:09:39

Customers need and deserve a better solution.

01:09:43

Our job is to bring artificial intelligence to the data.

01:09:47

That's great perspective, Arthur.

01:09:49

And that 83% of the data and where it resides,

01:09:53

I think, is something that sticks in my mind a lot.

01:09:55

Now let's move to a little bit of the technology.

01:09:57

I mean, we've been partnering together

01:09:58

to bring some great solutions to the market.

01:10:00

Tell us more about what you have planned

01:10:02

from a tech standpoint.

01:10:03

Well, today's an exciting day.

01:10:05

We are announcing a much-anticipated update

01:10:08

to the family of our PowerEdge 9680,

01:10:11

the fastest growing product in Dell ISG history,

01:10:15

with the addition of AMD's Instinct MI300X Accelerator

01:10:19

for artificial intelligence.

01:10:27

Effective today, we are going to be able to offer

01:10:29

a new configuration of eight MI300X accelerators,

01:10:35

providing 1.5 terabytes of coherent HBM3 memory,

01:10:39

delivering bandwidth of 5.3 terabytes per server.

01:10:44

This is an unprecedented level of performance

01:10:47

in the industry and will allow customers

01:10:49

to consolidate large language model inferencing

01:10:53

onto a fewer number of services,

01:10:55

while providing for training at scale,

01:10:58

while also reducing complexity, cost,

01:11:01

and data center footprint.

01:11:03

We are also leveraging AMD's Instinct Infinity Platform,

01:11:07

which provides a unified fabric

01:11:10

for connecting multiple GPUs within and across servers,

01:11:14

delivering near linear scaling

01:11:16

and low latency for distributed AI.

01:11:20

Further,

01:11:22

and there's more.

01:11:26

Through our collaboration with AMD

01:11:28

on software and open source frameworks,

01:11:30

which Lisa, you talked a lot about today,

01:11:32

including PyTorch and TensorFlow,

01:11:34

we can bring seamless services for customers

01:11:37

and out-of-the-box LLM experience.

01:11:40

We talked about making it simple.

01:11:41

This makes it incredibly simple.

01:11:43

And we've also optimized the entire stack

01:11:47

with Dell storage,

01:11:48

specifically power scale and object scale,

01:11:50

providing ultra low latency ethernet fabrics,

01:11:53

which are designed specifically

01:11:54

to deliver the best performance and maximum throughput

01:11:58

for generative AI training and inferencing.

01:12:01

This is an incredibly exciting step forward.

01:12:04

And again, effective today, Lisa,

01:12:06

we're open for business,

01:12:08

we're ready to quote,

01:12:09

and we're taking orders.

01:12:10

I like the sound of that.

01:12:15

Look, it's so great to see how this all comes together.

01:12:19

Our teams have been working so closely together

01:12:22

over the last few years

01:12:23

and definitely over the last year.

01:12:26

Tell us though, there's a lot of co-innovation

01:12:28

and differentiation in these solutions.

01:12:31

So just tell us a little bit more about that.

01:12:33

Well, our biggest differentiator

01:12:35

is really the breadth of our technology portfolio at

01:12:38

Dell Technologies.

01:12:39

Products like power scale,

01:12:41

which is our one file system for unstructured data storage,

01:12:44

has been helping customers in industries

01:12:46

like financial services, manufacturing, life sciences,

01:12:49

to help solve the world's most challenging problems

01:12:52

for decades as the complexity of their workflows

01:12:55

and scale of their data estate increases.

01:12:58

And with AMD, we are bringing these components together

01:13:01

with open networking products and AI fabric solutions,

01:13:05

taking the guesswork out of building tailored gen AI solutions

01:13:09

for customers of all sizes, again, making it simple.

01:13:13

We have both partnered with Hugging Face

01:13:15

to ensure transformers and LLMs for generative AI

01:13:19

don't just work for our combined solutions

01:13:21

but are optimized for AMD's accelerators

01:13:24

and easy to configure and size for workloads with our products.

01:13:29

And in addition to that, Dell validated designs,

01:13:34

we have a comprehensive set

01:13:35

and a growing array of services and offerings

01:13:38

that can be tailored to meet the needs of customers

01:13:41

looking for a complimentary gen AI strategy consultation

01:13:46

all the way up to and fully managed solution

01:13:49

for generative AI.

01:13:50

That's fantastic, Arthur.

01:13:52

Great set of solutions, love the partnership

01:13:54

and love what we can do

01:13:56

for our enterprise customers together.

01:13:57

Thank you so much for being here.

01:13:58

Thank you for having me, Lisa.

01:14:00

Yeah.

01:14:04

Our next guest is another great friend.

01:14:06

Supermicro and AMD have been working together

01:14:09

to bring leadership computing solutions to the market

01:14:11

for many years based on AMD EPYC processors

01:14:14

as well as Instinct accelerators.

01:14:15

Here to tell us more about that,

01:14:17

please join me in welcoming CEO Charles Liang to the stage.

01:14:20

Congratulations.

01:14:27

Thank you so much.

01:14:28

Hello, Charles.

01:14:29

For a successful launch.

01:14:30

Yeah, thank you so much for being here.

01:14:31

I mean, Supermicro is really well known

01:14:33

for building highly optimized systems

01:14:37

for lots of workloads.

01:14:38

We've done so much together.

01:14:40

Can you share a little bit

01:14:41

about how you're approaching gen AI?

01:14:43

Thank you.

01:14:44

Because our building block solution

01:14:46

based on a modularized design.

01:14:48

So that enables Supermicro to design product

01:14:51

quicker than others and deliver product to customer

01:14:55

also quicker, better leverage inventory

01:14:58

and better for service.

01:15:00

And thank you for our close relationship.

01:15:02

Thank you for all I have.

01:15:04

So that's why we are able to design product

01:15:06

time to market as soon as possible.

01:15:10

Well, I really appreciate that our teams

01:15:12

also work very closely together.

01:15:14

And we now know that everybody is calling us

01:15:18

for AI solutions.

01:15:19

You've built a lot of AI infrastructure.

01:15:22

What are you seeing in the market today?

01:15:24

Oh, the market continues to grow very fast.

01:15:27

The only limitation is-

01:15:28

Very fast, right?

01:15:29

Very fast.

01:15:30

Maybe more than very fast.

01:15:33

So all we need is just more chips.

01:15:36

I know.

01:15:45

So today, including USA,

01:15:47

Netherlands, Taiwan and Malaysia,

01:15:49

we have more than 4,000 rack per month capacity

01:15:54

and customer facing to no enough power,

01:15:58

no enough space problem.

01:16:00

So with our rack-scale building block solution,

01:16:04

with free air cooling,

01:16:07

optimized for hybrid air and free air cooling,

01:16:10

optimized for liquid cooling,

01:16:12

that can have customer safe energy power

01:16:15

up to 30 to even 40%.

01:16:17

And that allow customer to install more system

01:16:21

with fixed power budget

01:16:23

and all same power, same system,

01:16:27

but less energy cost.

01:16:29

So all of those,

01:16:30

together with our rack-scale building block solution,

01:16:34

we installed a whole rack,

01:16:36

including generative CPU, GPU,

01:16:40

and storage, switch,

01:16:45

firmware, management software,

01:16:48

security function.

01:16:49

And when we shift to customer,

01:16:51

customer just simply plug in two cable,

01:16:54

power cable, data cable,

01:16:57

and then ready to run, ready to online.

01:17:00

For liquid cooling customer,

01:17:02

for sure they need a water kind of tube.

01:17:06

So that make a customer can easily online

01:17:11

with one chip available.

01:17:13

Yeah, no, that's fantastic.

01:17:15

Thank you, Charles.

01:17:16

Now, let's talk a little bit about MI300X.

01:17:18

What do you have planned for MI300?

01:17:21

Okay, the big product.

01:17:22

We have a product based on MI300X,

01:17:27

like 8U for air cooler,

01:17:29

or for the air cooler.

01:17:31

And then 4U optimize for liquid cooler.

01:17:33

So the air cooler per rack,

01:17:37

we support up to 40 kW or 50 kW.

01:17:40

For liquid cooler,

01:17:41

we support up to 80 kW or 100 kW.

01:17:46

And so all kind of rack-scale plug and play.

01:17:50

So when customer need,

01:17:51

once we have chip,

01:17:52

we can ship the customer quicker.

01:17:55

That sounds wonderful.

01:17:56

Well, look, we appreciate all the partnership, Charles,

01:17:58

and we will definitely see a lot of opportunity

01:18:02

to collaborate together on the generative AI.

01:18:04

So thank you so much.

01:18:05

Thank you so much.

01:18:06

Thank you.

01:18:12

Okay, now let's turn to our next guest.

01:18:14

Lenovo and AMD have a broad partnership as well

01:18:17

that spans from data center to workstations and PCs,

01:18:20

and now to AI.

01:18:22

So here to tell us about this special partnership,

01:18:24

please welcome to the stage, Kirk Skaugen,

01:18:26

EVP and President of Infrastructure Solutions Group

01:18:29

at Lenovo.

01:18:34

Hello, Kirk.

01:18:35

Thank you so much for being here.

01:18:37

We truly appreciate the partnership with Lenovo.

01:18:41

You have a great perspective as well.

01:18:43

Tell us about your view of AI

01:18:45

and what's going on in the market.

01:18:47

Sure.

01:18:48

Well, AI is not new for Lenovo.

01:18:49

We've been talking and innovating around AI for many years.

01:18:52

We just had a great supercomputing

01:18:54

where we're the number one supercomputer provider

01:18:56

to the top 500,

01:18:58

and we're proud that IDC just ranked us number three

01:19:01

AI server infrastructure in the world as well.

01:19:02

So it's not new to us,

01:19:04

but you are at Tech World,

01:19:05

so thanks for joining us in Austin.

01:19:08

We're trying to help shape the future of AI

01:19:10

from the pocket to the edge to the cloud,

01:19:12

and we've had this kind of concept of AI for all.

01:19:15

So what does that mean?

01:19:16

Pocket meaning Motorola, smartphone, AI devices,

01:19:21

and then all the way to the cloud with our ODM Plus model.

01:19:23

So our collaboration with our customers

01:19:27

is really to accelerate AI adoption,

01:19:29

and we recently announced another billion dollars

01:19:32

to the original $1.2 billion we announced a few years ago

01:19:34

to deliver AI solutions to businesses of all sizes,

01:19:37

from the smallest business to the largest cloud.

01:19:39

So we believe that generative AI

01:19:41

will ultimately be a hybrid approach,

01:19:44

and fundamentally we do want to bring AI to the data.

01:19:47

I think one of the most exciting things for me is,

01:19:49

I think like Arthur said, right,

01:19:50

we'd see data doubling in the world over the next few years.

01:19:54

75% of that compute is moving to the edge,

01:19:56

and today we're only computing 2% of it,

01:19:58

so we're throwing away 98%.

01:20:00

So more data is going to be created in the next few years

01:20:02

in the entire history of the world combined,

01:20:04

and together we're bringing AI to the edge

01:20:07

with the recent SE455 ThinkEdge that we announced.

01:20:10

We think that there's kind of three views of generative AI,

01:20:13

public AI, private AI, and personal AI,

01:20:16

and the key for us is protecting privacy

01:20:18

and addressing data security.

01:20:19

So public AI where you'd use obviously public data,

01:20:23

enterprise AI where you'd use only your enterprise data

01:20:26

within your firewall, and then on things like an AI PC,

01:20:29

things that you choose to have only on your device,

01:20:31

whether that's a phone, a tablet, or a PC.

01:20:33

Yeah, no, no, it's a very comprehensive vision,

01:20:35

and we see it very much the same way.

01:20:38

Now, you talked a lot about your AI strategy at Tech World,

01:20:41

and you had some key pillars there.

01:20:44

Do you want to just tell us a little bit more about that?

01:20:45

Yeah, so I think there's three fundamental pillars

01:20:47

of our AI vision and strategy.

01:20:48

First, we have an AI product roadmap,

01:20:50

I think that's second to none,

01:20:51

from a rich smart device portfolio,

01:20:53

and we'll talk about AI PCs probably more in another day,

01:20:56

smartphones and tablets.

01:20:57

Then we have a huge array now of over 70 AI-ready server

01:21:02

and storage infrastructure products,

01:21:04

and then we've recently launched a whole set of solutions

01:21:07

and services around that as well.

01:21:08

So more than 70 products,

01:21:10

and we'll talk about the new ones we're announcing today,

01:21:12

which are very exciting.

01:21:13

The second thing is we have something called

01:21:15

an AI innovators program.

01:21:16

What's really daunting to people

01:21:18

is there's over 16,000 AI startups out there.

01:21:20

So if you have an IT department of a few dozen people,

01:21:23

how do you even start?

01:21:24

So we've gone and scoured the earth,

01:21:27

we've found 65 ISVs, 165 solutions

01:21:30

where we've optimized them

01:21:31

on top of Lenovo infrastructure

01:21:33

for some of the key verticals,

01:21:34

and are delivering kind of simplified AI

01:21:36

to the customer base.

01:21:37

And then at Tech World,

01:21:38

we launched a comprehensive set of professional services.

01:21:42

Now Lenovo, more than 40% of our revenue is non-PC,

01:21:45

so we're transforming into data center and services.

01:21:48

So we're doing everything in the AI

01:21:50

from just basic customer discovery of what you can do

01:21:52

if you're a stadium,

01:21:54

what are the best-in-class stadium solutions

01:21:55

if you're a fast food chain, if you're a supermarket,

01:21:58

all the way to AI adoption.

01:22:00

And then even from a sustainability perspective,

01:22:02

things like asset recovery services

01:22:04

to make sure you have a sustainable AI journey as well.

01:22:06

Yeah, I know it makes a lot of sense.

01:22:07

And you know, gen AI and large language models

01:22:10

is sort of the defining moment for us right now.

01:22:12

You're spending a lot of time with customers.

01:22:14

What are you hearing from them

01:22:15

and what are their challenges?

01:22:16

Yeah, so I think the key message

01:22:18

is that customers need help in simplifying their AI journey.

01:22:21

I mean, there's so much coming at them.

01:22:23

So our investments in that $2 billion they talked about

01:22:26

are really expanding our AI-ready portfolio

01:22:28

to deliver fully integrated systems

01:22:30

that bring AI-powered computing to everywhere data is created,

01:22:34

especially the edge,

01:22:35

and helping businesses easily and efficiently

01:22:37

deploy generative AI applications.

01:22:39

We're also hearing that customers want choice.

01:22:42

Choice in systems, choice in software,

01:22:44

choice in services, and definitely large language models

01:22:47

and model training are creating a lot of buzz.

01:22:49

But over time, I think we all know inference

01:22:51

is gonna become the dominant AI workload

01:22:53

as data flows from these billions

01:22:55

of connected devices at the edge.

01:22:57

So generative AI from our perspective,

01:22:59

like you said, I think in your opening comments,

01:23:01

needs high-performance compute,

01:23:03

large and fast memory, and a software stack

01:23:05

to support the leading AI ecosystem solution.

01:23:07

So with that, I believe Lenovo and AMD

01:23:10

are really uniquely positioned

01:23:12

to take advantage of these trends.

01:23:14

Yeah, absolutely. And our teams are doing a lot of work together

01:23:17

and working closely on MI300X.

01:23:19

Tell us more about your plans.

01:23:21

Well, we have a long proven track record as a PC company

01:23:25

and as a data center company of bringing Ryzen AI

01:23:27

to our ThinkPads, and we're committed

01:23:30

to being time to market on large language models,

01:23:32

on inferencing, and we're working with AMD

01:23:35

to develop our next-gen AI product roadmap

01:23:37

and our solution portfolios.

01:23:39

So we're incredibly excited today

01:23:40

about the addition of the MI300X

01:23:42

to the Lenovo ThinkSystem platform.

01:23:44

It's gonna be very exciting.

01:23:45

Thank you. Thank you.

01:23:48

So we're committed to be time to market

01:23:50

with a dual-EPYC 8 GPU MI300X

01:23:54

and have a lot of customer interest on that.

01:23:56

So bottom line, from edge to cloud,

01:23:59

we are incredibly excited about what's ahead for us.

01:24:03

We're gonna have all of this available as a service

01:24:05

through our Lenovo TruScale as well.

01:24:07

So you only have to pay for what you need.

01:24:09

So as we move to an asset service model,

01:24:10

everything we talked about today

01:24:12

will be available through that as well.

01:24:13

So thank you very much and look forward

01:24:15

to continuing the collaboration.

01:24:16

Absolutely, Kirk.

01:24:17

Thank you so much.

01:24:18

Thanks for the partnership.

01:24:19

All right, thank you.

01:24:23

So that's great.

01:24:24

Big thank you to Kirk and Arthur and Charles

01:24:27

for all the work that we're doing together

01:24:28

to really bring MI300X to our customers.

01:24:31

It really does take an entire ecosystem.

01:24:33

We're very proud of actually the broad OEM and ODM ecosystem

01:24:37

that we have brought together

01:24:38

to bring a wide range of MI300X solutions to market in 2024.

01:24:43

And in addition to the OEM and ODM ecosystem,

01:24:46

we're also significantly expanding our work

01:24:49

with some of these specialized AI cloud partners.

01:24:51

So I'm happy to say today that all of these partners

01:24:54

are adding MI300X to their portfolio.

01:24:57

And what's important about this is

01:24:59

it will actually make it easier for developers

01:25:01

and AI startups to get access to MI300X GPUs

01:25:05

as soon as possible with a proven set of providers

01:25:08

who each have their unique value and capabilities.

01:25:12

So that tells you a little bit about the ecosystem

01:25:14

that we're putting together for MI300X.

01:25:21

Now, we've given you a lot of information already,

01:25:24

but what is very, very important is not just the hardware

01:25:28

and the software and all of our customer partnerships,

01:25:31

but it's also the rest of the system partnerships.

01:25:33

So now let me welcome to the stage Forrest Norrod

01:25:36

to talk more about our AI networking

01:25:38

and high-performance computing solutions.

01:25:46

Thank you, Lisa.

01:25:47

Good morning.

01:25:48

So far, we've talked about the amazing GPU

01:25:51

and open software ecosystem that AMD is building

01:25:55

to power generative AI systems.

01:25:57

But there's a third element that's equally important

01:26:02

to the performance and scalability

01:26:03

of these large AI deployments, and that's networking.

01:26:08

The compute required to train the most advanced models

01:26:12

has increased by a factor of 50 billion

01:26:15

over the past decade.

01:26:17

While GPU performance has also increased,

01:26:21

what that performance demand means is we need many GPUs

01:26:25

in order to deliver the required total performance.

01:26:30

Leading AI clusters are now tens of thousands of GPUs,

01:26:35

and that's only going to increase.

01:26:38

Well, so the first way we've scaled to meet that demand

01:26:42

is within the server.

01:26:43

A typical server has perhaps a couple

01:26:46

of high-performance x86 CPUs and perhaps eight GPUs.

01:26:50

You've seen that today.

01:26:52

These are interconnected with a high-performance,

01:26:54

low-latency, non-blocking local fabric.

01:26:58

In the case of NVIDIA, that's NVLink.

01:27:01

For AMD, that's Infinity Fabric.

01:27:04

Both have high signaling rates, low latency,

01:27:08

both are coherent.

01:27:10

Both have demonstrated the ability

01:27:11

to offer near-linear scaling performance

01:27:14

as you increase the number of GPUs,

01:27:17

and both have been proprietary,

01:27:20

effectively only supported by the companies

01:27:22

that created them.

01:27:24

I'm pleased to say that today, AMD is changing that.

01:27:29

We are extending access to the Infinity Fabric ecosystem

01:27:33

to strategic partners and innovative companies

01:27:36

across the industry.

01:27:45

Doing so allows others to innovate

01:27:47

around the AMD GPU ecosystem to the benefit of customers

01:27:51

and the entire industry.

01:27:53

You'll hear more about this from one of our partners

01:27:55

in a few minutes and much more on this initiative next year.

01:28:00

But beyond the node, we still need to connect and scale

01:28:04

to much larger numbers.

01:28:06

We need fabrics to connect the servers to one another,

01:28:09

welding them into one resource.

01:28:12

Now, there are usually two networks connected

01:28:15

to each of these GPU servers.

01:28:17

A traditional ethernet network used to connect the server

01:28:21

to the rest of the data center traditional infrastructure,

01:28:25

and more importantly, a backside network

01:28:29

to interconnect the GPUs, allowing them to share parameters,

01:28:33

results, activations, and coordinate

01:28:36

in the overall training and inference tasks.

01:28:40

When we're connecting thousands of nodes

01:28:42

like we do in AI systems, the network is critical

01:28:46

to overall performance.

01:28:48

It has to deliver fast switching rates

01:28:50

at very low latency.

01:28:52

It must be efficiently scalable

01:28:54

so that congestion problems don't limit performance.

01:28:58

And in AMD, we believe it must also be open,

01:29:01

open to allow innovation.

01:29:04

Today, there are two options for the backend fabric,

01:29:08

InfiniBand or Ethernet.

01:29:10

At AMD, we believe Ethernet is the right answer.

01:29:14

It's a high-performance technology with leading signaling rates.

01:29:18

It has extensions such as RoCE and RDMA

01:29:21

to efficiently move data between nodes,

01:29:25

a set of innovations developed

01:29:27

for leading supercomputers over the years.

01:29:30

It's scalable, offering the highest-rate switching technology

01:29:34

from leading vendors such as Broadcom, Cisco, and Marvell.

01:29:38

And we've seen tremendous innovation recently

01:29:41

in advanced congestion control

01:29:43

to deal with the issues of scale effectively.

01:29:47

And most of all, it's open.

01:29:49

Open means companies can extend ethernet,

01:29:51

innovating on top as needed to solve new problems.

01:29:56

We've seen that from Hewlett Packard Enterprise

01:29:58

with their Slingshot technology,

01:30:00

which powers the network at the heart of Frontier,

01:30:02

the world's fastest supercomputer,

01:30:04

enabling it to achieve exascale performance.

01:30:08

And we've seen Google and AWS,

01:30:10

who run some of the largest clusters in the world,

01:30:13

develop their own ethernet extensions.

01:30:16

And finally, maybe most importantly,

01:30:18

we've seen the industry come together

01:30:21

to create the Ultra Ethernet Consortium and Standard,

01:30:25

where leaders across the field have united

01:30:27

to drive the future of ethernet

01:30:30

and ensure it's the best high-performance interconnect

01:30:34

for AI and HPC.

01:30:37

And we're proud to welcome to the stage today

01:30:40

some of those networking leaders.

01:30:42

Andy Bechtolsheim from Arista,

01:30:46

Jas Tremblay from Broadcom,

01:30:48

and Jonathan Davidson from Cisco.

01:31:00

Welcome, gentlemen.

01:31:01

It's not often that we have such a panel

01:31:05

of ethernet experts on the stage.

01:31:08

But before we jump right into ethernet,

01:31:12

perhaps we can talk a little bit about the work

01:31:14

of enabling an ecosystem for AI solutions,

01:31:17

and what that looks like,

01:31:18

and why is it so important to have an open approach?

01:31:22

And maybe, Jonathan, you can start.

01:31:24

Sure, absolutely. Well, first of all, congratulations

01:31:26

on all the announcements today.

01:31:29

We look at how ethernet is so critical,

01:31:34

because I remember back in the day

01:31:37

doing testing on 10 megabit ethernet interoperability.

01:31:42

We're now at 400 gig, 800 gig.

01:31:44

We have line of sight to 1.6 terabit.

01:31:46

It is absolutely ubiquitous across the industry,

01:31:49

and it's also interoperable.

01:31:52

It's a beautiful thing.

01:31:53

So that open standard is really important

01:31:55

for us to be able to make this successful.

01:31:59

Absolutely.

01:32:00

And, Jas, your thoughts as well.

01:32:02

No, I 100% agree.

01:32:04

Forrest, you and I share a vision

01:32:06

of the power of the data center ecosystem.

01:32:08

You think about a data center,

01:32:10

you've got thousands of companies coming together

01:32:12

to work as one, and this is really enabled

01:32:16

by open standards and a code of conduct

01:32:19

that we shall interop.

01:32:20

We're gonna make things work together across companies,

01:32:23

in some cases, across competitors,

01:32:25

and I'm especially excited about the work

01:32:27

that you and I have been doing on Infinity Fabric xGMI,

01:32:32

and we wanna let the industry know

01:32:36

that the next generation of Broadcom PCIe switches,

01:32:40

which are used as the internal fabric inside AI servers,

01:32:44

are gonna support Infinity Fabric xGMI,

01:32:46

and we'll be sharing more details around that

01:32:48

over the next few quarters.

01:32:49

But I think it's important that we offer choices

01:32:54

and options to customers,

01:32:55

and that we come together and jointly innovate.

01:32:58

I completely agree,

01:32:59

and Andy, you've long been a proponent of open.

01:33:03

Yeah, well, open standards have been the driving force

01:33:07

for a lot of the innovation

01:33:09

throughout the industry's history,

01:33:11

but nowhere is this more true than in the case of ethernet,

01:33:14

where the incredible progress we've seen

01:33:16

for the last 40 years would not have happened

01:33:19

without the contributions

01:33:21

of many, many ecosystem participants,

01:33:23

including the companies that are represented here

01:33:25

on this stage.

01:33:27

Absolutely, well, okay,

01:33:28

so since this is a panel of ethernet luminaries,

01:33:33

let's talk about ethernet in particular.

01:33:35

What are the advantages of ethernet for AI?

01:33:38

What are the advantages of ethernet in general,

01:33:41

and how are customers using it today?

01:33:43

We'll talk about the future in a minute,

01:33:44

but let's reflect on current state.

01:33:46

Maybe, Andy, you can start out.

01:33:48

Yeah, so ethernet, at least to me,

01:33:51

is the clear choice for AI fabrics,

01:33:54

and for very basic reason,

01:33:56

it doesn't have a scalability limit.

01:33:58

It can truly support not just 10,000s of nodes today,

01:34:02

but 100,000s, perhaps even a million nodes in the future,

01:34:05

and there is no other network technology

01:34:08

that has that attribute,

01:34:10

and without that scalability,

01:34:12

you're just boxing yourself in.

01:34:15

Yeah, very true, and Jonathan,

01:34:16

I know you guys have been working quite a bit

01:34:19

on AI networking systems as well.

01:34:22

Maybe you could amplify. Absolutely.

01:34:23

Well, for today specifically,

01:34:25

we see the majority of hyperscalers,

01:34:26

as you've had some of them on the stage today,

01:34:28

are either using ethernet for AI fabrics,

01:34:31

or there's a high desire for them to move to ethernet

01:34:34

for the AI fabrics, and so that requires

01:34:37

a lot of collaboration from the folks up here on stage

01:34:39

to make that happen.

01:34:41

We also have been helping customers deploy in the past

01:34:45

their AI networks for enterprise use cases globally,

01:34:49

and it might have started more

01:34:50

in the financial trading sector in the past,

01:34:53

but we're seeing a tremendous amount

01:34:54

of interest in use cases for that whole system

01:34:58

and how you pull all those things together

01:35:00

from the network, the GPU, the NIC, the DPU,

01:35:03

all the way to how you wrap the software around that

01:35:06

to really make it simple and understand

01:35:09

how things are working, and when they're not working,

01:35:11

why, and making that simple for them to do that as well.

01:35:14

Absolutely, and Jas, I know, well, all of us

01:35:17

have been working together in deploying

01:35:19

ethernet-based solutions for AI leaders today,

01:35:23

but I mean, we've been working with the two gentlemen

01:35:27

on the end on switching, but Jas,

01:35:30

maybe you can reflect on the NIC as well.

01:35:33

I think the NIC is critical.

01:35:35

People want choices, and we need to move the innovation

01:35:39

even faster in the NIC, and you'll see much more linkages

01:35:44

between the NIC and the switch,

01:35:45

where before you had a compute domain and a network domain,

01:35:50

and these things are really coming together,

01:35:53

and AI is a driving force of that,

01:35:55

because the complexity is going up so much.

01:35:57

Yeah, absolutely.

01:35:58

Well, okay, so let's talk about the future a little bit.

01:36:01

You know, the Ultra Ethernet Consortium is all three,

01:36:06

all four companies on stage are founding members,

01:36:10

and there's many others that have joined.

01:36:14

You know, UEC is one of the fastest growing,

01:36:17

or maybe the fastest growing consortium

01:36:20

under the Linux Foundation, which has been great to see.

01:36:23

It's gonna shape, I think, UEC is gonna shape

01:36:25

the future of AI networking, and so let's unpack that,

01:36:29

because I think that's a critical topic for folks.

01:36:31

And maybe, Jas, why don't you go ahead and start off.

01:36:34

Yeah, so first of all, ethernet is ready today for AI,

01:36:38

but we need to continue to innovate,

01:36:41

and UEC started with a group of eight companies,

01:36:44

including four of our companies here, cloud providers,

01:36:48

system providers, and semiconductor providers,

01:36:52

coming together around a common vision,

01:36:54

and the vision is AI networks need to be open,

01:36:58

standards-based, we need to offer choices,

01:37:01

and we need to enhance them.

01:37:03

And with that common vision, you know,

01:37:05

the engineers we've assigned from other companies

01:37:07

really got together and rolled up their sleeves,

01:37:10

and the innovation happened extremely quickly.

01:37:14

It's quite exciting, actually.

01:37:15

And one of the things that I'm most excited about this

01:37:19

is we're not building something new.

01:37:22

We are jointly going to enhance ethernet

01:37:26

that's existed for 50 years.

01:37:29

So it's not starting from scratch, it's enhancing,

01:37:30

it's recognizing that ethernet is what people want.

01:37:33

We just need to continue to enhance it

01:37:35

and making this open and standards-based.

01:37:37

Absolutely, and Jonathan, I know Cisco's been

01:37:41

a huge proponent of UEC as well.

01:37:43

Maybe you can reflect on your thoughts

01:37:45

of where this is going.

01:37:46

Absolutely, well, I think that UEC absolutely

01:37:49

is very critical for Cisco, everyone on the panel,

01:37:52

and the whole industry so that we can continue

01:37:54

to drive that movement towards open.

01:37:59

It always takes time. You gotta debate what are the right

01:38:00

technical way to solve things,

01:38:01

but I think that overall it's moving in the right direction.

01:38:04

What I see what's happening here is that

01:38:06

we're gonna have to have interoperability

01:38:08

in more than just one area.

01:38:10

Andy, I wanna talk about LPO and all the things

01:38:12

that we need to do there to make that actually happen.

01:38:17

And what's happening at UEC is another important part.

01:38:20

And what I see what's happening between now

01:38:22

and when the first standard comes out

01:38:24

is really a coalition of the willing.

01:38:25

Like, how do we get all of us together

01:38:27

to drive towards those open interfaces,

01:38:30

whether it be at the ethernet layer,

01:38:32

whether it be at things that you need to plug into it,

01:38:34

how the GPUs connect into that,

01:38:36

how you're actually gonna spray traffic

01:38:38

across a very broad radix,

01:38:40

how you're gonna make sure you can reorder packets

01:38:42

in a consistent way.

01:38:43

These are all things that we need to make sure

01:38:45

that we are driving towards

01:38:47

from an interoperability perspective.

01:38:49

And we've got our own silicon, we've got optics,

01:38:52

but we also are in the component business at Cisco.

01:38:55

And so we sell those things.

01:38:57

Hyperscalers might wanna just buy pieces from us,

01:38:59

like the silicon, and enterprises may want the full system.

01:39:02

But we wanna make sure that it's absolutely 100%

01:39:04

interoperable in every single environment.

01:39:07

Absolutely.

01:39:08

And Andy, maybe you can hone in a little bit more.

01:39:11

I mean, I think many people that aren't familiar

01:39:14

with networking may think, hey, how hard can this be?

01:39:17

We're just shuffling bits around between systems,

01:39:20

but there's a lot of problems to solve.

01:39:22

Yeah, so UEC is in fact solving

01:39:25

a very important technical problem,

01:39:27

which is the way we describe it is modern RDMA at scale.

01:39:32

And this has not been solved before.

01:39:34

To be clear, you know, RoCE today exists,

01:39:36

but it has its limitations.

01:39:38

And it does take an ecosystem effort approach,

01:39:42

and it involves in particular the adapter,

01:39:45

the next silicon vendors,

01:39:47

but also the whole end-to-end interoperability

01:39:50

of that architecture.

01:39:53

We're very excited to be part of this.

01:39:55

We're not in the NIC business ourselves,

01:39:56

but this is absolutely key to enable scaling of RDMA

01:40:01

across 100,000s, if not a million nodes.

01:40:04

Yeah, absolutely.

01:40:05

And when you look at what's being predicted

01:40:08

in terms of million node, hundreds of thousands

01:40:11

up to a million node systems,

01:40:15

I mean, we all have our work cut out for us,

01:40:18

but working together, I know we can solve the problems.

01:40:21

Well, guys, thanks so much for coming to talk to us today.

01:40:26

I'd like to thank you all for your partnership

01:40:28

in this journey, and thank you all for coming today.

01:40:31

Thanks very much.

01:40:32

Thanks, guys.

01:40:34

Thank you so much.

01:40:39

I'd really like to thank our partners

01:40:40

from Arista, Broadcrown, and Cisco

01:40:43

for attending and for their partnership

01:40:45

in driving this critical third leg

01:40:48

that determines the performance of AI systems.

01:40:51

Now, let's turn our focus to high-performance computing,

01:40:55

the traditional realm of the world's largest systems.

01:40:59

AMD has been driving HPC technology for many years.

01:41:03

In 2021, we delivered the MI250,

01:41:07

introducing third-generation Infinity architecture.

01:41:10

It connected an EPYC CPU to the MI250 GPU

01:41:14

through a high-speed bus, Infinity Fabric.

01:41:17

That allowed the CPU and the GPU

01:41:19

to share a coherent memory space

01:41:22

and easily trade data back and forth,

01:41:24

simplifying programming and speeding up processing.

01:41:28

But today, we're taking that concept one step further,

01:41:32

really to its logical conclusion,

01:41:35

with the fourth-generation Infinity architecture

01:41:38

bringing the CPU and the GPU together into one package,

01:41:42

sharing a unified pool of memory.

01:41:46

This is an APU, an Accelerated Processing Unit.

01:41:50

And I'm very proud to say

01:41:52

that the industry's first data center APU for AI and HPC,

01:41:56

the MI300A, began volume production earlier this quarter

01:42:01

and is now being built into what we expect

01:42:03

to be the world's highest-performing system.

01:42:12

Now, Lisa already showed you

01:42:13

what our chiplet technologies make possible with the MI300X.

01:42:19

The MI300A takes those same building blocks

01:42:23

in a slightly different fashion.

01:42:25

Now, the IO die is laid down first, as before,

01:42:28

and contains the infinity cache

01:42:30

and connections to memory and IO.

01:42:32

The XCD accelerator chiplets are bonded on top,

01:42:35

as in the MI300X.

01:42:38

But with the MI300A, we also take CPU chiplets

01:42:43

leveraged directly from our fourth-generation

01:42:46

EPYC CPUs, Genoa, and we put those

01:42:50

on top of the IODs as well,

01:42:53

thus bringing together our leading CPU,

01:42:57

Zen, and CDNA technologies into one amazing part.

01:43:03

Finally, eight stacks of HBM3

01:43:05

with up to 128 gigs of capacity complete the MI300A.

01:43:11

A key advantage of the APU is no longer needing

01:43:15

to copy data from one processor to another,

01:43:19

even through a coherent link,

01:43:22

because the memory is unified,

01:43:25

both in the RAM as well as in the cache.

01:43:29

The second advantage is the ability

01:43:31

to optimize power management between the CPU and the GPU.

01:43:35

That means dynamically shifting power

01:43:38

from one processor to another,

01:43:40

depending on the needs of the workload,

01:43:42

optimizing application performance.

01:43:45

And very importantly, an APU can dramatically

01:43:49

streamline programming, making it easier

01:43:52

for HPC users to unlock its full performance.

01:43:56

And let's talk about that performance.

01:43:58

61 teraflops of double-precision floating point, FP64.

01:44:04

122 teraflops of single-precision.

01:44:08

Combined with that 128 gigabytes of HBM3 memory

01:44:11

at 5.3 terabytes a second of bandwidth,

01:44:14

the capabilities of the MI300A are impressive.

01:44:19

And they're impressive, too,

01:44:20

when you compare it to the alternative.

01:44:23

When you look at the competition,

01:44:25

the MI300A has 1.6 times the memory capacity

01:44:29

and bandwidth of Hopper.

01:44:32

For low-precision operations like FP16,

01:44:34

the two are at parity in terms

01:44:37

of computational performance.

01:44:39

But where precision is needed,

01:44:42

MI300A delivers 1.8 times the double and single-precision

01:44:49

FP64 and FP32 floating point performance.

01:44:54

And beyond simple benchmarks,

01:44:55

the real advantages of an APU come with the performance

01:44:59

of real-world applications which have been tuned

01:45:02

for the APU architecture.

01:45:04

For example, let's look at OpenFOAM.

01:45:07

OpenFOAM is a set of computational fluid dynamics codes

01:45:10

widely used across research, academia, and industry.

01:45:15

With MI300A, we see four times the performance

01:45:20

of Hopper on common OpenFOAM codes.

01:45:23

Now, that performance comes from several places,

01:45:32

from higher performance math operations as we talked,

01:45:36

larger memory and the increased memory bandwidth.

01:45:39

But much of that uplift really comes

01:45:41

from that unified memory eliminating the need

01:45:43

to copy data around the system.

01:45:45

That can perform for tuned applications

01:45:48

truly transformative performance.

01:45:52

And I'm also proud to say that beyond performance,

01:45:54

AMD has stayed true to its heritage,

01:45:58

to its history of leading in power efficiency.

01:46:02

At the node level, the MI300A has twice

01:46:06

the HPC performance per watt of the nearest competitor.

01:46:10

Customers can thus fit more nodes

01:46:13

into their overall facility power budget

01:46:17

and better support their sustainability goals.

01:46:21

With the MI300A, we set out to help our customers

01:46:25

advance the frontiers of research

01:46:28

and not just running traditional HPC applications.

01:46:32

One of the most exciting new areas in HPC

01:46:35

is actually the convergence with AI,

01:46:38

where AI is used in conjunction with HPC techniques

01:46:42

to help steer simulations,

01:46:45

thus getting much better results much faster.

01:46:49

A great example of this is CosmoFlow.

01:46:51

It couples deep learning

01:46:53

with traditional HPC simulation methods,

01:46:57

giving researchers the ability to probe more deeply

01:47:00

and allowing us to learn more about the universe at scale.

01:47:05

CosmoFlow is one of the first applications

01:47:07

targeted to be run on El Capitan,

01:47:09

which we believe will be the industry's first true

01:47:13

two exaflop supercomputer running double precision float

01:47:17

when it's fully commissioned

01:47:19

at Lawrence Livermore National Labs.

01:47:21

Now it.

01:47:22

It's gonna be an amazing machine.

01:47:29

So let's hear more about El Capitan

01:47:32

and its applications for HPC and AI

01:47:35

from our partners at LLNL and Hewlett Packard Enterprises.

01:47:40

We expect El Capitan to be an engine

01:47:43

for artificial intelligence and deep learning.

01:47:47

We will recreate the experimental environment in simulation,

01:47:51

generate lots of data, for example,

01:47:53

and then train our artificial intelligence methods

01:47:55

on that simulation data.

01:47:58

El Capitan will be the most capable AI machine

01:48:01

and its use of APUs at this scale

01:48:04

will be the first of its kind.

01:48:07

As you operate these exascale level workloads,

01:48:11

all of those nodes talk to each other.

01:48:14

AMD and HPE have a long legacy of partnership

01:48:17

and it was only natural for us to partner again

01:48:19

for El Capitan.

01:48:21

The MI300A can be versatile across many different workloads

01:48:25

and we couple it directly with our slingshot fabric

01:48:28

to give it high performance as it operates as a system.

01:48:31

We work very closely with AMD and HPE

01:48:34

to deliver the hardware and the software

01:48:36

that's actually used by the scientists

01:48:38

in the machine itself.

01:48:39

It's really that partnership together

01:48:41

that can really go after and build these supercomputers.

01:48:45

El Capitan will be 16 times faster

01:48:48

than our existing machine here at Lawrence Livermore.

01:48:51

It will enable scientific breakthroughs

01:48:53

that we can't even imagine.

01:49:03

We're proud to have partnered

01:49:05

with Hewlett Packard Enterprises to design

01:49:07

and now build this amazing system.

01:49:10

And so I'd like to invite to the stage Trish Damkroger,

01:49:13

the Senior Vice President and Chief Product Officer

01:49:16

for HPC AI and Labs from Hewlett Packard Enterprise.

01:49:24

Thank you.

01:49:26

Welcome, Trish.

01:49:27

The AMD and HPE teams have been working closely together

01:49:31

over the years to deliver some next generation supercomputers.

01:49:35

Most recently, of course, we've broken the exascale barrier.

01:49:40

I gotta say that again. We broken the exascale barrier

01:49:46

with Frontier for Oak Ridge National Labs.

01:49:48

And now we're looking forward to powering

01:49:50

another exascale system and another bench,

01:49:53

another record with you with El Capitan

01:49:57

for Lawrence Livermore National Labs,

01:49:58

another US Department of Energy lab.

01:50:02

Maybe you can share more with this audience

01:50:04

about our journey together and the innovations

01:50:06

that we've ushered in this journey to exascale.

01:50:10

Sure.

01:50:11

First, I wanna echo the long partnership

01:50:13

that we've had with AMD.

01:50:15

Frontier continues to be the fastest computer in the world.

01:50:20

Many doubted our ability to actually reach exascale,

01:50:24

but we were able to achieve this feat

01:50:26

with industry-leading liquid cooling infrastructure,

01:50:30

next-generation high-performance interconnect

01:50:32

with Slingshot, our highly differentiated system management

01:50:36

and Cray programming environment software,

01:50:39

along with the incredible MI250.

01:50:42

With Frontier, exascale computing

01:50:44

has already made breakthroughs in areas such as aerospace,

01:50:47

climate modeling, healthcare, and nuclear physics.

01:50:51

Frontier is also one of the world's

01:50:53

top 10 greenest supercomputers.

01:50:56

In fact, HPE and AMD have the majority

01:51:00

of the world's top 10 energy-efficient supercomputers.

01:51:09

I am very excited to deliver El Capitan to Lawrence Livermore.

01:51:13

As you know, I worked there for over 15 years.

01:51:16

El Capitan's computing prowess will fundamentally shift

01:51:20

what the scientists and engineers will be able to achieve.

01:51:24

El Capitan's gonna be 15 to 20 times faster

01:51:27

than their current system.

01:51:29

Supercomputing is truly essential

01:51:31

to the mission of the Department of Energy.

01:51:34

Lawrence Livermore has been at the forefront

01:51:37

driving the convergence of HPC and AI,

01:51:40

demonstrated by work at the National Ignition Facility

01:51:43

and other of the national security programs.

01:51:45

I'm really looking forward to continuing our journey

01:51:48

of bringing more leadership-class systems to the world.

01:51:52

Absolutely.

01:51:53

I couldn't agree more, Trish.

01:51:54

It's been a rewarding journey working together with HPE.

01:51:58

But speaking of our shared success

01:52:03

in building these record-breaking systems,

01:52:06

can you tell us a bit more about El Capitan

01:52:09

and how HPE is developing the Instinct 300A-powered

01:52:16

CPU to El Capitan?

01:52:18

Great, yes.

01:52:19

El Capitan will feature the HPE Cray EX supercomputer

01:52:23

with the MI300A accelerators

01:52:26

to power large AI-driven scientific projects.

01:52:29

The HPE Cray EX supercomputer was built from the ground up

01:52:34

with end-to-end capabilities

01:52:35

to support the magnitude of exascale.

01:52:38

El Capitan nodes include the MI300A,

01:52:42

coupled with our Slingshot Fabric

01:52:44

to operate as a fully integrated system.

01:52:47

Supercomputing is the foundation needed for large-scale AI,

01:52:51

and HPE is uniquely positioned to deliver this

01:52:54

with our Cray supercomputers.

01:52:56

El Capitan will be that engine for AI

01:52:59

and deep learning for the Department of Energy.

01:53:02

They will be recreating the experimental environment

01:53:05

and simulations and training the AI models

01:53:08

with all of that vast amount of data.

01:53:11

El Capitan will be one of the most capable AI systems

01:53:15

in the world.

01:53:17

And beyond El Capitan, we're excited to have expanded

01:53:20

our supercomputing portfolio with the MI300A

01:53:23

to bring next-generation accelerated compute

01:53:26

to a broad set of customers.

01:53:28

Yeah, so Trish, that's fantastic.

01:53:30

And actually, let's double-click into that a little bit more.

01:53:33

I know that there are a growing number

01:53:35

of supercomputing customers, not just at LLNL,

01:53:40

that are really applying AI to their projects.

01:53:42

Can you tell us a little bit even more about that?

01:53:45

Sure, so AI undoubtedly will be the catalyst

01:53:48

to transform scientific research.

01:53:51

As I said earlier, supercomputing is the foundation

01:53:54

needed to run AI.

01:53:56

And HPE is the undisputed leader

01:53:58

in delivering supercomputers.

01:54:00

Some example where AI will be fundamental in El Capitan

01:54:04

include the National Ignition Facility,

01:54:07

where they will be using 1D, 2D, 3D simulations,

01:54:11

along with trained AI models to develop a more robust design

01:54:16

for higher-yield fusion reactions.

01:54:18

Just imagine fusion energy in our future.

01:54:21

Another application is high-resolution

01:54:23

earthquake modeling, essential for understanding

01:54:26

building structural integrity and also emergency planning.

01:54:30

And one more application is bioassurance,

01:54:32

where simulation and AI models will be key

01:54:35

in developing rapid therapeutics.

01:54:38

Supercomputing and AR are tools to allow engineers

01:54:41

and scientists the ability to find the unknown.

01:54:45

I'm thrilled to be part of the journey

01:54:47

of accelerating scientific discovery

01:54:49

and the scale of impact it has

01:54:52

on changing the way people live and work.

01:54:55

Fantastic.

01:54:56

Well, Trish, thank you.

01:54:57

I'm so excited about the opportunities

01:54:59

that researchers and scientists will have

01:55:02

with the systems that we're bringing to the market together.

01:55:05

Thanks so much.

01:55:06

Thank you.

01:55:09

Yeah, on behalf of AMD and the entire team,

01:55:12

I really wanna just thank HPE and our customers

01:55:16

for the opportunity to participate

01:55:18

in the development of these massive systems.

01:55:20

Because El Capitan will be an amazing machine

01:55:23

and a real showcase for the MI300A,

01:55:27

which defines leadership at this critical junction

01:55:30

as HPC and AI converge.

01:55:33

AMD is proud of the leadership systems powered by MI300A,

01:55:37

which will be available soon from partners around the world.

01:55:41

I can't wait to see what researchers and scientists

01:55:45

are gonna do with these systems.

01:55:47

And with that, I'd like to welcome Lisa back on stage

01:55:51

to conclude our journey today.

01:55:53

Thank you.

01:56:00

All right, thank you, Forrest,

01:56:02

and thank you to all of our partners who joined us.

01:56:05

You've heard from Victor, Forrest, our key partners.

01:56:08

We have significant momentum,

01:56:10

and we're building on that for the data center AI platforms.

01:56:14

To cap off the day,

01:56:15

let me now talk about another important area for AMD

01:56:18

where we're delivering leadership AI solutions,

01:56:20

and that's the PC.

01:56:22

Now, for the PCs, we recognized several years ago

01:56:25

that on-chip AI accelerators, or NPUs,

01:56:28

would be very, very important for next generation PCs.

01:56:31

And the NPU is actually the compute engine

01:56:34

that will enable us to reimagine what it means

01:56:37

to build a truly intelligent and personal PC experience.

01:56:41

At AMD, we're on actually a multi-year journey.

01:56:43

We have a strong roadmap to deliver the highest performance

01:56:46

and most power-efficient NPUs possible.

01:56:49

We were actually the first company to integrate an NPU

01:56:52

into an x86 processor

01:56:54

when we launched Ryzen Mobile 7040 series earlier this year,

01:56:58

and we integrated the XDNA architecture

01:57:01

that actually came from our acquisition of Xilinx.

01:57:04

It actually took us less than a year

01:57:05

to bring Xilinx's proven technology into our PC products.

01:57:11

Let me tell you a little bit about XDNA.

01:57:13

It's a scalable and adaptive computing architecture.

01:57:15

It's built around a large computing array

01:57:18

that can efficiently transfer the massive amounts of data

01:57:21

required for AI inference.

01:57:23

And as a result, XDNA is both extremely performant

01:57:27

and also very energy-efficient.

01:57:29

So you can run multiple AI workloads

01:57:31

simultaneously in real time.

01:57:34

Now, I'm happy to say that we've already shipped millions

01:57:37

of Ryzen AI-enabled PCs into the market

01:57:40

with all of the leading PC OEMs

01:57:42

and all of this provides the hardware foundation

01:57:45

for developers to leverage this first wave of AI PCs.

01:57:49

Now, if you look at some of the applications,

01:57:51

today Ryzen AI powers hundreds of different AI functions,

01:57:55

things like advanced motion tracking

01:57:57

and sharpening to deep blur 4K video,

01:58:00

enabling production level digital production capabilities

01:58:03

with unlimited virtual cameras,

01:58:05

all in an ultra-thin notebook for the very first time.

01:58:09

We're also working with key software leaders

01:58:11

like Adobe and Blackmagic,

01:58:13

and they're using our on-chip Radeon GPU

01:58:16

to accelerate the AI-enabled editing features

01:58:19

so that you can dramatically improve productivity

01:58:22

for content creators.

01:58:24

And of course, we've worked very, very closely with Microsoft

01:58:27

to enable Windows 11 Studio Effects on Ryzen AI.

01:58:31

Now, today we're launching some additional capabilities.

01:58:34

So Ryzen AI 1.0 software,

01:58:37

it will make it easier for developers

01:58:38

to add advanced gen AI capabilities.

01:58:41

So with this new package,

01:58:43

developers can create an AI-enabled application

01:58:46

that's ready to run on Ryzen AI hardware

01:58:50

just by choosing a pre-trained model.

01:58:52

So for example, you can choose one of the models

01:58:54

that are available on Hugging Face,

01:58:57

quantize it based on your needs,

01:58:58

and then deploy it through ONNX Runtime.

01:59:01

So this is a major step forward

01:59:03

when you think about the broad ecosystem

01:59:06

that wants to run AI apps for Windows,

01:59:08

and we can't wait to see what ISVs will do

01:59:11

when they really capture the leadership performance

01:59:14

that you can get from an NPU in Ryzen AI.

01:59:18

Now, of course, we know developers

01:59:20

always want more AI compute.

01:59:22

So today, I'm very happy to say that we're launching

01:59:25

our "Hawk Point" Ryzen 8040 Series Mobile Processors.

01:59:29

And,

01:59:31

Thank you.

01:59:35

"Hawk Point" combines all of our industry-leading performance

01:59:38

and battery life, and it increases AI TOPS by 60%

01:59:42

compared to the previous generation.

01:59:44

So if you just take a look at some of the performance metrics

01:59:47

for the Ryzen 8040 Series,

01:59:50

if you look at the top of the stack,

01:59:51

so Ryzen 9 8945,

01:59:53

it's actually significantly faster

01:59:55

than the competition in many areas,

01:59:57

delivering more performance for multi-threaded applications,

02:00:01

1.8x higher frame rates for games,

02:00:04

and 1.4x faster performance

02:00:06

across content creation applications.

02:00:08

But when you look at the AI improvements of Ryzen 8040,

02:00:12

you really see some substantial improvements.

02:00:15

So I talked about additional TOPS in "Hawk Point",

02:00:18

and what that results in faster performance

02:00:21

when you're running the key models.

02:00:22

So things like Llama 2 7B, we run 1.4x faster,

02:00:27

and also 1.4x faster on things like AI image recognition

02:00:31

and object detection models.

02:00:33

So all of this, what does it do?

02:00:35

It provides faster response times

02:00:37

and overall better experiences.

02:00:40

Now, I really believe that we're actually at the beginning

02:00:43

of this AI PC journey,

02:00:45

and it's something that is really gonna change

02:00:47

the way we think about productivity at a personal level.

02:00:50

So we've been working very closely with Microsoft

02:00:52

to ensure that we are co-innovating

02:00:54

across hardware and software

02:00:56

to enable those next generation of AI PCs.

02:00:59

To share more about this work,

02:01:01

I'm pleased to welcome Pavan Davuluri,

02:01:03

Corporate Vice President of Windows and Devices

02:01:05

at Microsoft to the stage.

02:01:12

Hey, how are you?

02:01:13

Great to be here.

02:01:14

Pavan, thank you so much for being here.

02:01:16

We started the show with Kevin Scott

02:01:18

talking about the great partnership

02:01:20

between Microsoft and AMD,

02:01:21

and all the work we're doing on the big iron,

02:01:24

and the cloud, and Azure.

02:01:25

And it seemed fitting that we close the show

02:01:28

with the other very, very important work

02:01:31

that we're doing together on the client side.

02:01:33

So can you tell us a little bit, Pavan,

02:01:36

about all the great work and your vision for client AI?

02:01:40

For sure.

02:01:41

As you and Kevin covered,

02:01:42

Microsoft and AMD have a long partnership together

02:01:45

across Azure and Windows.

02:01:47

And it's incredible to see us moving that partnership

02:01:49

together into the next wave of technology with AI.

02:01:53

As you shared, Lisa, for us,

02:01:54

there are millions of PCs right now

02:01:56

with Ryzen 7040 AI in market.

02:01:59

And that's amazing because these are the first x86 PCs

02:02:02

with integrated NPUs, enabling enhanced AI experience.

02:02:05

You told me everybody wanted NPUs.

02:02:07

Absolutely.

02:02:08

And you know, right now we get to see

02:02:09

some incredible AI features.

02:02:11

Somebody talked about Windows Studio FX coming to life

02:02:13

across the scale of the ecosystem.

02:02:15

Absolutely fantastic, I would say.

02:02:17

Now, for us at Microsoft and for the ecosystem,

02:02:20

our marquee AI experience is really Copilot.

02:02:24

Similar to how the start button is the gateway into Windows,

02:02:27

the Copilot for us is the entry point

02:02:29

into this world of AI on the PC.

02:02:33

It has a fundamental impact on everything

02:02:35

we will do on a computer, from work and school

02:02:37

and play and entertainment and creation.

02:02:40

You know, I completely agree, Pavan.

02:02:41

I think Copilot is so transformational.

02:02:44

I mean, for everyone who's had a chance to experience it,

02:02:46

it's so, it really changes the way we do work.

02:02:49

So let's talk about the tech that's underneath it.

02:02:52

So to enable Copilot and everything

02:02:55

that we want to do on PCs.

02:02:56

We are putting together new systems architectures

02:02:58

that really power those experiences going forward

02:03:01

and they really pull together GPU, NPU

02:03:04

and certainly the cloud as well.

02:03:06

And quite honestly, we're seeing customer habits

02:03:08

change early at this point in time

02:03:10

and we believe to your point earlier,

02:03:11

we're early in the cycle of innovation that's coming.

02:03:15

When we have these powerful NPUs

02:03:16

like the ones you're building,

02:03:18

it gives us an opportunity to create apps

02:03:20

that take advantage of both local and cloud inferencing.

02:03:23

And to me, that's what the Windows AI ecosystem is about

02:03:26

and that's what we're building in partnership with you.

02:03:28

It's designed to enable those scenarios

02:03:31

with the ONNX runtime of course

02:03:33

and the Olive tool chain to back this up.

02:03:35

Applications are gonna have many models

02:03:37

like Llama that you mentioned, FI2 running

02:03:39

and they will run very capably in the TOPS that we will have.

02:03:42

And of course, not to mention the foundation models

02:03:45

that are powered by the GPUs in the cloud.

02:03:47

Yeah, I mean, I think this is an area

02:03:49

where Microsoft and AMD really have a very unique position

02:03:52

because we have so much capability in the cloud,

02:03:55

we have also access to the client and the local view.

02:03:59

Can you share a bit about how we're thinking about

02:04:01

across all of these, the cloud local view?

02:04:04

Yeah, with AMD, we're making it simpler to incorporate

02:04:07

what we call the hybrid pattern

02:04:09

or the hybrid loop into applications.

02:04:11

And we wanna be able to load shift between the cloud

02:04:13

and the client to provide the best of computing

02:04:15

across both those worlds.

02:04:17

For us, it's really about seamless computing

02:04:20

across the cloud and the client.

02:04:22

It brings together the benefits of local compute,

02:04:24

things like enhanced privacy and responsiveness

02:04:26

and latency with the power of the cloud,

02:04:29

high performance models, large data sets,

02:04:32

cross platform inferencing.

02:04:33

And so for us, we feel like we're working together

02:04:36

to build that future where Windows is the destination

02:04:39

for the best AI experiences on PCs.

02:04:42

Yeah, no, I think that sounds great.

02:04:44

Now, one of the things though that you definitely

02:04:46

are always talking to me about is more TOPS.

02:04:50

I ask for more TOPS all the time.

02:04:52

So look, we completely believe that to enable

02:04:57

your vision for AI experiences,

02:05:00

we've really thought about how do we actually accelerate

02:05:02

our client AI roadmap.

02:05:03

So I wanna share a little bit of our roadmap today.

02:05:07

Ryzen 7040 and 8040, we've already delivered those

02:05:10

industry-leading NPU capabilities.

02:05:12

But today, I'm very excited to announce

02:05:14

that our next-gen Strix Point Ryzen processors

02:05:17

will actually include a new NPU

02:05:19

powered by our second-generation XDNA architecture

02:05:22

coming in 2024.

02:05:24

Congratulations.

02:05:25

Thank you.

02:05:29

So a little bit about XDNA 2.

02:05:32

It's designed really for leadership-gen AI performance.

02:05:35

It delivers more than three times the NPU performance

02:05:38

of our current Ryzen 7040 series.

02:05:41

And Pavan, I'm very happy to share,

02:05:42

I know your teams already know this

02:05:44

because you have the silicon,

02:05:45

but today, Strix Point is running great in our labs

02:05:48

and we're really excited about it.

02:05:50

Our teams have been working really closely together

02:05:52

to make sure that all of those great future Windows AI

02:05:55

features run really well on Strix Point.

02:05:58

So I can't wait to share more about that later this year.

02:06:02

Lisa, that's awesome.

02:06:03

And we will use every TOP you will provide us.

02:06:05

You promised, right?

02:06:07

Absolutely.

02:06:09

And it's not just the size of the neural engines,

02:06:12

the dramatic increase in efficiency performance per watt

02:06:15

of these next-generation NPUs.

02:06:17

We think we'll bring a whole new level of capabilities

02:06:19

to the market, enabling personalization

02:06:21

on every interaction on these devices.

02:06:24

Together with Windows, we feel like we're building

02:06:26

that future for the Copilot where we will orchestrate

02:06:28

multiple apps, services, and across devices, quite frankly,

02:06:32

functioning as an agent in your life that has context

02:06:35

and maintains context across entire workflows.

02:06:38

So we're very excited about these devices coming to life

02:06:40

for the Windows ecosystem.

02:06:41

We're excited to see what developers will do

02:06:43

with this technology.

02:06:44

And quite frankly, the other day,

02:06:45

ultimately, what customers will do

02:06:47

with all of this innovation.

02:06:49

Thank you so much, Pavan.

02:06:50

We are so excited about the partnership.

02:06:52

We appreciate all the long-term work we're doing together

02:06:55

and look forward to lots of great things to come.

02:06:58

Thank you for having me, Lisa.

02:06:59

Thank you, Pavan.

02:07:00

Thank you. Thank you.

02:07:05

All right, so it's been such a fun day,

02:07:07

but now it's time for me to wrap up a bit.

02:07:09

We've showed you a lot of new products,

02:07:11

a lot of new platforms, a lot of new technologies

02:07:14

that are all about taking AI infrastructure

02:07:17

to the next level.

02:07:18

MI300X, MI300A accelerators,

02:07:21

these are all shipping today in production.

02:07:23

They're already being adopted by Microsoft, Oracle,

02:07:26

Meta, Dell, HP Enterprise, Lenovo, Supermicro,

02:07:30

and many others.

02:07:32

You heard from Victor how we're expanding the ecosystem

02:07:34

of AI developers, working with us,

02:07:37

ROCm 6 software, the open ecosystem,

02:07:40

that our goal is to make it incredibly easy

02:07:42

for everyone to use Instinct GPUs.

02:07:45

You heard from Forrest in our panel

02:07:47

on the overall system architecture,

02:07:48

our work with Arista, Broadcom, and Cisco.

02:07:51

We believe that to create this high-performance

02:07:54

AI infrastructure, it has to be open,

02:07:56

and that's what we're doing together

02:07:58

for scale-out AI solutions.

02:08:00

And then you heard what we're doing on the other side,

02:08:02

the client part of our business,

02:08:04

because we actually believe AI should be everywhere.

02:08:07

So our latest Ryzen processors really extend

02:08:10

our compute vision and our AI leadership.

02:08:12

I hope you can see that AI is absolutely

02:08:16

the number one priority at AMD.

02:08:18

Our goal is to push the envelope,

02:08:20

to bring innovation to the market,

02:08:22

to do more than anything thought was possible,

02:08:24

because we believe, as wonderful as our technology is,

02:08:28

it is about doing it together in a partner ecosystem

02:08:32

where everybody brings their best to the market.

02:08:35

Today is a...

02:08:44

I want to say on a personal level,

02:08:46

today is an incredibly proud moment for AMD.

02:08:48

If you think about all of the innovation,

02:08:51

everything that we bring to the market,

02:08:53

to be part of AI at this time,

02:08:56

at the beginning of this era,

02:08:58

to work with these amazing people throughout the industry,

02:09:01

throughout the ecosystem at AMD,

02:09:05

I can say that I've never seen something more exciting.

02:09:08

A very, very special thank you

02:09:09

to all of our partners who joined us today,

02:09:11

and thank you all for joining us.

Description:

Join us to discover how AMD and its partners are powering the future of AI. Learn more: https://www.amd.com/en/corporate/events/advancing-ai.html 00:11:24 Introduction 00:18:32 AMD Instinct™ MI300X Accelerators 00:24:52 Microsoft 00:29:30 AMD Instinct™ Platform 00:33:03 Oracle Cloud 00:36:14 Software and Ecosystem 00:44:18 AI Innovators Panel: Databricks, EssentialAI, Lamini 01:01:09 Meta 01:07:45 Dell 01:14:05 Supermicro 01:18:14 Lenovo 01:25:33 Networking 01:30:37 Panel: Broadcom, Arista, Cisco 01:40:51 AMD Instinct™ MI300A APUs 01:47:05 Exascale Computing – El Capitan 01:56:13 AI PCs 01:56:49 AMD XDNA™ Architecture 01:58:31 AMD Ryzen™ AI Software 02:00:49 Microsoft 02:07:05 Closing *** Subscribe: https://www.youtube.com/user/amd?sub_confirmation=1 Follow us on LinkedIn: http://www.linkedin.com/company/1497 Follow us on Instagram: https://www.facebook.com/unsupportedbrowser Follow us on X: https://bit.ly/AMD_On_Twitter Like us on Facebook: https://www.facebook.com/unsupportedbrowser ©2023 Advanced Micro Devices, Inc. AMD, the AMD Arrow Logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and other jurisdictions. Other names are for informational purposes only and may be trademarks of their respective owners.

Preparing download options

Popular

HD video

Only sound

All

* — If the video is playing in a new tab, go to it, then right-click on the video and select "Save video as..."

** — Link intended for online playback in specialized players

Questions about downloading video

How can I download "AMD Presents: Advancing AI" video?

http://unidownloader.com/ website is the best way to download a video or a separate audio track if you want to do without installing programs and extensions.
The UDL Helper extension is a convenient button that is seamlessly integrated into YouTube, Instagram and OK.ru sites for fast content download.
UDL Client program (for Windows) is the most powerful solution that supports more than 900 websites, social networks and video hosting sites, as well as any video quality that is available in the source.
UDL Lite is a really convenient way to access a website from your mobile device. With its help, you can easily download videos directly to your smartphone.

Which format of "AMD Presents: Advancing AI" video should I choose?

The best quality formats are FullHD (1080p), 2K (1440p), 4K (2160p) and 8K (4320p). The higher the resolution of your screen, the higher the video quality should be. However, there are other factors to consider: download speed, amount of free space, and device performance during playback.

Why does my computer freeze when loading a "AMD Presents: Advancing AI" video?

The browser/computer should not freeze completely! If this happens, please report it with a link to the video. Sometimes videos cannot be downloaded directly in a suitable format, so we have added the ability to convert the file to the desired format. In some cases, this process may actively use computer resources.

How can I download "AMD Presents: Advancing AI" video to my phone?

You can download a video to your smartphone using the website or the PWA application UDL Lite. It is also possible to send a download link via QR code using the UDL Helper extension.

How can I download an audio track (music) to MP3 "AMD Presents: Advancing AI"?

The most convenient way is to use the UDL Client program, which supports converting video to MP3 format. In some cases, MP3 can also be downloaded through the UDL Helper extension.

How can I save a frame from a video "AMD Presents: Advancing AI"?

This feature is available in the UDL Helper extension. Make sure that "Show the video snapshot button" is checked in the settings. A camera icon should appear in the lower right corner of the player to the left of the "Settings" icon. When you click on it, the current frame from the video will be saved to your computer in JPEG format.

What's the price of all this stuff?

It costs nothing. Our services are absolutely free for all users. There are no PRO subscriptions, no restrictions on the number or maximum length of downloaded videos.