background top icon
background center wave icon
background filled rhombus icon
background two lines icon
background stroke rhombus icon

Download "AMD Presents: Advancing AI"

input logo icon
"videoThumbnail AMD Presents: Advancing AI
Table of contents
|

Table of contents

11:24
Introduction
18:32
AMD Instinct™ MI300X Accelerators
24:52
Microsoft
29:30
AMD Instinct™ Platform
33:03
Oracle Cloud
36:14
Software and Ecosystem
44:18
AI Innovators Panel: Databricks, EssentialAI, Lamini
1:01:09
Meta
1:07:45
Dell
1:14:05
Supermicro
1:18:14
Lenovo
1:25:33
Networking
1:30:37
Panel: Broadcom, Arista, Cisco
1:40:51
AMD Instinct™ MI300A APUs
1:47:05
Exascale Computing – El Capitan
1:56:13
AI PCs
1:56:49
AMD XDNA™ Architecture
1:58:31
AMD Ryzen™ AI Software
2:00:49
Microsoft
2:07:05
Closing
Video tags
|

Video tags

AMD
Advanced Micro Devices
EPYC
cpu
processors
gpu
graphics
pc
together we advance_
Gaming
Server
Computer
Desktop
Laptop
Ryzen
AI
Subtitles
|

Subtitles

00:13:15
Hey, good morning.
00:13:19
Good morning, everyone.
00:13:20
Welcome to all of you who are joining us here in Silicon Valley and to everyone who's
00:13:23
joining us online from around the world.
00:13:26
It has been just an incredibly exciting year with all of the new products and all the innovation
00:13:31
that has come across our business and our industry.
00:13:34
But today, it's all about AI.
00:13:37
We have a lot of new AI solutions to launch today and to news to share with you, so let's
00:13:41
go ahead and get started.
00:13:43
Now, I know we've all felt this this year.
00:13:46
I mean, it's been just an amazing year.
00:13:48
I mean, if you think about it, a year ago, OpenAI unveiled ChatGPT.
00:13:53
And it's really sparked a revolution that has totally reshaped the technology landscape.
00:13:59
In this just short amount of time, AI hasn't just progressed.
00:14:03
It's actually exploded.
00:14:05
The year has shown us that AI isn't just kind of a cool new thing.
00:14:10
It's actually the future of computing.
00:14:12
And at AMD, when we think about it, we actually view AI as the single most transformational
00:14:18
technology over the last 50 years.
00:14:22
Maybe the only thing that has been close has been the introduction of the internet.
00:14:26
But what's different about AI is that the adoption rate is just much, much faster.
00:14:32
So although so much has happened, the truth is right now, we're just at the very beginning
00:14:37
of the AI era.
00:14:39
And we can see how it's so capable of touching every aspect of our lives.
00:14:44
So if you guys just take a step back and just look, I mean, AI is already being used everywhere.
00:14:50
Think about improving healthcare, accelerating climate research, enabling personal assistance
00:14:55
for all of us and for greater business productivity, things like industrial robotics, security,
00:15:02
and providing lots of new tools for content creators.
00:15:05
Now the key to all of this is generative AI.
00:15:09
It requires a significant investment in new infrastructure.
00:15:13
And that's to enable training and all of the inference that's needed.
00:15:17
And that market is just huge.
00:15:19
Now a year ago when we were thinking about AI, we were super excited.
00:15:23
And we estimated the data center AI accelerator market would grow approximately 50% annually
00:15:30
over the next few years, from something like $30 billion in 2023 to more than $150 billion
00:15:36
in 2027.
00:15:38
And that felt like a big number.
00:15:40
However, as we look at everything that's happened in the last 12 months and the rate and pace
00:15:46
of adoption that we're seeing across the industry, across our customers, across the world,
00:15:52
it's really clear that the demand is just growing much, much faster.
00:15:57
So if you look at now to enable AI infrastructure, of course it starts with the cloud,
00:16:02
but it goes into the enterprise.
00:16:03
We believe we'll see plenty of AI throughout the embedded markets and into personal computing.
00:16:09
We're now expecting that the data center accelerator TAM will grow more than 70% annually
00:16:14
over the next four years to over 400 billion in 2027.
00:16:20
So does that sound exciting for us as an industry?
00:16:28
I have to say for someone like me who's been in the industry for a while, this pace of
00:16:32
innovation is faster than anything I've ever seen before.
00:16:36
And for us at AMD, we are so well positioned to power that end-to-end infrastructure that
00:16:42
defines this new AI era.
00:16:44
So speaking about massive cloud server installations to we're going to talk about on-prem enterprise
00:16:50
clusters to the next generation of AI in embedded and PCs, our AI strategy is really centered
00:16:56
around three big strategic priorities.
00:17:00
First, we must deliver a broad portfolio of very performant, energy-efficient GPUs, CPUs,
00:17:07
and adaptive computing solutions for AI training and inference.
00:17:11
And we believe, frankly, that you're going to need all of these pieces for AI.
00:17:15
Second, it's really about expanding our open, proven, and being very developer-friendly
00:17:21
in our software platform to ensure that leading AI frameworks, libraries, and models are all
00:17:27
fully enabled for AMD hardware and that it's really easy for people to use.
00:17:32
And then third, it's really about partnership.
00:17:35
You're going to see a lot of partners today.
00:17:37
That's who we are as a company.
00:17:38
It's about expanding the co-innovation work and working with all parts of the ecosystem,
00:17:44
including cloud providers, OEMs, software developers.
00:17:49
You're going to hear from some really AI leaders in the industry to really accelerate how we
00:17:54
work together and get that widespread deployment of our solutions across the board.
00:18:00
So we have so much to share with you today.
00:18:01
I'd like to get started.
00:18:03
And of course, let's start with the cloud.
00:18:06
Generative AI is the most demanding data center workload ever.
00:18:11
It requires tens of thousands of accelerators to train and refine models with billions of
00:18:16
parameters.
00:18:17
And that same infrastructure is also needed to answer the millions of queries from everyone
00:18:22
around the world to these smart models.
00:18:25
And it's very simple.
00:18:26
The more compute you have, the more capable the model, the faster the answers are generated.
00:18:32
And the GPU is at the center of this generative AI world.
00:18:37
And right now, I think we all know it, everyone I've talked to says it, the availability and
00:18:42
capability of GPU compute is the single most important driver of AI adoption.
00:18:48
Do you guys agree with that?
00:18:49
So that's why I'm so excited today to launch our Instinct MI300X.
00:18:59
It's the highest performance accelerator in the world for generative AI.
00:19:04
MI300X is actually built on our new CDNA 3 data center architecture.
00:19:09
And it's optimized for performance and power efficiency.
00:19:12
CDNA 3 has a lot of new features.
00:19:15
It combines a new compute engine.
00:19:17
It supports sparsity, the latest data formats, including FP8.
00:19:22
It has industry-leading memory capacity and bandwidth.
00:19:25
And we're going to talk a lot about memory today.
00:19:28
And it's built on the most advanced process technologies and 3D packaging.
00:19:32
So if you compare it to our previous generation, which frankly was also very good, CDNA 3 actually
00:19:38
delivers more than three times higher performance for key AI data types, like FP16 and BF16,
00:19:45
and a nearly seven times increase in int tape performance.
00:19:50
So if you look underneath it, how do we get MI300X?
00:19:53
It's actually 153 billion transistors, 153 billion.
00:20:04
It's across a dozen 5-nanometer and 6-nanometer chiplets.
00:20:08
It uses the most advanced packaging in the world.
00:20:11
And if you take a look at how we put it together, it's actually pretty amazing.
00:20:15
We start with four IO die in the base layer.
00:20:19
And what we have on the IO dies are 256 megabytes of infinity cache and all of the next-gen
00:20:24
IO that you need.
00:20:26
Things like 128-channel HBM3 interfaces, PCIe Gen 5 support, our fourth-gen infinity fabric
00:20:34
that connects multiple MI300Xs so that we get 896 gigabytes per second.
00:20:40
And then we stack eight CDNA 3 accelerator chiplets, or XCDs, on top of the IO die.
00:20:46
And that's where we deliver 1.3 petaflops of FP16 and 2.6 petaflops of FP8 performance.
00:20:54
And then we connect these 304 compute units with dense through-silicon vias, or TSVs,
00:21:01
and that supports up to 17 terabytes per second of bandwidth.
00:21:05
And of course, to take advantage of all of this compute, we connect eight stacks of HBM3
00:21:11
for a total of 192 gigabytes of memory at 5.3 terabytes per second of bandwidth.
00:21:17
That's a lot of stuff on that.
00:21:25
I have to say, it's truly the most advanced product we've ever built, and it is the most
00:21:31
advanced AI accelerator in the industry.
00:21:34
Now let's talk about some of the performance and why it's so great.
00:21:39
For generative AI, memory capacity and bandwidth are really important for performance.
00:21:44
If you look at MI300X, we made a very conscious decision to add more flexibility, more memory
00:21:50
capacity, and more bandwidth, and what that translates to is 2.4 times more memory capacity
00:21:56
and 1.6 times more memory bandwidth than the competition.
00:22:00
Now when you run things like lower precision data types that are widely used in LLMs, the
00:22:06
new CDNA 3 compute units and memory density actually enable MI300X to deliver 1.3 times
00:22:13
more teraflops of FP8 and FP16 performance than the competition.
00:22:19
Now these are good numbers, but what's more important is how things look in real world
00:22:24
inference workloads.
00:22:26
So let's start with some of the most common kernels used by the latest AI models.
00:22:31
LLMs use attention algorithms to generate precise results.
00:22:35
So for something like FlashAttention-2 kernels, MI300X actually delivers up to 1.2 times better
00:22:42
performance than the competition.
00:22:44
And if you look at something like the Llama 2 70B LLM, and we're going to use this a lot
00:22:48
throughout the show, MI300X again delivers up to 1.2 times more performance.
00:22:55
And what this means is the performance at the kernel level actually directly translates
00:23:00
into faster results when running LLMs on a single MI300X accelerator.
00:23:06
But we also know, we talked about these models getting so large, so what's really important
00:23:11
is how that AI performance scales when you go to the platform level and beyond.
00:23:16
So let's take a look at how MI300X scales.
00:23:20
Let's start first with training.
00:23:22
Training is really hard.
00:23:23
People talk about how hard training is.
00:23:25
When you look at something like the 30 billion parameter model from Databricks, MPT LLM,
00:23:31
it's a pretty good example of something that is used by multiple enterprises for a lot
00:23:36
of different things.
00:23:38
And you can see here that the training performance for MI300X is actually equal to the competition.
00:23:43
And that means it's actually a very, very competitive training platform today.
00:23:48
But when you turn to the inference performance of MI300X, this is where our performance really
00:23:53
shines.
00:23:55
We're showing some data here, measured data on two widely used models, Bloom 176B.
00:24:02
It's the world's largest open multi-language AI model.
00:24:06
It generates text in 46 languages.
00:24:09
And our Llama 2 70B, which is also very popular, as I said, for enterprise customers.
00:24:15
And what we see in this case is a single server with eight MI300X accelerators is substantially
00:24:21
faster than the competition, 1.4 to 1.6X.
00:24:25
So these are pretty big numbers here.
00:24:28
And what this performance does is it just directly translates into a better user experience.
00:24:32
You guys have used it.
00:24:33
When you ask the model something, you'd like it to come back faster, especially as the
00:24:38
responses get more complicated.
00:24:41
So that gives you a view of the performance of MI300X.
00:24:45
Now excited as we are about the performance, we are even more excited about the work we're
00:24:50
doing with our partners.
00:24:52
So let me turn to our first guest, very, very special.
00:24:55
Microsoft is truly a visionary leader in AI.
00:24:59
We've been so fortunate to have a deep partnership with Microsoft for many, many years across
00:25:04
all aspects of our business.
00:25:06
And the work we're doing today in AI is truly taking that partnership to the next level.
00:25:10
So here to tell us more about that is Microsoft's Chief Technology Officer, Kevin Scott.
00:25:22
Kevin, it is so great to see you.
00:25:24
Thank you so much for being here with us.
00:25:25
It's a real pleasure to be here with you all today.
00:25:29
We've done so much work together on EPYC and Instinct over the years.
00:25:33
Can you just tell our audience a little bit about that partnership?
00:25:36
Yeah, I think Microsoft and AMD have a very special partnership.
00:25:41
And as you mentioned, it has been one that we've enjoyed for a really long time.
00:25:46
It started with the PC.
00:25:47
It continued then with a bunch of custom silicon work that we've done together over the years
00:25:52
on Xbox.
00:25:54
It's extended through the work that we've done with you all on EPYC for the high-performance
00:25:59
computing workloads that we have in our cloud.
00:26:02
And like the thing that I've been spending a bunch of time with you all on the past couple
00:26:06
of years, like actually a little bit longer even, is on AI compute, which I think everybody
00:26:12
now understands how important it is to driving progress on this new platform that we're trying
00:26:19
to deliver to the world.
00:26:20
I have to say we talk pretty often.
00:26:22
We do.
00:26:24
But Kevin, what I admire so much is just your vision, Satya's vision about where AI is going
00:26:31
in the industry.
00:26:32
So can you just give us a perspective of where are we on this journey?
00:26:36
Yeah, so we have been with a huge amount of intensity over the past five years or so,
00:26:44
been trying to prepare for the moment that I think we brought the world into over the
00:26:49
past year.
00:26:50
So it is almost a year to the day since the launch of ChatGPT, which I think is perhaps
00:26:56
most people's first contact with this new wave of generative AI.
00:27:01
But the thing that allowed Microsoft and OpenAI to do this was just a deep amount of infrastructure
00:27:09
work that we've been investing in for a very long while.
00:27:13
And one of the things that we realized fairly early in our journey is just how important
00:27:19
compute was going to be and just how important it is to think about the sort of full systems
00:27:24
optimization.
00:27:26
So the work that we've been doing with you all has been not just about figuring out what
00:27:33
the silicon architecture looks like, but that's been a very important thing and making sure
00:27:37
that we together are building things that are going to intercept where the actual platform
00:27:42
is going to be years in advance, but also just doing all of that software work that
00:27:49
needs to be done to make this thing usable by all the developers of the world.
00:27:55
I think that's really key.
00:27:56
I think sometimes people don't understand, they think about AI as this year, but the
00:28:01
truth is we've been building the foundation for so many years.
00:28:04
Kevin, I want to take this moment to really acknowledge that Microsoft has been so instrumental
00:28:09
in our AI journey.
00:28:11
The work we've done over the last several generations, the software work that we're
00:28:15
doing, the platform work that we're doing, we're super excited for this moment.
00:28:19
Now I know you guys just had Ignite recently and Satya previewed some of the stuff you're
00:28:23
doing with 300X, but can you share that with our audience?
00:28:26
We're super enthusiastic about 300X.
00:28:29
Satya announced that the MI300X VMs were going to be available in Azure.
00:28:38
It's really, really exciting right now seeing the bring up of GPT-4 on MI300X, seeing the
00:28:45
performance of LLlama 2, getting it rolled into production.
00:28:50
The thing that I'm excited here today is we will have the MI300X VMs in preview available
00:28:57
today.
00:29:06
I completely agree with you.
00:29:07
The thing that's so exciting about AI is every day we discover something new and we're learning
00:29:12
that together.
00:29:13
Kevin, we're so honored to be Microsoft's partner in AI.
00:29:17
Thank you for all the work that your teams have done, that we've done together.
00:29:21
We look forward to a lot more progress.
00:29:23
Likewise.
00:29:24
Thank you very much.
00:29:31
All right, so look
00:29:32
We certainly do learn a tremendous amount every day and we're always pushing the envelope.
00:29:37
Let me talk to you a little bit about how we bring more people into our ecosystem.
00:29:41
When I talk about the Instinct platform, you have to understand our goal has really been
00:29:47
to enable as many customers as possible to deploy Instinct as fast and as simply as possible.
00:29:53
To do this, we really adopted industry standards.
00:29:57
We built the Instinct platform based on an industry standard OCP server design.
00:30:02
I'd actually like to show you what that means because I don't know if everyone understands.
00:30:06
Let's bring her out.
00:30:07
Her or him?
00:30:15
Let me show you the most powerful gen AI computer in the world.
00:30:27
Those of you who follow our shows know that I'm usually holding up a chip, but we've shown
00:30:32
you the MI300X chip already, so we thought it would be important to show you just what
00:30:38
it means to do generative AI at a system level.
00:30:42
What you see here is eight MI300X GPUs and they're connected by our high-performance
00:30:49
Infinity fabric in an OCP-compliant design.
00:30:53
What makes that special?
00:30:55
This board actually drops right into any OCP-compliant design, which is the majority of AI systems
00:31:02
today.
00:31:03
We did this for a very deliberate reason.
00:31:04
We want to make this as easy as possible for customers to adopt so you can take out your
00:31:09
other board and put in the MI300X Instinct platform.
00:31:14
If you take a look at the specifications, we actually support all of the same connectivity
00:31:19
and networking capabilities of our competition, so PCI Gen 5, support for 400 gig ethernet,
00:31:26
that 896 gigabytes per second of total system bandwidth, but all of that is with 2.4 times
00:31:33
more memory and 1.3 times more compute server than the competition.
00:31:38
That's really why we call it the most powerful gen AI system in the world.
00:31:42
Now, I've talked about some of the performance in AI workloads, but I want to give you just
00:31:47
a little bit more color on that.
00:31:50
When you look at deploying servers at scale, it's not just about performance.
00:31:54
Our customers are also trying to optimize power, space, CapEx and OpEx, and that's where
00:32:01
you see some really nice benefits of our platform.
00:32:05
When you compare our Instinct platform to the competition, I've already showed you that
00:32:09
we deliver comparable training performance and significantly higher inference performance,
00:32:14
but in addition, what that memory capacity and bandwidth gives us is that customers can
00:32:19
actually either run more models, if you're running multiple models on a given server,
00:32:24
or you can run larger models on that same server.
00:32:28
In the case where you're running multiple different models on a single server, the Instinct
00:32:32
platform can run twice as many models for both training and inference than the competition.
00:32:38
On the other side, if what you're doing is trying to run very large models, you'd like
00:32:42
to fit them on as few GPUs as possible.
00:32:46
With the FP16 data format, you can run twice the number of LLMs on a single MI300X server
00:32:52
compared to our competition.
00:32:55
This directly translates into lower CapEx, and especially if you don't have enough GPUs,
00:33:01
this is really, really helpful.
00:33:03
So, to talk more about MI300X and how we're bringing it to market, let me bring our next
00:33:09
guest to the stage.
00:33:11
Oracle Cloud and AMD have been engaged for many, many years in bringing great computing
00:33:15
solutions to the cloud.
00:33:17
Here to tell us more about our work together is Karan Batta, Senior Vice President at
00:33:21
Oracle Cloud Infrastructure.
00:33:28
Hey, Karan.
00:33:29
Hi, Lisa.
00:33:30
Thank you so much for being here.
00:33:31
Thank you for your partnership.
00:33:32
Can you tell us a little bit about the work that we're doing together?
00:33:36
Yeah, thank you.
00:33:37
Excited to be here today.
00:33:39
Oracle and AMD have been working together for a long, long time, right, since the inception
00:33:43
of OCI back in 2017.
00:33:45
And so, we've launched every generation of EPYC as part of our bare metal compute platform,
00:33:51
and it's been so successful, customers like Red Bull as an example.
00:33:55
And we've expanded that across the board for all of our portfolio of past services
00:33:59
like Kubernetes, VMware, et cetera.
00:34:02
And then we are also collaborating on Pensando DPUs, where we offload a lot of that logic
00:34:07
so that customers can get much better performance, flexibility.
00:34:10
And then, you know, earlier this year, we also announced that we're partnering with
00:34:13
you guys on Exadata, which is a big deal, right?
00:34:17
So, we're super excited about our partnership with AMD, and then what's to come with 300X?
00:34:21
Yeah.
00:34:22
We really appreciate OCI has really been a leading customer as we talk about how do we
00:34:28
bring new technology into Oracle Cloud.
00:34:31
Now, you're spending a lot of time on AI as well.
00:34:33
Tell us a little bit about your strategy for AI and how we fit into that strategy.
00:34:37
Absolutely.
00:34:38
You know, we're spending a lot of time on AI, obviously.
00:34:41
Everyone is.
00:34:42
We are.
00:34:43
Everybody is.
00:34:44
It's the new thing.
00:34:45
You know, we're doing that across the stack, from infrastructure all the way up to applications.
00:34:48
Oracle is an applications company as well.
00:34:50
And so, we're doing that across the stack, but from an infrastructure standpoint, we're
00:34:54
investing a lot of effort into our core compute stack, our networking stack.
00:34:59
We announced clustered networking.
00:35:01
And what I'm really excited to announce is that we're going to be supporting MI300X as
00:35:04
part of that bare-metal compute stack.
00:35:12
We are super thrilled about that partnership.
00:35:14
We love the fact that you're going to have 300X.
00:35:17
I know your customers and our customers are talking to us every day about it.
00:35:20
Tell us a little bit about what customers are saying.
00:35:22
Yeah, we've been working with a lot of customers.
00:35:24
Obviously, we've been collaborating a lot at the engineering level as well with AMD.
00:35:28
And you know, customers are seeing incredible results already from the previous generation.
00:35:32
And so, I think that will actually carry through with the 300X.
00:35:36
And so much so that we're also excited to actually support MI300X as part of our generative
00:35:41
AI service that's going to be coming up live very soon as well.
00:35:44
So, we're very, very excited about that.
00:35:46
We're working with some of our early customer adopters like Naveen from Databricks Mosaic.
00:35:51
So, we're very excited about the possibility.
00:35:54
We're also very excited about the fact that the ROCm ecosystem is going to help us
00:35:58
continue that effort moving forward.
00:36:00
So, we're very pumped.
00:36:02
That's wonderful.
00:36:03
Karan, thank you so much.
00:36:04
Thank your teams.
00:36:05
We're so excited about the work we're doing together and look forward to a lot more.
00:36:08
Thank you, Lisa.
00:36:09
Thank you.
00:36:14
Now, as important as the hardware is, software actually is what drives adoption.
00:36:20
And we have made significant investments in our software capabilities and our overall
00:36:23
ecosystem.
00:36:24
So, let me now welcome to the stage AMD President Victor Peng to talk about our software and
00:36:29
ecosystem progress.
00:36:35
Thank you, Lisa.
00:36:36
Thank you.
00:36:37
And good morning, everyone.
00:36:39
You know, last June at the AI event in San Francisco, I said that the ROCm software
00:36:44
stack was open, proven, and ready.
00:36:47
And today, I'm really excited to tell you about the tremendous progress we've made in
00:36:51
delivering powerful new features as well as the high performance on ROCm.
00:36:56
And how the ecosystem partners have been significantly expanding the support for Instinct GPUs and
00:37:02
the entire product portfolio.
00:37:04
Today, there are multiple tens of thousands of AI models that run right out of the box
00:37:09
on Instinct.
00:37:11
And more developers are running on the MI250, and soon they'll be running on the MI300.
00:37:17
So we've expanded deployments in the data center, at the edge, in client, embedded applications
00:37:23
of our GPUs, CPUs, FPGAs, and adaptive SoCs, really end to end.
00:37:29
And we're executing on that strategy of building a unified AI software stack so any model,
00:37:34
including generative AI, can run seamlessly across an entire product portfolio.
00:37:39
Now, today, I'm going to focus on ROCm and the expanded ecosystem support for our
00:37:44
Instinct GPUs.
00:37:47
We architected ROCm to be modular and open source to enable very broad user accessibility
00:37:53
and rapid contribution by the open source community and AI community.
00:37:58
Open source and ecosystem are really integral to our software strategy, and in fact, really
00:38:03
open is integral to our overall strategy.
00:38:06
This contrasts with CUDA, which is proprietary and closed.
00:38:09
Now, the open source community, everybody knows, moves at the speed of light in deploying
00:38:14
and proliferating new algorithms, models, tools, and performance enhancements.
00:38:19
And we are definitely seeing the benefits of that in the tremendous ecosystem momentum
00:38:24
that we've established.
00:38:26
To further accelerate developer adoption, we recently announced that we're going to
00:38:30
be sporting ROCm on our Radeon GPUs.
00:38:33
This makes AI development on AMD GPUs more accessible to more developers, start-ups,
00:38:39
and researchers.
00:38:41
So our foot is firmly on the gas pedal with driving the MI300 to volume production and
00:38:46
our next ROCm release.
00:38:49
So I'm really super excited that we'll be shipping ROCm 6 later this month.
00:38:53
I'm really proud of what the team has done with this really big release.
00:38:56
ROCm 6 has been optimized for gen AI, particularly large language models, has powerful
00:39:02
new features, library optimizations, expanded ecosystem support, and increases performance
00:39:09
by factors.
00:39:10
It really delivers for AI developers.
00:39:13
ROCm 6 supports FP16, BF16, and the new FP8 data pipes for higher performance while
00:39:20
reducing both memory and bandwidth needs.
00:39:25
We've incorporated advanced graph and kernel optimizations and optimized libraries for
00:39:29
improved efficiency.
00:39:31
We're shipping state-of-the-art attention algorithms like FlashAttention-2, page attention,
00:39:35
which are critical for performance in LLMs and other models.
00:39:40
These algorithms and optimizations are complemented with a new release of rCCL, our collective
00:39:45
communications library for efficient, very large-scale GPU deployments.
00:39:50
So look, the bottom line is ROCm 6 delivers a quantum leap in performance and capability.
00:39:56
Now I'm going to first work you through the inference performance gains you'll see with
00:40:00
some of these optimizations on ROCm 6.
00:40:02
So for instance, running a 70 billion Llama 2 model, page attention and other algorithms
00:40:07
speed up the token generation by paging attention keys and values, delivering 2.6x higher performance.
00:40:15
HIP graph allows processing to be defined in graphs rather than single operations,
00:40:21
and that delivers a 1.4x speed up.
00:40:24
FlashAttention, which is widely used kernel for very high-performance LLL performance,
00:40:29
delivers 1.3x speed up.
00:40:32
So all those optimizations together deliver an 8x speed up on the MI300x with ROCm 6
00:40:39
compared to the MI250 and ROCm 5.
00:40:42
That's 8x performance in a single generation.
00:40:46
So this is one of those huge benefits we provide to customers with this great performance improvement
00:40:50
with the MI300x.
00:40:52
So now let's look at it from a competitive perspective.
00:40:56
Lisa had highlighted the performance of large models running on multiple GPUs.
00:41:01
What I'm sharing here is how the performance of smaller models running on single GPUs,
00:41:07
in this case the 13 billion Llama 2 model.
00:41:10
The MI300x and ROCm 6 together deliver 1.2x higher performance than the competition.
00:41:16
So this is the reason why our customers and our partners are super excited about creating
00:41:20
the next innovations in AI on the MI300x.
00:41:26
So we're relentlessly focused on delivering leadership technology and very comprehensive
00:41:30
software support for AI developers.
00:41:33
And to fuel that drive, we've been significantly strengthening our software teams through both
00:41:38
organic and inorganic means, and we're expanding our ecosystem engagements.
00:41:43
So we recently acquired Nod.ai and Mipsology.
00:41:46
Nod brings world-class expertise in open source compilers and runtime technology.
00:41:52
They've been instrumental in the MLIR compiler technology as well as in the communities.
00:41:57
And as part of our team, they are significantly strengthening our customer engagements and
00:42:01
they're accelerating our software development plans.
00:42:05
Mipsology also strengthens our capabilities and they're especially in delivering to customers
00:42:09
in very AI-rich applications like autonomous vehicles and industrial automation.
00:42:16
So now let me turn over to the ecosystem.
00:42:20
In addition to working closely with the ecosystem, oh, sorry.
00:42:23
We announced that we had the partnership with Hugging Face just last June.
00:42:28
Today they have 62,000 models running daily on Instinct platforms.
00:42:34
And in addition, we've worked closely on getting these LLM optimizations as part of their optimal
00:42:39
library and toolkit.
00:42:41
Our partnership with PyTorch Foundation has also continued to thrive with CI/CD pipelines
00:42:48
and validation, enabling developers to target our platforms directly.
00:42:52
And we continue to make very significant contributions to all the major frameworks, including upstream
00:42:57
support for AMD GPUs in JAX, OpenXLA, QPI, and even initiatives like Deep Speed for Science.
00:43:06
Just yesterday, the AI Alliance was announced with over 50 founding members that also include
00:43:11
AMD, IBM, and Meta and other companies.
00:43:16
And I'm really delighted to share some very late-breaking news.
00:43:20
AMD GPUs, including the MI300, will be supported in the standard OpenAI Triton distribution
00:43:27
starting with the 3.0 release.
00:43:34
We're really thrilled to be working with Philippe Tillet, who created Triton, and the whole OpenAI team.
00:43:41
AI developers using the OpenAI Triton are more productive working at a higher level
00:43:45
of design abstraction, and they still get really excellent performance.
00:43:49
This is great for developers and aligned with our strategy to empower developers with powerful
00:43:55
and open software stacks and GPU platforms.
00:43:58
This is in contrast to the much greater effort developers would need to invest working at
00:44:02
a much lower level abstraction in order to eke out performance.
00:44:07
Now I've shared a lot with you about the progress we made on software, but the best indication
00:44:12
of the progress we've really made are the people who are using our software and GPUs
00:44:16
and what they're saying.
00:44:17
So it gives me great pleasure to have three AI luminaries and entrepreneurs from Databricks,
00:44:23
essential AI, and Lamini to join me on stage.
00:44:27
Please give a very warm welcome to Ion Stoica, Ashish Vaswani, and Sharon Zhou.
00:44:54
Great.
00:44:57
Welcome, Ion, Ashish, and Sharon.
00:44:58
Thank you so much for joining us here.
00:45:00
Really appreciate it.
00:45:01
So I'm gonna ask each of you a bit about first with the mission of your company
00:45:06
and share about the innovations you're doing with our GPUs and software
00:45:10
and what the experience has been like.
00:45:12
So Ion, let me start with you.
00:45:14
Now you're also not only founder of Databricks,
00:45:17
but you're on the staff of the department of UC Berkeley,
00:45:21
director of Sky Computing Labs,
00:45:22
and also you've been world with AnyScale and many AI startups.
00:45:27
So maybe you could talk about your engagement with AMD
00:45:30
as well as your experience in the MI200 and MI300.
00:45:33
Yeah, thank you very much.
00:45:34
Very glad to be here.
00:45:36
And yes, indeed, I collaborated with AMD wearing multiple hats,
00:45:43
director of a Sky Computing Lab at Berkeley,
00:45:46
which AMD is supporting, and also founders of AnyScale and Databricks.
00:45:52
And in all my work over the year, one thing I really focus on
00:45:56
is democratizing the access to AI.
00:46:00
What this means, it's improving the scale, performance, and cost,
00:46:05
reducing the cost, to run these large AI applications,
00:46:11
which means everything from AI workloads, everything from training,
00:46:15
fine-tuning, inference, and generative AI applications.
00:46:21
Just to give you some examples, we developed VLLM,
00:46:24
which is arguably now the most popular open-source
00:46:28
inference engines for LLMs.
00:46:31
We have developed Ray, another open-source framework
00:46:34
which is used to distribute machine learning workloads.
00:46:37
Ray has been used by OpenAI to train ChatGPT.
00:46:41
And more recently, Sky Computing, one of the projects there is SkyPilot,
00:46:46
which helps you to run your applications or machine learning applications
00:46:52
and workloads across multiple clouds.
00:46:54
And why do you want to do that?
00:46:56
It's because you want to alleviate the scarcity of the GPUs
00:47:01
and reduce the costs.
00:47:04
Now, when it comes to our collaborations,
00:47:07
we collaborate on all these kind of projects.
00:47:10
And one thing which was a very pleasant surprise
00:47:14
is that it was very easy to run and include ROCm in our stack.
00:47:22
It really runs out of the box from day one.
00:47:27
Of course, you need to do more optimization for that.
00:47:29
And this is what we are doing and we are working on.
00:47:32
So for instance, we had the support for MI250 and to Ray.
00:47:39
And we are working, actually, collaborating with AMD,
00:47:43
like I mentioned, to optimize the inference for VLLM,
00:47:48
again, running on MI250 and MI300X.
00:47:52
And from the point of view of SkyPilot,
00:47:55
we're really looking forward to have more and more of MI250s
00:48:02
and MI300X in various clouds.
00:48:05
So we have more choices.
00:48:07
It sounds great.
00:48:09
Thank you so much for all the collaboration across all those clouds.
00:48:12
Ashish why don't you tell us about Essential's mission
00:48:16
and also your experience with ROCm and Instinct?
00:48:20
Thank you.
00:48:22
Great to be here, Victor.
00:48:25
Essential, we're really excited.
00:48:27
We're really excited to push the boundaries of human-machine
00:48:32
partnership in enterprises.
00:48:33
We should be able to do it.
00:48:34
We're at the beginning stages where
00:48:36
we'll be able to do 10x or 50x more than what we can just
00:48:39
do by ourselves today.
00:48:40
So we're extremely excited.
00:48:41
And what that's going to take, I believe
00:48:45
it's going to be a full-stack approach.
00:48:47
So you're building the models, serving infrastructure,
00:48:49
but more importantly, understanding workflows
00:48:52
in enterprises today and giving people the tools
00:48:55
to configure these models, teach these models to configure them
00:48:59
for their workflows end to end.
00:49:01
And so the models learn with feedback.
00:49:02
They get better with feedback.
00:49:04
They get smarter.
00:49:06
And then they're eventually able to even guide non-experts
00:49:08
to do tasks they were not able to do.
00:49:10
We're really excited.
00:49:11
And we actually were lucky to start to benchmark the 250s
00:49:16
earlier this year.
00:49:18
And hey, we want to solve a couple of hard problems,
00:49:21
scientific problems. And we were like, hey, are we going
00:49:23
to get long context and check?
00:49:24
OK, so are we going to be able to trade larger models?
00:49:26
Are we able to serve larger models and smaller chips?
00:49:28
And so as we saw, and the ease of using the software
00:49:32
was also very pleasant.
00:49:36
And then we saw how things were progressing.
00:49:38
For example, I think in two months, I believe,
00:49:40
FlashAttention, which is a critical component
00:49:42
to actually scale to longer sequences,
00:49:44
appeared, so it was generally very happy
00:49:45
and just impressed with the progress
00:49:47
and excited about the chips.
00:49:49
Thanks so much, Ashish. And Sharon.
00:49:50
So Sharon, Lamini has a very innovative business model
00:49:57
and working with enterprise for their private models.
00:49:59
Why don't you share the mission and how the experience
00:50:01
with AMD has been?
00:50:03
Yeah, thanks, Victor.
00:50:04
So by way of quick background, Sharon,
00:50:07
co-founder CEO of Lamini, most recently,
00:50:09
I was a computer science faculty at Stanford
00:50:11
leading a research group in generative AI.
00:50:13
I did my PhD there also under Andrew Ng
00:50:16
and teach about a quarter million students
00:50:18
and professionals online in generative AI.
00:50:20
And I left Stanford to pursue Lamini and co-found Lamini
00:50:24
on the premise of making the magical, difficult, expensive
00:50:29
pieces of building your own language
00:50:31
model inside an enterprise extremely accessible, easy
00:50:35
to use so that companies who understand
00:50:38
their domain-specific problems best
00:50:39
can be the ones who can actually wield this technology
00:50:42
and, more importantly, fully own that technology.
00:50:47
In just a few lines of code, you can run an LLM
00:50:50
and be able to imbue it with knowledge
00:50:53
from millions of documents, which
00:50:55
is 40,000 times more than hitting
00:51:00
Claude 2 Pro on that API.
00:51:02
So just a huge amount of information
00:51:05
can be imbued into this technology
00:51:06
using our infrastructure.
00:51:08
And more importantly, our customers
00:51:11
get to fully own their models.
00:51:12
For example, NordicTrack, one of our customers
00:51:16
that makes all the ellipticals and treadmills in the gym,
00:51:20
parent companies, iFit, they have over 6 million users
00:51:23
on their mobile app platform.
00:51:26
And so they're building an LLM that can actually
00:51:29
create this personal AI fitness coach imbued
00:51:31
with all the knowledge they have in-house
00:51:34
on what a good fitness coach is.
00:51:35
And it turns out it's actually not a professional athlete.
00:51:37
They tried to hire Michael Phelps, did not work.
00:51:39
So they have real knowledge inside of their company
00:51:42
and they're imbuing the LLM with that
00:51:44
so that we can all have personal fitness trainers.
00:51:47
So we're very excited to be working with AMD.
00:51:51
We actually have had a cloud, AMD cloud,
00:51:54
in production for over the past year on MI200,
00:51:58
so MI210, MI250s.
00:52:00
And we're very excited about the MI300s.
00:52:04
And I think something that's been super important to us
00:52:07
is that with Lamini software,
00:52:09
we've actually reached software parity with CUDA
00:52:12
on all the things that matter with large language models,
00:52:15
including inference and training.
00:52:17
And I would say even beyond CUDA.
00:52:19
We have reached beyond CUDA
00:52:21
for things that matter for our customers.
00:52:23
So that's including higher memory,
00:52:25
higher memory or higher capacity means bigger models.
00:52:28
And our customers wanna be able to build
00:52:31
bigger and more capable models.
00:52:32
And then a second point,
00:52:33
which Lisa kind of touched on earlier today is,
00:52:39
these machines, these chips can actually,
00:52:42
given higher bandwidth, be able to return results
00:52:45
with lower latency, which matters for the user experience,
00:52:49
certainly a personal fitness coach,
00:52:51
but for all of our customers as well.
00:52:53
Super exciting, that's great.
00:52:56
Great.
00:52:58
So, Ion back to you, changing this up a little bit.
00:53:00
So, you heard several key components
00:53:02
of ROCm is open source.
00:53:03
And we did that for rapid adoption
00:53:05
and also getting better, more enhancements
00:53:07
from the community, both open source and AI.
00:53:09
So what do you think about this strategy
00:53:11
and how do you think this approach might help
00:53:12
some of the companies that you've founded?
00:53:15
So obviously, given my history,
00:53:17
really love the open source.
00:53:19
I love the open source ecosystem.
00:53:21
And we try to do over time to do our own contribution,
00:53:27
bring out, and I think that one thing to note
00:53:31
is that many of the generative AI tools today are open source.
00:53:35
And we are talking here about Hugging Face,
00:53:37
about PyTorch, Triton, like I mentioned,
00:53:40
BLM, Drey, and many others.
00:53:43
And many of these tools actually can run today
00:53:49
on AMD and ROCm, stack today.
00:53:54
And this makes ROCm another key component
00:53:58
of the open source ecosystem.
00:54:02
And I think this is great.
00:54:03
And it's, in time, I'm sure that actually quite fast.
00:54:10
It's like the community will take advantage
00:54:12
of the unique capabilities of the AMDs,
00:54:17
MI250 and MI300X to innovate
00:54:22
and to improve the performance of all these tools
00:54:27
which are running at a higher level of the generative AI stack.
00:54:30
Great, and that's our purpose and aim,
00:54:32
so I'm glad to hear that.
00:54:34
So I'm gonna, out of order execution,
00:54:38
jump over to Sharon.
00:54:40
So Sharon, what do you think about how AI workloads
00:54:44
are evolving in the future?
00:54:46
And what do you think, GPU Instincts,
00:54:48
since you have great experience with it
00:54:50
and ROCm can play in that future of AI development?
00:54:54
Okay, so maybe a bit of a spicy take.
00:54:56
I think that GOFAI, good old-fashioned AI,
00:55:00
is not the future of AI.
00:55:03
And I really do think it's LLMs,
00:55:05
or some variant of LLMs of these models
00:55:08
that can actually be able to soak up
00:55:10
all this general knowledge that is missing
00:55:13
from these traditional algorithms.
00:55:15
And we've seen this across so many different algorithms
00:55:17
in our customers already.
00:55:19
Those who are even at the bleeding edge
00:55:21
of recommendation systems, forecasting systems,
00:55:23
classification, are even using this
00:55:26
because of that general knowledge that it's able to learn.
00:55:29
So I think that's the future.
00:55:30
It's maybe more known as Software 2.0,
00:55:34
coined by my friend, Andre Karpathy.
00:55:36
And I really do think Software 2.0,
00:55:38
which is hitting these models time and time again,
00:55:41
instead of writing really extensive software
00:55:43
inside a company, we'll be supporting enterprises 2.0,
00:55:48
meaning enterprises of the future, of the next generation.
00:55:53
And I think the AMD Instinct GPUs
00:55:57
are critical to basically supporting,
00:56:01
ubiquitously supporting the Software 2.0 of the future.
00:56:05
And we absolutely need compute
00:56:07
to be able to run these models efficiently,
00:56:09
to run lots of these models, more of these models,
00:56:12
and larger models with greater capabilities.
00:56:15
So overall, very excited with the direction
00:56:18
of not only these AI workloads,
00:56:20
but also the direction that AMD is taking
00:56:23
in doubling down on these MI300s
00:56:25
that, of course, can take on larger models
00:56:28
and more capable models for us.
00:56:30
Awesome.
00:56:33
So Ashish, we'll finish up with you
00:56:36
and I'll give you the same kind of question.
00:56:37
So where do you think about the future of AI workloads
00:56:39
and how do you think our GPUs and ROCm and can play
00:56:42
and how you're driving things at Essential?
00:56:44
Yep.
00:56:51
So I think that we have to improve reasoning
00:56:58
and planning to solve these complex tasks,
00:57:01
like take an analyst and if they actually,
00:57:04
they want to absorb an earnings call
00:57:06
and figure out how they should revise their opinion
00:57:10
and whether to invest in a company or what recommendations that they should provide.
00:57:13
It's actually gonna take,
00:57:14
it's gonna take multiple reasoning over multiple steps.
00:57:17
It's gonna take ingesting a large document
00:57:20
and being able to extract information from it,
00:57:23
apply their models, actually ask for information
00:57:25
when they don't have any, get world knowledge,
00:57:28
but also maybe have some reasoning
00:57:32
and some outside reasoning and planning there.
00:57:34
And then for all these sort of,
00:57:36
so when I look at the MI300 with very large HBM
00:57:40
and high memory bandwidth,
00:57:42
I think of what's gonna be unlocked
00:57:44
and which capabilities are going to be improved
00:57:46
and what new capabilities will be available.
00:57:48
So I mean, even with what we have today,
00:57:51
just imagine a world where you can process long documents
00:57:55
or you can make these models much more accurate
00:57:57
by adding more examples in the prompt.
00:58:00
But imagine just complete user sessions
00:58:02
that you can maintain and model state,
00:58:04
how they would actually improve
00:58:06
the end-to-end user experience, right?
00:58:08
And I think that we're moving to a kind of architecture
00:58:15
where what typically is to happen in inference,
00:58:17
a lot of search is now gonna go into training
00:58:19
where the models are gonna explore thousands of solutions
00:58:22
and eventually pick one that's actually the best option
00:58:24
for the goal, the best solution for the goal.
00:58:28
And that's good, and definitely the large HBM
00:58:31
and high bandwidth is gonna not only be important
00:58:33
for serving large models with low latency
00:58:35
for better end-to-end experience,
00:58:36
but also for some of these new techniques
00:58:38
that we're just exploring
00:58:40
that are gonna improve the capabilities of these models.
00:58:43
So very excited about the new chip
00:58:46
and what it's gonna unlock.
00:58:47
Great, thank you, Ashish.
00:58:48
Ion, Ashish, Sharon, this has been really terrific.
00:58:51
Thank you so much for all the great insights
00:58:54
you have provided us.
00:58:55
Thank you.
00:58:56
And thank you for joining us today.
00:58:57
Thank you.
00:58:58
Thank you.
00:58:59
Thank you.
00:59:00
Thank you.
00:59:03
It's just so exciting to hear what companies like Databricks,
00:59:06
Essential AI, Lamini are achieving with our GPUs
00:59:09
and just super thrilled that their experience
00:59:12
with our software has been so smooth and really a delight.
00:59:16
So you can tell, they see absolutely no barriers, right?
00:59:19
And they're extremely motivated
00:59:20
to innovate on AMD platforms.
00:59:23
Okay, to sum it up, what we delivered
00:59:25
over the past six months is empowering developers
00:59:28
to execute their mission and realize their vision.
00:59:32
We'll be shipping ROCm 6 very soon.
00:59:34
It's optimized for LLMs and together with the MI300X,
00:59:38
it's gonna deliver 8X gen-on-gen performance improvement
00:59:42
and it's higher performance in inference
00:59:44
than the competition.
00:59:46
We have 62,000 models running on Instinct today
00:59:49
and more models will be running on the MI300 very soon.
00:59:54
We have very strong momentum,
00:59:55
as you can see in the ecosystem,
00:59:57
adding open AI training to our extensive list
01:00:00
of NG standard frameworks, models, runtimes and libraries.
01:00:04
And you heard from the panels, right?
01:00:06
Our tools are proven and easy to use.
01:00:09
Innovators are advancing the state of the art of AI
01:00:12
on AMD GPUs today.
01:00:15
ROCm 6 and the MI300X will drive an inflection point
01:00:19
in developer adoption, I'm confident of that.
01:00:22
We're empowering innovators to realize the profound benefits
01:00:26
of pervasive AI faster on AMD.
01:00:30
Thank you.
01:00:35
And now I'd like to invite Lisa back on the stage.
01:00:45
Thank you, Victor.
01:00:46
And weren't those innovators great?
01:00:48
I mean, you love the energy
01:00:49
and just all of the thought there.
01:00:51
So look, as you can see,
01:00:53
the team has really made great, great progress with ROCm
01:00:57
and our overall software ecosystem.
01:00:59
Now, I said I wanted though,
01:01:00
we really want broad adoption for MI300X.
01:01:03
So let's go through and talk to some additional customers
01:01:07
and partners who are early adopters of MI300X.
01:01:10
Our next guest is a partner really at the forefront
01:01:13
of GenAI innovation and working across models,
01:01:17
software and hardware.
01:01:18
Please welcome Ajit Matthews of Meta to the stage.
01:01:28
Hello, Ajit, it's so nice of you to be here.
01:01:30
We're incredibly proud of our partnership together.
01:01:34
Meta and AMD have been doing so much work together.
01:01:36
Can you tell us a little bit about Meta's vision in AI?
01:01:39
Cause it's really broad and key for the industry.
01:01:43
Absolutely, thanks Lisa.
01:01:45
We are excited to partner with you and others
01:01:48
and innovate together to bring generative AI
01:01:51
to people around the world at scale.
01:01:54
Generative AI is enabling new forms of connection
01:01:57
for people around the world,
01:01:59
giving them the tools to be more creative,
01:02:01
expressive and productive.
01:02:04
We are investing for the future
01:02:06
by building new experiences for people across our services
01:02:09
and advancing open technologies
01:02:12
and research for the industry.
01:02:14
We recently launched AI stickers,
01:02:17
image editing, Meta AI, which is our AI assistant
01:02:21
that spans our family of apps and devices
01:02:25
and lots of AIs for people to interact
01:02:28
within our messaging platforms.
01:02:31
In July, we opened access to our Llama 2 family of models
01:02:36
and as you've seen it, have blown away
01:02:38
by the reception from the committee
01:02:40
who have built some truly amazing applications
01:02:44
on top of them.
01:02:45
We believe that an open approach feeds to better
01:02:51
and safer technology in the long run
01:02:53
as we have seen from our involvement
01:02:55
in the PyTorch Foundation, Open Compute Project
01:02:58
and across dozens of previous AI models
01:03:02
and data set releases.
01:03:04
We're excited to have partnered with the industry
01:03:06
on our generative AI work, including AMD.
01:03:10
We have a shared vision to create new opportunities
01:03:13
for innovation in both hardware and software
01:03:16
to improve the performance and efficiency of AI solutions.
01:03:22
That's so great, Ajit.
01:03:23
We completely agree with the vision.
01:03:26
We agree with the open ecosystem
01:03:28
and that really being the path to get all of the innovation
01:03:32
from all the smart folks in the industry.
01:03:34
Now, we've collaborated a lot on the product front as well,
01:03:38
both EPYC and Instinct.
01:03:39
Can you talk a little bit about that work?
01:03:42
Yeah, absolutely.
01:03:43
We have been working together on EPYC CPUs since 2019
01:03:48
and most recently deployed Genoa and Bergamo-based servers
01:03:53
at scale across Meta's infrastructure
01:03:55
where it now serves many diverse workloads.
01:04:00
But our partnership is much broader than EPYC CPUs
01:04:04
and we have been working together on Instinct GPUs
01:04:06
starting since the MI100 in 2020.
01:04:10
We have been benchmarking ROCm
01:04:12
and working together on improvements for its support
01:04:16
in PyTorch across each generation of AMD Instinct GPU,
01:04:20
leading up to MI300X now.
01:04:22
Over the years, ROCm has evolved,
01:04:25
becoming a competitive software platform
01:04:27
due to optimizations and ecosystem growth.
01:04:31
AMD is a founding member of PyTorch foundations
01:04:34
and has made significant commitment to PyTorch
01:04:37
investment providing day zero support for PyTorch 2.0
01:04:40
with ROCm, Torch.compile, Torch.export,
01:04:43
all of those things are great.
01:04:44
We have seen tremendous progress
01:04:45
on both Instinct GPU performance and ROCm maturity
01:04:49
and are excited to see ecosystem support
01:04:52
grow beyond PyTorch 2.0,
01:04:53
like to open AI Triton, today's announcement
01:04:56
with respect to being a default backend of AMD,
01:04:59
that's great, FlashAttention-2 is great,
01:05:02
Hugging Face, great, and other industry frameworks.
01:05:05
All of these are great partnerships.
01:05:08
It really means a lot to hear you say that, Ajit.
01:05:10
I think we also view
01:05:12
that it's been an incredible partnership.
01:05:13
I think the teams work super closely together,
01:05:16
that's what you need to do to drive innovation.
01:05:18
And the work with PyTorch Foundation
01:05:20
is foundational for AMD, but really the ecosystem as well.
01:05:25
But our partnership is very exciting right now with GPUs,
01:05:29
so can you talk a little bit about the 300X plans?
01:05:31
Oh, here we go.
01:05:32
We are excited to be expanding our partnership
01:05:34
to include Instinct MI300X GPUs
01:05:37
in our data centers for AI inference workings.
01:05:40
Thank you, so much.
01:05:45
So, just to give you a little background,
01:05:47
MI300X leverages the OCP accelerator module,
01:05:51
standard and platform,
01:05:52
which has helped us adopt in record time.
01:05:55
In fact, MI300X is trending to be one of the fastest
01:05:58
designed-to-deployment solutions in the Meta history.
01:06:05
We have also had a great experience with ROCm,
01:06:09
and the performance is able to deliver with MI300X.
01:06:12
The optimizations and the ecosystem growth over the years
01:06:16
have made ROCm a competitive software platform.
01:06:19
As model parameters increase
01:06:21
and the Llama family of models continues to grow in size
01:06:24
and power, which it will,
01:06:26
the MI300X with its 192 GB of memory
01:06:29
and higher memory bandwidth meets the expanding requirements
01:06:32
for large language model inference.
01:06:34
We are really pleased with the ROCm optimizations
01:06:37
that AMD has done,
01:06:39
focused on the Llama 2 family of models on MI300X.
01:06:43
We are seeing great, promising performance numbers,
01:06:46
which we believe will benefit the industry.
01:06:49
So, to summarize, we are thrilled with our partnership
01:06:52
and excited about the capabilities offered by the MI300X
01:06:56
and the ROCm platform as we start to scale their use
01:06:59
in our infrastructure for production workloads.
01:07:02
That is absolutely fantastic, Ajit.
01:07:04
Thank you, Lisa.
01:07:05
Thank you so much.
01:07:07
We are thrilled with the partnership
01:07:09
and we look forward to seeing lots of MI300Xs
01:07:12
in your infrastructure. So, thank you for being here.
01:07:14
That's good. Thank you.
01:07:19
So, super exciting.
01:07:21
We said cloud is really where a lot of the infrastructure
01:07:24
is being deployed,
01:07:25
but enterprise is also super important.
01:07:28
So, when you think about the enterprise right now,
01:07:30
many enterprises are actually thinking about their strategy.
01:07:33
They want to deploy AI broadly
01:07:35
across both cloud and on-prem,
01:07:38
and we're working very closely with our OEM partners
01:07:41
to bring very integrated enterprise AI solutions
01:07:44
to the market.
01:07:45
So, to talk more about this,
01:07:47
I'd like to invite one of our closest partners to the stage,
01:07:50
Arthur Lewis, President of Dell Technologies
01:07:52
Infrastructure Solutions Group.
01:07:58
Hey, welcome, Arthur.
01:08:00
I'm so glad you could join us for this event.
01:08:02
And Dell and AMD have had such a strong history of partnership.
01:08:06
I actually also think, Arthur,
01:08:08
you have a very unique perspective
01:08:10
of what's happening in the enterprise,
01:08:11
just given your purview.
01:08:13
So, can we just start with giving the audience
01:08:15
a little bit of a view of what's happening in enterprise AI?
01:08:18
Yeah, Lisa, thank you for having me today.
01:08:21
We are at an inflection point with artificial intelligence.
01:08:26
Traditional machine learning and now generative AI
01:08:29
is a catalyst for much greater data utilization,
01:08:32
making the value of data tangible
01:08:34
and therefore quantifiable.
01:08:37
Data, as we all know, is growing exponentially.
01:08:39
A hundred terabytes of data was generated last year,
01:08:42
more than doubling over the last three years.
01:08:44
And IDC projects that data will double again by 2026.
01:08:49
And it is clear that data is becoming
01:08:51
the world's most valuable asset.
01:08:53
And this data has gravity.
01:08:55
83% of the world's data resides on-prem,
01:08:59
and much of the new data will be generated at the edge.
01:09:03
Yet customers are dealing with years of rapid data growth,
01:09:07
multiple copies on-prem across clouds,
01:09:10
proliferating data sources, formats, and tools.
01:09:13
These challenges, if not overcome,
01:09:15
will prevent customers from realizing
01:09:17
the full potential of artificial intelligence
01:09:19
and maximizing real business outcome.
01:09:22
Today, customers are faced with two suboptimal choices.
01:09:27
Number one, stitch together a complex web
01:09:30
of technologies and tools and manage it themselves,
01:09:33
or two, replicate their entire data estate
01:09:37
in the public cloud.
01:09:39
Customers need and deserve a better solution.
01:09:43
Our job is to bring artificial intelligence to the data.
01:09:47
That's great perspective, Arthur.
01:09:49
And that 83% of the data and where it resides,
01:09:53
I think, is something that sticks in my mind a lot.
01:09:55
Now let's move to a little bit of the technology.
01:09:57
I mean, we've been partnering together
01:09:58
to bring some great solutions to the market.
01:10:00
Tell us more about what you have planned
01:10:02
from a tech standpoint.
01:10:03
Well, today's an exciting day.
01:10:05
We are announcing a much-anticipated update
01:10:08
to the family of our PowerEdge 9680,
01:10:11
the fastest growing product in Dell ISG history,
01:10:15
with the addition of AMD's Instinct MI300X Accelerator
01:10:19
for artificial intelligence.
01:10:27
Effective today, we are going to be able to offer
01:10:29
a new configuration of eight MI300X accelerators,
01:10:35
providing 1.5 terabytes of coherent HBM3 memory,
01:10:39
delivering bandwidth of 5.3 terabytes per server.
01:10:44
This is an unprecedented level of performance
01:10:47
in the industry and will allow customers
01:10:49
to consolidate large language model inferencing
01:10:53
onto a fewer number of services,
01:10:55
while providing for training at scale,
01:10:58
while also reducing complexity, cost,
01:11:01
and data center footprint.
01:11:03
We are also leveraging AMD's Instinct Infinity Platform,
01:11:07
which provides a unified fabric
01:11:10
for connecting multiple GPUs within and across servers,
01:11:14
delivering near linear scaling
01:11:16
and low latency for distributed AI.
01:11:20
Further,
01:11:22
and there's more.
01:11:26
Through our collaboration with AMD
01:11:28
on software and open source frameworks,
01:11:30
which Lisa, you talked a lot about today,
01:11:32
including PyTorch and TensorFlow,
01:11:34
we can bring seamless services for customers
01:11:37
and out-of-the-box LLM experience.
01:11:40
We talked about making it simple.
01:11:41
This makes it incredibly simple.
01:11:43
And we've also optimized the entire stack
01:11:47
with Dell storage,
01:11:48
specifically power scale and object scale,
01:11:50
providing ultra low latency ethernet fabrics,
01:11:53
which are designed specifically
01:11:54
to deliver the best performance and maximum throughput
01:11:58
for generative AI training and inferencing.
01:12:01
This is an incredibly exciting step forward.
01:12:04
And again, effective today, Lisa,
01:12:06
we're open for business,
01:12:08
we're ready to quote,
01:12:09
and we're taking orders.
01:12:10
I like the sound of that.
01:12:15
Look, it's so great to see how this all comes together.
01:12:19
Our teams have been working so closely together
01:12:22
over the last few years
01:12:23
and definitely over the last year.
01:12:26
Tell us though, there's a lot of co-innovation
01:12:28
and differentiation in these solutions.
01:12:31
So just tell us a little bit more about that.
01:12:33
Well, our biggest differentiator
01:12:35
is really the breadth of our technology portfolio at
01:12:38
Dell Technologies.
01:12:39
Products like power scale,
01:12:41
which is our one file system for unstructured data storage,
01:12:44
has been helping customers in industries
01:12:46
like financial services, manufacturing, life sciences,
01:12:49
to help solve the world's most challenging problems
01:12:52
for decades as the complexity of their workflows
01:12:55
and scale of their data estate increases.
01:12:58
And with AMD, we are bringing these components together
01:13:01
with open networking products and AI fabric solutions,
01:13:05
taking the guesswork out of building tailored gen AI solutions
01:13:09
for customers of all sizes, again, making it simple.
01:13:13
We have both partnered with Hugging Face
01:13:15
to ensure transformers and LLMs for generative AI
01:13:19
don't just work for our combined solutions
01:13:21
but are optimized for AMD's accelerators
01:13:24
and easy to configure and size for workloads with our products.
01:13:29
And in addition to that, Dell validated designs,
01:13:34
we have a comprehensive set
01:13:35
and a growing array of services and offerings
01:13:38
that can be tailored to meet the needs of customers
01:13:41
looking for a complimentary gen AI strategy consultation
01:13:46
all the way up to and fully managed solution
01:13:49
for generative AI.
01:13:50
That's fantastic, Arthur.
01:13:52
Great set of solutions, love the partnership
01:13:54
and love what we can do
01:13:56
for our enterprise customers together.
01:13:57
Thank you so much for being here.
01:13:58
Thank you for having me, Lisa.
01:14:00
Yeah.
01:14:04
Our next guest is another great friend.
01:14:06
Supermicro and AMD have been working together
01:14:09
to bring leadership computing solutions to the market
01:14:11
for many years based on AMD EPYC processors
01:14:14
as well as Instinct accelerators.
01:14:15
Here to tell us more about that,
01:14:17
please join me in welcoming CEO Charles Liang to the stage.
01:14:20
Congratulations.
01:14:27
Thank you so much.
01:14:28
Hello, Charles.
01:14:29
For a successful launch.
01:14:30
Yeah, thank you so much for being here.
01:14:31
I mean, Supermicro is really well known
01:14:33
for building highly optimized systems
01:14:37
for lots of workloads.
01:14:38
We've done so much together.
01:14:40
Can you share a little bit
01:14:41
about how you're approaching gen AI?
01:14:43
Thank you.
01:14:44
Because our building block solution
01:14:46
based on a modularized design.
01:14:48
So that enables Supermicro to design product
01:14:51
quicker than others and deliver product to customer
01:14:55
also quicker, better leverage inventory
01:14:58
and better for service.
01:15:00
And thank you for our close relationship.
01:15:02
Thank you for all I have.
01:15:04
So that's why we are able to design product
01:15:06
time to market as soon as possible.
01:15:10
Well, I really appreciate that our teams
01:15:12
also work very closely together.
01:15:14
And we now know that everybody is calling us
01:15:18
for AI solutions.
01:15:19
You've built a lot of AI infrastructure.
01:15:22
What are you seeing in the market today?
01:15:24
Oh, the market continues to grow very fast.
01:15:27
The only limitation is-
01:15:28
Very fast, right?
01:15:29
Very fast.
01:15:30
Maybe more than very fast.
01:15:33
So all we need is just more chips.
01:15:36
I know.
01:15:45
So today, including USA,
01:15:47
Netherlands, Taiwan and Malaysia,
01:15:49
we have more than 4,000 rack per month capacity
01:15:54
and customer facing to no enough power,
01:15:58
no enough space problem.
01:16:00
So with our rack-scale building block solution,
01:16:04
with free air cooling,
01:16:07
optimized for hybrid air and free air cooling,
01:16:10
optimized for liquid cooling,
01:16:12
that can have customer safe energy power
01:16:15
up to 30 to even 40%.
01:16:17
And that allow customer to install more system
01:16:21
with fixed power budget
01:16:23
and all same power, same system,
01:16:27
but less energy cost.
01:16:29
So all of those,
01:16:30
together with our rack-scale building block solution,
01:16:34
we installed a whole rack,
01:16:36
including generative CPU, GPU,
01:16:40
and storage, switch,
01:16:45
firmware, management software,
01:16:48
security function.
01:16:49
And when we shift to customer,
01:16:51
customer just simply plug in two cable,
01:16:54
power cable, data cable,
01:16:57
and then ready to run, ready to online.
01:17:00
For liquid cooling customer,
01:17:02
for sure they need a water kind of tube.
01:17:06
So that make a customer can easily online
01:17:11
with one chip available.
01:17:13
Yeah, no, that's fantastic.
01:17:15
Thank you, Charles.
01:17:16
Now, let's talk a little bit about MI300X.
01:17:18
What do you have planned for MI300?
01:17:21
Okay, the big product.
01:17:22
We have a product based on MI300X,
01:17:27
like 8U for air cooler,
01:17:29
or for the air cooler.
01:17:31
And then 4U optimize for liquid cooler.
01:17:33
So the air cooler per rack,
01:17:37
we support up to 40 kW or 50 kW.
01:17:40
For liquid cooler,
01:17:41
we support up to 80 kW or 100 kW.
01:17:46
And so all kind of rack-scale plug and play.
01:17:50
So when customer need,
01:17:51
once we have chip,
01:17:52
we can ship the customer quicker.
01:17:55
That sounds wonderful.
01:17:56
Well, look, we appreciate all the partnership, Charles,
01:17:58
and we will definitely see a lot of opportunity
01:18:02
to collaborate together on the generative AI.
01:18:04
So thank you so much.
01:18:05
Thank you so much.
01:18:06
Thank you.
01:18:12
Okay, now let's turn to our next guest.
01:18:14
Lenovo and AMD have a broad partnership as well
01:18:17
that spans from data center to workstations and PCs,
01:18:20
and now to AI.
01:18:22
So here to tell us about this special partnership,
01:18:24
please welcome to the stage, Kirk Skaugen,
01:18:26
EVP and President of Infrastructure Solutions Group
01:18:29
at Lenovo.
01:18:34
Hello, Kirk.
01:18:35
Thank you so much for being here.
01:18:37
We truly appreciate the partnership with Lenovo.
01:18:41
You have a great perspective as well.
01:18:43
Tell us about your view of AI
01:18:45
and what's going on in the market.
01:18:47
Sure.
01:18:48
Well, AI is not new for Lenovo.
01:18:49
We've been talking and innovating around AI for many years.
01:18:52
We just had a great supercomputing
01:18:54
where we're the number one supercomputer provider
01:18:56
to the top 500,
01:18:58
and we're proud that IDC just ranked us number three
01:19:01
AI server infrastructure in the world as well.
01:19:02
So it's not new to us,
01:19:04
but you are at Tech World,
01:19:05
so thanks for joining us in Austin.
01:19:08
We're trying to help shape the future of AI
01:19:10
from the pocket to the edge to the cloud,
01:19:12
and we've had this kind of concept of AI for all.
01:19:15
So what does that mean?
01:19:16
Pocket meaning Motorola, smartphone, AI devices,
01:19:21
and then all the way to the cloud with our ODM Plus model.
01:19:23
So our collaboration with our customers
01:19:27
is really to accelerate AI adoption,
01:19:29
and we recently announced another billion dollars
01:19:32
to the original $1.2 billion we announced a few years ago
01:19:34
to deliver AI solutions to businesses of all sizes,
01:19:37
from the smallest business to the largest cloud.
01:19:39
So we believe that generative AI
01:19:41
will ultimately be a hybrid approach,
01:19:44
and fundamentally we do want to bring AI to the data.
01:19:47
I think one of the most exciting things for me is,
01:19:49
I think like Arthur said, right,
01:19:50
we'd see data doubling in the world over the next few years.
01:19:54
75% of that compute is moving to the edge,
01:19:56
and today we're only computing 2% of it,
01:19:58
so we're throwing away 98%.
01:20:00
So more data is going to be created in the next few years
01:20:02
in the entire history of the world combined,
01:20:04
and together we're bringing AI to the edge
01:20:07
with the recent SE455 ThinkEdge that we announced.
01:20:10
We think that there's kind of three views of generative AI,
01:20:13
public AI, private AI, and personal AI,
01:20:16
and the key for us is protecting privacy
01:20:18
and addressing data security.
01:20:19
So public AI where you'd use obviously public data,
01:20:23
enterprise AI where you'd use only your enterprise data
01:20:26
within your firewall, and then on things like an AI PC,
01:20:29
things that you choose to have only on your device,
01:20:31
whether that's a phone, a tablet, or a PC.
01:20:33
Yeah, no, no, it's a very comprehensive vision,
01:20:35
and we see it very much the same way.
01:20:38
Now, you talked a lot about your AI strategy at Tech World,
01:20:41
and you had some key pillars there.
01:20:44
Do you want to just tell us a little bit more about that?
01:20:45
Yeah, so I think there's three fundamental pillars
01:20:47
of our AI vision and strategy.
01:20:48
First, we have an AI product roadmap,
01:20:50
I think that's second to none,
01:20:51
from a rich smart device portfolio,
01:20:53
and we'll talk about AI PCs probably more in another day,
01:20:56
smartphones and tablets.
01:20:57
Then we have a huge array now of over 70 AI-ready server
01:21:02
and storage infrastructure products,
01:21:04
and then we've recently launched a whole set of solutions
01:21:07
and services around that as well.
01:21:08
So more than 70 products,
01:21:10
and we'll talk about the new ones we're announcing today,
01:21:12
which are very exciting.
01:21:13
The second thing is we have something called
01:21:15
an AI innovators program.
01:21:16
What's really daunting to people
01:21:18
is there's over 16,000 AI startups out there.
01:21:20
So if you have an IT department of a few dozen people,
01:21:23
how do you even start?
01:21:24
So we've gone and scoured the earth,
01:21:27
we've found 65 ISVs, 165 solutions
01:21:30
where we've optimized them
01:21:31
on top of Lenovo infrastructure
01:21:33
for some of the key verticals,
01:21:34
and are delivering kind of simplified AI
01:21:36
to the customer base.
01:21:37
And then at Tech World,
01:21:38
we launched a comprehensive set of professional services.
01:21:42
Now Lenovo, more than 40% of our revenue is non-PC,
01:21:45
so we're transforming into data center and services.
01:21:48
So we're doing everything in the AI
01:21:50
from just basic customer discovery of what you can do
01:21:52
if you're a stadium,
01:21:54
what are the best-in-class stadium solutions
01:21:55
if you're a fast food chain, if you're a supermarket,
01:21:58
all the way to AI adoption.
01:22:00
And then even from a sustainability perspective,
01:22:02
things like asset recovery services
01:22:04
to make sure you have a sustainable AI journey as well.
01:22:06
Yeah, I know it makes a lot of sense.
01:22:07
And you know, gen AI and large language models
01:22:10
is sort of the defining moment for us right now.
01:22:12
You're spending a lot of time with customers.
01:22:14
What are you hearing from them
01:22:15
and what are their challenges?
01:22:16
Yeah, so I think the key message
01:22:18
is that customers need help in simplifying their AI journey.
01:22:21
I mean, there's so much coming at them.
01:22:23
So our investments in that $2 billion they talked about
01:22:26
are really expanding our AI-ready portfolio
01:22:28
to deliver fully integrated systems
01:22:30
that bring AI-powered computing to everywhere data is created,
01:22:34
especially the edge,
01:22:35
and helping businesses easily and efficiently
01:22:37
deploy generative AI applications.
01:22:39
We're also hearing that customers want choice.
01:22:42
Choice in systems, choice in software,
01:22:44
choice in services, and definitely large language models
01:22:47
and model training are creating a lot of buzz.
01:22:49
But over time, I think we all know inference
01:22:51
is gonna become the dominant AI workload
01:22:53
as data flows from these billions
01:22:55
of connected devices at the edge.
01:22:57
So generative AI from our perspective,
01:22:59
like you said, I think in your opening comments,
01:23:01
needs high-performance compute,
01:23:03
large and fast memory, and a software stack
01:23:05
to support the leading AI ecosystem solution.
01:23:07
So with that, I believe Lenovo and AMD
01:23:10
are really uniquely positioned
01:23:12
to take advantage of these trends.
01:23:14
Yeah, absolutely. And our teams are doing a lot of work together
01:23:17
and working closely on MI300X.
01:23:19
Tell us more about your plans.
01:23:21
Well, we have a long proven track record as a PC company
01:23:25
and as a data center company of bringing Ryzen AI
01:23:27
to our ThinkPads, and we're committed
01:23:30
to being time to market on large language models,
01:23:32
on inferencing, and we're working with AMD
01:23:35
to develop our next-gen AI product roadmap
01:23:37
and our solution portfolios.
01:23:39
So we're incredibly excited today
01:23:40
about the addition of the MI300X
01:23:42
to the Lenovo ThinkSystem platform.
01:23:44
It's gonna be very exciting.
01:23:45
Thank you. Thank you.
01:23:48
So we're committed to be time to market
01:23:50
with a dual-EPYC 8 GPU MI300X
01:23:54
and have a lot of customer interest on that.
01:23:56
So bottom line, from edge to cloud,
01:23:59
we are incredibly excited about what's ahead for us.
01:24:03
We're gonna have all of this available as a service
01:24:05
through our Lenovo TruScale as well.
01:24:07
So you only have to pay for what you need.
01:24:09
So as we move to an asset service model,
01:24:10
everything we talked about today
01:24:12
will be available through that as well.
01:24:13
So thank you very much and look forward
01:24:15
to continuing the collaboration.
01:24:16
Absolutely, Kirk.
01:24:17
Thank you so much.
01:24:18
Thanks for the partnership.
01:24:19
All right, thank you.
01:24:23
So that's great.
01:24:24
Big thank you to Kirk and Arthur and Charles
01:24:27
for all the work that we're doing together
01:24:28
to really bring MI300X to our customers.
01:24:31
It really does take an entire ecosystem.
01:24:33
We're very proud of actually the broad OEM and ODM ecosystem
01:24:37
that we have brought together
01:24:38
to bring a wide range of MI300X solutions to market in 2024.
01:24:43
And in addition to the OEM and ODM ecosystem,
01:24:46
we're also significantly expanding our work
01:24:49
with some of these specialized AI cloud partners.
01:24:51
So I'm happy to say today that all of these partners
01:24:54
are adding MI300X to their portfolio.
01:24:57
And what's important about this is
01:24:59
it will actually make it easier for developers
01:25:01
and AI startups to get access to MI300X GPUs
01:25:05
as soon as possible with a proven set of providers
01:25:08
who each have their unique value and capabilities.
01:25:12
So that tells you a little bit about the ecosystem
01:25:14
that we're putting together for MI300X.
01:25:21
Now, we've given you a lot of information already,
01:25:24
but what is very, very important is not just the hardware
01:25:28
and the software and all of our customer partnerships,
01:25:31
but it's also the rest of the system partnerships.
01:25:33
So now let me welcome to the stage Forrest Norrod
01:25:36
to talk more about our AI networking
01:25:38
and high-performance computing solutions.
01:25:46
Thank you, Lisa.
01:25:47
Good morning.
01:25:48
So far, we've talked about the amazing GPU
01:25:51
and open software ecosystem that AMD is building
01:25:55
to power generative AI systems.
01:25:57
But there's a third element that's equally important
01:26:02
to the performance and scalability
01:26:03
of these large AI deployments, and that's networking.
01:26:08
The compute required to train the most advanced models
01:26:12
has increased by a factor of 50 billion
01:26:15
over the past decade.
01:26:17
While GPU performance has also increased,
01:26:21
what that performance demand means is we need many GPUs
01:26:25
in order to deliver the required total performance.
01:26:30
Leading AI clusters are now tens of thousands of GPUs,
01:26:35
and that's only going to increase.
01:26:38
Well, so the first way we've scaled to meet that demand
01:26:42
is within the server.
01:26:43
A typical server has perhaps a couple
01:26:46
of high-performance x86 CPUs and perhaps eight GPUs.
01:26:50
You've seen that today.
01:26:52
These are interconnected with a high-performance,
01:26:54
low-latency, non-blocking local fabric.
01:26:58
In the case of NVIDIA, that's NVLink.
01:27:01
For AMD, that's Infinity Fabric.
01:27:04
Both have high signaling rates, low latency,
01:27:08
both are coherent.
01:27:10
Both have demonstrated the ability
01:27:11
to offer near-linear scaling performance
01:27:14
as you increase the number of GPUs,
01:27:17
and both have been proprietary,
01:27:20
effectively only supported by the companies
01:27:22
that created them.
01:27:24
I'm pleased to say that today, AMD is changing that.
01:27:29
We are extending access to the Infinity Fabric ecosystem
01:27:33
to strategic partners and innovative companies
01:27:36
across the industry.
01:27:45
Doing so allows others to innovate
01:27:47
around the AMD GPU ecosystem to the benefit of customers
01:27:51
and the entire industry.
01:27:53
You'll hear more about this from one of our partners
01:27:55
in a few minutes and much more on this initiative next year.
01:28:00
But beyond the node, we still need to connect and scale
01:28:04
to much larger numbers.
01:28:06
We need fabrics to connect the servers to one another,
01:28:09
welding them into one resource.
01:28:12
Now, there are usually two networks connected
01:28:15
to each of these GPU servers.
01:28:17
A traditional ethernet network used to connect the server
01:28:21
to the rest of the data center traditional infrastructure,
01:28:25
and more importantly, a backside network
01:28:29
to interconnect the GPUs, allowing them to share parameters,
01:28:33
results, activations, and coordinate
01:28:36
in the overall training and inference tasks.
01:28:40
When we're connecting thousands of nodes
01:28:42
like we do in AI systems, the network is critical
01:28:46
to overall performance.
01:28:48
It has to deliver fast switching rates
01:28:50
at very low latency.
01:28:52
It must be efficiently scalable
01:28:54
so that congestion problems don't limit performance.
01:28:58
And in AMD, we believe it must also be open,
01:29:01
open to allow innovation.
01:29:04
Today, there are two options for the backend fabric,
01:29:08
InfiniBand or Ethernet.
01:29:10
At AMD, we believe Ethernet is the right answer.
01:29:14
It's a high-performance technology with leading signaling rates.
01:29:18
It has extensions such as RoCE and RDMA
01:29:21
to efficiently move data between nodes,
01:29:25
a set of innovations developed
01:29:27
for leading supercomputers over the years.
01:29:30
It's scalable, offering the highest-rate switching technology
01:29:34
from leading vendors such as Broadcom, Cisco, and Marvell.
01:29:38
And we've seen tremendous innovation recently
01:29:41
in advanced congestion control
01:29:43
to deal with the issues of scale effectively.
01:29:47
And most of all, it's open.
01:29:49
Open means companies can extend ethernet,
01:29:51
innovating on top as needed to solve new problems.
01:29:56
We've seen that from Hewlett Packard Enterprise
01:29:58
with their Slingshot technology,
01:30:00
which powers the network at the heart of Frontier,
01:30:02
the world's fastest supercomputer,
01:30:04
enabling it to achieve exascale performance.
01:30:08
And we've seen Google and AWS,
01:30:10
who run some of the largest clusters in the world,
01:30:13
develop their own ethernet extensions.
01:30:16
And finally, maybe most importantly,
01:30:18
we've seen the industry come together
01:30:21
to create the Ultra Ethernet Consortium and Standard,
01:30:25
where leaders across the field have united
01:30:27
to drive the future of ethernet
01:30:30
and ensure it's the best high-performance interconnect
01:30:34
for AI and HPC.
01:30:37
And we're proud to welcome to the stage today
01:30:40
some of those networking leaders.
01:30:42
Andy Bechtolsheim from Arista,
01:30:46
Jas Tremblay from Broadcom,
01:30:48
and Jonathan Davidson from Cisco.
01:31:00
Welcome, gentlemen.
01:31:01
It's not often that we have such a panel
01:31:05
of ethernet experts on the stage.
01:31:08
But before we jump right into ethernet,
01:31:12
perhaps we can talk a little bit about the work
01:31:14
of enabling an ecosystem for AI solutions,
01:31:17
and what that looks like,
01:31:18
and why is it so important to have an open approach?
01:31:22
And maybe, Jonathan, you can start.
01:31:24
Sure, absolutely. Well, first of all, congratulations
01:31:26
on all the announcements today.
01:31:29
We look at how ethernet is so critical,
01:31:34
because I remember back in the day
01:31:37
doing testing on 10 megabit ethernet interoperability.
01:31:42
We're now at 400 gig, 800 gig.
01:31:44
We have line of sight to 1.6 terabit.
01:31:46
It is absolutely ubiquitous across the industry,
01:31:49
and it's also interoperable.
01:31:52
It's a beautiful thing.
01:31:53
So that open standard is really important
01:31:55
for us to be able to make this successful.
01:31:59
Absolutely.
01:32:00
And, Jas, your thoughts as well.
01:32:02
No, I 100% agree.
01:32:04
Forrest, you and I share a vision
01:32:06
of the power of the data center ecosystem.
01:32:08
You think about a data center,
01:32:10
you've got thousands of companies coming together
01:32:12
to work as one, and this is really enabled
01:32:16
by open standards and a code of conduct
01:32:19
that we shall interop.
01:32:20
We're gonna make things work together across companies,
01:32:23
in some cases, across competitors,
01:32:25
and I'm especially excited about the work
01:32:27
that you and I have been doing on Infinity Fabric xGMI,
01:32:32
and we wanna let the industry know
01:32:36
that the next generation of Broadcom PCIe switches,
01:32:40
which are used as the internal fabric inside AI servers,
01:32:44
are gonna support Infinity Fabric xGMI,
01:32:46
and we'll be sharing more details around that
01:32:48
over the next few quarters.
01:32:49
But I think it's important that we offer choices
01:32:54
and options to customers,
01:32:55
and that we come together and jointly innovate.
01:32:58
I completely agree,
01:32:59
and Andy, you've long been a proponent of open.
01:33:03
Yeah, well, open standards have been the driving force
01:33:07
for a lot of the innovation
01:33:09
throughout the industry's history,
01:33:11
but nowhere is this more true than in the case of ethernet,
01:33:14
where the incredible progress we've seen
01:33:16
for the last 40 years would not have happened
01:33:19
without the contributions
01:33:21
of many, many ecosystem participants,
01:33:23
including the companies that are represented here
01:33:25
on this stage.
01:33:27
Absolutely, well, okay,
01:33:28
so since this is a panel of ethernet luminaries,
01:33:33
let's talk about ethernet in particular.
01:33:35
What are the advantages of ethernet for AI?
01:33:38
What are the advantages of ethernet in general,
01:33:41
and how are customers using it today?
01:33:43
We'll talk about the future in a minute,
01:33:44
but let's reflect on current state.
01:33:46
Maybe, Andy, you can start out.
01:33:48
Yeah, so ethernet, at least to me,
01:33:51
is the clear choice for AI fabrics,
01:33:54
and for very basic reason,
01:33:56
it doesn't have a scalability limit.
01:33:58
It can truly support not just 10,000s of nodes today,
01:34:02
but 100,000s, perhaps even a million nodes in the future,
01:34:05
and there is no other network technology
01:34:08
that has that attribute,
01:34:10
and without that scalability,
01:34:12
you're just boxing yourself in.
01:34:15
Yeah, very true, and Jonathan,
01:34:16
I know you guys have been working quite a bit
01:34:19
on AI networking systems as well.
01:34:22
Maybe you could amplify. Absolutely.
01:34:23
Well, for today specifically,
01:34:25
we see the majority of hyperscalers,
01:34:26
as you've had some of them on the stage today,
01:34:28
are either using ethernet for AI fabrics,
01:34:31
or there's a high desire for them to move to ethernet
01:34:34
for the AI fabrics, and so that requires
01:34:37
a lot of collaboration from the folks up here on stage
01:34:39
to make that happen.
01:34:41
We also have been helping customers deploy in the past
01:34:45
their AI networks for enterprise use cases globally,
01:34:49
and it might have started more
01:34:50
in the financial trading sector in the past,
01:34:53
but we're seeing a tremendous amount
01:34:54
of interest in use cases for that whole system
01:34:58
and how you pull all those things together
01:35:00
from the network, the GPU, the NIC, the DPU,
01:35:03
all the way to how you wrap the software around that
01:35:06
to really make it simple and understand
01:35:09
how things are working, and when they're not working,
01:35:11
why, and making that simple for them to do that as well.
01:35:14
Absolutely, and Jas, I know, well, all of us
01:35:17
have been working together in deploying
01:35:19
ethernet-based solutions for AI leaders today,
01:35:23
but I mean, we've been working with the two gentlemen
01:35:27
on the end on switching, but Jas,
01:35:30
maybe you can reflect on the NIC as well.
01:35:33
I think the NIC is critical.
01:35:35
People want choices, and we need to move the innovation
01:35:39
even faster in the NIC, and you'll see much more linkages
01:35:44
between the NIC and the switch,
01:35:45
where before you had a compute domain and a network domain,
01:35:50
and these things are really coming together,
01:35:53
and AI is a driving force of that,
01:35:55
because the complexity is going up so much.
01:35:57
Yeah, absolutely.
01:35:58
Well, okay, so let's talk about the future a little bit.
01:36:01
You know, the Ultra Ethernet Consortium is all three,
01:36:06
all four companies on stage are founding members,
01:36:10
and there's many others that have joined.
01:36:14
You know, UEC is one of the fastest growing,
01:36:17
or maybe the fastest growing consortium
01:36:20
under the Linux Foundation, which has been great to see.
01:36:23
It's gonna shape, I think, UEC is gonna shape
01:36:25
the future of AI networking, and so let's unpack that,
01:36:29
because I think that's a critical topic for folks.
01:36:31
And maybe, Jas, why don't you go ahead and start off.
01:36:34
Yeah, so first of all, ethernet is ready today for AI,
01:36:38
but we need to continue to innovate,
01:36:41
and UEC started with a group of eight companies,
01:36:44
including four of our companies here, cloud providers,
01:36:48
system providers, and semiconductor providers,
01:36:52
coming together around a common vision,
01:36:54
and the vision is AI networks need to be open,
01:36:58
standards-based, we need to offer choices,
01:37:01
and we need to enhance them.
01:37:03
And with that common vision, you know,
01:37:05
the engineers we've assigned from other companies
01:37:07
really got together and rolled up their sleeves,
01:37:10
and the innovation happened extremely quickly.
01:37:14
It's quite exciting, actually.
01:37:15
And one of the things that I'm most excited about this
01:37:19
is we're not building something new.
01:37:22
We are jointly going to enhance ethernet
01:37:26
that's existed for 50 years.
01:37:29
So it's not starting from scratch, it's enhancing,
01:37:30
it's recognizing that ethernet is what people want.
01:37:33
We just need to continue to enhance it
01:37:35
and making this open and standards-based.
01:37:37
Absolutely, and Jonathan, I know Cisco's been
01:37:41
a huge proponent of UEC as well.
01:37:43
Maybe you can reflect on your thoughts
01:37:45
of where this is going.
01:37:46
Absolutely, well, I think that UEC absolutely
01:37:49
is very critical for Cisco, everyone on the panel,
01:37:52
and the whole industry so that we can continue
01:37:54
to drive that movement towards open.
01:37:59
It always takes time. You gotta debate what are the right
01:38:00
technical way to solve things,
01:38:01
but I think that overall it's moving in the right direction.
01:38:04
What I see what's happening here is that
01:38:06
we're gonna have to have interoperability
01:38:08
in more than just one area.
01:38:10
Andy, I wanna talk about LPO and all the things
01:38:12
that we need to do there to make that actually happen.
01:38:17
And what's happening at UEC is another important part.
01:38:20
And what I see what's happening between now
01:38:22
and when the first standard comes out
01:38:24
is really a coalition of the willing.
01:38:25
Like, how do we get all of us together
01:38:27
to drive towards those open interfaces,
01:38:30
whether it be at the ethernet layer,
01:38:32
whether it be at things that you need to plug into it,
01:38:34
how the GPUs connect into that,
01:38:36
how you're actually gonna spray traffic
01:38:38
across a very broad radix,
01:38:40
how you're gonna make sure you can reorder packets
01:38:42
in a consistent way.
01:38:43
These are all things that we need to make sure
01:38:45
that we are driving towards
01:38:47
from an interoperability perspective.
01:38:49
And we've got our own silicon, we've got optics,
01:38:52
but we also are in the component business at Cisco.
01:38:55
And so we sell those things.
01:38:57
Hyperscalers might wanna just buy pieces from us,
01:38:59
like the silicon, and enterprises may want the full system.
01:39:02
But we wanna make sure that it's absolutely 100%
01:39:04
interoperable in every single environment.
01:39:07
Absolutely.
01:39:08
And Andy, maybe you can hone in a little bit more.
01:39:11
I mean, I think many people that aren't familiar
01:39:14
with networking may think, hey, how hard can this be?
01:39:17
We're just shuffling bits around between systems,
01:39:20
but there's a lot of problems to solve.
01:39:22
Yeah, so UEC is in fact solving
01:39:25
a very important technical problem,
01:39:27
which is the way we describe it is modern RDMA at scale.
01:39:32
And this has not been solved before.
01:39:34
To be clear, you know, RoCE today exists,
01:39:36
but it has its limitations.
01:39:38
And it does take an ecosystem effort approach,
01:39:42
and it involves in particular the adapter,
01:39:45
the next silicon vendors,
01:39:47
but also the whole end-to-end interoperability
01:39:50
of that architecture.
01:39:53
We're very excited to be part of this.
01:39:55
We're not in the NIC business ourselves,
01:39:56
but this is absolutely key to enable scaling of RDMA
01:40:01
across 100,000s, if not a million nodes.
01:40:04
Yeah, absolutely.
01:40:05
And when you look at what's being predicted
01:40:08
in terms of million node, hundreds of thousands
01:40:11
up to a million node systems,
01:40:15
I mean, we all have our work cut out for us,
01:40:18
but working together, I know we can solve the problems.
01:40:21
Well, guys, thanks so much for coming to talk to us today.
01:40:26
I'd like to thank you all for your partnership
01:40:28
in this journey, and thank you all for coming today.
01:40:31
Thanks very much.
01:40:32
Thanks, guys.
01:40:34
Thank you so much.
01:40:39
I'd really like to thank our partners
01:40:40
from Arista, Broadcrown, and Cisco
01:40:43
for attending and for their partnership
01:40:45
in driving this critical third leg
01:40:48
that determines the performance of AI systems.
01:40:51
Now, let's turn our focus to high-performance computing,
01:40:55
the traditional realm of the world's largest systems.
01:40:59
AMD has been driving HPC technology for many years.
01:41:03
In 2021, we delivered the MI250,
01:41:07
introducing third-generation Infinity architecture.
01:41:10
It connected an EPYC CPU to the MI250 GPU
01:41:14
through a high-speed bus, Infinity Fabric.
01:41:17
That allowed the CPU and the GPU
01:41:19
to share a coherent memory space
01:41:22
and easily trade data back and forth,
01:41:24
simplifying programming and speeding up processing.
01:41:28
But today, we're taking that concept one step further,
01:41:32
really to its logical conclusion,
01:41:35
with the fourth-generation Infinity architecture
01:41:38
bringing the CPU and the GPU together into one package,
01:41:42
sharing a unified pool of memory.
01:41:46
This is an APU, an Accelerated Processing Unit.
01:41:50
And I'm very proud to say
01:41:52
that the industry's first data center APU for AI and HPC,
01:41:56
the MI300A, began volume production earlier this quarter
01:42:01
and is now being built into what we expect
01:42:03
to be the world's highest-performing system.
01:42:12
Now, Lisa already showed you
01:42:13
what our chiplet technologies make possible with the MI300X.
01:42:19
The MI300A takes those same building blocks
01:42:23
in a slightly different fashion.
01:42:25
Now, the IO die is laid down first, as before,
01:42:28
and contains the infinity cache
01:42:30
and connections to memory and IO.
01:42:32
The XCD accelerator chiplets are bonded on top,
01:42:35
as in the MI300X.
01:42:38
But with the MI300A, we also take CPU chiplets
01:42:43
leveraged directly from our fourth-generation
01:42:46
EPYC CPUs, Genoa, and we put those
01:42:50
on top of the IODs as well,
01:42:53
thus bringing together our leading CPU,
01:42:57
Zen, and CDNA technologies into one amazing part.
01:43:03
Finally, eight stacks of HBM3
01:43:05
with up to 128 gigs of capacity complete the MI300A.
01:43:11
A key advantage of the APU is no longer needing
01:43:15
to copy data from one processor to another,
01:43:19
even through a coherent link,
01:43:22
because the memory is unified,
01:43:25
both in the RAM as well as in the cache.
01:43:29
The second advantage is the ability
01:43:31
to optimize power management between the CPU and the GPU.
01:43:35
That means dynamically shifting power
01:43:38
from one processor to another,
01:43:40
depending on the needs of the workload,
01:43:42
optimizing application performance.
01:43:45
And very importantly, an APU can dramatically
01:43:49
streamline programming, making it easier
01:43:52
for HPC users to unlock its full performance.
01:43:56
And let's talk about that performance.
01:43:58
61 teraflops of double-precision floating point, FP64.
01:44:04
122 teraflops of single-precision.
01:44:08
Combined with that 128 gigabytes of HBM3 memory
01:44:11
at 5.3 terabytes a second of bandwidth,
01:44:14
the capabilities of the MI300A are impressive.
01:44:19
And they're impressive, too,
01:44:20
when you compare it to the alternative.
01:44:23
When you look at the competition,
01:44:25
the MI300A has 1.6 times the memory capacity
01:44:29
and bandwidth of Hopper.
01:44:32
For low-precision operations like FP16,
01:44:34
the two are at parity in terms
01:44:37
of computational performance.
01:44:39
But where precision is needed,
01:44:42
MI300A delivers 1.8 times the double and single-precision
01:44:49
FP64 and FP32 floating point performance.
01:44:54
And beyond simple benchmarks,
01:44:55
the real advantages of an APU come with the performance
01:44:59
of real-world applications which have been tuned
01:45:02
for the APU architecture.
01:45:04
For example, let's look at OpenFOAM.
01:45:07
OpenFOAM is a set of computational fluid dynamics codes
01:45:10
widely used across research, academia, and industry.
01:45:15
With MI300A, we see four times the performance
01:45:20
of Hopper on common OpenFOAM codes.
01:45:23
Now, that performance comes from several places,
01:45:32
from higher performance math operations as we talked,
01:45:36
larger memory and the increased memory bandwidth.
01:45:39
But much of that uplift really comes
01:45:41
from that unified memory eliminating the need
01:45:43
to copy data around the system.
01:45:45
That can perform for tuned applications
01:45:48
truly transformative performance.
01:45:52
And I'm also proud to say that beyond performance,
01:45:54
AMD has stayed true to its heritage,
01:45:58
to its history of leading in power efficiency.
01:46:02
At the node level, the MI300A has twice
01:46:06
the HPC performance per watt of the nearest competitor.
01:46:10
Customers can thus fit more nodes
01:46:13
into their overall facility power budget
01:46:17
and better support their sustainability goals.
01:46:21
With the MI300A, we set out to help our customers
01:46:25
advance the frontiers of research
01:46:28
and not just running traditional HPC applications.
01:46:32
One of the most exciting new areas in HPC
01:46:35
is actually the convergence with AI,
01:46:38
where AI is used in conjunction with HPC techniques
01:46:42
to help steer simulations,
01:46:45
thus getting much better results much faster.
01:46:49
A great example of this is CosmoFlow.
01:46:51
It couples deep learning
01:46:53
with traditional HPC simulation methods,
01:46:57
giving researchers the ability to probe more deeply
01:47:00
and allowing us to learn more about the universe at scale.
01:47:05
CosmoFlow is one of the first applications
01:47:07
targeted to be run on El Capitan,
01:47:09
which we believe will be the industry's first true
01:47:13
two exaflop supercomputer running double precision float
01:47:17
when it's fully commissioned
01:47:19
at Lawrence Livermore National Labs.
01:47:21
Now it.
01:47:22
It's gonna be an amazing machine.
01:47:29
So let's hear more about El Capitan
01:47:32
and its applications for HPC and AI
01:47:35
from our partners at LLNL and Hewlett Packard Enterprises.
01:47:40
We expect El Capitan to be an engine
01:47:43
for artificial intelligence and deep learning.
01:47:47
We will recreate the experimental environment in simulation,
01:47:51
generate lots of data, for example,
01:47:53
and then train our artificial intelligence methods
01:47:55
on that simulation data.
01:47:58
El Capitan will be the most capable AI machine
01:48:01
and its use of APUs at this scale
01:48:04
will be the first of its kind.
01:48:07
As you operate these exascale level workloads,
01:48:11
all of those nodes talk to each other.
01:48:14
AMD and HPE have a long legacy of partnership
01:48:17
and it was only natural for us to partner again
01:48:19
for El Capitan.
01:48:21
The MI300A can be versatile across many different workloads
01:48:25
and we couple it directly with our slingshot fabric
01:48:28
to give it high performance as it operates as a system.
01:48:31
We work very closely with AMD and HPE
01:48:34
to deliver the hardware and the software
01:48:36
that's actually used by the scientists
01:48:38
in the machine itself.
01:48:39
It's really that partnership together
01:48:41
that can really go after and build these supercomputers.
01:48:45
El Capitan will be 16 times faster
01:48:48
than our existing machine here at Lawrence Livermore.
01:48:51
It will enable scientific breakthroughs
01:48:53
that we can't even imagine.
01:49:03
We're proud to have partnered
01:49:05
with Hewlett Packard Enterprises to design
01:49:07
and now build this amazing system.
01:49:10
And so I'd like to invite to the stage Trish Damkroger,
01:49:13
the Senior Vice President and Chief Product Officer
01:49:16
for HPC AI and Labs from Hewlett Packard Enterprise.
01:49:24
Thank you.
01:49:26
Welcome, Trish.
01:49:27
The AMD and HPE teams have been working closely together
01:49:31
over the years to deliver some next generation supercomputers.
01:49:35
Most recently, of course, we've broken the exascale barrier.
01:49:40
I gotta say that again. We broken the exascale barrier
01:49:46
with Frontier for Oak Ridge National Labs.
01:49:48
And now we're looking forward to powering
01:49:50
another exascale system and another bench,
01:49:53
another record with you with El Capitan
01:49:57
for Lawrence Livermore National Labs,
01:49:58
another US Department of Energy lab.
01:50:02
Maybe you can share more with this audience
01:50:04
about our journey together and the innovations
01:50:06
that we've ushered in this journey to exascale.
01:50:10
Sure.
01:50:11
First, I wanna echo the long partnership
01:50:13
that we've had with AMD.
01:50:15
Frontier continues to be the fastest computer in the world.
01:50:20
Many doubted our ability to actually reach exascale,
01:50:24
but we were able to achieve this feat
01:50:26
with industry-leading liquid cooling infrastructure,
01:50:30
next-generation high-performance interconnect
01:50:32
with Slingshot, our highly differentiated system management
01:50:36
and Cray programming environment software,
01:50:39
along with the incredible MI250.
01:50:42
With Frontier, exascale computing
01:50:44
has already made breakthroughs in areas such as aerospace,
01:50:47
climate modeling, healthcare, and nuclear physics.
01:50:51
Frontier is also one of the world's
01:50:53
top 10 greenest supercomputers.
01:50:56
In fact, HPE and AMD have the majority
01:51:00
of the world's top 10 energy-efficient supercomputers.
01:51:09
I am very excited to deliver El Capitan to Lawrence Livermore.
01:51:13
As you know, I worked there for over 15 years.
01:51:16
El Capitan's computing prowess will fundamentally shift
01:51:20
what the scientists and engineers will be able to achieve.
01:51:24
El Capitan's gonna be 15 to 20 times faster
01:51:27
than their current system.
01:51:29
Supercomputing is truly essential
01:51:31
to the mission of the Department of Energy.
01:51:34
Lawrence Livermore has been at the forefront
01:51:37
driving the convergence of HPC and AI,
01:51:40
demonstrated by work at the National Ignition Facility
01:51:43
and other of the national security programs.
01:51:45
I'm really looking forward to continuing our journey
01:51:48
of bringing more leadership-class systems to the world.
01:51:52
Absolutely.
01:51:53
I couldn't agree more, Trish.
01:51:54
It's been a rewarding journey working together with HPE.
01:51:58
But speaking of our shared success
01:52:03
in building these record-breaking systems,
01:52:06
can you tell us a bit more about El Capitan
01:52:09
and how HPE is developing the Instinct 300A-powered
01:52:16
CPU to El Capitan?
01:52:18
Great, yes.
01:52:19
El Capitan will feature the HPE Cray EX supercomputer
01:52:23
with the MI300A accelerators
01:52:26
to power large AI-driven scientific projects.
01:52:29
The HPE Cray EX supercomputer was built from the ground up
01:52:34
with end-to-end capabilities
01:52:35
to support the magnitude of exascale.
01:52:38
El Capitan nodes include the MI300A,
01:52:42
coupled with our Slingshot Fabric
01:52:44
to operate as a fully integrated system.
01:52:47
Supercomputing is the foundation needed for large-scale AI,
01:52:51
and HPE is uniquely positioned to deliver this
01:52:54
with our Cray supercomputers.
01:52:56
El Capitan will be that engine for AI
01:52:59
and deep learning for the Department of Energy.
01:53:02
They will be recreating the experimental environment
01:53:05
and simulations and training the AI models
01:53:08
with all of that vast amount of data.
01:53:11
El Capitan will be one of the most capable AI systems
01:53:15
in the world.
01:53:17
And beyond El Capitan, we're excited to have expanded
01:53:20
our supercomputing portfolio with the MI300A
01:53:23
to bring next-generation accelerated compute
01:53:26
to a broad set of customers.
01:53:28
Yeah, so Trish, that's fantastic.
01:53:30
And actually, let's double-click into that a little bit more.
01:53:33
I know that there are a growing number
01:53:35
of supercomputing customers, not just at LLNL,
01:53:40
that are really applying AI to their projects.
01:53:42
Can you tell us a little bit even more about that?
01:53:45
Sure, so AI undoubtedly will be the catalyst
01:53:48
to transform scientific research.
01:53:51
As I said earlier, supercomputing is the foundation
01:53:54
needed to run AI.
01:53:56
And HPE is the undisputed leader
01:53:58
in delivering supercomputers.
01:54:00
Some example where AI will be fundamental in El Capitan
01:54:04
include the National Ignition Facility,
01:54:07
where they will be using 1D, 2D, 3D simulations,
01:54:11
along with trained AI models to develop a more robust design
01:54:16
for higher-yield fusion reactions.
01:54:18
Just imagine fusion energy in our future.
01:54:21
Another application is high-resolution
01:54:23
earthquake modeling, essential for understanding
01:54:26
building structural integrity and also emergency planning.
01:54:30
And one more application is bioassurance,
01:54:32
where simulation and AI models will be key
01:54:35
in developing rapid therapeutics.
01:54:38
Supercomputing and AR are tools to allow engineers
01:54:41
and scientists the ability to find the unknown.
01:54:45
I'm thrilled to be part of the journey
01:54:47
of accelerating scientific discovery
01:54:49
and the scale of impact it has
01:54:52
on changing the way people live and work.
01:54:55
Fantastic.
01:54:56
Well, Trish, thank you.
01:54:57
I'm so excited about the opportunities
01:54:59
that researchers and scientists will have
01:55:02
with the systems that we're bringing to the market together.
01:55:05
Thanks so much.
01:55:06
Thank you.
01:55:09
Yeah, on behalf of AMD and the entire team,
01:55:12
I really wanna just thank HPE and our customers
01:55:16
for the opportunity to participate
01:55:18
in the development of these massive systems.
01:55:20
Because El Capitan will be an amazing machine
01:55:23
and a real showcase for the MI300A,
01:55:27
which defines leadership at this critical junction
01:55:30
as HPC and AI converge.
01:55:33
AMD is proud of the leadership systems powered by MI300A,
01:55:37
which will be available soon from partners around the world.
01:55:41
I can't wait to see what researchers and scientists
01:55:45
are gonna do with these systems.
01:55:47
And with that, I'd like to welcome Lisa back on stage
01:55:51
to conclude our journey today.
01:55:53
Thank you.
01:56:00
All right, thank you, Forrest,
01:56:02
and thank you to all of our partners who joined us.
01:56:05
You've heard from Victor, Forrest, our key partners.
01:56:08
We have significant momentum,
01:56:10
and we're building on that for the data center AI platforms.
01:56:14
To cap off the day,
01:56:15
let me now talk about another important area for AMD
01:56:18
where we're delivering leadership AI solutions,
01:56:20
and that's the PC.
01:56:22
Now, for the PCs, we recognized several years ago
01:56:25
that on-chip AI accelerators, or NPUs,
01:56:28
would be very, very important for next generation PCs.
01:56:31
And the NPU is actually the compute engine
01:56:34
that will enable us to reimagine what it means
01:56:37
to build a truly intelligent and personal PC experience.
01:56:41
At AMD, we're on actually a multi-year journey.
01:56:43
We have a strong roadmap to deliver the highest performance
01:56:46
and most power-efficient NPUs possible.
01:56:49
We were actually the first company to integrate an NPU
01:56:52
into an x86 processor
01:56:54
when we launched Ryzen Mobile 7040 series earlier this year,
01:56:58
and we integrated the XDNA architecture
01:57:01
that actually came from our acquisition of Xilinx.
01:57:04
It actually took us less than a year
01:57:05
to bring Xilinx's proven technology into our PC products.
01:57:11
Let me tell you a little bit about XDNA.
01:57:13
It's a scalable and adaptive computing architecture.
01:57:15
It's built around a large computing array
01:57:18
that can efficiently transfer the massive amounts of data
01:57:21
required for AI inference.
01:57:23
And as a result, XDNA is both extremely performant
01:57:27
and also very energy-efficient.
01:57:29
So you can run multiple AI workloads
01:57:31
simultaneously in real time.
01:57:34
Now, I'm happy to say that we've already shipped millions
01:57:37
of Ryzen AI-enabled PCs into the market
01:57:40
with all of the leading PC OEMs
01:57:42
and all of this provides the hardware foundation
01:57:45
for developers to leverage this first wave of AI PCs.
01:57:49
Now, if you look at some of the applications,
01:57:51
today Ryzen AI powers hundreds of different AI functions,
01:57:55
things like advanced motion tracking
01:57:57
and sharpening to deep blur 4K video,
01:58:00
enabling production level digital production capabilities
01:58:03
with unlimited virtual cameras,
01:58:05
all in an ultra-thin notebook for the very first time.
01:58:09
We're also working with key software leaders
01:58:11
like Adobe and Blackmagic,
01:58:13
and they're using our on-chip Radeon GPU
01:58:16
to accelerate the AI-enabled editing features
01:58:19
so that you can dramatically improve productivity
01:58:22
for content creators.
01:58:24
And of course, we've worked very, very closely with Microsoft
01:58:27
to enable Windows 11 Studio Effects on Ryzen AI.
01:58:31
Now, today we're launching some additional capabilities.
01:58:34
So Ryzen AI 1.0 software,
01:58:37
it will make it easier for developers
01:58:38
to add advanced gen AI capabilities.
01:58:41
So with this new package,
01:58:43
developers can create an AI-enabled application
01:58:46
that's ready to run on Ryzen AI hardware
01:58:50
just by choosing a pre-trained model.
01:58:52
So for example, you can choose one of the models
01:58:54
that are available on Hugging Face,
01:58:57
quantize it based on your needs,
01:58:58
and then deploy it through ONNX Runtime.
01:59:01
So this is a major step forward
01:59:03
when you think about the broad ecosystem
01:59:06
that wants to run AI apps for Windows,
01:59:08
and we can't wait to see what ISVs will do
01:59:11
when they really capture the leadership performance
01:59:14
that you can get from an NPU in Ryzen AI.
01:59:18
Now, of course, we know developers
01:59:20
always want more AI compute.
01:59:22
So today, I'm very happy to say that we're launching
01:59:25
our "Hawk Point" Ryzen 8040 Series Mobile Processors.
01:59:29
And,
01:59:31
Thank you.
01:59:35
"Hawk Point" combines all of our industry-leading performance
01:59:38
and battery life, and it increases AI TOPS by 60%
01:59:42
compared to the previous generation.
01:59:44
So if you just take a look at some of the performance metrics
01:59:47
for the Ryzen 8040 Series,
01:59:50
if you look at the top of the stack,
01:59:51
so Ryzen 9 8945,
01:59:53
it's actually significantly faster
01:59:55
than the competition in many areas,
01:59:57
delivering more performance for multi-threaded applications,
02:00:01
1.8x higher frame rates for games,
02:00:04
and 1.4x faster performance
02:00:06
across content creation applications.
02:00:08
But when you look at the AI improvements of Ryzen 8040,
02:00:12
you really see some substantial improvements.
02:00:15
So I talked about additional TOPS in "Hawk Point",
02:00:18
and what that results in faster performance
02:00:21
when you're running the key models.
02:00:22
So things like Llama 2 7B, we run 1.4x faster,
02:00:27
and also 1.4x faster on things like AI image recognition
02:00:31
and object detection models.
02:00:33
So all of this, what does it do?
02:00:35
It provides faster response times
02:00:37
and overall better experiences.
02:00:40
Now, I really believe that we're actually at the beginning
02:00:43
of this AI PC journey,
02:00:45
and it's something that is really gonna change
02:00:47
the way we think about productivity at a personal level.
02:00:50
So we've been working very closely with Microsoft
02:00:52
to ensure that we are co-innovating
02:00:54
across hardware and software
02:00:56
to enable those next generation of AI PCs.
02:00:59
To share more about this work,
02:01:01
I'm pleased to welcome Pavan Davuluri,
02:01:03
Corporate Vice President of Windows and Devices
02:01:05
at Microsoft to the stage.
02:01:12
Hey, how are you?
02:01:13
Great to be here.
02:01:14
Pavan, thank you so much for being here.
02:01:16
We started the show with Kevin Scott
02:01:18
talking about the great partnership
02:01:20
between Microsoft and AMD,
02:01:21
and all the work we're doing on the big iron,
02:01:24
and the cloud, and Azure.
02:01:25
And it seemed fitting that we close the show
02:01:28
with the other very, very important work
02:01:31
that we're doing together on the client side.
02:01:33
So can you tell us a little bit, Pavan,
02:01:36
about all the great work and your vision for client AI?
02:01:40
For sure.
02:01:41
As you and Kevin covered,
02:01:42
Microsoft and AMD have a long partnership together
02:01:45
across Azure and Windows.
02:01:47
And it's incredible to see us moving that partnership
02:01:49
together into the next wave of technology with AI.
02:01:53
As you shared, Lisa, for us,
02:01:54
there are millions of PCs right now
02:01:56
with Ryzen 7040 AI in market.
02:01:59
And that's amazing because these are the first x86 PCs
02:02:02
with integrated NPUs, enabling enhanced AI experience.
02:02:05
You told me everybody wanted NPUs.
02:02:07
Absolutely.
02:02:08
And you know, right now we get to see
02:02:09
some incredible AI features.
02:02:11
Somebody talked about Windows Studio FX coming to life
02:02:13
across the scale of the ecosystem.
02:02:15
Absolutely fantastic, I would say.
02:02:17
Now, for us at Microsoft and for the ecosystem,
02:02:20
our marquee AI experience is really Copilot.
02:02:24
Similar to how the start button is the gateway into Windows,
02:02:27
the Copilot for us is the entry point
02:02:29
into this world of AI on the PC.
02:02:33
It has a fundamental impact on everything
02:02:35
we will do on a computer, from work and school
02:02:37
and play and entertainment and creation.
02:02:40
You know, I completely agree, Pavan.
02:02:41
I think Copilot is so transformational.
02:02:44
I mean, for everyone who's had a chance to experience it,
02:02:46
it's so, it really changes the way we do work.
02:02:49
So let's talk about the tech that's underneath it.
02:02:52
So to enable Copilot and everything
02:02:55
that we want to do on PCs.
02:02:56
We are putting together new systems architectures
02:02:58
that really power those experiences going forward
02:03:01
and they really pull together GPU, NPU
02:03:04
and certainly the cloud as well.
02:03:06
And quite honestly, we're seeing customer habits
02:03:08
change early at this point in time
02:03:10
and we believe to your point earlier,
02:03:11
we're early in the cycle of innovation that's coming.
02:03:15
When we have these powerful NPUs
02:03:16
like the ones you're building,
02:03:18
it gives us an opportunity to create apps
02:03:20
that take advantage of both local and cloud inferencing.
02:03:23
And to me, that's what the Windows AI ecosystem is about
02:03:26
and that's what we're building in partnership with you.
02:03:28
It's designed to enable those scenarios
02:03:31
with the ONNX runtime of course
02:03:33
and the Olive tool chain to back this up.
02:03:35
Applications are gonna have many models
02:03:37
like Llama that you mentioned, FI2 running
02:03:39
and they will run very capably in the TOPS that we will have.
02:03:42
And of course, not to mention the foundation models
02:03:45
that are powered by the GPUs in the cloud.
02:03:47
Yeah, I mean, I think this is an area
02:03:49
where Microsoft and AMD really have a very unique position
02:03:52
because we have so much capability in the cloud,
02:03:55
we have also access to the client and the local view.
02:03:59
Can you share a bit about how we're thinking about
02:04:01
across all of these, the cloud local view?
02:04:04
Yeah, with AMD, we're making it simpler to incorporate
02:04:07
what we call the hybrid pattern
02:04:09
or the hybrid loop into applications.
02:04:11
And we wanna be able to load shift between the cloud
02:04:13
and the client to provide the best of computing
02:04:15
across both those worlds.
02:04:17
For us, it's really about seamless computing
02:04:20
across the cloud and the client.
02:04:22
It brings together the benefits of local compute,
02:04:24
things like enhanced privacy and responsiveness
02:04:26
and latency with the power of the cloud,
02:04:29
high performance models, large data sets,
02:04:32
cross platform inferencing.
02:04:33
And so for us, we feel like we're working together
02:04:36
to build that future where Windows is the destination
02:04:39
for the best AI experiences on PCs.
02:04:42
Yeah, no, I think that sounds great.
02:04:44
Now, one of the things though that you definitely
02:04:46
are always talking to me about is more TOPS.
02:04:50
I ask for more TOPS all the time.
02:04:52
So look, we completely believe that to enable
02:04:57
your vision for AI experiences,
02:05:00
we've really thought about how do we actually accelerate
02:05:02
our client AI roadmap.
02:05:03
So I wanna share a little bit of our roadmap today.
02:05:07
Ryzen 7040 and 8040, we've already delivered those
02:05:10
industry-leading NPU capabilities.
02:05:12
But today, I'm very excited to announce
02:05:14
that our next-gen Strix Point Ryzen processors
02:05:17
will actually include a new NPU
02:05:19
powered by our second-generation XDNA architecture
02:05:22
coming in 2024.
02:05:24
Congratulations.
02:05:25
Thank you.
02:05:29
So a little bit about XDNA 2.
02:05:32
It's designed really for leadership-gen AI performance.
02:05:35
It delivers more than three times the NPU performance
02:05:38
of our current Ryzen 7040 series.
02:05:41
And Pavan, I'm very happy to share,
02:05:42
I know your teams already know this
02:05:44
because you have the silicon,
02:05:45
but today, Strix Point is running great in our labs
02:05:48
and we're really excited about it.
02:05:50
Our teams have been working really closely together
02:05:52
to make sure that all of those great future Windows AI
02:05:55
features run really well on Strix Point.
02:05:58
So I can't wait to share more about that later this year.
02:06:02
Lisa, that's awesome.
02:06:03
And we will use every TOP you will provide us.
02:06:05
You promised, right?
02:06:07
Absolutely.
02:06:09
And it's not just the size of the neural engines,
02:06:12
the dramatic increase in efficiency performance per watt
02:06:15
of these next-generation NPUs.
02:06:17
We think we'll bring a whole new level of capabilities
02:06:19
to the market, enabling personalization
02:06:21
on every interaction on these devices.
02:06:24
Together with Windows, we feel like we're building
02:06:26
that future for the Copilot where we will orchestrate
02:06:28
multiple apps, services, and across devices, quite frankly,
02:06:32
functioning as an agent in your life that has context
02:06:35
and maintains context across entire workflows.
02:06:38
So we're very excited about these devices coming to life
02:06:40
for the Windows ecosystem.
02:06:41
We're excited to see what developers will do
02:06:43
with this technology.
02:06:44
And quite frankly, the other day,
02:06:45
ultimately, what customers will do
02:06:47
with all of this innovation.
02:06:49
Thank you so much, Pavan.
02:06:50
We are so excited about the partnership.
02:06:52
We appreciate all the long-term work we're doing together
02:06:55
and look forward to lots of great things to come.
02:06:58
Thank you for having me, Lisa.
02:06:59
Thank you, Pavan.
02:07:00
Thank you. Thank you.
02:07:05
All right, so it's been such a fun day,
02:07:07
but now it's time for me to wrap up a bit.
02:07:09
We've showed you a lot of new products,
02:07:11
a lot of new platforms, a lot of new technologies
02:07:14
that are all about taking AI infrastructure
02:07:17
to the next level.
02:07:18
MI300X, MI300A accelerators,
02:07:21
these are all shipping today in production.
02:07:23
They're already being adopted by Microsoft, Oracle,
02:07:26
Meta, Dell, HP Enterprise, Lenovo, Supermicro,
02:07:30
and many others.
02:07:32
You heard from Victor how we're expanding the ecosystem
02:07:34
of AI developers, working with us,
02:07:37
ROCm 6 software, the open ecosystem,
02:07:40
that our goal is to make it incredibly easy
02:07:42
for everyone to use Instinct GPUs.
02:07:45
You heard from Forrest in our panel
02:07:47
on the overall system architecture,
02:07:48
our work with Arista, Broadcom, and Cisco.
02:07:51
We believe that to create this high-performance
02:07:54
AI infrastructure, it has to be open,
02:07:56
and that's what we're doing together
02:07:58
for scale-out AI solutions.
02:08:00
And then you heard what we're doing on the other side,
02:08:02
the client part of our business,
02:08:04
because we actually believe AI should be everywhere.
02:08:07
So our latest Ryzen processors really extend
02:08:10
our compute vision and our AI leadership.
02:08:12
I hope you can see that AI is absolutely
02:08:16
the number one priority at AMD.
02:08:18
Our goal is to push the envelope,
02:08:20
to bring innovation to the market,
02:08:22
to do more than anything thought was possible,
02:08:24
because we believe, as wonderful as our technology is,
02:08:28
it is about doing it together in a partner ecosystem
02:08:32
where everybody brings their best to the market.
02:08:35
Today is a...
02:08:44
I want to say on a personal level,
02:08:46
today is an incredibly proud moment for AMD.
02:08:48
If you think about all of the innovation,
02:08:51
everything that we bring to the market,
02:08:53
to be part of AI at this time,
02:08:56
at the beginning of this era,
02:08:58
to work with these amazing people throughout the industry,
02:09:01
throughout the ecosystem at AMD,
02:09:05
I can say that I've never seen something more exciting.
02:09:08
A very, very special thank you
02:09:09
to all of our partners who joined us today,
02:09:11
and thank you all for joining us.

Description:

Join us to discover how AMD and its partners are powering the future of AI. Learn more: https://www.amd.com/en/corporate/events/advancing-ai.html 00:11:24 Introduction 00:18:32 AMD Instinct™ MI300X Accelerators 00:24:52 Microsoft 00:29:30 AMD Instinct™ Platform 00:33:03 Oracle Cloud 00:36:14 Software and Ecosystem 00:44:18 AI Innovators Panel: Databricks, EssentialAI, Lamini 01:01:09 Meta 01:07:45 Dell 01:14:05 Supermicro 01:18:14 Lenovo 01:25:33 Networking 01:30:37 Panel: Broadcom, Arista, Cisco 01:40:51 AMD Instinct™ MI300A APUs 01:47:05 Exascale Computing – El Capitan 01:56:13 AI PCs 01:56:49 AMD XDNA™ Architecture 01:58:31 AMD Ryzen™ AI Software 02:00:49 Microsoft 02:07:05 Closing *** Subscribe: https://www.youtube.com/user/amd?sub_confirmation=1 Follow us on LinkedIn: http://www.linkedin.com/company/1497 Follow us on Instagram: https://www.facebook.com/unsupportedbrowser Follow us on X: https://bit.ly/AMD_On_Twitter Like us on Facebook: https://www.facebook.com/unsupportedbrowser ©2023 Advanced Micro Devices, Inc. AMD, the AMD Arrow Logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and other jurisdictions. Other names are for informational purposes only and may be trademarks of their respective owners.

Preparing download options

popular icon
Popular
hd icon
HD video
audio icon
Only sound
total icon
All
* — If the video is playing in a new tab, go to it, then right-click on the video and select "Save video as..."
** — Link intended for online playback in specialized players

Questions about downloading video

mobile menu iconHow can I download "AMD Presents: Advancing AI" video?mobile menu icon

  • http://unidownloader.com/ website is the best way to download a video or a separate audio track if you want to do without installing programs and extensions.

  • The UDL Helper extension is a convenient button that is seamlessly integrated into YouTube, Instagram and OK.ru sites for fast content download.

  • UDL Client program (for Windows) is the most powerful solution that supports more than 900 websites, social networks and video hosting sites, as well as any video quality that is available in the source.

  • UDL Lite is a really convenient way to access a website from your mobile device. With its help, you can easily download videos directly to your smartphone.

mobile menu iconWhich format of "AMD Presents: Advancing AI" video should I choose?mobile menu icon

  • The best quality formats are FullHD (1080p), 2K (1440p), 4K (2160p) and 8K (4320p). The higher the resolution of your screen, the higher the video quality should be. However, there are other factors to consider: download speed, amount of free space, and device performance during playback.

mobile menu iconWhy does my computer freeze when loading a "AMD Presents: Advancing AI" video?mobile menu icon

  • The browser/computer should not freeze completely! If this happens, please report it with a link to the video. Sometimes videos cannot be downloaded directly in a suitable format, so we have added the ability to convert the file to the desired format. In some cases, this process may actively use computer resources.

mobile menu iconHow can I download "AMD Presents: Advancing AI" video to my phone?mobile menu icon

  • You can download a video to your smartphone using the website or the PWA application UDL Lite. It is also possible to send a download link via QR code using the UDL Helper extension.

mobile menu iconHow can I download an audio track (music) to MP3 "AMD Presents: Advancing AI"?mobile menu icon

  • The most convenient way is to use the UDL Client program, which supports converting video to MP3 format. In some cases, MP3 can also be downloaded through the UDL Helper extension.

mobile menu iconHow can I save a frame from a video "AMD Presents: Advancing AI"?mobile menu icon

  • This feature is available in the UDL Helper extension. Make sure that "Show the video snapshot button" is checked in the settings. A camera icon should appear in the lower right corner of the player to the left of the "Settings" icon. When you click on it, the current frame from the video will be saved to your computer in JPEG format.

mobile menu iconWhat's the price of all this stuff?mobile menu icon

  • It costs nothing. Our services are absolutely free for all users. There are no PRO subscriptions, no restrictions on the number or maximum length of downloaded videos.