1
00:00:02,520 --> 00:00:07,000
Speaker 1: Bloomberg Audio Studios, podcasts, radio news.

2
00:00:07,840 --> 00:00:10,160
Speaker 2: Now, let's narrow our focus from the broader markets to

3
00:00:10,280 --> 00:00:15,079
Speaker 2: one single stock. Amazon, the tech giant, hosting its annual

4
00:00:15,160 --> 00:00:19,280
Speaker 2: Amazon Web Services Reinvent Conference down in Las Vegas this week.

5
00:00:19,520 --> 00:00:23,400
Speaker 2: The Cloud focused confab draws developers, engineers, and other thought

6
00:00:23,480 --> 00:00:25,840
Speaker 2: leaders in tech to explore the latest cloud and AI

7
00:00:25,960 --> 00:00:29,640
Speaker 2: projects happening under Amazon's roof, including a new AI chip.

8
00:00:29,960 --> 00:00:32,839
Speaker 2: Let's go live now, we're Bloomberg Tech co host ed

9
00:00:33,000 --> 00:00:38,960
Speaker 2: Ludlow is joined by a special guest ed take it away. Yeah.

10
00:00:39,000 --> 00:00:41,639
Speaker 3: Three pieces of news move markets this morning. A new

11
00:00:41,720 --> 00:00:45,960
Speaker 3: generation of Frontier model from AWS new agentic tools, and

12
00:00:46,000 --> 00:00:51,720
Speaker 3: then a very quickly released, installed and now ramping generation

13
00:00:51,920 --> 00:00:55,520
Speaker 3: of in house custom accelerator which is Trainium three. All

14
00:00:55,520 --> 00:00:59,640
Speaker 3: points of discussion for Matt Garman, AWSCO. You know the

15
00:00:59,680 --> 00:01:02,920
Speaker 3: base point with Trainium three and you've moved quickly to

16
00:01:02,960 --> 00:01:07,360
Speaker 3: bring it to the real world is cost performance efficiency

17
00:01:07,440 --> 00:01:10,800
Speaker 3: over the prior generation, but also over in Vidia GPUs,

18
00:01:10,920 --> 00:01:14,360
Speaker 3: over Google TPUs. I think what people are trying to

19
00:01:14,400 --> 00:01:17,440
Speaker 3: understand is that ramp part I was talking about when

20
00:01:17,560 --> 00:01:22,240
Speaker 3: real world customers use it beyond this anchor customer of Enenthropic,

21
00:01:22,480 --> 00:01:23,800
Speaker 3: which relies on it currently.

22
00:01:24,200 --> 00:01:27,520
Speaker 1: Yeah. Well, look, we're quite excited about Trainium and Trainium

23
00:01:27,520 --> 00:01:30,360
Speaker 1: three in particular, as you mentioned, excited to get it

24
00:01:30,360 --> 00:01:32,760
Speaker 1: into customer's hands. And part of where we have a

25
00:01:32,800 --> 00:01:35,759
Speaker 1: benefit that we can bring to bear is, as you mentioned,

26
00:01:35,760 --> 00:01:38,120
Speaker 1: getting it into market quickly, and it's because we control

27
00:01:38,200 --> 00:01:41,360
Speaker 1: that full stack, We control the silicon development, we control

28
00:01:41,360 --> 00:01:43,440
Speaker 1: the data centers that all land in. We know that

29
00:01:43,480 --> 00:01:45,880
Speaker 1: full environment, and we can land that in very large

30
00:01:45,880 --> 00:01:47,720
Speaker 1: clusters for people to take advantage of that, and the

31
00:01:47,760 --> 00:01:50,840
Speaker 1: performance that we're seeing out of it is quite incredible,

32
00:01:50,840 --> 00:01:53,280
Speaker 1: and so we're anxious and excited to get more and

33
00:01:53,320 --> 00:01:54,080
Speaker 1: more people using it.

34
00:01:54,640 --> 00:01:56,600
Speaker 3: I've been able to go inside out a Perna Labs

35
00:01:56,640 --> 00:01:59,280
Speaker 3: and look at the engineering work between the first generation

36
00:01:59,360 --> 00:02:02,960
Speaker 3: of Trainium and second It wasn't just the accelerator at

37
00:02:02,960 --> 00:02:04,320
Speaker 3: the server level as well.

38
00:02:04,440 --> 00:02:04,800
Speaker 1: That's right.

39
00:02:05,040 --> 00:02:07,040
Speaker 3: But a part of the surprise of today is this,

40
00:02:07,560 --> 00:02:10,560
Speaker 3: you appear to be committing to an annual cadence a

41
00:02:10,560 --> 00:02:13,799
Speaker 3: new generation of Trainium. How do you keep that up?

42
00:02:14,000 --> 00:02:16,880
Speaker 1: Well, the key thing that we're focused on is making

43
00:02:16,880 --> 00:02:20,160
Speaker 1: sure that we can iterate on the technology as fast

44
00:02:20,160 --> 00:02:23,359
Speaker 1: as possible. The desire and the hunger out there for

45
00:02:23,760 --> 00:02:28,240
Speaker 1: more power and more compute is almost insatiable. And so

46
00:02:28,320 --> 00:02:31,799
Speaker 1: the more we can take an existing power footprint, an

47
00:02:31,800 --> 00:02:34,600
Speaker 1: existing set of capabilities and bring more and more compute

48
00:02:34,600 --> 00:02:38,799
Speaker 1: into that for customers to build cool applications and cool

49
00:02:38,880 --> 00:02:42,000
Speaker 1: environments and to get value from that, that's we're focused on. Then,

50
00:02:42,040 --> 00:02:44,160
Speaker 1: so we're going to be pushing that envelope as fast

51
00:02:44,200 --> 00:02:46,600
Speaker 1: as we as we possibly can to get those new

52
00:02:46,600 --> 00:02:48,120
Speaker 1: and new capabilities out to customers.

53
00:02:48,280 --> 00:02:50,760
Speaker 3: The pitch for Trainium in both the training and inference

54
00:02:50,840 --> 00:02:54,160
Speaker 3: use case is that it's a great deal, you know,

55
00:02:54,320 --> 00:02:57,480
Speaker 3: cost effective performance. At the same time, you went on

56
00:02:57,520 --> 00:03:00,720
Speaker 3: stage and said AWS is quote by far the best

57
00:03:00,720 --> 00:03:05,560
Speaker 3: place to run in Vidia GPS how above if possible.

58
00:03:05,320 --> 00:03:07,920
Speaker 1: Well, I mean that both both are possible because that

59
00:03:08,040 --> 00:03:12,760
Speaker 1: is a great environment to run accelerators and compute in.

60
00:03:13,160 --> 00:03:15,960
Speaker 1: And so we've been working for fifteen plus years with

61
00:03:16,040 --> 00:03:19,400
Speaker 1: the in Nvidia team and Jensen and team to deliver

62
00:03:19,639 --> 00:03:23,240
Speaker 1: outstanding capabilities for our customers and for when you're running

63
00:03:23,280 --> 00:03:26,160
Speaker 1: a large cluster of Nvidia GPUs, people will tell you

64
00:03:26,240 --> 00:03:28,600
Speaker 1: AIGHTWS is the best place you get the best performance,

65
00:03:28,639 --> 00:03:32,120
Speaker 1: the most stable cluster the best capabilities out there and

66
00:03:32,320 --> 00:03:34,720
Speaker 1: broad scale, and it's why folks like OpenAI and others

67
00:03:34,760 --> 00:03:37,880
Speaker 1: are running in AWS and we have that choice. And

68
00:03:37,960 --> 00:03:39,800
Speaker 1: so for others that want to be able to take

69
00:03:39,840 --> 00:03:42,800
Speaker 1: advantage of Trainium, and there's some use cases that are

70
00:03:42,800 --> 00:03:45,040
Speaker 1: best for Trainum, there's other use cases where in vidio

71
00:03:45,080 --> 00:03:47,040
Speaker 1: GPUs are going to be your best option. We want

72
00:03:47,040 --> 00:03:49,200
Speaker 1: to have all of those available, and so we think

73
00:03:49,200 --> 00:03:51,680
Speaker 1: that if we can continue to push the envelope on

74
00:03:51,720 --> 00:03:54,480
Speaker 1: what Trainium can deliver for customers and make sure that

75
00:03:54,560 --> 00:03:58,040
Speaker 1: we are supporting the latest and greatest from everything that

76
00:03:58,040 --> 00:04:00,800
Speaker 1: the awesome team in Nvidia is delivering, that's going to

77
00:04:00,800 --> 00:04:02,240
Speaker 1: be the best outcome for our customers.

78
00:04:02,800 --> 00:04:06,200
Speaker 3: The plan for AWS is to basically double capacity by

79
00:04:06,200 --> 00:04:08,760
Speaker 3: the end of twenty twenty seven to round eight gigawatts,

80
00:04:09,080 --> 00:04:11,480
Speaker 3: so you have a sense of how you apportion that

81
00:04:11,520 --> 00:04:15,920
Speaker 3: capacity in how silicon and server designs to traineum versus

82
00:04:16,080 --> 00:04:17,159
Speaker 3: and video gp is.

83
00:04:18,000 --> 00:04:19,560
Speaker 1: We're just going to keep pushing as fast as we

84
00:04:19,600 --> 00:04:22,200
Speaker 1: can and we'll see where customer demands drives us as

85
00:04:22,240 --> 00:04:25,719
Speaker 1: we go. And as you said, we're massively adding capacity.

86
00:04:25,760 --> 00:04:28,200
Speaker 1: In the last year alone, we've added three point eight

87
00:04:28,200 --> 00:04:30,760
Speaker 1: gigawatts of capacity, and we'll continue to add more and

88
00:04:30,800 --> 00:04:33,640
Speaker 1: more as over the next couple of years, and we'll

89
00:04:33,680 --> 00:04:35,880
Speaker 1: let customer demands drive us a little bit on what

90
00:04:35,920 --> 00:04:38,919
Speaker 1: they're looking for and what they want, and that's what

91
00:04:38,960 --> 00:04:40,520
Speaker 1: we always listen to and that's what we'll continue to

92
00:04:40,520 --> 00:04:40,880
Speaker 1: listen to.

93
00:04:41,400 --> 00:04:43,760
Speaker 3: The focus with Trainium in the time I've been able

94
00:04:43,760 --> 00:04:46,240
Speaker 3: to interact with you and talk about not again not

95
00:04:46,279 --> 00:04:49,400
Speaker 3: just the accelerator, but at the server design level, there's

96
00:04:49,440 --> 00:04:52,239
Speaker 3: a lot of benefits the customer. When does that benefit

97
00:04:52,320 --> 00:04:55,520
Speaker 3: start accruing to AWS in terms of profitability, Like if

98
00:04:55,520 --> 00:04:59,160
Speaker 3: it's such a good financial proposition, you must be able

99
00:04:59,240 --> 00:05:01,080
Speaker 3: soon to say when making a lot of money on this.

100
00:05:01,240 --> 00:05:03,600
Speaker 1: Yeah, Well, you're already seeing some of the benefits of

101
00:05:03,680 --> 00:05:06,239
Speaker 1: crew You see things like bedrock growing really really rapidly,

102
00:05:06,440 --> 00:05:08,760
Speaker 1: and you see trainingum powering that under the covers, and

103
00:05:08,800 --> 00:05:12,920
Speaker 1: we announced today that more than half of all tokens

104
00:05:12,920 --> 00:05:15,760
Speaker 1: and inference done in bedrock are done on TRAININGUM two

105
00:05:15,760 --> 00:05:18,000
Speaker 1: servers under the covers, and so you're already seeing that

106
00:05:18,040 --> 00:05:21,160
Speaker 1: benefit come. You see the models that we're building in

107
00:05:21,240 --> 00:05:23,480
Speaker 1: Nova and Nova two start to get better and better

108
00:05:23,520 --> 00:05:26,839
Speaker 1: over time and be accelerated by Trainum, and so we

109
00:05:26,880 --> 00:05:29,240
Speaker 1: really think that there's a whole bunch of dimensions on

110
00:05:29,279 --> 00:05:32,520
Speaker 1: which both our customers, our partners, and our own products

111
00:05:32,520 --> 00:05:34,480
Speaker 1: are going to get accelerated all from Trainium.

112
00:05:34,800 --> 00:05:37,200
Speaker 3: Every time you come onto the program, I always offer

113
00:05:37,240 --> 00:05:39,400
Speaker 3: the audience opportunity to pose a question to you. There's

114
00:05:39,400 --> 00:05:41,680
Speaker 3: a lot of interests in AWS right. Many of your

115
00:05:41,920 --> 00:05:45,680
Speaker 3: customers span global technology. Actually most of the questions were

116
00:05:45,720 --> 00:05:49,680
Speaker 3: about anthropic. That wasn't much said on stage. I think

117
00:05:49,680 --> 00:05:52,760
Speaker 3: people are trying to understand what is the benefit and

118
00:05:52,839 --> 00:05:57,560
Speaker 3: advantage AWS Office to anthropic while they are ramping Trainium

119
00:05:57,560 --> 00:06:01,320
Speaker 3: through Project Raineer, but also ramping their tea allocations as well.

120
00:06:01,839 --> 00:06:04,000
Speaker 1: Well. Look, our partners are an anthropic. Our partnership with

121
00:06:04,000 --> 00:06:06,440
Speaker 1: them is incredibly strong and it's never been stronger, and

122
00:06:07,720 --> 00:06:09,720
Speaker 1: we do a ton of collaboration with them, and as

123
00:06:09,720 --> 00:06:12,480
Speaker 1: I mentioned through Project Right here, it's a huge collaboration

124
00:06:12,600 --> 00:06:15,279
Speaker 1: there to go build their current generation models and all

125
00:06:15,320 --> 00:06:18,400
Speaker 1: their models run today and launch on day one on

126
00:06:18,520 --> 00:06:21,120
Speaker 1: top of Trainingum and on top of AWS which we're

127
00:06:21,160 --> 00:06:23,520
Speaker 1: incredibly excited about it, and we'll continue that partnership for

128
00:06:23,520 --> 00:06:26,560
Speaker 1: a long time. I think from them, they have a

129
00:06:26,680 --> 00:06:29,080
Speaker 1: huge demand for compute, and so they'll go to other

130
00:06:29,120 --> 00:06:32,080
Speaker 1: places where it makes sense to round out their compute

131
00:06:32,480 --> 00:06:35,640
Speaker 1: needs because they just have such massive needs for compute,

132
00:06:35,680 --> 00:06:38,039
Speaker 1: and they have customers in other clouds as well, But

133
00:06:38,080 --> 00:06:41,640
Speaker 1: we're definitely their their primary cloud provider and closest partner.

134
00:06:41,680 --> 00:06:42,000
Speaker 1: For sure.

135
00:06:42,960 --> 00:06:46,680
Speaker 3: Supply constraints so am Fropic is supply constrained, they can't

136
00:06:46,680 --> 00:06:49,400
Speaker 3: get the compute they need. We've talked about the rampont

137
00:06:49,440 --> 00:06:52,279
Speaker 3: and video GPU and in house silicon. Is there a

138
00:06:52,320 --> 00:06:55,679
Speaker 3: supply constraint element with AWS so you able to get

139
00:06:56,040 --> 00:06:57,479
Speaker 3: the chips that you need.

140
00:06:57,640 --> 00:07:01,440
Speaker 1: Yeah, I think there's always anytime you see an industry

141
00:07:01,440 --> 00:07:03,840
Speaker 1: that's growing as fast as this is right now, when

142
00:07:03,839 --> 00:07:07,120
Speaker 1: you think about AI and model development and chips, there

143
00:07:07,160 --> 00:07:09,440
Speaker 1: are going to be constraints. No matter what. There is

144
00:07:09,520 --> 00:07:13,040
Speaker 1: more demand than there is supply. Sometimes it's in chips,

145
00:07:13,080 --> 00:07:15,680
Speaker 1: Sometimes it's in power and data centers. Sometimes it's in

146
00:07:16,440 --> 00:07:19,120
Speaker 1: you know, different parts of that. At some points it's

147
00:07:19,440 --> 00:07:23,320
Speaker 1: you know, networking equipment. At some point it's transistors, you know,

148
00:07:23,480 --> 00:07:25,520
Speaker 1: resistors or whatever it is, and you look at the

149
00:07:25,680 --> 00:07:29,320
Speaker 1: entire supply chain that is needed to ramp up at

150
00:07:29,400 --> 00:07:32,720
Speaker 1: such a massive rate right Never before has the technology

151
00:07:32,720 --> 00:07:34,640
Speaker 1: industry ramped at the rate that we are right now,

152
00:07:35,240 --> 00:07:37,640
Speaker 1: and so there are always constraints, and so it's not

153
00:07:37,680 --> 00:07:40,920
Speaker 1: necessarily that there is necessarily one constraint where it's like, wow,

154
00:07:40,960 --> 00:07:42,600
Speaker 1: I can't get in vidio chips. We can get in

155
00:07:42,640 --> 00:07:45,720
Speaker 1: video chips. And actually Jensen team have been incredibly supportive

156
00:07:45,840 --> 00:07:48,160
Speaker 1: and great partners and helping us get capacity there. It's

157
00:07:48,200 --> 00:07:50,480
Speaker 1: not that you can't get power. We're getting power all

158
00:07:50,520 --> 00:07:52,720
Speaker 1: over the place. But it's just we're ramping all of

159
00:07:52,760 --> 00:07:56,280
Speaker 1: these places in such rapid rates that always there's a

160
00:07:56,320 --> 00:07:58,560
Speaker 1: constraint in that system, and it'll change every month you

161
00:07:58,600 --> 00:07:59,720
Speaker 1: ask me of what the current one is.

162
00:08:00,080 --> 00:08:01,960
Speaker 3: Throughout the day, we was just speaking with your team

163
00:08:02,040 --> 00:08:05,200
Speaker 3: about the idea we're moving from AI assistance to AI

164
00:08:05,280 --> 00:08:08,400
Speaker 3: co workers. You know, particular focus on the agentic offering

165
00:08:08,440 --> 00:08:10,560
Speaker 3: that you've done. You're in the camp of people, if

166
00:08:10,600 --> 00:08:13,520
Speaker 3: you don't mind me saying that sees basically ninety percent

167
00:08:13,560 --> 00:08:16,840
Speaker 3: of the value in enterprise coming from agentic technology. Do

168
00:08:16,880 --> 00:08:19,280
Speaker 3: you have any data or evidence to support that all

169
00:08:19,320 --> 00:08:20,800
Speaker 3: of your customers are ready for that?

170
00:08:21,320 --> 00:08:23,920
Speaker 1: Yeah, I don't think all of our customers are. You

171
00:08:23,920 --> 00:08:25,800
Speaker 1: get ready for that, but they're excited about it, So,

172
00:08:25,840 --> 00:08:27,360
Speaker 1: you know, I think it'd definitely be an overstatement to

173
00:08:27,360 --> 00:08:29,280
Speaker 1: say everybody's ready for it. And part of that is

174
00:08:29,320 --> 00:08:31,480
Speaker 1: because it is going to take change. Right, people are

175
00:08:31,480 --> 00:08:33,240
Speaker 1: going to have to change how they think about work.

176
00:08:33,280 --> 00:08:35,240
Speaker 1: They're going to have to change their process flows, they're

177
00:08:35,240 --> 00:08:37,240
Speaker 1: going to have to change some of the things about how

178
00:08:37,240 --> 00:08:38,680
Speaker 1: they get work done. It's not just going to be

179
00:08:38,880 --> 00:08:40,920
Speaker 1: a magic one that's going to come in and magically

180
00:08:40,920 --> 00:08:44,199
Speaker 1: get them to get value. But almost everyone that I

181
00:08:44,280 --> 00:08:46,440
Speaker 1: talked to you definitely sees that that's the path. These

182
00:08:46,520 --> 00:08:50,560
Speaker 1: the agentic power of the power of agents is what

183
00:08:50,720 --> 00:08:53,480
Speaker 1: allows customers to actually get that work done. And when

184
00:08:53,520 --> 00:08:55,880
Speaker 1: they see that efficiency gain, they see them able to

185
00:08:55,880 --> 00:08:58,640
Speaker 1: accomplish things they weren't able to do before. That is

186
00:08:58,640 --> 00:09:00,440
Speaker 1: when it's worth it to go make these changes. And

187
00:09:00,480 --> 00:09:01,920
Speaker 1: so there's going to be work for people, and it's

188
00:09:01,960 --> 00:09:04,440
Speaker 1: going to take some time, right, It's taken We're twenty

189
00:09:04,480 --> 00:09:07,319
Speaker 1: years into the cloud journey and still only a fraction

190
00:09:07,400 --> 00:09:09,280
Speaker 1: of workloads have moved to the cloud. So it's going

191
00:09:09,360 --> 00:09:10,720
Speaker 1: to take time. It's not like the people are going

192
00:09:10,800 --> 00:09:12,880
Speaker 1: to magically switch. And I think it's going to be really.

193
00:09:12,800 --> 00:09:15,719
Speaker 3: Fair that we just have sixty seconds twenty years into

194
00:09:15,760 --> 00:09:17,920
Speaker 3: the cloud journey. When I touched down in Vegas. Everyone

195
00:09:17,960 --> 00:09:23,000
Speaker 3: accepts AWS number one in terms of scale infrastructure. They question,

196
00:09:23,200 --> 00:09:26,319
Speaker 3: is AWS number one in AI? Just in the thirty

197
00:09:26,360 --> 00:09:28,760
Speaker 3: seconds we have left? Yeah, I think I'll give it.

198
00:09:28,760 --> 00:09:30,240
Speaker 1: It's a question that we got a lot two years

199
00:09:30,240 --> 00:09:32,160
Speaker 1: ago and not that much a year ago, and today

200
00:09:32,160 --> 00:09:33,719
Speaker 1: I don't think we get that nearly as much. It's

201
00:09:33,720 --> 00:09:35,360
Speaker 1: just people that are kind of playing the same tapes.

202
00:09:35,679 --> 00:09:39,319
Speaker 1: We have a huge choice of models. We see when

203
00:09:39,360 --> 00:09:42,480
Speaker 1: customers are actually moving their workloads to production, they want

204
00:09:42,480 --> 00:09:44,440
Speaker 1: to run those AI workloads on AWS, and that to

205
00:09:44,480 --> 00:09:46,800
Speaker 1: me is the biggest signal. When we see our customers

206
00:09:46,840 --> 00:09:48,600
Speaker 1: they say, I ran proof of concepts in a lot

207
00:09:48,600 --> 00:09:50,439
Speaker 1: of places. When I want to move to production, I

208
00:09:50,480 --> 00:09:52,200
Speaker 1: want to run on AWS. And that's the thing that

209
00:09:52,200 --> 00:09:54,120
Speaker 1: we hear over and over again, which makes me think

210
00:09:54,160 --> 00:09:55,320
Speaker 1: we're actually in a great position.

211
00:09:55,640 --> 00:09:59,240
Speaker 3: Matt Garman, AWS, CEO with the Full Stack AI Company,

212
00:09:59,280 --> 00:10:01,080
Speaker 3: pitch here Vegas at reinvent