1 00:00:02,520 --> 00:00:07,000 Speaker 1: Bloomberg Audio Studios, podcasts, radio news. 2 00:00:07,840 --> 00:00:10,160 Speaker 2: Now, let's narrow our focus from the broader markets to 3 00:00:10,280 --> 00:00:15,079 Speaker 2: one single stock. Amazon, the tech giant, hosting its annual 4 00:00:15,160 --> 00:00:19,280 Speaker 2: Amazon Web Services Reinvent Conference down in Las Vegas this week. 5 00:00:19,520 --> 00:00:23,400 Speaker 2: The Cloud focused confab draws developers, engineers, and other thought 6 00:00:23,480 --> 00:00:25,840 Speaker 2: leaders in tech to explore the latest cloud and AI 7 00:00:25,960 --> 00:00:29,640 Speaker 2: projects happening under Amazon's roof, including a new AI chip. 8 00:00:29,960 --> 00:00:32,839 Speaker 2: Let's go live now, we're Bloomberg Tech co host ed 9 00:00:33,000 --> 00:00:38,960 Speaker 2: Ludlow is joined by a special guest ed take it away. Yeah. 10 00:00:39,000 --> 00:00:41,639 Speaker 3: Three pieces of news move markets this morning. A new 11 00:00:41,720 --> 00:00:45,960 Speaker 3: generation of Frontier model from AWS new agentic tools, and 12 00:00:46,000 --> 00:00:51,720 Speaker 3: then a very quickly released, installed and now ramping generation 13 00:00:51,920 --> 00:00:55,520 Speaker 3: of in house custom accelerator which is Trainium three. All 14 00:00:55,520 --> 00:00:59,640 Speaker 3: points of discussion for Matt Garman, AWSCO. You know the 15 00:00:59,680 --> 00:01:02,920 Speaker 3: base point with Trainium three and you've moved quickly to 16 00:01:02,960 --> 00:01:07,360 Speaker 3: bring it to the real world is cost performance efficiency 17 00:01:07,440 --> 00:01:10,800 Speaker 3: over the prior generation, but also over in Vidia GPUs, 18 00:01:10,920 --> 00:01:14,360 Speaker 3: over Google TPUs. I think what people are trying to 19 00:01:14,400 --> 00:01:17,440 Speaker 3: understand is that ramp part I was talking about when 20 00:01:17,560 --> 00:01:22,240 Speaker 3: real world customers use it beyond this anchor customer of Enenthropic, 21 00:01:22,480 --> 00:01:23,800 Speaker 3: which relies on it currently. 22 00:01:24,200 --> 00:01:27,520 Speaker 1: Yeah. Well, look, we're quite excited about Trainium and Trainium 23 00:01:27,520 --> 00:01:30,360 Speaker 1: three in particular, as you mentioned, excited to get it 24 00:01:30,360 --> 00:01:32,760 Speaker 1: into customer's hands. And part of where we have a 25 00:01:32,800 --> 00:01:35,759 Speaker 1: benefit that we can bring to bear is, as you mentioned, 26 00:01:35,760 --> 00:01:38,120 Speaker 1: getting it into market quickly, and it's because we control 27 00:01:38,200 --> 00:01:41,360 Speaker 1: that full stack, We control the silicon development, we control 28 00:01:41,360 --> 00:01:43,440 Speaker 1: the data centers that all land in. We know that 29 00:01:43,480 --> 00:01:45,880 Speaker 1: full environment, and we can land that in very large 30 00:01:45,880 --> 00:01:47,720 Speaker 1: clusters for people to take advantage of that, and the 31 00:01:47,760 --> 00:01:50,840 Speaker 1: performance that we're seeing out of it is quite incredible, 32 00:01:50,840 --> 00:01:53,280 Speaker 1: and so we're anxious and excited to get more and 33 00:01:53,320 --> 00:01:54,080 Speaker 1: more people using it. 34 00:01:54,640 --> 00:01:56,600 Speaker 3: I've been able to go inside out a Perna Labs 35 00:01:56,640 --> 00:01:59,280 Speaker 3: and look at the engineering work between the first generation 36 00:01:59,360 --> 00:02:02,960 Speaker 3: of Trainium and second It wasn't just the accelerator at 37 00:02:02,960 --> 00:02:04,320 Speaker 3: the server level as well. 38 00:02:04,440 --> 00:02:04,800 Speaker 1: That's right. 39 00:02:05,040 --> 00:02:07,040 Speaker 3: But a part of the surprise of today is this, 40 00:02:07,560 --> 00:02:10,560 Speaker 3: you appear to be committing to an annual cadence a 41 00:02:10,560 --> 00:02:13,799 Speaker 3: new generation of Trainium. How do you keep that up? 42 00:02:14,000 --> 00:02:16,880 Speaker 1: Well, the key thing that we're focused on is making 43 00:02:16,880 --> 00:02:20,160 Speaker 1: sure that we can iterate on the technology as fast 44 00:02:20,160 --> 00:02:23,359 Speaker 1: as possible. The desire and the hunger out there for 45 00:02:23,760 --> 00:02:28,240 Speaker 1: more power and more compute is almost insatiable. And so 46 00:02:28,320 --> 00:02:31,799 Speaker 1: the more we can take an existing power footprint, an 47 00:02:31,800 --> 00:02:34,600 Speaker 1: existing set of capabilities and bring more and more compute 48 00:02:34,600 --> 00:02:38,799 Speaker 1: into that for customers to build cool applications and cool 49 00:02:38,880 --> 00:02:42,000 Speaker 1: environments and to get value from that, that's we're focused on. Then, 50 00:02:42,040 --> 00:02:44,160 Speaker 1: so we're going to be pushing that envelope as fast 51 00:02:44,200 --> 00:02:46,600 Speaker 1: as we as we possibly can to get those new 52 00:02:46,600 --> 00:02:48,120 Speaker 1: and new capabilities out to customers. 53 00:02:48,280 --> 00:02:50,760 Speaker 3: The pitch for Trainium in both the training and inference 54 00:02:50,840 --> 00:02:54,160 Speaker 3: use case is that it's a great deal, you know, 55 00:02:54,320 --> 00:02:57,480 Speaker 3: cost effective performance. At the same time, you went on 56 00:02:57,520 --> 00:03:00,720 Speaker 3: stage and said AWS is quote by far the best 57 00:03:00,720 --> 00:03:05,560 Speaker 3: place to run in Vidia GPS how above if possible. 58 00:03:05,320 --> 00:03:07,920 Speaker 1: Well, I mean that both both are possible because that 59 00:03:08,040 --> 00:03:12,760 Speaker 1: is a great environment to run accelerators and compute in. 60 00:03:13,160 --> 00:03:15,960 Speaker 1: And so we've been working for fifteen plus years with 61 00:03:16,040 --> 00:03:19,400 Speaker 1: the in Nvidia team and Jensen and team to deliver 62 00:03:19,639 --> 00:03:23,240 Speaker 1: outstanding capabilities for our customers and for when you're running 63 00:03:23,280 --> 00:03:26,160 Speaker 1: a large cluster of Nvidia GPUs, people will tell you 64 00:03:26,240 --> 00:03:28,600 Speaker 1: AIGHTWS is the best place you get the best performance, 65 00:03:28,639 --> 00:03:32,120 Speaker 1: the most stable cluster the best capabilities out there and 66 00:03:32,320 --> 00:03:34,720 Speaker 1: broad scale, and it's why folks like OpenAI and others 67 00:03:34,760 --> 00:03:37,880 Speaker 1: are running in AWS and we have that choice. And 68 00:03:37,960 --> 00:03:39,800 Speaker 1: so for others that want to be able to take 69 00:03:39,840 --> 00:03:42,800 Speaker 1: advantage of Trainium, and there's some use cases that are 70 00:03:42,800 --> 00:03:45,040 Speaker 1: best for Trainum, there's other use cases where in vidio 71 00:03:45,080 --> 00:03:47,040 Speaker 1: GPUs are going to be your best option. We want 72 00:03:47,040 --> 00:03:49,200 Speaker 1: to have all of those available, and so we think 73 00:03:49,200 --> 00:03:51,680 Speaker 1: that if we can continue to push the envelope on 74 00:03:51,720 --> 00:03:54,480 Speaker 1: what Trainium can deliver for customers and make sure that 75 00:03:54,560 --> 00:03:58,040 Speaker 1: we are supporting the latest and greatest from everything that 76 00:03:58,040 --> 00:04:00,800 Speaker 1: the awesome team in Nvidia is delivering, that's going to 77 00:04:00,800 --> 00:04:02,240 Speaker 1: be the best outcome for our customers. 78 00:04:02,800 --> 00:04:06,200 Speaker 3: The plan for AWS is to basically double capacity by 79 00:04:06,200 --> 00:04:08,760 Speaker 3: the end of twenty twenty seven to round eight gigawatts, 80 00:04:09,080 --> 00:04:11,480 Speaker 3: so you have a sense of how you apportion that 81 00:04:11,520 --> 00:04:15,920 Speaker 3: capacity in how silicon and server designs to traineum versus 82 00:04:16,080 --> 00:04:17,159 Speaker 3: and video gp is. 83 00:04:18,000 --> 00:04:19,560 Speaker 1: We're just going to keep pushing as fast as we 84 00:04:19,600 --> 00:04:22,200 Speaker 1: can and we'll see where customer demands drives us as 85 00:04:22,240 --> 00:04:25,719 Speaker 1: we go. And as you said, we're massively adding capacity. 86 00:04:25,760 --> 00:04:28,200 Speaker 1: In the last year alone, we've added three point eight 87 00:04:28,200 --> 00:04:30,760 Speaker 1: gigawatts of capacity, and we'll continue to add more and 88 00:04:30,800 --> 00:04:33,640 Speaker 1: more as over the next couple of years, and we'll 89 00:04:33,680 --> 00:04:35,880 Speaker 1: let customer demands drive us a little bit on what 90 00:04:35,920 --> 00:04:38,919 Speaker 1: they're looking for and what they want, and that's what 91 00:04:38,960 --> 00:04:40,520 Speaker 1: we always listen to and that's what we'll continue to 92 00:04:40,520 --> 00:04:40,880 Speaker 1: listen to. 93 00:04:41,400 --> 00:04:43,760 Speaker 3: The focus with Trainium in the time I've been able 94 00:04:43,760 --> 00:04:46,240 Speaker 3: to interact with you and talk about not again not 95 00:04:46,279 --> 00:04:49,400 Speaker 3: just the accelerator, but at the server design level, there's 96 00:04:49,440 --> 00:04:52,239 Speaker 3: a lot of benefits the customer. When does that benefit 97 00:04:52,320 --> 00:04:55,520 Speaker 3: start accruing to AWS in terms of profitability, Like if 98 00:04:55,520 --> 00:04:59,160 Speaker 3: it's such a good financial proposition, you must be able 99 00:04:59,240 --> 00:05:01,080 Speaker 3: soon to say when making a lot of money on this. 100 00:05:01,240 --> 00:05:03,600 Speaker 1: Yeah, Well, you're already seeing some of the benefits of 101 00:05:03,680 --> 00:05:06,239 Speaker 1: crew You see things like bedrock growing really really rapidly, 102 00:05:06,440 --> 00:05:08,760 Speaker 1: and you see trainingum powering that under the covers, and 103 00:05:08,800 --> 00:05:12,920 Speaker 1: we announced today that more than half of all tokens 104 00:05:12,920 --> 00:05:15,760 Speaker 1: and inference done in bedrock are done on TRAININGUM two 105 00:05:15,760 --> 00:05:18,000 Speaker 1: servers under the covers, and so you're already seeing that 106 00:05:18,040 --> 00:05:21,160 Speaker 1: benefit come. You see the models that we're building in 107 00:05:21,240 --> 00:05:23,480 Speaker 1: Nova and Nova two start to get better and better 108 00:05:23,520 --> 00:05:26,839 Speaker 1: over time and be accelerated by Trainum, and so we 109 00:05:26,880 --> 00:05:29,240 Speaker 1: really think that there's a whole bunch of dimensions on 110 00:05:29,279 --> 00:05:32,520 Speaker 1: which both our customers, our partners, and our own products 111 00:05:32,520 --> 00:05:34,480 Speaker 1: are going to get accelerated all from Trainium. 112 00:05:34,800 --> 00:05:37,200 Speaker 3: Every time you come onto the program, I always offer 113 00:05:37,240 --> 00:05:39,400 Speaker 3: the audience opportunity to pose a question to you. There's 114 00:05:39,400 --> 00:05:41,680 Speaker 3: a lot of interests in AWS right. Many of your 115 00:05:41,920 --> 00:05:45,680 Speaker 3: customers span global technology. Actually most of the questions were 116 00:05:45,720 --> 00:05:49,680 Speaker 3: about anthropic. That wasn't much said on stage. I think 117 00:05:49,680 --> 00:05:52,760 Speaker 3: people are trying to understand what is the benefit and 118 00:05:52,839 --> 00:05:57,560 Speaker 3: advantage AWS Office to anthropic while they are ramping Trainium 119 00:05:57,560 --> 00:06:01,320 Speaker 3: through Project Raineer, but also ramping their tea allocations as well. 120 00:06:01,839 --> 00:06:04,000 Speaker 1: Well. Look, our partners are an anthropic. Our partnership with 121 00:06:04,000 --> 00:06:06,440 Speaker 1: them is incredibly strong and it's never been stronger, and 122 00:06:07,720 --> 00:06:09,720 Speaker 1: we do a ton of collaboration with them, and as 123 00:06:09,720 --> 00:06:12,480 Speaker 1: I mentioned through Project Right here, it's a huge collaboration 124 00:06:12,600 --> 00:06:15,279 Speaker 1: there to go build their current generation models and all 125 00:06:15,320 --> 00:06:18,400 Speaker 1: their models run today and launch on day one on 126 00:06:18,520 --> 00:06:21,120 Speaker 1: top of Trainingum and on top of AWS which we're 127 00:06:21,160 --> 00:06:23,520 Speaker 1: incredibly excited about it, and we'll continue that partnership for 128 00:06:23,520 --> 00:06:26,560 Speaker 1: a long time. I think from them, they have a 129 00:06:26,680 --> 00:06:29,080 Speaker 1: huge demand for compute, and so they'll go to other 130 00:06:29,120 --> 00:06:32,080 Speaker 1: places where it makes sense to round out their compute 131 00:06:32,480 --> 00:06:35,640 Speaker 1: needs because they just have such massive needs for compute, 132 00:06:35,680 --> 00:06:38,039 Speaker 1: and they have customers in other clouds as well, But 133 00:06:38,080 --> 00:06:41,640 Speaker 1: we're definitely their their primary cloud provider and closest partner. 134 00:06:41,680 --> 00:06:42,000 Speaker 1: For sure. 135 00:06:42,960 --> 00:06:46,680 Speaker 3: Supply constraints so am Fropic is supply constrained, they can't 136 00:06:46,680 --> 00:06:49,400 Speaker 3: get the compute they need. We've talked about the rampont 137 00:06:49,440 --> 00:06:52,279 Speaker 3: and video GPU and in house silicon. Is there a 138 00:06:52,320 --> 00:06:55,679 Speaker 3: supply constraint element with AWS so you able to get 139 00:06:56,040 --> 00:06:57,479 Speaker 3: the chips that you need. 140 00:06:57,640 --> 00:07:01,440 Speaker 1: Yeah, I think there's always anytime you see an industry 141 00:07:01,440 --> 00:07:03,840 Speaker 1: that's growing as fast as this is right now, when 142 00:07:03,839 --> 00:07:07,120 Speaker 1: you think about AI and model development and chips, there 143 00:07:07,160 --> 00:07:09,440 Speaker 1: are going to be constraints. No matter what. There is 144 00:07:09,520 --> 00:07:13,040 Speaker 1: more demand than there is supply. Sometimes it's in chips, 145 00:07:13,080 --> 00:07:15,680 Speaker 1: Sometimes it's in power and data centers. Sometimes it's in 146 00:07:16,440 --> 00:07:19,120 Speaker 1: you know, different parts of that. At some points it's 147 00:07:19,440 --> 00:07:23,320 Speaker 1: you know, networking equipment. At some point it's transistors, you know, 148 00:07:23,480 --> 00:07:25,520 Speaker 1: resistors or whatever it is, and you look at the 149 00:07:25,680 --> 00:07:29,320 Speaker 1: entire supply chain that is needed to ramp up at 150 00:07:29,400 --> 00:07:32,720 Speaker 1: such a massive rate right Never before has the technology 151 00:07:32,720 --> 00:07:34,640 Speaker 1: industry ramped at the rate that we are right now, 152 00:07:35,240 --> 00:07:37,640 Speaker 1: and so there are always constraints, and so it's not 153 00:07:37,680 --> 00:07:40,920 Speaker 1: necessarily that there is necessarily one constraint where it's like, wow, 154 00:07:40,960 --> 00:07:42,600 Speaker 1: I can't get in vidio chips. We can get in 155 00:07:42,640 --> 00:07:45,720 Speaker 1: video chips. And actually Jensen team have been incredibly supportive 156 00:07:45,840 --> 00:07:48,160 Speaker 1: and great partners and helping us get capacity there. It's 157 00:07:48,200 --> 00:07:50,480 Speaker 1: not that you can't get power. We're getting power all 158 00:07:50,520 --> 00:07:52,720 Speaker 1: over the place. But it's just we're ramping all of 159 00:07:52,760 --> 00:07:56,280 Speaker 1: these places in such rapid rates that always there's a 160 00:07:56,320 --> 00:07:58,560 Speaker 1: constraint in that system, and it'll change every month you 161 00:07:58,600 --> 00:07:59,720 Speaker 1: ask me of what the current one is. 162 00:08:00,080 --> 00:08:01,960 Speaker 3: Throughout the day, we was just speaking with your team 163 00:08:02,040 --> 00:08:05,200 Speaker 3: about the idea we're moving from AI assistance to AI 164 00:08:05,280 --> 00:08:08,400 Speaker 3: co workers. You know, particular focus on the agentic offering 165 00:08:08,440 --> 00:08:10,560 Speaker 3: that you've done. You're in the camp of people, if 166 00:08:10,600 --> 00:08:13,520 Speaker 3: you don't mind me saying that sees basically ninety percent 167 00:08:13,560 --> 00:08:16,840 Speaker 3: of the value in enterprise coming from agentic technology. Do 168 00:08:16,880 --> 00:08:19,280 Speaker 3: you have any data or evidence to support that all 169 00:08:19,320 --> 00:08:20,800 Speaker 3: of your customers are ready for that? 170 00:08:21,320 --> 00:08:23,920 Speaker 1: Yeah, I don't think all of our customers are. You 171 00:08:23,920 --> 00:08:25,800 Speaker 1: get ready for that, but they're excited about it, So, 172 00:08:25,840 --> 00:08:27,360 Speaker 1: you know, I think it'd definitely be an overstatement to 173 00:08:27,360 --> 00:08:29,280 Speaker 1: say everybody's ready for it. And part of that is 174 00:08:29,320 --> 00:08:31,480 Speaker 1: because it is going to take change. Right, people are 175 00:08:31,480 --> 00:08:33,240 Speaker 1: going to have to change how they think about work. 176 00:08:33,280 --> 00:08:35,240 Speaker 1: They're going to have to change their process flows, they're 177 00:08:35,240 --> 00:08:37,240 Speaker 1: going to have to change some of the things about how 178 00:08:37,240 --> 00:08:38,680 Speaker 1: they get work done. It's not just going to be 179 00:08:38,880 --> 00:08:40,920 Speaker 1: a magic one that's going to come in and magically 180 00:08:40,920 --> 00:08:44,199 Speaker 1: get them to get value. But almost everyone that I 181 00:08:44,280 --> 00:08:46,440 Speaker 1: talked to you definitely sees that that's the path. These 182 00:08:46,520 --> 00:08:50,560 Speaker 1: the agentic power of the power of agents is what 183 00:08:50,720 --> 00:08:53,480 Speaker 1: allows customers to actually get that work done. And when 184 00:08:53,520 --> 00:08:55,880 Speaker 1: they see that efficiency gain, they see them able to 185 00:08:55,880 --> 00:08:58,640 Speaker 1: accomplish things they weren't able to do before. That is 186 00:08:58,640 --> 00:09:00,440 Speaker 1: when it's worth it to go make these changes. And 187 00:09:00,480 --> 00:09:01,920 Speaker 1: so there's going to be work for people, and it's 188 00:09:01,960 --> 00:09:04,440 Speaker 1: going to take some time, right, It's taken We're twenty 189 00:09:04,480 --> 00:09:07,319 Speaker 1: years into the cloud journey and still only a fraction 190 00:09:07,400 --> 00:09:09,280 Speaker 1: of workloads have moved to the cloud. So it's going 191 00:09:09,360 --> 00:09:10,720 Speaker 1: to take time. It's not like the people are going 192 00:09:10,800 --> 00:09:12,880 Speaker 1: to magically switch. And I think it's going to be really. 193 00:09:12,800 --> 00:09:15,719 Speaker 3: Fair that we just have sixty seconds twenty years into 194 00:09:15,760 --> 00:09:17,920 Speaker 3: the cloud journey. When I touched down in Vegas. Everyone 195 00:09:17,960 --> 00:09:23,000 Speaker 3: accepts AWS number one in terms of scale infrastructure. They question, 196 00:09:23,200 --> 00:09:26,319 Speaker 3: is AWS number one in AI? Just in the thirty 197 00:09:26,360 --> 00:09:28,760 Speaker 3: seconds we have left? Yeah, I think I'll give it. 198 00:09:28,760 --> 00:09:30,240 Speaker 1: It's a question that we got a lot two years 199 00:09:30,240 --> 00:09:32,160 Speaker 1: ago and not that much a year ago, and today 200 00:09:32,160 --> 00:09:33,719 Speaker 1: I don't think we get that nearly as much. It's 201 00:09:33,720 --> 00:09:35,360 Speaker 1: just people that are kind of playing the same tapes. 202 00:09:35,679 --> 00:09:39,319 Speaker 1: We have a huge choice of models. We see when 203 00:09:39,360 --> 00:09:42,480 Speaker 1: customers are actually moving their workloads to production, they want 204 00:09:42,480 --> 00:09:44,440 Speaker 1: to run those AI workloads on AWS, and that to 205 00:09:44,480 --> 00:09:46,800 Speaker 1: me is the biggest signal. When we see our customers 206 00:09:46,840 --> 00:09:48,600 Speaker 1: they say, I ran proof of concepts in a lot 207 00:09:48,600 --> 00:09:50,439 Speaker 1: of places. When I want to move to production, I 208 00:09:50,480 --> 00:09:52,200 Speaker 1: want to run on AWS. And that's the thing that 209 00:09:52,200 --> 00:09:54,120 Speaker 1: we hear over and over again, which makes me think 210 00:09:54,160 --> 00:09:55,320 Speaker 1: we're actually in a great position. 211 00:09:55,640 --> 00:09:59,240 Speaker 3: Matt Garman, AWS, CEO with the Full Stack AI Company, 212 00:09:59,280 --> 00:10:01,080 Speaker 3: pitch here Vegas at reinvent