WEBVTT - How CoreWeave Sees the Market for Compute Right Now 0:00:02.720 --> 0:00:15.840 Bloomberg Audio Studios, Podcasts, radio News. 0:00:18.520 --> 0:00:21.560 Hello and welcome to another episode of the Odd Lots podcast. 0:00:21.640 --> 0:00:24.040 I'm Joe Wisenthal and I'm Tracy Alloway. 0:00:24.280 --> 0:00:29.680 Tracy, I'm envisioning this future where like we have to 0:00:29.720 --> 0:00:33.680 do a state of the sort of AI inference market episode, 0:00:33.720 --> 0:00:36.600 like once a month, you know, where it's like things 0:00:36.600 --> 0:00:39.480 are moving so rapidly and there's so much change either 0:00:39.479 --> 0:00:42.120 in terms of what models are using or what they're 0:00:42.120 --> 0:00:44.720 being used for, et cetera, that in the same way 0:00:44.760 --> 0:00:47.440 we would do, like you know, the occasional regular stock 0:00:47.479 --> 0:00:50.839 market episode or whatever, we would just do, Okay, what 0:00:50.920 --> 0:00:53.880 are we seeing right now in a inference trends because 0:00:53.920 --> 0:00:56.120 it just feels like the moment we do an episode, 0:00:56.120 --> 0:00:58.000 a few weeks later it may be out of date. 0:00:58.240 --> 0:01:00.600 We should just buy the bullet and do a weekly episode, 0:01:00.680 --> 0:01:04.800 transform lots more into a market update on compute. 0:01:04.840 --> 0:01:07.840 We could do inference in I don't know, we'll have 0:01:07.880 --> 0:01:12.160 to workshop inference. No, no, we'd have to. But anyway, 0:01:12.360 --> 0:01:15.640 this is lots of lots of inference. This is like 0:01:15.959 --> 0:01:19.360 the story of the moment, and we know that, you know, 0:01:19.400 --> 0:01:22.640 a couple of years ago, everyone was sort of dabbling 0:01:22.680 --> 0:01:28.000 around with various things and experimenting and using AI, like 0:01:27.680 --> 0:01:30.240 oh like write a poem for me about this, etcetera. 0:01:30.400 --> 0:01:32.440 That phase of AI is long over, and we know 0:01:32.560 --> 0:01:36.800 that companies specifically are spending a ton on compute, so 0:01:37.040 --> 0:01:40.839 much so that CFOs around the world are getting sticker 0:01:40.880 --> 0:01:43.520 shock about their compute budgets. And there was even a 0:01:43.560 --> 0:01:47.240 headline of like Uber saying like, okay, like fifteen hundred 0:01:47.240 --> 0:01:50.760 dollars of max per employee, like don't spend more than 0:01:50.800 --> 0:01:53.000 that in a month on token, So like this is 0:01:53.040 --> 0:01:54.360 a very fast moving area. 0:01:54.600 --> 0:01:56.880 Yeah, you're starting to get headlines about, I guess a 0:01:57.000 --> 0:02:01.400 corporate reckoning AI as more people experiment and spend money 0:02:01.440 --> 0:02:04.840 on it. The Uber headline that you mentioned apparently Uber 0:02:04.880 --> 0:02:08.160 burned through its entire twenty twenty six AI budget in 0:02:08.240 --> 0:02:12.160 four months basically, and like what's more important is the 0:02:12.160 --> 0:02:15.639 COEO was actually asking whether or not that was worth it, 0:02:15.680 --> 0:02:18.800 like whether they saw productivity gains or whatever as a 0:02:18.800 --> 0:02:22.720 result of that. The other very amusing headline that I saw, 0:02:22.800 --> 0:02:25.679 and it was citing an unnamed source. It's from Axios, 0:02:25.720 --> 0:02:28.480 so you know, oh yeah, not entirely sure it's true, 0:02:28.680 --> 0:02:31.480 but reportedly great headline. It was a great headline. An 0:02:31.520 --> 0:02:34.760 AI consultant told Axios that one of their clients recently 0:02:34.760 --> 0:02:38.120 spent half a billion dollars in a single month after 0:02:38.160 --> 0:02:39.959 failing to put usage limits on this. 0:02:40.080 --> 0:02:43.080 Yeah, it's because there's everyone that's like, oh, I just 0:02:43.120 --> 0:02:45.320 have a simple question. I want to look up our 0:02:45.320 --> 0:02:48.320 guests title. I'm going to use the most advanced model 0:02:48.480 --> 0:02:50.720 to do that, et cetera. I have a theory and 0:02:50.760 --> 0:02:53.200 we'll get into this with our guests that one of 0:02:53.240 --> 0:02:55.400 the things that will and we've talked about this with 0:02:55.480 --> 0:02:58.639 a Goldenman's Marco Ardenti, but one of the things I 0:02:58.680 --> 0:03:02.280 predict is that companies are like, clearly you know, they're 0:03:02.280 --> 0:03:04.080 going to keep using it more and more would be 0:03:04.080 --> 0:03:06.520 my guess. But there are probably a lot of investment 0:03:06.600 --> 0:03:09.840 made in sort of like optimal model routing. Because some 0:03:09.919 --> 0:03:12.680 models are like one hundred per query of what a 0:03:12.680 --> 0:03:15.320 frontier model is, probably a lot of people don't know 0:03:15.360 --> 0:03:18.519 like what is the sort of like efficient frontier model usage, 0:03:18.720 --> 0:03:21.640 and so actually routing the query to the sort of 0:03:21.720 --> 0:03:25.679 most efficient model. I have a feeling we're going to 0:03:25.720 --> 0:03:27.280 see a lot of investment in that area. 0:03:27.120 --> 0:03:29.560 Specifically, well, there's also just the question of whether or 0:03:29.560 --> 0:03:32.560 not the models get cheaper overall as they advance, right, 0:03:32.600 --> 0:03:34.800 and we have seen some I think Nvidia has a 0:03:34.840 --> 0:03:37.880 new system or chip out or something that is supposed 0:03:37.920 --> 0:03:41.160 to reduce token usage. We can get into that as well. 0:03:41.360 --> 0:03:44.480 And you know, we did that live episode recently with 0:03:44.600 --> 0:03:47.520 In Dunning of Hudson River Trading and he said a 0:03:47.520 --> 0:03:49.760 lot of interesting things in that, But one of the 0:03:49.800 --> 0:03:54.160 things he said is that the scarcity is increasingly like 0:03:54.440 --> 0:03:58.560 just the real estate component. Finding a suitable place to 0:03:58.640 --> 0:04:02.280 plug in your GPU, at least from his perspective, right now, 0:04:02.800 --> 0:04:05.320 is as much, if not more so, of a challenge 0:04:05.360 --> 0:04:07.840 than securing GPUs themselves, so like. 0:04:08.040 --> 0:04:10.400 Which is different to what it was like three years ago. 0:04:10.480 --> 0:04:12.320 Yeah, yeah, so just like where you plug it in. 0:04:12.400 --> 0:04:14.680 We know there's all the like the anti data center 0:04:14.720 --> 0:04:16.680 politics out there, so it's like, yeah, we got to 0:04:16.680 --> 0:04:17.279 take the pulse of. 0:04:17.279 --> 0:04:20.240 This market, all right, consider this our inference update. 0:04:20.440 --> 0:04:22.480 Yeah, well, I'm really excited to say we really do 0:04:22.560 --> 0:04:25.880 have the perfect guests. Someone we spoke to like truly 0:04:25.920 --> 0:04:28.280 feels like eons a good I think the first thing 0:04:28.360 --> 0:04:30.680 we ever connected with this company, They've always had a 0:04:30.680 --> 0:04:32.400 lot of chips. But I think the first time we 0:04:32.440 --> 0:04:34.120 ever linked up with this company was still in the 0:04:34.160 --> 0:04:36.680 era where people were excited about in video chips being 0:04:36.760 --> 0:04:39.560 used for like cryptomining and stuff like that. But we 0:04:39.560 --> 0:04:41.599 are now in this very different era and this is 0:04:41.640 --> 0:04:44.200 truly like one of the companies of the moment, and 0:04:44.279 --> 0:04:46.360 that is, of course core weave, one of the so 0:04:46.440 --> 0:04:51.000 called neo clouds, offering both training and inference services for 0:04:51.040 --> 0:04:54.120 all sorts of different AI workloads. I'm very excited to 0:04:54.160 --> 0:04:57.560 say back on the show, we have Brandon McBee, Coreweav's 0:04:57.560 --> 0:05:00.960 co founder and chief development officer, So thank you so 0:05:01.040 --> 0:05:02.440 much for coming on ALTS. 0:05:02.880 --> 0:05:06.920 Appreciate being invited back, guys, and that was a fantastic intro. 0:05:07.000 --> 0:05:08.559 We look forward to hitting these topics today. 0:05:08.720 --> 0:05:11.440 All right, here's my question. So we know that like 0:05:11.640 --> 0:05:14.400 at the tail end of last year and then in 0:05:14.440 --> 0:05:17.799 the first quarter of this year, it's everyone started using 0:05:17.880 --> 0:05:22.320 clog code and just there's clearly a key inflection moment 0:05:22.720 --> 0:05:25.920 for sort of like overall AI demand. And then we 0:05:25.960 --> 0:05:29.719 get into Q two and suddenly the CFO is, oh 0:05:29.800 --> 0:05:32.800 my gosh, we're spending this much on inference. We got 0:05:32.800 --> 0:05:36.600 to like figure things out just straight up like in 0:05:36.720 --> 0:05:41.359 the last month whatever. Do you see any signs of 0:05:41.400 --> 0:05:44.839 that happening yet of these companies which are all like 0:05:45.440 --> 0:05:51.200 still AI eager AI adopters trying to get a little 0:05:51.200 --> 0:05:55.159 bit of a handle and maybe slowing the rate of 0:05:55.200 --> 0:05:57.320 the rate of growth. Is that happening yet? 0:05:57.680 --> 0:06:01.360 Yeah, I think you see head lines there that there 0:06:01.400 --> 0:06:06.640 are surprises of spend et cetera. I'd say our interpretation 0:06:07.000 --> 0:06:12.280 of it is entirely look at the authentic and foundational 0:06:12.360 --> 0:06:14.680 demand that is out there right, Like, all we're really 0:06:14.680 --> 0:06:19.119 doing is talking about how much consumption there is of 0:06:19.400 --> 0:06:22.080 AI and use for it. And I think that that 0:06:22.200 --> 0:06:24.880 was a real question in the market twelve eighteen, twenty 0:06:24.920 --> 0:06:27.840 four months ago, is will there be demand FEI? Where 0:06:27.920 --> 0:06:30.640 is this inference demand that everyone's been talking about. And 0:06:31.240 --> 0:06:34.880 I think you're absolutely correct January or so with this 0:06:35.080 --> 0:06:38.039 kind of like next group of models that were coming out, 0:06:38.720 --> 0:06:41.880 everyone all of a sudden and all at once said 0:06:42.279 --> 0:06:44.279 this is what we've needed, like this is the real 0:06:44.360 --> 0:06:47.640 product breakthrough. But I think we're keeping in mind that 0:06:47.760 --> 0:06:51.680 product breakthrough was like for a limited set of people 0:06:51.960 --> 0:06:54.159 at the end of the day, right, we're talking like 0:06:54.520 --> 0:06:59.280 coding professionals, finance professionals, but it's a relatively small group 0:06:59.320 --> 0:07:03.679 of people that are using infrastructure at this normal scale. 0:07:03.720 --> 0:07:07.080 And so where we see this moving towards next is 0:07:07.240 --> 0:07:11.480 broader enterprise use, like likely not seeing this whole to 0:07:11.600 --> 0:07:16.840 connecting approach, and I think that that is unsustainable. But 0:07:17.560 --> 0:07:20.560 do we see adoption in other sectors and how this 0:07:20.640 --> 0:07:24.200 can continue to spread out? Absolutely? I mean, you know, 0:07:24.520 --> 0:07:27.920 on our end, I think we have ten over one 0:07:27.960 --> 0:07:33.840 billion dollar clients at this point, and our financial services 0:07:33.880 --> 0:07:37.840 client backlog is into tens of billions of dollars at 0:07:37.840 --> 0:07:41.040 this point. And so we're now talking about things outside 0:07:41.040 --> 0:07:45.080 of AI labs, outside of hyperscalers. And look, as you 0:07:45.120 --> 0:07:51.080 guys know, we support nine of the top ten AI 0:07:51.200 --> 0:07:54.080 labs on the planet, and if you exclude China and 0:07:54.120 --> 0:07:57.280 everything that's going on over there, Like, we have a 0:07:57.360 --> 0:08:01.000 lot of visibility into what people are doing, and we're 0:08:01.040 --> 0:08:05.720 not seeing any pullback on what they're doing on inference today. 0:08:05.720 --> 0:08:10.440 If anything, it just remains this unrelenting demand for access 0:08:10.480 --> 0:08:15.360 to the best technology solution in the market for running 0:08:15.520 --> 0:08:19.280 artificial intelligence, and that's core week solution in the market. 0:08:19.640 --> 0:08:22.800 Wait, say more about the customer mix now versus say 0:08:23.320 --> 0:08:26.600 three years ago. So you have hyper scalers, you've got startups, 0:08:26.600 --> 0:08:30.640 you've got various businesses. How has that, I guess composition 0:08:30.760 --> 0:08:31.760 shifted over time. 0:08:32.120 --> 0:08:38.800 Yeah, it's shifted enormously towards a more diverse customer base. Right. 0:08:38.840 --> 0:08:41.160 We got a lot of flat for this. In our IPO. Right, 0:08:41.320 --> 0:08:45.760 people were noting that we only had a handful of 0:08:45.840 --> 0:08:50.040 large clients, that our clients were like just the hyperscalers 0:08:50.080 --> 0:08:53.560 and AI lab or two. And I think that we 0:08:53.600 --> 0:08:57.840 have made tremendous progress in driving diversifications. So I'd say 0:08:57.920 --> 0:09:01.880 it's broadly cross three buckets today. Right. We had hyperscaleup 0:09:01.880 --> 0:09:05.440 clients who continue to grow with us. We have AI 0:09:05.600 --> 0:09:09.000 lab clients. As I said, nine of the top ten 0:09:09.120 --> 0:09:13.199 AI labs on the planet choose core REEF. And then 0:09:13.200 --> 0:09:17.200 we have this enterprise base. And the enterprise base just 0:09:17.240 --> 0:09:20.440 doesn't grab as many headlines as you would expect because 0:09:20.440 --> 0:09:24.480 it's not these massive, multi billion dollar contracts that are 0:09:24.720 --> 0:09:28.800 being signed. But I think in Q four alone, we 0:09:28.920 --> 0:09:33.600 added twice as many logos to our client base as 0:09:33.640 --> 0:09:37.280 we had ever done versus any previous court. Right, And 0:09:37.320 --> 0:09:40.320 that enterprise base is one that's growing so much. And 0:09:40.880 --> 0:09:43.280 there was a point you guys hit on in the 0:09:43.320 --> 0:09:45.880 intro that I think is really worth acknowledging, and it 0:09:46.000 --> 0:09:50.520 was this concept of model routing and the idea that like, 0:09:50.880 --> 0:09:53.960 not everyone needs just the latest model, that it's different 0:09:54.040 --> 0:09:57.000 types of models I can hit different use cases. And 0:09:57.600 --> 0:10:00.000 this is something we've been talking about for a while 0:10:00.480 --> 0:10:03.280 right as it relates to the infrastructure side of things 0:10:03.440 --> 0:10:07.160 as well, right, because you don't need that latest model 0:10:07.200 --> 0:10:10.440 for everything, and accordingly you don't need the latest piece 0:10:10.480 --> 0:10:14.760 of infrastructure to support every single inference or training query 0:10:14.800 --> 0:10:18.680 that's out there. You can kind of conceptualize this matrix 0:10:18.760 --> 0:10:21.600 of different sizes of workloads well to the different sizes 0:10:21.640 --> 0:10:24.320 of GPUs, and all of a sudden that tells you, 0:10:24.679 --> 0:10:28.160 my god, like h one hundreds could last six, seven, 0:10:28.280 --> 0:10:30.960 eight years, a one hundreds are going to last longer. 0:10:31.000 --> 0:10:35.439 And it totally changes the entire conversation around depreci full 0:10:35.520 --> 0:10:39.920 life of infrastructure, as that was a really popular topic 0:10:40.280 --> 0:10:43.400 during twenty twenty five. People were saying like, oh, this 0:10:43.440 --> 0:10:46.680 stuff will last two years, it's worth zero afterwards, in 0:10:46.800 --> 0:10:50.720 like we've never seen any semblance of that. Because of 0:10:50.760 --> 0:10:55.560 the point you guys are accurately making, which is users 0:10:55.600 --> 0:10:59.839 are going to need to find the way to use 0:11:00.120 --> 0:11:03.640 the appropriate model for their prompts, and that'll be solved 0:11:03.640 --> 0:11:06.680 by model round to your point, but that just further 0:11:06.800 --> 0:11:11.560 enables this concept that infrastructure is going to be used longer, 0:11:11.640 --> 0:11:15.080 and we see that every day in our portfolio, extending 0:11:15.080 --> 0:11:16.880 all the way back to A one hundreds. 0:11:17.080 --> 0:11:19.480 I just want to ask a specific question about the 0:11:19.520 --> 0:11:22.760 broadening out of the customer base. And you mentioned, for example, 0:11:23.120 --> 0:11:26.960 financial services clients. When you talk about, say a financial 0:11:27.000 --> 0:11:31.960 services client as being distinct client from one of the 0:11:32.000 --> 0:11:34.960 major AI labs, does that mean, well, you're saying so 0:11:35.000 --> 0:11:37.200 it's like I'm just making it up. Let's just say 0:11:37.440 --> 0:11:39.720 I don't know if these relationships exist. Let's say a 0:11:39.760 --> 0:11:45.680 city group has an enterprise licensed with an ANTHROPIC. Does 0:11:45.720 --> 0:11:48.720 that count as Anthropic as a customer or city as 0:11:48.760 --> 0:11:50.920 a customer. And when you talk about this broadening out, 0:11:51.240 --> 0:11:55.280 are there essentially more types of entities who are building 0:11:55.400 --> 0:11:59.559 some type of model, not necessarily an LLM per se, 0:11:59.840 --> 0:12:03.480 but it's some type of internal house specific model from 0:12:03.559 --> 0:12:05.040 which they want to run inference. 0:12:05.320 --> 0:12:08.480 It's a great question. The scenario you presented ANTHROPIC would 0:12:08.480 --> 0:12:09.040 be our clients. 0:12:09.040 --> 0:12:09.839 Okay, got it. 0:12:09.920 --> 0:12:11.880 So what I'm highlighting I want to correct a number 0:12:11.880 --> 0:12:15.240 I said earlier are financial service clients and this is 0:12:15.400 --> 0:12:21.079 direct to those financial services. They're approaching ten billion in backlog. Okay, 0:12:21.120 --> 0:12:23.080 so this would be you know, a good example of this, 0:12:23.120 --> 0:12:25.600 and that's when we made recently is with Jane Street. Okay, right. 0:12:25.640 --> 0:12:28.160 That's not Jane Street coming through Open AI or Anthropic 0:12:28.280 --> 0:12:31.280 to get to us. That is Jane Street coming directly 0:12:31.400 --> 0:12:34.200 to us and using our platform, and that for. 0:12:34.280 --> 0:12:39.319 A model that they're building. So it's a Jane Street No, no, no, 0:12:39.400 --> 0:12:41.640 I'm not saying it's setting inside training, but it would 0:12:41.640 --> 0:12:46.080 be inference of a model that it's the Jane Street's 0:12:46.200 --> 0:12:50.920 model of something rather than Jane Street's contract and enterprise 0:12:51.000 --> 0:12:53.000 relationship with one of the major labs. 0:12:53.160 --> 0:12:55.800 At the end of the day, we don't know what 0:12:55.960 --> 0:13:00.319 exact workloads these entities are running, especially for and he's 0:13:00.320 --> 0:13:03.320 like Jane Street would imagine that's highly secretive. Yeah. But 0:13:03.440 --> 0:13:06.439 the point I would say is more that this is 0:13:06.480 --> 0:13:10.040 not them coming through an AI lab yus. They are 0:13:10.440 --> 0:13:15.319 interfacing with and managing the infrastructure directly on our platform. 0:13:15.440 --> 0:13:19.600 And that's a really important distinction as we grow this 0:13:19.720 --> 0:13:23.560 diversified client base. And I again, I think that we've 0:13:23.600 --> 0:13:26.720 just done a wonderful job of executing. 0:13:26.400 --> 0:13:45.040 That over the past, as you've talked about, including in 0:13:45.160 --> 0:13:47.520 earnings releases, and as you can just tell from these 0:13:47.559 --> 0:13:52.360 huge token budgets, inference demand is booming, but model training 0:13:52.559 --> 0:13:56.480 is still important. But in addition to model training, to say, okay, 0:13:57.000 --> 0:14:00.280 relative if you have a pie chart, the part that's 0:14:00.360 --> 0:14:03.800 inference is getting bigger. But I assume the training is 0:14:03.840 --> 0:14:07.680 also growing as well. But I'm curious from the perspective 0:14:07.760 --> 0:14:12.600 of like say, the AI labs when they think about growth, 0:14:13.000 --> 0:14:16.440 has there been a subtle shift from investing to push 0:14:16.520 --> 0:14:19.760 the pure model frontier, having the absolute best state of 0:14:19.800 --> 0:14:24.520 the art model, versus investing in, say, better harnesses. Because 0:14:24.520 --> 0:14:27.280 a big reason we're excited, and i'll talk about AI 0:14:27.520 --> 0:14:31.560 right now, is really the excitement that happened over with 0:14:31.680 --> 0:14:34.920 Claude Code in the final quarter of twenty twenty five, 0:14:35.200 --> 0:14:38.680 and it's like, oh, this harness has really unlocked a 0:14:38.680 --> 0:14:41.680 bunch of capabilities. Has there been a shift in investment 0:14:41.720 --> 0:14:45.960 from rather than just the purest, most advanced model to 0:14:46.560 --> 0:14:50.880 let's invest more in tooling capacity and other things that 0:14:50.960 --> 0:14:54.840 allow companies and clients to get more juice from an 0:14:54.880 --> 0:14:55.680 advanced model. 0:14:55.920 --> 0:15:00.400 I don't think that we're exposed to that decision making 0:15:01.040 --> 0:15:05.320 with the AI labs as counterparties to us. The observation 0:15:05.520 --> 0:15:09.000 I would make in a behavior change for the AI 0:15:09.160 --> 0:15:16.560 labs is they want access to more infrastructure for longer duration, right, 0:15:16.640 --> 0:15:19.560 And I'll qualify it a little bit, which is a year. 0:15:20.080 --> 0:15:23.760 Two years ago, we were signing three year committed contracts. 0:15:23.880 --> 0:15:25.960 The type of contracts we sign are basically like take 0:15:25.960 --> 0:15:29.760 our pay contracts, which is the best way to finance 0:15:29.840 --> 0:15:34.240 the infrastructure that we are building for our clients. Last 0:15:34.320 --> 0:15:37.400 year it was four year contracts. Right. They were saying, 0:15:37.440 --> 0:15:42.440 we want explicit access to Hopper for four years or 0:15:42.480 --> 0:15:46.720 Blackwell for four years. Now they're coming and saying, well, 0:15:46.720 --> 0:15:49.000 actually we want it for five years. We don't want 0:15:49.120 --> 0:15:52.880 any interruption of use. We'll commit to the exact same 0:15:52.920 --> 0:15:56.440 economics throughout the full duration of the contract. You can't 0:15:56.480 --> 0:15:59.080 upgrade or change the infrastructure within it. You cannot cancel 0:15:59.120 --> 0:16:02.240 the contract. It for five years, and they want it 0:16:02.280 --> 0:16:06.280 at more scale. Right. The deployments are getting larger and larger, 0:16:06.440 --> 0:16:10.800 so that's probably the best characterization we can offer on 0:16:11.000 --> 0:16:13.960 decision making that AI labs are going through right now 0:16:14.000 --> 0:16:17.200 as they look from an infrastructure perspective, it absolutely seems 0:16:17.280 --> 0:16:21.800 like tooling is important, but scaling laws are still holding, right, 0:16:21.920 --> 0:16:27.040 Like your ability to advance your frontier model through accessing 0:16:27.160 --> 0:16:30.960 more infrastructure, its scale holds and that will hold through 0:16:31.120 --> 0:16:36.240 Vera Ruben we expect and seemingly it's not stopping anytime soon. 0:16:36.400 --> 0:16:38.480 Oh yeah, what's the deal with Arah Rubin? Can you 0:16:38.480 --> 0:16:39.320 explain that to us? 0:16:40.720 --> 0:16:41.680 Which aspect of it? 0:16:41.880 --> 0:16:42.480 What is it? 0:16:42.640 --> 0:16:42.840 Oh? 0:16:42.960 --> 0:16:45.040 Yes, basically yeah. 0:16:44.800 --> 0:16:48.120 So it's just Nvidio's next architecture that's coming out, right, 0:16:48.160 --> 0:16:51.920 Like the current architecture that we're deploying today is Blackwell. 0:16:52.200 --> 0:16:56.640 Blackwall comes. We deplay phronomenally in a NBL seventy two configuration, 0:16:56.760 --> 0:17:00.680 which was an entire architecture change from the right. If 0:17:00.680 --> 0:17:04.879 you recall Hopper game before Blackwell Hopper, you could deploy 0:17:05.000 --> 0:17:07.880 these forty two U racks, which was typically like eight 0:17:07.960 --> 0:17:10.520 GPUs in a server case. You would take it, plug 0:17:10.560 --> 0:17:14.480 it in largely air cooled as well. Right, we ran 0:17:14.560 --> 0:17:17.919 some liquid cooling just so we understood the requirements of 0:17:17.960 --> 0:17:22.800 liquid cooling because Blackwell for our deployments is overwhelmingly liquid 0:17:22.840 --> 0:17:27.480 cooled and its deployment configuration, and instead of eight GPUs 0:17:27.520 --> 0:17:30.879 and a forty two U configuration, it's in this larger 0:17:30.920 --> 0:17:35.800 seventy two GPU rack. It's like an entire chassis that's 0:17:35.840 --> 0:17:39.840 being brought in and it just looks entirely different in 0:17:39.880 --> 0:17:42.479 the data center. It's like this giant tower thing that 0:17:42.560 --> 0:17:47.120 you've seen in pictures floating around on x so. Vera 0:17:47.200 --> 0:17:53.560 Rubin will be the next architecture that comes out, and 0:17:53.840 --> 0:17:57.480 we've started receiving testing racks for Vera Rubin. 0:17:57.560 --> 0:18:01.840 But the basic idea is like the new configureuration makes 0:18:01.880 --> 0:18:05.760 the whole system more efficient, like more tokens per energy 0:18:05.880 --> 0:18:06.920 use and that sort of thing. 0:18:07.480 --> 0:18:10.560 Yes, yeah, I think that's kind of where you're getting 0:18:10.560 --> 0:18:13.320 to with it. But that doesn't necessarily mean going back 0:18:13.320 --> 0:18:16.520 to the point earlier that everyone only wants the latest 0:18:16.520 --> 0:18:23.160 generation of GPU. Right, we have massive demand for Ampeer, Hopper, Blackwell, 0:18:23.280 --> 0:18:28.120 et cetera. And it just varies by US case model 0:18:28.800 --> 0:18:31.120 and type of client as well. Like I would qualify 0:18:31.440 --> 0:18:35.119 that AI labs are probably the ones who are lining 0:18:35.200 --> 0:18:40.359 up first to secure access to the latest generation GPUs, 0:18:40.920 --> 0:18:46.720 whereas enterprise clients might be probably very focused on current generation, 0:18:47.400 --> 0:18:50.359 right like Hopper and Blackwell. 0:18:50.040 --> 0:18:52.760 Right now, I'm going to be honest for a second. 0:18:52.880 --> 0:18:54.720 You know, I try to keep up on a lot 0:18:54.760 --> 0:18:57.480 of things AI related, I really do, and every single 0:18:57.520 --> 0:19:00.320 day the ones that I got to do not keep 0:19:00.520 --> 0:19:03.320 like in my mind. If you asked me, like I 0:19:03.440 --> 0:19:05.560 liked it in the old days, when it was like 0:19:05.960 --> 0:19:08.320 one eighty six, two eighty six, three eighty six, four 0:19:08.359 --> 0:19:11.440 eighty six, Penniem and then like Penniam two, et cetera. 0:19:11.840 --> 0:19:14.159 There was just this numerical sequence that I could keep 0:19:14.200 --> 0:19:16.359 track of in my head. And so if someone asked 0:19:16.359 --> 0:19:19.879 me like Joe, like Vera, Ruben Hopper, Blackwell, what was 0:19:19.960 --> 0:19:22.600 the sequence, I'd be like, I gotta be honest with you. 0:19:22.640 --> 0:19:26.359 I like, don't exactly remember, and I will prioritize that 0:19:26.400 --> 0:19:30.760 at some point. But speaking of Silicon, so yesterday Microsoft 0:19:30.800 --> 0:19:32.560 came out with a big They're really they want to 0:19:32.600 --> 0:19:34.119 be in the game too. They don't want to just 0:19:34.160 --> 0:19:37.720 be connected the labs. They want to have advanced models too, 0:19:37.760 --> 0:19:40.520 and apparently it's a good model. And they announced the 0:19:40.880 --> 0:19:45.040 MAI Thinking one model, but they said it's optimized on 0:19:45.200 --> 0:19:47.800 the maya two hundred chip, which is their own chip. 0:19:48.280 --> 0:19:51.000 And this is a thing which is even again going 0:19:51.040 --> 0:19:53.720 back to our recent conversation we had even a place 0:19:53.760 --> 0:19:57.760 like Hudson River Trading is thinking about getting into the 0:19:57.800 --> 0:20:02.760 customized hardware game. How much juice for the squeeze is 0:20:02.840 --> 0:20:07.440 there of aligning the model with custom silicon From your 0:20:07.600 --> 0:20:08.800 vantage point. 0:20:09.080 --> 0:20:11.680 What we could offer is what we hear from our clients. Yeah, 0:20:12.040 --> 0:20:15.760 on that, And it's important to keep in mind we 0:20:15.840 --> 0:20:19.600 can run any type of silicon on our platform. Right. 0:20:19.800 --> 0:20:23.000 We are entirely customer led in what we build. Like, 0:20:23.040 --> 0:20:26.879 we don't go commit to CAPEX and specuatively hope people 0:20:26.920 --> 0:20:30.040 come and use infrastructure, right like, We wait until a 0:20:30.119 --> 0:20:33.280 client says, we want you to go do this specific build, 0:20:33.359 --> 0:20:35.600 here's what we want it to look like, and then 0:20:35.640 --> 0:20:37.399 we go commit to that CAPEX right more like a 0:20:37.400 --> 0:20:42.679 success based CAPEX approach. And the client isn't asking for 0:20:42.840 --> 0:20:47.800 anything but in nvidio infrastructure. And I think a large 0:20:47.840 --> 0:20:50.840 contributor to that is I mean, they built this incredible 0:20:50.880 --> 0:20:55.280 ecosystem around their chipset. They have been dedicated to that 0:20:55.359 --> 0:20:57.880 for I think over fifteen years at this point. Through 0:20:57.960 --> 0:21:02.600 the Kuda architecture and in video. From what we hear 0:21:02.640 --> 0:21:07.160 from more clients, that platform just remains the most efficient, 0:21:07.200 --> 0:21:11.719 the most scalable, the most reliable set of infrastructure that 0:21:11.800 --> 0:21:16.800 is in the market. Right, So I think others there's 0:21:16.880 --> 0:21:19.000 always been you think over the past few years, right, 0:21:19.000 --> 0:21:21.200 there's always been talking like what it is, but yeah, 0:21:21.240 --> 0:21:24.320 this other again and these other chips, and at the 0:21:24.400 --> 0:21:27.879 end of the day, like people are still using in 0:21:28.000 --> 0:21:32.280 video infrastructure, they're committing to in Vidia infrastructure for five 0:21:32.440 --> 0:21:37.840 plus year contracts in these billion, multi billion dollar commitments 0:21:38.600 --> 0:21:40.520 because they know that that is going to be a 0:21:40.600 --> 0:21:44.760 critical part of how they scale their business. We really 0:21:44.760 --> 0:21:49.439 don't see demands on a material basis for anything but 0:21:49.880 --> 0:21:53.240 that in Vidia compute and that's what we are building today. 0:21:53.640 --> 0:21:56.400 Obviously, just to push back on this a little bit, 0:21:56.520 --> 0:21:59.040 and I'm not really in any position to push back, 0:21:59.080 --> 0:22:02.240 I can only relate what past guests have said and 0:22:02.440 --> 0:22:05.760 my own reading. So what one of our guests said 0:22:05.920 --> 0:22:11.359 is that absolutely in Nvidia has the lock on model training, 0:22:11.400 --> 0:22:14.359 that if you want to train a model, that yes, 0:22:14.400 --> 0:22:16.399 in video chips are the older game in talent, but 0:22:16.400 --> 0:22:19.879 that for inference they're really his view, this is in 0:22:20.040 --> 0:22:23.080 Donning again his views, there really were options. And then 0:22:23.119 --> 0:22:25.080 of course we had someone who was much more biased. 0:22:25.119 --> 0:22:27.960 We interviewed the CEU of Sarah Brash, the company that 0:22:27.960 --> 0:22:32.119 makes the Gigantic plate and or sorry, the Gigantic pichanic, 0:22:32.480 --> 0:22:34.560 and of course he did but I mean, of course 0:22:34.640 --> 0:22:37.320 he was gonna say, yeah, the Kudo mote is vastly 0:22:37.359 --> 0:22:41.119 overrated for inference. It barely exists now. Of course, of 0:22:41.119 --> 0:22:43.240 course he's going to say that, so, like, you know, 0:22:43.320 --> 0:22:45.960 he's in a competitor, but we've also heard it from 0:22:46.000 --> 0:22:50.080 a user of inference, and intuitively it makes sense, like 0:22:50.200 --> 0:22:54.640 training is very complicated, all that stuff. But what you're 0:22:54.680 --> 0:22:58.320 saying is that from the customer standpoint, you see the 0:22:58.560 --> 0:23:02.360 demand for in Vidia on both the training and the 0:23:02.400 --> 0:23:06.719 inference as being steady, and that you perceive that advantage 0:23:06.760 --> 0:23:09.120 to be consistent through both aspects. 0:23:09.280 --> 0:23:12.959 So I believe in our last quarterly report, our CEO 0:23:13.119 --> 0:23:18.639 might qualify that inference workloads represent well in access of 0:23:18.800 --> 0:23:23.800 fifty percent of infrastructure utilization on our platform. Exact same 0:23:23.800 --> 0:23:26.480 infrastructure we do is for training. Yeah. Right, When going 0:23:26.480 --> 0:23:28.680 back to my commentive like it's very fungible between those 0:23:28.760 --> 0:23:33.920 different types of workloads those customers are choosing in Vidia 0:23:33.960 --> 0:23:37.240