WEBVTT - Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip

0:00:02.720 --> 0:00:13.960
<v Speaker 1>Bloomberg Audio Studios, Podcasts, Radio News.

0:00:18.600 --> 0:00:22.239
<v Speaker 2>Hello and welcome to another episode of the Odd Lots podcast.

0:00:22.360 --> 0:00:24.840
<v Speaker 3>I'm Jill Wisenthal and I'm Tracy Alloway.

0:00:25.040 --> 0:00:29.600
<v Speaker 2>Tracy, I have to say, unfortunately, I don't have AI psychosis.

0:00:29.640 --> 0:00:31.040
<v Speaker 4>I'm certain of that debatable.

0:00:31.640 --> 0:00:34.240
<v Speaker 2>I'm pretty sure. I'm pretty sure I don't have AI psychosis.

0:00:34.440 --> 0:00:39.400
<v Speaker 2>I do have to say, unfortunately, like the amount of

0:00:39.640 --> 0:00:43.320
<v Speaker 2>time now where it's like it feels like AI related

0:00:43.440 --> 0:00:47.360
<v Speaker 2>questions and there's many of them are sort of like

0:00:47.760 --> 0:00:51.000
<v Speaker 2>swallowing up the other thoughts that I have in my head,

0:00:51.240 --> 0:00:55.279
<v Speaker 2>whether it's questions about which models best and why, and

0:00:55.360 --> 0:00:58.880
<v Speaker 2>what are the economics of inference and how much training

0:00:59.120 --> 0:01:02.200
<v Speaker 2>is pre training versus post training for each model.

0:01:02.600 --> 0:01:05.160
<v Speaker 4>Like it's just sort of like this blog, there's a

0:01:05.200 --> 0:01:07.240
<v Speaker 4>growing that's taking out more and more of my thoughts.

0:01:07.319 --> 0:01:10.640
<v Speaker 3>What is your definition of AI psychosis? Because one would

0:01:10.800 --> 0:01:14.600
<v Speaker 3>argue that maybe thinking about AI literally all the time

0:01:14.920 --> 0:01:16.240
<v Speaker 3>would be a form of psychosis.

0:01:16.240 --> 0:01:18.240
<v Speaker 2>Well, let's just say, like, I'm not the type who

0:01:18.280 --> 0:01:21.039
<v Speaker 2>thinks that, Like, I don't like think that the AI

0:01:21.360 --> 0:01:24.240
<v Speaker 2>is a friend, for one saying I'm not in love

0:01:24.280 --> 0:01:27.280
<v Speaker 2>with the AI models. I don't think that in collaboration

0:01:27.560 --> 0:01:32.160
<v Speaker 2>with chat GPT, that I'm stumbling on unified theory of

0:01:32.240 --> 0:01:33.959
<v Speaker 2>physics and things like that.

0:01:34.319 --> 0:01:36.520
<v Speaker 3>So like, but you do spend a lot of time

0:01:37.160 --> 0:01:40.959
<v Speaker 3>in putting instructions, pressing the button, yes, what comes out, and.

0:01:41.000 --> 0:01:41.720
<v Speaker 4>See what comes out.

0:01:41.840 --> 0:01:44.039
<v Speaker 2>I'm just saying I think I'm aware that I'm talking

0:01:44.080 --> 0:01:47.920
<v Speaker 2>to machine and that we're not establishing any great breakthroughs

0:01:48.360 --> 0:01:51.279
<v Speaker 2>of which we are collaborators in partners and friends.

0:01:51.320 --> 0:01:54.080
<v Speaker 3>Recognizing you have a problem is the first step towards

0:01:54.080 --> 0:01:57.760
<v Speaker 3>healing Joe. Seriously, though, there's there's a good reason to

0:01:57.840 --> 0:02:01.080
<v Speaker 3>think about AI more and more, which is a huge

0:02:01.160 --> 0:02:03.160
<v Speaker 3>chunk of not just the market, but the real economy

0:02:03.200 --> 0:02:05.840
<v Speaker 3>is now revolving around AI right totally.

0:02:05.880 --> 0:02:09.480
<v Speaker 2>So anyway, again, within the AI conversation, there are a

0:02:09.520 --> 0:02:13.640
<v Speaker 2>lot of subcategories. One of the subcategories happens to be

0:02:13.800 --> 0:02:16.760
<v Speaker 2>another odd Lat's favorite topic, which is chips. Of course,

0:02:16.880 --> 0:02:19.560
<v Speaker 2>chips are used in multiple different ways. The chips are

0:02:19.639 --> 0:02:22.639
<v Speaker 2>used in different parts of the AI supply chain, different

0:02:22.639 --> 0:02:24.640
<v Speaker 2>types of chips of different roles, and so we have

0:02:24.680 --> 0:02:25.200
<v Speaker 2>to learn more.

0:02:25.240 --> 0:02:26.919
<v Speaker 3>We have to learn more, and I have to say

0:02:27.000 --> 0:02:30.280
<v Speaker 3>I'm particularly interested in the company we're about to speak

0:02:30.320 --> 0:02:33.519
<v Speaker 3>to partly because the two things I know about them

0:02:33.600 --> 0:02:37.119
<v Speaker 3>are number one, they just had a huge IPO yep, right,

0:02:37.280 --> 0:02:40.119
<v Speaker 3>raising something like five point five billion dollars at kind

0:02:40.160 --> 0:02:43.000
<v Speaker 3>of insane multiple. I can't even do a price to

0:02:43.040 --> 0:02:46.440
<v Speaker 3>earnings multiple because they're not profitable yet, but I think

0:02:46.520 --> 0:02:49.600
<v Speaker 3>just on a sales basis, it was like sixty seven

0:02:49.639 --> 0:02:54.520
<v Speaker 3>times forward earnings, which is pretty juicy, pretty hot. And

0:02:54.560 --> 0:02:57.200
<v Speaker 3>the second thing I know about the company is they

0:02:57.200 --> 0:03:01.000
<v Speaker 3>make giant way first, which is just a fun image

0:03:01.520 --> 0:03:02.080
<v Speaker 3>in your head.

0:03:02.280 --> 0:03:02.720
<v Speaker 5>That's right.

0:03:02.840 --> 0:03:05.919
<v Speaker 2>So if you were thinking it's like, okay, there is

0:03:06.000 --> 0:03:10.120
<v Speaker 2>a hot entrance in this space. What is their differentiator? Well,

0:03:10.160 --> 0:03:13.359
<v Speaker 2>one fact about them is their chips are just enormous.

0:03:13.400 --> 0:03:15.840
<v Speaker 2>About the size of the dinner plate. One might think

0:03:15.880 --> 0:03:18.120
<v Speaker 2>you're reading an onion article, but in fact it's real

0:03:18.240 --> 0:03:23.160
<v Speaker 2>and apparently it actually has some real technical advantages.

0:03:22.840 --> 0:03:25.240
<v Speaker 3>And it's different and so what everyone else is doing.

0:03:25.280 --> 0:03:27.560
<v Speaker 3>So everyone else is, I guess, doing this sort of

0:03:27.600 --> 0:03:30.359
<v Speaker 3>like modular networking thing where you get together a bunch

0:03:30.400 --> 0:03:32.760
<v Speaker 3>of chips and you connect them together and that's how

0:03:32.800 --> 0:03:36.400
<v Speaker 3>you get more compute, more memory, more power basically, But

0:03:36.520 --> 0:03:38.760
<v Speaker 3>this company has done something different in the form of

0:03:39.200 --> 0:03:39.920
<v Speaker 3>the giant wafers.

0:03:40.000 --> 0:03:42.520
<v Speaker 2>The giant wafer, and if you figure that to get

0:03:42.600 --> 0:03:46.400
<v Speaker 2>maximum performance, you sort of want to lessen the distance

0:03:46.480 --> 0:03:48.200
<v Speaker 2>between things, then put it all.

0:03:48.080 --> 0:03:48.720
<v Speaker 4>On one wafer.

0:03:48.760 --> 0:03:51.800
<v Speaker 2>Anyway, we're gonna learn a lot more. I'm very sad

0:03:52.680 --> 0:03:53.600
<v Speaker 2>about giant wafers.

0:03:53.640 --> 0:03:53.760
<v Speaker 5>More.

0:03:53.800 --> 0:03:56.040
<v Speaker 2>I'm very excited to say we do have the founder

0:03:56.400 --> 0:03:59.520
<v Speaker 2>and CEO of Sarah Bras on the podcast, Andrew Feldman.

0:03:59.600 --> 0:04:00.520
<v Speaker 4>Truly the perfect guest.

0:04:00.600 --> 0:04:03.160
<v Speaker 2>So, Andrew, thank you so much for coming on the

0:04:03.200 --> 0:04:04.960
<v Speaker 2>podcast on the week of your IPO.

0:04:05.280 --> 0:04:07.280
<v Speaker 5>Well, thank you so much for having me. What a pleasure.

0:04:07.440 --> 0:04:09.760
<v Speaker 2>Absolutely, Why don't you just start us.

0:04:09.600 --> 0:04:11.600
<v Speaker 4>Off the big giant chip.

0:04:11.680 --> 0:04:14.640
<v Speaker 2>They're apparently real, They're as big as a dinner plate.

0:04:14.920 --> 0:04:19.320
<v Speaker 2>What is the technical reason why this actually makes sense

0:04:19.760 --> 0:04:23.359
<v Speaker 2>as a superior form of architecture for at least some

0:04:23.640 --> 0:04:24.520
<v Speaker 2>aspect of AI.

0:04:25.200 --> 0:04:29.680
<v Speaker 6>I think larger chips process more information and less time, okay, and.

0:04:29.640 --> 0:04:31.360
<v Speaker 5>That produces faster results.

0:04:32.200 --> 0:04:36.160
<v Speaker 6>And everybody had gone to bigger chips and video had

0:04:36.200 --> 0:04:40.039
<v Speaker 6>moved from four hundred square millimeters to eight hundred square

0:04:40.080 --> 0:04:41.320
<v Speaker 6>millimeters over.

0:04:41.120 --> 0:04:43.440
<v Speaker 5>The course of five or six years for.

0:04:43.360 --> 0:04:48.000
<v Speaker 6>This exact reason, and in the compute industry wafer scale,

0:04:48.040 --> 0:04:50.080
<v Speaker 6>which is building a chip for.

0:04:50.000 --> 0:04:52.520
<v Speaker 2>Those, by the way, for those who are just listening,

0:04:52.600 --> 0:04:54.840
<v Speaker 2>andrews Now holding up the chip, and yes, it looks

0:04:55.160 --> 0:04:57.760
<v Speaker 2>it actually looks bigger than a dinner plate, to be honest.

0:04:57.800 --> 0:05:01.359
<v Speaker 4>But that is a big that's a big chip.

0:05:00.640 --> 0:05:03.400
<v Speaker 5>That's fifty think.

0:05:03.440 --> 0:05:05.719
<v Speaker 6>It's fifty eight times larger than any other chip that

0:05:05.760 --> 0:05:09.800
<v Speaker 6>had ever been Wow. And what it did was it

0:05:09.839 --> 0:05:13.560
<v Speaker 6>allowed us to use a different type of memory, a

0:05:13.600 --> 0:05:17.240
<v Speaker 6>type of memory that at the beginning, there are two

0:05:17.279 --> 0:05:19.440
<v Speaker 6>types of memory. There's memory that can store a lot,

0:05:20.080 --> 0:05:23.120
<v Speaker 6>but it's really slow, and there's memory that can't store

0:05:23.279 --> 0:05:27.400
<v Speaker 6>very much per square millimeter, but it's blisteringly fast. And

0:05:27.960 --> 0:05:34.120
<v Speaker 6>historically all graphics processing units use this memory that could

0:05:34.120 --> 0:05:37.200
<v Speaker 6>store a lot but was really slow, and that's the

0:05:37.279 --> 0:05:40.000
<v Speaker 6>reason they do inference so slowly. So if you're using

0:05:40.040 --> 0:05:43.479
<v Speaker 6>Claude right now, or you're using anything but chat GPT,

0:05:44.000 --> 0:05:47.240
<v Speaker 6>what you frequently feel is you'll enter your prompt and

0:05:47.279 --> 0:05:52.080
<v Speaker 6>you'll wait for an answer, right, And that's because the

0:05:52.200 --> 0:05:54.360
<v Speaker 6>memory is slow and they have to move a ton

0:05:54.360 --> 0:05:58.040
<v Speaker 6>of information from memory to compute. Now, by going to

0:05:58.560 --> 0:06:02.960
<v Speaker 6>wayfer scale use this fast memory. Now we couldn't make

0:06:03.000 --> 0:06:07.120
<v Speaker 6>that memory store more information per square millimeter, but we

0:06:07.160 --> 0:06:11.119
<v Speaker 6>could add square millimeters, and so by building this big chip,

0:06:11.400 --> 0:06:14.520
<v Speaker 6>we were able to stuff it to the gills with

0:06:14.560 --> 0:06:18.920
<v Speaker 6>this fast memory. And that's why we're fifteen times faster

0:06:19.000 --> 0:06:23.480
<v Speaker 6>than the fastest GPU. That's why on some problems we're fifty,

0:06:23.520 --> 0:06:27.720
<v Speaker 6>one hundred, even one thousand times faster than graphics processing units.

0:06:28.000 --> 0:06:31.200
<v Speaker 3>Wait, can you explain how you actually managed to do this?

0:06:31.440 --> 0:06:34.080
<v Speaker 3>Because I know there have been previous attempts to do

0:06:34.360 --> 0:06:37.160
<v Speaker 3>wayfer scale, and I seem to remember there was even

0:06:37.200 --> 0:06:39.720
<v Speaker 3>like an early attempt in the nineteen eighties or something

0:06:39.800 --> 0:06:42.640
<v Speaker 3>to do it. How are you able to pull this off?

0:06:42.960 --> 0:06:46.080
<v Speaker 6>Yeah, it was an ambitious undertaking, that's for sure. Every

0:06:46.120 --> 0:06:49.360
<v Speaker 6>previous effort in the seventy five year history of our

0:06:49.400 --> 0:06:54.320
<v Speaker 6>industry had failed, including Gene Amdall, who's sort of on

0:06:54.360 --> 0:06:57.839
<v Speaker 6>the mount Rushmore of compute in our industry. He failed

0:06:58.200 --> 0:07:00.920
<v Speaker 6>sort of spectacularly in the mid eighties at a company

0:07:01.000 --> 0:07:05.920
<v Speaker 6>called Trilogy. Not only that, but after we succeeded, people

0:07:05.960 --> 0:07:08.560
<v Speaker 6>who had visited us, who'd been in our labs tried

0:07:08.600 --> 0:07:12.040
<v Speaker 6>to copy us, and they also failed. And so what

0:07:12.080 --> 0:07:14.440
<v Speaker 6>we were able to do is solve a set of

0:07:14.680 --> 0:07:18.760
<v Speaker 6>really fundamental problems, and those problems cut across a wide

0:07:18.880 --> 0:07:22.760
<v Speaker 6>swath of technology. They cut across lithography, so we had

0:07:22.800 --> 0:07:25.880
<v Speaker 6>to collaborate closely with TSMC, and they turned out to

0:07:25.880 --> 0:07:29.720
<v Speaker 6>be a great partner. We had to make inventions in

0:07:29.800 --> 0:07:33.240
<v Speaker 6>material and packaging. That's how you put a process, or

0:07:33.280 --> 0:07:35.640
<v Speaker 6>how you put a piece of silicon on a motherboard

0:07:35.840 --> 0:07:39.920
<v Speaker 6>deliver power and IO to it. We had to make

0:07:40.000 --> 0:07:44.160
<v Speaker 6>inventions in power delivery. Right when you build a giant chip,

0:07:44.200 --> 0:07:46.480
<v Speaker 6>you're going to deliver way more power to it than

0:07:46.520 --> 0:07:48.960
<v Speaker 6>if you do a chip the size of a postage stamp.

0:07:49.400 --> 0:07:51.640
<v Speaker 6>We had to invent ways to cool it. We had

0:07:51.680 --> 0:07:52.840
<v Speaker 6>to write new types of.

0:07:52.800 --> 0:07:54.080
<v Speaker 5>Software that ran on it.

0:07:54.640 --> 0:07:57.880
<v Speaker 6>All of these had never been done before, and it

0:07:57.960 --> 0:08:01.760
<v Speaker 6>was a decade long process. It took us five years

0:08:01.800 --> 0:08:04.880
<v Speaker 6>and about five hundred million dollars to deliver the first one,

0:08:05.480 --> 0:08:10.080
<v Speaker 6>and it's been an extraordinary run since. In December, we

0:08:10.200 --> 0:08:13.360
<v Speaker 6>signed a deal with open Ai North to twenty billion dollars,

0:08:13.400 --> 0:08:16.840
<v Speaker 6>one of the largest contracts ever signed in Silicon Valley,

0:08:17.280 --> 0:08:19.600
<v Speaker 6>and then in March we signed a deal with with

0:08:19.680 --> 0:08:24.280
<v Speaker 6>AWS where they would deploy our systems in their data

0:08:24.320 --> 0:08:27.680
<v Speaker 6>centers in their AWS data centers, and so it's just

0:08:27.720 --> 0:08:30.880
<v Speaker 6>been an extraordinary run. But it took a long time.

0:08:31.160 --> 0:08:34.800
<v Speaker 6>It took extraordinary engineering, and there were certainly long periods

0:08:34.840 --> 0:08:37.000
<v Speaker 6>of time when it wasn't clear we were going to

0:08:37.040 --> 0:08:37.679
<v Speaker 6>make this work.

0:08:38.000 --> 0:08:41.400
<v Speaker 2>Obviously you've hit this remarkable milestone you have in fact

0:08:41.640 --> 0:08:45.959
<v Speaker 2>IPO and so forth, and right now market's valuing your

0:08:45.960 --> 0:08:49.680
<v Speaker 2>company at sixty four billion dollars early days of the IPO.

0:08:50.000 --> 0:08:53.839
<v Speaker 2>Just for the listener to understand, the chips are are

0:08:53.840 --> 0:08:57.560
<v Speaker 2>a solely in inference as opposed to, you know, in training.

0:08:57.559 --> 0:09:01.000
<v Speaker 2>When we think about AI, I think about, okay, there's training, training,

0:09:01.040 --> 0:09:03.360
<v Speaker 2>the model, and then answer giving that's the inference.

0:09:03.640 --> 0:09:05.680
<v Speaker 4>Are the tips for just for inference.

0:09:05.800 --> 0:09:08.400
<v Speaker 5>So a couple things I think you framed it exactly right.

0:09:08.520 --> 0:09:11.320
<v Speaker 6>Training is how we make AI, and inference is how

0:09:11.360 --> 0:09:15.800
<v Speaker 6>we use AI. And so what happened was that in

0:09:16.080 --> 0:09:18.000
<v Speaker 6>sort of twenty twenty five, in the first part of

0:09:18.040 --> 0:09:21.480
<v Speaker 6>twenty twenty five, the models we made were smart enough

0:09:21.520 --> 0:09:24.960
<v Speaker 6>to be useful, and there was an explosion of use.

0:09:26.120 --> 0:09:28.160
<v Speaker 6>And we use AI by doing inference. So there was

0:09:28.160 --> 0:09:32.520
<v Speaker 6>this sort of tidal wave of demand on inference, and

0:09:32.559 --> 0:09:34.760
<v Speaker 6>that has continued in twenty twenty six, and we think

0:09:34.800 --> 0:09:38.480
<v Speaker 6>it will continue for years and years to come. And

0:09:38.559 --> 0:09:43.400
<v Speaker 6>so that's what had happened in twenty fifteen. When we

0:09:43.520 --> 0:09:46.920
<v Speaker 6>began thinking about the company. We knew that AI was

0:09:46.920 --> 0:09:49.079
<v Speaker 6>on the horizon and they would eat a huge amount

0:09:49.120 --> 0:09:54.199
<v Speaker 6>of computer right, and we made sort of two fundamental bets.

0:09:54.400 --> 0:09:59.719
<v Speaker 6>We bet that it would need dedicated silicon, and right,

0:10:00.000 --> 0:10:02.520
<v Speaker 6>graphics had needed dedicated silicon, that's how you got.

0:10:02.360 --> 0:10:03.800
<v Speaker 5>The graphics processing unit.

0:10:04.160 --> 0:10:07.000
<v Speaker 6>Mobile compute had needed dedicated compute.

0:10:07.040 --> 0:10:08.640
<v Speaker 5>That's where you got ARM processors.

0:10:09.320 --> 0:10:11.160
<v Speaker 6>We made that bet, and we made a bet that

0:10:11.800 --> 0:10:15.400
<v Speaker 6>modifying the GPU architecture wouldn't be right. You needed to

0:10:15.440 --> 0:10:18.080
<v Speaker 6>start with a clean sheet of paper. And so what

0:10:18.160 --> 0:10:22.400
<v Speaker 6>we started with was a new vision, and that vision

0:10:22.440 --> 0:10:25.920
<v Speaker 6>could do training and it could do inference, and it

0:10:25.960 --> 0:10:30.280
<v Speaker 6>was orders of magnitude faster at both. But right now

0:10:30.320 --> 0:10:34.200
<v Speaker 6>what we're seeing is such an explosion in demand for

0:10:34.320 --> 0:10:36.680
<v Speaker 6>inference that a lot of the business this minuted his inference,

0:10:36.920 --> 0:10:40.920
<v Speaker 6>even though we're just as fast at the same amount

0:10:41.000 --> 0:10:43.160
<v Speaker 6>faster than GPUs on training.

0:10:43.320 --> 0:10:43.959
<v Speaker 4>That's interesting.

0:10:44.000 --> 0:10:46.679
<v Speaker 2>Maybe we'll get more to the theoretical training market a

0:10:46.720 --> 0:10:47.240
<v Speaker 2>little later.

0:10:47.600 --> 0:10:49.199
<v Speaker 4>Just real quick on inference.

0:10:49.320 --> 0:10:52.640
<v Speaker 2>Ben Thompson, who writes a newsletter about tech, He wrote

0:10:52.640 --> 0:10:56.640
<v Speaker 2>a piece in which he distinguishes between answer inference and

0:10:56.720 --> 0:11:02.520
<v Speaker 2>agentic So answer inferences like format by resume or whatever,

0:11:02.880 --> 0:11:05.240
<v Speaker 2>or write me an essay on X or Y, or

0:11:05.240 --> 0:11:08.400
<v Speaker 2>answer some questions, and then agentic inference is like, Okay,

0:11:08.440 --> 0:11:11.360
<v Speaker 2>here's this thing that's going to go around. Do you

0:11:11.440 --> 0:11:15.440
<v Speaker 2>distinguish and do services for you not producing visual answers?

0:11:15.520 --> 0:11:18.440
<v Speaker 4>Do you distinguish between those two? Is that a real

0:11:18.480 --> 0:11:21.800
<v Speaker 4>divide in your view? And can your chips do both?

0:11:22.120 --> 0:11:25.640
<v Speaker 6>Our chips can do both. I think it is a divide, Okay.

0:11:25.800 --> 0:11:28.000
<v Speaker 6>I think speed.

0:11:27.960 --> 0:11:29.280
<v Speaker 5>Matters equally in both.

0:11:29.480 --> 0:11:29.800
<v Speaker 4>Okay.

0:11:30.360 --> 0:11:33.800
<v Speaker 6>I think if you are engaged with the AI, if

0:11:33.800 --> 0:11:37.000
<v Speaker 6>you're writing code, which is agentic. If you're writing code

0:11:37.080 --> 0:11:41.040
<v Speaker 6>or you're doing work, nobody wants to wait. I mean,

0:11:41.480 --> 0:11:43.400
<v Speaker 6>we could just turn the question around and say, well,

0:11:43.400 --> 0:11:46.959
<v Speaker 6>how big is the market for slow search zero? How

0:11:46.960 --> 0:11:49.240
<v Speaker 6>big is the market for dial up internet zero?

0:11:49.280 --> 0:11:52.680
<v Speaker 5>Why is that? Because nobody wants to wait? Right?

0:11:52.760 --> 0:11:56.040
<v Speaker 6>So, if you're engaged with the AI, speed is of

0:11:56.080 --> 0:11:59.760
<v Speaker 6>the essence. But if the AI is doing agentic work

0:12:00.559 --> 0:12:04.480
<v Speaker 6>and your competitor gets three times five times, ten times

0:12:04.480 --> 0:12:07.520
<v Speaker 6>as much work done in twenty minutes than you do,

0:12:07.800 --> 0:12:11.760
<v Speaker 6>you're gonna get smoked. And so this notion somehow that

0:12:11.880 --> 0:12:16.520
<v Speaker 6>been proposed that speed isn't very important in agentic flows

0:12:16.640 --> 0:12:20.520
<v Speaker 6>is dead wrong. That speed is important in all aspects

0:12:20.559 --> 0:12:24.280
<v Speaker 6>of productive work, and that your ability to get more

0:12:24.360 --> 0:12:29.400
<v Speaker 6>done in less time is a fundamental advantage that accrues

0:12:29.520 --> 0:12:33.680
<v Speaker 6>over time. Right If while your competitor is doing one

0:12:33.800 --> 0:12:37.640
<v Speaker 6>unit of work, you can do three, and in the

0:12:37.679 --> 0:12:41.600
<v Speaker 6>next time they do one unit of work, you do six. Sure, right,

0:12:41.679 --> 0:12:46.280
<v Speaker 6>this adds up over time and you beat them in

0:12:46.360 --> 0:12:49.719
<v Speaker 6>any line of work. And so speed, which is sort

0:12:49.720 --> 0:12:53.079
<v Speaker 6>of our specialty, is important across the board.

0:12:53.400 --> 0:12:56.520
<v Speaker 3>What do giant wafers and speed in general actually mean

0:12:56.679 --> 0:13:00.160
<v Speaker 3>for I guess the economics of tokens, because one way

0:13:00.760 --> 0:13:02.839
<v Speaker 3>I think about it, I have this sort of vision

0:13:02.880 --> 0:13:07.040
<v Speaker 3>in my head, like, Okay, if I'm out shopping for toothpaste,

0:13:07.400 --> 0:13:09.120
<v Speaker 3>I know I need toothpaste every once in a while,

0:13:09.160 --> 0:13:11.000
<v Speaker 3>and I go into like a cvs A store, I

0:13:11.040 --> 0:13:13.040
<v Speaker 3>get one thing of toothpaste, and then maybe a week

0:13:13.160 --> 0:13:15.680
<v Speaker 3>later I get some more toothpaste. Or I could go

0:13:15.720 --> 0:13:19.640
<v Speaker 3>to Costco and buy a giant thing of toothpaste and

0:13:19.679 --> 0:13:22.360
<v Speaker 3>take it home, probably at a cheaper cost. And that's

0:13:22.400 --> 0:13:25.200
<v Speaker 3>sort of how I think of the giant wafers. Maybe

0:13:25.200 --> 0:13:28.400
<v Speaker 3>it's bad analogy, but what does speed actually mean for

0:13:29.080 --> 0:13:30.320
<v Speaker 3>the cost of tokens?

0:13:30.679 --> 0:13:33.840
<v Speaker 6>Well, I think there are a couple observations. I think

0:13:33.880 --> 0:13:38.440
<v Speaker 6>people have chosen so far to price speed a little higher.

0:13:39.600 --> 0:13:45.280
<v Speaker 6>For example, Anthropic offered a premium service in which they

0:13:45.720 --> 0:13:49.640
<v Speaker 6>offered tokens twice as fast and charged six times as much,

0:13:50.200 --> 0:13:53.560
<v Speaker 6>and they sold it out and they couldn't meet the demand. Now,

0:13:53.840 --> 0:13:56.520
<v Speaker 6>just to give you an idea, we're fifteen times faster

0:13:56.880 --> 0:14:01.400
<v Speaker 6>than there, twice as fast, and so people value speed

0:14:01.600 --> 0:14:04.720
<v Speaker 6>because it allows them to do more work and they

0:14:04.800 --> 0:14:07.920
<v Speaker 6>value their time. And when you can do more work

0:14:07.920 --> 0:14:11.280
<v Speaker 6>in less time, you are making people more productive. That's

0:14:11.320 --> 0:14:13.920
<v Speaker 6>why people have chosen to price them at a premium.

0:14:13.920 --> 0:14:15.400
<v Speaker 5>They don't cost more to make.

0:14:16.640 --> 0:14:21.000
<v Speaker 6>In fact, in the GPU architecture is an extremely good

0:14:21.080 --> 0:14:25.960
<v Speaker 6>architecture and extremely efficient at building very slow tokens. And

0:14:26.000 --> 0:14:29.800
<v Speaker 6>if you don't mind slow, the cost per token on

0:14:29.880 --> 0:14:34.080
<v Speaker 6>a GPU is extremely low. But the GPU has a

0:14:34.200 --> 0:14:38.840
<v Speaker 6>characteristic that as you try and go faster, the cost

0:14:38.920 --> 0:14:43.640
<v Speaker 6>and the power used per token increase, sort of like

0:14:43.800 --> 0:14:46.200
<v Speaker 6>as you go faster in your car, your miles per

0:14:46.240 --> 0:14:51.080
<v Speaker 6>gallon decrease. Right, So what happens is as you try

0:14:51.120 --> 0:14:54.080
<v Speaker 6>and get fast enough to be useful, fast enough to

0:14:54.120 --> 0:14:59.400
<v Speaker 6>be interesting, fast enough to keep users intelligence focused on

0:14:59.440 --> 0:15:04.440
<v Speaker 6>this product, they become extremely expensive and extremely power hungry.

0:15:05.280 --> 0:15:08.480
<v Speaker 6>And so the question is is not just what people

0:15:08.480 --> 0:15:11.400
<v Speaker 6>are paying for a token, what people are choosing to

0:15:11.480 --> 0:15:13.800
<v Speaker 6>price them at, but what they actually cost to make.

0:15:14.440 --> 0:15:17.080
<v Speaker 5>And GPS make very.

0:15:16.960 --> 0:15:22.040
<v Speaker 6>Slow tokens very cheaply, and they're unbelievably expensive at fast tokens.

0:15:22.080 --> 0:15:26.160
<v Speaker 6>We make fast tokens vastly less expensive than the GPU's

0:15:26.160 --> 0:15:32.760
<v Speaker 6>and we use a tiny fraction of the power.

0:15:43.440 --> 0:15:46.120
<v Speaker 2>Let's say we stipulate that this is not true and

0:15:46.560 --> 0:15:48.800
<v Speaker 2>everyone wants the fastest and everyone's like, you know what,

0:15:49.400 --> 0:15:53.560
<v Speaker 2>this is the solution that the Cerebras technology one big chip.

0:15:53.920 --> 0:15:58.440
<v Speaker 2>This is really where it's at. How much of your

0:15:58.600 --> 0:16:01.920
<v Speaker 2>market share for the inference market when you look out

0:16:02.000 --> 0:16:05.440
<v Speaker 2>next year, the year after, et cetera, how much is

0:16:05.480 --> 0:16:10.320
<v Speaker 2>your market share going to be dictated by your ability

0:16:10.360 --> 0:16:13.680
<v Speaker 2>to get capacity at tsmc fabs. How much is that

0:16:13.760 --> 0:16:15.360
<v Speaker 2>a gating mechanism for growth?

0:16:15.600 --> 0:16:19.560
<v Speaker 5>You know, TSMC is a huge part of the supply chain. Yeah,

0:16:19.640 --> 0:16:21.480
<v Speaker 5>but we have some real advantages.

0:16:21.840 --> 0:16:25.680
<v Speaker 6>There are three areas right now that are limiting vendors

0:16:25.680 --> 0:16:27.120
<v Speaker 6>in building AI computes.

0:16:28.040 --> 0:16:30.640
<v Speaker 5>Number one is HBM memory.

0:16:31.200 --> 0:16:34.920
<v Speaker 6>Is this memory we described earlier that can store a lot,

0:16:34.960 --> 0:16:39.920
<v Speaker 6>but it's really slow. That's made by three companies approximately Samsung, Heienix,

0:16:39.960 --> 0:16:44.640
<v Speaker 6>and Micron, and it's under unbelievable supply pressure. It's extremely

0:16:44.680 --> 0:16:47.960
<v Speaker 6>difficult to get their very long lead times. It's unbelievably

0:16:47.960 --> 0:16:51.880
<v Speaker 6>expensive right now, we don't use it. The second part

0:16:51.920 --> 0:16:56.920
<v Speaker 6>that's limiting is a process inside of TSMC called cooths,

0:16:57.880 --> 0:17:00.440
<v Speaker 6>and this is the process that in Nvidia and other

0:17:00.520 --> 0:17:01.400
<v Speaker 6>GPUs use.

0:17:02.160 --> 0:17:02.880
<v Speaker 5>We don't use it.

0:17:03.680 --> 0:17:07.919
<v Speaker 6>The third thing is that at TSMC, the factory that

0:17:08.040 --> 0:17:11.560
<v Speaker 6>is under most pressure is their three nanimeter factory.

0:17:12.000 --> 0:17:14.000
<v Speaker 5>We don't use it. We use five nanometer.

0:17:14.760 --> 0:17:18.879
<v Speaker 6>So we have managed to avoid some of the most

0:17:18.920 --> 0:17:23.919
<v Speaker 6>binding supply constraints. Now, TSMC still has to give us

0:17:23.920 --> 0:17:27.080
<v Speaker 6>a meaningful allocation, and they've been an extraordinary partner from

0:17:27.080 --> 0:17:30.560
<v Speaker 6>the get go, and they are the greatest manufacturing company

0:17:30.680 --> 0:17:33.600
<v Speaker 6>on earth by far. A fab is sort of a

0:17:33.640 --> 0:17:37.280
<v Speaker 6>modern pyramid. It's an unbelievable thing. And I highly recommend

0:17:37.359 --> 0:17:39.639
<v Speaker 6>you or any of your your listeners, if you get

0:17:39.640 --> 0:17:42.040
<v Speaker 6>a chance to go to Taipei, go and see them.

0:17:42.280 --> 0:17:44.240
<v Speaker 5>They are just extraordinary.

0:17:44.280 --> 0:17:47.639
<v Speaker 3>Can you do faburs You can't, Actually, you can't do

0:17:48.560 --> 0:17:50.560
<v Speaker 3>you can go and they have a museum of Innovation

0:17:50.680 --> 0:17:52.680
<v Speaker 3>and it is an extraordinary thing.

0:17:53.280 --> 0:17:56.040
<v Speaker 6>They are the sort of the national champion of Taiwan.

0:17:56.720 --> 0:18:00.680
<v Speaker 6>But I think today TSMC is given a as many

0:18:00.680 --> 0:18:05.280
<v Speaker 6>wafers as we've needed. Business today is constrained by data centers,

0:18:06.080 --> 0:18:09.440
<v Speaker 6>and that's the grand irony, right You invent technology that

0:18:09.520 --> 0:18:13.320
<v Speaker 6>has been unbuildable, never been invented for seventy five years

0:18:13.359 --> 0:18:17.520
<v Speaker 6>in the history of compute. You write software that is extraordinary,

0:18:17.560 --> 0:18:19.720
<v Speaker 6>You build a product that is vastly faster.

0:18:19.560 --> 0:18:22.679
<v Speaker 5>Than the cumbent. And what are we all constrained by buildings?

0:18:24.000 --> 0:18:24.520
<v Speaker 5>All right?

0:18:24.760 --> 0:18:28.040
<v Speaker 6>The data centers right now are everybody's constrained in the

0:18:28.160 --> 0:18:30.640
<v Speaker 6>entire industry powered buildings, So real estate.

0:18:31.000 --> 0:18:32.600
<v Speaker 5>It is an amazing thing right now.

0:18:33.480 --> 0:18:37.160
<v Speaker 6>And that is too sort of across the board, and

0:18:37.240 --> 0:18:40.760
<v Speaker 6>that will not change for the next fifteen or eighteen

0:18:40.760 --> 0:18:41.440
<v Speaker 6>months for sure.

0:18:41.880 --> 0:18:45.119
<v Speaker 3>I mean, since we're talking physical constraints, I guess I

0:18:45.119 --> 0:18:48.800
<v Speaker 3>should ask you. We did an episode about helium recently,

0:18:49.000 --> 0:18:51.840
<v Speaker 3>a helium shortage given the situation in the Strait of

0:18:51.880 --> 0:18:54.840
<v Speaker 3>Horror moves, and one of the things that helium is

0:18:54.960 --> 0:18:58.920
<v Speaker 3>used for is lithography on semiconductor chips. Has that affected

0:18:58.960 --> 0:19:01.400
<v Speaker 3>you at all or is that so thing that you're monitoring.

0:19:01.760 --> 0:19:03.600
<v Speaker 6>We monitor, but there's not a lot we can do,

0:19:04.520 --> 0:19:06.720
<v Speaker 6>and there's plenty of stuff to worry about that we

0:19:06.800 --> 0:19:11.760
<v Speaker 6>can't affect. We obviously are in communication every day with TSMC.

0:19:12.240 --> 0:19:14.320
<v Speaker 5>We're in communication with our entire.

0:19:14.240 --> 0:19:17.960
<v Speaker 6>Supply chain every single day, and we stay abreast of

0:19:18.640 --> 0:19:23.000
<v Speaker 6>the various issues. But it has had no impact on us,

0:19:23.359 --> 0:19:26.720
<v Speaker 6>and we put that in the bucket of things that

0:19:26.840 --> 0:19:31.200
<v Speaker 6>our manufacturing partners worry about also and that we can't help.

0:19:31.720 --> 0:19:36.320
<v Speaker 2>You know, So, in addition to manufacturing these chips, you

0:19:36.400 --> 0:19:39.320
<v Speaker 2>actually I didn't realize this. You have your own cloud

0:19:39.400 --> 0:19:42.119
<v Speaker 2>we do, and or you have your own cloud services,

0:19:42.400 --> 0:19:45.040
<v Speaker 2>which I have a bunch of questions about that. You

0:19:45.080 --> 0:19:48.600
<v Speaker 2>have your own cloud services through which a user can

0:19:48.680 --> 0:19:52.400
<v Speaker 2>actually get access to various open source models and so forth.

0:19:52.520 --> 0:19:54.520
<v Speaker 2>It looks a little bit sort of visually, it looks

0:19:54.560 --> 0:19:59.000
<v Speaker 2>a lot like the open router interface roughly the same environment,

0:19:59.359 --> 0:20:03.320
<v Speaker 2>except like the open source What I'm something I'm curious

0:20:03.320 --> 0:20:06.240
<v Speaker 2>about and maybe you could speak to this. You know,

0:20:07.720 --> 0:20:11.600
<v Speaker 2>in traditional software open source. One nice thing about open

0:20:11.640 --> 0:20:13.640
<v Speaker 2>sources you don't have to pay for it, so it's free.

0:20:13.840 --> 0:20:14.600
<v Speaker 5>It's a little bit.

0:20:14.480 --> 0:20:16.600
<v Speaker 2>Different when we're talking about there's no really such thing

0:20:16.640 --> 0:20:19.439
<v Speaker 2>as like free AI software because even if it's like free,

0:20:19.680 --> 0:20:21.760
<v Speaker 2>you still have to pay for the depreciation of the

0:20:21.800 --> 0:20:24.000
<v Speaker 2>chips and you have to pay for the electricity to

0:20:24.119 --> 0:20:26.720
<v Speaker 2>run them. So there's no real sarch things like free

0:20:26.800 --> 0:20:29.600
<v Speaker 2>open source AI software. But what I am curious about

0:20:29.720 --> 0:20:34.679
<v Speaker 2>in your experience as a cloud vendor, are the open

0:20:34.760 --> 0:20:39.800
<v Speaker 2>sources models cheaper on a per unit of intelligence basis?

0:20:40.040 --> 0:20:44.160
<v Speaker 2>If we had some way of saying levelized cost of intelligence,

0:20:44.160 --> 0:20:46.720
<v Speaker 2>which I don't know if the industry has yet, Are

0:20:46.760 --> 0:20:52.240
<v Speaker 2>open source models cheaper per IQ point whatever we want,

0:20:52.280 --> 0:20:54.600
<v Speaker 2>however we want to measure intelligence.

0:20:54.200 --> 0:20:55.520
<v Speaker 5>Yes, by a lot?

0:20:55.640 --> 0:20:59.240
<v Speaker 6>Really yeah, I think in the closed source world you're

0:20:59.280 --> 0:21:01.920
<v Speaker 6>paying a lot for that extra little bit of intelligence.

0:21:02.000 --> 0:21:04.320
<v Speaker 6>Right the open source models. There are no open source

0:21:04.359 --> 0:21:06.120
<v Speaker 6>models that are as good as.

0:21:06.000 --> 0:21:07.040
<v Speaker 5>The closed source models.

0:21:07.240 --> 0:21:10.800
<v Speaker 6>Okay, think of it as three four percent five percent

0:21:10.880 --> 0:21:14.320
<v Speaker 6>different Okay, something in that range, and it could be

0:21:14.359 --> 0:21:16.760
<v Speaker 6>a little more, could be a little less, but the

0:21:16.920 --> 0:21:20.720
<v Speaker 6>cost to you using them. You can jump up right

0:21:20.760 --> 0:21:23.160
<v Speaker 6>now and run KIMMI Kate two. It's a one trillion

0:21:23.200 --> 0:21:26.600
<v Speaker 6>parameter model. It's an open source model on cerebras where

0:21:26.640 --> 0:21:30.560
<v Speaker 6>ten or fifteen times faster than others. And what you're

0:21:30.560 --> 0:21:35.000
<v Speaker 6>paying for is the cost of our power and some

0:21:35.720 --> 0:21:38.679
<v Speaker 6>cost of the compute that took to calculate it. What

0:21:38.720 --> 0:21:40.800
<v Speaker 6>you're not paying for was the cost to train it.

0:21:41.520 --> 0:21:43.400
<v Speaker 6>And that's a battle that.

0:21:43.400 --> 0:21:44.520
<v Speaker 5>Is underway in the market.

0:21:45.200 --> 0:21:49.720
<v Speaker 6>You have open Ai with their coding software, you have

0:21:50.320 --> 0:21:53.840
<v Speaker 6>Anthropic with their coding software. And you've got companies like

0:21:53.960 --> 0:21:58.440
<v Speaker 6>Cursor and Cognition that are using open source. We power

0:21:58.480 --> 0:22:01.800
<v Speaker 6>open Ai and we power Cognitive. You have a battle

0:22:01.880 --> 0:22:06.359
<v Speaker 6>underway between closed source and open source, and I think

0:22:06.400 --> 0:22:09.000
<v Speaker 6>that the winners of that battle is yet to be determined.

0:22:09.359 --> 0:22:13.840
<v Speaker 6>What is clear is that the closed sources is strictly

0:22:14.040 --> 0:22:18.040
<v Speaker 6>better by a little bit by how much varies, and

0:22:18.119 --> 0:22:18.920
<v Speaker 6>it's more expensive.

0:22:19.320 --> 0:22:21.919
<v Speaker 3>Yeah, I think we've talked about this before, but like

0:22:21.920 --> 0:22:23.639
<v Speaker 3>I've heard of a lot of big companies in the

0:22:23.760 --> 0:22:26.520
<v Speaker 3>US who have been like very quietly shifting from some

0:22:26.560 --> 0:22:28.960
<v Speaker 3>of the closed source models to the open source models,

0:22:29.000 --> 0:22:33.840
<v Speaker 3>like the Chinese ones like Kimmy Kimmy and Quinn. I'm

0:22:33.880 --> 0:22:35.320
<v Speaker 3>sorry to press you on this point, but if you

0:22:35.359 --> 0:22:37.959
<v Speaker 3>had to make a bet, like in twenty years, is

0:22:38.000 --> 0:22:40.680
<v Speaker 3>the dominant aim AI model going to be a cheap

0:22:40.720 --> 0:22:45.159
<v Speaker 3>open source thing or a more expensive, incrementally better closed

0:22:45.160 --> 0:22:45.760
<v Speaker 3>source model.

0:22:46.160 --> 0:22:48.520
<v Speaker 5>I don't think there's going to be one. Right, There's

0:22:48.560 --> 0:22:52.320
<v Speaker 5>not one SaaS software. Right. There's some big dogs right.

0:22:52.160 --> 0:22:55.439
<v Speaker 6>There, Salesforce, there's some other sort of giant players, and

0:22:55.480 --> 0:22:58.359
<v Speaker 6>there are lots of other specialists. I can't think of

0:22:58.440 --> 0:23:02.719
<v Speaker 6>many markets where we've sort of settled onto to one player.

0:23:02.800 --> 0:23:05.320
<v Speaker 5>Right. If you look at the semiconductor market, you've got

0:23:05.440 --> 0:23:08.200
<v Speaker 5>x eighty six, where you've got two major players.

0:23:07.640 --> 0:23:11.560
<v Speaker 6>In AMD and Intel, and then you've got a whole

0:23:11.560 --> 0:23:16.359
<v Speaker 6>adjacent market owned by ARM and the companies that build

0:23:16.440 --> 0:23:19.560
<v Speaker 6>ARM parts, and then you've got customs silicon around that.

0:23:19.840 --> 0:23:21.800
<v Speaker 6>I think that's the way you're going to have this.

0:23:21.880 --> 0:23:23.919
<v Speaker 6>We're going to have you know, open aye is going

0:23:23.960 --> 0:23:27.040
<v Speaker 6>to continue to do extraordinary things. They will be competitors

0:23:27.080 --> 0:23:29.440
<v Speaker 6>to them, and they will be open source. I don't

0:23:29.440 --> 0:23:31.240
<v Speaker 6>think I don't think any of those go away.

0:23:31.840 --> 0:23:34.800
<v Speaker 3>Since we're on the topic of software, one of the

0:23:34.840 --> 0:23:37.960
<v Speaker 3>things you often hear when talking about you know, new

0:23:38.119 --> 0:23:42.119
<v Speaker 3>chip entrance going up against Nvidia, is this idea that well,

0:23:42.160 --> 0:23:44.439
<v Speaker 3>you know, like in video chips, they're great and all,

0:23:44.560 --> 0:23:48.880
<v Speaker 3>but the real mode of Invidio's business is Kuda, right,

0:23:49.680 --> 0:23:53.719
<v Speaker 3>software stack that goes with it. What's your take on that, Like,

0:23:53.840 --> 0:23:57.119
<v Speaker 3>is that a realistic concern for someone who's trying to

0:23:57.119 --> 0:23:59.960
<v Speaker 3>go up against a company is big? And I guess

0:24:00.119 --> 0:24:04.080
<v Speaker 3>as embedded in the software system as Nvidia currently is.

0:24:04.480 --> 0:24:07.200
<v Speaker 6>In Nvidia is probably the greatest company in the first

0:24:07.200 --> 0:24:10.080
<v Speaker 6>part of this century, right, you know, Jensen's one of

0:24:10.080 --> 0:24:13.200
<v Speaker 6>the great CEOs of our era, along with hoc Ten

0:24:13.280 --> 0:24:15.479
<v Speaker 6>at Broadcom and maybe Lisa at AMD.

0:24:15.760 --> 0:24:16.800
<v Speaker 5>Just extraordinary.

0:24:17.119 --> 0:24:20.159
<v Speaker 6>And Kuda was really important in the creating of the

0:24:20.200 --> 0:24:25.440
<v Speaker 6>AI landscape, but it's not important now and it has

0:24:25.440 --> 0:24:26.880
<v Speaker 6>no role whatsoever in inference.

0:24:27.320 --> 0:24:30.400
<v Speaker 5>If you want to move from running.

0:24:30.400 --> 0:24:34.880
<v Speaker 6>A model on GPUs today to running it on US,

0:24:35.000 --> 0:24:38.280
<v Speaker 6>we can move it in ten keystrokes, just move point

0:24:38.280 --> 0:24:41.959
<v Speaker 6>to our API. So that's the first part. The second

0:24:42.000 --> 0:24:48.200
<v Speaker 6>part is that a year ago, every major frontier lab

0:24:48.280 --> 0:24:53.480
<v Speaker 6>model had been built on a Kudah foundation, and today

0:24:54.400 --> 0:24:58.040
<v Speaker 6>two of three haven't, so they lost seventy percent market share.

0:24:58.440 --> 0:25:04.160
<v Speaker 6>They're three leading from tier models Gemini, Claude and GPT.

0:25:05.080 --> 0:25:10.280
<v Speaker 6>Gemini built by Google on TPUs, trained on TPUs, served

0:25:10.320 --> 0:25:11.400
<v Speaker 6>on TPUs.

0:25:11.000 --> 0:25:12.440
<v Speaker 5>No Kuda.

0:25:12.800 --> 0:25:20.560
<v Speaker 6>Anthropics models trained on Trainium, no Kuda served on TPUs,

0:25:21.080 --> 0:25:26.040
<v Speaker 6>on Trainium and on GPUs, and Open Eyes GPT trained

0:25:26.080 --> 0:25:30.119
<v Speaker 6>on GPUs in the Kuda environment. So two of the

0:25:30.200 --> 0:25:35.240
<v Speaker 6>three leading models today used no Kuda. That's a hemorrhaging

0:25:35.280 --> 0:25:38.680
<v Speaker 6>a share. And so I think what was true three

0:25:38.840 --> 0:25:41.879
<v Speaker 6>or five years ago in which Kuda had a dominant

0:25:41.880 --> 0:25:47.280
<v Speaker 6>position with central has shrunk significantly and not important at

0:25:47.320 --> 0:25:50.679
<v Speaker 6>all in inference and shrinking in its role in training.

0:25:51.000 --> 0:25:54.600
<v Speaker 2>You know, since we're talking about the economics, since we're

0:25:54.600 --> 0:25:57.440
<v Speaker 2>talking about you know, the economics of inference and all

0:25:57.480 --> 0:25:59.600
<v Speaker 2>this stuff, I've actually I would love to get your

0:25:59.680 --> 0:26:02.840
<v Speaker 2>take one of the things that like, literally in the

0:26:02.880 --> 0:26:06.359
<v Speaker 2>last couple of weeks, there's been this flurry of announcements

0:26:06.400 --> 0:26:11.200
<v Speaker 2>of these attempts to financialize the market for compute and

0:26:11.280 --> 0:26:14.080
<v Speaker 2>so it's like, oh, you're going to buy some capacity,

0:26:14.320 --> 0:26:17.880
<v Speaker 2>the H one hundred benchmark et cetera. And people want

0:26:18.000 --> 0:26:22.640
<v Speaker 2>maybe theoretically hedging it. I'm not entirely convinced. It still

0:26:22.640 --> 0:26:25.480
<v Speaker 2>seems to me like I it's not like maybe. But

0:26:25.600 --> 0:26:29.760
<v Speaker 2>on the other hand, like an inference provider can lock

0:26:29.840 --> 0:26:34.040
<v Speaker 2>in a very long term relationship bilaterally with the data

0:26:34.080 --> 0:26:36.359
<v Speaker 2>center and so forth, and no need for like these

0:26:36.520 --> 0:26:39.679
<v Speaker 2>spot hedging markets. Do you think the market is going

0:26:39.760 --> 0:26:41.760
<v Speaker 2>to evolve in such a way that there will be

0:26:41.840 --> 0:26:48.119
<v Speaker 2>significant demand for financial instruments that allow inference providers to

0:26:48.960 --> 0:26:50.240
<v Speaker 2>hedge their price exposure.

0:26:50.600 --> 0:26:52.440
<v Speaker 5>I don't know. I'm not a financial engineer. It's the

0:26:52.480 --> 0:26:54.159
<v Speaker 5>first thing, okay, But we can look a little bit

0:26:54.160 --> 0:26:55.080
<v Speaker 5>at history.

0:26:55.240 --> 0:27:00.840
<v Speaker 6>The guys at core Weave were enormously innovative in how

0:27:00.880 --> 0:27:05.000
<v Speaker 6>to fund some of their massive deployments. They were some

0:27:05.080 --> 0:27:08.320
<v Speaker 6>of the first to use a debt instrument that had

0:27:08.320 --> 0:27:12.560
<v Speaker 6>a backstop with the GPU, and this enabled them to

0:27:13.000 --> 0:27:17.400
<v Speaker 6>really leap out and sort of have first mover advantage

0:27:17.400 --> 0:27:18.399
<v Speaker 6>in the neocloud space.

0:27:19.440 --> 0:27:22.200
<v Speaker 5>And that was an innovation in financial.

0:27:21.800 --> 0:27:26.000
<v Speaker 6>Engineering, and extremely creative others followed, and now there's a big,

0:27:26.600 --> 0:27:30.560
<v Speaker 6>an active debt market and funding the building and the

0:27:30.600 --> 0:27:33.760
<v Speaker 6>fit out of data centers. When you have a market

0:27:33.880 --> 0:27:37.040
<v Speaker 6>that is that big and that active, you have people

0:27:37.080 --> 0:27:40.199
<v Speaker 6>who want to make bets on either side, and I

0:27:40.200 --> 0:27:44.320
<v Speaker 6>think over time those bets normalized and regularize, and you

0:27:44.359 --> 0:27:46.560
<v Speaker 6>can wrap them up and you can make it easy

0:27:46.560 --> 0:27:50.000
<v Speaker 6>to make the bet. When sort of CO two was

0:27:50.080 --> 0:27:53.679
<v Speaker 6>one of the first to loan money against GPUs for

0:27:54.320 --> 0:27:57.760
<v Speaker 6>core Weave, this was really innovative. And not only does

0:27:58.080 --> 0:28:01.080
<v Speaker 6>core Weave get credit for the creating of the instrument,

0:28:01.080 --> 0:28:02.840
<v Speaker 6>but so does the other side of the deal for

0:28:03.000 --> 0:28:08.399
<v Speaker 6>doing it and making a successful innovative bet. And as

0:28:08.520 --> 0:28:11.800
<v Speaker 6>sort of more and more people jumped in and these

0:28:11.840 --> 0:28:15.359
<v Speaker 6>could be regularized, they could be more easily priced, and

0:28:15.400 --> 0:28:18.920
<v Speaker 6>then once it's regularized and you have a market, then

0:28:19.160 --> 0:28:23.720
<v Speaker 6>derivatives of that market are easy to make. Historically, and

0:28:23.800 --> 0:28:26.719
<v Speaker 6>that's sort of the way I see this unfolding that

0:28:26.920 --> 0:28:32.040
<v Speaker 6>as this market for data centers and compute matures, there'll

0:28:32.080 --> 0:28:37.440
<v Speaker 6>be people making bets on either side, and financial instruments.

0:28:36.920 --> 0:28:38.240
<v Speaker 5>Will be created to do it.

0:28:38.240 --> 0:28:40.320
<v Speaker 6>Whether it's a good idea or not, I have no

0:28:40.360 --> 0:28:41.280
<v Speaker 6>opinion at this time.

0:28:41.680 --> 0:28:44.680
<v Speaker 3>Since we brought up finance, I was looking through the

0:28:44.720 --> 0:28:47.680
<v Speaker 3>IPO filing and looking at some of the actual numbers

0:28:47.720 --> 0:28:50.200
<v Speaker 3>in there, and I know you have the open AI

0:28:50.400 --> 0:28:54.280
<v Speaker 3>deal now, but a huge chunk of your revenue comes

0:28:54.280 --> 0:28:57.719
<v Speaker 3>from this company called G forty two in Abu Dhabi,

0:28:57.800 --> 0:29:00.440
<v Speaker 3>and I think they're both like your biggest customer and

0:29:00.720 --> 0:29:05.120
<v Speaker 3>also a major investor. What does G forty two actually

0:29:05.160 --> 0:29:07.120
<v Speaker 3>do with all these chips?

0:29:07.560 --> 0:29:12.600
<v Speaker 6>Sure last year they were a really important chunk of

0:29:12.640 --> 0:29:13.720
<v Speaker 6>our business, a lot of it.

0:29:14.600 --> 0:29:16.520
<v Speaker 5>They're a minority investor.

0:29:17.440 --> 0:29:21.680
<v Speaker 6>They are the national champion, the national AI champion of

0:29:21.760 --> 0:29:25.360
<v Speaker 6>the UAE, and they build a cloud that is used

0:29:25.360 --> 0:29:28.640
<v Speaker 6>across the UAE's ecosystem.

0:29:29.720 --> 0:29:34.160
<v Speaker 5>So it's used by leading universities there. It's used by leading.

0:29:33.840 --> 0:29:38.240
<v Speaker 6>Companies there, companies like ADNOC, they're they're leading oil company.

0:29:38.840 --> 0:29:43.880
<v Speaker 6>It's used by G forty two's nine operating companies.

0:29:44.560 --> 0:29:47.000
<v Speaker 5>The deployments to date have been in the US.

0:29:47.920 --> 0:29:52.000
<v Speaker 6>We have data centers that massive data centers that run

0:29:52.240 --> 0:29:57.720
<v Speaker 6>equipment for G forty two here in Santa Clara, but

0:29:57.800 --> 0:30:02.120
<v Speaker 6>also in Minneapolis and Dallas, stag this soon in Toronto,

0:30:03.080 --> 0:30:07.280
<v Speaker 6>and so they're doing training and they're doing inference. The

0:30:07.320 --> 0:30:10.320
<v Speaker 6>training they're doing, they have pioneered some of the leading

0:30:10.560 --> 0:30:15.040
<v Speaker 6>English Arabic models. They've done genomic work. They are doing

0:30:15.320 --> 0:30:20.080
<v Speaker 6>serving of models, and they're operating as a cloud, particularly

0:30:20.120 --> 0:30:24.080
<v Speaker 6>for the UAE ecosystem, but also for global companies.

0:30:40.240 --> 0:30:45.040
<v Speaker 2>Do you think that over time, corporate users and perhaps

0:30:45.040 --> 0:30:49.960
<v Speaker 2>individual users, but corporate users will want inference served from

0:30:50.000 --> 0:30:53.440
<v Speaker 2>a company that's separate from the model maker, such that

0:30:53.520 --> 0:30:57.000
<v Speaker 2>they can be certain that they are not revealing and

0:30:57.080 --> 0:30:59.560
<v Speaker 2>thus training the company that might replace them.

0:30:59.600 --> 0:31:00.640
<v Speaker 4>I mean, look.

0:31:00.720 --> 0:31:03.240
<v Speaker 2>Anthropic every couple of days and now to some new thing.

0:31:04.120 --> 0:31:06.480
<v Speaker 2>Oh we have a new markdown file that could do

0:31:06.560 --> 0:31:09.440
<v Speaker 2>this for tax or that could do this for whatever.

0:31:09.480 --> 0:31:10.880
<v Speaker 4>And then a bunch of companies.

0:31:10.520 --> 0:31:16.400
<v Speaker 2>File like our companies that you use AI increasingly going

0:31:16.480 --> 0:31:19.720
<v Speaker 2>to want to want want to use data centers and

0:31:19.760 --> 0:31:23.400
<v Speaker 2>inference providers that aren't the model themselves.

0:31:23.440 --> 0:31:27.360
<v Speaker 6>Well, first, I think there is a type of professional,

0:31:28.200 --> 0:31:33.360
<v Speaker 6>a type of job that is most directly under threat

0:31:33.520 --> 0:31:37.640
<v Speaker 6>from AIK and they're almost always white collar, and they

0:31:37.720 --> 0:31:43.920
<v Speaker 6>require you to have expertise over a body of knowledge. Right,

0:31:43.960 --> 0:31:47.239
<v Speaker 6>That's what an accountant is, right, They have you have

0:31:47.320 --> 0:31:51.680
<v Speaker 6>expertise over a body of knowledge of rulings, of previous

0:31:51.720 --> 0:31:56.479
<v Speaker 6>examples of tax case law, et cetera. That's exactly what

0:31:56.600 --> 0:32:00.920
<v Speaker 6>AI is good at right now, exactly so lawyer's accountants.

0:32:01.160 --> 0:32:05.720
<v Speaker 6>There's sort of these professionals who have stood between sort

0:32:05.760 --> 0:32:09.120
<v Speaker 6>of the ordinary person who doesn't know anything about IRS

0:32:09.160 --> 0:32:14.400
<v Speaker 6>tax rules and the tax rules that is under threat,

0:32:15.080 --> 0:32:18.360
<v Speaker 6>and that is something that it will be very easy

0:32:19.120 --> 0:32:24.440
<v Speaker 6>for companies like open ai and Anthropic to chew through.

0:32:25.400 --> 0:32:31.720
<v Speaker 6>There are other areas like say drug design, genetics, genomics,

0:32:32.720 --> 0:32:37.480
<v Speaker 6>where companies like galaxosmith Klein have remarkable and.

0:32:37.560 --> 0:32:38.640
<v Speaker 5>Unique data sets.

0:32:40.280 --> 0:32:42.760
<v Speaker 6>This is true for one of our large customers, Mayo Clinic.

0:32:43.560 --> 0:32:46.640
<v Speaker 6>It's true for Glaxosmithklin and other of our pharma customers.

0:32:47.200 --> 0:32:48.720
<v Speaker 5>They have unique.

0:32:48.480 --> 0:32:52.800
<v Speaker 6>Data and they will be able to find insight in

0:32:52.880 --> 0:32:55.880
<v Speaker 6>that data, and they will be able to get value

0:32:55.920 --> 0:32:59.440
<v Speaker 6>from that data, and they will certainly.

0:32:59.000 --> 0:33:04.120
<v Speaker 5>Not want to share that data with the foundation.

0:33:03.840 --> 0:33:07.440
<v Speaker 6>Model makers unless they are guaranteed that it will not

0:33:08.280 --> 0:33:12.920
<v Speaker 6>sort of make the general model smarter. And these are

0:33:12.920 --> 0:33:15.480
<v Speaker 6>companies that have spent twenty or thirty years spending tens

0:33:15.480 --> 0:33:19.959
<v Speaker 6>of billions of dollars a year gathering data right patient

0:33:20.000 --> 0:33:24.960
<v Speaker 6>care records or test results for drug design. They're going

0:33:25.040 --> 0:33:29.800
<v Speaker 6>to mine the insight in this work and they're going

0:33:29.840 --> 0:33:33.600
<v Speaker 6>to provide find extraordinary things and those are much more

0:33:33.640 --> 0:33:36.719
<v Speaker 6>protected because the INSIGHT's in the data and they have

0:33:36.760 --> 0:33:37.240
<v Speaker 6>the data.

0:33:37.680 --> 0:33:41.320
<v Speaker 3>You know, you were talking about fabs in Taiwan earlier,

0:33:41.440 --> 0:33:44.360
<v Speaker 3>and I'm now regretting not going on a fab tour

0:33:44.400 --> 0:33:46.560
<v Speaker 3>when I was in Taipei, but it just didn't cross

0:33:46.600 --> 0:33:50.040
<v Speaker 3>my mind at that time. Next time, Yeah, hopefully. There

0:33:50.040 --> 0:33:53.280
<v Speaker 3>have been various efforts under the Chips Act and some

0:33:53.360 --> 0:33:58.800
<v Speaker 3>other industrial policies to try to build more chip making

0:33:59.040 --> 0:34:03.240
<v Speaker 3>capacity in the US. In your view, what's the big

0:34:03.360 --> 0:34:06.280
<v Speaker 3>I guess impediment to actually do it? Yeah? A, is

0:34:06.320 --> 0:34:09.319
<v Speaker 3>it happening? And then B why does it seem so

0:34:09.560 --> 0:34:11.280
<v Speaker 3>difficult to actually make happen?

0:34:11.680 --> 0:34:11.839
<v Speaker 5>Right?

0:34:12.239 --> 0:34:15.280
<v Speaker 6>The first thing is difficult because it's a difficult problem

0:34:15.320 --> 0:34:18.680
<v Speaker 6>to They're hard that they cost thirty or forty billion

0:34:18.719 --> 0:34:21.440
<v Speaker 6>dollars and take five or six years to build. So

0:34:22.200 --> 0:34:24.760
<v Speaker 6>that amount of money in that amount of time cuts

0:34:24.800 --> 0:34:30.880
<v Speaker 6>across administrations, right, And that's a problem with the politics

0:34:30.880 --> 0:34:33.560
<v Speaker 6>in the US is it's hard to make policy that's

0:34:33.719 --> 0:34:37.239
<v Speaker 6>durable across administrations and across time.

0:34:37.960 --> 0:34:38.560
<v Speaker 5>The first thing.

0:34:39.280 --> 0:34:45.479
<v Speaker 6>The second thing is these are remarkably complicated buildings, and

0:34:46.160 --> 0:34:50.160
<v Speaker 6>we have a sort of a hodgepodge, a sort of

0:34:50.960 --> 0:34:55.920
<v Speaker 6>strange lattice work of local regional building codes.

0:34:55.800 --> 0:35:01.560
<v Speaker 5>That a fab maker has to negotiate. Third is we're trying.

0:35:02.080 --> 0:35:07.560
<v Speaker 6>TSMC has dedicated tens of billions of dollars to their

0:35:07.600 --> 0:35:10.919
<v Speaker 6>fabs in Arizona and have committed hundreds of billions more.

0:35:11.719 --> 0:35:14.760
<v Speaker 6>Samsung has dedicated tens of billions of dollars and committed

0:35:14.840 --> 0:35:17.759
<v Speaker 6>hundreds of billions more to their fabs in Texas. But

0:35:17.880 --> 0:35:21.680
<v Speaker 6>they take a long time, and we have to remain

0:35:21.960 --> 0:35:26.440
<v Speaker 6>committed to building not just the fab, but the surrounding ecosystem,

0:35:26.800 --> 0:35:29.240
<v Speaker 6>not just for three or five years, but for twenty

0:35:29.280 --> 0:35:32.080
<v Speaker 6>years or twenty five years, because you want not just

0:35:32.200 --> 0:35:35.480
<v Speaker 6>one fab, but you want a whole trajectory of fabs.

0:35:36.239 --> 0:35:39.759
<v Speaker 6>You want them working at today's cutting edge, but tomorrow's

0:35:39.840 --> 0:35:42.520
<v Speaker 6>and next years and in ten years cutting edge as well.

0:35:43.600 --> 0:35:46.880
<v Speaker 6>And those are things that have proven really challenging in

0:35:46.920 --> 0:35:51.600
<v Speaker 6>the US, and I think we needed their strategic assets,

0:35:52.160 --> 0:35:55.560
<v Speaker 6>and I think we need to find ways to collaborate

0:35:55.680 --> 0:35:59.960
<v Speaker 6>with those that have the expertise and to find ways

0:36:00.080 --> 0:36:03.120
<v Speaker 6>to build policy that is durable over a length of

0:36:03.160 --> 0:36:07.240
<v Speaker 6>time that can build a vibrant ecosystem in the fab

0:36:07.280 --> 0:36:09.800
<v Speaker 6>and the associated elements.

0:36:09.680 --> 0:36:13.200
<v Speaker 3>So the other big political economy theme I guess when

0:36:13.200 --> 0:36:15.759
<v Speaker 3>it comes to semiconductors is this idea that they are

0:36:15.760 --> 0:36:19.680
<v Speaker 3>in fact a strategically important technology, and so the US

0:36:19.719 --> 0:36:24.000
<v Speaker 3>should place some limitations on their use abroad. And so

0:36:24.120 --> 0:36:29.680
<v Speaker 3>we've seen things like export controls, export restrictions. You're an

0:36:29.719 --> 0:36:33.000
<v Speaker 3>actual chip company, and so I'm very curious at an

0:36:33.080 --> 0:36:37.440
<v Speaker 3>operating level what your experience of these kind of export

0:36:37.520 --> 0:36:41.680
<v Speaker 3>controls has actually been, Like how much time does that

0:36:41.760 --> 0:36:44.640
<v Speaker 3>take up for you? And then also given that one

0:36:44.680 --> 0:36:48.680
<v Speaker 3>of your biggest customers is an international firm in Abu Dhabi, like,

0:36:48.960 --> 0:36:52.080
<v Speaker 3>how important is the trajectory of those export controls to

0:36:52.120 --> 0:36:52.960
<v Speaker 3>your future business?

0:36:53.440 --> 0:36:55.560
<v Speaker 6>I think, you know, three or four years ago, I

0:36:55.560 --> 0:36:58.160
<v Speaker 6>would have said not important at all. I think today

0:36:58.200 --> 0:37:01.839
<v Speaker 6>they're really important. In the administration, I got to know

0:37:02.120 --> 0:37:05.200
<v Speaker 6>the leadership and the Department of Commerce and in the

0:37:06.200 --> 0:37:10.919
<v Speaker 6>Biss Division of Commerce, which oversees the licensing. I think

0:37:10.920 --> 0:37:14.879
<v Speaker 6>this is an extraordinarily difficult job, and we saw really

0:37:15.000 --> 0:37:18.280
<v Speaker 6>hard working, smart people doing a job.

0:37:18.080 --> 0:37:19.600
<v Speaker 5>That is very, very difficult.

0:37:20.080 --> 0:37:22.160
<v Speaker 6>I got to know the people in this administration and

0:37:22.360 --> 0:37:24.480
<v Speaker 6>I found the same every single one of them is

0:37:24.560 --> 0:37:26.440
<v Speaker 6>earning a tiny fraction of what they could earn in

0:37:26.440 --> 0:37:29.920
<v Speaker 6>the private sector, and is doing this because they believe

0:37:30.200 --> 0:37:33.520
<v Speaker 6>that this is an important mission. The problem is is

0:37:33.560 --> 0:37:35.960
<v Speaker 6>that there are differing views about the right way to

0:37:36.000 --> 0:37:41.080
<v Speaker 6>do this, and there are differing views on the right way.

0:37:40.920 --> 0:37:43.880
<v Speaker 5>To achieve the goal, which is.

0:37:43.920 --> 0:37:49.400
<v Speaker 6>To not give your most precious technology to your industrial enemy.

0:37:49.840 --> 0:37:53.760
<v Speaker 6>And I think we can agree that today, in today's environment,

0:37:54.160 --> 0:37:55.960
<v Speaker 6>China is an industrial enemy.

0:37:56.640 --> 0:37:58.080
<v Speaker 5>Good well meaning people can.

0:37:58.000 --> 0:38:01.520
<v Speaker 6>Disagree on whether the right strategy is to limit them

0:38:01.600 --> 0:38:06.640
<v Speaker 6>from gaining access. Others argue, as those that Nvidia have argued,

0:38:06.760 --> 0:38:08.800
<v Speaker 6>is that the right strategy is to give them access

0:38:08.840 --> 0:38:13.240
<v Speaker 6>and to keep them working on our product, on us

0:38:13.239 --> 0:38:16.640
<v Speaker 6>made on us sort of designed product. I come down

0:38:17.040 --> 0:38:19.640
<v Speaker 6>on the other side of that argument. I understand they're

0:38:20.200 --> 0:38:26.839
<v Speaker 6>good arguments in both directions. I think limiting the distribution

0:38:28.160 --> 0:38:33.640
<v Speaker 6>the diffusion of our most precious technologies makes sense, and

0:38:34.000 --> 0:38:36.279
<v Speaker 6>I think we have to do it thoughtfully and we

0:38:36.320 --> 0:38:39.520
<v Speaker 6>have to recognize that means some markets will be foreclosed

0:38:39.520 --> 0:38:42.000
<v Speaker 6>to us, and I'm okay with that.

0:38:42.680 --> 0:38:45.719
<v Speaker 2>Just quickly, on the sort of like current business stuff

0:38:45.760 --> 0:38:49.200
<v Speaker 2>you mentioned to deal with AWS, how does that work?

0:38:49.239 --> 0:38:53.600
<v Speaker 2>Could customers right now like could customers of aws pay

0:38:53.680 --> 0:38:58.360
<v Speaker 2>them to have infrint served specifically on one of your chips.

0:38:58.520 --> 0:39:00.719
<v Speaker 5>Not yet, but soon, okay, they will be.

0:39:00.960 --> 0:39:03.720
<v Speaker 6>It will be served in Bedrock, which is their AI

0:39:03.840 --> 0:39:07.839
<v Speaker 6>as a service offering, and they will yes be able

0:39:07.880 --> 0:39:11.200
<v Speaker 6>to go down the clickdown menu and get super fast inference,

0:39:11.360 --> 0:39:15.440
<v Speaker 6>which will be delivered via a combination of what's called

0:39:15.480 --> 0:39:19.680
<v Speaker 6>a disaggregated solution, which is using some tranium for some

0:39:19.760 --> 0:39:24.960
<v Speaker 6>of the inference work and using the cerebras technology in

0:39:25.000 --> 0:39:28.080
<v Speaker 6>our systems called the CS three for other parts of

0:39:28.080 --> 0:39:28.480
<v Speaker 6>the work.

0:39:28.640 --> 0:39:31.520
<v Speaker 2>And presumably someone who scrolled down and selects that they

0:39:31.520 --> 0:39:35.000
<v Speaker 2>would pay some premium for that ultra fast in front.

0:39:35.080 --> 0:39:36.960
<v Speaker 5>I think they will pay a premium.

0:39:37.040 --> 0:39:40.239
<v Speaker 6>We will see this as entirely as Amazon wishes to

0:39:40.280 --> 0:39:41.520
<v Speaker 6>price it to their product.

0:39:41.840 --> 0:39:45.719
<v Speaker 2>See you iPod this week. It's May twenty six. This

0:39:45.800 --> 0:39:48.640
<v Speaker 2>is not the first time that you've tried to or

0:39:48.719 --> 0:39:52.800
<v Speaker 2>look towards going to the IPO market. Door headlines going

0:39:52.880 --> 0:39:56.640
<v Speaker 2>back to twenty twenty four about wanting to try for

0:39:56.680 --> 0:39:59.920
<v Speaker 2>the IPO market, and then there were headlines last year,

0:40:00.239 --> 0:40:03.640
<v Speaker 2>especially because of the relationship with G forty two, about

0:40:03.960 --> 0:40:07.240
<v Speaker 2>Syphius and some of the national security concerns, and maybe

0:40:07.280 --> 0:40:10.360
<v Speaker 2>that was an issue with the IPO. And then but

0:40:10.440 --> 0:40:14.880
<v Speaker 2>also last September, you got one of your looks like

0:40:15.200 --> 0:40:18.359
<v Speaker 2>g Round g Round. One of the participants in the

0:40:18.400 --> 0:40:22.080
<v Speaker 2>g Round investor was seventeen eighty nine Capital, which is

0:40:22.120 --> 0:40:26.560
<v Speaker 2>of course the firm that's associated with Donald Trump Junior,

0:40:26.640 --> 0:40:29.680
<v Speaker 2>which is a lot of things, and then the IPO happens.

0:40:30.800 --> 0:40:34.960
<v Speaker 2>I'm a cynic, so I wonder if the participation, if

0:40:35.719 --> 0:40:39.560
<v Speaker 2>Donald Trump Junior's investment in your company made it easier

0:40:39.600 --> 0:40:42.280
<v Speaker 2>to get the green light from these national security concerns

0:40:42.280 --> 0:40:42.960
<v Speaker 2>to do an IPO.

0:40:43.560 --> 0:40:46.520
<v Speaker 5>I wish it were that easy. No, it had no

0:40:46.800 --> 0:40:47.319
<v Speaker 5>role at all.

0:40:47.560 --> 0:40:51.959
<v Speaker 6>We resolved all SIFIAUS issues in March of twenty twenty five.

0:40:52.360 --> 0:40:56.200
<v Speaker 6>I believe that was before we took money from seventeen

0:40:56.280 --> 0:41:00.319
<v Speaker 6>eighty nine. Okay, moreover, I wouldn't ask. That's not who

0:41:00.320 --> 0:41:02.839
<v Speaker 6>I am and that's not the way we roll. So

0:41:03.320 --> 0:41:06.319
<v Speaker 6>we took money because they are a thoughtful venture firm,

0:41:07.239 --> 0:41:10.960
<v Speaker 6>and we don't believe that there's only one point.

0:41:10.680 --> 0:41:13.120
<v Speaker 5>Of political view. There are lots of political views.

0:41:13.200 --> 0:41:16.200
<v Speaker 6>They all have some merit, they'll have some weaknesses, and

0:41:16.280 --> 0:41:20.799
<v Speaker 6>so we have right leaning political some investors, we have

0:41:20.880 --> 0:41:23.960
<v Speaker 6>left leaning the fact that this firm had some right

0:41:24.040 --> 0:41:29.680
<v Speaker 6>leaning in sort of investors. We were looking only at

0:41:29.840 --> 0:41:31.359
<v Speaker 6>their ability to help us.

0:41:31.239 --> 0:41:32.640
<v Speaker 5>Build an extraordinary company.

0:41:33.280 --> 0:41:36.920
<v Speaker 6>And we have asked, and we will not at We

0:41:37.000 --> 0:41:40.560
<v Speaker 6>have never asked, nor will we ever ask for political

0:41:40.600 --> 0:41:42.479
<v Speaker 6>access or anything of the kind.

0:41:42.640 --> 0:41:45.600
<v Speaker 3>What's it like to become a billionaire in a single day?

0:41:45.760 --> 0:41:47.600
<v Speaker 3>This is something I assume will never happen to me,

0:41:47.680 --> 0:41:49.239
<v Speaker 3>so I might as well ask you no.

0:41:49.360 --> 0:41:52.759
<v Speaker 6>I think the honest truth is it was a big

0:41:52.840 --> 0:41:56.400
<v Speaker 6>nothing for me. I had some wealth before and have

0:41:56.480 --> 0:41:59.719
<v Speaker 6>some wealth after. Right, I think this is a very

0:41:59.760 --> 0:42:03.880
<v Speaker 6>difficult way to make money, right, being a tech CEO,

0:42:04.280 --> 0:42:05.680
<v Speaker 6>I think what you have to do is you have

0:42:05.719 --> 0:42:08.040
<v Speaker 6>to love the work, you have to love the people,

0:42:08.120 --> 0:42:10.120
<v Speaker 6>and you have to think every day about how to

0:42:10.120 --> 0:42:15.439
<v Speaker 6>make your team rich. And far more important than sort

0:42:15.440 --> 0:42:18.399
<v Speaker 6>of some change in my wealth was we made more

0:42:18.400 --> 0:42:22.359
<v Speaker 6>than eight hundred millionaires. Nice, and that's something I'm proud

0:42:22.440 --> 0:42:26.440
<v Speaker 6>of every minute of every day. And at my last

0:42:26.440 --> 0:42:30.560
<v Speaker 6>company we made a hundred millionaires. And at this company,

0:42:31.160 --> 0:42:34.040
<v Speaker 6>through our IPO, we made more than eight hundred And

0:42:34.760 --> 0:42:38.000
<v Speaker 6>that's something that you wake up feeling good about yourself.

0:42:38.080 --> 0:42:38.879
<v Speaker 5>Every single day.

0:42:39.160 --> 0:42:41.200
<v Speaker 3>That was going to be my last question, but actually

0:42:41.200 --> 0:42:44.759
<v Speaker 3>you just reminded me in that answer. You know the

0:42:44.840 --> 0:42:48.160
<v Speaker 3>idea that getting here, I said, you became a billionaire

0:42:48.200 --> 0:42:50.279
<v Speaker 3>in a day, but obviously this was the outcome of

0:42:50.400 --> 0:42:53.200
<v Speaker 3>years and years and years of work. And if we

0:42:53.239 --> 0:42:58.680
<v Speaker 3>think about technological hardware, one of the things most people

0:42:58.680 --> 0:43:03.080
<v Speaker 3>associated with is really long lead times and really big

0:43:03.239 --> 0:43:07.080
<v Speaker 3>research and development budgets. Now that you're a public company,

0:43:07.920 --> 0:43:11.000
<v Speaker 3>how do you sort of balance that quarter to quarter

0:43:11.280 --> 0:43:14.600
<v Speaker 3>financial performance pressure with the idea that you still need

0:43:14.640 --> 0:43:19.520
<v Speaker 3>to be investing in capex, in new you know, new

0:43:19.600 --> 0:43:23.320
<v Speaker 3>ways of designing chips, new improvements to the existing ones.

0:43:23.840 --> 0:43:28.200
<v Speaker 6>Well, first, we think the opportunity for innovation, based on

0:43:28.200 --> 0:43:31.560
<v Speaker 6>our way for scale engine, the best work is still

0:43:31.560 --> 0:43:32.120
<v Speaker 6>ahead of us.

0:43:32.480 --> 0:43:34.800
<v Speaker 5>Number one, we see an.

0:43:34.640 --> 0:43:38.720
<v Speaker 6>Opportunity for extraordinary innovation in the years ahead to make leaps.

0:43:38.760 --> 0:43:41.719
<v Speaker 6>Every bit is big and often bigger than what we

0:43:41.840 --> 0:43:44.920
<v Speaker 6>made by building the largest chip on earth. When you

0:43:45.000 --> 0:43:48.400
<v Speaker 6>love building hardware, the fact that it takes time is

0:43:48.840 --> 0:43:53.359
<v Speaker 6>part of the deal, right that what we do can't

0:43:53.400 --> 0:43:55.040
<v Speaker 6>be done in a week or a month or a year.

0:43:56.400 --> 0:44:00.240
<v Speaker 6>And that's what you sign up for, and that's true.

0:44:00.280 --> 0:44:01.920
<v Speaker 5>In every profession.

0:44:02.040 --> 0:44:06.759
<v Speaker 6>You sign up for the good and the challenging, and

0:44:07.200 --> 0:44:09.640
<v Speaker 6>you have to sort of make peace with that. If

0:44:09.680 --> 0:44:13.319
<v Speaker 6>you're a person that wants to dive in and sort

0:44:13.360 --> 0:44:17.680
<v Speaker 6>of begin iterating right away and fail quickly and code

0:44:17.760 --> 0:44:19.440
<v Speaker 6>up something and look at it and throw it out

0:44:19.440 --> 0:44:22.720
<v Speaker 6>in the market and see if it wins God speed,

0:44:22.760 --> 0:44:26.520
<v Speaker 6>that's great, And that's not for me. You know, in

0:44:26.560 --> 0:44:30.680
<v Speaker 6>our business, we measure twice before we cut once. And

0:44:31.400 --> 0:44:34.200
<v Speaker 6>you have to put that in your soul, and you

0:44:34.239 --> 0:44:37.360
<v Speaker 6>have to like it. You have to like that mistakes

0:44:37.360 --> 0:44:40.440
<v Speaker 6>in our business are really expensive, and you have to

0:44:40.600 --> 0:44:43.000
<v Speaker 6>like the fact that you breathe life.

0:44:42.760 --> 0:44:46.080
<v Speaker 5>Into a chunk of silicon and you get it to do.

0:44:46.000 --> 0:44:48.080
<v Speaker 6>Things that nobody else has ever been able to make

0:44:48.120 --> 0:44:51.200
<v Speaker 6>a chunk of silicon do. And if that's for you,

0:44:51.280 --> 0:44:56.160
<v Speaker 6>then this process that takes time and money, you love

0:44:56.200 --> 0:44:59.400
<v Speaker 6>that too. And so I think I would love it

0:44:59.480 --> 0:45:01.480
<v Speaker 6>less if you could do it in a week. And

0:45:01.680 --> 0:45:04.520
<v Speaker 6>I think the people that I love to work with

0:45:04.840 --> 0:45:09.520
<v Speaker 6>they feel the same way. And they like being engineers

0:45:09.520 --> 0:45:11.279
<v Speaker 6>not because it's a path to money. They like being

0:45:11.280 --> 0:45:14.720
<v Speaker 6>engineers because they like building things, and they like building

0:45:14.760 --> 0:45:17.759
<v Speaker 6>hard things. And I like working with them for for

0:45:17.880 --> 0:45:18.800
<v Speaker 6>exactly that reason.

0:45:19.000 --> 0:45:21.480
<v Speaker 2>Yeah, you mentioned breathing life into a chunk of silicon.

0:45:21.560 --> 0:45:24.280
<v Speaker 2>My dad, who's a physicist, always likes to point out

0:45:24.600 --> 0:45:27.480
<v Speaker 2>how carbon and silicon are right next to each other

0:45:27.680 --> 0:45:30.359
<v Speaker 2>on the periodic table. They are, and they're sort of like,

0:45:30.640 --> 0:45:33.040
<v Speaker 2>here are the two things that we have closest to life,

0:45:33.080 --> 0:45:34.759
<v Speaker 2>and they're literally touching each other.

0:45:35.160 --> 0:45:36.920
<v Speaker 4>Maybe there's something deep in that.

0:45:37.400 --> 0:45:39.799
<v Speaker 6>I think that's a really thoughtful thing, your father said,

0:45:39.920 --> 0:45:44.080
<v Speaker 6>thank you, And I think that's really cool. And nobody

0:45:44.120 --> 0:45:47.560
<v Speaker 6>pointed that out to me. The stared at periodic tables

0:45:47.640 --> 0:45:50.240
<v Speaker 6>for a long time. But I think to the extent

0:45:50.320 --> 0:45:52.280
<v Speaker 6>we can make artificial life, we need silicon.

0:45:52.480 --> 0:45:53.799
<v Speaker 4>Yeah, and they're right next to each other.

0:45:54.160 --> 0:45:56.640
<v Speaker 6>Right, carbon, carbon is the heart of all other life,

0:45:56.680 --> 0:45:59.759
<v Speaker 6>and artificial life will be we've founded at least the

0:46:00.000 --> 0:46:01.920
<v Speaker 6>elligent part will be foundered on silicon.

0:46:02.200 --> 0:46:04.080
<v Speaker 2>Right below silicon is germanium.

0:46:04.239 --> 0:46:06.160
<v Speaker 4>Maybe the next I don't.

0:46:05.960 --> 0:46:07.040
<v Speaker 5>Know what what does that mean.

0:46:07.320 --> 0:46:10.520
<v Speaker 2>Let's keep Yeah, let's keep an eye on germanium next. Andrew,

0:46:10.600 --> 0:46:13.440
<v Speaker 2>thank you so much for coming on odd lots fascinating

0:46:13.440 --> 0:46:16.280
<v Speaker 2>conversation right in the sweet spot of what we're interested.

0:46:16.400 --> 0:46:17.680
<v Speaker 4>Really appreciate you taking your time.

0:46:17.800 --> 0:46:20.239
<v Speaker 6>Hey, thank you guys for having me, and I really

0:46:20.280 --> 0:46:20.799
<v Speaker 6>appreciate it.

0:46:20.840 --> 0:46:21.920
<v Speaker 5>Look forward to seeing you against it.

0:46:34.680 --> 0:46:35.399
<v Speaker 4>That was really fun.

0:46:35.480 --> 0:46:39.440
<v Speaker 2>I'm super interested in this topic and it does feel

0:46:39.640 --> 0:46:44.359
<v Speaker 2>to me like the economics of inference in particular and

0:46:44.520 --> 0:46:47.440
<v Speaker 2>the market for ins ference inference capacity speed.

0:46:47.640 --> 0:46:49.839
<v Speaker 4>Like it's still day one, you know what I'm saying.

0:46:50.320 --> 0:46:51.760
<v Speaker 3>I just like looking at the giant.

0:46:51.880 --> 0:46:54.200
<v Speaker 2>It's so cool. It's it really doesn't seem like an

0:46:54.200 --> 0:46:55.720
<v Speaker 2>onion thing, doesn't it.

0:46:55.719 --> 0:46:58.200
<v Speaker 4>It's like company solved in ference.

0:46:58.040 --> 0:47:01.000
<v Speaker 3>With a giant building the biggest But.

0:47:01.040 --> 0:47:01.640
<v Speaker 4>It is interesting.

0:47:01.719 --> 0:47:05.200
<v Speaker 2>We did that episode of course with Ray Wang from

0:47:05.320 --> 0:47:09.359
<v Speaker 2>semi Analysis and talking about the role like memory as

0:47:09.400 --> 0:47:11.680
<v Speaker 2>being this really important part of the sort of cutting

0:47:11.680 --> 0:47:14.839
<v Speaker 2>as chipsets, and it's interesting to think it's like, Okay, well,

0:47:14.880 --> 0:47:17.759
<v Speaker 2>here is a bottleneck that doesn't run into that they

0:47:17.800 --> 0:47:20.320
<v Speaker 2>don't have, and the idea that at least as he

0:47:20.480 --> 0:47:24.000
<v Speaker 2>described it, they're not fighting to get the smallest animator

0:47:24.320 --> 0:47:27.799
<v Speaker 2>chips and so maybe that gives them a little bit

0:47:27.880 --> 0:47:29.600
<v Speaker 2>of breathing room ont capacity there too.

0:47:29.840 --> 0:47:33.000
<v Speaker 3>Yeah, I mean, I do imagine there are some downsides

0:47:33.239 --> 0:47:36.279
<v Speaker 3>to having giant chips, and you know, just as there

0:47:36.280 --> 0:47:39.400
<v Speaker 3>are upsides that Andrew laid out the other thing I

0:47:39.560 --> 0:47:42.080
<v Speaker 3>was wondering. I know he made the case for the

0:47:42.160 --> 0:47:45.799
<v Speaker 3>reason speed is very important, but like I can also

0:47:46.000 --> 0:47:51.400
<v Speaker 3>imagine a world where maybe it's not that important, you know,

0:47:51.880 --> 0:47:56.160
<v Speaker 3>Like I think at some point, like the incremental speed

0:47:56.360 --> 0:47:59.880
<v Speaker 3>factor just starts to become less important when weighed against

0:48:00.080 --> 0:48:03.600
<v Speaker 3>like the incremental cost of generating it speed.

0:48:03.840 --> 0:48:05.799
<v Speaker 2>I think it really this is like one of those

0:48:05.880 --> 0:48:08.640
<v Speaker 2>things where it probably really depends what you're You're what

0:48:08.680 --> 0:48:11.400
<v Speaker 2>you're using it for. Right, So it's like if you're like,

0:48:11.800 --> 0:48:16.640
<v Speaker 2>you know what, I'm really curious why pterodactyls aren't actually dinosaurs?

0:48:16.680 --> 0:48:18.880
<v Speaker 4>Can you explain it to me? Then it's like I

0:48:18.920 --> 0:48:21.000
<v Speaker 4>don't care about that, Like that fraction of a second.

0:48:21.120 --> 0:48:23.600
<v Speaker 3>I would wait five minutes for the chat bought to

0:48:23.640 --> 0:48:24.680
<v Speaker 3>tell you you're wrong, Joe.

0:48:24.880 --> 0:48:27.200
<v Speaker 2>You just you just don't really care that much. But

0:48:27.320 --> 0:48:29.640
<v Speaker 2>if you're doing some sort of like agenta coding thing

0:48:30.200 --> 0:48:35.480
<v Speaker 2>or whatever, et cetera, then like, yeah, that definitely adds up.

0:48:35.600 --> 0:48:38.160
<v Speaker 2>And I will say, like, as you use it more

0:48:38.320 --> 0:48:41.600
<v Speaker 2>like it's just like everything else the hit the treadmill

0:48:41.680 --> 0:48:44.680
<v Speaker 2>of expectations. Here's some task that you can do in

0:48:44.760 --> 0:48:48.120
<v Speaker 2>thirty seconds, which maybe several years ago would have taken

0:48:48.160 --> 0:48:50.520
<v Speaker 2>you thirty minutes, and you get it patient in that

0:48:50.640 --> 0:48:52.880
<v Speaker 2>thirty seconds, and you want it in ten seconds. And

0:48:53.040 --> 0:48:56.279
<v Speaker 2>that's just like that competition to shave down seconds. I

0:48:56.320 --> 0:48:58.319
<v Speaker 2>think it's always going to be there, so no one

0:48:58.360 --> 0:49:01.040
<v Speaker 2>ever gets satisfied with this, is is my point. It

0:49:01.200 --> 0:49:04.200
<v Speaker 2>always eventually becomes like it feels like waiting.

0:49:04.400 --> 0:49:07.280
<v Speaker 3>But to me, this feels like this is the crux

0:49:07.520 --> 0:49:11.399
<v Speaker 3>of the AI valuation argument, which is like how much

0:49:11.440 --> 0:49:13.960
<v Speaker 3>of a premium are we going to place on a

0:49:14.080 --> 0:49:16.520
<v Speaker 3>model that maybe a closed source model that is maybe

0:49:16.600 --> 0:49:19.760
<v Speaker 3>slightly better than an open source model. How much premium

0:49:19.800 --> 0:49:22.960
<v Speaker 3>are we going to place on compute that is slightly

0:49:23.200 --> 0:49:26.600
<v Speaker 3>faster than this other type of compute or like other

0:49:26.800 --> 0:49:30.520
<v Speaker 3>use of compute like that. To me, it's an unanswered question.

0:49:30.640 --> 0:49:33.880
<v Speaker 3>And Andrew is pretty upfront about closed versus open source,

0:49:33.920 --> 0:49:36.839
<v Speaker 3>but I think on the speed question too, like we're

0:49:36.840 --> 0:49:37.440
<v Speaker 3>going to find.

0:49:37.320 --> 0:49:40.800
<v Speaker 2>Out, We're going to find out, and you know, I

0:49:40.920 --> 0:49:43.720
<v Speaker 2>think one of the things that is going to happen.

0:49:44.239 --> 0:49:46.480
<v Speaker 2>And there have been all these stories about sort of

0:49:46.560 --> 0:49:49.160
<v Speaker 2>like token shock, like how these companies are spending on tokens.

0:49:49.680 --> 0:49:52.160
<v Speaker 2>My guess is one of the things that will happen

0:49:52.640 --> 0:49:54.960
<v Speaker 2>at some point is there's going to be a lot

0:49:55.080 --> 0:49:58.160
<v Speaker 2>more discussion about why are we using this ultra premium

0:49:58.239 --> 0:50:00.840
<v Speaker 2>model when we could have done this, Like there is

0:50:00.920 --> 0:50:03.280
<v Speaker 2>a lot of just like throw it at the AI,

0:50:04.160 --> 0:50:04.680
<v Speaker 2>rack up.

0:50:04.600 --> 0:50:05.680
<v Speaker 4>Those bills, et cetera.

0:50:05.719 --> 0:50:08.120
<v Speaker 2>And at some point there's going to be this like, Okay,

0:50:08.160 --> 0:50:10.919
<v Speaker 2>what really needs to be served fast? What really needs

0:50:10.960 --> 0:50:13.560
<v Speaker 2>to be served on the most premium pro source models,

0:50:13.880 --> 0:50:16.440
<v Speaker 2>and companies are probably going to get a lot more

0:50:17.200 --> 0:50:22.080
<v Speaker 2>skilled at allocating from you know, different forms of inference

0:50:22.280 --> 0:50:23.080
<v Speaker 2>depending on the need.

0:50:23.280 --> 0:50:25.720
<v Speaker 3>Yeah, I think that's exactly it. And at that point,

0:50:25.920 --> 0:50:28.120
<v Speaker 3>like we could well see some of the dynamics in

0:50:28.200 --> 0:50:31.879
<v Speaker 3>the market start to change in terms of valuation. Shall

0:50:31.880 --> 0:50:32.279
<v Speaker 3>we leave it there?

0:50:32.400 --> 0:50:33.000
<v Speaker 4>Let's leave it there.

0:50:33.160 --> 0:50:35.640
<v Speaker 3>This has been another episode of the Authoughts podcast. I'm

0:50:35.680 --> 0:50:38.400
<v Speaker 3>Tracy Alloway. You can follow me at Tracy Alloway and.

0:50:38.480 --> 0:50:41.240
<v Speaker 4>I'm Joe Wisenthal. You can follow me at the Stalwart.

0:50:41.560 --> 0:50:44.960
<v Speaker 2>Follow our producers Carman Rodriguez at Carmen armand dash, Ol

0:50:44.960 --> 0:50:48.560
<v Speaker 2>Bennett at Dashbot, Cal Brooks at Cale Brooks and Kevin

0:50:48.600 --> 0:50:51.920
<v Speaker 2>Lozano at Kevin Lloyd Lozano and from our Odd Lots content.

0:50:52.000 --> 0:50:53.240
<v Speaker 4>Go to Bloomberg dot com.

0:50:53.239 --> 0:50:55.719
<v Speaker 2>Slash odd lots, where the daily newsletter and all of

0:50:55.800 --> 0:50:57.960
<v Speaker 2>our episodes, and you can shout about all of these

0:50:58.000 --> 0:51:01.520
<v Speaker 2>topics twenty four to seven in our disc discord dot

0:51:01.640 --> 0:51:03.239
<v Speaker 2>gg slash thoughts and.

0:51:03.320 --> 0:51:05.160
<v Speaker 3>If you enjoy all thoughts, If you like it when

0:51:05.200 --> 0:51:07.759
<v Speaker 3>we talk about giant wafers, then please leave us a

0:51:07.840 --> 0:51:11.120
<v Speaker 3>positive review on your favorite podcast platform. And remember, if

0:51:11.120 --> 0:51:13.560
<v Speaker 3>you're a Bloomberg subscriber, you can listen to all of

0:51:13.600 --> 0:51:16.480
<v Speaker 3>our episodes absolutely add free. All you need to do

0:51:16.719 --> 0:51:19.600
<v Speaker 3>is find the Bloomberg channel on Apple Podcasts and follow

0:51:19.640 --> 0:51:21.720
<v Speaker 3>the instructions there. Thanks for listening