WEBVTT - Two Veteran Chip Builders Have a Plan to Take On Nvidia

0:00:03.160 --> 0:00:18.520
<v Speaker 1>Bloomberg Audio Studios, Podcasts, radio News.

0:00:20.079 --> 0:00:23.959
<v Speaker 2>Hello and welcome to another episode of the Odd Lots podcast.

0:00:24.040 --> 0:00:25.680
<v Speaker 2>I'm Jill Wisenthal.

0:00:25.360 --> 0:00:26.439
<v Speaker 3>And I'm Tracy Alloway.

0:00:26.720 --> 0:00:30.880
<v Speaker 2>Tracy, here's something I know about AI. I don't know much,

0:00:30.920 --> 0:00:31.920
<v Speaker 2>but here's something.

0:00:31.680 --> 0:00:32.080
<v Speaker 4>I do know.

0:00:32.240 --> 0:00:33.600
<v Speaker 3>How to log into chat GPT.

0:00:33.920 --> 0:00:35.680
<v Speaker 2>No, I'm good at it. I'm good at that. I'm

0:00:35.680 --> 0:00:38.479
<v Speaker 2>good at logging into chat GPT and claude, and I'm

0:00:38.520 --> 0:00:41.680
<v Speaker 2>reasonably good at asking questions. Now, here's actually something about

0:00:41.680 --> 0:00:44.280
<v Speaker 2>the actually about the business of AI that I know.

0:00:44.520 --> 0:00:44.840
<v Speaker 3>Okay.

0:00:45.120 --> 0:00:45.879
<v Speaker 4>I know that in.

0:00:46.080 --> 0:00:50.120
<v Speaker 2>Video is making a ton of money and the stock

0:00:50.159 --> 0:00:53.280
<v Speaker 2>has gone to the moon, and that other companies would

0:00:53.280 --> 0:00:54.560
<v Speaker 2>like a slice of that pie.

0:00:55.160 --> 0:00:57.560
<v Speaker 3>Yes, yes, that's a good thing to know.

0:00:58.000 --> 0:01:00.360
<v Speaker 2>It's like a basic, simple thing, which is that when

0:01:00.360 --> 0:01:04.080
<v Speaker 2>people think about AI chips, there's literally one company that

0:01:04.160 --> 0:01:08.280
<v Speaker 2>comes to mind. I know others are involved. AMD has stuff,

0:01:08.440 --> 0:01:11.800
<v Speaker 2>Intel obviously wants to play others, but there is obviously

0:01:11.840 --> 0:01:15.840
<v Speaker 2>that one gigantic pile of cash that's flowing to this

0:01:15.840 --> 0:01:18.120
<v Speaker 2>one company. I don't know if it's still but at

0:01:18.120 --> 0:01:20.240
<v Speaker 2>one point, is the biggest company in the world is

0:01:20.440 --> 0:01:21.160
<v Speaker 2>pulled back.

0:01:21.000 --> 0:01:21.520
<v Speaker 4>A little bit.

0:01:22.080 --> 0:01:24.640
<v Speaker 2>Well, I would say two things. One, other companies would

0:01:24.680 --> 0:01:27.920
<v Speaker 2>like that a piece of that pie. And b companies

0:01:27.959 --> 0:01:31.639
<v Speaker 2>that are in the business of building AI models would

0:01:31.680 --> 0:01:35.039
<v Speaker 2>like to find a way to get cheaper, more efficient,

0:01:35.360 --> 0:01:38.640
<v Speaker 2>less energy intensive chips so that they don't have to

0:01:38.680 --> 0:01:40.160
<v Speaker 2>always pay the Nvidia tax.

0:01:40.440 --> 0:01:43.240
<v Speaker 3>Do you want to know what I know about AI

0:01:43.319 --> 0:01:46.320
<v Speaker 3>and semiconductors, Let's go for it. Okay, here's the one

0:01:46.360 --> 0:01:49.160
<v Speaker 3>thing that I know, which is that whenever you have

0:01:49.280 --> 0:01:52.800
<v Speaker 3>this conversation about in Nvidia, the one word that always

0:01:52.800 --> 0:01:54.080
<v Speaker 3>comes up is moat.

0:01:54.400 --> 0:01:55.440
<v Speaker 2>Oh yes, moat yeah.

0:01:55.520 --> 0:01:59.400
<v Speaker 3>So, like you're either talking about like medieval castles or

0:01:59.440 --> 0:02:02.280
<v Speaker 3>you're talking about semiconductor manufacturing. That's when you hear the

0:02:02.320 --> 0:02:05.360
<v Speaker 3>word mote because over and over again people will say

0:02:05.400 --> 0:02:07.480
<v Speaker 3>it is expensive to make the chips. You need a

0:02:07.480 --> 0:02:10.040
<v Speaker 3>lot of money for research and development and to set

0:02:10.120 --> 0:02:12.480
<v Speaker 3>up the fabs, and you need a lot of first

0:02:12.520 --> 0:02:16.080
<v Speaker 3>person expertise in building them. And then there's also the

0:02:16.120 --> 0:02:20.160
<v Speaker 3>network effect. So a company like Nvidia has this huge

0:02:20.200 --> 0:02:23.560
<v Speaker 3>moat around its business. The question, of course, is whether

0:02:23.680 --> 0:02:26.520
<v Speaker 3>or not, getting back to the medieval castle analogy, it

0:02:26.600 --> 0:02:28.560
<v Speaker 3>is unassailable, that's right.

0:02:28.720 --> 0:02:32.519
<v Speaker 2>If semiconductor seems to be mote after MOTI, after mode,

0:02:32.520 --> 0:02:36.840
<v Speaker 2>because there's ASML's moat, and then there's Taiwan Semiconductor's moat,

0:02:37.440 --> 0:02:41.000
<v Speaker 2>and then there's Nvidia's moat, and so yes, it's like

0:02:41.040 --> 0:02:44.880
<v Speaker 2>there's a series of moats, and if someone could overcome

0:02:45.000 --> 0:02:46.960
<v Speaker 2>these moats or make find a way to build a

0:02:47.000 --> 0:02:50.800
<v Speaker 2>bridge over one of these moats and enter this proverbial castle,

0:02:51.080 --> 0:02:53.760
<v Speaker 2>that would be very lucrative. We know that many are

0:02:53.880 --> 0:02:57.919
<v Speaker 2>trying to enter these moats, but it's incredibly costly and

0:02:58.080 --> 0:03:01.680
<v Speaker 2>capital intensive and difficult. There are just not many people

0:03:01.680 --> 0:03:04.080
<v Speaker 2>who know how to do any of this stuff, and

0:03:04.200 --> 0:03:06.840
<v Speaker 2>so the question of whether these modes can be overcome.

0:03:07.200 --> 0:03:09.480
<v Speaker 2>But again, there are many businesses that would love to

0:03:09.480 --> 0:03:13.320
<v Speaker 2>see more robust competition in the space so that their

0:03:13.400 --> 0:03:15.160
<v Speaker 2>payment is not a attack.

0:03:15.520 --> 0:03:18.359
<v Speaker 3>You know, one thing I don't know, and I don't

0:03:18.400 --> 0:03:21.120
<v Speaker 3>think we've ever done an episode purely on this, but

0:03:21.200 --> 0:03:25.040
<v Speaker 3>I don't really understand the different designs of chips. So

0:03:25.200 --> 0:03:28.720
<v Speaker 3>I know that some chips, specifically in videos, are supposed

0:03:28.760 --> 0:03:33.040
<v Speaker 3>to be better at AI. They're better at running lots

0:03:33.120 --> 0:03:36.400
<v Speaker 3>of little calculations all at the same time. And I

0:03:36.440 --> 0:03:40.200
<v Speaker 3>know there's basic chips that go into your refrigerator or

0:03:40.200 --> 0:03:42.360
<v Speaker 3>your car or whatever. But I don't really know the

0:03:42.400 --> 0:03:46.560
<v Speaker 3>difference between what a chip that was designed specifically to

0:03:46.720 --> 0:03:50.120
<v Speaker 3>run a large language model would look like compared to

0:03:50.560 --> 0:03:52.080
<v Speaker 3>a standard basic chip.

0:03:52.320 --> 0:03:54.400
<v Speaker 2>I don't know anything about chip design. I just sort

0:03:54.400 --> 0:03:58.760
<v Speaker 2>of imagined someone on like using some CADS software, etching

0:03:58.880 --> 0:04:02.520
<v Speaker 2>little lines in the thing and drawing some sort of

0:04:02.560 --> 0:04:05.560
<v Speaker 2>like circuitry or you know, put it place in the trains.

0:04:06.040 --> 0:04:08.520
<v Speaker 3>You know, A chip design game would be really fun,

0:04:08.600 --> 0:04:10.400
<v Speaker 3>now that I think about it. Yeah, you could just

0:04:10.520 --> 0:04:13.360
<v Speaker 3>draw little things on the square. Okay. Anyway, Well, we

0:04:13.400 --> 0:04:13.800
<v Speaker 3>are going.

0:04:13.760 --> 0:04:17.200
<v Speaker 2>To learn about how chip design works. We are going

0:04:17.279 --> 0:04:21.200
<v Speaker 2>to learn about what makes a chip particularly good for

0:04:21.279 --> 0:04:25.320
<v Speaker 2>the task of training and running inference on these AI models.

0:04:25.600 --> 0:04:27.479
<v Speaker 2>And I have to say, I really do believe we

0:04:27.600 --> 0:04:31.400
<v Speaker 2>have the two perfect guests because they are both veterans

0:04:31.480 --> 0:04:34.400
<v Speaker 2>in this space, and they are both active in the

0:04:34.680 --> 0:04:38.120
<v Speaker 2>attempt to bridge some of these motes and enter the

0:04:38.160 --> 0:04:41.479
<v Speaker 2>space and bring competition to the industry. We are going

0:04:41.520 --> 0:04:44.320
<v Speaker 2>to be speaking with yin Or Pope, co founder and

0:04:44.440 --> 0:04:47.400
<v Speaker 2>CEO of Medex, as well as Mike Gunter, co founder

0:04:47.400 --> 0:04:50.679
<v Speaker 2>and CTO of Madex. It's a new company that's trying

0:04:50.720 --> 0:04:55.960
<v Speaker 2>to build chips specifically for the purpose of large language models.

0:04:56.279 --> 0:04:58.839
<v Speaker 2>Both of them have a lot of experience in the

0:04:58.880 --> 0:05:01.440
<v Speaker 2>space we're going to we get our hands dirty, so

0:05:01.520 --> 0:05:04.560
<v Speaker 2>to speak, and understand how you build the hardware for

0:05:04.560 --> 0:05:06.800
<v Speaker 2>all this stuff and what makes it win and whether

0:05:06.839 --> 0:05:09.400
<v Speaker 2>it's even a winnable game. Ryan Or and Mike, thank

0:05:09.440 --> 0:05:11.080
<v Speaker 2>you so much for coming on Outlaws.

0:05:11.440 --> 0:05:14.040
<v Speaker 5>Thanks, happy to be here, pleasure to be here.

0:05:14.160 --> 0:05:16.839
<v Speaker 2>So what do you tell us? What does a chip

0:05:16.880 --> 0:05:20.640
<v Speaker 2>designer do? I know, I have this completely cartoonish view

0:05:20.680 --> 0:05:23.880
<v Speaker 2>in my head that cannot possibly be right of someone

0:05:23.960 --> 0:05:27.200
<v Speaker 2>on a big screen using some CAD software to sort of,

0:05:27.279 --> 0:05:28.880
<v Speaker 2>you know, figure out what's going to be etched in

0:05:28.920 --> 0:05:31.560
<v Speaker 2>that way for of silicon. What is the job of

0:05:31.640 --> 0:05:32.280
<v Speaker 2>chip design?

0:05:33.200 --> 0:05:35.440
<v Speaker 5>So maybe this is best told by what is the

0:05:35.480 --> 0:05:38.520
<v Speaker 5>story of chip development from the beginning of a project

0:05:38.560 --> 0:05:41.000
<v Speaker 5>to the end of it. So there's a range of

0:05:41.000 --> 0:05:42.360
<v Speaker 5>different ways this can go, but there's a lot of

0:05:42.400 --> 0:05:46.000
<v Speaker 5>things that are in common. So generally a chip design

0:05:46.200 --> 0:05:49.880
<v Speaker 5>team is at the low end, maybe thirty people, up

0:05:49.920 --> 0:05:52.560
<v Speaker 5>to many many thousands of people at the high end,

0:05:53.000 --> 0:05:56.479
<v Speaker 5>and it as the project typically runs for somewhere in

0:05:56.480 --> 0:05:58.800
<v Speaker 5>the range of three to five years from conception to

0:05:58.880 --> 0:06:02.160
<v Speaker 5>actually shipping to customer, and so over that time what

0:06:02.160 --> 0:06:04.760
<v Speaker 5>we see in the life cycle is we tend to

0:06:04.800 --> 0:06:07.840
<v Speaker 5>start with a small team of architects. If you think

0:06:07.839 --> 0:06:10.080
<v Speaker 5>of designing a house, the team of architects are the

0:06:10.080 --> 0:06:12.440
<v Speaker 5>people who decide what rooms go in here, or how

0:06:12.440 --> 0:06:14.880
<v Speaker 5>many bedrooms, how many bathrooms, what are the flows between them,

0:06:14.880 --> 0:06:16.640
<v Speaker 5>how do people walk through the corridors, and so on,

0:06:17.000 --> 0:06:19.840
<v Speaker 5>the coarse grained design of the chip, in the chip itself,

0:06:19.880 --> 0:06:22.080
<v Speaker 5>that is, you know what kinds of components at the

0:06:22.360 --> 0:06:26.160
<v Speaker 5>high level we have, and then after that initial exploration,

0:06:26.680 --> 0:06:29.039
<v Speaker 5>this moves then over to the micro architects. These are

0:06:29.080 --> 0:06:31.200
<v Speaker 5>the people who are designing the individual rooms. What are

0:06:31.240 --> 0:06:34.320
<v Speaker 5>the components that go in the individual rooms. So at

0:06:34.360 --> 0:06:36.760
<v Speaker 5>that point everything we've done so far is a design

0:06:36.839 --> 0:06:41.040
<v Speaker 5>stage thing. This is done in documents, spreadsheets, and it's

0:06:41.080 --> 0:06:44.080
<v Speaker 5>a verbal and human communication form. But beyond that, that's

0:06:44.080 --> 0:06:46.160
<v Speaker 5>when it starts to actually touch the computer in a

0:06:46.520 --> 0:06:49.839
<v Speaker 5>more meaningful sense. And so the micro architects will hand

0:06:49.880 --> 0:06:52.760
<v Speaker 5>over to the logic designers. They are the people who

0:06:52.800 --> 0:06:55.200
<v Speaker 5>are actually writing code. So even though you think of

0:06:55.240 --> 0:06:58.080
<v Speaker 5>chips as being this very physical thing where there's wires

0:06:58.120 --> 0:07:00.320
<v Speaker 5>and gates and everything. The way we try to admit

0:07:00.320 --> 0:07:02.400
<v Speaker 5>this information to the computer is actually writing code. We

0:07:02.440 --> 0:07:05.760
<v Speaker 5>write verolog that expresses the design of the chip. So

0:07:06.120 --> 0:07:10.040
<v Speaker 5>that's what the logic designers are doing. That's an extended

0:07:10.080 --> 0:07:12.400
<v Speaker 5>period of time building out all of the different you know,

0:07:12.680 --> 0:07:16.320
<v Speaker 5>matrix multiplies, memories, circuitry that connects to the outside world,

0:07:16.360 --> 0:07:18.800
<v Speaker 5>and so on. And then the output of all of

0:07:18.840 --> 0:07:21.920
<v Speaker 5>them is this verolog piece of software code that gets

0:07:22.080 --> 0:07:25.000
<v Speaker 5>then compiled by a computer down to a set of

0:07:25.080 --> 0:07:27.960
<v Speaker 5>gates which are logic gates and or gates and so on.

0:07:28.040 --> 0:07:29.800
<v Speaker 5>And then why is that connect them together? That's the

0:07:29.840 --> 0:07:33.560
<v Speaker 5>netlist this file. Then there's a few more stages still

0:07:33.560 --> 0:07:36.960
<v Speaker 5>coming here. This file gets handed off to physical designers,

0:07:37.000 --> 0:07:39.760
<v Speaker 5>who again work with CAD tools to convert this kind

0:07:39.760 --> 0:07:40.600
<v Speaker 5>of logical discussion.

0:07:40.640 --> 0:07:42.560
<v Speaker 2>Was right, Someone is using CAD tools.

0:07:43.480 --> 0:07:46.040
<v Speaker 5>Absolutely, there's a CAD tool, but it's it's only out

0:07:46.080 --> 0:07:50.040
<v Speaker 5>of the job. Okay, So the physical designers are converting

0:07:50.080 --> 0:07:52.800
<v Speaker 5>the sort of logical description into a physical placement. So

0:07:53.240 --> 0:07:55.560
<v Speaker 5>where do each of these gates go? Now there's two

0:07:55.640 --> 0:07:58.000
<v Speaker 5>hundred billion logic gates on a chip, so a human

0:07:58.040 --> 0:07:59.760
<v Speaker 5>is not going to be placing all of those manually.

0:08:00.040 --> 0:08:03.120
<v Speaker 5>So there's a huge amount of software assistance here. But

0:08:03.160 --> 0:08:05.240
<v Speaker 5>what the human is doing is providing oversights through this

0:08:05.280 --> 0:08:07.520
<v Speaker 5>process and saying, I've done this a ton of times before.

0:08:07.640 --> 0:08:10.560
<v Speaker 5>This placement kind of looks wrong, it doesn't match my heuristics,

0:08:10.600 --> 0:08:12.760
<v Speaker 5>and so I can probably do a better job here.

0:08:13.160 --> 0:08:15.360
<v Speaker 5>So that's the physical designers, and the output of their

0:08:15.400 --> 0:08:18.920
<v Speaker 5>work is actually eventually you get a polygons, so basically

0:08:18.920 --> 0:08:21.600
<v Speaker 5>an image saying here is the thing that is going

0:08:21.680 --> 0:08:26.160
<v Speaker 5>to get etched onto a piece of silicon. So that

0:08:26.600 --> 0:08:29.640
<v Speaker 5>file is ultimately a huge, like really big image in

0:08:29.640 --> 0:08:32.000
<v Speaker 5>some form a bunch of polygons on it. It gets

0:08:32.040 --> 0:08:36.439
<v Speaker 5>handed over to a manufacturing company such as TSMC. They

0:08:36.440 --> 0:08:41.040
<v Speaker 5>spend maybe four or five months initially creating a mask set,

0:08:41.120 --> 0:08:43.760
<v Speaker 5>so those are like the templates or the stencils that

0:08:43.800 --> 0:08:46.160
<v Speaker 5>will be used to stamp out many many copies of

0:08:46.160 --> 0:08:48.679
<v Speaker 5>the chip, and then stamps up many copies of the chip.

0:08:48.720 --> 0:08:51.840
<v Speaker 5>You get a chip back. This is typically about two

0:08:51.920 --> 0:08:54.160
<v Speaker 5>or three years after you started the project. You get

0:08:54.200 --> 0:08:57.000
<v Speaker 5>chips back, and now you have a bring up team

0:08:57.000 --> 0:09:00.520
<v Speaker 5>who puts this chip into a whole board and connected

0:09:00.559 --> 0:09:02.680
<v Speaker 5>to what to power and electricity and starts testing it,

0:09:03.240 --> 0:09:05.760
<v Speaker 5>and then after another six to twelve months or maybe

0:09:05.760 --> 0:09:08.800
<v Speaker 5>even more, eventually you actually can hand this over to customers.

0:09:09.160 --> 0:09:10.920
<v Speaker 5>There's maybe just one or two other things which are

0:09:10.920 --> 0:09:13.920
<v Speaker 5>not in that flow but very essential to call out too.

0:09:14.360 --> 0:09:18.040
<v Speaker 5>Are because of this whole process taking so long, especially

0:09:18.040 --> 0:09:21.440
<v Speaker 5>the manufacturing, we also have like very large teams of

0:09:21.600 --> 0:09:24.920
<v Speaker 5>verification people. So these are the people who before we

0:09:24.920 --> 0:09:27.160
<v Speaker 5>actually send it to manufacturing and pay twenty to thirty

0:09:27.160 --> 0:09:31.480
<v Speaker 5>million dollars of manufacturing, we have a substantial team doing

0:09:31.480 --> 0:09:33.640
<v Speaker 5>a lot of testing. And this is software based testing,

0:09:33.720 --> 0:09:36.120
<v Speaker 5>so writing tests in the same way a software engineer

0:09:36.160 --> 0:09:39.600
<v Speaker 5>might to make sure that the functionality actually works as intended.

0:09:39.920 --> 0:09:44.240
<v Speaker 6>To underlying the comparison to ordinary software, which Reiner touched

0:09:44.280 --> 0:09:47.760
<v Speaker 6>on it, we're writing code, but it's on super hard mode.

0:09:48.160 --> 0:09:50.600
<v Speaker 6>So if you have a if you have a software

0:09:50.640 --> 0:09:54.000
<v Speaker 6>that's deployed the website, you can fix a bug and

0:09:54.120 --> 0:09:57.880
<v Speaker 6>you know, ten minutes at basically zero cost. Whereas in

0:09:57.920 --> 0:09:59.880
<v Speaker 6>our case, the reason that we have a large team

0:10:00.080 --> 0:10:03.280
<v Speaker 6>people doing verification making sure that what we've done is

0:10:03.320 --> 0:10:07.439
<v Speaker 6>correct is that it's potentially four months and thirty million

0:10:07.440 --> 0:10:11.079
<v Speaker 6>dollars for every mistake that you let through. Likewise, there

0:10:11.120 --> 0:10:14.280
<v Speaker 6>is software, but it's a relatively small fraction of software

0:10:14.320 --> 0:10:16.719
<v Speaker 6>that's very performance critical where you want the code to

0:10:16.760 --> 0:10:19.400
<v Speaker 6>run as fast as possible. But in some sense, every

0:10:19.480 --> 0:10:22.120
<v Speaker 6>line of code that you write in hardware has an

0:10:22.120 --> 0:10:25.480
<v Speaker 6>impact on the overall performance of the product, because every

0:10:25.520 --> 0:10:28.400
<v Speaker 6>line of code ends up getting embodied in silicon, and

0:10:28.440 --> 0:10:31.280
<v Speaker 6>every line of code affects the eventual performance. So it's

0:10:31.360 --> 0:10:34.080
<v Speaker 6>kind of coding, but on hard mode.

0:10:34.800 --> 0:10:40.520
<v Speaker 3>So I intuitively understand the importance of getting the software right.

0:10:40.679 --> 0:10:45.360
<v Speaker 3>But why does placement on the actual chip or wayfer

0:10:45.480 --> 0:10:48.280
<v Speaker 3>Why does that matter? Are you trying to make it

0:10:48.280 --> 0:10:51.280
<v Speaker 3>more efficient, are you trying to reduce the rise time?

0:10:51.440 --> 0:10:53.640
<v Speaker 3>Or why does it matter where the little bits and

0:10:53.679 --> 0:10:56.679
<v Speaker 3>bobs are placed? To use the scientific.

0:10:56.200 --> 0:11:00.400
<v Speaker 6>Term, Yeah, you're right that reducing the right time is

0:11:00.640 --> 0:11:04.320
<v Speaker 6>a massive issue. And you know, fundamentally the issue is

0:11:04.320 --> 0:11:07.520
<v Speaker 6>that chips, you know, at a very abstract level, are

0:11:07.960 --> 0:11:11.480
<v Speaker 6>composed of were at a somewhat content concrete level, really

0:11:11.800 --> 0:11:16.000
<v Speaker 6>are composed of transistors and wires, and the placement has

0:11:16.000 --> 0:11:19.720
<v Speaker 6>a dramatic effect on the link through the wires, which

0:11:19.720 --> 0:11:22.199
<v Speaker 6>has a dramatic effect on both the performance of the

0:11:22.240 --> 0:11:24.760
<v Speaker 6>chip and how much you can fit. In terms of

0:11:24.800 --> 0:11:27.679
<v Speaker 6>the impact that this has on the quality of chip

0:11:27.720 --> 0:11:32.080
<v Speaker 6>that you produce, wires have over time not been shrinking

0:11:32.200 --> 0:11:36.559
<v Speaker 6>in the same way that transistors have, and so getting

0:11:36.800 --> 0:11:39.560
<v Speaker 6>the wearing right, which usually means getting the placement right,

0:11:39.679 --> 0:11:41.560
<v Speaker 6>has become more and more important over time.

0:11:57.960 --> 0:12:01.160
<v Speaker 3>Can chips be beautiful? I know code can be elegant,

0:12:01.720 --> 0:12:04.160
<v Speaker 3>and some people will say certain code is beautiful, But

0:12:04.320 --> 0:12:07.120
<v Speaker 3>have you ever looked at a semiconductor and been like, oh, wow,

0:12:07.320 --> 0:12:09.680
<v Speaker 3>that's really nicely put together.

0:12:10.520 --> 0:12:12.640
<v Speaker 5>For me, I mean I think absolutely yes. This is

0:12:12.679 --> 0:12:14.320
<v Speaker 5>like why I work in this space is I just

0:12:14.400 --> 0:12:16.560
<v Speaker 5>really like geeking out on the design of things. But

0:12:16.800 --> 0:12:19.000
<v Speaker 5>to me, what beautiful for a chip means is that

0:12:19.280 --> 0:12:21.439
<v Speaker 5>it kind of does exactly what it was designed to do,

0:12:21.960 --> 0:12:24.679
<v Speaker 5>and no more and no less. I mean, obviously less

0:12:24.720 --> 0:12:27.720
<v Speaker 5>would be a bit of a disappointment, but often if

0:12:27.720 --> 0:12:29.600
<v Speaker 5>it does more, do you think, well, maybe I designed

0:12:29.600 --> 0:12:31.600
<v Speaker 5>it for slightly the wrong purpose or something like that.

0:12:32.000 --> 0:12:35.240
<v Speaker 2>I think this is a good seg into getting into

0:12:35.360 --> 0:12:39.120
<v Speaker 2>your business specifically, so we all know that so much

0:12:39.120 --> 0:12:42.720
<v Speaker 2>of this AI is powered by these in video GPUs,

0:12:43.240 --> 0:12:46.520
<v Speaker 2>but in video GPUs have been used for a long

0:12:46.559 --> 0:12:49.480
<v Speaker 2>time for many things that do not have anything to

0:12:49.559 --> 0:12:53.880
<v Speaker 2>do with large language models or the specific AI applications

0:12:53.880 --> 0:12:56.120
<v Speaker 2>that people are excited about today in twenty twenty four.

0:12:56.640 --> 0:12:58.960
<v Speaker 2>So for a while they were, well, the video games

0:12:59.000 --> 0:13:01.400
<v Speaker 2>is obviously the big one for decades and decades, and

0:13:01.440 --> 0:13:03.560
<v Speaker 2>then there was like five minutes where people got really

0:13:03.600 --> 0:13:07.520
<v Speaker 2>excited to use them for ethereum mining, and now everyone's

0:13:07.559 --> 0:13:11.600
<v Speaker 2>really excited about their use for artificial intelligence and large

0:13:11.679 --> 0:13:14.920
<v Speaker 2>language models and some of these other generative AI applications

0:13:14.960 --> 0:13:18.440
<v Speaker 2>that people are excited about right now, Why don't you

0:13:18.559 --> 0:13:21.920
<v Speaker 2>tell us maybe the sort of idea behind maddex, but

0:13:22.040 --> 0:13:25.640
<v Speaker 2>specifically what you were both doing when you were at

0:13:25.679 --> 0:13:29.440
<v Speaker 2>alphabet or Google, which you know it has its own chips.

0:13:29.480 --> 0:13:33.319
<v Speaker 2>I believe it has something called TPUs. What was the

0:13:33.440 --> 0:13:38.160
<v Speaker 2>project at Google? Why did Google find it necessary or

0:13:38.280 --> 0:13:40.600
<v Speaker 2>a good business to start building their own chips for

0:13:40.640 --> 0:13:43.520
<v Speaker 2>in house purposes? And then why did you feel the

0:13:43.559 --> 0:13:46.960
<v Speaker 2>need to then leave to build what you're building now

0:13:47.040 --> 0:13:48.400
<v Speaker 2>for LLM specifically?

0:13:48.960 --> 0:13:52.760
<v Speaker 6>Yeah, So what Google was seeing, and this was at

0:13:52.760 --> 0:13:56.640
<v Speaker 6>this point sometime back more than a decade ago, they

0:13:56.679 --> 0:14:01.439
<v Speaker 6>were seeing that the use of artific intelligence lllms were

0:14:01.440 --> 0:14:04.160
<v Speaker 6>not a thing at that point, was going up, and

0:14:04.440 --> 0:14:08.520
<v Speaker 6>they were worried about how much money they would have

0:14:08.720 --> 0:14:11.960
<v Speaker 6>to spend on traditional it would be it would have

0:14:12.000 --> 0:14:16.040
<v Speaker 6>been GPUs at that time, and so they built a

0:14:16.160 --> 0:14:21.040
<v Speaker 6>very specialized chip to do neural nets, and that chips

0:14:21.400 --> 0:14:27.240
<v Speaker 6>specialize on matrix multiplication. So they put in a structure

0:14:27.280 --> 0:14:31.520
<v Speaker 6>called a systolic array, which they definitely didn't invent. It existed,

0:14:32.120 --> 0:14:35.400
<v Speaker 6>has existed from the seventies that is especially good at

0:14:35.400 --> 0:14:39.920
<v Speaker 6>doing matrix multiplication. Now after that, Nvidia has added a

0:14:39.960 --> 0:14:44.680
<v Speaker 6>similar structure into their chips. And the initial Google TPU

0:14:45.000 --> 0:14:47.600
<v Speaker 6>was an inference focused only chip, and then they have

0:14:47.840 --> 0:14:51.360
<v Speaker 6>subsequently made chips that can be used for both training

0:14:51.360 --> 0:14:54.480
<v Speaker 6>and inference. And I guess now is a good point

0:14:54.520 --> 0:14:56.920
<v Speaker 6>to So the very last thing that I was doing

0:14:56.920 --> 0:14:59.440
<v Speaker 6>at Google was I was on the TPU team and

0:14:59.480 --> 0:15:02.120
<v Speaker 6>Reiner was on the large language model team. And it's

0:15:02.120 --> 0:15:04.680
<v Speaker 6>probably good to have him sort of tell free from here.

0:15:05.040 --> 0:15:07.320
<v Speaker 5>So I mean, what we were seeing and this this

0:15:07.400 --> 0:15:09.320
<v Speaker 5>is what we personally were seeing, but Google was seeing

0:15:09.360 --> 0:15:12.120
<v Speaker 5>more generally as well. Is just large language models were

0:15:12.120 --> 0:15:14.400
<v Speaker 5>a thing. There was this period of time between GPT

0:15:14.560 --> 0:15:17.480
<v Speaker 5>three and chat GIPT coming out. GPT three came out

0:15:17.480 --> 0:15:20.440
<v Speaker 5>in twenty twenty, and so people who were very plugged

0:15:20.480 --> 0:15:24.560
<v Speaker 5>into the field recognized the importance of it all at

0:15:24.640 --> 0:15:26.720
<v Speaker 5>least to some extent, recognized the importance of it back then,

0:15:27.280 --> 0:15:30.080
<v Speaker 5>and so there was this push to you know, everyone

0:15:30.120 --> 0:15:32.600
<v Speaker 5>wanted to create their own large language model that was

0:15:32.640 --> 0:15:35.800
<v Speaker 5>better than GPT three, and so, I mean, at the time,

0:15:35.840 --> 0:15:38.280
<v Speaker 5>I was on the Large Language Model team. We helped

0:15:38.320 --> 0:15:41.440
<v Speaker 5>training Google Palm, and we were using thousands of TPUs

0:15:41.480 --> 0:15:44.240
<v Speaker 5>for that, and one of the things we were saying is, well,

0:15:44.240 --> 0:15:47.240
<v Speaker 5>look what does it cost to deploy this? In Google Search?

0:15:47.360 --> 0:15:49.280
<v Speaker 5>There's quite a lot of search querers. I think it's

0:15:49.320 --> 0:15:51.200
<v Speaker 5>the public estimates thro about one hundred thousand of them

0:15:51.200 --> 0:15:54.600
<v Speaker 5>per second. If you multiply out how much each querer costs,

0:15:54.720 --> 0:15:56.400
<v Speaker 5>and if you want to run that on large language models,

0:15:56.400 --> 0:15:58.680
<v Speaker 5>that's a lot more expensive. And then also I just

0:15:58.720 --> 0:16:00.680
<v Speaker 5>if I want to train a model that's times bigger

0:16:00.680 --> 0:16:03.840
<v Speaker 5>than my current model or one hundred times bigger, suddenly

0:16:04.280 --> 0:16:07.120
<v Speaker 5>these models have just moved from costing you know, a

0:16:07.160 --> 0:16:09.640
<v Speaker 5>million dollars or one hundred thousand dollars to train to

0:16:10.000 --> 0:16:12.040
<v Speaker 5>tens of millions and hundreds of millions of dollars, and

0:16:12.120 --> 0:16:16.000
<v Speaker 5>so the overall goal was can we make it cheaper

0:16:16.000 --> 0:16:18.440
<v Speaker 5>by any way possible. So, of course there's algorithmic approaches.

0:16:18.480 --> 0:16:21.440
<v Speaker 5>There's a lot of opportunity on the algorithm and research side.

0:16:21.480 --> 0:16:23.560
<v Speaker 5>But then the other really big lever is just making

0:16:23.560 --> 0:16:25.840
<v Speaker 5>better hardware. So one of the things we were looking

0:16:25.880 --> 0:16:29.440
<v Speaker 5>at was trying to make Google's TPUs better for large

0:16:29.480 --> 0:16:32.000
<v Speaker 5>language models. What led us, actually, i mean this is

0:16:32.040 --> 0:16:33.760
<v Speaker 5>personally about Mike and me in this case, or what

0:16:33.840 --> 0:16:36.440
<v Speaker 5>led us to leave Google to make medics was we

0:16:36.480 --> 0:16:38.640
<v Speaker 5>saw that there was We believe that there is some

0:16:38.720 --> 0:16:42.400
<v Speaker 5>opportunity to make chips substantially better if you're only looking

0:16:42.400 --> 0:16:45.160
<v Speaker 5>to focus on large language models. And so the chips

0:16:45.160 --> 0:16:49.560
<v Speaker 5>that were designed pre GPT three and especially pre chat

0:16:49.600 --> 0:16:52.560
<v Speaker 5>GPT try to do a really good job on really

0:16:52.560 --> 0:16:54.440
<v Speaker 5>good job on small models as well as a really

0:16:54.480 --> 0:16:56.840
<v Speaker 5>good job on large models. And so what you find

0:16:56.880 --> 0:16:59.040
<v Speaker 5>is that the circuitry in those chips, there's a bit

0:16:59.080 --> 0:17:01.120
<v Speaker 5>of circuitry for what you need for small models, there's

0:17:01.120 --> 0:17:03.080
<v Speaker 5>a bit of secretry for what you need for large models.

0:17:03.120 --> 0:17:05.760
<v Speaker 5>Also for maybe embedding look ups. There's three or four

0:17:05.760 --> 0:17:08.560
<v Speaker 5>different kinds of workloads, and all of them take some

0:17:08.640 --> 0:17:11.640
<v Speaker 5>of the real estate in your cellica. And so if

0:17:11.640 --> 0:17:13.280
<v Speaker 5>you really want to make the best use of the

0:17:13.280 --> 0:17:15.119
<v Speaker 5>real estate, you should just focus on the thing you

0:17:15.160 --> 0:17:17.520
<v Speaker 5>care about most and hope that there's a big market there.

0:17:17.640 --> 0:17:20.639
<v Speaker 5>So that the game and or what we decided to

0:17:20.680 --> 0:17:22.600
<v Speaker 5>do when we see some others deciding to do as well,

0:17:22.720 --> 0:17:25.680
<v Speaker 5>is to really try and focus on just the one

0:17:25.680 --> 0:17:27.639
<v Speaker 5>workload that seems like it's going to become a one

0:17:27.680 --> 0:17:30.320
<v Speaker 5>hundred billion dollar or a trendion dollar industry.

0:17:30.680 --> 0:17:33.160
<v Speaker 2>I know there's always this sort of cliche when talking

0:17:33.160 --> 0:17:36.480
<v Speaker 2>about techno. Oh, Google and Facebook, they can just build

0:17:36.480 --> 0:17:38.760
<v Speaker 2>this and they'll destroy your little startup because they have

0:17:38.840 --> 0:17:42.000
<v Speaker 2>infinites amounts of money. Except that doesn't actually seem to

0:17:42.200 --> 0:17:44.840
<v Speaker 2>happen in the real world as much as people on

0:17:44.880 --> 0:17:48.400
<v Speaker 2>Twitter expect it to happen. But can you just sort

0:17:48.400 --> 0:17:51.639
<v Speaker 2>of give a sense of maybe the business and organizational

0:17:52.200 --> 0:17:57.960
<v Speaker 2>incentives for why a company like Google doesn't say, oh,

0:17:58.040 --> 0:18:00.159
<v Speaker 2>this is one hundred billion dollar market in video is

0:18:00.200 --> 0:18:02.320
<v Speaker 2>worth three and a half trillion or three trillion dollars,

0:18:02.440 --> 0:18:06.240
<v Speaker 2>let's build our own LM specific chips. Why doesn't that

0:18:06.880 --> 0:18:11.159
<v Speaker 2>happen at these large, hyperscaler companies that presumably have all

0:18:11.200 --> 0:18:12.520
<v Speaker 2>the talent and money to do it.

0:18:13.920 --> 0:18:20.919
<v Speaker 6>So Google's TPUs are primarily built to serve their internal customers,

0:18:21.520 --> 0:18:25.320
<v Speaker 6>and Google's revenue for the most part comes from Google

0:18:25.359 --> 0:18:28.960
<v Speaker 6>Search that Google Search, and in particular from Google Search ads.

0:18:29.400 --> 0:18:34.280
<v Speaker 6>Google Search ads. Is you know, a customer of the TPUs,

0:18:34.040 --> 0:18:38.720
<v Speaker 6>It's a relatively difficult thing to say that hundreds of

0:18:38.800 --> 0:18:41.480
<v Speaker 6>billions of dollars of revenue that we're making, we're going

0:18:41.520 --> 0:18:44.359
<v Speaker 6>to make a chip that doesn't really support that particularly well,

0:18:44.400 --> 0:18:47.400
<v Speaker 6>and focuses on this at this point unproven in terms

0:18:47.440 --> 0:18:51.840
<v Speaker 6>of revenue market and it's not just ads, but they

0:18:51.880 --> 0:18:54.320
<v Speaker 6>are you know, a variety of other customers. For instance,

0:18:54.560 --> 0:18:57.359
<v Speaker 6>you know, you may have noticed how Google is pretty

0:18:57.359 --> 0:19:01.679
<v Speaker 6>good at identifying good photos and doing a whole variety

0:19:01.760 --> 0:19:04.359
<v Speaker 6>of other things that are supported in many cases by

0:19:04.400 --> 0:19:05.000
<v Speaker 6>the TPUs.

0:19:06.280 --> 0:19:08.240
<v Speaker 5>I think one of the other things too, that we

0:19:08.320 --> 0:19:11.760
<v Speaker 5>see in all chip companies in general, or companies producing chips,

0:19:11.840 --> 0:19:14.919
<v Speaker 5>is because producing chips is so expensive, you end up

0:19:14.960 --> 0:19:16.600
<v Speaker 5>in this place where you really want to put all

0:19:16.640 --> 0:19:21.320
<v Speaker 5>your resources behind one chip effort. And so just because

0:19:21.400 --> 0:19:23.520
<v Speaker 5>the thinking is that there's a huge amount of return

0:19:23.600 --> 0:19:25.879
<v Speaker 5>on investment in making this one thing better rather than

0:19:25.920 --> 0:19:28.199
<v Speaker 5>fragmenting your efforts. Really, what you'd like to do in

0:19:28.200 --> 0:19:30.880
<v Speaker 5>this situation where there's a new emerging field that might

0:19:30.960 --> 0:19:33.600
<v Speaker 5>be huge or might not, but it's hard to say yet,

0:19:33.720 --> 0:19:35.399
<v Speaker 5>what you'd like to do is maybe spin up a

0:19:35.440 --> 0:19:37.760
<v Speaker 5>second effort on the side and have like a skunk works. Yeah,

0:19:37.880 --> 0:19:38.439
<v Speaker 5>that's work, right.

0:19:38.440 --> 0:19:41.199
<v Speaker 2>That would be just to let Ryan er and just

0:19:41.320 --> 0:19:43.280
<v Speaker 2>let the two of you go have your own little

0:19:43.280 --> 0:19:44.160
<v Speaker 2>office somewhere else.

0:19:44.560 --> 0:19:48.199
<v Speaker 5>Yeah, just organizationally that it's often challenging to do, and

0:19:48.240 --> 0:19:50.720
<v Speaker 5>we see this across all companies. Every chip company really

0:19:50.720 --> 0:19:54.760
<v Speaker 5>has essentially only one mainstream chip product that is that

0:19:54.800 --> 0:19:57.120
<v Speaker 5>they're iterating on and making better and better over time.

0:19:58.200 --> 0:20:03.000
<v Speaker 3>To what degree is to design driven by the customer?

0:20:03.119 --> 0:20:05.440
<v Speaker 3>And what I mean by that is, so the TPUs

0:20:05.480 --> 0:20:09.639
<v Speaker 3>at Google were developed to handle Google's internal workloads, but

0:20:09.920 --> 0:20:13.920
<v Speaker 3>at other chip designers, to what degree will customers come

0:20:13.960 --> 0:20:16.600
<v Speaker 3>and like basically do a reverse inquiry and ask for

0:20:16.640 --> 0:20:20.320
<v Speaker 3>a specific chip or what does the dialogue between customers

0:20:20.400 --> 0:20:23.320
<v Speaker 3>and the big chip designers actually look like.

0:20:24.080 --> 0:20:27.040
<v Speaker 5>Yeah, it's a fun interplay of I want my provider

0:20:27.080 --> 0:20:28.479
<v Speaker 5>to do a good job, but I also don't want

0:20:28.520 --> 0:20:31.880
<v Speaker 5>to leak my IP too much. So you can see

0:20:31.920 --> 0:20:34.640
<v Speaker 5>this how this played out in so Mike was talking

0:20:34.680 --> 0:20:37.880
<v Speaker 5>about through the development of the TPUs which were publicly

0:20:37.920 --> 0:20:41.439
<v Speaker 5>announced in twenty sixteen and around the same time in

0:20:41.520 --> 0:20:44.119
<v Speaker 5>videos first GPU with the tens ocos, So that was

0:20:44.160 --> 0:20:46.520
<v Speaker 5>the first GPU that was really focused on matrix multiplication.

0:20:46.800 --> 0:20:49.320
<v Speaker 5>That was the vaulted generation came out at about the

0:20:49.359 --> 0:20:52.479
<v Speaker 5>same time. And some of this actually was a result

0:20:52.520 --> 0:20:56.680
<v Speaker 5>of when Google had this recognition of look, matrix multiplication

0:20:56.760 --> 0:20:58.600
<v Speaker 5>is so important, we need to make it really better.

0:20:58.800 --> 0:21:01.800
<v Speaker 5>They simultaneously work themselves but also went to Nvideo and

0:21:01.840 --> 0:21:04.399
<v Speaker 5>said we're not telling you much, but can you do

0:21:04.440 --> 0:21:07.879
<v Speaker 5>better at matrix multification? And so that was enough for

0:21:08.000 --> 0:21:10.760
<v Speaker 5>Nvidia to go on the first generation. They made a

0:21:10.760 --> 0:21:12.720
<v Speaker 5>pretty good attempt. But if you talk to people that

0:21:12.800 --> 0:21:15.199
<v Speaker 5>in video, I'll say that actually the second generation of

0:21:15.240 --> 0:21:17.760
<v Speaker 5>the tensacle which was in the MPa generation, was where

0:21:17.760 --> 0:21:20.399
<v Speaker 5>they really nailed it. So when it's big enough, you

0:21:20.440 --> 0:21:22.760
<v Speaker 5>sometimes see these customers coming and saying what they want,

0:21:22.800 --> 0:21:26.040
<v Speaker 5>but they'll maybe they'll try and disguise what they're asking

0:21:26.080 --> 0:21:28.240
<v Speaker 5>for or not giving you the absolute minimum amount of

0:21:28.280 --> 0:21:31.600
<v Speaker 5>information to help a vendor make what they want without

0:21:31.600 --> 0:21:32.760
<v Speaker 5>revealing too much about their.

0:21:32.640 --> 0:21:50.240
<v Speaker 4>IB Let's get to maddex.

0:21:50.680 --> 0:21:54.280
<v Speaker 2>Tell us the product that you're designing and how it

0:21:54.720 --> 0:21:59.040
<v Speaker 2>fundamentally will differ from the offerings on the market, most

0:21:59.080 --> 0:22:00.000
<v Speaker 2>notably from VideA.

0:22:00.040 --> 0:22:00.240
<v Speaker 4>Yeah.

0:22:01.240 --> 0:22:04.320
<v Speaker 5>Yeah, So we make chips and in fact racks and

0:22:04.320 --> 0:22:08.840
<v Speaker 5>clusters for large language models. So when you look at

0:22:09.160 --> 0:22:11.719
<v Speaker 5>in videos GPUs, you already talked about all of this,

0:22:12.000 --> 0:22:15.679
<v Speaker 5>the original background in gaming, this brief movement in ethereum,

0:22:15.920 --> 0:22:18.280
<v Speaker 5>and then even within AI they're doing small models of

0:22:18.400 --> 0:22:22.760
<v Speaker 5>large models. So what that translates to in you can

0:22:22.760 --> 0:22:24.560
<v Speaker 5>think of it as the rooms of the house or something.

0:22:24.680 --> 0:22:27.000
<v Speaker 5>They have a different room for each of each of

0:22:27.040 --> 0:22:29.880
<v Speaker 5>those different use cases, so different circuitry in the chip

0:22:29.920 --> 0:22:32.840
<v Speaker 5>for all of these use cases. And the fundamental bet

0:22:32.920 --> 0:22:35.919
<v Speaker 5>is that if you say, look, I don't care about that,

0:22:35.960 --> 0:22:37.720
<v Speaker 5>I'm going to do a lousy job if you try

0:22:37.720 --> 0:22:38.919
<v Speaker 5>and run a game on me, or I'm going to

0:22:38.960 --> 0:22:41.280
<v Speaker 5>do a lousy job if you want to run a

0:22:41.280 --> 0:22:44.479
<v Speaker 5>convolutional network on me. But if you give me a

0:22:44.560 --> 0:22:47.240
<v Speaker 5>large model with very large matrices, I'm going to crush it.

0:22:47.640 --> 0:22:50.440
<v Speaker 5>That's the bet that we're making amatix, so we spend

0:22:50.440 --> 0:22:52.520
<v Speaker 5>as much of our silicon as we can on making

0:22:52.600 --> 0:22:54.959
<v Speaker 5>this work. There's a lot of detail in making all

0:22:54.960 --> 0:22:56.480
<v Speaker 5>of this work out, because you need not just the

0:22:56.480 --> 0:22:58.960
<v Speaker 5>matrix multiplication, but the all of the memory bandwidths and

0:22:58.960 --> 0:23:02.639
<v Speaker 5>communication bandwidths and the actual engineering things to make a

0:23:02.640 --> 0:23:05.280
<v Speaker 5>pen out. But that's the core bette.

0:23:05.200 --> 0:23:09.040
<v Speaker 3>And why can't Invidia do this? So you know, in

0:23:09.160 --> 0:23:11.359
<v Speaker 3>Video has a lot of resources, It has that big

0:23:11.400 --> 0:23:13.959
<v Speaker 3>moat as we were discussing in the intro, and it

0:23:14.000 --> 0:23:17.119
<v Speaker 3>has the GPUs that are already in production and working

0:23:17.240 --> 0:23:20.159
<v Speaker 3>on new ones. But why couldn't it start designing an

0:23:20.480 --> 0:23:22.680
<v Speaker 3>LM focused chip from scratch?

0:23:23.800 --> 0:23:27.200
<v Speaker 6>Right? So you talked about in Vidia's mode and that

0:23:27.680 --> 0:23:30.919
<v Speaker 6>moat has two components. One component is that they build

0:23:31.000 --> 0:23:34.040
<v Speaker 6>the very best hardware, and I think you know that

0:23:34.200 --> 0:23:38.520
<v Speaker 6>is the result of having a very large team that

0:23:38.680 --> 0:23:42.720
<v Speaker 6>executes extremely well and making good choices about how to

0:23:42.760 --> 0:23:46.479
<v Speaker 6>serve their market. They also have a tremendous software mode.

0:23:46.840 --> 0:23:48.800
<v Speaker 6>And you know, both of these moats are important to

0:23:48.840 --> 0:23:53.040
<v Speaker 6>different sets of customers, so they're tremendous software mode. They

0:23:53.040 --> 0:23:57.359
<v Speaker 6>have a very broad, deep software ecosystem based on Kuda

0:23:57.880 --> 0:23:58.879
<v Speaker 6>that allows it.

0:23:59.040 --> 0:24:01.080
<v Speaker 3>Oh yeah, I remember this came up in our discussion

0:24:01.080 --> 0:24:01.720
<v Speaker 3>with core Weave.

0:24:02.000 --> 0:24:06.439
<v Speaker 6>Yeah yeah. And so that allows customers who are not

0:24:06.560 --> 0:24:11.720
<v Speaker 6>very sophisticated, who don't have gigantic engineering budgets themselves, to

0:24:11.880 --> 0:24:15.560
<v Speaker 6>use those chips and using videos chips and be efficient

0:24:15.640 --> 0:24:18.960
<v Speaker 6>at that. So the thing about a mote is not

0:24:19.000 --> 0:24:21.840
<v Speaker 6>only does it in some sense keep other people out,

0:24:21.880 --> 0:24:25.560
<v Speaker 6>it also keeps you in. So insofar as they want

0:24:25.600 --> 0:24:29.040
<v Speaker 6>to keep their software mode, their Kuda mote, they have

0:24:29.080 --> 0:24:34.240
<v Speaker 6>to remain compatible with Kuda and compatibilility with that software mode.

0:24:34.440 --> 0:24:39.640
<v Speaker 6>Compatibilility with Kuda requires certain hardware structures. So in Videos

0:24:40.080 --> 0:24:43.080
<v Speaker 6>has lots and lots of threads, they have a very

0:24:43.119 --> 0:24:47.080
<v Speaker 6>flexible memory system. These things are great for being able

0:24:47.119 --> 0:24:50.479
<v Speaker 6>to flexibly address a whole bunch of different types of

0:24:50.560 --> 0:24:53.760
<v Speaker 6>neural net problems, but they all cost in terms of hardware,

0:24:53.840 --> 0:24:58.160
<v Speaker 6>and they're not necessarily those The choices to have those

0:24:58.160 --> 0:25:01.040
<v Speaker 6>sorts of things are not necessarily the in fact, not

0:25:01.119 --> 0:25:03.399
<v Speaker 6>the choices that you would want to make if you

0:25:03.440 --> 0:25:06.879
<v Speaker 6>were aiming specifically at an LM. So in order to

0:25:06.920 --> 0:25:11.400
<v Speaker 6>be you know, fully competitive with a chip that's specialized

0:25:11.440 --> 0:25:14.080
<v Speaker 6>for LLMS, they would have to give up all of that.

0:25:14.600 --> 0:25:18.120
<v Speaker 6>And you know, Jensen himself has said that the one

0:25:18.680 --> 0:25:21.320
<v Speaker 6>non negotiable rule in our company is that we have

0:25:21.359 --> 0:25:22.520
<v Speaker 6>to be compatible with kuda.

0:25:23.480 --> 0:25:27.240
<v Speaker 2>This is interesting. So the challenge for them of spinning

0:25:27.280 --> 0:25:31.280
<v Speaker 2>out something totally different is that it would be outside

0:25:31.359 --> 0:25:35.359
<v Speaker 2>the family. And so it's outside the Kudah family, so

0:25:35.400 --> 0:25:35.800
<v Speaker 2>to speak.

0:25:35.880 --> 0:25:38.640
<v Speaker 3>And meanwhile, you already have like pie, Torch and Triton

0:25:38.680 --> 0:25:39.600
<v Speaker 3>waiting in the wings.

0:25:39.640 --> 0:25:42.920
<v Speaker 2>I guess, so why don't you tell us a little

0:25:42.920 --> 0:25:46.680
<v Speaker 2>bit more about the business of LLLM chips specifically, because

0:25:46.680 --> 0:25:50.320
<v Speaker 2>there's a lot of questions, Like, you know, one question

0:25:50.440 --> 0:25:52.920
<v Speaker 2>is you have all these people in Silicon Valley who

0:25:52.960 --> 0:25:57.040
<v Speaker 2>seem motivated by the idea of like agi that that's

0:25:57.080 --> 0:26:00.280
<v Speaker 2>the goal, that we're going to have super intelligence one day,

0:26:00.320 --> 0:26:03.720
<v Speaker 2>maybe thousands IQs and the hundreds of thousands one day.

0:26:03.720 --> 0:26:06.200
<v Speaker 2>That'll make us all seem very dumb, et cetera. Are

0:26:06.200 --> 0:26:09.240
<v Speaker 2>you implicitly making a bet by your company that it'll

0:26:09.280 --> 0:26:12.480
<v Speaker 2>be lllms that we'll get there, Because as you mentioned,

0:26:12.520 --> 0:26:15.240
<v Speaker 2>there are other algorithmic ideas, There are other ideas for

0:26:15.320 --> 0:26:18.800
<v Speaker 2>how you might be able to expand intelligent. How much

0:26:18.920 --> 0:26:22.080
<v Speaker 2>of your company's bet is the idea that the future

0:26:22.240 --> 0:26:26.119
<v Speaker 2>of generative AI or as we know it, is going

0:26:26.160 --> 0:26:27.960
<v Speaker 2>to be along the LLLM pathway.

0:26:28.520 --> 0:26:30.440
<v Speaker 5>One of the core things. I think there's two core

0:26:30.600 --> 0:26:34.760
<v Speaker 5>ingredients of the LM pathway. Yeah, one so far is

0:26:34.920 --> 0:26:37.840
<v Speaker 5>the transformer architecture, which is a model architecture and was

0:26:37.880 --> 0:26:41.040
<v Speaker 5>substantially better than the things that came before. But the

0:26:41.080 --> 0:26:43.919
<v Speaker 5>other one, and that actually has a much longer history,

0:26:44.080 --> 0:26:48.000
<v Speaker 5>is the scaling hypothesis in hypothesis in general sore, But

0:26:48.200 --> 0:26:51.359
<v Speaker 5>that's the there's a general observation which has been widely

0:26:51.400 --> 0:26:56.040
<v Speaker 5>recognized for a decade or more that if I am sorry,

0:26:56.160 --> 0:26:58.840
<v Speaker 5>I'm training in neural net or some kind of AI model,

0:26:59.200 --> 0:27:01.080
<v Speaker 5>if I want to make its quality better and make

0:27:01.119 --> 0:27:03.680
<v Speaker 5>it bigger, and so what does bigger mean? Bigger means

0:27:03.680 --> 0:27:06.359
<v Speaker 5>I have to spend more compute training it. Bigger means

0:27:06.359 --> 0:27:09.800
<v Speaker 5>I have more neurons. Tho's are the loosely analogous to

0:27:09.880 --> 0:27:12.439
<v Speaker 5>the sort of processing power in a human brain, although

0:27:12.800 --> 0:27:15.520
<v Speaker 5>analogy is weak. If I make my model bigger, I

0:27:15.520 --> 0:27:18.160
<v Speaker 5>get better quality. That's a sort of simple qualitative thing

0:27:18.160 --> 0:27:20.800
<v Speaker 5>to say, and that's been true for a really long

0:27:20.840 --> 0:27:25.199
<v Speaker 5>time in these models. So the advantage of that, or

0:27:25.400 --> 0:27:27.480
<v Speaker 5>the thing that we've seen really recently is we've seen

0:27:27.480 --> 0:27:31.760
<v Speaker 5>this turned up to eleven. So around the time when

0:27:31.800 --> 0:27:34.280
<v Speaker 5>GPT three came out, So in twenty twenty, a paper

0:27:34.359 --> 0:27:37.880
<v Speaker 5>was published called the Scaling Laws, and so this took

0:27:37.920 --> 0:27:41.879
<v Speaker 5>this qualitative observation and made it quantitative and said, actually,

0:27:41.920 --> 0:27:43.680
<v Speaker 5>we can even fit an equation to it, and so

0:27:43.760 --> 0:27:46.760
<v Speaker 5>that gave people a lot more conviction to it. And

0:27:46.960 --> 0:27:50.159
<v Speaker 5>this is what led to the people saying, well, if

0:27:50.200 --> 0:27:52.240
<v Speaker 5>I have a better model, I can solve more problems

0:27:52.240 --> 0:27:54.440
<v Speaker 5>with AI than I could before. And so every time

0:27:54.480 --> 0:27:57.360
<v Speaker 5>I spend ten times as much training on it, I

0:27:57.480 --> 0:28:00.879
<v Speaker 5>unlock new use cases. And so that's what to this craze.

0:28:00.920 --> 0:28:03.840
<v Speaker 5>And the remarkable thing is that while there are these

0:28:03.840 --> 0:28:05.920
<v Speaker 5>diminishing returns, I have to spend ten times as much

0:28:05.920 --> 0:28:09.639
<v Speaker 5>computing power to get some improvement beyond that sort of

0:28:09.800 --> 0:28:13.679
<v Speaker 5>logarithmic scale. We don't see as yet any plateau an

0:28:13.840 --> 0:28:16.880
<v Speaker 5>so it seems like there continues to be opportunity here.

0:28:17.160 --> 0:28:19.879
<v Speaker 5>So the key thing is this scaling hypothesis or scaling

0:28:19.960 --> 0:28:21.960
<v Speaker 5>laws in general that are causing these models to grow.

0:28:22.480 --> 0:28:24.600
<v Speaker 5>And then I mean as a hardware provider, what you

0:28:24.680 --> 0:28:26.520
<v Speaker 5>might look at is you might say, that's the thing

0:28:26.520 --> 0:28:28.000
<v Speaker 5>I really want to bet on. I want to bet

0:28:28.040 --> 0:28:30.840
<v Speaker 5>on the growth of models, and I mean, now it's

0:28:30.840 --> 0:28:33.320
<v Speaker 5>a little more in the details, but the thing you

0:28:33.359 --> 0:28:35.399
<v Speaker 5>actually have to bet on is the growth of matrix sites,

0:28:35.600 --> 0:28:37.920
<v Speaker 5>which is very strongly correlated with the growth of models.

0:28:38.560 --> 0:28:42.280
<v Speaker 3>Just to hammer this point home, if more AI was

0:28:42.360 --> 0:28:47.560
<v Speaker 3>learning from stuff like self play or synthetic data rather

0:28:47.640 --> 0:28:51.520
<v Speaker 3>than scraping the internet, would the design of the chips

0:28:51.720 --> 0:28:54.480
<v Speaker 3>have to take that into account, Like, how would the

0:28:54.560 --> 0:28:58.360
<v Speaker 3>chips vary between those different learning styles.

0:28:59.080 --> 0:29:02.240
<v Speaker 5>Yeah, so in general, when you're building a chip, you

0:29:02.520 --> 0:29:04.560
<v Speaker 5>have to make it programmable because you're going to make

0:29:04.560 --> 0:29:06.080
<v Speaker 5>this chip and you will ship a new version every

0:29:06.120 --> 0:29:07.920
<v Speaker 5>two years, but what people want to do with the

0:29:08.000 --> 0:29:10.239
<v Speaker 5>chip is going to change every month or so, so

0:29:10.280 --> 0:29:12.760
<v Speaker 5>it has to be programmable to some extent. So that's

0:29:12.760 --> 0:29:15.520
<v Speaker 5>true for all of the chips that anyone ships, and

0:29:15.560 --> 0:29:19.360
<v Speaker 5>so there's different scales of programmability and what kinds of

0:29:19.440 --> 0:29:23.160
<v Speaker 5>changes you need to adapt to, So changes in kind

0:29:23.200 --> 0:29:25.880
<v Speaker 5>of the way you feed it data that's maybe on

0:29:25.920 --> 0:29:28.480
<v Speaker 5>the very very outer layers of doesn't affect much of

0:29:28.520 --> 0:29:30.360
<v Speaker 5>the core of the chip, and so those kinds of

0:29:30.480 --> 0:29:33.120
<v Speaker 5>changes tend to be some of the easier changes to

0:29:33.120 --> 0:29:35.560
<v Speaker 5>adapt to. The things that then become a little harder

0:29:35.560 --> 0:29:39.200
<v Speaker 5>to adapt to is if I'm substantially changing my model architecture.

0:29:39.320 --> 0:29:41.600
<v Speaker 5>So a small change might be maybe I change the

0:29:41.680 --> 0:29:44.640
<v Speaker 5>number of layers, or I reorder some of the layers

0:29:44.640 --> 0:29:47.240
<v Speaker 5>in my model, or maybe I use the same ingredients

0:29:47.320 --> 0:29:49.800
<v Speaker 5>but shuffle them around in some way. A bigger change

0:29:49.840 --> 0:29:51.560
<v Speaker 5>would be that say, Okay, I'm actually going to throw

0:29:51.560 --> 0:29:53.320
<v Speaker 5>out all of these ingredients and use a completely different

0:29:53.320 --> 0:29:56.600
<v Speaker 5>set of primitives. And that's often that's that last step

0:29:56.640 --> 0:29:58.560
<v Speaker 5>is the one that that would really kill you if

0:29:58.600 --> 0:30:01.280
<v Speaker 5>you're betting very much on it. Partic a set of ingredients.

0:30:01.920 --> 0:30:05.400
<v Speaker 6>So an example of a potential different set of primitives

0:30:05.440 --> 0:30:08.920
<v Speaker 6>that are used in other models that aren't used in

0:30:09.320 --> 0:30:13.120
<v Speaker 6>llms are we made mention of these embedding things that

0:30:13.160 --> 0:30:16.800
<v Speaker 6>are used in recommender and ad models. So Facebook has

0:30:17.000 --> 0:30:20.240
<v Speaker 6>talked about building special purpose hardware to support inference on

0:30:20.280 --> 0:30:24.160
<v Speaker 6>those kind of models. Those are they have much less

0:30:24.200 --> 0:30:31.080
<v Speaker 6>emphasis relative emphasis, particularly on matrix multiply. Another possible direction

0:30:31.200 --> 0:30:35.280
<v Speaker 6>that model architecture could go. That would be different and

0:30:35.680 --> 0:30:38.880
<v Speaker 6>bad for a chip designed for current LLMS, would be

0:30:39.400 --> 0:30:43.560
<v Speaker 6>instead of having very large matrices in about one hundred layers,

0:30:43.600 --> 0:30:46.800
<v Speaker 6>you could have much smaller matrices but ten thousand layers,

0:30:47.280 --> 0:30:50.920
<v Speaker 6>and that would demand a different sort of design to

0:30:51.000 --> 0:30:54.520
<v Speaker 6>be good at that kind of model. So a bet

0:30:54.560 --> 0:30:58.880
<v Speaker 6>that looks good given the modern history of neural nets

0:30:58.960 --> 0:31:01.280
<v Speaker 6>is that matrices will get larger over time.

0:31:01.960 --> 0:31:04.280
<v Speaker 2>You know, you're talking about scaling laws, and so everyone

0:31:04.320 --> 0:31:09.000
<v Speaker 2>talks about okay, computation, power, energy efficiency, et cetera, and

0:31:09.120 --> 0:31:11.520
<v Speaker 2>I never know if they're true. But then sometimes you

0:31:11.600 --> 0:31:14.520
<v Speaker 2>read these stories they're like Sam Altman wants to go

0:31:14.560 --> 0:31:18.400
<v Speaker 2>around the world and raise like five trillion dollars to

0:31:18.480 --> 0:31:22.600
<v Speaker 2>like build his own semiconductor fabs and have the entire architecture,

0:31:22.600 --> 0:31:25.400
<v Speaker 2>because that's like what it's going to take. What about

0:31:25.400 --> 0:31:27.920
<v Speaker 2>the data side, because this is another thing people talk about,

0:31:27.920 --> 0:31:30.520
<v Speaker 2>the data wall that you know, there's only one Internet

0:31:30.600 --> 0:31:33.480
<v Speaker 2>to scrape, and then you know, after that, what if

0:31:33.480 --> 0:31:36.320
<v Speaker 2>you're not there at AGI yet again, I know you're

0:31:36.360 --> 0:31:39.520
<v Speaker 2>solving for the hardware side, but when you think about

0:31:39.640 --> 0:31:44.400
<v Speaker 2>risks going forward along the LLM pathway, what's your perspective

0:31:44.600 --> 0:31:47.880
<v Speaker 2>on well, what happens when we've just we've ingested all

0:31:47.880 --> 0:31:49.000
<v Speaker 2>the data.

0:31:49.080 --> 0:31:51.760
<v Speaker 5>So there's two ways you can make a model better.

0:31:51.960 --> 0:31:54.000
<v Speaker 5>One of them is by training on more data, and

0:31:54.040 --> 0:31:56.280
<v Speaker 5>the other one is making a bigger model. And these

0:31:56.320 --> 0:31:59.520
<v Speaker 5>two effects work in a really complimentary way. So you

0:31:59.520 --> 0:32:01.520
<v Speaker 5>can think of a like having a bigger brain and

0:32:01.560 --> 0:32:03.840
<v Speaker 5>then practicing more and so both of these are going

0:32:03.920 --> 0:32:06.720
<v Speaker 5>to help to some extent. So there's a risk that

0:32:06.760 --> 0:32:09.680
<v Speaker 5>we hit a data wall. In general, there's been a

0:32:09.720 --> 0:32:13.440
<v Speaker 5>long history of people predicting walls in different kinds of

0:32:13.480 --> 0:32:18.040
<v Speaker 5>walls in techno training and then ingenuity overcoming this, and

0:32:18.080 --> 0:32:22.280
<v Speaker 5>so I wouldn't I would bet that there's a fairly

0:32:22.360 --> 0:32:26.000
<v Speaker 5>large amount of mileage to continue here. Tracy mentioned self

0:32:26.040 --> 0:32:30.840
<v Speaker 5>training and generating new data. That's the vibe in the

0:32:30.840 --> 0:32:33.640
<v Speaker 5>industry is that this is a promising direction for sure.

0:32:34.040 --> 0:32:36.520
<v Speaker 5>But even if you don't bet on that, there's mileage,

0:32:36.520 --> 0:32:38.800
<v Speaker 5>and it's less attractive mileage, but there is mileage in

0:32:38.800 --> 0:32:42.480
<v Speaker 5>making the models bigger. So I believe, and I think

0:32:42.480 --> 0:32:46.000
<v Speaker 5>this is shared by many people insiders in the industry

0:32:46.000 --> 0:32:48.520
<v Speaker 5>as well, is that there's at least a few more

0:32:48.600 --> 0:32:50.880
<v Speaker 5>orders of magnitude available here before we run out of

0:32:51.080 --> 0:32:54.200
<v Speaker 5>easy engineering knobs to turn. But of course, one of

0:32:54.240 --> 0:32:56.800
<v Speaker 5>the limiting factors here is just the dollars you spend.

0:32:57.200 --> 0:33:01.200
<v Speaker 5>So you have some amount of budge that I'm willing

0:33:01.240 --> 0:33:03.440
<v Speaker 5>to spend. And I mean, maybe Sam can raise five

0:33:03.440 --> 0:33:05.880
<v Speaker 5>trillion dollars, I don't think necessarily everyone else can raise

0:33:05.960 --> 0:33:08.360
<v Speaker 5>that amount of money to train a model. And so

0:33:08.400 --> 0:33:10.120
<v Speaker 5>if you've got a fixed amount of dollars that you

0:33:10.160 --> 0:33:11.920
<v Speaker 5>want to spend, and you want to train the best model,

0:33:12.280 --> 0:33:14.240
<v Speaker 5>you want to make the best use of the multipliers,

0:33:14.440 --> 0:33:15.840
<v Speaker 5>you want to make the best use of the dollars

0:33:15.880 --> 0:33:18.320
<v Speaker 5>you spend, and so that means fundamentally, what you're paying

0:33:18.320 --> 0:33:21.200
<v Speaker 5>for is the flops, which flops is a floating point operation,

0:33:21.760 --> 0:33:24.840
<v Speaker 5>so the number of multipliers you can do. And then

0:33:24.880 --> 0:33:27.320
<v Speaker 5>every time I increase my model size or increase the

0:33:27.360 --> 0:33:29.920
<v Speaker 5>amount of training data I've got, I'm spending more flops,

0:33:30.040 --> 0:33:34.320
<v Speaker 5>and so flops converts into intelligence. And then if I've

0:33:34.320 --> 0:33:36.200
<v Speaker 5>got a fixed budget, really what I want to maximize

0:33:36.280 --> 0:33:37.280
<v Speaker 5>is my flops per dollar.

0:33:38.680 --> 0:33:41.840
<v Speaker 3>I find this so fascinating because there are so many

0:33:41.880 --> 0:33:45.560
<v Speaker 3>different directions that you could theoretically go in, and so

0:33:45.680 --> 0:33:49.760
<v Speaker 3>many decisions that need to be made, you know, do

0:33:49.800 --> 0:33:52.440
<v Speaker 3>you go after of that scale? How do you tailor

0:33:52.520 --> 0:33:55.479
<v Speaker 3>the design for different methods of data input? Although, as

0:33:55.480 --> 0:33:57.680
<v Speaker 3>you said earlier, maybe that's one of the easiest things

0:33:57.840 --> 0:34:01.280
<v Speaker 3>to respond to. But then there are other trade offs

0:34:01.320 --> 0:34:04.680
<v Speaker 3>that you have to think about between speed and power

0:34:04.720 --> 0:34:09.440
<v Speaker 3>consumption and I guess area utilization or the placement of

0:34:09.480 --> 0:34:11.840
<v Speaker 3>all the bits and bobs that we were discussing earlier,

0:34:11.880 --> 0:34:16.200
<v Speaker 3>and cost effectiveness too. How do you balance all those

0:34:16.320 --> 0:34:19.600
<v Speaker 3>elements and are there particular things that you're willing to

0:34:19.760 --> 0:34:21.520
<v Speaker 3>sacrifice for others.

0:34:22.719 --> 0:34:27.000
<v Speaker 6>So different people can choose different targets to go after

0:34:27.480 --> 0:34:31.400
<v Speaker 6>in the market, and so one one target, which you

0:34:31.440 --> 0:34:35.640
<v Speaker 6>could argue in VideA is winning on currently and one

0:34:35.680 --> 0:34:37.719
<v Speaker 6>of the reasons that their chips their products are so

0:34:37.760 --> 0:34:41.400
<v Speaker 6>popular is, as Rayner said, just the amount of flops

0:34:41.440 --> 0:34:43.520
<v Speaker 6>you can get out out of a chip, and if

0:34:43.719 --> 0:34:46.440
<v Speaker 6>all the chips are roughly the same to make, that

0:34:46.800 --> 0:34:52.080
<v Speaker 6>translates into two flops flops per dollar. So another target

0:34:52.160 --> 0:34:55.440
<v Speaker 6>you could also go after would be the time to

0:34:55.520 --> 0:34:58.279
<v Speaker 6>respond to one user so to get the answer back.

0:34:58.560 --> 0:35:01.960
<v Speaker 6>One approach is maxim the throughput that you can have

0:35:02.040 --> 0:35:05.520
<v Speaker 6>and others minimizing the latency, So kind of the difference

0:35:05.560 --> 0:35:09.040
<v Speaker 6>between a seven forty seven flying a group of passengers

0:35:09.080 --> 0:35:12.719
<v Speaker 6>across the country versus an SR seventy one getting there

0:35:13.000 --> 0:35:15.720
<v Speaker 6>in a couple hours but only bringing one or two people.

0:35:16.120 --> 0:35:18.799
<v Speaker 2>Let's talk about the business itself. So you know, in

0:35:18.840 --> 0:35:22.000
<v Speaker 2>the old you know, ten years ago, someone starting a

0:35:22.360 --> 0:35:26.080
<v Speaker 2>tech startup, they you know, get three or four people

0:35:26.080 --> 0:35:28.120
<v Speaker 2>in an office and then they write something up. But

0:35:28.160 --> 0:35:30.319
<v Speaker 2>then they have a code and it doesn't maybe they

0:35:30.360 --> 0:35:32.719
<v Speaker 2>don't even have to raise any money to do it,

0:35:32.760 --> 0:35:35.600
<v Speaker 2>and they certainly don't have to depend on whether Taiwan

0:35:35.680 --> 0:35:39.319
<v Speaker 2>Semiconductor has any capacity at their fab or anything like

0:35:39.360 --> 0:35:42.600
<v Speaker 2>this walk us through the sort of nuts and bolts

0:35:42.760 --> 0:35:46.280
<v Speaker 2>of what it actually takes to build a chip business

0:35:46.320 --> 0:35:49.799
<v Speaker 2>from the ground up, both in terms of costs and

0:35:50.200 --> 0:35:52.440
<v Speaker 2>time and what you have to rely on. You know,

0:35:52.480 --> 0:35:55.560
<v Speaker 2>we've talked about some of the design element, what are

0:35:55.560 --> 0:35:58.239
<v Speaker 2>the business side requirements and what will it take to

0:35:58.280 --> 0:35:59.080
<v Speaker 2>actually succeed.

0:35:59.800 --> 0:36:05.200
<v Speaker 6>So fortunately we've kind of referred to this in multiple places.

0:36:05.520 --> 0:36:10.239
<v Speaker 6>There's a huge ecosystem around designing chips. So there's a

0:36:10.280 --> 0:36:12.440
<v Speaker 6>portion you have to do yourself, and there's a portion

0:36:12.520 --> 0:36:15.520
<v Speaker 6>that you can buy, so the placement of Tracy's bits

0:36:15.560 --> 0:36:18.280
<v Speaker 6>and bobs and also the testing that we've talked about.

0:36:18.800 --> 0:36:24.080
<v Speaker 6>There are DA electronic design automation companies that build those tools,

0:36:24.680 --> 0:36:28.600
<v Speaker 6>like there are companies that do just manufacturing, so TSMC

0:36:29.800 --> 0:36:34.480
<v Speaker 6>and their suppliers, and then there are many other other companies.

0:36:34.520 --> 0:36:39.440
<v Speaker 6>So most companies don't go directly to TSMC. So so

0:36:39.640 --> 0:36:45.360
<v Speaker 6>very sophisticated companies like Apple or Nvidia interface directly with them,

0:36:45.400 --> 0:36:49.279
<v Speaker 6>but most other companies go through ACIC vendors. And so

0:36:49.440 --> 0:36:52.400
<v Speaker 6>you know, the prominent companies in the most prominent companies

0:36:52.440 --> 0:36:56.399
<v Speaker 6>in that space are Broadcom and Marvel, and then there

0:36:56.400 --> 0:36:59.040
<v Speaker 6>are a bunch of smaller companies. A couple that are

0:36:59.520 --> 0:37:04.319
<v Speaker 6>close to TSMC are all Chip and GUC and so

0:37:04.560 --> 0:37:08.040
<v Speaker 6>they'll do a lot of the work of taking your

0:37:08.160 --> 0:37:11.799
<v Speaker 6>code and actually getting it placed on the chip. That's

0:37:11.800 --> 0:37:15.600
<v Speaker 6>often a very good thing to outsource because it's the

0:37:15.640 --> 0:37:18.160
<v Speaker 6>work is somewhat seasonal. You're only ready to do that

0:37:18.239 --> 0:37:21.880
<v Speaker 6>placement when you're near the end of this three year project,

0:37:22.360 --> 0:37:25.279
<v Speaker 6>and so you kind of don't have work unless you're

0:37:25.280 --> 0:37:29.680
<v Speaker 6>a massive company for people the whole time. So while

0:37:29.840 --> 0:37:32.040
<v Speaker 6>that ecosystem means that you don't have to hire a

0:37:32.040 --> 0:37:35.600
<v Speaker 6>ton of a huge number of people yourself. All of

0:37:35.600 --> 0:37:39.040
<v Speaker 6>those people have to get paid, and so you do

0:37:39.120 --> 0:37:40.800
<v Speaker 6>have to raise a fair bit of money. And another

0:37:40.840 --> 0:37:43.520
<v Speaker 6>big element of actually thing that you end up spending

0:37:43.640 --> 0:37:46.400
<v Speaker 6>money on is there are parts of the chip that

0:37:46.920 --> 0:37:51.680
<v Speaker 6>are very special, difficult to design and take multiple iterations

0:37:51.719 --> 0:37:54.520
<v Speaker 6>of taping things out and seeing if they work. So

0:37:54.960 --> 0:37:58.200
<v Speaker 6>the very high speed interconnect the connects to get connects

0:37:58.239 --> 0:38:02.400
<v Speaker 6>together chips is an example that. So those are designed

0:38:02.400 --> 0:38:06.239
<v Speaker 6>by yet another set of companies, and the design is

0:38:06.239 --> 0:38:08.840
<v Speaker 6>difficult and fairly expensive because of the need to do

0:38:08.920 --> 0:38:13.359
<v Speaker 6>multiple tapeouts, and so it's very fairly expensive to buy

0:38:13.440 --> 0:38:17.279
<v Speaker 6>that IP. So when you add up the cost of

0:38:17.320 --> 0:38:21.560
<v Speaker 6>the IP, the cost of the ASK vendors services, and

0:38:21.600 --> 0:38:27.680
<v Speaker 6>then the mask fees that TSMC charges using ASMLS and

0:38:27.719 --> 0:38:31.799
<v Speaker 6>ASK creation software, you're talking about tens of millions of

0:38:31.800 --> 0:38:34.960
<v Speaker 6>dollars to bring a state of the art chip to market.

0:38:35.000 --> 0:38:38.239
<v Speaker 6>It's the numbers are much lower for a simpler chip

0:38:38.280 --> 0:38:41.080
<v Speaker 6>on it without the very high speed iOS and on

0:38:41.120 --> 0:38:44.799
<v Speaker 6>an older node, but for an advanced node it's a

0:38:44.840 --> 0:38:46.440
<v Speaker 6>pretty expensive process.

0:38:46.680 --> 0:38:48.480
<v Speaker 3>When do you think you'll be able to bring your

0:38:48.520 --> 0:38:49.240
<v Speaker 3>chips to market.

0:38:49.760 --> 0:38:52.879
<v Speaker 5>Generally, we see these projects taking three to five years

0:38:53.440 --> 0:38:56.040
<v Speaker 5>across most companies. We started on this seriously at the

0:38:56.080 --> 0:38:58.680
<v Speaker 5>beginning of twenty four, so about three years from there

0:38:58.760 --> 0:38:59.839
<v Speaker 5>is likely for us.

0:39:00.719 --> 0:39:04.520
<v Speaker 2>Tell us about what customers because I've heard this, you know,

0:39:04.920 --> 0:39:08.520
<v Speaker 2>we're all trying to find some alternative to video, whether

0:39:08.560 --> 0:39:12.800
<v Speaker 2>it's to reduce energy costs or just reduce costs in general,

0:39:13.040 --> 0:39:16.480
<v Speaker 2>or be able to even access chips at all, since

0:39:16.520 --> 0:39:18.480
<v Speaker 2>not everyone can get them because there are only so

0:39:18.480 --> 0:39:20.680
<v Speaker 2>many chips getting made. But when you talk to like

0:39:20.880 --> 0:39:25.759
<v Speaker 2>theoretical customers, A, who do you imagine as your customers?

0:39:25.880 --> 0:39:28.239
<v Speaker 2>Is it the open eyes of the world, is it

0:39:28.360 --> 0:39:31.440
<v Speaker 2>the metas of the world. Is it labs that we

0:39:31.560 --> 0:39:34.600
<v Speaker 2>haven't heard of yet that could only get into this

0:39:35.040 --> 0:39:38.000
<v Speaker 2>if there were sort of more focused, lower cost options.

0:39:38.600 --> 0:39:40.719
<v Speaker 2>And then b what are they asking for? What do

0:39:40.800 --> 0:39:43.360
<v Speaker 2>they say, like, you know what we're using in video

0:39:43.520 --> 0:39:45.840
<v Speaker 2>right now, but we would really like X or Y

0:39:46.120 --> 0:39:47.440
<v Speaker 2>in the ideal world.

0:39:48.160 --> 0:39:50.760
<v Speaker 5>So there's a range of possible customers in the world.

0:39:50.920 --> 0:39:53.440
<v Speaker 5>The way that we see or away you divide them up,

0:39:53.560 --> 0:39:55.799
<v Speaker 5>and how we choose to do that is what is

0:39:55.840 --> 0:39:58.160
<v Speaker 5>the ratio of engineering time they're putting into their work

0:39:58.239 --> 0:40:01.319
<v Speaker 5>versus the amount of computers spent that they're putting in.

0:40:01.800 --> 0:40:05.600
<v Speaker 5>So the ideal customer in general for a hardware vendor

0:40:05.640 --> 0:40:08.920
<v Speaker 5>who's trying to make the absolute best, but not necessarily

0:40:08.960 --> 0:40:12.680
<v Speaker 5>easiest to use hardware is a company that is spending

0:40:12.719 --> 0:40:14.400
<v Speaker 5>a lot more on their computing power than they are

0:40:14.400 --> 0:40:16.680
<v Speaker 5>spending on the engineering type, because then that makes a

0:40:16.719 --> 0:40:18.680
<v Speaker 5>really good trade off of maybe I can spend a

0:40:18.719 --> 0:40:20.759
<v Speaker 5>bit more engineering time to make your hardware work, but

0:40:20.800 --> 0:40:23.839
<v Speaker 5>I get a big saving on my computing costs. So

0:40:24.360 --> 0:40:27.359
<v Speaker 5>companies like open ai would be obviously a slam dunk.

0:40:27.640 --> 0:40:30.640
<v Speaker 5>There's many more companies as well. So the companies that

0:40:30.680 --> 0:40:34.440
<v Speaker 5>meet this criteria of spending many times more on compute

0:40:34.600 --> 0:40:38.359
<v Speaker 5>than on engineering. There's actually a set of maybe ten

0:40:38.360 --> 0:40:41.040
<v Speaker 5>to fifteen large language model labs that are not as

0:40:41.080 --> 0:40:44.719
<v Speaker 5>well known as open ai, but you might think character Ai, Coheer,

0:40:44.760 --> 0:40:48.719
<v Speaker 5>and many other companies like that in mistrial. So the

0:40:48.800 --> 0:40:51.120
<v Speaker 5>common thing that we hear from those companies, all of

0:40:51.120 --> 0:40:53.960
<v Speaker 5>those are spending hundreds of millions of dollars on compute,

0:40:55.239 --> 0:40:59.480
<v Speaker 5>is I just want better flops for dollar. That's actually

0:40:59.480 --> 0:41:03.040
<v Speaker 5>the single deciding factor, And that's primarily the reason they're

0:41:03.040 --> 0:41:07.040
<v Speaker 5>deciding on today, deciding on in videos products rather than

0:41:07.080 --> 0:41:09.280
<v Speaker 5>some of the other products in the market, is because

0:41:09.280 --> 0:41:11.440
<v Speaker 5>the flops for dollar of those products is the best

0:41:11.520 --> 0:41:13.799
<v Speaker 5>you can buy. But when you give them a spec

0:41:13.840 --> 0:41:15.600
<v Speaker 5>sheet and the first thing they're going to look at

0:41:15.680 --> 0:41:17.719
<v Speaker 5>is just what's the most floating point operations I can

0:41:17.760 --> 0:41:20.400
<v Speaker 5>run on my chip? And then you can rule out

0:41:20.440 --> 0:41:22.640
<v Speaker 5>ninety percent of products there on the basis of okay,

0:41:22.760 --> 0:41:25.880
<v Speaker 5>just doesn't meet that far. But then after that you

0:41:25.960 --> 0:41:28.720
<v Speaker 5>then go through the more detailed analysis of saying, okay, well,

0:41:28.880 --> 0:41:31.799
<v Speaker 5>I've got these floating point operations, but is the rest

0:41:31.840 --> 0:41:33.640
<v Speaker 5>going to work out? Do I have the memory bandwidth

0:41:33.719 --> 0:41:36.600
<v Speaker 5>and the interconnect? But for sure, the number one criteria

0:41:36.719 --> 0:41:38.200
<v Speaker 5>is that top line flops.

0:41:38.600 --> 0:41:42.120
<v Speaker 2>When we talk about delivering more flops per dollar, what

0:41:42.160 --> 0:41:46.000
<v Speaker 2>are you aiming for? What is current benchmark flops per dollar?

0:41:46.360 --> 0:41:48.000
<v Speaker 2>And then are we talking like can it be done

0:41:48.120 --> 0:41:51.600
<v Speaker 2>like ninety percent cheaper? What do you think is realistic

0:41:51.640 --> 0:41:54.600
<v Speaker 2>in terms of coming to market with something meaningfully better

0:41:54.640 --> 0:41:55.480
<v Speaker 2>on that metric?

0:41:56.280 --> 0:42:00.120
<v Speaker 5>So in videos, Blackwell in their FP four format offers

0:42:00.680 --> 0:42:03.399
<v Speaker 5>ten pet of flops in their chip, and that chip

0:42:03.440 --> 0:42:08.840
<v Speaker 5>sells for Bullpark thirty to fifty thousand, depends on many factors.

0:42:09.360 --> 0:42:12.800
<v Speaker 5>That is about a factor of two to four better

0:42:13.080 --> 0:42:15.440
<v Speaker 5>than the previous generation and video chip, which is the

0:42:15.480 --> 0:42:18.359
<v Speaker 5>Hopper chip. So part of that factor is coming from

0:42:18.360 --> 0:42:21.040
<v Speaker 5>going to lower precision, going from eight bit precision to

0:42:21.080 --> 0:42:24.480
<v Speaker 5>four bit precision. In general, precision is in one of

0:42:24.520 --> 0:42:27.640
<v Speaker 5>the best ways to improve the flops you can pack

0:42:27.680 --> 0:42:30.040
<v Speaker 5>into a certain amount of silicon, and then some of

0:42:30.040 --> 0:42:31.960
<v Speaker 5>it is also coming from other factors such as cost

0:42:32.000 --> 0:42:35.000
<v Speaker 5>productions that in Vidia has been deployed. So that's a

0:42:35.000 --> 0:42:37.480
<v Speaker 5>benchmark for ware inn video is that now you need

0:42:37.520 --> 0:42:40.120
<v Speaker 5>to be at least a few integer multiples better than

0:42:40.120 --> 0:42:42.160
<v Speaker 5>that in order to compete with the incumbent. So at

0:42:42.239 --> 0:42:45.240
<v Speaker 5>least you know, two or three times better on that metric,

0:42:45.280 --> 0:42:47.520
<v Speaker 5>we would say. But then, of course, if you're designing

0:42:47.520 --> 0:42:49.359
<v Speaker 5>for the future, you have to compete against the next

0:42:49.400 --> 0:42:51.960
<v Speaker 5>generation after that too, and so you want to be

0:42:52.280 --> 0:42:54.839
<v Speaker 5>many times better than the future chip, which isn't down yet,

0:42:54.880 --> 0:42:56.360
<v Speaker 5>And so that's the thing you aim for.

0:42:57.000 --> 0:42:59.760
<v Speaker 2>Is there anything else that we should sort of understand

0:43:00.080 --> 0:43:02.360
<v Speaker 2>about this business that we haven't touched on that you

0:43:02.400 --> 0:43:03.359
<v Speaker 2>think is important?

0:43:03.560 --> 0:43:06.400
<v Speaker 6>One thing, given that this is odd lots that I

0:43:06.440 --> 0:43:09.360
<v Speaker 6>think the reason that sam Altman is going around the

0:43:09.360 --> 0:43:12.839
<v Speaker 6>world talking about trillions of dollars of spend is that

0:43:12.920 --> 0:43:16.120
<v Speaker 6>he wants to move the expectations of all of the

0:43:16.160 --> 0:43:20.719
<v Speaker 6>suppliers up. So as you have we've observed in the

0:43:20.800 --> 0:43:26.160
<v Speaker 6>semiconductor shortage, if the suppliers are preparing for a certain

0:43:26.200 --> 0:43:28.600
<v Speaker 6>amount of demand and demand you know, in the case

0:43:29.640 --> 0:43:33.240
<v Speaker 6>famously of the auto manufacturers as a result of COVID

0:43:33.719 --> 0:43:37.040
<v Speaker 6>canceled their orders and then they found that demand was much, much,

0:43:37.160 --> 0:43:40.279
<v Speaker 6>much larger than they expected. It took a very long

0:43:40.360 --> 0:43:44.640
<v Speaker 6>time to catch up. A similar thing happened with the

0:43:45.000 --> 0:43:48.600
<v Speaker 6>in videos H one hundred. So TSMC was actually perfectly

0:43:48.640 --> 0:43:51.839
<v Speaker 6>capable of keeping up with demand for the chips themselves.

0:43:52.280 --> 0:43:57.279
<v Speaker 6>But the chips for these AI products are use a

0:43:57.360 --> 0:44:01.040
<v Speaker 6>very special kind of packaging which puts the compute chips

0:44:01.120 --> 0:44:03.239
<v Speaker 6>very close to the memory chips and hence allows them

0:44:03.280 --> 0:44:08.719
<v Speaker 6>to communicate very quickly, called coos, And the capacity for

0:44:08.840 --> 0:44:14.040
<v Speaker 6>coos was limited because TSMC built with a particular expectation

0:44:14.160 --> 0:44:17.520
<v Speaker 6>of demand, and when H one hundred became such a

0:44:17.560 --> 0:44:22.520
<v Speaker 6>monster product, their coosts capacity wasn't able to keep pace

0:44:22.680 --> 0:44:26.440
<v Speaker 6>with demand. So, you know, supply chain tends to be

0:44:26.560 --> 0:44:31.200
<v Speaker 6>really good if you predict accurately, and if you predict badly,

0:44:31.480 --> 0:44:33.920
<v Speaker 6>you know, on on the low side, then you end

0:44:34.000 --> 0:44:37.520
<v Speaker 6>up with these shortages. But on the other hand, these companies,

0:44:37.680 --> 0:44:41.600
<v Speaker 6>because the manufacturing companies have very high capex, they are

0:44:41.640 --> 0:44:44.759
<v Speaker 6>fairly lows to it, predict badly on the high side

0:44:44.800 --> 0:44:48.040
<v Speaker 6>because that leads them to having spent a bunch of

0:44:48.040 --> 0:44:50.920
<v Speaker 6>money on capital capex that they're unable to recover.

0:44:51.520 --> 0:44:54.879
<v Speaker 2>So, yeah, this is very interesting, this idea that in

0:44:54.920 --> 0:44:58.960
<v Speaker 2>some part it's a signal we're not slowing down. We're

0:44:59.040 --> 0:45:01.000
<v Speaker 2>you know, we have more and more that we want

0:45:01.040 --> 0:45:04.759
<v Speaker 2>to do. So if you're anywhere along the semiconductor supply chain,

0:45:05.280 --> 0:45:08.359
<v Speaker 2>don't start, you know, curbing your expectations or curbing your

0:45:08.360 --> 0:45:10.960
<v Speaker 2>production because we want to build a lot more. I'm

0:45:11.000 --> 0:45:14.120
<v Speaker 2>curious one last question, I guess for both of you.

0:45:14.120 --> 0:45:15.880
<v Speaker 2>You know, you hear a lot of people in the

0:45:15.960 --> 0:45:18.840
<v Speaker 2>industry you say, like, we might just be three or

0:45:18.880 --> 0:45:24.400
<v Speaker 2>four years away from AGI or super intelligence, however that's defined,

0:45:24.800 --> 0:45:27.000
<v Speaker 2>and then you get into a lot of these philosophical

0:45:27.120 --> 0:45:30.000
<v Speaker 2>questions and ethical questions about you know, whatever is the

0:45:30.080 --> 0:45:32.440
<v Speaker 2>AI going to, well, it's gonna be the role for

0:45:32.520 --> 0:45:34.520
<v Speaker 2>humans or is it gonna kill us all? Or whatever

0:45:35.160 --> 0:45:37.640
<v Speaker 2>you know, fear scenario you want. But the two of

0:45:37.680 --> 0:45:39.879
<v Speaker 2>you like, how do you see that question? Like could

0:45:39.960 --> 0:45:42.680
<v Speaker 2>we hit it in just a few short years where

0:45:43.200 --> 0:45:45.840
<v Speaker 2>we have something that people agree is oh, this is

0:45:45.960 --> 0:45:49.360
<v Speaker 2>agi Like are you is it short runway or just

0:45:49.440 --> 0:45:51.200
<v Speaker 2>a couple of years away from this or does it

0:45:51.200 --> 0:45:53.719
<v Speaker 2>feel like no, that's still quite a few years out.

0:45:53.960 --> 0:45:56.560
<v Speaker 5>If ever, I think what we have what's your.

0:46:00.560 --> 0:46:06.480
<v Speaker 6>Approximately zero to be blunt? Thank you, my p great things.

0:46:06.640 --> 0:46:09.480
<v Speaker 6>I mean, I think we kind of already have great things,

0:46:09.480 --> 0:46:12.960
<v Speaker 6>and we've just gotten the models of this level of

0:46:13.000 --> 0:46:15.440
<v Speaker 6>quality recently and we're learning how to use them, and

0:46:15.480 --> 0:46:19.520
<v Speaker 6>the quality is going up. The you know, the fact

0:46:19.520 --> 0:46:21.960
<v Speaker 6>that we can get a computer to write code pretty

0:46:21.960 --> 0:46:26.160
<v Speaker 6>well is fairly amazing to me. That you can ask

0:46:26.200 --> 0:46:28.520
<v Speaker 6>it to tell a good joke in the style of

0:46:28.560 --> 0:46:32.760
<v Speaker 6>a particular person and it can do that is also amazing. Yeah.

0:46:32.840 --> 0:46:36.160
<v Speaker 2>Well, uh, I'm glad, You're I'm glad, you're I'm glad

0:46:36.200 --> 0:46:39.239
<v Speaker 2>your odds of total doom and annihilation are zero. That

0:46:39.320 --> 0:46:41.799
<v Speaker 2>makes me feel a little bit better. Ryan or and Mike,

0:46:41.840 --> 0:46:43.560
<v Speaker 2>thank you so much for coming on odd laws.

0:46:43.600 --> 0:47:02.600
<v Speaker 7>I learned as from that conversation there's a pleasure.

0:46:58.840 --> 0:46:59.240
<v Speaker 4>Tracy.

0:46:59.239 --> 0:47:02.920
<v Speaker 2>There was obviously ton that was really interesting in that conversation,

0:47:03.000 --> 0:47:07.000
<v Speaker 2>but I particularly like the part about incentives of large

0:47:07.200 --> 0:47:11.360
<v Speaker 2>legacy incumbents about entering a totally new business. So for

0:47:11.400 --> 0:47:15.720
<v Speaker 2>a company like Google, the primary purpose of their chips

0:47:16.120 --> 0:47:19.720
<v Speaker 2>is going to be serving an in house business purpose.

0:47:19.760 --> 0:47:21.600
<v Speaker 2>And even with all the money that they have, and

0:47:21.640 --> 0:47:24.920
<v Speaker 2>even with the engineering talent, there's still a sort of

0:47:25.080 --> 0:47:28.359
<v Speaker 2>trade off question involved of how much do we want

0:47:28.440 --> 0:47:31.839
<v Speaker 2>to build chips for some other purpose, for some sort

0:47:31.880 --> 0:47:33.000
<v Speaker 2>of external service.

0:47:33.120 --> 0:47:36.120
<v Speaker 3>Yeah, and I also thought the point about why Sam

0:47:36.160 --> 0:47:39.120
<v Speaker 3>Altman is going around talking about how, you know, how

0:47:39.160 --> 0:47:42.319
<v Speaker 3>many billions he's going to spend was really interesting and

0:47:42.480 --> 0:47:44.960
<v Speaker 3>it kind of makes sense in the aftermath of the

0:47:45.000 --> 0:47:49.000
<v Speaker 3>pandemic and semiconductors. I'm sure you remember this. I think

0:47:49.040 --> 0:47:51.600
<v Speaker 3>that was actually where we first learned about the bullwhip

0:47:51.640 --> 0:47:55.040
<v Speaker 3>effect and this idea that very small changes in one

0:47:55.280 --> 0:47:57.880
<v Speaker 3>end of the supply chain, which would be customer demand,

0:47:58.160 --> 0:48:02.000
<v Speaker 3>can end up reverberate, you know, all the way through

0:48:02.040 --> 0:48:05.440
<v Speaker 3>the supply chain. And so when you had carmakers start

0:48:05.440 --> 0:48:08.280
<v Speaker 3>to cut back on their orders. That had a much

0:48:08.320 --> 0:48:12.000
<v Speaker 3>bigger and longer impact than you might have anticipated. And

0:48:12.040 --> 0:48:15.000
<v Speaker 3>so it's interesting to see companies coming at it from

0:48:15.040 --> 0:48:17.799
<v Speaker 3>the other end and saying like, no, we have all

0:48:17.840 --> 0:48:19.640
<v Speaker 3>this money and we're going to be here for a

0:48:19.680 --> 0:48:20.240
<v Speaker 3>long time.

0:48:20.480 --> 0:48:23.480
<v Speaker 2>We're not slowing down. We are going to agi. And

0:48:23.560 --> 0:48:25.680
<v Speaker 2>so if you think like, oh, we're gonna come out

0:48:25.719 --> 0:48:28.360
<v Speaker 2>with GPT five and then we're going to focus on

0:48:28.480 --> 0:48:31.719
<v Speaker 2>just like commercializing that and selling it to airlines to

0:48:31.719 --> 0:48:34.560
<v Speaker 2>do customer support after that, and just go into glide

0:48:34.600 --> 0:48:37.120
<v Speaker 2>mode and take business like they want to signal that

0:48:37.160 --> 0:48:39.600
<v Speaker 2>they're like building more and more and more. I thought

0:48:39.600 --> 0:48:42.400
<v Speaker 2>that was interesting. I thought it was interesting the point

0:48:42.480 --> 0:48:47.160
<v Speaker 2>about Nvidia and Kuda and the idea that, Okay, yes,

0:48:47.520 --> 0:48:51.400
<v Speaker 2>the Kuda software ecosystem is perceived to be this mote

0:48:51.400 --> 0:48:55.000
<v Speaker 2>that makes it harder for other semiconductor companies to break

0:48:55.080 --> 0:48:58.680
<v Speaker 2>into the same business, but it's also constraining from an

0:48:58.680 --> 0:49:01.839
<v Speaker 2>in video perspective, the idea that, Okay, if they want

0:49:01.920 --> 0:49:06.280
<v Speaker 2>everything to be Kuda compatible or be within the same

0:49:06.480 --> 0:49:10.799
<v Speaker 2>family of software usage, then that also constrains the potential

0:49:11.160 --> 0:49:13.400
<v Speaker 2>sidelines that they might get into right.

0:49:13.280 --> 0:49:16.200
<v Speaker 3>And opens up space for competitors. But I don't know

0:49:16.239 --> 0:49:20.080
<v Speaker 3>why I haven't really like internalized this lesson before, because

0:49:20.120 --> 0:49:24.360
<v Speaker 3>it comes up in every conversation we do on semiconductors.

0:49:24.440 --> 0:49:26.919
<v Speaker 3>But I think there's still a perception, or at least

0:49:26.960 --> 0:49:29.520
<v Speaker 3>maybe I still have this perception that the moat around

0:49:29.600 --> 0:49:32.319
<v Speaker 3>Nvidia is like the actual hardware. Yes, but it's not.

0:49:32.640 --> 0:49:34.560
<v Speaker 3>It's the software. It's Kuda.

0:49:34.840 --> 0:49:35.800
<v Speaker 2>It seems like it's both.

0:49:36.080 --> 0:49:39.160
<v Speaker 3>Well, yeah, but I think I'm starting to appreciate how

0:49:39.239 --> 0:49:41.120
<v Speaker 3>much of it is Kuda is what I'm.

0:49:40.960 --> 0:49:43.960
<v Speaker 2>Saying it certainly, it certainly seems to come up over

0:49:44.120 --> 0:49:47.400
<v Speaker 2>and over again. How much the fact that this is

0:49:47.440 --> 0:49:50.920
<v Speaker 2>what people use. It's the software that makes it easy

0:49:51.000 --> 0:49:56.360
<v Speaker 2>for less sophistic less sophisticated customers to use the applications.

0:49:56.520 --> 0:49:59.439
<v Speaker 2>It seems extremely powerful. It's also interesting to hear about

0:49:59.480 --> 0:50:05.880
<v Speaker 2>like the ecosystem of businesses around semiconductor design. And you know,

0:50:06.120 --> 0:50:10.120
<v Speaker 2>he mentioned Broadcom. Ryner mentioned Broadcom, which is a company

0:50:10.160 --> 0:50:12.759
<v Speaker 2>that I don't think we've ever really talked about very

0:50:12.840 --> 0:50:15.880
<v Speaker 2>much on the show. But if you look at that stock,

0:50:16.280 --> 0:50:18.880
<v Speaker 2>I mean, it looks kind of like you're looking at

0:50:18.920 --> 0:50:20.960
<v Speaker 2>a chart of in video like that has been a

0:50:21.040 --> 0:50:25.640
<v Speaker 2>gigantic winner over the last few years. Back in twenty twenty,

0:50:25.800 --> 0:50:27.600
<v Speaker 2>it was a thirty one dollars stock. Now it's one

0:50:27.680 --> 0:50:29.800
<v Speaker 2>hundred and forty six dollars stock. Okay, I tell you

0:50:29.800 --> 0:50:32.560
<v Speaker 2>a five back or so, maybe not quite in video returns.

0:50:33.000 --> 0:50:35.799
<v Speaker 3>And this idea that, like how in Vidia has just

0:50:35.880 --> 0:50:39.719
<v Speaker 3>skewed like I know what's expected of every stock, it's like,

0:50:40.080 --> 0:50:41.600
<v Speaker 3>but this is on a different plane.

0:50:41.719 --> 0:50:46.040
<v Speaker 2>And this idea that a semiconductor startup doesn't necessarily interface

0:50:46.120 --> 0:50:50.440
<v Speaker 2>directly with TSMC like that really for the most sophisticated advance,

0:50:50.480 --> 0:50:52.560
<v Speaker 2>and then there are some of these companies in the middle.

0:50:52.640 --> 0:50:54.080
<v Speaker 2>I thought that was extremely interesting.

0:50:54.320 --> 0:50:57.160
<v Speaker 3>Uh you know what, Joe, I asked chat GPT, what

0:50:57.280 --> 0:51:03.080
<v Speaker 3>the most beautiful semiconductor is. Yeah, it says Gallium arsenide

0:51:03.440 --> 0:51:09.080
<v Speaker 3>is considered beautiful for several reasons. It's crystal structure is

0:51:09.160 --> 0:51:12.600
<v Speaker 3>often admired for its clarity and elegance. Wow, So I

0:51:12.600 --> 0:51:15.399
<v Speaker 3>guess semiconductors may solum arsenide, So.

0:51:16.320 --> 0:51:19.600
<v Speaker 2>There's beauty at the molecular level. Yeah, But actually I thought,

0:51:19.680 --> 0:51:22.000
<v Speaker 2>you know, I thought when you asked that question, it's like, oh,

0:51:22.040 --> 0:51:25.440
<v Speaker 2>it's just sort of a you know, philosophical, you know,

0:51:25.600 --> 0:51:29.960
<v Speaker 2>fun whimsical question, but this idea of like doing the

0:51:30.040 --> 0:51:33.360
<v Speaker 2>minimum required or not building a bunch of extra rooms

0:51:33.440 --> 0:51:36.239
<v Speaker 2>in the house that you don't really need. And as

0:51:36.280 --> 0:51:38.959
<v Speaker 2>we know, I mean, it's just objectively true that even

0:51:38.960 --> 0:51:41.080
<v Speaker 2>if in video chips are the best in the world

0:51:41.280 --> 0:51:44.879
<v Speaker 2>for AI, they do other stuff beyond AI, and they

0:51:44.920 --> 0:51:47.840
<v Speaker 2>do ethereum mining, or they used to, and that was

0:51:48.160 --> 0:51:50.479
<v Speaker 2>based on proof of work back in the old days.

0:51:50.520 --> 0:51:52.320
<v Speaker 2>And of course they're for video games. But if you

0:51:52.400 --> 0:51:55.560
<v Speaker 2>really just want a computer, or if you really just

0:51:55.600 --> 0:51:59.520
<v Speaker 2>want a model that can speak in English or write code,

0:52:00.200 --> 0:52:04.240
<v Speaker 2>or can just think without doing video games and chip mining,

0:52:04.560 --> 0:52:06.440
<v Speaker 2>then perhaps there are a bunch of rooms in the

0:52:06.440 --> 0:52:07.920
<v Speaker 2>house that are totally unnecessary.

0:52:08.080 --> 0:52:11.480
<v Speaker 3>Yeah, And I mean there's efficiency costs to that efficiency cost. Yeah,

0:52:11.680 --> 0:52:14.759
<v Speaker 3>you're trying to streamline it as much as possible. All right,

0:52:14.800 --> 0:52:15.520
<v Speaker 3>shall we leave it there.

0:52:15.600 --> 0:52:16.279
<v Speaker 2>Let's leave it there.

0:52:16.480 --> 0:52:19.280
<v Speaker 3>This has been another episode of the All Thoughts podcast.

0:52:19.360 --> 0:52:22.760
<v Speaker 3>I'm Tracy Alloway. You can follow me at Tracy Alloway.

0:52:22.480 --> 0:52:25.040
<v Speaker 2>And I'm Jill Wisenthal. You can follow me at the Stalwart.

0:52:25.280 --> 0:52:28.200
<v Speaker 2>Follow our guests Rein or Pope. He's at rein Or

0:52:28.280 --> 0:52:32.400
<v Speaker 2>Pope and Mike Gunter. He's Mike Gunter Underscore. Follow our

0:52:32.400 --> 0:52:35.920
<v Speaker 2>producers Carmen Rodriguez at Carman Ermann dash Oll, Bennett at

0:52:36.000 --> 0:52:39.520
<v Speaker 2>dashbot In Kelbrooks at Kelbrooks. Thank you to our producer

0:52:39.560 --> 0:52:42.919
<v Speaker 2>Moses Ondam. For more Oddlots content, go to Bloomberg dot

0:52:42.920 --> 0:52:46.040
<v Speaker 2>com slash odd Lots, where we have transcripts, a blog,

0:52:46.120 --> 0:52:48.400
<v Speaker 2>and a newsletter and you can chat about all of

0:52:48.440 --> 0:52:51.480
<v Speaker 2>these topics twenty four to seven in the discord Discord

0:52:51.520 --> 0:52:55.239
<v Speaker 2>dot gg slash odd Laws. There's even a semiconductor room

0:52:55.239 --> 0:52:57.480
<v Speaker 2>in there, so you can just go there and just

0:52:57.520 --> 0:52:59.040
<v Speaker 2>talk about chips all day if you want.

0:53:00.040 --> 0:53:02.399
<v Speaker 3>If you enjoy All Lots, if you like it when

0:53:02.440 --> 0:53:05.359
<v Speaker 3>we talk about what the most beautiful semiconductor is, then

0:53:05.400 --> 0:53:09.200
<v Speaker 3>please leave us a positive review on your favorite podcast platform.

0:53:09.560 --> 0:53:12.359
<v Speaker 3>And remember, if you're a Bloomberg subscriber, you can listen

0:53:12.400 --> 0:53:15.520
<v Speaker 3>to all of our episodes absolutely ad free. All you

0:53:15.600 --> 0:53:18.960
<v Speaker 3>need to do is connect your Bloomberg account with Apple Podcasts.

0:53:19.239 --> 0:53:21.640
<v Speaker 3>In order to do that, just find the Bloomberg channel

0:53:21.719 --> 0:53:41.200
<v Speaker 3>on the platform and follow the instructions there. Thanks for listening.