WEBVTT - Inside the Battle for Chips That Will Power Artificial Intelligence

0:00:10.160 --> 0:00:14.480
<v Speaker 1>Hello, and welcome to another episode of The Odd Blocks podcast.

0:00:14.560 --> 0:00:16.280
<v Speaker 1>I'm Joe Wisenthal.

0:00:15.840 --> 0:00:16.880
<v Speaker 2>And I'm Tracy Alloway.

0:00:17.239 --> 0:00:21.160
<v Speaker 1>Tracy, I'm not sure if you've heard anyone talking about

0:00:21.200 --> 0:00:22.800
<v Speaker 1>it or anything, but have you heard about like this

0:00:22.920 --> 0:00:24.920
<v Speaker 1>sort of AI thing people have been discussing?

0:00:24.960 --> 0:00:27.720
<v Speaker 2>Oh, you know what, I discovered this really cool new

0:00:27.800 --> 0:00:29.240
<v Speaker 2>thing called chat gps.

0:00:29.280 --> 0:00:31.520
<v Speaker 1>Oh yeah, I saw that website too. Yeah.

0:00:31.560 --> 0:00:32.440
<v Speaker 2>Have you tried it?

0:00:32.960 --> 0:00:35.080
<v Speaker 1>I tried it. Yeah, I kind of like write a

0:00:35.080 --> 0:00:38.840
<v Speaker 1>poem for me. She's pretty cool technology. We should probably

0:00:38.880 --> 0:00:39.639
<v Speaker 1>learn more about it.

0:00:39.880 --> 0:00:42.199
<v Speaker 2>Yeah, I think we should know. Okay, all right, obviously

0:00:42.320 --> 0:00:46.920
<v Speaker 2>we're being facetious and joking, but everyone has been talking

0:00:47.159 --> 0:00:51.920
<v Speaker 2>about AI and these new sort of natural language interfaces

0:00:52.000 --> 0:00:56.440
<v Speaker 2>that allow you to ask questions or generate all different

0:00:56.480 --> 0:00:59.320
<v Speaker 2>types of texts and things like that. It feels like

0:00:59.440 --> 0:01:02.280
<v Speaker 2>everyone is very excited about that space.

0:01:02.160 --> 0:01:06.240
<v Speaker 1>Every almost every time. Mile Like I went out with

0:01:06.280 --> 0:01:08.399
<v Speaker 1>some friends that I hadn't seen in a long time,

0:01:08.480 --> 0:01:10.720
<v Speaker 1>Like I was at a bar last night, and like

0:01:10.800 --> 0:01:13.840
<v Speaker 1>the conversation like turned to AI within like two minutes.

0:01:13.880 --> 0:01:16.120
<v Speaker 1>Never got to talk about the experiments they did. But yes,

0:01:16.240 --> 0:01:18.959
<v Speaker 1>there is a lot. It's basically like this, like wall

0:01:19.040 --> 0:01:22.240
<v Speaker 1>of noise and everyone's been talking about actually but us,

0:01:22.280 --> 0:01:24.280
<v Speaker 1>because I don't think we have done as far as

0:01:24.319 --> 0:01:27.400
<v Speaker 1>I can recall, like an AI episode. We don't want

0:01:27.400 --> 0:01:30.240
<v Speaker 1>to just add to the noise and get another sort

0:01:30.280 --> 0:01:33.240
<v Speaker 1>of chin stroke around. But obviously there's a lot there

0:01:33.280 --> 0:01:33.679
<v Speaker 1>for us.

0:01:33.560 --> 0:01:36.320
<v Speaker 2>To discuss totally, and I'm sure this will be the

0:01:36.319 --> 0:01:39.720
<v Speaker 2>first of many episodes. But one of the ways that

0:01:39.760 --> 0:01:43.640
<v Speaker 2>it fits into sort of classic odd lots lore is

0:01:44.000 --> 0:01:45.360
<v Speaker 2>via semiconductors.

0:01:45.480 --> 0:01:45.640
<v Speaker 3>Right.

0:01:45.840 --> 0:01:49.480
<v Speaker 2>If you think about what chat GPT, for instance, is doing,

0:01:49.680 --> 0:01:55.000
<v Speaker 2>it's taking words and transforming them into numbers and then

0:01:55.240 --> 0:01:57.920
<v Speaker 2>spitting those words back out at you. And the thing

0:01:58.000 --> 0:02:01.520
<v Speaker 2>that enables it to do that semiconductors chips.

0:02:01.800 --> 0:02:04.560
<v Speaker 1>Right, So here's like the four things I think I

0:02:04.600 --> 0:02:08.440
<v Speaker 1>know about this and so this is that A. Training

0:02:08.480 --> 0:02:10.680
<v Speaker 1>the AI models so that they can do that is

0:02:10.680 --> 0:02:16.119
<v Speaker 1>a computationally intensive process. B. Each query is much more

0:02:16.120 --> 0:02:18.680
<v Speaker 1>computationally intensive than say a Google search.

0:02:19.400 --> 0:02:19.680
<v Speaker 3>Three.

0:02:20.360 --> 0:02:23.880
<v Speaker 1>The company that's absolutely crushing the space and printing money

0:02:24.000 --> 0:02:27.840
<v Speaker 1>because of this is in Nvidia. Yeah, And four there's

0:02:27.880 --> 0:02:31.920
<v Speaker 1>a general scarcity of computing powers, so that even if

0:02:31.960 --> 0:02:35.200
<v Speaker 1>you and I like were brilliant mathematicians and AI theorists,

0:02:35.200 --> 0:02:38.440
<v Speaker 1>et cetera. If we wanted to start a chat GPT competitor,

0:02:39.200 --> 0:02:42.400
<v Speaker 1>just getting access to the computing power in order to

0:02:42.480 --> 0:02:44.960
<v Speaker 1>do that would not be trivial, even if we had

0:02:45.000 --> 0:02:46.040
<v Speaker 1>tons of money outside of it.

0:02:46.120 --> 0:02:49.200
<v Speaker 2>I'm going to buy an out of business cryptomne and

0:02:49.240 --> 0:02:49.799
<v Speaker 2>take all the.

0:02:50.280 --> 0:02:53.280
<v Speaker 1>They've already been bought. Someone got that. But that's that's

0:02:53.360 --> 0:02:57.240
<v Speaker 1>basically the extent of my understanding of the nexus between

0:02:57.360 --> 0:03:01.320
<v Speaker 1>this AI and chips, and I suspect there's more to know.

0:03:01.400 --> 0:03:05.120
<v Speaker 2>They're just well. I also think having a conversation about

0:03:05.160 --> 0:03:09.280
<v Speaker 2>semiconductors and AI is a really good way to understand

0:03:09.480 --> 0:03:12.720
<v Speaker 2>the underlying technology of both those things. So that's what

0:03:12.760 --> 0:03:14.280
<v Speaker 2>I'm hoping for out of this conversation.

0:03:14.320 --> 0:03:16.320
<v Speaker 1>All right, Well, you mentioned we've been doing We've done

0:03:16.360 --> 0:03:18.560
<v Speaker 1>lots of Chips episodes in the past, so we're going

0:03:18.639 --> 0:03:22.040
<v Speaker 1>to go back to the future or something like that.

0:03:22.080 --> 0:03:23.960
<v Speaker 1>We're going to go back to our first episode, our

0:03:24.000 --> 0:03:27.240
<v Speaker 1>first guest, where we started exploring Chips episodes. I think

0:03:27.240 --> 0:03:29.720
<v Speaker 1>it was the first one that we did sometime maybe

0:03:29.760 --> 0:03:32.760
<v Speaker 1>in early twenty twenty one. We're going to be speaking

0:03:32.800 --> 0:03:36.320
<v Speaker 1>with Stacey Raskin, Managing director and senior analyst of US

0:03:36.360 --> 0:03:40.960
<v Speaker 1>Semiconductors and Semiconductor Capital Equipment at Bernstein Research, someone who's

0:03:41.040 --> 0:03:43.280
<v Speaker 1>great at breaking all this stuff down has been doing

0:03:43.320 --> 0:03:46.280
<v Speaker 1>a lot of research on this question now. So Stacy,

0:03:46.680 --> 0:03:48.760
<v Speaker 1>thank you so much for coming back on odd lots.

0:03:49.680 --> 0:03:51.520
<v Speaker 3>I am so happy to be back. Thank you so

0:03:51.640 --> 0:03:52.560
<v Speaker 3>much for having me right.

0:03:52.560 --> 0:03:54.560
<v Speaker 1>So I'm going to start with just sort of like

0:03:54.880 --> 0:03:58.560
<v Speaker 1>not even a business question, but a sort of semiconductor

0:03:58.600 --> 0:04:03.280
<v Speaker 1>design question, which is this company in video Like for

0:04:03.440 --> 0:04:05.480
<v Speaker 1>years I just sort of knew them. Is like they

0:04:05.480 --> 0:04:08.680
<v Speaker 1>were the company that made graphics cards for video games,

0:04:08.720 --> 0:04:10.880
<v Speaker 1>and then for a while they got there like oh,

0:04:10.920 --> 0:04:13.960
<v Speaker 1>and they're also good for crypto mining, and they were

0:04:14.040 --> 0:04:16.880
<v Speaker 1>very popular for a while in ethereum mining when it

0:04:17.000 --> 0:04:20.279
<v Speaker 1>used roof of work. And now my understanding is everyone

0:04:20.320 --> 0:04:22.800
<v Speaker 1>wants their chips for AI purposes. And we'll get into

0:04:22.839 --> 0:04:25.760
<v Speaker 1>all that, but just to start, what is it about

0:04:25.839 --> 0:04:29.920
<v Speaker 1>the design of their chips that makes them naturally suited

0:04:29.960 --> 0:04:32.200
<v Speaker 1>for these other things? A company that started in graphics

0:04:32.240 --> 0:04:35.440
<v Speaker 1>cards that makes them naturally suited for these things like

0:04:35.560 --> 0:04:39.240
<v Speaker 1>AI in a way apparently that other chip makers, like

0:04:39.279 --> 0:04:43.400
<v Speaker 1>saying Intel, their chips do not seem to be as

0:04:43.720 --> 0:04:44.640
<v Speaker 1>used for this space.

0:04:46.160 --> 0:04:48.560
<v Speaker 3>Yeah, so let me step back.

0:04:48.640 --> 0:04:52.040
<v Speaker 1>Yeah, sure, if the question, if the question is totally

0:04:52.120 --> 0:04:54.320
<v Speaker 1>flawed in its premise, then feel free to say your

0:04:54.400 --> 0:04:56.320
<v Speaker 1>question is totally let me step back.

0:04:56.360 --> 0:05:00.279
<v Speaker 3>So sure, I'd say the idea of like using cute

0:05:00.360 --> 0:05:02.599
<v Speaker 3>and artificial intelligence has obviously been around for a long

0:05:02.880 --> 0:05:05.120
<v Speaker 3>long time, and actually the AI industry has been through

0:05:05.120 --> 0:05:08.240
<v Speaker 3>a number of what they call AI winters over the years,

0:05:08.279 --> 0:05:10.760
<v Speaker 3>where people would get really excited about this and then

0:05:10.760 --> 0:05:12.279
<v Speaker 3>they would do work, and then it would just turn

0:05:12.320 --> 0:05:15.640
<v Speaker 3>out it wasn't working, and pretty much it was just

0:05:15.680 --> 0:05:19.839
<v Speaker 3>because the compute capacity and capabilities of the hardware at

0:05:19.880 --> 0:05:21.720
<v Speaker 3>the time doesn't really wasn't really up to the task,

0:05:21.760 --> 0:05:24.080
<v Speaker 3>and so interest would wane and you'd go through this

0:05:24.160 --> 0:05:27.560
<v Speaker 3>winter period, and a while back, I don't know, ten

0:05:27.720 --> 0:05:29.719
<v Speaker 3>fifteen years ago, whenever it was, it was sort of

0:05:29.760 --> 0:05:35.520
<v Speaker 3>discovered that the types of calculations that are used for

0:05:35.839 --> 0:05:38.280
<v Speaker 3>neural networks and machine learning, it turns out they are

0:05:38.440 --> 0:05:41.080
<v Speaker 3>very similar to the kinds of application the kinds of

0:05:41.200 --> 0:05:45.479
<v Speaker 3>mathematics that are used for graphics process processing and graphics rendering.

0:05:45.520 --> 0:05:48.960
<v Speaker 3>As it turns out it's primarily matrix multiplication and we'll

0:05:48.960 --> 0:05:51.000
<v Speaker 3>probably get into this call on this call a little

0:05:51.040 --> 0:05:53.960
<v Speaker 3>bit in terms of how these machine learning models and

0:05:53.960 --> 0:05:55.680
<v Speaker 3>everything actually work. But at the end of the day,

0:05:55.800 --> 0:05:59.520
<v Speaker 3>really it comes down to like really really large amounts

0:05:59.520 --> 0:06:02.840
<v Speaker 3>of matrix multiplication and parallel operations. And as it turned out,

0:06:03.600 --> 0:06:07.200
<v Speaker 3>the GPU, the graphics of processing unit was was quite suitable.

0:06:07.640 --> 0:06:10.400
<v Speaker 1>Before you go on then and maybe we'll get into

0:06:10.440 --> 0:06:13.159
<v Speaker 1>this an hour three of this conversation. No, we're not

0:06:13.160 --> 0:06:15.599
<v Speaker 1>going to go down on but what is matrix multiplication?

0:06:17.000 --> 0:06:18.599
<v Speaker 3>Yeah? So, I don't know how many of you are

0:06:18.640 --> 0:06:21.880
<v Speaker 3>our listeners here have had linear algebra or anything, but

0:06:22.120 --> 0:06:24.000
<v Speaker 3>a matrix is just like an array of numbers, like

0:06:24.120 --> 0:06:27.279
<v Speaker 3>thinking about like a square array of numbers, okay, okay,

0:06:27.320 --> 0:06:29.800
<v Speaker 3>and matrix multiplications. I've got two of these arrays and

0:06:29.839 --> 0:06:32.960
<v Speaker 3>I'm multiplying them together, and it's it's not as simple

0:06:33.000 --> 0:06:35.800
<v Speaker 3>as the kind of math or multiplication that maybe you're

0:06:35.960 --> 0:06:39.880
<v Speaker 3>typically used to, but it can be done. And it

0:06:39.960 --> 0:06:42.240
<v Speaker 3>turns out there are some of these characteristics of these

0:06:42.320 --> 0:06:44.520
<v Speaker 3>kinds of matrix' number of these matrix can be really big,

0:06:44.560 --> 0:06:46.680
<v Speaker 3>and there's like lots and lots of operations that need

0:06:46.760 --> 0:06:49.000
<v Speaker 3>to happen, and this stuff needs to happen like like

0:06:49.080 --> 0:06:52.520
<v Speaker 3>quite rapidly. And again I'm grossly simplifying here for the listeners,

0:06:53.279 --> 0:06:56.360
<v Speaker 3>But when when you're working through these kinds of machine

0:06:56.440 --> 0:06:58.960
<v Speaker 3>learning models, that that's really what you're doing. It's it's

0:06:58.960 --> 0:07:02.000
<v Speaker 3>a bunch of different makes, a bunch of different arrays

0:07:02.720 --> 0:07:06.080
<v Speaker 3>of numbers that contain all of the different parameters and things.

0:07:06.279 --> 0:07:08.120
<v Speaker 3>But we should probably step up a bit and talk

0:07:08.160 --> 0:07:11.200
<v Speaker 3>about what we actually mean when we talk about machine

0:07:11.240 --> 0:07:14.720
<v Speaker 3>learning and models and all kinds of things. But at

0:07:14.760 --> 0:07:16.440
<v Speaker 3>the end of the day, you have these really large

0:07:16.480 --> 0:07:19.560
<v Speaker 3>arrays of numbers that have to get multiplied together in

0:07:19.600 --> 0:07:21.760
<v Speaker 3>many cases, over and over again, many many times, and

0:07:21.800 --> 0:07:26.000
<v Speaker 3>it turns into a very very large compute problem. And

0:07:26.040 --> 0:07:30.000
<v Speaker 3>it's something that the GPU architecture can actually can do

0:07:30.120 --> 0:07:33.800
<v Speaker 3>really really efficiently, much more efficiently than you could say

0:07:33.840 --> 0:07:37.760
<v Speaker 3>on a traditional CPU. And so, as it turns out,

0:07:37.760 --> 0:07:40.200
<v Speaker 3>the GPU has become a good architecture for this. Now

0:07:40.200 --> 0:07:41.640
<v Speaker 3>when a video has done on top of this, not

0:07:41.640 --> 0:07:44.160
<v Speaker 3>only with having the hardware is they've also built a

0:07:44.240 --> 0:07:48.160
<v Speaker 3>really massive software ecosystem around all of this. They have

0:07:48.360 --> 0:07:51.240
<v Speaker 3>their software is called Kuta. Think about it as kind

0:07:51.280 --> 0:07:54.440
<v Speaker 3>of like the software of the programming and environment, like

0:07:54.440 --> 0:07:57.440
<v Speaker 3>the parallel programming environment for these gps, and they've layered

0:07:57.480 --> 0:08:01.120
<v Speaker 3>on all kinds of other libraries, stks and everything on

0:08:01.440 --> 0:08:05.480
<v Speaker 3>top of that that actually makes this relatively easy to

0:08:05.640 --> 0:08:07.600
<v Speaker 3>use and to deploy and to deliver. And so they've

0:08:07.640 --> 0:08:09.800
<v Speaker 3>built up not just the hardware bus of the software

0:08:09.800 --> 0:08:12.160
<v Speaker 3>around this, and it's given them a really really sort

0:08:12.160 --> 0:08:15.520
<v Speaker 3>of like like like massive gap versus like a lot

0:08:15.520 --> 0:08:17.480
<v Speaker 3>of the other competitors that are now trying to get

0:08:17.480 --> 0:08:19.960
<v Speaker 3>into this market as well. And so and it's FUNNYO

0:08:20.000 --> 0:08:22.720
<v Speaker 3>if you look at Nvidia as a stock I mean today,

0:08:22.760 --> 0:08:24.320
<v Speaker 3>I mean this morning, it's about a lot of a

0:08:24.320 --> 0:08:26.640
<v Speaker 3>two hundred and sixty or two hundred and seventy dollars

0:08:26.680 --> 0:08:29.920
<v Speaker 3>a share. This was a ten to twenty dollars stock forever,

0:08:30.000 --> 0:08:33.319
<v Speaker 3>and they did a four to one s stock split recently,

0:08:33.400 --> 0:08:35.200
<v Speaker 3>so that'd be more like, you know, like a two

0:08:35.240 --> 0:08:37.880
<v Speaker 3>dollars and fifty cent to five dollars stock on today's

0:08:37.880 --> 0:08:40.560
<v Speaker 3>basis for for years and years and years. And just

0:08:40.600 --> 0:08:44.640
<v Speaker 3>the magnitude of the growth that we've had with these

0:08:44.640 --> 0:08:47.000
<v Speaker 3>guys over over the last like five or ten years,

0:08:47.000 --> 0:08:51.040
<v Speaker 3>particularly around their data center business and artificial intelligence. Everything

0:08:51.240 --> 0:08:54.000
<v Speaker 3>has just been quite remarkable, and so the earnings have

0:08:54.040 --> 0:08:56.959
<v Speaker 3>gone through the roof, and clearly the multiple that you're

0:08:57.000 --> 0:08:59.280
<v Speaker 3>placing on those earnings has gone through the roof, because

0:08:59.440 --> 0:09:01.400
<v Speaker 3>you know, the the view is that the opportunity here

0:09:01.440 --> 0:09:02.960
<v Speaker 3>is massive and that we're early and there's a lot

0:09:02.960 --> 0:09:05.000
<v Speaker 3>of runway ahead of us and the stocks. I mean,

0:09:05.000 --> 0:09:07.000
<v Speaker 3>it's had it tops and downs, but in general it's

0:09:07.000 --> 0:09:07.640
<v Speaker 3>been a home run.

0:09:08.200 --> 0:09:10.240
<v Speaker 2>I definitely want to ask you about where we are

0:09:10.280 --> 0:09:14.800
<v Speaker 2>in the sort of semiconductor stock price cycle. But before

0:09:14.840 --> 0:09:17.560
<v Speaker 2>we get into that, you know, I will also bite

0:09:17.640 --> 0:09:21.240
<v Speaker 2>on the really basic question that you already alluded to,

0:09:21.400 --> 0:09:26.560
<v Speaker 2>but how does machine learning slash AI actually work. You

0:09:26.640 --> 0:09:29.560
<v Speaker 2>mentioned this idea of I guess processing a bunch of

0:09:29.640 --> 0:09:34.199
<v Speaker 2>data in parallel versus I guess old style computing where

0:09:34.240 --> 0:09:36.960
<v Speaker 2>it would be sequential. But like, talk to us about

0:09:37.000 --> 0:09:40.280
<v Speaker 2>what is actually happening here and how does it fit

0:09:40.480 --> 0:09:42.200
<v Speaker 2>into the semiconductor space.

0:09:43.360 --> 0:09:45.120
<v Speaker 3>You bet? You bet? So let me let me first

0:09:45.160 --> 0:09:47.679
<v Speaker 3>abstract this up and I'll give you a really contrived

0:09:47.720 --> 0:09:50.959
<v Speaker 3>example just sort of simplistically about what's going on, and

0:09:51.000 --> 0:09:52.319
<v Speaker 3>then we can go a little bit more into the

0:09:52.360 --> 0:09:55.199
<v Speaker 3>actual details of what's happening. But let's imagine you want

0:09:55.200 --> 0:09:58.079
<v Speaker 3>to have some kind of a neural net. But the

0:09:58.280 --> 0:10:01.079
<v Speaker 3>machine learning is typically done with something called a neural network,

0:10:01.480 --> 0:10:03.600
<v Speaker 3>and I'll talk about what that is in a moment.

0:10:03.600 --> 0:10:05.680
<v Speaker 3>And let's let's just imagine, for example, you want to

0:10:05.679 --> 0:10:09.720
<v Speaker 3>build a an artificial intelligence a neural network to recognize

0:10:09.760 --> 0:10:13.040
<v Speaker 3>pictures of casts. It's just saying, okay, let's imagine I've

0:10:13.080 --> 0:10:15.040
<v Speaker 3>got this black box sitting in front of me, and

0:10:15.280 --> 0:10:17.680
<v Speaker 3>it's got a slots on one side where I'm taking

0:10:17.720 --> 0:10:20.800
<v Speaker 3>pictures and I'm feeding them in. It's got to display

0:10:20.880 --> 0:10:22.800
<v Speaker 3>on the other side which tells me, yes, it's a

0:10:22.840 --> 0:10:25.360
<v Speaker 3>cat or no it's not. And on the side of

0:10:25.400 --> 0:10:30.080
<v Speaker 3>the box there are a billion knobs that you can turn, okay,

0:10:30.679 --> 0:10:34.160
<v Speaker 3>and and they'll change various parameters of this model that

0:10:34.280 --> 0:10:36.520
<v Speaker 3>right now are inside the black box. Don't worry about

0:10:36.520 --> 0:10:38.920
<v Speaker 3>what those parameters are, but there's there's knobs that can

0:10:39.000 --> 0:10:41.760
<v Speaker 3>change them, and so effectively what you're doing when you're

0:10:42.480 --> 0:10:43.880
<v Speaker 3>training the thing. And by the way, when you have

0:10:43.920 --> 0:10:45.440
<v Speaker 3>the artificion does what you have is you have this

0:10:45.480 --> 0:10:48.320
<v Speaker 3>big black box. You need to train it to do

0:10:48.400 --> 0:10:50.600
<v Speaker 3>a specific task. That's what I'm going to talk about

0:10:50.600 --> 0:10:53.760
<v Speaker 3>in a moment. That's called training, and then once it's trained,

0:10:53.800 --> 0:10:56.800
<v Speaker 3>you need to use it for whatever task you've traded for.

0:10:57.080 --> 0:10:59.280
<v Speaker 3>That task is called inference. So you got to do

0:10:59.520 --> 0:11:02.040
<v Speaker 3>the training inference. So the training here's where we have.

0:11:02.280 --> 0:11:04.160
<v Speaker 3>I got my box with a slot and the display

0:11:04.160 --> 0:11:06.920
<v Speaker 3>and a billion knobs. Okay, So what I do for

0:11:06.960 --> 0:11:09.360
<v Speaker 3>the training process effectively is I take a picture and

0:11:10.440 --> 0:11:12.400
<v Speaker 3>a known picture okay, so I know if it's a

0:11:12.440 --> 0:11:15.599
<v Speaker 3>catter or not. I feed it into the box and

0:11:15.720 --> 0:11:18.400
<v Speaker 3>I look at the display and it tells me yes

0:11:18.440 --> 0:11:20.240
<v Speaker 3>it's a catteror yes it's not, and it probably gets

0:11:20.280 --> 0:11:21.640
<v Speaker 3>it wrong. And so then what I do is I

0:11:21.679 --> 0:11:25.240
<v Speaker 3>turn some of the knobs and I feed another picture in,

0:11:26.160 --> 0:11:27.920
<v Speaker 3>and then I turned some of the knobs, and I'm

0:11:27.920 --> 0:11:31.440
<v Speaker 3>basically tuning all of the parameters and sort of measuring

0:11:31.559 --> 0:11:35.280
<v Speaker 3>how accurate is this network at doing this tasket recognizing

0:11:35.360 --> 0:11:36.679
<v Speaker 3>is this a picture of a cat or is it not?

0:11:37.400 --> 0:11:42.200
<v Speaker 3>And I keep feeding pictures in known pictures known data set,

0:11:42.679 --> 0:11:45.080
<v Speaker 3>and I keep playing with all the knobs until the

0:11:45.120 --> 0:11:47.040
<v Speaker 3>accuracy of the thing is wherever I want it to be.

0:11:47.120 --> 0:11:50.480
<v Speaker 3>So yes, it's decided that that now it's very good

0:11:50.520 --> 0:11:52.840
<v Speaker 3>at recognizing is this a picture of a catteror is

0:11:52.840 --> 0:11:55.600
<v Speaker 3>it not. At that point, my model, my box is trained.

0:11:56.240 --> 0:11:58.280
<v Speaker 3>I now lock all of those knobs in place, I

0:11:58.280 --> 0:12:00.720
<v Speaker 3>don't move them anymore, and I use it now I

0:12:00.720 --> 0:12:02.839
<v Speaker 3>can just feed in pictures and it'll tell me yes,

0:12:02.880 --> 0:12:05.360
<v Speaker 3>it's a category, yes it's not. And so the process

0:12:05.400 --> 0:12:07.920
<v Speaker 3>of training this model is what that's really what it's about.

0:12:07.920 --> 0:12:11.079
<v Speaker 3>It's about varying all of the parameters. And by the way,

0:12:11.120 --> 0:12:14.480
<v Speaker 3>these models can have billions or hundreds of billions or

0:12:14.480 --> 0:12:17.679
<v Speaker 3>even more of parameters that they can be changed. And

0:12:17.720 --> 0:12:20.920
<v Speaker 3>that's the process of training. You're basically trying to optimize

0:12:20.960 --> 0:12:24.240
<v Speaker 3>this this sort of situation. I'm changing the parameters a

0:12:24.280 --> 0:12:26.960
<v Speaker 3>little bit at a time such that I can optimize

0:12:27.000 --> 0:12:29.040
<v Speaker 3>the response of this thing such sus that I can

0:12:29.080 --> 0:12:33.280
<v Speaker 3>get the performance of it, the accuracy of the network

0:12:33.320 --> 0:12:36.040
<v Speaker 3>to be high. So that's the training process, and it

0:12:36.120 --> 0:12:39.040
<v Speaker 3>is very very compute intensive, because you can imagine, if

0:12:39.040 --> 0:12:41.480
<v Speaker 3>I've got a billion different knobs on turning, I'm trying

0:12:41.520 --> 0:12:43.640
<v Speaker 3>to optimize the output, that takes a lot of compute.

0:12:43.960 --> 0:12:47.280
<v Speaker 3>The inference process once all that is much less compute

0:12:47.280 --> 0:12:50.640
<v Speaker 3>intensive because I'm not changing anything. I'm just applying the

0:12:50.679 --> 0:12:53.559
<v Speaker 3>network as it is to whatever data that I'm feeding

0:12:53.559 --> 0:12:55.480
<v Speaker 3>in at that But I'm not changing anything. But I

0:12:55.559 --> 0:12:57.240
<v Speaker 3>may be doing a lot more that the difference of

0:12:57.320 --> 0:12:58.679
<v Speaker 3>the inference. I may be using it all the time,

0:12:58.720 --> 0:13:01.280
<v Speaker 3>whereas once I've trained the model trained it. So it's

0:13:01.280 --> 0:13:04.000
<v Speaker 3>more like a one and done versus like a continual

0:13:04.080 --> 0:13:04.679
<v Speaker 3>use sort of thing.

0:13:05.160 --> 0:13:07.160
<v Speaker 1>Since you talk said, we're getting into sort of the

0:13:07.240 --> 0:13:12.199
<v Speaker 1>economics of training versus inference. A is there sort of

0:13:12.240 --> 0:13:14.440
<v Speaker 1>any way to get a sense of Like let's say

0:13:14.679 --> 0:13:18.000
<v Speaker 1>Tracy and me start odd Lodge GPT. It's a competitor

0:13:18.080 --> 0:13:21.000
<v Speaker 1>to chat, a competitor to open AI, Like, what are

0:13:21.040 --> 0:13:23.199
<v Speaker 1>we thinking of in terms of just that scale? How

0:13:23.280 --> 0:13:27.400
<v Speaker 1>much we're spending to compute on the training part? Then

0:13:27.440 --> 0:13:30.520
<v Speaker 1>how much are recurring costs in terms of inference are?

0:13:30.920 --> 0:13:33.280
<v Speaker 1>And then I'm also just curious, like also, like I

0:13:33.640 --> 0:13:36.280
<v Speaker 1>know you said the inference is much cheaper, but how

0:13:36.360 --> 0:13:41.120
<v Speaker 1>much cheaper is it versus say, asking Google question? How

0:13:41.200 --> 0:13:43.960
<v Speaker 1>much more expensive is it? How much more expensive is

0:13:44.000 --> 0:13:47.320
<v Speaker 1>a Chad GPT query or an odd Lodge GPT query

0:13:47.520 --> 0:13:49.520
<v Speaker 1>versus just a normal Google search?

0:13:50.000 --> 0:13:52.080
<v Speaker 3>Yeah, now you get and by the wahen I say cheaper.

0:13:52.080 --> 0:13:54.800
<v Speaker 3>It's like for any given given single use right again,

0:13:54.840 --> 0:13:56.480
<v Speaker 3>if I've got if I'm if I've got like one

0:13:56.520 --> 0:13:58.719
<v Speaker 3>hundred billion different inference activities, maybe it's not.

0:13:58.880 --> 0:13:59.840
<v Speaker 1>It's still expensive.

0:14:00.360 --> 0:14:02.400
<v Speaker 3>Yeah, But I first want to talk about it, just

0:14:02.400 --> 0:14:04.160
<v Speaker 3>just really quickly about like so that this is my

0:14:04.200 --> 0:14:07.760
<v Speaker 3>big abstract, contrived example about what's going on. If if

0:14:07.800 --> 0:14:10.000
<v Speaker 3>I go just a little bit deeper about what what

0:14:10.040 --> 0:14:11.880
<v Speaker 3>this thing is, like, let's talk just briefly about a

0:14:11.920 --> 0:14:13.959
<v Speaker 3>neural network, and then I will get true question, but

0:14:14.559 --> 0:14:17.120
<v Speaker 3>it kind of influences it. So think what is a

0:14:17.160 --> 0:14:19.640
<v Speaker 3>neural If I was to draw like a representation of

0:14:19.640 --> 0:14:21.160
<v Speaker 3>a neural network for you, what I would do is

0:14:21.200 --> 0:14:24.000
<v Speaker 3>I have a bunch of circles. Each of the circles

0:14:24.000 --> 0:14:25.760
<v Speaker 3>would be a neuron, and I wish I was there.

0:14:25.760 --> 0:14:28.200
<v Speaker 3>I could draw a picture for you. But imagine like send.

0:14:27.960 --> 0:14:30.680
<v Speaker 1>A picture after you're done, send a picture and we'll

0:14:30.720 --> 0:14:31.840
<v Speaker 1>run it with the episode.

0:14:31.840 --> 0:14:34.200
<v Speaker 3>We'll run it with the Okay, okay, I can I

0:14:34.200 --> 0:14:34.480
<v Speaker 3>can do?

0:14:34.520 --> 0:14:38.760
<v Speaker 1>There your a hand drawn explanation of these are varies.

0:14:39.400 --> 0:14:42.680
<v Speaker 3>These are varies and fine, but anyways, but imagine like

0:14:42.720 --> 0:14:44.720
<v Speaker 3>I've got like a group of circles. I've got like

0:14:44.760 --> 0:14:47.720
<v Speaker 3>a column, you know, in column one with like three circles,

0:14:47.720 --> 0:14:50.160
<v Speaker 3>and then column two, I've got i don't know, three

0:14:50.200 --> 0:14:52.520
<v Speaker 3>or four circles, and column three, I've got some circles.

0:14:52.760 --> 0:14:55.160
<v Speaker 3>These are my neurons. And imagine I've got arrows that

0:14:55.200 --> 0:14:58.960
<v Speaker 3>are connecting each circle to the circles in one row,

0:14:59.000 --> 0:15:00.720
<v Speaker 3>to all of the circles in the next throw. Those

0:15:00.760 --> 0:15:03.280
<v Speaker 3>are my connections between my neurons. So you can see

0:15:03.280 --> 0:15:05.880
<v Speaker 3>it looks like kind of a net or a network. Okay.

0:15:06.520 --> 0:15:09.960
<v Speaker 3>And so within each circle, I've got some which what's

0:15:10.000 --> 0:15:12.480
<v Speaker 3>called activation function. So what each circle does is it

0:15:12.520 --> 0:15:16.120
<v Speaker 3>takes an input the arrow that's coming into it, and

0:15:16.160 --> 0:15:18.720
<v Speaker 3>it has to decide based on those inputs, do I

0:15:18.800 --> 0:15:22.520
<v Speaker 3>send an output out out the other side or not? Right,

0:15:22.840 --> 0:15:25.960
<v Speaker 3>So there's some certain threshold. If the inputs reach some

0:15:26.040 --> 0:15:28.200
<v Speaker 3>amount of threshold, the neuron will fire, just just like

0:15:28.240 --> 0:15:31.760
<v Speaker 3>the neuron in your brain. Okay. Each each neuron can

0:15:31.800 --> 0:15:33.800
<v Speaker 3>have more than one input coming in from from more

0:15:33.840 --> 0:15:36.480
<v Speaker 3>than one neuron in the previous These are called layers.

0:15:36.480 --> 0:15:38.840
<v Speaker 3>By the way, these rows of circles can have more

0:15:38.840 --> 0:15:41.360
<v Speaker 3>than one input from the different neurons in the previous layer,

0:15:41.640 --> 0:15:44.600
<v Speaker 3>and that the neuron can weight those those different inputs

0:15:44.640 --> 0:15:46.720
<v Speaker 3>differently good, So it can say, you know, from from

0:15:46.920 --> 0:15:48.600
<v Speaker 3>this one neuron, I'm going to give that a fifty

0:15:48.640 --> 0:15:50.680
<v Speaker 3>percent weight, and from the other neural only weight at

0:15:50.680 --> 0:15:52.640
<v Speaker 3>twenty percent. I'm not going to take the full signal.

0:15:53.040 --> 0:15:57.400
<v Speaker 3>So those are called the weights of the network. And

0:15:57.440 --> 0:16:01.160
<v Speaker 3>so each neuron has inputs coming in and outputs going out,

0:16:01.200 --> 0:16:02.760
<v Speaker 3>and each of those inputs and outputs will have a

0:16:02.760 --> 0:16:04.960
<v Speaker 3>weight associated with it. So those those are where I

0:16:05.000 --> 0:16:08.320
<v Speaker 3>talk about those knobs. Those parameters. Yeah, those weights are

0:16:08.400 --> 0:16:11.800
<v Speaker 3>are one set of parameters. And then within each neuron

0:16:12.000 --> 0:16:15.600
<v Speaker 3>there's there's basically there's a certain threshold with all those

0:16:15.640 --> 0:16:17.760
<v Speaker 3>all those signals coming in when you add them up,

0:16:17.760 --> 0:16:20.560
<v Speaker 3>if they reach a certain threshold, then the neuron fires. Okay,

0:16:20.720 --> 0:16:23.080
<v Speaker 3>So that that threshold is called the bias, and you

0:16:23.120 --> 0:16:25.520
<v Speaker 3>can tune that. Like I can have a really sensitive

0:16:25.560 --> 0:16:28.080
<v Speaker 3>neuron where if the bias doesn't I don't need a

0:16:28.080 --> 0:16:29.920
<v Speaker 3>lot of signal coming in to make it fire. I

0:16:29.920 --> 0:16:32.200
<v Speaker 3>can have a neuron that's less sensitive. I need a

0:16:32.200 --> 0:16:35.560
<v Speaker 3>lot of signal coming into portal fire. That's called a bias.

0:16:35.600 --> 0:16:37.520
<v Speaker 3>That that that's also a parameter. So those are the

0:16:37.560 --> 0:16:41.440
<v Speaker 3>parameters that you're setting. The structure of the network itself,

0:16:41.480 --> 0:16:43.640
<v Speaker 3>the number of neurons and the number of layers and

0:16:43.640 --> 0:16:46.640
<v Speaker 3>everything that's that's sort of set, and then you're trying

0:16:46.680 --> 0:16:50.160
<v Speaker 3>to determine these weights and biases and again just just

0:16:50.200 --> 0:16:53.160
<v Speaker 3>the level set you check GPT, which you haven't getting

0:16:53.160 --> 0:16:56.360
<v Speaker 3>excited about as one hundred and seventy five billion separate

0:16:56.400 --> 0:17:00.400
<v Speaker 3>parameters that they get set during their during the training press. Okay,

0:17:00.640 --> 0:17:02.640
<v Speaker 3>So that's that's kind of what's what's going on.

0:17:19.440 --> 0:17:21.640
<v Speaker 2>Before you talk about the economics. Can I just ask

0:17:21.800 --> 0:17:24.920
<v Speaker 2>so one of the things about the technology is it's

0:17:24.960 --> 0:17:28.360
<v Speaker 2>sort of it's supposed to be iterative, right, like it's

0:17:28.480 --> 0:17:31.440
<v Speaker 2>learning as it goes along. Can you talk just briefly

0:17:31.480 --> 0:17:36.760
<v Speaker 2>maybe about how it's incorporating like new inputs as it develops.

0:17:37.880 --> 0:17:40.639
<v Speaker 3>Yeah, So when when you when you training, let's talk

0:17:40.640 --> 0:17:43.760
<v Speaker 3>about training now. So when you train the network, it

0:17:43.880 --> 0:17:47.000
<v Speaker 3>happens on a static data set. Okay, so you have

0:17:47.080 --> 0:17:49.359
<v Speaker 3>to start with a data set, right, and in terms

0:17:49.359 --> 0:17:53.159
<v Speaker 3>of check GPT, that is you know, it has a

0:17:53.400 --> 0:17:56.000
<v Speaker 3>large corpus of data that it was trained on. It

0:17:56.040 --> 0:17:58.399
<v Speaker 3>was there's a lot of data from the Internet and

0:17:58.400 --> 0:17:59.680
<v Speaker 3>from other sources.

0:17:59.359 --> 0:18:02.439
<v Speaker 1>Right, basically trained the smart like all of the Internet,

0:18:03.200 --> 0:18:06.920
<v Speaker 1>but also a lot of Reddit. So it's like we've right,

0:18:07.080 --> 0:18:09.120
<v Speaker 1>like is it like we've trained just like the greatest

0:18:09.119 --> 0:18:11.120
<v Speaker 1>brain of all time is like reddit pill.

0:18:11.800 --> 0:18:13.880
<v Speaker 2>Now it talks like a seventeen year old boy.

0:18:14.400 --> 0:18:16.440
<v Speaker 3>So there's a lot of data and and so yes,

0:18:16.560 --> 0:18:18.639
<v Speaker 3>I sort of how does that data get get you know,

0:18:19.560 --> 0:18:22.760
<v Speaker 3>incorporated into I don't want to get too short of

0:18:22.760 --> 0:18:24.480
<v Speaker 3>getting too complet I don't want to get too complicated.

0:18:24.760 --> 0:18:26.600
<v Speaker 3>Let me talk about how to standard training works, and

0:18:26.600 --> 0:18:28.400
<v Speaker 3>then we can talk about chat GPT because that uses

0:18:28.440 --> 0:18:30.760
<v Speaker 3>a different kind of model. It's called a transformer model.

0:18:30.840 --> 0:18:33.639
<v Speaker 3>But anyways, but when when I'm training this, so, so

0:18:33.680 --> 0:18:35.800
<v Speaker 3>what happens is is I feed this stuff that there's

0:18:35.840 --> 0:18:38.600
<v Speaker 3>a there's a process called it's called back propagation. Basically

0:18:38.680 --> 0:18:42.879
<v Speaker 3>what you do is you sort of feed this stuff

0:18:42.920 --> 0:18:46.679
<v Speaker 3>through through this through the network itself, and then you

0:18:46.720 --> 0:18:48.680
<v Speaker 3>work it backwards and you're basically what you're doing is

0:18:48.720 --> 0:18:51.480
<v Speaker 3>you're measuring the output against a known response. I want

0:18:51.480 --> 0:18:54.480
<v Speaker 3>to sort of you know, that's my my cat picture.

0:18:54.560 --> 0:18:56.080
<v Speaker 3>Is it a cat or is it not a cat, right,

0:18:56.119 --> 0:18:58.160
<v Speaker 3>I'm trying to minimize the difference between because I want

0:18:58.160 --> 0:19:00.080
<v Speaker 3>to be accurate. Right, So what you sort of to

0:19:00.160 --> 0:19:03.280
<v Speaker 3>do is you roll a certain step through the network, right,

0:19:03.320 --> 0:19:06.040
<v Speaker 3>You measure the output against the against the known what

0:19:06.200 --> 0:19:08.400
<v Speaker 3>it should be. And then there's a process that's called

0:19:08.480 --> 0:19:11.200
<v Speaker 3>back propagation, where what you're doing you're actually what you're

0:19:11.200 --> 0:19:14.160
<v Speaker 3>calculate what's called the gradients of all of these things.

0:19:14.160 --> 0:19:16.119
<v Speaker 3>You're basically looking at sort of like the sort of

0:19:16.119 --> 0:19:19.720
<v Speaker 3>like the rate of change of of these different parameters,

0:19:19.720 --> 0:19:23.000
<v Speaker 3>and you sort of work the network backwards, and that

0:19:23.160 --> 0:19:25.400
<v Speaker 3>gradient that you're calculating kind of tells you how much

0:19:25.440 --> 0:19:28.560
<v Speaker 3>to adjust each parameter. So you work it back and

0:19:28.600 --> 0:19:30.280
<v Speaker 3>then you work it forward again, and then you work

0:19:30.280 --> 0:19:31.879
<v Speaker 3>it backward, and then you work at forward and you

0:19:31.920 --> 0:19:35.720
<v Speaker 3>work at backward, and then you do that until you've

0:19:35.760 --> 0:19:38.800
<v Speaker 3>converged like that that the that the network itself is

0:19:39.000 --> 0:19:41.359
<v Speaker 3>accurate to to wherever you want it to be to

0:19:41.400 --> 0:19:45.320
<v Speaker 3>be accurate at. That's so that's again I'm I'm I'm

0:19:45.359 --> 0:19:47.760
<v Speaker 3>grossly simplifying here. I'm trying to keep this as high

0:19:47.840 --> 0:19:50.720
<v Speaker 3>level as possible, but that's kind of what you're and

0:19:50.800 --> 0:19:52.320
<v Speaker 3>just in terms of the amount of can be sort

0:19:52.359 --> 0:19:55.720
<v Speaker 3>of train check GPT and and checking we can do.

0:19:55.760 --> 0:19:57.919
<v Speaker 3>They've actually released all the details of the network, like

0:19:57.960 --> 0:20:01.119
<v Speaker 3>how many layers and what's the dimension, I parameters all

0:20:01.119 --> 0:20:02.920
<v Speaker 3>this stuff, so we can do this math. It turns

0:20:02.960 --> 0:20:05.000
<v Speaker 3>out to take about three times ten to the twenty

0:20:05.040 --> 0:20:07.800
<v Speaker 3>third operations to train it. And so just just that's

0:20:07.880 --> 0:20:13.080
<v Speaker 3>three hundred sex tillion operations it took to train chat GPT.

0:20:14.080 --> 0:20:16.680
<v Speaker 3>Now in terms of how much it costs, so CHATTYB

0:20:16.880 --> 0:20:19.320
<v Speaker 3>was was they kind of said this, It was trained

0:20:19.320 --> 0:20:23.040
<v Speaker 3>on ten thousand in video what they called the V

0:20:23.119 --> 0:20:25.000
<v Speaker 3>one hundred. That's that's the Volta chip. That's a chip

0:20:25.040 --> 0:20:27.240
<v Speaker 3>that's several years old for in video. But it was

0:20:27.280 --> 0:20:29.760
<v Speaker 3>trained on supposedly about ten thousand of these. And we

0:20:29.800 --> 0:20:31.760
<v Speaker 3>did some of this math ourselves. I was coming out

0:20:31.760 --> 0:20:33.840
<v Speaker 3>more like three or four thousand, but there's a ton

0:20:33.840 --> 0:20:35.560
<v Speaker 3>of another assumptions you have to make it here, ten

0:20:35.560 --> 0:20:37.760
<v Speaker 3>thousand seems to be the right order of magnitude for

0:20:37.880 --> 0:20:41.000
<v Speaker 3>that part. That part of the time cost about you know,

0:20:41.080 --> 0:20:43.720
<v Speaker 3>I don't know, eight thousand bucks. And so the number

0:20:43.760 --> 0:20:45.280
<v Speaker 3>that was kind of tossed up with something like eighty

0:20:45.320 --> 0:20:48.600
<v Speaker 3>million dollars to train chat GPT one time.

0:20:49.160 --> 0:20:51.480
<v Speaker 1>I think on some of the it doesn't seem like

0:20:51.480 --> 0:20:54.080
<v Speaker 1>that much to me. Well, so this is like did

0:20:54.080 --> 0:20:55.119
<v Speaker 1>I get it, but like there are a lot of

0:20:55.160 --> 0:20:57.200
<v Speaker 1>companies that could spend that have eighty millions.

0:20:57.400 --> 0:20:59.679
<v Speaker 3>I actually agree with it. We're jumping ahead. But my

0:21:00.040 --> 0:21:02.440
<v Speaker 3>take is that for for large language models, and we

0:21:02.480 --> 0:21:05.439
<v Speaker 3>can talk about these different things, but for large language

0:21:05.440 --> 0:21:08.320
<v Speaker 3>almost chat CHIPD, I actually think inference is a bigger opportunity,

0:21:08.680 --> 0:21:10.119
<v Speaker 3>and you're kind of getting to the heart of it.

0:21:10.119 --> 0:21:13.200
<v Speaker 3>It's because inference scales directly the more queries I run.

0:21:14.960 --> 0:21:17.159
<v Speaker 1>Trained once and that's done, and that's eighty one, or

0:21:17.160 --> 0:21:17.600
<v Speaker 1>even if.

0:21:17.480 --> 0:21:20.520
<v Speaker 3>You're training more than once and again to your question, Tracy,

0:21:20.600 --> 0:21:22.440
<v Speaker 3>like you can add to the to the data set

0:21:22.480 --> 0:21:24.959
<v Speaker 3>and retrain it. But if I've already got the info,

0:21:25.080 --> 0:21:28.320
<v Speaker 3>let's say I'm training it every two weeks, Okay, yeah,

0:21:28.400 --> 0:21:30.440
<v Speaker 3>that'd be training it like twenty four to twenty five

0:21:30.480 --> 0:21:32.840
<v Speaker 3>times a year. But I've I've got the infrastructure that

0:21:32.920 --> 0:21:35.919
<v Speaker 3>is in place already right to do that, and so

0:21:36.440 --> 0:21:41.000
<v Speaker 3>the training TAM will be more around how many different

0:21:41.400 --> 0:21:44.680
<v Speaker 3>entities actually develop these models and how many models each

0:21:44.760 --> 0:21:47.480
<v Speaker 3>do they develop and how often do they train those models,

0:21:47.520 --> 0:21:49.280
<v Speaker 3>and importantly how big do the models get, Because this

0:21:49.359 --> 0:21:52.040
<v Speaker 3>is one of the things. Chat GPD is is big,

0:21:52.080 --> 0:21:54.920
<v Speaker 3>but GPT four, which they've released, that is even bigger.

0:21:54.920 --> 0:21:57.679
<v Speaker 3>They haven't they haven't talked about specs, but I wouldn't

0:21:57.680 --> 0:22:00.240
<v Speaker 3>be surprised. CHATCHIPD four is room to have over billion

0:22:00.280 --> 0:22:03.040
<v Speaker 3>parameters like a very well mighte and you have. We're

0:22:03.160 --> 0:22:05.199
<v Speaker 3>very early into this, like these these models are going

0:22:05.240 --> 0:22:07.080
<v Speaker 3>to keep getting bigger and bigger and bigger. And so

0:22:07.119 --> 0:22:10.200
<v Speaker 3>that's how I think the training market, the training tam

0:22:10.320 --> 0:22:12.760
<v Speaker 3>will be growing. It it's a function of the of

0:22:12.840 --> 0:22:15.199
<v Speaker 3>the number of trainings of all these models we're doing

0:22:15.240 --> 0:22:16.760
<v Speaker 3>every year, in the size of these models, and the

0:22:16.760 --> 0:22:17.520
<v Speaker 3>model will get big.

0:22:18.280 --> 0:22:20.480
<v Speaker 1>So let's get it. But in your view, the big

0:22:20.560 --> 0:22:22.439
<v Speaker 1>money is going to be made on the inference, So

0:22:22.560 --> 0:22:23.320
<v Speaker 1>let's talk about it.

0:22:23.440 --> 0:22:24.200
<v Speaker 3>I think.

0:22:24.400 --> 0:22:28.720
<v Speaker 1>So think that's talk about what happens then and your

0:22:28.760 --> 0:22:31.320
<v Speaker 1>sort of sense of the side. I don't know, Yeah,

0:22:31.359 --> 0:22:33.840
<v Speaker 1>just talk to us about the inference part and the economics.

0:22:34.200 --> 0:22:37.280
<v Speaker 3>You bet, Chat CHPT in these large language models, it's

0:22:37.320 --> 0:22:39.680
<v Speaker 3>a it's a new type of model's called a transformer model,

0:22:39.680 --> 0:22:42.919
<v Speaker 3>and there's a bunch of compute steps that have to happen.

0:22:43.600 --> 0:22:45.760
<v Speaker 3>There's also a step in there that helps it map

0:22:45.800 --> 0:22:49.320
<v Speaker 3>the relation, capture the relationship between you. You know, by

0:22:49.320 --> 0:22:51.320
<v Speaker 3>the way, if you if you've ever used chatcha, you know,

0:22:51.320 --> 0:22:54.560
<v Speaker 3>you type in like a querry into a box and

0:22:54.560 --> 0:22:57.480
<v Speaker 3>it and it returns to respond, so that querry is

0:22:57.480 --> 0:23:00.199
<v Speaker 3>broken into what are called tokens. It's basically thinking do

0:23:00.240 --> 0:23:03.080
<v Speaker 3>you think about token is kind of like a word

0:23:03.160 --> 0:23:05.760
<v Speaker 3>or a group of words sort of. But the transformer

0:23:05.800 --> 0:23:08.880
<v Speaker 3>model has something it's it's called a self attention mechanism,

0:23:09.359 --> 0:23:11.879
<v Speaker 3>and what that does is it captures the relationship between

0:23:11.880 --> 0:23:14.600
<v Speaker 3>those different tokens and the input sequence based on the

0:23:14.640 --> 0:23:16.560
<v Speaker 3>training data that it has. And that's how it knows

0:23:16.680 --> 0:23:20.320
<v Speaker 3>what it's really doing. It's predictive text. It knows based

0:23:20.359 --> 0:23:22.320
<v Speaker 3>on this query, I'm going to start the response with

0:23:22.400 --> 0:23:25.240
<v Speaker 3>this word, and based on this word and this query

0:23:25.280 --> 0:23:27.959
<v Speaker 3>and my data said, I know, these other words typically follow,

0:23:28.440 --> 0:23:31.679
<v Speaker 3>and it kind of constructs the response from that. And

0:23:31.720 --> 0:23:35.760
<v Speaker 3>so our math suggests that for like a typical query

0:23:35.840 --> 0:23:38.280
<v Speaker 3>response called like you know, five hundred tokens or maybe

0:23:38.320 --> 0:23:42.800
<v Speaker 3>two thousand words, it was something like four hundred quadrillion

0:23:43.000 --> 0:23:46.679
<v Speaker 3>operations needed to accomplish something like that. And so you

0:23:46.720 --> 0:23:49.760
<v Speaker 3>can size this up because I know, for like an

0:23:49.840 --> 0:23:52.080
<v Speaker 3>Nvidia GPU, and you can do it for different GPUs.

0:23:52.119 --> 0:23:55.520
<v Speaker 3>I know how many operations per second each GPU can run,

0:23:57.000 --> 0:23:59.879
<v Speaker 3>and I know how much these GPS ballpark kind of costs.

0:24:00.040 --> 0:24:02.080
<v Speaker 3>And so then you know, you got I assume like, well, okay,

0:24:02.080 --> 0:24:03.440
<v Speaker 3>how many queries per day are you going to do?

0:24:03.480 --> 0:24:06.200
<v Speaker 3>And you can come up with a number, and I mean, frankly,

0:24:06.200 --> 0:24:07.720
<v Speaker 3>the number can be as big as you want. It

0:24:07.760 --> 0:24:10.160
<v Speaker 3>depends on how many queries. But I think a tam

0:24:10.200 --> 0:24:12.200
<v Speaker 3>you know, at least in the multiple tens of billions

0:24:12.200 --> 0:24:16.080
<v Speaker 3>of dollars is not unreasonable, if not more, and just

0:24:16.080 --> 0:24:18.120
<v Speaker 3>the level set I mean, I guess to your Google question,

0:24:18.160 --> 0:24:20.239
<v Speaker 3>Google does about ten billion searches a day and give

0:24:20.320 --> 0:24:23.000
<v Speaker 3>or take. I think a lot of people have been

0:24:23.040 --> 0:24:25.719
<v Speaker 3>looking at at that level as part of like you know,

0:24:25.840 --> 0:24:28.000
<v Speaker 3>like the end all bill for where this could go.

0:24:28.720 --> 0:24:32.280
<v Speaker 3>I'll be honest, like, I understand why people are, especially

0:24:32.280 --> 0:24:34.760
<v Speaker 3>the Internet investors, are concerned that large language models and

0:24:34.800 --> 0:24:38.080
<v Speaker 3>things like chat GPD can start to disrupt search. I'm

0:24:38.119 --> 0:24:40.680
<v Speaker 3>not exactly sure that search is the right proxy person.

0:24:40.800 --> 0:24:42.760
<v Speaker 3>It feels kind of limiting to me. I mean, you

0:24:42.760 --> 0:24:45.720
<v Speaker 3>could imagine I've watched a little too much Star Trek,

0:24:45.760 --> 0:24:47.240
<v Speaker 3>I guess, but I mean you could imagine, you know,

0:24:47.280 --> 0:24:49.040
<v Speaker 3>when you have like a virtual assist in the ceiling,

0:24:49.080 --> 0:24:51.680
<v Speaker 3>I'm calling out to it, and you know, it doesn't

0:24:51.760 --> 0:24:54.000
<v Speaker 3>have to be just search on my screen. I could

0:24:54.080 --> 0:24:56.879
<v Speaker 3>have it in my car, right, I could have you know,

0:24:56.920 --> 0:24:59.280
<v Speaker 3>I call up American Airlines that change my airline tickets

0:24:59.320 --> 0:25:02.480
<v Speaker 3>and it's a checkbo that the CHET bought that's talking

0:25:02.520 --> 0:25:04.040
<v Speaker 3>to me. So this could be very big and by

0:25:04.040 --> 0:25:06.400
<v Speaker 3>the way, I think to get by the way, the

0:25:06.440 --> 0:25:08.439
<v Speaker 3>one problem with this start to a calculation that's kind

0:25:08.440 --> 0:25:11.160
<v Speaker 3>of static, Like the cost is sort of an output

0:25:11.240 --> 0:25:15.160
<v Speaker 3>rather than an input. I think to drive adoption, cost

0:25:15.200 --> 0:25:17.960
<v Speaker 3>will come down, and we've already seen that. Like Video

0:25:17.960 --> 0:25:20.639
<v Speaker 3>has a new product it's called Hopper, which is like

0:25:20.680 --> 0:25:23.040
<v Speaker 3>two generations past those V one hundreds that I was

0:25:23.080 --> 0:25:26.320
<v Speaker 3>talking about, past the Volta generation. The cost per query

0:25:26.400 --> 0:25:28.640
<v Speaker 3>to do this or the cost for training on Hopper

0:25:28.920 --> 0:25:31.120
<v Speaker 3>is much lower than a Bolta because it's much more efficient.

0:25:31.160 --> 0:25:34.560
<v Speaker 3>Part that's a good thing, though it's camacreed if it

0:25:34.560 --> 0:25:36.240
<v Speaker 3>will drive adoption, and.

0:25:36.320 --> 0:25:40.080
<v Speaker 4>Video actually has specific products specifically designed to do this

0:25:40.280 --> 0:25:43.720
<v Speaker 4>this kind of thing, and Hopper has specific blocks on

0:25:43.760 --> 0:25:46.080
<v Speaker 4>it that actually helped with with the training and inference

0:25:46.080 --> 0:25:47.440
<v Speaker 4>on these kind of large language models.

0:25:47.480 --> 0:25:50.200
<v Speaker 3>And so I actually think over time, is the efficiency

0:25:50.200 --> 0:25:52.439
<v Speaker 3>gets better and better, you're going to drive adoption more

0:25:52.480 --> 0:25:54.160
<v Speaker 3>and more. I think this is a big thing. And

0:25:54.200 --> 0:25:56.480
<v Speaker 3>I remember we're still really early. Chatchp deal only showed

0:25:56.520 --> 0:25:57.240
<v Speaker 3>up in November.

0:25:57.680 --> 0:25:59.720
<v Speaker 1>Yeah, it's crazy, it's really early.

0:25:59.760 --> 0:26:00.280
<v Speaker 3>Still.

0:26:00.000 --> 0:26:04.639
<v Speaker 2>Well, just on that note, can you draw directly the

0:26:04.680 --> 0:26:08.760
<v Speaker 2>connection between the software and the hardware you're here, because

0:26:08.800 --> 0:26:11.920
<v Speaker 2>I think it at this point probably everyone listening has

0:26:11.960 --> 0:26:14.960
<v Speaker 2>tried chat GPT, and you're used to seeing it as

0:26:15.000 --> 0:26:17.159
<v Speaker 2>a sort of you know, it's an interface on the

0:26:17.200 --> 0:26:19.560
<v Speaker 2>Internet and you type stuff into it and it spits

0:26:19.640 --> 0:26:24.440
<v Speaker 2>something out. But like, where do the semiconductors actually come

0:26:24.480 --> 0:26:28.600
<v Speaker 2>in when we're talking about crunching these enormous data sets

0:26:28.760 --> 0:26:31.119
<v Speaker 2>and what makes us You kind of touched on this

0:26:31.160 --> 0:26:33.480
<v Speaker 2>a little bit with Nvidio, but what makes a semiconductor

0:26:34.040 --> 0:26:38.840
<v Speaker 2>better at doing AI versus more traditional computational processes?

0:26:39.200 --> 0:26:41.359
<v Speaker 3>Yeah, yeah, you bet. So. To answer that second question,

0:26:41.480 --> 0:26:44.120
<v Speaker 3>I think AI is really much more around parallel processing,

0:26:44.160 --> 0:26:46.760
<v Speaker 3>and in particular thing it's this kind of MAPP matrix map.

0:26:48.160 --> 0:26:54.159
<v Speaker 3>It's a single class of calculations that these things do

0:26:54.400 --> 0:26:57.040
<v Speaker 3>very very efficiently and do very very well, and they

0:26:57.040 --> 0:26:58.920
<v Speaker 3>do them much more efficiently than a CPO that that

0:26:59.000 --> 0:27:02.560
<v Speaker 3>performs a little more really versus parallel. You just couldn't

0:27:02.600 --> 0:27:05.000
<v Speaker 3>run this stuff on CPUs. But don't get me wrong,

0:27:05.200 --> 0:27:07.440
<v Speaker 3>you do some of we've been talking about inference on

0:27:08.760 --> 0:27:12.000
<v Speaker 3>large language models. There's there's all kinds of inference. Inference

0:27:12.040 --> 0:27:15.600
<v Speaker 3>workloads range from very simplistic to very very complex like

0:27:15.680 --> 0:27:19.480
<v Speaker 3>and my my, you know, cat recognition example was very simplistic,

0:27:20.359 --> 0:27:23.159
<v Speaker 3>something like this, or fakly something like autonomous driving that

0:27:23.560 --> 0:27:26.360
<v Speaker 3>is an inference activity, but is a hugely computationally intense

0:27:26.920 --> 0:27:29.400
<v Speaker 3>inference activity. And so there's still a lot of inference

0:27:29.440 --> 0:27:32.000
<v Speaker 3>today that actually happens. In fact, most inference today actually

0:27:32.040 --> 0:27:35.600
<v Speaker 3>happens on CPUs. But i'd say the types of things

0:27:35.640 --> 0:27:38.040
<v Speaker 3>that you're trying to do are getting more and more complex,

0:27:38.600 --> 0:27:41.760
<v Speaker 3>and CPUs are getting less and less viable for that

0:27:41.800 --> 0:27:43.520
<v Speaker 3>for that kind of that kind of anth and so

0:27:43.920 --> 0:27:46.560
<v Speaker 3>that's kind of the difference between GPUs and other types

0:27:46.600 --> 0:27:50.440
<v Speaker 3>of parallel offerings versus like a CPU. I should say,

0:27:50.440 --> 0:27:52.000
<v Speaker 3>by the way, GPUs are not the only way to

0:27:52.000 --> 0:27:54.639
<v Speaker 3>do this. Google, for example, has their own an I chips.

0:27:54.680 --> 0:27:57.000
<v Speaker 3>They call them a TPU tensor processing unit.

0:27:57.520 --> 0:27:59.760
<v Speaker 1>One thing I write like about talking to Stacey to

0:28:00.080 --> 0:28:03.080
<v Speaker 1>things is a I think he comes up with better

0:28:03.200 --> 0:28:05.720
<v Speaker 1>versions of our questions than we do.

0:28:05.840 --> 0:28:08.760
<v Speaker 2>Which it's like one thing about the question is just ask.

0:28:09.880 --> 0:28:11.560
<v Speaker 1>He's always like, all right, that's a good question, but

0:28:11.640 --> 0:28:14.919
<v Speaker 1>let me actually reframe the question to get a better response.

0:28:14.920 --> 0:28:18.720
<v Speaker 1>So I appreciate that, and he also anticipates because I literally,

0:28:18.880 --> 0:28:21.479
<v Speaker 1>like on my computer right now, I had Google Cloud

0:28:21.520 --> 0:28:24.080
<v Speaker 1>tensor processing units because that was my next question. And

0:28:24.160 --> 0:28:27.879
<v Speaker 1>also important because I think yesterday the information reported that

0:28:27.960 --> 0:28:30.840
<v Speaker 1>Microsoft is also So why don't you talk to us

0:28:30.880 --> 0:28:34.480
<v Speaker 1>about that these other and what the competing directly?

0:28:36.240 --> 0:28:39.680
<v Speaker 3>Yeah, yeah, yeah, you got so Google's good. By the good,

0:28:39.720 --> 0:28:41.200
<v Speaker 3>this is not new. Google has been doing their own

0:28:41.240 --> 0:28:44.160
<v Speaker 3>chips for seven or eight years. It is not new, right,

0:28:44.200 --> 0:28:46.160
<v Speaker 3>And but they have what they call TPU and they

0:28:46.200 --> 0:28:50.480
<v Speaker 3>use it extensively for their own internal workloads. Absolutely, Amazon

0:28:50.560 --> 0:28:52.840
<v Speaker 3>has their own chips. They have a training chip. It's

0:28:52.960 --> 0:28:55.520
<v Speaker 3>that's called you know kind of hysterically. It's called tranium.

0:28:56.200 --> 0:29:00.000
<v Speaker 3>They have an inference chip. It's called Interferentia. Microsoft apparently

0:29:00.120 --> 0:29:03.680
<v Speaker 3>is working on their own. My feeling is every hyperscaler

0:29:03.760 --> 0:29:06.760
<v Speaker 3>is working on their own chat, particularly for their own

0:29:06.800 --> 0:29:09.320
<v Speaker 3>internal workloads. And that is an area we talked about

0:29:09.320 --> 0:29:12.560
<v Speaker 3>in Vida software remote like Google doesn't need in video

0:29:12.680 --> 0:29:15.800
<v Speaker 3>software mode, they're not running Kuda. They're they're just running

0:29:15.840 --> 0:29:19.240
<v Speaker 3>tensorflock right and and doing their their thing. They don't

0:29:19.240 --> 0:29:23.000
<v Speaker 3>need Kuda anything. However, that is facing an end customer,

0:29:23.040 --> 0:29:25.200
<v Speaker 3>like an enterprise like end customer, like on a public cloud,

0:29:25.280 --> 0:29:28.479
<v Speaker 3>like like a customer going to AWS and ranting, you know,

0:29:28.880 --> 0:29:32.440
<v Speaker 3>compute power, that tends to be GPUs because customers don't

0:29:32.480 --> 0:29:36.720
<v Speaker 3>have Google's just sophistication. They really do need the software

0:29:36.960 --> 0:29:40.360
<v Speaker 3>ecosystem that's built around they use. So for example, I

0:29:40.360 --> 0:29:42.520
<v Speaker 3>can go to Google Cloud, I can actually rent a

0:29:42.640 --> 0:29:46.920
<v Speaker 3>TPU instance. It can be done. Nobody really doesn't. And

0:29:46.960 --> 0:29:49.400
<v Speaker 3>actually if you look how they're priced, typically it's actually

0:29:49.400 --> 0:29:52.120
<v Speaker 3>more expensive usually even than than have the way that

0:29:52.120 --> 0:29:56.320
<v Speaker 3>Google's pricing GPUs on on on Google Cloud. It's it's

0:29:56.320 --> 0:29:59.360
<v Speaker 3>similar for Amazon and others, And so I do think

0:29:59.400 --> 0:30:01.200
<v Speaker 3>that all the hyper feelers are working on their own

0:30:01.280 --> 0:30:03.640
<v Speaker 3>and there is a certain certainly a place for that,

0:30:03.760 --> 0:30:06.600
<v Speaker 3>especially for their own internal workloads, anything that's facing a

0:30:06.640 --> 0:30:09.680
<v Speaker 3>customer that that in Video GPO ecosystem is really kind.

0:30:09.520 --> 0:30:12.560
<v Speaker 1>Of yeah, this is so, this is so Actually these

0:30:12.960 --> 0:30:15.200
<v Speaker 1>just to clarify, because that point is really interesting that

0:30:15.280 --> 0:30:18.680
<v Speaker 1>for like, if again Tracy and I want to launch

0:30:18.760 --> 0:30:23.000
<v Speaker 1>odd launch GPT, part of the issue would be not

0:30:23.160 --> 0:30:28.920
<v Speaker 1>necessarily the hardware, this sort of the silicon, but actually

0:30:29.480 --> 0:30:33.000
<v Speaker 1>that in Video's software suite built around it would make

0:30:33.040 --> 0:30:36.239
<v Speaker 1>it much easier for us to sort of start and

0:30:36.360 --> 0:30:38.280
<v Speaker 1>use on in Video for training our model.

0:30:38.360 --> 0:30:41.320
<v Speaker 3>Yeah, yes, it was, and they've built a lot of

0:30:41.360 --> 0:30:44.160
<v Speaker 3>It's funny. You can go listen to Video's announcements in

0:30:44.200 --> 0:30:46.000
<v Speaker 3>their analyst dys and things, and there as much about

0:30:46.000 --> 0:30:48.840
<v Speaker 3>software as they are about hardware. So not only have

0:30:48.920 --> 0:30:52.680
<v Speaker 3>they continue to extend like the basic like like COUDA ecosystem,

0:30:52.680 --> 0:30:56.040
<v Speaker 3>they've layered all kinds of other application specific things on

0:30:56.560 --> 0:30:58.400
<v Speaker 3>top of it. So they've got what they call RAPIDS,

0:30:58.400 --> 0:31:01.920
<v Speaker 3>which is for enterprise Machine Learn. They've got a library

0:31:01.920 --> 0:31:04.760
<v Speaker 3>package called ISAACS, which is for automation robotics, They've got

0:31:04.760 --> 0:31:08.080
<v Speaker 3>a package called Clara, which is specifically for medical imaging

0:31:08.120 --> 0:31:11.520
<v Speaker 3>and diagnostics. They've got something called cou Quantum, which is

0:31:11.520 --> 0:31:15.600
<v Speaker 3>actually for quantum computer simulations. They've got something for drug discovery.

0:31:15.960 --> 0:31:20.000
<v Speaker 3>So they're layering all these things on top, right depending

0:31:20.040 --> 0:31:22.520
<v Speaker 3>on your application. They've got internal teams that are working

0:31:22.520 --> 0:31:24.760
<v Speaker 3>on it's not just throwing the software out there. They've

0:31:24.800 --> 0:31:27.040
<v Speaker 3>got people there that can actually like help you work

0:31:27.080 --> 0:31:30.040
<v Speaker 3>or work and come along with it. They're doing other

0:31:30.080 --> 0:31:32.480
<v Speaker 3>things easier, you know. So they actually just launched a

0:31:32.520 --> 0:31:35.200
<v Speaker 3>cloud service, and this is with Google and Oracle and

0:31:35.200 --> 0:31:37.480
<v Speaker 3>Google and Microsoft ware. You can almost they'll do like

0:31:37.520 --> 0:31:41.680
<v Speaker 3>a fully provisioned in Vidia AI supercomputer in the cloud.

0:31:41.800 --> 0:31:43.880
<v Speaker 3>So because like you, they sell these AI servers and

0:31:43.920 --> 0:31:46.800
<v Speaker 3>they can cost hundreds of thousands of dollars apiece. If

0:31:46.840 --> 0:31:48.960
<v Speaker 3>you want now you can just go to Oracle Cloud

0:31:49.040 --> 0:31:50.840
<v Speaker 3>or Google Cloud or whatever. You can sort of rent

0:31:50.880 --> 0:31:54.840
<v Speaker 3>they fully provisioned in Vidia supercomputer sitting in the cloud

0:31:54.840 --> 0:31:56.960
<v Speaker 3>that they'll all you got to u is access it

0:31:57.000 --> 0:32:00.000
<v Speaker 3>right for a web browser. This kind of get super easy.

0:32:00.160 --> 0:32:02.000
<v Speaker 2>This is going to be my next question actually because

0:32:02.240 --> 0:32:06.040
<v Speaker 2>so I take the point about software, but like what

0:32:06.160 --> 0:32:11.120
<v Speaker 2>do the AI supercomputers actually look like nowadays, Like is

0:32:11.160 --> 0:32:14.760
<v Speaker 2>there a physical thing in a giant data center somewhere? Yeah,

0:32:14.840 --> 0:32:17.960
<v Speaker 2>they mostly like cloud based or what does this look like?

0:32:17.960 --> 0:32:21.520
<v Speaker 3>Like? Walk astro so video sells, and Video sells something

0:32:21.560 --> 0:32:24.000
<v Speaker 3>they called a DGX. It's a it's a box. I

0:32:24.040 --> 0:32:26.280
<v Speaker 3>mean it's I don't know it's when it's a two peat,

0:32:26.320 --> 0:32:27.720
<v Speaker 3>but I don't know what the dimensions are two peak

0:32:27.760 --> 0:32:30.200
<v Speaker 3>by two pet or something like that. It's got eight

0:32:30.280 --> 0:32:33.760
<v Speaker 3>GPUs and two CPUs and a bunch of memory and

0:32:33.760 --> 0:32:35.840
<v Speaker 3>a bunch of networking. They've got their own like you know,

0:32:35.840 --> 0:32:37.960
<v Speaker 3>they bought a company called Melanox a while back that

0:32:38.040 --> 0:32:41.320
<v Speaker 3>did networking hardware. So it's got a bunch of proprietary

0:32:41.360 --> 0:32:43.400
<v Speaker 3>network because that's but that's something else we haven't talked about.

0:32:43.440 --> 0:32:45.680
<v Speaker 3>It's not just enough to have the computer the compute.

0:32:46.000 --> 0:32:48.440
<v Speaker 3>These models are so big they don't fit on a

0:32:48.480 --> 0:32:50.160
<v Speaker 3>single c GPU. So you have to be able to

0:32:50.200 --> 0:32:53.400
<v Speaker 3>network all this stuff together, right, And so they've got

0:32:53.480 --> 0:32:56.080
<v Speaker 3>networking in there, and they have this this box, and

0:32:56.120 --> 0:32:57.720
<v Speaker 3>then you can you can stack a whole bunch of

0:32:57.720 --> 0:33:01.520
<v Speaker 3>boxes together, like and Video has their own internal supercomputer.

0:33:01.560 --> 0:33:03.600
<v Speaker 3>It's it's fairly a high on the top five hundred less.

0:33:03.600 --> 0:33:06.720
<v Speaker 3>They call it Selene. It's a bunch of these DGX

0:33:06.840 --> 0:33:10.080
<v Speaker 3>like servers that they make, all just like stacked together effectively,

0:33:10.520 --> 0:33:13.800
<v Speaker 3>and they sell for the older generation. Their prior generation

0:33:13.920 --> 0:33:16.000
<v Speaker 3>was called Ampeer and that box sold for one hundred

0:33:16.000 --> 0:33:18.800
<v Speaker 3>and ninety nine thousand dollars. I don't believe they've released

0:33:18.840 --> 0:33:20.960
<v Speaker 3>pricing on the Hopper version, but I know for the

0:33:20.960 --> 0:33:25.080
<v Speaker 3>Hopper GPU it costs two to three x what Amper

0:33:25.160 --> 0:33:26.880
<v Speaker 3>costs the prior generation.

0:33:27.000 --> 0:33:32.440
<v Speaker 1>So this really is a separate question to me, which is, Okay,

0:33:32.520 --> 0:33:34.880
<v Speaker 1>there's the price, and it exists, and you could go

0:33:34.920 --> 0:33:38.320
<v Speaker 1>to you could theoretically go and use Google's tensor based

0:33:38.320 --> 0:33:42.160
<v Speaker 1>cloud or is it available or is it because I

0:33:42.240 --> 0:33:44.680
<v Speaker 1>sort of get the impression that, like for some of

0:33:44.720 --> 0:33:47.920
<v Speaker 1>the technology that people want to use, it's not available

0:33:47.960 --> 0:33:50.760
<v Speaker 1>at any price, and that there is actually is that

0:33:50.800 --> 0:33:51.320
<v Speaker 1>real or not?

0:33:52.080 --> 0:33:54.440
<v Speaker 3>It seems to be so we're the like. So their

0:33:54.520 --> 0:33:57.600
<v Speaker 3>new generation, which is called Hopper, which like I said,

0:33:57.720 --> 0:34:01.400
<v Speaker 3>has characteristics of it maked very attractive, especially for these

0:34:01.440 --> 0:34:04.120
<v Speaker 3>kind of like chat GPT large language models, is in

0:34:04.200 --> 0:34:05.840
<v Speaker 3>tighted to play. Were at the very beginning of that

0:34:05.880 --> 0:34:08.399
<v Speaker 3>product cycle. They just launched it like in the last

0:34:08.560 --> 0:34:11.399
<v Speaker 3>couple of quarters, and so that ramp up takes time,

0:34:11.440 --> 0:34:15.160
<v Speaker 3>and it does seem like they are seeing accelerated demand

0:34:15.320 --> 0:34:17.120
<v Speaker 3>because of this kinds of stuff, and so yeah, I

0:34:17.160 --> 0:34:20.880
<v Speaker 3>think supply is tight. We've heard stories about GPU shortages

0:34:20.960 --> 0:34:23.719
<v Speaker 3>at Microsoft and the cloud vendors, and I think there

0:34:23.760 --> 0:34:25.279
<v Speaker 3>was a Bloomberg store the other day that said these

0:34:25.280 --> 0:34:27.400
<v Speaker 3>things were selling for like forty thousand dollars on eBay.

0:34:27.400 --> 0:34:30.040
<v Speaker 3>Its a thing, right, I took a look at some

0:34:30.040 --> 0:34:31.759
<v Speaker 3>of those listings. They looked a little shady to me,

0:34:31.920 --> 0:34:33.839
<v Speaker 3>But yeah, it's tight. You have to remember, these parts

0:34:33.880 --> 0:34:36.279
<v Speaker 3>are very complicated, so the lead times to actually have

0:34:36.360 --> 0:34:38.240
<v Speaker 3>more made it takes a while.

0:34:38.480 --> 0:34:41.680
<v Speaker 2>Wait, so just on this snow. I joked about this

0:34:41.760 --> 0:34:44.839
<v Speaker 2>in the intro, But you know, could I buy like

0:34:45.360 --> 0:34:50.360
<v Speaker 2>a bitcoin mining facility and take all that computer processing

0:34:50.440 --> 0:34:53.239
<v Speaker 2>power and like convert it into something that could be

0:34:53.320 --> 0:34:55.120
<v Speaker 2>used for AI. Is that a possibility?

0:34:55.360 --> 0:34:57.960
<v Speaker 3>You could? The big point stuff at least a lot

0:34:58.000 --> 0:35:00.520
<v Speaker 3>of the big point stuff was done that was with gps.

0:35:00.760 --> 0:35:03.960
<v Speaker 3>Those were still mostly gaming GPUs. People are buying gaming

0:35:03.960 --> 0:35:07.160
<v Speaker 3>gps and purposing them for a bitcoin and the theory

0:35:07.320 --> 0:35:10.080
<v Speaker 3>mostly etherory of mining. Yeah, they're they're not nearly as

0:35:10.080 --> 0:35:13.319
<v Speaker 3>compute efficient as the data center parts, right, but I

0:35:13.320 --> 0:35:15.520
<v Speaker 3>mean in theory, yeah, you could get you know, gaming

0:35:15.600 --> 0:35:17.759
<v Speaker 3>GPUs if you could and stringly get but it would

0:35:17.800 --> 0:35:20.440
<v Speaker 3>be prohibitive, right, And even now most of that stuff's

0:35:20.440 --> 0:35:23.160
<v Speaker 3>cleared out. I think as as Joe said, but the

0:35:23.200 --> 0:35:27.640
<v Speaker 3>math is somewhat similar, I'd say for these kinds of models, though, again,

0:35:27.760 --> 0:35:30.440
<v Speaker 3>like a hopper in Video's new data center product has,

0:35:30.520 --> 0:35:32.760
<v Speaker 3>they have something that they call it a transformer engine.

0:35:33.520 --> 0:35:35.400
<v Speaker 3>What it really does is it allows you to do

0:35:35.480 --> 0:35:38.440
<v Speaker 3>the training at a slightly lower precision than unless you

0:35:38.480 --> 0:35:41.000
<v Speaker 3>do it at eight bit floating point versus sixteen bit

0:35:41.400 --> 0:35:44.319
<v Speaker 3>it'll so it lets you get higher performance. And then

0:35:44.360 --> 0:35:47.200
<v Speaker 3>there's another process. There's like a conversion process. Sometimes it

0:35:47.280 --> 0:35:49.880
<v Speaker 3>has to go when you go from training to inference.

0:35:49.920 --> 0:35:53.040
<v Speaker 3>It's something of quantization, and with these transformer engines you

0:35:53.080 --> 0:35:55.120
<v Speaker 3>don't have to do that. So it increases the efficiency

0:35:55.480 --> 0:35:58.040
<v Speaker 3>which you wouldn't get by picking some random GPUs.

0:35:58.080 --> 0:35:59.640
<v Speaker 1>Where is Intel in this story?

0:36:00.360 --> 0:36:03.000
<v Speaker 3>Well, so let's let's talk about the other competitive options

0:36:03.000 --> 0:36:05.319
<v Speaker 3>that we're out there. Okay, So we talked about some

0:36:05.400 --> 0:36:08.920
<v Speaker 3>of the captive silicon and hyperscalers that is there, and

0:36:08.960 --> 0:36:10.680
<v Speaker 3>it is real, and they're all building their own and

0:36:10.680 --> 0:36:12.760
<v Speaker 3>they've been doing it forever and it hasn't slowed anything

0:36:12.760 --> 0:36:14.839
<v Speaker 3>down on the slightest because we're still early, and then

0:36:14.880 --> 0:36:17.080
<v Speaker 3>the opportunity is big. By the way, I will say,

0:36:17.320 --> 0:36:19.920
<v Speaker 3>I don't worry to lead with it. I don't worry

0:36:19.920 --> 0:36:23.719
<v Speaker 3>so much about competition at this point because think about it.

0:36:23.719 --> 0:36:25.719
<v Speaker 3>In Videa's run rating their data center business right now,

0:36:25.719 --> 0:36:28.080
<v Speaker 3>it's something like fifteen billion dollars a year. That's where

0:36:28.080 --> 0:36:29.920
<v Speaker 3>it is. It's growing, but that's where it is. So

0:36:30.120 --> 0:36:33.200
<v Speaker 3>Jensen in Video CEO likes to throw out big numbers,

0:36:33.200 --> 0:36:36.040
<v Speaker 3>and he threw out I think he said for silicon

0:36:36.120 --> 0:36:38.160
<v Speaker 3>and hardware TAM in the data center, and he thought

0:36:38.160 --> 0:36:41.520
<v Speaker 3>that their TAM overtime is three hundred billion dollars, and

0:36:41.600 --> 0:36:43.680
<v Speaker 3>it seemed kind of crazy. Although I would say, like

0:36:43.719 --> 0:36:46.000
<v Speaker 3>it's seeming a little less and less crazy every day.

0:36:46.680 --> 0:36:49.120
<v Speaker 3>But if you thought the TAM was three hundred billion

0:36:49.320 --> 0:36:51.960
<v Speaker 3>or two or one hundred billion or like whatever, and

0:36:52.000 --> 0:36:54.400
<v Speaker 3>they're run rating at fifteen billion dollars, there's tons of

0:36:54.440 --> 0:36:57.160
<v Speaker 3>headrooms competition doesn't really matter, and that's what we've seen.

0:36:57.200 --> 0:37:01.439
<v Speaker 3>We've seen competition, but there's so much opportunity like who

0:37:01.480 --> 0:37:03.520
<v Speaker 3>cares right versus like if you thought it was a

0:37:03.560 --> 0:37:05.880
<v Speaker 3>twenty billion dollar ten like they would have a problem

0:37:05.960 --> 0:37:08.640
<v Speaker 3>like already today. So that's why I don't worry too

0:37:08.680 --> 0:37:11.359
<v Speaker 3>much because I think the opportunity is still very very

0:37:11.440 --> 0:37:15.080
<v Speaker 3>large relative to where they're running into business today. In

0:37:15.120 --> 0:37:17.520
<v Speaker 3>terms of other competitors, though, sayes so you mentioned let's

0:37:17.520 --> 0:37:20.959
<v Speaker 3>talk about AMD first, because A and D actually makes GPUs,

0:37:21.360 --> 0:37:23.439
<v Speaker 3>they make data center GPUs. They don't sell very many

0:37:23.480 --> 0:37:25.640
<v Speaker 3>of them. Their current product is something called the Mi

0:37:25.680 --> 0:37:30.560
<v Speaker 3>I two fifty and they've sold deminimus basically. And in fact,

0:37:30.560 --> 0:37:33.400
<v Speaker 3>you know, when the China sanctions were put on, and

0:37:33.520 --> 0:37:35.040
<v Speaker 3>you know, we didn't talk about that, but the US

0:37:35.120 --> 0:37:38.480
<v Speaker 3>has stopped allowing like high end aichips from being shipped

0:37:38.480 --> 0:37:41.200
<v Speaker 3>to China. The MI two to fifty eighties part was

0:37:41.200 --> 0:37:42.480
<v Speaker 3>on the list, but it didn't affect them at all

0:37:42.480 --> 0:37:45.080
<v Speaker 3>because they weren't selling anything. Hey, so their sales were zero.

0:37:45.320 --> 0:37:47.680
<v Speaker 3>They've got another product coming out at the following that's

0:37:47.680 --> 0:37:49.560
<v Speaker 3>called the Mi I three hundred, and people have been

0:37:49.560 --> 0:37:51.279
<v Speaker 3>getting kind of excited about A and B. They've been

0:37:51.360 --> 0:37:52.640
<v Speaker 3>sort of looking to play it as kind of like

0:37:52.640 --> 0:37:55.359
<v Speaker 3>the Foreman's and Video. I'll be honest, I don't think

0:37:55.360 --> 0:37:57.480
<v Speaker 3>it's the Foreman's in video and video is doing, you know,

0:37:57.640 --> 0:38:00.600
<v Speaker 3>close to four billion dollars a quarter in data revenues.

0:38:01.040 --> 0:38:02.799
<v Speaker 3>I don't know that I see anything like that with

0:38:02.840 --> 0:38:05.160
<v Speaker 3>the mi I three hundred figure they in AMD as

0:38:05.200 --> 0:38:07.480
<v Speaker 3>far as i fell, has not even released any sort

0:38:07.520 --> 0:38:10.480
<v Speaker 3>of specifications for what it looks like at this point. So,

0:38:10.600 --> 0:38:13.160
<v Speaker 3>but that is an option, and some people would say

0:38:13.160 --> 0:38:15.520
<v Speaker 3>there's maybe some truth to this is you know, if

0:38:15.520 --> 0:38:19.120
<v Speaker 3>you want an alternative, AV will present an alternative. And

0:38:19.120 --> 0:38:20.880
<v Speaker 3>if the opportunity is really that they they'll get some.

0:38:21.000 --> 0:38:23.320
<v Speaker 3>They'll they'll probably get some. Do you have that? You

0:38:23.400 --> 0:38:27.640
<v Speaker 3>have Intel? So Intel's got a few things on their CPUs.

0:38:27.680 --> 0:38:31.680
<v Speaker 3>Their current version is called Sapphire Rapids. It has AI

0:38:31.800 --> 0:38:34.839
<v Speaker 3>specific accelerate, is four core inference not not so much

0:38:34.840 --> 0:38:38.560
<v Speaker 3>maybe for this kind of stuff, but for general inference activities.

0:38:39.080 --> 0:38:41.800
<v Speaker 3>They're trying to play at the capabilities of their CPU

0:38:42.560 --> 0:38:44.600
<v Speaker 3>on that fine, and why are they doing that. It's

0:38:44.640 --> 0:38:47.520
<v Speaker 3>because their accelerator roadmap isn't so good. So they have

0:38:47.600 --> 0:38:51.200
<v Speaker 3>a GPU roadmap. The code name for it was ponta Vecchio,

0:38:52.239 --> 0:38:54.720
<v Speaker 3>and they've kind of gutted that roadmap. So the follow

0:38:54.800 --> 0:38:57.680
<v Speaker 3>on product was something called rialto Bridge that they've since canceled,

0:38:58.560 --> 0:39:01.800
<v Speaker 3>and one of the Pontaventio products recently they just canceled,

0:39:02.680 --> 0:39:06.040
<v Speaker 3>and a Pajaveci originally was designed for the Area supercomputer

0:39:06.080 --> 0:39:09.759
<v Speaker 3>and it was massively late. I mean so like they

0:39:09.800 --> 0:39:11.560
<v Speaker 3>took a much was it was something like a three

0:39:11.640 --> 0:39:15.160
<v Speaker 3>hundred million dollar charge. I think it was the at

0:39:15.160 --> 0:39:16.759
<v Speaker 3>the end of twenty twenty one. It was either the

0:39:16.840 --> 0:39:19.040
<v Speaker 3>end of twenty or g twenty twenty one where they're

0:39:19.080 --> 0:39:21.520
<v Speaker 3>they basically they gave it away. It was so late,

0:39:21.880 --> 0:39:23.759
<v Speaker 3>So that's that's how late they were. They also have

0:39:23.840 --> 0:39:28.440
<v Speaker 3>another product. They bought an Israeli AI company called Habana,

0:39:29.239 --> 0:39:31.840
<v Speaker 3>and Habana has a product called Goudi. It's not a

0:39:31.880 --> 0:39:36.759
<v Speaker 3>GPU exactly, but it's like a specific accelerator technology. And

0:39:36.880 --> 0:39:39.040
<v Speaker 3>Amazon bought some of them and they sell a little bit,

0:39:39.040 --> 0:39:42.000
<v Speaker 3>but again it versus Intel's total revenues. It's the Minimus,

0:39:42.360 --> 0:39:44.839
<v Speaker 3>so they're not really there. There's also a bunch of

0:39:44.840 --> 0:39:48.520
<v Speaker 3>startups and the problem with most of the startups is

0:39:48.560 --> 0:39:51.080
<v Speaker 3>their their their story tends to be something like, you know,

0:39:51.080 --> 0:39:53.160
<v Speaker 3>we have a product that's ten times as good as Nvidia,

0:39:53.200 --> 0:39:56.160
<v Speaker 3>and the issue is with every generation, in Vidia has

0:39:56.160 --> 0:39:57.960
<v Speaker 3>something that's ten times as good as in video, and

0:39:58.000 --> 0:40:00.600
<v Speaker 3>they have the software ecosystem that goes with it. Neither

0:40:00.640 --> 0:40:02.760
<v Speaker 3>a m D, nor Intel, nor most of the startups

0:40:02.760 --> 0:40:05.960
<v Speaker 3>have anything remotely resembling in video software. So that's another

0:40:06.040 --> 0:40:08.479
<v Speaker 3>huge issue right that all of them are facing. There's

0:40:08.520 --> 0:40:11.520
<v Speaker 3>a few startups that have some niche success. One of

0:40:11.560 --> 0:40:14.359
<v Speaker 3>the one that's probably gotten the most attention is called

0:40:14.400 --> 0:40:18.240
<v Speaker 3>Servius or Cerebraus, and their whole thing. They make a chip.

0:40:18.400 --> 0:40:21.560
<v Speaker 3>It's imaginating a three hundred millimeters silicon wafer and it's

0:40:21.600 --> 0:40:25.000
<v Speaker 3>inscribing a square on it. That's their chip. It's like

0:40:25.040 --> 0:40:27.759
<v Speaker 3>one chip per wafer, and so you can put very

0:40:27.920 --> 0:40:30.960
<v Speaker 3>large models onto these chips, and they've been deploying them

0:40:30.960 --> 0:40:34.040
<v Speaker 3>for those kinds of things. But again the software becomes

0:40:34.200 --> 0:40:35.880
<v Speaker 3>an issue. But they've had a little bit of success.

0:40:36.400 --> 0:40:38.640
<v Speaker 3>There's some other names that that you know, You've got

0:40:38.760 --> 0:40:41.000
<v Speaker 3>Groc and some others I think that are still out there.

0:40:41.000 --> 0:40:43.560
<v Speaker 3>And then there's a company called Tends toward which is

0:40:43.640 --> 0:40:47.160
<v Speaker 3>interesting not because of so far what they're doing because

0:40:47.200 --> 0:40:49.360
<v Speaker 3>it's early, but it's run now by Jim Keller. And

0:40:49.360 --> 0:40:52.520
<v Speaker 3>do you guys know who Jim Keller is. Jim Keller

0:40:52.680 --> 0:40:55.280
<v Speaker 3>was was He's sort of like a star chip designer.

0:40:55.320 --> 0:40:59.239
<v Speaker 3>He designed Apple's first custom processor. He designed A and

0:40:59.320 --> 0:41:01.520
<v Speaker 3>ds as and and epic road NEPs that they've been

0:41:01.600 --> 0:41:02.920
<v Speaker 3>that they've been taking a lot of share with. He

0:41:03.040 --> 0:41:05.000
<v Speaker 3>was even at Tesla for a while and at Intel,

0:41:05.480 --> 0:41:07.719
<v Speaker 3>and so he's now running tense to it and they

0:41:07.760 --> 0:41:10.319
<v Speaker 3>do it's a risk five. Risk five is another type

0:41:10.320 --> 0:41:13.239
<v Speaker 3>of architecture, and they do they do an AI chap,

0:41:13.280 --> 0:41:14.120
<v Speaker 3>So Jim is running that.

0:41:14.520 --> 0:41:16.960
<v Speaker 2>So can I just ask based on that? I mean,

0:41:17.120 --> 0:41:22.439
<v Speaker 2>how like capex intensive is developing chips that are well

0:41:22.480 --> 0:41:26.600
<v Speaker 2>suited for AI versus other types of chips. And then secondly,

0:41:26.760 --> 0:41:32.040
<v Speaker 2>like where do the improvements come from or what are

0:41:32.120 --> 0:41:36.279
<v Speaker 2>the like improvements focused on? Is it speed or like

0:41:36.600 --> 0:41:40.799
<v Speaker 2>scale given the data sets involved in the parallel processes

0:41:40.800 --> 0:41:41.600
<v Speaker 2>that you described.

0:41:42.480 --> 0:41:43.960
<v Speaker 3>Yeah, so it's a few thing so in terms of

0:41:44.000 --> 0:41:46.480
<v Speaker 3>Capex intents, and these are mostly design companies, so they

0:41:46.480 --> 0:41:48.440
<v Speaker 3>don't have a lot of Capex. It's certainly r and

0:41:48.520 --> 0:41:51.640
<v Speaker 3>D intensive, So maybe maybe that's that's what you're getting

0:41:51.640 --> 0:41:53.800
<v Speaker 3>on in video spends like many billions of dollars a

0:41:53.920 --> 0:41:56.160
<v Speaker 3>year on R and D and and VIDA has a

0:41:56.160 --> 0:41:58.160
<v Speaker 3>little bit of advantage too because it's it's effectively the

0:41:58.200 --> 0:42:01.319
<v Speaker 3>same architecture between day center in gaming, so they've got

0:42:01.360 --> 0:42:04.759
<v Speaker 3>other other volume effectively to sort of amortize some of

0:42:04.760 --> 0:42:07.440
<v Speaker 3>those investments over although now I mean this year, I mean,

0:42:07.480 --> 0:42:10.120
<v Speaker 3>data center's probably sixty percent of in videous revenues now,

0:42:10.120 --> 0:42:11.879
<v Speaker 3>so I mean in video is sort of the center

0:42:11.920 --> 0:42:13.719
<v Speaker 3>of data center is a center of gravity for in

0:42:13.800 --> 0:42:16.480
<v Speaker 3>video now, but it's very R and D intensive and

0:42:16.560 --> 0:42:18.920
<v Speaker 3>probably getting more so. And you've got folks all up

0:42:18.920 --> 0:42:20.879
<v Speaker 3>and down the value chain that are investing or both

0:42:20.880 --> 0:42:23.719
<v Speaker 3>the silicon guys you know, and the cloud guys and

0:42:23.760 --> 0:42:25.839
<v Speaker 3>the customers and everything else. But I mean, that's that's

0:42:25.920 --> 0:42:28.000
<v Speaker 3>kind of where we are in terms of what you're

0:42:28.080 --> 0:42:30.960
<v Speaker 3>you're looking for. So there's a few things you're looking for.

0:42:31.080 --> 0:42:33.800
<v Speaker 3>Performance and on training, quite often that comes down to

0:42:33.920 --> 0:42:36.759
<v Speaker 3>like time to train. So I've got a model, Like

0:42:36.800 --> 0:42:38.520
<v Speaker 3>some of these models, I mean, you could imagine it

0:42:38.520 --> 0:42:43.640
<v Speaker 3>could take weeks or months historically to train right, and

0:42:43.880 --> 0:42:46.279
<v Speaker 3>that's a problem. You want it to be faster, so

0:42:46.320 --> 0:42:48.400
<v Speaker 3>I can get that down you know, two weeks or

0:42:48.440 --> 0:42:50.440
<v Speaker 3>you know too days or hours that would be better.

0:42:50.920 --> 0:42:52.640
<v Speaker 3>So that's one thing clearly that they work on.

0:42:53.040 --> 0:42:53.560
<v Speaker 1>I don't want to.

0:42:53.719 --> 0:42:56.080
<v Speaker 3>It's something notice, yeah, go ahead.

0:42:56.160 --> 0:42:58.480
<v Speaker 1>No finish your thought that I have a slightly oh yeah.

0:42:58.760 --> 0:43:00.000
<v Speaker 3>The other think I was talking about that there's something

0:43:00.040 --> 0:43:02.359
<v Speaker 3>where I'm like like scale out. So basically, remember I said,

0:43:02.360 --> 0:43:05.400
<v Speaker 3>you're you're connecting lots and lots of these chips together.

0:43:06.320 --> 0:43:08.840
<v Speaker 3>So for example, if if I if I increase the

0:43:08.920 --> 0:43:12.040
<v Speaker 3>number of chips by ten X, does my trading time

0:43:12.080 --> 0:43:13.880
<v Speaker 3>go back down by like a factor of ten or

0:43:13.920 --> 0:43:16.040
<v Speaker 3>is it like by factor of two? So yeah, ideally

0:43:16.040 --> 0:43:18.480
<v Speaker 3>you would want like linear scaling, right, I want, like

0:43:18.680 --> 0:43:20.920
<v Speaker 3>I add resources, it scaled linearly.

0:43:21.080 --> 0:43:23.080
<v Speaker 1>So this is kind of gonna was going to get

0:43:23.080 --> 0:43:25.759
<v Speaker 1>into my next question. Actually, and you know, we can

0:43:25.920 --> 0:43:29.520
<v Speaker 1>talk to another with someone else about certain like AI

0:43:29.680 --> 0:43:30.399
<v Speaker 1>fantasy doom.

0:43:30.840 --> 0:43:33.440
<v Speaker 3>But I think, but I'm not an AI. I'm not

0:43:33.480 --> 0:43:36.400
<v Speaker 3>an AI architecture X. But I'm a down past here.

0:43:36.480 --> 0:43:38.120
<v Speaker 3>So I could just say you may want to get aged,

0:43:38.200 --> 0:43:38.520
<v Speaker 3>no I.

0:43:38.480 --> 0:43:41.480
<v Speaker 1>Know somebody, but I am curious though, because I do

0:43:41.560 --> 0:43:44.080
<v Speaker 1>think it relates to this question, which is that okay,

0:43:44.200 --> 0:43:46.600
<v Speaker 1>like with each one like GPT five and they're going

0:43:46.640 --> 0:43:49.200
<v Speaker 1>to keep adding more knobs on the box, et cetera,

0:43:49.440 --> 0:43:54.520
<v Speaker 1>like and is your perception that this sort of quality

0:43:54.560 --> 0:43:58.080
<v Speaker 1>of the output is growing exponentially or is it the

0:43:58.160 --> 0:44:01.960
<v Speaker 1>kind of thing where it's like GPT four, you know,

0:44:02.080 --> 0:44:04.120
<v Speaker 1>there's a lot more knobs and they got a big

0:44:04.200 --> 0:44:08.320
<v Speaker 1>jump from GPT three. GPT five will be way more knobs,

0:44:08.520 --> 0:44:10.880
<v Speaker 1>but like is it going to be marginally better? Like

0:44:11.080 --> 0:44:12.960
<v Speaker 1>what is this sort of like where are we in

0:44:13.000 --> 0:44:14.680
<v Speaker 1>the sort of like what does the shape of the

0:44:14.719 --> 0:44:17.440
<v Speaker 1>output curve look like? And this sort of like cost

0:44:17.680 --> 0:44:21.640
<v Speaker 1>of you know, these chip developments of getting there. I

0:44:21.640 --> 0:44:23.360
<v Speaker 1>don't know, it's kind of so there's a couple of things.

0:44:23.400 --> 0:44:25.759
<v Speaker 3>So, so, first of all, when you're talking about large

0:44:25.800 --> 0:44:28.840
<v Speaker 3>language where it was accuracy, it's sort of a nebulous

0:44:28.920 --> 0:44:30.839
<v Speaker 3>term because it's not just accuracy. It's like like case,

0:44:30.840 --> 0:44:34.399
<v Speaker 3>it's also capability, like what could it do? What chat

0:44:34.440 --> 0:44:36.719
<v Speaker 3>GPT and GPD four can do. And also, like I

0:44:36.760 --> 0:44:38.360
<v Speaker 3>think as you're going forward and you talk about the

0:44:38.400 --> 0:44:42.640
<v Speaker 3>trajectors here, it's not just text right, we're talking text

0:44:42.680 --> 0:44:45.200
<v Speaker 3>to texture, but there's also text images and anybody like

0:44:45.200 --> 0:44:48.600
<v Speaker 3>with like Dolly where words. You know, it's generating images

0:44:48.719 --> 0:44:51.320
<v Speaker 3>from a text prompt and now we've got like video

0:44:52.000 --> 0:44:54.400
<v Speaker 3>what it was it mid was it midsummer? Is that

0:44:54.440 --> 0:44:57.160
<v Speaker 3>what it's called big journey? Journey can't mid journey? Yeah,

0:44:57.160 --> 0:44:59.799
<v Speaker 3>so it's it's it's creating like video prompts. I mean,

0:44:59.800 --> 0:45:03.040
<v Speaker 3>so like the like text is de scrapped as just

0:45:03.239 --> 0:45:05.360
<v Speaker 3>the tip of the iceberg, I think in terms of

0:45:05.400 --> 0:45:08.360
<v Speaker 3>what we're going to need, but they're.

0:45:08.200 --> 0:45:10.919
<v Speaker 1>Never they're never going to get to where they could

0:45:10.920 --> 0:45:14.560
<v Speaker 1>have three people having a conversation with voices sound like Tracy,

0:45:14.640 --> 0:45:17.520
<v Speaker 1>Joe and Stacy. Right, No, I'm just kidding, No, I mean,

0:45:18.440 --> 0:45:21.840
<v Speaker 1>I'm just kidding. It feels like, yeah, this job.

0:45:21.719 --> 0:45:24.759
<v Speaker 3>Now one of the dangerous clearly, and maybe this gets

0:45:24.800 --> 0:45:27.160
<v Speaker 3>the capabilities. So what one thing with chat GPT is

0:45:27.200 --> 0:45:29.879
<v Speaker 3>it's very very good. This why I should worry about

0:45:29.920 --> 0:45:32.239
<v Speaker 3>my job because it's very good about that. That's it

0:45:32.400 --> 0:45:34.439
<v Speaker 3>sounding like it knows what it's talking about, where maybe

0:45:34.480 --> 0:45:37.120
<v Speaker 3>it doesn't hate, So maybe I should be worried about

0:45:37.160 --> 0:45:39.680
<v Speaker 3>my job, you know, And accuracy, I think is a

0:45:39.680 --> 0:45:41.279
<v Speaker 3>big issue, but you have to remember it.

0:45:41.360 --> 0:45:44.640
<v Speaker 1>So, but like on this accuracy question, like I assume,

0:45:44.719 --> 0:45:47.919
<v Speaker 1>you know, like self driving cars, like when people were

0:45:47.920 --> 0:45:50.600
<v Speaker 1>really hyped about them ten years ago, they're like, oh,

0:45:50.600 --> 0:45:52.959
<v Speaker 1>it's ninety five percent solid, we just have a little

0:45:52.960 --> 0:45:56.040
<v Speaker 1>bit more, and then it's solid ten years later. Yeah,

0:45:56.160 --> 0:45:58.399
<v Speaker 1>ten years later, it feels like they haven't made any

0:45:58.440 --> 0:45:59.880
<v Speaker 1>progress on that final five percent.

0:46:00.040 --> 0:46:01.719
<v Speaker 3>Yeah. I mean, these things are always a power law.

0:46:01.840 --> 0:46:06.280
<v Speaker 1>So this is my question when we talk about accuracy

0:46:06.360 --> 0:46:08.840
<v Speaker 1>or these things, like are we at the point where

0:46:08.920 --> 0:46:10.080
<v Speaker 1>like is it going to be the kind of thing

0:46:10.080 --> 0:46:13.160
<v Speaker 1>where it's like, yeah, GPT five will definitely be better

0:46:13.160 --> 0:46:16.319
<v Speaker 1>than GBT four, but it will be like ninety six

0:46:16.360 --> 0:46:17.359
<v Speaker 1>percent of the way there.

0:46:18.000 --> 0:46:21.080
<v Speaker 3>Well, again, let me separate out. Let me separate an

0:46:21.120 --> 0:46:25.200
<v Speaker 3>accuracy from capability again. So there's an accuracy you have

0:46:25.280 --> 0:46:28.960
<v Speaker 3>to remember, like it the model has no idea what

0:46:29.120 --> 0:46:32.480
<v Speaker 3>accurate even means. It doesn't remember. These things are not

0:46:32.560 --> 0:46:34.600
<v Speaker 3>actually intelligent. I know there's a lot of worry about

0:46:34.640 --> 0:46:36.800
<v Speaker 3>like what they go like like like agi like artifice

0:46:36.840 --> 0:46:39.319
<v Speaker 3>with general intelligence. Right, I don't think this is it.

0:46:39.400 --> 0:46:42.359
<v Speaker 3>This is predictive text. That's all. The model doesn't know

0:46:42.400 --> 0:46:44.960
<v Speaker 3>if it's if it's viewing bull crap or truth. It

0:46:44.960 --> 0:46:46.880
<v Speaker 3>has no idea, it's just predicting the next word in

0:46:47.160 --> 0:46:49.560
<v Speaker 3>the the thing. And it's because of what it's trained on.

0:46:49.600 --> 0:46:52.880
<v Speaker 3>So you need to add on maybe other kinds of

0:46:52.880 --> 0:46:55.200
<v Speaker 3>things to ensure accuracy, maybe to put guard rails or

0:46:55.239 --> 0:46:57.560
<v Speaker 3>things things like that. You may need to very carefully,

0:46:57.640 --> 0:47:00.040
<v Speaker 3>like more harsh like your input like data sets and

0:47:00.120 --> 0:47:02.360
<v Speaker 3>things like that. I think that's a problem now. I

0:47:02.400 --> 0:47:05.440
<v Speaker 3>think it'll get solved. There's enough date. But like and

0:47:05.520 --> 0:47:07.759
<v Speaker 3>this has already been an issue and you got you

0:47:07.800 --> 0:47:09.319
<v Speaker 3>can take it like the other like the I don't

0:47:09.320 --> 0:47:10.520
<v Speaker 3>know if it's the converse of it or not, but

0:47:10.600 --> 0:47:12.920
<v Speaker 3>things like deep fakes, people are deliberately trying to use

0:47:13.280 --> 0:47:15.680
<v Speaker 3>AI to deceive. I mean, this is just human nature.

0:47:15.719 --> 0:47:17.719
<v Speaker 3>This is this is why we have problems. But I

0:47:17.760 --> 0:47:20.000
<v Speaker 3>think they can work through that just in terms of

0:47:20.040 --> 0:47:23.640
<v Speaker 3>capabilities now, I think it's it's really interesting to look

0:47:23.640 --> 0:47:27.440
<v Speaker 3>at like like sort of similar like a response like

0:47:27.440 --> 0:47:30.600
<v Speaker 3>to a similar prompt between like chat GPT and GPT four,

0:47:30.680 --> 0:47:33.279
<v Speaker 3>and like what people are getting out of GPD four.

0:47:33.280 --> 0:47:35.239
<v Speaker 3>It's it's it's miles ahead of like some of the

0:47:35.280 --> 0:47:37.480
<v Speaker 3>stuff that that that chat GPT, which was trained on

0:47:37.920 --> 0:47:40.719
<v Speaker 3>GPT three of them, all that than what it was,

0:47:40.840 --> 0:47:44.600
<v Speaker 3>what is delivering in terms of nuance, right, and color

0:47:44.680 --> 0:47:46.839
<v Speaker 3>and every and everything else. I mean, and I think

0:47:46.880 --> 0:47:49.480
<v Speaker 3>that's going to continue. I wouldn't be And already you're

0:47:49.480 --> 0:47:51.000
<v Speaker 3>on the boat where these things can already pass the

0:47:51.000 --> 0:47:53.760
<v Speaker 3>turning tests. Oh yeah, right, it can be very difficult

0:47:53.760 --> 0:47:55.560
<v Speaker 3>to know if it's if I'm put in the question

0:47:55.600 --> 0:47:58.320
<v Speaker 3>of accuracy aside perment, it's very difficult to know for

0:47:58.440 --> 0:47:59.960
<v Speaker 3>some of these things if if you didn't know any

0:48:00.040 --> 0:48:02.040
<v Speaker 3>better whether it was coming from a real person or not.

0:48:02.200 --> 0:48:04.840
<v Speaker 3>And I think it's going to get like harder and

0:48:04.920 --> 0:48:07.520
<v Speaker 3>harder to tell, like whether you know even if it's

0:48:07.520 --> 0:48:10.040
<v Speaker 3>not you know, quote unquote really thinking it's going to

0:48:10.080 --> 0:48:11.440
<v Speaker 3>be hard for us to tell what's really going on.

0:48:11.520 --> 0:48:14.080
<v Speaker 3>That is sort of like other interesting you know, implications

0:48:14.400 --> 0:48:17.279
<v Speaker 3>or for what this might be over the next five

0:48:17.360 --> 0:48:33.280
<v Speaker 3>years or ten years.

0:48:35.440 --> 0:48:37.759
<v Speaker 2>Just going back to the stock prices, I mean, we

0:48:37.800 --> 0:48:40.680
<v Speaker 2>mentioned the Nvidia chart, which is up quite a lot,

0:48:40.680 --> 0:48:43.799
<v Speaker 2>although not it hasn't reached its its peak back in

0:48:43.840 --> 0:48:48.360
<v Speaker 2>twenty twenty one. The Socks Index is recovering, but you know,

0:48:48.440 --> 0:48:51.479
<v Speaker 2>still below an intel. I mean, I won't even mention,

0:48:51.760 --> 0:48:55.839
<v Speaker 2>but like, where are we in the semiconductor cycle, because

0:48:55.880 --> 0:48:59.040
<v Speaker 2>it feels like, on the one hand there's talk about

0:48:59.040 --> 0:49:01.960
<v Speaker 2>excess capacity and orders starting to fall, but on the

0:49:02.000 --> 0:49:04.720
<v Speaker 2>other hand, there is this real excitement about the future

0:49:04.880 --> 0:49:05.920
<v Speaker 2>in the form of AI.

0:49:06.960 --> 0:49:11.080
<v Speaker 3>Yes. Yes, So seventies in general were pretty lousy last year.

0:49:11.160 --> 0:49:14.120
<v Speaker 3>They've had a very strong year to date performance and

0:49:14.160 --> 0:49:17.480
<v Speaker 3>the sectors up, which is sectors up, you know, twenty

0:49:17.600 --> 0:49:20.120
<v Speaker 3>twenty two percent year to date, quite a bit above

0:49:20.120 --> 0:49:23.640
<v Speaker 3>the overall market. And the reason is, to your point,

0:49:23.800 --> 0:49:25.680
<v Speaker 3>we've been in a cycle. Numbers have been coming down.

0:49:25.680 --> 0:49:27.120
<v Speaker 3>And we may have talked about this last time. I

0:49:27.120 --> 0:49:30.200
<v Speaker 3>don't remember, but semi conutter investors, that turns out the

0:49:30.200 --> 0:49:33.200
<v Speaker 3>best friend to buy stocks in general is after numbers

0:49:33.239 --> 0:49:35.000
<v Speaker 3>come down, but before they hit bottoms, Like if you

0:49:35.040 --> 0:49:38.200
<v Speaker 3>could buy them right before the last cut, if you

0:49:38.239 --> 0:49:40.759
<v Speaker 3>could have perfect foresight. You never know when that is.

0:49:40.840 --> 0:49:42.480
<v Speaker 3>But I mean numbers of cut. But numbers have come

0:49:42.520 --> 0:49:45.240
<v Speaker 3>down the laws so estimates forward estimates for the industry

0:49:45.280 --> 0:49:49.080
<v Speaker 3>peaked last June and they are down over thirty percent,

0:49:49.160 --> 0:49:51.000
<v Speaker 3>like thirty five percent since that when it's actually the

0:49:51.280 --> 0:49:56.040
<v Speaker 3>largest negative earnings revision we've had probably since the financial crisis. Wow,

0:49:57.120 --> 0:50:00.320
<v Speaker 3>and people are looking for you know, playing the ottoming

0:50:00.400 --> 0:50:02.520
<v Speaker 3>theme and that hopefully things get better into the second half.

0:50:02.560 --> 0:50:06.319
<v Speaker 3>You know, we get hope, hopefully China reopening, and you've

0:50:06.360 --> 0:50:08.520
<v Speaker 3>got markets like and this relates to Intel like like

0:50:08.600 --> 0:50:12.560
<v Speaker 3>PCs and things where you know, we've now corrected kind

0:50:12.600 --> 0:50:14.399
<v Speaker 3>of we're back like more on a pre COVID run

0:50:14.440 --> 0:50:17.520
<v Speaker 3>rate for PCs versus where we were, and the CPUs

0:50:17.520 --> 0:50:20.400
<v Speaker 3>which were massively overshipping at the peak, they're now undershipping.

0:50:20.400 --> 0:50:23.280
<v Speaker 3>And so we're in that inventory flushed part of the cycle,

0:50:23.960 --> 0:50:25.960
<v Speaker 3>and so people have been sort of playing the space

0:50:26.480 --> 0:50:28.960
<v Speaker 3>for that like second half recovery. Not now. All that

0:50:28.960 --> 0:50:31.560
<v Speaker 3>being said, if you look at the overall industry, if

0:50:31.560 --> 0:50:33.439
<v Speaker 3>you look at numbers in the second half, they're actually

0:50:33.440 --> 0:50:35.680
<v Speaker 3>above seasonal. So people are starting to bake in that

0:50:35.760 --> 0:50:39.960
<v Speaker 3>cyclical recovery to the numbers. And if you look at inventories,

0:50:40.040 --> 0:50:42.919
<v Speaker 3>it just overall in the space they are ludicrously high.

0:50:42.920 --> 0:50:46.040
<v Speaker 3>I've actually never seen them this five before. So we've

0:50:46.040 --> 0:50:48.160
<v Speaker 3>had some inventory correction, but we may we may have not,

0:50:48.960 --> 0:50:50.920
<v Speaker 3>we may just be getting started there. And if you

0:50:50.920 --> 0:50:53.479
<v Speaker 3>look at valuations. I think the sector's trading. It's something

0:50:53.520 --> 0:50:55.279
<v Speaker 3>like a thirty percent of premium to the S and

0:50:55.280 --> 0:50:58.560
<v Speaker 3>P five hundred, which is the largest premium we've had again,

0:50:58.680 --> 0:51:01.200
<v Speaker 3>probably since things normal life after the tech bubble or

0:51:01.360 --> 0:51:04.640
<v Speaker 3>after the financial crisis at least, so people have been

0:51:04.680 --> 0:51:07.279
<v Speaker 3>playing this backup recoverary. But yeah, we better get it

0:51:09.120 --> 0:51:11.000
<v Speaker 3>as as as it relates to some of the other

0:51:11.080 --> 0:51:13.520
<v Speaker 3>some of the individual stocks, like you mentioned Intel, It's funny.

0:51:13.520 --> 0:51:14.759
<v Speaker 3>I think you guys may not know this. I just

0:51:14.880 --> 0:51:20.200
<v Speaker 3>upgraded Intel. Oh. The title of the note was we

0:51:20.280 --> 0:51:26.239
<v Speaker 3>hate this call, and I meant I desperately would like

0:51:26.280 --> 0:51:28.840
<v Speaker 3>the standard prom It was and it was not a

0:51:28.920 --> 0:51:31.080
<v Speaker 3>we like an Intel call. It was just I think

0:51:31.120 --> 0:51:33.759
<v Speaker 3>that they that they're now under shipping in PCs by

0:51:33.760 --> 0:51:35.560
<v Speaker 3>a wide margin, and I think for the first time

0:51:35.560 --> 0:51:38.680
<v Speaker 3>in a while, the second half street numbers might actually

0:51:38.680 --> 0:51:41.320
<v Speaker 3>be too low. So that's it's not like a super

0:51:41.360 --> 0:51:44.640
<v Speaker 3>compelling call. But I felt uncomfortable Push although they were

0:51:44.640 --> 0:51:46.920
<v Speaker 3>port earning next week, I make I may be kicking myself,

0:51:46.920 --> 0:51:50.120
<v Speaker 3>like we'll still see in Vidia, however, so it's clearly

0:51:50.200 --> 0:51:52.759
<v Speaker 3>you know you're ready. It hasn't reached its prior peak

0:51:52.760 --> 0:51:55.120
<v Speaker 3>from a stock price base, and the reason the numbers

0:51:55.120 --> 0:51:57.080
<v Speaker 3>have come down a lot. I mean, let's be honest,

0:51:57.560 --> 0:52:01.040
<v Speaker 3>the gaining you know, business was was in plated significantly

0:52:01.040 --> 0:52:04.240
<v Speaker 3>by crypto, right, and so that's all come out right.

0:52:04.320 --> 0:52:06.440
<v Speaker 3>And then you know with data center, you had some

0:52:06.560 --> 0:52:10.360
<v Speaker 3>impacts from from China. China general was weak, and then

0:52:10.400 --> 0:52:12.080
<v Speaker 3>we had some of the export controls that they had

0:52:12.080 --> 0:52:15.080
<v Speaker 3>to work their way around, and see had some issues there. Now,

0:52:15.120 --> 0:52:18.560
<v Speaker 3>all of that being said, graphics cards in gaming, we

0:52:18.760 --> 0:52:21.600
<v Speaker 3>talked about some of these inventory corrections. Graphics cards actually

0:52:21.640 --> 0:52:23.840
<v Speaker 3>corrected the most and the most rapidly. So those have

0:52:23.880 --> 0:52:25.799
<v Speaker 3>already hit bottom and they're growing again. And in VideA

0:52:25.840 --> 0:52:27.759
<v Speaker 3>has got a product cycle there that they just kicked off.

0:52:27.920 --> 0:52:30.400
<v Speaker 3>The new cards are called Lovelace and they and they

0:52:30.480 --> 0:52:32.480
<v Speaker 3>look really good and especially behind and they're starting to

0:52:32.480 --> 0:52:34.280
<v Speaker 3>fill out like the rest of the stack. So gaming

0:52:34.360 --> 0:52:37.040
<v Speaker 3>is okay. And then in data centering again this you know,

0:52:37.080 --> 0:52:40.200
<v Speaker 3>this generative AI has really caught everybody's fancy. And in

0:52:40.320 --> 0:52:42.920
<v Speaker 3>Vivia had a data center of and they're saying that

0:52:42.960 --> 0:52:44.800
<v Speaker 3>they were at the beginning of a product cycle in

0:52:44.880 --> 0:52:47.440
<v Speaker 3>data centerm and you know, they had an advantage. A

0:52:47.440 --> 0:52:49.319
<v Speaker 3>couple of weeks ago, they're their GtC event where they

0:52:49.320 --> 0:52:53.360
<v Speaker 3>actually basically and directly said we're seeing upside from generative

0:52:53.360 --> 0:52:56.480
<v Speaker 3>AI even now, right, So people have been buying in

0:52:56.600 --> 0:52:59.480
<v Speaker 3>VideA on on on those on that thesis, and like

0:52:59.520 --> 0:53:01.560
<v Speaker 3>the last the stock hit the peak at these peaks,

0:53:01.560 --> 0:53:04.120
<v Speaker 3>at least in terms of valuation. The issue is we

0:53:04.120 --> 0:53:07.200
<v Speaker 3>were at the peak of their product cycles and numbers

0:53:07.239 --> 0:53:10.160
<v Speaker 3>came down. This time, valuations kind of went back to

0:53:10.160 --> 0:53:12.720
<v Speaker 3>where they were at those peaks, but were the skinning

0:53:12.719 --> 0:53:14.640
<v Speaker 3>of the product cycles, and numbers are probably going up

0:53:14.680 --> 0:53:15.399
<v Speaker 3>knock knock down.

0:53:15.719 --> 0:53:19.080
<v Speaker 1>So that's that's why Stacy I joked at the beginning

0:53:19.160 --> 0:53:21.560
<v Speaker 1>that we could talk about about this for three hours,

0:53:21.560 --> 0:53:24.920
<v Speaker 1>and I'm sure we could. Sure there's such a deep area,

0:53:25.000 --> 0:53:28.120
<v Speaker 1>but that was a great overview of just like the

0:53:28.160 --> 0:53:30.239
<v Speaker 1>state of competition, of the state of play, and the

0:53:30.320 --> 0:53:32.799
<v Speaker 1>economics of this a very good way for us to

0:53:32.840 --> 0:53:36.359
<v Speaker 1>sort of enter talking about AI. Stum more broadly, thank

0:53:36.400 --> 0:53:38.760
<v Speaker 1>you so much for coming back on online.

0:53:39.960 --> 0:53:42.120
<v Speaker 3>My pleasure. Anytime you guys want me here, just let

0:53:42.120 --> 0:53:42.799
<v Speaker 3>me now, all right.

0:53:42.719 --> 0:53:46.759
<v Speaker 1>We'll have you back next week for Intel take care

0:53:46.800 --> 0:54:04.760
<v Speaker 1>of State. I really like talking to Stacey. He's really

0:54:04.760 --> 0:54:07.399
<v Speaker 1>good at explaining complicated Yeah.

0:54:07.400 --> 0:54:09.040
<v Speaker 2>I know, he made a point of saying that he's

0:54:09.080 --> 0:54:11.040
<v Speaker 2>not an AI expert, but I thought he did a

0:54:11.040 --> 0:54:13.480
<v Speaker 2>pretty good job of explaining it. I do think the

0:54:13.640 --> 0:54:17.920
<v Speaker 2>trajectory of how all this, I mean, this is such

0:54:17.960 --> 0:54:19.560
<v Speaker 2>an obvious thing to say, but it's going to be

0:54:19.600 --> 0:54:23.719
<v Speaker 2>really interesting to watch and how businesses adapt to this,

0:54:23.920 --> 0:54:26.640
<v Speaker 2>and we're what's kind of fascinating to me is that

0:54:26.640 --> 0:54:30.680
<v Speaker 2>we're already seeing that differentiation play out in the market,

0:54:30.800 --> 0:54:33.160
<v Speaker 2>with in video shares up quite a bit and Intel,

0:54:33.239 --> 0:54:35.800
<v Speaker 2>which is seen as not as competitive in the space,

0:54:36.000 --> 0:54:36.799
<v Speaker 2>down quite a bit.

0:54:37.400 --> 0:54:40.359
<v Speaker 1>I was really interested in some of his points about

0:54:40.480 --> 0:54:44.960
<v Speaker 1>software in particular, and so I have realized that, Yeah,

0:54:45.040 --> 0:54:47.960
<v Speaker 1>like I mean, I you know, like sometimes I see

0:54:48.160 --> 0:54:49.799
<v Speaker 1>like someone will post on Twitter it's like, look at

0:54:49.800 --> 0:54:52.200
<v Speaker 1>this cool thing and video just rolled out where they

0:54:52.200 --> 0:54:54.200
<v Speaker 1>can make your face look like something else or whatever.

0:54:54.800 --> 0:54:59.040
<v Speaker 1>But thinking about like how important that is in terms

0:54:59.080 --> 0:55:01.239
<v Speaker 1>of like, Okay, you and I want to start an

0:55:01.280 --> 0:55:04.759
<v Speaker 1>AI company and idea for a large language model or

0:55:04.760 --> 0:55:08.040
<v Speaker 1>something specifically have a model to train. There's going to

0:55:08.080 --> 0:55:10.319
<v Speaker 1>be a big advantage going with the company that has

0:55:10.360 --> 0:55:13.440
<v Speaker 1>this huge like wealth of like libraries and code bases

0:55:13.480 --> 0:55:17.680
<v Speaker 1>and specific tools around specific industries as opposed to it

0:55:17.760 --> 0:55:20.480
<v Speaker 1>seems like where some of the other competitors are, or

0:55:20.480 --> 0:55:24.080
<v Speaker 1>it's just much more technically challenging to even like use

0:55:24.200 --> 0:55:27.719
<v Speaker 1>the chips if they exist, like Google's.

0:55:27.200 --> 0:55:32.239
<v Speaker 2>TPUs totally the other thing that caught my attention, And

0:55:32.320 --> 0:55:34.760
<v Speaker 2>I know these are very different spaces in many ways,

0:55:34.800 --> 0:55:38.399
<v Speaker 2>but there's so much of the terminology and like that's

0:55:38.520 --> 0:55:41.880
<v Speaker 2>very reminiscent of crypto. So just the idea of like

0:55:41.920 --> 0:55:45.239
<v Speaker 2>an AI winter and a crypto winter, and you can see,

0:55:45.280 --> 0:55:47.680
<v Speaker 2>I mean, you can see the pivot happening right now

0:55:47.719 --> 0:55:50.600
<v Speaker 2>from like crypto people moving into AI. So that's going

0:55:50.640 --> 0:55:53.359
<v Speaker 2>to be interesting to watch play out. Like how much

0:55:53.360 --> 0:55:56.760
<v Speaker 2>of it is hype classic sort of gartment hype cycle

0:55:57.200 --> 0:55:58.399
<v Speaker 2>versus the real thing.

0:55:58.560 --> 0:56:01.120
<v Speaker 1>But you know, two things, it's absolutely you know, so

0:56:01.200 --> 0:56:02.840
<v Speaker 1>two things I think would be interesting. It'd be interesting

0:56:02.920 --> 0:56:06.279
<v Speaker 1>to go back to like past AI summers, like what

0:56:06.320 --> 0:56:08.400
<v Speaker 1>were some past periods which people thought we made this

0:56:08.440 --> 0:56:10.120
<v Speaker 1>break through and then what happened? So that might be

0:56:10.120 --> 0:56:12.600
<v Speaker 1>an interesting And then the other thing is like, look

0:56:12.680 --> 0:56:16.719
<v Speaker 1>like you know, in twenty twenty three, I have never

0:56:16.840 --> 0:56:20.200
<v Speaker 1>actually like found a reason I've ever felt compelled to

0:56:20.280 --> 0:56:23.120
<v Speaker 1>like need to use a blockchain for something. And I

0:56:23.160 --> 0:56:26.920
<v Speaker 1>get use out of chad GPT on something like almost

0:56:26.960 --> 0:56:31.040
<v Speaker 1>every day. And so for example, we recently did an

0:56:31.160 --> 0:56:33.759
<v Speaker 1>episode you know, yeah, look, we'll do an episode now

0:56:33.800 --> 0:56:35.040
<v Speaker 1>of a question. At the end, they're like, oh, what

0:56:35.160 --> 0:56:37.960
<v Speaker 1>is the difference Like yesterday, you know, we recently did

0:56:37.960 --> 0:56:40.560
<v Speaker 1>an episode on like lending, and so it's like, oh,

0:56:40.600 --> 0:56:44.479
<v Speaker 1>what's the difference sort of structurally between the leverage loan

0:56:44.480 --> 0:56:46.239
<v Speaker 1>market and the private debt market. It's like, this might

0:56:46.239 --> 0:56:48.600
<v Speaker 1>be an interesting question for a chat GPT, and like

0:56:48.640 --> 0:56:52.160
<v Speaker 1>I got this like very useful, clear answer from it

0:56:52.239 --> 0:56:55.000
<v Speaker 1>that like I couldn't have gotten perhaps as easily from

0:56:55.040 --> 0:56:57.520
<v Speaker 1>a Google search. So I do think like some of

0:56:57.560 --> 0:57:00.120
<v Speaker 1>these hype cycles like are really useful, But like I

0:57:00.160 --> 0:57:04.040
<v Speaker 1>am already in my daily life and very already getting

0:57:04.239 --> 0:57:06.000
<v Speaker 1>use out of this technology in a way that I

0:57:06.040 --> 0:57:08.600
<v Speaker 1>cannot say for anything related like web three. No, that

0:57:08.760 --> 0:57:09.760
<v Speaker 1>is very true.

0:57:09.840 --> 0:57:11.799
<v Speaker 2>And you know the fact that this only came out

0:57:11.880 --> 0:57:14.520
<v Speaker 2>a few months ago and everyone has been talking about

0:57:14.520 --> 0:57:17.200
<v Speaker 2>it and experimenting with it kind of speaks for itself.

0:57:17.520 --> 0:57:19.120
<v Speaker 1>Shall we leave it there? Let's leave it there.

0:57:19.280 --> 0:57:22.640
<v Speaker 2>This has been another episode of the Oddlots podcast. I'm

0:57:22.640 --> 0:57:25.920
<v Speaker 2>Tracy Alloway. You can follow me on Twitter at Tracy Alloway.

0:57:26.000 --> 0:57:29.040
<v Speaker 1>And I'm Joe Wisenthal. You can follow me on Twitter

0:57:29.120 --> 0:57:32.720
<v Speaker 1>at the Stalwart. Follow our guest Stacey Raskin. He's at

0:57:33.000 --> 0:57:37.360
<v Speaker 1>s Raskin. Follow our producers Carmen Rodriguez at Carmen Arman

0:57:37.480 --> 0:57:40.520
<v Speaker 1>and Dash o' bennett at dashbot. And check out all

0:57:40.560 --> 0:57:44.240
<v Speaker 1>of our podcasts at Bloomberg under the handle at podcasts,

0:57:44.280 --> 0:57:47.479
<v Speaker 1>and for more Oddlots content, go to Bloomberg dot com

0:57:47.480 --> 0:57:51.000
<v Speaker 1>slash odd Lots. We blog, we post transcripts, we have

0:57:51.000 --> 0:57:54.760
<v Speaker 1>a newsletter, and check out the Odd Loots Discord people

0:57:54.880 --> 0:57:57.240
<v Speaker 1>listeners chatting twenty four to seven about all the things

0:57:57.280 --> 0:57:59.680
<v Speaker 1>we talk about here. We even have an AI specific

0:57:59.720 --> 0:58:03.360
<v Speaker 1>world that's really fun and set and the semiconductor room,

0:58:03.520 --> 0:58:05.760
<v Speaker 1>and so people chatting about these things. I even so

0:58:05.840 --> 0:58:09.000
<v Speaker 1>listened to some questions for today from that group, so

0:58:09.160 --> 0:58:11.520
<v Speaker 1>it's really fun. I like hanging out there. To go

0:58:11.560 --> 0:58:16.000
<v Speaker 1>to Discord dot gg slash pop. Thanks for listening