WEBVTT - Monologue: What Happened With DeepSeek?

0:00:01.800 --> 0:00:05.840
<v Speaker 1>All Zone Media. Hi, and welcome to the very first

0:00:05.880 --> 0:00:08.080
<v Speaker 1>Better Offline Monologue. This is going to be a short

0:00:08.080 --> 0:00:10.559
<v Speaker 1>weekly episode where I take a quick look at something

0:00:10.560 --> 0:00:13.240
<v Speaker 1>going on in the tech industry doesn't quite warrant a

0:00:13.240 --> 0:00:16.360
<v Speaker 1>full episode. One might say, they're like quick bites of

0:00:16.400 --> 0:00:18.959
<v Speaker 1>content quibis if you will, and this is a business

0:00:18.960 --> 0:00:22.360
<v Speaker 1>model that's proven successful time and time again. This week,

0:00:22.400 --> 0:00:24.200
<v Speaker 1>I'm going to give you a distilled rundown of a

0:00:24.239 --> 0:00:27.320
<v Speaker 1>recent situation at Rock both the economy and the AI world.

0:00:27.520 --> 0:00:29.440
<v Speaker 1>For those of you that either need a refresh or

0:00:29.480 --> 0:00:39.720
<v Speaker 1>rejected the notion of a TUPAC podcast. At the end

0:00:39.720 --> 0:00:43.120
<v Speaker 1>of January, something happened that radically overturned not just the

0:00:43.159 --> 0:00:46.360
<v Speaker 1>AI industry status quo, but also called into question the

0:00:46.360 --> 0:00:50.320
<v Speaker 1>dominance of the American tech industry. Our story starts on

0:00:50.400 --> 0:00:53.240
<v Speaker 1>January twentieth, when a little known Chinese company called deep

0:00:53.280 --> 0:00:56.760
<v Speaker 1>Seek released It's our one AI model, terrifying the Western

0:00:56.800 --> 0:01:00.160
<v Speaker 1>tech behemoths that applowed over two hundred billion dollars combined

0:01:00.360 --> 0:01:03.920
<v Speaker 1>into data centers in industrial grade graphics processing units GPUs

0:01:04.040 --> 0:01:07.280
<v Speaker 1>for others to power generative AI models like those behind chat,

0:01:07.319 --> 0:01:12.000
<v Speaker 1>GPT and anthropics. Claud like open aizo one model, deep

0:01:12.000 --> 0:01:15.120
<v Speaker 1>seeks are one model is a reasoning model, which is

0:01:15.160 --> 0:01:16.959
<v Speaker 1>a way to say that it works through problems step

0:01:16.959 --> 0:01:19.360
<v Speaker 1>by step, showing the users the steps it took to

0:01:19.360 --> 0:01:22.800
<v Speaker 1>reach its conclusion. Generally, when you make a request of

0:01:22.840 --> 0:01:26.440
<v Speaker 1>a generative model, it generates an answer probabilistically, meaning it's

0:01:26.480 --> 0:01:29.480
<v Speaker 1>guessing at each next bit based on the request you've made.

0:01:29.640 --> 0:01:32.119
<v Speaker 1>In the case of open aizo one model, and indeed

0:01:32.160 --> 0:01:35.440
<v Speaker 1>deep seeks are one model, the model thinks. They use

0:01:35.480 --> 0:01:38.240
<v Speaker 1>that term loosely. These models do not know anything. They're

0:01:38.280 --> 0:01:41.399
<v Speaker 1>not thinking. They have no consciousness, but I think through

0:01:41.520 --> 0:01:44.440
<v Speaker 1>each step by generating it piece by piece and reviewing

0:01:44.440 --> 0:01:46.520
<v Speaker 1>it piece by piece with separate parts of the model.

0:01:47.360 --> 0:01:50.800
<v Speaker 1>In theory, this ability to reason means it's well suited

0:01:50.800 --> 0:01:53.440
<v Speaker 1>for tasks where there's a definitive right and wrong answer,

0:01:53.760 --> 0:01:57.120
<v Speaker 1>like logic and maths. It's also what it makes it

0:01:57.160 --> 0:01:59.960
<v Speaker 1>different from the standard CHAT GPT or GPT four US,

0:02:00.600 --> 0:02:03.680
<v Speaker 1>which is considerably faster, as it doesn't undertake this step

0:02:03.720 --> 0:02:06.680
<v Speaker 1>by step thinking and thus is better suited for more

0:02:06.760 --> 0:02:09.400
<v Speaker 1>open ended questions such as what would it be like

0:02:09.440 --> 0:02:12.720
<v Speaker 1>if Garfield had a gun? To be clear, this doesn't

0:02:12.720 --> 0:02:15.519
<v Speaker 1>mean the answers are any good now. Just a few

0:02:15.520 --> 0:02:18.640
<v Speaker 1>weeks earlier, Deep Sea could release another model, albeit a

0:02:18.680 --> 0:02:21.400
<v Speaker 1>far less fanfare, likely due to it being launched there

0:02:21.400 --> 0:02:24.360
<v Speaker 1>after Christmas, of course, but nevertheless, it was called V

0:02:24.400 --> 0:02:27.800
<v Speaker 1>three and it was still pretty impressive. V three competes

0:02:27.840 --> 0:02:30.680
<v Speaker 1>with the same model that powers chat GPTs I just mentioned,

0:02:30.720 --> 0:02:32.880
<v Speaker 1>which at the time of recording this is called GPT

0:02:33.080 --> 0:02:36.079
<v Speaker 1>four zero, and that's a more general purpose kind of product.

0:02:36.240 --> 0:02:38.560
<v Speaker 1>It can write code and solve maths problems, but it's

0:02:38.560 --> 0:02:41.400
<v Speaker 1>better suited for tasks that are rooted in language, writing

0:02:41.440 --> 0:02:44.480
<v Speaker 1>that term paper, summarizing a document, whatever it is you

0:02:44.600 --> 0:02:47.600
<v Speaker 1>do with this. And it's also important to know that

0:02:47.680 --> 0:02:50.880
<v Speaker 1>this is the most commonly used style of model. You're

0:02:50.919 --> 0:02:53.520
<v Speaker 1>not really getting reasoning in everything, at least not yet,

0:02:53.680 --> 0:02:56.960
<v Speaker 1>and I don't know how prevalent it'll ever be now.

0:02:56.960 --> 0:02:59.959
<v Speaker 1>Deep seeks Tech didn't just match open ai and capabilities.

0:03:00.080 --> 0:03:02.880
<v Speaker 1>It was also purportedly cheaper to train and to operate,

0:03:03.400 --> 0:03:06.600
<v Speaker 1>whereas open AI's GPT four model reportedly costs one hundred

0:03:06.680 --> 0:03:10.000
<v Speaker 1>million dollars to train. Some experts estimate the deep Seek's

0:03:10.000 --> 0:03:13.560
<v Speaker 1>reasoning model, called R one cost a lot less than that,

0:03:14.000 --> 0:03:16.680
<v Speaker 1>and their V three model actually costs less than six

0:03:16.800 --> 0:03:19.959
<v Speaker 1>million dollars to train. This figure is open to some debate,

0:03:20.760 --> 0:03:22.880
<v Speaker 1>but the big thing is about these models is they're

0:03:22.960 --> 0:03:26.920
<v Speaker 1>dramatically cheaper. They can be run on your computer, though

0:03:27.080 --> 0:03:29.760
<v Speaker 1>much slower, or they can be run another cloud infrastructure.

0:03:30.280 --> 0:03:32.160
<v Speaker 1>And in the case of the V three model, the

0:03:32.160 --> 0:03:35.160
<v Speaker 1>one that competes with chat GPT, it was actually about

0:03:35.200 --> 0:03:38.800
<v Speaker 1>fifty times cheaper, and the Reasoning model are one about

0:03:38.880 --> 0:03:41.480
<v Speaker 1>thirty which is crazy. Now, these are the prices that

0:03:41.520 --> 0:03:43.840
<v Speaker 1>are run on the servers where deep Seak runs, but

0:03:43.880 --> 0:03:46.080
<v Speaker 1>we're very quickly going to see as other people host

0:03:46.160 --> 0:03:48.560
<v Speaker 1>them exactly how much cheaper they are. And they're more

0:03:48.600 --> 0:03:52.360
<v Speaker 1>efficient too, which is crazy. They's so much more efficient.

0:03:53.720 --> 0:03:56.400
<v Speaker 1>And it's also important to note that they train these

0:03:56.400 --> 0:03:59.360
<v Speaker 1>models using older generation N video chips because they had

0:03:59.400 --> 0:04:01.600
<v Speaker 1>sanctions on them from China. They got some of the

0:04:01.640 --> 0:04:05.360
<v Speaker 1>newer ones too through weird resellers, but nevertheless this made

0:04:05.400 --> 0:04:08.640
<v Speaker 1>it much harder for them to get GPUs in general,

0:04:09.120 --> 0:04:11.480
<v Speaker 1>and thus they were able to kind of squeeze more

0:04:11.520 --> 0:04:13.200
<v Speaker 1>power out than they had to come up with really

0:04:13.280 --> 0:04:16.479
<v Speaker 1>interesting kind of assembly language level stuff where they did

0:04:16.520 --> 0:04:19.279
<v Speaker 1>extra things with the GPUs, the well, the fat and

0:04:19.360 --> 0:04:22.520
<v Speaker 1>happy tech executives never thought of, and Sam Altman and

0:04:22.560 --> 0:04:25.160
<v Speaker 1>his ILK from open ai never really thought of, because well,

0:04:25.320 --> 0:04:27.200
<v Speaker 1>why would they have to be why would they have

0:04:27.240 --> 0:04:29.680
<v Speaker 1>to think of that they had the unlimited money cheap

0:04:29.720 --> 0:04:32.080
<v Speaker 1>from the hyperscalers, like in the case of open Ai

0:04:32.320 --> 0:04:35.120
<v Speaker 1>funded by Microsoft, in the case of Anthropic funded by

0:04:35.240 --> 0:04:38.720
<v Speaker 1>Amazon and Google. And this is where the narrative has

0:04:38.760 --> 0:04:41.000
<v Speaker 1>begun to kind of fall apart, because all of this

0:04:41.040 --> 0:04:43.839
<v Speaker 1>has made it much harder to justify these companies building

0:04:43.839 --> 0:04:47.279
<v Speaker 1>new data centers and buying new in video GPUs. This

0:04:47.640 --> 0:04:50.440
<v Speaker 1>entire AI boom has been based off of the assumption

0:04:50.480 --> 0:04:52.880
<v Speaker 1>that the only way to build powerful models was to

0:04:52.920 --> 0:04:55.560
<v Speaker 1>get the biggest, most hugest chips from in video each year,

0:04:55.960 --> 0:04:57.560
<v Speaker 1>and that there was just no way to make these

0:04:57.640 --> 0:05:01.640
<v Speaker 1>models cheaper. Now as an aside, lost five billion dollars

0:05:01.680 --> 0:05:04.400
<v Speaker 1>in twenty twenty four and all of their products are unprofitable,

0:05:04.520 --> 0:05:07.520
<v Speaker 1>even their two hundred dollars a month open ai Chat

0:05:07.560 --> 0:05:11.240
<v Speaker 1>GPT pro subscription. I hate these terms, by the way,

0:05:11.400 --> 0:05:15.640
<v Speaker 1>They're all different. Nevertheless, everyone assumed that there was never

0:05:15.680 --> 0:05:18.360
<v Speaker 1>going to be a more efficient model and I personally

0:05:18.440 --> 0:05:20.600
<v Speaker 1>made the mistake of saying, well, if it was going

0:05:20.680 --> 0:05:22.599
<v Speaker 1>to be more efficient, surely they would want it to

0:05:22.640 --> 0:05:25.760
<v Speaker 1>be or they could do that, right, right, Maybe they

0:05:25.839 --> 0:05:27.839
<v Speaker 1>just have to do this stuff even though it's stupid.

0:05:28.680 --> 0:05:31.760
<v Speaker 1>That was never the case, and deep Seek proved in crucially,

0:05:31.800 --> 0:05:34.560
<v Speaker 1>deep Seak released its models under an open source license,

0:05:34.640 --> 0:05:37.520
<v Speaker 1>meaning any company can reuse and repurpose its tech without

0:05:37.560 --> 0:05:40.480
<v Speaker 1>having to pay anyone anything, any license fees or anything,

0:05:40.640 --> 0:05:43.960
<v Speaker 1>or ask anyone for permission. Open Ai, by contrast, keeps

0:05:43.960 --> 0:05:46.840
<v Speaker 1>its technology under lock and key. Despite their name, open

0:05:46.880 --> 0:05:50.080
<v Speaker 1>ai is a deeply secretive organization open in name only.

0:05:50.839 --> 0:05:53.800
<v Speaker 1>In summary, deep Seek has created a viable alternative to

0:05:53.839 --> 0:05:58.240
<v Speaker 1>open AI's tech and indeed anthropics that's equally capable, vastly cheaper,

0:05:58.360 --> 0:06:00.680
<v Speaker 1>an open source and proven that you don't need the

0:06:00.680 --> 0:06:03.640
<v Speaker 1>most expensive and powerful chips to do so. And they

0:06:03.720 --> 0:06:06.520
<v Speaker 1>kind of came out of nowhere. Well, deep Seek isn't

0:06:06.560 --> 0:06:10.280
<v Speaker 1>exactly a tiny little startup. They're also not a Silicon

0:06:10.360 --> 0:06:13.880
<v Speaker 1>Valley giant with billions of dollars of venture capital, or

0:06:14.120 --> 0:06:16.880
<v Speaker 1>someone who's backed by one of the many different companies

0:06:16.880 --> 0:06:19.680
<v Speaker 1>with a three trillion dollar market cap. They started off

0:06:19.680 --> 0:06:21.880
<v Speaker 1>as a side project from a Chinese hedge fund. No,

0:06:22.000 --> 0:06:25.480
<v Speaker 1>I'm not kidding now, still an eight billion dollars under

0:06:25.480 --> 0:06:29.520
<v Speaker 1>management hedge fund. They're not small at all. It's so strange.

0:06:29.920 --> 0:06:32.880
<v Speaker 1>It's a kind of cynical version of David versus Goliath,

0:06:32.960 --> 0:06:37.040
<v Speaker 1>where David is a hedge fund baby and Goliath is

0:06:37.600 --> 0:06:42.640
<v Speaker 1>several different hyperscalers taped together with a bad idea. But anyway,

0:06:42.680 --> 0:06:45.039
<v Speaker 1>put yourself in the shoes of open Ai CEO and

0:06:45.080 --> 0:06:48.160
<v Speaker 1>co founder Sam Mortmon. You've crafted this public perception of

0:06:48.200 --> 0:06:51.080
<v Speaker 1>yourself as a visionary that isn't just bringing generative AI

0:06:51.120 --> 0:06:53.360
<v Speaker 1>to the massives, but you're on the path that will

0:06:53.360 --> 0:06:56.359
<v Speaker 1>bring about artificial general intelligence, which is to say, an

0:06:56.400 --> 0:06:59.400
<v Speaker 1>AI that's as capable as a human being. You've crafted

0:06:59.400 --> 0:07:01.679
<v Speaker 1>this myth not just about yourself, but about your company

0:07:01.680 --> 0:07:03.520
<v Speaker 1>and what you'll do, and this has allowed you to,

0:07:03.680 --> 0:07:05.760
<v Speaker 1>in essence, to fire the laws of physics when it

0:07:05.760 --> 0:07:08.080
<v Speaker 1>comes to business. You can burn money at a rate

0:07:08.160 --> 0:07:11.440
<v Speaker 1>unlike any tech company in history, with no hope of

0:07:11.480 --> 0:07:13.160
<v Speaker 1>making a profit, or at least not in the short

0:07:13.200 --> 0:07:16.400
<v Speaker 1>to medium term, and no real expectation that you'll do so,

0:07:16.720 --> 0:07:19.400
<v Speaker 1>as investors will still line up to give you more money.

0:07:19.400 --> 0:07:22.560
<v Speaker 1>With your company valued and even more ludicrous numbers seemingly

0:07:22.600 --> 0:07:25.760
<v Speaker 1>every other month, you can say these outlandish things like

0:07:25.800 --> 0:07:28.680
<v Speaker 1>you need seven trillion dollars to build the infrastructure and

0:07:28.760 --> 0:07:31.400
<v Speaker 1>chip manufacturing capacity to bring your plans to life, and

0:07:31.440 --> 0:07:33.280
<v Speaker 1>you don't get laughed out of the room if I

0:07:33.360 --> 0:07:35.880
<v Speaker 1>said this shit, they'd asked me if I had a concussion.

0:07:36.640 --> 0:07:38.880
<v Speaker 1>You can say stuff like I want to build five

0:07:38.920 --> 0:07:41.520
<v Speaker 1>hundred billion dollars worth of data centers, and instead of

0:07:41.520 --> 0:07:44.240
<v Speaker 1>people rolling their eyes, the world's largest tech companies and

0:07:44.400 --> 0:07:47.680
<v Speaker 1>investors will say, damn man, that's sick, and then it

0:07:47.720 --> 0:07:51.200
<v Speaker 1>turns out that you were wrong. You'd always assume that

0:07:51.240 --> 0:07:54.320
<v Speaker 1>AI must be expensive, that the models used to power

0:07:54.440 --> 0:07:58.480
<v Speaker 1>your apps like chat, GPT and Dally their image generator,

0:07:59.720 --> 0:08:02.000
<v Speaker 1>they always cost more to build, they'd always cost more

0:08:02.040 --> 0:08:05.520
<v Speaker 1>to run, they'd always require more powerful hardware, or maybe

0:08:05.520 --> 0:08:07.600
<v Speaker 1>you just never thought about it too hard because you

0:08:07.680 --> 0:08:10.240
<v Speaker 1>never have to worry about money and to grow to

0:08:10.240 --> 0:08:12.920
<v Speaker 1>build more capable aiye moodels, you assume that you would

0:08:12.920 --> 0:08:15.640
<v Speaker 1>always need more money, and so much more money than

0:08:15.680 --> 0:08:19.000
<v Speaker 1>anyone's ever had, And then here comes this Chinese company

0:08:19.040 --> 0:08:23.040
<v Speaker 1>didn't just replicate the functionality of your model. And on

0:08:23.080 --> 0:08:25.320
<v Speaker 1>top of that, by the way, one is open ayes

0:08:25.400 --> 0:08:27.640
<v Speaker 1>one moat. It was the one thing that people liked.

0:08:27.760 --> 0:08:31.760
<v Speaker 1>It was their most sophisticated AI model. But this company

0:08:31.800 --> 0:08:34.440
<v Speaker 1>came along and did it on a shoestring budget, both

0:08:34.520 --> 0:08:37.240
<v Speaker 1>for actually training it even if the estimates are off

0:08:37.280 --> 0:08:39.719
<v Speaker 1>by like factors of ten. But these things are more

0:08:39.720 --> 0:08:42.920
<v Speaker 1>efficient too. And this company didn't even have access to

0:08:42.960 --> 0:08:46.000
<v Speaker 1>the most capable GPUs. They didn't have the server architecture

0:08:46.120 --> 0:08:50.560
<v Speaker 1>provided by Microsoft or Amazon or Google. And wow, and

0:08:50.600 --> 0:08:52.360
<v Speaker 1>what did they do next with this thing they built

0:08:52.360 --> 0:08:55.200
<v Speaker 1>that's competitive with you only real moat? They gave it away.

0:08:56.080 --> 0:08:59.080
<v Speaker 1>Oh goodness me, Sammy, things aren't looking good at all.

0:08:59.679 --> 0:09:02.079
<v Speaker 1>And this is where Sam Moultman's at. This is where

0:09:02.080 --> 0:09:03.920
<v Speaker 1>open ai and the companies that are backed to it,

0:09:03.960 --> 0:09:06.800
<v Speaker 1>and their competitors, this is where they're all at. The

0:09:06.880 --> 0:09:10.200
<v Speaker 1>decisive lead they once enjoyed has like a puddle on

0:09:10.240 --> 0:09:13.360
<v Speaker 1>a hot day, evaporated. And you'd see that happen a

0:09:13.400 --> 0:09:16.400
<v Speaker 1>lot here in beautiful Las Vegas, Nevada. Now, don't get

0:09:16.400 --> 0:09:19.120
<v Speaker 1>me wrong, open ai still burns money. But now when

0:09:19.120 --> 0:09:21.920
<v Speaker 1>Sam Moretman dusts off his begging bowl. Investors will ask,

0:09:22.000 --> 0:09:31.560
<v Speaker 1>perhaps for the first time, one very simple question, why