WEBVTT - Monologue: What Happened With DeepSeek? 0:00:01.800 --> 0:00:05.840 All Zone Media. Hi, and welcome to the very first 0:00:05.880 --> 0:00:08.080 Better Offline Monologue. This is going to be a short 0:00:08.080 --> 0:00:10.559 weekly episode where I take a quick look at something 0:00:10.560 --> 0:00:13.240 going on in the tech industry doesn't quite warrant a 0:00:13.240 --> 0:00:16.360 full episode. One might say, they're like quick bites of 0:00:16.400 --> 0:00:18.959 content quibis if you will, and this is a business 0:00:18.960 --> 0:00:22.360 model that's proven successful time and time again. This week, 0:00:22.400 --> 0:00:24.200 I'm going to give you a distilled rundown of a 0:00:24.239 --> 0:00:27.320 recent situation at Rock both the economy and the AI world. 0:00:27.520 --> 0:00:29.440 For those of you that either need a refresh or 0:00:29.480 --> 0:00:39.720 rejected the notion of a TUPAC podcast. At the end 0:00:39.720 --> 0:00:43.120 of January, something happened that radically overturned not just the 0:00:43.159 --> 0:00:46.360 AI industry status quo, but also called into question the 0:00:46.360 --> 0:00:50.320 dominance of the American tech industry. Our story starts on 0:00:50.400 --> 0:00:53.240 January twentieth, when a little known Chinese company called deep 0:00:53.280 --> 0:00:56.760 Seek released It's our one AI model, terrifying the Western 0:00:56.800 --> 0:01:00.160 tech behemoths that applowed over two hundred billion dollars combined 0:01:00.360 --> 0:01:03.920 into data centers in industrial grade graphics processing units GPUs 0:01:04.040 --> 0:01:07.280 for others to power generative AI models like those behind chat, 0:01:07.319 --> 0:01:12.000 GPT and anthropics. Claud like open aizo one model, deep 0:01:12.000 --> 0:01:15.120 seeks are one model is a reasoning model, which is 0:01:15.160 --> 0:01:16.959 a way to say that it works through problems step 0:01:16.959 --> 0:01:19.360 by step, showing the users the steps it took to 0:01:19.360 --> 0:01:22.800 reach its conclusion. Generally, when you make a request of 0:01:22.840 --> 0:01:26.440 a generative model, it generates an answer probabilistically, meaning it's 0:01:26.480 --> 0:01:29.480 guessing at each next bit based on the request you've made. 0:01:29.640 --> 0:01:32.119 In the case of open aizo one model, and indeed 0:01:32.160 --> 0:01:35.440 deep seeks are one model, the model thinks. They use 0:01:35.480 --> 0:01:38.240 that term loosely. These models do not know anything. They're 0:01:38.280 --> 0:01:41.399 not thinking. They have no consciousness, but I think through 0:01:41.520 --> 0:01:44.440 each step by generating it piece by piece and reviewing 0:01:44.440 --> 0:01:46.520 it piece by piece with separate parts of the model. 0:01:47.360 --> 0:01:50.800 In theory, this ability to reason means it's well suited 0:01:50.800 --> 0:01:53.440 for tasks where there's a definitive right and wrong answer, 0:01:53.760 --> 0:01:57.120 like logic and maths. It's also what it makes it 0:01:57.160 --> 0:01:59.960 different from the standard CHAT GPT or GPT four US, 0:02:00.600 --> 0:02:03.680 which is considerably faster, as it doesn't undertake this step 0:02:03.720 --> 0:02:06.680 by step thinking and thus is better suited for more 0:02:06.760 --> 0:02:09.400 open ended questions such as what would it be like 0:02:09.440 --> 0:02:12.720 if Garfield had a gun? To be clear, this doesn't 0:02:12.720 --> 0:02:15.519 mean the answers are any good now. Just a few 0:02:15.520 --> 0:02:18.640 weeks earlier, Deep Sea could release another model, albeit a 0:02:18.680 --> 0:02:21.400 far less fanfare, likely due to it being launched there 0:02:21.400 --> 0:02:24.360 after Christmas, of course, but nevertheless, it was called V 0:02:24.400 --> 0:02:27.800 three and it was still pretty impressive. V three competes 0:02:27.840 --> 0:02:30.680 with the same model that powers chat GPTs I just mentioned, 0:02:30.720 --> 0:02:32.880 which at the time of recording this is called GPT 0:02:33.080 --> 0:02:36.079 four zero, and that's a more general purpose kind of product. 0:02:36.240 --> 0:02:38.560 It can write code and solve maths problems, but it's 0:02:38.560 --> 0:02:41.400 better suited for tasks that are rooted in language, writing 0:02:41.440 --> 0:02:44.480 that term paper, summarizing a document, whatever it is you 0:02:44.600 --> 0:02:47.600 do with this. And it's also important to know that 0:02:47.680 --> 0:02:50.880 this is the most commonly used style of model. You're 0:02:50.919 --> 0:02:53.520 not really getting reasoning in everything, at least not yet, 0:02:53.680 --> 0:02:56.960 and I don't know how prevalent it'll ever be now. 0:02:56.960 --> 0:02:59.959 Deep seeks Tech didn't just match open ai and capabilities. 0:03:00.080 --> 0:03:02.880 It was also purportedly cheaper to train and to operate, 0:03:03.400 --> 0:03:06.600 whereas open AI's GPT four model reportedly costs one hundred 0:03:06.680 --> 0:03:10.000 million dollars to train. Some experts estimate the deep Seek's 0:03:10.000 --> 0:03:13.560 reasoning model, called R one cost a lot less than that, 0:03:14.000 --> 0:03:16.680 and their V three model actually costs less than six 0:03:16.800 --> 0:03:19.959 million dollars to train. This figure is open to some debate, 0:03:20.760 --> 0:03:22.880 but the big thing is about these models is they're 0:03:22.960 --> 0:03:26.920 dramatically cheaper. They can be run on your computer, though 0:03:27.080 --> 0:03:29.760 much slower, or they can be run another cloud infrastructure. 0:03:30.280 --> 0:03:32.160 And in the case of the V three model, the 0:03:32.160 --> 0:03:35.160 one that competes with chat GPT, it was actually about 0:03:35.200 --> 0:03:38.800 fifty times cheaper, and the Reasoning model are one about 0:03:38.880 --> 0:03:41.480 thirty which is crazy. Now, these are the prices that 0:03:41.520 --> 0:03:43.840 are run on the servers where deep Seak runs, but 0:03:43.880 --> 0:03:46.080 we're very quickly going to see as other people host 0:03:46.160 --> 0:03:48.560 them exactly how much cheaper they are. And they're more 0:03:48.600 --> 0:03:52.360 efficient too, which is crazy. They's so much more efficient. 0:03:53.720 --> 0:03:56.400 And it's also important to note that they train these 0:03:56.400 --> 0:03:59.360 models using older generation N video chips because they had 0:03:59.400 --> 0:04:01.600 sanctions on them from China. They got some of the 0:04:01.640 --> 0:04:05.360 newer ones too through weird resellers, but nevertheless this made 0:04:05.400 --> 0:04:08.640 it much harder for them to get GPUs in general, 0:04:09.120 --> 0:04:11.480 and thus they were able to kind of squeeze more 0:04:11.520 --> 0:04:13.200 power out than they had to come up with really 0:04:13.280 --> 0:04:16.479 interesting kind of assembly language level stuff where they did 0:04:16.520 --> 0:04:19.279 extra things with the GPUs, the well, the fat and 0:04:19.360 --> 0:04:22.520 happy tech executives never thought of, and Sam Altman and 0:04:22.560 --> 0:04:25.160 his ILK from open ai never really thought of, because well, 0:04:25.320 --> 0:04:27.200 why would they have to be why would they have 0:04:27.240 --> 0:04:29.680 to think of that they had the unlimited money cheap 0:04:29.720 --> 0:04:32.080 from the hyperscalers, like in the case of open Ai 0:04:32.320 --> 0:04:35.120 funded by Microsoft, in the case of Anthropic funded by 0:04:35.240 --> 0:04:38.720 Amazon and Google. And this is where the narrative has 0:04:38.760 --> 0:04:41.000 begun to kind of fall apart, because all of this 0:04:41.040 --> 0:04:43.839 has made it much harder to justify these companies building 0:04:43.839 --> 0:04:47.279 new data centers and buying new in video GPUs. This 0:04:47.640 --> 0:04:50.440 entire AI boom has been based off of the assumption 0:04:50.480 --> 0:04:52.880 that the only way to build powerful models was to 0:04:52.920 --> 0:04:55.560 get the biggest, most hugest chips from in video each year, 0:04:55.960 --> 0:04:57.560 and that there was just no way to make these 0:04:57.640 --> 0:05:01.640 models cheaper. Now as an aside, lost five billion dollars 0:05:01.680 --> 0:05:04.400 in twenty twenty four and all of their products are unprofitable, 0:05:04.520 --> 0:05:07.520 even their two hundred dollars a month open ai Chat 0:05:07.560 --> 0:05:11.240 GPT pro subscription. I hate these terms, by the way, 0:05:11.400 --> 0:05:15.640 They're all different. Nevertheless, everyone assumed that there was never 0:05:15.680 --> 0:05:18.360 going to be a more efficient model and I personally 0:05:18.440 --> 0:05:20.600 made the mistake of saying, well, if it was going 0:05:20.680 --> 0:05:22.599 to be more efficient, surely they would want it to 0:05:22.640 --> 0:05:25.760 be or they could do that, right, right, Maybe they 0:05:25.839 --> 0:05:27.839 just have to do this stuff even though it's stupid. 0:05:28.680 --> 0:05:31.760 That was never the case, and deep Seek proved in crucially, 0:05:31.800 --> 0:05:34.560 deep Seak released its models under an open source license, 0:05:34.640 --> 0:05:37.520 meaning any company can reuse and repurpose its tech without 0:05:37.560 --> 0:05:40.480 having to pay anyone anything, any license fees or anything, 0:05:40.640 --> 0:05:43.960 or ask anyone for permission. Open Ai, by contrast, keeps 0:05:43.960 --> 0:05:46.840 its technology under lock and key. Despite their name, open 0:05:46.880 --> 0:05:50.080 ai is a deeply secretive organization open in name only. 0:05:50.839 --> 0:05:53.800 In summary, deep Seek has created a viable alternative to 0:05:53.839 --> 0:05:58.240 open AI's tech and indeed anthropics that's equally capable, vastly cheaper, 0:05:58.360 --> 0:06:00.680 an open source and proven that you don't need the 0:06:00.680 --> 0:06:03.640 most expensive and powerful chips to do so. And they 0:06:03.720 --> 0:06:06.520 kind of came out of nowhere. Well, deep Seek isn't 0:06:06.560 --> 0:06:10.280 exactly a tiny little startup. They're also not a Silicon 0:06:10.360 --> 0:06:13.880 Valley giant with billions of dollars of venture capital, or 0:06:14.120 --> 0:06:16.880 someone who's backed by one of the many different companies 0:06:16.880 --> 0:06:19.680 with a three trillion dollar market cap. They started off 0:06:19.680 --> 0:06:21.880 as a side project from a Chinese hedge fund. No, 0:06:22.000 --> 0:06:25.480 I'm not kidding now, still an eight billion dollars under 0:06:25.480 --> 0:06:29.520 management hedge fund. They're not small at all. It's so strange. 0:06:29.920 --> 0:06:32.880 It's a kind of cynical version of David versus Goliath, 0:06:32.960 --> 0:06:37.040 where David is a hedge fund baby and Goliath is 0:06:37.600 --> 0:06:42.640 several different hyperscalers taped together with a bad idea. But anyway, 0:06:42.680 --> 0:06:45.039 put yourself in the shoes of open Ai CEO and 0:06:45.080 --> 0:06:48.160 co founder Sam Mortmon. You've crafted this public perception of 0:06:48.200 --> 0:06:51.080 yourself as a visionary that isn't just bringing generative AI 0:06:51.120 --> 0:06:53.360 to the massives, but you're on the path that will 0:06:53.360 --> 0:06:56.359 bring about artificial general intelligence, which is to say, an 0:06:56.400 --> 0:06:59.400 AI that's as capable as a human being. You've crafted 0:06:59.400 --> 0:07:01.679 this myth not just about yourself, but about your company 0:07:01.680 --> 0:07:03.520 and what you'll do, and this has allowed you to, 0:07:03.680 --> 0:07:05.760 in essence, to fire the laws of physics when it 0:07:05.760 --> 0:07:08.080 comes to business. You can burn money at a rate 0:07:08.160 --> 0:07:11.440 unlike any tech company in history, with no hope of 0:07:11.480 --> 0:07:13.160 making a profit, or at least not in the short 0:07:13.200 --> 0:07:16.400 to medium term, and no real expectation that you'll do so, 0:07:16.720 --> 0:07:19.400 as investors will still line up to give you more money. 0:07:19.400 --> 0:07:22.560 With your company valued and even more ludicrous numbers seemingly 0:07:22.600 --> 0:07:25.760 every other month, you can say these outlandish things like 0:07:25.800 --> 0:07:28.680 you need seven trillion dollars to build the infrastructure and 0:07:28.760 --> 0:07:31.400 chip manufacturing capacity to bring your plans to life, and 0:07:31.440 --> 0:07:33.280 you don't get laughed out of the room if I 0:07:33.360 --> 0:07:35.880 said this shit, they'd asked me if I had a concussion. 0:07:36.640 --> 0:07:38.880 You can say stuff like I want to build five 0:07:38.920 --> 0:07:41.520 hundred billion dollars worth of data centers, and instead of 0:07:41.520 --> 0:07:44.240 people rolling their eyes, the world's largest tech companies and 0:07:44.400 --> 0:07:47.680 investors will say, damn man, that's sick, and then it 0:07:47.720 --> 0:07:51.200 turns out that you were wrong. You'd always assume that 0:07:51.240 --> 0:07:54.320 AI must be expensive, that the models used to power 0:07:54.440 --> 0:07:58.480 your apps like chat, GPT and Dally their image generator, 0:07:59.720 --> 0:08:02.000 they always cost more to build, they'd always cost more 0:08:02.040 --> 0:08:05.520 to run, they'd always require more powerful hardware, or maybe 0:08:05.520 --> 0:08:07.600 you just never thought about it too hard because you 0:08:07.680 --> 0:08:10.240 never have to worry about money and to grow to 0:08:10.240 --> 0:08:12.920 build more capable aiye moodels, you assume that you would 0:08:12.920 --> 0:08:15.640 always need more money, and so much more money than 0:08:15.680 --> 0:08:19.000 anyone's ever had, And then here comes this Chinese company 0:08:19.040 --> 0:08:23.040 didn't just replicate the functionality of your model. And on 0:08:23.080 --> 0:08:25.320 top of that, by the way, one is open ayes 0:08:25.400 --> 0:08:27.640 one moat. It was the one thing that people liked. 0:08:27.760 --> 0:08:31.760 It was their most sophisticated AI model. But this company 0:08:31.800 --> 0:08:34.440 came along and did it on a shoestring budget, both 0:08:34.520 --> 0:08:37.240 for actually training it even if the estimates are off 0:08:37.280 --> 0:08:39.719 by like factors of ten. But these things are more 0:08:39.720 --> 0:08:42.920 efficient too. And this company didn't even have access to 0:08:42.960 --> 0:08:46.000 the most capable GPUs. They didn't have the server architecture 0:08:46.120 --> 0:08:50.560 provided by Microsoft or Amazon or Google. And wow, and 0:08:50.600 --> 0:08:52.360 what did they do next with this thing they built 0:08:52.360 --> 0:08:55.200 that's competitive with you only real moat? They gave it away. 0:08:56.080 --> 0:08:59.080 Oh goodness me, Sammy, things aren't looking good at all. 0:08:59.679 --> 0:09:02.079 And this is where Sam Moultman's at. This is where 0:09:02.080 --> 0:09:03.920 open ai and the companies that are backed to it, 0:09:03.960 --> 0:09:06.800 and their competitors, this is where they're all at. The 0:09:06.880 --> 0:09:10.200 decisive lead they once enjoyed has like a puddle on 0:09:10.240 --> 0:09:13.360 a hot day, evaporated. And you'd see that happen a 0:09:13.400 --> 0:09:16.400 lot here in beautiful Las Vegas, Nevada. Now, don't get 0:09:16.400 --> 0:09:19.120 me wrong, open ai still burns money. But now when 0:09:19.120 --> 0:09:21.920 Sam Moretman dusts off his begging bowl. Investors will ask, 0:09:22.000 --> 0:09:31.560 perhaps for the first time, one very simple question, why