1 00:00:02,120 --> 00:00:02,880 Speaker 1: Ze Media. 2 00:00:04,200 --> 00:00:07,000 Speaker 2: Hello one, Welcome to Better Offline. I'm your host ed 3 00:00:07,080 --> 00:00:21,840 Speaker 2: Zi Trun. This is part two of our three parts 4 00:00:21,880 --> 00:00:25,040 Speaker 2: serious on how to argue with an AI booster. When 5 00:00:25,040 --> 00:00:27,440 Speaker 2: we last left off, I'd started talking about some of 6 00:00:27,480 --> 00:00:30,240 Speaker 2: the most common and vacuous talking points used by those 7 00:00:30,240 --> 00:00:32,959 Speaker 2: who defend the generative AI industry and why a lot 8 00:00:32,960 --> 00:00:36,080 Speaker 2: of them are wholly without merit. These are the booster quips, 9 00:00:36,120 --> 00:00:38,680 Speaker 2: assertions that if you don't know much, sound convincing but 10 00:00:38,720 --> 00:00:41,680 Speaker 2: are easily disproven with the right information. And in that 11 00:00:41,800 --> 00:00:44,000 Speaker 2: last episode we addressed the quips that say were in 12 00:00:44,040 --> 00:00:47,080 Speaker 2: the early days of AI and that people doubted smartphones 13 00:00:47,080 --> 00:00:49,479 Speaker 2: and the internet. Things they didn't do just like they 14 00:00:49,479 --> 00:00:52,880 Speaker 2: did generative AI, which they should do in the cycle 15 00:00:52,920 --> 00:00:55,360 Speaker 2: of grief. That's the denial stage. Now we're going to 16 00:00:55,400 --> 00:00:58,880 Speaker 2: move on to bargaining. This is just that the dot 17 00:00:58,920 --> 00:01:01,920 Speaker 2: com boom, even if of this collapses, the overcapacity will 18 00:01:01,960 --> 00:01:04,200 Speaker 2: be practical for the market like the fiber boom was. 19 00:01:05,040 --> 00:01:07,760 Speaker 2: All right, folks, time for a little history. You know me, 20 00:01:07,840 --> 00:01:10,800 Speaker 2: I'll love me some mystery. The fiber boom began after 21 00:01:10,840 --> 00:01:14,520 Speaker 2: the Telecommunications Act of nineteen ninety six deregulated large parts 22 00:01:14,520 --> 00:01:18,920 Speaker 2: of America's communications infrastructure, creating a massive boom, a five 23 00:01:19,000 --> 00:01:25,720 Speaker 2: hundred billion dollars one to be precise, primarily funded with debt. Obviously, 24 00:01:25,720 --> 00:01:28,400 Speaker 2: we're still using the infrastructure bought during that boom, and 25 00:01:28,480 --> 00:01:30,640 Speaker 2: this fact is used as a defense of the insane 26 00:01:30,720 --> 00:01:35,520 Speaker 2: capex spending surrounding generative AI. High speed Internet is useful, right, sure, 27 00:01:35,600 --> 00:01:38,480 Speaker 2: But the fiber optic boom period was also defined by 28 00:01:38,480 --> 00:01:43,280 Speaker 2: a gluttony of overinvestment, ridiculous valuations, and genuine, outright fraud. 29 00:01:43,480 --> 00:01:45,560 Speaker 2: In any case, this is not remotely the same thing, 30 00:01:45,560 --> 00:01:47,480 Speaker 2: and anyone making this point needs to learn the very 31 00:01:47,520 --> 00:01:51,520 Speaker 2: fucking basics of technology. Let's get going now. The fiber 32 00:01:51,520 --> 00:01:54,120 Speaker 2: optic cable of this era is mostly owned by a 33 00:01:54,120 --> 00:01:57,360 Speaker 2: few companies. Forty two percent of Nvidia's revenue is from 34 00:01:57,400 --> 00:02:00,440 Speaker 2: the Magnificent seven, and the companies buying these gps are 35 00:02:00,480 --> 00:02:02,360 Speaker 2: for the most part not going to go bust once 36 00:02:02,400 --> 00:02:05,680 Speaker 2: the AI bubble bursts. You can also already get the 37 00:02:05,800 --> 00:02:09,560 Speaker 2: cheap fiber of this era too cheap aigpus already here. 38 00:02:09,840 --> 00:02:13,040 Speaker 2: GPUs are depreciating assets, meaning that the good deals are 39 00:02:13,080 --> 00:02:16,640 Speaker 2: already happening. I found an in Vidia a one hundred 40 00:02:16,639 --> 00:02:19,160 Speaker 2: for two or three thousand dollars multiple times on eBay, 41 00:02:19,360 --> 00:02:21,120 Speaker 2: and you can get the h one hundreds which are 42 00:02:21,160 --> 00:02:23,639 Speaker 2: more powerful for well, I think thirty grand and those 43 00:02:23,680 --> 00:02:27,720 Speaker 2: things go forty five thousand retails, So not brilliant. Aigpus 44 00:02:27,760 --> 00:02:29,760 Speaker 2: also do not have a variety of use cases and 45 00:02:29,800 --> 00:02:33,440 Speaker 2: are limited by Kuda, in Vidia's programming libraries and APIs. 46 00:02:33,760 --> 00:02:37,760 Speaker 2: Aigpus are integrated into applications using this language Kuda, and 47 00:02:37,800 --> 00:02:41,280 Speaker 2: this is specifically in Vidia's programming language. While there are 48 00:02:41,400 --> 00:02:45,320 Speaker 2: other use cases scientific simulations, image and video processing, data 49 00:02:45,360 --> 00:02:48,880 Speaker 2: science and analytics, medical imaging, and so on. Kuder is 50 00:02:48,880 --> 00:02:53,720 Speaker 2: not a one size fits or digital panacea. While fiber 51 00:02:53,720 --> 00:02:57,040 Speaker 2: optic cable was, and it was also put everywhere, it 52 00:02:57,200 --> 00:03:00,240 Speaker 2: truly did set up the future. What are the these 53 00:03:00,240 --> 00:03:04,679 Speaker 2: GPUs setting up exactly? Also, widespread access to cheaper GPUs 54 00:03:04,720 --> 00:03:08,280 Speaker 2: has already happened, and what new use cases are there? 55 00:03:08,600 --> 00:03:11,520 Speaker 2: What are the new innovative things we can do? As 56 00:03:11,520 --> 00:03:14,440 Speaker 2: a result of the AI bubble, there are now many, many, many, many, 57 00:03:14,440 --> 00:03:17,720 Speaker 2: many different vendors to get access to GPUs. You can 58 00:03:17,760 --> 00:03:20,000 Speaker 2: pay at an hourly rate. Who knows if it's probitable, 59 00:03:20,040 --> 00:03:21,880 Speaker 2: but you can do it, and sometimes you can get 60 00:03:21,880 --> 00:03:23,880 Speaker 2: them for as little as one dollars an hour, which 61 00:03:23,919 --> 00:03:26,640 Speaker 2: is really not good. It definitely isn't making them money 62 00:03:26,639 --> 00:03:30,520 Speaker 2: but putting the financial collapse aside. While they might be 63 00:03:30,639 --> 00:03:33,840 Speaker 2: cheaper when the AI bubble bursts, does cheaper actually enable 64 00:03:33,840 --> 00:03:36,920 Speaker 2: people to do new stuff? Is costs the problem because 65 00:03:36,920 --> 00:03:38,080 Speaker 2: I think the costs are going to go up. But 66 00:03:38,120 --> 00:03:40,440 Speaker 2: even if they weren't going up, what are the things 67 00:03:40,480 --> 00:03:42,520 Speaker 2: that you could do that a new What is the 68 00:03:42,560 --> 00:03:46,520 Speaker 2: prohibitive cost? No one can actually answer this question because 69 00:03:46,560 --> 00:03:50,080 Speaker 2: the answer isn't fun. GPUs are built to shove massive 70 00:03:50,080 --> 00:03:52,960 Speaker 2: amounts of compute into one specific function, again and again 71 00:03:53,000 --> 00:03:55,560 Speaker 2: and again, like generating the output of model, which remember, 72 00:03:55,680 --> 00:03:59,640 Speaker 2: mostly boils down to complex maths. Unlike CPUs, a GPU 73 00:03:59,680 --> 00:04:03,240 Speaker 2: can't easily changed tasks or handle many little distinct operations, 74 00:04:03,520 --> 00:04:05,560 Speaker 2: meaning that these things aren't going to be adopted for 75 00:04:05,640 --> 00:04:08,640 Speaker 2: another mass market use case because there probably isn't one. 76 00:04:09,280 --> 00:04:12,800 Speaker 2: In simpler terms, this was not an infrastructure built out. 77 00:04:13,000 --> 00:04:16,360 Speaker 2: The GPU boom is a heavily centralized, capital expenditure funded 78 00:04:16,400 --> 00:04:18,640 Speaker 2: asset bubble where a bunch of chips will sit in 79 00:04:18,680 --> 00:04:22,560 Speaker 2: warehouses or kind of fallow data centers waiting for somebody 80 00:04:22,560 --> 00:04:24,480 Speaker 2: to make up a use case for them. And if 81 00:04:24,520 --> 00:04:27,000 Speaker 2: an endearing one existed, we'd already have it, because we 82 00:04:27,040 --> 00:04:31,920 Speaker 2: already have all the fucking GPUs. Now here's a really 83 00:04:31,920 --> 00:04:34,359 Speaker 2: big boost e quip and I have been looking forward to. 84 00:04:34,360 --> 00:04:35,880 Speaker 2: I get a lot of people asking you about this. 85 00:04:36,839 --> 00:04:41,280 Speaker 2: I'm ed, you're so stupid. Why am I stupid? Exactly? Well, 86 00:04:41,320 --> 00:04:44,200 Speaker 2: five really smart guys got together and wrote AI twenty 87 00:04:44,279 --> 00:04:47,320 Speaker 2: twenty seven, which is a very real sounding extrapolation that 88 00:04:47,440 --> 00:04:52,559 Speaker 2: shut the fuck up, shut up, shut up. AI twenty 89 00:04:52,600 --> 00:04:55,440 Speaker 2: twenty seven is fan fiction. If you were scared by this, 90 00:04:55,480 --> 00:04:57,560 Speaker 2: and you're not a booster, you shouldn't feel bad. By 91 00:04:57,560 --> 00:05:00,320 Speaker 2: the way this was written to scare you. By the way, 92 00:05:00,320 --> 00:05:02,200 Speaker 2: if you don't know what it is I'm talking about, 93 00:05:02,360 --> 00:05:04,880 Speaker 2: you should consider yourself lucky. It's essentially a piece of 94 00:05:04,920 --> 00:05:09,000 Speaker 2: speculative fiction that describes where GENAI companies get fatter models 95 00:05:09,000 --> 00:05:11,400 Speaker 2: that get exponentially better, and the US and China are 96 00:05:11,440 --> 00:05:14,120 Speaker 2: in brailed in an AI arms race. It's really silly. 97 00:05:14,160 --> 00:05:17,000 Speaker 2: It's so very silly, and I call it fan fiction 98 00:05:17,080 --> 00:05:19,680 Speaker 2: because it is. If we're thinking about this in purely 99 00:05:19,720 --> 00:05:22,080 Speaker 2: intellectual terms. It's up there with my immortal and no, 100 00:05:22,200 --> 00:05:24,599 Speaker 2: I'm not explaining that you can google that one for yourselves. 101 00:05:25,160 --> 00:05:27,240 Speaker 2: It doesn't matter if all the people writing the fan 102 00:05:27,279 --> 00:05:30,080 Speaker 2: fiction are scientists or that they have the right credentials. 103 00:05:30,440 --> 00:05:33,200 Speaker 2: They themselves said that AI twenty twenty seven is a 104 00:05:33,279 --> 00:05:36,960 Speaker 2: guess an extrapolation, which means guess with expert feedback, which 105 00:05:37,000 --> 00:05:40,120 Speaker 2: means someone editing your fan fiction and involves experience that 106 00:05:40,200 --> 00:05:42,240 Speaker 2: open AI. There are people that worked on the shows 107 00:05:42,240 --> 00:05:45,479 Speaker 2: they write fan fiction about. We're not even insulting fan fiction. 108 00:05:45,560 --> 00:05:48,520 Speaker 2: By the way, go nuts, you're more You are one 109 00:05:48,600 --> 00:05:53,040 Speaker 2: hundred times more ethically positive than these people. At least 110 00:05:53,040 --> 00:05:56,960 Speaker 2: you admits fan fiction could knuckles get pregnant. I'm sure 111 00:05:56,960 --> 00:05:59,200 Speaker 2: somebody's found out. I'm not going to go line by 112 00:05:59,240 --> 00:06:01,160 Speaker 2: line and cut this any more than I'm going to 113 00:06:01,200 --> 00:06:03,839 Speaker 2: go and do a lengthy takedown of someone's erotic Bancho 114 00:06:03,920 --> 00:06:07,640 Speaker 2: Kazoui's story, because both are fictional. The entire premise of 115 00:06:07,640 --> 00:06:10,400 Speaker 2: this nonsense is that at one point someone invents a 116 00:06:10,400 --> 00:06:13,400 Speaker 2: self learning agent that teaches itself stuff, and it does 117 00:06:13,400 --> 00:06:16,520 Speaker 2: a bunch of other stuff requiring a Brazilian compute points 118 00:06:17,000 --> 00:06:19,599 Speaker 2: with different agents with different numbers after them. There is 119 00:06:19,640 --> 00:06:21,800 Speaker 2: no proof that this is possible. Nobody has done it, 120 00:06:21,839 --> 00:06:24,600 Speaker 2: and nobody will do it. AA twenty twenty seven was 121 00:06:24,640 --> 00:06:27,120 Speaker 2: written specifically to fool people that want to be fooled, 122 00:06:27,279 --> 00:06:29,440 Speaker 2: with big chants and the right technical terms used to 123 00:06:29,480 --> 00:06:31,400 Speaker 2: lull the credulus into a wet dream and a New 124 00:06:31,480 --> 00:06:33,680 Speaker 2: York Times column where one of the writers folds their 125 00:06:33,720 --> 00:06:36,520 Speaker 2: hands and looks worried. It was also written to scare 126 00:06:36,520 --> 00:06:40,480 Speaker 2: people that are already scared. It makes big, scary proclamations 127 00:06:40,480 --> 00:06:43,000 Speaker 2: with tons of links to stuff that looks really legitimate, 128 00:06:43,080 --> 00:06:45,920 Speaker 2: but when you piece it all together, is literally just 129 00:06:46,000 --> 00:06:50,440 Speaker 2: fan fection, except really not that endearing. My personal favorite 130 00:06:50,480 --> 00:06:53,200 Speaker 2: part is mid twenty twenty six China Wakes Up, which 131 00:06:53,240 --> 00:06:56,520 Speaker 2: involves China's intelligence agents. He's trying to steal Open Brains 132 00:06:56,560 --> 00:06:59,960 Speaker 2: agent no idea who this companicably referring to please email 133 00:07:00,000 --> 00:07:02,000 Speaker 2: if you can work it out to I don't care 134 00:07:02,080 --> 00:07:05,760 Speaker 2: at business dot org before the headline of AI take 135 00:07:05,839 --> 00:07:08,560 Speaker 2: some jobs. After Open Brain releases a model. Oh God, 136 00:07:08,600 --> 00:07:12,520 Speaker 2: I'm so bored even fucking talking about this now. Sarah 137 00:07:12,600 --> 00:07:15,120 Speaker 2: lyonce puts this well, arguing that AI twenty twenty seven 138 00:07:15,160 --> 00:07:17,680 Speaker 2: and AI in general is no different from the spurious 139 00:07:17,720 --> 00:07:20,200 Speaker 2: spectral evidence used to accuse someone of being a witch 140 00:07:20,280 --> 00:07:23,520 Speaker 2: during the Salem witch trials, and I quote and the 141 00:07:23,520 --> 00:07:26,320 Speaker 2: evidence is spectral. What is the real evidence in AI 142 00:07:26,320 --> 00:07:29,680 Speaker 2: twenty twenty seven beyond trust us and vibes? People who 143 00:07:29,680 --> 00:07:32,720 Speaker 2: wrote it site themselves in the piece, do not demand 144 00:07:32,720 --> 00:07:35,440 Speaker 2: I take this seriously. This is so clearly a marketing 145 00:07:35,960 --> 00:07:38,240 Speaker 2: device to scare people into buying your product before this 146 00:07:38,280 --> 00:07:41,600 Speaker 2: imaginary window closes. Don't call me stupid for not falling 147 00:07:41,640 --> 00:07:44,840 Speaker 2: for your spectral evidence. My whole life, people have been 148 00:07:44,880 --> 00:07:48,200 Speaker 2: saying artificial intelligence is around the corner, and it never arrives. 149 00:07:48,640 --> 00:07:50,680 Speaker 2: I simply do not believe a chatbot will ever be 150 00:07:50,720 --> 00:07:52,720 Speaker 2: more than a chat pot, and until you show me 151 00:07:52,760 --> 00:07:57,040 Speaker 2: it doing that, I will not believe it anyway. AI 152 00:07:57,080 --> 00:08:00,480 Speaker 2: twenty twenty seven is fan fiction nothing more. Just because 153 00:08:00,480 --> 00:08:02,920 Speaker 2: it's full of fancy words and has five different grifters 154 00:08:02,960 --> 00:08:19,400 Speaker 2: on its byline doesn't mean a goddamn thing. Now now, now, now, now, folks, 155 00:08:20,240 --> 00:08:24,120 Speaker 2: we've all been waiting for this moment, and here's the 156 00:08:24,200 --> 00:08:28,239 Speaker 2: ultimate booster quip the cust of inference is coming down. 157 00:08:28,520 --> 00:08:31,640 Speaker 2: This proves that things are getting cheaper. And here's a 158 00:08:31,640 --> 00:08:34,000 Speaker 2: bonus trick for you before I get to my ben 159 00:08:34,640 --> 00:08:37,640 Speaker 2: Here we go, ask them to explain whether things have 160 00:08:37,720 --> 00:08:40,000 Speaker 2: actually got cheaper, and if they say they have, ask 161 00:08:40,040 --> 00:08:42,880 Speaker 2: them why there are no profitable AI companies. If they 162 00:08:42,920 --> 00:08:45,240 Speaker 2: say they're in the growth stage, ask them why there 163 00:08:45,240 --> 00:08:47,920 Speaker 2: are no profitable AI companies. Again, I'd say it's been 164 00:08:48,000 --> 00:08:50,679 Speaker 2: several years and not got one. At this point they 165 00:08:50,679 --> 00:08:53,640 Speaker 2: should try and kill you. But really, I'm about to 166 00:08:53,679 --> 00:08:55,880 Speaker 2: be petty. I'm about to be petty for a fucking 167 00:08:55,920 --> 00:08:58,960 Speaker 2: reason though. In an interview on a podcast from earlier 168 00:08:58,960 --> 00:09:01,560 Speaker 2: this year that I will not even quote because the 169 00:09:01,679 --> 00:09:04,040 Speaker 2: journalist in question did not back me up and it 170 00:09:04,080 --> 00:09:08,240 Speaker 2: pisses me off, Journalist Casey Newton said the following about 171 00:09:08,240 --> 00:09:08,720 Speaker 2: my work. 172 00:09:09,720 --> 00:09:11,160 Speaker 1: You don't think that that kind of flies in the 173 00:09:11,160 --> 00:09:13,120 Speaker 1: face of same altman saying that we need billions of 174 00:09:13,160 --> 00:09:15,880 Speaker 1: dollars for years. No, not at all. And I think 175 00:09:15,920 --> 00:09:18,080 Speaker 1: that's why it's so important when you're reading about AI 176 00:09:18,240 --> 00:09:20,600 Speaker 1: to read people who actually interview people who work at 177 00:09:20,640 --> 00:09:23,640 Speaker 1: these companies and understand how the technology works. Because the 178 00:09:23,800 --> 00:09:28,000 Speaker 1: entire industry has been on this curve where they are 179 00:09:28,200 --> 00:09:32,440 Speaker 1: trying to find micro innovations that reduce the cost of 180 00:09:32,480 --> 00:09:35,240 Speaker 1: training the models and to reduce the cost of what 181 00:09:35,280 --> 00:09:37,600 Speaker 1: they call inference, which is when you actually enter aquarium 182 00:09:37,640 --> 00:09:41,000 Speaker 1: the chat GBT and if you plotted the curve of 183 00:09:41,280 --> 00:09:44,360 Speaker 1: how the cost has been following over time, Deep Seek 184 00:09:44,440 --> 00:09:47,520 Speaker 1: is on that curve. Right, So everything that Deep Seek 185 00:09:47,559 --> 00:09:50,160 Speaker 1: did it was expected by the AI labs that someone 186 00:09:50,200 --> 00:09:52,520 Speaker 1: would be able to do. The novelty was just that 187 00:09:52,559 --> 00:09:54,760 Speaker 1: a Chinese company did it. So to say that it 188 00:09:54,920 --> 00:09:58,600 Speaker 1: like up ends expectations of how AI would be built 189 00:09:58,760 --> 00:10:01,440 Speaker 1: is just purely false and the opinion of somebody who 190 00:10:01,440 --> 00:10:02,680 Speaker 1: does not know what he's talking about. 191 00:10:03,280 --> 00:10:06,520 Speaker 2: Newton then says several octaves higher, which shows you exactly 192 00:10:06,520 --> 00:10:09,360 Speaker 2: how mad he isn't that he thought what he said 193 00:10:09,480 --> 00:10:12,000 Speaker 2: was very civil, and that there are things that are 194 00:10:12,000 --> 00:10:14,679 Speaker 2: true and there are things that are false, like you 195 00:10:14,720 --> 00:10:17,560 Speaker 2: can choose which ones you want to believe. I'm not 196 00:10:17,600 --> 00:10:20,240 Speaker 2: going to be so civil. Other than the fact that 197 00:10:20,280 --> 00:10:23,959 Speaker 2: Casey refers to micro innovations, the fuck are you talking about? 198 00:10:24,200 --> 00:10:26,640 Speaker 2: And Deep Seak being on a curve that was expected, 199 00:10:27,000 --> 00:10:30,320 Speaker 2: he makes, as many do, two very big mistakes and personally. 200 00:10:30,360 --> 00:10:34,160 Speaker 2: If I was doing this, I personally would not have 201 00:10:34,280 --> 00:10:37,680 Speaker 2: said these things in a sentence that began with me 202 00:10:37,760 --> 00:10:40,560 Speaker 2: suggesting that I be in case and Newton in this 203 00:10:40,679 --> 00:10:44,080 Speaker 2: example knew how the technology works. Now here's the case 204 00:10:44,120 --> 00:10:47,160 Speaker 2: in Newton wib inference, which is when you actually enter 205 00:10:47,200 --> 00:10:50,040 Speaker 2: a query into chat GPT. This statement is false. It's 206 00:10:50,040 --> 00:10:52,760 Speaker 2: not what inference means. Inference and I've gotten this wrong 207 00:10:52,800 --> 00:10:55,680 Speaker 2: in the past too. I'm being accountable. Is everything that 208 00:10:55,760 --> 00:10:58,120 Speaker 2: happens when you put in a prompt to generate an output. 209 00:10:58,400 --> 00:11:02,080 Speaker 2: It's when an AI based on your infers meaning. To 210 00:11:02,160 --> 00:11:05,280 Speaker 2: be more specific, in quoting Google machine learning, inference is 211 00:11:05,280 --> 00:11:07,720 Speaker 2: the process of running data points into a machine learning 212 00:11:07,720 --> 00:11:10,960 Speaker 2: model to calculate an output, such as a single numerical score. 213 00:11:11,320 --> 00:11:13,439 Speaker 2: Except that's what these things are bad at. But nevertheless, 214 00:11:13,720 --> 00:11:15,440 Speaker 2: Casey will try and weasel out of this one and 215 00:11:15,480 --> 00:11:18,320 Speaker 2: say this is what he meant. It wasn't. He also said, 216 00:11:18,400 --> 00:11:20,240 Speaker 2: if he planted the curve of how the cost of 217 00:11:20,280 --> 00:11:24,200 Speaker 2: inference has been falling over time, well that's wrong, Casey, 218 00:11:24,320 --> 00:11:26,320 Speaker 2: that's wrong the man. The cost of inference has gone 219 00:11:26,360 --> 00:11:28,960 Speaker 2: up over time. Now, Casey, like many people who talk 220 00:11:28,960 --> 00:11:31,600 Speaker 2: about stuff without learning about it first is likely referring 221 00:11:31,600 --> 00:11:33,320 Speaker 2: to the fact that the price of tokens for some 222 00:11:33,360 --> 00:11:36,240 Speaker 2: models has gone down in some cases. But you know what, folks, 223 00:11:36,320 --> 00:11:38,959 Speaker 2: let's establish and facts about inference. I'm doing the train. 224 00:11:39,320 --> 00:11:41,960 Speaker 2: I'm pulling the big horn on the invisible train. I'm 225 00:11:42,000 --> 00:11:45,000 Speaker 2: cooking now. Inference is a thing that costs money, is 226 00:11:45,120 --> 00:11:47,760 Speaker 2: entirely different to the price of tokens, and conflating the 227 00:11:47,800 --> 00:11:51,000 Speaker 2: two is journalistic malpractice. The cost of inference would be 228 00:11:51,000 --> 00:11:53,720 Speaker 2: the price of running the GPU and the associated architecture. 229 00:11:53,800 --> 00:11:55,800 Speaker 2: Of course, we do not at this point have any 230 00:11:55,840 --> 00:11:59,520 Speaker 2: real insight into token prices are set by the people 231 00:11:59,520 --> 00:12:02,160 Speaker 2: who sell access to the tokens, such as open ai 232 00:12:02,200 --> 00:12:05,120 Speaker 2: and Anthropic. For example, open ai dropped the price of 233 00:12:05,160 --> 00:12:07,959 Speaker 2: its O three models token costs almost immediately after the 234 00:12:08,000 --> 00:12:10,520 Speaker 2: launch of Claude Opus four. Do you think it did 235 00:12:10,559 --> 00:12:12,800 Speaker 2: that because the price of serving the models got cheaper. 236 00:12:13,000 --> 00:12:16,040 Speaker 2: If you do, I don't know how you possibly put 237 00:12:16,080 --> 00:12:19,920 Speaker 2: your trousers on every morning without cutting yourself in half. Now, 238 00:12:19,920 --> 00:12:22,960 Speaker 2: the cost of inference conversation comes from articles that say 239 00:12:23,000 --> 00:12:25,400 Speaker 2: that we now have models that are cheaper that can 240 00:12:25,400 --> 00:12:28,960 Speaker 2: now hit higher benchmark scores. Though the article I'm referring to, 241 00:12:29,000 --> 00:12:31,080 Speaker 2: which will be in the show notes, is from November 242 00:12:31,080 --> 00:12:33,240 Speaker 2: twenty twenty four, and the comparison it makes is between 243 00:12:33,280 --> 00:12:36,280 Speaker 2: GPT three, which is from November twenty twenty one, and 244 00:12:36,400 --> 00:12:40,400 Speaker 2: LAMA three point two to three b September twenty twenty four. Now, 245 00:12:40,440 --> 00:12:42,200 Speaker 2: the suggestion is in any case, that the cost of 246 00:12:42,200 --> 00:12:45,040 Speaker 2: inference is going down ten x year over year. The 247 00:12:45,080 --> 00:12:47,600 Speaker 2: problem is, however, that these are raw token costs, not 248 00:12:47,640 --> 00:12:51,199 Speaker 2: actual expressions of evaluations of token burn in a practical setting. 249 00:12:51,720 --> 00:12:54,199 Speaker 2: And to really I realized that it was a bit technical. 250 00:12:54,960 --> 00:12:57,920 Speaker 2: These are just what it costs to do something. It 251 00:12:57,960 --> 00:13:01,120 Speaker 2: doesn't actually tell you how how many tokens will be 252 00:13:01,160 --> 00:13:03,640 Speaker 2: burned at what volume they will be burned, because that 253 00:13:03,679 --> 00:13:06,800 Speaker 2: would change things. And well, wouldn't you know it, the 254 00:13:06,840 --> 00:13:10,120 Speaker 2: cost of inference actually went up as a result. In 255 00:13:10,160 --> 00:13:12,080 Speaker 2: an excellent blog from Killer Code, and I did not 256 00:13:12,160 --> 00:13:14,640 Speaker 2: get the chance to find out the pronunciation of this 257 00:13:15,400 --> 00:13:17,319 Speaker 2: second name, so I'm just going to call her. It 258 00:13:17,400 --> 00:13:22,760 Speaker 2: is ewasyz sz Ka. I am so sorry. I would 259 00:13:22,840 --> 00:13:25,679 Speaker 2: rather spell it out, miss than actually mispronounce it. I 260 00:13:25,720 --> 00:13:29,240 Speaker 2: hate when people say z tron wrong. Great blog anyway, 261 00:13:29,320 --> 00:13:33,520 Speaker 2: let me quote, application inference costs increase for two reasons. 262 00:13:33,559 --> 00:13:36,600 Speaker 2: The frontier models cost per token stayed constant, and the 263 00:13:36,679 --> 00:13:40,760 Speaker 2: token consumption per application grew a lot. Token consumption per 264 00:13:40,800 --> 00:13:43,600 Speaker 2: application grew a lot because models allowed for longer context 265 00:13:43,600 --> 00:13:46,880 Speaker 2: windows and bigger suggestions from the models. The combination of 266 00:13:46,920 --> 00:13:49,840 Speaker 2: a steady price per token and more token consumption caused 267 00:13:49,880 --> 00:13:52,880 Speaker 2: that inference cost to grow about ten times over the 268 00:13:52,880 --> 00:13:56,600 Speaker 2: past two years. To explain that in really simple terms, 269 00:13:56,640 --> 00:13:59,440 Speaker 2: while the costs of old models may have decreased, new models, 270 00:13:59,640 --> 00:14:02,760 Speaker 2: which you need to do most things, cost about the same, 271 00:14:02,800 --> 00:14:05,600 Speaker 2: and the reasoning that these new models use do actually 272 00:14:05,600 --> 00:14:09,079 Speaker 2: burn way way more tokens. When these new models reason, 273 00:14:09,160 --> 00:14:11,280 Speaker 2: they break the user's input down and break it into 274 00:14:11,280 --> 00:14:14,360 Speaker 2: component parts, then run inference on each of those parts. 275 00:14:14,600 --> 00:14:16,200 Speaker 2: When you plug an L and M into an AI 276 00:14:16,240 --> 00:14:19,320 Speaker 2: coding environment, it will naturally burn an absolute shit ton 277 00:14:19,360 --> 00:14:21,640 Speaker 2: of tokens, in part because of the large amount of 278 00:14:21,640 --> 00:14:23,800 Speaker 2: information you have to load into the prompt and the 279 00:14:23,840 --> 00:14:25,960 Speaker 2: context window, or the amount of information you can load 280 00:14:26,000 --> 00:14:29,440 Speaker 2: in at once, and in part because generatingcode is inference 281 00:14:29,520 --> 00:14:31,920 Speaker 2: intensive and also breaking down all those coding tasks. At 282 00:14:31,960 --> 00:14:34,360 Speaker 2: each of those tasks requiring a coding tool and taking 283 00:14:34,400 --> 00:14:38,200 Speaker 2: a bunch of inference themselves. It's really bad. In fact, 284 00:14:38,240 --> 00:14:40,640 Speaker 2: the inference costs are so severe. The Killer Code says 285 00:14:40,680 --> 00:14:43,160 Speaker 2: that a combination of a steady price for token and 286 00:14:43,200 --> 00:14:46,040 Speaker 2: more token consumption caused app inference costs to grow about 287 00:14:46,040 --> 00:14:49,160 Speaker 2: ten x over the last two years. I'm repeating myself. 288 00:14:49,200 --> 00:14:51,520 Speaker 2: I realized, But I really need you to get one thing, 289 00:14:51,760 --> 00:14:53,960 Speaker 2: which is that the cost of inference went up. But 290 00:14:54,120 --> 00:14:56,600 Speaker 2: I'm not done. I refuse to let this point go 291 00:14:56,800 --> 00:14:58,760 Speaker 2: because people love to say the cost of inference is 292 00:14:58,800 --> 00:15:01,400 Speaker 2: going down when the cost of inference has increased, and 293 00:15:01,440 --> 00:15:04,240 Speaker 2: they do so to a national audience, all while suggesting 294 00:15:04,320 --> 00:15:07,880 Speaker 2: I'm wrong somehow and acting superior. I don't like being 295 00:15:07,920 --> 00:15:10,680 Speaker 2: made to feel this way. I don't think it's nice 296 00:15:10,680 --> 00:15:13,360 Speaker 2: to do this to people. And if you're gonna do it, 297 00:15:13,440 --> 00:15:15,720 Speaker 2: if you have the temerity to call someone out directly, 298 00:15:15,840 --> 00:15:20,160 Speaker 2: at least be fucking right. I'm not wrong, You're wrong. 299 00:15:20,600 --> 00:15:24,240 Speaker 2: In fact, software developer influencer Theo Brown recently put out 300 00:15:24,240 --> 00:15:26,960 Speaker 2: a video called I was wrong about AI costs They 301 00:15:27,040 --> 00:15:30,240 Speaker 2: keep going up, which he breaks down as follows, reasoning 302 00:15:30,240 --> 00:15:34,000 Speaker 2: models are significantly increasing the amount of output tokens being generated. 303 00:15:34,320 --> 00:15:37,760 Speaker 2: These tokens are also more expensive. In one example, Brown 304 00:15:37,840 --> 00:15:41,080 Speaker 2: finds that Grockfor's reasoning mode uses six hundred and three 305 00:15:41,120 --> 00:15:45,760 Speaker 2: tokens to generate two words. This was a problem across 306 00:15:45,800 --> 00:15:48,720 Speaker 2: every single reasoning model, as even cheap reasoning models would 307 00:15:48,760 --> 00:15:51,600 Speaker 2: do the same thing. As a result, tasks are taking 308 00:15:51,680 --> 00:15:55,240 Speaker 2: longer and burning more tokens. Another writer called Ethan Deing 309 00:15:55,280 --> 00:15:57,760 Speaker 2: noted a few months ago that reasoning models burn so 310 00:15:57,800 --> 00:16:00,680 Speaker 2: many tokens that there is no flat subscrips price that 311 00:16:00,720 --> 00:16:03,200 Speaker 2: works in this new world. As the number of tokens 312 00:16:03,240 --> 00:16:06,920 Speaker 2: they consume to an absolutely nuclear the price drops have 313 00:16:07,000 --> 00:16:09,920 Speaker 2: also for the most part stopped. You cannot at this 314 00:16:10,040 --> 00:16:12,560 Speaker 2: point fairly evaluate whether a model is cheaper just based 315 00:16:12,600 --> 00:16:15,640 Speaker 2: on its cost per tokens, because reasoning models inherently burn 316 00:16:15,880 --> 00:16:19,080 Speaker 2: and are built to inherently burn more tokens to create 317 00:16:19,120 --> 00:16:21,560 Speaker 2: an output. Reasoning models are also the only way that 318 00:16:21,600 --> 00:16:23,840 Speaker 2: model developers have been able to improve the efficacy of 319 00:16:23,880 --> 00:16:26,640 Speaker 2: new models, using something called test time compute to burn 320 00:16:26,680 --> 00:16:30,080 Speaker 2: extra tokens to complete a task, and in basically anything 321 00:16:30,120 --> 00:16:31,800 Speaker 2: you're using today, there's going to be some sort of 322 00:16:31,880 --> 00:16:35,360 Speaker 2: reasoning model, especially if you're coding, the cost of inference 323 00:16:35,360 --> 00:16:38,800 Speaker 2: has gone up. Statements otherwise are purely false and are 324 00:16:38,840 --> 00:16:41,000 Speaker 2: the opinion of somebody who does not know what he's 325 00:16:41,040 --> 00:16:44,240 Speaker 2: talking about. But you ask, could the costs of inference 326 00:16:44,280 --> 00:16:49,000 Speaker 2: go down? Maybe it sure isn't trending that way, nor 327 00:16:49,040 --> 00:16:51,560 Speaker 2: has it gone down yet. I also predict that there's 328 00:16:51,560 --> 00:16:53,440 Speaker 2: going to be some sort of sudden realization in the 329 00:16:53,440 --> 00:16:55,720 Speaker 2: media that inference is going up, which is kind of 330 00:16:55,720 --> 00:16:58,960 Speaker 2: already started. The Information had a piece on it in 331 00:16:59,040 --> 00:17:01,480 Speaker 2: late August where they note that into it paide twenty 332 00:17:01,480 --> 00:17:03,880 Speaker 2: million dollars to as your last year, primarily to access 333 00:17:03,920 --> 00:17:06,160 Speaker 2: open AI's models, and it's on track to spend thirty 334 00:17:06,200 --> 00:17:08,720 Speaker 2: million this year, which outpaces the company's revenue growth in 335 00:17:08,760 --> 00:17:11,800 Speaker 2: the same period, raising questions about how sustainable the spending 336 00:17:11,920 --> 00:17:13,560 Speaker 2: is and how much of the cost it can pass 337 00:17:13,560 --> 00:17:16,320 Speaker 2: along to customers. Christopher Mims and The Wall Street Journal 338 00:17:16,359 --> 00:17:18,359 Speaker 2: also had a piece about the costs going up. Do 339 00:17:18,520 --> 00:17:21,040 Speaker 2: not be mad at Chris. Chris and I chatted before 340 00:17:21,080 --> 00:17:24,040 Speaker 2: he submitted that piece, like he literally on Blue Sky 341 00:17:24,080 --> 00:17:26,360 Speaker 2: called me out if fucking rocks. By the way, big 342 00:17:26,440 --> 00:17:28,600 Speaker 2: up to Chris Mims because it's nice to see the 343 00:17:28,640 --> 00:17:31,639 Speaker 2: mainstream media actually engaging with these things, even though it's 344 00:17:31,720 --> 00:17:34,600 Speaker 2: dangerous to the bubble. But you know what, the truth 345 00:17:34,680 --> 00:17:37,040 Speaker 2: must win out, and the problem here is that the 346 00:17:37,160 --> 00:17:41,600 Speaker 2: architecture underlying large language models is inherently unreliable. I imagine open 347 00:17:41,600 --> 00:17:44,520 Speaker 2: AI's introduction of the router to chat GPT five as 348 00:17:44,560 --> 00:17:46,359 Speaker 2: an attempt to moderate both the costs of the model 349 00:17:46,440 --> 00:17:49,320 Speaker 2: chosen and reduce the amount of exposure to reasoning models 350 00:17:49,320 --> 00:17:52,520 Speaker 2: for simple queries. Though Sam Moltman was boasting on August 351 00:17:52,520 --> 00:17:54,880 Speaker 2: tenth about the significant increase in both free and paid 352 00:17:54,960 --> 00:17:58,000 Speaker 2: users exposure to reasoning models, they don't teach you this 353 00:17:58,119 --> 00:18:01,640 Speaker 2: in business school. Still, A study written up by VentureBeat 354 00:18:01,680 --> 00:18:04,040 Speaker 2: found that open weight models burn between one point five 355 00:18:04,080 --> 00:18:06,119 Speaker 2: to four times more tokens, in part due to a 356 00:18:06,200 --> 00:18:08,879 Speaker 2: lack of token efficiency and in part thanks to you 357 00:18:09,040 --> 00:18:13,440 Speaker 2: guessed it reasoning models. I quote the finding's challenge of 358 00:18:13,480 --> 00:18:16,560 Speaker 2: prevailing assumption in the AI industry that open source models 359 00:18:16,560 --> 00:18:20,520 Speaker 2: offer a clear economic advantages over proprietary alternatives. While open 360 00:18:20,520 --> 00:18:23,000 Speaker 2: source models typically cost less per token to run, the 361 00:18:23,000 --> 00:18:25,520 Speaker 2: study suggests that this advantage could be and I quote 362 00:18:25,560 --> 00:18:28,280 Speaker 2: the study easily offset if they require more tokens to 363 00:18:28,320 --> 00:18:31,560 Speaker 2: reason about a given problem, and models keep getting bigger 364 00:18:31,560 --> 00:18:36,399 Speaker 2: and more expensive too. So why did this happen? Well, 365 00:18:36,520 --> 00:18:39,359 Speaker 2: it's because model developers hit a wall of diminishing returns 366 00:18:39,400 --> 00:18:41,159 Speaker 2: and the only way to make models do more was 367 00:18:41,200 --> 00:18:43,080 Speaker 2: to make them burn more tokens to generate a more 368 00:18:43,119 --> 00:18:46,560 Speaker 2: accurate response, which is a very simple way of describing 369 00:18:46,600 --> 00:18:49,160 Speaker 2: reasoning a thing that opening I launched in September twenty 370 00:18:49,200 --> 00:18:52,120 Speaker 2: twenty four, and others followed. As a result, all the 371 00:18:52,160 --> 00:18:55,040 Speaker 2: gains from powerful new models come from burning more and 372 00:18:55,119 --> 00:18:57,639 Speaker 2: more tokens. The cost per million token number is no 373 00:18:57,720 --> 00:18:59,840 Speaker 2: longer an accurate measure of the actual cost of generative 374 00:18:59,880 --> 00:19:02,720 Speaker 2: a because it's much much, much much harder to tell 375 00:19:02,720 --> 00:19:04,920 Speaker 2: how many tokens of reasoning model may burn, and it 376 00:19:05,040 --> 00:19:08,399 Speaker 2: varies as the boint the O Boying, I'm keeping that 377 00:19:08,480 --> 00:19:11,080 Speaker 2: all right. You get the real cuts as the O 378 00:19:11,240 --> 00:19:14,840 Speaker 2: Brown noted from model to model. In any case, there 379 00:19:14,880 --> 00:19:17,600 Speaker 2: really is no changing this path. These companies are out 380 00:19:17,600 --> 00:19:22,679 Speaker 2: of ideas now another another one of my favorite ultimate 381 00:19:22,720 --> 00:19:25,120 Speaker 2: booster gripts. This is a classic and I still get 382 00:19:25,160 --> 00:19:28,679 Speaker 2: this on social media. I'm I have people yapping in 383 00:19:28,720 --> 00:19:31,919 Speaker 2: my ear saying open air and Anthropic are just like 384 00:19:32,080 --> 00:19:34,840 Speaker 2: Uber because Uber bent twenty five billion dollars over the 385 00:19:34,880 --> 00:19:37,960 Speaker 2: course of fifteen or so years and look look edward, 386 00:19:38,119 --> 00:19:40,399 Speaker 2: they're now profitable. Why are you calling me Airport? Shut up? 387 00:19:40,640 --> 00:19:43,199 Speaker 2: This proves the open Ai, a totally different company with 388 00:19:43,240 --> 00:19:46,280 Speaker 2: different economics, will be totally fine. So I've heard this 389 00:19:46,400 --> 00:19:48,520 Speaker 2: argument maybe fifty times in the last year, to the 390 00:19:48,520 --> 00:19:49,879 Speaker 2: point that I had to talk about it in my 391 00:19:49,960 --> 00:19:53,160 Speaker 2: piece how does open Ai Survive, which I also turned 392 00:19:53,160 --> 00:19:55,720 Speaker 2: into a podcast around July twenty twenty four. Go back 393 00:19:55,720 --> 00:19:58,960 Speaker 2: and link a link to it in the piece. Yaddy yaddy, yadda. Nevertheless, 394 00:19:58,960 --> 00:20:00,840 Speaker 2: people make a few points by Uber and AI that 395 00:20:00,840 --> 00:20:02,880 Speaker 2: I think are fundamentally incorrect, and I'm going to break 396 00:20:02,920 --> 00:20:05,680 Speaker 2: them down for you now. They claim that AI is 397 00:20:05,720 --> 00:20:08,200 Speaker 2: making itself too big to fail and betting itself everywhere 398 00:20:08,240 --> 00:20:10,920 Speaker 2: and becoming essential, and none of these things are the case. 399 00:20:11,560 --> 00:20:13,480 Speaker 2: I've heard this argument a lot, by the way, and 400 00:20:13,520 --> 00:20:16,879 Speaker 2: it's one that's both ahistorical and alarmingly ignorant of the 401 00:20:17,040 --> 00:20:21,320 Speaker 2: very basics of society. But ed the government, no no, no, no, no, no, 402 00:20:21,680 --> 00:20:23,960 Speaker 2: you've heard, you've heard. OpenAI got a two hundred million 403 00:20:23,960 --> 00:20:26,720 Speaker 2: dollar Defense contract with an estimated completion date of July 404 00:20:26,760 --> 00:20:28,600 Speaker 2: twenty twenty six. And just to be clear, that's up 405 00:20:28,640 --> 00:20:31,120 Speaker 2: to two hundred million dollars, and that they're selling chat 406 00:20:31,160 --> 00:20:34,120 Speaker 2: GBT Enterprise to the US government for a dollar a year, 407 00:20:34,320 --> 00:20:37,160 Speaker 2: along with Anthropic doing the same thing, and even Google's 408 00:20:37,200 --> 00:20:40,000 Speaker 2: doing it, except they're doing forty cents for a year. Now, 409 00:20:40,000 --> 00:20:42,960 Speaker 2: you're probably hearing this and thinking, ah shit, this means 410 00:20:42,960 --> 00:20:45,080 Speaker 2: the government's paid them. They're never going away. And I 411 00:20:45,160 --> 00:20:47,720 Speaker 2: cannot be clear enough that you believing this is the 412 00:20:47,880 --> 00:20:51,240 Speaker 2: very intention of these deals. They are built specifically to 413 00:20:51,280 --> 00:20:53,359 Speaker 2: make you feel like these things are never going away. 414 00:20:53,640 --> 00:20:56,159 Speaker 2: This is also an attempt to get in with the 415 00:20:56,160 --> 00:20:58,440 Speaker 2: government at a rate that makes train these models a 416 00:20:58,520 --> 00:21:02,800 Speaker 2: no brainer. At which point I ask, and the government 417 00:21:02,880 --> 00:21:05,120 Speaker 2: is going to have cheap access to AI software does 418 00:21:05,119 --> 00:21:08,200 Speaker 2: not mean that the government relies on m every member 419 00:21:08,200 --> 00:21:11,199 Speaker 2: of the government having access to chat GPT, something that 420 00:21:11,320 --> 00:21:14,040 Speaker 2: is not even necessarily the case, does not make this 421 00:21:14,119 --> 00:21:17,200 Speaker 2: software useful, let alone essential. And if open ai burns 422 00:21:17,240 --> 00:21:19,600 Speaker 2: a bunch of money making it work for them, it 423 00:21:19,720 --> 00:21:22,240 Speaker 2: still won't be essential because large language models are not 424 00:21:22,280 --> 00:21:25,960 Speaker 2: actually that useful for doing stuff now let's talk Uber. 425 00:21:26,359 --> 00:21:29,360 Speaker 2: Uber was and is useful, which eventually made it essential. 426 00:21:30,080 --> 00:21:33,320 Speaker 2: Uber used lobbyist Bradley Tusk to steam roll local governments 427 00:21:33,359 --> 00:21:35,960 Speaker 2: into allowing Uber to operate in their cities, but Tasks 428 00:21:36,040 --> 00:21:38,520 Speaker 2: did not have to convince local governments that Uber was 429 00:21:38,600 --> 00:21:41,440 Speaker 2: useful or have to train people how to use Uber. 430 00:21:42,160 --> 00:21:44,760 Speaker 2: Uber's too big to fail moment was that local cabs 431 00:21:44,840 --> 00:21:48,000 Speaker 2: kind of fucking sucked just about everywhere. You ever try 432 00:21:48,000 --> 00:21:50,760 Speaker 2: and take a yellow cab from downtown Manhattan to Hoboken, 433 00:21:50,800 --> 00:21:53,880 Speaker 2: New Jersey, or Brooklyn or Queen's Do you ever try 434 00:21:53,880 --> 00:21:56,000 Speaker 2: and pay with a credit card? How about trying to 435 00:21:56,000 --> 00:21:58,480 Speaker 2: get a cab outside a major metropolitan area. Do you 436 00:21:58,520 --> 00:22:02,520 Speaker 2: remember how bad it was? It was really awful. I 437 00:22:02,560 --> 00:22:05,560 Speaker 2: don't think people realize or remember how bad it was. 438 00:22:05,760 --> 00:22:08,720 Speaker 2: And I'm not saying that Uber is good. I'm not 439 00:22:08,720 --> 00:22:11,600 Speaker 2: glorifying Uber in any way. But the experience that Uber 440 00:22:11,680 --> 00:22:14,640 Speaker 2: replaced was very, very bad. As a result, Uber did 441 00:22:14,680 --> 00:22:16,840 Speaker 2: become too big to fail because people now rely on 442 00:22:16,880 --> 00:22:19,840 Speaker 2: it because the old system sucked. Uber used its masses 443 00:22:19,880 --> 00:22:22,080 Speaker 2: of venture capital to keep prices low to get people 444 00:22:22,200 --> 00:22:24,880 Speaker 2: used to it too, but the fundamental experience was better 445 00:22:24,920 --> 00:22:27,000 Speaker 2: than calling a cab company and hoping they showed up. 446 00:22:27,520 --> 00:22:28,879 Speaker 2: I also want to be clear that this is not 447 00:22:28,920 --> 00:22:32,080 Speaker 2: me condoning Uber take public transport, if you can to 448 00:22:32,119 --> 00:22:34,439 Speaker 2: be clear. Uber has created a new kind of horrifying, 449 00:22:34,520 --> 00:22:38,440 Speaker 2: extractive labor practice which deprives people of benefits and dignity, 450 00:22:38,640 --> 00:22:40,800 Speaker 2: paying off academics to help the media gloss over the 451 00:22:40,800 --> 00:22:44,119 Speaker 2: horrors of their platform, and also now having to increase 452 00:22:44,160 --> 00:22:48,479 Speaker 2: prices so that they reached profitability by doing that. That 453 00:22:48,600 --> 00:22:51,159 Speaker 2: isn't something that's going to happen with genitive AI. Just 454 00:22:51,880 --> 00:23:08,840 Speaker 2: the costs are too high, They're way too high. But anyway, 455 00:23:09,240 --> 00:23:14,840 Speaker 2: what is essential about generative AI? What exactly, and be specific, 456 00:23:15,040 --> 00:23:18,679 Speaker 2: is the essential experience of generative AI? What are we 457 00:23:18,920 --> 00:23:24,919 Speaker 2: if chat, GPT disappeared tomorrow, what actually disappears? And on 458 00:23:24,960 --> 00:23:28,240 Speaker 2: an enterprise or governmental level, what exactly are these tools 459 00:23:28,320 --> 00:23:31,480 Speaker 2: doing for governments that would make removing them so painful? 460 00:23:31,640 --> 00:23:34,760 Speaker 2: What use cases, what outcomes? If your answer here is 461 00:23:34,800 --> 00:23:36,639 Speaker 2: to say, well, they're putting it in and they're choosing, 462 00:23:36,680 --> 00:23:40,640 Speaker 2: they're choosing which people to cut out of benefits, and please, goddamn, 463 00:23:40,920 --> 00:23:43,280 Speaker 2: this is what they want you to do. They want 464 00:23:43,320 --> 00:23:46,680 Speaker 2: you to be scared so they can feel powerful. They're 465 00:23:46,680 --> 00:23:48,760 Speaker 2: not doing that. You notice that we get all these 466 00:23:48,760 --> 00:23:51,720 Speaker 2: horrible stories by the way of internal government things, shoving 467 00:23:51,720 --> 00:23:55,159 Speaker 2: stuff into olms. You know what, we don't get another 468 00:23:55,240 --> 00:23:57,320 Speaker 2: thing we don't get, oh and then have It's just 469 00:23:57,359 --> 00:24:00,840 Speaker 2: they're doing this scary, bad thing that they shouldn't be. 470 00:24:00,840 --> 00:24:04,280 Speaker 2: This shouldn't be putting people's private information into anyway. I'm rambling. 471 00:24:04,600 --> 00:24:07,199 Speaker 2: Uber's essentral nature is that millions of people use it 472 00:24:07,359 --> 00:24:10,680 Speaker 2: in place of regular taxis, and it effectively replaced de 473 00:24:10,760 --> 00:24:13,679 Speaker 2: krepit of exploitative systems like the yellow cab Medallions in 474 00:24:13,680 --> 00:24:16,760 Speaker 2: New York with its own tech enabled exploitation system that 475 00:24:16,920 --> 00:24:20,560 Speaker 2: nevertheless worked far better for the user. Okay, I also 476 00:24:20,560 --> 00:24:22,240 Speaker 2: want to do a side note just to acknowledge that 477 00:24:22,800 --> 00:24:26,399 Speaker 2: the disruption from Uber brought something to the medallion system 478 00:24:26,440 --> 00:24:30,240 Speaker 2: that was genuinely horrendous. The consequences were horrifying for the 479 00:24:30,240 --> 00:24:32,560 Speaker 2: owners of the medallions, some of who had paid more 480 00:24:32,560 --> 00:24:34,919 Speaker 2: than a million dollars for the privilege of driving a 481 00:24:34,960 --> 00:24:37,400 Speaker 2: New York cab and were burdened under mountains of debt. 482 00:24:37,680 --> 00:24:41,280 Speaker 2: That our system is so fucking evil. I think it's horrifying, 483 00:24:41,520 --> 00:24:44,240 Speaker 2: and I think the payday loan people involved should all 484 00:24:44,280 --> 00:24:47,520 Speaker 2: be in fucking prison, worst scum of the world. The 485 00:24:47,560 --> 00:24:49,600 Speaker 2: people who are taking advantage of people come to this 486 00:24:49,640 --> 00:24:51,600 Speaker 2: country to drive a fucking cab that they have to 487 00:24:51,960 --> 00:24:55,639 Speaker 2: take out massive loans to buy. That is evil. Uber 488 00:24:55,680 --> 00:24:58,399 Speaker 2: is also just to be clear, but that also is 489 00:24:58,480 --> 00:25:02,840 Speaker 2: That's the point I'm trying to make. Should feel sorry 490 00:25:02,920 --> 00:25:06,199 Speaker 2: for the victims of that system. That system was a 491 00:25:06,280 --> 00:25:10,640 Speaker 2: kind of corruption unto itself anyway, getting back to the thing, 492 00:25:10,680 --> 00:25:12,760 Speaker 2: because I don't know, I feel I actually feel a 493 00:25:12,760 --> 00:25:14,919 Speaker 2: lot for the people who are the victims of the 494 00:25:14,920 --> 00:25:17,760 Speaker 2: medallion system. It's fucking rough, and every time I think 495 00:25:17,800 --> 00:25:20,760 Speaker 2: of it, I feel very sad inside. But let's get 496 00:25:20,800 --> 00:25:22,320 Speaker 2: back to the episode. I don't want to think about 497 00:25:22,359 --> 00:25:25,919 Speaker 2: it any longer. There really are no essential use cases 498 00:25:25,960 --> 00:25:29,359 Speaker 2: for Chat, GPT, or really any Genai system. You cannot 499 00:25:29,359 --> 00:25:31,280 Speaker 2: point to one use case that is anywhere near as 500 00:25:31,280 --> 00:25:34,560 Speaker 2: necessary as cabs in cities, And indeed the biggest use cases, 501 00:25:34,600 --> 00:25:37,399 Speaker 2: things like brainstorming and search, are either easily replaced by 502 00:25:37,480 --> 00:25:39,919 Speaker 2: any other commoditized The lam will already exist in the 503 00:25:39,920 --> 00:25:44,440 Speaker 2: case of Google Search. Now let's do another boost quip 504 00:25:45,200 --> 00:25:47,920 Speaker 2: data centers are important economic growth vehicles and now helping 505 00:25:48,000 --> 00:25:51,480 Speaker 2: drive innovation and jobs throughout America. Having data centers promotes innovation, 506 00:25:51,600 --> 00:25:54,960 Speaker 2: making open AI and AI data centers essential. And the 507 00:25:55,000 --> 00:25:58,119 Speaker 2: answer to there is no no. Sorry, this is a 508 00:25:58,160 --> 00:26:00,560 Speaker 2: really simple one. These data centers are not in and 509 00:26:00,600 --> 00:26:03,959 Speaker 2: of themselves driving much economic growth other than the costs 510 00:26:03,960 --> 00:26:07,600 Speaker 2: of building them, which I went into last episode. As 511 00:26:07,600 --> 00:26:10,280 Speaker 2: I've discussed again and again, there's maybe forty billion dollars 512 00:26:10,320 --> 00:26:12,720 Speaker 2: in revenue and no profit coming out of AI companies. 513 00:26:12,840 --> 00:26:15,240 Speaker 2: There isn't any economic growth. They're not holding up anything 514 00:26:15,480 --> 00:26:19,640 Speaker 2: other than the massive, massive infrastructure built to make them 515 00:26:19,800 --> 00:26:23,600 Speaker 2: make no money and lose billions. There's no great loss 516 00:26:23,600 --> 00:26:25,960 Speaker 2: associated with the death of large language models or the 517 00:26:26,119 --> 00:26:28,920 Speaker 2: death of this era. Taking away Ober would be genuinely 518 00:26:28,960 --> 00:26:32,720 Speaker 2: catastrophic with some people's ability to get places and people's jobs, 519 00:26:32,760 --> 00:26:37,560 Speaker 2: even if they are horrifyingly underpaid. But here's another booster, quipped. 520 00:26:37,720 --> 00:26:40,320 Speaker 2: Uber burned a lot of money twenty five billion dollars 521 00:26:40,400 --> 00:26:43,440 Speaker 2: or more to get where it is today. Ooh, mister Zichron, 522 00:26:43,720 --> 00:26:46,480 Speaker 2: mister Zitchron, You're dead. And my response is the open 523 00:26:46,520 --> 00:26:49,080 Speaker 2: AI and anthropic are both separately burned more than four 524 00:26:49,119 --> 00:26:51,240 Speaker 2: times as much money since the beginning of twenty twenty 525 00:26:51,240 --> 00:26:54,159 Speaker 2: four as Uber did in its entire existence. So the 526 00:26:54,240 --> 00:26:57,280 Speaker 2: classic and wrong argument about open ai and companies like 527 00:26:57,320 --> 00:26:59,400 Speaker 2: open ai is that Uber burned a bunch of money, 528 00:26:59,440 --> 00:27:03,080 Speaker 2: is now cash flow positive or profitable. I want to 529 00:27:03,080 --> 00:27:06,000 Speaker 2: be clear that Uber's costs are nothing like large language models, 530 00:27:06,000 --> 00:27:09,240 Speaker 2: and making this comparison is ridiculous and desperate. But let's 531 00:27:09,240 --> 00:27:11,320 Speaker 2: talk about raw losses, shall we, and where people are 532 00:27:11,320 --> 00:27:14,440 Speaker 2: making this assumption. So Uber lost twenty four point nine 533 00:27:14,560 --> 00:27:16,480 Speaker 2: billion dollars in the space of four years from twenty 534 00:27:16,600 --> 00:27:18,679 Speaker 2: nineteen to twenty twenty two, in part because of the 535 00:27:18,680 --> 00:27:20,800 Speaker 2: billions it was spending on sales and marketing in R 536 00:27:20,840 --> 00:27:22,960 Speaker 2: and D four point six billion dollars and four point 537 00:27:23,040 --> 00:27:26,720 Speaker 2: eight billion dollars respectively in twenty nineteen alone. It also 538 00:27:27,000 --> 00:27:29,840 Speaker 2: massively subsidized the cost of rights, which is why prices 539 00:27:29,880 --> 00:27:33,119 Speaker 2: had to increase, and spent heavily on driver recruitment, burning 540 00:27:33,119 --> 00:27:35,880 Speaker 2: cash to get scale, you know, the classic Silicon Valley way. 541 00:27:36,480 --> 00:27:40,200 Speaker 2: This is absolutely nothing like how large language models are growing. 542 00:27:40,200 --> 00:27:42,840 Speaker 2: And I'm tired of defending this point, but defended I 543 00:27:42,920 --> 00:27:46,800 Speaker 2: shall open AI and Anthropic burn money primarily through compute 544 00:27:46,800 --> 00:27:50,119 Speaker 2: costs and specialized talent. These costs are increasing, especially with 545 00:27:50,160 --> 00:27:52,399 Speaker 2: the rush to hire every single AI scientists at the 546 00:27:52,440 --> 00:27:56,680 Speaker 2: most expensive price possible. There are also essential immovable costs 547 00:27:56,760 --> 00:28:00,280 Speaker 2: that neither open AI or Anthropic have to shoulder. The 548 00:28:00,320 --> 00:28:02,800 Speaker 2: construction of the data centers necessary to train and run 549 00:28:02,800 --> 00:28:05,159 Speaker 2: inference for their models, and of course the GPU is 550 00:28:05,240 --> 00:28:08,000 Speaker 2: inside them, which I will get to in a little bit. Yes, 551 00:28:08,200 --> 00:28:10,919 Speaker 2: Uber raised thirty three point five billion dollars through multiple 552 00:28:11,000 --> 00:28:13,840 Speaker 2: rounds of posting IPO dam though it raised about twenty 553 00:28:13,840 --> 00:28:17,040 Speaker 2: five billion dollars in actual funding. Yes, Uber burned an 554 00:28:17,040 --> 00:28:19,760 Speaker 2: absolutely as ton of money. Yes, Uber a scale, but 555 00:28:19,880 --> 00:28:21,680 Speaker 2: Uber has not burned money as a means of making 556 00:28:21,680 --> 00:28:25,400 Speaker 2: its product functional or useful. Uber worked immediately. I mean 557 00:28:25,840 --> 00:28:27,879 Speaker 2: was twenty twelve. I think I used it for the 558 00:28:27,920 --> 00:28:30,119 Speaker 2: first time. Maybe earlier. No, no, it would have been 559 00:28:30,119 --> 00:28:33,760 Speaker 2: twenty ten. It worked immediately. You used it, You're like, wow, this, 560 00:28:34,040 --> 00:28:35,919 Speaker 2: I can just put in my address. I don't have 561 00:28:36,040 --> 00:28:38,320 Speaker 2: to say my address three times because I have a 562 00:28:38,320 --> 00:28:41,480 Speaker 2: British accent and nobody can fucking understand me. Sometimes you can, 563 00:28:41,560 --> 00:28:46,320 Speaker 2: though you're special. Yeah, it was really obvious that it worked, 564 00:28:46,520 --> 00:28:49,080 Speaker 2: and also the costs associate with Uber and its capital 565 00:28:49,080 --> 00:28:52,120 Speaker 2: expenditures from twenty nineteen through twenty twenty four were around 566 00:28:52,240 --> 00:28:54,640 Speaker 2: two point two billion dollars, by the way, on miniscule 567 00:28:54,680 --> 00:28:57,880 Speaker 2: compared to the actual real costs of open ai and Anthropic. 568 00:28:58,520 --> 00:29:01,520 Speaker 2: Both open Ai and Anthropic around five billion dollars each 569 00:29:01,520 --> 00:29:04,560 Speaker 2: in twenty twenty four, but their infrastructure was entirely paid 570 00:29:04,560 --> 00:29:07,480 Speaker 2: for by either Microsoft, Google, or Amazon. And by which 571 00:29:07,520 --> 00:29:09,640 Speaker 2: I mean the building of it and the expansion they're 572 00:29:09,640 --> 00:29:12,800 Speaker 2: in what we don't know how much of this infrastructure 573 00:29:12,840 --> 00:29:16,240 Speaker 2: is specifically for open ai or Anthropic. As the largest 574 00:29:16,280 --> 00:29:18,760 Speaker 2: model developers, it's fair to assume that a large chunk 575 00:29:18,760 --> 00:29:21,840 Speaker 2: at least thirty percent of Amazon and Microsoft's capital expenditures 576 00:29:21,880 --> 00:29:24,880 Speaker 2: have been to support these loads. Great sentence to cut 577 00:29:24,920 --> 00:29:27,520 Speaker 2: and listen to again. I also leave out Google, as 578 00:29:27,520 --> 00:29:30,840 Speaker 2: it's unclear whether it's expanded its infrastructure for Anthropic, but 579 00:29:30,880 --> 00:29:33,600 Speaker 2: we know Amazon has done so. As a result, the 580 00:29:33,600 --> 00:29:35,880 Speaker 2: true cost of open ai and Anthropic is at least 581 00:29:35,920 --> 00:29:39,120 Speaker 2: ten times what uberburned. Amazon spent eighty three billion dollars 582 00:29:39,160 --> 00:29:41,680 Speaker 2: in capital expenditures in twenty twenty four and expects one 583 00:29:41,760 --> 00:29:43,840 Speaker 2: hundred and five billion dollars are the fuckers in twenty 584 00:29:43,840 --> 00:29:47,160 Speaker 2: twenty five. Microsoft spent fifty five point six billion dollars 585 00:29:47,200 --> 00:29:49,400 Speaker 2: in twenty twenty four and expects to spend eighty billion 586 00:29:49,400 --> 00:29:52,080 Speaker 2: dollars this year. I'm actually confident most of that is 587 00:29:52,120 --> 00:29:55,760 Speaker 2: open Ai, but based on my conservative calculations, the true 588 00:29:55,760 --> 00:29:58,280 Speaker 2: cost of open ai is at least eighty two billion dollars, 589 00:29:58,440 --> 00:30:01,800 Speaker 2: and that only includes capex twenty twenty four onwards. Based 590 00:30:01,840 --> 00:30:04,479 Speaker 2: on thirty percent of Microsoft's capex. It's not everything has 591 00:30:04,520 --> 00:30:07,480 Speaker 2: been invested yet in twenty twenty five, and open Ai 592 00:30:07,880 --> 00:30:11,320 Speaker 2: might not be all of the capex, and also the 593 00:30:11,360 --> 00:30:13,480 Speaker 2: forty one point four billion dollars of funding that open 594 00:30:13,480 --> 00:30:16,160 Speaker 2: ai has received so far. The true cost of Anthropic 595 00:30:16,200 --> 00:30:18,320 Speaker 2: is around seventy seven point one billion dollars, and that's 596 00:30:18,360 --> 00:30:21,040 Speaker 2: not including the thirteen billion they just raised, but it 597 00:30:21,040 --> 00:30:23,400 Speaker 2: does include all their previous funding and thirty percent of 598 00:30:23,400 --> 00:30:26,320 Speaker 2: Amazon's capex in the beginning of twenty twenty four. Now 599 00:30:26,320 --> 00:30:29,840 Speaker 2: these are in exact comparisons, but the classic argument is 600 00:30:29,880 --> 00:30:32,680 Speaker 2: that Uber burned lots of money and worked out okay, 601 00:30:32,760 --> 00:30:35,400 Speaker 2: when in fact the combined couple expenditures from twenty twenty 602 00:30:35,400 --> 00:30:38,120 Speaker 2: four onwards that are necessary to make open ai and 603 00:30:38,120 --> 00:30:41,320 Speaker 2: Anthropic worker each on their own four times what Uber 604 00:30:41,480 --> 00:30:45,880 Speaker 2: burned in over a decade. I also believe these numbers 605 00:30:45,920 --> 00:30:48,200 Speaker 2: are conservative. There's a good chance that open ai and 606 00:30:48,240 --> 00:30:51,920 Speaker 2: Anthropic dominate the capex of Amazon, Google, and Microsoft in 607 00:30:51,960 --> 00:30:54,120 Speaker 2: part because of what the fuck else are they buying 608 00:30:54,120 --> 00:30:56,920 Speaker 2: all these GPUs for as their own AI services don't 609 00:30:56,920 --> 00:31:00,720 Speaker 2: appear to be making much money at all anyway. To 610 00:31:00,760 --> 00:31:03,360 Speaker 2: put it real simple, AI has burned way more in 611 00:31:03,360 --> 00:31:05,720 Speaker 2: the last two years than Uber burned in ten. Uber 612 00:31:05,760 --> 00:31:07,920 Speaker 2: didn't burn money in the same way, didn't burn much 613 00:31:07,920 --> 00:31:10,840 Speaker 2: in the way of capital expenditures, didn't require massive amounts 614 00:31:10,840 --> 00:31:13,600 Speaker 2: of infrastructure, and isn't remotely the same in any way, 615 00:31:13,640 --> 00:31:15,840 Speaker 2: shape or form other than that it burned a lot 616 00:31:15,880 --> 00:31:18,160 Speaker 2: of money. And that burning wasn't because it was trying 617 00:31:18,200 --> 00:31:20,520 Speaker 2: to build the core product. It was trying to scale. 618 00:31:20,720 --> 00:31:23,320 Speaker 2: It's all so stupid, And you know what, I'm not 619 00:31:23,400 --> 00:31:27,800 Speaker 2: even done. Our next and final AI booster episode will 620 00:31:27,800 --> 00:31:31,480 Speaker 2: breeze through the dumbest of the dumb arguments, and I'll 621 00:31:31,480 --> 00:31:34,360 Speaker 2: say why I'm finally drawing a line under these arguments 622 00:31:34,400 --> 00:31:36,760 Speaker 2: for real, because it needs to be said. We need 623 00:31:36,800 --> 00:31:41,240 Speaker 2: to say something. I hope you've enjoyed this, see you tomorrow, godspeed. 624 00:31:49,960 --> 00:31:52,400 Speaker 2: Thank you for listening to Better Offline. The editor and 625 00:31:52,400 --> 00:31:55,600 Speaker 2: composer of the Better Offline theme song is Matasowski. You 626 00:31:55,600 --> 00:31:57,840 Speaker 2: can check out more of his music and audio projects 627 00:31:58,040 --> 00:32:01,520 Speaker 2: at Matasowski dot com, M A T T O S 628 00:32:01,600 --> 00:32:05,640 Speaker 2: O W s ki dot com. You can email me 629 00:32:05,680 --> 00:32:08,280 Speaker 2: at easy at Better offline dot com or visit Better 630 00:32:08,320 --> 00:32:10,760 Speaker 2: Offline dot com to find more podcast links and of course, 631 00:32:10,800 --> 00:32:13,920 Speaker 2: my newsletter. I also really recommend you go to chat 632 00:32:13,960 --> 00:32:16,600 Speaker 2: dot Where's youreaed dot at to visit the discord, and 633 00:32:16,640 --> 00:32:19,360 Speaker 2: go to our slash Better Offline to check out our reddit. 634 00:32:20,120 --> 00:32:23,400 Speaker 2: Thank you so much for listening. Better Offline is a 635 00:32:23,400 --> 00:32:26,240 Speaker 2: production of cool Zone Media. For more from cool Zone Media, 636 00:32:26,600 --> 00:32:29,720 Speaker 2: visit our website cool Zonemedia dot com, or check us 637 00:32:29,760 --> 00:32:32,760 Speaker 2: out on the iHeartRadio app, Apple Podcasts, or wherever you 638 00:32:32,800 --> 00:32:33,920 Speaker 2: get your podcasts.