1
00:00:02,120 --> 00:00:02,880
Speaker 1: Ze Media.

2
00:00:04,200 --> 00:00:07,000
Speaker 2: Hello one, Welcome to Better Offline. I'm your host ed

3
00:00:07,080 --> 00:00:21,840
Speaker 2: Zi Trun. This is part two of our three parts

4
00:00:21,880 --> 00:00:25,040
Speaker 2: serious on how to argue with an AI booster. When

5
00:00:25,040 --> 00:00:27,440
Speaker 2: we last left off, I'd started talking about some of

6
00:00:27,480 --> 00:00:30,240
Speaker 2: the most common and vacuous talking points used by those

7
00:00:30,240 --> 00:00:32,959
Speaker 2: who defend the generative AI industry and why a lot

8
00:00:32,960 --> 00:00:36,080
Speaker 2: of them are wholly without merit. These are the booster quips,

9
00:00:36,120 --> 00:00:38,680
Speaker 2: assertions that if you don't know much, sound convincing but

10
00:00:38,720 --> 00:00:41,680
Speaker 2: are easily disproven with the right information. And in that

11
00:00:41,800 --> 00:00:44,000
Speaker 2: last episode we addressed the quips that say were in

12
00:00:44,040 --> 00:00:47,080
Speaker 2: the early days of AI and that people doubted smartphones

13
00:00:47,080 --> 00:00:49,479
Speaker 2: and the internet. Things they didn't do just like they

14
00:00:49,479 --> 00:00:52,880
Speaker 2: did generative AI, which they should do in the cycle

15
00:00:52,920 --> 00:00:55,360
Speaker 2: of grief. That's the denial stage. Now we're going to

16
00:00:55,400 --> 00:00:58,880
Speaker 2: move on to bargaining. This is just that the dot

17
00:00:58,920 --> 00:01:01,920
Speaker 2: com boom, even if of this collapses, the overcapacity will

18
00:01:01,960 --> 00:01:04,200
Speaker 2: be practical for the market like the fiber boom was.

19
00:01:05,040 --> 00:01:07,760
Speaker 2: All right, folks, time for a little history. You know me,

20
00:01:07,840 --> 00:01:10,800
Speaker 2: I'll love me some mystery. The fiber boom began after

21
00:01:10,840 --> 00:01:14,520
Speaker 2: the Telecommunications Act of nineteen ninety six deregulated large parts

22
00:01:14,520 --> 00:01:18,920
Speaker 2: of America's communications infrastructure, creating a massive boom, a five

23
00:01:19,000 --> 00:01:25,720
Speaker 2: hundred billion dollars one to be precise, primarily funded with debt. Obviously,

24
00:01:25,720 --> 00:01:28,400
Speaker 2: we're still using the infrastructure bought during that boom, and

25
00:01:28,480 --> 00:01:30,640
Speaker 2: this fact is used as a defense of the insane

26
00:01:30,720 --> 00:01:35,520
Speaker 2: capex spending surrounding generative AI. High speed Internet is useful, right, sure,

27
00:01:35,600 --> 00:01:38,480
Speaker 2: But the fiber optic boom period was also defined by

28
00:01:38,480 --> 00:01:43,280
Speaker 2: a gluttony of overinvestment, ridiculous valuations, and genuine, outright fraud.

29
00:01:43,480 --> 00:01:45,560
Speaker 2: In any case, this is not remotely the same thing,

30
00:01:45,560 --> 00:01:47,480
Speaker 2: and anyone making this point needs to learn the very

31
00:01:47,520 --> 00:01:51,520
Speaker 2: fucking basics of technology. Let's get going now. The fiber

32
00:01:51,520 --> 00:01:54,120
Speaker 2: optic cable of this era is mostly owned by a

33
00:01:54,120 --> 00:01:57,360
Speaker 2: few companies. Forty two percent of Nvidia's revenue is from

34
00:01:57,400 --> 00:02:00,440
Speaker 2: the Magnificent seven, and the companies buying these gps are

35
00:02:00,480 --> 00:02:02,360
Speaker 2: for the most part not going to go bust once

36
00:02:02,400 --> 00:02:05,680
Speaker 2: the AI bubble bursts. You can also already get the

37
00:02:05,800 --> 00:02:09,560
Speaker 2: cheap fiber of this era too cheap aigpus already here.

38
00:02:09,840 --> 00:02:13,040
Speaker 2: GPUs are depreciating assets, meaning that the good deals are

39
00:02:13,080 --> 00:02:16,640
Speaker 2: already happening. I found an in Vidia a one hundred

40
00:02:16,639 --> 00:02:19,160
Speaker 2: for two or three thousand dollars multiple times on eBay,

41
00:02:19,360 --> 00:02:21,120
Speaker 2: and you can get the h one hundreds which are

42
00:02:21,160 --> 00:02:23,639
Speaker 2: more powerful for well, I think thirty grand and those

43
00:02:23,680 --> 00:02:27,720
Speaker 2: things go forty five thousand retails, So not brilliant. Aigpus

44
00:02:27,760 --> 00:02:29,760
Speaker 2: also do not have a variety of use cases and

45
00:02:29,800 --> 00:02:33,440
Speaker 2: are limited by Kuda, in Vidia's programming libraries and APIs.

46
00:02:33,760 --> 00:02:37,760
Speaker 2: Aigpus are integrated into applications using this language Kuda, and

47
00:02:37,800 --> 00:02:41,280
Speaker 2: this is specifically in Vidia's programming language. While there are

48
00:02:41,400 --> 00:02:45,320
Speaker 2: other use cases scientific simulations, image and video processing, data

49
00:02:45,360 --> 00:02:48,880
Speaker 2: science and analytics, medical imaging, and so on. Kuder is

50
00:02:48,880 --> 00:02:53,720
Speaker 2: not a one size fits or digital panacea. While fiber

51
00:02:53,720 --> 00:02:57,040
Speaker 2: optic cable was, and it was also put everywhere, it

52
00:02:57,200 --> 00:03:00,240
Speaker 2: truly did set up the future. What are the these

53
00:03:00,240 --> 00:03:04,679
Speaker 2: GPUs setting up exactly? Also, widespread access to cheaper GPUs

54
00:03:04,720 --> 00:03:08,280
Speaker 2: has already happened, and what new use cases are there?

55
00:03:08,600 --> 00:03:11,520
Speaker 2: What are the new innovative things we can do? As

56
00:03:11,520 --> 00:03:14,440
Speaker 2: a result of the AI bubble, there are now many, many, many, many,

57
00:03:14,440 --> 00:03:17,720
Speaker 2: many different vendors to get access to GPUs. You can

58
00:03:17,760 --> 00:03:20,000
Speaker 2: pay at an hourly rate. Who knows if it's probitable,

59
00:03:20,040 --> 00:03:21,880
Speaker 2: but you can do it, and sometimes you can get

60
00:03:21,880 --> 00:03:23,880
Speaker 2: them for as little as one dollars an hour, which

61
00:03:23,919 --> 00:03:26,640
Speaker 2: is really not good. It definitely isn't making them money

62
00:03:26,639 --> 00:03:30,520
Speaker 2: but putting the financial collapse aside. While they might be

63
00:03:30,639 --> 00:03:33,840
Speaker 2: cheaper when the AI bubble bursts, does cheaper actually enable

64
00:03:33,840 --> 00:03:36,920
Speaker 2: people to do new stuff? Is costs the problem because

65
00:03:36,920 --> 00:03:38,080
Speaker 2: I think the costs are going to go up. But

66
00:03:38,120 --> 00:03:40,440
Speaker 2: even if they weren't going up, what are the things

67
00:03:40,480 --> 00:03:42,520
Speaker 2: that you could do that a new What is the

68
00:03:42,560 --> 00:03:46,520
Speaker 2: prohibitive cost? No one can actually answer this question because

69
00:03:46,560 --> 00:03:50,080
Speaker 2: the answer isn't fun. GPUs are built to shove massive

70
00:03:50,080 --> 00:03:52,960
Speaker 2: amounts of compute into one specific function, again and again

71
00:03:53,000 --> 00:03:55,560
Speaker 2: and again, like generating the output of model, which remember,

72
00:03:55,680 --> 00:03:59,640
Speaker 2: mostly boils down to complex maths. Unlike CPUs, a GPU

73
00:03:59,680 --> 00:04:03,240
Speaker 2: can't easily changed tasks or handle many little distinct operations,

74
00:04:03,520 --> 00:04:05,560
Speaker 2: meaning that these things aren't going to be adopted for

75
00:04:05,640 --> 00:04:08,640
Speaker 2: another mass market use case because there probably isn't one.

76
00:04:09,280 --> 00:04:12,800
Speaker 2: In simpler terms, this was not an infrastructure built out.

77
00:04:13,000 --> 00:04:16,360
Speaker 2: The GPU boom is a heavily centralized, capital expenditure funded

78
00:04:16,400 --> 00:04:18,640
Speaker 2: asset bubble where a bunch of chips will sit in

79
00:04:18,680 --> 00:04:22,560
Speaker 2: warehouses or kind of fallow data centers waiting for somebody

80
00:04:22,560 --> 00:04:24,480
Speaker 2: to make up a use case for them. And if

81
00:04:24,520 --> 00:04:27,000
Speaker 2: an endearing one existed, we'd already have it, because we

82
00:04:27,040 --> 00:04:31,920
Speaker 2: already have all the fucking GPUs. Now here's a really

83
00:04:31,920 --> 00:04:34,359
Speaker 2: big boost e quip and I have been looking forward to.

84
00:04:34,360 --> 00:04:35,880
Speaker 2: I get a lot of people asking you about this.

85
00:04:36,839 --> 00:04:41,280
Speaker 2: I'm ed, you're so stupid. Why am I stupid? Exactly? Well,

86
00:04:41,320 --> 00:04:44,200
Speaker 2: five really smart guys got together and wrote AI twenty

87
00:04:44,279 --> 00:04:47,320
Speaker 2: twenty seven, which is a very real sounding extrapolation that

88
00:04:47,440 --> 00:04:52,559
Speaker 2: shut the fuck up, shut up, shut up. AI twenty

89
00:04:52,600 --> 00:04:55,440
Speaker 2: twenty seven is fan fiction. If you were scared by this,

90
00:04:55,480 --> 00:04:57,560
Speaker 2: and you're not a booster, you shouldn't feel bad. By

91
00:04:57,560 --> 00:05:00,320
Speaker 2: the way this was written to scare you. By the way,

92
00:05:00,320 --> 00:05:02,200
Speaker 2: if you don't know what it is I'm talking about,

93
00:05:02,360 --> 00:05:04,880
Speaker 2: you should consider yourself lucky. It's essentially a piece of

94
00:05:04,920 --> 00:05:09,000
Speaker 2: speculative fiction that describes where GENAI companies get fatter models

95
00:05:09,000 --> 00:05:11,400
Speaker 2: that get exponentially better, and the US and China are

96
00:05:11,440 --> 00:05:14,120
Speaker 2: in brailed in an AI arms race. It's really silly.

97
00:05:14,160 --> 00:05:17,000
Speaker 2: It's so very silly, and I call it fan fiction

98
00:05:17,080 --> 00:05:19,680
Speaker 2: because it is. If we're thinking about this in purely

99
00:05:19,720 --> 00:05:22,080
Speaker 2: intellectual terms. It's up there with my immortal and no,

100
00:05:22,200 --> 00:05:24,599
Speaker 2: I'm not explaining that you can google that one for yourselves.

101
00:05:25,160 --> 00:05:27,240
Speaker 2: It doesn't matter if all the people writing the fan

102
00:05:27,279 --> 00:05:30,080
Speaker 2: fiction are scientists or that they have the right credentials.

103
00:05:30,440 --> 00:05:33,200
Speaker 2: They themselves said that AI twenty twenty seven is a

104
00:05:33,279 --> 00:05:36,960
Speaker 2: guess an extrapolation, which means guess with expert feedback, which

105
00:05:37,000 --> 00:05:40,120
Speaker 2: means someone editing your fan fiction and involves experience that

106
00:05:40,200 --> 00:05:42,240
Speaker 2: open AI. There are people that worked on the shows

107
00:05:42,240 --> 00:05:45,479
Speaker 2: they write fan fiction about. We're not even insulting fan fiction.

108
00:05:45,560 --> 00:05:48,520
Speaker 2: By the way, go nuts, you're more You are one

109
00:05:48,600 --> 00:05:53,040
Speaker 2: hundred times more ethically positive than these people. At least

110
00:05:53,040 --> 00:05:56,960
Speaker 2: you admits fan fiction could knuckles get pregnant. I'm sure

111
00:05:56,960 --> 00:05:59,200
Speaker 2: somebody's found out. I'm not going to go line by

112
00:05:59,240 --> 00:06:01,160
Speaker 2: line and cut this any more than I'm going to

113
00:06:01,200 --> 00:06:03,839
Speaker 2: go and do a lengthy takedown of someone's erotic Bancho

114
00:06:03,920 --> 00:06:07,640
Speaker 2: Kazoui's story, because both are fictional. The entire premise of

115
00:06:07,640 --> 00:06:10,400
Speaker 2: this nonsense is that at one point someone invents a

116
00:06:10,400 --> 00:06:13,400
Speaker 2: self learning agent that teaches itself stuff, and it does

117
00:06:13,400 --> 00:06:16,520
Speaker 2: a bunch of other stuff requiring a Brazilian compute points

118
00:06:17,000 --> 00:06:19,599
Speaker 2: with different agents with different numbers after them. There is

119
00:06:19,640 --> 00:06:21,800
Speaker 2: no proof that this is possible. Nobody has done it,

120
00:06:21,839 --> 00:06:24,600
Speaker 2: and nobody will do it. AA twenty twenty seven was

121
00:06:24,640 --> 00:06:27,120
Speaker 2: written specifically to fool people that want to be fooled,

122
00:06:27,279 --> 00:06:29,440
Speaker 2: with big chants and the right technical terms used to

123
00:06:29,480 --> 00:06:31,400
Speaker 2: lull the credulus into a wet dream and a New

124
00:06:31,480 --> 00:06:33,680
Speaker 2: York Times column where one of the writers folds their

125
00:06:33,720 --> 00:06:36,520
Speaker 2: hands and looks worried. It was also written to scare

126
00:06:36,520 --> 00:06:40,480
Speaker 2: people that are already scared. It makes big, scary proclamations

127
00:06:40,480 --> 00:06:43,000
Speaker 2: with tons of links to stuff that looks really legitimate,

128
00:06:43,080 --> 00:06:45,920
Speaker 2: but when you piece it all together, is literally just

129
00:06:46,000 --> 00:06:50,440
Speaker 2: fan fection, except really not that endearing. My personal favorite

130
00:06:50,480 --> 00:06:53,200
Speaker 2: part is mid twenty twenty six China Wakes Up, which

131
00:06:53,240 --> 00:06:56,520
Speaker 2: involves China's intelligence agents. He's trying to steal Open Brains

132
00:06:56,560 --> 00:06:59,960
Speaker 2: agent no idea who this companicably referring to please email

133
00:07:00,000 --> 00:07:02,000
Speaker 2: if you can work it out to I don't care

134
00:07:02,080 --> 00:07:05,760
Speaker 2: at business dot org before the headline of AI take

135
00:07:05,839 --> 00:07:08,560
Speaker 2: some jobs. After Open Brain releases a model. Oh God,

136
00:07:08,600 --> 00:07:12,520
Speaker 2: I'm so bored even fucking talking about this now. Sarah

137
00:07:12,600 --> 00:07:15,120
Speaker 2: lyonce puts this well, arguing that AI twenty twenty seven

138
00:07:15,160 --> 00:07:17,680
Speaker 2: and AI in general is no different from the spurious

139
00:07:17,720 --> 00:07:20,200
Speaker 2: spectral evidence used to accuse someone of being a witch

140
00:07:20,280 --> 00:07:23,520
Speaker 2: during the Salem witch trials, and I quote and the

141
00:07:23,520 --> 00:07:26,320
Speaker 2: evidence is spectral. What is the real evidence in AI

142
00:07:26,320 --> 00:07:29,680
Speaker 2: twenty twenty seven beyond trust us and vibes? People who

143
00:07:29,680 --> 00:07:32,720
Speaker 2: wrote it site themselves in the piece, do not demand

144
00:07:32,720 --> 00:07:35,440
Speaker 2: I take this seriously. This is so clearly a marketing

145
00:07:35,960 --> 00:07:38,240
Speaker 2: device to scare people into buying your product before this

146
00:07:38,280 --> 00:07:41,600
Speaker 2: imaginary window closes. Don't call me stupid for not falling

147
00:07:41,640 --> 00:07:44,840
Speaker 2: for your spectral evidence. My whole life, people have been

148
00:07:44,880 --> 00:07:48,200
Speaker 2: saying artificial intelligence is around the corner, and it never arrives.

149
00:07:48,640 --> 00:07:50,680
Speaker 2: I simply do not believe a chatbot will ever be

150
00:07:50,720 --> 00:07:52,720
Speaker 2: more than a chat pot, and until you show me

151
00:07:52,760 --> 00:07:57,040
Speaker 2: it doing that, I will not believe it anyway. AI

152
00:07:57,080 --> 00:08:00,480
Speaker 2: twenty twenty seven is fan fiction nothing more. Just because

153
00:08:00,480 --> 00:08:02,920
Speaker 2: it's full of fancy words and has five different grifters

154
00:08:02,960 --> 00:08:19,400
Speaker 2: on its byline doesn't mean a goddamn thing. Now now, now, now, now, folks,

155
00:08:20,240 --> 00:08:24,120
Speaker 2: we've all been waiting for this moment, and here's the

156
00:08:24,200 --> 00:08:28,239
Speaker 2: ultimate booster quip the cust of inference is coming down.

157
00:08:28,520 --> 00:08:31,640
Speaker 2: This proves that things are getting cheaper. And here's a

158
00:08:31,640 --> 00:08:34,000
Speaker 2: bonus trick for you before I get to my ben

159
00:08:34,640 --> 00:08:37,640
Speaker 2: Here we go, ask them to explain whether things have

160
00:08:37,720 --> 00:08:40,000
Speaker 2: actually got cheaper, and if they say they have, ask

161
00:08:40,040 --> 00:08:42,880
Speaker 2: them why there are no profitable AI companies. If they

162
00:08:42,920 --> 00:08:45,240
Speaker 2: say they're in the growth stage, ask them why there

163
00:08:45,240 --> 00:08:47,920
Speaker 2: are no profitable AI companies. Again, I'd say it's been

164
00:08:48,000 --> 00:08:50,679
Speaker 2: several years and not got one. At this point they

165
00:08:50,679 --> 00:08:53,640
Speaker 2: should try and kill you. But really, I'm about to

166
00:08:53,679 --> 00:08:55,880
Speaker 2: be petty. I'm about to be petty for a fucking

167
00:08:55,920 --> 00:08:58,960
Speaker 2: reason though. In an interview on a podcast from earlier

168
00:08:58,960 --> 00:09:01,560
Speaker 2: this year that I will not even quote because the

169
00:09:01,679 --> 00:09:04,040
Speaker 2: journalist in question did not back me up and it

170
00:09:04,080 --> 00:09:08,240
Speaker 2: pisses me off, Journalist Casey Newton said the following about

171
00:09:08,240 --> 00:09:08,720
Speaker 2: my work.

172
00:09:09,720 --> 00:09:11,160
Speaker 1: You don't think that that kind of flies in the

173
00:09:11,160 --> 00:09:13,120
Speaker 1: face of same altman saying that we need billions of

174
00:09:13,160 --> 00:09:15,880
Speaker 1: dollars for years. No, not at all. And I think

175
00:09:15,920 --> 00:09:18,080
Speaker 1: that's why it's so important when you're reading about AI

176
00:09:18,240 --> 00:09:20,600
Speaker 1: to read people who actually interview people who work at

177
00:09:20,640 --> 00:09:23,640
Speaker 1: these companies and understand how the technology works. Because the

178
00:09:23,800 --> 00:09:28,000
Speaker 1: entire industry has been on this curve where they are

179
00:09:28,200 --> 00:09:32,440
Speaker 1: trying to find micro innovations that reduce the cost of

180
00:09:32,480 --> 00:09:35,240
Speaker 1: training the models and to reduce the cost of what

181
00:09:35,280 --> 00:09:37,600
Speaker 1: they call inference, which is when you actually enter aquarium

182
00:09:37,640 --> 00:09:41,000
Speaker 1: the chat GBT and if you plotted the curve of

183
00:09:41,280 --> 00:09:44,360
Speaker 1: how the cost has been following over time, Deep Seek

184
00:09:44,440 --> 00:09:47,520
Speaker 1: is on that curve. Right, So everything that Deep Seek

185
00:09:47,559 --> 00:09:50,160
Speaker 1: did it was expected by the AI labs that someone

186
00:09:50,200 --> 00:09:52,520
Speaker 1: would be able to do. The novelty was just that

187
00:09:52,559 --> 00:09:54,760
Speaker 1: a Chinese company did it. So to say that it

188
00:09:54,920 --> 00:09:58,600
Speaker 1: like up ends expectations of how AI would be built

189
00:09:58,760 --> 00:10:01,440
Speaker 1: is just purely false and the opinion of somebody who

190
00:10:01,440 --> 00:10:02,680
Speaker 1: does not know what he's talking about.

191
00:10:03,280 --> 00:10:06,520
Speaker 2: Newton then says several octaves higher, which shows you exactly

192
00:10:06,520 --> 00:10:09,360
Speaker 2: how mad he isn't that he thought what he said

193
00:10:09,480 --> 00:10:12,000
Speaker 2: was very civil, and that there are things that are

194
00:10:12,000 --> 00:10:14,679
Speaker 2: true and there are things that are false, like you

195
00:10:14,720 --> 00:10:17,560
Speaker 2: can choose which ones you want to believe. I'm not

196
00:10:17,600 --> 00:10:20,240
Speaker 2: going to be so civil. Other than the fact that

197
00:10:20,280 --> 00:10:23,959
Speaker 2: Casey refers to micro innovations, the fuck are you talking about?

198
00:10:24,200 --> 00:10:26,640
Speaker 2: And Deep Seak being on a curve that was expected,

199
00:10:27,000 --> 00:10:30,320
Speaker 2: he makes, as many do, two very big mistakes and personally.

200
00:10:30,360 --> 00:10:34,160
Speaker 2: If I was doing this, I personally would not have

201
00:10:34,280 --> 00:10:37,680
Speaker 2: said these things in a sentence that began with me

202
00:10:37,760 --> 00:10:40,560
Speaker 2: suggesting that I be in case and Newton in this

203
00:10:40,679 --> 00:10:44,080
Speaker 2: example knew how the technology works. Now here's the case

204
00:10:44,120 --> 00:10:47,160
Speaker 2: in Newton wib inference, which is when you actually enter

205
00:10:47,200 --> 00:10:50,040
Speaker 2: a query into chat GPT. This statement is false. It's

206
00:10:50,040 --> 00:10:52,760
Speaker 2: not what inference means. Inference and I've gotten this wrong

207
00:10:52,800 --> 00:10:55,680
Speaker 2: in the past too. I'm being accountable. Is everything that

208
00:10:55,760 --> 00:10:58,120
Speaker 2: happens when you put in a prompt to generate an output.

209
00:10:58,400 --> 00:11:02,080
Speaker 2: It's when an AI based on your infers meaning. To

210
00:11:02,160 --> 00:11:05,280
Speaker 2: be more specific, in quoting Google machine learning, inference is

211
00:11:05,280 --> 00:11:07,720
Speaker 2: the process of running data points into a machine learning

212
00:11:07,720 --> 00:11:10,960
Speaker 2: model to calculate an output, such as a single numerical score.

213
00:11:11,320 --> 00:11:13,439
Speaker 2: Except that's what these things are bad at. But nevertheless,

214
00:11:13,720 --> 00:11:15,440
Speaker 2: Casey will try and weasel out of this one and

215
00:11:15,480 --> 00:11:18,320
Speaker 2: say this is what he meant. It wasn't. He also said,

216
00:11:18,400 --> 00:11:20,240
Speaker 2: if he planted the curve of how the cost of

217
00:11:20,280 --> 00:11:24,200
Speaker 2: inference has been falling over time, well that's wrong, Casey,

218
00:11:24,320 --> 00:11:26,320
Speaker 2: that's wrong the man. The cost of inference has gone

219
00:11:26,360 --> 00:11:28,960
Speaker 2: up over time. Now, Casey, like many people who talk

220
00:11:28,960 --> 00:11:31,600
Speaker 2: about stuff without learning about it first is likely referring

221
00:11:31,600 --> 00:11:33,320
Speaker 2: to the fact that the price of tokens for some

222
00:11:33,360 --> 00:11:36,240
Speaker 2: models has gone down in some cases. But you know what, folks,

223
00:11:36,320 --> 00:11:38,959
Speaker 2: let's establish and facts about inference. I'm doing the train.

224
00:11:39,320 --> 00:11:41,960
Speaker 2: I'm pulling the big horn on the invisible train. I'm

225
00:11:42,000 --> 00:11:45,000
Speaker 2: cooking now. Inference is a thing that costs money, is

226
00:11:45,120 --> 00:11:47,760
Speaker 2: entirely different to the price of tokens, and conflating the

227
00:11:47,800 --> 00:11:51,000
Speaker 2: two is journalistic malpractice. The cost of inference would be

228
00:11:51,000 --> 00:11:53,720
Speaker 2: the price of running the GPU and the associated architecture.

229
00:11:53,800 --> 00:11:55,800
Speaker 2: Of course, we do not at this point have any

230
00:11:55,840 --> 00:11:59,520
Speaker 2: real insight into token prices are set by the people

231
00:11:59,520 --> 00:12:02,160
Speaker 2: who sell access to the tokens, such as open ai

232
00:12:02,200 --> 00:12:05,120
Speaker 2: and Anthropic. For example, open ai dropped the price of

233
00:12:05,160 --> 00:12:07,959
Speaker 2: its O three models token costs almost immediately after the

234
00:12:08,000 --> 00:12:10,520
Speaker 2: launch of Claude Opus four. Do you think it did

235
00:12:10,559 --> 00:12:12,800
Speaker 2: that because the price of serving the models got cheaper.

236
00:12:13,000 --> 00:12:16,040
Speaker 2: If you do, I don't know how you possibly put

237
00:12:16,080 --> 00:12:19,920
Speaker 2: your trousers on every morning without cutting yourself in half. Now,

238
00:12:19,920 --> 00:12:22,960
Speaker 2: the cost of inference conversation comes from articles that say

239
00:12:23,000 --> 00:12:25,400
Speaker 2: that we now have models that are cheaper that can

240
00:12:25,400 --> 00:12:28,960
Speaker 2: now hit higher benchmark scores. Though the article I'm referring to,

241
00:12:29,000 --> 00:12:31,080
Speaker 2: which will be in the show notes, is from November

242
00:12:31,080 --> 00:12:33,240
Speaker 2: twenty twenty four, and the comparison it makes is between

243
00:12:33,280 --> 00:12:36,280
Speaker 2: GPT three, which is from November twenty twenty one, and

244
00:12:36,400 --> 00:12:40,400
Speaker 2: LAMA three point two to three b September twenty twenty four. Now,

245
00:12:40,440 --> 00:12:42,200
Speaker 2: the suggestion is in any case, that the cost of

246
00:12:42,200 --> 00:12:45,040
Speaker 2: inference is going down ten x year over year. The

247
00:12:45,080 --> 00:12:47,600
Speaker 2: problem is, however, that these are raw token costs, not

248
00:12:47,640 --> 00:12:51,199
Speaker 2: actual expressions of evaluations of token burn in a practical setting.

249
00:12:51,720 --> 00:12:54,199
Speaker 2: And to really I realized that it was a bit technical.

250
00:12:54,960 --> 00:12:57,920
Speaker 2: These are just what it costs to do something. It

251
00:12:57,960 --> 00:13:01,120
Speaker 2: doesn't actually tell you how how many tokens will be

252
00:13:01,160 --> 00:13:03,640
Speaker 2: burned at what volume they will be burned, because that

253
00:13:03,679 --> 00:13:06,800
Speaker 2: would change things. And well, wouldn't you know it, the

254
00:13:06,840 --> 00:13:10,120
Speaker 2: cost of inference actually went up as a result. In

255
00:13:10,160 --> 00:13:12,080
Speaker 2: an excellent blog from Killer Code, and I did not

256
00:13:12,160 --> 00:13:14,640
Speaker 2: get the chance to find out the pronunciation of this

257
00:13:15,400 --> 00:13:17,319
Speaker 2: second name, so I'm just going to call her. It

258
00:13:17,400 --> 00:13:22,760
Speaker 2: is ewasyz sz Ka. I am so sorry. I would

259
00:13:22,840 --> 00:13:25,679
Speaker 2: rather spell it out, miss than actually mispronounce it. I

260
00:13:25,720 --> 00:13:29,240
Speaker 2: hate when people say z tron wrong. Great blog anyway,

261
00:13:29,320 --> 00:13:33,520
Speaker 2: let me quote, application inference costs increase for two reasons.

262
00:13:33,559 --> 00:13:36,600
Speaker 2: The frontier models cost per token stayed constant, and the

263
00:13:36,679 --> 00:13:40,760
Speaker 2: token consumption per application grew a lot. Token consumption per

264
00:13:40,800 --> 00:13:43,600
Speaker 2: application grew a lot because models allowed for longer context

265
00:13:43,600 --> 00:13:46,880
Speaker 2: windows and bigger suggestions from the models. The combination of

266
00:13:46,920 --> 00:13:49,840
Speaker 2: a steady price per token and more token consumption caused

267
00:13:49,880 --> 00:13:52,880
Speaker 2: that inference cost to grow about ten times over the

268
00:13:52,880 --> 00:13:56,600
Speaker 2: past two years. To explain that in really simple terms,

269
00:13:56,640 --> 00:13:59,440
Speaker 2: while the costs of old models may have decreased, new models,

270
00:13:59,640 --> 00:14:02,760
Speaker 2: which you need to do most things, cost about the same,

271
00:14:02,800 --> 00:14:05,600
Speaker 2: and the reasoning that these new models use do actually

272
00:14:05,600 --> 00:14:09,079
Speaker 2: burn way way more tokens. When these new models reason,

273
00:14:09,160 --> 00:14:11,280
Speaker 2: they break the user's input down and break it into

274
00:14:11,280 --> 00:14:14,360
Speaker 2: component parts, then run inference on each of those parts.

275
00:14:14,600 --> 00:14:16,200
Speaker 2: When you plug an L and M into an AI

276
00:14:16,240 --> 00:14:19,320
Speaker 2: coding environment, it will naturally burn an absolute shit ton

277
00:14:19,360 --> 00:14:21,640
Speaker 2: of tokens, in part because of the large amount of

278
00:14:21,640 --> 00:14:23,800
Speaker 2: information you have to load into the prompt and the

279
00:14:23,840 --> 00:14:25,960
Speaker 2: context window, or the amount of information you can load

280
00:14:26,000 --> 00:14:29,440
Speaker 2: in at once, and in part because generatingcode is inference

281
00:14:29,520 --> 00:14:31,920
Speaker 2: intensive and also breaking down all those coding tasks. At

282
00:14:31,960 --> 00:14:34,360
Speaker 2: each of those tasks requiring a coding tool and taking

283
00:14:34,400 --> 00:14:38,200
Speaker 2: a bunch of inference themselves. It's really bad. In fact,

284
00:14:38,240 --> 00:14:40,640
Speaker 2: the inference costs are so severe. The Killer Code says

285
00:14:40,680 --> 00:14:43,160
Speaker 2: that a combination of a steady price for token and

286
00:14:43,200 --> 00:14:46,040
Speaker 2: more token consumption caused app inference costs to grow about

287
00:14:46,040 --> 00:14:49,160
Speaker 2: ten x over the last two years. I'm repeating myself.

288
00:14:49,200 --> 00:14:51,520
Speaker 2: I realized, But I really need you to get one thing,

289
00:14:51,760 --> 00:14:53,960
Speaker 2: which is that the cost of inference went up. But

290
00:14:54,120 --> 00:14:56,600
Speaker 2: I'm not done. I refuse to let this point go

291
00:14:56,800 --> 00:14:58,760
Speaker 2: because people love to say the cost of inference is

292
00:14:58,800 --> 00:15:01,400
Speaker 2: going down when the cost of inference has increased, and

293
00:15:01,440 --> 00:15:04,240
Speaker 2: they do so to a national audience, all while suggesting

294
00:15:04,320 --> 00:15:07,880
Speaker 2: I'm wrong somehow and acting superior. I don't like being

295
00:15:07,920 --> 00:15:10,680
Speaker 2: made to feel this way. I don't think it's nice

296
00:15:10,680 --> 00:15:13,360
Speaker 2: to do this to people. And if you're gonna do it,

297
00:15:13,440 --> 00:15:15,720
Speaker 2: if you have the temerity to call someone out directly,

298
00:15:15,840 --> 00:15:20,160
Speaker 2: at least be fucking right. I'm not wrong, You're wrong.

299
00:15:20,600 --> 00:15:24,240
Speaker 2: In fact, software developer influencer Theo Brown recently put out

300
00:15:24,240 --> 00:15:26,960
Speaker 2: a video called I was wrong about AI costs They

301
00:15:27,040 --> 00:15:30,240
Speaker 2: keep going up, which he breaks down as follows, reasoning

302
00:15:30,240 --> 00:15:34,000
Speaker 2: models are significantly increasing the amount of output tokens being generated.

303
00:15:34,320 --> 00:15:37,760
Speaker 2: These tokens are also more expensive. In one example, Brown

304
00:15:37,840 --> 00:15:41,080
Speaker 2: finds that Grockfor's reasoning mode uses six hundred and three

305
00:15:41,120 --> 00:15:45,760
Speaker 2: tokens to generate two words. This was a problem across

306
00:15:45,800 --> 00:15:48,720
Speaker 2: every single reasoning model, as even cheap reasoning models would

307
00:15:48,760 --> 00:15:51,600
Speaker 2: do the same thing. As a result, tasks are taking

308
00:15:51,680 --> 00:15:55,240
Speaker 2: longer and burning more tokens. Another writer called Ethan Deing

309
00:15:55,280 --> 00:15:57,760
Speaker 2: noted a few months ago that reasoning models burn so

310
00:15:57,800 --> 00:16:00,680
Speaker 2: many tokens that there is no flat subscrips price that

311
00:16:00,720 --> 00:16:03,200
Speaker 2: works in this new world. As the number of tokens

312
00:16:03,240 --> 00:16:06,920
Speaker 2: they consume to an absolutely nuclear the price drops have

313
00:16:07,000 --> 00:16:09,920
Speaker 2: also for the most part stopped. You cannot at this

314
00:16:10,040 --> 00:16:12,560
Speaker 2: point fairly evaluate whether a model is cheaper just based

315
00:16:12,600 --> 00:16:15,640
Speaker 2: on its cost per tokens, because reasoning models inherently burn

316
00:16:15,880 --> 00:16:19,080
Speaker 2: and are built to inherently burn more tokens to create

317
00:16:19,120 --> 00:16:21,560
Speaker 2: an output. Reasoning models are also the only way that

318
00:16:21,600 --> 00:16:23,840
Speaker 2: model developers have been able to improve the efficacy of

319
00:16:23,880 --> 00:16:26,640
Speaker 2: new models, using something called test time compute to burn

320
00:16:26,680 --> 00:16:30,080
Speaker 2: extra tokens to complete a task, and in basically anything

321
00:16:30,120 --> 00:16:31,800
Speaker 2: you're using today, there's going to be some sort of

322
00:16:31,880 --> 00:16:35,360
Speaker 2: reasoning model, especially if you're coding, the cost of inference

323
00:16:35,360 --> 00:16:38,800
Speaker 2: has gone up. Statements otherwise are purely false and are

324
00:16:38,840 --> 00:16:41,000
Speaker 2: the opinion of somebody who does not know what he's

325
00:16:41,040 --> 00:16:44,240
Speaker 2: talking about. But you ask, could the costs of inference

326
00:16:44,280 --> 00:16:49,000
Speaker 2: go down? Maybe it sure isn't trending that way, nor

327
00:16:49,040 --> 00:16:51,560
Speaker 2: has it gone down yet. I also predict that there's

328
00:16:51,560 --> 00:16:53,440
Speaker 2: going to be some sort of sudden realization in the

329
00:16:53,440 --> 00:16:55,720
Speaker 2: media that inference is going up, which is kind of

330
00:16:55,720 --> 00:16:58,960
Speaker 2: already started. The Information had a piece on it in

331
00:16:59,040 --> 00:17:01,480
Speaker 2: late August where they note that into it paide twenty

332
00:17:01,480 --> 00:17:03,880
Speaker 2: million dollars to as your last year, primarily to access

333
00:17:03,920 --> 00:17:06,160
Speaker 2: open AI's models, and it's on track to spend thirty

334
00:17:06,200 --> 00:17:08,720
Speaker 2: million this year, which outpaces the company's revenue growth in

335
00:17:08,760 --> 00:17:11,800
Speaker 2: the same period, raising questions about how sustainable the spending

336
00:17:11,920 --> 00:17:13,560
Speaker 2: is and how much of the cost it can pass

337
00:17:13,560 --> 00:17:16,320
Speaker 2: along to customers. Christopher Mims and The Wall Street Journal

338
00:17:16,359 --> 00:17:18,359
Speaker 2: also had a piece about the costs going up. Do

339
00:17:18,520 --> 00:17:21,040
Speaker 2: not be mad at Chris. Chris and I chatted before

340
00:17:21,080 --> 00:17:24,040
Speaker 2: he submitted that piece, like he literally on Blue Sky

341
00:17:24,080 --> 00:17:26,360
Speaker 2: called me out if fucking rocks. By the way, big

342
00:17:26,440 --> 00:17:28,600
Speaker 2: up to Chris Mims because it's nice to see the

343
00:17:28,640 --> 00:17:31,639
Speaker 2: mainstream media actually engaging with these things, even though it's

344
00:17:31,720 --> 00:17:34,600
Speaker 2: dangerous to the bubble. But you know what, the truth

345
00:17:34,680 --> 00:17:37,040
Speaker 2: must win out, and the problem here is that the

346
00:17:37,160 --> 00:17:41,600
Speaker 2: architecture underlying large language models is inherently unreliable. I imagine open

347
00:17:41,600 --> 00:17:44,520
Speaker 2: AI's introduction of the router to chat GPT five as

348
00:17:44,560 --> 00:17:46,359
Speaker 2: an attempt to moderate both the costs of the model

349
00:17:46,440 --> 00:17:49,320
Speaker 2: chosen and reduce the amount of exposure to reasoning models

350
00:17:49,320 --> 00:17:52,520
Speaker 2: for simple queries. Though Sam Moltman was boasting on August

351
00:17:52,520 --> 00:17:54,880
Speaker 2: tenth about the significant increase in both free and paid

352
00:17:54,960 --> 00:17:58,000
Speaker 2: users exposure to reasoning models, they don't teach you this

353
00:17:58,119 --> 00:18:01,640
Speaker 2: in business school. Still, A study written up by VentureBeat

354
00:18:01,680 --> 00:18:04,040
Speaker 2: found that open weight models burn between one point five

355
00:18:04,080 --> 00:18:06,119
Speaker 2: to four times more tokens, in part due to a

356
00:18:06,200 --> 00:18:08,879
Speaker 2: lack of token efficiency and in part thanks to you

357
00:18:09,040 --> 00:18:13,440
Speaker 2: guessed it reasoning models. I quote the finding's challenge of

358
00:18:13,480 --> 00:18:16,560
Speaker 2: prevailing assumption in the AI industry that open source models

359
00:18:16,560 --> 00:18:20,520
Speaker 2: offer a clear economic advantages over proprietary alternatives. While open

360
00:18:20,520 --> 00:18:23,000
Speaker 2: source models typically cost less per token to run, the

361
00:18:23,000 --> 00:18:25,520
Speaker 2: study suggests that this advantage could be and I quote

362
00:18:25,560 --> 00:18:28,280
Speaker 2: the study easily offset if they require more tokens to

363
00:18:28,320 --> 00:18:31,560
Speaker 2: reason about a given problem, and models keep getting bigger

364
00:18:31,560 --> 00:18:36,399
Speaker 2: and more expensive too. So why did this happen? Well,

365
00:18:36,520 --> 00:18:39,359
Speaker 2: it's because model developers hit a wall of diminishing returns

366
00:18:39,400 --> 00:18:41,159
Speaker 2: and the only way to make models do more was

367
00:18:41,200 --> 00:18:43,080
Speaker 2: to make them burn more tokens to generate a more

368
00:18:43,119 --> 00:18:46,560
Speaker 2: accurate response, which is a very simple way of describing

369
00:18:46,600 --> 00:18:49,160
Speaker 2: reasoning a thing that opening I launched in September twenty

370
00:18:49,200 --> 00:18:52,120
Speaker 2: twenty four, and others followed. As a result, all the

371
00:18:52,160 --> 00:18:55,040
Speaker 2: gains from powerful new models come from burning more and

372
00:18:55,119 --> 00:18:57,639
Speaker 2: more tokens. The cost per million token number is no

373
00:18:57,720 --> 00:18:59,840
Speaker 2: longer an accurate measure of the actual cost of generative

374
00:18:59,880 --> 00:19:02,720
Speaker 2: a because it's much much, much much harder to tell

375
00:19:02,720 --> 00:19:04,920
Speaker 2: how many tokens of reasoning model may burn, and it

376
00:19:05,040 --> 00:19:08,399
Speaker 2: varies as the boint the O Boying, I'm keeping that

377
00:19:08,480 --> 00:19:11,080
Speaker 2: all right. You get the real cuts as the O

378
00:19:11,240 --> 00:19:14,840
Speaker 2: Brown noted from model to model. In any case, there

379
00:19:14,880 --> 00:19:17,600
Speaker 2: really is no changing this path. These companies are out

380
00:19:17,600 --> 00:19:22,679
Speaker 2: of ideas now another another one of my favorite ultimate

381
00:19:22,720 --> 00:19:25,120
Speaker 2: booster gripts. This is a classic and I still get

382
00:19:25,160 --> 00:19:28,679
Speaker 2: this on social media. I'm I have people yapping in

383
00:19:28,720 --> 00:19:31,919
Speaker 2: my ear saying open air and Anthropic are just like

384
00:19:32,080 --> 00:19:34,840
Speaker 2: Uber because Uber bent twenty five billion dollars over the

385
00:19:34,880 --> 00:19:37,960
Speaker 2: course of fifteen or so years and look look edward,

386
00:19:38,119 --> 00:19:40,399
Speaker 2: they're now profitable. Why are you calling me Airport? Shut up?

387
00:19:40,640 --> 00:19:43,199
Speaker 2: This proves the open Ai, a totally different company with

388
00:19:43,240 --> 00:19:46,280
Speaker 2: different economics, will be totally fine. So I've heard this

389
00:19:46,400 --> 00:19:48,520
Speaker 2: argument maybe fifty times in the last year, to the

390
00:19:48,520 --> 00:19:49,879
Speaker 2: point that I had to talk about it in my

391
00:19:49,960 --> 00:19:53,160
Speaker 2: piece how does open Ai Survive, which I also turned

392
00:19:53,160 --> 00:19:55,720
Speaker 2: into a podcast around July twenty twenty four. Go back

393
00:19:55,720 --> 00:19:58,960
Speaker 2: and link a link to it in the piece. Yaddy yaddy, yadda. Nevertheless,

394
00:19:58,960 --> 00:20:00,840
Speaker 2: people make a few points by Uber and AI that

395
00:20:00,840 --> 00:20:02,880
Speaker 2: I think are fundamentally incorrect, and I'm going to break

396
00:20:02,920 --> 00:20:05,680
Speaker 2: them down for you now. They claim that AI is

397
00:20:05,720 --> 00:20:08,200
Speaker 2: making itself too big to fail and betting itself everywhere

398
00:20:08,240 --> 00:20:10,920
Speaker 2: and becoming essential, and none of these things are the case.

399
00:20:11,560 --> 00:20:13,480
Speaker 2: I've heard this argument a lot, by the way, and

400
00:20:13,520 --> 00:20:16,879
Speaker 2: it's one that's both ahistorical and alarmingly ignorant of the

401
00:20:17,040 --> 00:20:21,320
Speaker 2: very basics of society. But ed the government, no no, no, no, no, no,

402
00:20:21,680 --> 00:20:23,960
Speaker 2: you've heard, you've heard. OpenAI got a two hundred million

403
00:20:23,960 --> 00:20:26,720
Speaker 2: dollar Defense contract with an estimated completion date of July

404
00:20:26,760 --> 00:20:28,600
Speaker 2: twenty twenty six. And just to be clear, that's up

405
00:20:28,640 --> 00:20:31,120
Speaker 2: to two hundred million dollars, and that they're selling chat

406
00:20:31,160 --> 00:20:34,120
Speaker 2: GBT Enterprise to the US government for a dollar a year,

407
00:20:34,320 --> 00:20:37,160
Speaker 2: along with Anthropic doing the same thing, and even Google's

408
00:20:37,200 --> 00:20:40,000
Speaker 2: doing it, except they're doing forty cents for a year. Now,

409
00:20:40,000 --> 00:20:42,960
Speaker 2: you're probably hearing this and thinking, ah shit, this means

410
00:20:42,960 --> 00:20:45,080
Speaker 2: the government's paid them. They're never going away. And I

411
00:20:45,160 --> 00:20:47,720
Speaker 2: cannot be clear enough that you believing this is the

412
00:20:47,880 --> 00:20:51,240
Speaker 2: very intention of these deals. They are built specifically to

413
00:20:51,280 --> 00:20:53,359
Speaker 2: make you feel like these things are never going away.

414
00:20:53,640 --> 00:20:56,159
Speaker 2: This is also an attempt to get in with the

415
00:20:56,160 --> 00:20:58,440
Speaker 2: government at a rate that makes train these models a

416
00:20:58,520 --> 00:21:02,800
Speaker 2: no brainer. At which point I ask, and the government

417
00:21:02,880 --> 00:21:05,120
Speaker 2: is going to have cheap access to AI software does

418
00:21:05,119 --> 00:21:08,200
Speaker 2: not mean that the government relies on m every member

419
00:21:08,200 --> 00:21:11,199
Speaker 2: of the government having access to chat GPT, something that

420
00:21:11,320 --> 00:21:14,040
Speaker 2: is not even necessarily the case, does not make this

421
00:21:14,119 --> 00:21:17,200
Speaker 2: software useful, let alone essential. And if open ai burns

422
00:21:17,240 --> 00:21:19,600
Speaker 2: a bunch of money making it work for them, it

423
00:21:19,720 --> 00:21:22,240
Speaker 2: still won't be essential because large language models are not

424
00:21:22,280 --> 00:21:25,960
Speaker 2: actually that useful for doing stuff now let's talk Uber.

425
00:21:26,359 --> 00:21:29,360
Speaker 2: Uber was and is useful, which eventually made it essential.

426
00:21:30,080 --> 00:21:33,320
Speaker 2: Uber used lobbyist Bradley Tusk to steam roll local governments

427
00:21:33,359 --> 00:21:35,960
Speaker 2: into allowing Uber to operate in their cities, but Tasks

428
00:21:36,040 --> 00:21:38,520
Speaker 2: did not have to convince local governments that Uber was

429
00:21:38,600 --> 00:21:41,440
Speaker 2: useful or have to train people how to use Uber.

430
00:21:42,160 --> 00:21:44,760
Speaker 2: Uber's too big to fail moment was that local cabs

431
00:21:44,840 --> 00:21:48,000
Speaker 2: kind of fucking sucked just about everywhere. You ever try

432
00:21:48,000 --> 00:21:50,760
Speaker 2: and take a yellow cab from downtown Manhattan to Hoboken,

433
00:21:50,800 --> 00:21:53,880
Speaker 2: New Jersey, or Brooklyn or Queen's Do you ever try

434
00:21:53,880 --> 00:21:56,000
Speaker 2: and pay with a credit card? How about trying to

435
00:21:56,000 --> 00:21:58,480
Speaker 2: get a cab outside a major metropolitan area. Do you

436
00:21:58,520 --> 00:22:02,520
Speaker 2: remember how bad it was? It was really awful. I

437
00:22:02,560 --> 00:22:05,560
Speaker 2: don't think people realize or remember how bad it was.

438
00:22:05,760 --> 00:22:08,720
Speaker 2: And I'm not saying that Uber is good. I'm not

439
00:22:08,720 --> 00:22:11,600
Speaker 2: glorifying Uber in any way. But the experience that Uber

440
00:22:11,680 --> 00:22:14,640
Speaker 2: replaced was very, very bad. As a result, Uber did

441
00:22:14,680 --> 00:22:16,840
Speaker 2: become too big to fail because people now rely on

442
00:22:16,880 --> 00:22:19,840
Speaker 2: it because the old system sucked. Uber used its masses

443
00:22:19,880 --> 00:22:22,080
Speaker 2: of venture capital to keep prices low to get people

444
00:22:22,200 --> 00:22:24,880
Speaker 2: used to it too, but the fundamental experience was better

445
00:22:24,920 --> 00:22:27,000
Speaker 2: than calling a cab company and hoping they showed up.

446
00:22:27,520 --> 00:22:28,879
Speaker 2: I also want to be clear that this is not

447
00:22:28,920 --> 00:22:32,080
Speaker 2: me condoning Uber take public transport, if you can to

448
00:22:32,119 --> 00:22:34,439
Speaker 2: be clear. Uber has created a new kind of horrifying,

449
00:22:34,520 --> 00:22:38,440
Speaker 2: extractive labor practice which deprives people of benefits and dignity,

450
00:22:38,640 --> 00:22:40,800
Speaker 2: paying off academics to help the media gloss over the

451
00:22:40,800 --> 00:22:44,119
Speaker 2: horrors of their platform, and also now having to increase

452
00:22:44,160 --> 00:22:48,479
Speaker 2: prices so that they reached profitability by doing that. That

453
00:22:48,600 --> 00:22:51,159
Speaker 2: isn't something that's going to happen with genitive AI. Just

454
00:22:51,880 --> 00:23:08,840
Speaker 2: the costs are too high, They're way too high. But anyway,

455
00:23:09,240 --> 00:23:14,840
Speaker 2: what is essential about generative AI? What exactly, and be specific,

456
00:23:15,040 --> 00:23:18,679
Speaker 2: is the essential experience of generative AI? What are we

457
00:23:18,920 --> 00:23:24,919
Speaker 2: if chat, GPT disappeared tomorrow, what actually disappears? And on

458
00:23:24,960 --> 00:23:28,240
Speaker 2: an enterprise or governmental level, what exactly are these tools

459
00:23:28,320 --> 00:23:31,480
Speaker 2: doing for governments that would make removing them so painful?

460
00:23:31,640 --> 00:23:34,760
Speaker 2: What use cases, what outcomes? If your answer here is

461
00:23:34,800 --> 00:23:36,639
Speaker 2: to say, well, they're putting it in and they're choosing,

462
00:23:36,680 --> 00:23:40,640
Speaker 2: they're choosing which people to cut out of benefits, and please, goddamn,

463
00:23:40,920 --> 00:23:43,280
Speaker 2: this is what they want you to do. They want

464
00:23:43,320 --> 00:23:46,680
Speaker 2: you to be scared so they can feel powerful. They're

465
00:23:46,680 --> 00:23:48,760
Speaker 2: not doing that. You notice that we get all these

466
00:23:48,760 --> 00:23:51,720
Speaker 2: horrible stories by the way of internal government things, shoving

467
00:23:51,720 --> 00:23:55,159
Speaker 2: stuff into olms. You know what, we don't get another

468
00:23:55,240 --> 00:23:57,320
Speaker 2: thing we don't get, oh and then have It's just

469
00:23:57,359 --> 00:24:00,840
Speaker 2: they're doing this scary, bad thing that they shouldn't be.

470
00:24:00,840 --> 00:24:04,280
Speaker 2: This shouldn't be putting people's private information into anyway. I'm rambling.

471
00:24:04,600 --> 00:24:07,199
Speaker 2: Uber's essentral nature is that millions of people use it

472
00:24:07,359 --> 00:24:10,680
Speaker 2: in place of regular taxis, and it effectively replaced de

473
00:24:10,760 --> 00:24:13,679
Speaker 2: krepit of exploitative systems like the yellow cab Medallions in

474
00:24:13,680 --> 00:24:16,760
Speaker 2: New York with its own tech enabled exploitation system that

475
00:24:16,920 --> 00:24:20,560
Speaker 2: nevertheless worked far better for the user. Okay, I also

476
00:24:20,560 --> 00:24:22,240
Speaker 2: want to do a side note just to acknowledge that

477
00:24:22,800 --> 00:24:26,399
Speaker 2: the disruption from Uber brought something to the medallion system

478
00:24:26,440 --> 00:24:30,240
Speaker 2: that was genuinely horrendous. The consequences were horrifying for the

479
00:24:30,240 --> 00:24:32,560
Speaker 2: owners of the medallions, some of who had paid more

480
00:24:32,560 --> 00:24:34,919
Speaker 2: than a million dollars for the privilege of driving a

481
00:24:34,960 --> 00:24:37,400
Speaker 2: New York cab and were burdened under mountains of debt.

482
00:24:37,680 --> 00:24:41,280
Speaker 2: That our system is so fucking evil. I think it's horrifying,

483
00:24:41,520 --> 00:24:44,240
Speaker 2: and I think the payday loan people involved should all

484
00:24:44,280 --> 00:24:47,520
Speaker 2: be in fucking prison, worst scum of the world. The

485
00:24:47,560 --> 00:24:49,600
Speaker 2: people who are taking advantage of people come to this

486
00:24:49,640 --> 00:24:51,600
Speaker 2: country to drive a fucking cab that they have to

487
00:24:51,960 --> 00:24:55,639
Speaker 2: take out massive loans to buy. That is evil. Uber

488
00:24:55,680 --> 00:24:58,399
Speaker 2: is also just to be clear, but that also is

489
00:24:58,480 --> 00:25:02,840
Speaker 2: That's the point I'm trying to make. Should feel sorry

490
00:25:02,920 --> 00:25:06,199
Speaker 2: for the victims of that system. That system was a

491
00:25:06,280 --> 00:25:10,640
Speaker 2: kind of corruption unto itself anyway, getting back to the thing,

492
00:25:10,680 --> 00:25:12,760
Speaker 2: because I don't know, I feel I actually feel a

493
00:25:12,760 --> 00:25:14,919
Speaker 2: lot for the people who are the victims of the

494
00:25:14,920 --> 00:25:17,760
Speaker 2: medallion system. It's fucking rough, and every time I think

495
00:25:17,800 --> 00:25:20,760
Speaker 2: of it, I feel very sad inside. But let's get

496
00:25:20,800 --> 00:25:22,320
Speaker 2: back to the episode. I don't want to think about

497
00:25:22,359 --> 00:25:25,919
Speaker 2: it any longer. There really are no essential use cases

498
00:25:25,960 --> 00:25:29,359
Speaker 2: for Chat, GPT, or really any Genai system. You cannot

499
00:25:29,359 --> 00:25:31,280
Speaker 2: point to one use case that is anywhere near as

500
00:25:31,280 --> 00:25:34,560
Speaker 2: necessary as cabs in cities, And indeed the biggest use cases,

501
00:25:34,600 --> 00:25:37,399
Speaker 2: things like brainstorming and search, are either easily replaced by

502
00:25:37,480 --> 00:25:39,919
Speaker 2: any other commoditized The lam will already exist in the

503
00:25:39,920 --> 00:25:44,440
Speaker 2: case of Google Search. Now let's do another boost quip

504
00:25:45,200 --> 00:25:47,920
Speaker 2: data centers are important economic growth vehicles and now helping

505
00:25:48,000 --> 00:25:51,480
Speaker 2: drive innovation and jobs throughout America. Having data centers promotes innovation,

506
00:25:51,600 --> 00:25:54,960
Speaker 2: making open AI and AI data centers essential. And the

507
00:25:55,000 --> 00:25:58,119
Speaker 2: answer to there is no no. Sorry, this is a

508
00:25:58,160 --> 00:26:00,560
Speaker 2: really simple one. These data centers are not in and

509
00:26:00,600 --> 00:26:03,959
Speaker 2: of themselves driving much economic growth other than the costs

510
00:26:03,960 --> 00:26:07,600
Speaker 2: of building them, which I went into last episode. As

511
00:26:07,600 --> 00:26:10,280
Speaker 2: I've discussed again and again, there's maybe forty billion dollars

512
00:26:10,320 --> 00:26:12,720
Speaker 2: in revenue and no profit coming out of AI companies.

513
00:26:12,840 --> 00:26:15,240
Speaker 2: There isn't any economic growth. They're not holding up anything

514
00:26:15,480 --> 00:26:19,640
Speaker 2: other than the massive, massive infrastructure built to make them

515
00:26:19,800 --> 00:26:23,600
Speaker 2: make no money and lose billions. There's no great loss

516
00:26:23,600 --> 00:26:25,960
Speaker 2: associated with the death of large language models or the

517
00:26:26,119 --> 00:26:28,920
Speaker 2: death of this era. Taking away Ober would be genuinely

518
00:26:28,960 --> 00:26:32,720
Speaker 2: catastrophic with some people's ability to get places and people's jobs,

519
00:26:32,760 --> 00:26:37,560
Speaker 2: even if they are horrifyingly underpaid. But here's another booster, quipped.

520
00:26:37,720 --> 00:26:40,320
Speaker 2: Uber burned a lot of money twenty five billion dollars

521
00:26:40,400 --> 00:26:43,440
Speaker 2: or more to get where it is today. Ooh, mister Zichron,

522
00:26:43,720 --> 00:26:46,480
Speaker 2: mister Zitchron, You're dead. And my response is the open

523
00:26:46,520 --> 00:26:49,080
Speaker 2: AI and anthropic are both separately burned more than four

524
00:26:49,119 --> 00:26:51,240
Speaker 2: times as much money since the beginning of twenty twenty

525
00:26:51,240 --> 00:26:54,159
Speaker 2: four as Uber did in its entire existence. So the

526
00:26:54,240 --> 00:26:57,280
Speaker 2: classic and wrong argument about open ai and companies like

527
00:26:57,320 --> 00:26:59,400
Speaker 2: open ai is that Uber burned a bunch of money,

528
00:26:59,440 --> 00:27:03,080
Speaker 2: is now cash flow positive or profitable. I want to

529
00:27:03,080 --> 00:27:06,000
Speaker 2: be clear that Uber's costs are nothing like large language models,

530
00:27:06,000 --> 00:27:09,240
Speaker 2: and making this comparison is ridiculous and desperate. But let's

531
00:27:09,240 --> 00:27:11,320
Speaker 2: talk about raw losses, shall we, and where people are

532
00:27:11,320 --> 00:27:14,440
Speaker 2: making this assumption. So Uber lost twenty four point nine

533
00:27:14,560 --> 00:27:16,480
Speaker 2: billion dollars in the space of four years from twenty

534
00:27:16,600 --> 00:27:18,679
Speaker 2: nineteen to twenty twenty two, in part because of the

535
00:27:18,680 --> 00:27:20,800
Speaker 2: billions it was spending on sales and marketing in R

536
00:27:20,840 --> 00:27:22,960
Speaker 2: and D four point six billion dollars and four point

537
00:27:23,040 --> 00:27:26,720
Speaker 2: eight billion dollars respectively in twenty nineteen alone. It also

538
00:27:27,000 --> 00:27:29,840
Speaker 2: massively subsidized the cost of rights, which is why prices

539
00:27:29,880 --> 00:27:33,119
Speaker 2: had to increase, and spent heavily on driver recruitment, burning

540
00:27:33,119 --> 00:27:35,880
Speaker 2: cash to get scale, you know, the classic Silicon Valley way.

541
00:27:36,480 --> 00:27:40,200
Speaker 2: This is absolutely nothing like how large language models are growing.

542
00:27:40,200 --> 00:27:42,840
Speaker 2: And I'm tired of defending this point, but defended I

543
00:27:42,920 --> 00:27:46,800
Speaker 2: shall open AI and Anthropic burn money primarily through compute

544
00:27:46,800 --> 00:27:50,119
Speaker 2: costs and specialized talent. These costs are increasing, especially with

545
00:27:50,160 --> 00:27:52,399
Speaker 2: the rush to hire every single AI scientists at the

546
00:27:52,440 --> 00:27:56,680
Speaker 2: most expensive price possible. There are also essential immovable costs

547
00:27:56,760 --> 00:28:00,280
Speaker 2: that neither open AI or Anthropic have to shoulder. The

548
00:28:00,320 --> 00:28:02,800
Speaker 2: construction of the data centers necessary to train and run

549
00:28:02,800 --> 00:28:05,159
Speaker 2: inference for their models, and of course the GPU is

550
00:28:05,240 --> 00:28:08,000
Speaker 2: inside them, which I will get to in a little bit. Yes,

551
00:28:08,200 --> 00:28:10,919
Speaker 2: Uber raised thirty three point five billion dollars through multiple

552
00:28:11,000 --> 00:28:13,840
Speaker 2: rounds of posting IPO dam though it raised about twenty

553
00:28:13,840 --> 00:28:17,040
Speaker 2: five billion dollars in actual funding. Yes, Uber burned an

554
00:28:17,040 --> 00:28:19,760
Speaker 2: absolutely as ton of money. Yes, Uber a scale, but

555
00:28:19,880 --> 00:28:21,680
Speaker 2: Uber has not burned money as a means of making

556
00:28:21,680 --> 00:28:25,400
Speaker 2: its product functional or useful. Uber worked immediately. I mean

557
00:28:25,840 --> 00:28:27,879
Speaker 2: was twenty twelve. I think I used it for the

558
00:28:27,920 --> 00:28:30,119
Speaker 2: first time. Maybe earlier. No, no, it would have been

559
00:28:30,119 --> 00:28:33,760
Speaker 2: twenty ten. It worked immediately. You used it, You're like, wow, this,

560
00:28:34,040 --> 00:28:35,919
Speaker 2: I can just put in my address. I don't have

561
00:28:36,040 --> 00:28:38,320
Speaker 2: to say my address three times because I have a

562
00:28:38,320 --> 00:28:41,480
Speaker 2: British accent and nobody can fucking understand me. Sometimes you can,

563
00:28:41,560 --> 00:28:46,320
Speaker 2: though you're special. Yeah, it was really obvious that it worked,

564
00:28:46,520 --> 00:28:49,080
Speaker 2: and also the costs associate with Uber and its capital

565
00:28:49,080 --> 00:28:52,120
Speaker 2: expenditures from twenty nineteen through twenty twenty four were around

566
00:28:52,240 --> 00:28:54,640
Speaker 2: two point two billion dollars, by the way, on miniscule

567
00:28:54,680 --> 00:28:57,880
Speaker 2: compared to the actual real costs of open ai and Anthropic.

568
00:28:58,520 --> 00:29:01,520
Speaker 2: Both open Ai and Anthropic around five billion dollars each

569
00:29:01,520 --> 00:29:04,560
Speaker 2: in twenty twenty four, but their infrastructure was entirely paid

570
00:29:04,560 --> 00:29:07,480
Speaker 2: for by either Microsoft, Google, or Amazon. And by which

571
00:29:07,520 --> 00:29:09,640
Speaker 2: I mean the building of it and the expansion they're

572
00:29:09,640 --> 00:29:12,800
Speaker 2: in what we don't know how much of this infrastructure

573
00:29:12,840 --> 00:29:16,240
Speaker 2: is specifically for open ai or Anthropic. As the largest

574
00:29:16,280 --> 00:29:18,760
Speaker 2: model developers, it's fair to assume that a large chunk

575
00:29:18,760 --> 00:29:21,840
Speaker 2: at least thirty percent of Amazon and Microsoft's capital expenditures

576
00:29:21,880 --> 00:29:24,880
Speaker 2: have been to support these loads. Great sentence to cut

577
00:29:24,920 --> 00:29:27,520
Speaker 2: and listen to again. I also leave out Google, as

578
00:29:27,520 --> 00:29:30,840
Speaker 2: it's unclear whether it's expanded its infrastructure for Anthropic, but

579
00:29:30,880 --> 00:29:33,600
Speaker 2: we know Amazon has done so. As a result, the

580
00:29:33,600 --> 00:29:35,880
Speaker 2: true cost of open ai and Anthropic is at least

581
00:29:35,920 --> 00:29:39,120
Speaker 2: ten times what uberburned. Amazon spent eighty three billion dollars

582
00:29:39,160 --> 00:29:41,680
Speaker 2: in capital expenditures in twenty twenty four and expects one

583
00:29:41,760 --> 00:29:43,840
Speaker 2: hundred and five billion dollars are the fuckers in twenty

584
00:29:43,840 --> 00:29:47,160
Speaker 2: twenty five. Microsoft spent fifty five point six billion dollars

585
00:29:47,200 --> 00:29:49,400
Speaker 2: in twenty twenty four and expects to spend eighty billion

586
00:29:49,400 --> 00:29:52,080
Speaker 2: dollars this year. I'm actually confident most of that is

587
00:29:52,120 --> 00:29:55,760
Speaker 2: open Ai, but based on my conservative calculations, the true

588
00:29:55,760 --> 00:29:58,280
Speaker 2: cost of open ai is at least eighty two billion dollars,

589
00:29:58,440 --> 00:30:01,800
Speaker 2: and that only includes capex twenty twenty four onwards. Based

590
00:30:01,840 --> 00:30:04,479
Speaker 2: on thirty percent of Microsoft's capex. It's not everything has

591
00:30:04,520 --> 00:30:07,480
Speaker 2: been invested yet in twenty twenty five, and open Ai

592
00:30:07,880 --> 00:30:11,320
Speaker 2: might not be all of the capex, and also the

593
00:30:11,360 --> 00:30:13,480
Speaker 2: forty one point four billion dollars of funding that open

594
00:30:13,480 --> 00:30:16,160
Speaker 2: ai has received so far. The true cost of Anthropic

595
00:30:16,200 --> 00:30:18,320
Speaker 2: is around seventy seven point one billion dollars, and that's

596
00:30:18,360 --> 00:30:21,040
Speaker 2: not including the thirteen billion they just raised, but it

597
00:30:21,040 --> 00:30:23,400
Speaker 2: does include all their previous funding and thirty percent of

598
00:30:23,400 --> 00:30:26,320
Speaker 2: Amazon's capex in the beginning of twenty twenty four. Now

599
00:30:26,320 --> 00:30:29,840
Speaker 2: these are in exact comparisons, but the classic argument is

600
00:30:29,880 --> 00:30:32,680
Speaker 2: that Uber burned lots of money and worked out okay,

601
00:30:32,760 --> 00:30:35,400
Speaker 2: when in fact the combined couple expenditures from twenty twenty

602
00:30:35,400 --> 00:30:38,120
Speaker 2: four onwards that are necessary to make open ai and

603
00:30:38,120 --> 00:30:41,320
Speaker 2: Anthropic worker each on their own four times what Uber

604
00:30:41,480 --> 00:30:45,880
Speaker 2: burned in over a decade. I also believe these numbers

605
00:30:45,920 --> 00:30:48,200
Speaker 2: are conservative. There's a good chance that open ai and

606
00:30:48,240 --> 00:30:51,920
Speaker 2: Anthropic dominate the capex of Amazon, Google, and Microsoft in

607
00:30:51,960 --> 00:30:54,120
Speaker 2: part because of what the fuck else are they buying

608
00:30:54,120 --> 00:30:56,920
Speaker 2: all these GPUs for as their own AI services don't

609
00:30:56,920 --> 00:31:00,720
Speaker 2: appear to be making much money at all anyway. To

610
00:31:00,760 --> 00:31:03,360
Speaker 2: put it real simple, AI has burned way more in

611
00:31:03,360 --> 00:31:05,720
Speaker 2: the last two years than Uber burned in ten. Uber

612
00:31:05,760 --> 00:31:07,920
Speaker 2: didn't burn money in the same way, didn't burn much

613
00:31:07,920 --> 00:31:10,840
Speaker 2: in the way of capital expenditures, didn't require massive amounts

614
00:31:10,840 --> 00:31:13,600
Speaker 2: of infrastructure, and isn't remotely the same in any way,

615
00:31:13,640 --> 00:31:15,840
Speaker 2: shape or form other than that it burned a lot

616
00:31:15,880 --> 00:31:18,160
Speaker 2: of money. And that burning wasn't because it was trying

617
00:31:18,200 --> 00:31:20,520
Speaker 2: to build the core product. It was trying to scale.

618
00:31:20,720 --> 00:31:23,320
Speaker 2: It's all so stupid, And you know what, I'm not

619
00:31:23,400 --> 00:31:27,800
Speaker 2: even done. Our next and final AI booster episode will

620
00:31:27,800 --> 00:31:31,480
Speaker 2: breeze through the dumbest of the dumb arguments, and I'll

621
00:31:31,480 --> 00:31:34,360
Speaker 2: say why I'm finally drawing a line under these arguments

622
00:31:34,400 --> 00:31:36,760
Speaker 2: for real, because it needs to be said. We need

623
00:31:36,800 --> 00:31:41,240
Speaker 2: to say something. I hope you've enjoyed this, see you tomorrow, godspeed.

624
00:31:49,960 --> 00:31:52,400
Speaker 2: Thank you for listening to Better Offline. The editor and

625
00:31:52,400 --> 00:31:55,600
Speaker 2: composer of the Better Offline theme song is Matasowski. You

626
00:31:55,600 --> 00:31:57,840
Speaker 2: can check out more of his music and audio projects

627
00:31:58,040 --> 00:32:01,520
Speaker 2: at Matasowski dot com, M A T T O S

628
00:32:01,600 --> 00:32:05,640
Speaker 2: O W s ki dot com. You can email me

629
00:32:05,680 --> 00:32:08,280
Speaker 2: at easy at Better offline dot com or visit Better

630
00:32:08,320 --> 00:32:10,760
Speaker 2: Offline dot com to find more podcast links and of course,

631
00:32:10,800 --> 00:32:13,920
Speaker 2: my newsletter. I also really recommend you go to chat

632
00:32:13,960 --> 00:32:16,600
Speaker 2: dot Where's youreaed dot at to visit the discord, and

633
00:32:16,640 --> 00:32:19,360
Speaker 2: go to our slash Better Offline to check out our reddit.

634
00:32:20,120 --> 00:32:23,400
Speaker 2: Thank you so much for listening. Better Offline is a

635
00:32:23,400 --> 00:32:26,240
Speaker 2: production of cool Zone Media. For more from cool Zone Media,

636
00:32:26,600 --> 00:32:29,720
Speaker 2: visit our website cool Zonemedia dot com, or check us

637
00:32:29,760 --> 00:32:32,760
Speaker 2: out on the iHeartRadio app, Apple Podcasts, or wherever you

638
00:32:32,800 --> 00:32:33,920
Speaker 2: get your podcasts.