WEBVTT - How To Argue With An AI Booster, Part Two

0:00:02.120 --> 0:00:02.880
<v Speaker 1>Ze Media.

0:00:04.200 --> 0:00:07.000
<v Speaker 2>Hello one, Welcome to Better Offline. I'm your host ed

0:00:07.080 --> 0:00:21.840
<v Speaker 2>Zi Trun. This is part two of our three parts

0:00:21.880 --> 0:00:25.040
<v Speaker 2>serious on how to argue with an AI booster. When

0:00:25.040 --> 0:00:27.440
<v Speaker 2>we last left off, I'd started talking about some of

0:00:27.480 --> 0:00:30.240
<v Speaker 2>the most common and vacuous talking points used by those

0:00:30.240 --> 0:00:32.959
<v Speaker 2>who defend the generative AI industry and why a lot

0:00:32.960 --> 0:00:36.080
<v Speaker 2>of them are wholly without merit. These are the booster quips,

0:00:36.120 --> 0:00:38.680
<v Speaker 2>assertions that if you don't know much, sound convincing but

0:00:38.720 --> 0:00:41.680
<v Speaker 2>are easily disproven with the right information. And in that

0:00:41.800 --> 0:00:44.000
<v Speaker 2>last episode we addressed the quips that say were in

0:00:44.040 --> 0:00:47.080
<v Speaker 2>the early days of AI and that people doubted smartphones

0:00:47.080 --> 0:00:49.479
<v Speaker 2>and the internet. Things they didn't do just like they

0:00:49.479 --> 0:00:52.880
<v Speaker 2>did generative AI, which they should do in the cycle

0:00:52.920 --> 0:00:55.360
<v Speaker 2>of grief. That's the denial stage. Now we're going to

0:00:55.400 --> 0:00:58.880
<v Speaker 2>move on to bargaining. This is just that the dot

0:00:58.920 --> 0:01:01.920
<v Speaker 2>com boom, even if of this collapses, the overcapacity will

0:01:01.960 --> 0:01:04.200
<v Speaker 2>be practical for the market like the fiber boom was.

0:01:05.040 --> 0:01:07.760
<v Speaker 2>All right, folks, time for a little history. You know me,

0:01:07.840 --> 0:01:10.800
<v Speaker 2>I'll love me some mystery. The fiber boom began after

0:01:10.840 --> 0:01:14.520
<v Speaker 2>the Telecommunications Act of nineteen ninety six deregulated large parts

0:01:14.520 --> 0:01:18.920
<v Speaker 2>of America's communications infrastructure, creating a massive boom, a five

0:01:19.000 --> 0:01:25.720
<v Speaker 2>hundred billion dollars one to be precise, primarily funded with debt. Obviously,

0:01:25.720 --> 0:01:28.400
<v Speaker 2>we're still using the infrastructure bought during that boom, and

0:01:28.480 --> 0:01:30.640
<v Speaker 2>this fact is used as a defense of the insane

0:01:30.720 --> 0:01:35.520
<v Speaker 2>capex spending surrounding generative AI. High speed Internet is useful, right, sure,

0:01:35.600 --> 0:01:38.480
<v Speaker 2>But the fiber optic boom period was also defined by

0:01:38.480 --> 0:01:43.280
<v Speaker 2>a gluttony of overinvestment, ridiculous valuations, and genuine, outright fraud.

0:01:43.480 --> 0:01:45.560
<v Speaker 2>In any case, this is not remotely the same thing,

0:01:45.560 --> 0:01:47.480
<v Speaker 2>and anyone making this point needs to learn the very

0:01:47.520 --> 0:01:51.520
<v Speaker 2>fucking basics of technology. Let's get going now. The fiber

0:01:51.520 --> 0:01:54.120
<v Speaker 2>optic cable of this era is mostly owned by a

0:01:54.120 --> 0:01:57.360
<v Speaker 2>few companies. Forty two percent of Nvidia's revenue is from

0:01:57.400 --> 0:02:00.440
<v Speaker 2>the Magnificent seven, and the companies buying these gps are

0:02:00.480 --> 0:02:02.360
<v Speaker 2>for the most part not going to go bust once

0:02:02.400 --> 0:02:05.680
<v Speaker 2>the AI bubble bursts. You can also already get the

0:02:05.800 --> 0:02:09.560
<v Speaker 2>cheap fiber of this era too cheap aigpus already here.

0:02:09.840 --> 0:02:13.040
<v Speaker 2>GPUs are depreciating assets, meaning that the good deals are

0:02:13.080 --> 0:02:16.640
<v Speaker 2>already happening. I found an in Vidia a one hundred

0:02:16.639 --> 0:02:19.160
<v Speaker 2>for two or three thousand dollars multiple times on eBay,

0:02:19.360 --> 0:02:21.120
<v Speaker 2>and you can get the h one hundreds which are

0:02:21.160 --> 0:02:23.639
<v Speaker 2>more powerful for well, I think thirty grand and those

0:02:23.680 --> 0:02:27.720
<v Speaker 2>things go forty five thousand retails, So not brilliant. Aigpus

0:02:27.760 --> 0:02:29.760
<v Speaker 2>also do not have a variety of use cases and

0:02:29.800 --> 0:02:33.440
<v Speaker 2>are limited by Kuda, in Vidia's programming libraries and APIs.

0:02:33.760 --> 0:02:37.760
<v Speaker 2>Aigpus are integrated into applications using this language Kuda, and

0:02:37.800 --> 0:02:41.280
<v Speaker 2>this is specifically in Vidia's programming language. While there are

0:02:41.400 --> 0:02:45.320
<v Speaker 2>other use cases scientific simulations, image and video processing, data

0:02:45.360 --> 0:02:48.880
<v Speaker 2>science and analytics, medical imaging, and so on. Kuder is

0:02:48.880 --> 0:02:53.720
<v Speaker 2>not a one size fits or digital panacea. While fiber

0:02:53.720 --> 0:02:57.040
<v Speaker 2>optic cable was, and it was also put everywhere, it

0:02:57.200 --> 0:03:00.240
<v Speaker 2>truly did set up the future. What are the these

0:03:00.240 --> 0:03:04.679
<v Speaker 2>GPUs setting up exactly? Also, widespread access to cheaper GPUs

0:03:04.720 --> 0:03:08.280
<v Speaker 2>has already happened, and what new use cases are there?

0:03:08.600 --> 0:03:11.520
<v Speaker 2>What are the new innovative things we can do? As

0:03:11.520 --> 0:03:14.440
<v Speaker 2>a result of the AI bubble, there are now many, many, many, many,

0:03:14.440 --> 0:03:17.720
<v Speaker 2>many different vendors to get access to GPUs. You can

0:03:17.760 --> 0:03:20.000
<v Speaker 2>pay at an hourly rate. Who knows if it's probitable,

0:03:20.040 --> 0:03:21.880
<v Speaker 2>but you can do it, and sometimes you can get

0:03:21.880 --> 0:03:23.880
<v Speaker 2>them for as little as one dollars an hour, which

0:03:23.919 --> 0:03:26.640
<v Speaker 2>is really not good. It definitely isn't making them money

0:03:26.639 --> 0:03:30.520
<v Speaker 2>but putting the financial collapse aside. While they might be

0:03:30.639 --> 0:03:33.840
<v Speaker 2>cheaper when the AI bubble bursts, does cheaper actually enable

0:03:33.840 --> 0:03:36.920
<v Speaker 2>people to do new stuff? Is costs the problem because

0:03:36.920 --> 0:03:38.080
<v Speaker 2>I think the costs are going to go up. But

0:03:38.120 --> 0:03:40.440
<v Speaker 2>even if they weren't going up, what are the things

0:03:40.480 --> 0:03:42.520
<v Speaker 2>that you could do that a new What is the

0:03:42.560 --> 0:03:46.520
<v Speaker 2>prohibitive cost? No one can actually answer this question because

0:03:46.560 --> 0:03:50.080
<v Speaker 2>the answer isn't fun. GPUs are built to shove massive

0:03:50.080 --> 0:03:52.960
<v Speaker 2>amounts of compute into one specific function, again and again

0:03:53.000 --> 0:03:55.560
<v Speaker 2>and again, like generating the output of model, which remember,

0:03:55.680 --> 0:03:59.640
<v Speaker 2>mostly boils down to complex maths. Unlike CPUs, a GPU

0:03:59.680 --> 0:04:03.240
<v Speaker 2>can't easily changed tasks or handle many little distinct operations,

0:04:03.520 --> 0:04:05.560
<v Speaker 2>meaning that these things aren't going to be adopted for

0:04:05.640 --> 0:04:08.640
<v Speaker 2>another mass market use case because there probably isn't one.

0:04:09.280 --> 0:04:12.800
<v Speaker 2>In simpler terms, this was not an infrastructure built out.

0:04:13.000 --> 0:04:16.360
<v Speaker 2>The GPU boom is a heavily centralized, capital expenditure funded

0:04:16.400 --> 0:04:18.640
<v Speaker 2>asset bubble where a bunch of chips will sit in

0:04:18.680 --> 0:04:22.560
<v Speaker 2>warehouses or kind of fallow data centers waiting for somebody

0:04:22.560 --> 0:04:24.480
<v Speaker 2>to make up a use case for them. And if

0:04:24.520 --> 0:04:27.000
<v Speaker 2>an endearing one existed, we'd already have it, because we

0:04:27.040 --> 0:04:31.920
<v Speaker 2>already have all the fucking GPUs. Now here's a really

0:04:31.920 --> 0:04:34.359
<v Speaker 2>big boost e quip and I have been looking forward to.

0:04:34.360 --> 0:04:35.880
<v Speaker 2>I get a lot of people asking you about this.

0:04:36.839 --> 0:04:41.280
<v Speaker 2>I'm ed, you're so stupid. Why am I stupid? Exactly? Well,

0:04:41.320 --> 0:04:44.200
<v Speaker 2>five really smart guys got together and wrote AI twenty

0:04:44.279 --> 0:04:47.320
<v Speaker 2>twenty seven, which is a very real sounding extrapolation that

0:04:47.440 --> 0:04:52.559
<v Speaker 2>shut the fuck up, shut up, shut up. AI twenty

0:04:52.600 --> 0:04:55.440
<v Speaker 2>twenty seven is fan fiction. If you were scared by this,

0:04:55.480 --> 0:04:57.560
<v Speaker 2>and you're not a booster, you shouldn't feel bad. By

0:04:57.560 --> 0:05:00.320
<v Speaker 2>the way this was written to scare you. By the way,

0:05:00.320 --> 0:05:02.200
<v Speaker 2>if you don't know what it is I'm talking about,

0:05:02.360 --> 0:05:04.880
<v Speaker 2>you should consider yourself lucky. It's essentially a piece of

0:05:04.920 --> 0:05:09.000
<v Speaker 2>speculative fiction that describes where GENAI companies get fatter models

0:05:09.000 --> 0:05:11.400
<v Speaker 2>that get exponentially better, and the US and China are

0:05:11.440 --> 0:05:14.120
<v Speaker 2>in brailed in an AI arms race. It's really silly.

0:05:14.160 --> 0:05:17.000
<v Speaker 2>It's so very silly, and I call it fan fiction

0:05:17.080 --> 0:05:19.680
<v Speaker 2>because it is. If we're thinking about this in purely

0:05:19.720 --> 0:05:22.080
<v Speaker 2>intellectual terms. It's up there with my immortal and no,

0:05:22.200 --> 0:05:24.599
<v Speaker 2>I'm not explaining that you can google that one for yourselves.

0:05:25.160 --> 0:05:27.240
<v Speaker 2>It doesn't matter if all the people writing the fan

0:05:27.279 --> 0:05:30.080
<v Speaker 2>fiction are scientists or that they have the right credentials.

0:05:30.440 --> 0:05:33.200
<v Speaker 2>They themselves said that AI twenty twenty seven is a

0:05:33.279 --> 0:05:36.960
<v Speaker 2>guess an extrapolation, which means guess with expert feedback, which

0:05:37.000 --> 0:05:40.120
<v Speaker 2>means someone editing your fan fiction and involves experience that

0:05:40.200 --> 0:05:42.240
<v Speaker 2>open AI. There are people that worked on the shows

0:05:42.240 --> 0:05:45.479
<v Speaker 2>they write fan fiction about. We're not even insulting fan fiction.

0:05:45.560 --> 0:05:48.520
<v Speaker 2>By the way, go nuts, you're more You are one

0:05:48.600 --> 0:05:53.040
<v Speaker 2>hundred times more ethically positive than these people. At least

0:05:53.040 --> 0:05:56.960
<v Speaker 2>you admits fan fiction could knuckles get pregnant. I'm sure

0:05:56.960 --> 0:05:59.200
<v Speaker 2>somebody's found out. I'm not going to go line by

0:05:59.240 --> 0:06:01.160
<v Speaker 2>line and cut this any more than I'm going to

0:06:01.200 --> 0:06:03.839
<v Speaker 2>go and do a lengthy takedown of someone's erotic Bancho

0:06:03.920 --> 0:06:07.640
<v Speaker 2>Kazoui's story, because both are fictional. The entire premise of

0:06:07.640 --> 0:06:10.400
<v Speaker 2>this nonsense is that at one point someone invents a

0:06:10.400 --> 0:06:13.400
<v Speaker 2>self learning agent that teaches itself stuff, and it does

0:06:13.400 --> 0:06:16.520
<v Speaker 2>a bunch of other stuff requiring a Brazilian compute points

0:06:17.000 --> 0:06:19.599
<v Speaker 2>with different agents with different numbers after them. There is

0:06:19.640 --> 0:06:21.800
<v Speaker 2>no proof that this is possible. Nobody has done it,

0:06:21.839 --> 0:06:24.600
<v Speaker 2>and nobody will do it. AA twenty twenty seven was

0:06:24.640 --> 0:06:27.120
<v Speaker 2>written specifically to fool people that want to be fooled,

0:06:27.279 --> 0:06:29.440
<v Speaker 2>with big chants and the right technical terms used to

0:06:29.480 --> 0:06:31.400
<v Speaker 2>lull the credulus into a wet dream and a New

0:06:31.480 --> 0:06:33.680
<v Speaker 2>York Times column where one of the writers folds their

0:06:33.720 --> 0:06:36.520
<v Speaker 2>hands and looks worried. It was also written to scare

0:06:36.520 --> 0:06:40.480
<v Speaker 2>people that are already scared. It makes big, scary proclamations

0:06:40.480 --> 0:06:43.000
<v Speaker 2>with tons of links to stuff that looks really legitimate,

0:06:43.080 --> 0:06:45.920
<v Speaker 2>but when you piece it all together, is literally just

0:06:46.000 --> 0:06:50.440
<v Speaker 2>fan fection, except really not that endearing. My personal favorite

0:06:50.480 --> 0:06:53.200
<v Speaker 2>part is mid twenty twenty six China Wakes Up, which

0:06:53.240 --> 0:06:56.520
<v Speaker 2>involves China's intelligence agents. He's trying to steal Open Brains

0:06:56.560 --> 0:06:59.960
<v Speaker 2>agent no idea who this companicably referring to please email

0:07:00.000 --> 0:07:02.000
<v Speaker 2>if you can work it out to I don't care

0:07:02.080 --> 0:07:05.760
<v Speaker 2>at business dot org before the headline of AI take

0:07:05.839 --> 0:07:08.560
<v Speaker 2>some jobs. After Open Brain releases a model. Oh God,

0:07:08.600 --> 0:07:12.520
<v Speaker 2>I'm so bored even fucking talking about this now. Sarah

0:07:12.600 --> 0:07:15.120
<v Speaker 2>lyonce puts this well, arguing that AI twenty twenty seven

0:07:15.160 --> 0:07:17.680
<v Speaker 2>and AI in general is no different from the spurious

0:07:17.720 --> 0:07:20.200
<v Speaker 2>spectral evidence used to accuse someone of being a witch

0:07:20.280 --> 0:07:23.520
<v Speaker 2>during the Salem witch trials, and I quote and the

0:07:23.520 --> 0:07:26.320
<v Speaker 2>evidence is spectral. What is the real evidence in AI

0:07:26.320 --> 0:07:29.680
<v Speaker 2>twenty twenty seven beyond trust us and vibes? People who

0:07:29.680 --> 0:07:32.720
<v Speaker 2>wrote it site themselves in the piece, do not demand

0:07:32.720 --> 0:07:35.440
<v Speaker 2>I take this seriously. This is so clearly a marketing

0:07:35.960 --> 0:07:38.240
<v Speaker 2>device to scare people into buying your product before this

0:07:38.280 --> 0:07:41.600
<v Speaker 2>imaginary window closes. Don't call me stupid for not falling

0:07:41.640 --> 0:07:44.840
<v Speaker 2>for your spectral evidence. My whole life, people have been

0:07:44.880 --> 0:07:48.200
<v Speaker 2>saying artificial intelligence is around the corner, and it never arrives.

0:07:48.640 --> 0:07:50.680
<v Speaker 2>I simply do not believe a chatbot will ever be

0:07:50.720 --> 0:07:52.720
<v Speaker 2>more than a chat pot, and until you show me

0:07:52.760 --> 0:07:57.040
<v Speaker 2>it doing that, I will not believe it anyway. AI

0:07:57.080 --> 0:08:00.480
<v Speaker 2>twenty twenty seven is fan fiction nothing more. Just because

0:08:00.480 --> 0:08:02.920
<v Speaker 2>it's full of fancy words and has five different grifters

0:08:02.960 --> 0:08:19.400
<v Speaker 2>on its byline doesn't mean a goddamn thing. Now now, now, now, now, folks,

0:08:20.240 --> 0:08:24.120
<v Speaker 2>we've all been waiting for this moment, and here's the

0:08:24.200 --> 0:08:28.239
<v Speaker 2>ultimate booster quip the cust of inference is coming down.

0:08:28.520 --> 0:08:31.640
<v Speaker 2>This proves that things are getting cheaper. And here's a

0:08:31.640 --> 0:08:34.000
<v Speaker 2>bonus trick for you before I get to my ben

0:08:34.640 --> 0:08:37.640
<v Speaker 2>Here we go, ask them to explain whether things have

0:08:37.720 --> 0:08:40.000
<v Speaker 2>actually got cheaper, and if they say they have, ask

0:08:40.040 --> 0:08:42.880
<v Speaker 2>them why there are no profitable AI companies. If they

0:08:42.920 --> 0:08:45.240
<v Speaker 2>say they're in the growth stage, ask them why there

0:08:45.240 --> 0:08:47.920
<v Speaker 2>are no profitable AI companies. Again, I'd say it's been

0:08:48.000 --> 0:08:50.679
<v Speaker 2>several years and not got one. At this point they

0:08:50.679 --> 0:08:53.640
<v Speaker 2>should try and kill you. But really, I'm about to

0:08:53.679 --> 0:08:55.880
<v Speaker 2>be petty. I'm about to be petty for a fucking

0:08:55.920 --> 0:08:58.960
<v Speaker 2>reason though. In an interview on a podcast from earlier

0:08:58.960 --> 0:09:01.560
<v Speaker 2>this year that I will not even quote because the

0:09:01.679 --> 0:09:04.040
<v Speaker 2>journalist in question did not back me up and it

0:09:04.080 --> 0:09:08.240
<v Speaker 2>pisses me off, Journalist Casey Newton said the following about

0:09:08.240 --> 0:09:08.720
<v Speaker 2>my work.

0:09:09.720 --> 0:09:11.160
<v Speaker 1>You don't think that that kind of flies in the

0:09:11.160 --> 0:09:13.120
<v Speaker 1>face of same altman saying that we need billions of

0:09:13.160 --> 0:09:15.880
<v Speaker 1>dollars for years. No, not at all. And I think

0:09:15.920 --> 0:09:18.080
<v Speaker 1>that's why it's so important when you're reading about AI

0:09:18.240 --> 0:09:20.600
<v Speaker 1>to read people who actually interview people who work at

0:09:20.640 --> 0:09:23.640
<v Speaker 1>these companies and understand how the technology works. Because the

0:09:23.800 --> 0:09:28.000
<v Speaker 1>entire industry has been on this curve where they are

0:09:28.200 --> 0:09:32.440
<v Speaker 1>trying to find micro innovations that reduce the cost of

0:09:32.480 --> 0:09:35.240
<v Speaker 1>training the models and to reduce the cost of what

0:09:35.280 --> 0:09:37.600
<v Speaker 1>they call inference, which is when you actually enter aquarium

0:09:37.640 --> 0:09:41.000
<v Speaker 1>the chat GBT and if you plotted the curve of

0:09:41.280 --> 0:09:44.360
<v Speaker 1>how the cost has been following over time, Deep Seek

0:09:44.440 --> 0:09:47.520
<v Speaker 1>is on that curve. Right, So everything that Deep Seek

0:09:47.559 --> 0:09:50.160
<v Speaker 1>did it was expected by the AI labs that someone

0:09:50.200 --> 0:09:52.520
<v Speaker 1>would be able to do. The novelty was just that

0:09:52.559 --> 0:09:54.760
<v Speaker 1>a Chinese company did it. So to say that it

0:09:54.920 --> 0:09:58.600
<v Speaker 1>like up ends expectations of how AI would be built

0:09:58.760 --> 0:10:01.440
<v Speaker 1>is just purely false and the opinion of somebody who

0:10:01.440 --> 0:10:02.680
<v Speaker 1>does not know what he's talking about.

0:10:03.280 --> 0:10:06.520
<v Speaker 2>Newton then says several octaves higher, which shows you exactly

0:10:06.520 --> 0:10:09.360
<v Speaker 2>how mad he isn't that he thought what he said

0:10:09.480 --> 0:10:12.000
<v Speaker 2>was very civil, and that there are things that are

0:10:12.000 --> 0:10:14.679
<v Speaker 2>true and there are things that are false, like you

0:10:14.720 --> 0:10:17.560
<v Speaker 2>can choose which ones you want to believe. I'm not

0:10:17.600 --> 0:10:20.240
<v Speaker 2>going to be so civil. Other than the fact that

0:10:20.280 --> 0:10:23.959
<v Speaker 2>Casey refers to micro innovations, the fuck are you talking about?

0:10:24.200 --> 0:10:26.640
<v Speaker 2>And Deep Seak being on a curve that was expected,

0:10:27.000 --> 0:10:30.320
<v Speaker 2>he makes, as many do, two very big mistakes and personally.

0:10:30.360 --> 0:10:34.160
<v Speaker 2>If I was doing this, I personally would not have

0:10:34.280 --> 0:10:37.680
<v Speaker 2>said these things in a sentence that began with me

0:10:37.760 --> 0:10:40.560
<v Speaker 2>suggesting that I be in case and Newton in this

0:10:40.679 --> 0:10:44.080
<v Speaker 2>example knew how the technology works. Now here's the case

0:10:44.120 --> 0:10:47.160
<v Speaker 2>in Newton wib inference, which is when you actually enter

0:10:47.200 --> 0:10:50.040
<v Speaker 2>a query into chat GPT. This statement is false. It's

0:10:50.040 --> 0:10:52.760
<v Speaker 2>not what inference means. Inference and I've gotten this wrong

0:10:52.800 --> 0:10:55.680
<v Speaker 2>in the past too. I'm being accountable. Is everything that

0:10:55.760 --> 0:10:58.120
<v Speaker 2>happens when you put in a prompt to generate an output.

0:10:58.400 --> 0:11:02.080
<v Speaker 2>It's when an AI based on your infers meaning. To

0:11:02.160 --> 0:11:05.280
<v Speaker 2>be more specific, in quoting Google machine learning, inference is

0:11:05.280 --> 0:11:07.720
<v Speaker 2>the process of running data points into a machine learning

0:11:07.720 --> 0:11:10.960
<v Speaker 2>model to calculate an output, such as a single numerical score.

0:11:11.320 --> 0:11:13.439
<v Speaker 2>Except that's what these things are bad at. But nevertheless,

0:11:13.720 --> 0:11:15.440
<v Speaker 2>Casey will try and weasel out of this one and

0:11:15.480 --> 0:11:18.320
<v Speaker 2>say this is what he meant. It wasn't. He also said,

0:11:18.400 --> 0:11:20.240
<v Speaker 2>if he planted the curve of how the cost of

0:11:20.280 --> 0:11:24.200
<v Speaker 2>inference has been falling over time, well that's wrong, Casey,

0:11:24.320 --> 0:11:26.320
<v Speaker 2>that's wrong the man. The cost of inference has gone

0:11:26.360 --> 0:11:28.960
<v Speaker 2>up over time. Now, Casey, like many people who talk

0:11:28.960 --> 0:11:31.600
<v Speaker 2>about stuff without learning about it first is likely referring

0:11:31.600 --> 0:11:33.320
<v Speaker 2>to the fact that the price of tokens for some

0:11:33.360 --> 0:11:36.240
<v Speaker 2>models has gone down in some cases. But you know what, folks,

0:11:36.320 --> 0:11:38.959
<v Speaker 2>let's establish and facts about inference. I'm doing the train.

0:11:39.320 --> 0:11:41.960
<v Speaker 2>I'm pulling the big horn on the invisible train. I'm

0:11:42.000 --> 0:11:45.000
<v Speaker 2>cooking now. Inference is a thing that costs money, is

0:11:45.120 --> 0:11:47.760
<v Speaker 2>entirely different to the price of tokens, and conflating the

0:11:47.800 --> 0:11:51.000
<v Speaker 2>two is journalistic malpractice. The cost of inference would be

0:11:51.000 --> 0:11:53.720
<v Speaker 2>the price of running the GPU and the associated architecture.

0:11:53.800 --> 0:11:55.800
<v Speaker 2>Of course, we do not at this point have any

0:11:55.840 --> 0:11:59.520
<v Speaker 2>real insight into token prices are set by the people

0:11:59.520 --> 0:12:02.160
<v Speaker 2>who sell access to the tokens, such as open ai

0:12:02.200 --> 0:12:05.120
<v Speaker 2>and Anthropic. For example, open ai dropped the price of

0:12:05.160 --> 0:12:07.959
<v Speaker 2>its O three models token costs almost immediately after the

0:12:08.000 --> 0:12:10.520
<v Speaker 2>launch of Claude Opus four. Do you think it did

0:12:10.559 --> 0:12:12.800
<v Speaker 2>that because the price of serving the models got cheaper.

0:12:13.000 --> 0:12:16.040
<v Speaker 2>If you do, I don't know how you possibly put

0:12:16.080 --> 0:12:19.920
<v Speaker 2>your trousers on every morning without cutting yourself in half. Now,

0:12:19.920 --> 0:12:22.960
<v Speaker 2>the cost of inference conversation comes from articles that say

0:12:23.000 --> 0:12:25.400
<v Speaker 2>that we now have models that are cheaper that can

0:12:25.400 --> 0:12:28.960
<v Speaker 2>now hit higher benchmark scores. Though the article I'm referring to,

0:12:29.000 --> 0:12:31.080
<v Speaker 2>which will be in the show notes, is from November

0:12:31.080 --> 0:12:33.240
<v Speaker 2>twenty twenty four, and the comparison it makes is between

0:12:33.280 --> 0:12:36.280
<v Speaker 2>GPT three, which is from November twenty twenty one, and

0:12:36.400 --> 0:12:40.400
<v Speaker 2>LAMA three point two to three b September twenty twenty four. Now,

0:12:40.440 --> 0:12:42.200
<v Speaker 2>the suggestion is in any case, that the cost of

0:12:42.200 --> 0:12:45.040
<v Speaker 2>inference is going down ten x year over year. The

0:12:45.080 --> 0:12:47.600
<v Speaker 2>problem is, however, that these are raw token costs, not

0:12:47.640 --> 0:12:51.199
<v Speaker 2>actual expressions of evaluations of token burn in a practical setting.

0:12:51.720 --> 0:12:54.199
<v Speaker 2>And to really I realized that it was a bit technical.

0:12:54.960 --> 0:12:57.920
<v Speaker 2>These are just what it costs to do something. It

0:12:57.960 --> 0:13:01.120
<v Speaker 2>doesn't actually tell you how how many tokens will be

0:13:01.160 --> 0:13:03.640
<v Speaker 2>burned at what volume they will be burned, because that

0:13:03.679 --> 0:13:06.800
<v Speaker 2>would change things. And well, wouldn't you know it, the

0:13:06.840 --> 0:13:10.120
<v Speaker 2>cost of inference actually went up as a result. In

0:13:10.160 --> 0:13:12.080
<v Speaker 2>an excellent blog from Killer Code, and I did not

0:13:12.160 --> 0:13:14.640
<v Speaker 2>get the chance to find out the pronunciation of this

0:13:15.400 --> 0:13:17.319
<v Speaker 2>second name, so I'm just going to call her. It

0:13:17.400 --> 0:13:22.760
<v Speaker 2>is ewasyz sz Ka. I am so sorry. I would

0:13:22.840 --> 0:13:25.679
<v Speaker 2>rather spell it out, miss than actually mispronounce it. I

0:13:25.720 --> 0:13:29.240
<v Speaker 2>hate when people say z tron wrong. Great blog anyway,

0:13:29.320 --> 0:13:33.520
<v Speaker 2>let me quote, application inference costs increase for two reasons.

0:13:33.559 --> 0:13:36.600
<v Speaker 2>The frontier models cost per token stayed constant, and the

0:13:36.679 --> 0:13:40.760
<v Speaker 2>token consumption per application grew a lot. Token consumption per

0:13:40.800 --> 0:13:43.600
<v Speaker 2>application grew a lot because models allowed for longer context

0:13:43.600 --> 0:13:46.880
<v Speaker 2>windows and bigger suggestions from the models. The combination of

0:13:46.920 --> 0:13:49.840
<v Speaker 2>a steady price per token and more token consumption caused

0:13:49.880 --> 0:13:52.880
<v Speaker 2>that inference cost to grow about ten times over the

0:13:52.880 --> 0:13:56.600
<v Speaker 2>past two years. To explain that in really simple terms,

0:13:56.640 --> 0:13:59.440
<v Speaker 2>while the costs of old models may have decreased, new models,

0:13:59.640 --> 0:14:02.760
<v Speaker 2>which you need to do most things, cost about the same,

0:14:02.800 --> 0:14:05.600
<v Speaker 2>and the reasoning that these new models use do actually

0:14:05.600 --> 0:14:09.079
<v Speaker 2>burn way way more tokens. When these new models reason,

0:14:09.160 --> 0:14:11.280
<v Speaker 2>they break the user's input down and break it into

0:14:11.280 --> 0:14:14.360
<v Speaker 2>component parts, then run inference on each of those parts.

0:14:14.600 --> 0:14:16.200
<v Speaker 2>When you plug an L and M into an AI

0:14:16.240 --> 0:14:19.320
<v Speaker 2>coding environment, it will naturally burn an absolute shit ton

0:14:19.360 --> 0:14:21.640
<v Speaker 2>of tokens, in part because of the large amount of

0:14:21.640 --> 0:14:23.800
<v Speaker 2>information you have to load into the prompt and the

0:14:23.840 --> 0:14:25.960
<v Speaker 2>context window, or the amount of information you can load

0:14:26.000 --> 0:14:29.440
<v Speaker 2>in at once, and in part because generatingcode is inference

0:14:29.520 --> 0:14:31.920
<v Speaker 2>intensive and also breaking down all those coding tasks. At

0:14:31.960 --> 0:14:34.360
<v Speaker 2>each of those tasks requiring a coding tool and taking

0:14:34.400 --> 0:14:38.200
<v Speaker 2>a bunch of inference themselves. It's really bad. In fact,

0:14:38.240 --> 0:14:40.640
<v Speaker 2>the inference costs are so severe. The Killer Code says

0:14:40.680 --> 0:14:43.160
<v Speaker 2>that a combination of a steady price for token and

0:14:43.200 --> 0:14:46.040
<v Speaker 2>more token consumption caused app inference costs to grow about

0:14:46.040 --> 0:14:49.160
<v Speaker 2>ten x over the last two years. I'm repeating myself.

0:14:49.200 --> 0:14:51.520
<v Speaker 2>I realized, But I really need you to get one thing,

0:14:51.760 --> 0:14:53.960
<v Speaker 2>which is that the cost of inference went up. But

0:14:54.120 --> 0:14:56.600
<v Speaker 2>I'm not done. I refuse to let this point go

0:14:56.800 --> 0:14:58.760
<v Speaker 2>because people love to say the cost of inference is

0:14:58.800 --> 0:15:01.400
<v Speaker 2>going down when the cost of inference has increased, and

0:15:01.440 --> 0:15:04.240
<v Speaker 2>they do so to a national audience, all while suggesting

0:15:04.320 --> 0:15:07.880
<v Speaker 2>I'm wrong somehow and acting superior. I don't like being

0:15:07.920 --> 0:15:10.680
<v Speaker 2>made to feel this way. I don't think it's nice

0:15:10.680 --> 0:15:13.360
<v Speaker 2>to do this to people. And if you're gonna do it,

0:15:13.440 --> 0:15:15.720
<v Speaker 2>if you have the temerity to call someone out directly,

0:15:15.840 --> 0:15:20.160
<v Speaker 2>at least be fucking right. I'm not wrong, You're wrong.

0:15:20.600 --> 0:15:24.240
<v Speaker 2>In fact, software developer influencer Theo Brown recently put out

0:15:24.240 --> 0:15:26.960
<v Speaker 2>a video called I was wrong about AI costs They

0:15:27.040 --> 0:15:30.240
<v Speaker 2>keep going up, which he breaks down as follows, reasoning

0:15:30.240 --> 0:15:34.000
<v Speaker 2>models are significantly increasing the amount of output tokens being generated.

0:15:34.320 --> 0:15:37.760
<v Speaker 2>These tokens are also more expensive. In one example, Brown

0:15:37.840 --> 0:15:41.080
<v Speaker 2>finds that Grockfor's reasoning mode uses six hundred and three

0:15:41.120 --> 0:15:45.760
<v Speaker 2>tokens to generate two words. This was a problem across

0:15:45.800 --> 0:15:48.720
<v Speaker 2>every single reasoning model, as even cheap reasoning models would

0:15:48.760 --> 0:15:51.600
<v Speaker 2>do the same thing. As a result, tasks are taking

0:15:51.680 --> 0:15:55.240
<v Speaker 2>longer and burning more tokens. Another writer called Ethan Deing

0:15:55.280 --> 0:15:57.760
<v Speaker 2>noted a few months ago that reasoning models burn so

0:15:57.800 --> 0:16:00.680
<v Speaker 2>many tokens that there is no flat subscrips price that

0:16:00.720 --> 0:16:03.200
<v Speaker 2>works in this new world. As the number of tokens

0:16:03.240 --> 0:16:06.920
<v Speaker 2>they consume to an absolutely nuclear the price drops have

0:16:07.000 --> 0:16:09.920
<v Speaker 2>also for the most part stopped. You cannot at this

0:16:10.040 --> 0:16:12.560
<v Speaker 2>point fairly evaluate whether a model is cheaper just based

0:16:12.600 --> 0:16:15.640
<v Speaker 2>on its cost per tokens, because reasoning models inherently burn

0:16:15.880 --> 0:16:19.080
<v Speaker 2>and are built to inherently burn more tokens to create

0:16:19.120 --> 0:16:21.560
<v Speaker 2>an output. Reasoning models are also the only way that

0:16:21.600 --> 0:16:23.840
<v Speaker 2>model developers have been able to improve the efficacy of

0:16:23.880 --> 0:16:26.640
<v Speaker 2>new models, using something called test time compute to burn

0:16:26.680 --> 0:16:30.080
<v Speaker 2>extra tokens to complete a task, and in basically anything

0:16:30.120 --> 0:16:31.800
<v Speaker 2>you're using today, there's going to be some sort of

0:16:31.880 --> 0:16:35.360
<v Speaker 2>reasoning model, especially if you're coding, the cost of inference

0:16:35.360 --> 0:16:38.800
<v Speaker 2>has gone up. Statements otherwise are purely false and are

0:16:38.840 --> 0:16:41.000
<v Speaker 2>the opinion of somebody who does not know what he's

0:16:41.040 --> 0:16:44.240
<v Speaker 2>talking about. But you ask, could the costs of inference

0:16:44.280 --> 0:16:49.000
<v Speaker 2>go down? Maybe it sure isn't trending that way, nor

0:16:49.040 --> 0:16:51.560
<v Speaker 2>has it gone down yet. I also predict that there's

0:16:51.560 --> 0:16:53.440
<v Speaker 2>going to be some sort of sudden realization in the

0:16:53.440 --> 0:16:55.720
<v Speaker 2>media that inference is going up, which is kind of

0:16:55.720 --> 0:16:58.960
<v Speaker 2>already started. The Information had a piece on it in

0:16:59.040 --> 0:17:01.480
<v Speaker 2>late August where they note that into it paide twenty

0:17:01.480 --> 0:17:03.880
<v Speaker 2>million dollars to as your last year, primarily to access

0:17:03.920 --> 0:17:06.160
<v Speaker 2>open AI's models, and it's on track to spend thirty

0:17:06.200 --> 0:17:08.720
<v Speaker 2>million this year, which outpaces the company's revenue growth in

0:17:08.760 --> 0:17:11.800
<v Speaker 2>the same period, raising questions about how sustainable the spending

0:17:11.920 --> 0:17:13.560
<v Speaker 2>is and how much of the cost it can pass

0:17:13.560 --> 0:17:16.320
<v Speaker 2>along to customers. Christopher Mims and The Wall Street Journal

0:17:16.359 --> 0:17:18.359
<v Speaker 2>also had a piece about the costs going up. Do

0:17:18.520 --> 0:17:21.040
<v Speaker 2>not be mad at Chris. Chris and I chatted before

0:17:21.080 --> 0:17:24.040
<v Speaker 2>he submitted that piece, like he literally on Blue Sky

0:17:24.080 --> 0:17:26.360
<v Speaker 2>called me out if fucking rocks. By the way, big

0:17:26.440 --> 0:17:28.600
<v Speaker 2>up to Chris Mims because it's nice to see the

0:17:28.640 --> 0:17:31.639
<v Speaker 2>mainstream media actually engaging with these things, even though it's

0:17:31.720 --> 0:17:34.600
<v Speaker 2>dangerous to the bubble. But you know what, the truth

0:17:34.680 --> 0:17:37.040
<v Speaker 2>must win out, and the problem here is that the

0:17:37.160 --> 0:17:41.600
<v Speaker 2>architecture underlying large language models is inherently unreliable. I imagine open

0:17:41.600 --> 0:17:44.520
<v Speaker 2>AI's introduction of the router to chat GPT five as

0:17:44.560 --> 0:17:46.359
<v Speaker 2>an attempt to moderate both the costs of the model

0:17:46.440 --> 0:17:49.320
<v Speaker 2>chosen and reduce the amount of exposure to reasoning models

0:17:49.320 --> 0:17:52.520
<v Speaker 2>for simple queries. Though Sam Moltman was boasting on August

0:17:52.520 --> 0:17:54.880
<v Speaker 2>tenth about the significant increase in both free and paid

0:17:54.960 --> 0:17:58.000
<v Speaker 2>users exposure to reasoning models, they don't teach you this

0:17:58.119 --> 0:18:01.640
<v Speaker 2>in business school. Still, A study written up by VentureBeat

0:18:01.680 --> 0:18:04.040
<v Speaker 2>found that open weight models burn between one point five

0:18:04.080 --> 0:18:06.119
<v Speaker 2>to four times more tokens, in part due to a

0:18:06.200 --> 0:18:08.879
<v Speaker 2>lack of token efficiency and in part thanks to you

0:18:09.040 --> 0:18:13.440
<v Speaker 2>guessed it reasoning models. I quote the finding's challenge of

0:18:13.480 --> 0:18:16.560
<v Speaker 2>prevailing assumption in the AI industry that open source models

0:18:16.560 --> 0:18:20.520
<v Speaker 2>offer a clear economic advantages over proprietary alternatives. While open

0:18:20.520 --> 0:18:23.000
<v Speaker 2>source models typically cost less per token to run, the

0:18:23.000 --> 0:18:25.520
<v Speaker 2>study suggests that this advantage could be and I quote

0:18:25.560 --> 0:18:28.280
<v Speaker 2>the study easily offset if they require more tokens to

0:18:28.320 --> 0:18:31.560
<v Speaker 2>reason about a given problem, and models keep getting bigger

0:18:31.560 --> 0:18:36.399
<v Speaker 2>and more expensive too. So why did this happen? Well,

0:18:36.520 --> 0:18:39.359
<v Speaker 2>it's because model developers hit a wall of diminishing returns

0:18:39.400 --> 0:18:41.159
<v Speaker 2>and the only way to make models do more was

0:18:41.200 --> 0:18:43.080
<v Speaker 2>to make them burn more tokens to generate a more

0:18:43.119 --> 0:18:46.560
<v Speaker 2>accurate response, which is a very simple way of describing

0:18:46.600 --> 0:18:49.160
<v Speaker 2>reasoning a thing that opening I launched in September twenty

0:18:49.200 --> 0:18:52.120
<v Speaker 2>twenty four, and others followed. As a result, all the

0:18:52.160 --> 0:18:55.040
<v Speaker 2>gains from powerful new models come from burning more and

0:18:55.119 --> 0:18:57.639
<v Speaker 2>more tokens. The cost per million token number is no

0:18:57.720 --> 0:18:59.840
<v Speaker 2>longer an accurate measure of the actual cost of generative

0:18:59.880 --> 0:19:02.720
<v Speaker 2>a because it's much much, much much harder to tell

0:19:02.720 --> 0:19:04.920
<v Speaker 2>how many tokens of reasoning model may burn, and it

0:19:05.040 --> 0:19:08.399
<v Speaker 2>varies as the boint the O Boying, I'm keeping that

0:19:08.480 --> 0:19:11.080
<v Speaker 2>all right. You get the real cuts as the O

0:19:11.240 --> 0:19:14.840
<v Speaker 2>Brown noted from model to model. In any case, there

0:19:14.880 --> 0:19:17.600
<v Speaker 2>really is no changing this path. These companies are out

0:19:17.600 --> 0:19:22.679
<v Speaker 2>of ideas now another another one of my favorite ultimate

0:19:22.720 --> 0:19:25.120
<v Speaker 2>booster gripts. This is a classic and I still get

0:19:25.160 --> 0:19:28.679
<v Speaker 2>this on social media. I'm I have people yapping in

0:19:28.720 --> 0:19:31.919
<v Speaker 2>my ear saying open air and Anthropic are just like

0:19:32.080 --> 0:19:34.840
<v Speaker 2>Uber because Uber bent twenty five billion dollars over the

0:19:34.880 --> 0:19:37.960
<v Speaker 2>course of fifteen or so years and look look edward,

0:19:38.119 --> 0:19:40.399
<v Speaker 2>they're now profitable. Why are you calling me Airport? Shut up?

0:19:40.640 --> 0:19:43.199
<v Speaker 2>This proves the open Ai, a totally different company with

0:19:43.240 --> 0:19:46.280
<v Speaker 2>different economics, will be totally fine. So I've heard this

0:19:46.400 --> 0:19:48.520
<v Speaker 2>argument maybe fifty times in the last year, to the

0:19:48.520 --> 0:19:49.879
<v Speaker 2>point that I had to talk about it in my

0:19:49.960 --> 0:19:53.160
<v Speaker 2>piece how does open Ai Survive, which I also turned

0:19:53.160 --> 0:19:55.720
<v Speaker 2>into a podcast around July twenty twenty four. Go back

0:19:55.720 --> 0:19:58.960
<v Speaker 2>and link a link to it in the piece. Yaddy yaddy, yadda. Nevertheless,

0:19:58.960 --> 0:20:00.840
<v Speaker 2>people make a few points by Uber and AI that

0:20:00.840 --> 0:20:02.880
<v Speaker 2>I think are fundamentally incorrect, and I'm going to break

0:20:02.920 --> 0:20:05.680
<v Speaker 2>them down for you now. They claim that AI is

0:20:05.720 --> 0:20:08.200
<v Speaker 2>making itself too big to fail and betting itself everywhere

0:20:08.240 --> 0:20:10.920
<v Speaker 2>and becoming essential, and none of these things are the case.

0:20:11.560 --> 0:20:13.480
<v Speaker 2>I've heard this argument a lot, by the way, and

0:20:13.520 --> 0:20:16.879
<v Speaker 2>it's one that's both ahistorical and alarmingly ignorant of the

0:20:17.040 --> 0:20:21.320
<v Speaker 2>very basics of society. But ed the government, no no, no, no, no, no,

0:20:21.680 --> 0:20:23.960
<v Speaker 2>you've heard, you've heard. OpenAI got a two hundred million

0:20:23.960 --> 0:20:26.720
<v Speaker 2>dollar Defense contract with an estimated completion date of July

0:20:26.760 --> 0:20:28.600
<v Speaker 2>twenty twenty six. And just to be clear, that's up

0:20:28.640 --> 0:20:31.120
<v Speaker 2>to two hundred million dollars, and that they're selling chat

0:20:31.160 --> 0:20:34.120
<v Speaker 2>GBT Enterprise to the US government for a dollar a year,

0:20:34.320 --> 0:20:37.160
<v Speaker 2>along with Anthropic doing the same thing, and even Google's

0:20:37.200 --> 0:20:40.000
<v Speaker 2>doing it, except they're doing forty cents for a year. Now,

0:20:40.000 --> 0:20:42.960
<v Speaker 2>you're probably hearing this and thinking, ah shit, this means

0:20:42.960 --> 0:20:45.080
<v Speaker 2>the government's paid them. They're never going away. And I

0:20:45.160 --> 0:20:47.720
<v Speaker 2>cannot be clear enough that you believing this is the

0:20:47.880 --> 0:20:51.240
<v Speaker 2>very intention of these deals. They are built specifically to

0:20:51.280 --> 0:20:53.359
<v Speaker 2>make you feel like these things are never going away.

0:20:53.640 --> 0:20:56.159
<v Speaker 2>This is also an attempt to get in with the

0:20:56.160 --> 0:20:58.440
<v Speaker 2>government at a rate that makes train these models a

0:20:58.520 --> 0:21:02.800
<v Speaker 2>no brainer. At which point I ask, and the government

0:21:02.880 --> 0:21:05.120
<v Speaker 2>is going to have cheap access to AI software does

0:21:05.119 --> 0:21:08.200
<v Speaker 2>not mean that the government relies on m every member

0:21:08.200 --> 0:21:11.199
<v Speaker 2>of the government having access to chat GPT, something that

0:21:11.320 --> 0:21:14.040
<v Speaker 2>is not even necessarily the case, does not make this

0:21:14.119 --> 0:21:17.200
<v Speaker 2>software useful, let alone essential. And if open ai burns

0:21:17.240 --> 0:21:19.600
<v Speaker 2>a bunch of money making it work for them, it

0:21:19.720 --> 0:21:22.240
<v Speaker 2>still won't be essential because large language models are not

0:21:22.280 --> 0:21:25.960
<v Speaker 2>actually that useful for doing stuff now let's talk Uber.

0:21:26.359 --> 0:21:29.360
<v Speaker 2>Uber was and is useful, which eventually made it essential.

0:21:30.080 --> 0:21:33.320
<v Speaker 2>Uber used lobbyist Bradley Tusk to steam roll local governments

0:21:33.359 --> 0:21:35.960
<v Speaker 2>into allowing Uber to operate in their cities, but Tasks

0:21:36.040 --> 0:21:38.520
<v Speaker 2>did not have to convince local governments that Uber was

0:21:38.600 --> 0:21:41.440
<v Speaker 2>useful or have to train people how to use Uber.

0:21:42.160 --> 0:21:44.760
<v Speaker 2>Uber's too big to fail moment was that local cabs

0:21:44.840 --> 0:21:48.000
<v Speaker 2>kind of fucking sucked just about everywhere. You ever try

0:21:48.000 --> 0:21:50.760
<v Speaker 2>and take a yellow cab from downtown Manhattan to Hoboken,

0:21:50.800 --> 0:21:53.880
<v Speaker 2>New Jersey, or Brooklyn or Queen's Do you ever try

0:21:53.880 --> 0:21:56.000
<v Speaker 2>and pay with a credit card? How about trying to

0:21:56.000 --> 0:21:58.480
<v Speaker 2>get a cab outside a major metropolitan area. Do you

0:21:58.520 --> 0:22:02.520
<v Speaker 2>remember how bad it was? It was really awful. I

0:22:02.560 --> 0:22:05.560
<v Speaker 2>don't think people realize or remember how bad it was.

0:22:05.760 --> 0:22:08.720
<v Speaker 2>And I'm not saying that Uber is good. I'm not

0:22:08.720 --> 0:22:11.600
<v Speaker 2>glorifying Uber in any way. But the experience that Uber

0:22:11.680 --> 0:22:14.640
<v Speaker 2>replaced was very, very bad. As a result, Uber did

0:22:14.680 --> 0:22:16.840
<v Speaker 2>become too big to fail because people now rely on

0:22:16.880 --> 0:22:19.840
<v Speaker 2>it because the old system sucked. Uber used its masses

0:22:19.880 --> 0:22:22.080
<v Speaker 2>of venture capital to keep prices low to get people

0:22:22.200 --> 0:22:24.880
<v Speaker 2>used to it too, but the fundamental experience was better

0:22:24.920 --> 0:22:27.000
<v Speaker 2>than calling a cab company and hoping they showed up.

0:22:27.520 --> 0:22:28.879
<v Speaker 2>I also want to be clear that this is not

0:22:28.920 --> 0:22:32.080
<v Speaker 2>me condoning Uber take public transport, if you can to

0:22:32.119 --> 0:22:34.439
<v Speaker 2>be clear. Uber has created a new kind of horrifying,

0:22:34.520 --> 0:22:38.440
<v Speaker 2>extractive labor practice which deprives people of benefits and dignity,

0:22:38.640 --> 0:22:40.800
<v Speaker 2>paying off academics to help the media gloss over the

0:22:40.800 --> 0:22:44.119
<v Speaker 2>horrors of their platform, and also now having to increase

0:22:44.160 --> 0:22:48.479
<v Speaker 2>prices so that they reached profitability by doing that. That

0:22:48.600 --> 0:22:51.159
<v Speaker 2>isn't something that's going to happen with genitive AI. Just

0:22:51.880 --> 0:23:08.840
<v Speaker 2>the costs are too high, They're way too high. But anyway,

0:23:09.240 --> 0:23:14.840
<v Speaker 2>what is essential about generative AI? What exactly, and be specific,

0:23:15.040 --> 0:23:18.679
<v Speaker 2>is the essential experience of generative AI? What are we

0:23:18.920 --> 0:23:24.919
<v Speaker 2>if chat, GPT disappeared tomorrow, what actually disappears? And on

0:23:24.960 --> 0:23:28.240
<v Speaker 2>an enterprise or governmental level, what exactly are these tools

0:23:28.320 --> 0:23:31.480
<v Speaker 2>doing for governments that would make removing them so painful?

0:23:31.640 --> 0:23:34.760
<v Speaker 2>What use cases, what outcomes? If your answer here is

0:23:34.800 --> 0:23:36.639
<v Speaker 2>to say, well, they're putting it in and they're choosing,

0:23:36.680 --> 0:23:40.640
<v Speaker 2>they're choosing which people to cut out of benefits, and please, goddamn,

0:23:40.920 --> 0:23:43.280
<v Speaker 2>this is what they want you to do. They want

0:23:43.320 --> 0:23:46.680
<v Speaker 2>you to be scared so they can feel powerful. They're

0:23:46.680 --> 0:23:48.760
<v Speaker 2>not doing that. You notice that we get all these

0:23:48.760 --> 0:23:51.720
<v Speaker 2>horrible stories by the way of internal government things, shoving

0:23:51.720 --> 0:23:55.159
<v Speaker 2>stuff into olms. You know what, we don't get another

0:23:55.240 --> 0:23:57.320
<v Speaker 2>thing we don't get, oh and then have It's just

0:23:57.359 --> 0:24:00.840
<v Speaker 2>they're doing this scary, bad thing that they shouldn't be.

0:24:00.840 --> 0:24:04.280
<v Speaker 2>This shouldn't be putting people's private information into anyway. I'm rambling.

0:24:04.600 --> 0:24:07.199
<v Speaker 2>Uber's essentral nature is that millions of people use it

0:24:07.359 --> 0:24:10.680
<v Speaker 2>in place of regular taxis, and it effectively replaced de

0:24:10.760 --> 0:24:13.679
<v Speaker 2>krepit of exploitative systems like the yellow cab Medallions in

0:24:13.680 --> 0:24:16.760
<v Speaker 2>New York with its own tech enabled exploitation system that

0:24:16.920 --> 0:24:20.560
<v Speaker 2>nevertheless worked far better for the user. Okay, I also

0:24:20.560 --> 0:24:22.240
<v Speaker 2>want to do a side note just to acknowledge that

0:24:22.800 --> 0:24:26.399
<v Speaker 2>the disruption from Uber brought something to the medallion system

0:24:26.440 --> 0:24:30.240
<v Speaker 2>that was genuinely horrendous. The consequences were horrifying for the

0:24:30.240 --> 0:24:32.560
<v Speaker 2>owners of the medallions, some of who had paid more

0:24:32.560 --> 0:24:34.919
<v Speaker 2>than a million dollars for the privilege of driving a

0:24:34.960 --> 0:24:37.400
<v Speaker 2>New York cab and were burdened under mountains of debt.

0:24:37.680 --> 0:24:41.280
<v Speaker 2>That our system is so fucking evil. I think it's horrifying,

0:24:41.520 --> 0:24:44.240
<v Speaker 2>and I think the payday loan people involved should all

0:24:44.280 --> 0:24:47.520
<v Speaker 2>be in fucking prison, worst scum of the world. The

0:24:47.560 --> 0:24:49.600
<v Speaker 2>people who are taking advantage of people come to this

0:24:49.640 --> 0:24:51.600
<v Speaker 2>country to drive a fucking cab that they have to

0:24:51.960 --> 0:24:55.639
<v Speaker 2>take out massive loans to buy. That is evil. Uber

0:24:55.680 --> 0:24:58.399
<v Speaker 2>is also just to be clear, but that also is

0:24:58.480 --> 0:25:02.840
<v Speaker 2>That's the point I'm trying to make. Should feel sorry

0:25:02.920 --> 0:25:06.199
<v Speaker 2>for the victims of that system. That system was a

0:25:06.280 --> 0:25:10.640
<v Speaker 2>kind of corruption unto itself anyway, getting back to the thing,

0:25:10.680 --> 0:25:12.760
<v Speaker 2>because I don't know, I feel I actually feel a

0:25:12.760 --> 0:25:14.919
<v Speaker 2>lot for the people who are the victims of the

0:25:14.920 --> 0:25:17.760
<v Speaker 2>medallion system. It's fucking rough, and every time I think

0:25:17.800 --> 0:25:20.760
<v Speaker 2>of it, I feel very sad inside. But let's get

0:25:20.800 --> 0:25:22.320
<v Speaker 2>back to the episode. I don't want to think about

0:25:22.359 --> 0:25:25.919
<v Speaker 2>it any longer. There really are no essential use cases

0:25:25.960 --> 0:25:29.359
<v Speaker 2>for Chat, GPT, or really any Genai system. You cannot

0:25:29.359 --> 0:25:31.280
<v Speaker 2>point to one use case that is anywhere near as

0:25:31.280 --> 0:25:34.560
<v Speaker 2>necessary as cabs in cities, And indeed the biggest use cases,

0:25:34.600 --> 0:25:37.399
<v Speaker 2>things like brainstorming and search, are either easily replaced by

0:25:37.480 --> 0:25:39.919
<v Speaker 2>any other commoditized The lam will already exist in the

0:25:39.920 --> 0:25:44.440
<v Speaker 2>case of Google Search. Now let's do another boost quip

0:25:45.200 --> 0:25:47.920
<v Speaker 2>data centers are important economic growth vehicles and now helping

0:25:48.000 --> 0:25:51.480
<v Speaker 2>drive innovation and jobs throughout America. Having data centers promotes innovation,

0:25:51.600 --> 0:25:54.960
<v Speaker 2>making open AI and AI data centers essential. And the

0:25:55.000 --> 0:25:58.119
<v Speaker 2>answer to there is no no. Sorry, this is a

0:25:58.160 --> 0:26:00.560
<v Speaker 2>really simple one. These data centers are not in and

0:26:00.600 --> 0:26:03.959
<v Speaker 2>of themselves driving much economic growth other than the costs

0:26:03.960 --> 0:26:07.600
<v Speaker 2>of building them, which I went into last episode. As

0:26:07.600 --> 0:26:10.280
<v Speaker 2>I've discussed again and again, there's maybe forty billion dollars

0:26:10.320 --> 0:26:12.720
<v Speaker 2>in revenue and no profit coming out of AI companies.

0:26:12.840 --> 0:26:15.240
<v Speaker 2>There isn't any economic growth. They're not holding up anything

0:26:15.480 --> 0:26:19.640
<v Speaker 2>other than the massive, massive infrastructure built to make them

0:26:19.800 --> 0:26:23.600
<v Speaker 2>make no money and lose billions. There's no great loss

0:26:23.600 --> 0:26:25.960
<v Speaker 2>associated with the death of large language models or the

0:26:26.119 --> 0:26:28.920
<v Speaker 2>death of this era. Taking away Ober would be genuinely

0:26:28.960 --> 0:26:32.720
<v Speaker 2>catastrophic with some people's ability to get places and people's jobs,

0:26:32.760 --> 0:26:37.560
<v Speaker 2>even if they are horrifyingly underpaid. But here's another booster, quipped.

0:26:37.720 --> 0:26:40.320
<v Speaker 2>Uber burned a lot of money twenty five billion dollars

0:26:40.400 --> 0:26:43.440
<v Speaker 2>or more to get where it is today. Ooh, mister Zichron,

0:26:43.720 --> 0:26:46.480
<v Speaker 2>mister Zitchron, You're dead. And my response is the open

0:26:46.520 --> 0:26:49.080
<v Speaker 2>AI and anthropic are both separately burned more than four

0:26:49.119 --> 0:26:51.240
<v Speaker 2>times as much money since the beginning of twenty twenty

0:26:51.240 --> 0:26:54.159
<v Speaker 2>four as Uber did in its entire existence. So the

0:26:54.240 --> 0:26:57.280
<v Speaker 2>classic and wrong argument about open ai and companies like

0:26:57.320 --> 0:26:59.400
<v Speaker 2>open ai is that Uber burned a bunch of money,

0:26:59.440 --> 0:27:03.080
<v Speaker 2>is now cash flow positive or profitable. I want to

0:27:03.080 --> 0:27:06.000
<v Speaker 2>be clear that Uber's costs are nothing like large language models,

0:27:06.000 --> 0:27:09.240
<v Speaker 2>and making this comparison is ridiculous and desperate. But let's

0:27:09.240 --> 0:27:11.320
<v Speaker 2>talk about raw losses, shall we, and where people are

0:27:11.320 --> 0:27:14.440
<v Speaker 2>making this assumption. So Uber lost twenty four point nine

0:27:14.560 --> 0:27:16.480
<v Speaker 2>billion dollars in the space of four years from twenty

0:27:16.600 --> 0:27:18.679
<v Speaker 2>nineteen to twenty twenty two, in part because of the

0:27:18.680 --> 0:27:20.800
<v Speaker 2>billions it was spending on sales and marketing in R

0:27:20.840 --> 0:27:22.960
<v Speaker 2>and D four point six billion dollars and four point

0:27:23.040 --> 0:27:26.720
<v Speaker 2>eight billion dollars respectively in twenty nineteen alone. It also

0:27:27.000 --> 0:27:29.840
<v Speaker 2>massively subsidized the cost of rights, which is why prices

0:27:29.880 --> 0:27:33.119
<v Speaker 2>had to increase, and spent heavily on driver recruitment, burning

0:27:33.119 --> 0:27:35.880
<v Speaker 2>cash to get scale, you know, the classic Silicon Valley way.

0:27:36.480 --> 0:27:40.200
<v Speaker 2>This is absolutely nothing like how large language models are growing.

0:27:40.200 --> 0:27:42.840
<v Speaker 2>And I'm tired of defending this point, but defended I

0:27:42.920 --> 0:27:46.800
<v Speaker 2>shall open AI and Anthropic burn money primarily through compute

0:27:46.800 --> 0:27:50.119
<v Speaker 2>costs and specialized talent. These costs are increasing, especially with

0:27:50.160 --> 0:27:52.399
<v Speaker 2>the rush to hire every single AI scientists at the

0:27:52.440 --> 0:27:56.680
<v Speaker 2>most expensive price possible. There are also essential immovable costs

0:27:56.760 --> 0:28:00.280
<v Speaker 2>that neither open AI or Anthropic have to shoulder. The

0:28:00.320 --> 0:28:02.800
<v Speaker 2>construction of the data centers necessary to train and run

0:28:02.800 --> 0:28:05.159
<v Speaker 2>inference for their models, and of course the GPU is

0:28:05.240 --> 0:28:08.000
<v Speaker 2>inside them, which I will get to in a little bit. Yes,

0:28:08.200 --> 0:28:10.919
<v Speaker 2>Uber raised thirty three point five billion dollars through multiple

0:28:11.000 --> 0:28:13.840
<v Speaker 2>rounds of posting IPO dam though it raised about twenty

0:28:13.840 --> 0:28:17.040
<v Speaker 2>five billion dollars in actual funding. Yes, Uber burned an

0:28:17.040 --> 0:28:19.760
<v Speaker 2>absolutely as ton of money. Yes, Uber a scale, but

0:28:19.880 --> 0:28:21.680
<v Speaker 2>Uber has not burned money as a means of making

0:28:21.680 --> 0:28:25.400
<v Speaker 2>its product functional or useful. Uber worked immediately. I mean

0:28:25.840 --> 0:28:27.879
<v Speaker 2>was twenty twelve. I think I used it for the

0:28:27.920 --> 0:28:30.119
<v Speaker 2>first time. Maybe earlier. No, no, it would have been

0:28:30.119 --> 0:28:33.760
<v Speaker 2>twenty ten. It worked immediately. You used it, You're like, wow, this,

0:28:34.040 --> 0:28:35.919
<v Speaker 2>I can just put in my address. I don't have

0:28:36.040 --> 0:28:38.320
<v Speaker 2>to say my address three times because I have a

0:28:38.320 --> 0:28:41.480
<v Speaker 2>British accent and nobody can fucking understand me. Sometimes you can,

0:28:41.560 --> 0:28:46.320
<v Speaker 2>though you're special. Yeah, it was really obvious that it worked,

0:28:46.520 --> 0:28:49.080
<v Speaker 2>and also the costs associate with Uber and its capital

0:28:49.080 --> 0:28:52.120
<v Speaker 2>expenditures from twenty nineteen through twenty twenty four were around

0:28:52.240 --> 0:28:54.640
<v Speaker 2>two point two billion dollars, by the way, on miniscule

0:28:54.680 --> 0:28:57.880
<v Speaker 2>compared to the actual real costs of open ai and Anthropic.

0:28:58.520 --> 0:29:01.520
<v Speaker 2>Both open Ai and Anthropic around five billion dollars each

0:29:01.520 --> 0:29:04.560
<v Speaker 2>in twenty twenty four, but their infrastructure was entirely paid

0:29:04.560 --> 0:29:07.480
<v Speaker 2>for by either Microsoft, Google, or Amazon. And by which

0:29:07.520 --> 0:29:09.640
<v Speaker 2>I mean the building of it and the expansion they're

0:29:09.640 --> 0:29:12.800
<v Speaker 2>in what we don't know how much of this infrastructure

0:29:12.840 --> 0:29:16.240
<v Speaker 2>is specifically for open ai or Anthropic. As the largest

0:29:16.280 --> 0:29:18.760
<v Speaker 2>model developers, it's fair to assume that a large chunk

0:29:18.760 --> 0:29:21.840
<v Speaker 2>at least thirty percent of Amazon and Microsoft's capital expenditures

0:29:21.880 --> 0:29:24.880
<v Speaker 2>have been to support these loads. Great sentence to cut

0:29:24.920 --> 0:29:27.520
<v Speaker 2>and listen to again. I also leave out Google, as

0:29:27.520 --> 0:29:30.840
<v Speaker 2>it's unclear whether it's expanded its infrastructure for Anthropic, but

0:29:30.880 --> 0:29:33.600
<v Speaker 2>we know Amazon has done so. As a result, the

0:29:33.600 --> 0:29:35.880
<v Speaker 2>true cost of open ai and Anthropic is at least

0:29:35.920 --> 0:29:39.120
<v Speaker 2>ten times what uberburned. Amazon spent eighty three billion dollars

0:29:39.160 --> 0:29:41.680
<v Speaker 2>in capital expenditures in twenty twenty four and expects one

0:29:41.760 --> 0:29:43.840
<v Speaker 2>hundred and five billion dollars are the fuckers in twenty

0:29:43.840 --> 0:29:47.160
<v Speaker 2>twenty five. Microsoft spent fifty five point six billion dollars

0:29:47.200 --> 0:29:49.400
<v Speaker 2>in twenty twenty four and expects to spend eighty billion

0:29:49.400 --> 0:29:52.080
<v Speaker 2>dollars this year. I'm actually confident most of that is

0:29:52.120 --> 0:29:55.760
<v Speaker 2>open Ai, but based on my conservative calculations, the true

0:29:55.760 --> 0:29:58.280
<v Speaker 2>cost of open ai is at least eighty two billion dollars,

0:29:58.440 --> 0:30:01.800
<v Speaker 2>and that only includes capex twenty twenty four onwards. Based

0:30:01.840 --> 0:30:04.479
<v Speaker 2>on thirty percent of Microsoft's capex. It's not everything has

0:30:04.520 --> 0:30:07.480
<v Speaker 2>been invested yet in twenty twenty five, and open Ai

0:30:07.880 --> 0:30:11.320
<v Speaker 2>might not be all of the capex, and also the

0:30:11.360 --> 0:30:13.480
<v Speaker 2>forty one point four billion dollars of funding that open

0:30:13.480 --> 0:30:16.160
<v Speaker 2>ai has received so far. The true cost of Anthropic

0:30:16.200 --> 0:30:18.320
<v Speaker 2>is around seventy seven point one billion dollars, and that's

0:30:18.360 --> 0:30:21.040
<v Speaker 2>not including the thirteen billion they just raised, but it

0:30:21.040 --> 0:30:23.400
<v Speaker 2>does include all their previous funding and thirty percent of

0:30:23.400 --> 0:30:26.320
<v Speaker 2>Amazon's capex in the beginning of twenty twenty four. Now

0:30:26.320 --> 0:30:29.840
<v Speaker 2>these are in exact comparisons, but the classic argument is

0:30:29.880 --> 0:30:32.680
<v Speaker 2>that Uber burned lots of money and worked out okay,

0:30:32.760 --> 0:30:35.400
<v Speaker 2>when in fact the combined couple expenditures from twenty twenty

0:30:35.400 --> 0:30:38.120
<v Speaker 2>four onwards that are necessary to make open ai and

0:30:38.120 --> 0:30:41.320
<v Speaker 2>Anthropic worker each on their own four times what Uber

0:30:41.480 --> 0:30:45.880
<v Speaker 2>burned in over a decade. I also believe these numbers

0:30:45.920 --> 0:30:48.200
<v Speaker 2>are conservative. There's a good chance that open ai and

0:30:48.240 --> 0:30:51.920
<v Speaker 2>Anthropic dominate the capex of Amazon, Google, and Microsoft in

0:30:51.960 --> 0:30:54.120
<v Speaker 2>part because of what the fuck else are they buying

0:30:54.120 --> 0:30:56.920
<v Speaker 2>all these GPUs for as their own AI services don't

0:30:56.920 --> 0:31:00.720
<v Speaker 2>appear to be making much money at all anyway. To

0:31:00.760 --> 0:31:03.360
<v Speaker 2>put it real simple, AI has burned way more in

0:31:03.360 --> 0:31:05.720
<v Speaker 2>the last two years than Uber burned in ten. Uber

0:31:05.760 --> 0:31:07.920
<v Speaker 2>didn't burn money in the same way, didn't burn much

0:31:07.920 --> 0:31:10.840
<v Speaker 2>in the way of capital expenditures, didn't require massive amounts

0:31:10.840 --> 0:31:13.600
<v Speaker 2>of infrastructure, and isn't remotely the same in any way,

0:31:13.640 --> 0:31:15.840
<v Speaker 2>shape or form other than that it burned a lot

0:31:15.880 --> 0:31:18.160
<v Speaker 2>of money. And that burning wasn't because it was trying

0:31:18.200 --> 0:31:20.520
<v Speaker 2>to build the core product. It was trying to scale.

0:31:20.720 --> 0:31:23.320
<v Speaker 2>It's all so stupid, And you know what, I'm not

0:31:23.400 --> 0:31:27.800
<v Speaker 2>even done. Our next and final AI booster episode will

0:31:27.800 --> 0:31:31.480
<v Speaker 2>breeze through the dumbest of the dumb arguments, and I'll

0:31:31.480 --> 0:31:34.360
<v Speaker 2>say why I'm finally drawing a line under these arguments

0:31:34.400 --> 0:31:36.760
<v Speaker 2>for real, because it needs to be said. We need

0:31:36.800 --> 0:31:41.240
<v Speaker 2>to say something. I hope you've enjoyed this, see you tomorrow, godspeed.

0:31:49.960 --> 0:31:52.400
<v Speaker 2>Thank you for listening to Better Offline. The editor and

0:31:52.400 --> 0:31:55.600
<v Speaker 2>composer of the Better Offline theme song is Matasowski. You

0:31:55.600 --> 0:31:57.840
<v Speaker 2>can check out more of his music and audio projects

0:31:58.040 --> 0:32:01.520
<v Speaker 2>at Matasowski dot com, M A T T O S

0:32:01.600 --> 0:32:05.640
<v Speaker 2>O W s ki dot com. You can email me

0:32:05.680 --> 0:32:08.280
<v Speaker 2>at easy at Better offline dot com or visit Better

0:32:08.320 --> 0:32:10.760
<v Speaker 2>Offline dot com to find more podcast links and of course,

0:32:10.800 --> 0:32:13.920
<v Speaker 2>my newsletter. I also really recommend you go to chat

0:32:13.960 --> 0:32:16.600
<v Speaker 2>dot Where's youreaed dot at to visit the discord, and

0:32:16.640 --> 0:32:19.360
<v Speaker 2>go to our slash Better Offline to check out our reddit.

0:32:20.120 --> 0:32:23.400
<v Speaker 2>Thank you so much for listening. Better Offline is a

0:32:23.400 --> 0:32:26.240
<v Speaker 2>production of cool Zone Media. For more from cool Zone Media,

0:32:26.600 --> 0:32:29.720
<v Speaker 2>visit our website cool Zonemedia dot com, or check us

0:32:29.760 --> 0:32:32.760
<v Speaker 2>out on the iHeartRadio app, Apple Podcasts, or wherever you

0:32:32.800 --> 0:32:33.920
<v Speaker 2>get your podcasts.