WEBVTT - OpenAI's Video Generating AI Is Dead On Arrival

0:00:02.520 --> 0:00:07.880
<v Speaker 1>All Zone Media. Hello and welcome to Better Offline as usual.

0:00:07.960 --> 0:00:21.600
<v Speaker 1>I'm your host ed zitron. A few months ago, open

0:00:21.640 --> 0:00:24.800
<v Speaker 1>Ai showed off Sora, a product that can generate videos

0:00:24.800 --> 0:00:27.120
<v Speaker 1>based on a short text prompt, kind of like chat

0:00:27.240 --> 0:00:31.800
<v Speaker 1>ebt does for text or Daali does for images. These videos,

0:00:31.840 --> 0:00:34.680
<v Speaker 1>which are usually no more than sixty seconds long, can

0:00:34.760 --> 0:00:38.040
<v Speaker 1>at times seem impressive until you notice a little detail

0:00:38.040 --> 0:00:40.640
<v Speaker 1>that breaks the entire facade, like in a video where

0:00:40.640 --> 0:00:42.559
<v Speaker 1>a cat wakes up its owner, but the owner's arm

0:00:42.600 --> 0:00:44.760
<v Speaker 1>appears to be part the cushion and the cat's poor

0:00:44.880 --> 0:00:48.360
<v Speaker 1>explodes out of its arm like an ameba. Reactions to

0:00:48.479 --> 0:00:51.440
<v Speaker 1>Sora's Ai generated videos, and indeed the existence of the

0:00:51.440 --> 0:00:54.840
<v Speaker 1>model itself, have ranged from kind of a breathless hype

0:00:54.920 --> 0:00:57.840
<v Speaker 1>to genuine fear that this will be used to replace

0:00:57.920 --> 0:01:01.640
<v Speaker 1>video producers, in that it can create reality adjacent videos

0:01:01.680 --> 0:01:04.880
<v Speaker 1>that for a few seconds kind of seem real, especially

0:01:04.959 --> 0:01:07.280
<v Speaker 1>in the case in some of open Aye's hand pick

0:01:07.360 --> 0:01:12.200
<v Speaker 1>demo videos. Yet even in these handpicked Sora outputs, you'll

0:01:12.240 --> 0:01:15.800
<v Speaker 1>find these weird little things that immediately shatter the illusion,

0:01:16.160 --> 0:01:19.119
<v Speaker 1>like one where a woman's legs awkwardly shuffle, then somehow

0:01:19.160 --> 0:01:21.959
<v Speaker 1>switch sides as she walks around, or blobs of people

0:01:22.000 --> 0:01:25.240
<v Speaker 1>merging in the background of images. These are, on some

0:01:25.520 --> 0:01:31.280
<v Speaker 1>level genuinely remarkable technological achievements, until you consider that what

0:01:31.440 --> 0:01:35.560
<v Speaker 1>they are and what they might do, and that there

0:01:35.600 --> 0:01:39.120
<v Speaker 1>are problems in them that run through the entire fabric

0:01:39.200 --> 0:01:43.319
<v Speaker 1>of artificial intelligence. A little over a month after SAW

0:01:43.520 --> 0:01:46.440
<v Speaker 1>was announced, open AI would debut a series of short films,

0:01:46.480 --> 0:01:49.760
<v Speaker 1>including one called Airhead, where filmmakers Shy Kids told the

0:01:49.760 --> 0:01:51.360
<v Speaker 1>story of a man with a balloon for a head,

0:01:51.800 --> 0:01:55.120
<v Speaker 1>and because this is AI said, balloon changes sizes twenty three,

0:01:55.280 --> 0:01:58.240
<v Speaker 1>twenty four, twenty six, twenty seven, twenty nine, thirty two,

0:01:58.280 --> 0:02:01.240
<v Speaker 1>thirty four, thirty nine, forty one, forty two, forty three,

0:02:01.240 --> 0:02:03.960
<v Speaker 1>and forty five seconds into the piece, at which point

0:02:04.000 --> 0:02:06.120
<v Speaker 1>I stopped counting because it got boring and I really

0:02:06.120 --> 0:02:08.280
<v Speaker 1>don't want to be mean to shy kids, as this

0:02:08.600 --> 0:02:12.840
<v Speaker 1>really isn't their fault. The very nature of filmmaking is

0:02:12.840 --> 0:02:15.560
<v Speaker 1>that you take different shots of the same thing. Something

0:02:15.639 --> 0:02:19.200
<v Speaker 1>that I anticipated SAA was incapable of doing. Is each

0:02:19.280 --> 0:02:22.480
<v Speaker 1>shot is generated fresh a saura itself. Much like all

0:02:22.600 --> 0:02:27.360
<v Speaker 1>generative AI does not actually know anything when one asks

0:02:27.560 --> 0:02:29.880
<v Speaker 1>for a man with a yellow balloon as his head.

0:02:30.160 --> 0:02:32.960
<v Speaker 1>SAURA must then look at the parameters spawn during its

0:02:33.000 --> 0:02:36.040
<v Speaker 1>training process and create an output guessing what a man

0:02:36.080 --> 0:02:38.360
<v Speaker 1>looks like, what a balloon looks like, what a man's

0:02:38.480 --> 0:02:41.760
<v Speaker 1>features are on his body, what color yellow is, what

0:02:41.760 --> 0:02:45.680
<v Speaker 1>the man's doing, and so on and so forth. This

0:02:45.800 --> 0:02:49.680
<v Speaker 1>becomes extremely problematic when you're working in film or television,

0:02:49.720 --> 0:02:52.120
<v Speaker 1>where viewers are far more likely to see when something

0:02:52.240 --> 0:02:55.680
<v Speaker 1>just doesn't look right, a problem exacerbated by moving images,

0:02:55.840 --> 0:02:59.800
<v Speaker 1>high resolution footage, and big television screens which are now ubiquitous.

0:03:00.680 --> 0:03:05.400
<v Speaker 1>Yet the press, as usual, credulously accepted Saura's quote stunning

0:03:05.520 --> 0:03:09.360
<v Speaker 1>videos that were amazing and scary, suggesting to the public

0:03:09.400 --> 0:03:11.280
<v Speaker 1>that we were on the verge of some sort of

0:03:11.440 --> 0:03:16.760
<v Speaker 1>artificial intelligence takeover of the film industry, helping boy Sam Altman,

0:03:16.880 --> 0:03:20.440
<v Speaker 1>their CEO, and his dumbast attempts to convince Hollywood that

0:03:20.600 --> 0:03:25.440
<v Speaker 1>SURRA won't destroy the movie business. These stories only serve

0:03:25.520 --> 0:03:28.680
<v Speaker 1>to help Sam Orman, who desperately needs you to believe

0:03:28.680 --> 0:03:31.480
<v Speaker 1>that Hollywood is scared of Surer and even more scared

0:03:31.480 --> 0:03:34.120
<v Speaker 1>of Generative AI, because the more you talk about fear

0:03:34.160 --> 0:03:36.680
<v Speaker 1>and lost jobs and the machines taking over the less.

0:03:36.720 --> 0:03:40.360
<v Speaker 1>You ask a very very simple question, does any of

0:03:40.360 --> 0:03:45.160
<v Speaker 1>this shit actually work? The answer, it turns out, is

0:03:45.200 --> 0:03:48.560
<v Speaker 1>not very well. In a piece for FX Guide, Mike

0:03:48.600 --> 0:03:51.160
<v Speaker 1>Seymour sat down with Shy Kids, the people behind Airhead,

0:03:51.320 --> 0:03:54.720
<v Speaker 1>and revealed how SORAW is in many ways a little

0:03:54.720 --> 0:03:58.680
<v Speaker 1>bit useless for making films. SAURA takes ten to twenty

0:03:58.720 --> 0:04:01.560
<v Speaker 1>minutes to generate a single three to twenty second shot,

0:04:02.000 --> 0:04:04.400
<v Speaker 1>something that isn't really a problem until you realize that

0:04:04.520 --> 0:04:07.520
<v Speaker 1>until the shot is rendered, you really have absolutely no

0:04:07.600 --> 0:04:10.600
<v Speaker 1>idea what the hell it's going to spit out. Soa

0:04:10.880 --> 0:04:13.480
<v Speaker 1>has no mechanism to connect one shot to another. Even

0:04:13.480 --> 0:04:17.640
<v Speaker 1>with hyperdescriptive prompts. It hallucinates extra features when you haven't

0:04:17.680 --> 0:04:20.360
<v Speaker 1>asked for them. And Shy Kids were shocked by how

0:04:20.400 --> 0:04:23.680
<v Speaker 1>surprised open Ay's researchers were when they requested the ability

0:04:23.680 --> 0:04:27.080
<v Speaker 1>to use a prompt to request a particular angle in

0:04:27.120 --> 0:04:31.520
<v Speaker 1>a shot, a feature that was initially unavailable. It took

0:04:32.200 --> 0:04:35.200
<v Speaker 1>this is what kind of drives me crazy here and

0:04:35.240 --> 0:04:38.279
<v Speaker 1>you'll hear this in the interview with him later. These

0:04:38.320 --> 0:04:40.960
<v Speaker 1>people that are open AI people, and they were making

0:04:40.960 --> 0:04:44.560
<v Speaker 1>this tool for making visual images for making moving images.

0:04:44.600 --> 0:04:47.320
<v Speaker 1>They didn't think that people might want different shots. I'm

0:04:47.320 --> 0:04:49.480
<v Speaker 1>so glad these are the people who were in control

0:04:49.520 --> 0:04:53.159
<v Speaker 1>of the future. Anyway, to quote the piece, it took

0:04:53.320 --> 0:04:56.400
<v Speaker 1>hundreds of generations at ten to twenty seconds a piece

0:04:56.440 --> 0:05:00.720
<v Speaker 1>to make a minute and nineteen second long film. And

0:05:00.760 --> 0:05:05.679
<v Speaker 1>what's really fun about this is that the movie's fine.

0:05:05.800 --> 0:05:09.200
<v Speaker 1>I it was kind of fine. I just I have

0:05:09.320 --> 0:05:11.280
<v Speaker 1>nothing really to say about it. It's a minute and

0:05:11.360 --> 0:05:15.720
<v Speaker 1>twenty seconds long, but it's it kind of works. But also,

0:05:15.960 --> 0:05:18.799
<v Speaker 1>the balloon looks different in every other shot. This isn't

0:05:18.880 --> 0:05:23.279
<v Speaker 1>shy Kids's fault. But also this isn't gonna get better.

0:05:23.480 --> 0:05:26.520
<v Speaker 1>And I will get into why as we go along.

0:05:28.080 --> 0:05:31.479
<v Speaker 1>These tiny little problems I've mentioned, though, they all lead

0:05:31.520 --> 0:05:35.359
<v Speaker 1>to one overwhelming issue that Sora isn't so much a

0:05:35.440 --> 0:05:37.800
<v Speaker 1>tool to make movies as it is a big, fat

0:05:37.839 --> 0:05:40.360
<v Speaker 1>slot machine that spits out footage that may or may

0:05:40.400 --> 0:05:43.440
<v Speaker 1>not be of any use at all. Almost all of

0:05:43.440 --> 0:05:47.360
<v Speaker 1>the footage in Airhead was graded, treated, stabilized, the nutscaled,

0:05:48.000 --> 0:05:50.800
<v Speaker 1>and that ten to twenty second lead time on generations

0:05:50.920 --> 0:05:54.520
<v Speaker 1>was for four hundred and eightp resolution footage, meaning that

0:05:54.600 --> 0:05:58.200
<v Speaker 1>even useful footage needed significant post production work to look

0:05:58.200 --> 0:06:00.680
<v Speaker 1>good enough, and just to give you an idea for

0:06:00.760 --> 0:06:02.840
<v Speaker 1>the non technical members of the audience, and this is fair.

0:06:03.839 --> 0:06:06.599
<v Speaker 1>The video you see on YouTube is usually somewhere between

0:06:06.600 --> 0:06:09.920
<v Speaker 1>seven TWENTYP, ten ADP or four K. The TV shows

0:06:09.960 --> 0:06:13.880
<v Speaker 1>you watch usually ten AP four K or upscale ten ADP.

0:06:14.120 --> 0:06:16.359
<v Speaker 1>These are all lots of numbers. What I'm saying is

0:06:16.839 --> 0:06:20.440
<v Speaker 1>the stuff that SAA spits out, that takes burning a

0:06:20.440 --> 0:06:24.680
<v Speaker 1>small zoo to spit out, is incredibly low resolution. On

0:06:24.760 --> 0:06:29.599
<v Speaker 1>top of not being specific, look to put it as

0:06:29.640 --> 0:06:34.119
<v Speaker 1>plainly as possible, every single time that shy kids wanted

0:06:34.120 --> 0:06:37.400
<v Speaker 1>to generate a shot, even a three second long shot,

0:06:37.600 --> 0:06:40.440
<v Speaker 1>they would give SA a text prompt and then they

0:06:40.440 --> 0:06:44.040
<v Speaker 1>would wait at least ten minutes to find out if

0:06:44.080 --> 0:06:47.640
<v Speaker 1>it was right, and they'd have to accept footage that

0:06:47.800 --> 0:06:52.000
<v Speaker 1>was subprime or inaccurate. And there's a really good example

0:06:52.040 --> 0:06:54.479
<v Speaker 1>of this. If you watch Airhead, a lot of the

0:06:54.520 --> 0:06:57.240
<v Speaker 1>shots are in slow motion, and you may think, no,

0:06:57.400 --> 0:07:00.040
<v Speaker 1>this is a cinematic choice, right, because you kind of

0:07:00.160 --> 0:07:02.200
<v Speaker 1>just admiring this man with a balloon for a head

0:07:02.240 --> 0:07:05.880
<v Speaker 1>going about his business. No, no, no, no no. They

0:07:06.000 --> 0:07:08.440
<v Speaker 1>found that this was just what Sora wanted to give

0:07:08.480 --> 0:07:10.880
<v Speaker 1>them when they asked for it. This was, in and

0:07:10.920 --> 0:07:14.520
<v Speaker 1>of itself a hallucination, in the same way that chat

0:07:14.600 --> 0:07:18.560
<v Speaker 1>GBT will authoritatively tell you that something is true that

0:07:18.720 --> 0:07:22.040
<v Speaker 1>is not sorrow will spit out a man running in

0:07:22.080 --> 0:07:27.960
<v Speaker 1>slow motion despite you not asking for that, And it's

0:07:27.960 --> 0:07:31.040
<v Speaker 1>so weird. They had to quote them do quite a

0:07:31.080 --> 0:07:33.880
<v Speaker 1>bit of adjusting to keep the whole thing from feeling

0:07:34.520 --> 0:07:37.920
<v Speaker 1>like a big slow mode project, and it still kind

0:07:37.920 --> 0:07:43.680
<v Speaker 1>of does. And that's rough. That's really rough. But you know,

0:07:43.800 --> 0:07:46.920
<v Speaker 1>I'm a curious little critter, So I decided to sit

0:07:47.000 --> 0:07:49.640
<v Speaker 1>down with Shy Kids's Walter Woodman to talk about his

0:07:49.680 --> 0:07:52.120
<v Speaker 1>experience with Sora and have him delve a little daper

0:07:52.120 --> 0:07:55.040
<v Speaker 1>into his experience with the product. And I'd say he

0:07:55.120 --> 0:07:59.000
<v Speaker 1>had a far more utopian experience and perspective on the

0:07:59.040 --> 0:08:03.560
<v Speaker 1>whole thing than I excted. Now, some of you might

0:08:04.320 --> 0:08:07.320
<v Speaker 1>critique Walter for being so positive about it, but I

0:08:07.320 --> 0:08:09.520
<v Speaker 1>actually caution you to just listen to what he's saying,

0:08:10.040 --> 0:08:13.400
<v Speaker 1>because Walter's perspective is interesting. He sees this as a tool,

0:08:13.440 --> 0:08:15.680
<v Speaker 1>he doesn't see it as a replacement, and I think

0:08:15.680 --> 0:08:18.320
<v Speaker 1>it's a valid perspective to come at SAA with. I

0:08:18.360 --> 0:08:21.560
<v Speaker 1>also think it's a perspective that kind of accepts a

0:08:21.640 --> 0:08:25.440
<v Speaker 1>conceit of open AI's marketing strategy, that these things will

0:08:25.480 --> 0:08:30.520
<v Speaker 1>get better if they do. Perhaps Walter is right, perhaps

0:08:30.560 --> 0:08:33.600
<v Speaker 1>this will be an essential tool in filmmaking, even though

0:08:33.600 --> 0:08:35.440
<v Speaker 1>he didn't say essential. Don't want to put words in

0:08:35.440 --> 0:08:39.240
<v Speaker 1>the man's mouth, but I don't think that's the case.

0:08:40.320 --> 0:08:54.319
<v Speaker 1>Let me talk to him. You decide for yourself, all right.

0:08:54.440 --> 0:08:57.960
<v Speaker 1>So how did the relationship between Shy Kids and open

0:08:58.000 --> 0:08:58.920
<v Speaker 1>AYE actually begin.

0:09:00.160 --> 0:09:03.840
<v Speaker 2>The relationship between Shy Kids and Open AI began when

0:09:03.880 --> 0:09:08.079
<v Speaker 2>we made an installation for a film called dolly Land,

0:09:08.240 --> 0:09:12.560
<v Speaker 2>which was premiering at Toronto International Film Festival, and we

0:09:12.559 --> 0:09:15.480
<v Speaker 2>were the only people that our friends at Pressman Film

0:09:15.600 --> 0:09:19.720
<v Speaker 2>knew in Toronto, and so we made an installation that

0:09:19.840 --> 0:09:26.040
<v Speaker 2>looked like Salvador Dali's like studio inside of the basement

0:09:26.240 --> 0:09:29.679
<v Speaker 2>of the Saint Regis, which is where he lived and

0:09:30.240 --> 0:09:33.880
<v Speaker 2>made work out of, And inside of that installation we

0:09:34.600 --> 0:09:38.360
<v Speaker 2>made a like you could make your own surrealist painting,

0:09:39.520 --> 0:09:41.800
<v Speaker 2>and the way that you could make that was using

0:09:41.880 --> 0:09:48.160
<v Speaker 2>DOLLI the Open AI program, and so the open AI

0:09:48.320 --> 0:09:53.840
<v Speaker 2>people came to visit and check out the like what

0:09:53.920 --> 0:09:56.080
<v Speaker 2>we were working on, and making sure that it was

0:09:56.160 --> 0:09:58.080
<v Speaker 2>like something that they wanted to be a part of.

0:09:58.840 --> 0:09:59.840
<v Speaker 3>And so.

0:10:01.240 --> 0:10:05.440
<v Speaker 2>They met our producer Sydney, who they loved. She's easy

0:10:05.480 --> 0:10:06.120
<v Speaker 2>to love.

0:10:06.360 --> 0:10:07.840
<v Speaker 3>And they.

0:10:09.120 --> 0:10:11.840
<v Speaker 2>We sent them our previous work and so from there

0:10:12.120 --> 0:10:16.520
<v Speaker 2>they asked us to join this artist group. And then

0:10:16.720 --> 0:10:18.800
<v Speaker 2>when Sora came out, we saw it at the same

0:10:18.840 --> 0:10:24.720
<v Speaker 2>time as everyone else and we yeah, we got tapped

0:10:24.720 --> 0:10:27.720
<v Speaker 2>on the shoulder and said, hey, would you like to

0:10:27.800 --> 0:10:29.400
<v Speaker 2>check this out and try this out? And we said,

0:10:29.440 --> 0:10:32.360
<v Speaker 2>of course, that's how it came to be.

0:10:33.280 --> 0:10:37.119
<v Speaker 1>So how did you on board? Were you just given access?

0:10:37.280 --> 0:10:39.959
<v Speaker 1>Did they give you instructions? Did they physically come to you?

0:10:40.480 --> 0:10:44.199
<v Speaker 2>What was that like it was a top secret. They

0:10:44.240 --> 0:10:48.720
<v Speaker 2>gave us a briefcase and in a cloudy room.

0:10:48.960 --> 0:10:49.720
<v Speaker 3>No, it was.

0:10:50.840 --> 0:10:54.000
<v Speaker 2>Yeah, there was a very simple onboarding process where they

0:10:54.080 --> 0:10:58.080
<v Speaker 2>walked us through the technology as well as some of

0:10:58.120 --> 0:11:05.160
<v Speaker 2>its features, and yeah, it was pretty. It was pretty.

0:11:05.400 --> 0:11:07.640
<v Speaker 2>And then from there they gave us access to begin

0:11:08.280 --> 0:11:09.959
<v Speaker 2>using it and making.

0:11:09.600 --> 0:11:13.160
<v Speaker 1>Things and you were allowed to use it without their presence.

0:11:13.200 --> 0:11:14.319
<v Speaker 1>You had direct access.

0:11:14.360 --> 0:11:15.400
<v Speaker 3>Yep, yep.

0:11:16.320 --> 0:11:20.240
<v Speaker 1>So okay, did you get instructions on how to write

0:11:20.280 --> 0:11:23.479
<v Speaker 1>effective prompts or did you just kind of do trial.

0:11:23.200 --> 0:11:25.439
<v Speaker 3>And err, no, nothing like that.

0:11:25.600 --> 0:11:29.320
<v Speaker 2>I mean in the artist group itself, there's a lot

0:11:29.360 --> 0:11:33.440
<v Speaker 2>of really amazing and thoughtful creative people who kind of

0:11:34.160 --> 0:11:37.040
<v Speaker 2>show their work and show how they got to make

0:11:37.120 --> 0:11:43.480
<v Speaker 2>the things that they did. But no, not, there was

0:11:43.600 --> 0:11:49.480
<v Speaker 2>no real engineering of our prompts. They were very much

0:11:49.720 --> 0:11:55.360
<v Speaker 2>just play kind of see see what comes out of you.

0:11:55.360 --> 0:12:00.040
<v Speaker 2>You're creative people that we trust, Why don't.

0:11:59.880 --> 0:12:03.440
<v Speaker 3>You just see what works through spaghetti at the wall?

0:12:04.360 --> 0:12:07.800
<v Speaker 1>That's cool. So during the in the piece of mathx

0:12:07.880 --> 0:12:11.240
<v Speaker 1>guide in the interview, some more from shi Kids said

0:12:11.240 --> 0:12:14.839
<v Speaker 1>the Open Eyes researchers they were surprised when they were

0:12:14.880 --> 0:12:20.400
<v Speaker 1>asked about being able to say specific shots. What happened there?

0:12:20.840 --> 0:12:23.120
<v Speaker 1>Was it just that you tried to ask Saora to

0:12:23.120 --> 0:12:25.040
<v Speaker 1>do specific shots and it didn't work, or was it

0:12:25.120 --> 0:12:26.040
<v Speaker 1>just not a feature?

0:12:27.760 --> 0:12:30.520
<v Speaker 2>I think that's maybe taken a little bit out of context.

0:12:30.840 --> 0:12:31.599
<v Speaker 3>I think.

0:12:32.880 --> 0:12:38.000
<v Speaker 2>More so it's just people come from distant, different disciplines.

0:12:37.480 --> 0:12:39.079
<v Speaker 3>And when.

0:12:40.760 --> 0:12:43.760
<v Speaker 2>I say a wide shot on a one hundred and

0:12:43.800 --> 0:12:50.160
<v Speaker 2>thirty millimeter lens, people from my area of expertise know

0:12:50.400 --> 0:12:52.360
<v Speaker 2>sort of immediately what I'm talking about.

0:12:52.400 --> 0:12:55.160
<v Speaker 3>Whereas the researchers, they are.

0:12:56.200 --> 0:13:01.440
<v Speaker 2>More invested in sort of other other things, and so

0:13:02.320 --> 0:13:05.839
<v Speaker 2>it's it's not so much that they didn't understand or

0:13:05.920 --> 0:13:08.920
<v Speaker 2>that sort of didn't understand. It's more so just there's

0:13:08.960 --> 0:13:11.280
<v Speaker 2>all these terms in films.

0:13:10.720 --> 0:13:12.400
<v Speaker 3>Like a zollie or like a.

0:13:12.520 --> 0:13:15.800
<v Speaker 2>Hitchcock zoom or all of these different things that are

0:13:16.520 --> 0:13:19.320
<v Speaker 2>very understandable, but even when you go from set to set,

0:13:19.360 --> 0:13:22.680
<v Speaker 2>they mean something different. So I think it's about trying

0:13:22.800 --> 0:13:28.200
<v Speaker 2>to create a lingua franca between all of these sort

0:13:28.240 --> 0:13:34.360
<v Speaker 2>of different, very different people and very different ways of

0:13:34.480 --> 0:13:37.680
<v Speaker 2>using a tool. What I may call a zoom, you

0:13:37.760 --> 0:13:40.360
<v Speaker 2>may call a dolly shot, et cetera, et cetera.

0:13:40.480 --> 0:13:44.559
<v Speaker 1>So so that feels like a training date, a challenge.

0:13:44.760 --> 0:13:49.200
<v Speaker 2>Yeah, I think it's about trying to figure out how

0:13:49.360 --> 0:13:53.520
<v Speaker 2>and yeah, exactly what to what to train on.

0:13:54.600 --> 0:13:58.480
<v Speaker 1>Yeah, so tell me what was the interface like? Was

0:13:58.480 --> 0:14:01.120
<v Speaker 1>it a chat box? Did you have have? Like? Just

0:14:01.160 --> 0:14:02.679
<v Speaker 1>tell me about what I actually look like.

0:14:03.520 --> 0:14:07.839
<v Speaker 2>Sure, there's limitations of what I can say about things

0:14:07.920 --> 0:14:13.480
<v Speaker 2>like that, but I think the way that I've described

0:14:13.480 --> 0:14:16.679
<v Speaker 2>it to people without giving too much away is I

0:14:16.800 --> 0:14:21.040
<v Speaker 2>think if you're familiar with using something like the Adobe Suite.

0:14:21.480 --> 0:14:26.480
<v Speaker 2>I think that there's some commonalities whether you're using after

0:14:26.520 --> 0:14:32.600
<v Speaker 2>Effects or Premiere or whatever illustrator, there's like commonalities and

0:14:32.640 --> 0:14:35.280
<v Speaker 2>if you can use one, you can sort of flu's

0:14:35.320 --> 0:14:39.560
<v Speaker 2>your way around the others. I would say it's very

0:14:39.600 --> 0:14:42.800
<v Speaker 2>similar like that with open.

0:14:42.480 --> 0:14:46.200
<v Speaker 3>Ayes tools and models that if you are.

0:14:47.200 --> 0:14:51.840
<v Speaker 2>Used to things like chat, GPT and Dolly and those

0:14:51.880 --> 0:14:57.360
<v Speaker 2>types of models, I think you will find it find

0:14:57.400 --> 0:14:59.600
<v Speaker 2>an ease of use in using Zora.

0:15:01.400 --> 0:15:04.320
<v Speaker 1>So within that article they mentioned that there was like

0:15:04.320 --> 0:15:07.560
<v Speaker 1>a three hundred to one shooting ratio, which correct me

0:15:07.560 --> 0:15:09.800
<v Speaker 1>if I'm wrong, means like three hundred seconds of material

0:15:10.560 --> 0:15:13.720
<v Speaker 1>each second of usable material. How does that compare to

0:15:14.320 --> 0:15:18.400
<v Speaker 1>conventional filmmaking in your experience, it.

0:15:18.320 --> 0:15:20.920
<v Speaker 2>Would be even more seconds than that. I would say,

0:15:21.160 --> 0:15:26.280
<v Speaker 2>just three hundred shots at probably ten to twenty seconds apiece.

0:15:26.440 --> 0:15:30.080
<v Speaker 2>So whatever the math is on that, I would say

0:15:30.080 --> 0:15:35.000
<v Speaker 2>that that's pretty common with shooting. You know, when you

0:15:35.160 --> 0:15:40.040
<v Speaker 2>are shooting a fiction film or like even a documentary

0:15:40.120 --> 0:15:42.800
<v Speaker 2>is even crazier for that you shoot all day and

0:15:42.840 --> 0:15:47.760
<v Speaker 2>all day and from We shot a documentary recently and

0:15:47.840 --> 0:15:50.280
<v Speaker 2>I actually had to go back and watch all the dailies,

0:15:50.920 --> 0:15:54.560
<v Speaker 2>we counted about ninety hours of footage that we had,

0:15:54.840 --> 0:15:57.920
<v Speaker 2>and from that nineties hours, you're making an hour and

0:15:57.920 --> 0:15:59.800
<v Speaker 2>a half movie, So you.

0:15:59.760 --> 0:16:02.360
<v Speaker 3>Know, you are really trimming things down.

0:16:02.440 --> 0:16:06.600
<v Speaker 2>And I think also it's like you are getting the

0:16:06.720 --> 0:16:11.880
<v Speaker 2>five seconds that work or the you know, the section

0:16:12.200 --> 0:16:15.600
<v Speaker 2>of that shot that works. And I would say that's

0:16:15.600 --> 0:16:17.200
<v Speaker 2>pretty common to filmmaking.

0:16:19.240 --> 0:16:21.920
<v Speaker 1>How about narrative filmmaking, because I know documentary you have

0:16:21.960 --> 0:16:25.120
<v Speaker 1>a lot of stuff, But I'm just wondering what the

0:16:25.160 --> 0:16:28.400
<v Speaker 1>burden of selection is like compared to the amount of

0:16:28.400 --> 0:16:30.760
<v Speaker 1>shots you take in just a regular movie or regular

0:16:30.840 --> 0:16:31.400
<v Speaker 1>short film.

0:16:31.440 --> 0:16:34.160
<v Speaker 3>Even again, I would.

0:16:33.920 --> 0:16:36.520
<v Speaker 2>Say, at least I can only speak for the way

0:16:36.560 --> 0:16:40.160
<v Speaker 2>that I shoot films. You know, if you had it's subjective.

0:16:40.400 --> 0:16:43.560
<v Speaker 2>It's subjective for sure. If you're David Fincher, you're shooting

0:16:43.640 --> 0:16:47.120
<v Speaker 2>eight hundred takes of like someone picking up a pencil,

0:16:47.320 --> 0:16:50.560
<v Speaker 2>or Stanley Kubrick, you know, is like famous for a

0:16:50.680 --> 0:16:55.240
<v Speaker 2>thousand takes. I would say that the burn rate was

0:16:55.320 --> 0:16:59.680
<v Speaker 2>very similar. I would say that the challenges with Sora

0:17:00.480 --> 0:17:05.560
<v Speaker 2>are like it's unbelievable at making these images that are

0:17:06.560 --> 0:17:09.800
<v Speaker 2>unbelievable and so interesting to look at, But.

0:17:11.480 --> 0:17:14.400
<v Speaker 3>At its current state, it.

0:17:14.480 --> 0:17:19.080
<v Speaker 2>Can sometimes be difficult to do things that in traditional

0:17:19.080 --> 0:17:21.880
<v Speaker 2>shooting would be much easier, where you say, hey, can.

0:17:21.680 --> 0:17:23.920
<v Speaker 3>That guy go over here?

0:17:24.040 --> 0:17:26.199
<v Speaker 2>Or can that person move from one side of the

0:17:26.200 --> 0:17:30.600
<v Speaker 2>screen to the other. Things like that are are more difficult.

0:17:30.600 --> 0:17:34.320
<v Speaker 2>But again this is baby steps. We are in like

0:17:34.480 --> 0:17:37.919
<v Speaker 2>the toddler phase, so I assume that those things will

0:17:37.960 --> 0:17:38.400
<v Speaker 2>get better.

0:17:39.880 --> 0:17:44.040
<v Speaker 1>So you mentioned well shike, it's mentioned in the interview

0:17:44.200 --> 0:17:47.080
<v Speaker 1>the by default it tries to prevent you from creating

0:17:47.200 --> 0:17:51.919
<v Speaker 1>videos that violate copyright law existing copyrights. Did you accidentally

0:17:52.840 --> 0:17:55.040
<v Speaker 1>bump into this regularly or was this something that just

0:17:55.080 --> 0:17:56.199
<v Speaker 1>you didn't really bother you.

0:17:57.560 --> 0:18:00.760
<v Speaker 2>No, you couldn't generate things that So when I was

0:18:00.960 --> 0:18:04.960
<v Speaker 2>mentioning like a Hitchcock zoom, you couldn't mention Hitchcock, So

0:18:05.040 --> 0:18:07.480
<v Speaker 2>you had to find a different way to describe that

0:18:07.640 --> 0:18:13.960
<v Speaker 2>as opposed to like using public figures, anything that would

0:18:13.960 --> 0:18:17.119
<v Speaker 2>have a public figure or a title you would not

0:18:17.160 --> 0:18:21.760
<v Speaker 2>be allowed to generate. From my experience, there wasn't too

0:18:21.800 --> 0:18:26.200
<v Speaker 2>many logos or brands or anything like that, and any

0:18:26.240 --> 0:18:28.280
<v Speaker 2>of the things that I generated, and.

0:18:29.600 --> 0:18:32.640
<v Speaker 1>But something copyright. Did you generate anything that looked copyright?

0:18:33.080 --> 0:18:36.680
<v Speaker 3>No? Not to my not to my eye.

0:18:36.760 --> 0:18:41.560
<v Speaker 1>That's fine. So well, I know you don't know how

0:18:41.640 --> 0:18:44.200
<v Speaker 1>much Sorrow will cost, and we don't know that don't

0:18:44.200 --> 0:18:46.920
<v Speaker 1>even know when it will launch. Can you talk about

0:18:46.920 --> 0:18:48.639
<v Speaker 1>how much you'd be willing to pay for it? What

0:18:48.720 --> 0:18:50.600
<v Speaker 1>do you think it's worth? And I realized that this

0:18:50.760 --> 0:18:52.280
<v Speaker 1>is a vague question.

0:18:53.240 --> 0:18:53.760
<v Speaker 3>For sure.

0:18:55.600 --> 0:19:02.840
<v Speaker 2>I think that there is this illusion that Sora will

0:19:02.880 --> 0:19:08.000
<v Speaker 2>be this solution to all problems, and I don't think

0:19:08.040 --> 0:19:10.800
<v Speaker 2>that that is the case. I think Sora is a

0:19:10.840 --> 0:19:15.880
<v Speaker 2>tool amongst many tools, and for certain things it will

0:19:15.920 --> 0:19:16.840
<v Speaker 2>be very valuable.

0:19:17.040 --> 0:19:17.400
<v Speaker 3>And so.

0:19:19.000 --> 0:19:21.280
<v Speaker 2>In terms of value, it's like, well, how much is

0:19:21.320 --> 0:19:24.399
<v Speaker 2>a glass of water? Well, yes, if a glass of

0:19:24.440 --> 0:19:28.080
<v Speaker 2>water is just like right now in my kitchen, I.

0:19:27.560 --> 0:19:29.320
<v Speaker 3>Wouldn't like to pay that high for it.

0:19:29.720 --> 0:19:31.760
<v Speaker 2>If a glass of water is for a person in

0:19:31.800 --> 0:19:34.840
<v Speaker 2>the desert who desperately needs that glass of water, you

0:19:34.920 --> 0:19:37.600
<v Speaker 2>can really name your price. And I would say that

0:19:38.119 --> 0:19:42.240
<v Speaker 2>for some projects, I think that the usage of Sora

0:19:42.400 --> 0:19:44.560
<v Speaker 2>would be absolutely invaluable, and.

0:19:44.560 --> 0:19:47.240
<v Speaker 3>I would I would.

0:19:47.680 --> 0:19:49.680
<v Speaker 2>I don't know how much exactly that would be, would

0:19:49.680 --> 0:19:51.800
<v Speaker 2>depend on the budget, would depend on the limits and

0:19:51.840 --> 0:19:56.640
<v Speaker 2>the scales, but I would say that there's other projects

0:19:56.640 --> 0:19:58.960
<v Speaker 2>where I think it would be like totally inappropriate or

0:19:59.000 --> 0:20:04.600
<v Speaker 2>like just not worth like what, well, just when I

0:20:04.640 --> 0:20:08.280
<v Speaker 2>think of studio ghibli films that are hand drawn, and

0:20:09.760 --> 0:20:12.760
<v Speaker 2>I think the reason that those films work is because

0:20:12.800 --> 0:20:16.080
<v Speaker 2>of the way that they're made, or I think that

0:20:16.119 --> 0:20:19.280
<v Speaker 2>when you think of art man animation, it's like I

0:20:19.320 --> 0:20:21.720
<v Speaker 2>feel that you could feel the fingerprints in that clay,

0:20:22.240 --> 0:20:24.959
<v Speaker 2>and so I don't think maybe for those types of

0:20:25.000 --> 0:20:29.040
<v Speaker 2>films that it would be appropriate, But I think for

0:20:29.119 --> 0:20:31.880
<v Speaker 2>other types of films like Airhead or others, I think

0:20:31.920 --> 0:20:36.960
<v Speaker 2>it would be extremely appropriate. I think it's up to

0:20:37.000 --> 0:20:42.240
<v Speaker 2>the artists sort of discretion how much they think that

0:20:42.240 --> 0:20:43.520
<v Speaker 2>that tool is needed.

0:20:45.000 --> 0:20:50.440
<v Speaker 1>It's doesn't the inconsistency of shots make this deeply impractical,

0:20:50.520 --> 0:20:52.199
<v Speaker 1>because that's the thing I kept coming back to.

0:20:53.000 --> 0:20:55.359
<v Speaker 2>Yeah, I mean, depends on what project you're working on.

0:20:55.400 --> 0:20:58.000
<v Speaker 2>And again, I think that this is like early days.

0:20:58.359 --> 0:21:00.960
<v Speaker 2>I think that these are kinks and bugs that are

0:21:01.119 --> 0:21:07.280
<v Speaker 2>going to be changed, and already from day one where

0:21:07.280 --> 0:21:12.440
<v Speaker 2>we started using it to where we are today, massive

0:21:12.480 --> 0:21:15.919
<v Speaker 2>improvements have happened, and actually improvements where they've listened to

0:21:16.080 --> 0:21:19.919
<v Speaker 2>things that we have suggested and things that we'd like

0:21:20.000 --> 0:21:21.560
<v Speaker 2>to see and tools we'd.

0:21:21.440 --> 0:21:22.040
<v Speaker 3>Like to see.

0:21:22.119 --> 0:21:31.400
<v Speaker 2>So I think that, for example, for Airhead, the inconsistency

0:21:31.520 --> 0:21:38.800
<v Speaker 2>of having a protagonist, having a protagonist that stays true

0:21:39.000 --> 0:21:41.119
<v Speaker 2>through all these different shots, that's the reason why we

0:21:41.160 --> 0:21:43.680
<v Speaker 2>put a balloon in front of their head, Because while

0:21:43.680 --> 0:21:47.760
<v Speaker 2>different bodies can sort of be accepted, a different face

0:21:47.800 --> 0:21:49.400
<v Speaker 2>and a different head is going to be a little

0:21:49.440 --> 0:21:53.880
<v Speaker 2>bit difficult. And so we turned the limitation into our

0:21:54.440 --> 0:21:58.600
<v Speaker 2>sort of main attribute. And I would say that again,

0:21:58.720 --> 0:22:01.719
<v Speaker 2>that works for that story. But I don't think that

0:22:01.840 --> 0:22:06.239
<v Speaker 2>all stories are going to find this valuable. And I

0:22:06.240 --> 0:22:11.280
<v Speaker 2>also don't think every single shot needs to come from Sora.

0:22:11.600 --> 0:22:14.720
<v Speaker 3>I think that there's a world where it can be.

0:22:14.800 --> 0:22:18.399
<v Speaker 2>An addition, or it can be the start of a

0:22:18.480 --> 0:22:21.920
<v Speaker 2>story where instead of just brainstorming and just having a script,

0:22:22.400 --> 0:22:26.119
<v Speaker 2>you make a sort of moving mood board or a

0:22:26.200 --> 0:22:30.320
<v Speaker 2>trailer or so. I think that there's like tons of

0:22:30.400 --> 0:22:35.919
<v Speaker 2>stages along the pipeline that it would be extremely valuable

0:22:36.200 --> 0:22:41.400
<v Speaker 2>and help elucidate concepts and bring them to life.

0:22:41.680 --> 0:22:46.720
<v Speaker 1>So thematic question, so you avoided filming locations and all

0:22:46.720 --> 0:22:49.160
<v Speaker 1>of this, but you spend a lot of time writing

0:22:49.200 --> 0:22:53.360
<v Speaker 1>prompts and you're waiting for Sora to generate clips, then

0:22:53.440 --> 0:22:56.000
<v Speaker 1>up skating and all that. Do you think you could

0:22:56.000 --> 0:22:58.960
<v Speaker 1>make airhead assuming you could get around the balloon head thing?

0:22:59.320 --> 0:23:02.480
<v Speaker 1>Do you think you could make it quicker in real life?

0:23:02.640 --> 0:23:05.000
<v Speaker 1>Them was soa kind of essential to get it done

0:23:05.040 --> 0:23:06.480
<v Speaker 1>in the timeline you did, because it's like a week

0:23:06.520 --> 0:23:07.960
<v Speaker 1>and a half two weeks, I.

0:23:07.920 --> 0:23:13.600
<v Speaker 2>Think, Yeah, I don't know, that's an interesting question. I mean,

0:23:13.600 --> 0:23:15.879
<v Speaker 2>we definitely wouldn't be able to fly around the world

0:23:16.240 --> 0:23:20.560
<v Speaker 2>and yes, get the shots at the car race and

0:23:20.640 --> 0:23:22.000
<v Speaker 2>all of those things, so.

0:23:23.560 --> 0:23:26.199
<v Speaker 3>I think it would probably be shorter.

0:23:26.440 --> 0:23:30.840
<v Speaker 2>But I think in general, the conversations about like time

0:23:30.920 --> 0:23:35.240
<v Speaker 2>and money are like super reductive in a way in

0:23:35.280 --> 0:23:39.760
<v Speaker 2>that I think that without Sora, this wouldn't exist, And

0:23:40.040 --> 0:23:44.160
<v Speaker 2>I think that that is the more interesting conversation. As

0:23:44.880 --> 0:23:48.879
<v Speaker 2>a director, most directors I know have a folder of

0:23:50.359 --> 0:23:53.959
<v Speaker 2>unrealized ideas, and I think that my hope is that

0:23:54.119 --> 0:23:58.160
<v Speaker 2>Sora will allow us to dust off those folders and

0:23:59.359 --> 0:24:02.320
<v Speaker 2>breathe new life life into concepts, and when people see

0:24:02.640 --> 0:24:07.080
<v Speaker 2>what those concepts could be, my hope is that it

0:24:07.640 --> 0:24:13.280
<v Speaker 2>gives a lot more people opportunities to have their ideas illuminated.

0:24:13.720 --> 0:24:16.520
<v Speaker 2>And whether that means to go and shoot it now

0:24:16.560 --> 0:24:20.760
<v Speaker 2>traditionally or some hybrid. I think that that, to me

0:24:20.960 --> 0:24:22.600
<v Speaker 2>is what's most exciting.

0:24:23.960 --> 0:24:26.960
<v Speaker 1>So where do you see SORA going. I know you're

0:24:27.000 --> 0:24:29.400
<v Speaker 1>considering looking at it as kind of a complementary tool,

0:24:30.119 --> 0:24:31.840
<v Speaker 1>but do you think that that's its use case or

0:24:31.880 --> 0:24:34.080
<v Speaker 1>do you think it'll ever do end to end filmmaking.

0:24:35.440 --> 0:24:40.000
<v Speaker 2>I think I think let a thousand flowers bloom, you know.

0:24:40.119 --> 0:24:43.919
<v Speaker 2>I think that there is people who are going to

0:24:44.160 --> 0:24:48.240
<v Speaker 2>just use it for small complementary things to maybe help

0:24:48.320 --> 0:24:50.920
<v Speaker 2>with in the same way we use stock footage.

0:24:50.960 --> 0:24:51.160
<v Speaker 3>Now.

0:24:51.760 --> 0:24:57.560
<v Speaker 2>I think some people are going to use it as

0:24:57.600 --> 0:25:00.679
<v Speaker 2>a way, say you are from a commune unity that

0:25:01.440 --> 0:25:05.720
<v Speaker 2>has maybe a little bit of a less established film community,

0:25:05.760 --> 0:25:09.159
<v Speaker 2>and it's a way to have you compete with the

0:25:09.160 --> 0:25:13.200
<v Speaker 2>big boys in terms of special effects and usage. And again,

0:25:13.280 --> 0:25:16.000
<v Speaker 2>I don't just think it's as easy as bleep blue

0:25:16.000 --> 0:25:19.719
<v Speaker 2>block type in the prompt here comes the thing, but

0:25:19.840 --> 0:25:23.480
<v Speaker 2>rather it allows you to just have a really powerful

0:25:23.640 --> 0:25:28.399
<v Speaker 2>collaborator that you can help make maybe larger concepts and

0:25:28.440 --> 0:25:31.240
<v Speaker 2>bigger ideas. And then yeah, I think that there's some

0:25:31.280 --> 0:25:33.560
<v Speaker 2>people end to end who are going to make things

0:25:33.640 --> 0:25:40.399
<v Speaker 2>that are completely generated or most of the shots in

0:25:40.440 --> 0:25:46.280
<v Speaker 2>it are generated or things like that. In general, the

0:25:46.359 --> 0:25:50.919
<v Speaker 2>thing that feels interesting to me is like helping to

0:25:51.280 --> 0:25:58.199
<v Speaker 2>deepen humanity, Whereas the more you sort of simplify the process,

0:25:58.280 --> 0:26:01.960
<v Speaker 2>I think that that is like, I don't know, it's

0:26:02.040 --> 0:26:07.160
<v Speaker 2>never a simple process. So anytime you hear about something

0:26:07.200 --> 0:26:09.320
<v Speaker 2>that is going to make it all easy and make

0:26:09.359 --> 0:26:11.680
<v Speaker 2>all your troubles go away, I'd be very wary of that.

0:26:11.840 --> 0:26:12.960
<v Speaker 3>I think film is.

0:26:12.880 --> 0:26:17.919
<v Speaker 2>Going to always be difficult and a challenge, and I

0:26:18.040 --> 0:26:24.840
<v Speaker 2>think the benefit of SORA will be to help lead

0:26:24.920 --> 0:26:27.720
<v Speaker 2>us into new pasts and lead us into new directions.

0:26:27.760 --> 0:26:30.240
<v Speaker 2>If I were to tell you, hey, we made this

0:26:30.320 --> 0:26:33.160
<v Speaker 2>film called Lord of the Rings and it uses CGI

0:26:33.400 --> 0:26:37.479
<v Speaker 2>orcs and it makes massive orc fights. You know, if

0:26:37.520 --> 0:26:40.720
<v Speaker 2>I told you that in the nineteen thirties, you'd probably gasp.

0:26:41.800 --> 0:26:44.080
<v Speaker 2>Or if I told you that CGI is going to

0:26:44.080 --> 0:26:46.280
<v Speaker 2>be a predominant way in which we make films in

0:26:46.320 --> 0:26:48.920
<v Speaker 2>twenty twenty four, I think you would go, ah, that's

0:26:48.960 --> 0:26:50.000
<v Speaker 2>not real filmmaking.

0:26:50.640 --> 0:26:53.160
<v Speaker 1>And I don't think I think you kind of saw

0:26:53.160 --> 0:26:54.119
<v Speaker 1>that in the nineties.

0:26:54.320 --> 0:26:59.399
<v Speaker 2>Really yeah, I don't think history is too kind to

0:26:59.440 --> 0:27:03.040
<v Speaker 2>those people that go, this is not gonna work This

0:27:03.160 --> 0:27:06.040
<v Speaker 2>is not art. This technology is not the way I

0:27:06.160 --> 0:27:09.600
<v Speaker 2>just think it's it depends on the artist, and it

0:27:09.600 --> 0:27:11.440
<v Speaker 2>depends what they want to bring to it. I think

0:27:11.480 --> 0:27:14.160
<v Speaker 2>that's the key X factor here.

0:27:15.160 --> 0:27:18.560
<v Speaker 1>One final question, with that all in mind, do you

0:27:18.600 --> 0:27:20.840
<v Speaker 1>think that SRA is going to hurt filmmakers? Do you

0:27:20.880 --> 0:27:22.200
<v Speaker 1>think it's going to replace people?

0:27:23.440 --> 0:27:27.080
<v Speaker 2>I mean, I hope not. I mean that's my job,

0:27:27.320 --> 0:27:29.119
<v Speaker 2>so I would very hope not.

0:27:31.600 --> 0:27:33.919
<v Speaker 1>No. I very much.

0:27:33.840 --> 0:27:42.280
<v Speaker 2>Understand people's fears, and I think that you know, I'm

0:27:42.280 --> 0:27:45.119
<v Speaker 2>a student of history, so when I look back in

0:27:45.320 --> 0:27:52.399
<v Speaker 2>history and the camera obscura comes out, painters are talking

0:27:52.440 --> 0:27:55.159
<v Speaker 2>about how we aren't going to need painters anymore, because

0:27:55.200 --> 0:27:58.919
<v Speaker 2>now we can capture reality, why do you need a

0:27:58.920 --> 0:28:02.000
<v Speaker 2>painter to go and paint it? And it's a very

0:28:02.080 --> 0:28:06.360
<v Speaker 2>valid point, But painters didn't go away. And then there

0:28:06.440 --> 0:28:10.600
<v Speaker 2>was this whole new industry called photography, and then after photography,

0:28:10.600 --> 0:28:13.400
<v Speaker 2>there was this whole new industry called film. And then

0:28:13.440 --> 0:28:16.720
<v Speaker 2>after film, there was this whole new industry called home video.

0:28:17.240 --> 0:28:19.280
<v Speaker 2>And then after home video, there was this whole new

0:28:19.280 --> 0:28:22.760
<v Speaker 2>industry called cell phone video. And then there was this

0:28:22.800 --> 0:28:26.399
<v Speaker 2>whole new industry called tiktoks and vines, and I just

0:28:26.440 --> 0:28:33.320
<v Speaker 2>think that when people don't come in contacts with things

0:28:33.359 --> 0:28:38.280
<v Speaker 2>they're immediate. As humans, our immediate reaction is fear, and

0:28:39.640 --> 0:28:43.960
<v Speaker 2>we're worried about things that are new because we do

0:28:44.040 --> 0:28:48.160
<v Speaker 2>not yet understand them. And I think that for us,

0:28:49.560 --> 0:28:51.920
<v Speaker 2>we like to face those things face on. And I

0:28:52.040 --> 0:28:55.640
<v Speaker 2>think that the other side of that coin is that

0:28:56.000 --> 0:29:00.680
<v Speaker 2>there's some kid right now in rural Bangladesh who has

0:29:00.760 --> 0:29:04.320
<v Speaker 2>this amazing, big idea and maybe doesn't have all the

0:29:04.360 --> 0:29:08.400
<v Speaker 2>resources that everyone else has, and with these types of technologies,

0:29:08.840 --> 0:29:11.719
<v Speaker 2>it may level the playing field for kids like that

0:29:11.960 --> 0:29:15.320
<v Speaker 2>to compete with the avatars of the world, compete with

0:29:15.360 --> 0:29:18.600
<v Speaker 2>the Marvels of the world, And then I think we're

0:29:18.600 --> 0:29:20.640
<v Speaker 2>going to all be on this level playing field, and

0:29:20.680 --> 0:29:23.280
<v Speaker 2>what's going to matter is not just who has the

0:29:23.360 --> 0:29:26.960
<v Speaker 2>highest budgets and who has the most resources, but who

0:29:27.000 --> 0:29:32.080
<v Speaker 2>has the best stories. And for me, that's the exciting part.

0:29:33.840 --> 0:29:37.560
<v Speaker 2>We work with groups of collaborators that we love and respect,

0:29:37.640 --> 0:29:42.320
<v Speaker 2>and our hope is never let's work with them less.

0:29:42.440 --> 0:29:47.160
<v Speaker 2>Our hope is always let's enrich those relationships and hopefully

0:29:47.200 --> 0:29:51.360
<v Speaker 2>grow them and hopefully bring more people into our collective

0:29:51.760 --> 0:29:55.240
<v Speaker 2>and more people into our process. So that's our hope.

0:29:55.560 --> 0:30:00.360
<v Speaker 2>Maybe I'm utopic, maybe I'm wrong, but that's the that's

0:30:00.400 --> 0:30:03.360
<v Speaker 2>the choice, that that's the way we're choosing to look

0:30:03.360 --> 0:30:03.640
<v Speaker 2>at this.

0:30:17.120 --> 0:30:20.840
<v Speaker 1>In Woodman's mind, Surra is a tool, an extension of

0:30:20.880 --> 0:30:25.280
<v Speaker 1>creatives methods rather than a replacement of filmographers or actors,

0:30:25.320 --> 0:30:27.880
<v Speaker 1>what have you. And that very much lines up with

0:30:28.000 --> 0:30:31.560
<v Speaker 1>sam Ortman an open AI's sales pitch for Sura, his

0:30:31.720 --> 0:30:36.080
<v Speaker 1>utopian perspective, his words, not mine. It's predicated on both

0:30:36.120 --> 0:30:39.880
<v Speaker 1>film studios acting with integrity, something they've proven to never do,

0:30:40.320 --> 0:30:43.840
<v Speaker 1>an open Ai being able to make Sura a significantly

0:30:43.920 --> 0:30:47.640
<v Speaker 1>better tool, something that's going to require masses more training,

0:30:47.760 --> 0:30:51.960
<v Speaker 1>data and compute that I think is actually in existence.

0:30:53.240 --> 0:30:56.600
<v Speaker 1>Paul Trillo, an LA based artist and filmmaker, speaking to

0:30:56.680 --> 0:31:00.400
<v Speaker 1>Business Insider in April, described Saura as a research project

0:31:00.400 --> 0:31:03.520
<v Speaker 1>in Alpha, mentioning that it was a little confusing who

0:31:03.520 --> 0:31:06.640
<v Speaker 1>the market was for the service, and I think that

0:31:06.760 --> 0:31:11.120
<v Speaker 1>jails with another problem that Woodman raised, that what might

0:31:11.480 --> 0:31:16.240
<v Speaker 1>be a zoom out shot for you would be a

0:31:16.280 --> 0:31:19.840
<v Speaker 1>completely different term for someone else, which in turn would

0:31:19.920 --> 0:31:22.600
<v Speaker 1>require open ai to have both the right training data

0:31:22.760 --> 0:31:25.040
<v Speaker 1>of a zoom shot and many, many, many of them,

0:31:25.040 --> 0:31:28.880
<v Speaker 1>to be clear, But they need to know the multitudes

0:31:28.920 --> 0:31:33.800
<v Speaker 1>of different terminologies that go into filmmaking. Now, if they

0:31:33.800 --> 0:31:38.000
<v Speaker 1>don't give a shit, maybe that's a completely different story.

0:31:38.080 --> 0:31:42.040
<v Speaker 1>In short, SAUA faces both the intractable problems of AI

0:31:42.120 --> 0:31:44.760
<v Speaker 1>that I've mentioned in the previous episode, PKI go and

0:31:44.760 --> 0:31:47.360
<v Speaker 1>listen to it, but also a few of its own,

0:31:48.280 --> 0:31:52.640
<v Speaker 1>namely that generating moving images isn't just about ingesting a

0:31:52.640 --> 0:31:56.360
<v Speaker 1>bunch of footage, but it's about understanding said footage well

0:31:56.440 --> 0:31:59.600
<v Speaker 1>enough to generate something else based on a multitude of

0:31:59.600 --> 0:32:06.240
<v Speaker 1>different perspectives, descriptions, and cultural contexts. I'm not sure that

0:32:06.360 --> 0:32:10.600
<v Speaker 1>open AI really Most people realize how complex even the

0:32:10.720 --> 0:32:15.840
<v Speaker 1>simplest movie is, how much work goes into making a film,

0:32:16.360 --> 0:32:18.600
<v Speaker 1>and I think that that's actually what excites people about this,

0:32:18.680 --> 0:32:22.080
<v Speaker 1>because making films can be inefficient, it can be extremely taxing,

0:32:22.240 --> 0:32:27.240
<v Speaker 1>it can be extremely expensive. But the problem here, I'll

0:32:27.240 --> 0:32:30.400
<v Speaker 1>get into the other ones as well, is that SAURA

0:32:30.560 --> 0:32:33.720
<v Speaker 1>is being sold to film studios. That is who Sam

0:32:33.760 --> 0:32:37.360
<v Speaker 1>Mortman is going to, and thus it's going to be

0:32:37.400 --> 0:32:39.760
<v Speaker 1>built for people who don't make movies. I'm actually really

0:32:39.840 --> 0:32:42.320
<v Speaker 1>happy to hear that shy kids and other artists are involved,

0:32:42.360 --> 0:32:46.200
<v Speaker 1>so it'll actually be tuned to be somewhat useful. But

0:32:46.280 --> 0:32:50.920
<v Speaker 1>I don't think people realize how gigantine the task is

0:32:50.960 --> 0:32:55.120
<v Speaker 1>that SRA is going after, and how I think it's

0:32:55.160 --> 0:32:59.200
<v Speaker 1>impossible it can go any further. But I digress. I

0:32:59.440 --> 0:33:03.120
<v Speaker 1>just don't believe that SORA actually works if you're making

0:33:03.160 --> 0:33:07.720
<v Speaker 1>a movie. While pixel movies may take years to render,

0:33:07.880 --> 0:33:11.920
<v Speaker 1>they've got supercomputers and specialized hardware, and more importantly, the

0:33:11.960 --> 0:33:14.640
<v Speaker 1>ability to actually design and move characters in the three

0:33:14.720 --> 0:33:19.000
<v Speaker 1>D space. If you are putting something in Saura, what

0:33:19.040 --> 0:33:22.880
<v Speaker 1>are you designing? If you put a character in this

0:33:23.520 --> 0:33:27.640
<v Speaker 1>in again, you cannot have consistency between these things. That

0:33:27.800 --> 0:33:31.440
<v Speaker 1>is a problem across all generative AI. You can not

0:33:31.600 --> 0:33:35.600
<v Speaker 1>do that unless, of course, using copyrighted footage, mister Oltman.

0:33:35.920 --> 0:33:40.720
<v Speaker 1>But seriously, though, with no consistency cross shots, what the

0:33:40.760 --> 0:33:45.000
<v Speaker 1>hell are you doing? While there are unexpected things that

0:33:45.120 --> 0:33:47.560
<v Speaker 1>might happen in a three D animated movie or a

0:33:47.640 --> 0:33:51.600
<v Speaker 1>CGI situation, you still have complete control over the thing

0:33:51.680 --> 0:33:53.720
<v Speaker 1>you are putting on there, the thing you are animated.

0:33:53.760 --> 0:33:56.400
<v Speaker 1>You can make subtle tweaks to him that doesn't seem

0:33:56.400 --> 0:33:59.880
<v Speaker 1>to be the case with Sora. You can adjust what

0:34:00.160 --> 0:34:03.760
<v Speaker 1>on the screen. But even though this is AI generated,

0:34:04.040 --> 0:34:07.960
<v Speaker 1>it doesn't have the benefits of regular generative stuff like CGI,

0:34:08.600 --> 0:34:11.360
<v Speaker 1>which stands of course for a computer generated image. I believe,

0:34:11.400 --> 0:34:12.799
<v Speaker 1>and if I'm wrong, you're gonna yell at me in

0:34:12.800 --> 0:34:17.800
<v Speaker 1>the emails. But seriously, though the practical use cases for SURA,

0:34:18.960 --> 0:34:24.279
<v Speaker 1>they're just kind of not there. Sora's attempts to replace filmmakers,

0:34:24.320 --> 0:34:26.560
<v Speaker 1>if that is open ayes goal, and I really believe

0:34:26.560 --> 0:34:31.200
<v Speaker 1>it is, they're dead on arrival because it's an impractical

0:34:31.200 --> 0:34:34.640
<v Speaker 1>and ineffective solution and the problems it's solving are really

0:34:34.719 --> 0:34:39.360
<v Speaker 1>only ones created by Hollywood executives. The AI hype bubble,

0:34:39.400 --> 0:34:43.320
<v Speaker 1>as I have noted repeatedly, is one entirely reliant on

0:34:43.480 --> 0:34:46.680
<v Speaker 1>us accepting the idea of what these companies will do,

0:34:47.040 --> 0:34:51.400
<v Speaker 1>rather than interrogating their ability to actually do it. Sourra,

0:34:51.680 --> 0:34:55.120
<v Speaker 1>much like all generative AI, suffers from an imprecision and

0:34:55.160 --> 0:34:59.960
<v Speaker 1>an unreliability caused by hallucinations, an unavoidable result of your

0:35:00.080 --> 0:35:04.360
<v Speaker 1>using mathematics to generate things, and the massive power and

0:35:04.400 --> 0:35:08.759
<v Speaker 1>compute requirements are just prohibitively expensive. If this is going

0:35:08.840 --> 0:35:12.279
<v Speaker 1>to end up as a VFX tool, or a productivity tool,

0:35:12.520 --> 0:35:15.920
<v Speaker 1>or as a fill in tool. It's going to need

0:35:16.000 --> 0:35:18.360
<v Speaker 1>to be a lot cheaper than it is to run.

0:35:19.000 --> 0:35:24.319
<v Speaker 1>Generative AI is already unprofitable to make, soa any kind

0:35:24.320 --> 0:35:26.799
<v Speaker 1>of useful open ay will have to find a way

0:35:26.800 --> 0:35:30.600
<v Speaker 1>to dramatically increase the precision of the prompts, reduce hallucinations

0:35:30.600 --> 0:35:34.760
<v Speaker 1>to pretty much nothing, and vastly increase processing power across

0:35:34.800 --> 0:35:38.399
<v Speaker 1>the board. Sora hasn't even been launched save for, of course,

0:35:38.480 --> 0:35:41.520
<v Speaker 1>these handpicked companies that got to test it, meaning that

0:35:41.560 --> 0:35:45.000
<v Speaker 1>this ten to twenty minute weight between generations of moving

0:35:45.000 --> 0:35:48.760
<v Speaker 1>images that's likely to increase once people use the product.

0:35:48.880 --> 0:35:51.799
<v Speaker 1>And that's before you consider how expensive it's going to

0:35:51.840 --> 0:35:54.840
<v Speaker 1>be to run the bloody thing. This is a significantly

0:35:54.920 --> 0:35:59.040
<v Speaker 1>more complex model than chat GPT, which is already unprofitable.

0:36:00.080 --> 0:36:03.080
<v Speaker 1>Sam Moltman can make money, but can he make profit?

0:36:03.760 --> 0:36:07.279
<v Speaker 1>I severely bloody doubt it. He hasn't before, and I

0:36:07.320 --> 0:36:09.840
<v Speaker 1>don't think he's going to in the future. He's still

0:36:10.040 --> 0:36:13.480
<v Speaker 1>begging Daddy Satchia over at Microsoft to give him a

0:36:13.520 --> 0:36:17.360
<v Speaker 1>supercomputer so his things can fart out things more profitably.

0:36:17.400 --> 0:36:22.319
<v Speaker 1>It's just drives me a little insane. And these things

0:36:22.360 --> 0:36:25.680
<v Speaker 1>I've talked about their intractable problems that open aiy has

0:36:25.680 --> 0:36:29.040
<v Speaker 1>failed to solve. They've failed to make a more efficient

0:36:29.080 --> 0:36:31.359
<v Speaker 1>model for Microsoft last year in twenty twenty three, their

0:36:31.440 --> 0:36:35.879
<v Speaker 1>Arakis model Jesus Christ. And while GPT five is meant

0:36:35.920 --> 0:36:38.919
<v Speaker 1>to be materially better, to quote mister Altman, it isn't

0:36:38.960 --> 0:36:42.880
<v Speaker 1>obvious what better means when GPT four performs worse at

0:36:42.880 --> 0:36:46.840
<v Speaker 1>some tasks than its predecessor. I do believe Sam Mortman

0:36:46.920 --> 0:36:48.600
<v Speaker 1>is telling the truth when he says that the future

0:36:48.640 --> 0:36:51.799
<v Speaker 1>of AI requires an energy breakthrough. But the thing I

0:36:51.800 --> 0:36:54.400
<v Speaker 1>think he's leaving out is that it may take an

0:36:54.480 --> 0:36:58.440
<v Speaker 1>energy breakthrough and indeed more chips for generative AI to

0:36:58.560 --> 0:37:03.160
<v Speaker 1>approach any level of ness. And he's hoping that people

0:37:03.200 --> 0:37:06.280
<v Speaker 1>will buy the hype without asking too many annoying questions

0:37:06.280 --> 0:37:09.600
<v Speaker 1>like what does this stuff actually do? Or is this useful?

0:37:09.840 --> 0:37:12.520
<v Speaker 1>Or does this actually help me? Or will this be

0:37:12.640 --> 0:37:16.160
<v Speaker 1>around in ten years? To be clear, Sam Altman is

0:37:16.200 --> 0:37:19.520
<v Speaker 1>the single most well connected and well funded man in AI,

0:37:19.880 --> 0:37:23.600
<v Speaker 1>with a direct connection to Microsoft, a multi trillion dollar

0:37:23.640 --> 0:37:27.760
<v Speaker 1>tech company, and a rollodexter includes effectively every major founder

0:37:27.800 --> 0:37:31.000
<v Speaker 1>of the last decade, and he still can't get past

0:37:31.160 --> 0:37:34.440
<v Speaker 1>any of these problems, partly because he is not technical

0:37:34.600 --> 0:37:37.200
<v Speaker 1>and thus can't really solve the problems himself, and partly

0:37:37.200 --> 0:37:40.320
<v Speaker 1>because the problems he's facing are burdened by the laws

0:37:40.360 --> 0:37:45.720
<v Speaker 1>of maths and physics. Generative AI hallucinates because it doesn't

0:37:45.760 --> 0:37:49.000
<v Speaker 1>have a consciousness or any ability to learn or know anything.

0:37:50.080 --> 0:37:54.480
<v Speaker 1>It's extremely expensive because even the simplest prompts require GPT

0:37:54.560 --> 0:37:59.000
<v Speaker 1>four to run highly complex mathematical equations on graphics processing

0:37:59.120 --> 0:38:03.240
<v Speaker 1>units that cost upwards of ten thousand dollars apiece. Even

0:38:03.320 --> 0:38:06.520
<v Speaker 1>if generative AI were cheaper or more efficient or required

0:38:06.600 --> 0:38:10.200
<v Speaker 1>less power, it would still be a process that generates

0:38:10.239 --> 0:38:13.319
<v Speaker 1>answers based on the extremely complex process of ingesting an

0:38:13.360 --> 0:38:18.120
<v Speaker 1>increasingly dwindling amount of training data. These problems are significantly

0:38:18.160 --> 0:38:22.160
<v Speaker 1>compounded when you consider the complexity, size, and massive legal

0:38:22.239 --> 0:38:26.600
<v Speaker 1>ramifications of training a model on videos. A problem that

0:38:26.640 --> 0:38:30.279
<v Speaker 1>nobody has seem fit to push Altmnormorti or anyone else

0:38:30.280 --> 0:38:34.279
<v Speaker 1>at Open AI about what's a pisstake really seems like

0:38:34.280 --> 0:38:37.640
<v Speaker 1>an obvious one, like, hey man, you need a bunch

0:38:37.719 --> 0:38:41.200
<v Speaker 1>of training data to train chat GPT, which does words

0:38:41.360 --> 0:38:43.880
<v Speaker 1>how are you getting all these videos again? Big credit

0:38:43.920 --> 0:38:47.839
<v Speaker 1>to Joanna Stern who asked mirror Murati, CTO of open Ai,

0:38:47.960 --> 0:38:52.080
<v Speaker 1>whether Sawer was trained on YouTube videos, and then Mirrormorati

0:38:52.120 --> 0:38:55.440
<v Speaker 1>of course made that incredible face. Go look up that video.

0:38:55.560 --> 0:38:58.840
<v Speaker 1>I'll link it in the notes. That's how moately the

0:38:58.920 --> 0:39:02.640
<v Speaker 1>problem with the current bubble. So much of its success

0:39:02.680 --> 0:39:05.760
<v Speaker 1>requires us to tolerate and applaud these half fast, half

0:39:05.800 --> 0:39:08.880
<v Speaker 1>finished tools that only sort of kind of do the

0:39:08.920 --> 0:39:10.880
<v Speaker 1>things they're meant to do, and we're meant to nod

0:39:10.960 --> 0:39:14.279
<v Speaker 1>and smile and clap and say great job, Sammy, like

0:39:14.320 --> 0:39:17.240
<v Speaker 1>we're talking to a bloody child rather than a startup

0:39:17.280 --> 0:39:20.200
<v Speaker 1>with thirteen billion dollars in funding with a CEO that

0:39:20.280 --> 0:39:24.080
<v Speaker 1>has the backing of goddamn Microsoft and soa is the

0:39:24.160 --> 0:39:29.120
<v Speaker 1>ugliest messiest problem of them all. It's videos, while superficially impressive,

0:39:29.280 --> 0:39:32.439
<v Speaker 1>are still deeply, deeply flawed. They take way too long

0:39:32.440 --> 0:39:34.560
<v Speaker 1>to generate a problem that's only going to get worse,

0:39:35.040 --> 0:39:38.320
<v Speaker 1>and they're just far too inconsistent, which is a problem

0:39:38.360 --> 0:39:42.000
<v Speaker 1>created by the nature of how generative AI works and

0:39:42.040 --> 0:39:47.759
<v Speaker 1>its approach to generating things using mathematics, and if it's

0:39:47.800 --> 0:39:49.960
<v Speaker 1>planning to be a VFX tool, if it's planning to

0:39:49.960 --> 0:39:54.040
<v Speaker 1>be a sidearm for filmographers, it's going to have to

0:39:54.080 --> 0:39:59.400
<v Speaker 1>be a lot cheaper than it's really practical to make it. Again,

0:40:00.160 --> 0:40:03.879
<v Speaker 1>nothing open Ai makes is profitable. They may make over

0:40:03.920 --> 0:40:07.440
<v Speaker 1>a billion dollars of revenue, but everything is burning money.

0:40:08.480 --> 0:40:14.600
<v Speaker 1>It's just very frustrating. It's all very frustrating. Sora seems

0:40:14.719 --> 0:40:17.920
<v Speaker 1>kind of cool, but when you take away the cool

0:40:18.400 --> 0:40:20.440
<v Speaker 1>side and you just look at it for what it is,

0:40:20.920 --> 0:40:23.800
<v Speaker 1>it's just another con from Sam Altman. It's just another

0:40:24.120 --> 0:40:27.480
<v Speaker 1>unfinished product that is not able to fit the task.

0:40:28.440 --> 0:40:31.040
<v Speaker 1>It's just another thing that you look at and you say, oh,

0:40:31.120 --> 0:40:33.120
<v Speaker 1>if that was just a bit better, it'd be really good.

0:40:33.200 --> 0:40:36.560
<v Speaker 1>Except in this case it would be a lot better. Yeah,

0:40:36.840 --> 0:40:39.839
<v Speaker 1>all the press writes about it's incredible, it's amazing, and

0:40:40.480 --> 0:40:44.520
<v Speaker 1>you can separate the technological achievement of using maths to

0:40:44.560 --> 0:40:50.120
<v Speaker 1>generate a visual moving image that's genuinely cool. But you

0:40:50.280 --> 0:40:52.759
<v Speaker 1>gotta stop for a second and say, as cool as

0:40:52.800 --> 0:40:55.399
<v Speaker 1>this is, the people in the back of their shot,

0:40:55.440 --> 0:41:00.279
<v Speaker 1>they're molding into each other. It's like the thing, it's disgusting. Hey,

0:41:00.360 --> 0:41:04.640
<v Speaker 1>that monkey's got like five arms. That's weird. I don't know.

0:41:04.800 --> 0:41:09.160
<v Speaker 1>I just feel like normal people don't get this much leniency.

0:41:10.080 --> 0:41:12.920
<v Speaker 1>You and I don't get people saying great job when

0:41:12.960 --> 0:41:15.640
<v Speaker 1>we do kind of a shitty job. And if we

0:41:15.840 --> 0:41:19.920
<v Speaker 1>brought something to someone that was insanely expensive only really

0:41:19.920 --> 0:41:22.000
<v Speaker 1>did ten percent of the job, you needed it too,

0:41:22.800 --> 0:41:26.680
<v Speaker 1>And also the things that created took forever and looked horrifying.

0:41:27.080 --> 0:41:29.799
<v Speaker 1>I don't think we'd get told great job. I think

0:41:29.840 --> 0:41:32.400
<v Speaker 1>we'd be told we'd wasted a lot of money and

0:41:32.440 --> 0:41:36.279
<v Speaker 1>that someone was quite mad at us. I'm tired of this.

0:41:36.960 --> 0:41:41.040
<v Speaker 1>I'm tired of these companies announcing these half completed products

0:41:41.200 --> 0:41:43.319
<v Speaker 1>and having the media dance around and act like they've

0:41:43.320 --> 0:41:47.439
<v Speaker 1>delivered something truly incredible. I'm tired of the public being

0:41:47.520 --> 0:41:50.759
<v Speaker 1>expected to do the mental and emotional labor for Sam

0:41:50.800 --> 0:41:53.960
<v Speaker 1>Moultman and other AI companies, saying it's remarkable that they're

0:41:54.000 --> 0:41:56.919
<v Speaker 1>even able to do this, and assume and give them

0:41:56.960 --> 0:42:00.760
<v Speaker 1>credit for some inevitable future where all of thesebms are gone,

0:42:00.880 --> 0:42:03.520
<v Speaker 1>despite little proof that such a thing is possible and

0:42:03.560 --> 0:42:08.080
<v Speaker 1>plenty of proof that it isn't. And as I've suggested,

0:42:08.560 --> 0:42:11.319
<v Speaker 1>I really don't think it is. I think Sora is

0:42:11.360 --> 0:42:15.799
<v Speaker 1>dead on arrival. I think it's too expensive, too imprecise,

0:42:15.840 --> 0:42:19.400
<v Speaker 1>and there is no fixing those problems. You can iterate

0:42:19.440 --> 0:42:22.120
<v Speaker 1>on them, you can improve them, but without some kind

0:42:22.120 --> 0:42:24.799
<v Speaker 1>of energy or chips breakthrough, they're not even going to

0:42:24.840 --> 0:42:28.640
<v Speaker 1>have the compute or really the money to build this

0:42:28.680 --> 0:42:32.440
<v Speaker 1>thing into anything even half functional. And I'm calling on

0:42:32.480 --> 0:42:35.440
<v Speaker 1>the press to push back on these companies. I'm calling

0:42:35.440 --> 0:42:39.719
<v Speaker 1>on them to refuse to declare this quasi functional software

0:42:40.239 --> 0:42:46.080
<v Speaker 1>as complete. I'm tired of seeing the media back these

0:42:46.120 --> 0:42:50.320
<v Speaker 1>companies and do marketing work for them when they're not done.

0:42:50.719 --> 0:42:55.120
<v Speaker 1>They don't deserve the credit, and I'm demanding that people

0:42:55.160 --> 0:42:59.520
<v Speaker 1>like Sam Altman actually change the world before anyone says

0:42:59.520 --> 0:43:00.200
<v Speaker 1>that they're doing.

0:43:00.280 --> 0:43:00.320
<v Speaker 3>So.

0:43:08.600 --> 0:43:11.040
<v Speaker 1>Thank you for listening to Better Offline. The editor and

0:43:11.040 --> 0:43:14.239
<v Speaker 1>composer of the Better Offline theme song is Matasowski. You

0:43:14.239 --> 0:43:16.480
<v Speaker 1>can check out more of his music and audio projects

0:43:16.640 --> 0:43:20.160
<v Speaker 1>at Mattasowski dot com M A T T O. S

0:43:20.200 --> 0:43:24.279
<v Speaker 1>O W s KI dot com. You can email me

0:43:24.320 --> 0:43:26.919
<v Speaker 1>at easy at Better Offline dot com, or visit Better

0:43:26.960 --> 0:43:29.399
<v Speaker 1>Offline dot com to find more podcast links and of course,

0:43:29.440 --> 0:43:32.560
<v Speaker 1>my newsletter. I also really recommend you go to chat

0:43:32.600 --> 0:43:35.239
<v Speaker 1>dot Where's youreed dot at to visit the discord, and

0:43:35.280 --> 0:43:37.960
<v Speaker 1>go to our slash Better Offline to check out our reddit.

0:43:38.760 --> 0:43:42.000
<v Speaker 1>Thank you so much for listening. Better Offline is a

0:43:42.040 --> 0:43:45.120
<v Speaker 1>production of cool Zone Media. For more from cool Zone Media,

0:43:45.239 --> 0:43:48.400
<v Speaker 1>visit our website cool Zonemedia dot com, or check us

0:43:48.400 --> 0:43:51.360
<v Speaker 1>out on the iHeartRadio app, Apple Podcasts, or wherever you

0:43:51.440 --> 0:44:12.800
<v Speaker 1>get your podcasts.