WEBVTT - OpenAI's Video Generating AI Is Dead On Arrival 0:00:02.520 --> 0:00:07.880 All Zone Media. Hello and welcome to Better Offline as usual. 0:00:07.960 --> 0:00:21.600 I'm your host ed zitron. A few months ago, open 0:00:21.640 --> 0:00:24.800 Ai showed off Sora, a product that can generate videos 0:00:24.800 --> 0:00:27.120 based on a short text prompt, kind of like chat 0:00:27.240 --> 0:00:31.800 ebt does for text or Daali does for images. These videos, 0:00:31.840 --> 0:00:34.680 which are usually no more than sixty seconds long, can 0:00:34.760 --> 0:00:38.040 at times seem impressive until you notice a little detail 0:00:38.040 --> 0:00:40.640 that breaks the entire facade, like in a video where 0:00:40.640 --> 0:00:42.559 a cat wakes up its owner, but the owner's arm 0:00:42.600 --> 0:00:44.760 appears to be part the cushion and the cat's poor 0:00:44.880 --> 0:00:48.360 explodes out of its arm like an ameba. Reactions to 0:00:48.479 --> 0:00:51.440 Sora's Ai generated videos, and indeed the existence of the 0:00:51.440 --> 0:00:54.840 model itself, have ranged from kind of a breathless hype 0:00:54.920 --> 0:00:57.840 to genuine fear that this will be used to replace 0:00:57.920 --> 0:01:01.640 video producers, in that it can create reality adjacent videos 0:01:01.680 --> 0:01:04.880 that for a few seconds kind of seem real, especially 0:01:04.959 --> 0:01:07.280 in the case in some of open Aye's hand pick 0:01:07.360 --> 0:01:12.200 demo videos. Yet even in these handpicked Sora outputs, you'll 0:01:12.240 --> 0:01:15.800 find these weird little things that immediately shatter the illusion, 0:01:16.160 --> 0:01:19.119 like one where a woman's legs awkwardly shuffle, then somehow 0:01:19.160 --> 0:01:21.959 switch sides as she walks around, or blobs of people 0:01:22.000 --> 0:01:25.240 merging in the background of images. These are, on some 0:01:25.520 --> 0:01:31.280 level genuinely remarkable technological achievements, until you consider that what 0:01:31.440 --> 0:01:35.560 they are and what they might do, and that there 0:01:35.600 --> 0:01:39.120 are problems in them that run through the entire fabric 0:01:39.200 --> 0:01:43.319 of artificial intelligence. A little over a month after SAW 0:01:43.520 --> 0:01:46.440 was announced, open AI would debut a series of short films, 0:01:46.480 --> 0:01:49.760 including one called Airhead, where filmmakers Shy Kids told the 0:01:49.760 --> 0:01:51.360 story of a man with a balloon for a head, 0:01:51.800 --> 0:01:55.120 and because this is AI said, balloon changes sizes twenty three, 0:01:55.280 --> 0:01:58.240 twenty four, twenty six, twenty seven, twenty nine, thirty two, 0:01:58.280 --> 0:02:01.240 thirty four, thirty nine, forty one, forty two, forty three, 0:02:01.240 --> 0:02:03.960 and forty five seconds into the piece, at which point 0:02:04.000 --> 0:02:06.120 I stopped counting because it got boring and I really 0:02:06.120 --> 0:02:08.280 don't want to be mean to shy kids, as this 0:02:08.600 --> 0:02:12.840 really isn't their fault. The very nature of filmmaking is 0:02:12.840 --> 0:02:15.560 that you take different shots of the same thing. Something 0:02:15.639 --> 0:02:19.200 that I anticipated SAA was incapable of doing. Is each 0:02:19.280 --> 0:02:22.480 shot is generated fresh a saura itself. Much like all 0:02:22.600 --> 0:02:27.360 generative AI does not actually know anything when one asks 0:02:27.560 --> 0:02:29.880 for a man with a yellow balloon as his head. 0:02:30.160 --> 0:02:32.960 SAURA must then look at the parameters spawn during its 0:02:33.000 --> 0:02:36.040 training process and create an output guessing what a man 0:02:36.080 --> 0:02:38.360 looks like, what a balloon looks like, what a man's 0:02:38.480 --> 0:02:41.760 features are on his body, what color yellow is, what 0:02:41.760 --> 0:02:45.680 the man's doing, and so on and so forth. This 0:02:45.800 --> 0:02:49.680 becomes extremely problematic when you're working in film or television, 0:02:49.720 --> 0:02:52.120 where viewers are far more likely to see when something 0:02:52.240 --> 0:02:55.680 just doesn't look right, a problem exacerbated by moving images, 0:02:55.840 --> 0:02:59.800 high resolution footage, and big television screens which are now ubiquitous. 0:03:00.680 --> 0:03:05.400 Yet the press, as usual, credulously accepted Saura's quote stunning 0:03:05.520 --> 0:03:09.360 videos that were amazing and scary, suggesting to the public 0:03:09.400 --> 0:03:11.280 that we were on the verge of some sort of 0:03:11.440 --> 0:03:16.760 artificial intelligence takeover of the film industry, helping boy Sam Altman, 0:03:16.880 --> 0:03:20.440 their CEO, and his dumbast attempts to convince Hollywood that 0:03:20.600 --> 0:03:25.440 SURRA won't destroy the movie business. These stories only serve 0:03:25.520 --> 0:03:28.680 to help Sam Orman, who desperately needs you to believe 0:03:28.680 --> 0:03:31.480 that Hollywood is scared of Surer and even more scared 0:03:31.480 --> 0:03:34.120 of Generative AI, because the more you talk about fear 0:03:34.160 --> 0:03:36.680 and lost jobs and the machines taking over the less. 0:03:36.720 --> 0:03:40.360 You ask a very very simple question, does any of 0:03:40.360 --> 0:03:45.160 this shit actually work? The answer, it turns out, is 0:03:45.200 --> 0:03:48.560 not very well. In a piece for FX Guide, Mike 0:03:48.600 --> 0:03:51.160 Seymour sat down with Shy Kids, the people behind Airhead, 0:03:51.320 --> 0:03:54.720 and revealed how SORAW is in many ways a little 0:03:54.720 --> 0:03:58.680 bit useless for making films. SAURA takes ten to twenty 0:03:58.720 --> 0:04:01.560 minutes to generate a single three to twenty second shot, 0:04:02.000 --> 0:04:04.400 something that isn't really a problem until you realize that 0:04:04.520 --> 0:04:07.520 until the shot is rendered, you really have absolutely no 0:04:07.600 --> 0:04:10.600 idea what the hell it's going to spit out. Soa 0:04:10.880 --> 0:04:13.480 has no mechanism to connect one shot to another. Even 0:04:13.480 --> 0:04:17.640 with hyperdescriptive prompts. It hallucinates extra features when you haven't 0:04:17.680 --> 0:04:20.360 asked for them. And Shy Kids were shocked by how 0:04:20.400 --> 0:04:23.680 surprised open Ay's researchers were when they requested the ability 0:04:23.680 --> 0:04:27.080 to use a prompt to request a particular angle in 0:04:27.120 --> 0:04:31.520 a shot, a feature that was initially unavailable. It took 0:04:32.200 --> 0:04:35.200 this is what kind of drives me crazy here and 0:04:35.240 --> 0:04:38.279 you'll hear this in the interview with him later. These 0:04:38.320 --> 0:04:40.960 people that are open AI people, and they were making 0:04:40.960 --> 0:04:44.560 this tool for making visual images for making moving images. 0:04:44.600 --> 0:04:47.320 They didn't think that people might want different shots. I'm 0:04:47.320 --> 0:04:49.480 so glad these are the people who were in control 0:04:49.520 --> 0:04:53.159 of the future. Anyway, to quote the piece, it took 0:04:53.320 --> 0:04:56.400 hundreds of generations at ten to twenty seconds a piece 0:04:56.440 --> 0:05:00.720 to make a minute and nineteen second long film. And 0:05:00.760 --> 0:05:05.679 what's really fun about this is that the movie's fine. 0:05:05.800 --> 0:05:09.200 I it was kind of fine. I just I have 0:05:09.320 --> 0:05:11.280 nothing really to say about it. It's a minute and 0:05:11.360 --> 0:05:15.720 twenty seconds long, but it's it kind of works. But also, 0:05:15.960 --> 0:05:18.799 the balloon looks different in every other shot. This isn't 0:05:18.880 --> 0:05:23.279 shy Kids's fault. But also this isn't gonna get better. 0:05:23.480 --> 0:05:26.520 And I will get into why as we go along. 0:05:28.080 --> 0:05:31.479 These tiny little problems I've mentioned, though, they all lead 0:05:31.520 --> 0:05:35.359 to one overwhelming issue that Sora isn't so much a 0:05:35.440 --> 0:05:37.800 tool to make movies as it is a big, fat 0:05:37.839 --> 0:05:40.360 slot machine that spits out footage that may or may 0:05:40.400 --> 0:05:43.440 not be of any use at all. Almost all of 0:05:43.440 --> 0:05:47.360 the footage in Airhead was graded, treated, stabilized, the nutscaled, 0:05:48.000 --> 0:05:50.800 and that ten to twenty second lead time on generations 0:05:50.920 --> 0:05:54.520 was for four hundred and eightp resolution footage, meaning that 0:05:54.600 --> 0:05:58.200 even useful footage needed significant post production work to look 0:05:58.200 --> 0:06:00.680 good enough, and just to give you an idea for 0:06:00.760 --> 0:06:02.840 the non technical members of the audience, and this is fair. 0:06:03.839 --> 0:06:06.599 The video you see on YouTube is usually somewhere between 0:06:06.600 --> 0:06:09.920 seven TWENTYP, ten ADP or four K. The TV shows 0:06:09.960 --> 0:06:13.880 you watch usually ten AP four K or upscale ten ADP. 0:06:14.120 --> 0:06:16.359 These are all lots of numbers. What I'm saying is 0:06:16.839 --> 0:06:20.440 the stuff that SAA spits out, that takes burning a 0:06:20.440 --> 0:06:24.680 small zoo to spit out, is incredibly low resolution. On 0:06:24.760 --> 0:06:29.599 top of not being specific, look to put it as 0:06:29.640 --> 0:06:34.119 plainly as possible, every single time that shy kids wanted 0:06:34.120 --> 0:06:37.400 to generate a shot, even a three second long shot, 0:06:37.600 --> 0:06:40.440 they would give SA a text prompt and then they 0:06:40.440 --> 0:06:44.040 would wait at least ten minutes to find out if 0:06:44.080 --> 0:06:47.640 it was right, and they'd have to accept footage that 0:06:47.800 --> 0:06:52.000 was subprime or inaccurate. And there's a really good example 0:06:52.040 --> 0:06:54.479 of this. If you watch Airhead, a lot of the 0:06:54.520 --> 0:06:57.240 shots are in slow motion, and you may think, no, 0:06:57.400 --> 0:07:00.040 this is a cinematic choice, right, because you kind of 0:07:00.160 --> 0:07:02.200 just admiring this man with a balloon for a head 0:07:02.240 --> 0:07:05.880 going about his business. No, no, no, no no. They 0:07:06.000 --> 0:07:08.440 found that this was just what Sora wanted to give 0:07:08.480 --> 0:07:10.880 them when they asked for it. This was, in and 0:07:10.920 --> 0:07:14.520 of itself a hallucination, in the same way that chat 0:07:14.600 --> 0:07:18.560 GBT will authoritatively tell you that something is true that 0:07:18.720 --> 0:07:22.040 is not sorrow will spit out a man running in 0:07:22.080 --> 0:07:27.960 slow motion despite you not asking for that, And it's 0:07:27.960 --> 0:07:31.040 so weird. They had to quote them do quite a 0:07:31.080 --> 0:07:33.880 bit of adjusting to keep the whole thing from feeling 0:07:34.520 --> 0:07:37.920 like a big slow mode project, and it still kind 0:07:37.920 --> 0:07:43.680 of does. And that's rough. That's really rough. But you know, 0:07:43.800 --> 0:07:46.920 I'm a curious little critter, So I decided to sit 0:07:47.000 --> 0:07:49.640 down with Shy Kids's Walter Woodman to talk about his 0:07:49.680 --> 0:07:52.120 experience with Sora and have him delve a little daper 0:07:52.120 --> 0:07:55.040 into his experience with the product. And I'd say he 0:07:55.120 --> 0:07:59.000 had a far more utopian experience and perspective on the 0:07:59.040 --> 0:08:03.560 whole thing than I excted. Now, some of you might 0:08:04.320 --> 0:08:07.320 critique Walter for being so positive about it, but I 0:08:07.320 --> 0:08:09.520 actually caution you to just listen to what he's saying, 0:08:10.040 --> 0:08:13.400 because Walter's perspective is interesting. He sees this as a tool, 0:08:13.440 --> 0:08:15.680 he doesn't see it as a replacement, and I think 0:08:15.680 --> 0:08:18.320 it's a valid perspective to come at SAA with. I 0:08:18.360 --> 0:08:21.560 also think it's a perspective that kind of accepts a 0:08:21.640 --> 0:08:25.440 conceit of open AI's marketing strategy, that these things will 0:08:25.480 --> 0:08:30.520 get better if they do. Perhaps Walter is right, perhaps 0:08:30.560 --> 0:08:33.600 this will be an essential tool in filmmaking, even though 0:08:33.600 --> 0:08:35.440 he didn't say essential. Don't want to put words in 0:08:35.440 --> 0:08:39.240 the man's mouth, but I don't think that's the case. 0:08:40.320 --> 0:08:54.319 Let me talk to him. You decide for yourself, all right. 0:08:54.440 --> 0:08:57.960 So how did the relationship between Shy Kids and open 0:08:58.000 --> 0:08:58.920 AYE actually begin. 0:09:00.160 --> 0:09:03.840 The relationship between Shy Kids and Open AI began when 0:09:03.880 --> 0:09:08.079 we made an installation for a film called dolly Land, 0:09:08.240 --> 0:09:12.560 which was premiering at Toronto International Film Festival, and we 0:09:12.559 --> 0:09:15.480 were the only people that our friends at Pressman Film 0:09:15.600 --> 0:09:19.720 knew in Toronto, and so we made an installation that 0:09:19.840 --> 0:09:26.040 looked like Salvador Dali's like studio inside of the basement 0:09:26.240 --> 0:09:29.679 of the Saint Regis, which is where he lived and 0:09:30.240 --> 0:09:33.880 made work out of, And inside of that installation we 0:09:34.600 --> 0:09:38.360 made a like you could make your own surrealist painting, 0:09:39.520 --> 0:09:41.800 and the way that you could make that was using 0:09:41.880 --> 0:09:48.160 DOLLI the Open AI program, and so the open AI 0:09:48.320 --> 0:09:53.840 people came to visit and check out the like what 0:09:53.920 --> 0:09:56.080 we were working on, and making sure that it was 0:09:56.160 --> 0:09:58.080 like something that they wanted to be a part of. 0:09:58.840 --> 0:09:59.840 And so. 0:10:01.240 --> 0:10:05.440 They met our producer Sydney, who they loved. She's easy 0:10:05.480 --> 0:10:06.120 to love. 0:10:06.360 --> 0:10:07.840 And they. 0:10:09.120 --> 0:10:11.840 We sent them our previous work and so from there 0:10:12.120 --> 0:10:16.520 they asked us to join this artist group. And then 0:10:16.720 --> 0:10:18.800 when Sora came out, we saw it at the same 0:10:18.840 --> 0:10:24.720 time as everyone else and we yeah, we got tapped 0:10:24.720 --> 0:10:27.720 on the shoulder and said, hey, would you like to 0:10:27.800 --> 0:10:29.400 check this out and try this out? And we said, 0:10:29.440 --> 0:10:32.360 of course, that's how it came to be. 0:10:33.280 --> 0:10:37.119 So how did you on board? Were you just given access? 0:10:37.280 --> 0:10:39.959 Did they give you instructions? Did they physically come to you? 0:10:40.480 --> 0:10:44.199 What was that like it was a top secret. They 0:10:44.240 --> 0:10:48.720 gave us a briefcase and in a cloudy room. 0:10:48.960 --> 0:10:49.720 No, it was. 0:10:50.840 --> 0:10:54.000 Yeah, there was a very simple onboarding process where they 0:10:54.080 --> 0:10:58.080 walked us through the technology as well as some of 0:10:58.120 --> 0:11:05.160 its features, and yeah, it was pretty. It was pretty. 0:11:05.400 --> 0:11:07.640 And then from there they gave us access to begin 0:11:08.280 --> 0:11:09.959 using it and making. 0:11:09.600 --> 0:11:13.160 Things and you were allowed to use it without their presence. 0:11:13.200 --> 0:11:14.319 You had direct access. 0:11:14.360 --> 0:11:15.400 Yep, yep. 0:11:16.320 --> 0:11:20.240 So okay, did you get instructions on how to write 0:11:20.280 --> 0:11:23.479 effective prompts or did you just kind of do trial. 0:11:23.200 --> 0:11:25.439 And err, no, nothing like that. 0:11:25.600 --> 0:11:29.320 I mean in the artist group itself, there's a lot 0:11:29.360 --> 0:11:33.440 of really amazing and thoughtful creative people who kind of 0:11:34.160 --> 0:11:37.040 show their work and show how they got to make 0:11:37.120 --> 0:11:43.480 the things that they did. But no, not, there was 0:11:43.600 --> 0:11:49.480 no real engineering of our prompts. They were very much 0:11:49.720 --> 0:11:55.360 just play kind of see see what comes out of you. 0:11:55.360 --> 0:12:00.040 You're creative people that we trust, Why don't. 0:11:59.880 --> 0:12:03.440 You just see what works through spaghetti at the wall? 0:12:04.360 --> 0:12:07.800 That's cool. So during the in the piece of mathx 0:12:07.880 --> 0:12:11.240 guide in the interview, some more from shi Kids said 0:12:11.240 --> 0:12:14.839 the Open Eyes researchers they were surprised when they were 0:12:14.880 --> 0:12:20.400 asked about being able to say specific shots. What happened there? 0:12:20.840 --> 0:12:23.120 Was it just that you tried to ask Saora to 0:12:23.120 --> 0:12:25.040 do specific shots and it didn't work, or was it 0:12:25.120 --> 0:12:26.040 just not a feature? 0:12:27.760 --> 0:12:30.520 I think that's maybe taken a little bit out of context. 0:12:30.840 --> 0:12:31.599 I think. 0:12:32.880 --> 0:12:38.000 More so it's just people come from distant, different disciplines. 0:12:37.480 --> 0:12:39.079 And when. 0:12:40.760 --> 0:12:43.760 I say a wide shot on a one hundred and 0:12:43.800 --> 0:12:50.160 thirty millimeter lens, people from my area of expertise know 0:12:50.400 --> 0:12:52.360 sort of immediately what I'm talking about. 0:12:52.400 --> 0:12:55.160 Whereas the researchers, they are. 0:12:56.200 --> 0:13:01.440 More invested in sort of other other things, and so 0:13:02.320 --> 0:13:05.839 it's it's not so much that they didn't understand or 0:13:05.920 --> 0:13:08.920 that sort of didn't understand. It's more so just there's 0:13:08.960 --> 0:13:11.280 all these terms in films. 0:13:10.720 --> 0:13:12.400 Like a zollie or like a. 0:13:12.520 --> 0:13:15.800 Hitchcock zoom or all of these different things that are 0:13:16.520 --> 0:13:19.320 very understandable, but even when you go from set to set, 0:13:19.360 --> 0:13:22.680 they mean something different. So I think it's about trying 0:13:22.800 --> 0:13:28.200 to create a lingua franca between all of these sort 0:13:28.240 --> 0:13:34.360 of different, very different people and very different ways of 0:13:34.480 --> 0:13:37.680 using a tool. What I may call a zoom, you 0:13:37.760 --> 0:13:40.360 may call a dolly shot, et cetera, et cetera. 0:13:40.480 --> 0:13:44.559 So so that feels like a training date, a challenge. 0:13:44.760 --> 0:13:49.200 Yeah, I think it's about trying to figure out how 0:13:49.360 --> 0:13:53.520 and yeah, exactly what to what to train on. 0:13:54.600 --> 0:13:58.480 Yeah, so tell me what was the interface like? Was 0:13:58.480 --> 0:14:01.120 it a chat box? Did you have have? Like? Just 0:14:01.160 --> 0:14:02.679 tell me about what I actually look like. 0:14:03.520 --> 0:14:07.839 Sure, there's limitations of what I can say about things 0:14:07.920 --> 0:14:13.480 like that, but I think the way that I've described 0:14:13.480 --> 0:14:16.679 it to people without giving too much away is I 0:14:16.800 --> 0:14:21.040 think if you're familiar with using something like the Adobe Suite. 0:14:21.480 --> 0:14:26.480 I think that there's some commonalities whether you're using after 0:14:26.520 --> 0:14:32.600 Effects or Premiere or whatever illustrator, there's like commonalities and 0:14:32.640 --> 0:14:35.280 if you can use one, you can sort of flu's 0:14:35.320 --> 0:14:39.560 your way around the others. I would say it's very 0:14:39.600 --> 0:14:42.800 similar like that with open. 0:14:42.480 --> 0:14:46.200 Ayes tools and models that if you are. 0:14:47.200 --> 0:14:51.840 Used to things like chat, GPT and Dolly and those 0:14:51.880 --> 0:14:57.360 types of models, I think you will find it find 0:14:57.400 --> 0:14:59.600 an ease of use in using Zora. 0:15:01.400 --> 0:15:04.320 So within that article they mentioned that there was like 0:15:04.320 --> 0:15:07.560 a three hundred to one shooting ratio, which correct me 0:15:07.560 --> 0:15:09.800 if I'm wrong, means like three hundred seconds of material 0:15:10.560 --> 0:15:13.720 each second of usable material. How does that compare to 0:15:14.320 --> 0:15:18.400 conventional filmmaking in your experience, it. 0:15:18.320 --> 0:15:20.920 Would be even more seconds than that. I would say, 0:15:21.160 --> 0:15:26.280 just three hundred shots at probably ten to twenty seconds apiece. 0:15:26.440 --> 0:15:30.080 So whatever the math is on that, I would say 0:15:30.080 --> 0:15:35.000 that that's pretty common with shooting. You know, when you 0:15:35.160 --> 0:15:40.040 are shooting a fiction film or like even a documentary 0:15:40.120 --> 0:15:42.800 is even crazier for that you shoot all day and 0:15:42.840 --> 0:15:47.760 all day and from We shot a documentary recently and 0:15:47.840 --> 0:15:50.280 I actually had to go back and watch all the dailies, 0:15:50.920 --> 0:15:54.560 we counted about ninety hours of footage that we had, 0:15:54.840 --> 0:15:57.920 and from that nineties hours, you're making an hour and 0:15:57.920 --> 0:15:59.800 a half movie, So you. 0:15:59.760 --> 0:16:02.360 Know, you are really trimming things down. 0:16:02.440 --> 0:16:06.600 And I think also it's like you are getting the 0:16:06.720 --> 0:16:11.880 five seconds that work or the you know, the section 0:16:12.200 --> 0:16:15.600 of that shot that works. And I would say that's 0:16:15.600 --> 0:16:17.200 pretty common to filmmaking. 0:16:19.240 --> 0:16:21.920 How about narrative filmmaking, because I know documentary you have 0:16:21.960 --> 0:16:25.120 a lot of stuff, But I'm just wondering what the 0:16:25.160 --> 0:16:28.400 burden of selection is like compared to the amount of 0:16:28.400 --> 0:16:30.760 shots you take in just a regular movie or regular 0:16:30.840 --> 0:16:31.400 short film. 0:16:31.440 --> 0:16:34.160 Even again, I would. 0:16:33.920 --> 0:16:36.520 Say, at least I can only speak for the way 0:16:36.560 --> 0:16:40.160 that I shoot films. You know, if you had it's subjective. 0:16:40.400 --> 0:16:43.560 It's subjective for sure. If you're David Fincher, you're shooting 0:16:43.640 --> 0:16:47.120 eight hundred takes of like someone picking up a pencil, 0:16:47.320 --> 0:16:50.560 or Stanley Kubrick, you know, is like famous for a 0:16:50.680 --> 0:16:55.240 thousand takes. I would say that the burn rate was 0:16:55.320 --> 0:16:59.680 very similar. I would say that the challenges with Sora 0:17:00.480 --> 0:17:05.560 are like it's unbelievable at making these images that are 0:17:06.560 --> 0:17:09.800 unbelievable and so interesting to look at, But. 0:17:11.480 --> 0:17:14.400 At its current state, it. 0:17:14.480 --> 0:17:19.080 Can sometimes be difficult to do things that in traditional 0:17:19.080 --> 0:17:21.880 shooting would be much easier, where you say, hey, can. 0:17:21.680 --> 0:17:23.920 That guy go over here? 0:17:24.040 --> 0:17:26.199 Or can that person move from one side of the 0:17:26.200 --> 0:17:30.600 screen to the other. Things like that are are more difficult. 0:17:30.600 --> 0:17:34.320 But again this is baby steps. We are in like 0:17:34.480 --> 0:17:37.919 the toddler phase, so I assume that those things will 0:17:37.960 --> 0:17:38.400 get better. 0:17:39.880 --> 0:17:44.040 So you mentioned well shike, it's mentioned in the interview 0:17:44.200 --> 0:17:47.080 the by default it tries to prevent you from creating 0:17:47.200 --> 0:17:51.919 videos that violate copyright law existing copyrights. Did you accidentally 0:17:52.840 --> 0:17:55.040 bump into this regularly or was this something that just 0:17:55.080 --> 0:17:56.199 you didn't really bother you. 0:17:57.560 --> 0:18:00.760 No, you couldn't generate things that So when I was 0:18:00.960 --> 0:18:04.960 mentioning like a Hitchcock zoom, you couldn't mention Hitchcock, So 0:18:05.040 --> 0:18:07.480 you had to find a different way to describe that 0:18:07.640 --> 0:18:13.960 as opposed to like using public figures, anything that would 0:18:13.960 --> 0:18:17.119 have a public figure or a title you would not 0:18:17.160 --> 0:18:21.760 be allowed to generate. From my experience, there wasn't too 0:18:21.800 --> 0:18:26.200 many logos or brands or anything like that, and any 0:18:26.240 --> 0:18:28.280 of the things that I generated, and. 0:18:29.600 --> 0:18:32.640 But something copyright. Did you generate anything that looked copyright? 0:18:33.080 --> 0:18:36.680 No? Not to my not to my eye. 0:18:36.760 --> 0:18:41.560 That's fine. So well, I know you don't know how 0:18:41.640 --> 0:18:44.200 much Sorrow will cost, and we don't know that don't 0:18:44.200 --> 0:18:46.920 even know when it will launch. Can you talk about 0:18:46.920 --> 0:18:48.639 how much you'd be willing to pay for it? What 0:18:48.720 --> 0:18:50.600 do you think it's worth? And I realized that this 0:18:50.760 --> 0:18:52.280 is a vague question. 0:18:53.240 --> 0:18:53.760 For sure. 0:18:55.600 --> 0:19:02.840 I think that there is this illusion that Sora will 0:19:02.880 --> 0:19:08.000 be this solution to all problems, and I don't think 0:19:08.040 --> 0:19:10.800 that that is the case. I think Sora is a 0:19:10.840 --> 0:19:15.880 tool amongst many tools, and for certain things it will 0:19:15.920 --> 0:19:16.840 be very valuable. 0:19:17.040 --> 0:19:17.400 And so. 0:19:19.000 --> 0:19:21.280 In terms of value, it's like, well, how much is 0:19:21.320 --> 0:19:24.399 a glass of water? Well, yes, if a glass of 0:19:24.440 --> 0:19:28.080 water is just like right now in my kitchen, I. 0:19:27.560 --> 0:19:29.320 Wouldn't like to pay that high for it. 0:19:29.720 --> 0:19:31.760 If a glass of water is for a person in 0:19:31.800 --> 0:19:34.840 the desert who desperately needs that glass of water, you 0:19:34.920 --> 0:19:37.600 can really name your price. And I would say that 0:19:38.119 --> 0:19:42.240 for some projects, I think that the usage of Sora 0:19:42.400 --> 0:19:44.560 would be absolutely invaluable, and. 0:19:44.560 --> 0:19:47.240 I would I would. 0:19:47.680 --> 0:19:49.680 I don't know how much exactly that would be, would 0:19:49.680 --> 0:19:51.800 depend on the budget, would depend on the limits and 0:19:51.840 --> 0:19:56.640 the scales, but I would say that there's other projects 0:19:56.640 --> 0:19:58.960 where I think it would be like totally inappropriate or 0:19:59.000 --> 0:20:04.600 like just not worth like what, well, just when I 0:20:04.640 --> 0:20:08.280 think of studio ghibli films that are hand drawn, and 0:20:09.760 --> 0:20:12.760 I think the reason that those films work is because 0:20:12.800 --> 0:20:16.080 of the way that they're made, or I think that 0:20:16.119 --> 0:20:19.280 when you think of art man animation, it's like I 0:20:19.320 --> 0:20:21.720 feel that you could feel the fingerprints in that clay, 0:20:22.240 --> 0:20:24.959 and so I don't think maybe for those types of 0:20:25.000 --> 0:20:29.040 films that it would be appropriate, But I think for 0:20:29.119 --> 0:20:31.880 other types of films like Airhead or others, I think 0:20:31.920 --> 0:20:36.960 it would be extremely appropriate. I think it's up to 0:20:37.000 --> 0:20:42.240 the artists sort of discretion how much they think that 0:20:42.240 --> 0:20:43.520 that tool is needed. 0:20:45.000 --> 0:20:50.440 It's doesn't the inconsistency of shots make this deeply impractical, 0:20:50.520 --> 0:20:52.199 because that's the thing I kept coming back to. 0:20:53.000 --> 0:20:55.359 Yeah, I mean, depends on what project you're working on. 0:20:55.400 --> 0:20:58.000 And again, I think that this is like early days. 0:20:58.359 --> 0:21:00.960 I think that these are kinks and bugs that are 0:21:01.119 --> 0:21:07.280 going to be changed, and already from day one where 0:21:07.280 --> 0:21:12.440 we started using it to where we are today, massive 0:21:12.480 --> 0:21:15.919 improvements have happened, and actually improvements where they've listened to 0:21:16.080 --> 0:21:19.919 things that we have suggested and things that we'd like 0:21:20.000 --> 0:21:21.560 to see and tools we'd. 0:21:21.440 --> 0:21:22.040 Like to see. 0:21:22.119 --> 0:21:31.400 So I think that, for example, for Airhead, the inconsistency 0:21:31.520 --> 0:21:38.800 of having a protagonist, having a protagonist that stays true 0:21:39.000 --> 0:21:41.119 through all these different shots, that's the reason why we 0:21:41.160 --> 0:21:43.680 put a balloon in front of their head, Because while 0:21:43.680 --> 0:21:47.760 different bodies can sort of be accepted, a different face 0:21:47.800 --> 0:21:49.400 and a different head is going to be a little 0:21:49.440 --> 0:21:53.880 bit difficult. And so we turned the limitation into our 0:21:54.440 --> 0:21:58.600 sort of main attribute. And I would say that again, 0:21:58.720 --> 0:22:01.719 that works for that story. But I don't think that 0:22:01.840 --> 0:22:06.239 all stories are going to find this valuable. And I 0:22:06.240 --> 0:22:11.280 also don't think every single shot needs to come from Sora. 0:22:11.600 --> 0:22:14.720 I think that there's a world where it can be. 0:22:14.800 --> 0:22:18.399 An addition, or it can be the start of a 0:22:18.480 --> 0:22:21.920 story where instead of just brainstorming and just having a script, 0:22:22.400 --> 0:22:26.119 you make a sort of moving mood board or a 0:22:26.200 --> 0:22:30.320 trailer or so. I think that there's like tons of 0:22:30.400 --> 0:22:35.919 stages along the pipeline that it would be extremely valuable 0:22:36.200 --> 0:22:41.400 and help elucidate concepts and bring them to life. 0:22:41.680 --> 0:22:46.720 So thematic question, so you avoided filming locations and all 0:22:46.720 --> 0:22:49.160 of this, but you spend a lot of time writing 0:22:49.200 --> 0:22:53.360 prompts and you're waiting for Sora to generate clips, then 0:22:53.440 --> 0:22:56.000 up skating and all that. Do you think you could 0:22:56.000 --> 0:22:58.960 make airhead assuming you could get around the balloon head thing? 0:22:59.320 --> 0:23:02.480 Do you think you could make it quicker in real life? 0:23:02.640 --> 0:23:05.000 Them was soa kind of essential to get it done 0:23:05.040 --> 0:23:06.480 in the timeline you did, because it's like a week 0:23:06.520 --> 0:23:07.960 and a half two weeks, I. 0:23:07.920 --> 0:23:13.600 Think, Yeah, I don't know, that's an interesting question. I mean, 0:23:13.600 --> 0:23:15.879 we definitely wouldn't be able to fly around the world 0:23:16.240 --> 0:23:20.560 and yes, get the shots at the car race and 0:23:20.640 --> 0:23:22.000 all of those things, so. 0:23:23.560 --> 0:23:26.199 I think it would probably be shorter. 0:23:26.440 --> 0:23:30.840 But I think in general, the conversations about like time 0:23:30.920 --> 0:23:35.240 and money are like super reductive in a way in 0:23:35.280 --> 0:23:39.760 that I think that without Sora, this wouldn't exist, And 0:23:40.040 --> 0:23:44.160 I think that that is the more interesting conversation. As 0:23:44.880 --> 0:23:48.879 a director, most directors I know have a folder of 0:23:50.359 --> 0:23:53.959 unrealized ideas, and I think that my hope is that 0:23:54.119 --> 0:23:58.160 Sora will allow us to dust off those folders and 0:23:59.359 --> 0:24:02.320 breathe new life life into concepts, and when people see 0:24:02.640 --> 0:24:07.080 what those concepts could be, my hope is that it 0:24:07.640 --> 0:24:13.280 gives a lot more people opportunities to have their ideas illuminated. 0:24:13.720 --> 0:24:16.520 And whether that means to go and shoot it now 0:24:16.560 --> 0:24:20.760 traditionally or some hybrid. I think that that, to me 0:24:20.960 --> 0:24:22.600