WEBVTT - TechSupport: Pixel Peeping - How to Spot AI Video 0:00:07.800 --> 0:00:17.280 Class. Welcome to tech Stuff. I'm Kara Price. Today's interview 0:00:17.400 --> 0:00:20.880 is all about Sora, the video generation tool and invite 0:00:20.880 --> 0:00:23.320 only social media app that Open Ai released at the 0:00:23.360 --> 0:00:26.960 beginning of October. If you're on TikTok, Instagram, or x 0:00:27.120 --> 0:00:30.080 you've likely seen videos made by Sora plastered all over 0:00:30.120 --> 0:00:33.920 your feeds. These videos ranged from the absurd cats dancing 0:00:33.960 --> 0:00:37.920 by a dumpster with sunglasses on to hyper realistic like 0:00:38.040 --> 0:00:41.720 Queen Elizabeth trying jerk chicken in Jamaica. When I first 0:00:41.760 --> 0:00:44.560 saw these videos, I was entertained by the absurdist ones 0:00:44.600 --> 0:00:47.920 and kind of floored by the realistic ones. To me, 0:00:48.479 --> 0:00:51.400 Sora signals that we have officially entered the post bunny 0:00:51.440 --> 0:00:54.680 trampoline internet. Yeah, I'm talking about the AI video of 0:00:54.720 --> 0:00:57.400 the Horde of Bunnies jumping on a trampoline in the dark. 0:00:57.880 --> 0:01:00.320 I was very convinced that this video was real, and 0:01:00.360 --> 0:01:02.960 so were many people, which led to a mini panic. 0:01:03.800 --> 0:01:06.640 Is it even possible to detect what's fake and what's 0:01:06.680 --> 0:01:09.720 not anymore? That's where my guest today comes in. His 0:01:09.840 --> 0:01:12.960 name is Jeremy Carrasco and he runs multiple social media 0:01:13.000 --> 0:01:15.679 accounts under the name show Tools AI. 0:01:16.240 --> 0:01:18.360 The idea that we can't tell what's real or not 0:01:18.880 --> 0:01:22.480 because of AI video is so far definitely. 0:01:21.959 --> 0:01:24.240 Not the case. He has only been a full time 0:01:24.280 --> 0:01:27.000 creator for four months, but he has become a trusted 0:01:27.040 --> 0:01:30.880 source for dissecting viral AI videos and explaining the tells. 0:01:31.000 --> 0:01:35.160 There is a physical truth to shooting a video with 0:01:35.200 --> 0:01:38.959 a camera. That physical truth isn't going away, and AI 0:01:39.520 --> 0:01:43.600 does a version that to our eyes look like that 0:01:43.640 --> 0:01:48.400 physical truth. But upon examination you can figure out that 0:01:48.440 --> 0:01:51.120 these things break down. And I do think that any 0:01:51.240 --> 0:01:55.120 normal person with decent eyesight can zoom into these AI 0:01:55.240 --> 0:01:56.600 videos and figure that out. 0:01:57.200 --> 0:02:00.680 So Jeremy wants his social videos to be education. He 0:02:00.760 --> 0:02:03.120 wants more people to get excited by what he calls 0:02:03.200 --> 0:02:06.760 pixel peeping, and he wants to improve people's media literacy 0:02:07.320 --> 0:02:10.120 and hopes his accounts can help people tune their AI 0:02:10.240 --> 0:02:10.880 vibe checker. 0:02:11.360 --> 0:02:13.400 I'm not naive to the fact that people aren't going 0:02:13.440 --> 0:02:16.160 to be pixel peeping on the videos that they watch, 0:02:16.320 --> 0:02:20.000 So it's just about trying to tune people's initial impressions 0:02:20.040 --> 0:02:22.440 so that they have something in their head that says ey, 0:02:22.520 --> 0:02:25.040 something might not be right here, and then they can use, 0:02:25.080 --> 0:02:27.880 hopefully other media skills that I teach them. In order 0:02:27.919 --> 0:02:28.959 to dive a little. 0:02:28.760 --> 0:02:31.800 Bit deeper, I talk to Jeremy about so many things, 0:02:31.840 --> 0:02:35.840 how video generation tools work, how to pick up on AI, tells, 0:02:36.200 --> 0:02:39.200 why Sora is an inflection point for the Internet, and 0:02:39.240 --> 0:02:41.960 what this signals for the future of social media. I 0:02:42.000 --> 0:02:44.880 started out by asking Jeremy to clarify what Sora is 0:02:45.280 --> 0:02:46.440 and what it does. 0:02:47.120 --> 0:02:52.360 So. Sora was originally released as Openay's first video model 0:02:52.680 --> 0:02:56.239 in October twenty twenty five. They reuse the Sora name 0:02:56.440 --> 0:02:59.240 to launch their social media app. A lot of the 0:02:59.280 --> 0:03:01.680 hype has been around a Sora app, which is currently 0:03:01.720 --> 0:03:04.720 invite only, and then there's the Sora TOI model that 0:03:04.760 --> 0:03:08.680 you can already access if you have API access or 0:03:08.720 --> 0:03:11.760 if you're a developer or even a normal person. There 0:03:11.800 --> 0:03:14.360 are tools that let you generate a video with the 0:03:14.400 --> 0:03:18.440 Sora to video model without an invite. The Sora app 0:03:18.480 --> 0:03:23.520 experience is very unique in some ways and very familiar 0:03:23.520 --> 0:03:26.200 in others. It does feel like a TikTok for you 0:03:26.320 --> 0:03:30.160 page just for AI videos. You can scroll, it has 0:03:30.200 --> 0:03:32.840 an algorithm to suggest But what's gotten a lot of 0:03:32.840 --> 0:03:36.720 the tension is the ability to cameo someone, but really, 0:03:36.840 --> 0:03:38.880 these are just deep bakes. Like you're creating deep fakes 0:03:38.920 --> 0:03:41.200 of your friends. You're creating deep bakes of whoever lets 0:03:41.240 --> 0:03:42.960 you create a deep bake of them, And you have 0:03:43.040 --> 0:03:46.840 different levels of permissions. So, for example, Jake Paul and 0:03:46.920 --> 0:03:51.080 Sam Altman let anyone deep fake them, whereas I let 0:03:51.120 --> 0:03:54.120 no one deep fake me because I'm not comfortable with that. 0:03:54.600 --> 0:03:56.680 What does it look like to let someone deep fake 0:03:56.720 --> 0:03:57.400 you on Sora? 0:03:57.840 --> 0:04:00.800 It looks like a version of you doing whatever they 0:04:01.520 --> 0:04:04.240 prompted you to do. Now, there are safety features in place, 0:04:04.480 --> 0:04:08.120 so you can't have them do anything violent, you can't 0:04:08.160 --> 0:04:11.560 do anything sexual. But it's really up to open a 0:04:11.640 --> 0:04:14.280 high to set those boundaries. And I don't think it's 0:04:14.320 --> 0:04:17.680 completely accurate. I've made versions of myself that I think 0:04:17.720 --> 0:04:19.919 don't look very much like me. I've made other versions 0:04:19.920 --> 0:04:22.400 of myself that look a lot like me. That's really 0:04:22.520 --> 0:04:27.520 up to luck, because as we'll learn, these models aren't deterministic. 0:04:27.600 --> 0:04:30.120 There's a part of this that is random, so it's 0:04:30.160 --> 0:04:33.440 not repeatable. So Jake Paul is a very good example. 0:04:33.480 --> 0:04:35.880 There are a ton of AI videos of Jake Paul 0:04:35.960 --> 0:04:38.080 right now. All of them look a little bit different, 0:04:38.120 --> 0:04:41.960 but have his likeness, so you have to give permission 0:04:42.000 --> 0:04:43.680 for someone to make a video of you through the 0:04:43.680 --> 0:04:44.440 cameo feature. 0:04:44.760 --> 0:04:48.960 So would you say that AI video generation scares you, Like, 0:04:49.360 --> 0:04:51.120 is it something that keeps you up at night? 0:04:51.800 --> 0:04:54.159 It's not because I'm doing something about it now, but 0:04:54.200 --> 0:04:56.280 it really was, and I think it is keeping people 0:04:56.320 --> 0:04:58.000 up at night because so much of our time is 0:04:58.040 --> 0:05:01.120 spent on these short form video platforms like for better 0:05:01.240 --> 0:05:04.800 or worse. I do think that it is the primary 0:05:04.839 --> 0:05:08.039 way that people get information now. There was probably never 0:05:08.120 --> 0:05:10.520 the best format for that information in the first place, 0:05:10.640 --> 0:05:13.080 but here we are. So I think what keeps me 0:05:13.200 --> 0:05:16.800 up is really general media literacy skills, and I think 0:05:16.800 --> 0:05:19.640 of AI video as an extension of that. A lot 0:05:19.680 --> 0:05:21.560 of people are kept up by what I think are 0:05:21.720 --> 0:05:25.480 irrational fears about AI video, Like, in my opinion, it's 0:05:25.520 --> 0:05:28.039 probably not going to be framing you for a crime 0:05:28.080 --> 0:05:31.400 anytime soon, but it might turn the core of public 0:05:31.440 --> 0:05:34.320 opinion against you. It might be spreading disinformation. 0:05:34.600 --> 0:05:34.680 Like. 0:05:34.760 --> 0:05:38.320 It's an extension of other media literacy problems, and it's 0:05:38.360 --> 0:05:41.440 a very believable one because people when they are scrolling, 0:05:41.560 --> 0:05:43.400 they're just there to tune out and scroll. They're not 0:05:43.440 --> 0:05:46.640 there to pixel peep and really pay attention, right, I. 0:05:46.600 --> 0:05:48.960 Mean, you don't think that we are living in a 0:05:49.000 --> 0:05:51.560 world where soon people could be framed for something they 0:05:51.560 --> 0:05:53.599 didn't do using manipulated video. 0:05:54.000 --> 0:05:56.960 Well, I think that. I'm not a lawyer, but I've 0:05:57.000 --> 0:05:59.920 done some looking into this, and the reality is that 0:06:00.279 --> 0:06:03.000 in order for something to be admitted into evidence, at 0:06:03.080 --> 0:06:04.800 least in the United States, it has to have an 0:06:04.839 --> 0:06:09.039 extensive metadata trail. It has to be authenticated. You have 0:06:09.080 --> 0:06:11.039 to get the person who filmed the video into the 0:06:11.040 --> 0:06:13.839 courtroom to say that they filmed it. And we have 0:06:13.920 --> 0:06:17.840 to understand that while our perception might be getting tricked, 0:06:18.040 --> 0:06:21.919 there are procedural and mathematical ways that these can be detected. 0:06:22.440 --> 0:06:25.440 So it is not undetectable yet. And anyone who says 0:06:25.440 --> 0:06:28.640 it's undetectable is probably either selling you something or doesn't 0:06:28.640 --> 0:06:30.760 have a good eye. And anyone who says it will 0:06:30.800 --> 0:06:34.680 be undetectable does not know that, and frankly doesn't understand 0:06:34.760 --> 0:06:36.840 the technology that's making these AI videos very well. 0:06:36.880 --> 0:06:40.160 In my opinion, and right now, your likeness is not shared. 0:06:41.160 --> 0:06:46.640 No, I have a strong, strong bias against this because 0:06:46.720 --> 0:06:49.680 I believe that once your likeness gets out there and 0:06:49.839 --> 0:06:53.360 is deepfakable, so to speak. It's really hard to pull 0:06:53.360 --> 0:06:56.719 that back, not because you can't, like you can tell 0:06:56.760 --> 0:06:59.800 people to stop, but once it's out there, I think 0:06:59.800 --> 0:07:01.880 you lose a sense of trust. It's a line that 0:07:01.960 --> 0:07:04.440 I just don't want to cross. I'm not comfortable crossing, 0:07:04.440 --> 0:07:06.719 and I've actually told my followers I will never cross 0:07:06.760 --> 0:07:09.760 that line because it's just not what I'm interested in. 0:07:10.440 --> 0:07:13.640 So I was hoping that you could show me how 0:07:13.640 --> 0:07:15.559 to make a video using the Sora app. 0:07:15.760 --> 0:07:19.520 Sure, so this is the Sora desktop app. It is 0:07:19.640 --> 0:07:22.400 not the vertical experience that you have, you know, on 0:07:22.400 --> 0:07:25.360 the phone. It is, however, showing a lot of the 0:07:25.400 --> 0:07:28.280 same content. So this is essentially the for you page 0:07:28.280 --> 0:07:30.400 of Sora. And the thing to note here is that 0:07:30.440 --> 0:07:33.280 there are Sora water marks over each one of these videos. 0:07:33.560 --> 0:07:36.040 In the mobile experience, those water marks go away, but 0:07:36.160 --> 0:07:39.200 they don't let you screen record in the mobile version, 0:07:39.320 --> 0:07:42.360 Whereas theoretically anyone could do what I'm doing right now, 0:07:42.400 --> 0:07:44.720 like I can share my screen here, I could record 0:07:44.760 --> 0:07:47.560 my screen. When you see Sora videos on social media, 0:07:48.080 --> 0:07:49.360 this is how they're being made. 0:07:49.440 --> 0:07:52.040 So let's try to make a Sora video. Let's do 0:07:53.000 --> 0:07:55.680 skiing with candy. 0:07:56.520 --> 0:07:59.000 Skiing with Candy. You want me to just say that 0:07:59.080 --> 0:08:02.520 and see what it comes up with. Yes, let's do it. 0:08:02.600 --> 0:08:03.720 I think that's a great idea. 0:08:03.760 --> 0:08:04.840 Why do you think it's a good idea? 0:08:04.960 --> 0:08:10.120 Because so something that people aren't talking enough about with 0:08:10.240 --> 0:08:12.880 Sora is that you can have a very simple prompt 0:08:12.960 --> 0:08:16.080 and it can come up with something really creative. That's 0:08:16.200 --> 0:08:19.960 really what, in my opinion, distinguishes it from other video models. 0:08:20.240 --> 0:08:22.880 Google vo three was how a lot of AI content 0:08:22.960 --> 0:08:25.320 was made a few weeks ago. If you don't give 0:08:25.320 --> 0:08:28.480 Google vo three a good prompt, it's just boring, Whereas 0:08:28.480 --> 0:08:32.040 Sora will go through some attempts to at least make 0:08:32.080 --> 0:08:33.120 it entertaining anyway. 0:08:33.320 --> 0:08:36.360 It's just incredible to me that in a given three 0:08:36.400 --> 0:08:38.160 weeks the world sort of changes. 0:08:38.480 --> 0:08:42.080 I think that there is a misconception that the world 0:08:42.200 --> 0:08:48.440 just changed because video AI made a huge, undetectable leap. 0:08:48.960 --> 0:08:52.160 It did make a step towards more realism. What soa 0:08:52.280 --> 0:08:57.000 to really improved. Where a lot of the human parts 0:08:57.120 --> 0:09:00.840 of video AI, such as hand movement or if they 0:09:01.160 --> 0:09:04.840 have a missing limb, or if their teeth look weird, 0:09:05.080 --> 0:09:08.200 or if their eyes look uncanny, hair like, there were 0:09:08.240 --> 0:09:10.880 all these little things that people would pick on again, 0:09:11.000 --> 0:09:14.360 a lot of them subconscious. Sura made a step towards 0:09:14.360 --> 0:09:18.199 improving those things. It still has a lot of background issues. 0:09:18.720 --> 0:09:22.720 It is actually a noisier or muddier looking model in 0:09:22.720 --> 0:09:25.520 my opinion than video, but a lot of people aren't 0:09:25.559 --> 0:09:27.480 looking for that. A lot of the videos that go 0:09:27.640 --> 0:09:32.720 viral that are AI generated are security cams, our body 0:09:32.800 --> 0:09:36.960 cams are go pro looking cameras, things that people aren't 0:09:37.000 --> 0:09:40.160 looking at every day. But it really made improvements in 0:09:41.080 --> 0:09:45.560 how good the outputs are to watch. Like story wise, 0:09:46.160 --> 0:09:49.080 if you were to release Google vo three as a 0:09:49.120 --> 0:09:53.440 social media app, it would fail just entirely because people 0:09:53.440 --> 0:09:57.600 would get on there and unless you're a good prompter, like, 0:09:57.640 --> 0:10:00.320 you're not going to come up with anything interesting. As 0:10:00.320 --> 0:10:03.520 Sah made anyone getting into AI, it's possible for you 0:10:03.559 --> 0:10:06.280 to come up with something interesting with a very basic prompt. 0:10:06.320 --> 0:10:09.760 That's a really, really big innovation that they didn't talk about. 0:10:09.880 --> 0:10:12.160 But I think that's why it's had such an impact 0:10:12.280 --> 0:10:16.920 is because there's a huge volume of somewhat meaningful Sora 0:10:17.040 --> 0:10:19.880 videos out there, whereas there really wasn't with VEO when 0:10:19.880 --> 0:10:22.360 that came out right. So all right, so it came 0:10:22.440 --> 0:10:24.120 up with skiing with candy. Let's see what Let's see 0:10:24.120 --> 0:10:24.640 what I did here? 0:10:25.240 --> 0:10:25.520 Look what. 0:10:27.160 --> 0:10:29.959 Go mid slip snack classy and. 0:10:29.920 --> 0:10:31.000 A peppermint for the wind. 0:10:31.200 --> 0:10:34.000 Nothing like sweet feel to keep the turn smooth? Catch 0:10:34.000 --> 0:10:35.079 you at the bottom. 0:10:35.480 --> 0:10:37.599 All right? What are your impressions? 0:10:37.840 --> 0:10:40.760 I just don't I'm sorry, this is Is it okay 0:10:40.760 --> 0:10:42.000 that this is blowing my mind? 0:10:42.400 --> 0:10:43.400 It should okay? 0:10:43.440 --> 0:10:45.360 Good, it should blow your mind because I feel daft, 0:10:45.520 --> 0:10:49.640 Like I feel like I can't wrap my head around this, 0:10:49.960 --> 0:10:52.160 Like I'm assuming this woman in the video with her 0:10:52.200 --> 0:10:53.960 ski mask on is not a real person. 0:10:54.120 --> 0:10:56.120 No, she's not a real person. And we don't know 0:10:56.480 --> 0:10:58.600 how they invented her. They just came up with that. 0:10:58.679 --> 0:10:59.880 What so. 0:11:00.000 --> 0:11:03.000 So there are things about this that stick out to 0:11:03.040 --> 0:11:05.720 me as obvious AI video. And then there are things 0:11:05.720 --> 0:11:09.040 about this that I just have to say, wow, that 0:11:09.160 --> 0:11:12.280 is incredible. So if I can just explain what I 0:11:12.360 --> 0:11:15.360 see here someone who watches these, so please. She starts 0:11:15.360 --> 0:11:18.120 out by skiing down the hill, but she's kind of 0:11:18.160 --> 0:11:21.640 skiing like it's snowboarding. Then she stops. She has some 0:11:21.679 --> 0:11:24.240 peppermints in her hand, she has some bags of candy 0:11:24.400 --> 0:11:27.920 in her hand, and there are some weird things going 0:11:27.960 --> 0:11:30.280 on here. But what it did with it is without 0:11:30.320 --> 0:11:34.000 any input, it basically made a social media video with it. 0:11:34.000 --> 0:11:38.200 It's like she's promoting this candy. There's someone responding to 0:11:38.240 --> 0:11:42.319 her in the background. It invented a straw for her exactly. 0:11:42.400 --> 0:11:45.120 It she talks like an influencer. 0:11:45.520 --> 0:11:47.440 I just it really trips me up that she's not 0:11:47.480 --> 0:11:49.319 a real person, that this person does not exist in 0:11:49.360 --> 0:11:50.800 the world. It's really weird. 0:11:51.040 --> 0:11:53.360 Same. I mean, I have to tell myself it's not 0:11:53.400 --> 0:11:53.920 a real person. 0:11:54.000 --> 0:11:55.679 I mean, it would be like if you didn't exist. 0:11:56.000 --> 0:12:00.680 Yeah, it's That's the thing is, it's visually feels the 0:12:00.679 --> 0:12:03.640 same as talking to another person online. Of course, there 0:12:03.640 --> 0:12:05.640 are there are tells, so I'll get into those. So 0:12:06.280 --> 0:12:08.880 first of all, you have just the context. Why is 0:12:08.960 --> 0:12:11.800 she skiing down the hill with a bag of candy 0:12:11.960 --> 0:12:13.880 and why is she just putting it in her mouth 0:12:13.960 --> 0:12:17.800 with the wrappers. Then there are some artifacts that I 0:12:17.840 --> 0:12:20.760 can see, especially at the beginning of the generation. Her 0:12:20.880 --> 0:12:24.400 jacket and her pants are incredibly pixelated when it starts. 0:12:24.920 --> 0:12:28.160 But the other thing here is that it's very noisy. 0:12:28.200 --> 0:12:32.840 If we actually zoom in there's a lot of artifacts 0:12:33.040 --> 0:12:34.640 in the mountains back there. 0:12:34.880 --> 0:12:36.679 It is weird how she's eating the candy. That's a 0:12:36.679 --> 0:12:37.480 little uncanny. 0:12:37.520 --> 0:12:41.400 It's weird. Yeah, she's eating raft candy and the bag 0:12:41.440 --> 0:12:43.960 there just stuck to her knee. Yeah, you know, so 0:12:44.040 --> 0:12:46.480 at first it's a ziplock bag, then it's not a 0:12:46.559 --> 0:12:50.720 ziplock bag, then it sticks to her knee. Her feet 0:12:50.720 --> 0:12:54.199 are backwards, like her foot there is literally backwards in 0:12:54.280 --> 0:12:56.840 this version. She doesn't have a foot like you know, 0:12:56.880 --> 0:12:58.160 you get into it. It's kind of funny. 0:12:58.200 --> 0:12:59.960 But this is why you have such a large platfor 0:13:00.320 --> 0:13:02.320 because like I look at this at first and I'm like, oh, 0:13:02.360 --> 0:13:05.840 it's perfect. Like in a way, if I see the 0:13:05.880 --> 0:13:09.199 trappings of what I think i'm seeing, I don't really 0:13:09.240 --> 0:13:10.760 look for the detail that's wrong. 0:13:11.120 --> 0:13:14.439 Especially when you're just scrolling on TikTok or Instagram. You're 0:13:14.440 --> 0:13:15.079 not looking for. 0:13:15.000 --> 0:13:16.719 Anything wrong, right, which is how they want you to 0:13:16.760 --> 0:13:18.559 look at it, or scrolling on Sora. 0:13:18.480 --> 0:13:20.679 Or scrolling on Sora. A lot of them are leaving 0:13:20.679 --> 0:13:23.560 Sara and making it out to all these platforms. Yeah, 0:13:24.240 --> 0:13:26.480 you're not going to be looking for these things. I'm 0:13:26.520 --> 0:13:29.640 totally aware of that. I mean, on first watch, are 0:13:29.679 --> 0:13:31.480 you gonna pick out everything that's wrong with this? No, 0:13:31.600 --> 0:13:33.800 But if you watch it five times and start zooming in, 0:13:34.240 --> 0:13:37.199 you're gonna start noticing that her feet are literally backwards. 0:13:37.480 --> 0:13:41.000 So yeah, when it comes down to it, I think 0:13:41.040 --> 0:13:44.800 what's really very important about Sora is that it did 0:13:44.840 --> 0:13:47.400 all that work for you. You didn't need to know 0:13:47.480 --> 0:13:50.160 how to prompt the video. AI. If you were to 0:13:50.240 --> 0:13:54.320 put skiing with candy into Google video, it's just going 0:13:54.400 --> 0:13:56.320 to be boring. I'll just tell you that right now. 0:13:56.679 --> 0:14:00.720 So if I wanted this video, this exact video, for three, 0:14:01.720 --> 0:14:03.560 what would I have to prompt it to do? 0:14:04.040 --> 0:14:06.320 You'd have to act like a camera director. You'd have 0:14:06.400 --> 0:14:11.040 to say, video starting with women skiing down the slope. 0:14:11.120 --> 0:14:15.880 She is wearing a pink and yellow top, a turquoise bottom, 0:14:16.320 --> 0:14:18.320 She's holding a bag of candy in her right hand, 0:14:18.360 --> 0:14:20.600 pepperminster her left hand, and you'd have to go shot 0:14:20.640 --> 0:14:23.880 by shot to give it. I can actually show you 0:14:24.560 --> 0:14:27.320 something that I came up with that more clearly demonstrates 0:14:27.400 --> 0:14:29.960 this point. So this is a video I made yesterday 0:14:30.080 --> 0:14:34.600 with the prompt epic anime of Diego Maradonna scoring a 0:14:34.680 --> 0:14:37.000 goal in the world cup, weaving. 0:14:36.680 --> 0:14:41.600 Past one, still going two. Defender's beaten. He won't announcers. 0:14:42.280 --> 0:14:45.680 This is him dribbling through an entire defense. It is 0:14:45.720 --> 0:14:49.120 an epic looking anime. Anime. People would say it doesn't 0:14:49.160 --> 0:14:53.040 look great, but normal people probably wouldn't notice it. And 0:14:53.320 --> 0:14:57.360 what blew me away about this is that it created 0:14:57.720 --> 0:15:02.280 Diego Maradonna's most famous goal and it added the announcers. 0:15:02.560 --> 0:15:04.680 I didn't tell it to do any of that. Now, 0:15:04.720 --> 0:15:07.720 if I compare that to what Google Vio did with 0:15:07.840 --> 0:15:15.080 the exact same prompt it did this, this. 0:15:15.000 --> 0:15:16.120 One is b team. 0:15:16.160 --> 0:15:20.200 It is. The quality of the video is actually better, 0:15:20.440 --> 0:15:23.840 but it didn't make it interesting. So again, that's why 0:15:23.880 --> 0:15:25.840 you're seeing so much, Sarah, as you don't need to 0:15:25.840 --> 0:15:26.600 be very creative. 0:15:27.080 --> 0:15:30.120 What are the implications of a social media app being 0:15:30.200 --> 0:15:34.280 designed and housing videos full of fake people? Like it's 0:15:34.320 --> 0:15:35.760 just crazy to me that I can watch a video 0:15:35.840 --> 0:15:37.440 of someone who doesn't exist. 0:15:37.960 --> 0:15:42.040 I think that we don't know the implications, and I 0:15:42.080 --> 0:15:45.920 would push back on it being like our inevitable future 0:15:46.360 --> 0:15:49.760 a bit, but I would say that it is normalizing 0:15:50.000 --> 0:15:53.960 deep faking, and I don't think we know what that 0:15:54.040 --> 0:15:56.720 will mean for us. But I don't think it'll be good. 0:15:57.200 --> 0:15:59.960 I think it might be entertaining, I think it might 0:15:59.960 --> 0:16:04.320 be interesting. It is certainly a technical achievement, but I 0:16:04.320 --> 0:16:07.800 don't consider it to be a technological advancement. I'm not 0:16:07.920 --> 0:16:11.080 so sure it is progress. But it is a pretty 0:16:11.080 --> 0:16:14.320 incredible thing that they've been able to pull off, and 0:16:14.840 --> 0:16:17.840 I think that it is rational for people to look 0:16:17.880 --> 0:16:21.080 at these videos and be pretty freaked out. And that's 0:16:21.120 --> 0:16:25.080 what a lot of my comments are because what isn't 0:16:25.120 --> 0:16:29.320 clear is how this is going to improve social media 0:16:29.320 --> 0:16:32.320 in anyway, to improve our media literacy skills in any way. 0:16:32.640 --> 0:16:37.760 There are definitely tech advancements here that can improve advancements 0:16:37.800 --> 0:16:42.680 towards artificial general intelligence, like there are technical reasons that 0:16:42.720 --> 0:16:46.000 this could be helpful in the future. But the step 0:16:46.040 --> 0:16:48.960 that open Aye took to release this in a social 0:16:49.040 --> 0:16:53.760 media app was a huge jump, in my opinion, in 0:16:53.800 --> 0:16:56.920 the wrong direction. But the technology is here to stay 0:16:56.960 --> 0:16:57.320 for sure. 0:17:03.920 --> 0:17:08.000 After the break, will we become desensitized to deep fakes? 0:17:08.520 --> 0:17:12.880 Stay with us? 0:17:27.800 --> 0:17:30.120 One thing that I can't really get over about sore 0:17:30.240 --> 0:17:34.440 Too is that Sam Altman is letting anybody use his likeness. 0:17:34.560 --> 0:17:37.840 He opened his likeness to any sor user, so I 0:17:37.840 --> 0:17:42.000 could say Sam Altman building a snowman for example, why 0:17:42.040 --> 0:17:45.320 do this, Like, as the head of the company. 0:17:45.359 --> 0:17:48.480 I can only guess. I think that it is generally 0:17:49.080 --> 0:17:54.480 just attempt at normalizing deep baking people, and I think 0:17:54.520 --> 0:17:57.240 people should be really scared of crossing that line. I 0:17:57.240 --> 0:17:59.560 think it's a serious thing to do, and I think 0:17:59.640 --> 0:18:03.960 open pushing everyone in that direction before anyone was even 0:18:04.000 --> 0:18:08.320 asking for it is really frightening. You could create deep 0:18:08.359 --> 0:18:11.560 fakes of people before there was the technology to do it. 0:18:11.760 --> 0:18:14.720 There was a lot of friction and social pressure not 0:18:14.880 --> 0:18:18.280 to do it. That friction was helpful in keeping our 0:18:18.359 --> 0:18:22.320 information economy healthy. Even with safety features on the Sora 0:18:22.359 --> 0:18:25.639 app of like letting letting you set permissions, people are 0:18:25.640 --> 0:18:28.880 gonna mess that up. People won't know that they can 0:18:28.880 --> 0:18:31.479 be deepfaked, and of course that's their responsibility to know. 0:18:31.680 --> 0:18:34.120 But you've just opened up an entire can of worms. 0:18:34.160 --> 0:18:37.200 There are other issues here, like currently you can't delete 0:18:37.240 --> 0:18:40.639 your Sora account without deleting your entire JATGPT account. 0:18:40.960 --> 0:18:41.200 Wow. 0:18:41.359 --> 0:18:43.879 And again like you can't pull this back, Like in 0:18:44.040 --> 0:18:46.280 theory you could stop people. But if you are a 0:18:46.280 --> 0:18:48.520 public figure and you open up this can of worms, 0:18:48.880 --> 0:18:54.040 it could really backfire. So it's Sora accelerating this deep 0:18:54.080 --> 0:18:57.399 fake idea into a space that just hasn't been that 0:18:57.440 --> 0:18:59.760 full explored yet. And I don't think i'd want to 0:18:59.760 --> 0:19:02.920 be inly adopter of this because there's a lot of negative, 0:19:03.040 --> 0:19:05.520 like downside risk that I just don't think we figured 0:19:05.600 --> 0:19:06.000 out yet. 0:19:06.520 --> 0:19:09.040 So you have a video where you talk about how 0:19:09.080 --> 0:19:12.600 SOA is actually costing open Ai about one dollar per post? 0:19:12.720 --> 0:19:15.760 Can you explain that calculation and what it means for 0:19:15.840 --> 0:19:17.160 Sora long term? 0:19:17.240 --> 0:19:19.520 This was an educated guest that ended up being right. 0:19:19.840 --> 0:19:23.280 Every video you create is basically on open AI's dime. So, 0:19:23.520 --> 0:19:26.840 for example, two weeks ago, if I, as a creator 0:19:26.960 --> 0:19:30.320 wanted to post an Ai video to TikTok or Instagram, 0:19:30.680 --> 0:19:33.040 I would have to pay a subscription to make that 0:19:33.119 --> 0:19:37.400 video and download it or pay per post. So there 0:19:37.520 --> 0:19:42.640 are commodity prices for these video models. For Google vo 0:19:42.920 --> 0:19:46.000 it's a dollar fifty to three dollars. Sora is currently 0:19:46.080 --> 0:19:51.040 around a dollar. But the Sora application is free, and 0:19:51.160 --> 0:19:53.760 anytime you create an Ai video on that it is 0:19:53.880 --> 0:19:57.600 free to you. So as always I would ask the question, 0:19:57.760 --> 0:19:59.959 if it is free, are you the product? And I'm 0:20:00.080 --> 0:20:02.960 this case they are taking your data, they're taking your 0:20:02.960 --> 0:20:06.119 face scans, they're taking your props. Right, so there's that 0:20:06.200 --> 0:20:08.040 question of why are they doing this? Of course they're 0:20:08.040 --> 0:20:10.879 also doing it to get users. But imagine you were 0:20:10.920 --> 0:20:15.119 TikTok or Instagram and every single time someone posted a 0:20:15.200 --> 0:20:18.200 video on your site you needed to pay a dollar. 0:20:18.600 --> 0:20:20.840 How quickly is that going to add up? For Sora? 0:20:21.160 --> 0:20:21.760 Very quickly? 0:20:22.200 --> 0:20:26.280 Would advertisers be able to make up that difference? Are 0:20:26.280 --> 0:20:28.280 you going to need subscribers to help make up with 0:20:28.320 --> 0:20:31.200 that difference? I mean, video takes a ton of compute. 0:20:31.240 --> 0:20:34.640 It is costing them GPU compute, it is costing them 0:20:35.119 --> 0:20:38.760 opportunity costs. The GPUs could be used for other things, right, 0:20:39.080 --> 0:20:42.639 So the fact that they chose a video social media 0:20:42.680 --> 0:20:45.040 app where every time someone posts on your platform it's 0:20:45.040 --> 0:20:48.160 costing you money is pretty confusing to me as someone 0:20:48.200 --> 0:20:52.440 who understands that those advertiser clicks are not even close 0:20:52.480 --> 0:20:53.359 to worth that much. 0:20:53.720 --> 0:20:59.199 My question is, is your sam Altman, you oversee the 0:20:59.320 --> 0:21:04.280 most popular or AI tool on the market, Why are 0:21:04.280 --> 0:21:05.600 you going into social media? 0:21:06.160 --> 0:21:09.320 You're asking the right question that I think even open 0:21:09.359 --> 0:21:12.480 AI's own employees are asking. There has been some reporting 0:21:12.680 --> 0:21:16.280