1 00:00:07,800 --> 00:00:17,280 Speaker 1: Class. Welcome to tech Stuff. I'm Kara Price. Today's interview 2 00:00:17,400 --> 00:00:20,880 Speaker 1: is all about Sora, the video generation tool and invite 3 00:00:20,880 --> 00:00:23,320 Speaker 1: only social media app that Open Ai released at the 4 00:00:23,360 --> 00:00:26,960 Speaker 1: beginning of October. If you're on TikTok, Instagram, or x 5 00:00:27,120 --> 00:00:30,080 Speaker 1: you've likely seen videos made by Sora plastered all over 6 00:00:30,120 --> 00:00:33,920 Speaker 1: your feeds. These videos ranged from the absurd cats dancing 7 00:00:33,960 --> 00:00:37,920 Speaker 1: by a dumpster with sunglasses on to hyper realistic like 8 00:00:38,040 --> 00:00:41,720 Speaker 1: Queen Elizabeth trying jerk chicken in Jamaica. When I first 9 00:00:41,760 --> 00:00:44,560 Speaker 1: saw these videos, I was entertained by the absurdist ones 10 00:00:44,600 --> 00:00:47,920 Speaker 1: and kind of floored by the realistic ones. To me, 11 00:00:48,479 --> 00:00:51,400 Speaker 1: Sora signals that we have officially entered the post bunny 12 00:00:51,440 --> 00:00:54,680 Speaker 1: trampoline internet. Yeah, I'm talking about the AI video of 13 00:00:54,720 --> 00:00:57,400 Speaker 1: the Horde of Bunnies jumping on a trampoline in the dark. 14 00:00:57,880 --> 00:01:00,320 Speaker 1: I was very convinced that this video was real, and 15 00:01:00,360 --> 00:01:02,960 Speaker 1: so were many people, which led to a mini panic. 16 00:01:03,800 --> 00:01:06,640 Speaker 1: Is it even possible to detect what's fake and what's 17 00:01:06,680 --> 00:01:09,720 Speaker 1: not anymore? That's where my guest today comes in. His 18 00:01:09,840 --> 00:01:12,960 Speaker 1: name is Jeremy Carrasco and he runs multiple social media 19 00:01:13,000 --> 00:01:15,679 Speaker 1: accounts under the name show Tools AI. 20 00:01:16,240 --> 00:01:18,360 Speaker 2: The idea that we can't tell what's real or not 21 00:01:18,880 --> 00:01:22,480 Speaker 2: because of AI video is so far definitely. 22 00:01:21,959 --> 00:01:24,240 Speaker 1: Not the case. He has only been a full time 23 00:01:24,280 --> 00:01:27,000 Speaker 1: creator for four months, but he has become a trusted 24 00:01:27,040 --> 00:01:30,880 Speaker 1: source for dissecting viral AI videos and explaining the tells. 25 00:01:31,000 --> 00:01:35,160 Speaker 2: There is a physical truth to shooting a video with 26 00:01:35,200 --> 00:01:38,959 Speaker 2: a camera. That physical truth isn't going away, and AI 27 00:01:39,520 --> 00:01:43,600 Speaker 2: does a version that to our eyes look like that 28 00:01:43,640 --> 00:01:48,400 Speaker 2: physical truth. But upon examination you can figure out that 29 00:01:48,440 --> 00:01:51,120 Speaker 2: these things break down. And I do think that any 30 00:01:51,240 --> 00:01:55,120 Speaker 2: normal person with decent eyesight can zoom into these AI 31 00:01:55,240 --> 00:01:56,600 Speaker 2: videos and figure that out. 32 00:01:57,200 --> 00:02:00,680 Speaker 1: So Jeremy wants his social videos to be education. He 33 00:02:00,760 --> 00:02:03,120 Speaker 1: wants more people to get excited by what he calls 34 00:02:03,200 --> 00:02:06,760 Speaker 1: pixel peeping, and he wants to improve people's media literacy 35 00:02:07,320 --> 00:02:10,120 Speaker 1: and hopes his accounts can help people tune their AI 36 00:02:10,240 --> 00:02:10,880 Speaker 1: vibe checker. 37 00:02:11,360 --> 00:02:13,400 Speaker 2: I'm not naive to the fact that people aren't going 38 00:02:13,440 --> 00:02:16,160 Speaker 2: to be pixel peeping on the videos that they watch, 39 00:02:16,320 --> 00:02:20,000 Speaker 2: So it's just about trying to tune people's initial impressions 40 00:02:20,040 --> 00:02:22,440 Speaker 2: so that they have something in their head that says ey, 41 00:02:22,520 --> 00:02:25,040 Speaker 2: something might not be right here, and then they can use, 42 00:02:25,080 --> 00:02:27,880 Speaker 2: hopefully other media skills that I teach them. In order 43 00:02:27,919 --> 00:02:28,959 Speaker 2: to dive a little. 44 00:02:28,760 --> 00:02:31,800 Speaker 1: Bit deeper, I talk to Jeremy about so many things, 45 00:02:31,840 --> 00:02:35,840 Speaker 1: how video generation tools work, how to pick up on AI, tells, 46 00:02:36,200 --> 00:02:39,200 Speaker 1: why Sora is an inflection point for the Internet, and 47 00:02:39,240 --> 00:02:41,960 Speaker 1: what this signals for the future of social media. I 48 00:02:42,000 --> 00:02:44,880 Speaker 1: started out by asking Jeremy to clarify what Sora is 49 00:02:45,280 --> 00:02:46,440 Speaker 1: and what it does. 50 00:02:47,120 --> 00:02:52,360 Speaker 2: So. Sora was originally released as Openay's first video model 51 00:02:52,680 --> 00:02:56,239 Speaker 2: in October twenty twenty five. They reuse the Sora name 52 00:02:56,440 --> 00:02:59,240 Speaker 2: to launch their social media app. A lot of the 53 00:02:59,280 --> 00:03:01,680 Speaker 2: hype has been around a Sora app, which is currently 54 00:03:01,720 --> 00:03:04,720 Speaker 2: invite only, and then there's the Sora TOI model that 55 00:03:04,760 --> 00:03:08,680 Speaker 2: you can already access if you have API access or 56 00:03:08,720 --> 00:03:11,760 Speaker 2: if you're a developer or even a normal person. There 57 00:03:11,800 --> 00:03:14,360 Speaker 2: are tools that let you generate a video with the 58 00:03:14,400 --> 00:03:18,440 Speaker 2: Sora to video model without an invite. The Sora app 59 00:03:18,480 --> 00:03:23,520 Speaker 2: experience is very unique in some ways and very familiar 60 00:03:23,520 --> 00:03:26,200 Speaker 2: in others. It does feel like a TikTok for you 61 00:03:26,320 --> 00:03:30,160 Speaker 2: page just for AI videos. You can scroll, it has 62 00:03:30,200 --> 00:03:32,840 Speaker 2: an algorithm to suggest But what's gotten a lot of 63 00:03:32,840 --> 00:03:36,720 Speaker 2: the tension is the ability to cameo someone, but really, 64 00:03:36,840 --> 00:03:38,880 Speaker 2: these are just deep bakes. Like you're creating deep fakes 65 00:03:38,920 --> 00:03:41,200 Speaker 2: of your friends. You're creating deep bakes of whoever lets 66 00:03:41,240 --> 00:03:42,960 Speaker 2: you create a deep bake of them, And you have 67 00:03:43,040 --> 00:03:46,840 Speaker 2: different levels of permissions. So, for example, Jake Paul and 68 00:03:46,920 --> 00:03:51,080 Speaker 2: Sam Altman let anyone deep fake them, whereas I let 69 00:03:51,120 --> 00:03:54,120 Speaker 2: no one deep fake me because I'm not comfortable with that. 70 00:03:54,600 --> 00:03:56,680 Speaker 1: What does it look like to let someone deep fake 71 00:03:56,720 --> 00:03:57,400 Speaker 1: you on Sora? 72 00:03:57,840 --> 00:04:00,800 Speaker 2: It looks like a version of you doing whatever they 73 00:04:01,520 --> 00:04:04,240 Speaker 2: prompted you to do. Now, there are safety features in place, 74 00:04:04,480 --> 00:04:08,120 Speaker 2: so you can't have them do anything violent, you can't 75 00:04:08,160 --> 00:04:11,560 Speaker 2: do anything sexual. But it's really up to open a 76 00:04:11,640 --> 00:04:14,280 Speaker 2: high to set those boundaries. And I don't think it's 77 00:04:14,320 --> 00:04:17,680 Speaker 2: completely accurate. I've made versions of myself that I think 78 00:04:17,720 --> 00:04:19,919 Speaker 2: don't look very much like me. I've made other versions 79 00:04:19,920 --> 00:04:22,400 Speaker 2: of myself that look a lot like me. That's really 80 00:04:22,520 --> 00:04:27,520 Speaker 2: up to luck, because as we'll learn, these models aren't deterministic. 81 00:04:27,600 --> 00:04:30,120 Speaker 2: There's a part of this that is random, so it's 82 00:04:30,160 --> 00:04:33,440 Speaker 2: not repeatable. So Jake Paul is a very good example. 83 00:04:33,480 --> 00:04:35,880 Speaker 2: There are a ton of AI videos of Jake Paul 84 00:04:35,960 --> 00:04:38,080 Speaker 2: right now. All of them look a little bit different, 85 00:04:38,120 --> 00:04:41,960 Speaker 2: but have his likeness, so you have to give permission 86 00:04:42,000 --> 00:04:43,680 Speaker 2: for someone to make a video of you through the 87 00:04:43,680 --> 00:04:44,440 Speaker 2: cameo feature. 88 00:04:44,760 --> 00:04:48,960 Speaker 1: So would you say that AI video generation scares you, Like, 89 00:04:49,360 --> 00:04:51,120 Speaker 1: is it something that keeps you up at night? 90 00:04:51,800 --> 00:04:54,159 Speaker 2: It's not because I'm doing something about it now, but 91 00:04:54,200 --> 00:04:56,280 Speaker 2: it really was, and I think it is keeping people 92 00:04:56,320 --> 00:04:58,000 Speaker 2: up at night because so much of our time is 93 00:04:58,040 --> 00:05:01,120 Speaker 2: spent on these short form video platforms like for better 94 00:05:01,240 --> 00:05:04,800 Speaker 2: or worse. I do think that it is the primary 95 00:05:04,839 --> 00:05:08,039 Speaker 2: way that people get information now. There was probably never 96 00:05:08,120 --> 00:05:10,520 Speaker 2: the best format for that information in the first place, 97 00:05:10,640 --> 00:05:13,080 Speaker 2: but here we are. So I think what keeps me 98 00:05:13,200 --> 00:05:16,800 Speaker 2: up is really general media literacy skills, and I think 99 00:05:16,800 --> 00:05:19,640 Speaker 2: of AI video as an extension of that. A lot 100 00:05:19,680 --> 00:05:21,560 Speaker 2: of people are kept up by what I think are 101 00:05:21,720 --> 00:05:25,480 Speaker 2: irrational fears about AI video, Like, in my opinion, it's 102 00:05:25,520 --> 00:05:28,039 Speaker 2: probably not going to be framing you for a crime 103 00:05:28,080 --> 00:05:31,400 Speaker 2: anytime soon, but it might turn the core of public 104 00:05:31,440 --> 00:05:34,320 Speaker 2: opinion against you. It might be spreading disinformation. 105 00:05:34,600 --> 00:05:34,680 Speaker 1: Like. 106 00:05:34,760 --> 00:05:38,320 Speaker 2: It's an extension of other media literacy problems, and it's 107 00:05:38,360 --> 00:05:41,440 Speaker 2: a very believable one because people when they are scrolling, 108 00:05:41,560 --> 00:05:43,400 Speaker 2: they're just there to tune out and scroll. They're not 109 00:05:43,440 --> 00:05:46,640 Speaker 2: there to pixel peep and really pay attention, right, I. 110 00:05:46,600 --> 00:05:48,960 Speaker 1: Mean, you don't think that we are living in a 111 00:05:49,000 --> 00:05:51,560 Speaker 1: world where soon people could be framed for something they 112 00:05:51,560 --> 00:05:53,599 Speaker 1: didn't do using manipulated video. 113 00:05:54,000 --> 00:05:56,960 Speaker 2: Well, I think that. I'm not a lawyer, but I've 114 00:05:57,000 --> 00:05:59,920 Speaker 2: done some looking into this, and the reality is that 115 00:06:00,279 --> 00:06:03,000 Speaker 2: in order for something to be admitted into evidence, at 116 00:06:03,080 --> 00:06:04,800 Speaker 2: least in the United States, it has to have an 117 00:06:04,839 --> 00:06:09,039 Speaker 2: extensive metadata trail. It has to be authenticated. You have 118 00:06:09,080 --> 00:06:11,039 Speaker 2: to get the person who filmed the video into the 119 00:06:11,040 --> 00:06:13,839 Speaker 2: courtroom to say that they filmed it. And we have 120 00:06:13,920 --> 00:06:17,840 Speaker 2: to understand that while our perception might be getting tricked, 121 00:06:18,040 --> 00:06:21,919 Speaker 2: there are procedural and mathematical ways that these can be detected. 122 00:06:22,440 --> 00:06:25,440 Speaker 2: So it is not undetectable yet. And anyone who says 123 00:06:25,440 --> 00:06:28,640 Speaker 2: it's undetectable is probably either selling you something or doesn't 124 00:06:28,640 --> 00:06:30,760 Speaker 2: have a good eye. And anyone who says it will 125 00:06:30,800 --> 00:06:34,680 Speaker 2: be undetectable does not know that, and frankly doesn't understand 126 00:06:34,760 --> 00:06:36,840 Speaker 2: the technology that's making these AI videos very well. 127 00:06:36,880 --> 00:06:40,160 Speaker 1: In my opinion, and right now, your likeness is not shared. 128 00:06:41,160 --> 00:06:46,640 Speaker 2: No, I have a strong, strong bias against this because 129 00:06:46,720 --> 00:06:49,680 Speaker 2: I believe that once your likeness gets out there and 130 00:06:49,839 --> 00:06:53,360 Speaker 2: is deepfakable, so to speak. It's really hard to pull 131 00:06:53,360 --> 00:06:56,719 Speaker 2: that back, not because you can't, like you can tell 132 00:06:56,760 --> 00:06:59,800 Speaker 2: people to stop, but once it's out there, I think 133 00:06:59,800 --> 00:07:01,880 Speaker 2: you lose a sense of trust. It's a line that 134 00:07:01,960 --> 00:07:04,440 Speaker 2: I just don't want to cross. I'm not comfortable crossing, 135 00:07:04,440 --> 00:07:06,719 Speaker 2: and I've actually told my followers I will never cross 136 00:07:06,760 --> 00:07:09,760 Speaker 2: that line because it's just not what I'm interested in. 137 00:07:10,440 --> 00:07:13,640 Speaker 1: So I was hoping that you could show me how 138 00:07:13,640 --> 00:07:15,559 Speaker 1: to make a video using the Sora app. 139 00:07:15,760 --> 00:07:19,520 Speaker 2: Sure, so this is the Sora desktop app. It is 140 00:07:19,640 --> 00:07:22,400 Speaker 2: not the vertical experience that you have, you know, on 141 00:07:22,400 --> 00:07:25,360 Speaker 2: the phone. It is, however, showing a lot of the 142 00:07:25,400 --> 00:07:28,280 Speaker 2: same content. So this is essentially the for you page 143 00:07:28,280 --> 00:07:30,400 Speaker 2: of Sora. And the thing to note here is that 144 00:07:30,440 --> 00:07:33,280 Speaker 2: there are Sora water marks over each one of these videos. 145 00:07:33,560 --> 00:07:36,040 Speaker 2: In the mobile experience, those water marks go away, but 146 00:07:36,160 --> 00:07:39,200 Speaker 2: they don't let you screen record in the mobile version, 147 00:07:39,320 --> 00:07:42,360 Speaker 2: Whereas theoretically anyone could do what I'm doing right now, 148 00:07:42,400 --> 00:07:44,720 Speaker 2: like I can share my screen here, I could record 149 00:07:44,760 --> 00:07:47,560 Speaker 2: my screen. When you see Sora videos on social media, 150 00:07:48,080 --> 00:07:49,360 Speaker 2: this is how they're being made. 151 00:07:49,440 --> 00:07:52,040 Speaker 1: So let's try to make a Sora video. Let's do 152 00:07:53,000 --> 00:07:55,680 Speaker 1: skiing with candy. 153 00:07:56,520 --> 00:07:59,000 Speaker 2: Skiing with Candy. You want me to just say that 154 00:07:59,080 --> 00:08:02,520 Speaker 2: and see what it comes up with. Yes, let's do it. 155 00:08:02,600 --> 00:08:03,720 Speaker 2: I think that's a great idea. 156 00:08:03,760 --> 00:08:04,840 Speaker 1: Why do you think it's a good idea? 157 00:08:04,960 --> 00:08:10,120 Speaker 2: Because so something that people aren't talking enough about with 158 00:08:10,240 --> 00:08:12,880 Speaker 2: Sora is that you can have a very simple prompt 159 00:08:12,960 --> 00:08:16,080 Speaker 2: and it can come up with something really creative. That's 160 00:08:16,200 --> 00:08:19,960 Speaker 2: really what, in my opinion, distinguishes it from other video models. 161 00:08:20,240 --> 00:08:22,880 Speaker 2: Google vo three was how a lot of AI content 162 00:08:22,960 --> 00:08:25,320 Speaker 2: was made a few weeks ago. If you don't give 163 00:08:25,320 --> 00:08:28,480 Speaker 2: Google vo three a good prompt, it's just boring, Whereas 164 00:08:28,480 --> 00:08:32,040 Speaker 2: Sora will go through some attempts to at least make 165 00:08:32,080 --> 00:08:33,120 Speaker 2: it entertaining anyway. 166 00:08:33,320 --> 00:08:36,360 Speaker 1: It's just incredible to me that in a given three 167 00:08:36,400 --> 00:08:38,160 Speaker 1: weeks the world sort of changes. 168 00:08:38,480 --> 00:08:42,080 Speaker 2: I think that there is a misconception that the world 169 00:08:42,200 --> 00:08:48,440 Speaker 2: just changed because video AI made a huge, undetectable leap. 170 00:08:48,960 --> 00:08:52,160 Speaker 2: It did make a step towards more realism. What soa 171 00:08:52,280 --> 00:08:57,000 Speaker 2: to really improved. Where a lot of the human parts 172 00:08:57,120 --> 00:09:00,840 Speaker 2: of video AI, such as hand movement or if they 173 00:09:01,160 --> 00:09:04,840 Speaker 2: have a missing limb, or if their teeth look weird, 174 00:09:05,080 --> 00:09:08,200 Speaker 2: or if their eyes look uncanny, hair like, there were 175 00:09:08,240 --> 00:09:10,880 Speaker 2: all these little things that people would pick on again, 176 00:09:11,000 --> 00:09:14,360 Speaker 2: a lot of them subconscious. Sura made a step towards 177 00:09:14,360 --> 00:09:18,199 Speaker 2: improving those things. It still has a lot of background issues. 178 00:09:18,720 --> 00:09:22,720 Speaker 2: It is actually a noisier or muddier looking model in 179 00:09:22,720 --> 00:09:25,520 Speaker 2: my opinion than video, but a lot of people aren't 180 00:09:25,559 --> 00:09:27,480 Speaker 2: looking for that. A lot of the videos that go 181 00:09:27,640 --> 00:09:32,720 Speaker 2: viral that are AI generated are security cams, our body 182 00:09:32,800 --> 00:09:36,960 Speaker 2: cams are go pro looking cameras, things that people aren't 183 00:09:37,000 --> 00:09:40,160 Speaker 2: looking at every day. But it really made improvements in 184 00:09:41,080 --> 00:09:45,560 Speaker 2: how good the outputs are to watch. Like story wise, 185 00:09:46,160 --> 00:09:49,080 Speaker 2: if you were to release Google vo three as a 186 00:09:49,120 --> 00:09:53,440 Speaker 2: social media app, it would fail just entirely because people 187 00:09:53,440 --> 00:09:57,600 Speaker 2: would get on there and unless you're a good prompter, like, 188 00:09:57,640 --> 00:10:00,320 Speaker 2: you're not going to come up with anything interesting. As 189 00:10:00,320 --> 00:10:03,520 Speaker 2: Sah made anyone getting into AI, it's possible for you 190 00:10:03,559 --> 00:10:06,280 Speaker 2: to come up with something interesting with a very basic prompt. 191 00:10:06,320 --> 00:10:09,760 Speaker 2: That's a really, really big innovation that they didn't talk about. 192 00:10:09,880 --> 00:10:12,160 Speaker 2: But I think that's why it's had such an impact 193 00:10:12,280 --> 00:10:16,920 Speaker 2: is because there's a huge volume of somewhat meaningful Sora 194 00:10:17,040 --> 00:10:19,880 Speaker 2: videos out there, whereas there really wasn't with VEO when 195 00:10:19,880 --> 00:10:22,360 Speaker 2: that came out right. So all right, so it came 196 00:10:22,440 --> 00:10:24,120 Speaker 2: up with skiing with candy. Let's see what Let's see 197 00:10:24,120 --> 00:10:24,640 Speaker 2: what I did here? 198 00:10:25,240 --> 00:10:25,520 Speaker 1: Look what. 199 00:10:27,160 --> 00:10:29,959 Speaker 2: Go mid slip snack classy and. 200 00:10:29,920 --> 00:10:31,000 Speaker 3: A peppermint for the wind. 201 00:10:31,200 --> 00:10:34,000 Speaker 1: Nothing like sweet feel to keep the turn smooth? Catch 202 00:10:34,000 --> 00:10:35,079 Speaker 1: you at the bottom. 203 00:10:35,480 --> 00:10:37,599 Speaker 2: All right? What are your impressions? 204 00:10:37,840 --> 00:10:40,760 Speaker 1: I just don't I'm sorry, this is Is it okay 205 00:10:40,760 --> 00:10:42,000 Speaker 1: that this is blowing my mind? 206 00:10:42,400 --> 00:10:43,400 Speaker 2: It should okay? 207 00:10:43,440 --> 00:10:45,360 Speaker 1: Good, it should blow your mind because I feel daft, 208 00:10:45,520 --> 00:10:49,640 Speaker 1: Like I feel like I can't wrap my head around this, 209 00:10:49,960 --> 00:10:52,160 Speaker 1: Like I'm assuming this woman in the video with her 210 00:10:52,200 --> 00:10:53,960 Speaker 1: ski mask on is not a real person. 211 00:10:54,120 --> 00:10:56,120 Speaker 2: No, she's not a real person. And we don't know 212 00:10:56,480 --> 00:10:58,600 Speaker 2: how they invented her. They just came up with that. 213 00:10:58,679 --> 00:10:59,880 Speaker 3: What so. 214 00:11:00,000 --> 00:11:03,000 Speaker 2: So there are things about this that stick out to 215 00:11:03,040 --> 00:11:05,720 Speaker 2: me as obvious AI video. And then there are things 216 00:11:05,720 --> 00:11:09,040 Speaker 2: about this that I just have to say, wow, that 217 00:11:09,160 --> 00:11:12,280 Speaker 2: is incredible. So if I can just explain what I 218 00:11:12,360 --> 00:11:15,360 Speaker 2: see here someone who watches these, so please. She starts 219 00:11:15,360 --> 00:11:18,120 Speaker 2: out by skiing down the hill, but she's kind of 220 00:11:18,160 --> 00:11:21,640 Speaker 2: skiing like it's snowboarding. Then she stops. She has some 221 00:11:21,679 --> 00:11:24,240 Speaker 2: peppermints in her hand, she has some bags of candy 222 00:11:24,400 --> 00:11:27,920 Speaker 2: in her hand, and there are some weird things going 223 00:11:27,960 --> 00:11:30,280 Speaker 2: on here. But what it did with it is without 224 00:11:30,320 --> 00:11:34,000 Speaker 2: any input, it basically made a social media video with it. 225 00:11:34,000 --> 00:11:38,200 Speaker 2: It's like she's promoting this candy. There's someone responding to 226 00:11:38,240 --> 00:11:42,319 Speaker 2: her in the background. It invented a straw for her exactly. 227 00:11:42,400 --> 00:11:45,120 Speaker 2: It she talks like an influencer. 228 00:11:45,520 --> 00:11:47,440 Speaker 1: I just it really trips me up that she's not 229 00:11:47,480 --> 00:11:49,319 Speaker 1: a real person, that this person does not exist in 230 00:11:49,360 --> 00:11:50,800 Speaker 1: the world. It's really weird. 231 00:11:51,040 --> 00:11:53,360 Speaker 2: Same. I mean, I have to tell myself it's not 232 00:11:53,400 --> 00:11:53,920 Speaker 2: a real person. 233 00:11:54,000 --> 00:11:55,679 Speaker 1: I mean, it would be like if you didn't exist. 234 00:11:56,000 --> 00:12:00,680 Speaker 2: Yeah, it's That's the thing is, it's visually feels the 235 00:12:00,679 --> 00:12:03,640 Speaker 2: same as talking to another person online. Of course, there 236 00:12:03,640 --> 00:12:05,640 Speaker 2: are there are tells, so I'll get into those. So 237 00:12:06,280 --> 00:12:08,880 Speaker 2: first of all, you have just the context. Why is 238 00:12:08,960 --> 00:12:11,800 Speaker 2: she skiing down the hill with a bag of candy 239 00:12:11,960 --> 00:12:13,880 Speaker 2: and why is she just putting it in her mouth 240 00:12:13,960 --> 00:12:17,800 Speaker 2: with the wrappers. Then there are some artifacts that I 241 00:12:17,840 --> 00:12:20,760 Speaker 2: can see, especially at the beginning of the generation. Her 242 00:12:20,880 --> 00:12:24,400 Speaker 2: jacket and her pants are incredibly pixelated when it starts. 243 00:12:24,920 --> 00:12:28,160 Speaker 2: But the other thing here is that it's very noisy. 244 00:12:28,200 --> 00:12:32,840 Speaker 2: If we actually zoom in there's a lot of artifacts 245 00:12:33,040 --> 00:12:34,640 Speaker 2: in the mountains back there. 246 00:12:34,880 --> 00:12:36,679 Speaker 1: It is weird how she's eating the candy. That's a 247 00:12:36,679 --> 00:12:37,480 Speaker 1: little uncanny. 248 00:12:37,520 --> 00:12:41,400 Speaker 2: It's weird. Yeah, she's eating raft candy and the bag 249 00:12:41,440 --> 00:12:43,960 Speaker 2: there just stuck to her knee. Yeah, you know, so 250 00:12:44,040 --> 00:12:46,480 Speaker 2: at first it's a ziplock bag, then it's not a 251 00:12:46,559 --> 00:12:50,720 Speaker 2: ziplock bag, then it sticks to her knee. Her feet 252 00:12:50,720 --> 00:12:54,199 Speaker 2: are backwards, like her foot there is literally backwards in 253 00:12:54,280 --> 00:12:56,840 Speaker 2: this version. She doesn't have a foot like you know, 254 00:12:56,880 --> 00:12:58,160 Speaker 2: you get into it. It's kind of funny. 255 00:12:58,200 --> 00:12:59,960 Speaker 1: But this is why you have such a large platfor 256 00:13:00,320 --> 00:13:02,320 Speaker 1: because like I look at this at first and I'm like, oh, 257 00:13:02,360 --> 00:13:05,840 Speaker 1: it's perfect. Like in a way, if I see the 258 00:13:05,880 --> 00:13:09,199 Speaker 1: trappings of what I think i'm seeing, I don't really 259 00:13:09,240 --> 00:13:10,760 Speaker 1: look for the detail that's wrong. 260 00:13:11,120 --> 00:13:14,439 Speaker 2: Especially when you're just scrolling on TikTok or Instagram. You're 261 00:13:14,440 --> 00:13:15,079 Speaker 2: not looking for. 262 00:13:15,000 --> 00:13:16,719 Speaker 1: Anything wrong, right, which is how they want you to 263 00:13:16,760 --> 00:13:18,559 Speaker 1: look at it, or scrolling on Sora. 264 00:13:18,480 --> 00:13:20,679 Speaker 2: Or scrolling on Sora. A lot of them are leaving 265 00:13:20,679 --> 00:13:23,560 Speaker 2: Sara and making it out to all these platforms. Yeah, 266 00:13:24,240 --> 00:13:26,480 Speaker 2: you're not going to be looking for these things. I'm 267 00:13:26,520 --> 00:13:29,640 Speaker 2: totally aware of that. I mean, on first watch, are 268 00:13:29,679 --> 00:13:31,480 Speaker 2: you gonna pick out everything that's wrong with this? No, 269 00:13:31,600 --> 00:13:33,800 Speaker 2: But if you watch it five times and start zooming in, 270 00:13:34,240 --> 00:13:37,199 Speaker 2: you're gonna start noticing that her feet are literally backwards. 271 00:13:37,480 --> 00:13:41,000 Speaker 2: So yeah, when it comes down to it, I think 272 00:13:41,040 --> 00:13:44,800 Speaker 2: what's really very important about Sora is that it did 273 00:13:44,840 --> 00:13:47,400 Speaker 2: all that work for you. You didn't need to know 274 00:13:47,480 --> 00:13:50,160 Speaker 2: how to prompt the video. AI. If you were to 275 00:13:50,240 --> 00:13:54,320 Speaker 2: put skiing with candy into Google video, it's just going 276 00:13:54,400 --> 00:13:56,320 Speaker 2: to be boring. I'll just tell you that right now. 277 00:13:56,679 --> 00:14:00,720 Speaker 1: So if I wanted this video, this exact video, for three, 278 00:14:01,720 --> 00:14:03,560 Speaker 1: what would I have to prompt it to do? 279 00:14:04,040 --> 00:14:06,320 Speaker 2: You'd have to act like a camera director. You'd have 280 00:14:06,400 --> 00:14:11,040 Speaker 2: to say, video starting with women skiing down the slope. 281 00:14:11,120 --> 00:14:15,880 Speaker 2: She is wearing a pink and yellow top, a turquoise bottom, 282 00:14:16,320 --> 00:14:18,320 Speaker 2: She's holding a bag of candy in her right hand, 283 00:14:18,360 --> 00:14:20,600 Speaker 2: pepperminster her left hand, and you'd have to go shot 284 00:14:20,640 --> 00:14:23,880 Speaker 2: by shot to give it. I can actually show you 285 00:14:24,560 --> 00:14:27,320 Speaker 2: something that I came up with that more clearly demonstrates 286 00:14:27,400 --> 00:14:29,960 Speaker 2: this point. So this is a video I made yesterday 287 00:14:30,080 --> 00:14:34,600 Speaker 2: with the prompt epic anime of Diego Maradonna scoring a 288 00:14:34,680 --> 00:14:37,000 Speaker 2: goal in the world cup, weaving. 289 00:14:36,680 --> 00:14:41,600 Speaker 1: Past one, still going two. Defender's beaten. He won't announcers. 290 00:14:42,280 --> 00:14:45,680 Speaker 2: This is him dribbling through an entire defense. It is 291 00:14:45,720 --> 00:14:49,120 Speaker 2: an epic looking anime. Anime. People would say it doesn't 292 00:14:49,160 --> 00:14:53,040 Speaker 2: look great, but normal people probably wouldn't notice it. And 293 00:14:53,320 --> 00:14:57,360 Speaker 2: what blew me away about this is that it created 294 00:14:57,720 --> 00:15:02,280 Speaker 2: Diego Maradonna's most famous goal and it added the announcers. 295 00:15:02,560 --> 00:15:04,680 Speaker 2: I didn't tell it to do any of that. Now, 296 00:15:04,720 --> 00:15:07,720 Speaker 2: if I compare that to what Google Vio did with 297 00:15:07,840 --> 00:15:15,080 Speaker 2: the exact same prompt it did this, this. 298 00:15:15,000 --> 00:15:16,120 Speaker 1: One is b team. 299 00:15:16,160 --> 00:15:20,200 Speaker 2: It is. The quality of the video is actually better, 300 00:15:20,440 --> 00:15:23,840 Speaker 2: but it didn't make it interesting. So again, that's why 301 00:15:23,880 --> 00:15:25,840 Speaker 2: you're seeing so much, Sarah, as you don't need to 302 00:15:25,840 --> 00:15:26,600 Speaker 2: be very creative. 303 00:15:27,080 --> 00:15:30,120 Speaker 1: What are the implications of a social media app being 304 00:15:30,200 --> 00:15:34,280 Speaker 1: designed and housing videos full of fake people? Like it's 305 00:15:34,320 --> 00:15:35,760 Speaker 1: just crazy to me that I can watch a video 306 00:15:35,840 --> 00:15:37,440 Speaker 1: of someone who doesn't exist. 307 00:15:37,960 --> 00:15:42,040 Speaker 2: I think that we don't know the implications, and I 308 00:15:42,080 --> 00:15:45,920 Speaker 2: would push back on it being like our inevitable future 309 00:15:46,360 --> 00:15:49,760 Speaker 2: a bit, but I would say that it is normalizing 310 00:15:50,000 --> 00:15:53,960 Speaker 2: deep faking, and I don't think we know what that 311 00:15:54,040 --> 00:15:56,720 Speaker 2: will mean for us. But I don't think it'll be good. 312 00:15:57,200 --> 00:15:59,960 Speaker 2: I think it might be entertaining, I think it might 313 00:15:59,960 --> 00:16:04,320 Speaker 2: be interesting. It is certainly a technical achievement, but I 314 00:16:04,320 --> 00:16:07,800 Speaker 2: don't consider it to be a technological advancement. I'm not 315 00:16:07,920 --> 00:16:11,080 Speaker 2: so sure it is progress. But it is a pretty 316 00:16:11,080 --> 00:16:14,320 Speaker 2: incredible thing that they've been able to pull off, and 317 00:16:14,840 --> 00:16:17,840 Speaker 2: I think that it is rational for people to look 318 00:16:17,880 --> 00:16:21,080 Speaker 2: at these videos and be pretty freaked out. And that's 319 00:16:21,120 --> 00:16:25,080 Speaker 2: what a lot of my comments are because what isn't 320 00:16:25,120 --> 00:16:29,320 Speaker 2: clear is how this is going to improve social media 321 00:16:29,320 --> 00:16:32,320 Speaker 2: in anyway, to improve our media literacy skills in any way. 322 00:16:32,640 --> 00:16:37,760 Speaker 2: There are definitely tech advancements here that can improve advancements 323 00:16:37,800 --> 00:16:42,680 Speaker 2: towards artificial general intelligence, like there are technical reasons that 324 00:16:42,720 --> 00:16:46,000 Speaker 2: this could be helpful in the future. But the step 325 00:16:46,040 --> 00:16:48,960 Speaker 2: that open Aye took to release this in a social 326 00:16:49,040 --> 00:16:53,760 Speaker 2: media app was a huge jump, in my opinion, in 327 00:16:53,800 --> 00:16:56,920 Speaker 2: the wrong direction. But the technology is here to stay 328 00:16:56,960 --> 00:16:57,320 Speaker 2: for sure. 329 00:17:03,920 --> 00:17:08,000 Speaker 1: After the break, will we become desensitized to deep fakes? 330 00:17:08,520 --> 00:17:12,880 Speaker 3: Stay with us? 331 00:17:27,800 --> 00:17:30,120 Speaker 1: One thing that I can't really get over about sore 332 00:17:30,240 --> 00:17:34,440 Speaker 1: Too is that Sam Altman is letting anybody use his likeness. 333 00:17:34,560 --> 00:17:37,840 Speaker 1: He opened his likeness to any sor user, so I 334 00:17:37,840 --> 00:17:42,000 Speaker 1: could say Sam Altman building a snowman for example, why 335 00:17:42,040 --> 00:17:45,320 Speaker 1: do this, Like, as the head of the company. 336 00:17:45,359 --> 00:17:48,480 Speaker 2: I can only guess. I think that it is generally 337 00:17:49,080 --> 00:17:54,480 Speaker 2: just attempt at normalizing deep baking people, and I think 338 00:17:54,520 --> 00:17:57,240 Speaker 2: people should be really scared of crossing that line. I 339 00:17:57,240 --> 00:17:59,560 Speaker 2: think it's a serious thing to do, and I think 340 00:17:59,640 --> 00:18:03,960 Speaker 2: open pushing everyone in that direction before anyone was even 341 00:18:04,000 --> 00:18:08,320 Speaker 2: asking for it is really frightening. You could create deep 342 00:18:08,359 --> 00:18:11,560 Speaker 2: fakes of people before there was the technology to do it. 343 00:18:11,760 --> 00:18:14,720 Speaker 2: There was a lot of friction and social pressure not 344 00:18:14,880 --> 00:18:18,280 Speaker 2: to do it. That friction was helpful in keeping our 345 00:18:18,359 --> 00:18:22,320 Speaker 2: information economy healthy. Even with safety features on the Sora 346 00:18:22,359 --> 00:18:25,639 Speaker 2: app of like letting letting you set permissions, people are 347 00:18:25,640 --> 00:18:28,880 Speaker 2: gonna mess that up. People won't know that they can 348 00:18:28,880 --> 00:18:31,479 Speaker 2: be deepfaked, and of course that's their responsibility to know. 349 00:18:31,680 --> 00:18:34,120 Speaker 2: But you've just opened up an entire can of worms. 350 00:18:34,160 --> 00:18:37,200 Speaker 2: There are other issues here, like currently you can't delete 351 00:18:37,240 --> 00:18:40,639 Speaker 2: your Sora account without deleting your entire JATGPT account. 352 00:18:40,960 --> 00:18:41,200 Speaker 3: Wow. 353 00:18:41,359 --> 00:18:43,879 Speaker 2: And again like you can't pull this back, Like in 354 00:18:44,040 --> 00:18:46,280 Speaker 2: theory you could stop people. But if you are a 355 00:18:46,280 --> 00:18:48,520 Speaker 2: public figure and you open up this can of worms, 356 00:18:48,880 --> 00:18:54,040 Speaker 2: it could really backfire. So it's Sora accelerating this deep 357 00:18:54,080 --> 00:18:57,399 Speaker 2: fake idea into a space that just hasn't been that 358 00:18:57,440 --> 00:18:59,760 Speaker 2: full explored yet. And I don't think i'd want to 359 00:18:59,760 --> 00:19:02,920 Speaker 2: be inly adopter of this because there's a lot of negative, 360 00:19:03,040 --> 00:19:05,520 Speaker 2: like downside risk that I just don't think we figured 361 00:19:05,600 --> 00:19:06,000 Speaker 2: out yet. 362 00:19:06,520 --> 00:19:09,040 Speaker 1: So you have a video where you talk about how 363 00:19:09,080 --> 00:19:12,600 Speaker 1: SOA is actually costing open Ai about one dollar per post? 364 00:19:12,720 --> 00:19:15,760 Speaker 1: Can you explain that calculation and what it means for 365 00:19:15,840 --> 00:19:17,160 Speaker 1: Sora long term? 366 00:19:17,240 --> 00:19:19,520 Speaker 2: This was an educated guest that ended up being right. 367 00:19:19,840 --> 00:19:23,280 Speaker 2: Every video you create is basically on open AI's dime. So, 368 00:19:23,520 --> 00:19:26,840 Speaker 2: for example, two weeks ago, if I, as a creator 369 00:19:26,960 --> 00:19:30,320 Speaker 2: wanted to post an Ai video to TikTok or Instagram, 370 00:19:30,680 --> 00:19:33,040 Speaker 2: I would have to pay a subscription to make that 371 00:19:33,119 --> 00:19:37,400 Speaker 2: video and download it or pay per post. So there 372 00:19:37,520 --> 00:19:42,640 Speaker 2: are commodity prices for these video models. For Google vo 373 00:19:42,920 --> 00:19:46,000 Speaker 2: it's a dollar fifty to three dollars. Sora is currently 374 00:19:46,080 --> 00:19:51,040 Speaker 2: around a dollar. But the Sora application is free, and 375 00:19:51,160 --> 00:19:53,760 Speaker 2: anytime you create an Ai video on that it is 376 00:19:53,880 --> 00:19:57,600 Speaker 2: free to you. So as always I would ask the question, 377 00:19:57,760 --> 00:19:59,959 Speaker 2: if it is free, are you the product? And I'm 378 00:20:00,080 --> 00:20:02,960 Speaker 2: this case they are taking your data, they're taking your 379 00:20:02,960 --> 00:20:06,119 Speaker 2: face scans, they're taking your props. Right, so there's that 380 00:20:06,200 --> 00:20:08,040 Speaker 2: question of why are they doing this? Of course they're 381 00:20:08,040 --> 00:20:10,879 Speaker 2: also doing it to get users. But imagine you were 382 00:20:10,920 --> 00:20:15,119 Speaker 2: TikTok or Instagram and every single time someone posted a 383 00:20:15,200 --> 00:20:18,200 Speaker 2: video on your site you needed to pay a dollar. 384 00:20:18,600 --> 00:20:20,840 Speaker 2: How quickly is that going to add up? For Sora? 385 00:20:21,160 --> 00:20:21,760 Speaker 1: Very quickly? 386 00:20:22,200 --> 00:20:26,280 Speaker 2: Would advertisers be able to make up that difference? Are 387 00:20:26,280 --> 00:20:28,280 Speaker 2: you going to need subscribers to help make up with 388 00:20:28,320 --> 00:20:31,200 Speaker 2: that difference? I mean, video takes a ton of compute. 389 00:20:31,240 --> 00:20:34,640 Speaker 2: It is costing them GPU compute, it is costing them 390 00:20:35,119 --> 00:20:38,760 Speaker 2: opportunity costs. The GPUs could be used for other things, right, 391 00:20:39,080 --> 00:20:42,639 Speaker 2: So the fact that they chose a video social media 392 00:20:42,680 --> 00:20:45,040 Speaker 2: app where every time someone posts on your platform it's 393 00:20:45,040 --> 00:20:48,160 Speaker 2: costing you money is pretty confusing to me as someone 394 00:20:48,200 --> 00:20:52,440 Speaker 2: who understands that those advertiser clicks are not even close 395 00:20:52,480 --> 00:20:53,359 Speaker 2: to worth that much. 396 00:20:53,720 --> 00:20:59,199 Speaker 1: My question is, is your sam Altman, you oversee the 397 00:20:59,320 --> 00:21:04,280 Speaker 1: most popular or AI tool on the market, Why are 398 00:21:04,280 --> 00:21:05,600 Speaker 1: you going into social media? 399 00:21:06,160 --> 00:21:09,320 Speaker 2: You're asking the right question that I think even open 400 00:21:09,359 --> 00:21:12,480 Speaker 2: AI's own employees are asking. There has been some reporting 401 00:21:12,680 --> 00:21:16,280 Speaker 2: on even open AI people being confused by this. At 402 00:21:16,280 --> 00:21:19,560 Speaker 2: the end of the day, TikTok is releasing an AI generator, 403 00:21:19,600 --> 00:21:21,679 Speaker 2: I get ads for that all the time. YouTube is 404 00:21:21,720 --> 00:21:26,400 Speaker 2: putting Google vo three into YouTube shorts. Everyone's looking at 405 00:21:26,400 --> 00:21:29,680 Speaker 2: this as how do we build like the AI video feed? 406 00:21:29,960 --> 00:21:34,359 Speaker 2: And it appears to me the rationale would be to 407 00:21:34,600 --> 00:21:38,160 Speaker 2: generate some sort of advertiser revenue. I think that would 408 00:21:38,200 --> 00:21:41,040 Speaker 2: be the simple answer. But whether or not that actually 409 00:21:41,040 --> 00:21:43,680 Speaker 2: works is a huge open question. 410 00:21:44,200 --> 00:21:47,280 Speaker 1: So in the future, say, Sora, the app is running 411 00:21:47,320 --> 00:21:48,560 Speaker 1: ads between videos. 412 00:21:48,760 --> 00:21:50,680 Speaker 2: Yeah, absolutely interesting. 413 00:21:51,640 --> 00:21:54,040 Speaker 1: So in one of your videos, you say that AI 414 00:21:54,200 --> 00:21:56,720 Speaker 1: will end social media? What do you mean by that? 415 00:21:57,480 --> 00:21:59,960 Speaker 2: I think it has the potential to end the four 416 00:22:00,119 --> 00:22:03,080 Speaker 2: you page as we know it, unless the social media 417 00:22:03,119 --> 00:22:07,800 Speaker 2: companies figure out a way to filter AI content. Again, 418 00:22:08,240 --> 00:22:10,720 Speaker 2: we do not know how people are going to react 419 00:22:10,720 --> 00:22:14,719 Speaker 2: to this when it's deployed much wider. But it is 420 00:22:14,760 --> 00:22:18,840 Speaker 2: a rational thing to not want to only see AI 421 00:22:18,920 --> 00:22:21,520 Speaker 2: slop in your feed. And I say AI slop because 422 00:22:21,560 --> 00:22:24,240 Speaker 2: it's bad. Let's assume even that it's better. Let's assume 423 00:22:24,240 --> 00:22:28,560 Speaker 2: that AI video were indistinguishable. If that were the case, 424 00:22:29,000 --> 00:22:31,040 Speaker 2: would you actually want more of it in your feed, 425 00:22:31,480 --> 00:22:33,720 Speaker 2: or would you want to turn it off even more. 426 00:22:34,280 --> 00:22:36,639 Speaker 2: I don't think that we know the answers to these questions, 427 00:22:37,080 --> 00:22:41,480 Speaker 2: but it's very likely that if companies that are running 428 00:22:41,480 --> 00:22:45,280 Speaker 2: these platforms can't figure out a way to filter out 429 00:22:45,320 --> 00:22:48,600 Speaker 2: AI content, there's a part of the population that's going 430 00:22:48,640 --> 00:22:52,359 Speaker 2: to start tuning out. There's also advertisers that might be 431 00:22:52,400 --> 00:22:55,400 Speaker 2: scared by that, So I do think it's an existential 432 00:22:55,440 --> 00:22:57,800 Speaker 2: threat to the for you page. I think it actually 433 00:22:57,880 --> 00:23:01,800 Speaker 2: might be a boon for this subscriber or substack type communities, 434 00:23:02,040 --> 00:23:05,200 Speaker 2: like I think thatsting when people start rushing towards people 435 00:23:05,200 --> 00:23:07,920 Speaker 2: that they trust, I think that that could be a really, 436 00:23:08,000 --> 00:23:11,399 Speaker 2: really positive thing. I'll say for me, one of the 437 00:23:11,440 --> 00:23:13,119 Speaker 2: things that I would be looking at if I were 438 00:23:13,119 --> 00:23:16,200 Speaker 2: an AI creator is the fact that because Sora too 439 00:23:16,480 --> 00:23:19,439 Speaker 2: is so good at making videos, it lowered the barrier 440 00:23:19,440 --> 00:23:22,400 Speaker 2: of entries so far that I don't think open ai 441 00:23:22,560 --> 00:23:25,760 Speaker 2: is that far from generating their own feed. You know, 442 00:23:25,760 --> 00:23:28,879 Speaker 2: if you can make an interesting video with only two sentences, well, 443 00:23:28,960 --> 00:23:32,840 Speaker 2: chat gbt can make two sentences. They're collecting everyone's prompts, 444 00:23:32,880 --> 00:23:37,760 Speaker 2: they're seeing what gets likes and engagement on Sora. I 445 00:23:37,800 --> 00:23:40,119 Speaker 2: don't understand why they would need a human in the 446 00:23:40,160 --> 00:23:40,720 Speaker 2: loop soon. 447 00:23:41,200 --> 00:23:44,639 Speaker 1: I believe there's Actually we were just covering a story 448 00:23:44,680 --> 00:23:48,080 Speaker 1: in the Financial Times about gen Z being less on 449 00:23:48,200 --> 00:23:50,840 Speaker 1: social media, and I think a lot of it has 450 00:23:50,880 --> 00:23:54,520 Speaker 1: to do with the sort of enthitification of the feed. 451 00:23:55,040 --> 00:23:57,320 Speaker 1: And I see a lot of people kind of resigned 452 00:23:57,359 --> 00:24:00,200 Speaker 1: to the fact that going on Instagram means scrolling through 453 00:24:00,359 --> 00:24:01,840 Speaker 1: a lot of shit, and a lot of shit that's 454 00:24:01,840 --> 00:24:05,320 Speaker 1: AI generated. It's no longer social media. It's like watching 455 00:24:05,359 --> 00:24:09,080 Speaker 1: fake video. Yeah, it's hyper and shitification. It is the 456 00:24:09,200 --> 00:24:13,240 Speaker 1: most and shittified feed you could possibly have. And I 457 00:24:13,280 --> 00:24:16,960 Speaker 1: am totally agreeing that there will be people who are 458 00:24:17,000 --> 00:24:19,560 Speaker 1: super down with that and who are going to enjoy it. 459 00:24:20,119 --> 00:24:22,520 Speaker 2: Again, there are people who enjoy this. I don't want 460 00:24:22,600 --> 00:24:25,520 Speaker 2: to say that they're doing the wrong thing by enjoying 461 00:24:25,560 --> 00:24:26,439 Speaker 2: AI video a. 462 00:24:26,480 --> 00:24:28,720 Speaker 1: Fruit cutting another fruit, something like that. 463 00:24:29,040 --> 00:24:31,560 Speaker 2: Yeah, Like, I'm not here to judge what people are watching. 464 00:24:31,920 --> 00:24:36,359 Speaker 2: But if you play this out to its logical conclusion, here, 465 00:24:36,720 --> 00:24:41,720 Speaker 2: it looks like social media companies generating their own videos 466 00:24:41,800 --> 00:24:46,679 Speaker 2: without creators in the middle, for a hyper and shitified feed. 467 00:24:47,320 --> 00:24:49,840 Speaker 1: So five to ten years is a huge difference. So 468 00:24:49,920 --> 00:24:52,600 Speaker 1: let's just say, five years from now, what do you 469 00:24:52,680 --> 00:24:55,199 Speaker 1: think the state of AI video looks like, and what 470 00:24:55,240 --> 00:24:58,080 Speaker 1: does it mean for the Internet, for politics, and just 471 00:24:58,240 --> 00:24:59,439 Speaker 1: us generally as a culture. 472 00:25:00,119 --> 00:25:05,640 Speaker 2: If we project the current growth out, it is indistinguishable 473 00:25:05,720 --> 00:25:10,200 Speaker 2: and everywhere. If we take a contrarian view, we can 474 00:25:10,280 --> 00:25:12,560 Speaker 2: see that people might not be into it and it 475 00:25:12,640 --> 00:25:15,680 Speaker 2: might lose a lot of money. We don't know which 476 00:25:15,680 --> 00:25:18,040 Speaker 2: direction it's going to go, and I don't claim to 477 00:25:18,080 --> 00:25:21,439 Speaker 2: be able to tell which direction we're going in. But 478 00:25:21,840 --> 00:25:25,000 Speaker 2: in that first scenario where it's indistinguishable, it'll still be 479 00:25:25,040 --> 00:25:30,080 Speaker 2: distinguishable by machine learning algorithms, It'll still be detectable by experts. 480 00:25:30,400 --> 00:25:34,520 Speaker 2: I still don't think it presents legal problems, but it 481 00:25:34,560 --> 00:25:39,359 Speaker 2: presents massive disinformation problems. I'm very scared about that. And 482 00:25:39,400 --> 00:25:41,920 Speaker 2: then there's another scenario which I think is a little 483 00:25:41,960 --> 00:25:44,439 Speaker 2: bit more optimistic, which I actually subscribe to, which is 484 00:25:44,440 --> 00:25:48,080 Speaker 2: that AI content becomes its own genre. There are companies 485 00:25:48,080 --> 00:25:51,360 Speaker 2: that figure out a way to monetize it. It stays 486 00:25:51,480 --> 00:25:57,720 Speaker 2: separate from our real feeds to whatever degree the viewer wants. 487 00:25:58,119 --> 00:26:00,199 Speaker 2: And I think that this is the optimistic vision, and 488 00:26:00,280 --> 00:26:02,040 Speaker 2: that a lot of the tech community believes in too, 489 00:26:02,080 --> 00:26:04,000 Speaker 2: and that Sam Altman would probably say, you know, he's 490 00:26:04,040 --> 00:26:05,800 Speaker 2: been asked about this, He's been asked, how do we 491 00:26:05,800 --> 00:26:09,240 Speaker 2: tell what's real or fake? And I actually didn't hate 492 00:26:09,240 --> 00:26:11,600 Speaker 2: his answer. He said, well, just like we've always told 493 00:26:11,800 --> 00:26:13,800 Speaker 2: we follow the people we trust, like we have human 494 00:26:13,880 --> 00:26:18,280 Speaker 2: communication networks. Now, I think that his accelerationist view is 495 00:26:18,400 --> 00:26:20,639 Speaker 2: kind of running against that a little bit, but I 496 00:26:20,680 --> 00:26:22,880 Speaker 2: do believe that at its core, that's how we're going 497 00:26:22,880 --> 00:26:26,320 Speaker 2: to figure this out, and it might push people less online. 498 00:26:26,440 --> 00:26:29,359 Speaker 2: Like I just think that there's just so many unanswered questions. 499 00:26:29,400 --> 00:26:32,359 Speaker 2: But yeah, there's a few different scenarios that right now, 500 00:26:32,480 --> 00:26:34,159 Speaker 2: I think we just have to flip a coin on 501 00:26:34,200 --> 00:26:35,639 Speaker 2: which one we believe in. 502 00:26:37,160 --> 00:26:40,080 Speaker 1: So you said the reason that you got interested in 503 00:26:40,280 --> 00:26:45,040 Speaker 1: understanding AI video was as a tool for production. When 504 00:26:45,040 --> 00:26:47,480 Speaker 1: that was the case, what were you excited about and 505 00:26:47,520 --> 00:26:49,520 Speaker 1: sort of why has that now changed for you. 506 00:26:50,720 --> 00:26:54,080 Speaker 2: I was excited about it lowering the grounds to doing 507 00:26:54,200 --> 00:26:57,080 Speaker 2: creative things. I have a green screen studio in my basement. 508 00:26:57,119 --> 00:26:59,199 Speaker 2: I was excited about it, you know, putting me in 509 00:26:59,200 --> 00:27:01,879 Speaker 2: different types of stud and different types of environments. I 510 00:27:01,920 --> 00:27:06,800 Speaker 2: was excited about it improving my graphics workflows. What started 511 00:27:06,840 --> 00:27:09,159 Speaker 2: steering me away from it. It was some of the 512 00:27:09,160 --> 00:27:11,960 Speaker 2: ethical concerns. I did realize that at the end of 513 00:27:12,000 --> 00:27:15,720 Speaker 2: the day, like this was mostly stolen information. It was 514 00:27:15,920 --> 00:27:19,320 Speaker 2: actually not that much more useful than the actual room 515 00:27:19,520 --> 00:27:21,600 Speaker 2: I'm in right now, Like I can make a decent 516 00:27:21,640 --> 00:27:28,639 Speaker 2: studio myself. And really what made me turn was just 517 00:27:28,920 --> 00:27:31,680 Speaker 2: using the tools. I think a lot of the people 518 00:27:31,960 --> 00:27:36,359 Speaker 2: who are using them, who come from my background, realize 519 00:27:36,359 --> 00:27:39,320 Speaker 2: that they aren't very fun tools to use. It's not 520 00:27:39,359 --> 00:27:42,000 Speaker 2: a creative process for me. It's really frustrating. 521 00:27:42,080 --> 00:27:43,280 Speaker 1: Well, you just type something in. 522 00:27:43,400 --> 00:27:45,080 Speaker 2: You just type something in, and you hope it comes 523 00:27:45,080 --> 00:27:47,480 Speaker 2: back the way you want it. It's like if because 524 00:27:47,480 --> 00:27:49,240 Speaker 2: I have a history as a director, it is like 525 00:27:49,400 --> 00:27:52,119 Speaker 2: every time I needed to tell the actor exactly what 526 00:27:52,240 --> 00:27:55,720 Speaker 2: to say, exactly how to deliver it, over and over 527 00:27:55,840 --> 00:27:59,560 Speaker 2: and over. And as a creative person and as a director, 528 00:28:00,080 --> 00:28:02,680 Speaker 2: I just want to collaborate with people who bring something 529 00:28:02,680 --> 00:28:04,520 Speaker 2: to the table. I don't want to bring everything to 530 00:28:04,560 --> 00:28:06,479 Speaker 2: the table myself. I don't want to tell everyone how 531 00:28:06,480 --> 00:28:09,040 Speaker 2: to do everything right. That's not what the process of 532 00:28:09,080 --> 00:28:11,919 Speaker 2: creating ever was. It was always about collaboration. It was 533 00:28:11,920 --> 00:28:14,720 Speaker 2: always a fun process. I find the idea of just 534 00:28:14,720 --> 00:28:19,119 Speaker 2: sitting in my basement creating AI videos with text is 535 00:28:19,160 --> 00:28:22,719 Speaker 2: just it's exhausting. It doesn't feel creative at all. So 536 00:28:23,320 --> 00:28:25,960 Speaker 2: but I'm not saying that people should hate every AI 537 00:28:26,080 --> 00:28:28,680 Speaker 2: video they see, like some of them can be creative. 538 00:28:28,720 --> 00:28:32,280 Speaker 2: But yeah, it's just taking that opportunity to train yourself 539 00:28:32,320 --> 00:28:34,600 Speaker 2: to see what these video models look like. Because if 540 00:28:34,600 --> 00:28:37,439 Speaker 2: you're into it, that's totally fine, but then you're at 541 00:28:37,520 --> 00:28:40,080 Speaker 2: least ready for when it is used for disinformation, which 542 00:28:40,080 --> 00:28:41,440 Speaker 2: I think is enough of ball at this point. 543 00:28:41,840 --> 00:28:45,680 Speaker 1: Well, thank you so much, Jeremy. I will be tuned 544 00:28:45,760 --> 00:28:48,680 Speaker 1: into your feed. You are I don't know what I 545 00:28:48,680 --> 00:28:51,520 Speaker 1: would call you. Is it vigilante justice? I don't think so. 546 00:28:51,720 --> 00:28:55,520 Speaker 1: But you're doing some kind of public service education education. 547 00:28:55,640 --> 00:28:57,200 Speaker 3: You're an educator, Yeah, there you go. 548 00:28:57,320 --> 00:28:58,480 Speaker 1: You're an AI educator. 549 00:28:58,600 --> 00:29:22,040 Speaker 3: Yeah, for tech stuff. 550 00:29:22,240 --> 00:29:25,520 Speaker 1: I'm Kara Price. This episode was produced by Eliza Dennis, 551 00:29:25,560 --> 00:29:28,680 Speaker 1: Melissa Slaughter, and Tyler Hill. It was executive produced by 552 00:29:28,720 --> 00:29:32,720 Speaker 1: me oswa Oshan, Julia Nutter, and Kate Osborne for Kaleidoscope 553 00:29:33,000 --> 00:29:36,680 Speaker 1: and Katrina Norvell for iHeart Podcasts. Kyle Murdoch mixed this 554 00:29:36,760 --> 00:29:39,680 Speaker 1: episode and wrote our theme song. Join us on Friday 555 00:29:39,720 --> 00:29:41,840 Speaker 1: for the week in tech oz and I will run 556 00:29:41,880 --> 00:29:44,640 Speaker 1: through the headlines you may have missed. Please rate, review, 557 00:29:44,680 --> 00:29:47,160 Speaker 1: and reach out to us at tech Stuff Podcast at 558 00:29:47,160 --> 00:29:57,320 Speaker 1: gmail dot com.