1
00:00:07,800 --> 00:00:17,280
Speaker 1: Class. Welcome to tech Stuff. I'm Kara Price. Today's interview

2
00:00:17,400 --> 00:00:20,880
Speaker 1: is all about Sora, the video generation tool and invite

3
00:00:20,880 --> 00:00:23,320
Speaker 1: only social media app that Open Ai released at the

4
00:00:23,360 --> 00:00:26,960
Speaker 1: beginning of October. If you're on TikTok, Instagram, or x

5
00:00:27,120 --> 00:00:30,080
Speaker 1: you've likely seen videos made by Sora plastered all over

6
00:00:30,120 --> 00:00:33,920
Speaker 1: your feeds. These videos ranged from the absurd cats dancing

7
00:00:33,960 --> 00:00:37,920
Speaker 1: by a dumpster with sunglasses on to hyper realistic like

8
00:00:38,040 --> 00:00:41,720
Speaker 1: Queen Elizabeth trying jerk chicken in Jamaica. When I first

9
00:00:41,760 --> 00:00:44,560
Speaker 1: saw these videos, I was entertained by the absurdist ones

10
00:00:44,600 --> 00:00:47,920
Speaker 1: and kind of floored by the realistic ones. To me,

11
00:00:48,479 --> 00:00:51,400
Speaker 1: Sora signals that we have officially entered the post bunny

12
00:00:51,440 --> 00:00:54,680
Speaker 1: trampoline internet. Yeah, I'm talking about the AI video of

13
00:00:54,720 --> 00:00:57,400
Speaker 1: the Horde of Bunnies jumping on a trampoline in the dark.

14
00:00:57,880 --> 00:01:00,320
Speaker 1: I was very convinced that this video was real, and

15
00:01:00,360 --> 00:01:02,960
Speaker 1: so were many people, which led to a mini panic.

16
00:01:03,800 --> 00:01:06,640
Speaker 1: Is it even possible to detect what's fake and what's

17
00:01:06,680 --> 00:01:09,720
Speaker 1: not anymore? That's where my guest today comes in. His

18
00:01:09,840 --> 00:01:12,960
Speaker 1: name is Jeremy Carrasco and he runs multiple social media

19
00:01:13,000 --> 00:01:15,679
Speaker 1: accounts under the name show Tools AI.

20
00:01:16,240 --> 00:01:18,360
Speaker 2: The idea that we can't tell what's real or not

21
00:01:18,880 --> 00:01:22,480
Speaker 2: because of AI video is so far definitely.

22
00:01:21,959 --> 00:01:24,240
Speaker 1: Not the case. He has only been a full time

23
00:01:24,280 --> 00:01:27,000
Speaker 1: creator for four months, but he has become a trusted

24
00:01:27,040 --> 00:01:30,880
Speaker 1: source for dissecting viral AI videos and explaining the tells.

25
00:01:31,000 --> 00:01:35,160
Speaker 2: There is a physical truth to shooting a video with

26
00:01:35,200 --> 00:01:38,959
Speaker 2: a camera. That physical truth isn't going away, and AI

27
00:01:39,520 --> 00:01:43,600
Speaker 2: does a version that to our eyes look like that

28
00:01:43,640 --> 00:01:48,400
Speaker 2: physical truth. But upon examination you can figure out that

29
00:01:48,440 --> 00:01:51,120
Speaker 2: these things break down. And I do think that any

30
00:01:51,240 --> 00:01:55,120
Speaker 2: normal person with decent eyesight can zoom into these AI

31
00:01:55,240 --> 00:01:56,600
Speaker 2: videos and figure that out.

32
00:01:57,200 --> 00:02:00,680
Speaker 1: So Jeremy wants his social videos to be education. He

33
00:02:00,760 --> 00:02:03,120
Speaker 1: wants more people to get excited by what he calls

34
00:02:03,200 --> 00:02:06,760
Speaker 1: pixel peeping, and he wants to improve people's media literacy

35
00:02:07,320 --> 00:02:10,120
Speaker 1: and hopes his accounts can help people tune their AI

36
00:02:10,240 --> 00:02:10,880
Speaker 1: vibe checker.

37
00:02:11,360 --> 00:02:13,400
Speaker 2: I'm not naive to the fact that people aren't going

38
00:02:13,440 --> 00:02:16,160
Speaker 2: to be pixel peeping on the videos that they watch,

39
00:02:16,320 --> 00:02:20,000
Speaker 2: So it's just about trying to tune people's initial impressions

40
00:02:20,040 --> 00:02:22,440
Speaker 2: so that they have something in their head that says ey,

41
00:02:22,520 --> 00:02:25,040
Speaker 2: something might not be right here, and then they can use,

42
00:02:25,080 --> 00:02:27,880
Speaker 2: hopefully other media skills that I teach them. In order

43
00:02:27,919 --> 00:02:28,959
Speaker 2: to dive a little.

44
00:02:28,760 --> 00:02:31,800
Speaker 1: Bit deeper, I talk to Jeremy about so many things,

45
00:02:31,840 --> 00:02:35,840
Speaker 1: how video generation tools work, how to pick up on AI, tells,

46
00:02:36,200 --> 00:02:39,200
Speaker 1: why Sora is an inflection point for the Internet, and

47
00:02:39,240 --> 00:02:41,960
Speaker 1: what this signals for the future of social media. I

48
00:02:42,000 --> 00:02:44,880
Speaker 1: started out by asking Jeremy to clarify what Sora is

49
00:02:45,280 --> 00:02:46,440
Speaker 1: and what it does.

50
00:02:47,120 --> 00:02:52,360
Speaker 2: So. Sora was originally released as Openay's first video model

51
00:02:52,680 --> 00:02:56,239
Speaker 2: in October twenty twenty five. They reuse the Sora name

52
00:02:56,440 --> 00:02:59,240
Speaker 2: to launch their social media app. A lot of the

53
00:02:59,280 --> 00:03:01,680
Speaker 2: hype has been around a Sora app, which is currently

54
00:03:01,720 --> 00:03:04,720
Speaker 2: invite only, and then there's the Sora TOI model that

55
00:03:04,760 --> 00:03:08,680
Speaker 2: you can already access if you have API access or

56
00:03:08,720 --> 00:03:11,760
Speaker 2: if you're a developer or even a normal person. There

57
00:03:11,800 --> 00:03:14,360
Speaker 2: are tools that let you generate a video with the

58
00:03:14,400 --> 00:03:18,440
Speaker 2: Sora to video model without an invite. The Sora app

59
00:03:18,480 --> 00:03:23,520
Speaker 2: experience is very unique in some ways and very familiar

60
00:03:23,520 --> 00:03:26,200
Speaker 2: in others. It does feel like a TikTok for you

61
00:03:26,320 --> 00:03:30,160
Speaker 2: page just for AI videos. You can scroll, it has

62
00:03:30,200 --> 00:03:32,840
Speaker 2: an algorithm to suggest But what's gotten a lot of

63
00:03:32,840 --> 00:03:36,720
Speaker 2: the tension is the ability to cameo someone, but really,

64
00:03:36,840 --> 00:03:38,880
Speaker 2: these are just deep bakes. Like you're creating deep fakes

65
00:03:38,920 --> 00:03:41,200
Speaker 2: of your friends. You're creating deep bakes of whoever lets

66
00:03:41,240 --> 00:03:42,960
Speaker 2: you create a deep bake of them, And you have

67
00:03:43,040 --> 00:03:46,840
Speaker 2: different levels of permissions. So, for example, Jake Paul and

68
00:03:46,920 --> 00:03:51,080
Speaker 2: Sam Altman let anyone deep fake them, whereas I let

69
00:03:51,120 --> 00:03:54,120
Speaker 2: no one deep fake me because I'm not comfortable with that.

70
00:03:54,600 --> 00:03:56,680
Speaker 1: What does it look like to let someone deep fake

71
00:03:56,720 --> 00:03:57,400
Speaker 1: you on Sora?

72
00:03:57,840 --> 00:04:00,800
Speaker 2: It looks like a version of you doing whatever they

73
00:04:01,520 --> 00:04:04,240
Speaker 2: prompted you to do. Now, there are safety features in place,

74
00:04:04,480 --> 00:04:08,120
Speaker 2: so you can't have them do anything violent, you can't

75
00:04:08,160 --> 00:04:11,560
Speaker 2: do anything sexual. But it's really up to open a

76
00:04:11,640 --> 00:04:14,280
Speaker 2: high to set those boundaries. And I don't think it's

77
00:04:14,320 --> 00:04:17,680
Speaker 2: completely accurate. I've made versions of myself that I think

78
00:04:17,720 --> 00:04:19,919
Speaker 2: don't look very much like me. I've made other versions

79
00:04:19,920 --> 00:04:22,400
Speaker 2: of myself that look a lot like me. That's really

80
00:04:22,520 --> 00:04:27,520
Speaker 2: up to luck, because as we'll learn, these models aren't deterministic.

81
00:04:27,600 --> 00:04:30,120
Speaker 2: There's a part of this that is random, so it's

82
00:04:30,160 --> 00:04:33,440
Speaker 2: not repeatable. So Jake Paul is a very good example.

83
00:04:33,480 --> 00:04:35,880
Speaker 2: There are a ton of AI videos of Jake Paul

84
00:04:35,960 --> 00:04:38,080
Speaker 2: right now. All of them look a little bit different,

85
00:04:38,120 --> 00:04:41,960
Speaker 2: but have his likeness, so you have to give permission

86
00:04:42,000 --> 00:04:43,680
Speaker 2: for someone to make a video of you through the

87
00:04:43,680 --> 00:04:44,440
Speaker 2: cameo feature.

88
00:04:44,760 --> 00:04:48,960
Speaker 1: So would you say that AI video generation scares you, Like,

89
00:04:49,360 --> 00:04:51,120
Speaker 1: is it something that keeps you up at night?

90
00:04:51,800 --> 00:04:54,159
Speaker 2: It's not because I'm doing something about it now, but

91
00:04:54,200 --> 00:04:56,280
Speaker 2: it really was, and I think it is keeping people

92
00:04:56,320 --> 00:04:58,000
Speaker 2: up at night because so much of our time is

93
00:04:58,040 --> 00:05:01,120
Speaker 2: spent on these short form video platforms like for better

94
00:05:01,240 --> 00:05:04,800
Speaker 2: or worse. I do think that it is the primary

95
00:05:04,839 --> 00:05:08,039
Speaker 2: way that people get information now. There was probably never

96
00:05:08,120 --> 00:05:10,520
Speaker 2: the best format for that information in the first place,

97
00:05:10,640 --> 00:05:13,080
Speaker 2: but here we are. So I think what keeps me

98
00:05:13,200 --> 00:05:16,800
Speaker 2: up is really general media literacy skills, and I think

99
00:05:16,800 --> 00:05:19,640
Speaker 2: of AI video as an extension of that. A lot

100
00:05:19,680 --> 00:05:21,560
Speaker 2: of people are kept up by what I think are

101
00:05:21,720 --> 00:05:25,480
Speaker 2: irrational fears about AI video, Like, in my opinion, it's

102
00:05:25,520 --> 00:05:28,039
Speaker 2: probably not going to be framing you for a crime

103
00:05:28,080 --> 00:05:31,400
Speaker 2: anytime soon, but it might turn the core of public

104
00:05:31,440 --> 00:05:34,320
Speaker 2: opinion against you. It might be spreading disinformation.

105
00:05:34,600 --> 00:05:34,680
Speaker 1: Like.

106
00:05:34,760 --> 00:05:38,320
Speaker 2: It's an extension of other media literacy problems, and it's

107
00:05:38,360 --> 00:05:41,440
Speaker 2: a very believable one because people when they are scrolling,

108
00:05:41,560 --> 00:05:43,400
Speaker 2: they're just there to tune out and scroll. They're not

109
00:05:43,440 --> 00:05:46,640
Speaker 2: there to pixel peep and really pay attention, right, I.

110
00:05:46,600 --> 00:05:48,960
Speaker 1: Mean, you don't think that we are living in a

111
00:05:49,000 --> 00:05:51,560
Speaker 1: world where soon people could be framed for something they

112
00:05:51,560 --> 00:05:53,599
Speaker 1: didn't do using manipulated video.

113
00:05:54,000 --> 00:05:56,960
Speaker 2: Well, I think that. I'm not a lawyer, but I've

114
00:05:57,000 --> 00:05:59,920
Speaker 2: done some looking into this, and the reality is that

115
00:06:00,279 --> 00:06:03,000
Speaker 2: in order for something to be admitted into evidence, at

116
00:06:03,080 --> 00:06:04,800
Speaker 2: least in the United States, it has to have an

117
00:06:04,839 --> 00:06:09,039
Speaker 2: extensive metadata trail. It has to be authenticated. You have

118
00:06:09,080 --> 00:06:11,039
Speaker 2: to get the person who filmed the video into the

119
00:06:11,040 --> 00:06:13,839
Speaker 2: courtroom to say that they filmed it. And we have

120
00:06:13,920 --> 00:06:17,840
Speaker 2: to understand that while our perception might be getting tricked,

121
00:06:18,040 --> 00:06:21,919
Speaker 2: there are procedural and mathematical ways that these can be detected.

122
00:06:22,440 --> 00:06:25,440
Speaker 2: So it is not undetectable yet. And anyone who says

123
00:06:25,440 --> 00:06:28,640
Speaker 2: it's undetectable is probably either selling you something or doesn't

124
00:06:28,640 --> 00:06:30,760
Speaker 2: have a good eye. And anyone who says it will

125
00:06:30,800 --> 00:06:34,680
Speaker 2: be undetectable does not know that, and frankly doesn't understand

126
00:06:34,760 --> 00:06:36,840
Speaker 2: the technology that's making these AI videos very well.

127
00:06:36,880 --> 00:06:40,160
Speaker 1: In my opinion, and right now, your likeness is not shared.

128
00:06:41,160 --> 00:06:46,640
Speaker 2: No, I have a strong, strong bias against this because

129
00:06:46,720 --> 00:06:49,680
Speaker 2: I believe that once your likeness gets out there and

130
00:06:49,839 --> 00:06:53,360
Speaker 2: is deepfakable, so to speak. It's really hard to pull

131
00:06:53,360 --> 00:06:56,719
Speaker 2: that back, not because you can't, like you can tell

132
00:06:56,760 --> 00:06:59,800
Speaker 2: people to stop, but once it's out there, I think

133
00:06:59,800 --> 00:07:01,880
Speaker 2: you lose a sense of trust. It's a line that

134
00:07:01,960 --> 00:07:04,440
Speaker 2: I just don't want to cross. I'm not comfortable crossing,

135
00:07:04,440 --> 00:07:06,719
Speaker 2: and I've actually told my followers I will never cross

136
00:07:06,760 --> 00:07:09,760
Speaker 2: that line because it's just not what I'm interested in.

137
00:07:10,440 --> 00:07:13,640
Speaker 1: So I was hoping that you could show me how

138
00:07:13,640 --> 00:07:15,559
Speaker 1: to make a video using the Sora app.

139
00:07:15,760 --> 00:07:19,520
Speaker 2: Sure, so this is the Sora desktop app. It is

140
00:07:19,640 --> 00:07:22,400
Speaker 2: not the vertical experience that you have, you know, on

141
00:07:22,400 --> 00:07:25,360
Speaker 2: the phone. It is, however, showing a lot of the

142
00:07:25,400 --> 00:07:28,280
Speaker 2: same content. So this is essentially the for you page

143
00:07:28,280 --> 00:07:30,400
Speaker 2: of Sora. And the thing to note here is that

144
00:07:30,440 --> 00:07:33,280
Speaker 2: there are Sora water marks over each one of these videos.

145
00:07:33,560 --> 00:07:36,040
Speaker 2: In the mobile experience, those water marks go away, but

146
00:07:36,160 --> 00:07:39,200
Speaker 2: they don't let you screen record in the mobile version,

147
00:07:39,320 --> 00:07:42,360
Speaker 2: Whereas theoretically anyone could do what I'm doing right now,

148
00:07:42,400 --> 00:07:44,720
Speaker 2: like I can share my screen here, I could record

149
00:07:44,760 --> 00:07:47,560
Speaker 2: my screen. When you see Sora videos on social media,

150
00:07:48,080 --> 00:07:49,360
Speaker 2: this is how they're being made.

151
00:07:49,440 --> 00:07:52,040
Speaker 1: So let's try to make a Sora video. Let's do

152
00:07:53,000 --> 00:07:55,680
Speaker 1: skiing with candy.

153
00:07:56,520 --> 00:07:59,000
Speaker 2: Skiing with Candy. You want me to just say that

154
00:07:59,080 --> 00:08:02,520
Speaker 2: and see what it comes up with. Yes, let's do it.

155
00:08:02,600 --> 00:08:03,720
Speaker 2: I think that's a great idea.

156
00:08:03,760 --> 00:08:04,840
Speaker 1: Why do you think it's a good idea?

157
00:08:04,960 --> 00:08:10,120
Speaker 2: Because so something that people aren't talking enough about with

158
00:08:10,240 --> 00:08:12,880
Speaker 2: Sora is that you can have a very simple prompt

159
00:08:12,960 --> 00:08:16,080
Speaker 2: and it can come up with something really creative. That's

160
00:08:16,200 --> 00:08:19,960
Speaker 2: really what, in my opinion, distinguishes it from other video models.

161
00:08:20,240 --> 00:08:22,880
Speaker 2: Google vo three was how a lot of AI content

162
00:08:22,960 --> 00:08:25,320
Speaker 2: was made a few weeks ago. If you don't give

163
00:08:25,320 --> 00:08:28,480
Speaker 2: Google vo three a good prompt, it's just boring, Whereas

164
00:08:28,480 --> 00:08:32,040
Speaker 2: Sora will go through some attempts to at least make

165
00:08:32,080 --> 00:08:33,120
Speaker 2: it entertaining anyway.

166
00:08:33,320 --> 00:08:36,360
Speaker 1: It's just incredible to me that in a given three

167
00:08:36,400 --> 00:08:38,160
Speaker 1: weeks the world sort of changes.

168
00:08:38,480 --> 00:08:42,080
Speaker 2: I think that there is a misconception that the world

169
00:08:42,200 --> 00:08:48,440
Speaker 2: just changed because video AI made a huge, undetectable leap.

170
00:08:48,960 --> 00:08:52,160
Speaker 2: It did make a step towards more realism. What soa

171
00:08:52,280 --> 00:08:57,000
Speaker 2: to really improved. Where a lot of the human parts

172
00:08:57,120 --> 00:09:00,840
Speaker 2: of video AI, such as hand movement or if they

173
00:09:01,160 --> 00:09:04,840
Speaker 2: have a missing limb, or if their teeth look weird,

174
00:09:05,080 --> 00:09:08,200
Speaker 2: or if their eyes look uncanny, hair like, there were

175
00:09:08,240 --> 00:09:10,880
Speaker 2: all these little things that people would pick on again,

176
00:09:11,000 --> 00:09:14,360
Speaker 2: a lot of them subconscious. Sura made a step towards

177
00:09:14,360 --> 00:09:18,199
Speaker 2: improving those things. It still has a lot of background issues.

178
00:09:18,720 --> 00:09:22,720
Speaker 2: It is actually a noisier or muddier looking model in

179
00:09:22,720 --> 00:09:25,520
Speaker 2: my opinion than video, but a lot of people aren't

180
00:09:25,559 --> 00:09:27,480
Speaker 2: looking for that. A lot of the videos that go

181
00:09:27,640 --> 00:09:32,720
Speaker 2: viral that are AI generated are security cams, our body

182
00:09:32,800 --> 00:09:36,960
Speaker 2: cams are go pro looking cameras, things that people aren't

183
00:09:37,000 --> 00:09:40,160
Speaker 2: looking at every day. But it really made improvements in

184
00:09:41,080 --> 00:09:45,560
Speaker 2: how good the outputs are to watch. Like story wise,

185
00:09:46,160 --> 00:09:49,080
Speaker 2: if you were to release Google vo three as a

186
00:09:49,120 --> 00:09:53,440
Speaker 2: social media app, it would fail just entirely because people

187
00:09:53,440 --> 00:09:57,600
Speaker 2: would get on there and unless you're a good prompter, like,

188
00:09:57,640 --> 00:10:00,320
Speaker 2: you're not going to come up with anything interesting. As

189
00:10:00,320 --> 00:10:03,520
Speaker 2: Sah made anyone getting into AI, it's possible for you

190
00:10:03,559 --> 00:10:06,280
Speaker 2: to come up with something interesting with a very basic prompt.

191
00:10:06,320 --> 00:10:09,760
Speaker 2: That's a really, really big innovation that they didn't talk about.

192
00:10:09,880 --> 00:10:12,160
Speaker 2: But I think that's why it's had such an impact

193
00:10:12,280 --> 00:10:16,920
Speaker 2: is because there's a huge volume of somewhat meaningful Sora

194
00:10:17,040 --> 00:10:19,880
Speaker 2: videos out there, whereas there really wasn't with VEO when

195
00:10:19,880 --> 00:10:22,360
Speaker 2: that came out right. So all right, so it came

196
00:10:22,440 --> 00:10:24,120
Speaker 2: up with skiing with candy. Let's see what Let's see

197
00:10:24,120 --> 00:10:24,640
Speaker 2: what I did here?

198
00:10:25,240 --> 00:10:25,520
Speaker 1: Look what.

199
00:10:27,160 --> 00:10:29,959
Speaker 2: Go mid slip snack classy and.

200
00:10:29,920 --> 00:10:31,000
Speaker 3: A peppermint for the wind.

201
00:10:31,200 --> 00:10:34,000
Speaker 1: Nothing like sweet feel to keep the turn smooth? Catch

202
00:10:34,000 --> 00:10:35,079
Speaker 1: you at the bottom.

203
00:10:35,480 --> 00:10:37,599
Speaker 2: All right? What are your impressions?

204
00:10:37,840 --> 00:10:40,760
Speaker 1: I just don't I'm sorry, this is Is it okay

205
00:10:40,760 --> 00:10:42,000
Speaker 1: that this is blowing my mind?

206
00:10:42,400 --> 00:10:43,400
Speaker 2: It should okay?

207
00:10:43,440 --> 00:10:45,360
Speaker 1: Good, it should blow your mind because I feel daft,

208
00:10:45,520 --> 00:10:49,640
Speaker 1: Like I feel like I can't wrap my head around this,

209
00:10:49,960 --> 00:10:52,160
Speaker 1: Like I'm assuming this woman in the video with her

210
00:10:52,200 --> 00:10:53,960
Speaker 1: ski mask on is not a real person.

211
00:10:54,120 --> 00:10:56,120
Speaker 2: No, she's not a real person. And we don't know

212
00:10:56,480 --> 00:10:58,600
Speaker 2: how they invented her. They just came up with that.

213
00:10:58,679 --> 00:10:59,880
Speaker 3: What so.

214
00:11:00,000 --> 00:11:03,000
Speaker 2: So there are things about this that stick out to

215
00:11:03,040 --> 00:11:05,720
Speaker 2: me as obvious AI video. And then there are things

216
00:11:05,720 --> 00:11:09,040
Speaker 2: about this that I just have to say, wow, that

217
00:11:09,160 --> 00:11:12,280
Speaker 2: is incredible. So if I can just explain what I

218
00:11:12,360 --> 00:11:15,360
Speaker 2: see here someone who watches these, so please. She starts

219
00:11:15,360 --> 00:11:18,120
Speaker 2: out by skiing down the hill, but she's kind of

220
00:11:18,160 --> 00:11:21,640
Speaker 2: skiing like it's snowboarding. Then she stops. She has some

221
00:11:21,679 --> 00:11:24,240
Speaker 2: peppermints in her hand, she has some bags of candy

222
00:11:24,400 --> 00:11:27,920
Speaker 2: in her hand, and there are some weird things going

223
00:11:27,960 --> 00:11:30,280
Speaker 2: on here. But what it did with it is without

224
00:11:30,320 --> 00:11:34,000
Speaker 2: any input, it basically made a social media video with it.

225
00:11:34,000 --> 00:11:38,200
Speaker 2: It's like she's promoting this candy. There's someone responding to

226
00:11:38,240 --> 00:11:42,319
Speaker 2: her in the background. It invented a straw for her exactly.

227
00:11:42,400 --> 00:11:45,120
Speaker 2: It she talks like an influencer.

228
00:11:45,520 --> 00:11:47,440
Speaker 1: I just it really trips me up that she's not

229
00:11:47,480 --> 00:11:49,319
Speaker 1: a real person, that this person does not exist in

230
00:11:49,360 --> 00:11:50,800
Speaker 1: the world. It's really weird.

231
00:11:51,040 --> 00:11:53,360
Speaker 2: Same. I mean, I have to tell myself it's not

232
00:11:53,400 --> 00:11:53,920
Speaker 2: a real person.

233
00:11:54,000 --> 00:11:55,679
Speaker 1: I mean, it would be like if you didn't exist.

234
00:11:56,000 --> 00:12:00,680
Speaker 2: Yeah, it's That's the thing is, it's visually feels the

235
00:12:00,679 --> 00:12:03,640
Speaker 2: same as talking to another person online. Of course, there

236
00:12:03,640 --> 00:12:05,640
Speaker 2: are there are tells, so I'll get into those. So

237
00:12:06,280 --> 00:12:08,880
Speaker 2: first of all, you have just the context. Why is

238
00:12:08,960 --> 00:12:11,800
Speaker 2: she skiing down the hill with a bag of candy

239
00:12:11,960 --> 00:12:13,880
Speaker 2: and why is she just putting it in her mouth

240
00:12:13,960 --> 00:12:17,800
Speaker 2: with the wrappers. Then there are some artifacts that I

241
00:12:17,840 --> 00:12:20,760
Speaker 2: can see, especially at the beginning of the generation. Her

242
00:12:20,880 --> 00:12:24,400
Speaker 2: jacket and her pants are incredibly pixelated when it starts.

243
00:12:24,920 --> 00:12:28,160
Speaker 2: But the other thing here is that it's very noisy.

244
00:12:28,200 --> 00:12:32,840
Speaker 2: If we actually zoom in there's a lot of artifacts

245
00:12:33,040 --> 00:12:34,640
Speaker 2: in the mountains back there.

246
00:12:34,880 --> 00:12:36,679
Speaker 1: It is weird how she's eating the candy. That's a

247
00:12:36,679 --> 00:12:37,480
Speaker 1: little uncanny.

248
00:12:37,520 --> 00:12:41,400
Speaker 2: It's weird. Yeah, she's eating raft candy and the bag

249
00:12:41,440 --> 00:12:43,960
Speaker 2: there just stuck to her knee. Yeah, you know, so

250
00:12:44,040 --> 00:12:46,480
Speaker 2: at first it's a ziplock bag, then it's not a

251
00:12:46,559 --> 00:12:50,720
Speaker 2: ziplock bag, then it sticks to her knee. Her feet

252
00:12:50,720 --> 00:12:54,199
Speaker 2: are backwards, like her foot there is literally backwards in

253
00:12:54,280 --> 00:12:56,840
Speaker 2: this version. She doesn't have a foot like you know,

254
00:12:56,880 --> 00:12:58,160
Speaker 2: you get into it. It's kind of funny.

255
00:12:58,200 --> 00:12:59,960
Speaker 1: But this is why you have such a large platfor

256
00:13:00,320 --> 00:13:02,320
Speaker 1: because like I look at this at first and I'm like, oh,

257
00:13:02,360 --> 00:13:05,840
Speaker 1: it's perfect. Like in a way, if I see the

258
00:13:05,880 --> 00:13:09,199
Speaker 1: trappings of what I think i'm seeing, I don't really

259
00:13:09,240 --> 00:13:10,760
Speaker 1: look for the detail that's wrong.

260
00:13:11,120 --> 00:13:14,439
Speaker 2: Especially when you're just scrolling on TikTok or Instagram. You're

261
00:13:14,440 --> 00:13:15,079
Speaker 2: not looking for.

262
00:13:15,000 --> 00:13:16,719
Speaker 1: Anything wrong, right, which is how they want you to

263
00:13:16,760 --> 00:13:18,559
Speaker 1: look at it, or scrolling on Sora.

264
00:13:18,480 --> 00:13:20,679
Speaker 2: Or scrolling on Sora. A lot of them are leaving

265
00:13:20,679 --> 00:13:23,560
Speaker 2: Sara and making it out to all these platforms. Yeah,

266
00:13:24,240 --> 00:13:26,480
Speaker 2: you're not going to be looking for these things. I'm

267
00:13:26,520 --> 00:13:29,640
Speaker 2: totally aware of that. I mean, on first watch, are

268
00:13:29,679 --> 00:13:31,480
Speaker 2: you gonna pick out everything that's wrong with this? No,

269
00:13:31,600 --> 00:13:33,800
Speaker 2: But if you watch it five times and start zooming in,

270
00:13:34,240 --> 00:13:37,199
Speaker 2: you're gonna start noticing that her feet are literally backwards.

271
00:13:37,480 --> 00:13:41,000
Speaker 2: So yeah, when it comes down to it, I think

272
00:13:41,040 --> 00:13:44,800
Speaker 2: what's really very important about Sora is that it did

273
00:13:44,840 --> 00:13:47,400
Speaker 2: all that work for you. You didn't need to know

274
00:13:47,480 --> 00:13:50,160
Speaker 2: how to prompt the video. AI. If you were to

275
00:13:50,240 --> 00:13:54,320
Speaker 2: put skiing with candy into Google video, it's just going

276
00:13:54,400 --> 00:13:56,320
Speaker 2: to be boring. I'll just tell you that right now.

277
00:13:56,679 --> 00:14:00,720
Speaker 1: So if I wanted this video, this exact video, for three,

278
00:14:01,720 --> 00:14:03,560
Speaker 1: what would I have to prompt it to do?

279
00:14:04,040 --> 00:14:06,320
Speaker 2: You'd have to act like a camera director. You'd have

280
00:14:06,400 --> 00:14:11,040
Speaker 2: to say, video starting with women skiing down the slope.

281
00:14:11,120 --> 00:14:15,880
Speaker 2: She is wearing a pink and yellow top, a turquoise bottom,

282
00:14:16,320 --> 00:14:18,320
Speaker 2: She's holding a bag of candy in her right hand,

283
00:14:18,360 --> 00:14:20,600
Speaker 2: pepperminster her left hand, and you'd have to go shot

284
00:14:20,640 --> 00:14:23,880
Speaker 2: by shot to give it. I can actually show you

285
00:14:24,560 --> 00:14:27,320
Speaker 2: something that I came up with that more clearly demonstrates

286
00:14:27,400 --> 00:14:29,960
Speaker 2: this point. So this is a video I made yesterday

287
00:14:30,080 --> 00:14:34,600
Speaker 2: with the prompt epic anime of Diego Maradonna scoring a

288
00:14:34,680 --> 00:14:37,000
Speaker 2: goal in the world cup, weaving.

289
00:14:36,680 --> 00:14:41,600
Speaker 1: Past one, still going two. Defender's beaten. He won't announcers.

290
00:14:42,280 --> 00:14:45,680
Speaker 2: This is him dribbling through an entire defense. It is

291
00:14:45,720 --> 00:14:49,120
Speaker 2: an epic looking anime. Anime. People would say it doesn't

292
00:14:49,160 --> 00:14:53,040
Speaker 2: look great, but normal people probably wouldn't notice it. And

293
00:14:53,320 --> 00:14:57,360
Speaker 2: what blew me away about this is that it created

294
00:14:57,720 --> 00:15:02,280
Speaker 2: Diego Maradonna's most famous goal and it added the announcers.

295
00:15:02,560 --> 00:15:04,680
Speaker 2: I didn't tell it to do any of that. Now,

296
00:15:04,720 --> 00:15:07,720
Speaker 2: if I compare that to what Google Vio did with

297
00:15:07,840 --> 00:15:15,080
Speaker 2: the exact same prompt it did this, this.

298
00:15:15,000 --> 00:15:16,120
Speaker 1: One is b team.

299
00:15:16,160 --> 00:15:20,200
Speaker 2: It is. The quality of the video is actually better,

300
00:15:20,440 --> 00:15:23,840
Speaker 2: but it didn't make it interesting. So again, that's why

301
00:15:23,880 --> 00:15:25,840
Speaker 2: you're seeing so much, Sarah, as you don't need to

302
00:15:25,840 --> 00:15:26,600
Speaker 2: be very creative.

303
00:15:27,080 --> 00:15:30,120
Speaker 1: What are the implications of a social media app being

304
00:15:30,200 --> 00:15:34,280
Speaker 1: designed and housing videos full of fake people? Like it's

305
00:15:34,320 --> 00:15:35,760
Speaker 1: just crazy to me that I can watch a video

306
00:15:35,840 --> 00:15:37,440
Speaker 1: of someone who doesn't exist.

307
00:15:37,960 --> 00:15:42,040
Speaker 2: I think that we don't know the implications, and I

308
00:15:42,080 --> 00:15:45,920
Speaker 2: would push back on it being like our inevitable future

309
00:15:46,360 --> 00:15:49,760
Speaker 2: a bit, but I would say that it is normalizing

310
00:15:50,000 --> 00:15:53,960
Speaker 2: deep faking, and I don't think we know what that

311
00:15:54,040 --> 00:15:56,720
Speaker 2: will mean for us. But I don't think it'll be good.

312
00:15:57,200 --> 00:15:59,960
Speaker 2: I think it might be entertaining, I think it might

313
00:15:59,960 --> 00:16:04,320
Speaker 2: be interesting. It is certainly a technical achievement, but I

314
00:16:04,320 --> 00:16:07,800
Speaker 2: don't consider it to be a technological advancement. I'm not

315
00:16:07,920 --> 00:16:11,080
Speaker 2: so sure it is progress. But it is a pretty

316
00:16:11,080 --> 00:16:14,320
Speaker 2: incredible thing that they've been able to pull off, and

317
00:16:14,840 --> 00:16:17,840
Speaker 2: I think that it is rational for people to look

318
00:16:17,880 --> 00:16:21,080
Speaker 2: at these videos and be pretty freaked out. And that's

319
00:16:21,120 --> 00:16:25,080
Speaker 2: what a lot of my comments are because what isn't

320
00:16:25,120 --> 00:16:29,320
Speaker 2: clear is how this is going to improve social media

321
00:16:29,320 --> 00:16:32,320
Speaker 2: in anyway, to improve our media literacy skills in any way.

322
00:16:32,640 --> 00:16:37,760
Speaker 2: There are definitely tech advancements here that can improve advancements

323
00:16:37,800 --> 00:16:42,680
Speaker 2: towards artificial general intelligence, like there are technical reasons that

324
00:16:42,720 --> 00:16:46,000
Speaker 2: this could be helpful in the future. But the step

325
00:16:46,040 --> 00:16:48,960
Speaker 2: that open Aye took to release this in a social

326
00:16:49,040 --> 00:16:53,760
Speaker 2: media app was a huge jump, in my opinion, in

327
00:16:53,800 --> 00:16:56,920
Speaker 2: the wrong direction. But the technology is here to stay

328
00:16:56,960 --> 00:16:57,320
Speaker 2: for sure.

329
00:17:03,920 --> 00:17:08,000
Speaker 1: After the break, will we become desensitized to deep fakes?

330
00:17:08,520 --> 00:17:12,880
Speaker 3: Stay with us?

331
00:17:27,800 --> 00:17:30,120
Speaker 1: One thing that I can't really get over about sore

332
00:17:30,240 --> 00:17:34,440
Speaker 1: Too is that Sam Altman is letting anybody use his likeness.

333
00:17:34,560 --> 00:17:37,840
Speaker 1: He opened his likeness to any sor user, so I

334
00:17:37,840 --> 00:17:42,000
Speaker 1: could say Sam Altman building a snowman for example, why

335
00:17:42,040 --> 00:17:45,320
Speaker 1: do this, Like, as the head of the company.

336
00:17:45,359 --> 00:17:48,480
Speaker 2: I can only guess. I think that it is generally

337
00:17:49,080 --> 00:17:54,480
Speaker 2: just attempt at normalizing deep baking people, and I think

338
00:17:54,520 --> 00:17:57,240
Speaker 2: people should be really scared of crossing that line. I

339
00:17:57,240 --> 00:17:59,560
Speaker 2: think it's a serious thing to do, and I think

340
00:17:59,640 --> 00:18:03,960
Speaker 2: open pushing everyone in that direction before anyone was even

341
00:18:04,000 --> 00:18:08,320
Speaker 2: asking for it is really frightening. You could create deep

342
00:18:08,359 --> 00:18:11,560
Speaker 2: fakes of people before there was the technology to do it.

343
00:18:11,760 --> 00:18:14,720
Speaker 2: There was a lot of friction and social pressure not

344
00:18:14,880 --> 00:18:18,280
Speaker 2: to do it. That friction was helpful in keeping our

345
00:18:18,359 --> 00:18:22,320
Speaker 2: information economy healthy. Even with safety features on the Sora

346
00:18:22,359 --> 00:18:25,639
Speaker 2: app of like letting letting you set permissions, people are

347
00:18:25,640 --> 00:18:28,880
Speaker 2: gonna mess that up. People won't know that they can

348
00:18:28,880 --> 00:18:31,479
Speaker 2: be deepfaked, and of course that's their responsibility to know.

349
00:18:31,680 --> 00:18:34,120
Speaker 2: But you've just opened up an entire can of worms.

350
00:18:34,160 --> 00:18:37,200
Speaker 2: There are other issues here, like currently you can't delete

351
00:18:37,240 --> 00:18:40,639
Speaker 2: your Sora account without deleting your entire JATGPT account.

352
00:18:40,960 --> 00:18:41,200
Speaker 3: Wow.

353
00:18:41,359 --> 00:18:43,879
Speaker 2: And again like you can't pull this back, Like in

354
00:18:44,040 --> 00:18:46,280
Speaker 2: theory you could stop people. But if you are a

355
00:18:46,280 --> 00:18:48,520
Speaker 2: public figure and you open up this can of worms,

356
00:18:48,880 --> 00:18:54,040
Speaker 2: it could really backfire. So it's Sora accelerating this deep

357
00:18:54,080 --> 00:18:57,399
Speaker 2: fake idea into a space that just hasn't been that

358
00:18:57,440 --> 00:18:59,760
Speaker 2: full explored yet. And I don't think i'd want to

359
00:18:59,760 --> 00:19:02,920
Speaker 2: be inly adopter of this because there's a lot of negative,

360
00:19:03,040 --> 00:19:05,520
Speaker 2: like downside risk that I just don't think we figured

361
00:19:05,600 --> 00:19:06,000
Speaker 2: out yet.

362
00:19:06,520 --> 00:19:09,040
Speaker 1: So you have a video where you talk about how

363
00:19:09,080 --> 00:19:12,600
Speaker 1: SOA is actually costing open Ai about one dollar per post?

364
00:19:12,720 --> 00:19:15,760
Speaker 1: Can you explain that calculation and what it means for

365
00:19:15,840 --> 00:19:17,160
Speaker 1: Sora long term?

366
00:19:17,240 --> 00:19:19,520
Speaker 2: This was an educated guest that ended up being right.

367
00:19:19,840 --> 00:19:23,280
Speaker 2: Every video you create is basically on open AI's dime. So,

368
00:19:23,520 --> 00:19:26,840
Speaker 2: for example, two weeks ago, if I, as a creator

369
00:19:26,960 --> 00:19:30,320
Speaker 2: wanted to post an Ai video to TikTok or Instagram,

370
00:19:30,680 --> 00:19:33,040
Speaker 2: I would have to pay a subscription to make that

371
00:19:33,119 --> 00:19:37,400
Speaker 2: video and download it or pay per post. So there

372
00:19:37,520 --> 00:19:42,640
Speaker 2: are commodity prices for these video models. For Google vo

373
00:19:42,920 --> 00:19:46,000
Speaker 2: it's a dollar fifty to three dollars. Sora is currently

374
00:19:46,080 --> 00:19:51,040
Speaker 2: around a dollar. But the Sora application is free, and

375
00:19:51,160 --> 00:19:53,760
Speaker 2: anytime you create an Ai video on that it is

376
00:19:53,880 --> 00:19:57,600
Speaker 2: free to you. So as always I would ask the question,

377
00:19:57,760 --> 00:19:59,959
Speaker 2: if it is free, are you the product? And I'm

378
00:20:00,080 --> 00:20:02,960
Speaker 2: this case they are taking your data, they're taking your

379
00:20:02,960 --> 00:20:06,119
Speaker 2: face scans, they're taking your props. Right, so there's that

380
00:20:06,200 --> 00:20:08,040
Speaker 2: question of why are they doing this? Of course they're

381
00:20:08,040 --> 00:20:10,879
Speaker 2: also doing it to get users. But imagine you were

382
00:20:10,920 --> 00:20:15,119
Speaker 2: TikTok or Instagram and every single time someone posted a

383
00:20:15,200 --> 00:20:18,200
Speaker 2: video on your site you needed to pay a dollar.

384
00:20:18,600 --> 00:20:20,840
Speaker 2: How quickly is that going to add up? For Sora?

385
00:20:21,160 --> 00:20:21,760
Speaker 1: Very quickly?

386
00:20:22,200 --> 00:20:26,280
Speaker 2: Would advertisers be able to make up that difference? Are

387
00:20:26,280 --> 00:20:28,280
Speaker 2: you going to need subscribers to help make up with

388
00:20:28,320 --> 00:20:31,200
Speaker 2: that difference? I mean, video takes a ton of compute.

389
00:20:31,240 --> 00:20:34,640
Speaker 2: It is costing them GPU compute, it is costing them

390
00:20:35,119 --> 00:20:38,760
Speaker 2: opportunity costs. The GPUs could be used for other things, right,

391
00:20:39,080 --> 00:20:42,639
Speaker 2: So the fact that they chose a video social media

392
00:20:42,680 --> 00:20:45,040
Speaker 2: app where every time someone posts on your platform it's

393
00:20:45,040 --> 00:20:48,160
Speaker 2: costing you money is pretty confusing to me as someone

394
00:20:48,200 --> 00:20:52,440
Speaker 2: who understands that those advertiser clicks are not even close

395
00:20:52,480 --> 00:20:53,359
Speaker 2: to worth that much.

396
00:20:53,720 --> 00:20:59,199
Speaker 1: My question is, is your sam Altman, you oversee the

397
00:20:59,320 --> 00:21:04,280
Speaker 1: most popular or AI tool on the market, Why are

398
00:21:04,280 --> 00:21:05,600
Speaker 1: you going into social media?

399
00:21:06,160 --> 00:21:09,320
Speaker 2: You're asking the right question that I think even open

400
00:21:09,359 --> 00:21:12,480
Speaker 2: AI's own employees are asking. There has been some reporting

401
00:21:12,680 --> 00:21:16,280
Speaker 2: on even open AI people being confused by this. At

402
00:21:16,280 --> 00:21:19,560
Speaker 2: the end of the day, TikTok is releasing an AI generator,

403
00:21:19,600 --> 00:21:21,679
Speaker 2: I get ads for that all the time. YouTube is

404
00:21:21,720 --> 00:21:26,400
Speaker 2: putting Google vo three into YouTube shorts. Everyone's looking at

405
00:21:26,400 --> 00:21:29,680
Speaker 2: this as how do we build like the AI video feed?

406
00:21:29,960 --> 00:21:34,359
Speaker 2: And it appears to me the rationale would be to

407
00:21:34,600 --> 00:21:38,160
Speaker 2: generate some sort of advertiser revenue. I think that would

408
00:21:38,200 --> 00:21:41,040
Speaker 2: be the simple answer. But whether or not that actually

409
00:21:41,040 --> 00:21:43,680
Speaker 2: works is a huge open question.

410
00:21:44,200 --> 00:21:47,280
Speaker 1: So in the future, say, Sora, the app is running

411
00:21:47,320 --> 00:21:48,560
Speaker 1: ads between videos.

412
00:21:48,760 --> 00:21:50,680
Speaker 2: Yeah, absolutely interesting.

413
00:21:51,640 --> 00:21:54,040
Speaker 1: So in one of your videos, you say that AI

414
00:21:54,200 --> 00:21:56,720
Speaker 1: will end social media? What do you mean by that?

415
00:21:57,480 --> 00:21:59,960
Speaker 2: I think it has the potential to end the four

416
00:22:00,119 --> 00:22:03,080
Speaker 2: you page as we know it, unless the social media

417
00:22:03,119 --> 00:22:07,800
Speaker 2: companies figure out a way to filter AI content. Again,

418
00:22:08,240 --> 00:22:10,720
Speaker 2: we do not know how people are going to react

419
00:22:10,720 --> 00:22:14,719
Speaker 2: to this when it's deployed much wider. But it is

420
00:22:14,760 --> 00:22:18,840
Speaker 2: a rational thing to not want to only see AI

421
00:22:18,920 --> 00:22:21,520
Speaker 2: slop in your feed. And I say AI slop because

422
00:22:21,560 --> 00:22:24,240
Speaker 2: it's bad. Let's assume even that it's better. Let's assume

423
00:22:24,240 --> 00:22:28,560
Speaker 2: that AI video were indistinguishable. If that were the case,

424
00:22:29,000 --> 00:22:31,040
Speaker 2: would you actually want more of it in your feed,

425
00:22:31,480 --> 00:22:33,720
Speaker 2: or would you want to turn it off even more.

426
00:22:34,280 --> 00:22:36,639
Speaker 2: I don't think that we know the answers to these questions,

427
00:22:37,080 --> 00:22:41,480
Speaker 2: but it's very likely that if companies that are running

428
00:22:41,480 --> 00:22:45,280
Speaker 2: these platforms can't figure out a way to filter out

429
00:22:45,320 --> 00:22:48,600
Speaker 2: AI content, there's a part of the population that's going

430
00:22:48,640 --> 00:22:52,359
Speaker 2: to start tuning out. There's also advertisers that might be

431
00:22:52,400 --> 00:22:55,400
Speaker 2: scared by that, So I do think it's an existential

432
00:22:55,440 --> 00:22:57,800
Speaker 2: threat to the for you page. I think it actually

433
00:22:57,880 --> 00:23:01,800
Speaker 2: might be a boon for this subscriber or substack type communities,

434
00:23:02,040 --> 00:23:05,200
Speaker 2: like I think thatsting when people start rushing towards people

435
00:23:05,200 --> 00:23:07,920
Speaker 2: that they trust, I think that that could be a really,

436
00:23:08,000 --> 00:23:11,399
Speaker 2: really positive thing. I'll say for me, one of the

437
00:23:11,440 --> 00:23:13,119
Speaker 2: things that I would be looking at if I were

438
00:23:13,119 --> 00:23:16,200
Speaker 2: an AI creator is the fact that because Sora too

439
00:23:16,480 --> 00:23:19,439
Speaker 2: is so good at making videos, it lowered the barrier

440
00:23:19,440 --> 00:23:22,400
Speaker 2: of entries so far that I don't think open ai

441
00:23:22,560 --> 00:23:25,760
Speaker 2: is that far from generating their own feed. You know,

442
00:23:25,760 --> 00:23:28,879
Speaker 2: if you can make an interesting video with only two sentences, well,

443
00:23:28,960 --> 00:23:32,840
Speaker 2: chat gbt can make two sentences. They're collecting everyone's prompts,

444
00:23:32,880 --> 00:23:37,760
Speaker 2: they're seeing what gets likes and engagement on Sora. I

445
00:23:37,800 --> 00:23:40,119
Speaker 2: don't understand why they would need a human in the

446
00:23:40,160 --> 00:23:40,720
Speaker 2: loop soon.

447
00:23:41,200 --> 00:23:44,639
Speaker 1: I believe there's Actually we were just covering a story

448
00:23:44,680 --> 00:23:48,080
Speaker 1: in the Financial Times about gen Z being less on

449
00:23:48,200 --> 00:23:50,840
Speaker 1: social media, and I think a lot of it has

450
00:23:50,880 --> 00:23:54,520
Speaker 1: to do with the sort of enthitification of the feed.

451
00:23:55,040 --> 00:23:57,320
Speaker 1: And I see a lot of people kind of resigned

452
00:23:57,359 --> 00:24:00,200
Speaker 1: to the fact that going on Instagram means scrolling through

453
00:24:00,359 --> 00:24:01,840
Speaker 1: a lot of shit, and a lot of shit that's

454
00:24:01,840 --> 00:24:05,320
Speaker 1: AI generated. It's no longer social media. It's like watching

455
00:24:05,359 --> 00:24:09,080
Speaker 1: fake video. Yeah, it's hyper and shitification. It is the

456
00:24:09,200 --> 00:24:13,240
Speaker 1: most and shittified feed you could possibly have. And I

457
00:24:13,280 --> 00:24:16,960
Speaker 1: am totally agreeing that there will be people who are

458
00:24:17,000 --> 00:24:19,560
Speaker 1: super down with that and who are going to enjoy it.

459
00:24:20,119 --> 00:24:22,520
Speaker 2: Again, there are people who enjoy this. I don't want

460
00:24:22,600 --> 00:24:25,520
Speaker 2: to say that they're doing the wrong thing by enjoying

461
00:24:25,560 --> 00:24:26,439
Speaker 2: AI video a.

462
00:24:26,480 --> 00:24:28,720
Speaker 1: Fruit cutting another fruit, something like that.

463
00:24:29,040 --> 00:24:31,560
Speaker 2: Yeah, Like, I'm not here to judge what people are watching.

464
00:24:31,920 --> 00:24:36,359
Speaker 2: But if you play this out to its logical conclusion, here,

465
00:24:36,720 --> 00:24:41,720
Speaker 2: it looks like social media companies generating their own videos

466
00:24:41,800 --> 00:24:46,679
Speaker 2: without creators in the middle, for a hyper and shitified feed.

467
00:24:47,320 --> 00:24:49,840
Speaker 1: So five to ten years is a huge difference. So

468
00:24:49,920 --> 00:24:52,600
Speaker 1: let's just say, five years from now, what do you

469
00:24:52,680 --> 00:24:55,199
Speaker 1: think the state of AI video looks like, and what

470
00:24:55,240 --> 00:24:58,080
Speaker 1: does it mean for the Internet, for politics, and just

471
00:24:58,240 --> 00:24:59,439
Speaker 1: us generally as a culture.

472
00:25:00,119 --> 00:25:05,640
Speaker 2: If we project the current growth out, it is indistinguishable

473
00:25:05,720 --> 00:25:10,200
Speaker 2: and everywhere. If we take a contrarian view, we can

474
00:25:10,280 --> 00:25:12,560
Speaker 2: see that people might not be into it and it

475
00:25:12,640 --> 00:25:15,680
Speaker 2: might lose a lot of money. We don't know which

476
00:25:15,680 --> 00:25:18,040
Speaker 2: direction it's going to go, and I don't claim to

477
00:25:18,080 --> 00:25:21,439
Speaker 2: be able to tell which direction we're going in. But

478
00:25:21,840 --> 00:25:25,000
Speaker 2: in that first scenario where it's indistinguishable, it'll still be

479
00:25:25,040 --> 00:25:30,080
Speaker 2: distinguishable by machine learning algorithms, It'll still be detectable by experts.

480
00:25:30,400 --> 00:25:34,520
Speaker 2: I still don't think it presents legal problems, but it

481
00:25:34,560 --> 00:25:39,359
Speaker 2: presents massive disinformation problems. I'm very scared about that. And

482
00:25:39,400 --> 00:25:41,920
Speaker 2: then there's another scenario which I think is a little

483
00:25:41,960 --> 00:25:44,439
Speaker 2: bit more optimistic, which I actually subscribe to, which is

484
00:25:44,440 --> 00:25:48,080
Speaker 2: that AI content becomes its own genre. There are companies

485
00:25:48,080 --> 00:25:51,360
Speaker 2: that figure out a way to monetize it. It stays

486
00:25:51,480 --> 00:25:57,720
Speaker 2: separate from our real feeds to whatever degree the viewer wants.

487
00:25:58,119 --> 00:26:00,199
Speaker 2: And I think that this is the optimistic vision, and

488
00:26:00,280 --> 00:26:02,040
Speaker 2: that a lot of the tech community believes in too,

489
00:26:02,080 --> 00:26:04,000
Speaker 2: and that Sam Altman would probably say, you know, he's

490
00:26:04,040 --> 00:26:05,800
Speaker 2: been asked about this, He's been asked, how do we

491
00:26:05,800 --> 00:26:09,240
Speaker 2: tell what's real or fake? And I actually didn't hate

492
00:26:09,240 --> 00:26:11,600
Speaker 2: his answer. He said, well, just like we've always told

493
00:26:11,800 --> 00:26:13,800
Speaker 2: we follow the people we trust, like we have human

494
00:26:13,880 --> 00:26:18,280
Speaker 2: communication networks. Now, I think that his accelerationist view is

495
00:26:18,400 --> 00:26:20,639
Speaker 2: kind of running against that a little bit, but I

496
00:26:20,680 --> 00:26:22,880
Speaker 2: do believe that at its core, that's how we're going

497
00:26:22,880 --> 00:26:26,320
Speaker 2: to figure this out, and it might push people less online.

498
00:26:26,440 --> 00:26:29,359
Speaker 2: Like I just think that there's just so many unanswered questions.

499
00:26:29,400 --> 00:26:32,359
Speaker 2: But yeah, there's a few different scenarios that right now,

500
00:26:32,480 --> 00:26:34,159
Speaker 2: I think we just have to flip a coin on

501
00:26:34,200 --> 00:26:35,639
Speaker 2: which one we believe in.

502
00:26:37,160 --> 00:26:40,080
Speaker 1: So you said the reason that you got interested in

503
00:26:40,280 --> 00:26:45,040
Speaker 1: understanding AI video was as a tool for production. When

504
00:26:45,040 --> 00:26:47,480
Speaker 1: that was the case, what were you excited about and

505
00:26:47,520 --> 00:26:49,520
Speaker 1: sort of why has that now changed for you.

506
00:26:50,720 --> 00:26:54,080
Speaker 2: I was excited about it lowering the grounds to doing

507
00:26:54,200 --> 00:26:57,080
Speaker 2: creative things. I have a green screen studio in my basement.

508
00:26:57,119 --> 00:26:59,199
Speaker 2: I was excited about it, you know, putting me in

509
00:26:59,200 --> 00:27:01,879
Speaker 2: different types of stud and different types of environments. I

510
00:27:01,920 --> 00:27:06,800
Speaker 2: was excited about it improving my graphics workflows. What started

511
00:27:06,840 --> 00:27:09,159
Speaker 2: steering me away from it. It was some of the

512
00:27:09,160 --> 00:27:11,960
Speaker 2: ethical concerns. I did realize that at the end of

513
00:27:12,000 --> 00:27:15,720
Speaker 2: the day, like this was mostly stolen information. It was

514
00:27:15,920 --> 00:27:19,320
Speaker 2: actually not that much more useful than the actual room

515
00:27:19,520 --> 00:27:21,600
Speaker 2: I'm in right now, Like I can make a decent

516
00:27:21,640 --> 00:27:28,639
Speaker 2: studio myself. And really what made me turn was just

517
00:27:28,920 --> 00:27:31,680
Speaker 2: using the tools. I think a lot of the people

518
00:27:31,960 --> 00:27:36,359
Speaker 2: who are using them, who come from my background, realize

519
00:27:36,359 --> 00:27:39,320
Speaker 2: that they aren't very fun tools to use. It's not

520
00:27:39,359 --> 00:27:42,000
Speaker 2: a creative process for me. It's really frustrating.

521
00:27:42,080 --> 00:27:43,280
Speaker 1: Well, you just type something in.

522
00:27:43,400 --> 00:27:45,080
Speaker 2: You just type something in, and you hope it comes

523
00:27:45,080 --> 00:27:47,480
Speaker 2: back the way you want it. It's like if because

524
00:27:47,480 --> 00:27:49,240
Speaker 2: I have a history as a director, it is like

525
00:27:49,400 --> 00:27:52,119
Speaker 2: every time I needed to tell the actor exactly what

526
00:27:52,240 --> 00:27:55,720
Speaker 2: to say, exactly how to deliver it, over and over

527
00:27:55,840 --> 00:27:59,560
Speaker 2: and over. And as a creative person and as a director,

528
00:28:00,080 --> 00:28:02,680
Speaker 2: I just want to collaborate with people who bring something

529
00:28:02,680 --> 00:28:04,520
Speaker 2: to the table. I don't want to bring everything to

530
00:28:04,560 --> 00:28:06,479
Speaker 2: the table myself. I don't want to tell everyone how

531
00:28:06,480 --> 00:28:09,040
Speaker 2: to do everything right. That's not what the process of

532
00:28:09,080 --> 00:28:11,919
Speaker 2: creating ever was. It was always about collaboration. It was

533
00:28:11,920 --> 00:28:14,720
Speaker 2: always a fun process. I find the idea of just

534
00:28:14,720 --> 00:28:19,119
Speaker 2: sitting in my basement creating AI videos with text is

535
00:28:19,160 --> 00:28:22,719
Speaker 2: just it's exhausting. It doesn't feel creative at all. So

536
00:28:23,320 --> 00:28:25,960
Speaker 2: but I'm not saying that people should hate every AI

537
00:28:26,080 --> 00:28:28,680
Speaker 2: video they see, like some of them can be creative.

538
00:28:28,720 --> 00:28:32,280
Speaker 2: But yeah, it's just taking that opportunity to train yourself

539
00:28:32,320 --> 00:28:34,600
Speaker 2: to see what these video models look like. Because if

540
00:28:34,600 --> 00:28:37,439
Speaker 2: you're into it, that's totally fine, but then you're at

541
00:28:37,520 --> 00:28:40,080
Speaker 2: least ready for when it is used for disinformation, which

542
00:28:40,080 --> 00:28:41,440
Speaker 2: I think is enough of ball at this point.

543
00:28:41,840 --> 00:28:45,680
Speaker 1: Well, thank you so much, Jeremy. I will be tuned

544
00:28:45,760 --> 00:28:48,680
Speaker 1: into your feed. You are I don't know what I

545
00:28:48,680 --> 00:28:51,520
Speaker 1: would call you. Is it vigilante justice? I don't think so.

546
00:28:51,720 --> 00:28:55,520
Speaker 1: But you're doing some kind of public service education education.

547
00:28:55,640 --> 00:28:57,200
Speaker 3: You're an educator, Yeah, there you go.

548
00:28:57,320 --> 00:28:58,480
Speaker 1: You're an AI educator.

549
00:28:58,600 --> 00:29:22,040
Speaker 3: Yeah, for tech stuff.

550
00:29:22,240 --> 00:29:25,520
Speaker 1: I'm Kara Price. This episode was produced by Eliza Dennis,

551
00:29:25,560 --> 00:29:28,680
Speaker 1: Melissa Slaughter, and Tyler Hill. It was executive produced by

552
00:29:28,720 --> 00:29:32,720
Speaker 1: me oswa Oshan, Julia Nutter, and Kate Osborne for Kaleidoscope

553
00:29:33,000 --> 00:29:36,680
Speaker 1: and Katrina Norvell for iHeart Podcasts. Kyle Murdoch mixed this

554
00:29:36,760 --> 00:29:39,680
Speaker 1: episode and wrote our theme song. Join us on Friday

555
00:29:39,720 --> 00:29:41,840
Speaker 1: for the week in tech oz and I will run

556
00:29:41,880 --> 00:29:44,640
Speaker 1: through the headlines you may have missed. Please rate, review,

557
00:29:44,680 --> 00:29:47,160
Speaker 1: and reach out to us at tech Stuff Podcast at

558
00:29:47,160 --> 00:29:57,320
Speaker 1: gmail dot com.