1
00:00:21,273 --> 00:00:24,423
S1: All right. So welcome, Michelle. Good to have you back

2
00:00:24,423 --> 00:00:25,893
S1: on unsupervised learning.

3
00:00:26,463 --> 00:00:28,623
S2: Yeah, I'm happy to be here. Thanks, Daniel.

4
00:00:29,463 --> 00:00:33,333
S1: So you're the senior VP of product engineering and data

5
00:00:33,333 --> 00:00:37,863
S1: science at BlackBerry, and you've been on UL before. So

6
00:00:37,863 --> 00:00:40,893
S1: good to have you back. And what I wanted to

7
00:00:40,893 --> 00:00:46,323
S1: talk to you about today is deepfakes. And basically what

8
00:00:46,323 --> 00:00:50,883
S1: you're seeing around that, and I guess starting off like

9
00:00:50,913 --> 00:00:54,693
S1: what are the main cyber threats that you see deepfakes, uh,

10
00:00:54,933 --> 00:00:56,463
S1: picking up for us?

11
00:00:57,303 --> 00:01:02,943
S2: Yeah. You know, I think we're constantly getting immersed with, um,

12
00:01:02,973 --> 00:01:06,303
S2: you know, this intricate dance with innovation and malicious intent.

13
00:01:06,303 --> 00:01:09,783
S2: And I think initially we were seeing that content was

14
00:01:09,783 --> 00:01:14,343
S2: getting generated, whether textual in nature, like better phishing, more

15
00:01:14,343 --> 00:01:18,903
S2: convincing phishing, I would say, or personalized phishing, if you will,

16
00:01:18,903 --> 00:01:23,433
S2: from a content Perspective, trying to reflect your browsing habits.

17
00:01:23,463 --> 00:01:27,303
S2: You know, have a phishing email that would create such

18
00:01:27,303 --> 00:01:31,083
S2: that you would probably click on them. And on that,

19
00:01:31,083 --> 00:01:35,223
S2: I think it's progressively gotten more sophisticated. And, you know,

20
00:01:35,253 --> 00:01:39,333
S2: with media, I think we're seeing, you know, generative AI

21
00:01:39,363 --> 00:01:44,373
S2: that is were used for generating content with the multimodal

22
00:01:44,373 --> 00:01:49,023
S2: model technology. It basically revolutionized. I mean, the idea was

23
00:01:49,023 --> 00:01:53,493
S2: that it was mostly for the entertainment and education industry.

24
00:01:53,493 --> 00:01:55,503
S2: But on the other hand, as we are seeing with

25
00:01:55,503 --> 00:02:00,213
S2: these deepfakes, it's not just limited to phishing and textual, uh,

26
00:02:00,213 --> 00:02:03,783
S2: type of attacks or social engineering attacks, but more powerful

27
00:02:03,783 --> 00:02:09,543
S2: sort of reality, indistinguishable reality from fiction type of attacks where,

28
00:02:09,693 --> 00:02:14,193
S2: you know, deepfakes cause this dystopian vision that is becoming

29
00:02:14,193 --> 00:02:21,123
S2: a reality. Now malicious actors are creating highly convincing Videos, audios,

30
00:02:21,453 --> 00:02:25,803
S2: individuals saying things that they would never say or do before.

31
00:02:25,833 --> 00:02:30,843
S2: Identity theft has been a main fragment of that that

32
00:02:30,843 --> 00:02:36,693
S2: is sort of coming into effect with these deep voice fakes.

33
00:02:36,693 --> 00:02:40,053
S2: So yeah, so I think it sort of started with

34
00:02:40,053 --> 00:02:44,763
S2: the deception and now in a full form of, um,

35
00:02:44,763 --> 00:02:46,563
S2: identity compromise.

36
00:02:47,073 --> 00:02:47,553
S3: Mhm.

37
00:02:48,633 --> 00:02:51,993
S1: Yeah. And when you, when you talk about identity compromise, uh,

38
00:02:51,993 --> 00:02:54,843
S1: what do you mean what type of attack would that

39
00:02:54,843 --> 00:02:57,363
S1: be like. What is the scenario look like.

40
00:02:57,963 --> 00:03:01,803
S2: So um, I think, you know, if you, if you

41
00:03:01,803 --> 00:03:05,553
S2: look at like, you know, um, increasing number of deep fakes,

42
00:03:05,553 --> 00:03:11,013
S2: what we see in, in social media, even things that are, um,

43
00:03:11,013 --> 00:03:14,973
S2: you know, pretty benign people trying to lose a few years, uh,

44
00:03:14,973 --> 00:03:18,663
S2: or trying to lose a few. um, years from their

45
00:03:18,663 --> 00:03:21,693
S2: life in terms of looking more young or, you know,

46
00:03:21,723 --> 00:03:24,903
S2: seeing how they would look if they get older, uh,

47
00:03:24,903 --> 00:03:30,153
S2: even some similar these benign activities, the most concerning development

48
00:03:30,153 --> 00:03:33,813
S2: is the ability to take some of the same technology

49
00:03:33,813 --> 00:03:38,133
S2: and apply on voice and creating, you know, voice cloning, uh,

50
00:03:38,133 --> 00:03:40,833
S2: and voice fakes. And the reason why it is very

51
00:03:40,833 --> 00:03:45,993
S2: disturbing as a trend is because, uh, typically what you hear,

52
00:03:45,993 --> 00:03:48,873
S2: you're you're being trained to like, with all these years is,

53
00:03:48,903 --> 00:03:53,133
S2: is real. Like, if you recognize somebody's voice, we, um,

54
00:03:53,133 --> 00:03:57,603
S2: you know, our, our brains are trained to associate relationship

55
00:03:57,603 --> 00:04:03,243
S2: based on, uh, audio senses. So with deep voice or

56
00:04:03,243 --> 00:04:08,343
S2: voice cloning technology coming out because this enables cyber criminals

57
00:04:08,343 --> 00:04:13,893
S2: to create fake identities, um, you know, that enables people

58
00:04:13,893 --> 00:04:19,863
S2: to disclose information that they would otherwise not pass some biometric. Um,

59
00:04:19,893 --> 00:04:22,863
S2: you know, voice checks and things like that. That's what

60
00:04:22,863 --> 00:04:26,973
S2: I mean by identity. Um, yeah. Cloning?

61
00:04:27,543 --> 00:04:29,493
S1: Yeah, that makes sense. One way I like to think

62
00:04:29,493 --> 00:04:34,683
S1: about this is to imagine, um, when I think about

63
00:04:34,683 --> 00:04:39,243
S1: what can happen from a deepfake. I like to think

64
00:04:39,243 --> 00:04:42,813
S1: less about the deepfake itself and just imagine the impact

65
00:04:42,813 --> 00:04:46,713
S1: that it would have. So, for example, one, uh, one

66
00:04:46,713 --> 00:04:49,503
S1: of the big things is, uh, Beck attacks. And this

67
00:04:49,503 --> 00:04:54,183
S1: is before I. Right, or before modern AI. So it

68
00:04:54,183 --> 00:04:57,723
S1: was like, um, you just send an email and say, hey,

69
00:04:57,723 --> 00:04:59,943
S1: the boss wants you to transfer this money because we're

70
00:04:59,943 --> 00:05:03,333
S1: doing this merger and it's really important. And if the

71
00:05:03,333 --> 00:05:07,983
S1: email was, uh, convincing enough, then that money would would

72
00:05:07,983 --> 00:05:10,863
S1: transfer and they would lose, you know, a thousands of

73
00:05:10,863 --> 00:05:14,673
S1: dollars or millions of dollars or whatever. so that would

74
00:05:14,673 --> 00:05:18,483
S1: be one. Um, and then there's other things like, uh,

75
00:05:19,053 --> 00:05:22,053
S1: you're convinced to vote a certain way, you're convinced to

76
00:05:22,083 --> 00:05:25,743
S1: have a certain opinion, a positive or negative opinion about

77
00:05:25,743 --> 00:05:29,163
S1: a person. So I like to think about the impact

78
00:05:29,163 --> 00:05:31,473
S1: of it and then be like, okay, so how do

79
00:05:31,473 --> 00:05:36,453
S1: we defend against that? Mhm. Um, because there's multiple ways

80
00:05:36,453 --> 00:05:39,723
S1: to trick you into doing something like somebody could just

81
00:05:39,723 --> 00:05:42,033
S1: get on and it's not a deep fake at all.

82
00:05:42,033 --> 00:05:45,723
S1: They just convince you that you should transfer this money

83
00:05:45,933 --> 00:05:49,623
S1: like like oh you should buy this real estate. It's, uh,

84
00:05:49,653 --> 00:05:52,233
S1: you know, it's Oceanside, but somehow it's in the middle

85
00:05:52,233 --> 00:05:54,723
S1: of the country and there's no ocean, but they're just

86
00:05:54,723 --> 00:05:57,633
S1: really good at talking. So they convince you. So it's

87
00:05:57,633 --> 00:06:00,393
S1: like the technique to get you to do the thing

88
00:06:01,113 --> 00:06:03,393
S1: might not be the best place to look for it,

89
00:06:03,393 --> 00:06:07,983
S1: because there's so many of those techniques. The question is

90
00:06:07,983 --> 00:06:13,233
S1: the money transfer, the vote, the, um, opening up access

91
00:06:13,233 --> 00:06:15,573
S1: to an attacker to, like, hey, I need you to

92
00:06:15,603 --> 00:06:19,353
S1: turn on remote access so I can get access in. Well,

93
00:06:19,353 --> 00:06:21,213
S1: that that would be the flag, right? What do you

94
00:06:21,213 --> 00:06:24,093
S1: think about that sort of mental framework?

95
00:06:24,423 --> 00:06:27,243
S2: Yeah, I think, you know, that is a that is

96
00:06:27,243 --> 00:06:32,283
S2: definitely the correct mental framework. I think we're talking about the, um,

97
00:06:33,123 --> 00:06:39,963
S2: the malicious intent largely has not changed. Right. Whether, like

98
00:06:39,963 --> 00:06:42,903
S2: you said, whether it's, you know, you know, getting people

99
00:06:42,903 --> 00:06:45,363
S2: to do something that they would otherwise, not in a

100
00:06:45,363 --> 00:06:50,973
S2: very simple terms. Right. And the speed at which the

101
00:06:50,973 --> 00:06:55,353
S2: act of convincing the speed, if you could map it

102
00:06:55,353 --> 00:06:59,673
S2: to the act of convincing somebody has definitely increased because

103
00:06:59,673 --> 00:07:05,823
S2: of these audio visual, these perceptive sensors, which we believe

104
00:07:05,853 --> 00:07:08,883
S2: as real. What you see is, you know, what the

105
00:07:08,883 --> 00:07:12,783
S2: reality looks like and when that is being questioned on,

106
00:07:12,813 --> 00:07:15,783
S2: you know, that sort of definitely, you know, gets it

107
00:07:15,783 --> 00:07:18,213
S2: to a point where you are now going to be

108
00:07:18,243 --> 00:07:22,953
S2: targeting the different aspects to take advantage of. You talked

109
00:07:22,953 --> 00:07:28,173
S2: about financial, social, uh, defamation, personal attacks, like, you know,

110
00:07:28,203 --> 00:07:32,103
S2: all all of these put together I think is, is

111
00:07:32,103 --> 00:07:35,973
S2: now the landscape that these actors are operating with some

112
00:07:35,973 --> 00:07:38,433
S2: of these technologies. And I think, you know, the most

113
00:07:38,433 --> 00:07:42,363
S2: concerning aspect, I'm convinced that, you know, as the technology evolves, like,

114
00:07:42,393 --> 00:07:44,013
S2: you know, there will always be this cat and mouse.

115
00:07:44,013 --> 00:07:46,983
S2: But the most concerning aspect, I think, of these deepfakes

116
00:07:47,013 --> 00:07:53,223
S2: is the potential for eroding trust. Trust from from systems

117
00:07:53,223 --> 00:07:56,793
S2: that are legitimate, that are that are true. And and

118
00:07:56,793 --> 00:08:00,873
S2: I think, you know, that that is more of these intangible, uh,

119
00:08:00,873 --> 00:08:03,303
S2: effects of this technology, I think.

120
00:08:03,723 --> 00:08:07,413
S1: Yeah. So how do you see these being used in

121
00:08:07,413 --> 00:08:11,343
S1: attack chains? So we already have existing attack chains. How

122
00:08:11,343 --> 00:08:14,043
S1: do you see these being added in or like augmented

123
00:08:14,043 --> 00:08:15,903
S1: with this technology?

124
00:08:16,113 --> 00:08:19,473
S2: Yeah. So I think, you know, we started the discussion with,

125
00:08:19,503 --> 00:08:23,253
S2: you know, phishing. Um, we talked about this disturbing trend

126
00:08:23,283 --> 00:08:27,423
S2: with textual content. Now we're seeing with, with video. Um,

127
00:08:27,423 --> 00:08:30,363
S2: and we're seeing with voice. So voice we already talked about,

128
00:08:30,363 --> 00:08:34,143
S2: for example identity. Identity masquerading, for example, you know, faking

129
00:08:34,143 --> 00:08:37,263
S2: voice identity. We've already seen in the media, for example,

130
00:08:37,293 --> 00:08:39,813
S2: some of these things playing out with millions of dollars

131
00:08:40,353 --> 00:08:45,933
S2: or potentially private information being disclosed. Right. As cyber financial crimes,

132
00:08:45,933 --> 00:08:51,183
S2: for instance. Right. Um, and in video like, you know, uh,

133
00:08:51,183 --> 00:08:56,943
S2: elections coming up in both Canada and United States, you know, uh, this, um,

134
00:08:56,943 --> 00:09:00,573
S2: this disinformation or spread of disinformation at this, at this

135
00:09:00,573 --> 00:09:05,073
S2: speed is changing public opinion visually. Uh, these are sort

136
00:09:05,073 --> 00:09:09,603
S2: of these, uh, Attack vectors, I would say like changing perception,

137
00:09:09,603 --> 00:09:14,763
S2: changing or distorting reality. Um, in, in sense for, for

138
00:09:14,763 --> 00:09:18,873
S2: the mass and also in financial crimes like, you know,

139
00:09:18,903 --> 00:09:25,803
S2: leveraging this technology, uh, to defame brands, uh, create a

140
00:09:25,833 --> 00:09:28,443
S2: direct financial like, you know, you talked about millions of

141
00:09:28,443 --> 00:09:34,683
S2: dollars getting siphoned, uh, creating those. So it's all motivated in,

142
00:09:34,683 --> 00:09:38,853
S2: in those areas. So, um, and we're seeing, you know,

143
00:09:38,883 --> 00:09:41,973
S2: the reality it's no longer hypothetical. Um.

144
00:09:42,543 --> 00:09:46,083
S1: Yeah, absolutely. So there was, uh, one recently with, uh, Ferrari.

145
00:09:46,113 --> 00:09:48,813
S1: I don't know if you saw that one. It was, uh,

146
00:09:49,443 --> 00:09:55,743
S1: it was basically somebody masqueraded the CEO's voice, um, on

147
00:09:55,743 --> 00:09:58,143
S1: a phone call, and they would actually they actually had

148
00:09:58,143 --> 00:10:00,933
S1: done a bunch of stuff on WhatsApp first to get

149
00:10:00,933 --> 00:10:03,303
S1: them to the point of almost doing this thing that

150
00:10:03,303 --> 00:10:06,213
S1: they wanted. And then the final step was, hey, I

151
00:10:06,243 --> 00:10:08,793
S1: need to talk to you on the phone. So it

152
00:10:08,793 --> 00:10:12,813
S1: was Ferrari. So they're Italian, so they have the voice

153
00:10:12,813 --> 00:10:15,963
S1: of the person. They also have the right accent from

154
00:10:15,963 --> 00:10:20,433
S1: the right part of Italy. Mhm. So, um, it was

155
00:10:20,433 --> 00:10:23,973
S1: fairly convincing, but something the executive that they were talking

156
00:10:24,003 --> 00:10:27,753
S1: to and trying to trick something made them question it.

157
00:10:27,753 --> 00:10:30,813
S1: So they asked him a personal question that they knew

158
00:10:30,933 --> 00:10:35,433
S1: about the CEO. Mhm. And the fake CEO the deep

159
00:10:35,463 --> 00:10:40,143
S1: fake couldn't answer. So they ended the call. Yeah. So

160
00:10:40,173 --> 00:10:43,773
S1: so I think um that was a lucky case that

161
00:10:43,773 --> 00:10:48,603
S1: you had somebody who was suspicious. But have you seen

162
00:10:48,603 --> 00:10:52,353
S1: other similar sort of attacks where um, I guess there

163
00:10:52,353 --> 00:10:55,443
S1: was also the one where someone was convincing to send money.

164
00:10:55,443 --> 00:10:57,363
S1: I think they actually did convince them to send.

165
00:10:57,363 --> 00:11:01,953
S2: Money in a British, um, British form. Yeah. Yeah. That's right.

166
00:11:02,013 --> 00:11:04,473
S2: Very one. You know, I think the Ferrari one is

167
00:11:04,473 --> 00:11:09,093
S2: actually a very good case study. Like, I think, you know,

168
00:11:09,123 --> 00:11:11,643
S2: in that one I haven't looked fully into it. But

169
00:11:11,673 --> 00:11:15,603
S2: on the on on the brief, um, articles that I read,

170
00:11:15,603 --> 00:11:22,563
S2: it seemed like the suspicious element was the subtle mechanical, uh,

171
00:11:22,563 --> 00:11:27,843
S2: intonations in the voice that was detected, which sort of

172
00:11:27,873 --> 00:11:32,043
S2: got them into sort of questioning. And, and it's great

173
00:11:32,043 --> 00:11:36,813
S2: for this person to have asked a, um, a shared secret,

174
00:11:36,813 --> 00:11:39,783
S2: if you will, or a previous yes question that they

175
00:11:39,783 --> 00:11:43,353
S2: would have otherwise not known. Um, but you know what? Like,

176
00:11:43,383 --> 00:11:47,073
S2: you know, my first thought reading that was like, uh,

177
00:11:47,373 --> 00:11:51,693
S2: this is real time voice cloning, right? As you're speaking. Uh,

178
00:11:51,693 --> 00:11:55,743
S2: it's sort of generating this. Right? So the compute cycles

179
00:11:55,743 --> 00:11:58,953
S2: that are required, typically for audio latencies to work is

180
00:11:58,953 --> 00:12:01,953
S2: anywhere from 9 to 42 milliseconds for a continuous stream of,

181
00:12:01,953 --> 00:12:06,003
S2: of communication. And this will get progressively better. Like if

182
00:12:06,003 --> 00:12:08,463
S2: it were a static recording that you're playing like a

183
00:12:08,493 --> 00:12:12,213
S2: like a video. Um, that voice overlay or an audio,

184
00:12:12,213 --> 00:12:16,113
S2: for example, uh, you know, uh, or audio voice notes,

185
00:12:16,113 --> 00:12:19,503
S2: for example. Uh, they would be spot on because it

186
00:12:19,503 --> 00:12:22,473
S2: would have had the compute power necessary. And the algorithms

187
00:12:22,473 --> 00:12:25,293
S2: that we have today and that are used by these

188
00:12:25,293 --> 00:12:28,983
S2: threat actors. But I bet that this particular technology of

189
00:12:28,983 --> 00:12:32,763
S2: real time cloning, which is as I'm speaking, it is

190
00:12:32,793 --> 00:12:37,863
S2: sort of, um, transferring the the audio nuances to somebody

191
00:12:37,863 --> 00:12:43,173
S2: else's Mercury would get just better, uh, over time. So

192
00:12:43,173 --> 00:12:46,713
S2: this is quite, quite concerning. But the other one, I think,

193
00:12:46,743 --> 00:12:49,413
S2: you know, there is I came across resemble I or

194
00:12:49,413 --> 00:12:52,083
S2: resembled I forget like, you know, they track quite a

195
00:12:52,083 --> 00:12:56,163
S2: few of these incidents like, you know, worldwide. Um, for

196
00:12:56,163 --> 00:13:03,183
S2: many of these fake like whether it's robocall AI misinformation. Um, and,

197
00:13:03,213 --> 00:13:05,823
S2: you know, I recently read a report from, from Deloitte,

198
00:13:05,823 --> 00:13:11,043
S2: I think the fastest growing forms of adversarial. I, uh, like,

199
00:13:11,073 --> 00:13:15,333
S2: you know, deepfake related are all on financial crimes, like

200
00:13:15,333 --> 00:13:19,773
S2: financial losses. And they're projecting about over 12 north of

201
00:13:19,773 --> 00:13:26,433
S2: 12,000,000,000 in 2023. Uh, which was the case, and 40 billion, um,

202
00:13:26,463 --> 00:13:30,963
S2: by 2027 in aggregate. So which is, you know, is

203
00:13:30,963 --> 00:13:34,203
S2: growing at an astounding rate over, I don't know, like

204
00:13:34,323 --> 00:13:41,493
S2: 12 to 40, like 25, 30%. Yeah. You know, compounding rate. Um,

205
00:13:41,493 --> 00:13:44,343
S2: and you know, in one of those reports, you know,

206
00:13:44,703 --> 00:13:50,793
S2: especially for the financial crimes Deloitte reported, these fakes are proliferating, um,

207
00:13:50,823 --> 00:13:54,303
S2: mostly in the banking and financial services as being the

208
00:13:54,303 --> 00:13:56,313
S2: main target. Hmm.

209
00:13:57,303 --> 00:14:03,933
S1: Interesting. So as the stuff gets better and like you said,

210
00:14:03,933 --> 00:14:09,483
S1: it becomes indistinguishable. Like there's no way to tell the difference. Mhm. Um,

211
00:14:10,383 --> 00:14:13,743
S1: so one one thing is the voice sounds better, but

212
00:14:13,833 --> 00:14:18,993
S1: what I feel like if there's extra context the more

213
00:14:19,023 --> 00:14:21,843
S1: the attacker knows about the thing. So imagine this is

214
00:14:21,843 --> 00:14:25,203
S1: a fully automated AI attack. That's even worse. So it's

215
00:14:25,203 --> 00:14:28,233
S1: not even a voice a real time clone of a person.

216
00:14:28,233 --> 00:14:32,163
S1: It's actually like an AI agent that's just calling and

217
00:14:32,163 --> 00:14:35,613
S1: trying to get the things to happen. But but it

218
00:14:35,613 --> 00:14:39,273
S1: has been given a full database about everything about you

219
00:14:39,273 --> 00:14:43,653
S1: or about me or about this, uh, Italian executive. So

220
00:14:43,653 --> 00:14:45,333
S1: it knows, like, the name of their dog because it

221
00:14:45,333 --> 00:14:48,483
S1: got it from, like, Instagram. Right? And it's got all

222
00:14:48,483 --> 00:14:51,393
S1: this personal data about like, it knows the wife's name

223
00:14:51,393 --> 00:14:56,373
S1: and everything or whatever. So how how do you defend

224
00:14:56,373 --> 00:14:59,623
S1: against that? So let's say it's a perfect voice. It's

225
00:14:59,623 --> 00:15:04,363
S1: perfectly real time, but it also has deep knowledge from

226
00:15:04,573 --> 00:15:08,293
S1: open source intelligence about the actual perpetrator.

227
00:15:08,323 --> 00:15:09,643
S4: Yeah, yeah.

228
00:15:09,883 --> 00:15:12,703
S2: No, I think, you know, it's, uh, as you're saying, right?

229
00:15:12,733 --> 00:15:17,323
S2: I mean, this type of, uh, generative AI, adversarial AI,

230
00:15:17,353 --> 00:15:20,983
S2: you know, it creates new attack vectors with all of

231
00:15:20,983 --> 00:15:25,603
S2: this information, multimodal information that no one sees coming, and

232
00:15:25,603 --> 00:15:30,733
S2: it creates a more complex, nuanced threat landscape, um, that,

233
00:15:30,763 --> 00:15:37,063
S2: you know, prioritizes identity driven attacks, and it'll only get better. Um,

234
00:15:37,063 --> 00:15:41,713
S2: so in the short term, the way I think about

235
00:15:41,713 --> 00:15:44,983
S2: this and most companies that that I talk to their

236
00:15:44,983 --> 00:15:48,553
S2: CISOs is like training, right? The first thing is at

237
00:15:48,583 --> 00:15:55,003
S2: least deepfake detection training, which is recognizing inconsistencies in facial expression, uh,

238
00:15:55,053 --> 00:16:00,933
S2: audio quality or video quality? Uh, before disclosing sensitive information, uh,

239
00:16:00,933 --> 00:16:06,033
S2: validate using a verification protocol, uh, of contacting them through

240
00:16:06,033 --> 00:16:09,783
S2: known means, like, for example, through their, uh, known contacts,

241
00:16:09,813 --> 00:16:12,723
S2: like phone or ask them questions like you talked about

242
00:16:12,723 --> 00:16:15,303
S2: this shared secret, like ask them something that you know

243
00:16:15,333 --> 00:16:18,693
S2: they would otherwise have not shared or disclosed. Uh, create

244
00:16:18,693 --> 00:16:25,023
S2: these these validation, uh, protocols, um, and have people be

245
00:16:25,023 --> 00:16:31,773
S2: more aware of social engineering from, from an awareness perspective. Right. Um,

246
00:16:32,733 --> 00:16:35,433
S2: I mean, that's the, you know, that's just to get

247
00:16:35,433 --> 00:16:39,843
S2: get to the short term problem just with exposure and training, really.

248
00:16:39,843 --> 00:16:45,033
S2: But on a more broader term, I think social engineering awareness, uh,

249
00:16:45,033 --> 00:16:49,143
S2: basically should should drive these regulatory verification process. I think

250
00:16:49,143 --> 00:16:52,173
S2: the governments have have a part to play here to

251
00:16:52,203 --> 00:16:59,613
S2: make content provenance or identity? Um, you know, spoofing, using AI, uh,

252
00:16:59,613 --> 00:17:03,333
S2: as a mechanism, as a mandatory safeguarding, like in Canada,

253
00:17:03,333 --> 00:17:07,143
S2: for example, I can say, uh, most of the provinces

254
00:17:07,143 --> 00:17:11,253
S2: have enacted legislation on sharing non-consensual media, for example. So

255
00:17:11,253 --> 00:17:14,763
S2: the acts are already there, uh, in identity fakes, for example.

256
00:17:14,763 --> 00:17:18,123
S2: But that needs to be put on AI. Um, the no.

257
00:17:18,153 --> 00:17:21,873
S2: AI Fraud Act, uh, that was introduced in the US

258
00:17:21,873 --> 00:17:24,783
S2: House of Representatives, I think, uh, earlier this year I

259
00:17:24,783 --> 00:17:29,553
S2: believe is a good first step. Right. Uh, fraud itself.

260
00:17:29,613 --> 00:17:32,253
S2: There are regulations around that. There's laws around it. We

261
00:17:32,253 --> 00:17:35,763
S2: just need the governments to to catch up to the

262
00:17:35,763 --> 00:17:39,633
S2: level at which the technology is progressing and, and create

263
00:17:39,633 --> 00:17:43,773
S2: it within, within the AI framework. Um, yeah.

264
00:17:43,983 --> 00:17:46,773
S1: I I'm sorry. Go ahead.

265
00:17:47,013 --> 00:17:49,113
S2: No, no, I was just saying that, you know, training

266
00:17:49,113 --> 00:17:52,173
S2: and then the regulatory criteria. But there are things in

267
00:17:52,173 --> 00:17:54,393
S2: the technology side as well that we can do. But perhaps,

268
00:17:54,423 --> 00:18:00,003
S2: you know, the immediately, uh, there are these two things

269
00:18:00,003 --> 00:18:01,173
S2: that come to mind.

270
00:18:02,223 --> 00:18:05,883
S1: Yeah. Well, one thing I worry about there is that

271
00:18:06,873 --> 00:18:09,693
S1: I agree that government is going to get involved and

272
00:18:09,693 --> 00:18:15,033
S1: should get involved, but I think about spam calls in

273
00:18:15,033 --> 00:18:16,863
S1: the US. I don't know about Canada in the US.

274
00:18:16,893 --> 00:18:21,753
S1: It's still very bad. Um, I basically have an allow list. Mhm. Um,

275
00:18:21,783 --> 00:18:26,013
S1: and all other calls just go directly to voicemail because

276
00:18:26,013 --> 00:18:30,903
S1: I couldn't handle it any other way. Mhm. Um spam

277
00:18:30,903 --> 00:18:35,283
S1: calls are already illegal. Fraud is already illegal. So I

278
00:18:35,313 --> 00:18:39,573
S1: if the government says it's illegal to do fraud with I,

279
00:18:40,383 --> 00:18:43,113
S1: if it's already illegal to do fraud, I'm not sure

280
00:18:43,113 --> 00:18:46,983
S1: what exactly. Like who would be willing to do the

281
00:18:46,983 --> 00:18:51,843
S1: fraud without I. Mhm. Even though it's illegal. But once

282
00:18:51,843 --> 00:18:54,153
S1: the new law came out they would be like oh

283
00:18:54,183 --> 00:18:56,133
S1: well now it's illegal with AI so I'm not going

284
00:18:56,163 --> 00:18:56,853
S1: to do it.

285
00:18:57,063 --> 00:18:58,623
S3: Mhm. Yeah.

286
00:18:58,653 --> 00:19:01,953
S2: No I think you know and that's where so, so

287
00:19:01,953 --> 00:19:03,963
S2: I think the that's a great point. So the first

288
00:19:03,963 --> 00:19:06,693
S2: thing is you know educate. Right. We all need to

289
00:19:06,693 --> 00:19:09,903
S2: be aware that reality is getting distorted and be aware

290
00:19:09,903 --> 00:19:12,423
S2: of our surroundings. Yes. Number two we do need some

291
00:19:12,423 --> 00:19:16,053
S2: guardrails regulatory guardrails. Right. So that, you know, at least

292
00:19:16,083 --> 00:19:20,733
S2: it it puts some checks and balances. Or if somebody

293
00:19:20,763 --> 00:19:24,093
S2: were to to file a complaint, there is a legal

294
00:19:24,093 --> 00:19:26,343
S2: framework to act upon. So today for example, if I

295
00:19:26,373 --> 00:19:29,733
S2: went and said something like this happened. Um, the legal

296
00:19:29,733 --> 00:19:34,803
S2: framework is not there to support, uh, let's say the, the,

297
00:19:34,803 --> 00:19:37,323
S2: the legal proceedings that would follow this.

298
00:19:37,623 --> 00:19:38,733
S3: Yeah, that's a good point.

299
00:19:38,763 --> 00:19:41,463
S2: Yeah. But also there is a technology side to it, right. Like,

300
00:19:41,493 --> 00:19:43,413
S2: I mean, if you take a step back and if

301
00:19:43,413 --> 00:19:46,233
S2: you think about it, look it will get progressively better.

302
00:19:46,233 --> 00:19:48,693
S2: And why do I think about it that way? because,

303
00:19:48,723 --> 00:19:51,453
S2: you know, this branch of generative AI, you know, works

304
00:19:51,693 --> 00:19:55,563
S2: in a way like like, you know, we talked about this,

305
00:19:55,563 --> 00:19:59,433
S2: I think, in our previous discussion about adversarial networks. Right.

306
00:19:59,463 --> 00:20:05,403
S2: Adversarial networks, Gans like or or various types of autoencoders,

307
00:20:05,403 --> 00:20:07,263
S2: for example. Typically, the way it works is like you

308
00:20:07,263 --> 00:20:11,313
S2: have two pairs of network, right? One, you know, deep

309
00:20:11,313 --> 00:20:14,643
S2: neural network is called the generator that's generating this content.

310
00:20:14,643 --> 00:20:17,403
S2: And the other one is a discriminator and it's critiquing

311
00:20:17,403 --> 00:20:20,103
S2: the content. Simplest way to think about that. So the

312
00:20:20,103 --> 00:20:24,003
S2: generator will get better and better as the discriminator gets

313
00:20:24,003 --> 00:20:28,473
S2: better at critiquing critiquing the generation. Right. Yeah. Zero sum game.

314
00:20:28,473 --> 00:20:31,803
S2: So the very technology that is sort of enabling to

315
00:20:31,833 --> 00:20:34,173
S2: detect and identifying what is a fake and what is

316
00:20:34,173 --> 00:20:38,703
S2: real in itself is assisting in getting better at the content.

317
00:20:38,703 --> 00:20:41,223
S2: So that's why, you know, I think about this as,

318
00:20:41,253 --> 00:20:46,653
S2: as the need to accelerate the framework, the regulatory framework

319
00:20:46,683 --> 00:20:52,113
S2: for authenticity detection technology. So accelerate the innovation on cryptographically

320
00:20:52,113 --> 00:20:54,963
S2: securing generative AI content through fingerprint.

321
00:20:55,203 --> 00:20:56,733
S1: Okay. I like that.

322
00:20:56,763 --> 00:21:00,423
S2: Simplicity. You know, the government regulatory criteria can come in

323
00:21:00,423 --> 00:21:03,483
S2: and enforce the authenticity. Like we all need to learn

324
00:21:03,483 --> 00:21:06,513
S2: how to verify you know, what is valid and what

325
00:21:06,513 --> 00:21:08,763
S2: is authentic content. Just look at how we learned how

326
00:21:08,763 --> 00:21:10,983
S2: to check if a browser is safe, right? There is

327
00:21:10,983 --> 00:21:11,673
S2: a little.

328
00:21:12,063 --> 00:21:12,873
S3: Um yep.

329
00:21:13,203 --> 00:21:16,353
S2: Lock at the left hand side. Oh, okay. It's safe. Right?

330
00:21:16,383 --> 00:21:18,843
S2: SSL and all that. Right. The same way we need

331
00:21:18,843 --> 00:21:20,553
S2: to sort of, you know, do the same thing to

332
00:21:20,583 --> 00:21:25,773
S2: learn how to check if a content media is safe. And,

333
00:21:25,803 --> 00:21:29,283
S2: you know, we consume our content through browsers. Right. I mean,

334
00:21:29,283 --> 00:21:33,483
S2: all of this thing is manifesting itself in the very

335
00:21:33,483 --> 00:21:37,563
S2: vehicle that's delivering it is are these browsers. So there.

336
00:21:37,563 --> 00:21:38,343
S3: Is a way and.

337
00:21:38,343 --> 00:21:39,933
S1: Mobile apps and mobile apps.

338
00:21:39,963 --> 00:21:40,503
S3: And mobile.

339
00:21:40,503 --> 00:21:42,153
S2: Apps. But at the end of the day, there is

340
00:21:42,153 --> 00:21:46,023
S2: a way and and we consume most of the content

341
00:21:46,053 --> 00:21:50,913
S2: using these sort of instruments. So we can see how

342
00:21:50,913 --> 00:21:53,883
S2: I'm drawing these parallels, that if we sort of accelerate

343
00:21:53,913 --> 00:21:59,013
S2: the innovation on cryptographically security, securing and validating these, these

344
00:21:59,013 --> 00:22:04,233
S2: fingerprinted content, then we should be able to tackle this technology, right?

345
00:22:04,263 --> 00:22:04,743
S2: I mean.

346
00:22:05,073 --> 00:22:07,983
S1: So I like that. Yeah. Is it. So it's when

347
00:22:07,983 --> 00:22:11,193
S1: you're saying legislation you're not talking about make it illegal

348
00:22:11,193 --> 00:22:12,993
S1: to do bad things because it's already. No no.

349
00:22:12,993 --> 00:22:13,893
S3: No. Yeah. Yeah.

350
00:22:13,923 --> 00:22:22,083
S1: You're talking about requiring, uh, technology providers to have an

351
00:22:22,083 --> 00:22:25,593
S1: authenticity mechanism. It's really funny you say that because I

352
00:22:25,593 --> 00:22:28,143
S1: was going to bring up a similar point. So, um,

353
00:22:28,143 --> 00:22:30,903
S1: we're in zoom right now. You see, when I'm talking,

354
00:22:30,903 --> 00:22:34,053
S1: you see, I have a green outline around. Mhm. When

355
00:22:34,053 --> 00:22:37,383
S1: you talk the you have a green outline. And in

356
00:22:37,383 --> 00:22:40,803
S1: this case that's to indicate that that's person is talking. Right.

357
00:22:40,833 --> 00:22:46,713
S1: Obviously however I've been thinking about exactly what you said.

358
00:22:46,713 --> 00:22:53,313
S1: Which is, um, what if Apple and Android and YouTube

359
00:22:53,553 --> 00:22:59,583
S1: and all of meta, they had a mechanism where, um,

360
00:23:00,243 --> 00:23:03,093
S1: so let's say I'm talking to you on the phone. Mhm.

361
00:23:03,123 --> 00:23:06,843
S1: When I initiate my phone call to you. Mhm. Um,

362
00:23:06,873 --> 00:23:10,863
S1: it's using my secure enclave on my phone. And so

363
00:23:10,893 --> 00:23:14,373
S1: it's verifying that I got in with my face or

364
00:23:14,373 --> 00:23:18,033
S1: my finger or um touch ID or whatever.

365
00:23:18,063 --> 00:23:19,053
S2: Authenticity. Yeah.

366
00:23:19,293 --> 00:23:22,623
S1: Yeah. So it authenticates that. And then when the call

367
00:23:22,623 --> 00:23:25,623
S1: comes over to you, you see something, you see a

368
00:23:25,623 --> 00:23:28,023
S1: blue outline, you see a green outline, you see a

369
00:23:28,023 --> 00:23:31,353
S1: check mark, just like you said with the lock symbol.

370
00:23:31,383 --> 00:23:35,823
S1: So now when we're having a conversation, it's validated. So

371
00:23:35,853 --> 00:23:38,433
S1: same same with this. Like you said, we could do

372
00:23:38,463 --> 00:23:41,553
S1: you could do a real time deepfake pretty soon. So

373
00:23:41,553 --> 00:23:44,763
S1: it looks like I'm talking to shell, but it's not

374
00:23:44,763 --> 00:23:47,973
S1: actually you. But if there was a green outline on

375
00:23:47,973 --> 00:23:51,843
S1: a check mark, that would mean that some combination of

376
00:23:51,843 --> 00:23:58,623
S1: my operating system and zoom had validated and continued to validate.

377
00:23:58,743 --> 00:24:01,053
S1: So maybe in the middle of our conversation, we both

378
00:24:01,053 --> 00:24:04,203
S1: get prompts because we've been talking for ten minutes. We

379
00:24:04,203 --> 00:24:06,483
S1: have to re authenticate the feed. Mhm.

380
00:24:07,173 --> 00:24:07,413
S3: Mhm.

381
00:24:07,443 --> 00:24:11,973
S2: So yeah you know you you're spot on. And you

382
00:24:11,973 --> 00:24:17,313
S2: know I almost think of this as a joint government

383
00:24:17,313 --> 00:24:20,733
S2: and industry partnership. Like you know I'm often I have

384
00:24:20,733 --> 00:24:24,933
S2: the opinion that this is definitely a solvable problem. Yeah.

385
00:24:24,963 --> 00:24:28,713
S2: And as an industry thought leaders we have to establish

386
00:24:28,713 --> 00:24:33,963
S2: what this common identity assertion protocol is and standardize that.

387
00:24:33,963 --> 00:24:37,083
S2: And then all of these companies that are in the

388
00:24:37,083 --> 00:24:43,593
S2: business of of creating media or transmitting media or exchanging media,

389
00:24:43,803 --> 00:24:48,183
S2: sort of, you know, um, adhere by it. Like, I mean,

390
00:24:48,213 --> 00:24:52,473
S2: you know, I'm, I'm actually quite, um, encouraged to see

391
00:24:52,473 --> 00:24:56,013
S2: the DARPA very recently, uh, to deal with at least the, uh,

392
00:24:56,013 --> 00:25:01,773
S2: face swapping technology and the puppeteering technology, which which is

393
00:25:01,773 --> 00:25:06,363
S2: also a phenomenally interesting branch of generative models is where

394
00:25:06,363 --> 00:25:08,943
S2: your expressions you are still the same person, your identity

395
00:25:08,943 --> 00:25:11,943
S2: is not solved, but your expressions are. But to deal

396
00:25:11,943 --> 00:25:14,763
S2: with that, they initiated a new research, I think called

397
00:25:14,763 --> 00:25:19,233
S2: the Media Forensic Research Acceleration Program R&D program, if you will,

398
00:25:19,263 --> 00:25:24,963
S2: to identify fake digital visual media detection method. Right. Um,

399
00:25:24,993 --> 00:25:28,293
S2: and that will tackle, you know, those sort of things.

400
00:25:28,293 --> 00:25:31,983
S2: But I think in order to deal with the identity

401
00:25:32,013 --> 00:25:34,143
S2: side of things or validating, I think, you know, what

402
00:25:34,173 --> 00:25:37,053
S2: what you're suggesting is, is a great way of of

403
00:25:37,053 --> 00:25:39,663
S2: tackling that. In fact, I was recently reading a paper

404
00:25:39,693 --> 00:25:45,513
S2: like there is this cryptographic mechanism called um, uh, zero

405
00:25:45,513 --> 00:25:51,723
S2: knowledge proof. And the idea is, is quite simple. It's basically, um,

406
00:25:52,323 --> 00:25:55,413
S2: there is something that both you and I know without

407
00:25:55,413 --> 00:25:59,673
S2: you disclosing what you know, I'm able to verify, uh,

408
00:25:59,673 --> 00:26:03,003
S2: whether your claims are true or not.

409
00:26:03,033 --> 00:26:04,413
S1: Almost like Diffie. Hellman.

410
00:26:05,193 --> 00:26:08,613
S2: Yeah, except in Diffie. Hellman. Yeah, yeah, except that there is.

411
00:26:08,823 --> 00:26:11,283
S1: But but, um. But the middle person doesn't get to

412
00:26:11,433 --> 00:26:13,113
S1: see the thing exchanged.

413
00:26:13,263 --> 00:26:17,613
S2: Exactly. Yeah. And this zero knowledge proof is, is actually

414
00:26:17,613 --> 00:26:20,163
S2: applied in many different places. It's not new, but what

415
00:26:20,163 --> 00:26:23,433
S2: is new here is the application of zero knowledge proof

416
00:26:23,433 --> 00:26:29,403
S2: in authenticating and privacy maintaining hardware. Like, um, you know,

417
00:26:29,433 --> 00:26:32,463
S2: I came across a company that's sort of dabbling with

418
00:26:32,463 --> 00:26:37,323
S2: this called snark. They're using zero knowledge, zero knowledge proof microphones,

419
00:26:37,323 --> 00:26:40,413
S2: which you know, can prove the audio was indeed recorded

420
00:26:40,413 --> 00:26:44,223
S2: in that thing. Some media companies like Canon and Nikon

421
00:26:44,223 --> 00:26:48,093
S2: are dabbling with zero knowledge imaging technology, whereby they can

422
00:26:48,093 --> 00:26:52,173
S2: ascertain that this was actually captured using light rays coming

423
00:26:52,173 --> 00:26:53,583
S2: in a camera lens.

424
00:26:53,613 --> 00:26:54,993
S3: Oh, interesting.

425
00:26:54,993 --> 00:26:58,173
S2: And or if it has been edited in a particular way.

426
00:26:58,173 --> 00:27:03,753
S2: But this whole audio visual industry coalition content provenance, authenticity

427
00:27:03,783 --> 00:27:06,393
S2: is a serious topic. And it is time that, you know,

428
00:27:06,423 --> 00:27:11,043
S2: we we find ways to certify, uh, source of digital content,

429
00:27:11,073 --> 00:27:14,943
S2: how it was generated. Um, and things that you just

430
00:27:14,943 --> 00:27:20,253
S2: talked about ascertain before these gadgets, like mobile devices and

431
00:27:20,253 --> 00:27:23,973
S2: other such things are used to validate these things right now,

432
00:27:23,973 --> 00:27:27,483
S2: if you think about it. Right. Privacy, uh, or identity

433
00:27:27,513 --> 00:27:31,263
S2: verification is only used for content that you own. Like

434
00:27:31,293 --> 00:27:34,803
S2: for example, my phone, for example, is going to ask

435
00:27:34,803 --> 00:27:36,993
S2: me for my password and a bunch of other things

436
00:27:36,993 --> 00:27:41,583
S2: before it shows me my stuff, right? Or opens. But

437
00:27:41,583 --> 00:27:44,073
S2: that has no bearing when I'm calling you, for example.

438
00:27:44,103 --> 00:27:47,763
S2: Like you have no idea. Right. So so this this

439
00:27:47,763 --> 00:27:50,913
S2: thing is, as you mentioned, is actually now very important

440
00:27:50,913 --> 00:27:55,293
S2: is all of this identity validation that happened on the phone.

441
00:27:55,953 --> 00:28:00,963
S2: This trust needs to be expanded into the other entity

442
00:28:00,963 --> 00:28:04,893
S2: that you're interacting with. So the trust network needs to be, um,

443
00:28:04,893 --> 00:28:10,713
S2: shared using whatever methodology that these companies choose. But yeah,

444
00:28:10,743 --> 00:28:12,033
S2: our standardized.

445
00:28:12,303 --> 00:28:18,783
S1: Yeah. So I think that's correct. Um, I'm also thinking

446
00:28:18,783 --> 00:28:22,773
S1: about a thing that you said earlier when you mentioned Puppeting. Um, yeah.

447
00:28:22,803 --> 00:28:24,543
S1: I didn't know there was a name for this, but

448
00:28:24,543 --> 00:28:26,733
S1: I think it might be the same thing I'm thinking of.

449
00:28:26,763 --> 00:28:30,573
S1: So the the avatar on the other side looks like

450
00:28:30,573 --> 00:28:32,283
S1: this young, like.

451
00:28:32,703 --> 00:28:33,363
S3: Um.

452
00:28:34,173 --> 00:28:38,883
S1: Like, uh, almost like anime looking, uh, influencer girl. And she's, like,

453
00:28:38,883 --> 00:28:41,703
S1: really animated and, you know, pretty and everything, and she's

454
00:28:41,703 --> 00:28:45,033
S1: talking about whatever the topic is. And then you see

455
00:28:45,063 --> 00:28:48,063
S1: right next to it, it's actually like a 47 year

456
00:28:48,093 --> 00:28:52,233
S1: old male, and he's the one actually doing all of

457
00:28:52,233 --> 00:28:57,573
S1: the emoting and everything. And, um, it is real time

458
00:28:57,573 --> 00:29:04,293
S1: face swapping and real time hand swapping, uh, costume clothes, everything.

459
00:29:04,323 --> 00:29:09,543
S1: So that raises an interesting point. Uh, based on everything

460
00:29:09,543 --> 00:29:10,653
S1: we talked about.

461
00:29:11,103 --> 00:29:11,583
S3: At.

462
00:29:11,583 --> 00:29:18,603
S1: The start of the call, um, there was authentication that happened.

463
00:29:18,603 --> 00:29:22,893
S1: So they got a green box. Um, but then this

464
00:29:22,893 --> 00:29:26,253
S1: technology is now on. So now it looks like this

465
00:29:26,253 --> 00:29:30,723
S1: other person. Um, so what? This just got me thinking,

466
00:29:30,723 --> 00:29:33,273
S1: and I hadn't thought of this before. You know, this

467
00:29:33,273 --> 00:29:38,703
S1: reminds me of, uh, gaming situations where, um, games, there's

468
00:29:38,703 --> 00:29:42,603
S1: so much hacking happening in games that, uh, a lot

469
00:29:42,633 --> 00:29:47,043
S1: of game vendors switch to basically having to run a rootkit.

470
00:29:47,313 --> 00:29:47,673
S3: Mhm.

471
00:29:48,423 --> 00:29:51,993
S1: So they need end to end, top to bottom deep

472
00:29:51,993 --> 00:29:55,923
S1: kernel implementation to know that you do not have some

473
00:29:55,953 --> 00:30:01,473
S1: sort of shiv. Yeah. Some sort of injection capability inside

474
00:30:01,473 --> 00:30:04,443
S1: of the thing. And it's looking at all the processes

475
00:30:04,443 --> 00:30:07,293
S1: that are running. It's looking for evidence of malware. It's

476
00:30:07,293 --> 00:30:11,463
S1: looking for evidence of tampering. So, so the question is

477
00:30:11,493 --> 00:30:13,533
S1: like if we start a video call and then I

478
00:30:13,533 --> 00:30:17,493
S1: start software like that. Mhm. That technology needs to be

479
00:30:17,493 --> 00:30:20,313
S1: able to know that I'm using the puppet technology and

480
00:30:20,313 --> 00:30:22,953
S1: that there's an interception and translation happening.

481
00:30:22,983 --> 00:30:23,553
S3: Correct.

482
00:30:23,583 --> 00:30:26,373
S2: Yeah. And I think you know that's a that's a

483
00:30:26,373 --> 00:30:32,283
S2: very well uh explained kind of Uh, thought process. And

484
00:30:32,283 --> 00:30:36,063
S2: that's also one of the reasons why I think it's

485
00:30:36,063 --> 00:30:38,943
S2: not just that. And I talked about this zero knowledge

486
00:30:39,123 --> 00:30:44,403
S2: proof mechanism for authenticity built into the hardware. Um, because,

487
00:30:44,403 --> 00:30:49,353
S2: you see, if my camera is showing my video and

488
00:30:49,353 --> 00:30:53,763
S2: if the camera in the live stream, this hardware authenticates

489
00:30:53,763 --> 00:30:58,203
S2: that the video stream is basically what it is processing

490
00:30:58,203 --> 00:31:03,483
S2: using the, uh, the ZK hardware, uh, research I talked about.

491
00:31:03,513 --> 00:31:08,883
S2: Then any cross stream or stream mixing in the middle, um,

492
00:31:08,883 --> 00:31:12,423
S2: the receiving software should be able to validate the authenticity

493
00:31:12,423 --> 00:31:15,633
S2: that this is not what the camera captured. Right? Yes.

494
00:31:15,663 --> 00:31:18,663
S2: And that is the key, right? It you know, it's

495
00:31:18,663 --> 00:31:21,333
S2: not just at one level. It has to be that

496
00:31:21,333 --> 00:31:25,533
S2: the trust has to go all the way from the

497
00:31:25,533 --> 00:31:30,213
S2: physical level. Right? The lights and everything around here. To

498
00:31:30,243 --> 00:31:34,743
S2: what the camera sensor captures, to what the media gets digitized.

499
00:31:34,893 --> 00:31:38,583
S2: So just trying to tackle that the digital media layer

500
00:31:38,583 --> 00:31:44,673
S2: is insufficient. It needs to have the analog. Um, it

501
00:31:44,673 --> 00:31:48,873
S2: needs to have the analog ancillary to also transport this

502
00:31:48,873 --> 00:31:54,603
S2: authenticity and validation mechanism back for for real time communication,

503
00:31:55,023 --> 00:31:58,953
S2: whether it's an audio microphone or a video camera or sensor. Yeah.

504
00:31:59,433 --> 00:32:01,953
S1: Yeah. I love what you're saying there, because I love

505
00:32:01,953 --> 00:32:05,283
S1: the fact that the hardware itself is involved. So to

506
00:32:05,283 --> 00:32:12,093
S1: your point. Canon. Canon. Canon and Nikon. So it's almost

507
00:32:12,093 --> 00:32:14,043
S1: like they would have their own version of like a

508
00:32:14,043 --> 00:32:17,733
S1: secure enclave or something similar where it's like, that's a

509
00:32:17,733 --> 00:32:21,333
S1: protected system. It's the one doing the signing at the

510
00:32:21,363 --> 00:32:24,123
S1: at the camera hardware level, which is part of a

511
00:32:24,123 --> 00:32:26,883
S1: later signature which is passed on. That's right. So it's

512
00:32:26,883 --> 00:32:29,883
S1: like this chain of custody where it's an unbroken thing.

513
00:32:29,913 --> 00:32:30,273
S3: Yeah.

514
00:32:30,303 --> 00:32:32,403
S2: But the challenge with that, Daniel, is that, you see,

515
00:32:32,613 --> 00:32:35,163
S2: right now, the fragmentation in this space is going to

516
00:32:35,163 --> 00:32:39,243
S2: be devastating. Like, that's the worst thing that can happen. Yeah. Fragmentation.

517
00:32:39,243 --> 00:32:42,243
S2: Meaning that okay, one person is doing or one company

518
00:32:42,273 --> 00:32:43,833
S2: doing it this way. The other company is doing it

519
00:32:43,833 --> 00:32:46,503
S2: that way. And there is no like, you know, um,

520
00:32:46,503 --> 00:32:48,783
S2: so so that's why I think it's very important that

521
00:32:48,783 --> 00:32:54,393
S2: the Logitech camera can interact with some of my phone camera,

522
00:32:54,393 --> 00:32:57,783
S2: for example, or, you know, the interoperability of this. So like,

523
00:32:57,813 --> 00:33:01,713
S2: imagine if your web browser did something different, uh, of

524
00:33:01,713 --> 00:33:04,983
S2: SSL and something else did something different. It's going to

525
00:33:04,983 --> 00:33:08,133
S2: be just a chaos. So standardization of this mechanism to

526
00:33:08,163 --> 00:33:12,843
S2: tackle deep fake authenticity of of digital media, whether it's

527
00:33:12,843 --> 00:33:15,093
S2: stored media or media in transit.

528
00:33:15,393 --> 00:33:15,963
S3: Um, you know.

529
00:33:16,083 --> 00:33:19,113
S1: Now that I'm thinking about this, I think you're right

530
00:33:19,113 --> 00:33:24,003
S1: about that, because I think what will probably happen is

531
00:33:24,063 --> 00:33:26,823
S1: the what we will agree on is we agree to

532
00:33:26,853 --> 00:33:32,313
S1: trust zoom and then zoom on each of our sides.

533
00:33:32,433 --> 00:33:36,903
S1: Does the camera validation because the camera got some sort

534
00:33:36,903 --> 00:33:42,693
S1: of certification from somebody like Apple or Mac OS or windows? Correct.

535
00:33:42,723 --> 00:33:47,823
S1: So zoom trust the camera. Therefore zoom signs it. Therefore

536
00:33:47,823 --> 00:33:52,473
S1: your side agrees because zoom side zoom signed both sides.

537
00:33:52,503 --> 00:33:53,853
S3: Sure. Something like that.

538
00:33:53,883 --> 00:33:57,573
S2: That works. That works too. And yeah, that's you know,

539
00:33:57,603 --> 00:33:59,793
S2: that that is a kind of standardization. But I was

540
00:33:59,793 --> 00:34:02,223
S2: going a little bit broader. I was saying that we

541
00:34:02,223 --> 00:34:06,873
S2: should like, we should almost go to the layers of

542
00:34:06,873 --> 00:34:12,933
S2: network communication, the same way how we communicate with streams

543
00:34:12,933 --> 00:34:17,493
S2: like we do. Like you talked about Diffie-Hellman, I'm talking about,

544
00:34:17,523 --> 00:34:22,653
S2: you know, um, stream establishment at the internet protocols. Authenticity.

545
00:34:22,683 --> 00:34:23,643
S3: Oh, sure.

546
00:34:23,673 --> 00:34:27,213
S2: So I think it's time for us to look at, like.

547
00:34:27,243 --> 00:34:30,693
S2: I mean, we can keep patching this stuff, right? We

548
00:34:30,693 --> 00:34:33,663
S2: can we can keep creating these glues and, you know,

549
00:34:33,693 --> 00:34:37,563
S2: but I think it's time to to take a step

550
00:34:37,563 --> 00:34:43,293
S2: further and start, um, you know, um, the contracts of

551
00:34:43,293 --> 00:34:46,233
S2: this authenticity of the hardware, the data, like the same

552
00:34:46,233 --> 00:34:50,013
S2: way how we digitize the data. We need to embed

553
00:34:50,523 --> 00:34:54,003
S2: some of these validation mechanisms right into the protocol.

554
00:34:54,573 --> 00:34:58,803
S1: You know, honestly, we should, um, not not perfectly on topic,

555
00:34:58,803 --> 00:35:02,643
S1: but we should actually collaborate on this because, um, I

556
00:35:02,643 --> 00:35:05,013
S1: don't think it's going to be easy for a small

557
00:35:05,013 --> 00:35:07,143
S1: company to do this. I think this is really going

558
00:35:07,173 --> 00:35:11,373
S1: to be like a consortium. Mhm. Um, but I used

559
00:35:11,373 --> 00:35:14,493
S1: to be at Apple. Um, I still know a lot

560
00:35:14,523 --> 00:35:16,683
S1: of people over there. I know a lot of people

561
00:35:16,683 --> 00:35:19,983
S1: are thinking about this, but I am very surprised that

562
00:35:19,983 --> 00:35:22,203
S1: I have not heard more people talk about what you

563
00:35:22,203 --> 00:35:30,063
S1: just said. So for example, um, IPsec, uh, Randall, like

564
00:35:30,093 --> 00:35:34,893
S1: all the fundamental protocols, uh, the fundamental algorithms, what is

565
00:35:34,893 --> 00:35:40,203
S1: an underlying base standard like TCP, IP, like TLS? Um, um,

566
00:35:40,773 --> 00:35:44,943
S1: you know, is it, uh, are we doing public key

567
00:35:44,973 --> 00:35:48,213
S1: for the exchange? Are we doing symmetric for the for the, uh,

568
00:35:48,213 --> 00:35:53,763
S1: the communication? Yeah. So it's like all those things need

569
00:35:53,793 --> 00:35:56,403
S1: to be considered and built into like, like you said,

570
00:35:56,403 --> 00:36:01,083
S1: a fundamental protocol which includes the authentication piece, which includes

571
00:36:01,083 --> 00:36:05,373
S1: the re prompting for authentication over certain periods of time

572
00:36:05,373 --> 00:36:08,523
S1: based on, uh, so for example, here would be a

573
00:36:08,523 --> 00:36:12,273
S1: great like method for the, uh, thing. Uh, you have

574
00:36:12,273 --> 00:36:18,003
S1: a policy established during the, the initiation of the call

575
00:36:18,243 --> 00:36:21,483
S1: so that if certain things are being talked about. It

576
00:36:21,483 --> 00:36:25,503
S1: up levels the requirements so it prompts you both sides

577
00:36:25,503 --> 00:36:26,463
S1: more often.

578
00:36:26,613 --> 00:36:27,633
S3: Mhm. Mhm.

579
00:36:27,663 --> 00:36:30,003
S1: For revalidation. Yeah. Things like that.

580
00:36:30,033 --> 00:36:33,063
S2: Yeah absolutely. And you know you talk about Apple and

581
00:36:33,063 --> 00:36:35,493
S2: I think it's interesting right. Apple is in a unique

582
00:36:35,493 --> 00:36:42,393
S2: place to to really solve deepfakes is because they have

583
00:36:42,393 --> 00:36:48,183
S2: a full control of end to end ecosystem if you will.

584
00:36:48,213 --> 00:36:49,083
S3: Yes.

585
00:36:49,173 --> 00:36:51,903
S2: Um, all the way from the hardware to the content

586
00:36:51,933 --> 00:36:54,063
S2: to the method of that content distributes, and they have

587
00:36:54,063 --> 00:37:00,303
S2: statistically significant density of communities that interact with those content. Um, so,

588
00:37:00,303 --> 00:37:02,673
S2: so that, that, you know, that's one aspect. And the

589
00:37:02,673 --> 00:37:05,463
S2: other aspect is I think, you know, if you look

590
00:37:05,493 --> 00:37:10,143
S2: at the rate at which the technology is evolving, um,

591
00:37:11,553 --> 00:37:23,013
S2: deep fakes are probably Significantly impacting our ability of what

592
00:37:23,043 --> 00:37:28,473
S2: reality looks like and or eroding trust from systems.

593
00:37:28,773 --> 00:37:29,253
S3: Yep.

594
00:37:29,283 --> 00:37:31,713
S2: And that is massively concerning.

595
00:37:32,133 --> 00:37:37,713
S1: Yeah I agree. Yeah. One thing I just realized is, um,

596
00:37:37,983 --> 00:37:41,343
S1: I would love like a little I think this is

597
00:37:41,343 --> 00:37:43,953
S1: probably coming soon with AI agents. So you have like

598
00:37:43,983 --> 00:37:47,583
S1: a little bot that is watching this chat. And one

599
00:37:47,583 --> 00:37:52,323
S1: of the things it would have reported is, um, Shil's

600
00:37:52,323 --> 00:37:57,843
S1: background looks like a real background that is blurred. Daniel's

601
00:37:57,843 --> 00:38:03,663
S1: background looks to be AI generated. So I, I'm watching

602
00:38:03,663 --> 00:38:05,853
S1: him very carefully to make sure he doesn't have six

603
00:38:05,853 --> 00:38:08,853
S1: fingers or something. You know what I mean? So you

604
00:38:08,853 --> 00:38:10,953
S1: could just have an alert that's like right off the

605
00:38:10,953 --> 00:38:14,373
S1: start before we even started. It's a fake background.

606
00:38:14,403 --> 00:38:14,883
S3: Mhm.

607
00:38:15,183 --> 00:38:16,053
S1: You know what I mean.

608
00:38:16,083 --> 00:38:16,683
S3: Yeah. No.

609
00:38:16,683 --> 00:38:21,153
S2: Absolutely. Yeah. I think, um, I think there are various

610
00:38:21,153 --> 00:38:26,073
S2: ways to solve this, but, you know, um, there are

611
00:38:26,073 --> 00:38:27,663
S2: things that can be done in the short term. There

612
00:38:27,693 --> 00:38:29,313
S2: are things that can be done in the mid-term. But

613
00:38:29,313 --> 00:38:31,863
S2: I think, you know, if we're talking about thought leadership,

614
00:38:31,863 --> 00:38:33,993
S2: vision as to where we're going, I think it's time

615
00:38:33,993 --> 00:38:39,813
S2: for us to kind of, you know, uh, rethink what

616
00:38:39,813 --> 00:38:42,303
S2: we are doing and how we're going to deal with

617
00:38:42,333 --> 00:38:46,473
S2: fakes in general. Digital fakes. AI is helping it make better.

618
00:38:46,593 --> 00:38:49,203
S3: But yeah, yeah, yeah, yeah.

619
00:38:49,203 --> 00:38:52,053
S1: I think the way you're talking about it is exactly correct.

620
00:38:52,053 --> 00:38:55,743
S1: Ultimately it's a trust issue. So anything that is eroding

621
00:38:55,743 --> 00:38:58,593
S1: that trust is really the problem. And that's where we start.

622
00:38:58,623 --> 00:38:59,133
S3: Exactly.

623
00:38:59,163 --> 00:39:02,283
S1: And then we start with that trust problem. And then

624
00:39:02,283 --> 00:39:06,753
S1: you start thinking about a trust protocol a more fundamental

625
00:39:06,783 --> 00:39:11,523
S1: technology protocol like TCP, IP, like HTTP, something, you know,

626
00:39:11,553 --> 00:39:12,843
S1: at a deeper, more fundamental.

627
00:39:12,873 --> 00:39:14,163
S3: Yeah, yeah.

628
00:39:14,193 --> 00:39:17,163
S2: And you know, Daniel, there is also another important thing here. Like,

629
00:39:17,193 --> 00:39:20,703
S2: you know, some of these things were developed for entertainment purposes. Like,

630
00:39:20,703 --> 00:39:23,673
S2: if you think about it, the very premise. Right. If

631
00:39:23,673 --> 00:39:29,553
S2: you go look up in GitHub and you search for, uh, FS, uh, Gann, uh,

632
00:39:29,553 --> 00:39:34,473
S2: facial expression and you'll see like incredible research papers and

633
00:39:34,473 --> 00:39:39,543
S2: then implementation of them. And they are majority uh, the

634
00:39:39,573 --> 00:39:42,213
S2: goal is to demonstrate what the technology is capable of.

635
00:39:42,213 --> 00:39:45,453
S2: Some of the first applications were for fun and. Yeah.

636
00:39:45,453 --> 00:39:47,643
S2: So what like, you know, I sent a picture or

637
00:39:47,643 --> 00:39:52,503
S2: video that looks, uh, five, ten years, um, you know,

638
00:39:52,533 --> 00:39:55,923
S2: of my age taken off, right? As long as I

639
00:39:55,923 --> 00:39:58,803
S2: do not claim, I think it's perfectly fine. As long

640
00:39:58,803 --> 00:40:01,713
S2: as they say, hey, you know, look, this is. And

641
00:40:01,713 --> 00:40:04,203
S2: there is no claims made that this is who I

642
00:40:04,203 --> 00:40:07,443
S2: am or this is what it is. The problem becomes

643
00:40:07,443 --> 00:40:09,363
S2: when some like, you know. So the root of the

644
00:40:09,363 --> 00:40:12,903
S2: problem is, is a fake, whether it's deep or not,

645
00:40:12,933 --> 00:40:15,993
S2: I think is is it or AI generated is a

646
00:40:15,993 --> 00:40:17,793
S2: different point altogether?

647
00:40:18,513 --> 00:40:19,323
S3: Yeah.

648
00:40:19,383 --> 00:40:22,563
S1: No, I think that's right. It's it's a great point

649
00:40:22,563 --> 00:40:28,323
S1: because there's a harmless removal of 15 years of age. Mhm.

650
00:40:28,803 --> 00:40:31,593
S1: But if it's a guy and he's trying to get

651
00:40:31,593 --> 00:40:36,813
S1: a model, uh, modeling job and the modeling company stands

652
00:40:36,813 --> 00:40:40,473
S1: to lose money from this contract being signed, now that

653
00:40:40,473 --> 00:40:42,153
S1: innocent thing is no longer innocent.

654
00:40:42,183 --> 00:40:43,023
S3: Exactly.

655
00:40:43,053 --> 00:40:46,323
S2: Yeah. Exactly. Which is why, you know, the technology is

656
00:40:46,323 --> 00:40:48,783
S2: just enabling. And that's why my points were like, we

657
00:40:48,783 --> 00:40:52,593
S2: need to find a way to deal with the technology. Mhm.

658
00:40:53,223 --> 00:40:56,253
S1: Yeah. So any any tips for people to learn more

659
00:40:56,253 --> 00:40:57,363
S1: about this.

660
00:40:57,663 --> 00:41:02,283
S2: Yeah I think you know um like we recently did

661
00:41:02,313 --> 00:41:04,263
S2: like the threat research team and the and the data

662
00:41:04,263 --> 00:41:09,693
S2: science team, uh, did some work to, to publish this um,

663
00:41:09,723 --> 00:41:12,813
S2: thing from BlackBerry about deep fakes. I encourage people to

664
00:41:12,813 --> 00:41:15,063
S2: read it. I think they're going to find it informative.

665
00:41:15,063 --> 00:41:18,183
S2: It's developed in a language that is very easy to understand,

666
00:41:18,183 --> 00:41:21,903
S2: and I think right now I would encourage people to

667
00:41:21,933 --> 00:41:25,023
S2: sort of learn about these things of what's possible. Right.

668
00:41:25,053 --> 00:41:27,963
S2: That's the first thing that at least you're skeptical when

669
00:41:27,963 --> 00:41:30,633
S2: you see something or your antennas kind of pick up

670
00:41:30,633 --> 00:41:35,013
S2: something that that you might otherwise might not have. So

671
00:41:35,043 --> 00:41:37,563
S2: awareness I think, is is the key at this time.

672
00:41:38,133 --> 00:41:40,473
S1: Okay. Yeah, we'll definitely put the link to that in

673
00:41:40,473 --> 00:41:44,433
S1: the show notes. Um, any predictions for like the next

674
00:41:44,463 --> 00:41:46,323
S1: year or 2 or 3 years?

675
00:41:47,703 --> 00:41:52,533
S2: Um, well, I think this technology is going to get

676
00:41:52,533 --> 00:41:57,063
S2: progressively better. You're going to see more hyper realistic content.

677
00:41:57,063 --> 00:42:00,393
S2: In fact, you're going to start seeing full body, not

678
00:42:00,393 --> 00:42:04,773
S2: just faces and expression puppeteering. I think you're going to see,

679
00:42:04,803 --> 00:42:09,003
S2: you know, hyper realistic content. You're going to see content

680
00:42:09,003 --> 00:42:13,773
S2: interacting with other content in social settings. You're going to

681
00:42:13,773 --> 00:42:19,563
S2: see more personalized attacks through this mechanism. Uh, you know,

682
00:42:19,593 --> 00:42:22,653
S2: public figures or people you dislike. You're going to be

683
00:42:22,683 --> 00:42:27,933
S2: able to start propaganda and the availability of these tools like,

684
00:42:27,963 --> 00:42:30,843
S2: I mean, from $5 to $15 a month. From a

685
00:42:30,843 --> 00:42:34,143
S2: subscription perspective, you can create some of this stuff, uh,

686
00:42:34,143 --> 00:42:36,783
S2: with a bit of programming. You can go download these

687
00:42:36,783 --> 00:42:39,573
S2: GitHub projects and do your own, if you will. Um,

688
00:42:39,573 --> 00:42:43,623
S2: you know, the like, you know, the possibility is limitless.

689
00:42:43,623 --> 00:42:47,883
S2: So deepfake as a technology will continue to evolve because

690
00:42:47,883 --> 00:42:53,283
S2: it does stoke a, a, uh, in a reason for

691
00:42:53,283 --> 00:42:56,703
S2: why we do certain things that, that are, uh, not

692
00:42:56,703 --> 00:42:59,043
S2: the best moral grounds, if you will. So it will

693
00:42:59,043 --> 00:43:03,753
S2: get become more sophisticated, harder to detect the very technology

694
00:43:03,753 --> 00:43:08,493
S2: that is required to do this. Um, is is going

695
00:43:08,493 --> 00:43:12,363
S2: to basically enable this, this growth. And the challenge will

696
00:43:12,363 --> 00:43:15,903
S2: be there in the coming years unless we as a

697
00:43:15,903 --> 00:43:18,393
S2: community do something about it.

698
00:43:18,903 --> 00:43:21,003
S1: Yeah. So the better that stuff gets, the more we're

699
00:43:21,003 --> 00:43:23,673
S1: going to need the types of controls that you talked about.

700
00:43:23,703 --> 00:43:24,303
S3: Exactly.

701
00:43:24,333 --> 00:43:25,893
S2: Yeah, absolutely.

702
00:43:25,923 --> 00:43:28,623
S1: Where can we learn more about you and your team

703
00:43:28,623 --> 00:43:29,973
S1: and the work that you're doing?

704
00:43:30,423 --> 00:43:33,123
S2: Uh, we, uh. That's great. Like, you know, we have

705
00:43:33,123 --> 00:43:36,693
S2: a data science research blog where we publish, uh, things

706
00:43:36,693 --> 00:43:41,823
S2: that we learn, um, time to time, uh, at BlackBerry, um,

707
00:43:41,853 --> 00:43:46,413
S2: papers that we publish. Um, so, so I welcome, uh,

708
00:43:46,413 --> 00:43:49,653
S2: people reaching out if they want. Uh, I always love

709
00:43:49,653 --> 00:43:52,533
S2: to have a great conversation. Some of these conversations we

710
00:43:52,533 --> 00:43:55,773
S2: had were very insightful. Um, yeah.

711
00:43:56,523 --> 00:43:59,463
S1: Okay. Well, awesome. Well, it's great to have you back. And, uh,

712
00:43:59,493 --> 00:44:02,403
S1: great conversation, as always. I appreciate the time.

713
00:44:02,583 --> 00:44:04,053
S2: Hey, thanks a lot, Daniel. Thanks.

714
00:44:04,083 --> 00:44:04,503
S1: All right.

715
00:44:04,533 --> 00:44:06,003
S3: Take care. Bye.