1 00:00:15,356 --> 00:00:23,116 Speaker 1: Pushkin. Earlier this year, an employee working in Hong Kong 2 00:00:23,196 --> 00:00:26,516 Speaker 1: for an international company got a weird message from one 3 00:00:26,516 --> 00:00:29,516 Speaker 1: of his colleagues. He was supposed to make a secret 4 00:00:29,556 --> 00:00:34,236 Speaker 1: transfer of millions of dollars. It seems sketchy. It obviously 5 00:00:34,316 --> 00:00:36,956 Speaker 1: seems sketchy, so he got on a video call with 6 00:00:36,996 --> 00:00:40,516 Speaker 1: a bunch of people, including the company's CFO, the chief 7 00:00:40,516 --> 00:00:44,516 Speaker 1: financial officer. The CFO said the request was legit, so 8 00:00:44,716 --> 00:00:47,716 Speaker 1: the employee did what he was told. He transferred roughly 9 00:00:47,916 --> 00:00:52,076 Speaker 1: twenty five million dollars to several bank accounts. As it 10 00:00:52,116 --> 00:00:54,396 Speaker 1: turned out, the CFO on the video call was not 11 00:00:54,876 --> 00:00:58,796 Speaker 1: really the CFO. It was a deep fake, an AI 12 00:00:58,956 --> 00:01:03,116 Speaker 1: generated twin created from publicly available audio and video of 13 00:01:03,156 --> 00:01:06,756 Speaker 1: the real CFO. By the time the company figured out 14 00:01:06,756 --> 00:01:09,516 Speaker 1: what was going on, it was too late, the money 15 00:01:09,756 --> 00:01:18,316 Speaker 1: was gone. I'm Jacob Goldstein and this is What's Your Problem, 16 00:01:18,476 --> 00:01:20,236 Speaker 1: the show where I talk to people who are trying 17 00:01:20,276 --> 00:01:24,676 Speaker 1: to make technological progress. My guest today is Ali Shahieri. 18 00:01:24,916 --> 00:01:28,156 Speaker 1: He's the co founder and chief technology officer at the 19 00:01:28,236 --> 00:01:34,076 Speaker 1: audaciously named Reality Defender. Ali's problem is this, how can 20 00:01:34,116 --> 00:01:39,516 Speaker 1: you use AI to protect the world from AI? More specifically, 21 00:01:39,956 --> 00:01:41,996 Speaker 1: how do you build a set of models to spot 22 00:01:41,996 --> 00:01:48,076 Speaker 1: the difference between reality and AI generated deep fakes. How'd 23 00:01:48,076 --> 00:01:50,476 Speaker 1: you get into the defending reality business? 24 00:01:51,796 --> 00:01:57,556 Speaker 2: Yeah, so when I started, it was around actually generating 25 00:01:57,756 --> 00:02:00,236 Speaker 2: videos and deep fikes. 26 00:02:00,396 --> 00:02:03,036 Speaker 1: So you were attacking reality before you were defending it. 27 00:02:04,556 --> 00:02:06,796 Speaker 2: I wouldn't said we were attacking anything, but we were 28 00:02:06,796 --> 00:02:10,756 Speaker 2: definitely into looking into this technology. And it is way 29 00:02:10,796 --> 00:02:14,036 Speaker 2: back before all this stuff kind of went crazy. This 30 00:02:14,076 --> 00:02:17,036 Speaker 2: is back in like twenty nineteen around that time, So 31 00:02:17,076 --> 00:02:21,236 Speaker 2: we were building digital twins and we're looking at how 32 00:02:21,236 --> 00:02:23,476 Speaker 2: do you make it so that it looks realistic? Is 33 00:02:23,516 --> 00:02:27,116 Speaker 2: it a cartoon looking thing? Is it like a unity 34 00:02:27,116 --> 00:02:29,596 Speaker 2: three D thing? And then that's when we started to 35 00:02:29,636 --> 00:02:32,796 Speaker 2: see like these early research papers where they were taking 36 00:02:32,836 --> 00:02:36,236 Speaker 2: like someone's face and putting it on a video and 37 00:02:36,316 --> 00:02:39,436 Speaker 2: blending it in and it looked really good, and we 38 00:02:39,436 --> 00:02:41,556 Speaker 2: were like, oh, maybe we can do the digital twins 39 00:02:41,916 --> 00:02:46,276 Speaker 2: that way. And while we were like in that business, 40 00:02:46,516 --> 00:02:49,116 Speaker 2: we were like, you know, probably in a few years 41 00:02:49,116 --> 00:02:53,836 Speaker 2: someone can download an app and just make anything very easily. 42 00:02:53,876 --> 00:02:57,356 Speaker 2: And that's kind of the origins of how how we started. 43 00:02:57,996 --> 00:03:00,116 Speaker 2: We're very mission driven. What we're trying to do here 44 00:03:00,236 --> 00:03:05,356 Speaker 2: is really protect the world and people from the dangers 45 00:03:05,836 --> 00:03:08,716 Speaker 2: of AI, but in a way where you know, we 46 00:03:08,756 --> 00:03:11,996 Speaker 2: want people not to abuse the technology. We're very we 47 00:03:12,036 --> 00:03:15,116 Speaker 2: love AI, we just don't want it to be abused. 48 00:03:16,036 --> 00:03:19,876 Speaker 1: So let's talk about this sort of deep fake detection 49 00:03:20,516 --> 00:03:25,476 Speaker 1: kind of you know, jen AI detection market more generally, 50 00:03:25,716 --> 00:03:30,796 Speaker 1: like who's like, who's selling deep fake detection right now, 51 00:03:30,796 --> 00:03:32,676 Speaker 1: and who's buying what's the what's the sort of market 52 00:03:32,796 --> 00:03:33,676 Speaker 1: landscape look like. 53 00:03:34,636 --> 00:03:38,116 Speaker 2: The type of clients that we have right now are banks. 54 00:03:39,036 --> 00:03:42,316 Speaker 2: For example, we are currently live with one of the 55 00:03:42,436 --> 00:03:45,436 Speaker 2: largest banks in the world. When you call that bank, 56 00:03:46,436 --> 00:03:50,316 Speaker 2: the audio goes through our defake detection models and we're 57 00:03:50,356 --> 00:03:53,356 Speaker 2: able to tell the call center this person might be 58 00:03:53,396 --> 00:03:57,036 Speaker 2: a deep fake. And part of that is that's actually happened. 59 00:03:57,076 --> 00:04:01,676 Speaker 2: Someone's called the bank and they've transferred money out and 60 00:04:01,956 --> 00:04:05,316 Speaker 2: actually this this goes back to twenty nineteen, so the 61 00:04:05,396 --> 00:04:09,596 Speaker 2: first incident of defake fraud actually and back in. 62 00:04:09,596 --> 00:04:13,316 Speaker 1: Two in nineteen that we're aware of. Right, you're right exactly, 63 00:04:13,676 --> 00:04:15,556 Speaker 1: So what happened in twenty nineteen. 64 00:04:16,156 --> 00:04:18,836 Speaker 2: Yeah, so this is back where this is early and 65 00:04:18,876 --> 00:04:22,596 Speaker 2: nobody really knew about this, and there was a CEO 66 00:04:22,756 --> 00:04:26,756 Speaker 2: that called a smaller company that THEO was a parent 67 00:04:26,796 --> 00:04:29,756 Speaker 2: company calling the child company. The CEO calling the other 68 00:04:29,796 --> 00:04:32,676 Speaker 2: CEO and he wanted to transfer some money out and 69 00:04:33,196 --> 00:04:36,196 Speaker 2: it sounded like him and the guy transferred I think 70 00:04:36,196 --> 00:04:37,916 Speaker 2: it was in UK, about two hundred and three hundred 71 00:04:37,916 --> 00:04:39,636 Speaker 2: thousand dollars out And that was like the first one 72 00:04:39,636 --> 00:04:40,796 Speaker 2: of the first ones that we. 73 00:04:40,836 --> 00:04:44,036 Speaker 1: Know of, and they got away with it, I believe. 74 00:04:44,076 --> 00:04:44,676 Speaker 2: So. Yeah. 75 00:04:44,716 --> 00:04:46,876 Speaker 1: And there was an instance earlier this year right where 76 00:04:47,156 --> 00:04:49,636 Speaker 1: I think it was in Hong Kong and some employee 77 00:04:49,676 --> 00:04:51,836 Speaker 1: was on a zoom call with the company's CFO and 78 00:04:51,876 --> 00:04:54,356 Speaker 1: the CFO was like, you know, why are twenty five 79 00:04:54,356 --> 00:04:57,036 Speaker 1: million dollars or something to some bank account? And then 80 00:04:57,076 --> 00:04:59,236 Speaker 1: the employee did it and it turned out the CFO 81 00:04:59,276 --> 00:05:00,836 Speaker 1: on the call was a deep fake, right. 82 00:05:01,076 --> 00:05:05,676 Speaker 2: Yeah, So fast were they your client? They were not 83 00:05:05,916 --> 00:05:09,516 Speaker 2: in our clients unfortunately. But this shows the how quickly 84 00:05:09,556 --> 00:05:13,756 Speaker 2: the technology is evolving. You know, twenty nineteen audio fast 85 00:05:13,756 --> 00:05:15,676 Speaker 2: forward a few years now, You've got a zoom call. 86 00:05:15,996 --> 00:05:17,956 Speaker 2: There'd a bunch of people on it and they all 87 00:05:17,996 --> 00:05:20,476 Speaker 2: look like people, you know, I know, they're all de fis. 88 00:05:21,156 --> 00:05:23,276 Speaker 1: So you were starting to mention. Banks are some of 89 00:05:23,276 --> 00:05:25,196 Speaker 1: your main clients. Who are some of your other main. 90 00:05:25,076 --> 00:05:28,756 Speaker 2: Clients, media companies, I think think of some of the 91 00:05:28,756 --> 00:05:31,876 Speaker 2: big ones, there is our product this year, especially with 92 00:05:31,996 --> 00:05:35,076 Speaker 2: the election. You know, back twenty twenty, we thought it 93 00:05:35,076 --> 00:05:37,836 Speaker 2: would be a problem. It wasn't. This year we think 94 00:05:37,916 --> 00:05:40,116 Speaker 2: is a big problem. For sure. I think we were early, 95 00:05:40,636 --> 00:05:45,156 Speaker 2: but it's already this is happening everywhere even this year. 96 00:05:45,756 --> 00:05:47,756 Speaker 2: This year is the largest election year in the world. 97 00:05:47,876 --> 00:05:49,996 Speaker 2: More than fifty percent of the people are voting, and 98 00:05:50,076 --> 00:05:54,756 Speaker 2: we already have documented cases of election issues with the fix. 99 00:05:55,556 --> 00:05:59,436 Speaker 1: Okay, media companies, banks, any other kind of big categories 100 00:05:59,436 --> 00:05:59,956 Speaker 1: of clients. 101 00:06:00,636 --> 00:06:06,916 Speaker 2: Yeah, so other ones are government agencies. But in the end, 102 00:06:07,276 --> 00:06:11,996 Speaker 2: everyone we think, we believe everyone needs this Product's not 103 00:06:12,076 --> 00:06:14,356 Speaker 2: It shouldn't be up to the people to decide or 104 00:06:14,356 --> 00:06:16,676 Speaker 2: figure out if something's a deepic. If you're on the 105 00:06:16,756 --> 00:06:19,876 Speaker 2: social media platform, you shouldn't have to figure out, hey, 106 00:06:19,916 --> 00:06:21,556 Speaker 2: is this person real or not. It should just be 107 00:06:21,596 --> 00:06:23,596 Speaker 2: built in and anyone should be able to use it. 108 00:06:24,156 --> 00:06:29,156 Speaker 1: Well. Our social media companies either buying or building deep 109 00:06:29,196 --> 00:06:31,836 Speaker 1: fake detection tools or do they want to like stay 110 00:06:31,876 --> 00:06:33,436 Speaker 1: out of that business and be like no, we don't 111 00:06:33,436 --> 00:06:35,396 Speaker 1: want to be in the business of saying yes, this 112 00:06:35,516 --> 00:06:36,596 Speaker 1: is real, no, this isn't real. 113 00:06:37,356 --> 00:06:39,556 Speaker 2: I can tell you we've been in contact and have 114 00:06:39,676 --> 00:06:44,636 Speaker 2: talked to some social media platforms. I think one issue 115 00:06:44,676 --> 00:06:49,116 Speaker 2: is they don't have to flag these things. It's up 116 00:06:49,156 --> 00:06:53,076 Speaker 2: to them, right, there's not a lot of regulation, so 117 00:06:53,516 --> 00:06:55,436 Speaker 2: I know they're thinking about it. We've chatted with some, 118 00:06:56,076 --> 00:06:57,596 Speaker 2: but that's the extent of it. 119 00:06:58,316 --> 00:07:00,236 Speaker 1: So okay, So let's talk about how it works. And 120 00:07:00,276 --> 00:07:02,036 Speaker 1: there's two ways that I want to talk about how 121 00:07:02,036 --> 00:07:03,396 Speaker 1: it works. So one is from the point of view 122 00:07:03,436 --> 00:07:06,356 Speaker 1: of the user, whoever that may be, and then the 123 00:07:06,396 --> 00:07:08,476 Speaker 1: other is sort of what's going on under the hood. Right, 124 00:07:08,756 --> 00:07:11,316 Speaker 1: So let's start with the point of view of the user. 125 00:07:11,596 --> 00:07:15,156 Speaker 1: If I'm a whatever, a bank, university, a media company 126 00:07:15,196 --> 00:07:18,196 Speaker 1: who is paying for your service, how does it work 127 00:07:18,236 --> 00:07:18,436 Speaker 1: for me? 128 00:07:19,436 --> 00:07:22,436 Speaker 2: Depends on exactly the user and the use case. If 129 00:07:22,476 --> 00:07:26,076 Speaker 2: let's say it's a media company, Uh, they're looking at 130 00:07:26,196 --> 00:07:30,596 Speaker 2: maybe filtering through a lot of content, so content moderation. 131 00:07:30,916 --> 00:07:32,916 Speaker 2: Actually that would be like a social media company. They're 132 00:07:32,956 --> 00:07:36,516 Speaker 2: looking at content moderation. Maybe they want they're looking at 133 00:07:36,556 --> 00:07:39,716 Speaker 2: millions of assets and they want to quickly flag those 134 00:07:39,716 --> 00:07:42,396 Speaker 2: things if they were in that business. Uh, the bank 135 00:07:42,796 --> 00:07:46,196 Speaker 2: there For the example I gave the issue, someone could 136 00:07:46,236 --> 00:07:48,836 Speaker 2: call and biometrics fail. By the way, if you call 137 00:07:48,876 --> 00:07:52,116 Speaker 2: a bank, some banks say repeat after me, your my 138 00:07:52,196 --> 00:07:54,196 Speaker 2: voice is my passport? That actually fails. Now what do 139 00:07:54,316 --> 00:07:57,436 Speaker 2: you think? So a bank wants to make sure the 140 00:07:57,476 --> 00:08:00,276 Speaker 2: person calling in is actually that person. This is more 141 00:08:00,476 --> 00:08:03,876 Speaker 2: relevant to more to private banking, where there's actually a 142 00:08:03,916 --> 00:08:06,996 Speaker 2: one on one relationship between the client and the bank. 143 00:08:07,236 --> 00:08:09,436 Speaker 1: And so in that case, So let's take that case. 144 00:08:09,476 --> 00:08:12,876 Speaker 1: So in that case, someone calls in and talks to 145 00:08:12,996 --> 00:08:15,716 Speaker 1: their banker. They're a rich person who has a private banker. 146 00:08:15,756 --> 00:08:18,316 Speaker 1: Basically it's what you're talking about, right, So this rich 147 00:08:18,356 --> 00:08:21,996 Speaker 1: person calls in and talks to their private banker, and 148 00:08:22,356 --> 00:08:25,436 Speaker 1: it is the system just always running in the background 149 00:08:25,556 --> 00:08:28,156 Speaker 1: in that case, And like, how does it work from 150 00:08:28,156 --> 00:08:30,276 Speaker 1: the point of view of the of the private banker. 151 00:08:30,916 --> 00:08:33,556 Speaker 2: Sure, and I have to be careful what I say here, 152 00:08:33,836 --> 00:08:38,076 Speaker 2: But the high level is the models are listening and 153 00:08:38,116 --> 00:08:41,476 Speaker 2: if they detect a potential deep fake, they will the 154 00:08:41,636 --> 00:08:44,476 Speaker 2: call center. That person will get a notification so is 155 00:08:44,836 --> 00:08:48,716 Speaker 2: integrated into their existing workflow. They'll get a notification that says, hey, this. 156 00:08:48,596 --> 00:08:51,116 Speaker 1: Person get like a text or a slack or something 157 00:08:51,316 --> 00:08:53,796 Speaker 1: they're using. You're talking to a deep fake. 158 00:08:54,596 --> 00:08:56,876 Speaker 2: No, they're using software for the bank they're using they're 159 00:08:56,916 --> 00:09:00,516 Speaker 2: still using a software and there's a dashboard. In that scenario, 160 00:09:00,636 --> 00:09:03,436 Speaker 2: they do they ascalate, so they might say, let me 161 00:09:03,436 --> 00:09:05,436 Speaker 2: ask you some more questions or let me call you back. 162 00:09:05,876 --> 00:09:07,756 Speaker 1: Huh. Let me call you back is a super safe one, 163 00:09:07,796 --> 00:09:09,676 Speaker 1: right because if they have a relationship, probably they know 164 00:09:09,716 --> 00:09:13,276 Speaker 1: the number. They just call them back. Yeah, absolutely, okay, 165 00:09:13,596 --> 00:09:16,236 Speaker 1: And then how does it work? How does it work 166 00:09:16,276 --> 00:09:18,796 Speaker 1: for like when you say, like I presume by the 167 00:09:18,836 --> 00:09:21,276 Speaker 1: way that you can't name your clients. You said a 168 00:09:21,316 --> 00:09:24,396 Speaker 1: media company and a bank. It's it's secret that they're. 169 00:09:24,476 --> 00:09:26,036 Speaker 2: Yeah, we're not allowed to okay. 170 00:09:26,036 --> 00:09:28,196 Speaker 1: So let's say a media company. How's it work for 171 00:09:28,236 --> 00:09:30,036 Speaker 1: a media company? 172 00:09:29,916 --> 00:09:33,196 Speaker 2: Their their use case is slightly different, especially right now, 173 00:09:33,236 --> 00:09:35,676 Speaker 2: as I mentioned around the election, So there there might 174 00:09:35,716 --> 00:09:38,076 Speaker 2: be something that that's starting to go viral in the 175 00:09:38,156 --> 00:09:40,876 Speaker 2: news and they want to check, hey, is this a 176 00:09:40,996 --> 00:09:43,916 Speaker 2: real or not? I would like to say like something 177 00:09:44,036 --> 00:09:47,556 Speaker 2: like this is usually when something goes viral, the damage 178 00:09:47,556 --> 00:09:48,276 Speaker 2: is already ton. 179 00:09:48,996 --> 00:09:51,156 Speaker 1: Yes, although if you're if you're whatever, the New York 180 00:09:51,196 --> 00:09:52,956 Speaker 1: Times of the Wall Street Journal. You don't want to 181 00:09:52,996 --> 00:09:55,796 Speaker 1: repeat the viral lie. Part of your business model is 182 00:09:55,876 --> 00:09:59,156 Speaker 1: people are paying to subscribe to you because you are 183 00:09:59,636 --> 00:10:00,916 Speaker 1: more reliable. 184 00:10:00,516 --> 00:10:02,716 Speaker 2: Right exactly. So that's why they come to us. They 185 00:10:02,796 --> 00:10:05,796 Speaker 2: upload the assets and are our web app returns the 186 00:10:05,876 --> 00:10:06,636 Speaker 2: results I see. 187 00:10:06,676 --> 00:10:09,036 Speaker 1: So it's just like you just go to whatever Real 188 00:10:09,676 --> 00:10:13,156 Speaker 1: Defender dot whatever and you upload the viral video and 189 00:10:13,396 --> 00:10:15,956 Speaker 1: your machine says it's a fake. 190 00:10:16,676 --> 00:10:19,516 Speaker 2: Yeah, So we give results and probabilities that we don't 191 00:10:19,516 --> 00:10:22,756 Speaker 2: have the ground truth, so we give a probability. There's 192 00:10:22,796 --> 00:10:25,876 Speaker 2: several different models running, so we use an ensemble of models. 193 00:10:25,876 --> 00:10:29,596 Speaker 2: We have different models looking at different things, and we 194 00:10:29,676 --> 00:10:32,676 Speaker 2: give an overall score averaging those. In the case of 195 00:10:32,716 --> 00:10:35,636 Speaker 2: a video, we actually highlight the areas of a defake. 196 00:10:36,116 --> 00:10:38,036 Speaker 2: If the person is speaking and they're a fake, there'll 197 00:10:38,036 --> 00:10:39,836 Speaker 2: be a red box around them. If there is a 198 00:10:39,916 --> 00:10:41,196 Speaker 2: real they'll be a green box around it. 199 00:10:41,516 --> 00:10:46,236 Speaker 1: And well, that latter part sounds more binary as opposed 200 00:10:46,276 --> 00:10:47,276 Speaker 1: to probabilistic. 201 00:10:47,476 --> 00:10:50,076 Speaker 2: We give both. So yeah, there's there was a probably 202 00:10:50,236 --> 00:10:52,436 Speaker 2: score and there's just the visual. 203 00:10:52,276 --> 00:10:55,356 Speaker 1: And so the probabilistic score is basically according to our model, 204 00:10:55,396 --> 00:10:58,596 Speaker 1: there's a seventy percent chance that this is fake something 205 00:10:58,756 --> 00:10:59,836 Speaker 1: of that nature. 206 00:10:59,676 --> 00:11:01,796 Speaker 2: According to our ensemble of models. 207 00:11:01,916 --> 00:11:04,836 Speaker 1: Yes, yeah, our model of models, our fund of funds 208 00:11:04,836 --> 00:11:09,316 Speaker 1: of models exactly. So so okay, so you're actually looking 209 00:11:09,436 --> 00:11:12,876 Speaker 1: us toward what's under the hood, right, I'm interested in 210 00:11:12,916 --> 00:11:15,796 Speaker 1: discussing this on a few levels. Right, there is the 211 00:11:15,836 --> 00:11:20,036 Speaker 1: sort of broad beyond reality defender. You know, what are 212 00:11:20,076 --> 00:11:22,636 Speaker 1: the basic ways that the technology works, Like how does 213 00:11:23,036 --> 00:11:26,596 Speaker 1: deepfake detection gen AI detection work? In a broad way? 214 00:11:26,636 --> 00:11:27,636 Speaker 1: Like can you talk me through that? 215 00:11:27,676 --> 00:11:31,196 Speaker 2: Absolutely? Yeah. There's currently two ways people are looking at 216 00:11:31,236 --> 00:11:35,836 Speaker 2: this problem. Number one is prominence. For example, you water 217 00:11:35,996 --> 00:11:39,636 Speaker 2: mark a media that you create, maybe you water market 218 00:11:40,036 --> 00:11:42,156 Speaker 2: or you digitally sign it, maybe you put on a 219 00:11:42,156 --> 00:11:44,436 Speaker 2: blockchain somewhere or something like that. But basically there's a 220 00:11:44,476 --> 00:11:47,036 Speaker 2: source of true that this video is real. Yeah, and 221 00:11:47,076 --> 00:11:48,556 Speaker 2: there's a water mark. That's number one. 222 00:11:50,156 --> 00:11:52,916 Speaker 1: But we're concerned. We're concerned with instances where that is 223 00:11:52,916 --> 00:11:54,916 Speaker 1: not the case. Right. Our world is full of videos 224 00:11:54,956 --> 00:12:00,436 Speaker 1: today that are not clearly watermarked, blockchain whatever for prominence. 225 00:12:00,476 --> 00:12:02,676 Speaker 1: So we have this problem. What are the ways people 226 00:12:02,676 --> 00:12:03,236 Speaker 1: are solving it? 227 00:12:03,516 --> 00:12:05,676 Speaker 2: Yeah? The second way is how we're solving it, which 228 00:12:05,716 --> 00:12:08,556 Speaker 2: is basically we use AI to detect AI, so we 229 00:12:09,596 --> 00:12:13,196 Speaker 2: which we call inference. So we train AI models, as 230 00:12:13,236 --> 00:12:16,116 Speaker 2: I mentioned, a bunch of them to look at various 231 00:12:17,636 --> 00:12:20,036 Speaker 2: various aspects of plus say video. 232 00:12:20,476 --> 00:12:24,836 Speaker 1: So like, is it a sort of generative adversarial network 233 00:12:25,636 --> 00:12:27,436 Speaker 1: the right term? I mean, it seems like you It 234 00:12:27,476 --> 00:12:29,276 Speaker 1: seems like if I were making up how to do this, 235 00:12:29,316 --> 00:12:32,156 Speaker 1: I'd be like, well, I'm gonna have one model that's 236 00:12:32,236 --> 00:12:35,196 Speaker 1: like cranking out really good deep fikes, but I'll know 237 00:12:35,236 --> 00:12:36,876 Speaker 1: which ones are the deep fis, and then I'm gonna 238 00:12:36,876 --> 00:12:38,476 Speaker 1: feed the deep fis and the real ones to my 239 00:12:38,516 --> 00:12:41,076 Speaker 1: other model, and I'll score it on how well it does, 240 00:12:41,116 --> 00:12:43,356 Speaker 1: and it'll get really good at figuring out the difference. 241 00:12:43,796 --> 00:12:46,596 Speaker 2: Yeah, that's actually exactly how a lot of these work. 242 00:12:46,676 --> 00:12:48,956 Speaker 2: For if you go to there's a website you can 243 00:12:48,956 --> 00:12:51,316 Speaker 2: go where it just generates a person every time you 244 00:12:51,396 --> 00:12:53,396 Speaker 2: go to it a right, and that's actually using again 245 00:12:53,596 --> 00:12:56,916 Speaker 2: to generate that person. So the way we detect and 246 00:12:56,956 --> 00:12:58,516 Speaker 2: I can I can give a little bit more detail here. 247 00:12:58,556 --> 00:13:02,276 Speaker 2: So for example, one of our models which we actually removed, 248 00:13:02,636 --> 00:13:07,636 Speaker 2: was looking at blood flow. So yeah, so imagine actually 249 00:13:07,676 --> 00:13:11,716 Speaker 2: in this video lighting and conditions are right, we can 250 00:13:11,796 --> 00:13:14,476 Speaker 2: actually detect the heartbeat and the blood flow and the 251 00:13:14,556 --> 00:13:16,476 Speaker 2: veins the way we're looking at each other. 252 00:13:16,916 --> 00:13:19,556 Speaker 1: As I'm looking at my weirdly today, maybe because it's 253 00:13:19,556 --> 00:13:21,436 Speaker 1: hot or because the light hair, I can actually see 254 00:13:21,436 --> 00:13:24,156 Speaker 1: a vein bulging on my forehead. So, like you're saying, 255 00:13:24,156 --> 00:13:28,036 Speaker 1: an AI could like measure my pulse from that or something. 256 00:13:27,996 --> 00:13:30,956 Speaker 2: In the right conditions. Now, that model has a lot 257 00:13:30,996 --> 00:13:36,116 Speaker 2: of limitations, and you need to have the right It's 258 00:13:36,156 --> 00:13:39,276 Speaker 2: basically it has a lot of bias. Right, So we 259 00:13:39,356 --> 00:13:39,796 Speaker 2: tossed that. 260 00:13:40,156 --> 00:13:42,396 Speaker 1: Wait, you're saying it didn't work. You're saying it didn't work. 261 00:13:42,636 --> 00:13:45,876 Speaker 2: It worked in the right conditions and the right skin tone, 262 00:13:46,316 --> 00:13:49,516 Speaker 2: so yeah, so otherwise it was biased. So we this 263 00:13:49,676 --> 00:13:52,476 Speaker 2: was experimental and we tossed it. 264 00:13:52,396 --> 00:13:54,596 Speaker 1: A lot of things. It didn't work. So you tried 265 00:13:54,596 --> 00:13:56,356 Speaker 1: it and in a broad way it didn't work. It 266 00:13:56,396 --> 00:13:58,436 Speaker 1: worked in narrow conditions, but you need things that work 267 00:13:58,476 --> 00:14:01,196 Speaker 1: more broadly. What's another thing you tried that didn't work? 268 00:14:02,356 --> 00:14:05,396 Speaker 2: Well, I can tell you every month we may be 269 00:14:05,516 --> 00:14:06,636 Speaker 2: throwing away models. 270 00:14:06,836 --> 00:14:09,196 Speaker 1: Well, presumably there's things that work for a while and 271 00:14:09,236 --> 00:14:13,236 Speaker 1: then they don't. Right, It's kind of like antibiotics versus bacteria, right, 272 00:14:13,356 --> 00:14:16,236 Speaker 1: like your adversaries are getting better every day. 273 00:14:16,596 --> 00:14:19,116 Speaker 2: Basically, what we use, what we like to use is 274 00:14:19,156 --> 00:14:21,196 Speaker 2: we like to say we're like an anti virus company. 275 00:14:21,476 --> 00:14:25,316 Speaker 2: So every time every month there's a new genera of technique, 276 00:14:25,516 --> 00:14:28,156 Speaker 2: maybe we should go detective. But maybe it's something we 277 00:14:28,196 --> 00:14:30,516 Speaker 2: don't anticipate and we don't detect, and so we have 278 00:14:30,556 --> 00:14:33,436 Speaker 2: to make sure we quickly update our models. So and 279 00:14:33,436 --> 00:14:36,676 Speaker 2: then a model that worked last year, it's completely irrelevant now. 280 00:14:37,156 --> 00:14:40,316 Speaker 1: So what else, like, what else is happening technologically on 281 00:14:40,436 --> 00:14:42,956 Speaker 1: the reality defense side, on the detection side. 282 00:14:43,556 --> 00:14:46,476 Speaker 2: Okay, so the way, we have a few different products. 283 00:14:46,556 --> 00:14:50,436 Speaker 2: One is, as I mentioned, real time audio like scanning 284 00:14:50,436 --> 00:14:53,236 Speaker 2: and listening for telephone calls. The other one is a 285 00:14:53,276 --> 00:14:55,836 Speaker 2: place where a journalist or any user can go and 286 00:14:55,996 --> 00:14:59,516 Speaker 2: upload not just videos, but we also detect images. We 287 00:14:59,556 --> 00:15:03,036 Speaker 2: also detect audio, We also detect texts like chat GPT, 288 00:15:03,596 --> 00:15:06,956 Speaker 2: and these tools also explain to a user why something 289 00:15:07,196 --> 00:15:09,036 Speaker 2: is a deep fake. We don't just give a score. 290 00:15:09,236 --> 00:15:11,476 Speaker 2: Or for an image, we might put a heat map 291 00:15:11,476 --> 00:15:13,876 Speaker 2: and see these are the areas that set the model off. 292 00:15:14,956 --> 00:15:17,796 Speaker 2: For text, we might highlight areas and see these other 293 00:15:17,876 --> 00:15:19,996 Speaker 2: areas that appear to be generated. 294 00:15:20,196 --> 00:15:22,756 Speaker 1: There's a case study you have about a university that 295 00:15:22,876 --> 00:15:27,316 Speaker 1: is a client of yours that, among other things, uses 296 00:15:28,116 --> 00:15:32,636 Speaker 1: uses your service to tell when students are turning in 297 00:15:32,676 --> 00:15:36,396 Speaker 1: papers written by chat GIPT. Basically as I read it, right, like, 298 00:15:36,556 --> 00:15:39,236 Speaker 1: I just assume that, like everybody writes papers with chat 299 00:15:39,316 --> 00:15:41,516 Speaker 1: GPT now and there's nothing anybody can do about it. 300 00:15:41,516 --> 00:15:43,716 Speaker 1: But is that not true? Like if I like have 301 00:15:43,836 --> 00:15:46,196 Speaker 1: GPT write my paper and then I like change a 302 00:15:46,196 --> 00:15:49,876 Speaker 1: few words, does that sort of help get let me 303 00:15:50,116 --> 00:15:51,796 Speaker 1: sail past your defense? 304 00:15:52,316 --> 00:15:55,236 Speaker 2: It depends depends how much you change, Yeah, or if 305 00:15:55,276 --> 00:15:58,756 Speaker 2: you change like over fifty percent, maybe maybe would So 306 00:15:59,036 --> 00:16:00,756 Speaker 2: it depends. 307 00:16:00,476 --> 00:16:02,796 Speaker 1: Over fifty percent is more than a few words. And 308 00:16:02,876 --> 00:16:04,996 Speaker 1: so can you talk? I mean, I know you can't 309 00:16:05,036 --> 00:16:07,196 Speaker 1: name the university, but in practice you know how they 310 00:16:07,276 --> 00:16:11,356 Speaker 1: use it. So you know, somefess runs the student's papers 311 00:16:11,396 --> 00:16:14,316 Speaker 1: through your software and it says of when student there's 312 00:16:14,316 --> 00:16:18,596 Speaker 1: a whatever sixty percent chance that this was created using 313 00:16:18,596 --> 00:16:22,236 Speaker 1: a large language model. I mean, do you know in practice? 314 00:16:22,276 --> 00:16:24,316 Speaker 1: Obviously the professor could do whatever they want or the 315 00:16:24,436 --> 00:16:26,636 Speaker 1: university could have whatever policy, but do you know in practice, 316 00:16:26,756 --> 00:16:30,156 Speaker 1: what do they do with this information like that's that's 317 00:16:30,196 --> 00:16:31,876 Speaker 1: in a way a harder one to figure out than 318 00:16:31,916 --> 00:16:34,436 Speaker 1: the like banker who's like, oh, it might be a 319 00:16:34,436 --> 00:16:36,076 Speaker 1: deep fake on the phone. I'll call you right back 320 00:16:36,116 --> 00:16:38,756 Speaker 1: for security. Like if my I don't have a banker, 321 00:16:38,876 --> 00:16:40,636 Speaker 1: but if I had a banker and they did that, 322 00:16:40,676 --> 00:16:43,076 Speaker 1: I'd be like, oh, that's cool. I'm glad my bank 323 00:16:43,196 --> 00:16:45,796 Speaker 1: is doing this thing. Whereas with like the professor and 324 00:16:45,836 --> 00:16:51,436 Speaker 1: the student, that's a much more sort of fraud situation, right, 325 00:16:52,276 --> 00:16:55,036 Speaker 1: and harder to think of how to deal with again 326 00:16:55,116 --> 00:16:59,156 Speaker 1: the probabilistic nature of the output of the model. 327 00:16:59,476 --> 00:17:01,556 Speaker 2: Yes, I think a couple more things here. First of all, 328 00:17:01,836 --> 00:17:05,436 Speaker 2: I think even universities are trying to figure out this problem. 329 00:17:05,476 --> 00:17:08,076 Speaker 2: How to you solve it? You know. But the second 330 00:17:08,236 --> 00:17:13,156 Speaker 2: thing to note, most of our users are not interested 331 00:17:13,236 --> 00:17:15,636 Speaker 2: in a text detector. That seems to be a much 332 00:17:15,676 --> 00:17:20,076 Speaker 2: smaller market. The biggest one is actually audio. It's becoming 333 00:17:20,916 --> 00:17:22,516 Speaker 2: imagine you get a call from a loved one and 334 00:17:22,836 --> 00:17:24,876 Speaker 2: send me money, and you send money if you realize 335 00:17:24,956 --> 00:17:27,516 Speaker 2: is not who it was a defate, right, That's actually 336 00:17:27,556 --> 00:17:30,636 Speaker 2: a much widely used system. 337 00:17:31,116 --> 00:17:34,436 Speaker 1: That's the big one in terms of the business it's interesting. 338 00:17:34,476 --> 00:17:36,556 Speaker 1: I mean, I wonder if that's partly like relative we 339 00:17:36,596 --> 00:17:38,796 Speaker 1: think about the video more, but is it partly because 340 00:17:39,116 --> 00:17:41,916 Speaker 1: deep fake audio is now quite good and there are 341 00:17:41,956 --> 00:17:44,876 Speaker 1: lots of instances where people will transfer lots of money 342 00:17:44,916 --> 00:17:46,476 Speaker 1: based solely on audio. 343 00:17:47,116 --> 00:17:49,196 Speaker 2: De fake audio is the best and it's getting better, 344 00:17:49,276 --> 00:17:51,996 Speaker 2: right interested. I used to go to make your voice, 345 00:17:51,996 --> 00:17:54,196 Speaker 2: maybe I need a minute. Now I need just a 346 00:17:54,196 --> 00:17:56,996 Speaker 2: few seconds and I can make your voice. It's getting 347 00:17:57,436 --> 00:18:00,396 Speaker 2: exponentially better. All of them are, but audio is definitely 348 00:18:00,676 --> 00:18:01,596 Speaker 2: top of the list right now. 349 00:18:01,716 --> 00:18:05,316 Speaker 1: Huh And how are you keeping up? 350 00:18:06,236 --> 00:18:09,516 Speaker 2: Yeah? I mean, so when we detect audio, it's tricky. 351 00:18:09,516 --> 00:18:13,116 Speaker 2: There's a lot of factors to think about a person's accent, 352 00:18:13,636 --> 00:18:17,076 Speaker 2: right what is model biased? Does it not understand or 353 00:18:17,196 --> 00:18:20,076 Speaker 2: is there an issue where it detects It detects one 354 00:18:20,076 --> 00:18:22,236 Speaker 2: person with a certain type accent always as a d thing. 355 00:18:22,756 --> 00:18:26,116 Speaker 2: There's also issues of like noise when when when there's 356 00:18:26,116 --> 00:18:28,436 Speaker 2: a lot of background noise, the model could be impacted. 357 00:18:28,556 --> 00:18:31,236 Speaker 2: When there's cosstop, multiple people speaking at the same time, 358 00:18:31,556 --> 00:18:34,956 Speaker 2: that could impact the model. So there's a variety of factors. 359 00:18:35,116 --> 00:18:37,036 Speaker 2: And the other thing you think about is our models 360 00:18:37,036 --> 00:18:40,956 Speaker 2: are more they support multiple languages, so we don't just 361 00:18:40,956 --> 00:18:43,636 Speaker 2: do English, and so all of these kind of make 362 00:18:43,676 --> 00:18:46,956 Speaker 2: it very complicated. So when we detect something it's called 363 00:18:46,996 --> 00:18:50,036 Speaker 2: pre processing, there's a whole bunch of steps to the 364 00:18:50,116 --> 00:18:52,796 Speaker 2: audio before it actually goes to our AI models where 365 00:18:52,796 --> 00:18:56,076 Speaker 2: we have to clean up the audio, do certain types 366 00:18:56,076 --> 00:18:58,276 Speaker 2: of transformations before we push it to the models. 367 00:18:58,316 --> 00:19:01,076 Speaker 1: And is that happening in real time with these companies? 368 00:19:01,156 --> 00:19:07,036 Speaker 1: Huh huh? And and are you like, what is the 369 00:19:07,076 --> 00:19:10,236 Speaker 1: frontier of preprocessing? Like is it is it an efficiency 370 00:19:10,236 --> 00:19:12,356 Speaker 1: and speed problem because you're trying to do it in 371 00:19:12,356 --> 00:19:14,876 Speaker 1: real time and so you're just trying to kind of 372 00:19:15,156 --> 00:19:18,036 Speaker 1: make the sort of algorithmic part of it as fast 373 00:19:18,036 --> 00:19:19,156 Speaker 1: and efficient as possible. 374 00:19:19,876 --> 00:19:22,636 Speaker 2: Yeah, I mean this is a challenge. There's a lot 375 00:19:22,676 --> 00:19:25,716 Speaker 2: to be done. So that's an ongoing research. How do 376 00:19:25,796 --> 00:19:28,836 Speaker 2: we continue to speed up not just a preprocessing, but 377 00:19:28,876 --> 00:19:33,156 Speaker 2: the inference. And there's a variety of one thing that's 378 00:19:33,156 --> 00:19:35,076 Speaker 2: called a foundation model. I'm not sure if you heard 379 00:19:35,236 --> 00:19:37,556 Speaker 2: what those are, but these are extremely large pre train 380 00:19:37,636 --> 00:19:40,196 Speaker 2: model GPT is a foundation model is a pre train model. 381 00:19:40,596 --> 00:19:43,556 Speaker 2: And so these models can be useful in some parts 382 00:19:43,556 --> 00:19:47,956 Speaker 2: of the preprocessing where they can quickly extract certain features 383 00:19:47,956 --> 00:19:50,436 Speaker 2: for us, and then we can use those two down 384 00:19:50,436 --> 00:19:55,516 Speaker 2: the pipeline. 385 00:19:54,036 --> 00:19:56,436 Speaker 1: Still to come on the show. The problems that Ali 386 00:19:56,636 --> 00:20:09,996 Speaker 1: is trying to solve. Now, how good are you at 387 00:20:09,996 --> 00:20:12,996 Speaker 1: detecting de fikes? Can you quantify how good you are? 388 00:20:13,956 --> 00:20:16,236 Speaker 2: So the way they usually do this is they look 389 00:20:16,276 --> 00:20:19,876 Speaker 2: at benchmarks. Right, there's public data sets which we can 390 00:20:19,916 --> 00:20:23,396 Speaker 2: take and run and we're in the nineties and then 391 00:20:23,556 --> 00:20:25,476 Speaker 2: but you know that's not the real world. 392 00:20:25,516 --> 00:20:27,556 Speaker 1: When you say you're in the nineties, you mean you 393 00:20:29,156 --> 00:20:33,836 Speaker 1: in a binary sense, you guess correctly ninety percent of 394 00:20:33,876 --> 00:20:34,316 Speaker 1: the time. 395 00:20:35,036 --> 00:20:37,636 Speaker 2: Yeah, So on a public benchmark, we're in the nineties. 396 00:20:37,636 --> 00:20:41,956 Speaker 2: There's accuracy, precision and recall. Accuracy is how accurate are 397 00:20:41,996 --> 00:20:45,436 Speaker 2: we Let's say there is one hundred sample set is 398 00:20:45,436 --> 00:20:50,076 Speaker 2: one hundred, maybe fifty is fake, fifty is real? Right. 399 00:20:50,116 --> 00:20:52,396 Speaker 2: The accuracy is you take, okay, how many of those 400 00:20:52,396 --> 00:20:55,276 Speaker 2: did you get right? How many of the real I'm fake? Divided? Right, 401 00:20:55,836 --> 00:20:58,676 Speaker 2: that's the that's the accuracy. The problem with that is 402 00:20:58,796 --> 00:21:02,596 Speaker 2: like unbalanced data set, maybe maybe only two is fake 403 00:21:03,156 --> 00:21:06,636 Speaker 2: and then the other ninety eight are real. So in 404 00:21:06,676 --> 00:21:11,076 Speaker 2: that case, the accuracy. See if we had said that Okay, 405 00:21:11,116 --> 00:21:14,076 Speaker 2: everything is real, we would be ninety eight percent. Right, 406 00:21:14,156 --> 00:21:17,156 Speaker 2: that's not very useful because you missed the defix. So 407 00:21:17,236 --> 00:21:20,996 Speaker 2: that's why precision and recall coming. They look specifically at 408 00:21:21,276 --> 00:21:24,476 Speaker 2: how did you do on that specific like the fakes 409 00:21:24,596 --> 00:21:28,196 Speaker 2: or the reals, So there's more than just accuracy. There's 410 00:21:28,236 --> 00:21:29,756 Speaker 2: also other factors to look at. 411 00:21:30,276 --> 00:21:33,116 Speaker 1: So there's it's kind of like the sort of false 412 00:21:33,156 --> 00:21:39,716 Speaker 1: positive false negative challenge with medical tests, right you want 413 00:21:39,756 --> 00:21:43,076 Speaker 1: to test that both says you have the thing, says 414 00:21:43,076 --> 00:21:45,116 Speaker 1: you have the disease when you have the disease, and 415 00:21:45,276 --> 00:21:47,436 Speaker 1: also says you don't have the disease when you don't 416 00:21:47,436 --> 00:21:49,756 Speaker 1: have the disease, And that actually ends up being a 417 00:21:49,796 --> 00:21:54,676 Speaker 1: really complicated problem given the nature of baselines, right like 418 00:21:54,716 --> 00:21:57,076 Speaker 1: in your universe, certainly in the universe of people calling 419 00:21:57,116 --> 00:22:01,956 Speaker 1: their banker. Almost everybody calling their banker is a real person, right, 420 00:22:02,716 --> 00:22:06,276 Speaker 1: but there are these very high stakes, presumably very rare 421 00:22:06,316 --> 00:22:08,156 Speaker 1: cases where it is a defake, and so that's like 422 00:22:08,196 --> 00:22:09,956 Speaker 1: a complicated problem. 423 00:22:10,316 --> 00:22:14,036 Speaker 2: It actually is, It absolutely is, and it's something as 424 00:22:14,076 --> 00:22:16,276 Speaker 2: we work with each customer, we have to tweak those. 425 00:22:16,396 --> 00:22:20,956 Speaker 2: Someonet higher false positives, someone higher false negatives. It depends 426 00:22:20,956 --> 00:22:23,276 Speaker 2: on each use case, in the case of a bank, 427 00:22:23,436 --> 00:22:25,676 Speaker 2: they want to be a bit more cautious. But that 428 00:22:25,796 --> 00:22:28,316 Speaker 2: also causes a lot of It could cause a lot 429 00:22:28,356 --> 00:22:29,876 Speaker 2: of pain depending on the volume. 430 00:22:29,876 --> 00:22:32,716 Speaker 1: Right, because if every client it's like, oh sorry, I 431 00:22:32,716 --> 00:22:34,076 Speaker 1: got to call you back to make sure you're not 432 00:22:34,116 --> 00:22:36,196 Speaker 1: a deep fake, Like that's not great. 433 00:22:36,956 --> 00:22:38,956 Speaker 2: Yeah, And if you have thousands of calls a day 434 00:22:38,996 --> 00:22:42,916 Speaker 2: and even one percent is a false positive or negative, 435 00:22:42,996 --> 00:22:45,476 Speaker 2: that that creates a lot of work, Yeah, because it 436 00:22:45,516 --> 00:22:46,036 Speaker 2: adds up. 437 00:22:46,196 --> 00:22:47,956 Speaker 1: How do you solve that? What do you do about that? 438 00:22:49,676 --> 00:22:53,476 Speaker 2: So the way it works is all about adjusting. You 439 00:22:53,476 --> 00:22:58,436 Speaker 2: can think of thresholds, right, We can adjust variety of 440 00:22:58,476 --> 00:23:02,196 Speaker 2: parameters as the output for a model, not just the 441 00:23:02,236 --> 00:23:08,396 Speaker 2: model itself, but the for example, in an audio as 442 00:23:08,436 --> 00:23:11,156 Speaker 2: we speak, you know, we could look at okay, how 443 00:23:11,196 --> 00:23:13,876 Speaker 2: long do you want to listen before you give an answer? 444 00:23:14,516 --> 00:23:17,316 Speaker 2: You know, maybe maybe? And the longer you listen, the 445 00:23:17,396 --> 00:23:21,876 Speaker 2: more the more confident, the more that's smart. 446 00:23:21,916 --> 00:23:24,516 Speaker 1: That makes sense, right, because it's essentially more data for 447 00:23:24,596 --> 00:23:28,396 Speaker 1: the model exactly. Yeah, what are you trying to figure 448 00:23:28,396 --> 00:23:31,116 Speaker 1: out now? Like what is the frontier? 449 00:23:32,236 --> 00:23:35,116 Speaker 2: What's really the latest now? And it's just amazing how quickly. 450 00:23:35,156 --> 00:23:38,076 Speaker 2: It's going as videos. So the videos that we detect 451 00:23:38,236 --> 00:23:41,476 Speaker 2: are like a face swap, Like you're sitting there speaking 452 00:23:41,676 --> 00:23:44,676 Speaker 2: and another person's face is on there. That's a face swap. 453 00:23:44,916 --> 00:23:48,676 Speaker 2: But now you can generate an entire video completely from scratch, 454 00:23:49,116 --> 00:23:52,836 Speaker 2: and you just type in the description and the video 455 00:23:52,916 --> 00:23:55,156 Speaker 2: comes out. You can take some you can I can 456 00:23:55,156 --> 00:23:57,076 Speaker 2: take your voice, a few seconds of your voice. I 457 00:23:57,116 --> 00:24:00,156 Speaker 2: can then have you say anything I want, which you 458 00:24:00,156 --> 00:24:03,356 Speaker 2: can clearly see. The bad, bad person can misuse these tools. 459 00:24:03,716 --> 00:24:06,036 Speaker 2: So the latest is these things are getting really good and. 460 00:24:06,196 --> 00:24:09,636 Speaker 1: Over time, like with those videos, is your how is 461 00:24:09,676 --> 00:24:13,236 Speaker 1: your reliability and accuracy changing? You're getting better or worse 462 00:24:13,356 --> 00:24:16,076 Speaker 1: or staying the same as the technology to create the 463 00:24:16,076 --> 00:24:17,036 Speaker 1: deep fix improves. 464 00:24:17,236 --> 00:24:20,516 Speaker 2: So what's interesting is it has slowed down in terms 465 00:24:20,556 --> 00:24:23,476 Speaker 2: of like the signatures, Like we don't need as much 466 00:24:23,556 --> 00:24:27,436 Speaker 2: data as we used to. So of course there's still 467 00:24:27,436 --> 00:24:29,356 Speaker 2: a lot of work and we're never going to stop, 468 00:24:29,596 --> 00:24:31,516 Speaker 2: but it is stabilizing a little bit. 469 00:24:32,516 --> 00:24:35,436 Speaker 1: When you say it, what is stabilizing a little bit, 470 00:24:36,956 --> 00:24:37,436 Speaker 1: So like the. 471 00:24:37,396 --> 00:24:40,956 Speaker 2: Defied signatures are stabilizing the way. 472 00:24:40,836 --> 00:24:43,996 Speaker 1: The signatures, meaning the giveaways, the things that I can't see, 473 00:24:44,396 --> 00:24:47,636 Speaker 1: but that your models can see that AI exactly. 474 00:24:47,676 --> 00:24:51,156 Speaker 2: So our models going back and give them more detail. 475 00:24:51,316 --> 00:24:54,396 Speaker 2: They're looking at different attributes of a piece of media, 476 00:24:54,556 --> 00:24:57,356 Speaker 2: and they pull out those attributes and then they send 477 00:24:57,396 --> 00:25:01,516 Speaker 2: those to our and house neural networks that steady those attributes. 478 00:25:01,916 --> 00:25:03,796 Speaker 1: Like one that you have mentioned, that the company has 479 00:25:03,836 --> 00:25:09,476 Speaker 1: mentioned publicly is the the sync of audio and video. Right, Yes, 480 00:25:09,556 --> 00:25:11,916 Speaker 1: maybe that's one where it's gotten better and it doesn't 481 00:25:11,956 --> 00:25:15,476 Speaker 1: matter anymore, but like it, from what I understand, from 482 00:25:15,516 --> 00:25:17,316 Speaker 1: what I've read, there was at least a time when 483 00:25:17,876 --> 00:25:20,196 Speaker 1: the sink of the audio and video tended to be 484 00:25:20,396 --> 00:25:25,756 Speaker 1: off in deep fake videos. Right? Is that an example 485 00:25:25,956 --> 00:25:26,756 Speaker 1: of a signature. 486 00:25:27,316 --> 00:25:29,556 Speaker 2: So the way that works is we train the model. 487 00:25:29,596 --> 00:25:33,356 Speaker 2: We say, hey, here's a bunch of people speaking, here's 488 00:25:33,396 --> 00:25:35,196 Speaker 2: what they look like. Look at the sink. Here's a 489 00:25:35,236 --> 00:25:37,636 Speaker 2: bunch of people like that or defikes, and look at 490 00:25:37,636 --> 00:25:40,356 Speaker 2: the sink, and we tune the model so we can 491 00:25:40,396 --> 00:25:42,476 Speaker 2: tell the difference. That's also happening to a video. By 492 00:25:42,476 --> 00:25:44,636 Speaker 2: the way, if you look at Sora and some of 493 00:25:44,676 --> 00:25:49,276 Speaker 2: these new models where someone's are walking, for example, their 494 00:25:49,356 --> 00:25:54,316 Speaker 2: legs are not like you know, they're not really smooth, 495 00:25:54,356 --> 00:25:56,076 Speaker 2: or they don't look right, So you can look at 496 00:25:56,076 --> 00:25:58,876 Speaker 2: that as well. That's the temporal dynamics we call that. 497 00:25:59,436 --> 00:26:03,516 Speaker 1: Uh Like temporal dynamics is basically are things proceeding in 498 00:26:03,636 --> 00:26:05,316 Speaker 1: time in a natural. 499 00:26:04,956 --> 00:26:07,196 Speaker 2: Way exactly how things change over time. 500 00:26:09,756 --> 00:26:12,116 Speaker 1: So yeah, all of these seem like things that you 501 00:26:12,156 --> 00:26:14,556 Speaker 1: can just that are going to be fleeting, right. Like 502 00:26:14,636 --> 00:26:20,276 Speaker 1: my baseline assumption is it'll all get solved. Do you 503 00:26:21,036 --> 00:26:22,676 Speaker 1: how long do you think you'll be able to defend 504 00:26:22,716 --> 00:26:23,276 Speaker 1: reality for? 505 00:26:24,316 --> 00:26:26,436 Speaker 2: You know, this question comes up all the time where 506 00:26:26,956 --> 00:26:29,676 Speaker 2: there is always a giveaway or there is always a 507 00:26:29,716 --> 00:26:31,596 Speaker 2: new way to look at the problem. We're not just 508 00:26:31,636 --> 00:26:34,156 Speaker 2: looking always at the raw pixels, right, We could look 509 00:26:34,196 --> 00:26:38,396 Speaker 2: at different aspects. We could look at the frequency. For example, 510 00:26:38,396 --> 00:26:40,396 Speaker 2: if you look at an image, you can actually break 511 00:26:40,396 --> 00:26:41,476 Speaker 2: it down into frequencies. 512 00:26:41,956 --> 00:26:44,596 Speaker 1: When you say frequency, what do you mean when you 513 00:26:44,636 --> 00:26:46,156 Speaker 1: say you can look at the frequency? What does that mean? 514 00:26:46,316 --> 00:26:49,396 Speaker 2: So? For example, okay, so let's go with audio. You 515 00:26:49,476 --> 00:26:51,516 Speaker 2: know you can use some of call four yer transformers 516 00:26:51,556 --> 00:26:54,956 Speaker 2: to actually break up an audio into individual wavelengths science 517 00:26:54,956 --> 00:26:56,756 Speaker 2: and co science that does a look. You can do 518 00:26:56,796 --> 00:26:59,116 Speaker 2: the same with for an image, for example, you can 519 00:26:59,156 --> 00:26:59,596 Speaker 2: break that. 520 00:26:59,636 --> 00:27:03,356 Speaker 1: Up like like the analogy of a wave form of audio. 521 00:27:03,556 --> 00:27:05,716 Speaker 2: Yeah, it can. It can be translated into a bunch 522 00:27:05,716 --> 00:27:10,196 Speaker 2: of waves. So so there's multiples that we look at. 523 00:27:10,276 --> 00:27:14,556 Speaker 2: There's and the AI there's always a giveaway, uh and 524 00:27:14,556 --> 00:27:17,556 Speaker 2: and and again we're also thinking outside the box, right, 525 00:27:17,636 --> 00:27:21,156 Speaker 2: like the blood flow for example. Right, But there's other 526 00:27:21,236 --> 00:27:22,716 Speaker 2: kind of similar things we could think about. 527 00:27:22,916 --> 00:27:28,516 Speaker 1: I mean, presumably you know there, you know, renaissance Renaissance Capital. 528 00:27:28,596 --> 00:27:32,196 Speaker 1: The James Simons is one of the first quant hedge funds, 529 00:27:32,796 --> 00:27:37,556 Speaker 1: and they made tons of money for a long time. 530 00:27:37,636 --> 00:27:41,516 Speaker 1: They wildly outperformed the market. Clearly they had a technological advantage. 531 00:27:41,596 --> 00:27:44,716 Speaker 1: And the thing Simon said, the founder of this math 532 00:27:44,756 --> 00:27:47,556 Speaker 1: guy about about that company. One of the things he 533 00:27:47,596 --> 00:27:51,436 Speaker 1: said was like, we actually don't want to hire like 534 00:27:51,596 --> 00:27:54,476 Speaker 1: finance people who have some story about why a stock 535 00:27:54,556 --> 00:27:56,876 Speaker 1: is going to outperform, because if there's a story about 536 00:27:56,916 --> 00:27:59,756 Speaker 1: it that then then somebody else is going to know 537 00:27:59,836 --> 00:28:03,116 Speaker 1: it already. Right. Their thing was just like, we just 538 00:28:03,756 --> 00:28:06,716 Speaker 1: give the model all the data and let the model 539 00:28:06,756 --> 00:28:11,756 Speaker 1: find these weird ass patterns that no human even understands. 540 00:28:11,996 --> 00:28:14,836 Speaker 1: But they work more often than they don't work, and 541 00:28:14,916 --> 00:28:17,756 Speaker 1: we make tons of money. And I would think that 542 00:28:17,796 --> 00:28:20,636 Speaker 1: would be the case for you to some extent that 543 00:28:20,636 --> 00:28:22,516 Speaker 1: if you could think of a thing like monitoring blood 544 00:28:22,516 --> 00:28:25,276 Speaker 1: flow or whatever, then the bad guys or whatever, the 545 00:28:25,276 --> 00:28:29,036 Speaker 1: people who want to make realistic Jenny I would also 546 00:28:29,076 --> 00:28:31,356 Speaker 1: think of it. And the real kind of secret sauce 547 00:28:31,396 --> 00:28:36,636 Speaker 1: would be in weird correlations that the model finds that 548 00:28:37,276 --> 00:28:39,076 Speaker 1: we wouldn't even understand. 549 00:28:40,196 --> 00:28:44,596 Speaker 2: Exactly. I mean, that is oftentimes what the model is 550 00:28:44,716 --> 00:28:48,596 Speaker 2: training on, and the way it determines of something that 551 00:28:48,596 --> 00:28:52,796 Speaker 2: you think looking at certain features, it is something that 552 00:28:52,836 --> 00:28:55,636 Speaker 2: we don't even tell it, right, Yeah, it determines on 553 00:28:55,676 --> 00:28:56,036 Speaker 2: its own. 554 00:28:56,076 --> 00:28:58,676 Speaker 1: Like that's the beauty of this kind of new era 555 00:28:58,876 --> 00:29:03,676 Speaker 1: of whatever, neural networks, machine learning. Right, it's just you 556 00:29:03,796 --> 00:29:06,956 Speaker 1: throw everything at it and let the machine figure it out. 557 00:29:07,196 --> 00:29:09,636 Speaker 2: We like to say we throw the kitchen sink at sometimes. 558 00:29:09,716 --> 00:29:12,756 Speaker 1: Yes, yes, I mean, And so when you were talking 559 00:29:12,796 --> 00:29:17,436 Speaker 1: before about explainability, right about sort of saying in your output, 560 00:29:17,956 --> 00:29:20,636 Speaker 1: here's why we think it's fake. I feel like that 561 00:29:20,756 --> 00:29:22,516 Speaker 1: kind of throw everything at it and let the machine 562 00:29:22,556 --> 00:29:24,996 Speaker 1: figure it out makes it hard to like sometimes you 563 00:29:24,996 --> 00:29:26,996 Speaker 1: don't know, right, it's just like, well, the machine is 564 00:29:27,356 --> 00:29:30,996 Speaker 1: very smart in it says this is probably fake, Like yes, 565 00:29:31,116 --> 00:29:33,556 Speaker 1: that is that intention that can happen. 566 00:29:33,636 --> 00:29:36,116 Speaker 2: So you'll look at it. We'll show you an image 567 00:29:36,156 --> 00:29:38,796 Speaker 2: and it'll say the model was looking at certain areas. 568 00:29:38,876 --> 00:29:41,476 Speaker 2: And by the way, this also helps us with debugging 569 00:29:41,556 --> 00:29:44,316 Speaker 2: him bias. Right, maybe it was like for some reason 570 00:29:44,396 --> 00:29:49,156 Speaker 2: looking at an area of the face that we wouldn't tell. 571 00:29:49,196 --> 00:29:51,956 Speaker 2: Why would that set off the model? And so in 572 00:29:51,956 --> 00:29:54,636 Speaker 2: those scenario as we also investigate like why was this 573 00:29:54,756 --> 00:29:58,516 Speaker 2: area flag? And it could be one hundred percent correct, 574 00:29:58,916 --> 00:30:01,436 Speaker 2: it's just we do we do have to examine it further. 575 00:30:02,516 --> 00:30:04,996 Speaker 1: Could you create a deep fake that would fool your 576 00:30:05,076 --> 00:30:10,276 Speaker 1: deep fake detector? Yes, haha, Well if you could do it, 577 00:30:10,316 --> 00:30:13,076 Speaker 1: somebody else could do it. Don't you think I could 578 00:30:13,116 --> 00:30:13,356 Speaker 1: do it? 579 00:30:13,396 --> 00:30:17,396 Speaker 2: Because I have access to a lot more knowledge, right, Like, 580 00:30:18,316 --> 00:30:20,476 Speaker 2: you know I could if I was running an anti 581 00:30:20,556 --> 00:30:23,836 Speaker 2: virus company, I could probably write a virus if I 582 00:30:23,916 --> 00:30:27,396 Speaker 2: knew exactly what we're constantly actually trying to do that. 583 00:30:27,436 --> 00:30:29,516 Speaker 1: By the way, yeah, I mean in a sense, that's 584 00:30:29,596 --> 00:30:33,276 Speaker 1: the whole adversarial network thing, right, Like I guess you 585 00:30:33,756 --> 00:30:36,996 Speaker 1: have to do that for your detection models or your 586 00:30:37,036 --> 00:30:38,956 Speaker 1: suite of models to get better, right. 587 00:30:39,076 --> 00:30:41,916 Speaker 2: Yeah, So we have what's called red teeming both black 588 00:30:41,956 --> 00:30:44,516 Speaker 2: box and understanding of the codes. So we're trying to 589 00:30:44,516 --> 00:30:46,956 Speaker 2: break the models. That's part of the what we do. 590 00:30:47,436 --> 00:30:50,556 Speaker 1: Uh huh. And so are there like evil geniuses at 591 00:30:50,596 --> 00:30:52,636 Speaker 1: your company who can make killer deep fakes? 592 00:30:53,796 --> 00:30:57,836 Speaker 2: We definitely have geniuses one hundred percent, but we're in 593 00:30:57,876 --> 00:31:00,636 Speaker 2: the business of detection, right, we don't. We don't try 594 00:31:00,636 --> 00:31:03,396 Speaker 2: to generate too much other than just for training the models. 595 00:31:03,676 --> 00:31:08,556 Speaker 1: I mean I have to think like, there are many 596 00:31:08,596 --> 00:31:10,676 Speaker 1: people in the world world who want to make a 597 00:31:11,716 --> 00:31:15,916 Speaker 1: deep fakes for many reasons, and they're at different levels 598 00:31:15,956 --> 00:31:22,236 Speaker 1: of technological sophistication. Naively not knowing much about this, I 599 00:31:22,236 --> 00:31:25,076 Speaker 1: would think you can catch most of them. But if 600 00:31:25,116 --> 00:31:27,436 Speaker 1: you have people who can beat your models, I would 601 00:31:27,476 --> 00:31:31,916 Speaker 1: imagine that, say, state actors, countries throwing billions of dollars 602 00:31:31,996 --> 00:31:35,156 Speaker 1: at this probably also have people who could defeat your models. 603 00:31:36,996 --> 00:31:40,796 Speaker 2: Yeah, I mean that's always a case with any cybersecurity company. 604 00:31:40,956 --> 00:31:44,716 Speaker 2: We are a cybersecurity company. Every cyber security company does 605 00:31:44,756 --> 00:31:49,396 Speaker 2: its best to defend right, but we did not promise 606 00:31:49,476 --> 00:31:53,316 Speaker 2: one hundred percent. Our models are always a probability. 607 00:31:54,076 --> 00:31:57,436 Speaker 1: Who's who's the best at making deep fikes that you're 608 00:31:57,476 --> 00:31:57,996 Speaker 1: aware of. 609 00:31:58,716 --> 00:32:01,836 Speaker 2: There's a few, right, there's like Sora from OpenAI. There's Runway, 610 00:32:01,996 --> 00:32:03,676 Speaker 2: there's Synthesia, there's you. 611 00:32:03,636 --> 00:32:06,116 Speaker 1: Better be able to catch right, anything I've heard of. 612 00:32:06,236 --> 00:32:08,796 Speaker 1: You better be really good at the technic. Presumably it's 613 00:32:08,836 --> 00:32:13,116 Speaker 1: like some like you know, Russian Genius Squad or I 614 00:32:13,116 --> 00:32:15,196 Speaker 1: don't know, the North Koreans are some things. I would 615 00:32:15,236 --> 00:32:17,396 Speaker 1: imagine it is some state funded actor, but. 616 00:32:17,556 --> 00:32:20,076 Speaker 2: I would I would actually say, you know, we're in 617 00:32:20,076 --> 00:32:23,356 Speaker 2: a place where this is a problem is getting bigger. 618 00:32:23,516 --> 00:32:25,436 Speaker 2: But we're in a place where a lot of the 619 00:32:25,516 --> 00:32:28,676 Speaker 2: defects coming out are actually fron entertainment and they're not 620 00:32:28,996 --> 00:32:31,316 Speaker 2: like Youth for Evil. You know, you've seen the famous 621 00:32:31,356 --> 00:32:36,196 Speaker 2: Tom kruzwe or or other actors running around and do things, 622 00:32:36,196 --> 00:32:38,196 Speaker 2: and those are defakes, right, those are actually pretty good. 623 00:32:38,236 --> 00:32:39,916 Speaker 2: We detect them, but they're actually very good. 624 00:32:40,596 --> 00:32:42,556 Speaker 1: What are you thinking about in the context of the 625 00:32:42,756 --> 00:32:44,436 Speaker 1: of the election in the US this year and do 626 00:32:44,476 --> 00:32:48,436 Speaker 1: you have particular clients who are especially focused on election 627 00:32:48,556 --> 00:32:50,436 Speaker 1: related deep fakes. 628 00:32:51,436 --> 00:32:56,556 Speaker 2: Yeah, the media companies are the main ones, and we're ready. 629 00:32:56,996 --> 00:33:02,356 Speaker 2: We detect the best, the best defakes, right, everything that's 630 00:33:02,396 --> 00:33:05,516 Speaker 2: coming out we detect, So we're ready and we want 631 00:33:05,596 --> 00:33:09,716 Speaker 2: to make sure we're there as one avenue of people 632 00:33:10,396 --> 00:33:13,596 Speaker 2: verifying content. I believe late last year there was an 633 00:33:13,636 --> 00:33:17,876 Speaker 2: election in Slovenia where there was an audio of one 634 00:33:17,916 --> 00:33:21,796 Speaker 2: of the candidates saying he's gonna double the price of beer. Yeah, 635 00:33:21,836 --> 00:33:25,756 Speaker 2: and that actually was a defake. It was caught, but 636 00:33:25,876 --> 00:33:28,556 Speaker 2: it kind of costed some damage. So it's starting to 637 00:33:28,596 --> 00:33:28,996 Speaker 2: happen now. 638 00:33:29,076 --> 00:33:31,356 Speaker 1: It's an awesomely stupid deep fake. I mean, to me, 639 00:33:31,596 --> 00:33:35,516 Speaker 1: the real risk of deep fakes is not people believing 640 00:33:35,556 --> 00:33:39,636 Speaker 1: something that's false. It's people ceasing to believe anything, right. 641 00:33:39,716 --> 00:33:43,116 Speaker 1: It's just saying, oh, that's probably just a deep fake, 642 00:33:43,196 --> 00:33:45,716 Speaker 1: right like that. Actually, to me seems like the bigger 643 00:33:45,796 --> 00:33:49,236 Speaker 1: risk is nothing is true anymore. Nobody cares about the 644 00:33:49,276 --> 00:33:49,996 Speaker 1: truth anymore. 645 00:33:50,556 --> 00:33:56,316 Speaker 2: That's definitely a problem as well. Now we're seeing people saying, oh, 646 00:33:56,396 --> 00:33:58,996 Speaker 2: this is a defake. That's actually happened. There's a few. 647 00:33:59,996 --> 00:34:03,156 Speaker 2: I believe it was a Cape Milton video if I'm correct, 648 00:34:03,156 --> 00:34:04,876 Speaker 2: that was earlier this year, or everyone thought that was 649 00:34:04,876 --> 00:34:08,316 Speaker 2: a defake and it wasn't. So this kind of problem 650 00:34:08,356 --> 00:34:09,356 Speaker 2: is happening. 651 00:34:08,956 --> 00:34:12,476 Speaker 1: Like that's because people people want to believe things that 652 00:34:12,516 --> 00:34:14,836 Speaker 1: are consistent with their prior beliefs, and they don't want 653 00:34:14,876 --> 00:34:18,356 Speaker 1: to believe things that call their prior beliefs into question, right, 654 00:34:18,436 --> 00:34:20,796 Speaker 1: and so deep fakes in a way are an easy 655 00:34:20,836 --> 00:34:23,676 Speaker 1: out where if you see something you like, you assume 656 00:34:23,676 --> 00:34:25,476 Speaker 1: it's true. If you see something you don't like, you 657 00:34:25,516 --> 00:34:27,716 Speaker 1: assume it's not true, or you assume everything's just kind 658 00:34:27,716 --> 00:34:29,756 Speaker 1: of bullshit like that to me seems like a big 659 00:34:29,836 --> 00:34:32,476 Speaker 1: quind of societal level risk of deep fakes. 660 00:34:32,636 --> 00:34:36,276 Speaker 2: We'll never fix that. That's something that will never solve. Yeah, 661 00:34:36,836 --> 00:34:40,076 Speaker 2: people have their own beliefs. You can show them anything, 662 00:34:40,236 --> 00:34:44,156 Speaker 2: the facts, math, that's not going to fix it all. Yeah. 663 00:34:44,196 --> 00:34:46,476 Speaker 1: No, I guess that's a human nature problem, if not 664 00:34:46,556 --> 00:34:52,636 Speaker 1: an AI problem. We'll be back in a minute with 665 00:34:52,676 --> 00:35:08,556 Speaker 1: the lighting round. Okay, let's close with a lightning round. 666 00:35:09,196 --> 00:35:09,516 Speaker 2: Okay. 667 00:35:10,236 --> 00:35:13,396 Speaker 1: How often do people applying to work at Reality Defender 668 00:35:13,756 --> 00:35:16,076 Speaker 1: use generative AI to write cover letters? 669 00:35:16,476 --> 00:35:19,116 Speaker 2: Oh, that's a good one. Not a lot of, but 670 00:35:19,116 --> 00:35:21,396 Speaker 2: we've seen it for sure. I would say maybe about 671 00:35:21,676 --> 00:35:22,356 Speaker 2: three percent. 672 00:35:22,836 --> 00:35:27,316 Speaker 1: Okay. If I want to use generative AI to write 673 00:35:27,316 --> 00:35:29,756 Speaker 1: a cover letter to apply to work at Reality Defender, 674 00:35:29,836 --> 00:35:32,116 Speaker 1: but I don't want to get caught, what should I do. 675 00:35:33,156 --> 00:35:35,276 Speaker 2: Change about seventy five percent. 676 00:35:34,996 --> 00:35:39,356 Speaker 1: Of the words Okay, who is Gabe Reagan? 677 00:35:41,476 --> 00:35:45,316 Speaker 2: Gabe was? I think it was our VP of public 678 00:35:45,596 --> 00:35:48,476 Speaker 2: relations or something like that. Here's a dfake. We we 679 00:35:48,516 --> 00:35:50,996 Speaker 2: created him as a as kind of a as kind 680 00:35:51,036 --> 00:35:54,196 Speaker 2: of a fun joke. But obviously we tell everyone. 681 00:35:54,516 --> 00:35:56,636 Speaker 1: Tell me, tell me a little bit more about that. 682 00:35:57,756 --> 00:36:01,316 Speaker 2: If you go on certain websites where you put your 683 00:36:01,316 --> 00:36:05,436 Speaker 2: photo and maybe your job experience, there's quite a large 684 00:36:05,516 --> 00:36:10,036 Speaker 2: number of deficke profiles on these websites like LinkedIn. 685 00:36:11,276 --> 00:36:16,996 Speaker 1: Yes, huh, why why people be doing that? 686 00:36:17,076 --> 00:36:18,276 Speaker 2: Sorry scammers? 687 00:36:18,676 --> 00:36:20,276 Speaker 1: I'm trying to think, how do you get money out 688 00:36:20,276 --> 00:36:22,196 Speaker 1: of people by having a fake LinkedIn account? 689 00:36:22,316 --> 00:36:25,396 Speaker 2: Oh? I can tell you. Let's say you start the 690 00:36:25,476 --> 00:36:28,956 Speaker 2: most popular ones that I'm aware of, is like cryptocurrency. 691 00:36:28,996 --> 00:36:31,116 Speaker 2: Maybe you create a coin and you're like, here's a 692 00:36:31,236 --> 00:36:33,956 Speaker 2: CEO and here's this person and they have these great 693 00:36:33,956 --> 00:36:36,716 Speaker 2: LinkedIn profiles. Here's their photo and they're not real, but 694 00:36:36,876 --> 00:36:38,516 Speaker 2: it sells a story. Right. 695 00:36:41,076 --> 00:36:45,356 Speaker 1: Is it right that you founded a clothing company? 696 00:36:46,116 --> 00:36:46,436 Speaker 2: I did? 697 00:36:46,516 --> 00:36:49,316 Speaker 1: Yes, what's one thing you learned about fashion from doing that? 698 00:36:50,876 --> 00:36:54,196 Speaker 2: It's much different than software development. 699 00:36:54,956 --> 00:36:57,196 Speaker 1: Sure, I don't think you needed to start a company 700 00:36:57,236 --> 00:37:00,756 Speaker 1: to learn that. I mean, the marginal cost is not 701 00:37:00,956 --> 00:37:01,956 Speaker 1: zero for one thing. 702 00:37:02,556 --> 00:37:05,436 Speaker 2: Yeah, the software is easy, you write some It's not 703 00:37:05,636 --> 00:37:08,116 Speaker 2: easy at all. But what I mean is you're writing 704 00:37:08,196 --> 00:37:11,636 Speaker 2: some code and you ship it. Versus in fashion, you 705 00:37:11,676 --> 00:37:13,396 Speaker 2: have to have like you got to source the fabric. 706 00:37:13,436 --> 00:37:16,356 Speaker 2: You gotta you gotta design it, you gotta make the patterns, 707 00:37:16,356 --> 00:37:18,156 Speaker 2: you gotta cut it, sew it, make sure it fits. 708 00:37:18,236 --> 00:37:19,076 Speaker 2: It's a lot more work. 709 00:37:22,196 --> 00:37:24,196 Speaker 1: What are the chances that we exist in a simulation? 710 00:37:25,836 --> 00:37:27,236 Speaker 2: You know, I used to think this is kind of 711 00:37:27,276 --> 00:37:30,396 Speaker 2: a joke, but I don't know. I'm seeing every every 712 00:37:30,796 --> 00:37:34,716 Speaker 2: month it seems to get higher. From my perspective. 713 00:37:35,196 --> 00:37:36,076 Speaker 1: Why do you say that. 714 00:37:37,356 --> 00:37:40,156 Speaker 2: I'm seeing what's happening with tech and what we're building, 715 00:37:40,316 --> 00:37:43,076 Speaker 2: and there's you can see there's there was one paper 716 00:37:43,116 --> 00:37:46,036 Speaker 2: where they took a bunch of agents and they gave 717 00:37:46,076 --> 00:37:47,836 Speaker 2: them all a job and they start to do it 718 00:37:47,876 --> 00:37:50,036 Speaker 2: and they just started to like create their own kind 719 00:37:50,116 --> 00:37:52,996 Speaker 2: of like work cloths. Right, I don't know, it shuld 720 00:37:53,036 --> 00:37:53,556 Speaker 2: be getting there. 721 00:37:53,876 --> 00:37:56,236 Speaker 1: So so it's like, well, if we can create a 722 00:37:56,316 --> 00:38:00,196 Speaker 1: simulation that seems like reality, maybe someone created a simulation 723 00:38:00,356 --> 00:38:05,156 Speaker 1: that is our reality exactly. Yeah, what do you wish 724 00:38:05,236 --> 00:38:06,756 Speaker 1: more people understood about AI. 725 00:38:07,716 --> 00:38:09,796 Speaker 2: I mean, it's a tool, and I don't think people 726 00:38:09,796 --> 00:38:12,556 Speaker 2: should be afraid of it. They should embrace it. And 727 00:38:12,996 --> 00:38:16,036 Speaker 2: you know there's people are just running away from it. 728 00:38:16,036 --> 00:38:20,676 Speaker 2: It's fantastic, it's great. Embrace it. Just be careful. One 729 00:38:20,676 --> 00:38:22,836 Speaker 2: thing I'd like to tell, like my friends and family, 730 00:38:23,036 --> 00:38:25,476 Speaker 2: especially with the e fake audio, have a safe word. 731 00:38:25,516 --> 00:38:28,716 Speaker 2: As somebody calls you and you're like that's weird, you know, 732 00:38:29,236 --> 00:38:31,356 Speaker 2: call him back or ask for a safe word. 733 00:38:31,876 --> 00:38:39,236 Speaker 1: What do you wish more people understood about reality reality? 734 00:38:39,916 --> 00:38:42,876 Speaker 2: I would say, just be aware that you exist. And 735 00:38:43,036 --> 00:38:45,316 Speaker 2: every day's a gift, So you should be excited that 736 00:38:45,476 --> 00:38:48,276 Speaker 2: you hear. Like the chances of you existing it's like 737 00:38:48,596 --> 00:38:52,116 Speaker 2: you've won the lottery a million times, So every day's 738 00:38:52,116 --> 00:38:52,476 Speaker 2: a gift. 739 00:38:56,836 --> 00:39:00,996 Speaker 1: Ali Shakiyari is the co founder and CTO at Reality Defender. 740 00:39:01,956 --> 00:39:05,236 Speaker 1: Today's show was produced by Gabriel Hunter Chang. It was 741 00:39:05,476 --> 00:39:08,916 Speaker 1: edited by Lyddy Jean Kott and engineered by Sarah Bruguer. 742 00:39:09,396 --> 00:39:13,036 Speaker 1: You can email us at problem at Pushkin dot fm. 743 00:39:13,116 --> 00:39:15,476 Speaker 1: I'm Jacob Goldstein and we'll be back next week with 744 00:39:15,516 --> 00:39:26,996 Speaker 1: another episode of What's Your Problem