1
00:00:00,880 --> 00:00:05,040
S1: Unsupervised Learning is a podcast about trends and ideas in cybersecurity,

2
00:00:05,080 --> 00:00:09,960
S1: national security, AI, technology and society, and how best to

3
00:00:10,000 --> 00:00:17,680
S1: upgrade ourselves to be ready for what's coming. All right,

4
00:00:17,680 --> 00:00:22,160
S1: welcome to Unsupervised learning. I'm here with Burrell Taylor, head

5
00:00:22,160 --> 00:00:24,959
S1: of Minaya, and it's great to see you.

6
00:00:25,880 --> 00:00:28,120
S2: It's great to be here. What a pleasure.

7
00:00:29,200 --> 00:00:34,800
S1: Awesome. So before we get into the IO stuff, uh,

8
00:00:35,360 --> 00:00:38,800
S1: it looks like you have a pretty interesting background. And

9
00:00:38,800 --> 00:00:41,120
S1: I just want to, like, get a walkthrough of that

10
00:00:41,120 --> 00:00:45,040
S1: real quick. Like, uh, what gets you excited about tech? Like,

11
00:00:45,040 --> 00:00:48,000
S1: what have you been doing in tech all these years? And, uh,

12
00:00:48,360 --> 00:00:49,600
S1: just like to hear about you.

13
00:00:51,040 --> 00:00:56,520
S2: And, wow, it's a it's a very long and short time. Um, basically,

14
00:00:56,520 --> 00:01:01,740
S2: since I'm 12, I'm programming, um, I'm into tech. I

15
00:01:01,860 --> 00:01:07,300
S2: also like I wasn't too into game development and I

16
00:01:07,340 --> 00:01:11,020
S2: also thought game development. And then I went into slightly

17
00:01:11,020 --> 00:01:14,420
S2: into cyber of it, like hacking games into getting more

18
00:01:14,459 --> 00:01:17,700
S2: like a, you know, getting money in the game or,

19
00:01:17,740 --> 00:01:19,380
S2: you know, points or something.

20
00:01:19,740 --> 00:01:21,460
S1: Yeah. Yeah. Higher. Higher scores.

21
00:01:22,020 --> 00:01:26,540
S2: Exactly. And it's so fun. And yeah, especially in the

22
00:01:26,540 --> 00:01:28,220
S2: early days, it was so easy.

23
00:01:28,580 --> 00:01:32,580
S1: Yeah. Because everything everything was local, right? Uh, all the resources,

24
00:01:32,620 --> 00:01:35,539
S1: everything was stored locally. So you could just edit it

25
00:01:35,540 --> 00:01:37,620
S1: and it would appear on the server side. Yeah.

26
00:01:38,260 --> 00:01:43,780
S2: Yeah, I remember all the all the Hexa, all the Hexa. Yes.

27
00:01:44,260 --> 00:01:46,140
S2: All the hex that they tried to find the right

28
00:01:46,140 --> 00:01:50,100
S2: place with the, with the scores to, to change. And

29
00:01:50,100 --> 00:01:51,940
S2: it's also fun. Um.

30
00:01:52,300 --> 00:01:54,420
S1: Yeah. You imagine the tools that we have now. It

31
00:01:54,420 --> 00:01:56,820
S1: would be so much easier. But I.

32
00:01:56,820 --> 00:01:57,220
S2: Know it's.

33
00:01:57,220 --> 00:01:59,880
S1: Crazy. Yeah, yeah, yeah. Um.

34
00:02:00,640 --> 00:02:03,440
S2: And then, like, you know, since then, I'm, I'm moved

35
00:02:03,440 --> 00:02:08,040
S2: into more network security operating systems. I also continue into

36
00:02:08,040 --> 00:02:10,840
S2: the Army. And so I have like a lot of

37
00:02:10,840 --> 00:02:19,440
S2: experience in Army stuff like cryptography, networks, research. Um, then

38
00:02:19,760 --> 00:02:24,600
S2: I was the first engineer in augmented reality startup. Um,

39
00:02:24,600 --> 00:02:28,040
S2: and afterwards I built my own company and we did

40
00:02:28,080 --> 00:02:32,320
S2: prioritization for, uh, for cloud native alerts, like for container,

41
00:02:32,680 --> 00:02:36,960
S2: container image scanning. Uh, and then we got acquired and

42
00:02:37,000 --> 00:02:41,520
S2: into mend, which is the same company I'm in now. Um, yeah.

43
00:02:41,520 --> 00:02:44,680
S2: And it's now basically it's called Mend Container. Today it's

44
00:02:44,680 --> 00:02:49,040
S2: the container reachability. And now also I moved into mend AI,

45
00:02:49,280 --> 00:02:52,600
S2: which is a new product, and we founded, uh, to

46
00:02:52,639 --> 00:02:53,679
S2: do AI security.

47
00:02:54,800 --> 00:02:57,610
S1: Very, very cool. What do you think? It's like the

48
00:02:57,610 --> 00:03:01,170
S1: common thread going through all of this. Like the main thing, uh,

49
00:03:01,169 --> 00:03:02,690
S1: driving curiosity.

50
00:03:04,490 --> 00:03:11,290
S2: Um, wow. Amazing question. Um, I think I always was really, uh,

51
00:03:11,690 --> 00:03:19,290
S2: really excited from new, uh, new industry changing technology. It

52
00:03:19,290 --> 00:03:23,889
S2: was always the technology that, uh, that, that, like, was

53
00:03:24,130 --> 00:03:29,170
S2: enabling something and, and make it so exciting, uh, especially

54
00:03:29,169 --> 00:03:32,770
S2: in the startups world where you have these giant companies

55
00:03:33,010 --> 00:03:38,850
S2: and one day or like in a process, pretty quick process, um,

56
00:03:39,290 --> 00:03:44,090
S2: they're all disrupted. There's tons of new things, uh, you

57
00:03:44,090 --> 00:03:47,730
S2: can build. And yeah, it's a challenge it to challenge

58
00:03:47,770 --> 00:03:49,730
S2: the Giants. So it's amazing.

59
00:03:51,290 --> 00:03:56,350
S1: Yeah. So, um, are you mostly interested in the network

60
00:03:56,350 --> 00:03:59,430
S1: security stuff? Obviously you're doing AI stuff now. Everyone's doing

61
00:03:59,470 --> 00:04:05,950
S1: AI stuff. Um, but like network security, application security. Uh, like,

62
00:04:05,950 --> 00:04:08,630
S1: what is like your center of mass? Like your favorite thing.

63
00:04:09,670 --> 00:04:14,550
S2: So for sure, 100% application security. It's so exciting. But basically,

64
00:04:14,550 --> 00:04:17,670
S2: at the end of the day, it's all about developers

65
00:04:17,670 --> 00:04:22,549
S2: and the vulnerabilities, um, like you have in your code. Um,

66
00:04:22,550 --> 00:04:25,870
S2: so my focus is now application security and AI security

67
00:04:25,910 --> 00:04:32,589
S2: for inside the application. And that's like our special take. Um,

68
00:04:32,910 --> 00:04:36,830
S2: we look just on application, we try to, you know,

69
00:04:37,070 --> 00:04:40,470
S2: there's so many issues there. Um, also on the AI

70
00:04:40,510 --> 00:04:42,310
S2: components there. So.

71
00:04:43,110 --> 00:04:46,310
S1: Yeah, what do you see is the biggest problems right

72
00:04:46,310 --> 00:04:49,630
S1: now in Appsec? Um, obviously a big part of that

73
00:04:49,630 --> 00:04:52,750
S1: is going to be AI because there's so many AI applications.

74
00:04:52,750 --> 00:04:55,930
S1: But would you say AI is the biggest sort of

75
00:04:55,970 --> 00:04:58,609
S1: application security thing happening right now?

76
00:04:59,970 --> 00:05:03,770
S2: I thought like we looked on many issues and when

77
00:05:03,770 --> 00:05:07,570
S2: we founded the Atom Security and it was clear the

78
00:05:07,570 --> 00:05:14,770
S2: biggest issue is prioritization and the the like the size

79
00:05:14,770 --> 00:05:18,770
S2: of the backlog of vulnerabilities of critical vulnerabilities you have. Yeah.

80
00:05:19,650 --> 00:05:24,289
S2: And then now with all the reachability technologies that allows

81
00:05:24,290 --> 00:05:27,969
S2: you to to like reduce the noise, basically remove the

82
00:05:27,970 --> 00:05:33,130
S2: false positives and the, the trend of platforms and aspm

83
00:05:33,250 --> 00:05:35,969
S2: allows you to take all the findings in one place

84
00:05:35,970 --> 00:05:41,490
S2: and prioritize them smartly. And like we see the biggest

85
00:05:41,490 --> 00:05:47,810
S2: issues now like a what's introduced in AI with AI.

86
00:05:48,410 --> 00:05:52,270
S2: And then there's like two different things. There's the I

87
00:05:52,390 --> 00:05:55,150
S2: the employees are using in order to be more productive

88
00:05:55,510 --> 00:05:59,230
S2: so that it's reduced some risks. And and what we

89
00:05:59,230 --> 00:06:02,950
S2: see as like the biggest risk. And because we like, uh,

90
00:06:03,750 --> 00:06:06,110
S2: we see how the world is changing and how our

91
00:06:06,110 --> 00:06:13,750
S2: customers are moving like towards the, um, in the products. Um,

92
00:06:13,870 --> 00:06:20,029
S2: is the AI components inside applications and especially the AI agents, um,

93
00:06:20,430 --> 00:06:22,630
S2: inside these applications that are in production.

94
00:06:24,550 --> 00:06:32,190
S1: Okay. So when you say AI components, do you mean like, uh, libraries?

95
00:06:32,230 --> 00:06:34,910
S1: Like what other pieces other than the agents do you

96
00:06:34,910 --> 00:06:36,310
S1: mean by AI components?

97
00:06:37,990 --> 00:06:42,110
S2: So it's a great question because like in the beginning, um,

98
00:06:42,350 --> 00:06:46,630
S2: when when I like did simple data science before before,

99
00:06:46,670 --> 00:06:51,089
S2: you know, ChatGPT and llms, um, there wasn't many components.

100
00:06:51,089 --> 00:06:54,570
S2: You know, you had data set and then you build

101
00:06:54,570 --> 00:07:00,410
S2: with it some some machine learning model. And and now

102
00:07:00,410 --> 00:07:05,809
S2: it's the amount of components is exploding because you have the, um,

103
00:07:05,850 --> 00:07:08,010
S2: you know, the data layer and you have data set

104
00:07:08,170 --> 00:07:11,570
S2: and then you have also data is like changing because

105
00:07:11,570 --> 00:07:15,170
S2: you have data for training. But if you take already

106
00:07:15,770 --> 00:07:19,210
S2: a model which is another component and then fine tuned it,

107
00:07:19,210 --> 00:07:23,250
S2: or just doing alignment. So you have many types of data.

108
00:07:23,650 --> 00:07:26,250
S2: You also have many types of models. You have models

109
00:07:26,250 --> 00:07:31,690
S2: you're using for training, models you're using for uh, for, uh,

110
00:07:32,890 --> 00:07:38,210
S2: for like for fine tuning and models are using as is. Yeah.

111
00:07:38,930 --> 00:07:42,210
S2: So that's more components. But we also have all the

112
00:07:42,210 --> 00:07:45,929
S2: components on the code layer. And we have the system

113
00:07:45,930 --> 00:07:51,340
S2: prompt and the agent Ancient tools and the third party

114
00:07:51,740 --> 00:07:57,580
S2: tools like MCP servers. Yep. And the actual agents that

115
00:07:57,580 --> 00:08:02,460
S2: can speak with many of these tools. And we have

116
00:08:02,500 --> 00:08:07,380
S2: like many agents we're already seeing, we already see adoption

117
00:08:07,820 --> 00:08:10,460
S2: of like multiple multiple agent frameworks.

118
00:08:12,100 --> 00:08:15,220
S1: Yeah. And maybe the APIs themselves, which they're not actually

119
00:08:15,340 --> 00:08:18,540
S1: AI components, but those agents will be calling back to

120
00:08:18,580 --> 00:08:21,780
S1: traditional APIs as well. Yeah. So that's a pretty good

121
00:08:21,780 --> 00:08:24,980
S1: list of the components. You're going to mention something else

122
00:08:24,980 --> 00:08:26,660
S1: like another another component.

123
00:08:27,340 --> 00:08:30,220
S2: And I'm definitely someone just asked me today like what's

124
00:08:30,220 --> 00:08:35,859
S2: the difference between like old APIs, third party APIs and

125
00:08:35,900 --> 00:08:41,940
S2: MCP servers or agent tools? And basically like you, you

126
00:08:41,940 --> 00:08:47,620
S2: would you would think there's no difference. And I would argue,

127
00:08:47,880 --> 00:08:49,760
S2: but like, tell me what? Maybe it's also a question

128
00:08:49,760 --> 00:08:51,760
S2: for you. Tell them what you think. I argue the

129
00:08:51,760 --> 00:08:56,439
S2: main difference is the way you're using them and not

130
00:08:56,960 --> 00:08:59,839
S2: the tool is basically the same tool. The way you

131
00:08:59,840 --> 00:09:05,600
S2: use them is that it's more close and more connected

132
00:09:05,880 --> 00:09:08,840
S2: to your data, your critical data.

133
00:09:10,360 --> 00:09:15,880
S1: Yeah, yeah, I would say another big difference is that, um,

134
00:09:16,480 --> 00:09:20,240
S1: usually when you build an API from scratch, you have

135
00:09:20,240 --> 00:09:24,880
S1: a skilled developer who is going through very systematic process

136
00:09:25,480 --> 00:09:30,080
S1: to define exactly what methods are possible. They still make mistakes, obviously,

137
00:09:30,080 --> 00:09:34,360
S1: because we have API security problems. However, at least it's

138
00:09:34,360 --> 00:09:39,120
S1: being manually done right. Whereas with um, I feel like

139
00:09:39,120 --> 00:09:43,880
S1: with MCP servers, the problem is you could have a,

140
00:09:43,880 --> 00:09:46,160
S1: you know, data and you can have APIs on the

141
00:09:46,300 --> 00:09:49,940
S1: back end. You spin up this NTP server and it

142
00:09:49,940 --> 00:09:52,980
S1: just kind of goes and collects all that functionality and

143
00:09:52,980 --> 00:09:56,780
S1: presents it's its own new APIs that could be used

144
00:09:56,780 --> 00:09:59,179
S1: by the ancient tools. So I feel like you could

145
00:09:59,820 --> 00:10:06,140
S1: create more functionality without knowledge. And that that's kind of

146
00:10:06,179 --> 00:10:08,460
S1: the issue, because you might be surprised by actually what

147
00:10:08,460 --> 00:10:12,740
S1: can happen through that NCP server. So I think, yeah,

148
00:10:12,780 --> 00:10:14,460
S1: I think a lot of stuff is being stood up

149
00:10:14,460 --> 00:10:18,180
S1: with NCP servers, and the people hosting it don't actually

150
00:10:18,179 --> 00:10:20,380
S1: know all of its capabilities.

151
00:10:21,380 --> 00:10:24,780
S2: It's a great point. Also, in a way, you can't

152
00:10:24,820 --> 00:10:29,179
S2: know all the capabilities because part of the like, the

153
00:10:29,179 --> 00:10:33,740
S2: best thing about this AI revolution is that the interface,

154
00:10:33,780 --> 00:10:36,820
S2: but also the worst, is that the interface is fuzzy.

155
00:10:37,300 --> 00:10:39,340
S2: The input and the output is fuzzy. It means.

156
00:10:39,860 --> 00:10:40,939
S1: Yeah, it's 100%.

157
00:10:40,940 --> 00:10:44,180
S2: You can't define exactly the signature of the function with

158
00:10:44,179 --> 00:10:47,440
S2: all the variables. That's only integer. You can't put string there, right?

159
00:10:47,480 --> 00:10:50,199
S2: So now you can afraid only from integer overflow maybe.

160
00:10:50,720 --> 00:10:56,200
S2: And and and now the interface is so fuzzy. It's text,

161
00:10:56,240 --> 00:11:01,160
S2: it's PDF it's voice. It's PDF with image that has

162
00:11:01,160 --> 00:11:04,200
S2: some something inside it. You have so many options. So

163
00:11:04,200 --> 00:11:06,000
S2: the text is huge.

164
00:11:06,800 --> 00:11:09,520
S1: That's right. And then there's also the issue of like

165
00:11:09,520 --> 00:11:14,080
S1: the fuzziness of if you're actually interacting with an agent

166
00:11:14,080 --> 00:11:17,840
S1: on the receiving side that is front ending but separate

167
00:11:17,840 --> 00:11:20,360
S1: from the MCP server. If you're talking to an agent

168
00:11:20,600 --> 00:11:23,480
S1: and it has the ability to use the tools, you

169
00:11:23,480 --> 00:11:26,280
S1: might be able to confuse it or trick it into

170
00:11:26,640 --> 00:11:32,640
S1: using the APIs it has available in unsafe ways. Right. Um.

171
00:11:33,160 --> 00:11:34,160
S2: That's a great point.

172
00:11:34,280 --> 00:11:37,600
S1: Yeah. And it might it might respond back and say no.

173
00:11:37,600 --> 00:11:40,240
S1: And you ask in a different way and it still

174
00:11:40,240 --> 00:11:41,239
S1: gives you results.

175
00:11:42,400 --> 00:11:45,929
S2: And by the way, we've seen in the wild and many,

176
00:11:45,929 --> 00:11:51,449
S2: many patterns, malicious patterns we've seen with open source libraries,

177
00:11:51,490 --> 00:11:54,290
S2: like in the beginning of a, you know, the concept

178
00:11:54,290 --> 00:11:58,770
S2: of SCA and open source security and with models and

179
00:11:58,770 --> 00:12:02,730
S2: like a concept, like a typosquatting when you're trying to

180
00:12:02,770 --> 00:12:08,490
S2: do phishing to humans. Yeah. So humans are close, you know,

181
00:12:08,530 --> 00:12:12,410
S2: the wrong open source library because they change something. They

182
00:12:12,410 --> 00:12:15,330
S2: added a dash. So same with models. We've seen it.

183
00:12:15,370 --> 00:12:18,329
S1: Yeah. For like for like npm packages, stuff like that.

184
00:12:18,330 --> 00:12:19,450
S1: Package managers.

185
00:12:19,850 --> 00:12:23,050
S2: Exactly. So we've seen it also with models in the

186
00:12:23,050 --> 00:12:27,809
S2: evolution of us of our product. And now we're seeing

187
00:12:27,809 --> 00:12:33,010
S2: it with both with agent tools and libraries. So cursor

188
00:12:33,370 --> 00:12:38,610
S2: uh you can very easily precursor uh to use a

189
00:12:38,770 --> 00:12:42,630
S2: typosquatting package. Let's call it this way. And same for

190
00:12:42,830 --> 00:12:46,750
S2: NTP servers. You have malicious FTP servers.

191
00:12:47,590 --> 00:12:51,910
S1: Right? That's a that's a great point. And that could

192
00:12:51,910 --> 00:12:53,710
S1: just be a man in the middle that just passes

193
00:12:53,710 --> 00:12:54,750
S1: on requests. Right?

194
00:12:55,429 --> 00:12:58,550
S2: Exactly. It's totally it's look 100% legit.

195
00:12:59,309 --> 00:13:03,590
S1: Yeah. But I'm saying when you submit to that malicious one,

196
00:13:03,590 --> 00:13:06,309
S1: it could be still submitting to the the real one

197
00:13:06,630 --> 00:13:09,829
S1: and returning you real results. But in the meantime, gathering

198
00:13:09,830 --> 00:13:11,390
S1: data or doing whatever.

199
00:13:12,990 --> 00:13:15,470
S2: I guess that's the worst, because when you have a

200
00:13:15,510 --> 00:13:19,870
S2: Bitcoin miner, I guess like the let's call it malicious

201
00:13:19,870 --> 00:13:24,110
S2: actor knows that it's going to be caught in the

202
00:13:24,110 --> 00:13:26,750
S2: next few months, and you're trying to make the best

203
00:13:26,910 --> 00:13:32,390
S2: out of this month. And this silent, silent man in

204
00:13:32,390 --> 00:13:37,150
S2: the middle type of malicious packages, malicious models and malicious servers,

205
00:13:37,190 --> 00:13:40,730
S2: that's probably the worst, especially when it's third party. so

206
00:13:40,730 --> 00:13:43,809
S2: you don't need even to open source your server.

207
00:13:44,850 --> 00:13:48,530
S1: Yeah, MCP is going to need some serious security help

208
00:13:48,530 --> 00:13:53,849
S1: very quickly. It's because everyone's just running full speed, installing

209
00:13:53,850 --> 00:13:56,610
S1: as many of these things as they can. And yeah,

210
00:13:56,610 --> 00:14:00,130
S1: it's it's a good point. It's it's a really big

211
00:14:00,130 --> 00:14:04,810
S1: mess right now. Um, what what other things? Uh, we

212
00:14:04,850 --> 00:14:08,490
S1: got malicious MCP servers. Um, we've got agents that can

213
00:14:08,490 --> 00:14:13,410
S1: be tricked. I always talk about just like the, um,

214
00:14:14,210 --> 00:14:19,210
S1: the agents having too many tools available. Uh, because oftentimes

215
00:14:19,210 --> 00:14:21,770
S1: I think when if the business is pushing, like, we

216
00:14:21,770 --> 00:14:24,370
S1: must have I, we must have I. Oh, and by

217
00:14:24,370 --> 00:14:27,450
S1: the way, we must have an agent. They just give

218
00:14:27,450 --> 00:14:31,250
S1: the agent access to too many tools. And, um, the

219
00:14:31,250 --> 00:14:35,730
S1: guardrail infrastructure isn't really there yet. I don't know if, uh,

220
00:14:35,770 --> 00:14:40,430
S1: you guys have something around this, but, um, Um, like, uh,

221
00:14:40,430 --> 00:14:44,110
S1: I use bedrock a lot, and bedrock has, um, uh,

222
00:14:44,110 --> 00:14:47,790
S1: some pretty cool guardrails stuff built in, but we're running

223
00:14:47,790 --> 00:14:51,750
S1: way faster than the guardrails are being laid down. So

224
00:14:51,790 --> 00:14:54,510
S1: I just feel like there's a there's a mismatch between

225
00:14:54,510 --> 00:14:58,350
S1: the amount of power that agents have and the amount of, um,

226
00:14:58,390 --> 00:15:00,470
S1: access and power that they have.

227
00:15:02,110 --> 00:15:06,430
S2: I think this asymmetry is was right for every new

228
00:15:06,470 --> 00:15:11,430
S2: category in security. That's true. I think AI security is

229
00:15:11,430 --> 00:15:15,030
S2: probably one of the quickest categories to to catch up.

230
00:15:15,270 --> 00:15:18,230
S2: If you look like on on games, for example, game

231
00:15:18,230 --> 00:15:22,430
S2: security on operating systems, um, in the early days or

232
00:15:22,470 --> 00:15:24,990
S2: networks in the early days, we still have protocols like

233
00:15:24,990 --> 00:15:29,750
S2: DHCP and ARP or DNS. It's so easy, so much

234
00:15:29,750 --> 00:15:34,950
S2: trust into into other, uh, other people in the early days.

235
00:15:35,990 --> 00:15:36,310
S1: Yeah.

236
00:15:36,350 --> 00:15:41,090
S2: Categories like a car's security. Um. Oh, yeah. Security. Very.

237
00:15:41,330 --> 00:15:46,130
S2: It's amazing how quickly they adopt security practices even before

238
00:15:46,250 --> 00:15:50,730
S2: the log4j2, uh, happened. Of the industry?

239
00:15:51,650 --> 00:15:54,650
S1: Yeah, it's a good point. Yeah. ARP is always the

240
00:15:54,650 --> 00:15:58,570
S1: one that trips me out the most. It's like. It's like, um.

241
00:15:59,010 --> 00:16:01,330
S1: You don't even have to ask a question. You could

242
00:16:01,330 --> 00:16:06,810
S1: just receive answers like, oh, by the way, um, here

243
00:16:06,810 --> 00:16:08,890
S1: is the Mac address that you're supposed to talk to.

244
00:16:09,690 --> 00:16:12,290
S1: And your host is like, thank you very much. I

245
00:16:12,290 --> 00:16:15,170
S1: will update that table immediately. It's just.

246
00:16:15,330 --> 00:16:16,090
S2: It's amazing.

247
00:16:16,930 --> 00:16:20,610
S1: And the fact that it all still works. Yeah. Interesting.

248
00:16:20,610 --> 00:16:24,250
S1: So I think I agree with you. I mean, we're

249
00:16:24,250 --> 00:16:28,170
S1: basically running with scissors with I, um, it's funny because

250
00:16:28,210 --> 00:16:31,170
S1: I've been in the space for so long, like 90

251
00:16:31,210 --> 00:16:36,900
S1: since 99. So we, uh, and I started mostly in

252
00:16:36,940 --> 00:16:40,180
S1: network security. And as you move to web security, you

253
00:16:40,180 --> 00:16:44,700
S1: have to relearn all the network security issues, right? We

254
00:16:44,700 --> 00:16:48,260
S1: learned that those lessons for ten years, 15 years, we

255
00:16:48,260 --> 00:16:50,740
S1: forget them when we move to web security, we forget

256
00:16:50,740 --> 00:16:53,620
S1: them again. You go to mobile security, you forget them

257
00:16:53,620 --> 00:16:55,739
S1: again a little bit when you go to cloud security.

258
00:16:55,860 --> 00:17:00,260
S1: And now with AI but yeah, maybe. Well, okay, here's

259
00:17:00,260 --> 00:17:06,020
S1: the question. Why is AI security picking up so fast

260
00:17:06,020 --> 00:17:09,940
S1: compared to the other ones? Why is the delay shorter?

261
00:17:12,100 --> 00:17:16,300
S2: Um, I think it's I think if you like, if

262
00:17:16,300 --> 00:17:23,419
S2: you think about it, um, everyone understand all the products

263
00:17:23,420 --> 00:17:28,420
S2: are going to have AI. And because of this fuzzy

264
00:17:28,420 --> 00:17:31,899
S2: interface and that you want to you want it to

265
00:17:31,900 --> 00:17:37,280
S2: be connected into your most important data sources. And I

266
00:17:37,280 --> 00:17:41,640
S2: think everyone understands kind of a game of of who's

267
00:17:41,640 --> 00:17:45,080
S2: going to be the first lock forward in a way.

268
00:17:45,119 --> 00:17:48,360
S2: Maybe everyone just understand it's a matter of time and

269
00:17:48,359 --> 00:17:50,800
S2: you don't want. It's just you don't want it to

270
00:17:50,800 --> 00:17:52,719
S2: be you. The first lock for Jay.

271
00:17:53,400 --> 00:17:53,800
S1: That makes.

272
00:17:53,800 --> 00:17:57,399
S2: Sense. I understand I understand all the companies, like all

273
00:17:57,400 --> 00:18:01,080
S2: my customers that like trying to push hard for to

274
00:18:01,119 --> 00:18:05,720
S2: develop with AI, to develop AI into their products. Because

275
00:18:05,720 --> 00:18:08,480
S2: if they won't do it, all the competitors will do it.

276
00:18:09,240 --> 00:18:11,400
S2: I mean, it's a huge advantage. Also us as like

277
00:18:11,440 --> 00:18:15,679
S2: a security vendor, we need to we're using AI to

278
00:18:15,720 --> 00:18:20,280
S2: solve problems in our products to to understand code and

279
00:18:20,280 --> 00:18:24,520
S2: to suggest code. Remediations. And it's something that you know,

280
00:18:25,440 --> 00:18:27,800
S2: or you're going to be the first to do it

281
00:18:28,080 --> 00:18:30,879
S2: and you have this advantage, or you'll be the last

282
00:18:30,880 --> 00:18:34,860
S2: to do it. And you Like you lose the game.

283
00:18:35,820 --> 00:18:38,740
S1: Yeah, that could be it. The the fact that it

284
00:18:38,740 --> 00:18:41,740
S1: just feels so big. People just have a natural fear

285
00:18:41,740 --> 00:18:45,420
S1: of it. Whereas maybe it was slower with the previous revolutions.

286
00:18:45,859 --> 00:18:49,780
S1: How do you see the distinction between security of AI

287
00:18:49,940 --> 00:18:51,700
S1: versus AI security.

288
00:18:54,180 --> 00:19:00,379
S2: Security of AI? Basically, with every new every category in security.

289
00:19:00,380 --> 00:19:06,459
S2: When you say security, it usually means securing X, right? Like, yeah,

290
00:19:06,700 --> 00:19:11,939
S2: a cloud security. It's securing the cloud. SaaS security is

291
00:19:11,940 --> 00:19:16,700
S2: securing SaaS somehow with AI security. Uh, it's it's still

292
00:19:16,700 --> 00:19:19,980
S2: confusing because I guess it's in the early days, uh,

293
00:19:20,380 --> 00:19:23,420
S2: I think we should align that AI security is securing

294
00:19:23,460 --> 00:19:30,540
S2: AI and not like it's not, uh, securing the the

295
00:19:30,540 --> 00:19:35,760
S2: output of AI or something like that. Um, but a

296
00:19:35,760 --> 00:19:36,960
S2: time will tell, I guess.

297
00:19:37,480 --> 00:19:40,600
S1: Yeah, yeah. I wonder if they start to merge. I

298
00:19:40,680 --> 00:19:43,840
S1: think the reason, maybe one of the reasons that it

299
00:19:43,840 --> 00:19:47,840
S1: started out being separate was this whole concept of, um,

300
00:19:48,359 --> 00:19:51,600
S1: just the model, uh, remember, uh, data poisoning. It's not

301
00:19:51,600 --> 00:19:56,320
S1: talked about as much as, like, um, in 22 or 23,

302
00:19:56,359 --> 00:19:59,560
S1: but it was like, what is the, um, can the

303
00:19:59,560 --> 00:20:03,840
S1: data be poisoned that, uh, you know, the eyes are

304
00:20:03,840 --> 00:20:08,159
S1: being trained on? Right. So it was like, I don't know,

305
00:20:08,160 --> 00:20:11,000
S1: there's just not nearly as much focus on that anymore.

306
00:20:11,040 --> 00:20:13,040
S1: And now it's more about it. I would agree with you.

307
00:20:13,040 --> 00:20:14,800
S1: I think it's actually merging more now.

308
00:20:16,960 --> 00:20:21,720
S2: Yeah. Also. Yeah. Also like, uh, in the beginning everyone

309
00:20:21,720 --> 00:20:26,280
S2: spoke about like, AI driven security. Um, and it's kind

310
00:20:26,280 --> 00:20:32,100
S2: of funny because, uh, anomaly detection, it's basically like using

311
00:20:32,140 --> 00:20:36,900
S2: AI in order to find anomalies. Um, and so many

312
00:20:36,900 --> 00:20:40,060
S2: of the categories are already using heavily, heavily based on AI.

313
00:20:41,220 --> 00:20:43,780
S2: But if we look in the future, there's not going

314
00:20:43,780 --> 00:20:47,020
S2: to be any vendor that's not using AI. That's right.

315
00:20:47,060 --> 00:20:50,380
S2: Also in application security, like your promise at the end

316
00:20:50,420 --> 00:20:55,780
S2: is suggesting how to harden your code. And of course

317
00:20:55,780 --> 00:20:56,740
S2: you're going to use AI.

318
00:20:57,540 --> 00:21:00,420
S1: That's right. That's right. We don't have any database companies

319
00:21:00,420 --> 00:21:04,300
S1: because um, what we do that that only make databases

320
00:21:04,300 --> 00:21:07,380
S1: or whatever. But every company is a database company. Every

321
00:21:07,380 --> 00:21:11,100
S1: company is an Excel company. Like. Yeah, it's just, uh,

322
00:21:11,420 --> 00:21:15,419
S1: it's just built in, um, so so what what do

323
00:21:15,420 --> 00:21:18,060
S1: you think about the current solutions? Uh, what do you

324
00:21:18,060 --> 00:21:20,979
S1: think about current solutions in terms of, like, what are

325
00:21:20,980 --> 00:21:23,620
S1: the current, like, appsec vendors, like the ones that have

326
00:21:23,619 --> 00:21:26,619
S1: been around for, you know, 15 years or whatever? How

327
00:21:26,660 --> 00:21:30,670
S1: are their solutions like By solving these problems that we've

328
00:21:30,670 --> 00:21:31,470
S1: been talking about.

329
00:21:33,030 --> 00:21:40,070
S2: And the really simple answer we just don't. And like

330
00:21:40,109 --> 00:21:44,390
S2: a classic abstract, abstract solutions, doing a really good job

331
00:21:44,670 --> 00:21:49,550
S2: in detecting some patterns in your code of vulnerabilities, like

332
00:21:49,590 --> 00:21:52,430
S2: CW is at the end of the day, inherently is

333
00:21:52,430 --> 00:21:55,710
S2: a pattern, and not it's not a technique to solve it.

334
00:21:55,710 --> 00:21:58,630
S2: To find the CW, the way we think about it

335
00:21:58,630 --> 00:22:05,030
S2: is about finding patterns and CVS, which today is maybe

336
00:22:05,070 --> 00:22:08,310
S2: a bit sad day for CVS. And let's say there's

337
00:22:08,310 --> 00:22:14,429
S2: many other security advisories. GA the others, you know, the

338
00:22:14,470 --> 00:22:19,310
S2: GitHub security advisory, the Ruby Security advisory. So all of them, um,

339
00:22:19,350 --> 00:22:23,149
S2: it's something that it's in your libraries. Um, so the

340
00:22:23,150 --> 00:22:26,670
S2: way you think about it is only libraries and code issues.

341
00:22:26,910 --> 00:22:31,090
S2: But the thing about AI security is that first, to

342
00:22:31,130 --> 00:22:34,570
S2: even to detect these components, it's not enough to look

343
00:22:34,570 --> 00:22:37,929
S2: in libraries. And libraries can give you a hint. And

344
00:22:37,970 --> 00:22:41,930
S2: actually it's something it's a great thing we do. We

345
00:22:41,930 --> 00:22:45,490
S2: only using the libraries tell you the hints. It it.

346
00:22:46,170 --> 00:22:49,369
S2: I remember the day when we found you can do it.

347
00:22:49,369 --> 00:22:52,530
S2: You can just just take the libraries even before you put,

348
00:22:52,570 --> 00:22:55,490
S2: you know, some heavy scanners on all the places. Just

349
00:22:55,490 --> 00:22:57,850
S2: take the libraries and extract all the hints you can

350
00:22:57,930 --> 00:23:00,929
S2: about the usage of AI. And that was like a great, uh,

351
00:23:00,970 --> 00:23:07,810
S2: you know, uh, Aurora for us. Um, and, uh, and

352
00:23:07,810 --> 00:23:10,850
S2: that's maybe the first step, basically. How do you find models?

353
00:23:11,290 --> 00:23:14,810
S2: Models is something um, or you find it the art

354
00:23:14,810 --> 00:23:18,890
S2: of the model artifact, which can be either the repository

355
00:23:18,890 --> 00:23:23,730
S2: in the container in some S3 bucket or spatial, uh,

356
00:23:23,770 --> 00:23:27,630
S2: you know, models, repositories. So that's a new place for models.

357
00:23:27,630 --> 00:23:30,790
S2: But if it's, you know, third party, if we're using

358
00:23:30,950 --> 00:23:35,670
S2: some inference providers, which from my experience and most customers

359
00:23:35,670 --> 00:23:40,310
S2: starting from using inference providers, most companies, they start with

360
00:23:40,310 --> 00:23:43,550
S2: some prototype and they will use the easiest, you know, OpenAI, Azure,

361
00:23:43,550 --> 00:23:48,630
S2: OpenAI bedrock. Yeah. And they can so so that's something

362
00:23:48,670 --> 00:23:51,550
S2: you can't find anywhere. Not in the container. Not the

363
00:23:51,670 --> 00:23:54,709
S2: it's not an artifact. And the only place is in

364
00:23:54,710 --> 00:23:58,470
S2: the code. And none of our current solutions is like

365
00:23:58,670 --> 00:24:06,350
S2: is positioned us to find it. And so, so like, uh,

366
00:24:06,430 --> 00:24:10,550
S2: you need a new, new solution. And you I think

367
00:24:10,590 --> 00:24:12,950
S2: the AI security companies are going to be in your

368
00:24:12,950 --> 00:24:17,230
S2: products are going to be something like totally new, and

369
00:24:17,270 --> 00:24:20,990
S2: that is going to be merged into the current DevSecOps

370
00:24:20,990 --> 00:24:22,630
S2: and Appsec workflows.

371
00:24:23,630 --> 00:24:26,490
S1: Yeah. The way I think. Think about it. Or I

372
00:24:26,490 --> 00:24:29,129
S1: always used to think about it because my background is

373
00:24:29,450 --> 00:24:34,090
S1: largely web security, um, which we would have a major

374
00:24:34,090 --> 00:24:38,250
S1: distinction between dynamic and static security. Right. So, like, I

375
00:24:38,250 --> 00:24:40,210
S1: was at 4 to 5 for a very long time,

376
00:24:40,250 --> 00:24:42,129
S1: and there were the static people over there, and we

377
00:24:42,130 --> 00:24:45,490
S1: were the dynamic people. I feel like the AI piece

378
00:24:45,970 --> 00:24:49,850
S1: definitely is on the dynamic side. Um, it has to

379
00:24:49,850 --> 00:24:53,570
S1: be right. So rather than just like web testing and

380
00:24:53,570 --> 00:25:00,609
S1: API testing, it's got to be more comprehensive. Um, so

381
00:25:01,090 --> 00:25:02,730
S1: let's just jump right into.

382
00:25:03,130 --> 00:25:06,930
S2: We, you know, we started with, uh, with the static

383
00:25:07,290 --> 00:25:11,889
S2: part of AI because we found that the biggest issue

384
00:25:12,090 --> 00:25:15,530
S2: is threat modeling just to discover what you don't know

385
00:25:15,530 --> 00:25:19,410
S2: you have. And we found that, uh, that's we found

386
00:25:19,410 --> 00:25:21,570
S2: that it's not just a big issue. It's like huge

387
00:25:21,570 --> 00:25:24,540
S2: issue because most companies just don't know what they have

388
00:25:25,340 --> 00:25:28,660
S2: on the order of ten. But then we because of that,

389
00:25:28,660 --> 00:25:33,260
S2: we moved into the dynamic section. And when we're now

390
00:25:33,300 --> 00:25:37,699
S2: like trying to offer both because at the end of

391
00:25:37,700 --> 00:25:40,300
S2: the day I a model, there's a bunch of numbers

392
00:25:40,660 --> 00:25:43,620
S2: and like code you can understand. You just can't understand it.

393
00:25:43,660 --> 00:25:48,060
S2: And the only way is through through a conversation, through

394
00:25:48,100 --> 00:25:52,500
S2: like dynamic and simulating attacks, through pentesting basically.

395
00:25:53,020 --> 00:26:01,140
S1: Yeah. Yeah. Absolutely. So. So let's think about that. So, um,

396
00:26:02,180 --> 00:26:06,300
S1: just talk me through like how your solutions are set up.

397
00:26:06,300 --> 00:26:10,340
S1: Like what is the the basic tagline for it like, um,

398
00:26:11,540 --> 00:26:15,900
S1: is it, uh, is it asset discovery? Are you discovering

399
00:26:15,940 --> 00:26:20,020
S1: like the the structure of the application? Are you enumerating

400
00:26:20,060 --> 00:26:23,960
S1: like controls? Roles. Are you testing controls? What exactly is

401
00:26:23,960 --> 00:26:25,320
S1: it that the suite does?

402
00:26:26,400 --> 00:26:31,760
S2: Um, so it's it's kind of both. Um, we're starting from, uh,

403
00:26:31,920 --> 00:26:36,680
S2: statically scanning all the assets, all the AI resources, components,

404
00:26:36,680 --> 00:26:39,720
S2: you can call it in different names, and to find

405
00:26:39,760 --> 00:26:42,520
S2: all the AI you have. And usually at this stage

406
00:26:43,080 --> 00:26:48,720
S2: we find that usually we start like, you know. Like

407
00:26:48,760 --> 00:26:53,639
S2: average company will say we have some AI components, some

408
00:26:53,640 --> 00:26:58,399
S2: AI driven applications. Then where we do the initial scan,

409
00:26:58,560 --> 00:27:01,359
S2: we found it's more than that by a factor of ten.

410
00:27:02,080 --> 00:27:07,480
S2: And and that's every time exciting to see like and

411
00:27:07,480 --> 00:27:10,560
S2: you know it's it's you in the security. It's really

412
00:27:10,560 --> 00:27:13,000
S2: easy to say that it's bad. But actually it's pretty

413
00:27:13,000 --> 00:27:16,159
S2: amazing how the industry is trying to push harder to

414
00:27:16,200 --> 00:27:21,220
S2: push forward. So so even without the security knowing everyone,

415
00:27:21,220 --> 00:27:24,700
S2: just trying to use it, trying to use it, trying

416
00:27:24,700 --> 00:27:30,859
S2: to make a smarter applications and more valuable products. Um,

417
00:27:30,980 --> 00:27:34,699
S2: so so we detect it. And then the question is, uh,

418
00:27:34,780 --> 00:27:38,899
S2: if I have dopesick, uh, it's a drug. Maybe it's

419
00:27:38,900 --> 00:27:42,460
S2: bad because rag connected to data, you know, you don't

420
00:27:42,460 --> 00:27:45,700
S2: want to leak this data, and you may be concerned

421
00:27:45,700 --> 00:27:48,859
S2: about dopesick unless you're in China, which is a very

422
00:27:48,859 --> 00:27:52,500
S2: it's the opposite. You probably want to use dopesick. And

423
00:27:52,540 --> 00:27:54,860
S2: so you just want to know what you have. Then

424
00:27:54,900 --> 00:27:58,580
S2: you want to detect all the risks you have in

425
00:27:58,580 --> 00:28:03,540
S2: these components. And which is kind of a composition analysis, um,

426
00:28:03,820 --> 00:28:06,860
S2: when you think about it in deeply, not only libraries

427
00:28:06,859 --> 00:28:13,260
S2: in SCA. Um. And that's amazing. You find, uh, legal risk.

428
00:28:13,300 --> 00:28:15,780
S2: You find security risk for all these components. It can

429
00:28:15,780 --> 00:28:20,359
S2: be models, uh, you know, uh, agents and speed to

430
00:28:20,400 --> 00:28:22,200
S2: MCP servers, agent tools.

431
00:28:22,960 --> 00:28:26,160
S1: And how are you getting all the components? Are they

432
00:28:26,160 --> 00:28:29,400
S1: providing them to you, or are you getting them dynamically? Like,

433
00:28:29,640 --> 00:28:31,520
S1: how are you getting these from the customer?

434
00:28:32,640 --> 00:28:36,840
S2: Um, so we started with taking the hints from the

435
00:28:36,840 --> 00:28:40,920
S2: open source libraries, and then we used these hints to

436
00:28:40,960 --> 00:28:44,760
S2: look into the code and find, uh, all the components

437
00:28:44,760 --> 00:28:49,760
S2: in code. Um, and if we have, uh, if you're

438
00:28:49,760 --> 00:28:54,800
S2: using self-hosted models or like open source models, uh, we

439
00:28:54,800 --> 00:28:59,480
S2: looked for artifacts. So we'll find a file that we, that,

440
00:28:59,520 --> 00:29:02,240
S2: you know, from the magic of the file, from the

441
00:29:02,240 --> 00:29:05,640
S2: headers of the file, we know it's a model. And

442
00:29:05,640 --> 00:29:09,880
S2: then we have some fingerprinting, uh, technique to, to match

443
00:29:09,880 --> 00:29:14,360
S2: it into the actual file. It's a inherit from. So

444
00:29:14,360 --> 00:29:16,400
S2: if you took a file, fine. Tuned it, you know,

445
00:29:16,440 --> 00:29:18,980
S2: took something from hugging face. Finding it and using it

446
00:29:19,700 --> 00:29:22,940
S2: will tell you about this file and that it's related

447
00:29:22,940 --> 00:29:26,260
S2: to hugging face model and hopefully not malicious one.

448
00:29:27,460 --> 00:29:30,820
S1: Oh, interesting. I mean, that's that's an offering by itself, right?

449
00:29:30,820 --> 00:29:32,500
S1: Just like asset discovery.

450
00:29:33,140 --> 00:29:39,060
S2: Of course, you'll be surprised by how much, um, you know,

451
00:29:39,100 --> 00:29:43,380
S2: there's a the proportion between what you know and what

452
00:29:43,380 --> 00:29:46,180
S2: you don't know, uh, is, is large.

453
00:29:46,700 --> 00:29:47,100
S1: It's pretty.

454
00:29:47,140 --> 00:29:47,660
S2: Amazing.

455
00:29:47,940 --> 00:29:50,940
S1: Yeah. Whenever, whenever I do security assessments, I usually have

456
00:29:50,980 --> 00:29:54,900
S1: a really large visual that I'm building out throughout the week.

457
00:29:55,300 --> 00:29:58,340
S1: And honestly, it's just laying out what the application does.

458
00:29:58,340 --> 00:30:01,380
S1: It just lays out what the functionality is, where the

459
00:30:01,380 --> 00:30:04,580
S1: data is flowing. And as I'm interviewing more and more

460
00:30:04,580 --> 00:30:07,420
S1: developers and more and more people in the company, I

461
00:30:07,420 --> 00:30:09,420
S1: just bring them in and show them this thing and

462
00:30:09,420 --> 00:30:15,870
S1: they're like, oh yeah, everyone starts taking pictures, they start taking,

463
00:30:16,070 --> 00:30:18,390
S1: they're like, oh yeah. Yeah. So because they don't have

464
00:30:18,390 --> 00:30:21,750
S1: any documentation that's actually this good. And it turns out

465
00:30:21,750 --> 00:30:25,230
S1: if you just see it, if you just visualize it,

466
00:30:25,550 --> 00:30:28,230
S1: you're like, it's obvious to everyone who walks in the

467
00:30:28,230 --> 00:30:32,670
S1: room that this is a problem and, you know, when

468
00:30:32,670 --> 00:30:38,350
S1: it's not visualized, um, or it's not explicitly laid out. Yeah.

469
00:30:38,350 --> 00:30:41,270
S1: You just miss the stuff. Okay. So so you have

470
00:30:41,270 --> 00:30:42,390
S1: the list of components.

471
00:30:42,430 --> 00:30:45,790
S2: By the way. It's a, I like like all the

472
00:30:45,790 --> 00:30:47,750
S2: trend of showing topology.

473
00:30:48,590 --> 00:30:49,030
S1: Yes.

474
00:30:49,070 --> 00:30:52,750
S2: What you have. Yeah I love this trend. Um, I

475
00:30:52,750 --> 00:30:56,870
S2: think there's a really large debate in the security industry

476
00:30:57,150 --> 00:31:00,390
S2: because people say at the end of the day, it's

477
00:31:00,390 --> 00:31:04,150
S2: not it's not showing the data. You need tables. But

478
00:31:04,190 --> 00:31:08,670
S2: there's something about visualizations that like pass the give you

479
00:31:08,670 --> 00:31:12,230
S2: the value shows it in a different way. And it's funny.

480
00:31:12,910 --> 00:31:15,370
S1: I would just show an arrow and I would like

481
00:31:15,370 --> 00:31:18,130
S1: color code the arrows like according I would have the

482
00:31:18,130 --> 00:31:22,090
S1: data classification for the company on the board. And if

483
00:31:22,090 --> 00:31:26,530
S1: it's one of the like last two data classification like, um,

484
00:31:27,010 --> 00:31:30,770
S1: sensitive secret, whatever their classification is, then I would just

485
00:31:30,770 --> 00:31:34,650
S1: have all the connecting dots or the connecting arrows be red.

486
00:31:35,730 --> 00:31:39,210
S1: They're like, why are these red? Because, um, you know,

487
00:31:39,330 --> 00:31:43,330
S1: Sarah over here or John over here said that that

488
00:31:43,330 --> 00:31:47,330
S1: type of data classification is included in this data. And

489
00:31:47,330 --> 00:31:51,010
S1: suddenly after you talk to 20 different people, the whole

490
00:31:51,010 --> 00:31:55,130
S1: board is red, right? The whole board's red. And they're like, okay,

491
00:31:55,130 --> 00:31:57,770
S1: I didn't realize the problem was this bad. Yeah. And

492
00:31:57,770 --> 00:32:00,490
S1: then those documents end up being used as the official

493
00:32:00,490 --> 00:32:04,330
S1: document for the application going forward. So I feel like

494
00:32:04,330 --> 00:32:09,090
S1: this is this is absolutely needed, especially for I when

495
00:32:09,090 --> 00:32:12,090
S1: I talk to people about AI implementations, I'm just like,

496
00:32:12,990 --> 00:32:17,310
S1: Show me exactly where the agent is in this workflow.

497
00:32:17,950 --> 00:32:21,110
S1: Show me exactly which APIs it has access to. And

498
00:32:21,110 --> 00:32:24,190
S1: as they start writing it down, they're like, oh, well,

499
00:32:24,190 --> 00:32:26,350
S1: I think I see the problem. I haven't even said

500
00:32:26,350 --> 00:32:27,230
S1: anything yet.

501
00:32:27,270 --> 00:32:32,350
S2: Yes. It's amazing. You know, I love threat modeling. I love, like,

502
00:32:32,390 --> 00:32:36,870
S2: doing it, uh, with customers. And you started from it

503
00:32:36,870 --> 00:32:37,510
S2: so much.

504
00:32:38,590 --> 00:32:39,310
S1: Yeah, absolutely.

505
00:32:39,350 --> 00:32:41,430
S2: And at the end of the day, uh, like, it's

506
00:32:41,430 --> 00:32:45,910
S2: not enough because you need to detect the issues. Um,

507
00:32:46,310 --> 00:32:55,270
S2: so there's malicious models, which is, like, surprisingly, surprisingly, surprisingly, like, uh, getting, uh,

508
00:32:55,910 --> 00:32:59,590
S2: evolving a category. And because in open source models, there

509
00:32:59,630 --> 00:33:02,990
S2: is a you think a model is only like a

510
00:33:02,990 --> 00:33:07,310
S2: metric of numbers, but actually it has also some serialized code.

511
00:33:07,470 --> 00:33:09,870
S2: It's like many of the many types of models are

512
00:33:09,870 --> 00:33:14,850
S2: like pickle files. Yeah. Not necessarily. It can be a

513
00:33:14,850 --> 00:33:18,650
S2: family of, uh, of types of people files, uh, which

514
00:33:18,650 --> 00:33:23,170
S2: is a code that is serialized into some, like, uh,

515
00:33:24,210 --> 00:33:25,490
S2: some opcodes.

516
00:33:25,730 --> 00:33:29,250
S1: It's always it's always the parsers. It's always the parsers.

517
00:33:29,930 --> 00:33:32,610
S2: Yes. Yeah. And it's going to, you know, pull code.

518
00:33:32,650 --> 00:33:36,170
S2: It's theoretically it can pull another binary, uh, from remote

519
00:33:36,170 --> 00:33:39,570
S2: and run it. Yeah. So it can be super malicious.

520
00:33:40,530 --> 00:33:40,810
S1: Yeah.

521
00:33:41,050 --> 00:33:43,930
S2: It's also like a known risk, you know, like other

522
00:33:43,930 --> 00:33:49,930
S2: classic vulnerabilities. Um, only this week I saw a company that, uh,

523
00:33:50,170 --> 00:33:53,850
S2: scanned all the papers, like, all the academic papers, and

524
00:33:53,850 --> 00:33:58,170
S2: try to extract the existing attacks and to map, you know,

525
00:33:58,210 --> 00:34:03,130
S2: attacks to models. Hmm. I think we still it's still not,

526
00:34:03,170 --> 00:34:05,290
S2: you know, we're still not there at the end of

527
00:34:05,330 --> 00:34:08,049
S2: the game. It's not like, uh, see that you have

528
00:34:08,050 --> 00:34:13,190
S2: so many security advisories. I think we're getting there. And

529
00:34:13,950 --> 00:34:16,670
S2: we're for sure more mature as an industry.

530
00:34:17,830 --> 00:34:23,030
S1: Yeah. Okay. So let's say someone what is an ideal

531
00:34:23,030 --> 00:34:26,950
S1: customer look like for you, like in terms of they

532
00:34:26,950 --> 00:34:28,990
S1: come to you and they say, I have this problem

533
00:34:28,989 --> 00:34:32,069
S1: or this problem or this problem and you say, okay, perfect.

534
00:34:32,110 --> 00:34:34,790
S1: That's exactly what we do. What would those problems look

535
00:34:34,790 --> 00:34:37,030
S1: like and what would you tell them? The solution is.

536
00:34:38,510 --> 00:34:43,190
S2: Um, there's two types of companies and I see all.

537
00:34:43,230 --> 00:34:46,469
S2: It's a company that's concerned about AI, but they don't

538
00:34:46,469 --> 00:34:49,190
S2: know what to do. And so that's kind of the

539
00:34:49,230 --> 00:34:54,270
S2: free threat modeling stage. That's that's a stage where the

540
00:34:54,270 --> 00:34:56,870
S2: first thing you need to do is discover everything. You

541
00:34:56,870 --> 00:35:01,870
S2: have all the agent models. Um, usually a company will say,

542
00:35:01,870 --> 00:35:04,310
S2: we have a policy that says we're only using, you know,

543
00:35:04,350 --> 00:35:07,110
S2: Azure AI. And then we'll find many, like a hugging

544
00:35:07,150 --> 00:35:12,839
S2: face model lipstick and other service providers. And and that's

545
00:35:12,840 --> 00:35:15,799
S2: the first stage. And once you we hope to do

546
00:35:15,800 --> 00:35:18,239
S2: the threat modeling, we see how the companies start like

547
00:35:18,280 --> 00:35:21,760
S2: thinking and getting more advanced. And the other type of

548
00:35:21,760 --> 00:35:25,239
S2: companies are the a bit more sophisticated ones where they

549
00:35:25,239 --> 00:35:29,040
S2: already have some like 2 to 3, let's say, or

550
00:35:29,080 --> 00:35:33,400
S2: a few like major AI driven products. They know about them.

551
00:35:33,440 --> 00:35:37,360
S2: They already like a threat model them. So the discovery

552
00:35:37,360 --> 00:35:40,920
S2: will still help them a lot, but the more advanced

553
00:35:40,960 --> 00:35:46,680
S2: they know, they have some, let's say uh, tax uh application,

554
00:35:47,000 --> 00:35:51,239
S2: uh reviewer, automatic reviewer. Um, that like is very smart,

555
00:35:51,239 --> 00:35:54,000
S2: but it's very risky because it's a PDF and something

556
00:35:54,000 --> 00:35:58,120
S2: can happen. Usually they wouldn't know exactly what the risks.

557
00:35:58,120 --> 00:36:00,919
S2: They wouldn't know that what they need is, you know,

558
00:36:00,960 --> 00:36:05,640
S2: some something that will simulate attacks of sending PDF with

559
00:36:05,680 --> 00:36:09,299
S2: an image inside the image. It will have some prompt injection,

560
00:36:10,780 --> 00:36:12,780
S2: but they will know they will. Heard about the OWASp

561
00:36:12,780 --> 00:36:16,299
S2: top ten for applications, which is, by the way, great

562
00:36:16,300 --> 00:36:20,460
S2: a list of threats and, you know, great awareness.

563
00:36:21,660 --> 00:36:22,660
S1: Yeah, that makes sense.

564
00:36:23,140 --> 00:36:25,540
S2: And then we like tell them, hey, you have here

565
00:36:25,580 --> 00:36:29,580
S2: potential a malicious model. Check it. You have here IPsec.

566
00:36:29,860 --> 00:36:33,700
S2: And that's the findings from the dynamic scanning of the

567
00:36:33,739 --> 00:36:36,859
S2: you know, all the attack simulations. Okay. And then we

568
00:36:36,860 --> 00:36:39,620
S2: need to fine tune how to have to do what's

569
00:36:39,620 --> 00:36:42,660
S2: the right attacks. What's the most important attacks to check.

570
00:36:43,580 --> 00:36:46,580
S1: Nice. Okay. And then what what is the product like.

571
00:36:46,580 --> 00:36:49,580
S1: What is the product suite do like what are the

572
00:36:49,580 --> 00:36:50,980
S1: the pieces of functionality.

573
00:36:52,219 --> 00:36:56,420
S2: And so it's basically what we spoke about. And it

574
00:36:56,420 --> 00:37:00,260
S2: starts with discovery of all the components. Then it moves

575
00:37:00,260 --> 00:37:03,980
S2: to the risk of each component individually. And once you

576
00:37:03,980 --> 00:37:06,840
S2: know about all the components you need somehow to connect

577
00:37:06,840 --> 00:37:10,719
S2: them into a behavioral risk and to to understand, to

578
00:37:10,760 --> 00:37:15,040
S2: contextualize all the risk of the entire components together. And

579
00:37:15,040 --> 00:37:17,160
S2: because you can't have a system prompt, you can't have

580
00:37:17,200 --> 00:37:20,160
S2: a model. And you can have some, you know, rug

581
00:37:20,600 --> 00:37:23,280
S2: that connects them to a database. But unless you understand

582
00:37:23,280 --> 00:37:26,040
S2: you have this database, this system from twist, this model,

583
00:37:26,320 --> 00:37:29,520
S2: that's the only way to understand that you have context leakage.

584
00:37:30,239 --> 00:37:33,680
S2: And so for that we have these behavioral risk rating

585
00:37:33,680 --> 00:37:39,080
S2: automatic rating attack simulations and drop everything. We're doing a

586
00:37:39,120 --> 00:37:43,720
S2: mitigation what we call governance and mitigations and where we

587
00:37:43,960 --> 00:37:46,560
S2: where you can create policies to prevent let's say if

588
00:37:46,560 --> 00:37:51,400
S2: it's Rug and Leipzig block it, this facial combination. And

589
00:37:51,440 --> 00:37:55,640
S2: in the near future we're going to release the ability

590
00:37:55,640 --> 00:38:00,240
S2: to what we call guiderails, which is not guardrails. It's guiderails.

591
00:38:00,560 --> 00:38:01,040
S1: Interesting.

592
00:38:01,040 --> 00:38:04,900
S2: The concept of mitigating putting like a the mitigations into

593
00:38:04,900 --> 00:38:05,460
S2: the code.

594
00:38:06,700 --> 00:38:07,299
S1: Not like.

595
00:38:07,300 --> 00:38:07,420
S2: A.

596
00:38:07,460 --> 00:38:12,420
S1: Firewall. Like dynamic. Dynamic detections as it's coming in.

597
00:38:13,180 --> 00:38:16,900
S2: No. So actually not dynamic. And the point is that

598
00:38:17,300 --> 00:38:20,819
S2: integrated into the development process. And the developers will get

599
00:38:20,820 --> 00:38:24,819
S2: suggestions for mitigations to the code. For example, add to

600
00:38:24,820 --> 00:38:28,500
S2: the system prompt this and that in order to block

601
00:38:28,540 --> 00:38:32,540
S2: this redeeming findings in order to mitigate from these findings.

602
00:38:32,820 --> 00:38:36,899
S2: For example, a format your output of LLM into more

603
00:38:36,940 --> 00:38:39,940
S2: like a less fuzzy format because you don't need a

604
00:38:39,940 --> 00:38:43,500
S2: fuzzy output. So why, uh, like why open your attack

605
00:38:43,500 --> 00:38:45,220
S2: surface and.

606
00:38:45,219 --> 00:38:45,580
S1: So many.

607
00:38:45,580 --> 00:38:49,660
S2: Small things that everyone has. Everyone got these issues and

608
00:38:49,660 --> 00:38:52,700
S2: you have agent, maybe the same agent should not access

609
00:38:52,700 --> 00:38:56,460
S2: the data and run code afterwards, because if someone will

610
00:38:56,660 --> 00:38:58,940
S2: like prompt injection the agent, you can run code that

611
00:38:58,940 --> 00:39:02,299
S2: access the data. And so you know these simple. There's

612
00:39:02,300 --> 00:39:04,350
S2: so many simple steps no one knows.

613
00:39:05,350 --> 00:39:09,710
S1: Yeah, that makes sense. Okay. So it's it's the initial

614
00:39:09,710 --> 00:39:14,750
S1: assessment of, like, what the attack surface is. Um, just

615
00:39:14,750 --> 00:39:19,109
S1: discovering the assets. Then there's the dynamic assessment, and then

616
00:39:19,110 --> 00:39:20,310
S1: there's mitigation.

617
00:39:21,030 --> 00:39:21,710
S2: Exactly.

618
00:39:22,710 --> 00:39:27,630
S1: Perfect. Um. All right. What do you think is happening next?

619
00:39:27,630 --> 00:39:31,870
S1: What are you worried about happening coming up? Uh, trends

620
00:39:31,870 --> 00:39:33,990
S1: or risks that you see coming up soon?

621
00:39:35,510 --> 00:39:39,670
S2: Um, I think we it seems like we have a

622
00:39:40,230 --> 00:39:44,229
S2: every new like it seems like every, every every old

623
00:39:44,230 --> 00:39:48,190
S2: category in security and need to find a way to adapt.

624
00:39:48,590 --> 00:39:51,550
S2: And because, like speaking with customers and we see that

625
00:39:51,750 --> 00:39:56,390
S2: identities that is like, you know, kind of an existing, uh,

626
00:39:56,430 --> 00:40:00,750
S2: category with tons of issues now has a new aspect

627
00:40:00,790 --> 00:40:04,370
S2: in the AI driven applications because you want to manage,

628
00:40:04,770 --> 00:40:06,730
S2: you want to make sure one user can't access a

629
00:40:06,730 --> 00:40:11,290
S2: different user data. And so, you know, it's old problem

630
00:40:11,530 --> 00:40:15,890
S2: with a new suit, let's call it. And so.

631
00:40:15,930 --> 00:40:16,450
S1: Yeah.

632
00:40:16,489 --> 00:40:19,729
S2: We see so many so many, so many nutrients. And

633
00:40:19,730 --> 00:40:24,610
S2: exactly like you said before, so many new same problems

634
00:40:24,650 --> 00:40:28,810
S2: and new problems which basically the same. So that's one

635
00:40:28,810 --> 00:40:31,810
S2: thing I think, which we need to remember it that

636
00:40:31,969 --> 00:40:34,090
S2: all the new things are basically old.

637
00:40:35,489 --> 00:40:40,250
S1: Yeah. Yeah. I like you're talking about um, yeah. I

638
00:40:40,250 --> 00:40:43,529
S1: think the distinction between what an agent is doing and

639
00:40:43,530 --> 00:40:49,689
S1: making sure it's not acting on behalf of an actual human. Right. Uh, making. Yeah,

640
00:40:49,730 --> 00:40:54,810
S1: non-human identity versus human identities and making those distinctions, having

641
00:40:54,810 --> 00:40:58,489
S1: separate policies, separate policies for those. I think that's going

642
00:40:58,489 --> 00:40:59,290
S1: to be important.

643
00:41:00,190 --> 00:41:03,430
S2: Yes. And, you know, like the same concept. You want

644
00:41:03,430 --> 00:41:08,030
S2: multi-tenant also for agent. You want permissions also for agent and.

645
00:41:08,350 --> 00:41:11,549
S1: Separation of duties. Like all like you said, all the

646
00:41:11,550 --> 00:41:16,150
S1: old stuff we have to re-implement relearning the lessons from,

647
00:41:16,270 --> 00:41:17,550
S1: you know, 25 years.

648
00:41:18,790 --> 00:41:22,069
S2: Exactly. And but I think like if you, if you

649
00:41:22,070 --> 00:41:26,070
S2: look on the new things classes and you see that

650
00:41:26,469 --> 00:41:31,390
S2: multiple agents of course, but multiple agent frameworks, agents orchestration

651
00:41:31,910 --> 00:41:35,790
S2: and create more issues because it's way more complicated to

652
00:41:35,790 --> 00:41:41,830
S2: understand how it works and the communication between agents. Open

653
00:41:41,870 --> 00:41:47,590
S2: the make the exploit way more complicated. It's way more complicated,

654
00:41:47,590 --> 00:41:51,989
S2: but like way stronger. A very similar to to what

655
00:41:51,989 --> 00:41:56,350
S2: you have when you try to exploit memory corruptions and

656
00:41:56,390 --> 00:42:01,049
S2: that you try to jump between a different places in

657
00:42:01,050 --> 00:42:04,850
S2: the program in order to finally run the code. And

658
00:42:05,090 --> 00:42:08,210
S2: so it's very similar to here. And if the code

659
00:42:08,250 --> 00:42:11,969
S2: is if you have more attack surface, when you have

660
00:42:12,010 --> 00:42:15,090
S2: the code is larger and you have more places and

661
00:42:15,090 --> 00:42:17,770
S2: more gadgets that you can more places you can jump

662
00:42:17,770 --> 00:42:21,930
S2: between each other until you run the code. And here

663
00:42:21,930 --> 00:42:24,169
S2: you have many agents. So you can you need just

664
00:42:24,170 --> 00:42:28,489
S2: to find the agent, you know, to send the right

665
00:42:28,489 --> 00:42:31,049
S2: prompt into to run the code, I think.

666
00:42:31,050 --> 00:42:33,370
S1: I think it's worse, like you said, I think it's

667
00:42:33,410 --> 00:42:38,850
S1: worse with multi-agent because that's more opportunities for tricking. Like

668
00:42:38,850 --> 00:42:44,770
S1: you might have a smart one. Um, so a prompt

669
00:42:44,770 --> 00:42:47,489
S1: injection might not detonate with the first one you're talking to,

670
00:42:47,530 --> 00:42:50,370
S1: but it might pass it along to a dumber one

671
00:42:50,370 --> 00:42:55,570
S1: behind it. And that one it will detonate on. Yeah. Interesting. Um.

672
00:42:56,450 --> 00:42:58,629
S2: Also, if you think about it, when you have more

673
00:42:58,630 --> 00:43:02,550
S2: components that speaking with each other, it's hard to track.

674
00:43:02,590 --> 00:43:05,509
S2: What's the purpose of each one. So you're going to

675
00:43:05,510 --> 00:43:09,710
S2: have these, you know, uh, let's call it a small

676
00:43:09,710 --> 00:43:14,709
S2: drift in, in the communication between them. Uh, there's one

677
00:43:14,710 --> 00:43:16,989
S2: that should do X, the one should do Y, but

678
00:43:16,989 --> 00:43:21,670
S2: they will have somewhere like some very small interaction that

679
00:43:21,670 --> 00:43:23,350
S2: will probably be exploitable.

680
00:43:24,830 --> 00:43:26,350
S1: Yeah. Because for each one.

681
00:43:26,350 --> 00:43:29,230
S2: Individually is totally okay. It looks fine.

682
00:43:30,870 --> 00:43:35,070
S1: Yeah. So for the CISOs that are listening, like, what's

683
00:43:35,070 --> 00:43:38,069
S1: the one piece of advice like what's the one thing

684
00:43:38,070 --> 00:43:41,270
S1: you would say to someone who's implementing AI or trying

685
00:43:41,270 --> 00:43:42,070
S1: to secure it?

686
00:43:44,510 --> 00:43:49,190
S2: Um, I would say the, the easiest thing to do is,

687
00:43:49,630 --> 00:43:52,870
S2: at least from my experience, is to try to create

688
00:43:53,630 --> 00:44:00,040
S2: really hard and forcing policies. And that prevents everything and

689
00:44:00,040 --> 00:44:04,920
S2: blocks everything. And that's really easy. Um, but I think

690
00:44:05,360 --> 00:44:09,759
S2: the right way to go is to create integrated workflows

691
00:44:09,760 --> 00:44:16,080
S2: like any other security issues, and that will help to enable, um,

692
00:44:16,120 --> 00:44:19,040
S2: enable developers and enable the products to be really good

693
00:44:19,040 --> 00:44:23,239
S2: and valuable in the AI driven, um, and more and

694
00:44:23,239 --> 00:44:26,880
S2: less disable, um, and it's right for everything. And I

695
00:44:26,880 --> 00:44:30,480
S2: think it's especially for AI and I see it firsthand, uh,

696
00:44:30,760 --> 00:44:33,000
S2: maybe a bit too much, unfortunately.

697
00:44:34,200 --> 00:44:37,359
S1: Yeah, that makes sense. And where can people learn more

698
00:44:37,360 --> 00:44:38,200
S1: about the company?

699
00:44:39,800 --> 00:44:44,960
S2: Um, so Mend-ooyo is an apps company? Um, traditional apps company.

700
00:44:44,960 --> 00:44:48,760
S2: One of the first SCA vendors that grew into, you know, SAS,

701
00:44:48,760 --> 00:44:52,520
S2: SCA container scan container image scanning. Like I spoke before

702
00:44:52,760 --> 00:44:58,219
S2: about the ability technology that got acquired for and now

703
00:44:58,219 --> 00:45:02,060
S2: also AI security. And we love what we do. We

704
00:45:02,060 --> 00:45:06,379
S2: have a lot of customers. Um, and uh, we're I

705
00:45:06,460 --> 00:45:08,700
S2: think we're the first AI security vendor in the market

706
00:45:08,780 --> 00:45:11,620
S2: doing it. Shiftleft. And so we're really proud about it

707
00:45:11,660 --> 00:45:14,740
S2: and hoping to make awareness in the industry about it.

708
00:45:15,660 --> 00:45:19,340
S1: Awesome. Well, it was great chatting with you and, uh,

709
00:45:19,820 --> 00:45:21,220
S1: look forward to chatting again.

710
00:45:21,940 --> 00:45:23,780
S2: Lovely. It was my pleasure.

711
00:45:24,180 --> 00:45:25,180
S1: All right. Take care.

712
00:45:26,500 --> 00:45:27,100
S2: Thank you.

713
00:45:29,260 --> 00:45:32,820
S1: Unsupervised learning is produced on Hindenburg Pro using an SM

714
00:45:32,820 --> 00:45:36,419
S1: seven B microphone. A video version of the podcast is

715
00:45:36,420 --> 00:45:40,140
S1: available on the Unsupervised Learning YouTube channel, and the text

716
00:45:40,140 --> 00:45:43,420
S1: version with full links and notes is available at Daniel

717
00:45:43,420 --> 00:45:47,220
S1: Miessler newsletter. We'll see you next time.