1
00:00:14,040 --> 00:00:16,520
Speaker 1: Welcome to tech stuff. This is tech Support. I'm os

2
00:00:16,560 --> 00:00:18,560
Speaker 1: Valoshin and I'm here with Cara Price.

3
00:00:18,680 --> 00:00:19,840
Speaker 2: Hey us, Hey Karra.

4
00:00:20,480 --> 00:00:23,480
Speaker 1: So today we wanted to talk about this chatchypt feature,

5
00:00:23,520 --> 00:00:26,960
Speaker 1: which is now defunct, but our friends at four or

6
00:00:26,960 --> 00:00:30,200
Speaker 1: Form Media had a story with the headline nearly one

7
00:00:30,280 --> 00:00:34,839
Speaker 1: hundred thousand chatchypt conversations were searchable on Google. And as

8
00:00:34,880 --> 00:00:36,599
Speaker 1: soon as that email hit my in box, before I'd

9
00:00:36,600 --> 00:00:38,559
Speaker 1: even read it, I've forwarded it to you and to

10
00:00:38,600 --> 00:00:41,440
Speaker 1: our producer Eliza, and I said, let's jump on this.

11
00:00:41,760 --> 00:00:43,280
Speaker 3: Yeah. You know, part of it is that it taps

12
00:00:43,320 --> 00:00:45,400
Speaker 3: into this fear that we all have about our most

13
00:00:45,440 --> 00:00:48,199
Speaker 3: intimate thoughts being made public. This isn't like having a

14
00:00:48,200 --> 00:00:52,199
Speaker 3: private Instagram account. This is very much between us and

15
00:00:52,320 --> 00:00:55,440
Speaker 3: chat gpt. It's a little bit like talking in our sleep.

16
00:00:55,880 --> 00:00:57,760
Speaker 3: And I think most people who have played around with

17
00:00:57,800 --> 00:01:00,800
Speaker 3: a chatbot have some questions or responses that they'd rather

18
00:01:00,840 --> 00:01:03,040
Speaker 3: the general public be blind to. I know I have

19
00:01:03,080 --> 00:01:03,760
Speaker 3: my fair share.

20
00:01:04,120 --> 00:01:04,360
Speaker 2: Yeah.

21
00:01:04,400 --> 00:01:07,280
Speaker 1: We did that piece recently with Kashmir Hill about AI

22
00:01:07,440 --> 00:01:11,120
Speaker 1: induced psychosis and the guy who'd fallen into the rabbit

23
00:01:11,200 --> 00:01:14,080
Speaker 1: hole by talking with chat Gibt about whether or not

24
00:01:14,080 --> 00:01:16,559
Speaker 1: he might be living in a simulation. So I started

25
00:01:16,600 --> 00:01:18,480
Speaker 1: talking about chat gpt with this to see if I

26
00:01:18,520 --> 00:01:20,080
Speaker 1: would also be taking down the rabbit hole, and then

27
00:01:20,080 --> 00:01:21,440
Speaker 1: I was like, oh my god, I'm not sure if

28
00:01:21,440 --> 00:01:23,520
Speaker 1: I want this to be made public at a later date.

29
00:01:24,120 --> 00:01:27,200
Speaker 1: So yeah, open Ai says they're now working with Google

30
00:01:27,280 --> 00:01:31,000
Speaker 1: to scrape these conversations off the web, but of course

31
00:01:31,160 --> 00:01:34,119
Speaker 1: some quick thinkers have already archived them.

32
00:01:34,400 --> 00:01:35,960
Speaker 2: And I can't help but be rather.

33
00:01:35,880 --> 00:01:38,360
Speaker 1: Curious about what it is that people are talking to

34
00:01:38,440 --> 00:01:39,680
Speaker 1: chat Gibt about.

35
00:01:40,000 --> 00:01:42,399
Speaker 3: I mean, obviously, we do have a segment at the

36
00:01:42,480 --> 00:01:45,320
Speaker 3: end of every Friday episode called Chat and Me about

37
00:01:45,319 --> 00:01:48,520
Speaker 3: how our listeners are really using their chatbots, and now

38
00:01:49,520 --> 00:01:52,440
Speaker 3: we have hundreds of thousands of additional responses to explore.

39
00:01:52,560 --> 00:01:54,920
Speaker 1: Of course, there's a difference between how our listeners tell

40
00:01:55,000 --> 00:01:59,440
Speaker 1: us they're using chambots and the reality which apparent from

41
00:01:59,440 --> 00:02:02,440
Speaker 1: these logs, and one researcher was actually created a data

42
00:02:02,520 --> 00:02:05,080
Speaker 1: set of all the responses that were indexed by Google,

43
00:02:05,480 --> 00:02:07,480
Speaker 1: and again our friends at four or four Media were

44
00:02:07,480 --> 00:02:10,000
Speaker 1: able to take a look here to tell us about

45
00:02:10,000 --> 00:02:12,240
Speaker 1: what everyone's asking chat is.

46
00:02:12,280 --> 00:02:14,120
Speaker 2: Four or four Media's Joseph.

47
00:02:13,800 --> 00:02:16,280
Speaker 3: Cox Joseph, Welcome back to tech stuff.

48
00:02:16,560 --> 00:02:17,600
Speaker 4: Hi, thank you for having me.

49
00:02:17,919 --> 00:02:18,239
Speaker 2: Joseph.

50
00:02:18,320 --> 00:02:20,920
Speaker 1: Let's start at the beginning. How is it that one

51
00:02:21,000 --> 00:02:25,200
Speaker 1: hundred thousand chat GPT conversations ended up on Google Search.

52
00:02:25,240 --> 00:02:27,080
Speaker 1: I thought that these conversations were private.

53
00:02:27,480 --> 00:02:31,120
Speaker 4: Yeah. So this starts with an article on Fast Company

54
00:02:31,320 --> 00:02:36,760
Speaker 4: on July thirtieth, and that outlook found that chat GPT

55
00:02:36,880 --> 00:02:41,440
Speaker 4: conversations were being indexed by Google. That is, as your

56
00:02:41,440 --> 00:02:44,639
Speaker 4: listeners will know, Google is constantly going around the web

57
00:02:44,960 --> 00:02:48,639
Speaker 4: and essentially grabbing content from websites. Of course, it can

58
00:02:48,760 --> 00:02:52,240
Speaker 4: use it to make its search engine. What was different

59
00:02:52,320 --> 00:02:56,560
Speaker 4: here was that while ordinarily, when you're talking to chat gpt,

60
00:02:56,840 --> 00:03:01,480
Speaker 4: thankfully all of the content of that conversation is private,

61
00:03:01,880 --> 00:03:04,600
Speaker 4: in this case, what some people have been doing was

62
00:03:04,680 --> 00:03:07,480
Speaker 4: using i think a little known feature where they could

63
00:03:07,520 --> 00:03:11,399
Speaker 4: share the contents of that communication. Now, maybe you want

64
00:03:11,400 --> 00:03:13,680
Speaker 4: to do that because you want to show your friend, wow,

65
00:03:13,720 --> 00:03:16,920
Speaker 4: look at this really wacky, crazy thing that chat GPT

66
00:03:17,080 --> 00:03:19,760
Speaker 4: told me. Or maybe there's a business need right like, hey,

67
00:03:19,840 --> 00:03:22,360
Speaker 4: I've done this with chat GPT, now I need to

68
00:03:22,360 --> 00:03:25,320
Speaker 4: show other people in my team. And you would select

69
00:03:25,440 --> 00:03:30,480
Speaker 4: the share feature and this would create a public essentially

70
00:03:30,520 --> 00:03:35,040
Speaker 4: a public web page version of that chat, and although

71
00:03:35,080 --> 00:03:36,960
Speaker 4: you can then send that to your friends or your

72
00:03:36,960 --> 00:03:40,600
Speaker 4: co workers, it can also be seen by Google obviously,

73
00:03:41,040 --> 00:03:44,080
Speaker 4: and OpenAI probably could have done some stuff to protect

74
00:03:44,120 --> 00:03:46,680
Speaker 4: it there. But the result is that a bunch of

75
00:03:46,680 --> 00:03:51,000
Speaker 4: these conversations and now publicly available, are indexed by Google,

76
00:03:51,400 --> 00:03:54,760
Speaker 4: and I seriously doubt that all of the people using

77
00:03:54,840 --> 00:03:59,240
Speaker 4: this share feature really understood what they were getting into.

78
00:03:59,440 --> 00:04:03,440
Speaker 1: Yeah, can you elaborate on that, because I thinking about WhatsApp,

79
00:04:03,520 --> 00:04:06,120
Speaker 1: for example, where there's like a forward button, or like

80
00:04:06,200 --> 00:04:11,200
Speaker 1: on x, I can do like a share link to tweet.

81
00:04:11,720 --> 00:04:14,720
Speaker 1: Is this like a somebody thinks they're pressing a button

82
00:04:14,800 --> 00:04:18,440
Speaker 1: to share an individual version of the transcript with another person,

83
00:04:18,880 --> 00:04:21,120
Speaker 1: but in so doing is kind of making their whole

84
00:04:21,480 --> 00:04:24,520
Speaker 1: chat GPT history visible to Google. Or what's the practical

85
00:04:25,200 --> 00:04:26,720
Speaker 1: explanation of how this happened?

86
00:04:27,040 --> 00:04:32,400
Speaker 4: Yeah, the users are making that particular conversation publicly available,

87
00:04:32,920 --> 00:04:35,359
Speaker 4: and it works in a very similar way to the

88
00:04:35,360 --> 00:04:38,799
Speaker 4: things you just outlined. I sometimes compare it a little

89
00:04:38,839 --> 00:04:42,400
Speaker 4: bit to a Google doc link where you will go

90
00:04:42,440 --> 00:04:44,800
Speaker 4: and you'll make that public and there's that setting you

91
00:04:44,839 --> 00:04:48,159
Speaker 4: can do that says Hey, anybody with this link is

92
00:04:48,200 --> 00:04:51,600
Speaker 4: going to be able to read your aw full article draft.

93
00:04:51,640 --> 00:04:53,359
Speaker 4: I mean that would be my case or whatever, or

94
00:04:53,400 --> 00:04:56,560
Speaker 4: your private thoughts or whatever. But you don't then go

95
00:04:56,680 --> 00:05:00,840
Speaker 4: and paste that link online and Google take steps so

96
00:05:00,880 --> 00:05:04,440
Speaker 4: that's not included in search engine results. Of course, if

97
00:05:04,480 --> 00:05:06,080
Speaker 4: you want to post it on a forum or you

98
00:05:06,120 --> 00:05:08,400
Speaker 4: post it on Twitter, that's going to be something else.

99
00:05:08,440 --> 00:05:11,520
Speaker 4: But that's usually how I think most people expect this

100
00:05:11,720 --> 00:05:14,880
Speaker 4: sort of sharing behavior to work. They expect that, well,

101
00:05:14,920 --> 00:05:16,719
Speaker 4: I'm going to just share it with one or two

102
00:05:16,800 --> 00:05:20,040
Speaker 4: people or you know, a dozen or whatever. They don't

103
00:05:20,120 --> 00:05:24,240
Speaker 4: expect typically that it's going to be available to anyone

104
00:05:24,640 --> 00:05:26,960
Speaker 4: on the Internet who knows where to look, or of

105
00:05:27,000 --> 00:05:31,400
Speaker 4: course anyone with Google now because Google has archived it

106
00:05:31,480 --> 00:05:34,480
Speaker 4: as well. It's sort of a big mix of the

107
00:05:34,600 --> 00:05:38,159
Speaker 4: user is partly at fault for perhaps not fully understanding

108
00:05:38,160 --> 00:05:40,640
Speaker 4: what is going on. Of course open AI, maybe not

109
00:05:40,640 --> 00:05:43,560
Speaker 4: fully explaining what is going on, and not taking steps

110
00:05:43,600 --> 00:05:46,640
Speaker 4: to stop Google indexing, and then of course Google indexing

111
00:05:46,680 --> 00:05:49,840
Speaker 4: it as well. There's a lot of maybe blame is

112
00:05:49,880 --> 00:05:51,839
Speaker 4: too strong a word, there's love blame to go around,

113
00:05:51,880 --> 00:05:52,839
Speaker 4: I think, to all parties.

114
00:05:53,600 --> 00:05:55,640
Speaker 2: So this is one hundred thousand conversations.

115
00:05:55,680 --> 00:05:59,920
Speaker 1: Do we know how many users those hundred thousand conversations represent?

116
00:06:00,120 --> 00:06:02,400
Speaker 1: And also you know what are some of the things

117
00:06:02,520 --> 00:06:03,560
Speaker 1: in those conversations.

118
00:06:03,680 --> 00:06:05,880
Speaker 4: Yeah, I don't think I've seen figures that drill down

119
00:06:05,960 --> 00:06:08,599
Speaker 4: to how many users, but you're right, it's nearly one

120
00:06:08,680 --> 00:06:14,240
Speaker 4: hundred thousand conversations with this data set the researcher scraped

121
00:06:14,279 --> 00:06:17,680
Speaker 4: from Google. I mean, before this, some researchers were going

122
00:06:17,680 --> 00:06:21,240
Speaker 4: through hundreds of conversations and that was already bad enough,

123
00:06:21,240 --> 00:06:24,919
Speaker 4: and of course Newsworthy. Well, this researcher did was scrape

124
00:06:24,960 --> 00:06:27,320
Speaker 4: them on mass put them into a data set. And

125
00:06:27,360 --> 00:06:29,880
Speaker 4: I'm actually looking at it now and there's a lot

126
00:06:29,920 --> 00:06:32,359
Speaker 4: of benign stuff in here. It looks like somebody is

127
00:06:32,400 --> 00:06:36,359
Speaker 4: making their first iPhone app and they're using chat GPT

128
00:06:36,560 --> 00:06:40,560
Speaker 4: for that. There are others where people are clearly discussing

129
00:06:41,080 --> 00:06:45,000
Speaker 4: sensitive business materials, such as could you help me write

130
00:06:45,000 --> 00:06:48,760
Speaker 4: this contract? There is potentially, you know, some bank information

131
00:06:49,320 --> 00:06:51,839
Speaker 4: in here. I say potentially because it sure looks like

132
00:06:51,880 --> 00:06:55,680
Speaker 4: bank information. And then you have I mean you mentioned

133
00:06:55,920 --> 00:07:00,760
Speaker 4: at the top these sort of delusional conversation that some

134
00:07:00,800 --> 00:07:04,280
Speaker 4: people have with chatjeput and I'm sure there is some

135
00:07:04,360 --> 00:07:07,159
Speaker 4: of that in here. I have seen some people talking

136
00:07:07,160 --> 00:07:12,240
Speaker 4: about therapy. I have seen some people talking about relationship issues,

137
00:07:12,280 --> 00:07:15,160
Speaker 4: such as one it seems to be a man talking

138
00:07:15,200 --> 00:07:18,080
Speaker 4: about his ex girlfriend and wondering why she's not looking

139
00:07:18,160 --> 00:07:22,520
Speaker 4: at his Instagram stories, that sort of thing, which I

140
00:07:22,520 --> 00:07:23,440
Speaker 4: don't know if I would turn.

141
00:07:23,480 --> 00:07:24,720
Speaker 2: It's just not that into you.

142
00:07:26,080 --> 00:07:28,680
Speaker 4: That means yes, I think chat GPT was trying to

143
00:07:28,720 --> 00:07:33,760
Speaker 4: say that basically, so this is only what people have

144
00:07:33,840 --> 00:07:38,120
Speaker 4: decided to share, which is a very interesting caveat to

145
00:07:38,760 --> 00:07:39,280
Speaker 4: the data.

146
00:07:39,440 --> 00:07:40,920
Speaker 1: They don't want to share it with the world, but

147
00:07:40,960 --> 00:07:43,480
Speaker 1: they've chosen at least one other person to share it with,

148
00:07:43,560 --> 00:07:47,360
Speaker 1: so therefore, by definition, is not their most private use case.

149
00:07:47,600 --> 00:07:51,800
Speaker 4: Yes, and maybe the research or others will be able

150
00:07:51,840 --> 00:07:55,240
Speaker 4: to do some sort of deeper analysis on this than me.

151
00:07:55,640 --> 00:07:57,840
Speaker 4: But that's interesting and that what are the sorts of

152
00:07:57,880 --> 00:08:00,720
Speaker 4: things that people are willing to share with another person?

153
00:08:00,880 --> 00:08:02,760
Speaker 4: And of course, you know, what does that tell us

154
00:08:02,760 --> 00:08:05,480
Speaker 4: about the things they're not sharing. That being said, I

155
00:08:05,480 --> 00:08:07,760
Speaker 4: don't think anybody wants a security issue where we're actually

156
00:08:07,760 --> 00:08:09,560
Speaker 4: able to see all of that private data either.

157
00:08:10,200 --> 00:08:12,360
Speaker 3: So this was something that was reported out a few

158
00:08:12,400 --> 00:08:15,239
Speaker 3: weeks ago, As you said, has there been any change

159
00:08:15,440 --> 00:08:19,920
Speaker 3: and how did open ai respond to the exclusive.

160
00:08:19,720 --> 00:08:23,680
Speaker 4: So open ai has now disabled this like opt in

161
00:08:24,160 --> 00:08:27,440
Speaker 4: sharing feature because the company actually said they don't think

162
00:08:27,480 --> 00:08:30,840
Speaker 4: people fully understood what was going on. And then the

163
00:08:30,880 --> 00:08:33,960
Speaker 4: company also says it is working with Google to remove

164
00:08:34,520 --> 00:08:37,839
Speaker 4: some of those indexed results. Because of course there's a

165
00:08:37,880 --> 00:08:40,120
Speaker 4: few things going on here. There's the exposure in the

166
00:08:40,160 --> 00:08:43,520
Speaker 4: first place, there's the sharing, there's the indexing by Google.

167
00:08:43,760 --> 00:08:48,240
Speaker 4: But even if Google does remove these search results, these

168
00:08:48,520 --> 00:08:52,600
Speaker 4: chats have been archived by this researcher, and I presume

169
00:08:52,760 --> 00:08:55,800
Speaker 4: others as well, Like I seriously doubt there's only one

170
00:08:55,880 --> 00:08:58,680
Speaker 4: or two people who grabbed all of this data. It's

171
00:08:58,960 --> 00:09:02,800
Speaker 4: very much an interesting privacy issue that I think researchers

172
00:09:02,800 --> 00:09:04,160
Speaker 4: want to look into and learn from.

173
00:09:04,440 --> 00:09:07,520
Speaker 3: I don't understand why open ai seem to think that

174
00:09:07,559 --> 00:09:09,560
Speaker 3: this tool would be useful, Like, have you given that

175
00:09:09,600 --> 00:09:10,080
Speaker 3: any thought?

176
00:09:10,600 --> 00:09:14,520
Speaker 4: Yeah, I think that people do want to sometimes share

177
00:09:15,160 --> 00:09:21,319
Speaker 4: the interesting or crazy or insightful stuff they get from GPT. Now,

178
00:09:21,720 --> 00:09:25,679
Speaker 4: open ai probably should have taken steps to ensure that

179
00:09:25,720 --> 00:09:29,920
Speaker 4: people can share this in a much more private manner,

180
00:09:30,200 --> 00:09:33,679
Speaker 4: maybe something like you have to add a particular chat

181
00:09:33,800 --> 00:09:36,520
Speaker 4: GPT user to the conversation, then they can see it

182
00:09:36,559 --> 00:09:38,880
Speaker 4: in the same way you add somebody to a Google doc,

183
00:09:39,000 --> 00:09:42,119
Speaker 4: for example. That would be a little bit more laborious,

184
00:09:42,160 --> 00:09:44,880
Speaker 4: there'd be a bit more friction there. But I'm just

185
00:09:45,000 --> 00:09:49,280
Speaker 4: interested in why open ai did not take more steps

186
00:09:49,320 --> 00:09:52,640
Speaker 4: to protect this from being scraped by Google. It is

187
00:09:52,840 --> 00:09:57,480
Speaker 4: possible to share material online without it being touched by

188
00:09:57,520 --> 00:10:00,240
Speaker 4: search engines. You can ask search engines, hey, if you

189
00:10:00,240 --> 00:10:03,800
Speaker 4: come across this, please do not index it. I'm curious

190
00:10:03,840 --> 00:10:06,840
Speaker 4: why OpenAI did not take those steps, and I don't

191
00:10:06,880 --> 00:10:10,240
Speaker 4: have any insight either way. But the result is that

192
00:10:10,280 --> 00:10:12,800
Speaker 4: all of these chats have now been indexed on Google,

193
00:10:12,840 --> 00:10:14,160
Speaker 4: and I think that's pretty significant.

194
00:10:14,440 --> 00:10:15,720
Speaker 2: What do you think might happen next?

195
00:10:15,960 --> 00:10:19,560
Speaker 4: What happens next is that I think other companies are

196
00:10:19,640 --> 00:10:24,880
Speaker 4: going to start checking whether they also have similar issues

197
00:10:25,440 --> 00:10:27,000
Speaker 4: like this. And I do want to stress like, this

198
00:10:27,040 --> 00:10:30,559
Speaker 4: is not the vast majority of chat GPT conversations or

199
00:10:30,559 --> 00:10:33,880
Speaker 4: anything like that. Chat GPT was not hacked, it wasn't breached.

200
00:10:33,920 --> 00:10:38,240
Speaker 4: There was a somewhat niche security issue, but because these

201
00:10:38,280 --> 00:10:42,640
Speaker 4: tools are becoming so so popular now, even a relatively

202
00:10:42,760 --> 00:10:45,640
Speaker 4: niche issue can actually impact a ton of people.

203
00:10:51,960 --> 00:10:56,560
Speaker 3: After the break, So how secure are AI chatbots stay

204
00:10:56,640 --> 00:10:56,959
Speaker 3: with us?

205
00:11:11,720 --> 00:11:16,560
Speaker 1: It's interesting because Sam Altman was recently on THEO Vonn's

206
00:11:16,640 --> 00:11:20,560
Speaker 1: podcast and he was sort of pointing out some of

207
00:11:20,600 --> 00:11:25,080
Speaker 1: the risks to my surprise, about the privacy issues in

208
00:11:25,640 --> 00:11:29,280
Speaker 1: chat shept. He was saying, like therapists conversations are protected

209
00:11:29,280 --> 00:11:34,040
Speaker 1: by hippa lawyer conversations are protected by attorney client privilege,

210
00:11:34,040 --> 00:11:37,360
Speaker 1: and people assume that when they're talking with chat that

211
00:11:37,520 --> 00:11:40,839
Speaker 1: maybe some of these protections apply, whereas in fact they don't.

212
00:11:41,120 --> 00:11:43,720
Speaker 1: And I was kind of wondering why he, of all people,

213
00:11:44,040 --> 00:11:46,560
Speaker 1: was out there on this topic. I did read some

214
00:11:46,600 --> 00:11:48,880
Speaker 1: other reporting saying that it may be part of the

215
00:11:49,400 --> 00:11:51,640
Speaker 1: lawsuit with the New York Times. The New York Times

216
00:11:51,679 --> 00:11:55,480
Speaker 1: is part of their discovery in the lawsuit against open

217
00:11:55,520 --> 00:11:58,400
Speaker 1: Ai for copyright infringement. Are demanding I think one hundred

218
00:11:58,480 --> 00:12:03,000
Speaker 1: million open ai converse stations for analysis, But I was

219
00:12:03,040 --> 00:12:06,120
Speaker 1: surprised to hear Altman out there on this. Nonetheless, can

220
00:12:06,120 --> 00:12:08,400
Speaker 1: you kind of take a step back and maybe reflect

221
00:12:08,440 --> 00:12:12,239
Speaker 1: on this story about the breach in the broader context

222
00:12:12,559 --> 00:12:18,400
Speaker 1: of how people are using chatbots and what chatbot makers

223
00:12:18,600 --> 00:12:21,920
Speaker 1: are incentivized to do or not do to protect their users.

224
00:12:22,360 --> 00:12:25,319
Speaker 4: Yeah, so I haven't seen those comments. But to zoom

225
00:12:25,360 --> 00:12:29,240
Speaker 4: out a little bit, Altman and other people in the space,

226
00:12:29,880 --> 00:12:34,160
Speaker 4: they enjoy kind of getting their cake and eating it too,

227
00:12:34,240 --> 00:12:37,480
Speaker 4: where on one side they will warn about the dangers

228
00:12:37,480 --> 00:12:40,640
Speaker 4: of AI. They'll say it needs to be regulated, it

229
00:12:40,640 --> 00:12:43,600
Speaker 4: needs to be taken really very seriously, and also it

230
00:12:43,679 --> 00:12:45,679
Speaker 4: is coming and there's nothing we can do about it,

231
00:12:45,920 --> 00:12:48,600
Speaker 4: while also building those tools at the same time and

232
00:12:48,640 --> 00:12:50,880
Speaker 4: making a lot of money from it. They actually benefit

233
00:12:50,920 --> 00:12:53,600
Speaker 4: from being on both sides of the conversation at the

234
00:12:53,640 --> 00:12:58,000
Speaker 4: same time, and Oltman and others very easily switch between

235
00:12:58,000 --> 00:13:01,560
Speaker 4: those positions depending on the context and which they're talking about.

236
00:13:01,600 --> 00:13:05,079
Speaker 4: So of course, you know, an AI developer can say

237
00:13:05,480 --> 00:13:08,800
Speaker 4: very very sensitive stuff is going on here and people

238
00:13:08,880 --> 00:13:10,680
Speaker 4: need to be careful, and then on the other side

239
00:13:10,679 --> 00:13:13,840
Speaker 4: they'll say, while our technology is absolutely suitable for that

240
00:13:13,880 --> 00:13:16,960
Speaker 4: because we take privacy very seriously or whatever. I've just

241
00:13:17,040 --> 00:13:19,920
Speaker 4: kind of got a little bit jaded by all of

242
00:13:19,960 --> 00:13:22,880
Speaker 4: these companies playing both sides at the same time, And

243
00:13:22,920 --> 00:13:27,760
Speaker 4: that's why I think you need outside journalists, outside experts, policymakers,

244
00:13:28,240 --> 00:13:31,319
Speaker 4: activists who can probe it a little bit more because

245
00:13:31,360 --> 00:13:34,560
Speaker 4: every time I hear Oltmann or someone similar make these

246
00:13:34,600 --> 00:13:37,240
Speaker 4: points about their own technology, I have to remember, yeah,

247
00:13:37,280 --> 00:13:37,920
Speaker 4: but they're making it.

248
00:13:38,120 --> 00:13:38,840
Speaker 2: Yeah.

249
00:13:39,120 --> 00:13:42,199
Speaker 3: Open ai is apparently trying to remove the shared content

250
00:13:42,240 --> 00:13:45,760
Speaker 3: from search engines, but smart people like this researcher accessed

251
00:13:45,760 --> 00:13:48,520
Speaker 3: and stored it while it was live. While they're using

252
00:13:48,559 --> 00:13:51,000
Speaker 3: it for an altruistic purpose. I'm wondering if you think

253
00:13:51,040 --> 00:13:54,920
Speaker 3: people should be concerned, like what if they do end

254
00:13:55,000 --> 00:13:55,880
Speaker 3: up in the wrong hands.

255
00:13:56,160 --> 00:13:59,440
Speaker 4: I don't think people need to necessarily be concerned about

256
00:13:59,520 --> 00:14:02,880
Speaker 4: this specific breach. I mean that being said, maybe there's

257
00:14:03,000 --> 00:14:05,680
Speaker 4: something really really bad in there and I simply haven't

258
00:14:05,800 --> 00:14:08,080
Speaker 4: seen it, and the researcher and others are going to

259
00:14:08,120 --> 00:14:12,080
Speaker 4: continue to dig through it. But people should absolutely be

260
00:14:12,200 --> 00:14:15,800
Speaker 4: careful with how they are using chatbots. I mean, maybe

261
00:14:15,800 --> 00:14:18,480
Speaker 4: they use this now disabled feature and maybe they're going

262
00:14:18,520 --> 00:14:21,000
Speaker 4: to be concerned about that. But putting that aside, you

263
00:14:21,280 --> 00:14:25,400
Speaker 4: have to remember every single command, every single prompt, every

264
00:14:25,400 --> 00:14:29,200
Speaker 4: single sentence that you put into chatch, GPT or any

265
00:14:29,240 --> 00:14:33,000
Speaker 4: of these other ones. It is going somewhere. It's not

266
00:14:33,560 --> 00:14:36,720
Speaker 4: just sat on your computer. It's not being locally processed.

267
00:14:36,880 --> 00:14:40,200
Speaker 4: Is going off to their systems, and ultimately you don't

268
00:14:40,280 --> 00:14:43,480
Speaker 4: really know what it's being used for. That is, maybe

269
00:14:43,480 --> 00:14:47,360
Speaker 4: it's you retraining and improving the training of the system itself,

270
00:14:47,560 --> 00:14:51,960
Speaker 4: or whether there's some sort of quirk in its security

271
00:14:52,040 --> 00:14:54,640
Speaker 4: or privacy or sharing settings that ends up with it

272
00:14:54,720 --> 00:14:58,280
Speaker 4: now being publicly available. And I know that I'm a

273
00:14:58,320 --> 00:15:00,600
Speaker 4: little bit more extreme than others, but I would never

274
00:15:01,040 --> 00:15:04,640
Speaker 4: put sensitive information into one of these things. And I

275
00:15:04,720 --> 00:15:08,920
Speaker 4: know that plenty of companies are having to implement policies

276
00:15:08,960 --> 00:15:12,600
Speaker 4: where they tell employees, please do not put competential information

277
00:15:13,000 --> 00:15:16,240
Speaker 4: into the chatbot that we don't own. I think people

278
00:15:16,320 --> 00:15:20,160
Speaker 4: just have to be really, really cognizant of that. In

279
00:15:20,200 --> 00:15:22,920
Speaker 4: the same way that when we all first got smartphones,

280
00:15:22,960 --> 00:15:25,800
Speaker 4: we had to learn, oh, right, it's tracking my location

281
00:15:25,960 --> 00:15:28,320
Speaker 4: data if I turn location data on. I think we

282
00:15:28,360 --> 00:15:30,840
Speaker 4: need to remember and to learn, oh, when I put

283
00:15:30,840 --> 00:15:34,200
Speaker 4: this thing into chat GPT, I don't know exactly where

284
00:15:34,200 --> 00:15:37,400
Speaker 4: it's going, and it could potentially bite me later if

285
00:15:37,400 --> 00:15:38,080
Speaker 4: I'm not careful.

286
00:15:38,360 --> 00:15:40,000
Speaker 2: Yeah, And I think it's an important point.

287
00:15:40,040 --> 00:15:42,320
Speaker 1: Just we think about the stakes of the you know,

288
00:15:42,360 --> 00:15:46,000
Speaker 1: open AI or chatchbt logs being indexed and available on

289
00:15:46,040 --> 00:15:50,120
Speaker 1: Google because like information that you know, you share with

290
00:15:50,200 --> 00:15:53,359
Speaker 1: a chatbot that you may think is more or less harmless,

291
00:15:53,840 --> 00:15:58,480
Speaker 1: could have you know, identifying information or sensitive personal information

292
00:15:58,560 --> 00:16:01,640
Speaker 1: about addresses or accouncilor whatever it may be.

293
00:16:01,800 --> 00:16:05,000
Speaker 2: And so I think there's this kind of almost.

294
00:16:04,640 --> 00:16:09,240
Speaker 1: Willful ignorance which many of us, including me, persist with

295
00:16:09,400 --> 00:16:13,479
Speaker 1: despite knowing better in terms of how important proper security

296
00:16:13,520 --> 00:16:17,480
Speaker 1: practices around digital information are. And as you say, like

297
00:16:17,800 --> 00:16:21,120
Speaker 1: with all of a sudden standing on the doorstep of

298
00:16:21,200 --> 00:16:22,920
Speaker 1: a much more scary reality.

299
00:16:23,280 --> 00:16:26,720
Speaker 4: Yeah, I would say that with security you really have

300
00:16:26,800 --> 00:16:30,280
Speaker 4: to be proactive rather than reactive after something has happened,

301
00:16:30,520 --> 00:16:34,480
Speaker 4: you know, your bank account got broken into or anything

302
00:16:34,560 --> 00:16:37,040
Speaker 4: like that. Sure, you can deal with it, but it's

303
00:16:37,080 --> 00:16:39,280
Speaker 4: going to be annoying, it's going to be hard, it's

304
00:16:39,320 --> 00:16:41,520
Speaker 4: going to be tricky, and maybe some people steal some

305
00:16:41,560 --> 00:16:44,120
Speaker 4: money from you, maybe somebody hacks into your company or

306
00:16:44,160 --> 00:16:48,920
Speaker 4: something like that. You really should do security proactively if

307
00:16:48,920 --> 00:16:51,160
Speaker 4: you can. And a really thing that applies to everybody,

308
00:16:51,160 --> 00:16:53,760
Speaker 4: which isn't to say that it should be on users

309
00:16:53,800 --> 00:16:56,080
Speaker 4: all of the time. It really is up to the

310
00:16:56,080 --> 00:16:59,400
Speaker 4: people who make these products such as chat, GPT by

311
00:16:59,440 --> 00:17:02,960
Speaker 4: open Ai or whatever else for them to put in

312
00:17:03,000 --> 00:17:06,520
Speaker 4: these guardrails so people can't make these mistakes in the

313
00:17:06,560 --> 00:17:07,160
Speaker 4: first place.

314
00:17:07,680 --> 00:17:09,159
Speaker 3: You were lucky enough to get a hold of this

315
00:17:09,240 --> 00:17:11,639
Speaker 3: data set by this researcher. Do you know what the

316
00:17:11,640 --> 00:17:14,120
Speaker 3: researcher is planning to do with the information.

317
00:17:14,119 --> 00:17:20,000
Speaker 4: Not specifically beyond analyzing it for trends. I believe seeing

318
00:17:20,040 --> 00:17:25,320
Speaker 4: what is in there absolutely no criminal activity or anything

319
00:17:25,400 --> 00:17:28,240
Speaker 4: like that. But again, that's not to say that other

320
00:17:28,280 --> 00:17:30,879
Speaker 4: people may not be doing that as well. I can

321
00:17:30,960 --> 00:17:34,240
Speaker 4: imagine the situation which let's say, and this is a hypothetical,

322
00:17:34,359 --> 00:17:36,520
Speaker 4: but I'm sure I can find something that would reflect

323
00:17:36,520 --> 00:17:38,879
Speaker 4: this in some sort of data set. They're say you

324
00:17:38,920 --> 00:17:42,640
Speaker 4: were using Chatchuputi or something similar to make a quick

325
00:17:42,760 --> 00:17:46,360
Speaker 4: prototype app for your company. In that you include your

326
00:17:46,480 --> 00:17:50,920
Speaker 4: username and password and access keys for the infrastructure of

327
00:17:50,960 --> 00:17:53,400
Speaker 4: your company to make that app. It's all well and good,

328
00:17:53,440 --> 00:17:56,159
Speaker 4: it works, and it accidentally gets shared in a database

329
00:17:56,440 --> 00:17:59,840
Speaker 4: like this, Someone who is malicious could then go in, well,

330
00:18:00,040 --> 00:18:02,040
Speaker 4: thank you very much for those access keys. I'm now

331
00:18:02,080 --> 00:18:05,560
Speaker 4: going to break into XYZ company. And although we haven't

332
00:18:05,560 --> 00:18:08,800
Speaker 4: seen that happen specifically with this data set, that sort

333
00:18:08,800 --> 00:18:14,040
Speaker 4: of stuff happens constantly where you know, an engineer company,

334
00:18:14,080 --> 00:18:17,919
Speaker 4: even a very junior one, will put those keys in

335
00:18:18,040 --> 00:18:22,680
Speaker 4: code which is accidentally exposed online. It's accidentally publicly available,

336
00:18:22,840 --> 00:18:24,840
Speaker 4: and that's how we end up with data breaches.

337
00:18:24,920 --> 00:18:27,439
Speaker 1: Now, yeah, I mean as AI is being marketed as

338
00:18:27,480 --> 00:18:30,720
Speaker 1: a tool for work, obviously, the leverage like an individual

339
00:18:30,800 --> 00:18:35,360
Speaker 1: consumer has versus Open Ai or Google is really limited, right,

340
00:18:35,400 --> 00:18:38,600
Speaker 1: Like you know, I can complain and holler and post

341
00:18:38,600 --> 00:18:41,480
Speaker 1: on Reddit, and journalists like you can pick it up.

342
00:18:41,920 --> 00:18:45,640
Speaker 1: But when you know, PEPSI or Ernst and Young has

343
00:18:45,720 --> 00:18:50,240
Speaker 1: concerns about how its employees chats are being handled by

344
00:18:50,280 --> 00:18:53,880
Speaker 1: third party companies that perhaps you know, can can drive

345
00:18:54,000 --> 00:18:56,680
Speaker 1: change more rapidly, given these are like big corporate spenders.

346
00:18:56,680 --> 00:18:59,320
Speaker 1: So I'm curious do you know anything about what the

347
00:18:59,320 --> 00:19:03,159
Speaker 1: conversation alike but kind of B to B conversations around

348
00:19:03,600 --> 00:19:07,360
Speaker 1: operational security for NLMs, Well, I.

349
00:19:07,280 --> 00:19:09,280
Speaker 4: Mean I would also draw a parallel even just with

350
00:19:09,440 --> 00:19:13,400
Speaker 4: the intellectual property one, where a lot of these companies

351
00:19:13,400 --> 00:19:17,040
Speaker 4: weren't really paying attention until somebody was taking Mickey Mouse

352
00:19:17,520 --> 00:19:20,960
Speaker 4: doing some very strange things with AI with it for example.

353
00:19:20,960 --> 00:19:22,560
Speaker 4: And now of course we have the lawsuit you know

354
00:19:22,600 --> 00:19:25,239
Speaker 4: between Disney and mid Journey, for example, which is an

355
00:19:25,280 --> 00:19:30,280
Speaker 4: AI image generator engine. When it comes to security, I

356
00:19:30,320 --> 00:19:33,879
Speaker 4: don't know about the specific conversations, but it's absolutely something

357
00:19:33,920 --> 00:19:37,639
Speaker 4: that people need to be educated at inside their companies.

358
00:19:38,000 --> 00:19:41,320
Speaker 4: Funny enough about Disney, there was a breach of Disney

359
00:19:41,640 --> 00:19:43,720
Speaker 4: I think a year ago at this point, and that

360
00:19:43,880 --> 00:19:47,399
Speaker 4: started because one of their employees downloaded the piece of

361
00:19:47,440 --> 00:19:50,560
Speaker 4: software that they believed was some sort of AI agent

362
00:19:50,720 --> 00:19:54,280
Speaker 4: or some sort of AI generation tool. Hidden inside that

363
00:19:54,920 --> 00:19:59,160
Speaker 4: was malware which then stole passwords, and which then logged

364
00:19:59,200 --> 00:20:03,840
Speaker 4: into Disney's slack and stole a mountain of data. And

365
00:20:03,880 --> 00:20:06,320
Speaker 4: it turns out the hacker behind this had been deliberately

366
00:20:06,640 --> 00:20:10,320
Speaker 4: putting malware into their own custom AI tools to try

367
00:20:10,359 --> 00:20:13,520
Speaker 4: to get unsuspecting people to download it. So this is

368
00:20:13,560 --> 00:20:17,280
Speaker 4: a real threare to anybody working I think in any

369
00:20:17,320 --> 00:20:22,040
Speaker 4: sort of company. Hackers do not care really who you are.

370
00:20:22,080 --> 00:20:24,520
Speaker 4: They only care what you may or may not have

371
00:20:25,000 --> 00:20:29,159
Speaker 4: access to, and AI is just another consideration of that,

372
00:20:29,240 --> 00:20:33,200
Speaker 4: whether that's the data that an employee is inversely putting

373
00:20:33,240 --> 00:20:38,160
Speaker 4: into chat, GPT or a sketchy tool that someone may download.

374
00:20:38,240 --> 00:20:39,720
Speaker 4: You know, like, this is something that we have to

375
00:20:39,760 --> 00:20:40,320
Speaker 4: live with now.

376
00:20:40,520 --> 00:20:43,560
Speaker 2: Joseph, thank you, Thank you, Joseph, thank you so much.

377
00:20:58,680 --> 00:20:59,359
Speaker 3: For Tech Stuff.

378
00:20:59,400 --> 00:21:02,520
Speaker 1: I'm care and I'm os Valoshin. This episode was produced

379
00:21:02,560 --> 00:21:05,600
Speaker 1: by Eliza Dennis and Tyler Hill. It was executive produced

380
00:21:05,600 --> 00:21:08,919
Speaker 1: by me Karroen Price and Kate Osborne for Kaleidoscope and

381
00:21:09,000 --> 00:21:13,120
Speaker 1: Katrin norvelfa I Heart Podcasts. Jack Insley mixed this episode

382
00:21:13,160 --> 00:21:14,840
Speaker 1: and Kyle Murdoch rodel theme song.

383
00:21:15,040 --> 00:21:17,240
Speaker 3: Join us on Friday for the weekend tech Ars and

384
00:21:17,280 --> 00:21:19,800
Speaker 3: I will run through the tech headlines you may have missed.

385
00:21:19,680 --> 00:21:22,159
Speaker 1: And please do rate and review the show wherever you

386
00:21:22,200 --> 00:21:24,560
Speaker 1: listen to your podcasts, and also send us a note

387
00:21:24,600 --> 00:21:27,520
Speaker 1: at tech Stuff podcast at gmail dot com with any

388
00:21:27,520 --> 00:21:28,600
Speaker 1: comments or suggestions