1 00:00:14,040 --> 00:00:16,520 Speaker 1: Welcome to tech stuff. This is tech Support. I'm os 2 00:00:16,560 --> 00:00:18,560 Speaker 1: Valoshin and I'm here with Cara Price. 3 00:00:18,680 --> 00:00:19,840 Speaker 2: Hey us, Hey Karra. 4 00:00:20,480 --> 00:00:23,480 Speaker 1: So today we wanted to talk about this chatchypt feature, 5 00:00:23,520 --> 00:00:26,960 Speaker 1: which is now defunct, but our friends at four or 6 00:00:26,960 --> 00:00:30,200 Speaker 1: Form Media had a story with the headline nearly one 7 00:00:30,280 --> 00:00:34,839 Speaker 1: hundred thousand chatchypt conversations were searchable on Google. And as 8 00:00:34,880 --> 00:00:36,599 Speaker 1: soon as that email hit my in box, before I'd 9 00:00:36,600 --> 00:00:38,559 Speaker 1: even read it, I've forwarded it to you and to 10 00:00:38,600 --> 00:00:41,440 Speaker 1: our producer Eliza, and I said, let's jump on this. 11 00:00:41,760 --> 00:00:43,280 Speaker 3: Yeah. You know, part of it is that it taps 12 00:00:43,320 --> 00:00:45,400 Speaker 3: into this fear that we all have about our most 13 00:00:45,440 --> 00:00:48,199 Speaker 3: intimate thoughts being made public. This isn't like having a 14 00:00:48,200 --> 00:00:52,199 Speaker 3: private Instagram account. This is very much between us and 15 00:00:52,320 --> 00:00:55,440 Speaker 3: chat gpt. It's a little bit like talking in our sleep. 16 00:00:55,880 --> 00:00:57,760 Speaker 3: And I think most people who have played around with 17 00:00:57,800 --> 00:01:00,800 Speaker 3: a chatbot have some questions or responses that they'd rather 18 00:01:00,840 --> 00:01:03,040 Speaker 3: the general public be blind to. I know I have 19 00:01:03,080 --> 00:01:03,760 Speaker 3: my fair share. 20 00:01:04,120 --> 00:01:04,360 Speaker 2: Yeah. 21 00:01:04,400 --> 00:01:07,280 Speaker 1: We did that piece recently with Kashmir Hill about AI 22 00:01:07,440 --> 00:01:11,120 Speaker 1: induced psychosis and the guy who'd fallen into the rabbit 23 00:01:11,200 --> 00:01:14,080 Speaker 1: hole by talking with chat Gibt about whether or not 24 00:01:14,080 --> 00:01:16,559 Speaker 1: he might be living in a simulation. So I started 25 00:01:16,600 --> 00:01:18,480 Speaker 1: talking about chat gpt with this to see if I 26 00:01:18,520 --> 00:01:20,080 Speaker 1: would also be taking down the rabbit hole, and then 27 00:01:20,080 --> 00:01:21,440 Speaker 1: I was like, oh my god, I'm not sure if 28 00:01:21,440 --> 00:01:23,520 Speaker 1: I want this to be made public at a later date. 29 00:01:24,120 --> 00:01:27,200 Speaker 1: So yeah, open Ai says they're now working with Google 30 00:01:27,280 --> 00:01:31,000 Speaker 1: to scrape these conversations off the web, but of course 31 00:01:31,160 --> 00:01:34,119 Speaker 1: some quick thinkers have already archived them. 32 00:01:34,400 --> 00:01:35,960 Speaker 2: And I can't help but be rather. 33 00:01:35,880 --> 00:01:38,360 Speaker 1: Curious about what it is that people are talking to 34 00:01:38,440 --> 00:01:39,680 Speaker 1: chat Gibt about. 35 00:01:40,000 --> 00:01:42,399 Speaker 3: I mean, obviously, we do have a segment at the 36 00:01:42,480 --> 00:01:45,320 Speaker 3: end of every Friday episode called Chat and Me about 37 00:01:45,319 --> 00:01:48,520 Speaker 3: how our listeners are really using their chatbots, and now 38 00:01:49,520 --> 00:01:52,440 Speaker 3: we have hundreds of thousands of additional responses to explore. 39 00:01:52,560 --> 00:01:54,920 Speaker 1: Of course, there's a difference between how our listeners tell 40 00:01:55,000 --> 00:01:59,440 Speaker 1: us they're using chambots and the reality which apparent from 41 00:01:59,440 --> 00:02:02,440 Speaker 1: these logs, and one researcher was actually created a data 42 00:02:02,520 --> 00:02:05,080 Speaker 1: set of all the responses that were indexed by Google, 43 00:02:05,480 --> 00:02:07,480 Speaker 1: and again our friends at four or four Media were 44 00:02:07,480 --> 00:02:10,000 Speaker 1: able to take a look here to tell us about 45 00:02:10,000 --> 00:02:12,240 Speaker 1: what everyone's asking chat is. 46 00:02:12,280 --> 00:02:14,120 Speaker 2: Four or four Media's Joseph. 47 00:02:13,800 --> 00:02:16,280 Speaker 3: Cox Joseph, Welcome back to tech stuff. 48 00:02:16,560 --> 00:02:17,600 Speaker 4: Hi, thank you for having me. 49 00:02:17,919 --> 00:02:18,239 Speaker 2: Joseph. 50 00:02:18,320 --> 00:02:20,920 Speaker 1: Let's start at the beginning. How is it that one 51 00:02:21,000 --> 00:02:25,200 Speaker 1: hundred thousand chat GPT conversations ended up on Google Search. 52 00:02:25,240 --> 00:02:27,080 Speaker 1: I thought that these conversations were private. 53 00:02:27,480 --> 00:02:31,120 Speaker 4: Yeah. So this starts with an article on Fast Company 54 00:02:31,320 --> 00:02:36,760 Speaker 4: on July thirtieth, and that outlook found that chat GPT 55 00:02:36,880 --> 00:02:41,440 Speaker 4: conversations were being indexed by Google. That is, as your 56 00:02:41,440 --> 00:02:44,639 Speaker 4: listeners will know, Google is constantly going around the web 57 00:02:44,960 --> 00:02:48,639 Speaker 4: and essentially grabbing content from websites. Of course, it can 58 00:02:48,760 --> 00:02:52,240 Speaker 4: use it to make its search engine. What was different 59 00:02:52,320 --> 00:02:56,560 Speaker 4: here was that while ordinarily, when you're talking to chat gpt, 60 00:02:56,840 --> 00:03:01,480 Speaker 4: thankfully all of the content of that conversation is private, 61 00:03:01,880 --> 00:03:04,600 Speaker 4: in this case, what some people have been doing was 62 00:03:04,680 --> 00:03:07,480 Speaker 4: using i think a little known feature where they could 63 00:03:07,520 --> 00:03:11,399 Speaker 4: share the contents of that communication. Now, maybe you want 64 00:03:11,400 --> 00:03:13,680 Speaker 4: to do that because you want to show your friend, wow, 65 00:03:13,720 --> 00:03:16,920 Speaker 4: look at this really wacky, crazy thing that chat GPT 66 00:03:17,080 --> 00:03:19,760 Speaker 4: told me. Or maybe there's a business need right like, hey, 67 00:03:19,840 --> 00:03:22,360 Speaker 4: I've done this with chat GPT, now I need to 68 00:03:22,360 --> 00:03:25,320 Speaker 4: show other people in my team. And you would select 69 00:03:25,440 --> 00:03:30,480 Speaker 4: the share feature and this would create a public essentially 70 00:03:30,520 --> 00:03:35,040 Speaker 4: a public web page version of that chat, and although 71 00:03:35,080 --> 00:03:36,960 Speaker 4: you can then send that to your friends or your 72 00:03:36,960 --> 00:03:40,600 Speaker 4: co workers, it can also be seen by Google obviously, 73 00:03:41,040 --> 00:03:44,080 Speaker 4: and OpenAI probably could have done some stuff to protect 74 00:03:44,120 --> 00:03:46,680 Speaker 4: it there. But the result is that a bunch of 75 00:03:46,680 --> 00:03:51,000 Speaker 4: these conversations and now publicly available, are indexed by Google, 76 00:03:51,400 --> 00:03:54,760 Speaker 4: and I seriously doubt that all of the people using 77 00:03:54,840 --> 00:03:59,240 Speaker 4: this share feature really understood what they were getting into. 78 00:03:59,440 --> 00:04:03,440 Speaker 1: Yeah, can you elaborate on that, because I thinking about WhatsApp, 79 00:04:03,520 --> 00:04:06,120 Speaker 1: for example, where there's like a forward button, or like 80 00:04:06,200 --> 00:04:11,200 Speaker 1: on x, I can do like a share link to tweet. 81 00:04:11,720 --> 00:04:14,720 Speaker 1: Is this like a somebody thinks they're pressing a button 82 00:04:14,800 --> 00:04:18,440 Speaker 1: to share an individual version of the transcript with another person, 83 00:04:18,880 --> 00:04:21,120 Speaker 1: but in so doing is kind of making their whole 84 00:04:21,480 --> 00:04:24,520 Speaker 1: chat GPT history visible to Google. Or what's the practical 85 00:04:25,200 --> 00:04:26,720 Speaker 1: explanation of how this happened? 86 00:04:27,040 --> 00:04:32,400 Speaker 4: Yeah, the users are making that particular conversation publicly available, 87 00:04:32,920 --> 00:04:35,359 Speaker 4: and it works in a very similar way to the 88 00:04:35,360 --> 00:04:38,799 Speaker 4: things you just outlined. I sometimes compare it a little 89 00:04:38,839 --> 00:04:42,400 Speaker 4: bit to a Google doc link where you will go 90 00:04:42,440 --> 00:04:44,800 Speaker 4: and you'll make that public and there's that setting you 91 00:04:44,839 --> 00:04:48,159 Speaker 4: can do that says Hey, anybody with this link is 92 00:04:48,200 --> 00:04:51,600 Speaker 4: going to be able to read your aw full article draft. 93 00:04:51,640 --> 00:04:53,359 Speaker 4: I mean that would be my case or whatever, or 94 00:04:53,400 --> 00:04:56,560 Speaker 4: your private thoughts or whatever. But you don't then go 95 00:04:56,680 --> 00:05:00,840 Speaker 4: and paste that link online and Google take steps so 96 00:05:00,880 --> 00:05:04,440 Speaker 4: that's not included in search engine results. Of course, if 97 00:05:04,480 --> 00:05:06,080 Speaker 4: you want to post it on a forum or you 98 00:05:06,120 --> 00:05:08,400 Speaker 4: post it on Twitter, that's going to be something else. 99 00:05:08,440 --> 00:05:11,520 Speaker 4: But that's usually how I think most people expect this 100 00:05:11,720 --> 00:05:14,880 Speaker 4: sort of sharing behavior to work. They expect that, well, 101 00:05:14,920 --> 00:05:16,719 Speaker 4: I'm going to just share it with one or two 102 00:05:16,800 --> 00:05:20,040 Speaker 4: people or you know, a dozen or whatever. They don't 103 00:05:20,120 --> 00:05:24,240 Speaker 4: expect typically that it's going to be available to anyone 104 00:05:24,640 --> 00:05:26,960 Speaker 4: on the Internet who knows where to look, or of 105 00:05:27,000 --> 00:05:31,400 Speaker 4: course anyone with Google now because Google has archived it 106 00:05:31,480 --> 00:05:34,480 Speaker 4: as well. It's sort of a big mix of the 107 00:05:34,600 --> 00:05:38,159 Speaker 4: user is partly at fault for perhaps not fully understanding 108 00:05:38,160 --> 00:05:40,640 Speaker 4: what is going on. Of course open AI, maybe not 109 00:05:40,640 --> 00:05:43,560 Speaker 4: fully explaining what is going on, and not taking steps 110 00:05:43,600 --> 00:05:46,640 Speaker 4: to stop Google indexing, and then of course Google indexing 111 00:05:46,680 --> 00:05:49,840 Speaker 4: it as well. There's a lot of maybe blame is 112 00:05:49,880 --> 00:05:51,839 Speaker 4: too strong a word, there's love blame to go around, 113 00:05:51,880 --> 00:05:52,839 Speaker 4: I think, to all parties. 114 00:05:53,600 --> 00:05:55,640 Speaker 2: So this is one hundred thousand conversations. 115 00:05:55,680 --> 00:05:59,920 Speaker 1: Do we know how many users those hundred thousand conversations represent? 116 00:06:00,120 --> 00:06:02,400 Speaker 1: And also you know what are some of the things 117 00:06:02,520 --> 00:06:03,560 Speaker 1: in those conversations. 118 00:06:03,680 --> 00:06:05,880 Speaker 4: Yeah, I don't think I've seen figures that drill down 119 00:06:05,960 --> 00:06:08,599 Speaker 4: to how many users, but you're right, it's nearly one 120 00:06:08,680 --> 00:06:14,240 Speaker 4: hundred thousand conversations with this data set the researcher scraped 121 00:06:14,279 --> 00:06:17,680 Speaker 4: from Google. I mean, before this, some researchers were going 122 00:06:17,680 --> 00:06:21,240 Speaker 4: through hundreds of conversations and that was already bad enough, 123 00:06:21,240 --> 00:06:24,919 Speaker 4: and of course Newsworthy. Well, this researcher did was scrape 124 00:06:24,960 --> 00:06:27,320 Speaker 4: them on mass put them into a data set. And 125 00:06:27,360 --> 00:06:29,880 Speaker 4: I'm actually looking at it now and there's a lot 126 00:06:29,920 --> 00:06:32,359 Speaker 4: of benign stuff in here. It looks like somebody is 127 00:06:32,400 --> 00:06:36,359 Speaker 4: making their first iPhone app and they're using chat GPT 128 00:06:36,560 --> 00:06:40,560 Speaker 4: for that. There are others where people are clearly discussing 129 00:06:41,080 --> 00:06:45,000 Speaker 4: sensitive business materials, such as could you help me write 130 00:06:45,000 --> 00:06:48,760 Speaker 4: this contract? There is potentially, you know, some bank information 131 00:06:49,320 --> 00:06:51,839 Speaker 4: in here. I say potentially because it sure looks like 132 00:06:51,880 --> 00:06:55,680 Speaker 4: bank information. And then you have I mean you mentioned 133 00:06:55,920 --> 00:07:00,760 Speaker 4: at the top these sort of delusional conversation that some 134 00:07:00,800 --> 00:07:04,280 Speaker 4: people have with chatjeput and I'm sure there is some 135 00:07:04,360 --> 00:07:07,159 Speaker 4: of that in here. I have seen some people talking 136 00:07:07,160 --> 00:07:12,240 Speaker 4: about therapy. I have seen some people talking about relationship issues, 137 00:07:12,280 --> 00:07:15,160 Speaker 4: such as one it seems to be a man talking 138 00:07:15,200 --> 00:07:18,080 Speaker 4: about his ex girlfriend and wondering why she's not looking 139 00:07:18,160 --> 00:07:22,520 Speaker 4: at his Instagram stories, that sort of thing, which I 140 00:07:22,520 --> 00:07:23,440 Speaker 4: don't know if I would turn. 141 00:07:23,480 --> 00:07:24,720 Speaker 2: It's just not that into you. 142 00:07:26,080 --> 00:07:28,680 Speaker 4: That means yes, I think chat GPT was trying to 143 00:07:28,720 --> 00:07:33,760 Speaker 4: say that basically, so this is only what people have 144 00:07:33,840 --> 00:07:38,120 Speaker 4: decided to share, which is a very interesting caveat to 145 00:07:38,760 --> 00:07:39,280 Speaker 4: the data. 146 00:07:39,440 --> 00:07:40,920 Speaker 1: They don't want to share it with the world, but 147 00:07:40,960 --> 00:07:43,480 Speaker 1: they've chosen at least one other person to share it with, 148 00:07:43,560 --> 00:07:47,360 Speaker 1: so therefore, by definition, is not their most private use case. 149 00:07:47,600 --> 00:07:51,800 Speaker 4: Yes, and maybe the research or others will be able 150 00:07:51,840 --> 00:07:55,240 Speaker 4: to do some sort of deeper analysis on this than me. 151 00:07:55,640 --> 00:07:57,840 Speaker 4: But that's interesting and that what are the sorts of 152 00:07:57,880 --> 00:08:00,720 Speaker 4: things that people are willing to share with another person? 153 00:08:00,880 --> 00:08:02,760 Speaker 4: And of course, you know, what does that tell us 154 00:08:02,760 --> 00:08:05,480 Speaker 4: about the things they're not sharing. That being said, I 155 00:08:05,480 --> 00:08:07,760 Speaker 4: don't think anybody wants a security issue where we're actually 156 00:08:07,760 --> 00:08:09,560 Speaker 4: able to see all of that private data either. 157 00:08:10,200 --> 00:08:12,360 Speaker 3: So this was something that was reported out a few 158 00:08:12,400 --> 00:08:15,239 Speaker 3: weeks ago, As you said, has there been any change 159 00:08:15,440 --> 00:08:19,920 Speaker 3: and how did open ai respond to the exclusive. 160 00:08:19,720 --> 00:08:23,680 Speaker 4: So open ai has now disabled this like opt in 161 00:08:24,160 --> 00:08:27,440 Speaker 4: sharing feature because the company actually said they don't think 162 00:08:27,480 --> 00:08:30,840 Speaker 4: people fully understood what was going on. And then the 163 00:08:30,880 --> 00:08:33,960 Speaker 4: company also says it is working with Google to remove 164 00:08:34,520 --> 00:08:37,839 Speaker 4: some of those indexed results. Because of course there's a 165 00:08:37,880 --> 00:08:40,120 Speaker 4: few things going on here. There's the exposure in the 166 00:08:40,160 --> 00:08:43,520 Speaker 4: first place, there's the sharing, there's the indexing by Google. 167 00:08:43,760 --> 00:08:48,240 Speaker 4: But even if Google does remove these search results, these 168 00:08:48,520 --> 00:08:52,600 Speaker 4: chats have been archived by this researcher, and I presume 169 00:08:52,760 --> 00:08:55,800 Speaker 4: others as well, Like I seriously doubt there's only one 170 00:08:55,880 --> 00:08:58,680 Speaker 4: or two people who grabbed all of this data. It's 171 00:08:58,960 --> 00:09:02,800 Speaker 4: very much an interesting privacy issue that I think researchers 172 00:09:02,800 --> 00:09:04,160 Speaker 4: want to look into and learn from. 173 00:09:04,440 --> 00:09:07,520 Speaker 3: I don't understand why open ai seem to think that 174 00:09:07,559 --> 00:09:09,560 Speaker 3: this tool would be useful, Like, have you given that 175 00:09:09,600 --> 00:09:10,080 Speaker 3: any thought? 176 00:09:10,600 --> 00:09:14,520 Speaker 4: Yeah, I think that people do want to sometimes share 177 00:09:15,160 --> 00:09:21,319 Speaker 4: the interesting or crazy or insightful stuff they get from GPT. Now, 178 00:09:21,720 --> 00:09:25,679 Speaker 4: open ai probably should have taken steps to ensure that 179 00:09:25,720 --> 00:09:29,920 Speaker 4: people can share this in a much more private manner, 180 00:09:30,200 --> 00:09:33,679 Speaker 4: maybe something like you have to add a particular chat 181 00:09:33,800 --> 00:09:36,520 Speaker 4: GPT user to the conversation, then they can see it 182 00:09:36,559 --> 00:09:38,880 Speaker 4: in the same way you add somebody to a Google doc, 183 00:09:39,000 --> 00:09:42,119 Speaker 4: for example. That would be a little bit more laborious, 184 00:09:42,160 --> 00:09:44,880 Speaker 4: there'd be a bit more friction there. But I'm just 185 00:09:45,000 --> 00:09:49,280 Speaker 4: interested in why open ai did not take more steps 186 00:09:49,320 --> 00:09:52,640 Speaker 4: to protect this from being scraped by Google. It is 187 00:09:52,840 --> 00:09:57,480 Speaker 4: possible to share material online without it being touched by 188 00:09:57,520 --> 00:10:00,240 Speaker 4: search engines. You can ask search engines, hey, if you 189 00:10:00,240 --> 00:10:03,800 Speaker 4: come across this, please do not index it. I'm curious 190 00:10:03,840 --> 00:10:06,840 Speaker 4: why OpenAI did not take those steps, and I don't 191 00:10:06,880 --> 00:10:10,240 Speaker 4: have any insight either way. But the result is that 192 00:10:10,280 --> 00:10:12,800 Speaker 4: all of these chats have now been indexed on Google, 193 00:10:12,840 --> 00:10:14,160 Speaker 4: and I think that's pretty significant. 194 00:10:14,440 --> 00:10:15,720 Speaker 2: What do you think might happen next? 195 00:10:15,960 --> 00:10:19,560 Speaker 4: What happens next is that I think other companies are 196 00:10:19,640 --> 00:10:24,880 Speaker 4: going to start checking whether they also have similar issues 197 00:10:25,440 --> 00:10:27,000 Speaker 4: like this. And I do want to stress like, this 198 00:10:27,040 --> 00:10:30,559 Speaker 4: is not the vast majority of chat GPT conversations or 199 00:10:30,559 --> 00:10:33,880 Speaker 4: anything like that. Chat GPT was not hacked, it wasn't breached. 200 00:10:33,920 --> 00:10:38,240 Speaker 4: There was a somewhat niche security issue, but because these 201 00:10:38,280 --> 00:10:42,640 Speaker 4: tools are becoming so so popular now, even a relatively 202 00:10:42,760 --> 00:10:45,640 Speaker 4: niche issue can actually impact a ton of people. 203 00:10:51,960 --> 00:10:56,560 Speaker 3: After the break, So how secure are AI chatbots stay 204 00:10:56,640 --> 00:10:56,959 Speaker 3: with us? 205 00:11:11,720 --> 00:11:16,560 Speaker 1: It's interesting because Sam Altman was recently on THEO Vonn's 206 00:11:16,640 --> 00:11:20,560 Speaker 1: podcast and he was sort of pointing out some of 207 00:11:20,600 --> 00:11:25,080 Speaker 1: the risks to my surprise, about the privacy issues in 208 00:11:25,640 --> 00:11:29,280 Speaker 1: chat shept. He was saying, like therapists conversations are protected 209 00:11:29,280 --> 00:11:34,040 Speaker 1: by hippa lawyer conversations are protected by attorney client privilege, 210 00:11:34,040 --> 00:11:37,360 Speaker 1: and people assume that when they're talking with chat that 211 00:11:37,520 --> 00:11:40,839 Speaker 1: maybe some of these protections apply, whereas in fact they don't. 212 00:11:41,120 --> 00:11:43,720 Speaker 1: And I was kind of wondering why he, of all people, 213 00:11:44,040 --> 00:11:46,560 Speaker 1: was out there on this topic. I did read some 214 00:11:46,600 --> 00:11:48,880 Speaker 1: other reporting saying that it may be part of the 215 00:11:49,400 --> 00:11:51,640 Speaker 1: lawsuit with the New York Times. The New York Times 216 00:11:51,679 --> 00:11:55,480 Speaker 1: is part of their discovery in the lawsuit against open 217 00:11:55,520 --> 00:11:58,400 Speaker 1: Ai for copyright infringement. Are demanding I think one hundred 218 00:11:58,480 --> 00:12:03,000 Speaker 1: million open ai converse stations for analysis, But I was 219 00:12:03,040 --> 00:12:06,120 Speaker 1: surprised to hear Altman out there on this. Nonetheless, can 220 00:12:06,120 --> 00:12:08,400 Speaker 1: you kind of take a step back and maybe reflect 221 00:12:08,440 --> 00:12:12,239 Speaker 1: on this story about the breach in the broader context 222 00:12:12,559 --> 00:12:18,400 Speaker 1: of how people are using chatbots and what chatbot makers 223 00:12:18,600 --> 00:12:21,920 Speaker 1: are incentivized to do or not do to protect their users. 224 00:12:22,360 --> 00:12:25,319 Speaker 4: Yeah, so I haven't seen those comments. But to zoom 225 00:12:25,360 --> 00:12:29,240 Speaker 4: out a little bit, Altman and other people in the space, 226 00:12:29,880 --> 00:12:34,160 Speaker 4: they enjoy kind of getting their cake and eating it too, 227 00:12:34,240 --> 00:12:37,480 Speaker 4: where on one side they will warn about the dangers 228 00:12:37,480 --> 00:12:40,640 Speaker 4: of AI. They'll say it needs to be regulated, it 229 00:12:40,640 --> 00:12:43,600 Speaker 4: needs to be taken really very seriously, and also it 230 00:12:43,679 --> 00:12:45,679 Speaker 4: is coming and there's nothing we can do about it, 231 00:12:45,920 --> 00:12:48,600 Speaker 4: while also building those tools at the same time and 232 00:12:48,640 --> 00:12:50,880 Speaker 4: making a lot of money from it. They actually benefit 233 00:12:50,920 --> 00:12:53,600 Speaker 4: from being on both sides of the conversation at the 234 00:12:53,640 --> 00:12:58,000 Speaker 4: same time, and Oltman and others very easily switch between 235 00:12:58,000 --> 00:13:01,560 Speaker 4: those positions depending on the context and which they're talking about. 236 00:13:01,600 --> 00:13:05,079 Speaker 4: So of course, you know, an AI developer can say 237 00:13:05,480 --> 00:13:08,800 Speaker 4: very very sensitive stuff is going on here and people 238 00:13:08,880 --> 00:13:10,680 Speaker 4: need to be careful, and then on the other side 239 00:13:10,679 --> 00:13:13,840 Speaker 4: they'll say, while our technology is absolutely suitable for that 240 00:13:13,880 --> 00:13:16,960 Speaker 4: because we take privacy very seriously or whatever. I've just 241 00:13:17,040 --> 00:13:19,920 Speaker 4: kind of got a little bit jaded by all of 242 00:13:19,960 --> 00:13:22,880 Speaker 4: these companies playing both sides at the same time, And 243 00:13:22,920 --> 00:13:27,760 Speaker 4: that's why I think you need outside journalists, outside experts, policymakers, 244 00:13:28,240 --> 00:13:31,319 Speaker 4: activists who can probe it a little bit more because 245 00:13:31,360 --> 00:13:34,560 Speaker 4: every time I hear Oltmann or someone similar make these 246 00:13:34,600 --> 00:13:37,240 Speaker 4: points about their own technology, I have to remember, yeah, 247 00:13:37,280 --> 00:13:37,920 Speaker 4: but they're making it. 248 00:13:38,120 --> 00:13:38,840 Speaker 2: Yeah. 249 00:13:39,120 --> 00:13:42,199 Speaker 3: Open ai is apparently trying to remove the shared content 250 00:13:42,240 --> 00:13:45,760 Speaker 3: from search engines, but smart people like this researcher accessed 251 00:13:45,760 --> 00:13:48,520 Speaker 3: and stored it while it was live. While they're using 252 00:13:48,559 --> 00:13:51,000 Speaker 3: it for an altruistic purpose. I'm wondering if you think 253 00:13:51,040 --> 00:13:54,920 Speaker 3: people should be concerned, like what if they do end 254 00:13:55,000 --> 00:13:55,880 Speaker 3: up in the wrong hands. 255 00:13:56,160 --> 00:13:59,440 Speaker 4: I don't think people need to necessarily be concerned about 256 00:13:59,520 --> 00:14:02,880 Speaker 4: this specific breach. I mean that being said, maybe there's 257 00:14:03,000 --> 00:14:05,680 Speaker 4: something really really bad in there and I simply haven't 258 00:14:05,800 --> 00:14:08,080 Speaker 4: seen it, and the researcher and others are going to 259 00:14:08,120 --> 00:14:12,080 Speaker 4: continue to dig through it. But people should absolutely be 260 00:14:12,200 --> 00:14:15,800 Speaker 4: careful with how they are using chatbots. I mean, maybe 261 00:14:15,800 --> 00:14:18,480 Speaker 4: they use this now disabled feature and maybe they're going 262 00:14:18,520 --> 00:14:21,000 Speaker 4: to be concerned about that. But putting that aside, you 263 00:14:21,280 --> 00:14:25,400 Speaker 4: have to remember every single command, every single prompt, every 264 00:14:25,400 --> 00:14:29,200 Speaker 4: single sentence that you put into chatch, GPT or any 265 00:14:29,240 --> 00:14:33,000 Speaker 4: of these other ones. It is going somewhere. It's not 266 00:14:33,560 --> 00:14:36,720 Speaker 4: just sat on your computer. It's not being locally processed. 267 00:14:36,880 --> 00:14:40,200 Speaker 4: Is going off to their systems, and ultimately you don't 268 00:14:40,280 --> 00:14:43,480 Speaker 4: really know what it's being used for. That is, maybe 269 00:14:43,480 --> 00:14:47,360 Speaker 4: it's you retraining and improving the training of the system itself, 270 00:14:47,560 --> 00:14:51,960 Speaker 4: or whether there's some sort of quirk in its security 271 00:14:52,040 --> 00:14:54,640 Speaker 4: or privacy or sharing settings that ends up with it 272 00:14:54,720 --> 00:14:58,280 Speaker 4: now being publicly available. And I know that I'm a 273 00:14:58,320 --> 00:15:00,600 Speaker 4: little bit more extreme than others, but I would never 274 00:15:01,040 --> 00:15:04,640 Speaker 4: put sensitive information into one of these things. And I 275 00:15:04,720 --> 00:15:08,920 Speaker 4: know that plenty of companies are having to implement policies 276 00:15:08,960 --> 00:15:12,600 Speaker 4: where they tell employees, please do not put competential information 277 00:15:13,000 --> 00:15:16,240 Speaker 4: into the chatbot that we don't own. I think people 278 00:15:16,320 --> 00:15:20,160 Speaker 4: just have to be really, really cognizant of that. In 279 00:15:20,200 --> 00:15:22,920 Speaker 4: the same way that when we all first got smartphones, 280 00:15:22,960 --> 00:15:25,800 Speaker 4: we had to learn, oh, right, it's tracking my location 281 00:15:25,960 --> 00:15:28,320 Speaker 4: data if I turn location data on. I think we 282 00:15:28,360 --> 00:15:30,840 Speaker 4: need to remember and to learn, oh, when I put 283 00:15:30,840 --> 00:15:34,200 Speaker 4: this thing into chat GPT, I don't know exactly where 284 00:15:34,200 --> 00:15:37,400 Speaker 4: it's going, and it could potentially bite me later if 285 00:15:37,400 --> 00:15:38,080 Speaker 4: I'm not careful. 286 00:15:38,360 --> 00:15:40,000 Speaker 2: Yeah, And I think it's an important point. 287 00:15:40,040 --> 00:15:42,320 Speaker 1: Just we think about the stakes of the you know, 288 00:15:42,360 --> 00:15:46,000 Speaker 1: open AI or chatchbt logs being indexed and available on 289 00:15:46,040 --> 00:15:50,120 Speaker 1: Google because like information that you know, you share with 290 00:15:50,200 --> 00:15:53,359 Speaker 1: a chatbot that you may think is more or less harmless, 291 00:15:53,840 --> 00:15:58,480 Speaker 1: could have you know, identifying information or sensitive personal information 292 00:15:58,560 --> 00:16:01,640 Speaker 1: about addresses or accouncilor whatever it may be. 293 00:16:01,800 --> 00:16:05,000 Speaker 2: And so I think there's this kind of almost. 294 00:16:04,640 --> 00:16:09,240 Speaker 1: Willful ignorance which many of us, including me, persist with 295 00:16:09,400 --> 00:16:13,479 Speaker 1: despite knowing better in terms of how important proper security 296 00:16:13,520 --> 00:16:17,480 Speaker 1: practices around digital information are. And as you say, like 297 00:16:17,800 --> 00:16:21,120 Speaker 1: with all of a sudden standing on the doorstep of 298 00:16:21,200 --> 00:16:22,920 Speaker 1: a much more scary reality. 299 00:16:23,280 --> 00:16:26,720 Speaker 4: Yeah, I would say that with security you really have 300 00:16:26,800 --> 00:16:30,280 Speaker 4: to be proactive rather than reactive after something has happened, 301 00:16:30,520 --> 00:16:34,480 Speaker 4: you know, your bank account got broken into or anything 302 00:16:34,560 --> 00:16:37,040 Speaker 4: like that. Sure, you can deal with it, but it's 303 00:16:37,080 --> 00:16:39,280 Speaker 4: going to be annoying, it's going to be hard, it's 304 00:16:39,320 --> 00:16:41,520 Speaker 4: going to be tricky, and maybe some people steal some 305 00:16:41,560 --> 00:16:44,120 Speaker 4: money from you, maybe somebody hacks into your company or 306 00:16:44,160 --> 00:16:48,920 Speaker 4: something like that. You really should do security proactively if 307 00:16:48,920 --> 00:16:51,160 Speaker 4: you can. And a really thing that applies to everybody, 308 00:16:51,160 --> 00:16:53,760 Speaker 4: which isn't to say that it should be on users 309 00:16:53,800 --> 00:16:56,080 Speaker 4: all of the time. It really is up to the 310 00:16:56,080 --> 00:16:59,400 Speaker 4: people who make these products such as chat, GPT by 311 00:16:59,440 --> 00:17:02,960 Speaker 4: open Ai or whatever else for them to put in 312 00:17:03,000 --> 00:17:06,520 Speaker 4: these guardrails so people can't make these mistakes in the 313 00:17:06,560 --> 00:17:07,160 Speaker 4: first place. 314 00:17:07,680 --> 00:17:09,159 Speaker 3: You were lucky enough to get a hold of this 315 00:17:09,240 --> 00:17:11,639 Speaker 3: data set by this researcher. Do you know what the 316 00:17:11,640 --> 00:17:14,120 Speaker 3: researcher is planning to do with the information. 317 00:17:14,119 --> 00:17:20,000 Speaker 4: Not specifically beyond analyzing it for trends. I believe seeing 318 00:17:20,040 --> 00:17:25,320 Speaker 4: what is in there absolutely no criminal activity or anything 319 00:17:25,400 --> 00:17:28,240 Speaker 4: like that. But again, that's not to say that other 320 00:17:28,280 --> 00:17:30,879 Speaker 4: people may not be doing that as well. I can 321 00:17:30,960 --> 00:17:34,240 Speaker 4: imagine the situation which let's say, and this is a hypothetical, 322 00:17:34,359 --> 00:17:36,520 Speaker 4: but I'm sure I can find something that would reflect 323 00:17:36,520 --> 00:17:38,879 Speaker 4: this in some sort of data set. They're say you 324 00:17:38,920 --> 00:17:42,640 Speaker 4: were using Chatchuputi or something similar to make a quick 325 00:17:42,760 --> 00:17:46,360 Speaker 4: prototype app for your company. In that you include your 326 00:17:46,480 --> 00:17:50,920 Speaker 4: username and password and access keys for the infrastructure of 327 00:17:50,960 --> 00:17:53,400 Speaker 4: your company to make that app. It's all well and good, 328 00:17:53,440 --> 00:17:56,159 Speaker 4: it works, and it accidentally gets shared in a database 329 00:17:56,440 --> 00:17:59,840 Speaker 4: like this, Someone who is malicious could then go in, well, 330 00:18:00,040 --> 00:18:02,040 Speaker 4: thank you very much for those access keys. I'm now 331 00:18:02,080 --> 00:18:05,560 Speaker 4: going to break into XYZ company. And although we haven't 332 00:18:05,560 --> 00:18:08,800 Speaker 4: seen that happen specifically with this data set, that sort 333 00:18:08,800 --> 00:18:14,040 Speaker 4: of stuff happens constantly where you know, an engineer company, 334 00:18:14,080 --> 00:18:17,919 Speaker 4: even a very junior one, will put those keys in 335 00:18:18,040 --> 00:18:22,680 Speaker 4: code which is accidentally exposed online. It's accidentally publicly available, 336 00:18:22,840 --> 00:18:24,840 Speaker 4: and that's how we end up with data breaches. 337 00:18:24,920 --> 00:18:27,439 Speaker 1: Now, yeah, I mean as AI is being marketed as 338 00:18:27,480 --> 00:18:30,720 Speaker 1: a tool for work, obviously, the leverage like an individual 339 00:18:30,800 --> 00:18:35,360 Speaker 1: consumer has versus Open Ai or Google is really limited, right, 340 00:18:35,400 --> 00:18:38,600 Speaker 1: Like you know, I can complain and holler and post 341 00:18:38,600 --> 00:18:41,480 Speaker 1: on Reddit, and journalists like you can pick it up. 342 00:18:41,920 --> 00:18:45,640 Speaker 1: But when you know, PEPSI or Ernst and Young has 343 00:18:45,720 --> 00:18:50,240 Speaker 1: concerns about how its employees chats are being handled by 344 00:18:50,280 --> 00:18:53,880 Speaker 1: third party companies that perhaps you know, can can drive 345 00:18:54,000 --> 00:18:56,680 Speaker 1: change more rapidly, given these are like big corporate spenders. 346 00:18:56,680 --> 00:18:59,320 Speaker 1: So I'm curious do you know anything about what the 347 00:18:59,320 --> 00:19:03,159 Speaker 1: conversation alike but kind of B to B conversations around 348 00:19:03,600 --> 00:19:07,360 Speaker 1: operational security for NLMs, Well, I. 349 00:19:07,280 --> 00:19:09,280 Speaker 4: Mean I would also draw a parallel even just with 350 00:19:09,440 --> 00:19:13,400 Speaker 4: the intellectual property one, where a lot of these companies 351 00:19:13,400 --> 00:19:17,040 Speaker 4: weren't really paying attention until somebody was taking Mickey Mouse 352 00:19:17,520 --> 00:19:20,960 Speaker 4: doing some very strange things with AI with it for example. 353 00:19:20,960 --> 00:19:22,560 Speaker 4: And now of course we have the lawsuit you know 354 00:19:22,600 --> 00:19:25,239 Speaker 4: between Disney and mid Journey, for example, which is an 355 00:19:25,280 --> 00:19:30,280 Speaker 4: AI image generator engine. When it comes to security, I 356 00:19:30,320 --> 00:19:33,879 Speaker 4: don't know about the specific conversations, but it's absolutely something 357 00:19:33,920 --> 00:19:37,639 Speaker 4: that people need to be educated at inside their companies. 358 00:19:38,000 --> 00:19:41,320 Speaker 4: Funny enough about Disney, there was a breach of Disney 359 00:19:41,640 --> 00:19:43,720 Speaker 4: I think a year ago at this point, and that 360 00:19:43,880 --> 00:19:47,399 Speaker 4: started because one of their employees downloaded the piece of 361 00:19:47,440 --> 00:19:50,560 Speaker 4: software that they believed was some sort of AI agent 362 00:19:50,720 --> 00:19:54,280 Speaker 4: or some sort of AI generation tool. Hidden inside that 363 00:19:54,920 --> 00:19:59,160 Speaker 4: was malware which then stole passwords, and which then logged 364 00:19:59,200 --> 00:20:03,840 Speaker 4: into Disney's slack and stole a mountain of data. And 365 00:20:03,880 --> 00:20:06,320 Speaker 4: it turns out the hacker behind this had been deliberately 366 00:20:06,640 --> 00:20:10,320 Speaker 4: putting malware into their own custom AI tools to try 367 00:20:10,359 --> 00:20:13,520 Speaker 4: to get unsuspecting people to download it. So this is 368 00:20:13,560 --> 00:20:17,280 Speaker 4: a real threare to anybody working I think in any 369 00:20:17,320 --> 00:20:22,040 Speaker 4: sort of company. Hackers do not care really who you are. 370 00:20:22,080 --> 00:20:24,520 Speaker 4: They only care what you may or may not have 371 00:20:25,000 --> 00:20:29,159 Speaker 4: access to, and AI is just another consideration of that, 372 00:20:29,240 --> 00:20:33,200 Speaker 4: whether that's the data that an employee is inversely putting 373 00:20:33,240 --> 00:20:38,160 Speaker 4: into chat, GPT or a sketchy tool that someone may download. 374 00:20:38,240 --> 00:20:39,720 Speaker 4: You know, like, this is something that we have to 375 00:20:39,760 --> 00:20:40,320 Speaker 4: live with now. 376 00:20:40,520 --> 00:20:43,560 Speaker 2: Joseph, thank you, Thank you, Joseph, thank you so much. 377 00:20:58,680 --> 00:20:59,359 Speaker 3: For Tech Stuff. 378 00:20:59,400 --> 00:21:02,520 Speaker 1: I'm care and I'm os Valoshin. This episode was produced 379 00:21:02,560 --> 00:21:05,600 Speaker 1: by Eliza Dennis and Tyler Hill. It was executive produced 380 00:21:05,600 --> 00:21:08,919 Speaker 1: by me Karroen Price and Kate Osborne for Kaleidoscope and 381 00:21:09,000 --> 00:21:13,120 Speaker 1: Katrin norvelfa I Heart Podcasts. Jack Insley mixed this episode 382 00:21:13,160 --> 00:21:14,840 Speaker 1: and Kyle Murdoch rodel theme song. 383 00:21:15,040 --> 00:21:17,240 Speaker 3: Join us on Friday for the weekend tech Ars and 384 00:21:17,280 --> 00:21:19,800 Speaker 3: I will run through the tech headlines you may have missed. 385 00:21:19,680 --> 00:21:22,159 Speaker 1: And please do rate and review the show wherever you 386 00:21:22,200 --> 00:21:24,560 Speaker 1: listen to your podcasts, and also send us a note 387 00:21:24,600 --> 00:21:27,520 Speaker 1: at tech Stuff podcast at gmail dot com with any 388 00:21:27,520 --> 00:21:28,600 Speaker 1: comments or suggestions