1 00:00:04,440 --> 00:00:12,280 Speaker 1: Welcome to Tech Stuff, a production from iHeartRadio. Hey there, 2 00:00:12,280 --> 00:00:15,640 Speaker 1: and welcome to tech Stuff. I'm your host, Jonathan Strickland. 3 00:00:15,680 --> 00:00:18,160 Speaker 1: I'm an executive producer with iHeartRadio. And how the tech 4 00:00:18,239 --> 00:00:21,600 Speaker 1: are you. I'm here to tell you something. You write 5 00:00:21,600 --> 00:00:25,480 Speaker 1: like a robot. But that's okay because I do too. 6 00:00:25,880 --> 00:00:29,720 Speaker 1: One of the founding fathers of the United States, James Madison, 7 00:00:30,120 --> 00:00:33,239 Speaker 1: wrote like a robot. Robots weren't even a thing when 8 00:00:33,280 --> 00:00:36,080 Speaker 1: he was writing back in the eighteenth century, all right, 9 00:00:36,159 --> 00:00:38,960 Speaker 1: so really, I guess it's more fair to say that 10 00:00:39,159 --> 00:00:43,520 Speaker 1: robots write like us. And while I'm having a little 11 00:00:43,560 --> 00:00:46,760 Speaker 1: bit of fun using the word robots, what I'm really 12 00:00:46,800 --> 00:00:51,000 Speaker 1: talking about is generative AI. You know, stuff like chat 13 00:00:51,080 --> 00:00:55,520 Speaker 1: GPT and Google Bard, that kind of thing, These AI 14 00:00:55,680 --> 00:00:59,280 Speaker 1: powered chat bots right like humans. Right, That's one of 15 00:00:59,320 --> 00:01:02,800 Speaker 1: the big suff features of the chatbots. One that they 16 00:01:02,800 --> 00:01:06,560 Speaker 1: can understand a prompt that we give them, That they 17 00:01:06,560 --> 00:01:09,480 Speaker 1: can understand what we mean when we give them a prompt, 18 00:01:09,520 --> 00:01:12,840 Speaker 1: and two that they then generate a response as if 19 00:01:12,920 --> 00:01:15,759 Speaker 1: it had been written by an actual person. But obviously 20 00:01:15,800 --> 00:01:20,399 Speaker 1: this also creates some challenges, some issues. So you might 21 00:01:20,440 --> 00:01:25,440 Speaker 1: remember that since chat GPT became publicly available last year 22 00:01:25,480 --> 00:01:29,319 Speaker 1: when OpenAI opened it up and let people start playing 23 00:01:29,319 --> 00:01:34,200 Speaker 1: with chat GPT, there were people in education, teachers and 24 00:01:34,280 --> 00:01:37,839 Speaker 1: administrators that sort of thing, who raise the alarm about 25 00:01:37,840 --> 00:01:42,320 Speaker 1: the possibility that students could use chat GPT and similar 26 00:01:42,360 --> 00:01:47,800 Speaker 1: tools to auto generate essays and stuff and thus bypass 27 00:01:47,920 --> 00:01:51,920 Speaker 1: school assignments. My robot wrote it for me. Beyond the 28 00:01:52,040 --> 00:01:55,480 Speaker 1: education sector, there are plenty of arenas where people are 29 00:01:55,520 --> 00:01:59,639 Speaker 1: worried that the less scrupulous folks out there will attempt 30 00:01:59,680 --> 00:02:02,840 Speaker 1: to pass off AI generated text as their own writing, 31 00:02:03,240 --> 00:02:08,760 Speaker 1: whether this is creative writing or business writing, whatever it 32 00:02:08,800 --> 00:02:13,440 Speaker 1: may be. So this then leads us to the concept 33 00:02:14,040 --> 00:02:18,640 Speaker 1: of AI writing detection tools, you know, some sort of 34 00:02:19,360 --> 00:02:23,280 Speaker 1: tool to determine if a piece of text originated from 35 00:02:23,480 --> 00:02:27,560 Speaker 1: a real human being or from that character that Haley 36 00:02:27,639 --> 00:02:31,240 Speaker 1: Joel Osmon played in that film about artificial intelligence. I 37 00:02:31,240 --> 00:02:35,239 Speaker 1: forget what that movie was called. Subsequent to the release 38 00:02:35,680 --> 00:02:39,239 Speaker 1: of these detection tools, we started hearing reports of teachers 39 00:02:39,560 --> 00:02:45,000 Speaker 1: failing students, sometimes an entire class of students, because the 40 00:02:45,120 --> 00:02:49,520 Speaker 1: detection tool indicated that the real source of the works 41 00:02:49,560 --> 00:02:52,040 Speaker 1: that were being turned in by the students it wasn't 42 00:02:52,120 --> 00:02:54,919 Speaker 1: from the students, but from AI. Now a lot of 43 00:02:54,960 --> 00:02:57,359 Speaker 1: students have actually come forward to argue that no, no, 44 00:02:57,520 --> 00:03:03,200 Speaker 1: they actually wrote those pieces themselves, that they authored that work, 45 00:03:03,240 --> 00:03:05,440 Speaker 1: they didn't use AI to do it, and that they 46 00:03:05,440 --> 00:03:09,000 Speaker 1: are the victim of false positives, that these writing detection 47 00:03:09,120 --> 00:03:12,240 Speaker 1: tools made a mistake, and as it turns out, at 48 00:03:12,320 --> 00:03:15,440 Speaker 1: least some of them, and likely a lot of them 49 00:03:15,560 --> 00:03:18,239 Speaker 1: were telling the truth. And we can say that because 50 00:03:18,280 --> 00:03:25,160 Speaker 1: these AI writing detection tools have abysmal accuracy rates, they 51 00:03:25,240 --> 00:03:29,400 Speaker 1: are worse than chance. That's how bad these tools can be. 52 00:03:30,160 --> 00:03:33,000 Speaker 1: So the success rate for an AI writing detector can 53 00:03:33,040 --> 00:03:36,040 Speaker 1: be so low that it has led some of the 54 00:03:36,040 --> 00:03:40,320 Speaker 1: companies to shut them down, and it led to a 55 00:03:40,360 --> 00:03:43,600 Speaker 1: lot of critics to just dismiss the concept of an 56 00:03:43,600 --> 00:03:48,240 Speaker 1: AI writing tool entirely. In fact, there are quite a 57 00:03:48,240 --> 00:03:51,120 Speaker 1: few who have argued that AI writing detection tools are 58 00:03:51,560 --> 00:03:54,720 Speaker 1: essentially snake oil. That there are companies that are making 59 00:03:54,760 --> 00:03:57,480 Speaker 1: what they say are reliable tools that can tell the 60 00:03:57,480 --> 00:04:00,560 Speaker 1: difference between text that was written by person and text 61 00:04:00,640 --> 00:04:04,200 Speaker 1: that was written by AI, but really they're just peddling 62 00:04:04,720 --> 00:04:08,800 Speaker 1: a hoax or a scam, and they're trying to make 63 00:04:08,920 --> 00:04:13,400 Speaker 1: money selling these tools to various organizations like schools and such, 64 00:04:14,160 --> 00:04:17,360 Speaker 1: but in fact those tools don't work, or at least 65 00:04:17,400 --> 00:04:21,160 Speaker 1: they don't work very well. Even open Ai, which is 66 00:04:21,200 --> 00:04:25,640 Speaker 1: the company that is responsible for chat GPT, they had 67 00:04:26,279 --> 00:04:28,880 Speaker 1: a tool that was meant to be a detection tool 68 00:04:28,920 --> 00:04:32,159 Speaker 1: to tell whether or not something was written by AI. 69 00:04:32,279 --> 00:04:35,560 Speaker 1: It was called AI Classifier, but they shut it down 70 00:04:36,240 --> 00:04:41,760 Speaker 1: earlier this year. Why because its accuracy rate was twenty 71 00:04:42,160 --> 00:04:47,960 Speaker 1: six percent. Twenty six percent accurate, that is bonkers. That 72 00:04:48,000 --> 00:04:52,320 Speaker 1: means nearly three quarters of the time that detection tool 73 00:04:52,400 --> 00:04:54,920 Speaker 1: came up with the wrong answer. Either it gave a 74 00:04:55,000 --> 00:04:59,400 Speaker 1: pass to an AI generated piece, or it accused a 75 00:04:59,600 --> 00:05:04,760 Speaker 1: work that a human being actually wrote, like definitively wrote, 76 00:05:05,320 --> 00:05:08,560 Speaker 1: as being the product of AI. This brings us to 77 00:05:08,640 --> 00:05:14,040 Speaker 1: James Madison. James Madison wrote the US Constitution, and folks 78 00:05:14,080 --> 00:05:17,880 Speaker 1: have fed the US Constitution into these AI writing detection 79 00:05:18,000 --> 00:05:22,440 Speaker 1: tools and received a notification that this piece was very 80 00:05:22,520 --> 00:05:25,800 Speaker 1: likely written by AI, which obviously led to lots of 81 00:05:26,320 --> 00:05:29,400 Speaker 1: jocularity on the Internet, as people said, I knew it. 82 00:05:29,440 --> 00:05:31,479 Speaker 1: I knew that the founding fathers of the United States 83 00:05:31,520 --> 00:05:34,440 Speaker 1: of America were really robots from the future sent back 84 00:05:34,480 --> 00:05:39,599 Speaker 1: in time to create a ultra capitalist society that preys 85 00:05:39,720 --> 00:05:44,440 Speaker 1: upon the disenfranchised or something like. There are a lot 86 00:05:44,440 --> 00:05:47,120 Speaker 1: of jokes about it, but the fact is no, it's 87 00:05:47,200 --> 00:05:51,680 Speaker 1: just that this writing detection tool is completely unreliable. So 88 00:05:51,720 --> 00:05:55,039 Speaker 1: you certainly cannot use these kinds of tools to justify 89 00:05:55,120 --> 00:05:59,080 Speaker 1: flunking an entire class of students when you know that 90 00:05:59,200 --> 00:06:02,680 Speaker 1: the reliability is so low. Now, I decided to do 91 00:06:03,120 --> 00:06:07,960 Speaker 1: this short episode about AI writing detection tools after reading 92 00:06:08,160 --> 00:06:11,279 Speaker 1: a couple of great pieces in Ours Technico. Those of 93 00:06:11,320 --> 00:06:14,080 Speaker 1: y'all who listen to my show frequently know that I 94 00:06:14,120 --> 00:06:19,760 Speaker 1: often reference Ours Technica because the folks there reliably post 95 00:06:20,240 --> 00:06:23,400 Speaker 1: great articles. So in this case, the author of both 96 00:06:23,440 --> 00:06:28,320 Speaker 1: pieces I read was BENJ. Edwards b E and J. Edwards, 97 00:06:28,680 --> 00:06:31,000 Speaker 1: And at some point I probably should reach out to 98 00:06:31,040 --> 00:06:33,440 Speaker 1: them and ask if they would like to join tech 99 00:06:33,480 --> 00:06:36,920 Speaker 1: stuff for an episode to talk about something like generative AI, 100 00:06:37,400 --> 00:06:41,839 Speaker 1: because Edwards has done some really good work. Anyways, as 101 00:06:41,880 --> 00:06:48,280 Speaker 1: we think about the issue about how this generative AI works, 102 00:06:48,680 --> 00:06:53,960 Speaker 1: the underlying technology that powers generative AI, we start to 103 00:06:53,960 --> 00:06:58,800 Speaker 1: see why there's this big reliability problem. Why are we 104 00:06:58,880 --> 00:07:04,240 Speaker 1: having such issues with an automated detection tool? Really determining 105 00:07:04,400 --> 00:07:07,760 Speaker 1: if something was written by a person or AI. And 106 00:07:07,760 --> 00:07:12,320 Speaker 1: it's because the tools like chat GPT are built on 107 00:07:12,400 --> 00:07:17,800 Speaker 1: top of large language models, also known as llms, And 108 00:07:17,840 --> 00:07:21,360 Speaker 1: if we take a moment to really understand llms, then 109 00:07:21,400 --> 00:07:23,720 Speaker 1: we start to get a handle on why these detector 110 00:07:23,760 --> 00:07:27,920 Speaker 1: tools are so unreliable. So first off, let's actually talk 111 00:07:27,920 --> 00:07:32,040 Speaker 1: about a precursor to large language models. This would be 112 00:07:32,200 --> 00:07:37,080 Speaker 1: recurrent neural networks or r ends. Now I've talked a 113 00:07:37,120 --> 00:07:39,800 Speaker 1: lot about neural networks on this show, but just as 114 00:07:39,800 --> 00:07:43,640 Speaker 1: a refresher. Neural network is an attempt to create a 115 00:07:43,800 --> 00:07:48,680 Speaker 1: computer system or computer model that processes information in a 116 00:07:48,680 --> 00:07:53,080 Speaker 1: way that is similar to how our brains process information. 117 00:07:53,640 --> 00:07:58,560 Speaker 1: So you have layers of artificial neurons, or you can 118 00:07:58,560 --> 00:08:02,920 Speaker 1: think of them as nodes. These layers connect to other 119 00:08:03,000 --> 00:08:07,080 Speaker 1: artificial neurons. You have multiple connections from neuron to other neurons, 120 00:08:07,480 --> 00:08:09,520 Speaker 1: and you have layers that go from top to bottom. 121 00:08:09,520 --> 00:08:11,560 Speaker 1: You can think of it like at the top that's 122 00:08:11,560 --> 00:08:14,120 Speaker 1: where you put input and at the bottom that's where 123 00:08:14,160 --> 00:08:18,120 Speaker 1: you get output. So essentially, you feed information into the 124 00:08:18,160 --> 00:08:20,960 Speaker 1: model and then the information goes through a series of 125 00:08:20,960 --> 00:08:25,160 Speaker 1: operations in which data passes through these different nodes, and 126 00:08:25,200 --> 00:08:28,480 Speaker 1: the nodes make decisions based upon the input, and then 127 00:08:28,520 --> 00:08:32,640 Speaker 1: they send output to different nodes and eventually you get 128 00:08:33,000 --> 00:08:36,800 Speaker 1: the ultimate output. And sometimes that output is correct. It 129 00:08:36,800 --> 00:08:40,480 Speaker 1: gives you the answer that is correct. Sometimes it's wrong. 130 00:08:41,000 --> 00:08:43,120 Speaker 1: And typically what that means is that you then have 131 00:08:43,200 --> 00:08:47,520 Speaker 1: to adjust how those artificial neurons are making decisions. Those 132 00:08:47,559 --> 00:08:52,480 Speaker 1: neurons apply a sort of bias to input, we call 133 00:08:52,520 --> 00:08:56,880 Speaker 1: it a weight, so they will favor some types of 134 00:08:56,960 --> 00:08:59,800 Speaker 1: input over others in an effort to make a decision. 135 00:08:59,840 --> 00:09:03,000 Speaker 1: If they didn't, then the data would never go anywhere. 136 00:09:03,080 --> 00:09:05,120 Speaker 1: You would never be able to have it processed through 137 00:09:05,160 --> 00:09:09,440 Speaker 1: the system. So the weighting affects how the neuron actually 138 00:09:09,440 --> 00:09:11,920 Speaker 1: processes the data, where does it pass it on to. 139 00:09:12,559 --> 00:09:16,720 Speaker 1: So it may say, if value is greater than X, 140 00:09:16,960 --> 00:09:20,679 Speaker 1: send to node A. If value is less than x, 141 00:09:20,920 --> 00:09:24,839 Speaker 1: send to node B. That could be a very basic weight. 142 00:09:25,240 --> 00:09:28,040 Speaker 1: X would be the weight in that case, and maybe 143 00:09:28,120 --> 00:09:31,640 Speaker 1: that would lead you to a correct outcome. So by 144 00:09:31,679 --> 00:09:36,400 Speaker 1: adjusting the weighting, you can change how these neurons make decisions. 145 00:09:36,880 --> 00:09:39,000 Speaker 1: And if you build a neural network for the purposes, 146 00:09:39,360 --> 00:09:42,200 Speaker 1: let's give it a hypothetical. Let's say it's identifying pictures 147 00:09:42,240 --> 00:09:46,320 Speaker 1: of cats. It's always my go to. And you start 148 00:09:46,400 --> 00:09:48,640 Speaker 1: looking at the output and you see that it is 149 00:09:48,760 --> 00:09:53,199 Speaker 1: mistakenly saying that pictures of flowers are pictures of cats. 150 00:09:53,600 --> 00:09:56,760 Speaker 1: You would say, all right, these artificial neural networks, the 151 00:09:57,040 --> 00:10:00,640 Speaker 1: nodes in this artificial neural network are making the wrong decisions. 152 00:10:01,000 --> 00:10:04,280 Speaker 1: The waiting is wrong in these nodes. I need to 153 00:10:04,280 --> 00:10:07,280 Speaker 1: go and start adjusting things so that I can start 154 00:10:07,280 --> 00:10:12,520 Speaker 1: to get back to this correctly saying whether or not 155 00:10:12,559 --> 00:10:15,400 Speaker 1: an image has a cat in it or doesn't. And 156 00:10:16,040 --> 00:10:18,240 Speaker 1: your goal is to train this model over and over 157 00:10:18,280 --> 00:10:21,200 Speaker 1: and over again until it gets better and better at 158 00:10:21,200 --> 00:10:24,120 Speaker 1: this task, so that then you can just send it 159 00:10:24,200 --> 00:10:26,720 Speaker 1: any raw data you like and not have to worry 160 00:10:26,760 --> 00:10:31,120 Speaker 1: about checking up on it afterward because its accuracy level 161 00:10:31,120 --> 00:10:34,520 Speaker 1: will be high enough to be reliable. That's your ultimate goal, 162 00:10:34,880 --> 00:10:38,120 Speaker 1: But there's a whole process of learning of training that 163 00:10:38,200 --> 00:10:41,760 Speaker 1: you have to go through first. Now, a recurrent neural network, 164 00:10:41,760 --> 00:10:44,760 Speaker 1: it's a little more specific than just artificial neural network. 165 00:10:45,320 --> 00:10:50,679 Speaker 1: Recurrent neural networks use sequential data. These networks can and 166 00:10:50,760 --> 00:10:55,720 Speaker 1: do take information from earlier inputs into consideration when processing 167 00:10:55,920 --> 00:11:00,280 Speaker 1: a new input, so there's a different model, the convolutional 168 00:11:00,520 --> 00:11:04,040 Speaker 1: neural network CNN, not the news channel. This is the 169 00:11:04,040 --> 00:11:08,000 Speaker 1: other big type of neural network where every time data 170 00:11:08,080 --> 00:11:11,480 Speaker 1: goes into an input, it's like a blank slate. It's 171 00:11:11,480 --> 00:11:15,320 Speaker 1: its own thing, it has nothing about That decision is 172 00:11:15,400 --> 00:11:19,880 Speaker 1: based upon any past decision. It's an instance by instance 173 00:11:20,000 --> 00:11:22,960 Speaker 1: kind of case. So you're starting from scratch. But with 174 00:11:23,160 --> 00:11:27,720 Speaker 1: recurrent neural networks, the network can actually incorporate past inputs 175 00:11:28,080 --> 00:11:31,840 Speaker 1: as part of how it processes a current input. But 176 00:11:31,960 --> 00:11:35,400 Speaker 1: one issue with these types of networks, the recurrent neural 177 00:11:35,400 --> 00:11:38,800 Speaker 1: networks is that they need a full sequence before they 178 00:11:38,840 --> 00:11:42,600 Speaker 1: can process the information. So when we're talking about text, 179 00:11:43,040 --> 00:11:45,880 Speaker 1: like if we wanted to process text through a recurrent 180 00:11:45,920 --> 00:11:49,120 Speaker 1: neural network, it would need to work over the entire 181 00:11:49,240 --> 00:11:53,240 Speaker 1: text before producing a result in order to understand things 182 00:11:53,280 --> 00:11:57,000 Speaker 1: like context. Sometimes this approach can lead to errors because 183 00:11:57,040 --> 00:12:01,720 Speaker 1: the model essentially forgets the stuff that was at the 184 00:12:01,760 --> 00:12:04,160 Speaker 1: beginning of the text by the time it gets to 185 00:12:04,200 --> 00:12:07,160 Speaker 1: the end, which sounds a lot like me honestly, where 186 00:12:07,600 --> 00:12:10,440 Speaker 1: I will finish a book and then I'll think, like 187 00:12:10,520 --> 00:12:13,560 Speaker 1: I'll have a discussion with someone about a book that 188 00:12:13,600 --> 00:12:15,360 Speaker 1: we've both read and they'll be like, Oh, I like 189 00:12:15,440 --> 00:12:18,320 Speaker 1: that part where in early in the book blah blah 190 00:12:18,320 --> 00:12:20,600 Speaker 1: blah blah blah, and it pays off much later, and meanwhile, 191 00:12:20,600 --> 00:12:23,320 Speaker 1: I'm thinking, I totally forgot that that happened earlier in 192 00:12:23,360 --> 00:12:25,559 Speaker 1: the book. I remember where we ended up, but I 193 00:12:25,600 --> 00:12:28,960 Speaker 1: don't remember how we got there. Recurrent neural networks can 194 00:12:29,000 --> 00:12:33,360 Speaker 1: fall into the same sort of trap, and so that 195 00:12:34,679 --> 00:12:38,520 Speaker 1: creates a bit of a hurdle when it comes to 196 00:12:38,559 --> 00:12:44,640 Speaker 1: things like analyzing text for the purposes of building natural 197 00:12:44,720 --> 00:12:48,960 Speaker 1: language systems. But I'll explain how that all started to 198 00:12:49,040 --> 00:12:52,559 Speaker 1: change in twenty seventeen. First, however, we need to take 199 00:12:52,600 --> 00:13:05,680 Speaker 1: a quick break to thank our sponsors. Okay, before the break, 200 00:13:05,720 --> 00:13:08,840 Speaker 1: I was talking about recurrent neural networks and how those 201 00:13:08,880 --> 00:13:11,439 Speaker 1: have certain limitations when it comes to the way they 202 00:13:11,440 --> 00:13:14,800 Speaker 1: process data because it has to be sequential. Well, in 203 00:13:14,840 --> 00:13:18,480 Speaker 1: twenty seventeen, a group of AI researchers who were working 204 00:13:18,520 --> 00:13:24,120 Speaker 1: specifically over at Google were coming up with an alternative 205 00:13:24,760 --> 00:13:27,760 Speaker 1: to this approach, and they published a paper, and the 206 00:13:27,800 --> 00:13:32,000 Speaker 1: paper's title was Attention is All You Need, in which 207 00:13:32,000 --> 00:13:35,680 Speaker 1: they suggested that you could do something differently from the 208 00:13:35,720 --> 00:13:39,000 Speaker 1: recurrent neural network approach for the purposes of analyzing stuff 209 00:13:39,080 --> 00:13:44,360 Speaker 1: like text. Their approach was what they called a transformer model. 210 00:13:45,240 --> 00:13:49,600 Speaker 1: While you're old, RNN would analyze text essentially a character 211 00:13:49,679 --> 00:13:51,600 Speaker 1: at a time, not even a word at a time, 212 00:13:51,600 --> 00:13:54,840 Speaker 1: but a character at a time, and thus that's sequential, right. 213 00:13:54,880 --> 00:13:58,440 Speaker 1: The sequential data is character by character. It builds this 214 00:13:58,600 --> 00:14:02,120 Speaker 1: up and then analyzes the whole thing. The transformer model 215 00:14:02,160 --> 00:14:06,680 Speaker 1: instead would tackle a sentence as a unit as opposed 216 00:14:06,679 --> 00:14:10,280 Speaker 1: to a character or even an entire passage of text 217 00:14:10,440 --> 00:14:13,319 Speaker 1: would be a single unit, and so it would analyze 218 00:14:13,360 --> 00:14:17,160 Speaker 1: this to understand the context of what was being said, 219 00:14:17,880 --> 00:14:20,880 Speaker 1: and that's a huge benefit you. Getting a handle on 220 00:14:21,000 --> 00:14:26,160 Speaker 1: context is absolutely critical to understanding what someone means, because 221 00:14:26,200 --> 00:14:29,400 Speaker 1: words can have multiple meanings, right, and without context, we 222 00:14:29,440 --> 00:14:33,720 Speaker 1: can't really be sure which meaning someone intended. So here's 223 00:14:33,720 --> 00:14:37,760 Speaker 1: an example. The English word late. That can mean a 224 00:14:37,760 --> 00:14:40,280 Speaker 1: lot of things if you're an English speaker. So if 225 00:14:40,320 --> 00:14:42,280 Speaker 1: you're talking about the time of day, if you say 226 00:14:42,280 --> 00:14:45,560 Speaker 1: it's late, you usually mean it's getting close to night time. 227 00:14:45,800 --> 00:14:47,760 Speaker 1: You could say it's late at night, which means it's 228 00:14:47,800 --> 00:14:51,120 Speaker 1: actually close to morning time, or maybe it even is 229 00:14:51,200 --> 00:14:55,440 Speaker 1: the morning because it's still dark. And so you think 230 00:14:55,480 --> 00:14:59,360 Speaker 1: of it as night, but technically speaking, it's morning and 231 00:14:59,360 --> 00:15:01,760 Speaker 1: you're just saying it's late at night. If you're saying 232 00:15:01,880 --> 00:15:05,800 Speaker 1: somebody is late, you could either mean they are not 233 00:15:05,960 --> 00:15:10,320 Speaker 1: on time for some appointment, or tragically, you could mean 234 00:15:10,360 --> 00:15:13,440 Speaker 1: that this is a person who has passed away. They 235 00:15:13,480 --> 00:15:16,960 Speaker 1: are late. But you need the rest of the sentence. 236 00:15:17,000 --> 00:15:21,920 Speaker 1: You need that context to understand what meaning of late 237 00:15:22,480 --> 00:15:27,720 Speaker 1: was actually intended. So you need that contextual vision to 238 00:15:27,760 --> 00:15:31,680 Speaker 1: be able to understand the whole thing. So transformer models 239 00:15:32,240 --> 00:15:37,840 Speaker 1: began to revolutionize certain types of AI applications, specifically in 240 00:15:37,880 --> 00:15:43,760 Speaker 1: the realm of natural language processing and generative AI, and 241 00:15:43,840 --> 00:15:47,600 Speaker 1: it's what led to the development of large language models 242 00:15:47,960 --> 00:15:52,040 Speaker 1: the lms. Essentially, a large language model is just a 243 00:15:52,280 --> 00:15:56,600 Speaker 1: huge transformer model. And to make a large language model, 244 00:15:57,040 --> 00:16:00,760 Speaker 1: you need a lot of text to train your model, 245 00:16:01,120 --> 00:16:04,960 Speaker 1: like a lot a lot. Open AI trained its large 246 00:16:05,000 --> 00:16:09,040 Speaker 1: language model known as GPT, which stands for Generative pre 247 00:16:09,160 --> 00:16:15,640 Speaker 1: Trained Transformer. They trained it on countless documents, millions and 248 00:16:15,800 --> 00:16:22,040 Speaker 1: millions of documents found across the web. Some authors allege 249 00:16:22,440 --> 00:16:26,000 Speaker 1: that the training material included copyrighted material and that the 250 00:16:26,000 --> 00:16:28,840 Speaker 1: authors did not give permission for their works to be 251 00:16:28,960 --> 00:16:32,200 Speaker 1: part of the information that fed into this model. That 252 00:16:32,400 --> 00:16:35,400 Speaker 1: leads into its own set of problems that are a 253 00:16:35,480 --> 00:16:37,760 Speaker 1: little bit beyond the scope of what I'm talking about today, 254 00:16:37,760 --> 00:16:41,080 Speaker 1: but they are big problems and they're ongoing now. Stephen 255 00:16:41,160 --> 00:16:45,120 Speaker 1: King argued that his works were clearly used to train 256 00:16:45,240 --> 00:16:48,720 Speaker 1: up large language models. A dead giveaway is if you 257 00:16:48,840 --> 00:16:53,360 Speaker 1: ask a chatbot built on top of a large language 258 00:16:53,360 --> 00:16:58,360 Speaker 1: model to recite passages from specific authors works, and if 259 00:16:58,360 --> 00:17:01,560 Speaker 1: it can do that like accurate, like it's really giving 260 00:17:01,600 --> 00:17:06,760 Speaker 1: you an accurate representation of that text. Yeah, there's no 261 00:17:06,880 --> 00:17:12,240 Speaker 1: way could have received that information without having trained on 262 00:17:12,440 --> 00:17:16,399 Speaker 1: the original text at least somewhere. Now, if it's just 263 00:17:16,440 --> 00:17:20,520 Speaker 1: making stuff up, that's different. That falls into the category 264 00:17:20,560 --> 00:17:24,080 Speaker 1: of hallucinations, which we might touch upon again before we 265 00:17:24,320 --> 00:17:30,320 Speaker 1: finish shut this episode. Anyway, the benefit of feeding so 266 00:17:30,640 --> 00:17:34,480 Speaker 1: much information to a transformer model is that the transformer model, 267 00:17:34,560 --> 00:17:38,000 Speaker 1: the large language model, gets pretty darn good at sussing 268 00:17:38,040 --> 00:17:42,040 Speaker 1: out context. Even stuff that you would expect would trip 269 00:17:42,200 --> 00:17:45,720 Speaker 1: up an AI chatbot can become a breeze. You know, 270 00:17:45,800 --> 00:17:49,479 Speaker 1: you might think that slang or idioms could trip up 271 00:17:49,480 --> 00:17:52,840 Speaker 1: an AI tool, but then you have to remember that 272 00:17:52,920 --> 00:17:55,960 Speaker 1: these tools rely on essentially all the stuff that's on 273 00:17:56,000 --> 00:17:59,320 Speaker 1: the Internet, at least all the stuff that's publicly available 274 00:17:59,320 --> 00:18:03,560 Speaker 1: that's not locked behind something, and maybe even some stuff 275 00:18:03,560 --> 00:18:07,080 Speaker 1: that is locked behind stuff. As it turns out, and 276 00:18:07,200 --> 00:18:09,840 Speaker 1: as such, that means that these models have trained with 277 00:18:09,960 --> 00:18:12,960 Speaker 1: data sets that originate from the same communities that are 278 00:18:13,000 --> 00:18:16,919 Speaker 1: creating the culture that generates certain slang and idioms in 279 00:18:16,920 --> 00:18:20,400 Speaker 1: the first place. So if your AI model is using 280 00:18:20,440 --> 00:18:25,320 Speaker 1: the same source material where these turns of phrase and 281 00:18:25,440 --> 00:18:29,960 Speaker 1: certain slang terms are are originating from, well, of course 282 00:18:30,000 --> 00:18:32,119 Speaker 1: it's going to understand it because that was part of 283 00:18:32,119 --> 00:18:36,240 Speaker 1: its training, so it has that grounding. It's not like me, 284 00:18:36,800 --> 00:18:40,199 Speaker 1: where I am old. I don't understand slang that the 285 00:18:40,280 --> 00:18:43,880 Speaker 1: kids use these days because I'm not in those communities. 286 00:18:44,560 --> 00:18:47,080 Speaker 1: You wouldn't expect me to understand. I am definitely the 287 00:18:48,400 --> 00:18:51,800 Speaker 1: stereotypical out of touch old dude. So when I hear 288 00:18:51,880 --> 00:18:55,840 Speaker 1: people about, you know, people rizing up, I'm like, wait what? 289 00:18:56,880 --> 00:18:59,600 Speaker 1: And I have to look things up. And as we 290 00:18:59,640 --> 00:19:03,720 Speaker 1: all know, urban dictionary is not the most reliable of resources. 291 00:19:04,200 --> 00:19:08,679 Speaker 1: It is frequently entertaining, usually in a way that is 292 00:19:08,720 --> 00:19:13,600 Speaker 1: incredibly offensive, but it's not always accurate anyway. This ultimately 293 00:19:13,680 --> 00:19:16,680 Speaker 1: starts to lead us to why these AI writing detection 294 00:19:16,800 --> 00:19:21,280 Speaker 1: tools are not very good. The material that AI generates 295 00:19:21,400 --> 00:19:24,840 Speaker 1: is built upon how we communicate. It's a built on 296 00:19:24,880 --> 00:19:28,360 Speaker 1: how we write. That's how it was trained. So it's 297 00:19:28,359 --> 00:19:33,199 Speaker 1: not like AI or robots, as I was facetiously saying 298 00:19:33,280 --> 00:19:36,080 Speaker 1: earlier in the episode. It's not like AI has a 299 00:19:36,119 --> 00:19:39,320 Speaker 1: different path toward writing than we do. The AI is 300 00:19:39,359 --> 00:19:43,760 Speaker 1: not following an established set of rules that's unique to AI. Right, 301 00:19:43,800 --> 00:19:47,760 Speaker 1: They're not saying, write this like artificial intelligence. So the 302 00:19:47,840 --> 00:19:51,639 Speaker 1: stuff that AI produces can come across as very human 303 00:19:52,040 --> 00:19:56,159 Speaker 1: and vice versa. Now, this does not mean that it 304 00:19:56,280 --> 00:20:01,080 Speaker 1: is absolutely impossible for someone like a teacher to tell 305 00:20:01,160 --> 00:20:04,720 Speaker 1: if something was written by AI or a student. If 306 00:20:04,760 --> 00:20:07,439 Speaker 1: the teacher is actually really familiar with the writing style 307 00:20:07,720 --> 00:20:12,120 Speaker 1: of that student or students in question, it's entirely possible 308 00:20:12,320 --> 00:20:15,120 Speaker 1: that the teacher might notice if that writing style were 309 00:20:15,160 --> 00:20:20,880 Speaker 1: to suddenly and maybe significantly change between assignments. This can 310 00:20:20,960 --> 00:20:23,640 Speaker 1: be a big ask, by the way, for certain teachers, 311 00:20:23,800 --> 00:20:26,880 Speaker 1: because class sizes can get huge depending on where you are, 312 00:20:27,600 --> 00:20:30,320 Speaker 1: and if you're talking about an overworked English teacher who's 313 00:20:30,359 --> 00:20:33,879 Speaker 1: teaching multiple classes and each class has got, you know, 314 00:20:34,000 --> 00:20:37,720 Speaker 1: thirty kids in it. It can be hard to really 315 00:20:37,920 --> 00:20:42,879 Speaker 1: build up a working knowledge and memory of the writing 316 00:20:42,920 --> 00:20:45,520 Speaker 1: styles of every single person in every single class. But 317 00:20:46,119 --> 00:20:48,800 Speaker 1: that is one way that teachers can tell. If teachers 318 00:20:49,040 --> 00:20:52,040 Speaker 1: read an essay and think, wow, you know, Robert didn't 319 00:20:52,119 --> 00:20:55,720 Speaker 1: write like this in the essay we did last month, 320 00:20:56,200 --> 00:20:59,960 Speaker 1: this is a very different approach to writing and per 321 00:21:00,040 --> 00:21:04,080 Speaker 1: perhaps that's an indicator that someone else wrote the piece, 322 00:21:04,119 --> 00:21:07,680 Speaker 1: whether that was AI or maybe you know, another human being, 323 00:21:08,480 --> 00:21:12,320 Speaker 1: and that can be an indication something hinky is going on. Also, 324 00:21:12,400 --> 00:21:15,480 Speaker 1: I mean, obviously some people get sloppy. This happens a 325 00:21:15,480 --> 00:21:18,640 Speaker 1: lot too when people just aren't paying attention as they're 326 00:21:18,760 --> 00:21:24,600 Speaker 1: using AI to generate either you know, an educational assignment 327 00:21:24,880 --> 00:21:28,840 Speaker 1: or business or whatever. There have been so many examples 328 00:21:29,240 --> 00:21:33,280 Speaker 1: of how people have accidentally copied and pasted not just 329 00:21:33,560 --> 00:21:36,760 Speaker 1: the body of the text, but stuff that's outside the 330 00:21:36,800 --> 00:21:39,000 Speaker 1: body of the text, like it might even be a 331 00:21:39,000 --> 00:21:42,600 Speaker 1: little disclaimer saying it was made by AI, or it 332 00:21:42,640 --> 00:21:47,080 Speaker 1: could be a command like regenerate response. That's something you 333 00:21:47,160 --> 00:21:51,480 Speaker 1: find in certain chat bots, and that is just what 334 00:21:51,520 --> 00:21:55,760 Speaker 1: regenerate response means. It just means, hey, can you create 335 00:21:55,920 --> 00:22:03,119 Speaker 1: a new AI response to the initial prompt I gave you. 336 00:22:03,320 --> 00:22:06,200 Speaker 1: So I wrote a prompt, I had you generate response. 337 00:22:07,040 --> 00:22:09,280 Speaker 1: I want you to create a whole new response based 338 00:22:09,320 --> 00:22:13,280 Speaker 1: on that original prompt. If you have regenerate response written 339 00:22:13,320 --> 00:22:18,560 Speaker 1: at in your essay, that's a dead giveaway that you 340 00:22:18,800 --> 00:22:22,160 Speaker 1: copied and pasted that essay off of an AI chatbot. 341 00:22:22,520 --> 00:22:25,920 Speaker 1: So there are ways that teachers can tell the difference, 342 00:22:26,840 --> 00:22:30,359 Speaker 1: but they aren't. It's not as granular as saying, oh, 343 00:22:30,600 --> 00:22:35,280 Speaker 1: this is clearly something that was written by artificial intelligence 344 00:22:35,359 --> 00:22:37,719 Speaker 1: versus this was written by a human. It's more like 345 00:22:38,400 --> 00:22:41,359 Speaker 1: this is different from what I have received before from 346 00:22:41,440 --> 00:22:47,440 Speaker 1: this particular student, or this contains obvious errors that reveal 347 00:22:47,720 --> 00:22:52,399 Speaker 1: that the student has used AI. Now, the AI writing 348 00:22:52,440 --> 00:22:58,160 Speaker 1: detection tools are at least claiming to use a couple 349 00:22:58,160 --> 00:23:01,040 Speaker 1: of strategies to try and determine if something was written 350 00:23:01,080 --> 00:23:04,360 Speaker 1: by AI or a human. So they're saying, we can 351 00:23:04,440 --> 00:23:08,359 Speaker 1: automate that process, and we can actually analyze a block 352 00:23:08,400 --> 00:23:11,120 Speaker 1: of text and give you a determination as to whether 353 00:23:11,240 --> 00:23:13,399 Speaker 1: or not that was made by AI or a human, 354 00:23:13,760 --> 00:23:18,080 Speaker 1: which suggests that maybe there is some sort of fundamental 355 00:23:18,119 --> 00:23:22,360 Speaker 1: difference between the way AI generates content and the way 356 00:23:22,520 --> 00:23:27,800 Speaker 1: people do. But these strategies that the AI writing detection 357 00:23:27,920 --> 00:23:32,199 Speaker 1: tools are built upon have fundamental flaws, and we know 358 00:23:32,280 --> 00:23:35,000 Speaker 1: that because we know the tools are bad. It was 359 00:23:35,040 --> 00:23:37,639 Speaker 1: bad enough for open ai to shut down its version 360 00:23:37,760 --> 00:23:42,919 Speaker 1: back in June. So this isn't like just us postulating 361 00:23:43,160 --> 00:23:45,919 Speaker 1: that these tools are bad. We know they're bad. We 362 00:23:46,080 --> 00:23:49,640 Speaker 1: know they create things like false positives. So knowing that 363 00:23:49,920 --> 00:23:53,639 Speaker 1: already they are unreliable, you then have to start asking, well, 364 00:23:53,960 --> 00:23:56,560 Speaker 1: why are they unreliable? What are the things that are 365 00:23:56,680 --> 00:24:00,480 Speaker 1: leading these tools to make these wrong determinations? And when 366 00:24:00,480 --> 00:24:04,359 Speaker 1: we come back, I'll talk about how Bene Edwards and 367 00:24:04,480 --> 00:24:08,880 Speaker 1: those OURS Technica articles really kind of digs into two 368 00:24:09,359 --> 00:24:14,159 Speaker 1: main concepts that end up leading to these writing detection 369 00:24:14,280 --> 00:24:17,080 Speaker 1: tools trying to make a determination and why they are 370 00:24:17,760 --> 00:24:32,600 Speaker 1: fundamentally flawed. But first let's take another quick break. So 371 00:24:33,320 --> 00:24:35,439 Speaker 1: before the break, I mentioned that I was going to 372 00:24:35,440 --> 00:24:39,760 Speaker 1: talk about some strategies that Binge Edwards outlines in his 373 00:24:39,960 --> 00:24:43,800 Speaker 1: RS Technica articles, and they fall into two categories. The 374 00:24:43,840 --> 00:24:49,600 Speaker 1: first is called perplexity, and that really means how surprising 375 00:24:49,800 --> 00:24:54,480 Speaker 1: or perplexing are the word choices, how creative are the 376 00:24:54,600 --> 00:24:59,640 Speaker 1: sentences in a given piece of text compared to an 377 00:24:59,680 --> 00:25:04,800 Speaker 1: AI training model. So the thinking behind this is that 378 00:25:05,560 --> 00:25:09,600 Speaker 1: if a block of text seems to conform to the 379 00:25:09,640 --> 00:25:12,880 Speaker 1: same sort of stuff that the language model would produce, 380 00:25:13,760 --> 00:25:17,639 Speaker 1: then AI probably created the text. That's the idea if 381 00:25:17,880 --> 00:25:22,280 Speaker 1: they're saying essentially that if the text is really similar 382 00:25:22,320 --> 00:25:25,880 Speaker 1: to what AI would create, then AI probably created it. 383 00:25:26,880 --> 00:25:30,119 Speaker 1: And let's think about how some tools use autocomplete to 384 00:25:30,160 --> 00:25:32,600 Speaker 1: help you write a text or sentence. Using a purely 385 00:25:32,680 --> 00:25:35,520 Speaker 1: hypothetical scenario to kind of get our minds wrapped around this, 386 00:25:36,160 --> 00:25:38,880 Speaker 1: Let's say that you were typing into something that has 387 00:25:38,960 --> 00:25:43,640 Speaker 1: autocomplete built into it, the sentence or the phrase I'm 388 00:25:43,680 --> 00:25:48,800 Speaker 1: going to go for a and then whatever tool you're 389 00:25:48,840 --> 00:25:53,920 Speaker 1: typing it into suggests the word walk as an autocomplete option. Well, 390 00:25:53,960 --> 00:25:57,800 Speaker 1: that would be because the language model that is powering 391 00:25:58,280 --> 00:26:04,760 Speaker 1: this autocomplete function has a has sampled millions of passages, 392 00:26:04,880 --> 00:26:07,879 Speaker 1: millions and millions and millions of documents, and has found 393 00:26:08,440 --> 00:26:11,760 Speaker 1: that the word walk has been the most common word 394 00:26:11,800 --> 00:26:15,199 Speaker 1: to follow the phrase I'm going to go for a 395 00:26:16,520 --> 00:26:21,720 Speaker 1: and so therefore it offers that as the suggestion, and 396 00:26:21,920 --> 00:26:24,160 Speaker 1: maybe it would even offer you a few options. Maybe 397 00:26:24,160 --> 00:26:27,359 Speaker 1: it would say walk, maybe it'd say swim in the UK, 398 00:26:27,480 --> 00:26:32,080 Speaker 1: maybe it'd say a curry. Who knows so, but you know, 399 00:26:32,119 --> 00:26:33,959 Speaker 1: it would give you maybe a couple of different options, 400 00:26:33,960 --> 00:26:35,960 Speaker 1: but they would be the ones that would most likely 401 00:26:36,040 --> 00:26:40,240 Speaker 1: follow that phrase based upon the training material that that 402 00:26:40,400 --> 00:26:44,640 Speaker 1: large language model had used to build itself up. Right, 403 00:26:45,119 --> 00:26:49,240 Speaker 1: So if you were to measure the perplexity of the 404 00:26:49,280 --> 00:26:52,840 Speaker 1: sentence I'm going to go for a walk, it would 405 00:26:52,880 --> 00:26:56,600 Speaker 1: be very very low, very low perplexity because it's in 406 00:26:56,640 --> 00:27:00,320 Speaker 1: line with what the language model would expect. So the 407 00:27:00,400 --> 00:27:03,720 Speaker 1: thought is, if a passage in general has a very 408 00:27:03,960 --> 00:27:08,960 Speaker 1: low perplexity, these tools tend to suspect that the passage 409 00:27:08,960 --> 00:27:11,760 Speaker 1: as a whole could have come from AI. So let's 410 00:27:11,800 --> 00:27:14,119 Speaker 1: say that it had a very hyperplexity. Let's say that 411 00:27:14,160 --> 00:27:16,399 Speaker 1: instead of saying I'm going to go for a walk, 412 00:27:16,880 --> 00:27:19,919 Speaker 1: you said I'm going to go for a zebra or 413 00:27:20,040 --> 00:27:23,760 Speaker 1: zebra if you're in the UK. Well, that doesn't want 414 00:27:23,800 --> 00:27:25,680 Speaker 1: it doesn't really make any sense. But two, that would 415 00:27:25,680 --> 00:27:28,240 Speaker 1: be very perplexing, right, that would not be something that 416 00:27:28,280 --> 00:27:30,720 Speaker 1: the large language model would expect. And so if it 417 00:27:30,720 --> 00:27:35,000 Speaker 1: has high perplexity, then the writing detection tool is more 418 00:27:35,200 --> 00:27:37,640 Speaker 1: likely to say this was written by a human, because 419 00:27:38,119 --> 00:27:41,920 Speaker 1: what generative chat system would have made that sentence, And 420 00:27:42,040 --> 00:27:44,280 Speaker 1: he's like, no, sane robot would say I'm going to 421 00:27:44,280 --> 00:27:47,680 Speaker 1: go for a zebra. Clearly some human wrote this. Now, 422 00:27:47,680 --> 00:27:51,400 Speaker 1: the problem is these companies are training their large language 423 00:27:51,400 --> 00:27:56,119 Speaker 1: models on enormous amounts of human generated text. And unless 424 00:27:56,160 --> 00:28:01,600 Speaker 1: you're purposefully trying to be really a original in your writing, 425 00:28:01,760 --> 00:28:03,840 Speaker 1: that's a kind way of saying you're being a weirdo 426 00:28:04,080 --> 00:28:07,119 Speaker 1: as you're writing your sentences. Chances are a lot of 427 00:28:07,119 --> 00:28:09,479 Speaker 1: the stuff you're writing is going to have a fairly 428 00:28:09,560 --> 00:28:13,800 Speaker 1: low perplexity, unless you're trying to write in like the 429 00:28:13,840 --> 00:28:18,800 Speaker 1: milieu of humor or absurdity. If unless you're purposely trying 430 00:28:18,800 --> 00:28:22,800 Speaker 1: to do that, then chances are your perplexity is going 431 00:28:22,880 --> 00:28:25,960 Speaker 1: to be pretty low too. Particularly for very structured writing 432 00:28:26,080 --> 00:28:30,280 Speaker 1: like business writing or academic writing, that perplexity is going 433 00:28:30,320 --> 00:28:33,720 Speaker 1: to be very low. So unless you're prone to throwing 434 00:28:33,760 --> 00:28:38,680 Speaker 1: in very odd, random, weird sentences like William Shakespeare's Othello 435 00:28:38,920 --> 00:28:41,040 Speaker 1: is one of the great tragedies of English theater, and 436 00:28:41,120 --> 00:28:46,120 Speaker 1: also I enjoy shoving hot dogs through mail slots. Well, 437 00:28:46,160 --> 00:28:48,600 Speaker 1: there's a pretty good chance that an AI detector tool 438 00:28:48,960 --> 00:28:52,720 Speaker 1: is going to think that your human written, legitimate essay 439 00:28:53,440 --> 00:28:57,680 Speaker 1: was in fact an AI's work, because the perplexity would 440 00:28:57,720 --> 00:29:00,840 Speaker 1: likely be pretty low, again unless you're doing something really 441 00:29:00,880 --> 00:29:04,479 Speaker 1: avant garde, So that there's a fundamental flaw and logic 442 00:29:04,520 --> 00:29:08,160 Speaker 1: of using perplexity as one of your metrics for determining 443 00:29:08,200 --> 00:29:11,680 Speaker 1: if something was written by AI versus a human. Ben 444 00:29:11,720 --> 00:29:15,640 Speaker 1: Jedwards also goes on to explain that another factor that 445 00:29:15,680 --> 00:29:19,040 Speaker 1: AI detection tools will take into consideration is one that's 446 00:29:19,080 --> 00:29:25,520 Speaker 1: called burstiness. Perplexity and burstiness makes me feel like I've 447 00:29:25,600 --> 00:29:29,960 Speaker 1: fallen into a Lewis Carroll novel. But anyway, burstiness really 448 00:29:30,000 --> 00:29:34,600 Speaker 1: has to do with variability, particularly between sentences. So y'all 449 00:29:34,640 --> 00:29:39,880 Speaker 1: probably have noticed I have a tendency toward really long sentences, 450 00:29:40,000 --> 00:29:43,960 Speaker 1: and often with a lot of parentheticals thrown in there. Now, 451 00:29:44,000 --> 00:29:48,080 Speaker 1: if I also incorporate short sentences on occasion, breaking up 452 00:29:48,120 --> 00:29:51,400 Speaker 1: these very long sentences, this creates a lot more variety, 453 00:29:51,680 --> 00:29:56,280 Speaker 1: a lot more dynamic elements between my sentences, because I'm 454 00:29:56,400 --> 00:30:02,560 Speaker 1: switching back and forth between these very long, pontificating sentences 455 00:30:02,600 --> 00:30:05,960 Speaker 1: and then short ones to make a point. Maybe in 456 00:30:06,000 --> 00:30:09,160 Speaker 1: some sentences I use tons of adverbs to describe action. 457 00:30:09,600 --> 00:30:11,920 Speaker 1: Maybe in the next sentence I don't use any adverbs 458 00:30:11,920 --> 00:30:16,520 Speaker 1: at all. This is what creates that variability. The conventional 459 00:30:16,560 --> 00:30:21,320 Speaker 1: wisdom is that AI generated work is more uniform, it's 460 00:30:21,360 --> 00:30:25,880 Speaker 1: more consistent, it has less variability from sentence to sentence. 461 00:30:25,920 --> 00:30:30,040 Speaker 1: Your sentence length and complexity is going to remain more 462 00:30:30,160 --> 00:30:33,719 Speaker 1: or less the same throughout an entire passage. So if 463 00:30:33,760 --> 00:30:38,040 Speaker 1: you're able to qualify how dynamic a writing style is, 464 00:30:39,160 --> 00:30:42,760 Speaker 1: the thinking goes. You could potentially determine if a human 465 00:30:42,800 --> 00:30:45,760 Speaker 1: wrote it or if an AI wrote that specific piece. 466 00:30:46,480 --> 00:30:50,600 Speaker 1: If it's not very dynamic, well that leads more toward AI. 467 00:30:51,800 --> 00:30:54,480 Speaker 1: But that approach depends upon a couple of things that 468 00:30:54,520 --> 00:30:57,680 Speaker 1: are not always reliable. So first up, it assumes that 469 00:30:57,800 --> 00:31:00,560 Speaker 1: AI generated content is going to contain you to show 470 00:31:00,640 --> 00:31:04,880 Speaker 1: more consistency than the stuff that humans. Right, that's going 471 00:31:04,920 --> 00:31:09,600 Speaker 1: to continue to be this very consistent approach to sentence structure. 472 00:31:09,920 --> 00:31:13,160 Speaker 1: But the language models and the generative AI that are 473 00:31:13,240 --> 00:31:15,880 Speaker 1: built on top of the language models are growing more 474 00:31:15,920 --> 00:31:18,840 Speaker 1: sophisticated all the time. A lot of these companies that 475 00:31:18,960 --> 00:31:23,880 Speaker 1: make these language models are mining platforms like x formerly 476 00:31:23,960 --> 00:31:27,560 Speaker 1: known as Twitter or Reddit in order to train their 477 00:31:27,640 --> 00:31:32,560 Speaker 1: language models. They're reading these sort of idiosyncratic messages that 478 00:31:32,600 --> 00:31:36,600 Speaker 1: people write. Sometimes they're writing purposefully in a way that 479 00:31:36,720 --> 00:31:41,040 Speaker 1: is not consistent, and it can get to be a 480 00:31:41,040 --> 00:31:45,360 Speaker 1: little unpredictable. Well, if you're training your language model on 481 00:31:45,400 --> 00:31:48,760 Speaker 1: these things, then over time the language models and the 482 00:31:48,800 --> 00:31:51,160 Speaker 1: tools that are built on top of them begin to 483 00:31:51,240 --> 00:31:54,920 Speaker 1: reflect that training material. It means that we should expect 484 00:31:55,480 --> 00:32:01,160 Speaker 1: generative AI to start increasing variability in sentence because that's 485 00:32:01,160 --> 00:32:04,760 Speaker 1: what we're training it on. You can't expect to train 486 00:32:04,800 --> 00:32:07,400 Speaker 1: it on one thing and it generates something totally different. 487 00:32:07,440 --> 00:32:10,600 Speaker 1: It's going to kind of mimic the material that was 488 00:32:10,720 --> 00:32:13,720 Speaker 1: used to teach it in the first place. So that 489 00:32:13,760 --> 00:32:15,840 Speaker 1: means you're going to see a reduction in the gap 490 00:32:15,960 --> 00:32:20,720 Speaker 1: between how AI creates text and how humans do. But 491 00:32:20,840 --> 00:32:24,440 Speaker 1: on top of that, again, for certain types of writing, 492 00:32:25,160 --> 00:32:28,240 Speaker 1: human authors may take a more structured approach and they 493 00:32:28,280 --> 00:32:34,720 Speaker 1: may purposefully reduce variability between sentences or unconsciously reduce variability. 494 00:32:35,480 --> 00:32:38,520 Speaker 1: That means that their writing is going to start looking 495 00:32:38,560 --> 00:32:41,560 Speaker 1: more like the stuff that these writing detection tools assume. 496 00:32:42,200 --> 00:32:45,880 Speaker 1: Is a marker for AI generated content. If I were 497 00:32:45,920 --> 00:32:49,200 Speaker 1: to write a term paper, I would probably take a 498 00:32:49,240 --> 00:32:53,479 Speaker 1: more consistent, uniform approach to my writing style. That's not 499 00:32:53,520 --> 00:32:55,680 Speaker 1: to suggest that would be the right choice, right, Like, 500 00:32:56,040 --> 00:32:57,880 Speaker 1: I'm not saying that if you write a term paper 501 00:32:57,920 --> 00:33:01,040 Speaker 1: you need to have this very consistent, uniform approach because 502 00:33:01,080 --> 00:33:04,640 Speaker 1: they can get really boring to read papers that are 503 00:33:04,680 --> 00:33:07,440 Speaker 1: written in a style like that. But that would probably 504 00:33:07,480 --> 00:33:11,080 Speaker 1: be my inclination, like thinking in my head, I'd be 505 00:33:11,600 --> 00:33:14,880 Speaker 1: I want to make sure I'm consistent, I'm academic, i 506 00:33:14,920 --> 00:33:18,120 Speaker 1: am thoughtful, I'm methodical. That means that the work I 507 00:33:18,120 --> 00:33:21,640 Speaker 1: would produce would have this low burstiness because I was 508 00:33:21,720 --> 00:33:24,400 Speaker 1: purposefully doing it. Even if that was the wrong decision, 509 00:33:24,400 --> 00:33:26,880 Speaker 1: it probably be the one that I would make because 510 00:33:26,920 --> 00:33:29,280 Speaker 1: I'd be working under the mistaken belief that this is 511 00:33:29,280 --> 00:33:33,320 Speaker 1: somehow more academic. So these AI writing detection tools are 512 00:33:33,360 --> 00:33:37,280 Speaker 1: looking for texts that has low burstiness and low perplexity 513 00:33:37,560 --> 00:33:41,040 Speaker 1: before suggesting that AI had created that particular block of text. 514 00:33:41,080 --> 00:33:44,120 Speaker 1: But as we've talked about, humans right in that kind 515 00:33:44,120 --> 00:33:47,440 Speaker 1: of style too, particularly for formal writing, and so you 516 00:33:47,480 --> 00:33:49,880 Speaker 1: get a lot of false positives, like if you feed 517 00:33:49,920 --> 00:33:54,200 Speaker 1: the US Constitution to a writing detection tool, and it says, well, 518 00:33:54,240 --> 00:33:56,760 Speaker 1: Ai wrote this, Well, a lot of stuff has been 519 00:33:56,760 --> 00:34:00,520 Speaker 1: written about the Constitution, including passages from the content Institution. 520 00:34:00,800 --> 00:34:04,160 Speaker 1: The Constitution itself is clearly available on the web, so 521 00:34:05,120 --> 00:34:09,040 Speaker 1: it's obviously part of these large language models training sets. 522 00:34:09,680 --> 00:34:12,400 Speaker 1: So of course it's going to reflect what's in the 523 00:34:12,440 --> 00:34:18,359 Speaker 1: training set. It was literally incorporated into it. So if 524 00:34:18,400 --> 00:34:23,239 Speaker 1: you're working backward from that logic, then your conclusion, oh 525 00:34:23,280 --> 00:34:26,880 Speaker 1: Ai wrote this because it reflects what the language model 526 00:34:27,280 --> 00:34:30,800 Speaker 1: was trained on. Well, yeah, but that's because the language 527 00:34:30,800 --> 00:34:33,680 Speaker 1: model was literally trained on the material you were analyzing. 528 00:34:34,680 --> 00:34:37,200 Speaker 1: It becomes the sort of catch twenty two sort of situation. 529 00:34:37,640 --> 00:34:42,799 Speaker 1: So we cannot rely on these detection tools in large part. Now, 530 00:34:42,800 --> 00:34:46,080 Speaker 1: this doesn't even touch upon the challenges that non native 531 00:34:46,120 --> 00:34:49,680 Speaker 1: English speakers face with their writing. When they're writing in 532 00:34:49,840 --> 00:34:53,840 Speaker 1: English and these AI detection tools are used on their work, 533 00:34:54,560 --> 00:34:57,640 Speaker 1: they can face disproportionate bias when it comes to these 534 00:34:57,640 --> 00:35:00,920 Speaker 1: detection tools. They get a lot more false positive So 535 00:35:01,760 --> 00:35:04,480 Speaker 1: you're already seeing a lot of false positives anyway, because 536 00:35:04,640 --> 00:35:08,879 Speaker 1: as we've discussed, the criteria being used by these AI 537 00:35:08,920 --> 00:35:14,600 Speaker 1: writing detection tools are faulty because it's making assumptions that 538 00:35:14,680 --> 00:35:16,799 Speaker 1: humans are not writing in those styles when in fact 539 00:35:16,840 --> 00:35:20,160 Speaker 1: they are, and that AI is writing in one specific style, 540 00:35:20,239 --> 00:35:23,920 Speaker 1: when in fact, at least over time, it migrates away 541 00:35:23,920 --> 00:35:27,680 Speaker 1: from that. So you got a double whammy here. Now, 542 00:35:27,680 --> 00:35:31,160 Speaker 1: there are some applications of AI detection tools where it 543 00:35:31,239 --> 00:35:34,560 Speaker 1: works and it makes sense, just not in writing, but 544 00:35:34,760 --> 00:35:39,720 Speaker 1: for stuff like photo or video manipulation. AI detection tools 545 00:35:39,719 --> 00:35:44,920 Speaker 1: can still look for telltale signs that can indicate that 546 00:35:45,000 --> 00:35:47,000 Speaker 1: maybe what you're looking at has at least in some 547 00:35:47,040 --> 00:35:51,200 Speaker 1: part been created by a generative AI tool, right like 548 00:35:51,239 --> 00:35:55,080 Speaker 1: an image creation tool. Obviously, there are examples of this 549 00:35:55,120 --> 00:35:57,279 Speaker 1: where you take one look and you know immediately that 550 00:35:57,320 --> 00:35:59,360 Speaker 1: this was made by AI, because you look at it 551 00:35:59,360 --> 00:36:02,160 Speaker 1: and you're like, no one has that many fingers on 552 00:36:02,160 --> 00:36:06,440 Speaker 1: one hand, but there are other cases where it may not. 553 00:36:06,680 --> 00:36:11,240 Speaker 1: It may be far more subtle to a human perception, 554 00:36:11,480 --> 00:36:16,280 Speaker 1: but if you were to actually analyze the image deeply 555 00:36:16,440 --> 00:36:19,719 Speaker 1: with a very well trained AI detection tool, it could 556 00:36:19,719 --> 00:36:25,399 Speaker 1: indicate this was made by AI because of little subtle things. 557 00:36:25,440 --> 00:36:29,960 Speaker 1: Maybe it's inconsistent lighting, Maybe it's a blinking pattern of 558 00:36:30,000 --> 00:36:32,919 Speaker 1: a person in a video, things like that, Little things 559 00:36:32,920 --> 00:36:35,640 Speaker 1: that would be hard for us to spot as human beings, 560 00:36:35,680 --> 00:36:40,920 Speaker 1: but easy for a detection tool to spot. These AI 561 00:36:41,000 --> 00:36:46,440 Speaker 1: detection tools make sense. They're not necessarily foolproof or flawless, 562 00:36:47,000 --> 00:36:51,360 Speaker 1: but they have a better success rate than when it 563 00:36:51,400 --> 00:36:54,240 Speaker 1: comes to writing, because it's just not that clear cut 564 00:36:54,719 --> 00:36:58,719 Speaker 1: when we're talking about writing. This is unfortunate when teachers 565 00:36:58,719 --> 00:37:03,160 Speaker 1: may rely heavily on AI writing detection tools in order 566 00:37:03,200 --> 00:37:05,839 Speaker 1: to determine if their students are actually doing their own 567 00:37:05,880 --> 00:37:09,560 Speaker 1: work or not. If the teachers are unaware that these 568 00:37:09,600 --> 00:37:13,320 Speaker 1: detection tools are unreliable, they can make some really drastic 569 00:37:13,360 --> 00:37:16,800 Speaker 1: decisions that will have a huge negative impact on their students' 570 00:37:16,840 --> 00:37:21,239 Speaker 1: work and lives, and that's not really fair. Hopefully, the 571 00:37:21,400 --> 00:37:26,960 Speaker 1: educators out there are themselves educating themselves to be repetitive 572 00:37:27,840 --> 00:37:33,920 Speaker 1: about these tools and their unreliability, because otherwise they're going 573 00:37:33,960 --> 00:37:37,880 Speaker 1: to be punishing students and they can't justify it because 574 00:37:38,600 --> 00:37:40,840 Speaker 1: it's all based on a tool that has proven to 575 00:37:40,920 --> 00:37:45,799 Speaker 1: be unreliable at the get go, unless, of course, we're 576 00:37:45,800 --> 00:37:49,680 Speaker 1: talking about instances where someone has copy and pasted some 577 00:37:49,880 --> 00:37:54,000 Speaker 1: ridiculous part of an AI generated response that just gives 578 00:37:54,040 --> 00:37:59,200 Speaker 1: it away. That's a different case. Entirely obviously, But yeah, 579 00:37:59,440 --> 00:38:03,080 Speaker 1: I think it's important to understand the limitations of these 580 00:38:04,120 --> 00:38:07,400 Speaker 1: As we explore generative AI, and we look at the 581 00:38:07,440 --> 00:38:10,960 Speaker 1: pros and the cons and we consider the impact the 582 00:38:11,000 --> 00:38:15,000 Speaker 1: generative AI has on multiple segments of our lives, we 583 00:38:15,080 --> 00:38:18,840 Speaker 1: also have to really think about how do we know 584 00:38:19,080 --> 00:38:22,560 Speaker 1: when it's in use, and how do we know that 585 00:38:22,600 --> 00:38:25,640 Speaker 1: the tools we're using to make those determinations are actually 586 00:38:26,400 --> 00:38:29,960 Speaker 1: good tools. In the case of these AI writing detection tools, 587 00:38:30,640 --> 00:38:34,000 Speaker 1: it looks to me like you might as well not 588 00:38:34,080 --> 00:38:37,880 Speaker 1: even look at them. You are more likely than not 589 00:38:37,960 --> 00:38:43,959 Speaker 1: to get an incorrect answer, because again, we train these 590 00:38:44,719 --> 00:38:48,239 Speaker 1: generative tools to communicate very much the way humans do, 591 00:38:48,280 --> 00:38:51,920 Speaker 1: at least in certain use cases, and those use cases 592 00:38:51,960 --> 00:38:54,239 Speaker 1: typically are the ones where we're most concerned about whether 593 00:38:54,320 --> 00:38:56,320 Speaker 1: or not AI was put to use in the first place. 594 00:38:56,880 --> 00:39:00,600 Speaker 1: So really interesting articles over on Ours Technica. It leads 595 00:39:00,640 --> 00:39:06,360 Speaker 1: to this really deep discussion about generative AI, the limitations 596 00:39:06,360 --> 00:39:10,200 Speaker 1: that we have in detecting it, And obviously there are 597 00:39:10,280 --> 00:39:12,239 Speaker 1: a lot of other things we could touch on. I 598 00:39:12,280 --> 00:39:17,600 Speaker 1: mentioned copyright. That's a big one, because if AI can 599 00:39:17,880 --> 00:39:24,200 Speaker 1: regurgitate copyrighted works with no flaws, then that can be 600 00:39:24,400 --> 00:39:29,839 Speaker 1: a huge blow to authors, for example, or we talked 601 00:39:29,840 --> 00:39:33,120 Speaker 1: a little bit about hallucinations. Hallucinations are when an AI 602 00:39:33,880 --> 00:39:39,560 Speaker 1: tool does not have the information to be able to 603 00:39:39,600 --> 00:39:42,960 Speaker 1: determine what should come next in a sentence. You have 604 00:39:43,040 --> 00:39:46,760 Speaker 1: to remember when you really boil it down these AI 605 00:39:46,880 --> 00:39:49,719 Speaker 1: generative tools, what they're doing is they're following a very 606 00:39:50,000 --> 00:39:55,600 Speaker 1: sophisticated statistical model to determine what should come next in 607 00:39:55,640 --> 00:39:59,200 Speaker 1: its answer. So you give it a prompt and it's 608 00:39:59,440 --> 00:40:03,799 Speaker 1: referencing this incredibly complicated statistical model to say, all right, 609 00:40:04,640 --> 00:40:07,759 Speaker 1: what should I put as a response. Some of the 610 00:40:07,800 --> 00:40:11,480 Speaker 1: information involves things like the actual answers to questions, but 611 00:40:11,520 --> 00:40:14,960 Speaker 1: there are cases where the AI model may be unable 612 00:40:15,000 --> 00:40:18,400 Speaker 1: to identify what the answer to the question is, but 613 00:40:18,480 --> 00:40:22,120 Speaker 1: it still needs to answer your query. It doesn't have 614 00:40:22,200 --> 00:40:24,759 Speaker 1: the answer, so it makes it up, but following this 615 00:40:24,880 --> 00:40:28,880 Speaker 1: very sophisticated statistical model so that the answer it generates 616 00:40:29,000 --> 00:40:32,719 Speaker 1: appears to be valid even though it's just completely made up. 617 00:40:32,800 --> 00:40:36,279 Speaker 1: This is what we call hallucinations in AI. It's when 618 00:40:36,320 --> 00:40:41,240 Speaker 1: AI generates an answer in order to respond to a query, 619 00:40:41,880 --> 00:40:46,040 Speaker 1: but that answer is fabricated. It's a confabulation. That's another 620 00:40:46,040 --> 00:40:49,719 Speaker 1: word that some people are using rather than hallucination, and 621 00:40:50,840 --> 00:40:53,840 Speaker 1: it comes across as being very much legitimate because again, 622 00:40:53,920 --> 00:41:00,000 Speaker 1: these very sophisticated statistical models make it seem authoritative and knowledge. 623 00:41:00,719 --> 00:41:03,000 Speaker 1: The way the sentences are structured, it doesn't come across 624 00:41:03,000 --> 00:41:06,880 Speaker 1: wishy washy. It's not like maybe it's blah blah blah. 625 00:41:06,920 --> 00:41:10,560 Speaker 1: It ends up being it's blah blah blah and presented 626 00:41:10,560 --> 00:41:12,880 Speaker 1: in such a way that you feel like it's reliable, 627 00:41:12,960 --> 00:41:16,960 Speaker 1: even though ultimately it's not. That's another issue. It's related 628 00:41:16,960 --> 00:41:20,279 Speaker 1: to what we're talking about. And it's also means that 629 00:41:20,360 --> 00:41:23,160 Speaker 1: as a student, or as a business writer or as 630 00:41:23,239 --> 00:41:25,640 Speaker 1: a lawyer, as one person found out earlier this year, 631 00:41:26,160 --> 00:41:30,200 Speaker 1: you should not rely on generative AI as your one 632 00:41:30,280 --> 00:41:36,080 Speaker 1: and only source for anything AI. Generative AI has even 633 00:41:36,120 --> 00:41:41,680 Speaker 1: been found to fabricate quotations from people. Obviously that's not 634 00:41:41,760 --> 00:41:45,680 Speaker 1: good either. There are lots of issues here. Anyway. I 635 00:41:45,719 --> 00:41:47,360 Speaker 1: hope that was some food for thought for y'all. I 636 00:41:47,360 --> 00:41:51,160 Speaker 1: hope you're doing well. I will talk to you again 637 00:41:52,239 --> 00:42:01,160 Speaker 1: really soon. Tech Stuff is an eye Heart Radio production. 638 00:42:01,480 --> 00:42:06,480 Speaker 1: For more podcasts from iHeartRadio, visit the iHeartRadio app, Apple Podcasts, 639 00:42:06,640 --> 00:42:08,640 Speaker 1: or wherever you listen to your favorite shows