WEBVTT - Did AI Write This?

0:00:04.440 --> 0:00:12.280
<v Speaker 1>Welcome to Tech Stuff, a production from iHeartRadio. Hey there,

0:00:12.280 --> 0:00:15.640
<v Speaker 1>and welcome to tech Stuff. I'm your host, Jonathan Strickland.

0:00:15.680 --> 0:00:18.160
<v Speaker 1>I'm an executive producer with iHeartRadio. And how the tech

0:00:18.239 --> 0:00:21.600
<v Speaker 1>are you. I'm here to tell you something. You write

0:00:21.600 --> 0:00:25.480
<v Speaker 1>like a robot. But that's okay because I do too.

0:00:25.880 --> 0:00:29.720
<v Speaker 1>One of the founding fathers of the United States, James Madison,

0:00:30.120 --> 0:00:33.239
<v Speaker 1>wrote like a robot. Robots weren't even a thing when

0:00:33.280 --> 0:00:36.080
<v Speaker 1>he was writing back in the eighteenth century, all right,

0:00:36.159 --> 0:00:38.960
<v Speaker 1>so really, I guess it's more fair to say that

0:00:39.159 --> 0:00:43.520
<v Speaker 1>robots write like us. And while I'm having a little

0:00:43.560 --> 0:00:46.760
<v Speaker 1>bit of fun using the word robots, what I'm really

0:00:46.800 --> 0:00:51.000
<v Speaker 1>talking about is generative AI. You know, stuff like chat

0:00:51.080 --> 0:00:55.520
<v Speaker 1>GPT and Google Bard, that kind of thing, These AI

0:00:55.680 --> 0:00:59.280
<v Speaker 1>powered chat bots right like humans. Right, That's one of

0:00:59.320 --> 0:01:02.800
<v Speaker 1>the big suff features of the chatbots. One that they

0:01:02.800 --> 0:01:06.560
<v Speaker 1>can understand a prompt that we give them, That they

0:01:06.560 --> 0:01:09.480
<v Speaker 1>can understand what we mean when we give them a prompt,

0:01:09.520 --> 0:01:12.840
<v Speaker 1>and two that they then generate a response as if

0:01:12.920 --> 0:01:15.759
<v Speaker 1>it had been written by an actual person. But obviously

0:01:15.800 --> 0:01:20.399
<v Speaker 1>this also creates some challenges, some issues. So you might

0:01:20.440 --> 0:01:25.440
<v Speaker 1>remember that since chat GPT became publicly available last year

0:01:25.480 --> 0:01:29.319
<v Speaker 1>when OpenAI opened it up and let people start playing

0:01:29.319 --> 0:01:34.200
<v Speaker 1>with chat GPT, there were people in education, teachers and

0:01:34.280 --> 0:01:37.839
<v Speaker 1>administrators that sort of thing, who raise the alarm about

0:01:37.840 --> 0:01:42.320
<v Speaker 1>the possibility that students could use chat GPT and similar

0:01:42.360 --> 0:01:47.800
<v Speaker 1>tools to auto generate essays and stuff and thus bypass

0:01:47.920 --> 0:01:51.920
<v Speaker 1>school assignments. My robot wrote it for me. Beyond the

0:01:52.040 --> 0:01:55.480
<v Speaker 1>education sector, there are plenty of arenas where people are

0:01:55.520 --> 0:01:59.639
<v Speaker 1>worried that the less scrupulous folks out there will attempt

0:01:59.680 --> 0:02:02.840
<v Speaker 1>to pass off AI generated text as their own writing,

0:02:03.240 --> 0:02:08.760
<v Speaker 1>whether this is creative writing or business writing, whatever it

0:02:08.800 --> 0:02:13.440
<v Speaker 1>may be. So this then leads us to the concept

0:02:14.040 --> 0:02:18.640
<v Speaker 1>of AI writing detection tools, you know, some sort of

0:02:19.360 --> 0:02:23.280
<v Speaker 1>tool to determine if a piece of text originated from

0:02:23.480 --> 0:02:27.560
<v Speaker 1>a real human being or from that character that Haley

0:02:27.639 --> 0:02:31.240
<v Speaker 1>Joel Osmon played in that film about artificial intelligence. I

0:02:31.240 --> 0:02:35.239
<v Speaker 1>forget what that movie was called. Subsequent to the release

0:02:35.680 --> 0:02:39.239
<v Speaker 1>of these detection tools, we started hearing reports of teachers

0:02:39.560 --> 0:02:45.000
<v Speaker 1>failing students, sometimes an entire class of students, because the

0:02:45.120 --> 0:02:49.520
<v Speaker 1>detection tool indicated that the real source of the works

0:02:49.560 --> 0:02:52.040
<v Speaker 1>that were being turned in by the students it wasn't

0:02:52.120 --> 0:02:54.919
<v Speaker 1>from the students, but from AI. Now a lot of

0:02:54.960 --> 0:02:57.359
<v Speaker 1>students have actually come forward to argue that no, no,

0:02:57.520 --> 0:03:03.200
<v Speaker 1>they actually wrote those pieces themselves, that they authored that work,

0:03:03.240 --> 0:03:05.440
<v Speaker 1>they didn't use AI to do it, and that they

0:03:05.440 --> 0:03:09.000
<v Speaker 1>are the victim of false positives, that these writing detection

0:03:09.120 --> 0:03:12.240
<v Speaker 1>tools made a mistake, and as it turns out, at

0:03:12.320 --> 0:03:15.440
<v Speaker 1>least some of them, and likely a lot of them

0:03:15.560 --> 0:03:18.239
<v Speaker 1>were telling the truth. And we can say that because

0:03:18.280 --> 0:03:25.160
<v Speaker 1>these AI writing detection tools have abysmal accuracy rates, they

0:03:25.240 --> 0:03:29.400
<v Speaker 1>are worse than chance. That's how bad these tools can be.

0:03:30.160 --> 0:03:33.000
<v Speaker 1>So the success rate for an AI writing detector can

0:03:33.040 --> 0:03:36.040
<v Speaker 1>be so low that it has led some of the

0:03:36.040 --> 0:03:40.320
<v Speaker 1>companies to shut them down, and it led to a

0:03:40.360 --> 0:03:43.600
<v Speaker 1>lot of critics to just dismiss the concept of an

0:03:43.600 --> 0:03:48.240
<v Speaker 1>AI writing tool entirely. In fact, there are quite a

0:03:48.240 --> 0:03:51.120
<v Speaker 1>few who have argued that AI writing detection tools are

0:03:51.560 --> 0:03:54.720
<v Speaker 1>essentially snake oil. That there are companies that are making

0:03:54.760 --> 0:03:57.480
<v Speaker 1>what they say are reliable tools that can tell the

0:03:57.480 --> 0:04:00.560
<v Speaker 1>difference between text that was written by person and text

0:04:00.640 --> 0:04:04.200
<v Speaker 1>that was written by AI, but really they're just peddling

0:04:04.720 --> 0:04:08.800
<v Speaker 1>a hoax or a scam, and they're trying to make

0:04:08.920 --> 0:04:13.400
<v Speaker 1>money selling these tools to various organizations like schools and such,

0:04:14.160 --> 0:04:17.360
<v Speaker 1>but in fact those tools don't work, or at least

0:04:17.400 --> 0:04:21.160
<v Speaker 1>they don't work very well. Even open Ai, which is

0:04:21.200 --> 0:04:25.640
<v Speaker 1>the company that is responsible for chat GPT, they had

0:04:26.279 --> 0:04:28.880
<v Speaker 1>a tool that was meant to be a detection tool

0:04:28.920 --> 0:04:32.159
<v Speaker 1>to tell whether or not something was written by AI.

0:04:32.279 --> 0:04:35.560
<v Speaker 1>It was called AI Classifier, but they shut it down

0:04:36.240 --> 0:04:41.760
<v Speaker 1>earlier this year. Why because its accuracy rate was twenty

0:04:42.160 --> 0:04:47.960
<v Speaker 1>six percent. Twenty six percent accurate, that is bonkers. That

0:04:48.000 --> 0:04:52.320
<v Speaker 1>means nearly three quarters of the time that detection tool

0:04:52.400 --> 0:04:54.920
<v Speaker 1>came up with the wrong answer. Either it gave a

0:04:55.000 --> 0:04:59.400
<v Speaker 1>pass to an AI generated piece, or it accused a

0:04:59.600 --> 0:05:04.760
<v Speaker 1>work that a human being actually wrote, like definitively wrote,

0:05:05.320 --> 0:05:08.560
<v Speaker 1>as being the product of AI. This brings us to

0:05:08.640 --> 0:05:14.040
<v Speaker 1>James Madison. James Madison wrote the US Constitution, and folks

0:05:14.080 --> 0:05:17.880
<v Speaker 1>have fed the US Constitution into these AI writing detection

0:05:18.000 --> 0:05:22.440
<v Speaker 1>tools and received a notification that this piece was very

0:05:22.520 --> 0:05:25.800
<v Speaker 1>likely written by AI, which obviously led to lots of

0:05:26.320 --> 0:05:29.400
<v Speaker 1>jocularity on the Internet, as people said, I knew it.

0:05:29.440 --> 0:05:31.479
<v Speaker 1>I knew that the founding fathers of the United States

0:05:31.520 --> 0:05:34.440
<v Speaker 1>of America were really robots from the future sent back

0:05:34.480 --> 0:05:39.599
<v Speaker 1>in time to create a ultra capitalist society that preys

0:05:39.720 --> 0:05:44.440
<v Speaker 1>upon the disenfranchised or something like. There are a lot

0:05:44.440 --> 0:05:47.120
<v Speaker 1>of jokes about it, but the fact is no, it's

0:05:47.200 --> 0:05:51.680
<v Speaker 1>just that this writing detection tool is completely unreliable. So

0:05:51.720 --> 0:05:55.039
<v Speaker 1>you certainly cannot use these kinds of tools to justify

0:05:55.120 --> 0:05:59.080
<v Speaker 1>flunking an entire class of students when you know that

0:05:59.200 --> 0:06:02.680
<v Speaker 1>the reliability is so low. Now, I decided to do

0:06:03.120 --> 0:06:07.960
<v Speaker 1>this short episode about AI writing detection tools after reading

0:06:08.160 --> 0:06:11.279
<v Speaker 1>a couple of great pieces in Ours Technico. Those of

0:06:11.320 --> 0:06:14.080
<v Speaker 1>y'all who listen to my show frequently know that I

0:06:14.120 --> 0:06:19.760
<v Speaker 1>often reference Ours Technica because the folks there reliably post

0:06:20.240 --> 0:06:23.400
<v Speaker 1>great articles. So in this case, the author of both

0:06:23.440 --> 0:06:28.320
<v Speaker 1>pieces I read was BENJ. Edwards b E and J. Edwards,

0:06:28.680 --> 0:06:31.000
<v Speaker 1>And at some point I probably should reach out to

0:06:31.040 --> 0:06:33.440
<v Speaker 1>them and ask if they would like to join tech

0:06:33.480 --> 0:06:36.920
<v Speaker 1>stuff for an episode to talk about something like generative AI,

0:06:37.400 --> 0:06:41.839
<v Speaker 1>because Edwards has done some really good work. Anyways, as

0:06:41.880 --> 0:06:48.280
<v Speaker 1>we think about the issue about how this generative AI works,

0:06:48.680 --> 0:06:53.960
<v Speaker 1>the underlying technology that powers generative AI, we start to

0:06:53.960 --> 0:06:58.800
<v Speaker 1>see why there's this big reliability problem. Why are we

0:06:58.880 --> 0:07:04.240
<v Speaker 1>having such issues with an automated detection tool? Really determining

0:07:04.400 --> 0:07:07.760
<v Speaker 1>if something was written by a person or AI. And

0:07:07.760 --> 0:07:12.320
<v Speaker 1>it's because the tools like chat GPT are built on

0:07:12.400 --> 0:07:17.800
<v Speaker 1>top of large language models, also known as llms, And

0:07:17.840 --> 0:07:21.360
<v Speaker 1>if we take a moment to really understand llms, then

0:07:21.400 --> 0:07:23.720
<v Speaker 1>we start to get a handle on why these detector

0:07:23.760 --> 0:07:27.920
<v Speaker 1>tools are so unreliable. So first off, let's actually talk

0:07:27.920 --> 0:07:32.040
<v Speaker 1>about a precursor to large language models. This would be

0:07:32.200 --> 0:07:37.080
<v Speaker 1>recurrent neural networks or r ends. Now I've talked a

0:07:37.120 --> 0:07:39.800
<v Speaker 1>lot about neural networks on this show, but just as

0:07:39.800 --> 0:07:43.640
<v Speaker 1>a refresher. Neural network is an attempt to create a

0:07:43.800 --> 0:07:48.680
<v Speaker 1>computer system or computer model that processes information in a

0:07:48.680 --> 0:07:53.080
<v Speaker 1>way that is similar to how our brains process information.

0:07:53.640 --> 0:07:58.560
<v Speaker 1>So you have layers of artificial neurons, or you can

0:07:58.560 --> 0:08:02.920
<v Speaker 1>think of them as nodes. These layers connect to other

0:08:03.000 --> 0:08:07.080
<v Speaker 1>artificial neurons. You have multiple connections from neuron to other neurons,

0:08:07.480 --> 0:08:09.520
<v Speaker 1>and you have layers that go from top to bottom.

0:08:09.520 --> 0:08:11.560
<v Speaker 1>You can think of it like at the top that's

0:08:11.560 --> 0:08:14.120
<v Speaker 1>where you put input and at the bottom that's where

0:08:14.160 --> 0:08:18.120
<v Speaker 1>you get output. So essentially, you feed information into the

0:08:18.160 --> 0:08:20.960
<v Speaker 1>model and then the information goes through a series of

0:08:20.960 --> 0:08:25.160
<v Speaker 1>operations in which data passes through these different nodes, and

0:08:25.200 --> 0:08:28.480
<v Speaker 1>the nodes make decisions based upon the input, and then

0:08:28.520 --> 0:08:32.640
<v Speaker 1>they send output to different nodes and eventually you get

0:08:33.000 --> 0:08:36.800
<v Speaker 1>the ultimate output. And sometimes that output is correct. It

0:08:36.800 --> 0:08:40.480
<v Speaker 1>gives you the answer that is correct. Sometimes it's wrong.

0:08:41.000 --> 0:08:43.120
<v Speaker 1>And typically what that means is that you then have

0:08:43.200 --> 0:08:47.520
<v Speaker 1>to adjust how those artificial neurons are making decisions. Those

0:08:47.559 --> 0:08:52.480
<v Speaker 1>neurons apply a sort of bias to input, we call

0:08:52.520 --> 0:08:56.880
<v Speaker 1>it a weight, so they will favor some types of

0:08:56.960 --> 0:08:59.800
<v Speaker 1>input over others in an effort to make a decision.

0:08:59.840 --> 0:09:03.000
<v Speaker 1>If they didn't, then the data would never go anywhere.

0:09:03.080 --> 0:09:05.120
<v Speaker 1>You would never be able to have it processed through

0:09:05.160 --> 0:09:09.440
<v Speaker 1>the system. So the weighting affects how the neuron actually

0:09:09.440 --> 0:09:11.920
<v Speaker 1>processes the data, where does it pass it on to.

0:09:12.559 --> 0:09:16.720
<v Speaker 1>So it may say, if value is greater than X,

0:09:16.960 --> 0:09:20.679
<v Speaker 1>send to node A. If value is less than x,

0:09:20.920 --> 0:09:24.839
<v Speaker 1>send to node B. That could be a very basic weight.

0:09:25.240 --> 0:09:28.040
<v Speaker 1>X would be the weight in that case, and maybe

0:09:28.120 --> 0:09:31.640
<v Speaker 1>that would lead you to a correct outcome. So by

0:09:31.679 --> 0:09:36.400
<v Speaker 1>adjusting the weighting, you can change how these neurons make decisions.

0:09:36.880 --> 0:09:39.000
<v Speaker 1>And if you build a neural network for the purposes,

0:09:39.360 --> 0:09:42.200
<v Speaker 1>let's give it a hypothetical. Let's say it's identifying pictures

0:09:42.240 --> 0:09:46.320
<v Speaker 1>of cats. It's always my go to. And you start

0:09:46.400 --> 0:09:48.640
<v Speaker 1>looking at the output and you see that it is

0:09:48.760 --> 0:09:53.199
<v Speaker 1>mistakenly saying that pictures of flowers are pictures of cats.

0:09:53.600 --> 0:09:56.760
<v Speaker 1>You would say, all right, these artificial neural networks, the

0:09:57.040 --> 0:10:00.640
<v Speaker 1>nodes in this artificial neural network are making the wrong decisions.

0:10:01.000 --> 0:10:04.280
<v Speaker 1>The waiting is wrong in these nodes. I need to

0:10:04.280 --> 0:10:07.280
<v Speaker 1>go and start adjusting things so that I can start

0:10:07.280 --> 0:10:12.520
<v Speaker 1>to get back to this correctly saying whether or not

0:10:12.559 --> 0:10:15.400
<v Speaker 1>an image has a cat in it or doesn't. And

0:10:16.040 --> 0:10:18.240
<v Speaker 1>your goal is to train this model over and over

0:10:18.280 --> 0:10:21.200
<v Speaker 1>and over again until it gets better and better at

0:10:21.200 --> 0:10:24.120
<v Speaker 1>this task, so that then you can just send it

0:10:24.200 --> 0:10:26.720
<v Speaker 1>any raw data you like and not have to worry

0:10:26.760 --> 0:10:31.120
<v Speaker 1>about checking up on it afterward because its accuracy level

0:10:31.120 --> 0:10:34.520
<v Speaker 1>will be high enough to be reliable. That's your ultimate goal,

0:10:34.880 --> 0:10:38.120
<v Speaker 1>But there's a whole process of learning of training that

0:10:38.200 --> 0:10:41.760
<v Speaker 1>you have to go through first. Now, a recurrent neural network,

0:10:41.760 --> 0:10:44.760
<v Speaker 1>it's a little more specific than just artificial neural network.

0:10:45.320 --> 0:10:50.679
<v Speaker 1>Recurrent neural networks use sequential data. These networks can and

0:10:50.760 --> 0:10:55.720
<v Speaker 1>do take information from earlier inputs into consideration when processing

0:10:55.920 --> 0:11:00.280
<v Speaker 1>a new input, so there's a different model, the convolutional

0:11:00.520 --> 0:11:04.040
<v Speaker 1>neural network CNN, not the news channel. This is the

0:11:04.040 --> 0:11:08.000
<v Speaker 1>other big type of neural network where every time data

0:11:08.080 --> 0:11:11.480
<v Speaker 1>goes into an input, it's like a blank slate. It's

0:11:11.480 --> 0:11:15.320
<v Speaker 1>its own thing, it has nothing about That decision is

0:11:15.400 --> 0:11:19.880
<v Speaker 1>based upon any past decision. It's an instance by instance

0:11:20.000 --> 0:11:22.960
<v Speaker 1>kind of case. So you're starting from scratch. But with

0:11:23.160 --> 0:11:27.720
<v Speaker 1>recurrent neural networks, the network can actually incorporate past inputs

0:11:28.080 --> 0:11:31.840
<v Speaker 1>as part of how it processes a current input. But

0:11:31.960 --> 0:11:35.400
<v Speaker 1>one issue with these types of networks, the recurrent neural

0:11:35.400 --> 0:11:38.800
<v Speaker 1>networks is that they need a full sequence before they

0:11:38.840 --> 0:11:42.600
<v Speaker 1>can process the information. So when we're talking about text,

0:11:43.040 --> 0:11:45.880
<v Speaker 1>like if we wanted to process text through a recurrent

0:11:45.920 --> 0:11:49.120
<v Speaker 1>neural network, it would need to work over the entire

0:11:49.240 --> 0:11:53.240
<v Speaker 1>text before producing a result in order to understand things

0:11:53.280 --> 0:11:57.000
<v Speaker 1>like context. Sometimes this approach can lead to errors because

0:11:57.040 --> 0:12:01.720
<v Speaker 1>the model essentially forgets the stuff that was at the

0:12:01.760 --> 0:12:04.160
<v Speaker 1>beginning of the text by the time it gets to

0:12:04.200 --> 0:12:07.160
<v Speaker 1>the end, which sounds a lot like me honestly, where

0:12:07.600 --> 0:12:10.440
<v Speaker 1>I will finish a book and then I'll think, like

0:12:10.520 --> 0:12:13.560
<v Speaker 1>I'll have a discussion with someone about a book that

0:12:13.600 --> 0:12:15.360
<v Speaker 1>we've both read and they'll be like, Oh, I like

0:12:15.440 --> 0:12:18.320
<v Speaker 1>that part where in early in the book blah blah

0:12:18.320 --> 0:12:20.600
<v Speaker 1>blah blah blah, and it pays off much later, and meanwhile,

0:12:20.600 --> 0:12:23.320
<v Speaker 1>I'm thinking, I totally forgot that that happened earlier in

0:12:23.360 --> 0:12:25.559
<v Speaker 1>the book. I remember where we ended up, but I

0:12:25.600 --> 0:12:28.960
<v Speaker 1>don't remember how we got there. Recurrent neural networks can

0:12:29.000 --> 0:12:33.360
<v Speaker 1>fall into the same sort of trap, and so that

0:12:34.679 --> 0:12:38.520
<v Speaker 1>creates a bit of a hurdle when it comes to

0:12:38.559 --> 0:12:44.640
<v Speaker 1>things like analyzing text for the purposes of building natural

0:12:44.720 --> 0:12:48.960
<v Speaker 1>language systems. But I'll explain how that all started to

0:12:49.040 --> 0:12:52.559
<v Speaker 1>change in twenty seventeen. First, however, we need to take

0:12:52.600 --> 0:13:05.680
<v Speaker 1>a quick break to thank our sponsors. Okay, before the break,

0:13:05.720 --> 0:13:08.840
<v Speaker 1>I was talking about recurrent neural networks and how those

0:13:08.880 --> 0:13:11.439
<v Speaker 1>have certain limitations when it comes to the way they

0:13:11.440 --> 0:13:14.800
<v Speaker 1>process data because it has to be sequential. Well, in

0:13:14.840 --> 0:13:18.480
<v Speaker 1>twenty seventeen, a group of AI researchers who were working

0:13:18.520 --> 0:13:24.120
<v Speaker 1>specifically over at Google were coming up with an alternative

0:13:24.760 --> 0:13:27.760
<v Speaker 1>to this approach, and they published a paper, and the

0:13:27.800 --> 0:13:32.000
<v Speaker 1>paper's title was Attention is All You Need, in which

0:13:32.000 --> 0:13:35.680
<v Speaker 1>they suggested that you could do something differently from the

0:13:35.720 --> 0:13:39.000
<v Speaker 1>recurrent neural network approach for the purposes of analyzing stuff

0:13:39.080 --> 0:13:44.360
<v Speaker 1>like text. Their approach was what they called a transformer model.

0:13:45.240 --> 0:13:49.600
<v Speaker 1>While you're old, RNN would analyze text essentially a character

0:13:49.679 --> 0:13:51.600
<v Speaker 1>at a time, not even a word at a time,

0:13:51.600 --> 0:13:54.840
<v Speaker 1>but a character at a time, and thus that's sequential, right.

0:13:54.880 --> 0:13:58.440
<v Speaker 1>The sequential data is character by character. It builds this

0:13:58.600 --> 0:14:02.120
<v Speaker 1>up and then analyzes the whole thing. The transformer model

0:14:02.160 --> 0:14:06.680
<v Speaker 1>instead would tackle a sentence as a unit as opposed

0:14:06.679 --> 0:14:10.280
<v Speaker 1>to a character or even an entire passage of text

0:14:10.440 --> 0:14:13.319
<v Speaker 1>would be a single unit, and so it would analyze

0:14:13.360 --> 0:14:17.160
<v Speaker 1>this to understand the context of what was being said,

0:14:17.880 --> 0:14:20.880
<v Speaker 1>and that's a huge benefit you. Getting a handle on

0:14:21.000 --> 0:14:26.160
<v Speaker 1>context is absolutely critical to understanding what someone means, because

0:14:26.200 --> 0:14:29.400
<v Speaker 1>words can have multiple meanings, right, and without context, we

0:14:29.440 --> 0:14:33.720
<v Speaker 1>can't really be sure which meaning someone intended. So here's

0:14:33.720 --> 0:14:37.760
<v Speaker 1>an example. The English word late. That can mean a

0:14:37.760 --> 0:14:40.280
<v Speaker 1>lot of things if you're an English speaker. So if

0:14:40.320 --> 0:14:42.280
<v Speaker 1>you're talking about the time of day, if you say

0:14:42.280 --> 0:14:45.560
<v Speaker 1>it's late, you usually mean it's getting close to night time.

0:14:45.800 --> 0:14:47.760
<v Speaker 1>You could say it's late at night, which means it's

0:14:47.800 --> 0:14:51.120
<v Speaker 1>actually close to morning time, or maybe it even is

0:14:51.200 --> 0:14:55.440
<v Speaker 1>the morning because it's still dark. And so you think

0:14:55.480 --> 0:14:59.360
<v Speaker 1>of it as night, but technically speaking, it's morning and

0:14:59.360 --> 0:15:01.760
<v Speaker 1>you're just saying it's late at night. If you're saying

0:15:01.880 --> 0:15:05.800
<v Speaker 1>somebody is late, you could either mean they are not

0:15:05.960 --> 0:15:10.320
<v Speaker 1>on time for some appointment, or tragically, you could mean

0:15:10.360 --> 0:15:13.440
<v Speaker 1>that this is a person who has passed away. They

0:15:13.480 --> 0:15:16.960
<v Speaker 1>are late. But you need the rest of the sentence.

0:15:17.000 --> 0:15:21.920
<v Speaker 1>You need that context to understand what meaning of late

0:15:22.480 --> 0:15:27.720
<v Speaker 1>was actually intended. So you need that contextual vision to

0:15:27.760 --> 0:15:31.680
<v Speaker 1>be able to understand the whole thing. So transformer models

0:15:32.240 --> 0:15:37.840
<v Speaker 1>began to revolutionize certain types of AI applications, specifically in

0:15:37.880 --> 0:15:43.760
<v Speaker 1>the realm of natural language processing and generative AI, and

0:15:43.840 --> 0:15:47.600
<v Speaker 1>it's what led to the development of large language models

0:15:47.960 --> 0:15:52.040
<v Speaker 1>the lms. Essentially, a large language model is just a

0:15:52.280 --> 0:15:56.600
<v Speaker 1>huge transformer model. And to make a large language model,

0:15:57.040 --> 0:16:00.760
<v Speaker 1>you need a lot of text to train your model,

0:16:01.120 --> 0:16:04.960
<v Speaker 1>like a lot a lot. Open AI trained its large

0:16:05.000 --> 0:16:09.040
<v Speaker 1>language model known as GPT, which stands for Generative pre

0:16:09.160 --> 0:16:15.640
<v Speaker 1>Trained Transformer. They trained it on countless documents, millions and

0:16:15.800 --> 0:16:22.040
<v Speaker 1>millions of documents found across the web. Some authors allege

0:16:22.440 --> 0:16:26.000
<v Speaker 1>that the training material included copyrighted material and that the

0:16:26.000 --> 0:16:28.840
<v Speaker 1>authors did not give permission for their works to be

0:16:28.960 --> 0:16:32.200
<v Speaker 1>part of the information that fed into this model. That

0:16:32.400 --> 0:16:35.400
<v Speaker 1>leads into its own set of problems that are a

0:16:35.480 --> 0:16:37.760
<v Speaker 1>little bit beyond the scope of what I'm talking about today,

0:16:37.760 --> 0:16:41.080
<v Speaker 1>but they are big problems and they're ongoing now. Stephen

0:16:41.160 --> 0:16:45.120
<v Speaker 1>King argued that his works were clearly used to train

0:16:45.240 --> 0:16:48.720
<v Speaker 1>up large language models. A dead giveaway is if you

0:16:48.840 --> 0:16:53.360
<v Speaker 1>ask a chatbot built on top of a large language

0:16:53.360 --> 0:16:58.360
<v Speaker 1>model to recite passages from specific authors works, and if

0:16:58.360 --> 0:17:01.560
<v Speaker 1>it can do that like accurate, like it's really giving

0:17:01.600 --> 0:17:06.760
<v Speaker 1>you an accurate representation of that text. Yeah, there's no

0:17:06.880 --> 0:17:12.240
<v Speaker 1>way could have received that information without having trained on

0:17:12.440 --> 0:17:16.399
<v Speaker 1>the original text at least somewhere. Now, if it's just

0:17:16.440 --> 0:17:20.520
<v Speaker 1>making stuff up, that's different. That falls into the category

0:17:20.560 --> 0:17:24.080
<v Speaker 1>of hallucinations, which we might touch upon again before we

0:17:24.320 --> 0:17:30.320
<v Speaker 1>finish shut this episode. Anyway, the benefit of feeding so

0:17:30.640 --> 0:17:34.480
<v Speaker 1>much information to a transformer model is that the transformer model,

0:17:34.560 --> 0:17:38.000
<v Speaker 1>the large language model, gets pretty darn good at sussing

0:17:38.040 --> 0:17:42.040
<v Speaker 1>out context. Even stuff that you would expect would trip

0:17:42.200 --> 0:17:45.720
<v Speaker 1>up an AI chatbot can become a breeze. You know,

0:17:45.800 --> 0:17:49.479
<v Speaker 1>you might think that slang or idioms could trip up

0:17:49.480 --> 0:17:52.840
<v Speaker 1>an AI tool, but then you have to remember that

0:17:52.920 --> 0:17:55.960
<v Speaker 1>these tools rely on essentially all the stuff that's on

0:17:56.000 --> 0:17:59.320
<v Speaker 1>the Internet, at least all the stuff that's publicly available

0:17:59.320 --> 0:18:03.560
<v Speaker 1>that's not locked behind something, and maybe even some stuff

0:18:03.560 --> 0:18:07.080
<v Speaker 1>that is locked behind stuff. As it turns out, and

0:18:07.200 --> 0:18:09.840
<v Speaker 1>as such, that means that these models have trained with

0:18:09.960 --> 0:18:12.960
<v Speaker 1>data sets that originate from the same communities that are

0:18:13.000 --> 0:18:16.919
<v Speaker 1>creating the culture that generates certain slang and idioms in

0:18:16.920 --> 0:18:20.400
<v Speaker 1>the first place. So if your AI model is using

0:18:20.440 --> 0:18:25.320
<v Speaker 1>the same source material where these turns of phrase and

0:18:25.440 --> 0:18:29.960
<v Speaker 1>certain slang terms are are originating from, well, of course

0:18:30.000 --> 0:18:32.119
<v Speaker 1>it's going to understand it because that was part of

0:18:32.119 --> 0:18:36.240
<v Speaker 1>its training, so it has that grounding. It's not like me,

0:18:36.800 --> 0:18:40.199
<v Speaker 1>where I am old. I don't understand slang that the

0:18:40.280 --> 0:18:43.880
<v Speaker 1>kids use these days because I'm not in those communities.

0:18:44.560 --> 0:18:47.080
<v Speaker 1>You wouldn't expect me to understand. I am definitely the

0:18:48.400 --> 0:18:51.800
<v Speaker 1>stereotypical out of touch old dude. So when I hear

0:18:51.880 --> 0:18:55.840
<v Speaker 1>people about, you know, people rizing up, I'm like, wait what?

0:18:56.880 --> 0:18:59.600
<v Speaker 1>And I have to look things up. And as we

0:18:59.640 --> 0:19:03.720
<v Speaker 1>all know, urban dictionary is not the most reliable of resources.

0:19:04.200 --> 0:19:08.679
<v Speaker 1>It is frequently entertaining, usually in a way that is

0:19:08.720 --> 0:19:13.600
<v Speaker 1>incredibly offensive, but it's not always accurate anyway. This ultimately

0:19:13.680 --> 0:19:16.680
<v Speaker 1>starts to lead us to why these AI writing detection

0:19:16.800 --> 0:19:21.280
<v Speaker 1>tools are not very good. The material that AI generates

0:19:21.400 --> 0:19:24.840
<v Speaker 1>is built upon how we communicate. It's a built on

0:19:24.880 --> 0:19:28.360
<v Speaker 1>how we write. That's how it was trained. So it's

0:19:28.359 --> 0:19:33.199
<v Speaker 1>not like AI or robots, as I was facetiously saying

0:19:33.280 --> 0:19:36.080
<v Speaker 1>earlier in the episode. It's not like AI has a

0:19:36.119 --> 0:19:39.320
<v Speaker 1>different path toward writing than we do. The AI is

0:19:39.359 --> 0:19:43.760
<v Speaker 1>not following an established set of rules that's unique to AI. Right,

0:19:43.800 --> 0:19:47.760
<v Speaker 1>They're not saying, write this like artificial intelligence. So the

0:19:47.840 --> 0:19:51.639
<v Speaker 1>stuff that AI produces can come across as very human

0:19:52.040 --> 0:19:56.159
<v Speaker 1>and vice versa. Now, this does not mean that it

0:19:56.280 --> 0:20:01.080
<v Speaker 1>is absolutely impossible for someone like a teacher to tell

0:20:01.160 --> 0:20:04.720
<v Speaker 1>if something was written by AI or a student. If

0:20:04.760 --> 0:20:07.439
<v Speaker 1>the teacher is actually really familiar with the writing style

0:20:07.720 --> 0:20:12.120
<v Speaker 1>of that student or students in question, it's entirely possible

0:20:12.320 --> 0:20:15.120
<v Speaker 1>that the teacher might notice if that writing style were

0:20:15.160 --> 0:20:20.880
<v Speaker 1>to suddenly and maybe significantly change between assignments. This can

0:20:20.960 --> 0:20:23.640
<v Speaker 1>be a big ask, by the way, for certain teachers,

0:20:23.800 --> 0:20:26.880
<v Speaker 1>because class sizes can get huge depending on where you are,

0:20:27.600 --> 0:20:30.320
<v Speaker 1>and if you're talking about an overworked English teacher who's

0:20:30.359 --> 0:20:33.879
<v Speaker 1>teaching multiple classes and each class has got, you know,

0:20:34.000 --> 0:20:37.720
<v Speaker 1>thirty kids in it. It can be hard to really

0:20:37.920 --> 0:20:42.879
<v Speaker 1>build up a working knowledge and memory of the writing

0:20:42.920 --> 0:20:45.520
<v Speaker 1>styles of every single person in every single class. But

0:20:46.119 --> 0:20:48.800
<v Speaker 1>that is one way that teachers can tell. If teachers

0:20:49.040 --> 0:20:52.040
<v Speaker 1>read an essay and think, wow, you know, Robert didn't

0:20:52.119 --> 0:20:55.720
<v Speaker 1>write like this in the essay we did last month,

0:20:56.200 --> 0:20:59.960
<v Speaker 1>this is a very different approach to writing and per

0:21:00.040 --> 0:21:04.080
<v Speaker 1>perhaps that's an indicator that someone else wrote the piece,

0:21:04.119 --> 0:21:07.680
<v Speaker 1>whether that was AI or maybe you know, another human being,

0:21:08.480 --> 0:21:12.320
<v Speaker 1>and that can be an indication something hinky is going on. Also,

0:21:12.400 --> 0:21:15.480
<v Speaker 1>I mean, obviously some people get sloppy. This happens a

0:21:15.480 --> 0:21:18.640
<v Speaker 1>lot too when people just aren't paying attention as they're

0:21:18.760 --> 0:21:24.600
<v Speaker 1>using AI to generate either you know, an educational assignment

0:21:24.880 --> 0:21:28.840
<v Speaker 1>or business or whatever. There have been so many examples

0:21:29.240 --> 0:21:33.280
<v Speaker 1>of how people have accidentally copied and pasted not just

0:21:33.560 --> 0:21:36.760
<v Speaker 1>the body of the text, but stuff that's outside the

0:21:36.800 --> 0:21:39.000
<v Speaker 1>body of the text, like it might even be a

0:21:39.000 --> 0:21:42.600
<v Speaker 1>little disclaimer saying it was made by AI, or it

0:21:42.640 --> 0:21:47.080
<v Speaker 1>could be a command like regenerate response. That's something you

0:21:47.160 --> 0:21:51.480
<v Speaker 1>find in certain chat bots, and that is just what

0:21:51.520 --> 0:21:55.760
<v Speaker 1>regenerate response means. It just means, hey, can you create

0:21:55.920 --> 0:22:03.119
<v Speaker 1>a new AI response to the initial prompt I gave you.

0:22:03.320 --> 0:22:06.200
<v Speaker 1>So I wrote a prompt, I had you generate response.

0:22:07.040 --> 0:22:09.280
<v Speaker 1>I want you to create a whole new response based

0:22:09.320 --> 0:22:13.280
<v Speaker 1>on that original prompt. If you have regenerate response written

0:22:13.320 --> 0:22:18.560
<v Speaker 1>at in your essay, that's a dead giveaway that you

0:22:18.800 --> 0:22:22.160
<v Speaker 1>copied and pasted that essay off of an AI chatbot.

0:22:22.520 --> 0:22:25.920
<v Speaker 1>So there are ways that teachers can tell the difference,

0:22:26.840 --> 0:22:30.359
<v Speaker 1>but they aren't. It's not as granular as saying, oh,

0:22:30.600 --> 0:22:35.280
<v Speaker 1>this is clearly something that was written by artificial intelligence

0:22:35.359 --> 0:22:37.719
<v Speaker 1>versus this was written by a human. It's more like

0:22:38.400 --> 0:22:41.359
<v Speaker 1>this is different from what I have received before from

0:22:41.440 --> 0:22:47.440
<v Speaker 1>this particular student, or this contains obvious errors that reveal

0:22:47.720 --> 0:22:52.399
<v Speaker 1>that the student has used AI. Now, the AI writing

0:22:52.440 --> 0:22:58.160
<v Speaker 1>detection tools are at least claiming to use a couple

0:22:58.160 --> 0:23:01.040
<v Speaker 1>of strategies to try and determine if something was written

0:23:01.080 --> 0:23:04.360
<v Speaker 1>by AI or a human. So they're saying, we can

0:23:04.440 --> 0:23:08.359
<v Speaker 1>automate that process, and we can actually analyze a block

0:23:08.400 --> 0:23:11.120
<v Speaker 1>of text and give you a determination as to whether

0:23:11.240 --> 0:23:13.399
<v Speaker 1>or not that was made by AI or a human,

0:23:13.760 --> 0:23:18.080
<v Speaker 1>which suggests that maybe there is some sort of fundamental

0:23:18.119 --> 0:23:22.360
<v Speaker 1>difference between the way AI generates content and the way

0:23:22.520 --> 0:23:27.800
<v Speaker 1>people do. But these strategies that the AI writing detection

0:23:27.920 --> 0:23:32.199
<v Speaker 1>tools are built upon have fundamental flaws, and we know

0:23:32.280 --> 0:23:35.000
<v Speaker 1>that because we know the tools are bad. It was

0:23:35.040 --> 0:23:37.639
<v Speaker 1>bad enough for open ai to shut down its version

0:23:37.760 --> 0:23:42.919
<v Speaker 1>back in June. So this isn't like just us postulating

0:23:43.160 --> 0:23:45.919
<v Speaker 1>that these tools are bad. We know they're bad. We

0:23:46.080 --> 0:23:49.640
<v Speaker 1>know they create things like false positives. So knowing that

0:23:49.920 --> 0:23:53.639
<v Speaker 1>already they are unreliable, you then have to start asking, well,

0:23:53.960 --> 0:23:56.560
<v Speaker 1>why are they unreliable? What are the things that are

0:23:56.680 --> 0:24:00.480
<v Speaker 1>leading these tools to make these wrong determinations? And when

0:24:00.480 --> 0:24:04.359
<v Speaker 1>we come back, I'll talk about how Bene Edwards and

0:24:04.480 --> 0:24:08.880
<v Speaker 1>those OURS Technica articles really kind of digs into two

0:24:09.359 --> 0:24:14.159
<v Speaker 1>main concepts that end up leading to these writing detection

0:24:14.280 --> 0:24:17.080
<v Speaker 1>tools trying to make a determination and why they are

0:24:17.760 --> 0:24:32.600
<v Speaker 1>fundamentally flawed. But first let's take another quick break. So

0:24:33.320 --> 0:24:35.439
<v Speaker 1>before the break, I mentioned that I was going to

0:24:35.440 --> 0:24:39.760
<v Speaker 1>talk about some strategies that Binge Edwards outlines in his

0:24:39.960 --> 0:24:43.800
<v Speaker 1>RS Technica articles, and they fall into two categories. The

0:24:43.840 --> 0:24:49.600
<v Speaker 1>first is called perplexity, and that really means how surprising

0:24:49.800 --> 0:24:54.480
<v Speaker 1>or perplexing are the word choices, how creative are the

0:24:54.600 --> 0:24:59.640
<v Speaker 1>sentences in a given piece of text compared to an

0:24:59.680 --> 0:25:04.800
<v Speaker 1>AI training model. So the thinking behind this is that

0:25:05.560 --> 0:25:09.600
<v Speaker 1>if a block of text seems to conform to the

0:25:09.640 --> 0:25:12.880
<v Speaker 1>same sort of stuff that the language model would produce,

0:25:13.760 --> 0:25:17.639
<v Speaker 1>then AI probably created the text. That's the idea if

0:25:17.880 --> 0:25:22.280
<v Speaker 1>they're saying essentially that if the text is really similar

0:25:22.320 --> 0:25:25.880
<v Speaker 1>to what AI would create, then AI probably created it.

0:25:26.880 --> 0:25:30.119
<v Speaker 1>And let's think about how some tools use autocomplete to

0:25:30.160 --> 0:25:32.600
<v Speaker 1>help you write a text or sentence. Using a purely

0:25:32.680 --> 0:25:35.520
<v Speaker 1>hypothetical scenario to kind of get our minds wrapped around this,

0:25:36.160 --> 0:25:38.880
<v Speaker 1>Let's say that you were typing into something that has

0:25:38.960 --> 0:25:43.640
<v Speaker 1>autocomplete built into it, the sentence or the phrase I'm

0:25:43.680 --> 0:25:48.800
<v Speaker 1>going to go for a and then whatever tool you're

0:25:48.840 --> 0:25:53.920
<v Speaker 1>typing it into suggests the word walk as an autocomplete option. Well,

0:25:53.960 --> 0:25:57.800
<v Speaker 1>that would be because the language model that is powering

0:25:58.280 --> 0:26:04.760
<v Speaker 1>this autocomplete function has a has sampled millions of passages,

0:26:04.880 --> 0:26:07.879
<v Speaker 1>millions and millions and millions of documents, and has found

0:26:08.440 --> 0:26:11.760
<v Speaker 1>that the word walk has been the most common word

0:26:11.800 --> 0:26:15.199
<v Speaker 1>to follow the phrase I'm going to go for a

0:26:16.520 --> 0:26:21.720
<v Speaker 1>and so therefore it offers that as the suggestion, and

0:26:21.920 --> 0:26:24.160
<v Speaker 1>maybe it would even offer you a few options. Maybe

0:26:24.160 --> 0:26:27.359
<v Speaker 1>it would say walk, maybe it'd say swim in the UK,

0:26:27.480 --> 0:26:32.080
<v Speaker 1>maybe it'd say a curry. Who knows so, but you know,

0:26:32.119 --> 0:26:33.959
<v Speaker 1>it would give you maybe a couple of different options,

0:26:33.960 --> 0:26:35.960
<v Speaker 1>but they would be the ones that would most likely

0:26:36.040 --> 0:26:40.240
<v Speaker 1>follow that phrase based upon the training material that that

0:26:40.400 --> 0:26:44.640
<v Speaker 1>large language model had used to build itself up. Right,

0:26:45.119 --> 0:26:49.240
<v Speaker 1>So if you were to measure the perplexity of the

0:26:49.280 --> 0:26:52.840
<v Speaker 1>sentence I'm going to go for a walk, it would

0:26:52.880 --> 0:26:56.600
<v Speaker 1>be very very low, very low perplexity because it's in

0:26:56.640 --> 0:27:00.320
<v Speaker 1>line with what the language model would expect. So the

0:27:00.400 --> 0:27:03.720
<v Speaker 1>thought is, if a passage in general has a very

0:27:03.960 --> 0:27:08.960
<v Speaker 1>low perplexity, these tools tend to suspect that the passage

0:27:08.960 --> 0:27:11.760
<v Speaker 1>as a whole could have come from AI. So let's

0:27:11.800 --> 0:27:14.119
<v Speaker 1>say that it had a very hyperplexity. Let's say that

0:27:14.160 --> 0:27:16.399
<v Speaker 1>instead of saying I'm going to go for a walk,

0:27:16.880 --> 0:27:19.919
<v Speaker 1>you said I'm going to go for a zebra or

0:27:20.040 --> 0:27:23.760
<v Speaker 1>zebra if you're in the UK. Well, that doesn't want

0:27:23.800 --> 0:27:25.680
<v Speaker 1>it doesn't really make any sense. But two, that would

0:27:25.680 --> 0:27:28.240
<v Speaker 1>be very perplexing, right, that would not be something that

0:27:28.280 --> 0:27:30.720
<v Speaker 1>the large language model would expect. And so if it

0:27:30.720 --> 0:27:35.000
<v Speaker 1>has high perplexity, then the writing detection tool is more

0:27:35.200 --> 0:27:37.640
<v Speaker 1>likely to say this was written by a human, because

0:27:38.119 --> 0:27:41.920
<v Speaker 1>what generative chat system would have made that sentence, And

0:27:42.040 --> 0:27:44.280
<v Speaker 1>he's like, no, sane robot would say I'm going to

0:27:44.280 --> 0:27:47.680
<v Speaker 1>go for a zebra. Clearly some human wrote this. Now,

0:27:47.680 --> 0:27:51.400
<v Speaker 1>the problem is these companies are training their large language

0:27:51.400 --> 0:27:56.119
<v Speaker 1>models on enormous amounts of human generated text. And unless

0:27:56.160 --> 0:28:01.600
<v Speaker 1>you're purposefully trying to be really a original in your writing,

0:28:01.760 --> 0:28:03.840
<v Speaker 1>that's a kind way of saying you're being a weirdo

0:28:04.080 --> 0:28:07.119
<v Speaker 1>as you're writing your sentences. Chances are a lot of

0:28:07.119 --> 0:28:09.479
<v Speaker 1>the stuff you're writing is going to have a fairly

0:28:09.560 --> 0:28:13.800
<v Speaker 1>low perplexity, unless you're trying to write in like the

0:28:13.840 --> 0:28:18.800
<v Speaker 1>milieu of humor or absurdity. If unless you're purposely trying

0:28:18.800 --> 0:28:22.800
<v Speaker 1>to do that, then chances are your perplexity is going

0:28:22.880 --> 0:28:25.960
<v Speaker 1>to be pretty low too. Particularly for very structured writing

0:28:26.080 --> 0:28:30.280
<v Speaker 1>like business writing or academic writing, that perplexity is going

0:28:30.320 --> 0:28:33.720
<v Speaker 1>to be very low. So unless you're prone to throwing

0:28:33.760 --> 0:28:38.680
<v Speaker 1>in very odd, random, weird sentences like William Shakespeare's Othello

0:28:38.920 --> 0:28:41.040
<v Speaker 1>is one of the great tragedies of English theater, and

0:28:41.120 --> 0:28:46.120
<v Speaker 1>also I enjoy shoving hot dogs through mail slots. Well,

0:28:46.160 --> 0:28:48.600
<v Speaker 1>there's a pretty good chance that an AI detector tool

0:28:48.960 --> 0:28:52.720
<v Speaker 1>is going to think that your human written, legitimate essay

0:28:53.440 --> 0:28:57.680
<v Speaker 1>was in fact an AI's work, because the perplexity would

0:28:57.720 --> 0:29:00.840
<v Speaker 1>likely be pretty low, again unless you're doing something really

0:29:00.880 --> 0:29:04.479
<v Speaker 1>avant garde, So that there's a fundamental flaw and logic

0:29:04.520 --> 0:29:08.160
<v Speaker 1>of using perplexity as one of your metrics for determining

0:29:08.200 --> 0:29:11.680
<v Speaker 1>if something was written by AI versus a human. Ben

0:29:11.720 --> 0:29:15.640
<v Speaker 1>Jedwards also goes on to explain that another factor that

0:29:15.680 --> 0:29:19.040
<v Speaker 1>AI detection tools will take into consideration is one that's

0:29:19.080 --> 0:29:25.520
<v Speaker 1>called burstiness. Perplexity and burstiness makes me feel like I've

0:29:25.600 --> 0:29:29.960
<v Speaker 1>fallen into a Lewis Carroll novel. But anyway, burstiness really

0:29:30.000 --> 0:29:34.600
<v Speaker 1>has to do with variability, particularly between sentences. So y'all

0:29:34.640 --> 0:29:39.880
<v Speaker 1>probably have noticed I have a tendency toward really long sentences,

0:29:40.000 --> 0:29:43.960
<v Speaker 1>and often with a lot of parentheticals thrown in there. Now,

0:29:44.000 --> 0:29:48.080
<v Speaker 1>if I also incorporate short sentences on occasion, breaking up

0:29:48.120 --> 0:29:51.400
<v Speaker 1>these very long sentences, this creates a lot more variety,

0:29:51.680 --> 0:29:56.280
<v Speaker 1>a lot more dynamic elements between my sentences, because I'm

0:29:56.400 --> 0:30:02.560
<v Speaker 1>switching back and forth between these very long, pontificating sentences

0:30:02.600 --> 0:30:05.960
<v Speaker 1>and then short ones to make a point. Maybe in

0:30:06.000 --> 0:30:09.160
<v Speaker 1>some sentences I use tons of adverbs to describe action.

0:30:09.600 --> 0:30:11.920
<v Speaker 1>Maybe in the next sentence I don't use any adverbs

0:30:11.920 --> 0:30:16.520
<v Speaker 1>at all. This is what creates that variability. The conventional

0:30:16.560 --> 0:30:21.320
<v Speaker 1>wisdom is that AI generated work is more uniform, it's

0:30:21.360 --> 0:30:25.880
<v Speaker 1>more consistent, it has less variability from sentence to sentence.

0:30:25.920 --> 0:30:30.040
<v Speaker 1>Your sentence length and complexity is going to remain more

0:30:30.160 --> 0:30:33.719
<v Speaker 1>or less the same throughout an entire passage. So if

0:30:33.760 --> 0:30:38.040
<v Speaker 1>you're able to qualify how dynamic a writing style is,

0:30:39.160 --> 0:30:42.760
<v Speaker 1>the thinking goes. You could potentially determine if a human

0:30:42.800 --> 0:30:45.760
<v Speaker 1>wrote it or if an AI wrote that specific piece.

0:30:46.480 --> 0:30:50.600
<v Speaker 1>If it's not very dynamic, well that leads more toward AI.

0:30:51.800 --> 0:30:54.480
<v Speaker 1>But that approach depends upon a couple of things that

0:30:54.520 --> 0:30:57.680
<v Speaker 1>are not always reliable. So first up, it assumes that

0:30:57.800 --> 0:31:00.560
<v Speaker 1>AI generated content is going to contain you to show

0:31:00.640 --> 0:31:04.880
<v Speaker 1>more consistency than the stuff that humans. Right, that's going

0:31:04.920 --> 0:31:09.600
<v Speaker 1>to continue to be this very consistent approach to sentence structure.

0:31:09.920 --> 0:31:13.160
<v Speaker 1>But the language models and the generative AI that are

0:31:13.240 --> 0:31:15.880
<v Speaker 1>built on top of the language models are growing more

0:31:15.920 --> 0:31:18.840
<v Speaker 1>sophisticated all the time. A lot of these companies that

0:31:18.960 --> 0:31:23.880
<v Speaker 1>make these language models are mining platforms like x formerly

0:31:23.960 --> 0:31:27.560
<v Speaker 1>known as Twitter or Reddit in order to train their

0:31:27.640 --> 0:31:32.560
<v Speaker 1>language models. They're reading these sort of idiosyncratic messages that

0:31:32.600 --> 0:31:36.600
<v Speaker 1>people write. Sometimes they're writing purposefully in a way that

0:31:36.720 --> 0:31:41.040
<v Speaker 1>is not consistent, and it can get to be a

0:31:41.040 --> 0:31:45.360
<v Speaker 1>little unpredictable. Well, if you're training your language model on

0:31:45.400 --> 0:31:48.760
<v Speaker 1>these things, then over time the language models and the

0:31:48.800 --> 0:31:51.160
<v Speaker 1>tools that are built on top of them begin to

0:31:51.240 --> 0:31:54.920
<v Speaker 1>reflect that training material. It means that we should expect

0:31:55.480 --> 0:32:01.160
<v Speaker 1>generative AI to start increasing variability in sentence because that's

0:32:01.160 --> 0:32:04.760
<v Speaker 1>what we're training it on. You can't expect to train

0:32:04.800 --> 0:32:07.400
<v Speaker 1>it on one thing and it generates something totally different.

0:32:07.440 --> 0:32:10.600
<v Speaker 1>It's going to kind of mimic the material that was

0:32:10.720 --> 0:32:13.720
<v Speaker 1>used to teach it in the first place. So that

0:32:13.760 --> 0:32:15.840
<v Speaker 1>means you're going to see a reduction in the gap

0:32:15.960 --> 0:32:20.720
<v Speaker 1>between how AI creates text and how humans do. But

0:32:20.840 --> 0:32:24.440
<v Speaker 1>on top of that, again, for certain types of writing,

0:32:25.160 --> 0:32:28.240
<v Speaker 1>human authors may take a more structured approach and they

0:32:28.280 --> 0:32:34.720
<v Speaker 1>may purposefully reduce variability between sentences or unconsciously reduce variability.

0:32:35.480 --> 0:32:38.520
<v Speaker 1>That means that their writing is going to start looking

0:32:38.560 --> 0:32:41.560
<v Speaker 1>more like the stuff that these writing detection tools assume.

0:32:42.200 --> 0:32:45.880
<v Speaker 1>Is a marker for AI generated content. If I were

0:32:45.920 --> 0:32:49.200
<v Speaker 1>to write a term paper, I would probably take a

0:32:49.240 --> 0:32:53.479
<v Speaker 1>more consistent, uniform approach to my writing style. That's not

0:32:53.520 --> 0:32:55.680
<v Speaker 1>to suggest that would be the right choice, right, Like,

0:32:56.040 --> 0:32:57.880
<v Speaker 1>I'm not saying that if you write a term paper

0:32:57.920 --> 0:33:01.040
<v Speaker 1>you need to have this very consistent, uniform approach because

0:33:01.080 --> 0:33:04.640
<v Speaker 1>they can get really boring to read papers that are

0:33:04.680 --> 0:33:07.440
<v Speaker 1>written in a style like that. But that would probably

0:33:07.480 --> 0:33:11.080
<v Speaker 1>be my inclination, like thinking in my head, I'd be

0:33:11.600 --> 0:33:14.880
<v Speaker 1>I want to make sure I'm consistent, I'm academic, i

0:33:14.920 --> 0:33:18.120
<v Speaker 1>am thoughtful, I'm methodical. That means that the work I

0:33:18.120 --> 0:33:21.640
<v Speaker 1>would produce would have this low burstiness because I was

0:33:21.720 --> 0:33:24.400
<v Speaker 1>purposefully doing it. Even if that was the wrong decision,

0:33:24.400 --> 0:33:26.880
<v Speaker 1>it probably be the one that I would make because

0:33:26.920 --> 0:33:29.280
<v Speaker 1>I'd be working under the mistaken belief that this is

0:33:29.280 --> 0:33:33.320
<v Speaker 1>somehow more academic. So these AI writing detection tools are

0:33:33.360 --> 0:33:37.280
<v Speaker 1>looking for texts that has low burstiness and low perplexity

0:33:37.560 --> 0:33:41.040
<v Speaker 1>before suggesting that AI had created that particular block of text.

0:33:41.080 --> 0:33:44.120
<v Speaker 1>But as we've talked about, humans right in that kind

0:33:44.120 --> 0:33:47.440
<v Speaker 1>of style too, particularly for formal writing, and so you

0:33:47.480 --> 0:33:49.880
<v Speaker 1>get a lot of false positives, like if you feed

0:33:49.920 --> 0:33:54.200
<v Speaker 1>the US Constitution to a writing detection tool, and it says, well,

0:33:54.240 --> 0:33:56.760
<v Speaker 1>Ai wrote this, Well, a lot of stuff has been

0:33:56.760 --> 0:34:00.520
<v Speaker 1>written about the Constitution, including passages from the content Institution.

0:34:00.800 --> 0:34:04.160
<v Speaker 1>The Constitution itself is clearly available on the web, so

0:34:05.120 --> 0:34:09.040
<v Speaker 1>it's obviously part of these large language models training sets.

0:34:09.680 --> 0:34:12.400
<v Speaker 1>So of course it's going to reflect what's in the

0:34:12.440 --> 0:34:18.359
<v Speaker 1>training set. It was literally incorporated into it. So if

0:34:18.400 --> 0:34:23.239
<v Speaker 1>you're working backward from that logic, then your conclusion, oh

0:34:23.280 --> 0:34:26.880
<v Speaker 1>Ai wrote this because it reflects what the language model

0:34:27.280 --> 0:34:30.800
<v Speaker 1>was trained on. Well, yeah, but that's because the language

0:34:30.800 --> 0:34:33.680
<v Speaker 1>model was literally trained on the material you were analyzing.

0:34:34.680 --> 0:34:37.200
<v Speaker 1>It becomes the sort of catch twenty two sort of situation.

0:34:37.640 --> 0:34:42.799
<v Speaker 1>So we cannot rely on these detection tools in large part. Now,

0:34:42.800 --> 0:34:46.080
<v Speaker 1>this doesn't even touch upon the challenges that non native

0:34:46.120 --> 0:34:49.680
<v Speaker 1>English speakers face with their writing. When they're writing in

0:34:49.840 --> 0:34:53.840
<v Speaker 1>English and these AI detection tools are used on their work,

0:34:54.560 --> 0:34:57.640
<v Speaker 1>they can face disproportionate bias when it comes to these

0:34:57.640 --> 0:35:00.920
<v Speaker 1>detection tools. They get a lot more false positive So

0:35:01.760 --> 0:35:04.480
<v Speaker 1>you're already seeing a lot of false positives anyway, because

0:35:04.640 --> 0:35:08.879
<v Speaker 1>as we've discussed, the criteria being used by these AI

0:35:08.920 --> 0:35:14.600
<v Speaker 1>writing detection tools are faulty because it's making assumptions that

0:35:14.680 --> 0:35:16.799
<v Speaker 1>humans are not writing in those styles when in fact

0:35:16.840 --> 0:35:20.160
<v Speaker 1>they are, and that AI is writing in one specific style,

0:35:20.239 --> 0:35:23.920
<v Speaker 1>when in fact, at least over time, it migrates away

0:35:23.920 --> 0:35:27.680
<v Speaker 1>from that. So you got a double whammy here. Now,

0:35:27.680 --> 0:35:31.160
<v Speaker 1>there are some applications of AI detection tools where it

0:35:31.239 --> 0:35:34.560
<v Speaker 1>works and it makes sense, just not in writing, but

0:35:34.760 --> 0:35:39.720
<v Speaker 1>for stuff like photo or video manipulation. AI detection tools

0:35:39.719 --> 0:35:44.920
<v Speaker 1>can still look for telltale signs that can indicate that

0:35:45.000 --> 0:35:47.000
<v Speaker 1>maybe what you're looking at has at least in some

0:35:47.040 --> 0:35:51.200
<v Speaker 1>part been created by a generative AI tool, right like

0:35:51.239 --> 0:35:55.080
<v Speaker 1>an image creation tool. Obviously, there are examples of this

0:35:55.120 --> 0:35:57.279
<v Speaker 1>where you take one look and you know immediately that

0:35:57.320 --> 0:35:59.360
<v Speaker 1>this was made by AI, because you look at it

0:35:59.360 --> 0:36:02.160
<v Speaker 1>and you're like, no one has that many fingers on

0:36:02.160 --> 0:36:06.440
<v Speaker 1>one hand, but there are other cases where it may not.

0:36:06.680 --> 0:36:11.240
<v Speaker 1>It may be far more subtle to a human perception,

0:36:11.480 --> 0:36:16.280
<v Speaker 1>but if you were to actually analyze the image deeply

0:36:16.440 --> 0:36:19.719
<v Speaker 1>with a very well trained AI detection tool, it could

0:36:19.719 --> 0:36:25.399
<v Speaker 1>indicate this was made by AI because of little subtle things.

0:36:25.440 --> 0:36:29.960
<v Speaker 1>Maybe it's inconsistent lighting, Maybe it's a blinking pattern of

0:36:30.000 --> 0:36:32.919
<v Speaker 1>a person in a video, things like that, Little things

0:36:32.920 --> 0:36:35.640
<v Speaker 1>that would be hard for us to spot as human beings,

0:36:35.680 --> 0:36:40.920
<v Speaker 1>but easy for a detection tool to spot. These AI

0:36:41.000 --> 0:36:46.440
<v Speaker 1>detection tools make sense. They're not necessarily foolproof or flawless,

0:36:47.000 --> 0:36:51.360
<v Speaker 1>but they have a better success rate than when it

0:36:51.400 --> 0:36:54.240
<v Speaker 1>comes to writing, because it's just not that clear cut

0:36:54.719 --> 0:36:58.719
<v Speaker 1>when we're talking about writing. This is unfortunate when teachers

0:36:58.719 --> 0:37:03.160
<v Speaker 1>may rely heavily on AI writing detection tools in order

0:37:03.200 --> 0:37:05.839
<v Speaker 1>to determine if their students are actually doing their own

0:37:05.880 --> 0:37:09.560
<v Speaker 1>work or not. If the teachers are unaware that these

0:37:09.600 --> 0:37:13.320
<v Speaker 1>detection tools are unreliable, they can make some really drastic

0:37:13.360 --> 0:37:16.800
<v Speaker 1>decisions that will have a huge negative impact on their students'

0:37:16.840 --> 0:37:21.239
<v Speaker 1>work and lives, and that's not really fair. Hopefully, the

0:37:21.400 --> 0:37:26.960
<v Speaker 1>educators out there are themselves educating themselves to be repetitive

0:37:27.840 --> 0:37:33.920
<v Speaker 1>about these tools and their unreliability, because otherwise they're going

0:37:33.960 --> 0:37:37.880
<v Speaker 1>to be punishing students and they can't justify it because

0:37:38.600 --> 0:37:40.840
<v Speaker 1>it's all based on a tool that has proven to

0:37:40.920 --> 0:37:45.799
<v Speaker 1>be unreliable at the get go, unless, of course, we're

0:37:45.800 --> 0:37:49.680
<v Speaker 1>talking about instances where someone has copy and pasted some

0:37:49.880 --> 0:37:54.000
<v Speaker 1>ridiculous part of an AI generated response that just gives

0:37:54.040 --> 0:37:59.200
<v Speaker 1>it away. That's a different case. Entirely obviously, But yeah,

0:37:59.440 --> 0:38:03.080
<v Speaker 1>I think it's important to understand the limitations of these

0:38:04.120 --> 0:38:07.400
<v Speaker 1>As we explore generative AI, and we look at the

0:38:07.440 --> 0:38:10.960
<v Speaker 1>pros and the cons and we consider the impact the

0:38:11.000 --> 0:38:15.000
<v Speaker 1>generative AI has on multiple segments of our lives, we

0:38:15.080 --> 0:38:18.840
<v Speaker 1>also have to really think about how do we know

0:38:19.080 --> 0:38:22.560
<v Speaker 1>when it's in use, and how do we know that

0:38:22.600 --> 0:38:25.640
<v Speaker 1>the tools we're using to make those determinations are actually

0:38:26.400 --> 0:38:29.960
<v Speaker 1>good tools. In the case of these AI writing detection tools,

0:38:30.640 --> 0:38:34.000
<v Speaker 1>it looks to me like you might as well not

0:38:34.080 --> 0:38:37.880
<v Speaker 1>even look at them. You are more likely than not

0:38:37.960 --> 0:38:43.959
<v Speaker 1>to get an incorrect answer, because again, we train these

0:38:44.719 --> 0:38:48.239
<v Speaker 1>generative tools to communicate very much the way humans do,

0:38:48.280 --> 0:38:51.920
<v Speaker 1>at least in certain use cases, and those use cases

0:38:51.960 --> 0:38:54.239
<v Speaker 1>typically are the ones where we're most concerned about whether

0:38:54.320 --> 0:38:56.320
<v Speaker 1>or not AI was put to use in the first place.

0:38:56.880 --> 0:39:00.600
<v Speaker 1>So really interesting articles over on Ours Technica. It leads

0:39:00.640 --> 0:39:06.360
<v Speaker 1>to this really deep discussion about generative AI, the limitations

0:39:06.360 --> 0:39:10.200
<v Speaker 1>that we have in detecting it, And obviously there are

0:39:10.280 --> 0:39:12.239
<v Speaker 1>a lot of other things we could touch on. I

0:39:12.280 --> 0:39:17.600
<v Speaker 1>mentioned copyright. That's a big one, because if AI can

0:39:17.880 --> 0:39:24.200
<v Speaker 1>regurgitate copyrighted works with no flaws, then that can be

0:39:24.400 --> 0:39:29.839
<v Speaker 1>a huge blow to authors, for example, or we talked

0:39:29.840 --> 0:39:33.120
<v Speaker 1>a little bit about hallucinations. Hallucinations are when an AI

0:39:33.880 --> 0:39:39.560
<v Speaker 1>tool does not have the information to be able to

0:39:39.600 --> 0:39:42.960
<v Speaker 1>determine what should come next in a sentence. You have

0:39:43.040 --> 0:39:46.760
<v Speaker 1>to remember when you really boil it down these AI

0:39:46.880 --> 0:39:49.719
<v Speaker 1>generative tools, what they're doing is they're following a very

0:39:50.000 --> 0:39:55.600
<v Speaker 1>sophisticated statistical model to determine what should come next in

0:39:55.640 --> 0:39:59.200
<v Speaker 1>its answer. So you give it a prompt and it's

0:39:59.440 --> 0:40:03.799
<v Speaker 1>referencing this incredibly complicated statistical model to say, all right,

0:40:04.640 --> 0:40:07.759
<v Speaker 1>what should I put as a response. Some of the

0:40:07.800 --> 0:40:11.480
<v Speaker 1>information involves things like the actual answers to questions, but

0:40:11.520 --> 0:40:14.960
<v Speaker 1>there are cases where the AI model may be unable

0:40:15.000 --> 0:40:18.400
<v Speaker 1>to identify what the answer to the question is, but

0:40:18.480 --> 0:40:22.120
<v Speaker 1>it still needs to answer your query. It doesn't have

0:40:22.200 --> 0:40:24.759
<v Speaker 1>the answer, so it makes it up, but following this

0:40:24.880 --> 0:40:28.880
<v Speaker 1>very sophisticated statistical model so that the answer it generates

0:40:29.000 --> 0:40:32.719
<v Speaker 1>appears to be valid even though it's just completely made up.

0:40:32.800 --> 0:40:36.279
<v Speaker 1>This is what we call hallucinations in AI. It's when

0:40:36.320 --> 0:40:41.240
<v Speaker 1>AI generates an answer in order to respond to a query,

0:40:41.880 --> 0:40:46.040
<v Speaker 1>but that answer is fabricated. It's a confabulation. That's another

0:40:46.040 --> 0:40:49.719
<v Speaker 1>word that some people are using rather than hallucination, and

0:40:50.840 --> 0:40:53.840
<v Speaker 1>it comes across as being very much legitimate because again,

0:40:53.920 --> 0:41:00.000
<v Speaker 1>these very sophisticated statistical models make it seem authoritative and knowledge.

0:41:00.719 --> 0:41:03.000
<v Speaker 1>The way the sentences are structured, it doesn't come across

0:41:03.000 --> 0:41:06.880
<v Speaker 1>wishy washy. It's not like maybe it's blah blah blah.

0:41:06.920 --> 0:41:10.560
<v Speaker 1>It ends up being it's blah blah blah and presented

0:41:10.560 --> 0:41:12.880
<v Speaker 1>in such a way that you feel like it's reliable,

0:41:12.960 --> 0:41:16.960
<v Speaker 1>even though ultimately it's not. That's another issue. It's related

0:41:16.960 --> 0:41:20.279
<v Speaker 1>to what we're talking about. And it's also means that

0:41:20.360 --> 0:41:23.160
<v Speaker 1>as a student, or as a business writer or as

0:41:23.239 --> 0:41:25.640
<v Speaker 1>a lawyer, as one person found out earlier this year,

0:41:26.160 --> 0:41:30.200
<v Speaker 1>you should not rely on generative AI as your one

0:41:30.280 --> 0:41:36.080
<v Speaker 1>and only source for anything AI. Generative AI has even

0:41:36.120 --> 0:41:41.680
<v Speaker 1>been found to fabricate quotations from people. Obviously that's not

0:41:41.760 --> 0:41:45.680
<v Speaker 1>good either. There are lots of issues here. Anyway. I

0:41:45.719 --> 0:41:47.360
<v Speaker 1>hope that was some food for thought for y'all. I

0:41:47.360 --> 0:41:51.160
<v Speaker 1>hope you're doing well. I will talk to you again

0:41:52.239 --> 0:42:01.160
<v Speaker 1>really soon. Tech Stuff is an eye Heart Radio production.

0:42:01.480 --> 0:42:06.480
<v Speaker 1>For more podcasts from iHeartRadio, visit the iHeartRadio app, Apple Podcasts,

0:42:06.640 --> 0:42:08.640
<v Speaker 1>or wherever you listen to your favorite shows