1
00:00:04,440 --> 00:00:12,280
Speaker 1: Welcome to Tech Stuff, a production from iHeartRadio. Hey there,

2
00:00:12,280 --> 00:00:15,640
Speaker 1: and welcome to tech Stuff. I'm your host, Jonathan Strickland.

3
00:00:15,680 --> 00:00:18,160
Speaker 1: I'm an executive producer with iHeartRadio. And how the tech

4
00:00:18,239 --> 00:00:21,600
Speaker 1: are you. I'm here to tell you something. You write

5
00:00:21,600 --> 00:00:25,480
Speaker 1: like a robot. But that's okay because I do too.

6
00:00:25,880 --> 00:00:29,720
Speaker 1: One of the founding fathers of the United States, James Madison,

7
00:00:30,120 --> 00:00:33,239
Speaker 1: wrote like a robot. Robots weren't even a thing when

8
00:00:33,280 --> 00:00:36,080
Speaker 1: he was writing back in the eighteenth century, all right,

9
00:00:36,159 --> 00:00:38,960
Speaker 1: so really, I guess it's more fair to say that

10
00:00:39,159 --> 00:00:43,520
Speaker 1: robots write like us. And while I'm having a little

11
00:00:43,560 --> 00:00:46,760
Speaker 1: bit of fun using the word robots, what I'm really

12
00:00:46,800 --> 00:00:51,000
Speaker 1: talking about is generative AI. You know, stuff like chat

13
00:00:51,080 --> 00:00:55,520
Speaker 1: GPT and Google Bard, that kind of thing, These AI

14
00:00:55,680 --> 00:00:59,280
Speaker 1: powered chat bots right like humans. Right, That's one of

15
00:00:59,320 --> 00:01:02,800
Speaker 1: the big suff features of the chatbots. One that they

16
00:01:02,800 --> 00:01:06,560
Speaker 1: can understand a prompt that we give them, That they

17
00:01:06,560 --> 00:01:09,480
Speaker 1: can understand what we mean when we give them a prompt,

18
00:01:09,520 --> 00:01:12,840
Speaker 1: and two that they then generate a response as if

19
00:01:12,920 --> 00:01:15,759
Speaker 1: it had been written by an actual person. But obviously

20
00:01:15,800 --> 00:01:20,399
Speaker 1: this also creates some challenges, some issues. So you might

21
00:01:20,440 --> 00:01:25,440
Speaker 1: remember that since chat GPT became publicly available last year

22
00:01:25,480 --> 00:01:29,319
Speaker 1: when OpenAI opened it up and let people start playing

23
00:01:29,319 --> 00:01:34,200
Speaker 1: with chat GPT, there were people in education, teachers and

24
00:01:34,280 --> 00:01:37,839
Speaker 1: administrators that sort of thing, who raise the alarm about

25
00:01:37,840 --> 00:01:42,320
Speaker 1: the possibility that students could use chat GPT and similar

26
00:01:42,360 --> 00:01:47,800
Speaker 1: tools to auto generate essays and stuff and thus bypass

27
00:01:47,920 --> 00:01:51,920
Speaker 1: school assignments. My robot wrote it for me. Beyond the

28
00:01:52,040 --> 00:01:55,480
Speaker 1: education sector, there are plenty of arenas where people are

29
00:01:55,520 --> 00:01:59,639
Speaker 1: worried that the less scrupulous folks out there will attempt

30
00:01:59,680 --> 00:02:02,840
Speaker 1: to pass off AI generated text as their own writing,

31
00:02:03,240 --> 00:02:08,760
Speaker 1: whether this is creative writing or business writing, whatever it

32
00:02:08,800 --> 00:02:13,440
Speaker 1: may be. So this then leads us to the concept

33
00:02:14,040 --> 00:02:18,640
Speaker 1: of AI writing detection tools, you know, some sort of

34
00:02:19,360 --> 00:02:23,280
Speaker 1: tool to determine if a piece of text originated from

35
00:02:23,480 --> 00:02:27,560
Speaker 1: a real human being or from that character that Haley

36
00:02:27,639 --> 00:02:31,240
Speaker 1: Joel Osmon played in that film about artificial intelligence. I

37
00:02:31,240 --> 00:02:35,239
Speaker 1: forget what that movie was called. Subsequent to the release

38
00:02:35,680 --> 00:02:39,239
Speaker 1: of these detection tools, we started hearing reports of teachers

39
00:02:39,560 --> 00:02:45,000
Speaker 1: failing students, sometimes an entire class of students, because the

40
00:02:45,120 --> 00:02:49,520
Speaker 1: detection tool indicated that the real source of the works

41
00:02:49,560 --> 00:02:52,040
Speaker 1: that were being turned in by the students it wasn't

42
00:02:52,120 --> 00:02:54,919
Speaker 1: from the students, but from AI. Now a lot of

43
00:02:54,960 --> 00:02:57,359
Speaker 1: students have actually come forward to argue that no, no,

44
00:02:57,520 --> 00:03:03,200
Speaker 1: they actually wrote those pieces themselves, that they authored that work,

45
00:03:03,240 --> 00:03:05,440
Speaker 1: they didn't use AI to do it, and that they

46
00:03:05,440 --> 00:03:09,000
Speaker 1: are the victim of false positives, that these writing detection

47
00:03:09,120 --> 00:03:12,240
Speaker 1: tools made a mistake, and as it turns out, at

48
00:03:12,320 --> 00:03:15,440
Speaker 1: least some of them, and likely a lot of them

49
00:03:15,560 --> 00:03:18,239
Speaker 1: were telling the truth. And we can say that because

50
00:03:18,280 --> 00:03:25,160
Speaker 1: these AI writing detection tools have abysmal accuracy rates, they

51
00:03:25,240 --> 00:03:29,400
Speaker 1: are worse than chance. That's how bad these tools can be.

52
00:03:30,160 --> 00:03:33,000
Speaker 1: So the success rate for an AI writing detector can

53
00:03:33,040 --> 00:03:36,040
Speaker 1: be so low that it has led some of the

54
00:03:36,040 --> 00:03:40,320
Speaker 1: companies to shut them down, and it led to a

55
00:03:40,360 --> 00:03:43,600
Speaker 1: lot of critics to just dismiss the concept of an

56
00:03:43,600 --> 00:03:48,240
Speaker 1: AI writing tool entirely. In fact, there are quite a

57
00:03:48,240 --> 00:03:51,120
Speaker 1: few who have argued that AI writing detection tools are

58
00:03:51,560 --> 00:03:54,720
Speaker 1: essentially snake oil. That there are companies that are making

59
00:03:54,760 --> 00:03:57,480
Speaker 1: what they say are reliable tools that can tell the

60
00:03:57,480 --> 00:04:00,560
Speaker 1: difference between text that was written by person and text

61
00:04:00,640 --> 00:04:04,200
Speaker 1: that was written by AI, but really they're just peddling

62
00:04:04,720 --> 00:04:08,800
Speaker 1: a hoax or a scam, and they're trying to make

63
00:04:08,920 --> 00:04:13,400
Speaker 1: money selling these tools to various organizations like schools and such,

64
00:04:14,160 --> 00:04:17,360
Speaker 1: but in fact those tools don't work, or at least

65
00:04:17,400 --> 00:04:21,160
Speaker 1: they don't work very well. Even open Ai, which is

66
00:04:21,200 --> 00:04:25,640
Speaker 1: the company that is responsible for chat GPT, they had

67
00:04:26,279 --> 00:04:28,880
Speaker 1: a tool that was meant to be a detection tool

68
00:04:28,920 --> 00:04:32,159
Speaker 1: to tell whether or not something was written by AI.

69
00:04:32,279 --> 00:04:35,560
Speaker 1: It was called AI Classifier, but they shut it down

70
00:04:36,240 --> 00:04:41,760
Speaker 1: earlier this year. Why because its accuracy rate was twenty

71
00:04:42,160 --> 00:04:47,960
Speaker 1: six percent. Twenty six percent accurate, that is bonkers. That

72
00:04:48,000 --> 00:04:52,320
Speaker 1: means nearly three quarters of the time that detection tool

73
00:04:52,400 --> 00:04:54,920
Speaker 1: came up with the wrong answer. Either it gave a

74
00:04:55,000 --> 00:04:59,400
Speaker 1: pass to an AI generated piece, or it accused a

75
00:04:59,600 --> 00:05:04,760
Speaker 1: work that a human being actually wrote, like definitively wrote,

76
00:05:05,320 --> 00:05:08,560
Speaker 1: as being the product of AI. This brings us to

77
00:05:08,640 --> 00:05:14,040
Speaker 1: James Madison. James Madison wrote the US Constitution, and folks

78
00:05:14,080 --> 00:05:17,880
Speaker 1: have fed the US Constitution into these AI writing detection

79
00:05:18,000 --> 00:05:22,440
Speaker 1: tools and received a notification that this piece was very

80
00:05:22,520 --> 00:05:25,800
Speaker 1: likely written by AI, which obviously led to lots of

81
00:05:26,320 --> 00:05:29,400
Speaker 1: jocularity on the Internet, as people said, I knew it.

82
00:05:29,440 --> 00:05:31,479
Speaker 1: I knew that the founding fathers of the United States

83
00:05:31,520 --> 00:05:34,440
Speaker 1: of America were really robots from the future sent back

84
00:05:34,480 --> 00:05:39,599
Speaker 1: in time to create a ultra capitalist society that preys

85
00:05:39,720 --> 00:05:44,440
Speaker 1: upon the disenfranchised or something like. There are a lot

86
00:05:44,440 --> 00:05:47,120
Speaker 1: of jokes about it, but the fact is no, it's

87
00:05:47,200 --> 00:05:51,680
Speaker 1: just that this writing detection tool is completely unreliable. So

88
00:05:51,720 --> 00:05:55,039
Speaker 1: you certainly cannot use these kinds of tools to justify

89
00:05:55,120 --> 00:05:59,080
Speaker 1: flunking an entire class of students when you know that

90
00:05:59,200 --> 00:06:02,680
Speaker 1: the reliability is so low. Now, I decided to do

91
00:06:03,120 --> 00:06:07,960
Speaker 1: this short episode about AI writing detection tools after reading

92
00:06:08,160 --> 00:06:11,279
Speaker 1: a couple of great pieces in Ours Technico. Those of

93
00:06:11,320 --> 00:06:14,080
Speaker 1: y'all who listen to my show frequently know that I

94
00:06:14,120 --> 00:06:19,760
Speaker 1: often reference Ours Technica because the folks there reliably post

95
00:06:20,240 --> 00:06:23,400
Speaker 1: great articles. So in this case, the author of both

96
00:06:23,440 --> 00:06:28,320
Speaker 1: pieces I read was BENJ. Edwards b E and J. Edwards,

97
00:06:28,680 --> 00:06:31,000
Speaker 1: And at some point I probably should reach out to

98
00:06:31,040 --> 00:06:33,440
Speaker 1: them and ask if they would like to join tech

99
00:06:33,480 --> 00:06:36,920
Speaker 1: stuff for an episode to talk about something like generative AI,

100
00:06:37,400 --> 00:06:41,839
Speaker 1: because Edwards has done some really good work. Anyways, as

101
00:06:41,880 --> 00:06:48,280
Speaker 1: we think about the issue about how this generative AI works,

102
00:06:48,680 --> 00:06:53,960
Speaker 1: the underlying technology that powers generative AI, we start to

103
00:06:53,960 --> 00:06:58,800
Speaker 1: see why there's this big reliability problem. Why are we

104
00:06:58,880 --> 00:07:04,240
Speaker 1: having such issues with an automated detection tool? Really determining

105
00:07:04,400 --> 00:07:07,760
Speaker 1: if something was written by a person or AI. And

106
00:07:07,760 --> 00:07:12,320
Speaker 1: it's because the tools like chat GPT are built on

107
00:07:12,400 --> 00:07:17,800
Speaker 1: top of large language models, also known as llms, And

108
00:07:17,840 --> 00:07:21,360
Speaker 1: if we take a moment to really understand llms, then

109
00:07:21,400 --> 00:07:23,720
Speaker 1: we start to get a handle on why these detector

110
00:07:23,760 --> 00:07:27,920
Speaker 1: tools are so unreliable. So first off, let's actually talk

111
00:07:27,920 --> 00:07:32,040
Speaker 1: about a precursor to large language models. This would be

112
00:07:32,200 --> 00:07:37,080
Speaker 1: recurrent neural networks or r ends. Now I've talked a

113
00:07:37,120 --> 00:07:39,800
Speaker 1: lot about neural networks on this show, but just as

114
00:07:39,800 --> 00:07:43,640
Speaker 1: a refresher. Neural network is an attempt to create a

115
00:07:43,800 --> 00:07:48,680
Speaker 1: computer system or computer model that processes information in a

116
00:07:48,680 --> 00:07:53,080
Speaker 1: way that is similar to how our brains process information.

117
00:07:53,640 --> 00:07:58,560
Speaker 1: So you have layers of artificial neurons, or you can

118
00:07:58,560 --> 00:08:02,920
Speaker 1: think of them as nodes. These layers connect to other

119
00:08:03,000 --> 00:08:07,080
Speaker 1: artificial neurons. You have multiple connections from neuron to other neurons,

120
00:08:07,480 --> 00:08:09,520
Speaker 1: and you have layers that go from top to bottom.

121
00:08:09,520 --> 00:08:11,560
Speaker 1: You can think of it like at the top that's

122
00:08:11,560 --> 00:08:14,120
Speaker 1: where you put input and at the bottom that's where

123
00:08:14,160 --> 00:08:18,120
Speaker 1: you get output. So essentially, you feed information into the

124
00:08:18,160 --> 00:08:20,960
Speaker 1: model and then the information goes through a series of

125
00:08:20,960 --> 00:08:25,160
Speaker 1: operations in which data passes through these different nodes, and

126
00:08:25,200 --> 00:08:28,480
Speaker 1: the nodes make decisions based upon the input, and then

127
00:08:28,520 --> 00:08:32,640
Speaker 1: they send output to different nodes and eventually you get

128
00:08:33,000 --> 00:08:36,800
Speaker 1: the ultimate output. And sometimes that output is correct. It

129
00:08:36,800 --> 00:08:40,480
Speaker 1: gives you the answer that is correct. Sometimes it's wrong.

130
00:08:41,000 --> 00:08:43,120
Speaker 1: And typically what that means is that you then have

131
00:08:43,200 --> 00:08:47,520
Speaker 1: to adjust how those artificial neurons are making decisions. Those

132
00:08:47,559 --> 00:08:52,480
Speaker 1: neurons apply a sort of bias to input, we call

133
00:08:52,520 --> 00:08:56,880
Speaker 1: it a weight, so they will favor some types of

134
00:08:56,960 --> 00:08:59,800
Speaker 1: input over others in an effort to make a decision.

135
00:08:59,840 --> 00:09:03,000
Speaker 1: If they didn't, then the data would never go anywhere.

136
00:09:03,080 --> 00:09:05,120
Speaker 1: You would never be able to have it processed through

137
00:09:05,160 --> 00:09:09,440
Speaker 1: the system. So the weighting affects how the neuron actually

138
00:09:09,440 --> 00:09:11,920
Speaker 1: processes the data, where does it pass it on to.

139
00:09:12,559 --> 00:09:16,720
Speaker 1: So it may say, if value is greater than X,

140
00:09:16,960 --> 00:09:20,679
Speaker 1: send to node A. If value is less than x,

141
00:09:20,920 --> 00:09:24,839
Speaker 1: send to node B. That could be a very basic weight.

142
00:09:25,240 --> 00:09:28,040
Speaker 1: X would be the weight in that case, and maybe

143
00:09:28,120 --> 00:09:31,640
Speaker 1: that would lead you to a correct outcome. So by

144
00:09:31,679 --> 00:09:36,400
Speaker 1: adjusting the weighting, you can change how these neurons make decisions.

145
00:09:36,880 --> 00:09:39,000
Speaker 1: And if you build a neural network for the purposes,

146
00:09:39,360 --> 00:09:42,200
Speaker 1: let's give it a hypothetical. Let's say it's identifying pictures

147
00:09:42,240 --> 00:09:46,320
Speaker 1: of cats. It's always my go to. And you start

148
00:09:46,400 --> 00:09:48,640
Speaker 1: looking at the output and you see that it is

149
00:09:48,760 --> 00:09:53,199
Speaker 1: mistakenly saying that pictures of flowers are pictures of cats.

150
00:09:53,600 --> 00:09:56,760
Speaker 1: You would say, all right, these artificial neural networks, the

151
00:09:57,040 --> 00:10:00,640
Speaker 1: nodes in this artificial neural network are making the wrong decisions.

152
00:10:01,000 --> 00:10:04,280
Speaker 1: The waiting is wrong in these nodes. I need to

153
00:10:04,280 --> 00:10:07,280
Speaker 1: go and start adjusting things so that I can start

154
00:10:07,280 --> 00:10:12,520
Speaker 1: to get back to this correctly saying whether or not

155
00:10:12,559 --> 00:10:15,400
Speaker 1: an image has a cat in it or doesn't. And

156
00:10:16,040 --> 00:10:18,240
Speaker 1: your goal is to train this model over and over

157
00:10:18,280 --> 00:10:21,200
Speaker 1: and over again until it gets better and better at

158
00:10:21,200 --> 00:10:24,120
Speaker 1: this task, so that then you can just send it

159
00:10:24,200 --> 00:10:26,720
Speaker 1: any raw data you like and not have to worry

160
00:10:26,760 --> 00:10:31,120
Speaker 1: about checking up on it afterward because its accuracy level

161
00:10:31,120 --> 00:10:34,520
Speaker 1: will be high enough to be reliable. That's your ultimate goal,

162
00:10:34,880 --> 00:10:38,120
Speaker 1: But there's a whole process of learning of training that

163
00:10:38,200 --> 00:10:41,760
Speaker 1: you have to go through first. Now, a recurrent neural network,

164
00:10:41,760 --> 00:10:44,760
Speaker 1: it's a little more specific than just artificial neural network.

165
00:10:45,320 --> 00:10:50,679
Speaker 1: Recurrent neural networks use sequential data. These networks can and

166
00:10:50,760 --> 00:10:55,720
Speaker 1: do take information from earlier inputs into consideration when processing

167
00:10:55,920 --> 00:11:00,280
Speaker 1: a new input, so there's a different model, the convolutional

168
00:11:00,520 --> 00:11:04,040
Speaker 1: neural network CNN, not the news channel. This is the

169
00:11:04,040 --> 00:11:08,000
Speaker 1: other big type of neural network where every time data

170
00:11:08,080 --> 00:11:11,480
Speaker 1: goes into an input, it's like a blank slate. It's

171
00:11:11,480 --> 00:11:15,320
Speaker 1: its own thing, it has nothing about That decision is

172
00:11:15,400 --> 00:11:19,880
Speaker 1: based upon any past decision. It's an instance by instance

173
00:11:20,000 --> 00:11:22,960
Speaker 1: kind of case. So you're starting from scratch. But with

174
00:11:23,160 --> 00:11:27,720
Speaker 1: recurrent neural networks, the network can actually incorporate past inputs

175
00:11:28,080 --> 00:11:31,840
Speaker 1: as part of how it processes a current input. But

176
00:11:31,960 --> 00:11:35,400
Speaker 1: one issue with these types of networks, the recurrent neural

177
00:11:35,400 --> 00:11:38,800
Speaker 1: networks is that they need a full sequence before they

178
00:11:38,840 --> 00:11:42,600
Speaker 1: can process the information. So when we're talking about text,

179
00:11:43,040 --> 00:11:45,880
Speaker 1: like if we wanted to process text through a recurrent

180
00:11:45,920 --> 00:11:49,120
Speaker 1: neural network, it would need to work over the entire

181
00:11:49,240 --> 00:11:53,240
Speaker 1: text before producing a result in order to understand things

182
00:11:53,280 --> 00:11:57,000
Speaker 1: like context. Sometimes this approach can lead to errors because

183
00:11:57,040 --> 00:12:01,720
Speaker 1: the model essentially forgets the stuff that was at the

184
00:12:01,760 --> 00:12:04,160
Speaker 1: beginning of the text by the time it gets to

185
00:12:04,200 --> 00:12:07,160
Speaker 1: the end, which sounds a lot like me honestly, where

186
00:12:07,600 --> 00:12:10,440
Speaker 1: I will finish a book and then I'll think, like

187
00:12:10,520 --> 00:12:13,560
Speaker 1: I'll have a discussion with someone about a book that

188
00:12:13,600 --> 00:12:15,360
Speaker 1: we've both read and they'll be like, Oh, I like

189
00:12:15,440 --> 00:12:18,320
Speaker 1: that part where in early in the book blah blah

190
00:12:18,320 --> 00:12:20,600
Speaker 1: blah blah blah, and it pays off much later, and meanwhile,

191
00:12:20,600 --> 00:12:23,320
Speaker 1: I'm thinking, I totally forgot that that happened earlier in

192
00:12:23,360 --> 00:12:25,559
Speaker 1: the book. I remember where we ended up, but I

193
00:12:25,600 --> 00:12:28,960
Speaker 1: don't remember how we got there. Recurrent neural networks can

194
00:12:29,000 --> 00:12:33,360
Speaker 1: fall into the same sort of trap, and so that

195
00:12:34,679 --> 00:12:38,520
Speaker 1: creates a bit of a hurdle when it comes to

196
00:12:38,559 --> 00:12:44,640
Speaker 1: things like analyzing text for the purposes of building natural

197
00:12:44,720 --> 00:12:48,960
Speaker 1: language systems. But I'll explain how that all started to

198
00:12:49,040 --> 00:12:52,559
Speaker 1: change in twenty seventeen. First, however, we need to take

199
00:12:52,600 --> 00:13:05,680
Speaker 1: a quick break to thank our sponsors. Okay, before the break,

200
00:13:05,720 --> 00:13:08,840
Speaker 1: I was talking about recurrent neural networks and how those

201
00:13:08,880 --> 00:13:11,439
Speaker 1: have certain limitations when it comes to the way they

202
00:13:11,440 --> 00:13:14,800
Speaker 1: process data because it has to be sequential. Well, in

203
00:13:14,840 --> 00:13:18,480
Speaker 1: twenty seventeen, a group of AI researchers who were working

204
00:13:18,520 --> 00:13:24,120
Speaker 1: specifically over at Google were coming up with an alternative

205
00:13:24,760 --> 00:13:27,760
Speaker 1: to this approach, and they published a paper, and the

206
00:13:27,800 --> 00:13:32,000
Speaker 1: paper's title was Attention is All You Need, in which

207
00:13:32,000 --> 00:13:35,680
Speaker 1: they suggested that you could do something differently from the

208
00:13:35,720 --> 00:13:39,000
Speaker 1: recurrent neural network approach for the purposes of analyzing stuff

209
00:13:39,080 --> 00:13:44,360
Speaker 1: like text. Their approach was what they called a transformer model.

210
00:13:45,240 --> 00:13:49,600
Speaker 1: While you're old, RNN would analyze text essentially a character

211
00:13:49,679 --> 00:13:51,600
Speaker 1: at a time, not even a word at a time,

212
00:13:51,600 --> 00:13:54,840
Speaker 1: but a character at a time, and thus that's sequential, right.

213
00:13:54,880 --> 00:13:58,440
Speaker 1: The sequential data is character by character. It builds this

214
00:13:58,600 --> 00:14:02,120
Speaker 1: up and then analyzes the whole thing. The transformer model

215
00:14:02,160 --> 00:14:06,680
Speaker 1: instead would tackle a sentence as a unit as opposed

216
00:14:06,679 --> 00:14:10,280
Speaker 1: to a character or even an entire passage of text

217
00:14:10,440 --> 00:14:13,319
Speaker 1: would be a single unit, and so it would analyze

218
00:14:13,360 --> 00:14:17,160
Speaker 1: this to understand the context of what was being said,

219
00:14:17,880 --> 00:14:20,880
Speaker 1: and that's a huge benefit you. Getting a handle on

220
00:14:21,000 --> 00:14:26,160
Speaker 1: context is absolutely critical to understanding what someone means, because

221
00:14:26,200 --> 00:14:29,400
Speaker 1: words can have multiple meanings, right, and without context, we

222
00:14:29,440 --> 00:14:33,720
Speaker 1: can't really be sure which meaning someone intended. So here's

223
00:14:33,720 --> 00:14:37,760
Speaker 1: an example. The English word late. That can mean a

224
00:14:37,760 --> 00:14:40,280
Speaker 1: lot of things if you're an English speaker. So if

225
00:14:40,320 --> 00:14:42,280
Speaker 1: you're talking about the time of day, if you say

226
00:14:42,280 --> 00:14:45,560
Speaker 1: it's late, you usually mean it's getting close to night time.

227
00:14:45,800 --> 00:14:47,760
Speaker 1: You could say it's late at night, which means it's

228
00:14:47,800 --> 00:14:51,120
Speaker 1: actually close to morning time, or maybe it even is

229
00:14:51,200 --> 00:14:55,440
Speaker 1: the morning because it's still dark. And so you think

230
00:14:55,480 --> 00:14:59,360
Speaker 1: of it as night, but technically speaking, it's morning and

231
00:14:59,360 --> 00:15:01,760
Speaker 1: you're just saying it's late at night. If you're saying

232
00:15:01,880 --> 00:15:05,800
Speaker 1: somebody is late, you could either mean they are not

233
00:15:05,960 --> 00:15:10,320
Speaker 1: on time for some appointment, or tragically, you could mean

234
00:15:10,360 --> 00:15:13,440
Speaker 1: that this is a person who has passed away. They

235
00:15:13,480 --> 00:15:16,960
Speaker 1: are late. But you need the rest of the sentence.

236
00:15:17,000 --> 00:15:21,920
Speaker 1: You need that context to understand what meaning of late

237
00:15:22,480 --> 00:15:27,720
Speaker 1: was actually intended. So you need that contextual vision to

238
00:15:27,760 --> 00:15:31,680
Speaker 1: be able to understand the whole thing. So transformer models

239
00:15:32,240 --> 00:15:37,840
Speaker 1: began to revolutionize certain types of AI applications, specifically in

240
00:15:37,880 --> 00:15:43,760
Speaker 1: the realm of natural language processing and generative AI, and

241
00:15:43,840 --> 00:15:47,600
Speaker 1: it's what led to the development of large language models

242
00:15:47,960 --> 00:15:52,040
Speaker 1: the lms. Essentially, a large language model is just a

243
00:15:52,280 --> 00:15:56,600
Speaker 1: huge transformer model. And to make a large language model,

244
00:15:57,040 --> 00:16:00,760
Speaker 1: you need a lot of text to train your model,

245
00:16:01,120 --> 00:16:04,960
Speaker 1: like a lot a lot. Open AI trained its large

246
00:16:05,000 --> 00:16:09,040
Speaker 1: language model known as GPT, which stands for Generative pre

247
00:16:09,160 --> 00:16:15,640
Speaker 1: Trained Transformer. They trained it on countless documents, millions and

248
00:16:15,800 --> 00:16:22,040
Speaker 1: millions of documents found across the web. Some authors allege

249
00:16:22,440 --> 00:16:26,000
Speaker 1: that the training material included copyrighted material and that the

250
00:16:26,000 --> 00:16:28,840
Speaker 1: authors did not give permission for their works to be

251
00:16:28,960 --> 00:16:32,200
Speaker 1: part of the information that fed into this model. That

252
00:16:32,400 --> 00:16:35,400
Speaker 1: leads into its own set of problems that are a

253
00:16:35,480 --> 00:16:37,760
Speaker 1: little bit beyond the scope of what I'm talking about today,

254
00:16:37,760 --> 00:16:41,080
Speaker 1: but they are big problems and they're ongoing now. Stephen

255
00:16:41,160 --> 00:16:45,120
Speaker 1: King argued that his works were clearly used to train

256
00:16:45,240 --> 00:16:48,720
Speaker 1: up large language models. A dead giveaway is if you

257
00:16:48,840 --> 00:16:53,360
Speaker 1: ask a chatbot built on top of a large language

258
00:16:53,360 --> 00:16:58,360
Speaker 1: model to recite passages from specific authors works, and if

259
00:16:58,360 --> 00:17:01,560
Speaker 1: it can do that like accurate, like it's really giving

260
00:17:01,600 --> 00:17:06,760
Speaker 1: you an accurate representation of that text. Yeah, there's no

261
00:17:06,880 --> 00:17:12,240
Speaker 1: way could have received that information without having trained on

262
00:17:12,440 --> 00:17:16,399
Speaker 1: the original text at least somewhere. Now, if it's just

263
00:17:16,440 --> 00:17:20,520
Speaker 1: making stuff up, that's different. That falls into the category

264
00:17:20,560 --> 00:17:24,080
Speaker 1: of hallucinations, which we might touch upon again before we

265
00:17:24,320 --> 00:17:30,320
Speaker 1: finish shut this episode. Anyway, the benefit of feeding so

266
00:17:30,640 --> 00:17:34,480
Speaker 1: much information to a transformer model is that the transformer model,

267
00:17:34,560 --> 00:17:38,000
Speaker 1: the large language model, gets pretty darn good at sussing

268
00:17:38,040 --> 00:17:42,040
Speaker 1: out context. Even stuff that you would expect would trip

269
00:17:42,200 --> 00:17:45,720
Speaker 1: up an AI chatbot can become a breeze. You know,

270
00:17:45,800 --> 00:17:49,479
Speaker 1: you might think that slang or idioms could trip up

271
00:17:49,480 --> 00:17:52,840
Speaker 1: an AI tool, but then you have to remember that

272
00:17:52,920 --> 00:17:55,960
Speaker 1: these tools rely on essentially all the stuff that's on

273
00:17:56,000 --> 00:17:59,320
Speaker 1: the Internet, at least all the stuff that's publicly available

274
00:17:59,320 --> 00:18:03,560
Speaker 1: that's not locked behind something, and maybe even some stuff

275
00:18:03,560 --> 00:18:07,080
Speaker 1: that is locked behind stuff. As it turns out, and

276
00:18:07,200 --> 00:18:09,840
Speaker 1: as such, that means that these models have trained with

277
00:18:09,960 --> 00:18:12,960
Speaker 1: data sets that originate from the same communities that are

278
00:18:13,000 --> 00:18:16,919
Speaker 1: creating the culture that generates certain slang and idioms in

279
00:18:16,920 --> 00:18:20,400
Speaker 1: the first place. So if your AI model is using

280
00:18:20,440 --> 00:18:25,320
Speaker 1: the same source material where these turns of phrase and

281
00:18:25,440 --> 00:18:29,960
Speaker 1: certain slang terms are are originating from, well, of course

282
00:18:30,000 --> 00:18:32,119
Speaker 1: it's going to understand it because that was part of

283
00:18:32,119 --> 00:18:36,240
Speaker 1: its training, so it has that grounding. It's not like me,

284
00:18:36,800 --> 00:18:40,199
Speaker 1: where I am old. I don't understand slang that the

285
00:18:40,280 --> 00:18:43,880
Speaker 1: kids use these days because I'm not in those communities.

286
00:18:44,560 --> 00:18:47,080
Speaker 1: You wouldn't expect me to understand. I am definitely the

287
00:18:48,400 --> 00:18:51,800
Speaker 1: stereotypical out of touch old dude. So when I hear

288
00:18:51,880 --> 00:18:55,840
Speaker 1: people about, you know, people rizing up, I'm like, wait what?

289
00:18:56,880 --> 00:18:59,600
Speaker 1: And I have to look things up. And as we

290
00:18:59,640 --> 00:19:03,720
Speaker 1: all know, urban dictionary is not the most reliable of resources.

291
00:19:04,200 --> 00:19:08,679
Speaker 1: It is frequently entertaining, usually in a way that is

292
00:19:08,720 --> 00:19:13,600
Speaker 1: incredibly offensive, but it's not always accurate anyway. This ultimately

293
00:19:13,680 --> 00:19:16,680
Speaker 1: starts to lead us to why these AI writing detection

294
00:19:16,800 --> 00:19:21,280
Speaker 1: tools are not very good. The material that AI generates

295
00:19:21,400 --> 00:19:24,840
Speaker 1: is built upon how we communicate. It's a built on

296
00:19:24,880 --> 00:19:28,360
Speaker 1: how we write. That's how it was trained. So it's

297
00:19:28,359 --> 00:19:33,199
Speaker 1: not like AI or robots, as I was facetiously saying

298
00:19:33,280 --> 00:19:36,080
Speaker 1: earlier in the episode. It's not like AI has a

299
00:19:36,119 --> 00:19:39,320
Speaker 1: different path toward writing than we do. The AI is

300
00:19:39,359 --> 00:19:43,760
Speaker 1: not following an established set of rules that's unique to AI. Right,

301
00:19:43,800 --> 00:19:47,760
Speaker 1: They're not saying, write this like artificial intelligence. So the

302
00:19:47,840 --> 00:19:51,639
Speaker 1: stuff that AI produces can come across as very human

303
00:19:52,040 --> 00:19:56,159
Speaker 1: and vice versa. Now, this does not mean that it

304
00:19:56,280 --> 00:20:01,080
Speaker 1: is absolutely impossible for someone like a teacher to tell

305
00:20:01,160 --> 00:20:04,720
Speaker 1: if something was written by AI or a student. If

306
00:20:04,760 --> 00:20:07,439
Speaker 1: the teacher is actually really familiar with the writing style

307
00:20:07,720 --> 00:20:12,120
Speaker 1: of that student or students in question, it's entirely possible

308
00:20:12,320 --> 00:20:15,120
Speaker 1: that the teacher might notice if that writing style were

309
00:20:15,160 --> 00:20:20,880
Speaker 1: to suddenly and maybe significantly change between assignments. This can

310
00:20:20,960 --> 00:20:23,640
Speaker 1: be a big ask, by the way, for certain teachers,

311
00:20:23,800 --> 00:20:26,880
Speaker 1: because class sizes can get huge depending on where you are,

312
00:20:27,600 --> 00:20:30,320
Speaker 1: and if you're talking about an overworked English teacher who's

313
00:20:30,359 --> 00:20:33,879
Speaker 1: teaching multiple classes and each class has got, you know,

314
00:20:34,000 --> 00:20:37,720
Speaker 1: thirty kids in it. It can be hard to really

315
00:20:37,920 --> 00:20:42,879
Speaker 1: build up a working knowledge and memory of the writing

316
00:20:42,920 --> 00:20:45,520
Speaker 1: styles of every single person in every single class. But

317
00:20:46,119 --> 00:20:48,800
Speaker 1: that is one way that teachers can tell. If teachers

318
00:20:49,040 --> 00:20:52,040
Speaker 1: read an essay and think, wow, you know, Robert didn't

319
00:20:52,119 --> 00:20:55,720
Speaker 1: write like this in the essay we did last month,

320
00:20:56,200 --> 00:20:59,960
Speaker 1: this is a very different approach to writing and per

321
00:21:00,040 --> 00:21:04,080
Speaker 1: perhaps that's an indicator that someone else wrote the piece,

322
00:21:04,119 --> 00:21:07,680
Speaker 1: whether that was AI or maybe you know, another human being,

323
00:21:08,480 --> 00:21:12,320
Speaker 1: and that can be an indication something hinky is going on. Also,

324
00:21:12,400 --> 00:21:15,480
Speaker 1: I mean, obviously some people get sloppy. This happens a

325
00:21:15,480 --> 00:21:18,640
Speaker 1: lot too when people just aren't paying attention as they're

326
00:21:18,760 --> 00:21:24,600
Speaker 1: using AI to generate either you know, an educational assignment

327
00:21:24,880 --> 00:21:28,840
Speaker 1: or business or whatever. There have been so many examples

328
00:21:29,240 --> 00:21:33,280
Speaker 1: of how people have accidentally copied and pasted not just

329
00:21:33,560 --> 00:21:36,760
Speaker 1: the body of the text, but stuff that's outside the

330
00:21:36,800 --> 00:21:39,000
Speaker 1: body of the text, like it might even be a

331
00:21:39,000 --> 00:21:42,600
Speaker 1: little disclaimer saying it was made by AI, or it

332
00:21:42,640 --> 00:21:47,080
Speaker 1: could be a command like regenerate response. That's something you

333
00:21:47,160 --> 00:21:51,480
Speaker 1: find in certain chat bots, and that is just what

334
00:21:51,520 --> 00:21:55,760
Speaker 1: regenerate response means. It just means, hey, can you create

335
00:21:55,920 --> 00:22:03,119
Speaker 1: a new AI response to the initial prompt I gave you.

336
00:22:03,320 --> 00:22:06,200
Speaker 1: So I wrote a prompt, I had you generate response.

337
00:22:07,040 --> 00:22:09,280
Speaker 1: I want you to create a whole new response based

338
00:22:09,320 --> 00:22:13,280
Speaker 1: on that original prompt. If you have regenerate response written

339
00:22:13,320 --> 00:22:18,560
Speaker 1: at in your essay, that's a dead giveaway that you

340
00:22:18,800 --> 00:22:22,160
Speaker 1: copied and pasted that essay off of an AI chatbot.

341
00:22:22,520 --> 00:22:25,920
Speaker 1: So there are ways that teachers can tell the difference,

342
00:22:26,840 --> 00:22:30,359
Speaker 1: but they aren't. It's not as granular as saying, oh,

343
00:22:30,600 --> 00:22:35,280
Speaker 1: this is clearly something that was written by artificial intelligence

344
00:22:35,359 --> 00:22:37,719
Speaker 1: versus this was written by a human. It's more like

345
00:22:38,400 --> 00:22:41,359
Speaker 1: this is different from what I have received before from

346
00:22:41,440 --> 00:22:47,440
Speaker 1: this particular student, or this contains obvious errors that reveal

347
00:22:47,720 --> 00:22:52,399
Speaker 1: that the student has used AI. Now, the AI writing

348
00:22:52,440 --> 00:22:58,160
Speaker 1: detection tools are at least claiming to use a couple

349
00:22:58,160 --> 00:23:01,040
Speaker 1: of strategies to try and determine if something was written

350
00:23:01,080 --> 00:23:04,360
Speaker 1: by AI or a human. So they're saying, we can

351
00:23:04,440 --> 00:23:08,359
Speaker 1: automate that process, and we can actually analyze a block

352
00:23:08,400 --> 00:23:11,120
Speaker 1: of text and give you a determination as to whether

353
00:23:11,240 --> 00:23:13,399
Speaker 1: or not that was made by AI or a human,

354
00:23:13,760 --> 00:23:18,080
Speaker 1: which suggests that maybe there is some sort of fundamental

355
00:23:18,119 --> 00:23:22,360
Speaker 1: difference between the way AI generates content and the way

356
00:23:22,520 --> 00:23:27,800
Speaker 1: people do. But these strategies that the AI writing detection

357
00:23:27,920 --> 00:23:32,199
Speaker 1: tools are built upon have fundamental flaws, and we know

358
00:23:32,280 --> 00:23:35,000
Speaker 1: that because we know the tools are bad. It was

359
00:23:35,040 --> 00:23:37,639
Speaker 1: bad enough for open ai to shut down its version

360
00:23:37,760 --> 00:23:42,919
Speaker 1: back in June. So this isn't like just us postulating

361
00:23:43,160 --> 00:23:45,919
Speaker 1: that these tools are bad. We know they're bad. We

362
00:23:46,080 --> 00:23:49,640
Speaker 1: know they create things like false positives. So knowing that

363
00:23:49,920 --> 00:23:53,639
Speaker 1: already they are unreliable, you then have to start asking, well,

364
00:23:53,960 --> 00:23:56,560
Speaker 1: why are they unreliable? What are the things that are

365
00:23:56,680 --> 00:24:00,480
Speaker 1: leading these tools to make these wrong determinations? And when

366
00:24:00,480 --> 00:24:04,359
Speaker 1: we come back, I'll talk about how Bene Edwards and

367
00:24:04,480 --> 00:24:08,880
Speaker 1: those OURS Technica articles really kind of digs into two

368
00:24:09,359 --> 00:24:14,159
Speaker 1: main concepts that end up leading to these writing detection

369
00:24:14,280 --> 00:24:17,080
Speaker 1: tools trying to make a determination and why they are

370
00:24:17,760 --> 00:24:32,600
Speaker 1: fundamentally flawed. But first let's take another quick break. So

371
00:24:33,320 --> 00:24:35,439
Speaker 1: before the break, I mentioned that I was going to

372
00:24:35,440 --> 00:24:39,760
Speaker 1: talk about some strategies that Binge Edwards outlines in his

373
00:24:39,960 --> 00:24:43,800
Speaker 1: RS Technica articles, and they fall into two categories. The

374
00:24:43,840 --> 00:24:49,600
Speaker 1: first is called perplexity, and that really means how surprising

375
00:24:49,800 --> 00:24:54,480
Speaker 1: or perplexing are the word choices, how creative are the

376
00:24:54,600 --> 00:24:59,640
Speaker 1: sentences in a given piece of text compared to an

377
00:24:59,680 --> 00:25:04,800
Speaker 1: AI training model. So the thinking behind this is that

378
00:25:05,560 --> 00:25:09,600
Speaker 1: if a block of text seems to conform to the

379
00:25:09,640 --> 00:25:12,880
Speaker 1: same sort of stuff that the language model would produce,

380
00:25:13,760 --> 00:25:17,639
Speaker 1: then AI probably created the text. That's the idea if

381
00:25:17,880 --> 00:25:22,280
Speaker 1: they're saying essentially that if the text is really similar

382
00:25:22,320 --> 00:25:25,880
Speaker 1: to what AI would create, then AI probably created it.

383
00:25:26,880 --> 00:25:30,119
Speaker 1: And let's think about how some tools use autocomplete to

384
00:25:30,160 --> 00:25:32,600
Speaker 1: help you write a text or sentence. Using a purely

385
00:25:32,680 --> 00:25:35,520
Speaker 1: hypothetical scenario to kind of get our minds wrapped around this,

386
00:25:36,160 --> 00:25:38,880
Speaker 1: Let's say that you were typing into something that has

387
00:25:38,960 --> 00:25:43,640
Speaker 1: autocomplete built into it, the sentence or the phrase I'm

388
00:25:43,680 --> 00:25:48,800
Speaker 1: going to go for a and then whatever tool you're

389
00:25:48,840 --> 00:25:53,920
Speaker 1: typing it into suggests the word walk as an autocomplete option. Well,

390
00:25:53,960 --> 00:25:57,800
Speaker 1: that would be because the language model that is powering

391
00:25:58,280 --> 00:26:04,760
Speaker 1: this autocomplete function has a has sampled millions of passages,

392
00:26:04,880 --> 00:26:07,879
Speaker 1: millions and millions and millions of documents, and has found

393
00:26:08,440 --> 00:26:11,760
Speaker 1: that the word walk has been the most common word

394
00:26:11,800 --> 00:26:15,199
Speaker 1: to follow the phrase I'm going to go for a

395
00:26:16,520 --> 00:26:21,720
Speaker 1: and so therefore it offers that as the suggestion, and

396
00:26:21,920 --> 00:26:24,160
Speaker 1: maybe it would even offer you a few options. Maybe

397
00:26:24,160 --> 00:26:27,359
Speaker 1: it would say walk, maybe it'd say swim in the UK,

398
00:26:27,480 --> 00:26:32,080
Speaker 1: maybe it'd say a curry. Who knows so, but you know,

399
00:26:32,119 --> 00:26:33,959
Speaker 1: it would give you maybe a couple of different options,

400
00:26:33,960 --> 00:26:35,960
Speaker 1: but they would be the ones that would most likely

401
00:26:36,040 --> 00:26:40,240
Speaker 1: follow that phrase based upon the training material that that

402
00:26:40,400 --> 00:26:44,640
Speaker 1: large language model had used to build itself up. Right,

403
00:26:45,119 --> 00:26:49,240
Speaker 1: So if you were to measure the perplexity of the

404
00:26:49,280 --> 00:26:52,840
Speaker 1: sentence I'm going to go for a walk, it would

405
00:26:52,880 --> 00:26:56,600
Speaker 1: be very very low, very low perplexity because it's in

406
00:26:56,640 --> 00:27:00,320
Speaker 1: line with what the language model would expect. So the

407
00:27:00,400 --> 00:27:03,720
Speaker 1: thought is, if a passage in general has a very

408
00:27:03,960 --> 00:27:08,960
Speaker 1: low perplexity, these tools tend to suspect that the passage

409
00:27:08,960 --> 00:27:11,760
Speaker 1: as a whole could have come from AI. So let's

410
00:27:11,800 --> 00:27:14,119
Speaker 1: say that it had a very hyperplexity. Let's say that

411
00:27:14,160 --> 00:27:16,399
Speaker 1: instead of saying I'm going to go for a walk,

412
00:27:16,880 --> 00:27:19,919
Speaker 1: you said I'm going to go for a zebra or

413
00:27:20,040 --> 00:27:23,760
Speaker 1: zebra if you're in the UK. Well, that doesn't want

414
00:27:23,800 --> 00:27:25,680
Speaker 1: it doesn't really make any sense. But two, that would

415
00:27:25,680 --> 00:27:28,240
Speaker 1: be very perplexing, right, that would not be something that

416
00:27:28,280 --> 00:27:30,720
Speaker 1: the large language model would expect. And so if it

417
00:27:30,720 --> 00:27:35,000
Speaker 1: has high perplexity, then the writing detection tool is more

418
00:27:35,200 --> 00:27:37,640
Speaker 1: likely to say this was written by a human, because

419
00:27:38,119 --> 00:27:41,920
Speaker 1: what generative chat system would have made that sentence, And

420
00:27:42,040 --> 00:27:44,280
Speaker 1: he's like, no, sane robot would say I'm going to

421
00:27:44,280 --> 00:27:47,680
Speaker 1: go for a zebra. Clearly some human wrote this. Now,

422
00:27:47,680 --> 00:27:51,400
Speaker 1: the problem is these companies are training their large language

423
00:27:51,400 --> 00:27:56,119
Speaker 1: models on enormous amounts of human generated text. And unless

424
00:27:56,160 --> 00:28:01,600
Speaker 1: you're purposefully trying to be really a original in your writing,

425
00:28:01,760 --> 00:28:03,840
Speaker 1: that's a kind way of saying you're being a weirdo

426
00:28:04,080 --> 00:28:07,119
Speaker 1: as you're writing your sentences. Chances are a lot of

427
00:28:07,119 --> 00:28:09,479
Speaker 1: the stuff you're writing is going to have a fairly

428
00:28:09,560 --> 00:28:13,800
Speaker 1: low perplexity, unless you're trying to write in like the

429
00:28:13,840 --> 00:28:18,800
Speaker 1: milieu of humor or absurdity. If unless you're purposely trying

430
00:28:18,800 --> 00:28:22,800
Speaker 1: to do that, then chances are your perplexity is going

431
00:28:22,880 --> 00:28:25,960
Speaker 1: to be pretty low too. Particularly for very structured writing

432
00:28:26,080 --> 00:28:30,280
Speaker 1: like business writing or academic writing, that perplexity is going

433
00:28:30,320 --> 00:28:33,720
Speaker 1: to be very low. So unless you're prone to throwing

434
00:28:33,760 --> 00:28:38,680
Speaker 1: in very odd, random, weird sentences like William Shakespeare's Othello

435
00:28:38,920 --> 00:28:41,040
Speaker 1: is one of the great tragedies of English theater, and

436
00:28:41,120 --> 00:28:46,120
Speaker 1: also I enjoy shoving hot dogs through mail slots. Well,

437
00:28:46,160 --> 00:28:48,600
Speaker 1: there's a pretty good chance that an AI detector tool

438
00:28:48,960 --> 00:28:52,720
Speaker 1: is going to think that your human written, legitimate essay

439
00:28:53,440 --> 00:28:57,680
Speaker 1: was in fact an AI's work, because the perplexity would

440
00:28:57,720 --> 00:29:00,840
Speaker 1: likely be pretty low, again unless you're doing something really

441
00:29:00,880 --> 00:29:04,479
Speaker 1: avant garde, So that there's a fundamental flaw and logic

442
00:29:04,520 --> 00:29:08,160
Speaker 1: of using perplexity as one of your metrics for determining

443
00:29:08,200 --> 00:29:11,680
Speaker 1: if something was written by AI versus a human. Ben

444
00:29:11,720 --> 00:29:15,640
Speaker 1: Jedwards also goes on to explain that another factor that

445
00:29:15,680 --> 00:29:19,040
Speaker 1: AI detection tools will take into consideration is one that's

446
00:29:19,080 --> 00:29:25,520
Speaker 1: called burstiness. Perplexity and burstiness makes me feel like I've

447
00:29:25,600 --> 00:29:29,960
Speaker 1: fallen into a Lewis Carroll novel. But anyway, burstiness really

448
00:29:30,000 --> 00:29:34,600
Speaker 1: has to do with variability, particularly between sentences. So y'all

449
00:29:34,640 --> 00:29:39,880
Speaker 1: probably have noticed I have a tendency toward really long sentences,

450
00:29:40,000 --> 00:29:43,960
Speaker 1: and often with a lot of parentheticals thrown in there. Now,

451
00:29:44,000 --> 00:29:48,080
Speaker 1: if I also incorporate short sentences on occasion, breaking up

452
00:29:48,120 --> 00:29:51,400
Speaker 1: these very long sentences, this creates a lot more variety,

453
00:29:51,680 --> 00:29:56,280
Speaker 1: a lot more dynamic elements between my sentences, because I'm

454
00:29:56,400 --> 00:30:02,560
Speaker 1: switching back and forth between these very long, pontificating sentences

455
00:30:02,600 --> 00:30:05,960
Speaker 1: and then short ones to make a point. Maybe in

456
00:30:06,000 --> 00:30:09,160
Speaker 1: some sentences I use tons of adverbs to describe action.

457
00:30:09,600 --> 00:30:11,920
Speaker 1: Maybe in the next sentence I don't use any adverbs

458
00:30:11,920 --> 00:30:16,520
Speaker 1: at all. This is what creates that variability. The conventional

459
00:30:16,560 --> 00:30:21,320
Speaker 1: wisdom is that AI generated work is more uniform, it's

460
00:30:21,360 --> 00:30:25,880
Speaker 1: more consistent, it has less variability from sentence to sentence.

461
00:30:25,920 --> 00:30:30,040
Speaker 1: Your sentence length and complexity is going to remain more

462
00:30:30,160 --> 00:30:33,719
Speaker 1: or less the same throughout an entire passage. So if

463
00:30:33,760 --> 00:30:38,040
Speaker 1: you're able to qualify how dynamic a writing style is,

464
00:30:39,160 --> 00:30:42,760
Speaker 1: the thinking goes. You could potentially determine if a human

465
00:30:42,800 --> 00:30:45,760
Speaker 1: wrote it or if an AI wrote that specific piece.

466
00:30:46,480 --> 00:30:50,600
Speaker 1: If it's not very dynamic, well that leads more toward AI.

467
00:30:51,800 --> 00:30:54,480
Speaker 1: But that approach depends upon a couple of things that

468
00:30:54,520 --> 00:30:57,680
Speaker 1: are not always reliable. So first up, it assumes that

469
00:30:57,800 --> 00:31:00,560
Speaker 1: AI generated content is going to contain you to show

470
00:31:00,640 --> 00:31:04,880
Speaker 1: more consistency than the stuff that humans. Right, that's going

471
00:31:04,920 --> 00:31:09,600
Speaker 1: to continue to be this very consistent approach to sentence structure.

472
00:31:09,920 --> 00:31:13,160
Speaker 1: But the language models and the generative AI that are

473
00:31:13,240 --> 00:31:15,880
Speaker 1: built on top of the language models are growing more

474
00:31:15,920 --> 00:31:18,840
Speaker 1: sophisticated all the time. A lot of these companies that

475
00:31:18,960 --> 00:31:23,880
Speaker 1: make these language models are mining platforms like x formerly

476
00:31:23,960 --> 00:31:27,560
Speaker 1: known as Twitter or Reddit in order to train their

477
00:31:27,640 --> 00:31:32,560
Speaker 1: language models. They're reading these sort of idiosyncratic messages that

478
00:31:32,600 --> 00:31:36,600
Speaker 1: people write. Sometimes they're writing purposefully in a way that

479
00:31:36,720 --> 00:31:41,040
Speaker 1: is not consistent, and it can get to be a

480
00:31:41,040 --> 00:31:45,360
Speaker 1: little unpredictable. Well, if you're training your language model on

481
00:31:45,400 --> 00:31:48,760
Speaker 1: these things, then over time the language models and the

482
00:31:48,800 --> 00:31:51,160
Speaker 1: tools that are built on top of them begin to

483
00:31:51,240 --> 00:31:54,920
Speaker 1: reflect that training material. It means that we should expect

484
00:31:55,480 --> 00:32:01,160
Speaker 1: generative AI to start increasing variability in sentence because that's

485
00:32:01,160 --> 00:32:04,760
Speaker 1: what we're training it on. You can't expect to train

486
00:32:04,800 --> 00:32:07,400
Speaker 1: it on one thing and it generates something totally different.

487
00:32:07,440 --> 00:32:10,600
Speaker 1: It's going to kind of mimic the material that was

488
00:32:10,720 --> 00:32:13,720
Speaker 1: used to teach it in the first place. So that

489
00:32:13,760 --> 00:32:15,840
Speaker 1: means you're going to see a reduction in the gap

490
00:32:15,960 --> 00:32:20,720
Speaker 1: between how AI creates text and how humans do. But

491
00:32:20,840 --> 00:32:24,440
Speaker 1: on top of that, again, for certain types of writing,

492
00:32:25,160 --> 00:32:28,240
Speaker 1: human authors may take a more structured approach and they

493
00:32:28,280 --> 00:32:34,720
Speaker 1: may purposefully reduce variability between sentences or unconsciously reduce variability.

494
00:32:35,480 --> 00:32:38,520
Speaker 1: That means that their writing is going to start looking

495
00:32:38,560 --> 00:32:41,560
Speaker 1: more like the stuff that these writing detection tools assume.

496
00:32:42,200 --> 00:32:45,880
Speaker 1: Is a marker for AI generated content. If I were

497
00:32:45,920 --> 00:32:49,200
Speaker 1: to write a term paper, I would probably take a

498
00:32:49,240 --> 00:32:53,479
Speaker 1: more consistent, uniform approach to my writing style. That's not

499
00:32:53,520 --> 00:32:55,680
Speaker 1: to suggest that would be the right choice, right, Like,

500
00:32:56,040 --> 00:32:57,880
Speaker 1: I'm not saying that if you write a term paper

501
00:32:57,920 --> 00:33:01,040
Speaker 1: you need to have this very consistent, uniform approach because

502
00:33:01,080 --> 00:33:04,640
Speaker 1: they can get really boring to read papers that are

503
00:33:04,680 --> 00:33:07,440
Speaker 1: written in a style like that. But that would probably

504
00:33:07,480 --> 00:33:11,080
Speaker 1: be my inclination, like thinking in my head, I'd be

505
00:33:11,600 --> 00:33:14,880
Speaker 1: I want to make sure I'm consistent, I'm academic, i

506
00:33:14,920 --> 00:33:18,120
Speaker 1: am thoughtful, I'm methodical. That means that the work I

507
00:33:18,120 --> 00:33:21,640
Speaker 1: would produce would have this low burstiness because I was

508
00:33:21,720 --> 00:33:24,400
Speaker 1: purposefully doing it. Even if that was the wrong decision,

509
00:33:24,400 --> 00:33:26,880
Speaker 1: it probably be the one that I would make because

510
00:33:26,920 --> 00:33:29,280
Speaker 1: I'd be working under the mistaken belief that this is

511
00:33:29,280 --> 00:33:33,320
Speaker 1: somehow more academic. So these AI writing detection tools are

512
00:33:33,360 --> 00:33:37,280
Speaker 1: looking for texts that has low burstiness and low perplexity

513
00:33:37,560 --> 00:33:41,040
Speaker 1: before suggesting that AI had created that particular block of text.

514
00:33:41,080 --> 00:33:44,120
Speaker 1: But as we've talked about, humans right in that kind

515
00:33:44,120 --> 00:33:47,440
Speaker 1: of style too, particularly for formal writing, and so you

516
00:33:47,480 --> 00:33:49,880
Speaker 1: get a lot of false positives, like if you feed

517
00:33:49,920 --> 00:33:54,200
Speaker 1: the US Constitution to a writing detection tool, and it says, well,

518
00:33:54,240 --> 00:33:56,760
Speaker 1: Ai wrote this, Well, a lot of stuff has been

519
00:33:56,760 --> 00:34:00,520
Speaker 1: written about the Constitution, including passages from the content Institution.

520
00:34:00,800 --> 00:34:04,160
Speaker 1: The Constitution itself is clearly available on the web, so

521
00:34:05,120 --> 00:34:09,040
Speaker 1: it's obviously part of these large language models training sets.

522
00:34:09,680 --> 00:34:12,400
Speaker 1: So of course it's going to reflect what's in the

523
00:34:12,440 --> 00:34:18,359
Speaker 1: training set. It was literally incorporated into it. So if

524
00:34:18,400 --> 00:34:23,239
Speaker 1: you're working backward from that logic, then your conclusion, oh

525
00:34:23,280 --> 00:34:26,880
Speaker 1: Ai wrote this because it reflects what the language model

526
00:34:27,280 --> 00:34:30,800
Speaker 1: was trained on. Well, yeah, but that's because the language

527
00:34:30,800 --> 00:34:33,680
Speaker 1: model was literally trained on the material you were analyzing.

528
00:34:34,680 --> 00:34:37,200
Speaker 1: It becomes the sort of catch twenty two sort of situation.

529
00:34:37,640 --> 00:34:42,799
Speaker 1: So we cannot rely on these detection tools in large part. Now,

530
00:34:42,800 --> 00:34:46,080
Speaker 1: this doesn't even touch upon the challenges that non native

531
00:34:46,120 --> 00:34:49,680
Speaker 1: English speakers face with their writing. When they're writing in

532
00:34:49,840 --> 00:34:53,840
Speaker 1: English and these AI detection tools are used on their work,

533
00:34:54,560 --> 00:34:57,640
Speaker 1: they can face disproportionate bias when it comes to these

534
00:34:57,640 --> 00:35:00,920
Speaker 1: detection tools. They get a lot more false positive So

535
00:35:01,760 --> 00:35:04,480
Speaker 1: you're already seeing a lot of false positives anyway, because

536
00:35:04,640 --> 00:35:08,879
Speaker 1: as we've discussed, the criteria being used by these AI

537
00:35:08,920 --> 00:35:14,600
Speaker 1: writing detection tools are faulty because it's making assumptions that

538
00:35:14,680 --> 00:35:16,799
Speaker 1: humans are not writing in those styles when in fact

539
00:35:16,840 --> 00:35:20,160
Speaker 1: they are, and that AI is writing in one specific style,

540
00:35:20,239 --> 00:35:23,920
Speaker 1: when in fact, at least over time, it migrates away

541
00:35:23,920 --> 00:35:27,680
Speaker 1: from that. So you got a double whammy here. Now,

542
00:35:27,680 --> 00:35:31,160
Speaker 1: there are some applications of AI detection tools where it

543
00:35:31,239 --> 00:35:34,560
Speaker 1: works and it makes sense, just not in writing, but

544
00:35:34,760 --> 00:35:39,720
Speaker 1: for stuff like photo or video manipulation. AI detection tools

545
00:35:39,719 --> 00:35:44,920
Speaker 1: can still look for telltale signs that can indicate that

546
00:35:45,000 --> 00:35:47,000
Speaker 1: maybe what you're looking at has at least in some

547
00:35:47,040 --> 00:35:51,200
Speaker 1: part been created by a generative AI tool, right like

548
00:35:51,239 --> 00:35:55,080
Speaker 1: an image creation tool. Obviously, there are examples of this

549
00:35:55,120 --> 00:35:57,279
Speaker 1: where you take one look and you know immediately that

550
00:35:57,320 --> 00:35:59,360
Speaker 1: this was made by AI, because you look at it

551
00:35:59,360 --> 00:36:02,160
Speaker 1: and you're like, no one has that many fingers on

552
00:36:02,160 --> 00:36:06,440
Speaker 1: one hand, but there are other cases where it may not.

553
00:36:06,680 --> 00:36:11,240
Speaker 1: It may be far more subtle to a human perception,

554
00:36:11,480 --> 00:36:16,280
Speaker 1: but if you were to actually analyze the image deeply

555
00:36:16,440 --> 00:36:19,719
Speaker 1: with a very well trained AI detection tool, it could

556
00:36:19,719 --> 00:36:25,399
Speaker 1: indicate this was made by AI because of little subtle things.

557
00:36:25,440 --> 00:36:29,960
Speaker 1: Maybe it's inconsistent lighting, Maybe it's a blinking pattern of

558
00:36:30,000 --> 00:36:32,919
Speaker 1: a person in a video, things like that, Little things

559
00:36:32,920 --> 00:36:35,640
Speaker 1: that would be hard for us to spot as human beings,

560
00:36:35,680 --> 00:36:40,920
Speaker 1: but easy for a detection tool to spot. These AI

561
00:36:41,000 --> 00:36:46,440
Speaker 1: detection tools make sense. They're not necessarily foolproof or flawless,

562
00:36:47,000 --> 00:36:51,360
Speaker 1: but they have a better success rate than when it

563
00:36:51,400 --> 00:36:54,240
Speaker 1: comes to writing, because it's just not that clear cut

564
00:36:54,719 --> 00:36:58,719
Speaker 1: when we're talking about writing. This is unfortunate when teachers

565
00:36:58,719 --> 00:37:03,160
Speaker 1: may rely heavily on AI writing detection tools in order

566
00:37:03,200 --> 00:37:05,839
Speaker 1: to determine if their students are actually doing their own

567
00:37:05,880 --> 00:37:09,560
Speaker 1: work or not. If the teachers are unaware that these

568
00:37:09,600 --> 00:37:13,320
Speaker 1: detection tools are unreliable, they can make some really drastic

569
00:37:13,360 --> 00:37:16,800
Speaker 1: decisions that will have a huge negative impact on their students'

570
00:37:16,840 --> 00:37:21,239
Speaker 1: work and lives, and that's not really fair. Hopefully, the

571
00:37:21,400 --> 00:37:26,960
Speaker 1: educators out there are themselves educating themselves to be repetitive

572
00:37:27,840 --> 00:37:33,920
Speaker 1: about these tools and their unreliability, because otherwise they're going

573
00:37:33,960 --> 00:37:37,880
Speaker 1: to be punishing students and they can't justify it because

574
00:37:38,600 --> 00:37:40,840
Speaker 1: it's all based on a tool that has proven to

575
00:37:40,920 --> 00:37:45,799
Speaker 1: be unreliable at the get go, unless, of course, we're

576
00:37:45,800 --> 00:37:49,680
Speaker 1: talking about instances where someone has copy and pasted some

577
00:37:49,880 --> 00:37:54,000
Speaker 1: ridiculous part of an AI generated response that just gives

578
00:37:54,040 --> 00:37:59,200
Speaker 1: it away. That's a different case. Entirely obviously, But yeah,

579
00:37:59,440 --> 00:38:03,080
Speaker 1: I think it's important to understand the limitations of these

580
00:38:04,120 --> 00:38:07,400
Speaker 1: As we explore generative AI, and we look at the

581
00:38:07,440 --> 00:38:10,960
Speaker 1: pros and the cons and we consider the impact the

582
00:38:11,000 --> 00:38:15,000
Speaker 1: generative AI has on multiple segments of our lives, we

583
00:38:15,080 --> 00:38:18,840
Speaker 1: also have to really think about how do we know

584
00:38:19,080 --> 00:38:22,560
Speaker 1: when it's in use, and how do we know that

585
00:38:22,600 --> 00:38:25,640
Speaker 1: the tools we're using to make those determinations are actually

586
00:38:26,400 --> 00:38:29,960
Speaker 1: good tools. In the case of these AI writing detection tools,

587
00:38:30,640 --> 00:38:34,000
Speaker 1: it looks to me like you might as well not

588
00:38:34,080 --> 00:38:37,880
Speaker 1: even look at them. You are more likely than not

589
00:38:37,960 --> 00:38:43,959
Speaker 1: to get an incorrect answer, because again, we train these

590
00:38:44,719 --> 00:38:48,239
Speaker 1: generative tools to communicate very much the way humans do,

591
00:38:48,280 --> 00:38:51,920
Speaker 1: at least in certain use cases, and those use cases

592
00:38:51,960 --> 00:38:54,239
Speaker 1: typically are the ones where we're most concerned about whether

593
00:38:54,320 --> 00:38:56,320
Speaker 1: or not AI was put to use in the first place.

594
00:38:56,880 --> 00:39:00,600
Speaker 1: So really interesting articles over on Ours Technica. It leads

595
00:39:00,640 --> 00:39:06,360
Speaker 1: to this really deep discussion about generative AI, the limitations

596
00:39:06,360 --> 00:39:10,200
Speaker 1: that we have in detecting it, And obviously there are

597
00:39:10,280 --> 00:39:12,239
Speaker 1: a lot of other things we could touch on. I

598
00:39:12,280 --> 00:39:17,600
Speaker 1: mentioned copyright. That's a big one, because if AI can

599
00:39:17,880 --> 00:39:24,200
Speaker 1: regurgitate copyrighted works with no flaws, then that can be

600
00:39:24,400 --> 00:39:29,839
Speaker 1: a huge blow to authors, for example, or we talked

601
00:39:29,840 --> 00:39:33,120
Speaker 1: a little bit about hallucinations. Hallucinations are when an AI

602
00:39:33,880 --> 00:39:39,560
Speaker 1: tool does not have the information to be able to

603
00:39:39,600 --> 00:39:42,960
Speaker 1: determine what should come next in a sentence. You have

604
00:39:43,040 --> 00:39:46,760
Speaker 1: to remember when you really boil it down these AI

605
00:39:46,880 --> 00:39:49,719
Speaker 1: generative tools, what they're doing is they're following a very

606
00:39:50,000 --> 00:39:55,600
Speaker 1: sophisticated statistical model to determine what should come next in

607
00:39:55,640 --> 00:39:59,200
Speaker 1: its answer. So you give it a prompt and it's

608
00:39:59,440 --> 00:40:03,799
Speaker 1: referencing this incredibly complicated statistical model to say, all right,

609
00:40:04,640 --> 00:40:07,759
Speaker 1: what should I put as a response. Some of the

610
00:40:07,800 --> 00:40:11,480
Speaker 1: information involves things like the actual answers to questions, but

611
00:40:11,520 --> 00:40:14,960
Speaker 1: there are cases where the AI model may be unable

612
00:40:15,000 --> 00:40:18,400
Speaker 1: to identify what the answer to the question is, but

613
00:40:18,480 --> 00:40:22,120
Speaker 1: it still needs to answer your query. It doesn't have

614
00:40:22,200 --> 00:40:24,759
Speaker 1: the answer, so it makes it up, but following this

615
00:40:24,880 --> 00:40:28,880
Speaker 1: very sophisticated statistical model so that the answer it generates

616
00:40:29,000 --> 00:40:32,719
Speaker 1: appears to be valid even though it's just completely made up.

617
00:40:32,800 --> 00:40:36,279
Speaker 1: This is what we call hallucinations in AI. It's when

618
00:40:36,320 --> 00:40:41,240
Speaker 1: AI generates an answer in order to respond to a query,

619
00:40:41,880 --> 00:40:46,040
Speaker 1: but that answer is fabricated. It's a confabulation. That's another

620
00:40:46,040 --> 00:40:49,719
Speaker 1: word that some people are using rather than hallucination, and

621
00:40:50,840 --> 00:40:53,840
Speaker 1: it comes across as being very much legitimate because again,

622
00:40:53,920 --> 00:41:00,000
Speaker 1: these very sophisticated statistical models make it seem authoritative and knowledge.

623
00:41:00,719 --> 00:41:03,000
Speaker 1: The way the sentences are structured, it doesn't come across

624
00:41:03,000 --> 00:41:06,880
Speaker 1: wishy washy. It's not like maybe it's blah blah blah.

625
00:41:06,920 --> 00:41:10,560
Speaker 1: It ends up being it's blah blah blah and presented

626
00:41:10,560 --> 00:41:12,880
Speaker 1: in such a way that you feel like it's reliable,

627
00:41:12,960 --> 00:41:16,960
Speaker 1: even though ultimately it's not. That's another issue. It's related

628
00:41:16,960 --> 00:41:20,279
Speaker 1: to what we're talking about. And it's also means that

629
00:41:20,360 --> 00:41:23,160
Speaker 1: as a student, or as a business writer or as

630
00:41:23,239 --> 00:41:25,640
Speaker 1: a lawyer, as one person found out earlier this year,

631
00:41:26,160 --> 00:41:30,200
Speaker 1: you should not rely on generative AI as your one

632
00:41:30,280 --> 00:41:36,080
Speaker 1: and only source for anything AI. Generative AI has even

633
00:41:36,120 --> 00:41:41,680
Speaker 1: been found to fabricate quotations from people. Obviously that's not

634
00:41:41,760 --> 00:41:45,680
Speaker 1: good either. There are lots of issues here. Anyway. I

635
00:41:45,719 --> 00:41:47,360
Speaker 1: hope that was some food for thought for y'all. I

636
00:41:47,360 --> 00:41:51,160
Speaker 1: hope you're doing well. I will talk to you again

637
00:41:52,239 --> 00:42:01,160
Speaker 1: really soon. Tech Stuff is an eye Heart Radio production.

638
00:42:01,480 --> 00:42:06,480
Speaker 1: For more podcasts from iHeartRadio, visit the iHeartRadio app, Apple Podcasts,

639
00:42:06,640 --> 00:42:08,640
Speaker 1: or wherever you listen to your favorite shows