1
00:00:04,400 --> 00:00:12,760
Speaker 1: Welcome to Text, a production from my Heart Radio. Hey there,

2
00:00:12,760 --> 00:00:16,480
Speaker 1: and welcome to tech stuff. I'm your host, Jonathan Strickland.

3
00:00:16,520 --> 00:00:19,000
Speaker 1: I'm an executive producer with I Heart Radio and I

4
00:00:19,079 --> 00:00:22,599
Speaker 1: love all things tech, and today I want to tackle

5
00:00:22,840 --> 00:00:29,480
Speaker 1: a really interesting, complicated, and potentially scary topic, and that

6
00:00:29,640 --> 00:00:34,360
Speaker 1: is predictive text generation. And I know that sounds weird

7
00:00:34,400 --> 00:00:36,760
Speaker 1: to say potentially scary, but you know, stick with me.

8
00:00:37,360 --> 00:00:40,720
Speaker 1: I'm sure many of you have seen social media posts

9
00:00:40,760 --> 00:00:45,080
Speaker 1: that say things like type I am the on your

10
00:00:45,120 --> 00:00:48,519
Speaker 1: phone and then generate a result using the middle option

11
00:00:48,840 --> 00:00:52,280
Speaker 1: of predictive text. So you know, just for example, I

12
00:00:52,320 --> 00:00:54,600
Speaker 1: did that. If I did that on my phone, then

13
00:00:54,920 --> 00:00:58,320
Speaker 1: I get I am the only one who can help

14
00:00:58,400 --> 00:01:04,679
Speaker 1: me with this. Oh two, real predictive text. I mean,

15
00:01:04,680 --> 00:01:07,360
Speaker 1: I'm the only one who researches and writes these episodes.

16
00:01:07,400 --> 00:01:11,840
Speaker 1: That's it's way too real. But the whole meme of

17
00:01:12,000 --> 00:01:17,120
Speaker 1: using predictive text to generate seemingly meaningful or you know,

18
00:01:17,240 --> 00:01:22,120
Speaker 1: sometimes wildly absurd phrases is just part of what I

19
00:01:22,200 --> 00:01:25,640
Speaker 1: want to talk about today. Now. The reason this topic

20
00:01:25,920 --> 00:01:29,120
Speaker 1: jumped at me is because of a recent news article

21
00:01:29,200 --> 00:01:32,600
Speaker 1: that I read over on the Verge The article that

22
00:01:32,680 --> 00:01:35,959
Speaker 1: was written by Kim Lyons has the title a college

23
00:01:36,080 --> 00:01:40,959
Speaker 1: student used GPT three to write fake blog posts and

24
00:01:41,120 --> 00:01:44,560
Speaker 1: ended up at the top of Hacker News. Now, as

25
00:01:44,600 --> 00:01:49,080
Speaker 1: the headline indicates, a computer science student used a predictive

26
00:01:49,160 --> 00:01:52,960
Speaker 1: text engine called GPT three, a beta build of it

27
00:01:53,040 --> 00:01:58,000
Speaker 1: in fact, that stands for Generative pre Trained Transformer, and

28
00:01:58,040 --> 00:02:01,400
Speaker 1: then generated a blog that was featured on a site

29
00:02:01,400 --> 00:02:04,320
Speaker 1: called hacker News as if it were a piece written

30
00:02:04,320 --> 00:02:07,960
Speaker 1: by a flesh and blood human being. What's more, a

31
00:02:08,040 --> 00:02:10,919
Speaker 1: threat on Reddit showed that only a few people were

32
00:02:10,960 --> 00:02:13,920
Speaker 1: picking up on the feeling that something hinky was going on,

33
00:02:14,000 --> 00:02:17,240
Speaker 1: and that perhaps the blog post had not been written

34
00:02:17,760 --> 00:02:21,040
Speaker 1: but generated. And Lions goes on to point out that

35
00:02:21,120 --> 00:02:24,680
Speaker 1: the fact that there's a lot of, you know, not

36
00:02:25,000 --> 00:02:28,359
Speaker 1: very good writing on the Internet makes it a little

37
00:02:28,400 --> 00:02:32,359
Speaker 1: harder to sus out a decent generated post as opposed

38
00:02:32,400 --> 00:02:35,240
Speaker 1: to a written one. It's not so much that the

39
00:02:35,280 --> 00:02:39,480
Speaker 1: AI has become super awesome at writing, but rather that

40
00:02:39,520 --> 00:02:42,400
Speaker 1: we've kind of lowered the bar more than a little.

41
00:02:42,680 --> 00:02:44,760
Speaker 1: This kind of plays into the whole concept of a

42
00:02:44,800 --> 00:02:48,720
Speaker 1: touring test. So, just to go off on a tangent here,

43
00:02:48,919 --> 00:02:50,960
Speaker 1: this isn't in my notes, I'm just going to speak

44
00:02:51,360 --> 00:02:54,880
Speaker 1: off the cuff. The Touring test is named after Alan Touring,

45
00:02:55,560 --> 00:03:01,600
Speaker 1: famous computer scientist, and the idea. Nowadays, it's kind of

46
00:03:01,600 --> 00:03:05,400
Speaker 1: evolved into this idea of you have a series of

47
00:03:05,639 --> 00:03:10,040
Speaker 1: interviews that a person does over a computer, and some

48
00:03:10,160 --> 00:03:14,119
Speaker 1: of the interviewees are people and some of them are

49
00:03:14,840 --> 00:03:18,960
Speaker 1: chat bots essentially, and the goal of this whole exercise

50
00:03:19,080 --> 00:03:21,000
Speaker 1: is to see if the person who's doing the interview

51
00:03:21,560 --> 00:03:25,679
Speaker 1: can consistently tell if the other entity on the other

52
00:03:25,680 --> 00:03:28,359
Speaker 1: side of the interview is a person or if it's

53
00:03:28,360 --> 00:03:32,239
Speaker 1: a chat bought. And if you pass with a certain percentage,

54
00:03:32,520 --> 00:03:35,040
Speaker 1: you would say that the chat bought has passed the

55
00:03:35,040 --> 00:03:38,240
Speaker 1: Touring test, that people are unable to tell the difference

56
00:03:38,280 --> 00:03:41,080
Speaker 1: between the chat bought and a real human being, and

57
00:03:41,080 --> 00:03:43,880
Speaker 1: that this is kind of one of the markers for

58
00:03:44,040 --> 00:03:48,600
Speaker 1: artificial intelligence. We're gonna be dipping into that sort of

59
00:03:48,680 --> 00:03:53,520
Speaker 1: thing with this discussion as well. So today I'm really

60
00:03:53,520 --> 00:03:56,520
Speaker 1: wanted to dive into the whole concept of predictive text

61
00:03:56,560 --> 00:04:00,000
Speaker 1: and how it's done and how it could absolutely destroy

62
00:04:00,120 --> 00:04:02,960
Speaker 1: platforms like Facebook in the future. That's all I'm going

63
00:04:03,000 --> 00:04:06,040
Speaker 1: to end this episode, So stick around, but we have

64
00:04:06,200 --> 00:04:10,760
Speaker 1: to build on this gradually, So let's start at the

65
00:04:10,880 --> 00:04:14,600
Speaker 1: very beginning, which, according to this woman who's singing outside

66
00:04:14,600 --> 00:04:17,320
Speaker 1: my window, is a very good place to start. And

67
00:04:17,400 --> 00:04:21,080
Speaker 1: we are going to start with a particularly tricky concept

68
00:04:21,200 --> 00:04:25,120
Speaker 1: for a former English Lit Major to try and explain,

69
00:04:25,680 --> 00:04:29,560
Speaker 1: and this is called a Markov model. It's named after

70
00:04:29,760 --> 00:04:35,920
Speaker 1: a mathematician named Andre Andreyevitch Markov, and he was born

71
00:04:35,920 --> 00:04:39,400
Speaker 1: in Russia in eighteen fifty six, and he did a

72
00:04:39,400 --> 00:04:43,840
Speaker 1: lot of work on an area of mathematics called stochastic processes.

73
00:04:44,560 --> 00:04:48,800
Speaker 1: But that just raises another question, right, what does stochastic mean? Well,

74
00:04:48,839 --> 00:04:53,640
Speaker 1: a stochastic variable is one that is randomly determined. A

75
00:04:53,680 --> 00:04:59,040
Speaker 1: stocastic system has a random probability pattern that you can study,

76
00:04:59,080 --> 00:05:03,720
Speaker 1: but you can't dickt it precisely. There's always uncertainty. So

77
00:05:03,760 --> 00:05:07,880
Speaker 1: you can assign probabilities as to how the pattern will form,

78
00:05:08,480 --> 00:05:11,720
Speaker 1: but those are just indications of how likely a particular

79
00:05:11,800 --> 00:05:15,320
Speaker 1: pattern will form, not a guarantee. So let's take a

80
00:05:15,400 --> 00:05:19,640
Speaker 1: very simple example, and let's pick something really random. Let's

81
00:05:19,680 --> 00:05:22,760
Speaker 1: talk about my two year old niece. So let's say

82
00:05:22,960 --> 00:05:25,520
Speaker 1: my niece is standing in the middle of a room

83
00:05:25,760 --> 00:05:29,560
Speaker 1: and I walk in. Now, based on my past interactions

84
00:05:29,800 --> 00:05:34,480
Speaker 1: with this random creature, I know my niece is likely

85
00:05:34,560 --> 00:05:38,000
Speaker 1: to do one of three things. She is going to

86
00:05:38,120 --> 00:05:40,799
Speaker 1: run at me and grab my hand, and then boss

87
00:05:40,880 --> 00:05:43,359
Speaker 1: me around and put me someplace and tell me I

88
00:05:43,400 --> 00:05:46,159
Speaker 1: have to stay there. She's going to run away from

89
00:05:46,200 --> 00:05:49,760
Speaker 1: me and then hide and then demand very loudly that

90
00:05:49,839 --> 00:05:52,760
Speaker 1: I come find her. She is not, i should add,

91
00:05:52,839 --> 00:05:57,960
Speaker 1: quite grasped the concept of hiding. Or she is going

92
00:05:58,040 --> 00:06:02,520
Speaker 1: to ignore me and say and or dance. Those are

93
00:06:02,560 --> 00:06:05,440
Speaker 1: the things that she typically does. There are other things

94
00:06:05,560 --> 00:06:08,760
Speaker 1: she might do as well, but they happen much less frequently.

95
00:06:09,080 --> 00:06:12,440
Speaker 1: So let's say I want to sketch out this scenario

96
00:06:12,560 --> 00:06:16,640
Speaker 1: on paper. I might start with the scenario is my

97
00:06:16,720 --> 00:06:19,400
Speaker 1: nieces in a room and I come into the room.

98
00:06:19,640 --> 00:06:22,440
Speaker 1: Then I would draw a little bubbles on my paper

99
00:06:22,800 --> 00:06:26,960
Speaker 1: to represent the potential actions or states as we would

100
00:06:26,960 --> 00:06:30,240
Speaker 1: call them, in a Markov chain that could follow this

101
00:06:30,440 --> 00:06:33,560
Speaker 1: input of me walking into the room. Now, based on

102
00:06:33,600 --> 00:06:36,640
Speaker 1: the number of times I've seen her respond before, I

103
00:06:36,680 --> 00:06:41,440
Speaker 1: could wait each of those states with a certain probability. If,

104
00:06:41,640 --> 00:06:44,680
Speaker 1: for example, she runs at me and grabs my hand

105
00:06:44,760 --> 00:06:48,039
Speaker 1: then bosses me around more than half the time I

106
00:06:48,080 --> 00:06:52,760
Speaker 1: can wait that outcome, as you know, And does that

107
00:06:52,839 --> 00:06:54,800
Speaker 1: mean the next time I walk into a room that

108
00:06:54,880 --> 00:07:00,160
Speaker 1: she's going to do that? No, each incident is random.

109
00:07:00,160 --> 00:07:03,400
Speaker 1: I'm just illustrating how likely a particular outcome is going

110
00:07:03,440 --> 00:07:07,080
Speaker 1: to be. I would then assign probabilities for the other

111
00:07:07,200 --> 00:07:11,280
Speaker 1: two outcomes I outlined, and and maybe just ignore all

112
00:07:11,280 --> 00:07:16,200
Speaker 1: the outliers and say that one of them is you know, likely,

113
00:07:16,240 --> 00:07:18,880
Speaker 1: which means the third one is only five percent likely

114
00:07:18,960 --> 00:07:22,320
Speaker 1: to happen because it has to add up to now.

115
00:07:22,440 --> 00:07:26,280
Speaker 1: The example I just gave is ridiculously simple, despite the

116
00:07:26,280 --> 00:07:29,760
Speaker 1: fact that my niece is already incredibly complicated, And it

117
00:07:29,920 --> 00:07:34,320
Speaker 1: just gives us the odds of one starting state that

118
00:07:34,400 --> 00:07:37,720
Speaker 1: I'm me walking into a room that then transitions into

119
00:07:37,760 --> 00:07:42,480
Speaker 1: one of three outcome states. Markov models can have lots

120
00:07:42,480 --> 00:07:46,040
Speaker 1: of variables, with some variables dependent upon the value of

121
00:07:46,120 --> 00:07:49,760
Speaker 1: other variables. So you might see a chain as something

122
00:07:50,080 --> 00:07:54,040
Speaker 1: like if outcome A happens and there's a sixty chance

123
00:07:54,080 --> 00:07:57,280
Speaker 1: that it will, then there's a thirty percent chance that

124
00:07:57,440 --> 00:08:01,880
Speaker 1: a subsequent outcome A three will happen, And it can

125
00:08:01,920 --> 00:08:05,640
Speaker 1: become a really complex branching path of possibilities, but we

126
00:08:05,720 --> 00:08:09,679
Speaker 1: can stick with simple. Let's take the coin flip, the

127
00:08:09,840 --> 00:08:13,720
Speaker 1: classic example of a random variable. We know that the

128
00:08:13,760 --> 00:08:18,040
Speaker 1: odds of a fair coin landing heads up are and

129
00:08:18,120 --> 00:08:22,040
Speaker 1: landing tails up. Our fifty percent. Flipping a coin many

130
00:08:22,240 --> 00:08:27,320
Speaker 1: thousands of times should show that collectively you're gravitating towards

131
00:08:27,400 --> 00:08:31,560
Speaker 1: those probabilities, that about half of your coin flips will

132
00:08:31,600 --> 00:08:34,000
Speaker 1: be heads and the other half will be tails. But

133
00:08:34,120 --> 00:08:37,480
Speaker 1: that does not mean you won't get on streaks where

134
00:08:37,520 --> 00:08:41,600
Speaker 1: you flip heads over and over. Allah, Rosencrantz and Guildenstern

135
00:08:41,640 --> 00:08:44,640
Speaker 1: are dead. And if you don't know that reference, I

136
00:08:44,720 --> 00:08:48,360
Speaker 1: highly recommend that you read that play or you watch

137
00:08:48,480 --> 00:08:51,280
Speaker 1: the excellent film version that has Tim Roth and Gary

138
00:08:51,320 --> 00:08:54,400
Speaker 1: Oldman in it, because it is fantastic and it kind

139
00:08:54,400 --> 00:08:59,160
Speaker 1: of dives into a fun discussion of probabilities and what

140
00:08:59,320 --> 00:09:03,080
Speaker 1: does that actually mean Anyway, The odds of flipping a

141
00:09:03,160 --> 00:09:07,440
Speaker 1: coin heads are for a single coin flip, but what

142
00:09:07,520 --> 00:09:11,160
Speaker 1: about a second coin flip. Well, if we look at

143
00:09:11,280 --> 00:09:15,560
Speaker 1: just that flip in isolation, that second coin flip, it's

144
00:09:15,559 --> 00:09:18,200
Speaker 1: still a fifty pc chance that's going to land on heads.

145
00:09:18,840 --> 00:09:21,160
Speaker 1: But if we frame it a different way, if we

146
00:09:21,200 --> 00:09:25,360
Speaker 1: ask the question, what what are the odds of flipping

147
00:09:25,360 --> 00:09:28,120
Speaker 1: heads twice in a row? This is a different question

148
00:09:28,240 --> 00:09:32,040
Speaker 1: because you're not thinking about individual flips. You're saying, what

149
00:09:32,160 --> 00:09:36,360
Speaker 1: are the odds of this happening twice sequentially? Well, now

150
00:09:36,720 --> 00:09:38,640
Speaker 1: we have to take the odds of it happening once,

151
00:09:38,679 --> 00:09:42,280
Speaker 1: which is, and then we have to multiply it against itself.

152
00:09:42,320 --> 00:09:46,000
Speaker 1: It's a fifty chance again that it would happen twice.

153
00:09:46,200 --> 00:09:50,280
Speaker 1: So oft is let me do the math. It is

154
00:09:51,640 --> 00:09:54,400
Speaker 1: or one four. So if you were to do a

155
00:09:54,440 --> 00:09:57,080
Speaker 1: pair of coin flips, and you were to repeat this

156
00:09:57,160 --> 00:10:00,760
Speaker 1: experiment over and over and over again over the long run,

157
00:10:00,800 --> 00:10:05,400
Speaker 1: you would find that of those sequences would end up

158
00:10:05,440 --> 00:10:08,880
Speaker 1: with heads followed by heads. But what if we wanted

159
00:10:08,920 --> 00:10:11,400
Speaker 1: to say, how what are the odds of flipping three

160
00:10:11,520 --> 00:10:14,719
Speaker 1: heads in a row? Well, then we have to have

161
00:10:15,000 --> 00:10:20,199
Speaker 1: it again. So instead of one out of every four trials,

162
00:10:20,520 --> 00:10:23,080
Speaker 1: we would see one out of every eight, or twelve

163
00:10:23,120 --> 00:10:26,120
Speaker 1: point five percent. And we can keep extending this out.

164
00:10:26,200 --> 00:10:29,920
Speaker 1: We can figure out the odds of some ridiculously long

165
00:10:30,080 --> 00:10:33,880
Speaker 1: stretch of flipping heads in a row. Now in Rosen, Cranston,

166
00:10:33,880 --> 00:10:37,240
Speaker 1: Gillenstern are dead. We are told that it happens and

167
00:10:37,400 --> 00:10:42,400
Speaker 1: astonishing ninety two times in a row, that streak has

168
00:10:42,440 --> 00:10:48,120
Speaker 1: a probability of one in five octillion. That would be

169
00:10:48,160 --> 00:10:53,160
Speaker 1: a five followed by twenty seven zeros. This does not

170
00:10:53,280 --> 00:10:58,479
Speaker 1: mean that it would be impossible, but it is unfathomably unlikely.

171
00:10:59,440 --> 00:11:03,520
Speaker 1: Clemson University has a useful lecture available online in the

172
00:11:03,559 --> 00:11:08,600
Speaker 1: form of a presentation, and it's titled Introduction to Markov Models,

173
00:11:08,880 --> 00:11:12,880
Speaker 1: and it uses weather forecasting as an example. And their

174
00:11:12,960 --> 00:11:19,760
Speaker 1: example takes three initial states, sunny, rainy, and cloudy. Consequently,

175
00:11:19,760 --> 00:11:23,319
Speaker 1: those are also the three potential output states, so each

176
00:11:23,440 --> 00:11:29,079
Speaker 1: state can transition into three states, including transitioning into itself,

177
00:11:29,120 --> 00:11:32,960
Speaker 1: so you could go sunny to cloudy, sunny too rainy,

178
00:11:33,080 --> 00:11:36,400
Speaker 1: or sunny to sunny. That's a valid result as well.

179
00:11:36,600 --> 00:11:40,640
Speaker 1: And in their example, the ideas that we have based

180
00:11:40,640 --> 00:11:46,160
Speaker 1: on past observations figured out the probability for specific forecasts

181
00:11:46,200 --> 00:11:48,800
Speaker 1: based on whatever the current weather happens to be. So,

182
00:11:49,120 --> 00:11:54,720
Speaker 1: for example, we've figured out that rain tomorrow is likely

183
00:11:54,840 --> 00:11:59,040
Speaker 1: if it's raining today, but it's only likely if it's

184
00:11:59,080 --> 00:12:04,959
Speaker 1: just cloudy or sunny today. So if it's cloudy, if

185
00:12:04,960 --> 00:12:09,200
Speaker 1: it's sunny, if it's raining today, that we'll see rain tomorrow.

186
00:12:09,400 --> 00:12:11,960
Speaker 1: But our model would need to have probabilities assigned to

187
00:12:12,120 --> 00:12:15,600
Speaker 1: each pair of starting and ending states. So I'm gonna

188
00:12:15,600 --> 00:12:18,200
Speaker 1: follow through with that just for the purposes of this conversation.

189
00:12:18,640 --> 00:12:21,959
Speaker 1: And we've covered the probabilities of tomorrow being rainy based

190
00:12:21,960 --> 00:12:25,520
Speaker 1: on whatever today's weather is. But the example from Clemson

191
00:12:25,559 --> 00:12:29,079
Speaker 1: also gives the other two outcomes states. So if we're

192
00:12:29,120 --> 00:12:33,520
Speaker 1: looking at the probability of tomorrow being cloudy, we see

193
00:12:33,520 --> 00:12:37,520
Speaker 1: that based on our past observations, that if today is sunny,

194
00:12:37,559 --> 00:12:41,080
Speaker 1: it's a chance of cloudy tomorrow. If today is rainy,

195
00:12:41,120 --> 00:12:43,800
Speaker 1: it's a thirty percent chance, and if today is cloudy,

196
00:12:43,840 --> 00:12:46,679
Speaker 1: there's a fifty percent chance. And finally, if we want

197
00:12:46,760 --> 00:12:49,200
Speaker 1: to know if it's going to be sunny tomorrow, again

198
00:12:49,200 --> 00:12:51,600
Speaker 1: this is all just based on the example. We see

199
00:12:51,600 --> 00:12:54,200
Speaker 1: that if today is sunny, there's an eight percent chance

200
00:12:54,240 --> 00:12:56,800
Speaker 1: that tomorrow will be too. If today is rainy, it's

201
00:12:56,840 --> 00:12:59,760
Speaker 1: just a five percent chance. If today is cloudy there's

202
00:12:59,760 --> 00:13:02,400
Speaker 1: a fifteen percent chance. Now, the reason we need to

203
00:13:02,400 --> 00:13:05,040
Speaker 1: know all of these probabilities will become clear in a second.

204
00:13:05,280 --> 00:13:08,679
Speaker 1: And again these are just examples, they don't reflect real data.

205
00:13:09,360 --> 00:13:12,840
Speaker 1: Markov got very clever and began to use math to

206
00:13:12,920 --> 00:13:18,120
Speaker 1: describe probabilities for predictions that are further out than one state. So,

207
00:13:18,240 --> 00:13:21,440
Speaker 1: for example, you might say, what is the probability that,

208
00:13:21,679 --> 00:13:25,320
Speaker 1: if today is cloudy, that tomorrow will be sunny and

209
00:13:25,360 --> 00:13:29,240
Speaker 1: that the following day will be rainy. This is kind

210
00:13:29,240 --> 00:13:31,520
Speaker 1: of similar to us asking the question of what are

211
00:13:31,520 --> 00:13:34,800
Speaker 1: the odds of flipping heads two or three times in

212
00:13:34,800 --> 00:13:37,920
Speaker 1: a row, except we're looking at the probabilities of weather

213
00:13:38,360 --> 00:13:41,400
Speaker 1: that are based on what our current conditions happen to be.

214
00:13:41,800 --> 00:13:45,360
Speaker 1: So using the example probabilities that were used in that lecture,

215
00:13:45,720 --> 00:13:49,600
Speaker 1: we would find that sunny days follow cloudy days just

216
00:13:49,880 --> 00:13:52,160
Speaker 1: fifteen percent of the time, So there's a fifteen percent

217
00:13:52,320 --> 00:13:55,719
Speaker 1: chance that tomorrow will be cloudy if today is sunny,

218
00:13:56,400 --> 00:14:00,480
Speaker 1: and rainy days follow sunny days twenty per scent of

219
00:14:00,559 --> 00:14:04,400
Speaker 1: the time. So if tomorrow is sunny, there's a twenty

220
00:14:04,840 --> 00:14:08,600
Speaker 1: chance the day after tomorrow will be rainy. So then

221
00:14:09,520 --> 00:14:12,800
Speaker 1: that means that if today's cloudy, we've got that fift

222
00:14:13,320 --> 00:14:15,360
Speaker 1: chance tomorrow will be sunny, and if it is sunny,

223
00:14:15,400 --> 00:14:18,240
Speaker 1: there's a chance that the day after tomorrow will be rainy.

224
00:14:18,320 --> 00:14:21,080
Speaker 1: So we have to multiply those probabilities together. We have

225
00:14:21,120 --> 00:14:26,640
Speaker 1: to multiply that by twenty or point one five times

226
00:14:26,640 --> 00:14:30,520
Speaker 1: point two. That gives us point zero three, which we

227
00:14:30,760 --> 00:14:33,480
Speaker 1: convert to a percentage. That means there's just a three

228
00:14:33,520 --> 00:14:37,760
Speaker 1: percent chance that if today is cloudy, tomorrow will be sunny,

229
00:14:37,800 --> 00:14:40,400
Speaker 1: and the day after tomorrow will be rainy. That's just

230
00:14:40,440 --> 00:14:43,200
Speaker 1: a three percent chance of that happening. And the further

231
00:14:43,280 --> 00:14:45,800
Speaker 1: out we try to predict a particular sequence of whether,

232
00:14:46,200 --> 00:14:49,280
Speaker 1: the lower the probability will be, meaning you know it

233
00:14:49,320 --> 00:14:52,080
Speaker 1: could happen. It's not like it's impossible, but it gets

234
00:14:52,200 --> 00:14:55,520
Speaker 1: less likely the further out we go from our initial state.

235
00:14:55,880 --> 00:14:59,520
Speaker 1: So a Markov model is a stochastic model that describes

236
00:14:59,600 --> 00:15:03,960
Speaker 1: putten chill sequences. It is temporal in nature. That means

237
00:15:04,400 --> 00:15:07,600
Speaker 1: we are really concerned with the state of things and

238
00:15:07,640 --> 00:15:11,000
Speaker 1: how those states will change over time, and it gives

239
00:15:11,080 --> 00:15:15,080
Speaker 1: us a way to explain how current states will depend

240
00:15:15,160 --> 00:15:18,800
Speaker 1: upon previous states. It's not just about predicting the future,

241
00:15:18,840 --> 00:15:23,040
Speaker 1: but also understanding the present. Why are things the way

242
00:15:23,080 --> 00:15:25,840
Speaker 1: they are right now? And it gives us the chance

243
00:15:25,880 --> 00:15:30,280
Speaker 1: to weigh the predictions of the future based upon past

244
00:15:30,360 --> 00:15:35,560
Speaker 1: observational data. This is why we see weather forecasts that

245
00:15:35,600 --> 00:15:39,000
Speaker 1: give us percentages for rainy days, Like a chance for

246
00:15:39,120 --> 00:15:41,800
Speaker 1: rain tells us that it's probably a good idea to

247
00:15:41,800 --> 00:15:44,440
Speaker 1: bring an umbrella if we're going outside, because based on

248
00:15:44,520 --> 00:15:49,320
Speaker 1: past observations, there's a decent chance it's going to rain today. Now,

249
00:15:50,640 --> 00:15:54,000
Speaker 1: let's get more complicated. What if we don't actually know

250
00:15:54,760 --> 00:15:58,080
Speaker 1: the current state of the weather. Let's say that you

251
00:15:58,160 --> 00:16:01,280
Speaker 1: are stuck inside and you can't see out a window,

252
00:16:01,320 --> 00:16:03,160
Speaker 1: you have no windows in the room you're in, and

253
00:16:03,240 --> 00:16:06,160
Speaker 1: someone else comes into your room and says, what's the

254
00:16:06,200 --> 00:16:10,280
Speaker 1: weather like outside? Well, the only hint that we have

255
00:16:10,560 --> 00:16:14,120
Speaker 1: in this experience is if the person that comes in

256
00:16:14,360 --> 00:16:17,160
Speaker 1: is carrying an umbrella or not. We don't actually know

257
00:16:17,400 --> 00:16:20,800
Speaker 1: the current state. We can only make an educated guess

258
00:16:20,840 --> 00:16:24,440
Speaker 1: based on the presence or absence of an umbrella. The

259
00:16:24,560 --> 00:16:28,040
Speaker 1: reality of the current state is hidden from us. This

260
00:16:28,160 --> 00:16:31,200
Speaker 1: leads us to a type of sequential analysis that's used

261
00:16:31,200 --> 00:16:35,640
Speaker 1: in computer science, the hidden Markov model. So with these models,

262
00:16:35,920 --> 00:16:39,280
Speaker 1: we're trying to learn more about the initial states by

263
00:16:39,320 --> 00:16:42,960
Speaker 1: analyzing the outcomes that we can observe. And another way

264
00:16:42,960 --> 00:16:45,080
Speaker 1: of putting it is we're trying to answer the question

265
00:16:45,920 --> 00:16:48,440
Speaker 1: Why are things how they are right now? Why did

266
00:16:48,440 --> 00:16:53,120
Speaker 1: this happen? Let's look back and figure out the probability

267
00:16:53,160 --> 00:16:57,560
Speaker 1: that a particular initial state led to what is going

268
00:16:57,600 --> 00:17:00,440
Speaker 1: on right now now. The whole reason I spent time

269
00:17:00,440 --> 00:17:04,080
Speaker 1: talking about Markov models and probability is that it ties

270
00:17:04,200 --> 00:17:08,199
Speaker 1: heavily into predictive text. It's also used in tons of

271
00:17:08,240 --> 00:17:12,800
Speaker 1: other computational processes and analysis, from natural language analysis to

272
00:17:12,920 --> 00:17:17,639
Speaker 1: genome sequencing. It's really powerful stuff. If we think about language,

273
00:17:18,000 --> 00:17:20,439
Speaker 1: we know that there are certain rules to things. You

274
00:17:20,480 --> 00:17:24,240
Speaker 1: can't just string random letters in a sequence and expect

275
00:17:24,359 --> 00:17:27,520
Speaker 1: that to make a word that other people can understand.

276
00:17:28,119 --> 00:17:31,320
Speaker 1: We have developed languages that have their own vocabularies and

277
00:17:31,440 --> 00:17:35,440
Speaker 1: syntax and grammars. We know that in English, for example,

278
00:17:35,680 --> 00:17:39,439
Speaker 1: the letter Q is nearly always followed by the letter you.

279
00:17:40,160 --> 00:17:42,920
Speaker 1: We know that it would be very odd to see

280
00:17:42,960 --> 00:17:46,960
Speaker 1: the letter H follow right behind the letter J in English.

281
00:17:47,320 --> 00:17:49,879
Speaker 1: And so we can start building out a dictionary and

282
00:17:49,960 --> 00:17:53,800
Speaker 1: a matrix, and the dictionary would include lots of common words,

283
00:17:53,840 --> 00:17:56,439
Speaker 1: and the matrix would include basic rules to help us

284
00:17:56,480 --> 00:18:00,679
Speaker 1: identify when someone is making a typo or misspelling something.

285
00:18:01,200 --> 00:18:03,959
Speaker 1: And with these tools we could build out a method

286
00:18:04,000 --> 00:18:07,280
Speaker 1: for predicting a letter based on the letters that were

287
00:18:07,320 --> 00:18:11,359
Speaker 1: already typed. So if I typed T and then H,

288
00:18:11,520 --> 00:18:14,680
Speaker 1: my predictive text might helpfully offer out the letter E

289
00:18:14,960 --> 00:18:18,080
Speaker 1: because I frequently type the word the If I ignore

290
00:18:18,160 --> 00:18:20,680
Speaker 1: that and I hit the letter A, I might get

291
00:18:20,720 --> 00:18:25,280
Speaker 1: the prompt of using van or thank or maybe even

292
00:18:25,359 --> 00:18:29,399
Speaker 1: thanks or maybe something else. And we're starting down that

293
00:18:29,520 --> 00:18:34,320
Speaker 1: journey toward generative text. When we come back, I'll explain

294
00:18:34,359 --> 00:18:39,320
Speaker 1: more about this and some really cool experiments with using

295
00:18:39,640 --> 00:18:42,720
Speaker 1: machine learning and what that all means. But first let's

296
00:18:42,760 --> 00:18:53,919
Speaker 1: take a quick break. Okay, So we're building out a

297
00:18:53,920 --> 00:18:59,040
Speaker 1: tool that quote unquote understands basic probabilities of words appearing

298
00:18:59,080 --> 00:19:01,479
Speaker 1: in a given language in a given order, and it

299
00:19:01,560 --> 00:19:04,320
Speaker 1: understands that, for example, a Q will be followed by

300
00:19:04,480 --> 00:19:08,280
Speaker 1: you nearly of the time in English. We build into

301
00:19:08,320 --> 00:19:12,320
Speaker 1: this model all sorts of probabilities, so that words that

302
00:19:12,359 --> 00:19:15,280
Speaker 1: are more common are going to pop up as autocomplete

303
00:19:15,280 --> 00:19:19,520
Speaker 1: options more frequently than uncommon words. But we can do

304
00:19:19,600 --> 00:19:23,679
Speaker 1: better than this. We can pair this with a learning model.

305
00:19:24,160 --> 00:19:28,680
Speaker 1: Learning models evolve over time, They adjust based on the

306
00:19:28,720 --> 00:19:32,320
Speaker 1: input fed to them, and we're talking about lots and

307
00:19:32,480 --> 00:19:37,240
Speaker 1: lots of input, they refine themselves, so, in other words,

308
00:19:37,640 --> 00:19:42,200
Speaker 1: they learn. So with learning models are predictive text begins

309
00:19:42,240 --> 00:19:47,160
Speaker 1: to adjust to the specific individual who uses the predictive

310
00:19:47,200 --> 00:19:49,679
Speaker 1: text over time. Like a phone. So let's say you

311
00:19:49,720 --> 00:19:53,960
Speaker 1: and I each have the same particular model of smartphone,

312
00:19:54,480 --> 00:19:58,159
Speaker 1: and we're both running the same operating system version and everything,

313
00:19:58,200 --> 00:20:02,080
Speaker 1: like our phones are are essentially identical, at least at

314
00:20:02,119 --> 00:20:05,520
Speaker 1: casual glance. And we've both been using these phones for

315
00:20:05,760 --> 00:20:08,439
Speaker 1: a few weeks. And in that time, you and I

316
00:20:08,480 --> 00:20:11,560
Speaker 1: have each used our phones to send various messages to

317
00:20:11,600 --> 00:20:14,960
Speaker 1: our friends, our family, our colleagues, you know, your arch nemesis,

318
00:20:14,960 --> 00:20:18,159
Speaker 1: Ben Bolan, you know the usual. As we do that,

319
00:20:18,800 --> 00:20:22,000
Speaker 1: our predictive text keyboards start to pick up on how

320
00:20:22,119 --> 00:20:26,360
Speaker 1: we use words, and it can build up a frequency matrix,

321
00:20:26,359 --> 00:20:30,160
Speaker 1: which isn't just looking at words that are common in general,

322
00:20:30,359 --> 00:20:34,000
Speaker 1: but words that are common to us as individuals, and

323
00:20:34,040 --> 00:20:36,920
Speaker 1: the way that we use words, and sometimes the way

324
00:20:36,960 --> 00:20:40,040
Speaker 1: we generate words. Maybe you happen to use the word

325
00:20:40,160 --> 00:20:44,040
Speaker 1: balder dash a lot, and so you start typing the

326
00:20:44,080 --> 00:20:46,800
Speaker 1: word and the autocomplete for balder dash will jump up

327
00:20:46,880 --> 00:20:49,679
Speaker 1: much faster than it would if I were typing it

328
00:20:49,800 --> 00:20:52,119
Speaker 1: on my phone, because my phone has never heard me

329
00:20:52,720 --> 00:20:56,639
Speaker 1: use that, so it doesn't automatically assume that's what I'm typing.

330
00:20:56,880 --> 00:20:59,800
Speaker 1: Maybe I use the word folder roll a lot, and

331
00:20:59,840 --> 00:21:02,679
Speaker 1: the same happens with my phone compared to yours. The

332
00:21:02,720 --> 00:21:06,520
Speaker 1: models learned the words we use, not and not just

333
00:21:06,600 --> 00:21:09,560
Speaker 1: the words that the words we create as well. So

334
00:21:09,640 --> 00:21:12,320
Speaker 1: let's say that I was, for some reason a big

335
00:21:12,400 --> 00:21:14,840
Speaker 1: fan of How I Met Your Mother, which I'm not.

336
00:21:15,040 --> 00:21:16,919
Speaker 1: But let's say that I am a big fan of

337
00:21:16,920 --> 00:21:20,119
Speaker 1: Neil Patrick Harris, which is true, and his character often

338
00:21:20,160 --> 00:21:24,080
Speaker 1: says that is wait for it, legendary. Uh, And it

339
00:21:24,440 --> 00:21:27,560
Speaker 1: might extend the word legendary. So to do that, I

340
00:21:27,680 --> 00:21:29,840
Speaker 1: might throw in a whole bunch of extra ease at

341
00:21:29,880 --> 00:21:34,040
Speaker 1: the beginning of legendary. Well, my phone might pick up

342
00:21:34,080 --> 00:21:36,560
Speaker 1: that I tend to do this, and so it includes

343
00:21:36,640 --> 00:21:40,200
Speaker 1: that as a legitimate word, even though any sort of

344
00:21:40,560 --> 00:21:45,600
Speaker 1: spelling check would say this ain't a word, stop it,

345
00:21:45,640 --> 00:21:48,520
Speaker 1: But my phone's predictive text is going to include it

346
00:21:48,560 --> 00:21:52,200
Speaker 1: as saying this is something that is meaningful and thus

347
00:21:52,240 --> 00:21:57,480
Speaker 1: a valid option. Also, the phones can learn to adapt

348
00:21:57,520 --> 00:22:01,439
Speaker 1: to our own sense of syntax and grammar. Perhaps for

349
00:22:01,520 --> 00:22:05,200
Speaker 1: purposes of a particular effect. One of us tends to

350
00:22:05,240 --> 00:22:08,719
Speaker 1: tweak the syntax of the language that we're communicating in

351
00:22:08,760 --> 00:22:12,080
Speaker 1: for some reason. Maybe it's for comedic effect and it's

352
00:22:12,080 --> 00:22:15,480
Speaker 1: not following the established rules of grammar for English. But

353
00:22:15,560 --> 00:22:18,560
Speaker 1: our phone starts to understand that's how we communicate, based

354
00:22:18,600 --> 00:22:21,880
Speaker 1: on how we order our words and how we generate

355
00:22:21,880 --> 00:22:25,479
Speaker 1: our phrases, you know, how we communicate that. While our

356
00:22:25,560 --> 00:22:30,560
Speaker 1: choices aren't necessarily in alignment with an established formal system,

357
00:22:30,600 --> 00:22:34,880
Speaker 1: they represent a particular approach to communicating. Predictive text can

358
00:22:34,960 --> 00:22:38,840
Speaker 1: start to get a handle on that if it's built properly,

359
00:22:39,359 --> 00:22:43,640
Speaker 1: and even someone who communicates in an idiosyncratic way might

360
00:22:43,680 --> 00:22:47,680
Speaker 1: find that their phone is offering up particularly relevant suggestions.

361
00:22:47,720 --> 00:22:50,720
Speaker 1: So how does all this work? How do machines actually

362
00:22:51,160 --> 00:22:55,760
Speaker 1: learn stuff? Well, there's not one single method, but there

363
00:22:55,800 --> 00:23:00,160
Speaker 1: are a collection of related processes that computer scientists develop

364
00:23:00,160 --> 00:23:04,480
Speaker 1: to train machines. And you can look at two major

365
00:23:04,640 --> 00:23:08,359
Speaker 1: types of categories of machine learning, and there are a

366
00:23:08,400 --> 00:23:10,800
Speaker 1: lot of subtypes under each of these, and those would

367
00:23:10,840 --> 00:23:16,280
Speaker 1: be supervised learning and unsupervised learning. Supervised learning involves training

368
00:23:16,280 --> 00:23:21,280
Speaker 1: a computer model using known input and output information, so

369
00:23:21,560 --> 00:23:23,680
Speaker 1: Let's take an example that I like to use a lot,

370
00:23:23,960 --> 00:23:26,919
Speaker 1: and it's about image recognition. So let's say you're teaching

371
00:23:26,920 --> 00:23:31,320
Speaker 1: a computer to recognize images of coffee mugs, and you

372
00:23:31,400 --> 00:23:35,720
Speaker 1: have an enormous supply of images, millions of them. Some

373
00:23:35,840 --> 00:23:39,120
Speaker 1: of them contain coffee mugs and various shapes and sizes

374
00:23:39,160 --> 00:23:44,320
Speaker 1: and colors and orientations, and the lighting can be different.

375
00:23:44,400 --> 00:23:46,560
Speaker 1: You might have the handle pointing to the left, and

376
00:23:46,680 --> 00:23:48,680
Speaker 1: some or pointing to the right or the other. Some

377
00:23:48,720 --> 00:23:51,040
Speaker 1: cases it might be on its side. But you've got

378
00:23:51,160 --> 00:23:55,120
Speaker 1: tons of these, and you also have millions of images

379
00:23:55,320 --> 00:23:58,240
Speaker 1: of other stuff. Some of it might not even resemble

380
00:23:58,320 --> 00:24:02,280
Speaker 1: a mug remotely. Maybe it's an airplane or Christopher walkin.

381
00:24:02,840 --> 00:24:05,840
Speaker 1: Others might look kind of like a mug, you know,

382
00:24:05,840 --> 00:24:09,160
Speaker 1: it might be a glass or a bowl or something similar. Now,

383
00:24:09,200 --> 00:24:12,600
Speaker 1: as a human being, you can tell straight away if

384
00:24:12,640 --> 00:24:14,840
Speaker 1: the image you've got in front of you represents a

385
00:24:14,840 --> 00:24:21,280
Speaker 1: coffee mug or not, But machines don't inherently possess this ability.

386
00:24:21,640 --> 00:24:25,480
Speaker 1: You could feed one photo of a generic off white

387
00:24:25,520 --> 00:24:28,160
Speaker 1: coffee mug, the handle happens to be pointed to the left,

388
00:24:28,200 --> 00:24:30,720
Speaker 1: and you tag that photo as a coffee mug, you

389
00:24:30,760 --> 00:24:33,320
Speaker 1: give meta data to the computer to classify that as

390
00:24:33,359 --> 00:24:36,320
Speaker 1: a coffee mug. And if you create a database of images,

391
00:24:36,760 --> 00:24:39,480
Speaker 1: maybe you do a search for coffee mug, that one

392
00:24:39,480 --> 00:24:41,560
Speaker 1: would come up as a result because of all the

393
00:24:41,600 --> 00:24:46,040
Speaker 1: work you've done with tagging this thing and effectively telling

394
00:24:46,080 --> 00:24:49,440
Speaker 1: the computer this is what I mean by coffee mug. However,

395
00:24:49,720 --> 00:24:52,560
Speaker 1: if you fed a new image and this one is

396
00:24:52,600 --> 00:24:55,800
Speaker 1: of a red coffee mug that's of a different size,

397
00:24:56,119 --> 00:24:59,119
Speaker 1: maybe the photo has different lighting conditions, maybe the mug

398
00:24:59,160 --> 00:25:02,440
Speaker 1: is a little closer to the camera, the handles point

399
00:25:02,440 --> 00:25:04,760
Speaker 1: to the right and on the left, would the computer

400
00:25:04,880 --> 00:25:09,280
Speaker 1: automatically know that that's a coffee mug. No, it hasn't

401
00:25:09,400 --> 00:25:13,040
Speaker 1: learned that. So you would have to build a predictive

402
00:25:13,119 --> 00:25:16,520
Speaker 1: model for a computer to follow based on the known

403
00:25:16,640 --> 00:25:20,840
Speaker 1: input and outputs. Your output is you want the computer

404
00:25:20,960 --> 00:25:24,240
Speaker 1: to classify photos as either having a coffee mug in

405
00:25:24,280 --> 00:25:28,200
Speaker 1: them or not, And you might use an artificial neural network.

406
00:25:28,760 --> 00:25:32,880
Speaker 1: In this case, you're creating nodes that accept input, then

407
00:25:32,920 --> 00:25:35,920
Speaker 1: they apply some sort of decision making process to that

408
00:25:36,040 --> 00:25:40,000
Speaker 1: input and then pass it along further along the network.

409
00:25:40,320 --> 00:25:43,679
Speaker 1: You can almost think of nodes as essentially making a

410
00:25:43,800 --> 00:25:46,800
Speaker 1: yes or no judgment on a piece of data. Does

411
00:25:46,880 --> 00:25:50,320
Speaker 1: the input qualify or does it not? Does it have

412
00:25:50,600 --> 00:25:54,240
Speaker 1: this particular aspect of whatever it is you're looking at,

413
00:25:54,240 --> 00:25:57,640
Speaker 1: in our case, coffee mugs or does it lack that?

414
00:25:58,200 --> 00:26:01,479
Speaker 1: With our mug example, it could be a simple question

415
00:26:01,560 --> 00:26:05,320
Speaker 1: like is this mug shaped? But the nodes are asking

416
00:26:05,440 --> 00:26:08,840
Speaker 1: lots of questions and making lots of judgments and passing

417
00:26:08,880 --> 00:26:11,000
Speaker 1: them throughout the neural network until you get to the

418
00:26:11,040 --> 00:26:14,320
Speaker 1: final output, the final judgment of is this a coffee

419
00:26:14,359 --> 00:26:18,840
Speaker 1: mug or is it not? And computer scientists influence how

420
00:26:18,920 --> 00:26:23,119
Speaker 1: the computer processes information. They adjust the waiting of answers

421
00:26:23,200 --> 00:26:27,160
Speaker 1: waiting as in like weight, as in heavy W E

422
00:26:27,280 --> 00:26:30,399
Speaker 1: I G H T waiting. So you create your model,

423
00:26:30,680 --> 00:26:33,200
Speaker 1: you use nodes that are making a series of judgments

424
00:26:33,200 --> 00:26:38,520
Speaker 1: on images. You wait those decisions so that you're hopefully

425
00:26:38,600 --> 00:26:42,000
Speaker 1: going toward a more accurate result, and you feed your

426
00:26:42,720 --> 00:26:45,840
Speaker 1: photos through and you look at the output. Now you

427
00:26:45,920 --> 00:26:48,679
Speaker 1: know whether the photos have a coffee mug in them

428
00:26:48,760 --> 00:26:51,120
Speaker 1: or not. You're looking to see if the computer can

429
00:26:51,200 --> 00:26:53,680
Speaker 1: recognize that. So you're looking to see if your model

430
00:26:53,760 --> 00:26:56,439
Speaker 1: succeeded or failed. And then you go back and you

431
00:26:56,480 --> 00:26:59,400
Speaker 1: make adjustments to your neural network. You adjust the waitings

432
00:26:59,480 --> 00:27:03,040
Speaker 1: of those decisions so that the nodes process information in

433
00:27:03,040 --> 00:27:05,720
Speaker 1: a slightly different way, and you always have the goal

434
00:27:06,000 --> 00:27:09,840
Speaker 1: of improving the accuracy of the overall system. You feed

435
00:27:10,000 --> 00:27:12,600
Speaker 1: the images through it again, and you do this over

436
00:27:12,920 --> 00:27:17,000
Speaker 1: and over. You train the computer model so that it

437
00:27:17,119 --> 00:27:20,320
Speaker 1: gets more accurate as you make these adjustments, and ultimately

438
00:27:20,760 --> 00:27:23,840
Speaker 1: you get to a system that can accept brand new images,

439
00:27:24,240 --> 00:27:28,160
Speaker 1: ones that haven't been deliberately chosen, and then sort those

440
00:27:28,160 --> 00:27:31,359
Speaker 1: into images that either are of a coffee mug or

441
00:27:31,440 --> 00:27:36,120
Speaker 1: are not. And this is in an area called classification.

442
00:27:36,400 --> 00:27:39,320
Speaker 1: So in our simple example, images just fall into two

443
00:27:39,359 --> 00:27:43,639
Speaker 1: broad classifications, photos with mugs or photos without, though we're

444
00:27:43,640 --> 00:27:46,160
Speaker 1: gonna get a little more complicated a little bit later,

445
00:27:46,440 --> 00:27:50,360
Speaker 1: so you can have all sorts of classifications. Medical imaging

446
00:27:50,400 --> 00:27:53,280
Speaker 1: systems make use of this sort of machine learning process

447
00:27:53,320 --> 00:27:56,359
Speaker 1: to indicate whether or not an image of a of

448
00:27:56,359 --> 00:28:00,239
Speaker 1: a tumor is benign or not. Handwriting recognition program ms

449
00:28:00,280 --> 00:28:02,960
Speaker 1: do this to speech recognition can do this as well,

450
00:28:03,359 --> 00:28:06,920
Speaker 1: so supervised learning systems can also use a different approach

451
00:28:06,960 --> 00:28:10,160
Speaker 1: called regression as a means of training a system regression

452
00:28:10,200 --> 00:28:14,400
Speaker 1: is all about predicting a continuous response, like how much

453
00:28:14,560 --> 00:28:18,280
Speaker 1: electricity a community is going to need over time. It's

454
00:28:18,280 --> 00:28:22,160
Speaker 1: about predicting things to which you can assign real numbers. So,

455
00:28:22,280 --> 00:28:25,840
Speaker 1: for example, predicting a change in temperature, temperature happens to

456
00:28:26,280 --> 00:28:29,240
Speaker 1: have a value that is a real number, so that

457
00:28:29,320 --> 00:28:34,399
Speaker 1: falls into this category that's supervised learning, where we have

458
00:28:34,640 --> 00:28:38,240
Speaker 1: the known inputs and known outputs. We know definitively if

459
00:28:38,280 --> 00:28:41,280
Speaker 1: the information the computer generates is accurate or not because

460
00:28:41,320 --> 00:28:43,800
Speaker 1: we can actually check its work. It's kind of like

461
00:28:44,160 --> 00:28:47,560
Speaker 1: a teacher grading student tests and then working with a

462
00:28:47,600 --> 00:28:49,800
Speaker 1: student who has a low score to get a better

463
00:28:49,840 --> 00:28:52,760
Speaker 1: understanding of subject matter, and then on the next test

464
00:28:52,800 --> 00:28:56,120
Speaker 1: hopefully they score better, and you keep working with that

465
00:28:56,200 --> 00:28:59,840
Speaker 1: student over and over until they have reached a high

466
00:29:00,000 --> 00:29:05,200
Speaker 1: of level of consistency of being correct. Unsupervised learning is

467
00:29:05,240 --> 00:29:09,440
Speaker 1: more about finding patterns or meaning in data where no

468
00:29:09,560 --> 00:29:13,600
Speaker 1: such patterns or meaning is initially obvious. When we talk

469
00:29:13,680 --> 00:29:17,800
Speaker 1: about sifting through big data to find patterns, this is

470
00:29:17,840 --> 00:29:21,360
Speaker 1: the kind of thing we're talking about. Those patterns might

471
00:29:21,360 --> 00:29:24,320
Speaker 1: be subtle, or they might only be obvious when you're

472
00:29:24,360 --> 00:29:29,520
Speaker 1: dealing with truly enormous amounts of information. We humans are

473
00:29:29,640 --> 00:29:33,280
Speaker 1: really good at spotting patterns up to a point. It's

474
00:29:33,360 --> 00:29:38,000
Speaker 1: part of our survival mechanism. Recognizing patterns helped ancient humans

475
00:29:38,000 --> 00:29:41,880
Speaker 1: recognize prey or predators, so it's a key element to

476
00:29:41,920 --> 00:29:45,640
Speaker 1: the survival of our species. But when you get to really,

477
00:29:45,880 --> 00:29:49,560
Speaker 1: really big quantities of data, it's hard for us to

478
00:29:49,600 --> 00:29:51,400
Speaker 1: see patterns. It would be kind of like if you

479
00:29:51,560 --> 00:29:53,800
Speaker 1: jumped off a boat in the middle of the ocean

480
00:29:54,280 --> 00:29:56,520
Speaker 1: and then you were told to look for patterns that

481
00:29:56,560 --> 00:29:59,480
Speaker 1: are the size of New Zealand you'd be lost right away.

482
00:29:59,520 --> 00:30:02,960
Speaker 1: The scale is something we can't deal with. But computer

483
00:30:03,000 --> 00:30:06,280
Speaker 1: systems can handle data far more efficiently than we can,

484
00:30:06,680 --> 00:30:10,320
Speaker 1: and that means they can potentially spot patterns where we

485
00:30:10,680 --> 00:30:15,440
Speaker 1: would not. Unsupervised learning techniques are best for this, and

486
00:30:15,520 --> 00:30:19,240
Speaker 1: they have a few different approaches. One is clustering, which

487
00:30:19,480 --> 00:30:21,720
Speaker 1: is pretty much what sounds like. The system looks for

488
00:30:21,880 --> 00:30:27,280
Speaker 1: groupings and data indications of clusters, pattern clusters. And now

489
00:30:27,320 --> 00:30:29,960
Speaker 1: I need to get back to my image recognition coffee

490
00:30:30,040 --> 00:30:33,959
Speaker 1: mug analogy. If we were just feeding images that are

491
00:30:34,080 --> 00:30:38,600
Speaker 1: either a coffee mug on a neutral background or something else,

492
00:30:39,080 --> 00:30:42,000
Speaker 1: then we could go supervised learning all the way. But

493
00:30:42,160 --> 00:30:44,440
Speaker 1: if we wanted to create a system that could recognize

494
00:30:44,480 --> 00:30:47,640
Speaker 1: if a coffee mug were in a larger scene, like

495
00:30:47,800 --> 00:30:51,200
Speaker 1: a crowded kitchen table, lots of other stuff is on it,

496
00:30:51,280 --> 00:30:55,080
Speaker 1: we could probably rely a bit on unsupervised learning, in

497
00:30:55,120 --> 00:30:57,880
Speaker 1: which we would use clustering to teach the system to

498
00:30:57,920 --> 00:31:02,280
Speaker 1: look for data that collectively appears to represent a coffee mug.

499
00:31:02,440 --> 00:31:04,920
Speaker 1: We're trying to create a system that can pick out

500
00:31:05,040 --> 00:31:07,280
Speaker 1: the shape of a coffee mug in an image that

501
00:31:07,320 --> 00:31:09,600
Speaker 1: has a lot of other shapes in it. The system

502
00:31:09,640 --> 00:31:13,960
Speaker 1: needs to understand which shapes, which lines and curves represent

503
00:31:14,120 --> 00:31:17,400
Speaker 1: the borders of objects. So what is a coffee mug

504
00:31:17,440 --> 00:31:20,920
Speaker 1: as opposed to say, a tablecloth or a shadow or

505
00:31:21,000 --> 00:31:24,400
Speaker 1: a bowl with a spoon next to it. Unsupervised pattern

506
00:31:24,440 --> 00:31:27,640
Speaker 1: recognition can lead to that outcome. Again, it requires a

507
00:31:27,680 --> 00:31:30,560
Speaker 1: lot of training. You feed millions of images to a

508
00:31:30,600 --> 00:31:34,360
Speaker 1: system numerous times to refine this approach. The method often

509
00:31:34,400 --> 00:31:38,320
Speaker 1: relies upon hidden Markov models. Oh and this also ties

510
00:31:38,360 --> 00:31:40,960
Speaker 1: into something else that's you know, tangentially related. But I

511
00:31:40,960 --> 00:31:42,840
Speaker 1: thought I would bring it up in case you guys

512
00:31:42,840 --> 00:31:45,760
Speaker 1: have been experiencing it as much as I have. If

513
00:31:45,760 --> 00:31:48,920
Speaker 1: you've noticed a lot more instances of websites demanding that

514
00:31:48,960 --> 00:31:52,080
Speaker 1: you prove you're not a robot with a capture. By

515
00:31:52,120 --> 00:31:54,000
Speaker 1: the way, this is a good reminder that if you

516
00:31:54,120 --> 00:31:57,400
Speaker 1: go to the tech stuff store at t public dot

517
00:31:57,440 --> 00:32:00,560
Speaker 1: com slash stores slash tech Stuff, you can get a

518
00:32:00,600 --> 00:32:03,280
Speaker 1: shirt or you know, dare I say, a coffee mug

519
00:32:03,600 --> 00:32:07,440
Speaker 1: with this capture robot idea on it. A lot of

520
00:32:07,480 --> 00:32:10,440
Speaker 1: those captures involve a series of photos, and it's your

521
00:32:10,520 --> 00:32:13,960
Speaker 1: job to click all the photos that have something specific

522
00:32:14,000 --> 00:32:17,000
Speaker 1: in them, you know, like bicycles or crosswalks, or traffic

523
00:32:17,080 --> 00:32:22,000
Speaker 1: lights or fire hydrants. If you've wondered why that is, well,

524
00:32:22,920 --> 00:32:25,680
Speaker 1: it all comes down to good traffic versus bad traffic.

525
00:32:25,720 --> 00:32:28,680
Speaker 1: There's a lot of traffic out there that is uh

526
00:32:28,840 --> 00:32:33,920
Speaker 1: powered by butts for various reasons, and that can clog

527
00:32:34,000 --> 00:32:37,720
Speaker 1: things up, and so systems and companies like Google want

528
00:32:37,800 --> 00:32:41,000
Speaker 1: to prioritize traffic that's good traffic. It represents actual people

529
00:32:41,120 --> 00:32:45,720
Speaker 1: trying to do stuff, and give them preferential access to

530
00:32:46,120 --> 00:32:50,200
Speaker 1: other methods that might be malevolent or just might end

531
00:32:50,280 --> 00:32:54,600
Speaker 1: up making things run slower if they get unfettered access.

532
00:32:54,680 --> 00:32:57,840
Speaker 1: And the reason these captions are getting so difficult is

533
00:32:57,840 --> 00:33:02,480
Speaker 1: because machine learning and image recognition software has gotten really good,

534
00:33:02,720 --> 00:33:06,000
Speaker 1: and so to protect against bad traffic, companies like Google

535
00:33:06,120 --> 00:33:09,960
Speaker 1: are using difficult capture systems that present fuzzy, dimly lit,

536
00:33:10,080 --> 00:33:14,800
Speaker 1: or otherwise you know, bad photographs to you, and your

537
00:33:14,840 --> 00:33:17,680
Speaker 1: job is to stare at them, possibly on a tiny

538
00:33:17,720 --> 00:33:21,520
Speaker 1: smartphone screen, and figure out which ones are legit. The

539
00:33:21,560 --> 00:33:24,800
Speaker 1: whole goal is to present photos that are so lousy

540
00:33:24,840 --> 00:33:28,960
Speaker 1: that machines can't really deal with them. The problem is,

541
00:33:29,280 --> 00:33:33,000
Speaker 1: over the long run, machines get better than doing this

542
00:33:33,040 --> 00:33:35,120
Speaker 1: sort of stuff, whereas we kind of, you know, we

543
00:33:35,200 --> 00:33:38,200
Speaker 1: have a cap on our performance. There will come a

544
00:33:38,280 --> 00:33:40,800
Speaker 1: point where an image will be get you know, too

545
00:33:40,800 --> 00:33:42,880
Speaker 1: fuzzy or too dim for us to make out if

546
00:33:42,920 --> 00:33:46,240
Speaker 1: there's a fire hydrant in there or not. The machines

547
00:33:46,280 --> 00:33:49,440
Speaker 1: will always get better at stuff at this than than

548
00:33:49,560 --> 00:33:53,880
Speaker 1: we are over the long run. Heck, older capture systems

549
00:33:53,920 --> 00:33:57,880
Speaker 1: are completely obsolete now because computer systems can complete them

550
00:33:57,880 --> 00:34:01,440
Speaker 1: at a success rate that's actually higher than humans. We've

551
00:34:01,480 --> 00:34:04,000
Speaker 1: got a lot of science fiction stories about machines becoming

552
00:34:04,040 --> 00:34:07,000
Speaker 1: sentient and ruining humanity, but the truth of the matter

553
00:34:07,080 --> 00:34:10,920
Speaker 1: is they don't need sentients to be disruptive. If they

554
00:34:10,920 --> 00:34:14,719
Speaker 1: are directed by someone for a specific malevolent purpose, that's

555
00:34:14,760 --> 00:34:17,279
Speaker 1: bad enough, even if the machines aren't really you know,

556
00:34:17,600 --> 00:34:22,319
Speaker 1: thinking for themselves. Okay, but let's get back to predictive text.

557
00:34:22,400 --> 00:34:25,560
Speaker 1: After all of this. You could create a machine learning

558
00:34:25,560 --> 00:34:28,160
Speaker 1: model that has a huge database of words, you know,

559
00:34:28,200 --> 00:34:31,520
Speaker 1: a dictionary, and you could program the system to classify

560
00:34:31,600 --> 00:34:34,360
Speaker 1: the words. You can sus out which words are nouns

561
00:34:34,440 --> 00:34:37,839
Speaker 1: and verbs and adjectives, and then apply rules to how

562
00:34:37,880 --> 00:34:41,080
Speaker 1: those words can go together to make sentences. Or you

563
00:34:41,080 --> 00:34:45,399
Speaker 1: could just you know, analyze a ton of literature and

564
00:34:45,440 --> 00:34:48,320
Speaker 1: have the computer kind of figure that out for itself,

565
00:34:48,680 --> 00:34:54,160
Speaker 1: just through statistical analysis, understand how words fit together based

566
00:34:54,200 --> 00:34:57,480
Speaker 1: upon the history of the written word, at least in

567
00:34:57,640 --> 00:35:01,200
Speaker 1: modern English. For example, if you went further back to

568
00:35:01,719 --> 00:35:04,640
Speaker 1: like old English, first of all, your vocabulary would be

569
00:35:04,640 --> 00:35:07,600
Speaker 1: totally different, but your grammar would be too, and suddenly

570
00:35:07,680 --> 00:35:10,040
Speaker 1: things would not make much sense. It would everything would

571
00:35:10,080 --> 00:35:13,600
Speaker 1: sound like yoda. So the system could go through millions

572
00:35:13,640 --> 00:35:16,760
Speaker 1: of pages of materials building a statistical model that shows

573
00:35:16,760 --> 00:35:20,640
Speaker 1: how frequently certain words pair together and in which order. Effectively,

574
00:35:20,680 --> 00:35:24,360
Speaker 1: you're analyzing how humans put letters together to make words,

575
00:35:24,400 --> 00:35:27,400
Speaker 1: and words together to make sentences. You could move up

576
00:35:27,440 --> 00:35:30,879
Speaker 1: from there. You could try and analyze how sentences come

577
00:35:30,920 --> 00:35:35,000
Speaker 1: together to make up paragraphs, but it starts to get tricky. However,

578
00:35:35,080 --> 00:35:37,440
Speaker 1: you can work on a system that can present a

579
00:35:37,520 --> 00:35:40,359
Speaker 1: series of sentences that are related enough to be a

580
00:35:40,400 --> 00:35:43,640
Speaker 1: coherent presentation of ideas, at least in the short run.

581
00:35:44,120 --> 00:35:47,200
Speaker 1: It might not be super compelling or as effective as

582
00:35:47,239 --> 00:35:49,440
Speaker 1: what a human could do, but it could be a

583
00:35:49,440 --> 00:35:51,759
Speaker 1: lot more impressive than just, you know, a string of

584
00:35:51,760 --> 00:35:55,120
Speaker 1: totally unrelated words. When we come back, I'll talk a

585
00:35:55,120 --> 00:35:57,760
Speaker 1: bit more about how computer systems can put words together

586
00:35:57,800 --> 00:35:59,800
Speaker 1: for us and what that could mean in the future.

587
00:35:59,840 --> 00:36:11,680
Speaker 1: But first let's take another quick break. Okay, So, AI systems,

588
00:36:11,880 --> 00:36:16,120
Speaker 1: if sophisticated enough, can use stuff like hidden Markov models

589
00:36:16,120 --> 00:36:18,680
Speaker 1: and machine learning to put together strings of words that,

590
00:36:18,880 --> 00:36:23,359
Speaker 1: from a probability standpoint, a statistical standpoint, at least are

591
00:36:23,440 --> 00:36:27,720
Speaker 1: likely to make some sense. There's no guarantee it will

592
00:36:27,760 --> 00:36:31,200
Speaker 1: actually make sense, but if things are going well, the

593
00:36:31,239 --> 00:36:34,399
Speaker 1: phrases will be grammatically correct, and if they're going really well,

594
00:36:34,760 --> 00:36:37,400
Speaker 1: the word choice will be reasonable enough to pass muster.

595
00:36:38,040 --> 00:36:41,879
Speaker 1: But this is still pretty hard. Computer systems typically lack

596
00:36:41,960 --> 00:36:46,080
Speaker 1: the ability to build on context and meaning because they're

597
00:36:46,080 --> 00:36:49,080
Speaker 1: effectively looking for what is most likely to come next,

598
00:36:49,239 --> 00:36:52,360
Speaker 1: rather than looking back at what has already come before.

599
00:36:53,160 --> 00:36:55,160
Speaker 1: Does that make sense, Well, let me put it in

600
00:36:55,200 --> 00:36:58,359
Speaker 1: another way. In our weather example, I talked about how

601
00:36:58,400 --> 00:37:02,759
Speaker 1: the predictions for future weather depended on current weather. So

602
00:37:02,920 --> 00:37:06,440
Speaker 1: what is it doing today? If it is sunny today,

603
00:37:06,480 --> 00:37:08,960
Speaker 1: there's an eight percent chance it will be sunny tomorrow

604
00:37:09,000 --> 00:37:13,279
Speaker 1: according to our example. But the predictions don't depend upon

605
00:37:13,440 --> 00:37:17,319
Speaker 1: the weather that came earlier, like what happened yesterday. The

606
00:37:17,400 --> 00:37:22,000
Speaker 1: system doesn't care about yesterday's weather. We might care because

607
00:37:22,000 --> 00:37:25,200
Speaker 1: we're using long trends of weather to act as our

608
00:37:25,280 --> 00:37:28,080
Speaker 1: data source to train the computer model, you know, to

609
00:37:28,200 --> 00:37:31,640
Speaker 1: create those probabilities. But yesterday's weather, as far as the

610
00:37:31,640 --> 00:37:34,959
Speaker 1: computer system is concerned, has no impact on tomorrow's weather.

611
00:37:35,320 --> 00:37:38,719
Speaker 1: So if yesterday we're rainy in today is sunny, the

612
00:37:38,760 --> 00:37:42,280
Speaker 1: computer doesn't really care. It just cares that today is sunny.

613
00:37:42,480 --> 00:37:45,080
Speaker 1: The same thing can hold true with systems that are

614
00:37:45,120 --> 00:37:49,080
Speaker 1: creating predictive text. The goal with standard predictive text is

615
00:37:49,120 --> 00:37:53,080
Speaker 1: to save users time and effort by suggesting likely words

616
00:37:53,280 --> 00:37:55,960
Speaker 1: as you, you know, start typing, So if you start

617
00:37:56,040 --> 00:37:59,760
Speaker 1: typing the word technology, at some point, the system recognizes

618
00:37:59,840 --> 00:38:02,920
Speaker 1: the letter pattern and offers that up as an option,

619
00:38:03,280 --> 00:38:06,359
Speaker 1: And for words that are frequently used in pairs, you'll

620
00:38:06,400 --> 00:38:09,520
Speaker 1: get those suggestions right away after you type the first word.

621
00:38:09,880 --> 00:38:13,200
Speaker 1: Since this is typically presented as an option, you know,

622
00:38:13,280 --> 00:38:16,400
Speaker 1: something you can choose to use or not. It's pretty

623
00:38:16,400 --> 00:38:19,960
Speaker 1: simple to avoid going wrong unless you, as a user,

624
00:38:20,080 --> 00:38:23,080
Speaker 1: fumble things and accidentally picked the wrong word, which can

625
00:38:23,120 --> 00:38:26,400
Speaker 1: get kind of embarrassing, or if it autocompletes after the fact,

626
00:38:26,680 --> 00:38:29,759
Speaker 1: thinking that you made a spelling error and then you

627
00:38:30,200 --> 00:38:34,440
Speaker 1: have accidentally spelled Tim mentions name as Tim Munchkin and

628
00:38:34,600 --> 00:38:39,400
Speaker 1: I am deeply sorry for that. Auto replies with email

629
00:38:39,680 --> 00:38:42,879
Speaker 1: get a little more complicated as the system is analyzing

630
00:38:42,920 --> 00:38:46,040
Speaker 1: the message that is coming into you before formulating a

631
00:38:46,080 --> 00:38:49,440
Speaker 1: possible response. So I have email systems that do this

632
00:38:49,600 --> 00:38:52,719
Speaker 1: for me. And one common example for me is that

633
00:38:52,800 --> 00:38:55,480
Speaker 1: our sales team here at our company will send me

634
00:38:55,520 --> 00:38:58,840
Speaker 1: an email asking if I'm okay running a particular sponsors

635
00:38:58,880 --> 00:39:01,480
Speaker 1: ads on my show. Now, normally I like to do

636
00:39:01,560 --> 00:39:04,719
Speaker 1: research on my sponsors, so I'll take time to look

637
00:39:04,760 --> 00:39:08,480
Speaker 1: into things and then respond myself. But sometimes the request

638
00:39:08,520 --> 00:39:12,000
Speaker 1: is for a sponsor I'm familiar with and I definitely

639
00:39:12,080 --> 00:39:15,840
Speaker 1: want or you know, occasionally definitely do not want on

640
00:39:15,960 --> 00:39:18,680
Speaker 1: my show, and I'll see on my phone that I

641
00:39:18,719 --> 00:39:20,840
Speaker 1: have the option to pick a quick reply of something

642
00:39:20,880 --> 00:39:25,319
Speaker 1: like sure or yes, that's fine, or something similar. In

643
00:39:25,360 --> 00:39:28,720
Speaker 1: this case, the email program is using natural language systems

644
00:39:28,719 --> 00:39:31,640
Speaker 1: and predictive text to suss out that there is a

645
00:39:31,680 --> 00:39:35,200
Speaker 1: request and that the common responses I might make to

646
00:39:35,280 --> 00:39:38,319
Speaker 1: that request should be options. Now, it's not that the

647
00:39:38,320 --> 00:39:42,719
Speaker 1: computer system actually understands the nature of this request, but

648
00:39:42,920 --> 00:39:45,759
Speaker 1: more like the structure of a request. In other words,

649
00:39:45,760 --> 00:39:48,120
Speaker 1: it's saying, this looks like it's a yes or no question.

650
00:39:48,520 --> 00:39:52,239
Speaker 1: Let's present him with responses that are in a yes

651
00:39:52,360 --> 00:39:56,560
Speaker 1: or no format. The fact that the system doesn't really

652
00:39:56,560 --> 00:40:00,000
Speaker 1: have a deeper understanding can become evident in other use cases.

653
00:40:00,560 --> 00:40:04,680
Speaker 1: So for example, Janelle Shane, who is a research scientist

654
00:40:04,719 --> 00:40:09,239
Speaker 1: and who has a delightful blog called AI Weirdness, took

655
00:40:09,320 --> 00:40:11,920
Speaker 1: time to try and train a machine learning system to

656
00:40:11,960 --> 00:40:16,240
Speaker 1: tell jokes. It became clear that the system could construct

657
00:40:16,400 --> 00:40:21,760
Speaker 1: something resembling a classic question slash punchline style of joke.

658
00:40:22,320 --> 00:40:25,640
Speaker 1: But it was also clear that the punchline rarely had

659
00:40:25,760 --> 00:40:29,040
Speaker 1: any connection to the question. It actually reminded me a

660
00:40:29,040 --> 00:40:31,520
Speaker 1: lot of how little kids like my two year old

661
00:40:31,600 --> 00:40:35,080
Speaker 1: niece tell jokes. These jokes are some of my favorite

662
00:40:35,080 --> 00:40:38,360
Speaker 1: in the world, not because the jokes are inherently funny,

663
00:40:38,680 --> 00:40:41,399
Speaker 1: but because they are absurd and they show how little

664
00:40:41,480 --> 00:40:44,840
Speaker 1: children can recognize the structure, but not how to build

665
00:40:44,920 --> 00:40:49,160
Speaker 1: an actual joke. My favorite of the AI generated jokes

666
00:40:49,360 --> 00:40:53,319
Speaker 1: almost got it right, and it went like this, what

667
00:40:53,440 --> 00:40:57,359
Speaker 1: do you get when you cross a dinosaur? They get

668
00:40:57,400 --> 00:41:01,759
Speaker 1: a lawyer's I mean, that's that's almost a real joke.

669
00:41:01,840 --> 00:41:05,279
Speaker 1: I actually love that one. Shane pointed out the bit

670
00:41:05,400 --> 00:41:08,480
Speaker 1: that I mentioned earlier that these systems have next to

671
00:41:08,600 --> 00:41:12,280
Speaker 1: no short term memory, and so building any lengthy response

672
00:41:12,480 --> 00:41:15,279
Speaker 1: is pretty much impossible because the computer system is so

673
00:41:15,320 --> 00:41:18,560
Speaker 1: focused on choosing the word that comes next without an

674
00:41:18,640 --> 00:41:22,960
Speaker 1: understanding of the connection or context of what came earlier.

675
00:41:23,560 --> 00:41:26,640
Speaker 1: And you may have come across stuff like a social

676
00:41:26,680 --> 00:41:28,880
Speaker 1: media post that says something along the lines of I

677
00:41:28,960 --> 00:41:32,279
Speaker 1: fed a computer ten thousand movie scripts and asked it

678
00:41:32,320 --> 00:41:35,120
Speaker 1: to write the next you know, Highlander movie or whatever,

679
00:41:35,760 --> 00:41:39,560
Speaker 1: and then you get a little screenplay, and inevitably they

680
00:41:39,640 --> 00:41:44,160
Speaker 1: end up being silly and absurd, with crazy stage directions

681
00:41:44,200 --> 00:41:47,840
Speaker 1: and dialogue and descriptions. They also tend to be written

682
00:41:48,040 --> 00:41:53,319
Speaker 1: entirely by human beings. Most AI systems are incapable of

683
00:41:53,440 --> 00:41:57,920
Speaker 1: keeping things consistent, like character names. A computer system might

684
00:41:57,960 --> 00:42:01,960
Speaker 1: create a character name and give that character align, but

685
00:42:02,840 --> 00:42:06,239
Speaker 1: that name is not likely to return later on in

686
00:42:06,280 --> 00:42:09,440
Speaker 1: the screenplay. It's not necessarily going to show up in

687
00:42:09,440 --> 00:42:13,200
Speaker 1: any stage directions or descriptions. It ends up being more

688
00:42:13,320 --> 00:42:17,600
Speaker 1: dreamlike and free form. It's still absurd, but it's not

689
00:42:17,719 --> 00:42:21,960
Speaker 1: as internally consistent. So if you come across a long

690
00:42:22,080 --> 00:42:25,360
Speaker 1: piece of absurd ast humor that was quote unquote written

691
00:42:25,360 --> 00:42:29,160
Speaker 1: by a computer, chances are it wasn't. It was written

692
00:42:29,200 --> 00:42:33,120
Speaker 1: by a person who was emulating the dreamlike absurdism of

693
00:42:33,160 --> 00:42:37,359
Speaker 1: computer generated text. They're still really funny, they're just not

694
00:42:37,440 --> 00:42:41,400
Speaker 1: necessarily actually generated by a computer. So about that blog

695
00:42:41,480 --> 00:42:44,439
Speaker 1: post that ran on Hacker News. How did that get

696
00:42:44,480 --> 00:42:48,520
Speaker 1: past so many people? It started with Liam Poor, a

697
00:42:48,600 --> 00:42:52,279
Speaker 1: college student, a computer scientist, who made contact with a

698
00:42:52,320 --> 00:42:55,880
Speaker 1: PhD student who in turn had access to a private

699
00:42:55,960 --> 00:43:00,520
Speaker 1: beta build of the GPT three autocomplete tool. Poor created

700
00:43:00,560 --> 00:43:04,040
Speaker 1: a blog post title and an introduction to serve as

701
00:43:04,080 --> 00:43:07,680
Speaker 1: the launch point for the system to build upon. And

702
00:43:07,840 --> 00:43:11,280
Speaker 1: together they ran a few trials with this machine learning

703
00:43:11,320 --> 00:43:15,680
Speaker 1: system and auto generated text system and uh with those prompts,

704
00:43:15,800 --> 00:43:19,880
Speaker 1: and then Poor picked one of the results to submit

705
00:43:19,920 --> 00:43:22,200
Speaker 1: as a legit blog post. Now, I'm going to read

706
00:43:22,239 --> 00:43:25,120
Speaker 1: a little section of it. Now, the blog post title

707
00:43:25,320 --> 00:43:30,160
Speaker 1: was feeling unproductive, maybe you should stop overthinking. And here's

708
00:43:30,160 --> 00:43:34,320
Speaker 1: a segment that comes from the middle of the blog post. Quote.

709
00:43:35,280 --> 00:43:38,920
Speaker 1: When you engage in creative thinking, your brain starts working

710
00:43:38,960 --> 00:43:42,200
Speaker 1: more efficiently. It becomes more active and more open to

711
00:43:42,280 --> 00:43:45,360
Speaker 1: new ideas. It also helps you think outside the box

712
00:43:45,440 --> 00:43:48,399
Speaker 1: and look at things from a different perspective. So how

713
00:43:48,440 --> 00:43:53,000
Speaker 1: does this all tie into productivity. Well, if you're a creator,

714
00:43:53,280 --> 00:43:56,040
Speaker 1: then you should be engaging in creative thinking on a

715
00:43:56,040 --> 00:43:59,160
Speaker 1: regular basis. The more you do it, the better your

716
00:43:59,160 --> 00:44:02,520
Speaker 1: brain becomes at thinking up ideas. This makes it easier

717
00:44:02,560 --> 00:44:04,840
Speaker 1: for you to work on your projects because you won't

718
00:44:04,920 --> 00:44:09,680
Speaker 1: get stuck as often. End quote. Now the phrasing makes sense.

719
00:44:10,239 --> 00:44:12,920
Speaker 1: It's in a very casual style, and other parts of

720
00:44:12,920 --> 00:44:15,840
Speaker 1: the blog post get, you know, even more casual, sometimes

721
00:44:16,000 --> 00:44:21,239
Speaker 1: straying into grammatical error territory. It's not terribly precise, nor

722
00:44:21,320 --> 00:44:25,400
Speaker 1: is it saying anything really. The example I gave to

723
00:44:25,440 --> 00:44:28,040
Speaker 1: a friend of mine is that this blog post is

724
00:44:28,239 --> 00:44:31,080
Speaker 1: just like if I said, you know, if I'm caught

725
00:44:31,120 --> 00:44:34,080
Speaker 1: outside when it starts pouring down rain, I get wet.

726
00:44:34,800 --> 00:44:39,120
Speaker 1: I mean, yeah, that statement is true, but it's also,

727
00:44:39,239 --> 00:44:41,440
Speaker 1: you know, not saying anything, or at least not anything

728
00:44:41,440 --> 00:44:45,760
Speaker 1: that isn't already evident. All that being said, the blog

729
00:44:45,800 --> 00:44:48,880
Speaker 1: post impresses the heck out of me. And that's because

730
00:44:48,920 --> 00:44:53,360
Speaker 1: the paragraphs follow in a logical pattern. It's not well written,

731
00:44:53,800 --> 00:44:56,759
Speaker 1: but there's so much bad writing out there that it

732
00:44:56,840 --> 00:44:59,960
Speaker 1: also doesn't stand out. If I had read this without

733
00:45:00,200 --> 00:45:03,399
Speaker 1: knowing a computer generated it, I'm not certain I would

734
00:45:03,440 --> 00:45:06,400
Speaker 1: pick up on it again. Not because it's great writing,

735
00:45:06,440 --> 00:45:09,560
Speaker 1: but because I've read a lot of really bad writing

736
00:45:09,560 --> 00:45:13,080
Speaker 1: out there. Heck, I've probably written some of it. Think

737
00:45:13,120 --> 00:45:16,240
Speaker 1: of some of the content farms out there that post

738
00:45:16,680 --> 00:45:19,839
Speaker 1: thousands of blog posts a day. There's not as many

739
00:45:19,880 --> 00:45:22,680
Speaker 1: as there were maybe you know, five years ago, but

740
00:45:22,760 --> 00:45:25,279
Speaker 1: there's still quite a few. Well, a lot of that

741
00:45:25,320 --> 00:45:29,280
Speaker 1: content is written in a very quick, slap dash style,

742
00:45:29,640 --> 00:45:32,840
Speaker 1: and and no, no shade being thrown at the writers.

743
00:45:32,840 --> 00:45:35,680
Speaker 1: They're trying to make a living, but it's not exactly

744
00:45:35,800 --> 00:45:39,919
Speaker 1: well crafted work. This piece could have passed for one

745
00:45:39,920 --> 00:45:44,520
Speaker 1: of those, and the piece does actually seem to build

746
00:45:44,719 --> 00:45:48,000
Speaker 1: on itself. New paragraphs reference a point made in an

747
00:45:48,040 --> 00:45:51,160
Speaker 1: earlier paragraph, something that you didn't see so much of

748
00:45:51,280 --> 00:45:55,279
Speaker 1: in other systems. New paragraphs build on those earlier ones,

749
00:45:55,360 --> 00:45:59,400
Speaker 1: not in substantial ways, but there is a coherent link

750
00:45:59,600 --> 00:46:01,759
Speaker 1: from one paragraph to the next. It's not as free

751
00:46:01,800 --> 00:46:05,520
Speaker 1: form and absurd as other generative texts that I've seen.

752
00:46:06,480 --> 00:46:09,640
Speaker 1: As for the autocorrect on our phones, those get more

753
00:46:09,640 --> 00:46:11,799
Speaker 1: individualized as we use them. Like I said, if I

754
00:46:11,840 --> 00:46:14,520
Speaker 1: type a proper name like my dog tim Bolt, my

755
00:46:14,600 --> 00:46:17,120
Speaker 1: phone starts to pick up on this that it's a

756
00:46:17,120 --> 00:46:20,040
Speaker 1: word that has a particular meaning to me, that it's

757
00:46:20,040 --> 00:46:23,279
Speaker 1: also a proper noun because I always capitalize it, and

758
00:46:23,320 --> 00:46:25,920
Speaker 1: that it's not a typo, it's not a misspelling. So

759
00:46:26,200 --> 00:46:28,719
Speaker 1: while the name wasn't in my phone's dictionary when I

760
00:46:28,719 --> 00:46:31,919
Speaker 1: first got it, it has been added to that now

761
00:46:31,920 --> 00:46:33,640
Speaker 1: that I've been using it so much, and it can

762
00:46:33,719 --> 00:46:36,360
Speaker 1: even auto complete the name as I start to type.

763
00:46:36,360 --> 00:46:40,200
Speaker 1: Now we have some really impressive examples of generated text

764
00:46:40,320 --> 00:46:44,200
Speaker 1: or generated language applications in AI. A couple of years ago,

765
00:46:44,280 --> 00:46:47,800
Speaker 1: Google demonstrated how the Google Assistant could make a phone

766
00:46:47,800 --> 00:46:51,560
Speaker 1: call to a real human being operated business and make

767
00:46:51,560 --> 00:46:55,440
Speaker 1: an appointment for you. In a demonstration, the assistant called

768
00:46:55,520 --> 00:46:58,239
Speaker 1: a hair salon and had a brief conversation with the

769
00:46:58,239 --> 00:47:02,600
Speaker 1: salon employee to okay, haircut appointment, and it all sounded,

770
00:47:02,680 --> 00:47:06,600
Speaker 1: you know, fairly natural. This approach to natural language recognition

771
00:47:06,680 --> 00:47:10,480
Speaker 1: and generative language is really powerful stuff. In this case,

772
00:47:10,640 --> 00:47:14,640
Speaker 1: the assistant was relying upon certain parameters. Right The assistant

773
00:47:14,680 --> 00:47:18,560
Speaker 1: knew which salon the user wanted to call. They knew

774
00:47:18,560 --> 00:47:22,799
Speaker 1: the time frame that the user had outlined as being appropriate. Uh.

775
00:47:22,920 --> 00:47:26,800
Speaker 1: In this particular demonstration, it was an appointment slot anytime

776
00:47:26,840 --> 00:47:29,960
Speaker 1: between ten am and twelve pm, and knew what day

777
00:47:30,080 --> 00:47:33,360
Speaker 1: the user wanted an appointment and had all the basics,

778
00:47:33,560 --> 00:47:37,239
Speaker 1: and then the assistant could respond to questions and statements

779
00:47:37,280 --> 00:47:41,239
Speaker 1: from the salon employee on the phone and book the appointment,

780
00:47:41,400 --> 00:47:45,920
Speaker 1: all without obviously revealing that it was an AI program.

781
00:47:46,000 --> 00:47:48,799
Speaker 1: The appearance is that the assistant is able to have

782
00:47:49,040 --> 00:47:53,840
Speaker 1: persistent knowledge, but that's more of an illusion than anything else,

783
00:47:54,320 --> 00:47:56,920
Speaker 1: it does show that computer scientists are making a lot

784
00:47:56,920 --> 00:48:00,280
Speaker 1: of progress towards building systems that can generate language at

785
00:48:00,320 --> 00:48:04,320
Speaker 1: if it's not deeply meaningful, can at least be useful.

786
00:48:05,040 --> 00:48:07,160
Speaker 1: I'll close out was something that I covered at the

787
00:48:07,200 --> 00:48:11,400
Speaker 1: IBM Think Conference back in twenty nineteen. To demonstrate the

788
00:48:11,400 --> 00:48:14,920
Speaker 1: power of the Watson platform, which is a foundation for

789
00:48:15,040 --> 00:48:19,719
Speaker 1: various applications that all tap into deep AI processes, IBM

790
00:48:19,840 --> 00:48:24,360
Speaker 1: organized a debate between a debate champion and a system

791
00:48:24,400 --> 00:48:27,560
Speaker 1: called Project debater or, and the debate was on the

792
00:48:27,640 --> 00:48:32,480
Speaker 1: topic of subsidizing preschools. IBM had drawn the pro side

793
00:48:32,719 --> 00:48:35,480
Speaker 1: of the argument, and I got to watch this debate

794
00:48:35,600 --> 00:48:38,600
Speaker 1: live in person, and it was impressive. Not that I

795
00:48:38,640 --> 00:48:42,400
Speaker 1: felt that Watson was able to outmaneuver the skilled, logical,

796
00:48:42,680 --> 00:48:46,759
Speaker 1: eloquent human champion, but it was able to construct a

797
00:48:46,840 --> 00:48:51,600
Speaker 1: pretty sound and consistent argument. It wasn't as strong and rhetoric,

798
00:48:52,080 --> 00:48:55,000
Speaker 1: but it appeared to parse the flow of the debate

799
00:48:55,280 --> 00:48:58,520
Speaker 1: properly for the most part, constructing arguments and supporting them

800
00:48:58,560 --> 00:49:03,080
Speaker 1: with information wherever possible. It didn't come across as quite human,

801
00:49:03,560 --> 00:49:06,359
Speaker 1: but it was still really impressive. I think it will

802
00:49:06,400 --> 00:49:09,319
Speaker 1: be quite some time before machines can generate text or

803
00:49:09,400 --> 00:49:13,680
Speaker 1: speech at a level that compares with skilled humans, you know,

804
00:49:14,000 --> 00:49:18,200
Speaker 1: humans who incorporate so many things from creativity to insight

805
00:49:18,320 --> 00:49:22,160
Speaker 1: to intelligence in order to build communication. But progress is

806
00:49:22,200 --> 00:49:24,759
Speaker 1: being made all the time, and thanks to a surplus

807
00:49:24,800 --> 00:49:27,839
Speaker 1: of you know, not so great communication out there, we're

808
00:49:27,880 --> 00:49:31,399
Speaker 1: more likely to not notice the computer generated stuff as

809
00:49:31,440 --> 00:49:35,719
Speaker 1: it improves. This opens up a lot of thorny problems.

810
00:49:36,080 --> 00:49:39,160
Speaker 1: We've already got a problem with fake news. In a

811
00:49:39,200 --> 00:49:43,160
Speaker 1: world where computer systems could generate endless blog posts and

812
00:49:43,320 --> 00:49:47,719
Speaker 1: articles supporting narratives that don't reflect the truth, we're really

813
00:49:47,719 --> 00:49:50,320
Speaker 1: going to be in trouble. And I think that's why

814
00:49:50,320 --> 00:49:54,040
Speaker 1: this news about the blog post passing for a real

815
00:49:54,160 --> 00:49:58,319
Speaker 1: article should scare platforms like Facebook. If we reach a

816
00:49:58,360 --> 00:50:02,440
Speaker 1: point where computers can lad Facebook with fake news and

817
00:50:02,560 --> 00:50:06,760
Speaker 1: other computers are running bots that interact with that fake news,

818
00:50:07,520 --> 00:50:11,120
Speaker 1: fewer people are going to stick around on that platform.

819
00:50:11,160 --> 00:50:13,840
Speaker 1: They're going to it's just gonna get a turned to

820
00:50:13,920 --> 00:50:18,799
Speaker 1: a cess pit of of total nonsense. You know, some

821
00:50:18,840 --> 00:50:20,520
Speaker 1: people stick around, but a lot of people are just

822
00:50:20,560 --> 00:50:23,080
Speaker 1: gonna bail. People have been bailing already. We're gonna see

823
00:50:23,080 --> 00:50:26,160
Speaker 1: a lot more leave, and once the advertisers get win

824
00:50:26,320 --> 00:50:30,360
Speaker 1: that the majority of activity on Facebook isn't even human

825
00:50:31,040 --> 00:50:35,759
Speaker 1: and therefore doesn't represent actual potential customers, advertising money will

826
00:50:35,760 --> 00:50:39,319
Speaker 1: start to dry up, and then even a behemoth like

827
00:50:39,400 --> 00:50:42,960
Speaker 1: Facebook could crumble. Now I'm not saying this is going

828
00:50:43,000 --> 00:50:46,200
Speaker 1: to happen quickly, but I think it definitely could and

829
00:50:46,280 --> 00:50:49,840
Speaker 1: probably will happen at least in some respect over the

830
00:50:49,880 --> 00:50:55,840
Speaker 1: course of the next few years. So hey, Facebook, maybe

831
00:50:55,840 --> 00:51:00,200
Speaker 1: think about your oncoming existential crisis and you know, get

832
00:51:00,200 --> 00:51:04,200
Speaker 1: ahead of it. It would be good for everybody, including

833
00:51:04,239 --> 00:51:09,000
Speaker 1: your shareholders, and I know you really care about those alright.

834
00:51:09,080 --> 00:51:11,160
Speaker 1: That wraps up this episode of tech Stuff and how

835
00:51:11,320 --> 00:51:15,200
Speaker 1: artificial intelligence and machine learning and predictive text are all

836
00:51:15,239 --> 00:51:20,600
Speaker 1: evolving rapidly in ways that are both cool and you know, concerning,

837
00:51:20,800 --> 00:51:23,600
Speaker 1: if we're being totally honest, But I want to know

838
00:51:23,640 --> 00:51:25,520
Speaker 1: what you guys think. I also want to know if

839
00:51:25,520 --> 00:51:28,040
Speaker 1: you have any suggestions for future episodes of tech Stuff.

840
00:51:28,360 --> 00:51:31,280
Speaker 1: Reach out to me on Twitter. The handle is text

841
00:51:31,280 --> 00:51:34,320
Speaker 1: stuff h s W and I'll talk to you again

842
00:51:35,080 --> 00:51:43,360
Speaker 1: really soon. Text Stuff is an I Heart Radio production.

843
00:51:43,600 --> 00:51:46,400
Speaker 1: For more podcasts from My Heart Radio, visit the I

844
00:51:46,520 --> 00:51:49,759
Speaker 1: heart radio, app, Apple podcasts, or wherever you listen to

845
00:51:49,800 --> 00:51:50,720
Speaker 1: your favorite shows.