WEBVTT - Computer-Generated Communication

0:00:04.400 --> 0:00:12.760
<v Speaker 1>Welcome to Text, a production from my Heart Radio. Hey there,

0:00:12.760 --> 0:00:16.480
<v Speaker 1>and welcome to tech stuff. I'm your host, Jonathan Strickland.

0:00:16.520 --> 0:00:19.000
<v Speaker 1>I'm an executive producer with I Heart Radio and I

0:00:19.079 --> 0:00:22.599
<v Speaker 1>love all things tech, and today I want to tackle

0:00:22.840 --> 0:00:29.480
<v Speaker 1>a really interesting, complicated, and potentially scary topic, and that

0:00:29.640 --> 0:00:34.360
<v Speaker 1>is predictive text generation. And I know that sounds weird

0:00:34.400 --> 0:00:36.760
<v Speaker 1>to say potentially scary, but you know, stick with me.

0:00:37.360 --> 0:00:40.720
<v Speaker 1>I'm sure many of you have seen social media posts

0:00:40.760 --> 0:00:45.080
<v Speaker 1>that say things like type I am the on your

0:00:45.120 --> 0:00:48.519
<v Speaker 1>phone and then generate a result using the middle option

0:00:48.840 --> 0:00:52.280
<v Speaker 1>of predictive text. So you know, just for example, I

0:00:52.320 --> 0:00:54.600
<v Speaker 1>did that. If I did that on my phone, then

0:00:54.920 --> 0:00:58.320
<v Speaker 1>I get I am the only one who can help

0:00:58.400 --> 0:01:04.679
<v Speaker 1>me with this. Oh two, real predictive text. I mean,

0:01:04.680 --> 0:01:07.360
<v Speaker 1>I'm the only one who researches and writes these episodes.

0:01:07.400 --> 0:01:11.840
<v Speaker 1>That's it's way too real. But the whole meme of

0:01:12.000 --> 0:01:17.120
<v Speaker 1>using predictive text to generate seemingly meaningful or you know,

0:01:17.240 --> 0:01:22.120
<v Speaker 1>sometimes wildly absurd phrases is just part of what I

0:01:22.200 --> 0:01:25.640
<v Speaker 1>want to talk about today. Now. The reason this topic

0:01:25.920 --> 0:01:29.120
<v Speaker 1>jumped at me is because of a recent news article

0:01:29.200 --> 0:01:32.600
<v Speaker 1>that I read over on the Verge The article that

0:01:32.680 --> 0:01:35.959
<v Speaker 1>was written by Kim Lyons has the title a college

0:01:36.080 --> 0:01:40.959
<v Speaker 1>student used GPT three to write fake blog posts and

0:01:41.120 --> 0:01:44.560
<v Speaker 1>ended up at the top of Hacker News. Now, as

0:01:44.600 --> 0:01:49.080
<v Speaker 1>the headline indicates, a computer science student used a predictive

0:01:49.160 --> 0:01:52.960
<v Speaker 1>text engine called GPT three, a beta build of it

0:01:53.040 --> 0:01:58.000
<v Speaker 1>in fact, that stands for Generative pre Trained Transformer, and

0:01:58.040 --> 0:02:01.400
<v Speaker 1>then generated a blog that was featured on a site

0:02:01.400 --> 0:02:04.320
<v Speaker 1>called hacker News as if it were a piece written

0:02:04.320 --> 0:02:07.960
<v Speaker 1>by a flesh and blood human being. What's more, a

0:02:08.040 --> 0:02:10.919
<v Speaker 1>threat on Reddit showed that only a few people were

0:02:10.960 --> 0:02:13.920
<v Speaker 1>picking up on the feeling that something hinky was going on,

0:02:14.000 --> 0:02:17.240
<v Speaker 1>and that perhaps the blog post had not been written

0:02:17.760 --> 0:02:21.040
<v Speaker 1>but generated. And Lions goes on to point out that

0:02:21.120 --> 0:02:24.680
<v Speaker 1>the fact that there's a lot of, you know, not

0:02:25.000 --> 0:02:28.359
<v Speaker 1>very good writing on the Internet makes it a little

0:02:28.400 --> 0:02:32.359
<v Speaker 1>harder to sus out a decent generated post as opposed

0:02:32.400 --> 0:02:35.240
<v Speaker 1>to a written one. It's not so much that the

0:02:35.280 --> 0:02:39.480
<v Speaker 1>AI has become super awesome at writing, but rather that

0:02:39.520 --> 0:02:42.400
<v Speaker 1>we've kind of lowered the bar more than a little.

0:02:42.680 --> 0:02:44.760
<v Speaker 1>This kind of plays into the whole concept of a

0:02:44.800 --> 0:02:48.720
<v Speaker 1>touring test. So, just to go off on a tangent here,

0:02:48.919 --> 0:02:50.960
<v Speaker 1>this isn't in my notes, I'm just going to speak

0:02:51.360 --> 0:02:54.880
<v Speaker 1>off the cuff. The Touring test is named after Alan Touring,

0:02:55.560 --> 0:03:01.600
<v Speaker 1>famous computer scientist, and the idea. Nowadays, it's kind of

0:03:01.600 --> 0:03:05.400
<v Speaker 1>evolved into this idea of you have a series of

0:03:05.639 --> 0:03:10.040
<v Speaker 1>interviews that a person does over a computer, and some

0:03:10.160 --> 0:03:14.119
<v Speaker 1>of the interviewees are people and some of them are

0:03:14.840 --> 0:03:18.960
<v Speaker 1>chat bots essentially, and the goal of this whole exercise

0:03:19.080 --> 0:03:21.000
<v Speaker 1>is to see if the person who's doing the interview

0:03:21.560 --> 0:03:25.679
<v Speaker 1>can consistently tell if the other entity on the other

0:03:25.680 --> 0:03:28.359
<v Speaker 1>side of the interview is a person or if it's

0:03:28.360 --> 0:03:32.239
<v Speaker 1>a chat bought. And if you pass with a certain percentage,

0:03:32.520 --> 0:03:35.040
<v Speaker 1>you would say that the chat bought has passed the

0:03:35.040 --> 0:03:38.240
<v Speaker 1>Touring test, that people are unable to tell the difference

0:03:38.280 --> 0:03:41.080
<v Speaker 1>between the chat bought and a real human being, and

0:03:41.080 --> 0:03:43.880
<v Speaker 1>that this is kind of one of the markers for

0:03:44.040 --> 0:03:48.600
<v Speaker 1>artificial intelligence. We're gonna be dipping into that sort of

0:03:48.680 --> 0:03:53.520
<v Speaker 1>thing with this discussion as well. So today I'm really

0:03:53.520 --> 0:03:56.520
<v Speaker 1>wanted to dive into the whole concept of predictive text

0:03:56.560 --> 0:04:00.000
<v Speaker 1>and how it's done and how it could absolutely destroy

0:04:00.120 --> 0:04:02.960
<v Speaker 1>platforms like Facebook in the future. That's all I'm going

0:04:03.000 --> 0:04:06.040
<v Speaker 1>to end this episode, So stick around, but we have

0:04:06.200 --> 0:04:10.760
<v Speaker 1>to build on this gradually, So let's start at the

0:04:10.880 --> 0:04:14.600
<v Speaker 1>very beginning, which, according to this woman who's singing outside

0:04:14.600 --> 0:04:17.320
<v Speaker 1>my window, is a very good place to start. And

0:04:17.400 --> 0:04:21.080
<v Speaker 1>we are going to start with a particularly tricky concept

0:04:21.200 --> 0:04:25.120
<v Speaker 1>for a former English Lit Major to try and explain,

0:04:25.680 --> 0:04:29.560
<v Speaker 1>and this is called a Markov model. It's named after

0:04:29.760 --> 0:04:35.920
<v Speaker 1>a mathematician named Andre Andreyevitch Markov, and he was born

0:04:35.920 --> 0:04:39.400
<v Speaker 1>in Russia in eighteen fifty six, and he did a

0:04:39.400 --> 0:04:43.840
<v Speaker 1>lot of work on an area of mathematics called stochastic processes.

0:04:44.560 --> 0:04:48.800
<v Speaker 1>But that just raises another question, right, what does stochastic mean? Well,

0:04:48.839 --> 0:04:53.640
<v Speaker 1>a stochastic variable is one that is randomly determined. A

0:04:53.680 --> 0:04:59.040
<v Speaker 1>stocastic system has a random probability pattern that you can study,

0:04:59.080 --> 0:05:03.720
<v Speaker 1>but you can't dickt it precisely. There's always uncertainty. So

0:05:03.760 --> 0:05:07.880
<v Speaker 1>you can assign probabilities as to how the pattern will form,

0:05:08.480 --> 0:05:11.720
<v Speaker 1>but those are just indications of how likely a particular

0:05:11.800 --> 0:05:15.320
<v Speaker 1>pattern will form, not a guarantee. So let's take a

0:05:15.400 --> 0:05:19.640
<v Speaker 1>very simple example, and let's pick something really random. Let's

0:05:19.680 --> 0:05:22.760
<v Speaker 1>talk about my two year old niece. So let's say

0:05:22.960 --> 0:05:25.520
<v Speaker 1>my niece is standing in the middle of a room

0:05:25.760 --> 0:05:29.560
<v Speaker 1>and I walk in. Now, based on my past interactions

0:05:29.800 --> 0:05:34.480
<v Speaker 1>with this random creature, I know my niece is likely

0:05:34.560 --> 0:05:38.000
<v Speaker 1>to do one of three things. She is going to

0:05:38.120 --> 0:05:40.799
<v Speaker 1>run at me and grab my hand, and then boss

0:05:40.880 --> 0:05:43.359
<v Speaker 1>me around and put me someplace and tell me I

0:05:43.400 --> 0:05:46.159
<v Speaker 1>have to stay there. She's going to run away from

0:05:46.200 --> 0:05:49.760
<v Speaker 1>me and then hide and then demand very loudly that

0:05:49.839 --> 0:05:52.760
<v Speaker 1>I come find her. She is not, i should add,

0:05:52.839 --> 0:05:57.960
<v Speaker 1>quite grasped the concept of hiding. Or she is going

0:05:58.040 --> 0:06:02.520
<v Speaker 1>to ignore me and say and or dance. Those are

0:06:02.560 --> 0:06:05.440
<v Speaker 1>the things that she typically does. There are other things

0:06:05.560 --> 0:06:08.760
<v Speaker 1>she might do as well, but they happen much less frequently.

0:06:09.080 --> 0:06:12.440
<v Speaker 1>So let's say I want to sketch out this scenario

0:06:12.560 --> 0:06:16.640
<v Speaker 1>on paper. I might start with the scenario is my

0:06:16.720 --> 0:06:19.400
<v Speaker 1>nieces in a room and I come into the room.

0:06:19.640 --> 0:06:22.440
<v Speaker 1>Then I would draw a little bubbles on my paper

0:06:22.800 --> 0:06:26.960
<v Speaker 1>to represent the potential actions or states as we would

0:06:26.960 --> 0:06:30.240
<v Speaker 1>call them, in a Markov chain that could follow this

0:06:30.440 --> 0:06:33.560
<v Speaker 1>input of me walking into the room. Now, based on

0:06:33.600 --> 0:06:36.640
<v Speaker 1>the number of times I've seen her respond before, I

0:06:36.680 --> 0:06:41.440
<v Speaker 1>could wait each of those states with a certain probability. If,

0:06:41.640 --> 0:06:44.680
<v Speaker 1>for example, she runs at me and grabs my hand

0:06:44.760 --> 0:06:48.039
<v Speaker 1>then bosses me around more than half the time I

0:06:48.080 --> 0:06:52.760
<v Speaker 1>can wait that outcome, as you know, And does that

0:06:52.839 --> 0:06:54.800
<v Speaker 1>mean the next time I walk into a room that

0:06:54.880 --> 0:07:00.160
<v Speaker 1>she's going to do that? No, each incident is random.

0:07:00.160 --> 0:07:03.400
<v Speaker 1>I'm just illustrating how likely a particular outcome is going

0:07:03.440 --> 0:07:07.080
<v Speaker 1>to be. I would then assign probabilities for the other

0:07:07.200 --> 0:07:11.280
<v Speaker 1>two outcomes I outlined, and and maybe just ignore all

0:07:11.280 --> 0:07:16.200
<v Speaker 1>the outliers and say that one of them is you know, likely,

0:07:16.240 --> 0:07:18.880
<v Speaker 1>which means the third one is only five percent likely

0:07:18.960 --> 0:07:22.320
<v Speaker 1>to happen because it has to add up to now.

0:07:22.440 --> 0:07:26.280
<v Speaker 1>The example I just gave is ridiculously simple, despite the

0:07:26.280 --> 0:07:29.760
<v Speaker 1>fact that my niece is already incredibly complicated, And it

0:07:29.920 --> 0:07:34.320
<v Speaker 1>just gives us the odds of one starting state that

0:07:34.400 --> 0:07:37.720
<v Speaker 1>I'm me walking into a room that then transitions into

0:07:37.760 --> 0:07:42.480
<v Speaker 1>one of three outcome states. Markov models can have lots

0:07:42.480 --> 0:07:46.040
<v Speaker 1>of variables, with some variables dependent upon the value of

0:07:46.120 --> 0:07:49.760
<v Speaker 1>other variables. So you might see a chain as something

0:07:50.080 --> 0:07:54.040
<v Speaker 1>like if outcome A happens and there's a sixty chance

0:07:54.080 --> 0:07:57.280
<v Speaker 1>that it will, then there's a thirty percent chance that

0:07:57.440 --> 0:08:01.880
<v Speaker 1>a subsequent outcome A three will happen, And it can

0:08:01.920 --> 0:08:05.640
<v Speaker 1>become a really complex branching path of possibilities, but we

0:08:05.720 --> 0:08:09.679
<v Speaker 1>can stick with simple. Let's take the coin flip, the

0:08:09.840 --> 0:08:13.720
<v Speaker 1>classic example of a random variable. We know that the

0:08:13.760 --> 0:08:18.040
<v Speaker 1>odds of a fair coin landing heads up are and

0:08:18.120 --> 0:08:22.040
<v Speaker 1>landing tails up. Our fifty percent. Flipping a coin many

0:08:22.240 --> 0:08:27.320
<v Speaker 1>thousands of times should show that collectively you're gravitating towards

0:08:27.400 --> 0:08:31.560
<v Speaker 1>those probabilities, that about half of your coin flips will

0:08:31.600 --> 0:08:34.000
<v Speaker 1>be heads and the other half will be tails. But

0:08:34.120 --> 0:08:37.480
<v Speaker 1>that does not mean you won't get on streaks where

0:08:37.520 --> 0:08:41.600
<v Speaker 1>you flip heads over and over. Allah, Rosencrantz and Guildenstern

0:08:41.640 --> 0:08:44.640
<v Speaker 1>are dead. And if you don't know that reference, I

0:08:44.720 --> 0:08:48.360
<v Speaker 1>highly recommend that you read that play or you watch

0:08:48.480 --> 0:08:51.280
<v Speaker 1>the excellent film version that has Tim Roth and Gary

0:08:51.320 --> 0:08:54.400
<v Speaker 1>Oldman in it, because it is fantastic and it kind

0:08:54.400 --> 0:08:59.160
<v Speaker 1>of dives into a fun discussion of probabilities and what

0:08:59.320 --> 0:09:03.080
<v Speaker 1>does that actually mean Anyway, The odds of flipping a

0:09:03.160 --> 0:09:07.440
<v Speaker 1>coin heads are for a single coin flip, but what

0:09:07.520 --> 0:09:11.160
<v Speaker 1>about a second coin flip. Well, if we look at

0:09:11.280 --> 0:09:15.560
<v Speaker 1>just that flip in isolation, that second coin flip, it's

0:09:15.559 --> 0:09:18.200
<v Speaker 1>still a fifty pc chance that's going to land on heads.

0:09:18.840 --> 0:09:21.160
<v Speaker 1>But if we frame it a different way, if we

0:09:21.200 --> 0:09:25.360
<v Speaker 1>ask the question, what what are the odds of flipping

0:09:25.360 --> 0:09:28.120
<v Speaker 1>heads twice in a row? This is a different question

0:09:28.240 --> 0:09:32.040
<v Speaker 1>because you're not thinking about individual flips. You're saying, what

0:09:32.160 --> 0:09:36.360
<v Speaker 1>are the odds of this happening twice sequentially? Well, now

0:09:36.720 --> 0:09:38.640
<v Speaker 1>we have to take the odds of it happening once,

0:09:38.679 --> 0:09:42.280
<v Speaker 1>which is, and then we have to multiply it against itself.

0:09:42.320 --> 0:09:46.000
<v Speaker 1>It's a fifty chance again that it would happen twice.

0:09:46.200 --> 0:09:50.280
<v Speaker 1>So oft is let me do the math. It is

0:09:51.640 --> 0:09:54.400
<v Speaker 1>or one four. So if you were to do a

0:09:54.440 --> 0:09:57.080
<v Speaker 1>pair of coin flips, and you were to repeat this

0:09:57.160 --> 0:10:00.760
<v Speaker 1>experiment over and over and over again over the long run,

0:10:00.800 --> 0:10:05.400
<v Speaker 1>you would find that of those sequences would end up

0:10:05.440 --> 0:10:08.880
<v Speaker 1>with heads followed by heads. But what if we wanted

0:10:08.920 --> 0:10:11.400
<v Speaker 1>to say, how what are the odds of flipping three

0:10:11.520 --> 0:10:14.719
<v Speaker 1>heads in a row? Well, then we have to have

0:10:15.000 --> 0:10:20.199
<v Speaker 1>it again. So instead of one out of every four trials,

0:10:20.520 --> 0:10:23.080
<v Speaker 1>we would see one out of every eight, or twelve

0:10:23.120 --> 0:10:26.120
<v Speaker 1>point five percent. And we can keep extending this out.

0:10:26.200 --> 0:10:29.920
<v Speaker 1>We can figure out the odds of some ridiculously long

0:10:30.080 --> 0:10:33.880
<v Speaker 1>stretch of flipping heads in a row. Now in Rosen, Cranston,

0:10:33.880 --> 0:10:37.240
<v Speaker 1>Gillenstern are dead. We are told that it happens and

0:10:37.400 --> 0:10:42.400
<v Speaker 1>astonishing ninety two times in a row, that streak has

0:10:42.440 --> 0:10:48.120
<v Speaker 1>a probability of one in five octillion. That would be

0:10:48.160 --> 0:10:53.160
<v Speaker 1>a five followed by twenty seven zeros. This does not

0:10:53.280 --> 0:10:58.479
<v Speaker 1>mean that it would be impossible, but it is unfathomably unlikely.

0:10:59.440 --> 0:11:03.520
<v Speaker 1>Clemson University has a useful lecture available online in the

0:11:03.559 --> 0:11:08.600
<v Speaker 1>form of a presentation, and it's titled Introduction to Markov Models,

0:11:08.880 --> 0:11:12.880
<v Speaker 1>and it uses weather forecasting as an example. And their

0:11:12.960 --> 0:11:19.760
<v Speaker 1>example takes three initial states, sunny, rainy, and cloudy. Consequently,

0:11:19.760 --> 0:11:23.319
<v Speaker 1>those are also the three potential output states, so each

0:11:23.440 --> 0:11:29.079
<v Speaker 1>state can transition into three states, including transitioning into itself,

0:11:29.120 --> 0:11:32.960
<v Speaker 1>so you could go sunny to cloudy, sunny too rainy,

0:11:33.080 --> 0:11:36.400
<v Speaker 1>or sunny to sunny. That's a valid result as well.

0:11:36.600 --> 0:11:40.640
<v Speaker 1>And in their example, the ideas that we have based

0:11:40.640 --> 0:11:46.160
<v Speaker 1>on past observations figured out the probability for specific forecasts

0:11:46.200 --> 0:11:48.800
<v Speaker 1>based on whatever the current weather happens to be. So,

0:11:49.120 --> 0:11:54.720
<v Speaker 1>for example, we've figured out that rain tomorrow is likely

0:11:54.840 --> 0:11:59.040
<v Speaker 1>if it's raining today, but it's only likely if it's

0:11:59.080 --> 0:12:04.959
<v Speaker 1>just cloudy or sunny today. So if it's cloudy, if

0:12:04.960 --> 0:12:09.200
<v Speaker 1>it's sunny, if it's raining today, that we'll see rain tomorrow.

0:12:09.400 --> 0:12:11.960
<v Speaker 1>But our model would need to have probabilities assigned to

0:12:12.120 --> 0:12:15.600
<v Speaker 1>each pair of starting and ending states. So I'm gonna

0:12:15.600 --> 0:12:18.200
<v Speaker 1>follow through with that just for the purposes of this conversation.

0:12:18.640 --> 0:12:21.959
<v Speaker 1>And we've covered the probabilities of tomorrow being rainy based

0:12:21.960 --> 0:12:25.520
<v Speaker 1>on whatever today's weather is. But the example from Clemson

0:12:25.559 --> 0:12:29.079
<v Speaker 1>also gives the other two outcomes states. So if we're

0:12:29.120 --> 0:12:33.520
<v Speaker 1>looking at the probability of tomorrow being cloudy, we see

0:12:33.520 --> 0:12:37.520
<v Speaker 1>that based on our past observations, that if today is sunny,

0:12:37.559 --> 0:12:41.080
<v Speaker 1>it's a chance of cloudy tomorrow. If today is rainy,

0:12:41.120 --> 0:12:43.800
<v Speaker 1>it's a thirty percent chance, and if today is cloudy,

0:12:43.840 --> 0:12:46.679
<v Speaker 1>there's a fifty percent chance. And finally, if we want

0:12:46.760 --> 0:12:49.200
<v Speaker 1>to know if it's going to be sunny tomorrow, again

0:12:49.200 --> 0:12:51.600
<v Speaker 1>this is all just based on the example. We see

0:12:51.600 --> 0:12:54.200
<v Speaker 1>that if today is sunny, there's an eight percent chance

0:12:54.240 --> 0:12:56.800
<v Speaker 1>that tomorrow will be too. If today is rainy, it's

0:12:56.840 --> 0:12:59.760
<v Speaker 1>just a five percent chance. If today is cloudy there's

0:12:59.760 --> 0:13:02.400
<v Speaker 1>a fifteen percent chance. Now, the reason we need to

0:13:02.400 --> 0:13:05.040
<v Speaker 1>know all of these probabilities will become clear in a second.

0:13:05.280 --> 0:13:08.679
<v Speaker 1>And again these are just examples, they don't reflect real data.

0:13:09.360 --> 0:13:12.840
<v Speaker 1>Markov got very clever and began to use math to

0:13:12.920 --> 0:13:18.120
<v Speaker 1>describe probabilities for predictions that are further out than one state. So,

0:13:18.240 --> 0:13:21.440
<v Speaker 1>for example, you might say, what is the probability that,

0:13:21.679 --> 0:13:25.320
<v Speaker 1>if today is cloudy, that tomorrow will be sunny and

0:13:25.360 --> 0:13:29.240
<v Speaker 1>that the following day will be rainy. This is kind

0:13:29.240 --> 0:13:31.520
<v Speaker 1>of similar to us asking the question of what are

0:13:31.520 --> 0:13:34.800
<v Speaker 1>the odds of flipping heads two or three times in

0:13:34.800 --> 0:13:37.920
<v Speaker 1>a row, except we're looking at the probabilities of weather

0:13:38.360 --> 0:13:41.400
<v Speaker 1>that are based on what our current conditions happen to be.

0:13:41.800 --> 0:13:45.360
<v Speaker 1>So using the example probabilities that were used in that lecture,

0:13:45.720 --> 0:13:49.600
<v Speaker 1>we would find that sunny days follow cloudy days just

0:13:49.880 --> 0:13:52.160
<v Speaker 1>fifteen percent of the time, So there's a fifteen percent

0:13:52.320 --> 0:13:55.719
<v Speaker 1>chance that tomorrow will be cloudy if today is sunny,

0:13:56.400 --> 0:14:00.480
<v Speaker 1>and rainy days follow sunny days twenty per scent of

0:14:00.559 --> 0:14:04.400
<v Speaker 1>the time. So if tomorrow is sunny, there's a twenty

0:14:04.840 --> 0:14:08.600
<v Speaker 1>chance the day after tomorrow will be rainy. So then

0:14:09.520 --> 0:14:12.800
<v Speaker 1>that means that if today's cloudy, we've got that fift

0:14:13.320 --> 0:14:15.360
<v Speaker 1>chance tomorrow will be sunny, and if it is sunny,

0:14:15.400 --> 0:14:18.240
<v Speaker 1>there's a chance that the day after tomorrow will be rainy.

0:14:18.320 --> 0:14:21.080
<v Speaker 1>So we have to multiply those probabilities together. We have

0:14:21.120 --> 0:14:26.640
<v Speaker 1>to multiply that by twenty or point one five times

0:14:26.640 --> 0:14:30.520
<v Speaker 1>point two. That gives us point zero three, which we

0:14:30.760 --> 0:14:33.480
<v Speaker 1>convert to a percentage. That means there's just a three

0:14:33.520 --> 0:14:37.760
<v Speaker 1>percent chance that if today is cloudy, tomorrow will be sunny,

0:14:37.800 --> 0:14:40.400
<v Speaker 1>and the day after tomorrow will be rainy. That's just

0:14:40.440 --> 0:14:43.200
<v Speaker 1>a three percent chance of that happening. And the further

0:14:43.280 --> 0:14:45.800
<v Speaker 1>out we try to predict a particular sequence of whether,

0:14:46.200 --> 0:14:49.280
<v Speaker 1>the lower the probability will be, meaning you know it

0:14:49.320 --> 0:14:52.080
<v Speaker 1>could happen. It's not like it's impossible, but it gets

0:14:52.200 --> 0:14:55.520
<v Speaker 1>less likely the further out we go from our initial state.

0:14:55.880 --> 0:14:59.520
<v Speaker 1>So a Markov model is a stochastic model that describes

0:14:59.600 --> 0:15:03.960
<v Speaker 1>putten chill sequences. It is temporal in nature. That means

0:15:04.400 --> 0:15:07.600
<v Speaker 1>we are really concerned with the state of things and

0:15:07.640 --> 0:15:11.000
<v Speaker 1>how those states will change over time, and it gives

0:15:11.080 --> 0:15:15.080
<v Speaker 1>us a way to explain how current states will depend

0:15:15.160 --> 0:15:18.800
<v Speaker 1>upon previous states. It's not just about predicting the future,

0:15:18.840 --> 0:15:23.040
<v Speaker 1>but also understanding the present. Why are things the way

0:15:23.080 --> 0:15:25.840
<v Speaker 1>they are right now? And it gives us the chance

0:15:25.880 --> 0:15:30.280
<v Speaker 1>to weigh the predictions of the future based upon past

0:15:30.360 --> 0:15:35.560
<v Speaker 1>observational data. This is why we see weather forecasts that

0:15:35.600 --> 0:15:39.000
<v Speaker 1>give us percentages for rainy days, Like a chance for

0:15:39.120 --> 0:15:41.800
<v Speaker 1>rain tells us that it's probably a good idea to

0:15:41.800 --> 0:15:44.440
<v Speaker 1>bring an umbrella if we're going outside, because based on

0:15:44.520 --> 0:15:49.320
<v Speaker 1>past observations, there's a decent chance it's going to rain today. Now,

0:15:50.640 --> 0:15:54.000
<v Speaker 1>let's get more complicated. What if we don't actually know

0:15:54.760 --> 0:15:58.080
<v Speaker 1>the current state of the weather. Let's say that you

0:15:58.160 --> 0:16:01.280
<v Speaker 1>are stuck inside and you can't see out a window,

0:16:01.320 --> 0:16:03.160
<v Speaker 1>you have no windows in the room you're in, and

0:16:03.240 --> 0:16:06.160
<v Speaker 1>someone else comes into your room and says, what's the

0:16:06.200 --> 0:16:10.280
<v Speaker 1>weather like outside? Well, the only hint that we have

0:16:10.560 --> 0:16:14.120
<v Speaker 1>in this experience is if the person that comes in

0:16:14.360 --> 0:16:17.160
<v Speaker 1>is carrying an umbrella or not. We don't actually know

0:16:17.400 --> 0:16:20.800
<v Speaker 1>the current state. We can only make an educated guess

0:16:20.840 --> 0:16:24.440
<v Speaker 1>based on the presence or absence of an umbrella. The

0:16:24.560 --> 0:16:28.040
<v Speaker 1>reality of the current state is hidden from us. This

0:16:28.160 --> 0:16:31.200
<v Speaker 1>leads us to a type of sequential analysis that's used

0:16:31.200 --> 0:16:35.640
<v Speaker 1>in computer science, the hidden Markov model. So with these models,

0:16:35.920 --> 0:16:39.280
<v Speaker 1>we're trying to learn more about the initial states by

0:16:39.320 --> 0:16:42.960
<v Speaker 1>analyzing the outcomes that we can observe. And another way

0:16:42.960 --> 0:16:45.080
<v Speaker 1>of putting it is we're trying to answer the question

0:16:45.920 --> 0:16:48.440
<v Speaker 1>Why are things how they are right now? Why did

0:16:48.440 --> 0:16:53.120
<v Speaker 1>this happen? Let's look back and figure out the probability

0:16:53.160 --> 0:16:57.560
<v Speaker 1>that a particular initial state led to what is going

0:16:57.600 --> 0:17:00.440
<v Speaker 1>on right now now. The whole reason I spent time

0:17:00.440 --> 0:17:04.080
<v Speaker 1>talking about Markov models and probability is that it ties

0:17:04.200 --> 0:17:08.199
<v Speaker 1>heavily into predictive text. It's also used in tons of

0:17:08.240 --> 0:17:12.800
<v Speaker 1>other computational processes and analysis, from natural language analysis to

0:17:12.920 --> 0:17:17.639
<v Speaker 1>genome sequencing. It's really powerful stuff. If we think about language,

0:17:18.000 --> 0:17:20.439
<v Speaker 1>we know that there are certain rules to things. You

0:17:20.480 --> 0:17:24.240
<v Speaker 1>can't just string random letters in a sequence and expect

0:17:24.359 --> 0:17:27.520
<v Speaker 1>that to make a word that other people can understand.

0:17:28.119 --> 0:17:31.320
<v Speaker 1>We have developed languages that have their own vocabularies and

0:17:31.440 --> 0:17:35.440
<v Speaker 1>syntax and grammars. We know that in English, for example,

0:17:35.680 --> 0:17:39.439
<v Speaker 1>the letter Q is nearly always followed by the letter you.

0:17:40.160 --> 0:17:42.920
<v Speaker 1>We know that it would be very odd to see

0:17:42.960 --> 0:17:46.960
<v Speaker 1>the letter H follow right behind the letter J in English.

0:17:47.320 --> 0:17:49.879
<v Speaker 1>And so we can start building out a dictionary and

0:17:49.960 --> 0:17:53.800
<v Speaker 1>a matrix, and the dictionary would include lots of common words,

0:17:53.840 --> 0:17:56.439
<v Speaker 1>and the matrix would include basic rules to help us

0:17:56.480 --> 0:18:00.679
<v Speaker 1>identify when someone is making a typo or misspelling something.

0:18:01.200 --> 0:18:03.959
<v Speaker 1>And with these tools we could build out a method

0:18:04.000 --> 0:18:07.280
<v Speaker 1>for predicting a letter based on the letters that were

0:18:07.320 --> 0:18:11.359
<v Speaker 1>already typed. So if I typed T and then H,

0:18:11.520 --> 0:18:14.680
<v Speaker 1>my predictive text might helpfully offer out the letter E

0:18:14.960 --> 0:18:18.080
<v Speaker 1>because I frequently type the word the If I ignore

0:18:18.160 --> 0:18:20.680
<v Speaker 1>that and I hit the letter A, I might get

0:18:20.720 --> 0:18:25.280
<v Speaker 1>the prompt of using van or thank or maybe even

0:18:25.359 --> 0:18:29.399
<v Speaker 1>thanks or maybe something else. And we're starting down that

0:18:29.520 --> 0:18:34.320
<v Speaker 1>journey toward generative text. When we come back, I'll explain

0:18:34.359 --> 0:18:39.320
<v Speaker 1>more about this and some really cool experiments with using

0:18:39.640 --> 0:18:42.720
<v Speaker 1>machine learning and what that all means. But first let's

0:18:42.760 --> 0:18:53.919
<v Speaker 1>take a quick break. Okay, So we're building out a

0:18:53.920 --> 0:18:59.040
<v Speaker 1>tool that quote unquote understands basic probabilities of words appearing

0:18:59.080 --> 0:19:01.479
<v Speaker 1>in a given language in a given order, and it

0:19:01.560 --> 0:19:04.320
<v Speaker 1>understands that, for example, a Q will be followed by

0:19:04.480 --> 0:19:08.280
<v Speaker 1>you nearly of the time in English. We build into

0:19:08.320 --> 0:19:12.320
<v Speaker 1>this model all sorts of probabilities, so that words that

0:19:12.359 --> 0:19:15.280
<v Speaker 1>are more common are going to pop up as autocomplete

0:19:15.280 --> 0:19:19.520
<v Speaker 1>options more frequently than uncommon words. But we can do

0:19:19.600 --> 0:19:23.679
<v Speaker 1>better than this. We can pair this with a learning model.

0:19:24.160 --> 0:19:28.680
<v Speaker 1>Learning models evolve over time, They adjust based on the

0:19:28.720 --> 0:19:32.320
<v Speaker 1>input fed to them, and we're talking about lots and

0:19:32.480 --> 0:19:37.240
<v Speaker 1>lots of input, they refine themselves, so, in other words,

0:19:37.640 --> 0:19:42.200
<v Speaker 1>they learn. So with learning models are predictive text begins

0:19:42.240 --> 0:19:47.160
<v Speaker 1>to adjust to the specific individual who uses the predictive

0:19:47.200 --> 0:19:49.679
<v Speaker 1>text over time. Like a phone. So let's say you

0:19:49.720 --> 0:19:53.960
<v Speaker 1>and I each have the same particular model of smartphone,

0:19:54.480 --> 0:19:58.159
<v Speaker 1>and we're both running the same operating system version and everything,

0:19:58.200 --> 0:20:02.080
<v Speaker 1>like our phones are are essentially identical, at least at

0:20:02.119 --> 0:20:05.520
<v Speaker 1>casual glance. And we've both been using these phones for

0:20:05.760 --> 0:20:08.439
<v Speaker 1>a few weeks. And in that time, you and I

0:20:08.480 --> 0:20:11.560
<v Speaker 1>have each used our phones to send various messages to

0:20:11.600 --> 0:20:14.960
<v Speaker 1>our friends, our family, our colleagues, you know, your arch nemesis,

0:20:14.960 --> 0:20:18.159
<v Speaker 1>Ben Bolan, you know the usual. As we do that,

0:20:18.800 --> 0:20:22.000
<v Speaker 1>our predictive text keyboards start to pick up on how

0:20:22.119 --> 0:20:26.360
<v Speaker 1>we use words, and it can build up a frequency matrix,

0:20:26.359 --> 0:20:30.160
<v Speaker 1>which isn't just looking at words that are common in general,

0:20:30.359 --> 0:20:34.000
<v Speaker 1>but words that are common to us as individuals, and

0:20:34.040 --> 0:20:36.920
<v Speaker 1>the way that we use words, and sometimes the way

0:20:36.960 --> 0:20:40.040
<v Speaker 1>we generate words. Maybe you happen to use the word

0:20:40.160 --> 0:20:44.040
<v Speaker 1>balder dash a lot, and so you start typing the

0:20:44.080 --> 0:20:46.800
<v Speaker 1>word and the autocomplete for balder dash will jump up

0:20:46.880 --> 0:20:49.679
<v Speaker 1>much faster than it would if I were typing it

0:20:49.800 --> 0:20:52.119
<v Speaker 1>on my phone, because my phone has never heard me

0:20:52.720 --> 0:20:56.639
<v Speaker 1>use that, so it doesn't automatically assume that's what I'm typing.

0:20:56.880 --> 0:20:59.800
<v Speaker 1>Maybe I use the word folder roll a lot, and

0:20:59.840 --> 0:21:02.679
<v Speaker 1>the same happens with my phone compared to yours. The

0:21:02.720 --> 0:21:06.520
<v Speaker 1>models learned the words we use, not and not just

0:21:06.600 --> 0:21:09.560
<v Speaker 1>the words that the words we create as well. So

0:21:09.640 --> 0:21:12.320
<v Speaker 1>let's say that I was, for some reason a big

0:21:12.400 --> 0:21:14.840
<v Speaker 1>fan of How I Met Your Mother, which I'm not.

0:21:15.040 --> 0:21:16.919
<v Speaker 1>But let's say that I am a big fan of

0:21:16.920 --> 0:21:20.119
<v Speaker 1>Neil Patrick Harris, which is true, and his character often

0:21:20.160 --> 0:21:24.080
<v Speaker 1>says that is wait for it, legendary. Uh, And it

0:21:24.440 --> 0:21:27.560
<v Speaker 1>might extend the word legendary. So to do that, I

0:21:27.680 --> 0:21:29.840
<v Speaker 1>might throw in a whole bunch of extra ease at

0:21:29.880 --> 0:21:34.040
<v Speaker 1>the beginning of legendary. Well, my phone might pick up

0:21:34.080 --> 0:21:36.560
<v Speaker 1>that I tend to do this, and so it includes

0:21:36.640 --> 0:21:40.200
<v Speaker 1>that as a legitimate word, even though any sort of

0:21:40.560 --> 0:21:45.600
<v Speaker 1>spelling check would say this ain't a word, stop it,

0:21:45.640 --> 0:21:48.520
<v Speaker 1>But my phone's predictive text is going to include it

0:21:48.560 --> 0:21:52.200
<v Speaker 1>as saying this is something that is meaningful and thus

0:21:52.240 --> 0:21:57.480
<v Speaker 1>a valid option. Also, the phones can learn to adapt

0:21:57.520 --> 0:22:01.439
<v Speaker 1>to our own sense of syntax and grammar. Perhaps for

0:22:01.520 --> 0:22:05.200
<v Speaker 1>purposes of a particular effect. One of us tends to

0:22:05.240 --> 0:22:08.719
<v Speaker 1>tweak the syntax of the language that we're communicating in

0:22:08.760 --> 0:22:12.080
<v Speaker 1>for some reason. Maybe it's for comedic effect and it's

0:22:12.080 --> 0:22:15.480
<v Speaker 1>not following the established rules of grammar for English. But

0:22:15.560 --> 0:22:18.560
<v Speaker 1>our phone starts to understand that's how we communicate, based

0:22:18.600 --> 0:22:21.880
<v Speaker 1>on how we order our words and how we generate

0:22:21.880 --> 0:22:25.479
<v Speaker 1>our phrases, you know, how we communicate that. While our

0:22:25.560 --> 0:22:30.560
<v Speaker 1>choices aren't necessarily in alignment with an established formal system,

0:22:30.600 --> 0:22:34.880
<v Speaker 1>they represent a particular approach to communicating. Predictive text can

0:22:34.960 --> 0:22:38.840
<v Speaker 1>start to get a handle on that if it's built properly,

0:22:39.359 --> 0:22:43.640
<v Speaker 1>and even someone who communicates in an idiosyncratic way might

0:22:43.680 --> 0:22:47.680
<v Speaker 1>find that their phone is offering up particularly relevant suggestions.

0:22:47.720 --> 0:22:50.720
<v Speaker 1>So how does all this work? How do machines actually

0:22:51.160 --> 0:22:55.760
<v Speaker 1>learn stuff? Well, there's not one single method, but there

0:22:55.800 --> 0:23:00.160
<v Speaker 1>are a collection of related processes that computer scientists develop

0:23:00.160 --> 0:23:04.480
<v Speaker 1>to train machines. And you can look at two major

0:23:04.640 --> 0:23:08.359
<v Speaker 1>types of categories of machine learning, and there are a

0:23:08.400 --> 0:23:10.800
<v Speaker 1>lot of subtypes under each of these, and those would

0:23:10.840 --> 0:23:16.280
<v Speaker 1>be supervised learning and unsupervised learning. Supervised learning involves training

0:23:16.280 --> 0:23:21.280
<v Speaker 1>a computer model using known input and output information, so

0:23:21.560 --> 0:23:23.680
<v Speaker 1>Let's take an example that I like to use a lot,

0:23:23.960 --> 0:23:26.919
<v Speaker 1>and it's about image recognition. So let's say you're teaching

0:23:26.920 --> 0:23:31.320
<v Speaker 1>a computer to recognize images of coffee mugs, and you

0:23:31.400 --> 0:23:35.720
<v Speaker 1>have an enormous supply of images, millions of them. Some

0:23:35.840 --> 0:23:39.120
<v Speaker 1>of them contain coffee mugs and various shapes and sizes

0:23:39.160 --> 0:23:44.320
<v Speaker 1>and colors and orientations, and the lighting can be different.

0:23:44.400 --> 0:23:46.560
<v Speaker 1>You might have the handle pointing to the left, and

0:23:46.680 --> 0:23:48.680
<v Speaker 1>some or pointing to the right or the other. Some

0:23:48.720 --> 0:23:51.040
<v Speaker 1>cases it might be on its side. But you've got

0:23:51.160 --> 0:23:55.120
<v Speaker 1>tons of these, and you also have millions of images

0:23:55.320 --> 0:23:58.240
<v Speaker 1>of other stuff. Some of it might not even resemble

0:23:58.320 --> 0:24:02.280
<v Speaker 1>a mug remotely. Maybe it's an airplane or Christopher walkin.

0:24:02.840 --> 0:24:05.840
<v Speaker 1>Others might look kind of like a mug, you know,

0:24:05.840 --> 0:24:09.160
<v Speaker 1>it might be a glass or a bowl or something similar. Now,

0:24:09.200 --> 0:24:12.600
<v Speaker 1>as a human being, you can tell straight away if

0:24:12.640 --> 0:24:14.840
<v Speaker 1>the image you've got in front of you represents a

0:24:14.840 --> 0:24:21.280
<v Speaker 1>coffee mug or not, But machines don't inherently possess this ability.

0:24:21.640 --> 0:24:25.480
<v Speaker 1>You could feed one photo of a generic off white

0:24:25.520 --> 0:24:28.160
<v Speaker 1>coffee mug, the handle happens to be pointed to the left,

0:24:28.200 --> 0:24:30.720
<v Speaker 1>and you tag that photo as a coffee mug, you

0:24:30.760 --> 0:24:33.320
<v Speaker 1>give meta data to the computer to classify that as

0:24:33.359 --> 0:24:36.320
<v Speaker 1>a coffee mug. And if you create a database of images,

0:24:36.760 --> 0:24:39.480
<v Speaker 1>maybe you do a search for coffee mug, that one

0:24:39.480 --> 0:24:41.560
<v Speaker 1>would come up as a result because of all the

0:24:41.600 --> 0:24:46.040
<v Speaker 1>work you've done with tagging this thing and effectively telling

0:24:46.080 --> 0:24:49.440
<v Speaker 1>the computer this is what I mean by coffee mug. However,

0:24:49.720 --> 0:24:52.560
<v Speaker 1>if you fed a new image and this one is

0:24:52.600 --> 0:24:55.800
<v Speaker 1>of a red coffee mug that's of a different size,

0:24:56.119 --> 0:24:59.119
<v Speaker 1>maybe the photo has different lighting conditions, maybe the mug

0:24:59.160 --> 0:25:02.440
<v Speaker 1>is a little closer to the camera, the handles point

0:25:02.440 --> 0:25:04.760
<v Speaker 1>to the right and on the left, would the computer

0:25:04.880 --> 0:25:09.280
<v Speaker 1>automatically know that that's a coffee mug. No, it hasn't

0:25:09.400 --> 0:25:13.040
<v Speaker 1>learned that. So you would have to build a predictive

0:25:13.119 --> 0:25:16.520
<v Speaker 1>model for a computer to follow based on the known

0:25:16.640 --> 0:25:20.840
<v Speaker 1>input and outputs. Your output is you want the computer

0:25:20.960 --> 0:25:24.240
<v Speaker 1>to classify photos as either having a coffee mug in

0:25:24.280 --> 0:25:28.200
<v Speaker 1>them or not, And you might use an artificial neural network.

0:25:28.760 --> 0:25:32.880
<v Speaker 1>In this case, you're creating nodes that accept input, then

0:25:32.920 --> 0:25:35.920
<v Speaker 1>they apply some sort of decision making process to that

0:25:36.040 --> 0:25:40.000
<v Speaker 1>input and then pass it along further along the network.

0:25:40.320 --> 0:25:43.679
<v Speaker 1>You can almost think of nodes as essentially making a

0:25:43.800 --> 0:25:46.800
<v Speaker 1>yes or no judgment on a piece of data. Does

0:25:46.880 --> 0:25:50.320
<v Speaker 1>the input qualify or does it not? Does it have

0:25:50.600 --> 0:25:54.240
<v Speaker 1>this particular aspect of whatever it is you're looking at,

0:25:54.240 --> 0:25:57.640
<v Speaker 1>in our case, coffee mugs or does it lack that?

0:25:58.200 --> 0:26:01.479
<v Speaker 1>With our mug example, it could be a simple question

0:26:01.560 --> 0:26:05.320
<v Speaker 1>like is this mug shaped? But the nodes are asking

0:26:05.440 --> 0:26:08.840
<v Speaker 1>lots of questions and making lots of judgments and passing

0:26:08.880 --> 0:26:11.000
<v Speaker 1>them throughout the neural network until you get to the

0:26:11.040 --> 0:26:14.320
<v Speaker 1>final output, the final judgment of is this a coffee

0:26:14.359 --> 0:26:18.840
<v Speaker 1>mug or is it not? And computer scientists influence how

0:26:18.920 --> 0:26:23.119
<v Speaker 1>the computer processes information. They adjust the waiting of answers

0:26:23.200 --> 0:26:27.160
<v Speaker 1>waiting as in like weight, as in heavy W E

0:26:27.280 --> 0:26:30.399
<v Speaker 1>I G H T waiting. So you create your model,

0:26:30.680 --> 0:26:33.200
<v Speaker 1>you use nodes that are making a series of judgments

0:26:33.200 --> 0:26:38.520
<v Speaker 1>on images. You wait those decisions so that you're hopefully

0:26:38.600 --> 0:26:42.000
<v Speaker 1>going toward a more accurate result, and you feed your

0:26:42.720 --> 0:26:45.840
<v Speaker 1>photos through and you look at the output. Now you

0:26:45.920 --> 0:26:48.679
<v Speaker 1>know whether the photos have a coffee mug in them

0:26:48.760 --> 0:26:51.120
<v Speaker 1>or not. You're looking to see if the computer can

0:26:51.200 --> 0:26:53.680
<v Speaker 1>recognize that. So you're looking to see if your model

0:26:53.760 --> 0:26:56.439
<v Speaker 1>succeeded or failed. And then you go back and you

0:26:56.480 --> 0:26:59.400
<v Speaker 1>make adjustments to your neural network. You adjust the waitings

0:26:59.480 --> 0:27:03.040
<v Speaker 1>of those decisions so that the nodes process information in

0:27:03.040 --> 0:27:05.720
<v Speaker 1>a slightly different way, and you always have the goal

0:27:06.000 --> 0:27:09.840
<v Speaker 1>of improving the accuracy of the overall system. You feed

0:27:10.000 --> 0:27:12.600
<v Speaker 1>the images through it again, and you do this over

0:27:12.920 --> 0:27:17.000
<v Speaker 1>and over. You train the computer model so that it

0:27:17.119 --> 0:27:20.320
<v Speaker 1>gets more accurate as you make these adjustments, and ultimately

0:27:20.760 --> 0:27:23.840
<v Speaker 1>you get to a system that can accept brand new images,

0:27:24.240 --> 0:27:28.160
<v Speaker 1>ones that haven't been deliberately chosen, and then sort those

0:27:28.160 --> 0:27:31.359
<v Speaker 1>into images that either are of a coffee mug or

0:27:31.440 --> 0:27:36.120
<v Speaker 1>are not. And this is in an area called classification.

0:27:36.400 --> 0:27:39.320
<v Speaker 1>So in our simple example, images just fall into two

0:27:39.359 --> 0:27:43.639
<v Speaker 1>broad classifications, photos with mugs or photos without, though we're

0:27:43.640 --> 0:27:46.160
<v Speaker 1>gonna get a little more complicated a little bit later,

0:27:46.440 --> 0:27:50.360
<v Speaker 1>so you can have all sorts of classifications. Medical imaging

0:27:50.400 --> 0:27:53.280
<v Speaker 1>systems make use of this sort of machine learning process

0:27:53.320 --> 0:27:56.359
<v Speaker 1>to indicate whether or not an image of a of

0:27:56.359 --> 0:28:00.239
<v Speaker 1>a tumor is benign or not. Handwriting recognition program ms

0:28:00.280 --> 0:28:02.960
<v Speaker 1>do this to speech recognition can do this as well,

0:28:03.359 --> 0:28:06.920
<v Speaker 1>so supervised learning systems can also use a different approach

0:28:06.960 --> 0:28:10.160
<v Speaker 1>called regression as a means of training a system regression

0:28:10.200 --> 0:28:14.400
<v Speaker 1>is all about predicting a continuous response, like how much

0:28:14.560 --> 0:28:18.280
<v Speaker 1>electricity a community is going to need over time. It's

0:28:18.280 --> 0:28:22.160
<v Speaker 1>about predicting things to which you can assign real numbers. So,

0:28:22.280 --> 0:28:25.840
<v Speaker 1>for example, predicting a change in temperature, temperature happens to

0:28:26.280 --> 0:28:29.240
<v Speaker 1>have a value that is a real number, so that

0:28:29.320 --> 0:28:34.399
<v Speaker 1>falls into this category that's supervised learning, where we have

0:28:34.640 --> 0:28:38.240
<v Speaker 1>the known inputs and known outputs. We know definitively if

0:28:38.280 --> 0:28:41.280
<v Speaker 1>the information the computer generates is accurate or not because

0:28:41.320 --> 0:28:43.800
<v Speaker 1>we can actually check its work. It's kind of like

0:28:44.160 --> 0:28:47.560
<v Speaker 1>a teacher grading student tests and then working with a

0:28:47.600 --> 0:28:49.800
<v Speaker 1>student who has a low score to get a better

0:28:49.840 --> 0:28:52.760
<v Speaker 1>understanding of subject matter, and then on the next test

0:28:52.800 --> 0:28:56.120
<v Speaker 1>hopefully they score better, and you keep working with that

0:28:56.200 --> 0:28:59.840
<v Speaker 1>student over and over until they have reached a high

0:29:00.000 --> 0:29:05.200
<v Speaker 1>of level of consistency of being correct. Unsupervised learning is

0:29:05.240 --> 0:29:09.440
<v Speaker 1>more about finding patterns or meaning in data where no

0:29:09.560 --> 0:29:13.600
<v Speaker 1>such patterns or meaning is initially obvious. When we talk

0:29:13.680 --> 0:29:17.800
<v Speaker 1>about sifting through big data to find patterns, this is

0:29:17.840 --> 0:29:21.360
<v Speaker 1>the kind of thing we're talking about. Those patterns might

0:29:21.360 --> 0:29:24.320
<v Speaker 1>be subtle, or they might only be obvious when you're

0:29:24.360 --> 0:29:29.520
<v Speaker 1>dealing with truly enormous amounts of information. We humans are

0:29:29.640 --> 0:29:33.280
<v Speaker 1>really good at spotting patterns up to a point. It's

0:29:33.360 --> 0:29:38.000
<v Speaker 1>part of our survival mechanism. Recognizing patterns helped ancient humans

0:29:38.000 --> 0:29:41.880
<v Speaker 1>recognize prey or predators, so it's a key element to

0:29:41.920 --> 0:29:45.640
<v Speaker 1>the survival of our species. But when you get to really,

0:29:45.880 --> 0:29:49.560
<v Speaker 1>really big quantities of data, it's hard for us to

0:29:49.600 --> 0:29:51.400
<v Speaker 1>see patterns. It would be kind of like if you

0:29:51.560 --> 0:29:53.800
<v Speaker 1>jumped off a boat in the middle of the ocean

0:29:54.280 --> 0:29:56.520
<v Speaker 1>and then you were told to look for patterns that

0:29:56.560 --> 0:29:59.480
<v Speaker 1>are the size of New Zealand you'd be lost right away.

0:29:59.520 --> 0:30:02.960
<v Speaker 1>The scale is something we can't deal with. But computer

0:30:03.000 --> 0:30:06.280
<v Speaker 1>systems can handle data far more efficiently than we can,

0:30:06.680 --> 0:30:10.320
<v Speaker 1>and that means they can potentially spot patterns where we

0:30:10.680 --> 0:30:15.440
<v Speaker 1>would not. Unsupervised learning techniques are best for this, and

0:30:15.520 --> 0:30:19.240
<v Speaker 1>they have a few different approaches. One is clustering, which

0:30:19.480 --> 0:30:21.720
<v Speaker 1>is pretty much what sounds like. The system looks for

0:30:21.880 --> 0:30:27.280
<v Speaker 1>groupings and data indications of clusters, pattern clusters. And now

0:30:27.320 --> 0:30:29.960
<v Speaker 1>I need to get back to my image recognition coffee

0:30:30.040 --> 0:30:33.959
<v Speaker 1>mug analogy. If we were just feeding images that are

0:30:34.080 --> 0:30:38.600
<v Speaker 1>either a coffee mug on a neutral background or something else,

0:30:39.080 --> 0:30:42.000
<v Speaker 1>then we could go supervised learning all the way. But

0:30:42.160 --> 0:30:44.440
<v Speaker 1>if we wanted to create a system that could recognize

0:30:44.480 --> 0:30:47.640
<v Speaker 1>if a coffee mug were in a larger scene, like

0:30:47.800 --> 0:30:51.200
<v Speaker 1>a crowded kitchen table, lots of other stuff is on it,

0:30:51.280 --> 0:30:55.080
<v Speaker 1>we could probably rely a bit on unsupervised learning, in

0:30:55.120 --> 0:30:57.880
<v Speaker 1>which we would use clustering to teach the system to

0:30:57.920 --> 0:31:02.280
<v Speaker 1>look for data that collectively appears to represent a coffee mug.

0:31:02.440 --> 0:31:04.920
<v Speaker 1>We're trying to create a system that can pick out

0:31:05.040 --> 0:31:07.280
<v Speaker 1>the shape of a coffee mug in an image that

0:31:07.320 --> 0:31:09.600
<v Speaker 1>has a lot of other shapes in it. The system

0:31:09.640 --> 0:31:13.960
<v Speaker 1>needs to understand which shapes, which lines and curves represent

0:31:14.120 --> 0:31:17.400
<v Speaker 1>the borders of objects. So what is a coffee mug

0:31:17.440 --> 0:31:20.920
<v Speaker 1>as opposed to say, a tablecloth or a shadow or

0:31:21.000 --> 0:31:24.400
<v Speaker 1>a bowl with a spoon next to it. Unsupervised pattern

0:31:24.440 --> 0:31:27.640
<v Speaker 1>recognition can lead to that outcome. Again, it requires a

0:31:27.680 --> 0:31:30.560
<v Speaker 1>lot of training. You feed millions of images to a

0:31:30.600 --> 0:31:34.360
<v Speaker 1>system numerous times to refine this approach. The method often

0:31:34.400 --> 0:31:38.320
<v Speaker 1>relies upon hidden Markov models. Oh and this also ties

0:31:38.360 --> 0:31:40.960
<v Speaker 1>into something else that's you know, tangentially related. But I

0:31:40.960 --> 0:31:42.840
<v Speaker 1>thought I would bring it up in case you guys

0:31:42.840 --> 0:31:45.760
<v Speaker 1>have been experiencing it as much as I have. If

0:31:45.760 --> 0:31:48.920
<v Speaker 1>you've noticed a lot more instances of websites demanding that

0:31:48.960 --> 0:31:52.080
<v Speaker 1>you prove you're not a robot with a capture. By

0:31:52.120 --> 0:31:54.000
<v Speaker 1>the way, this is a good reminder that if you

0:31:54.120 --> 0:31:57.400
<v Speaker 1>go to the tech stuff store at t public dot

0:31:57.440 --> 0:32:00.560
<v Speaker 1>com slash stores slash tech Stuff, you can get a

0:32:00.600 --> 0:32:03.280
<v Speaker 1>shirt or you know, dare I say, a coffee mug

0:32:03.600 --> 0:32:07.440
<v Speaker 1>with this capture robot idea on it. A lot of

0:32:07.480 --> 0:32:10.440
<v Speaker 1>those captures involve a series of photos, and it's your

0:32:10.520 --> 0:32:13.960
<v Speaker 1>job to click all the photos that have something specific

0:32:14.000 --> 0:32:17.000
<v Speaker 1>in them, you know, like bicycles or crosswalks, or traffic

0:32:17.080 --> 0:32:22.000
<v Speaker 1>lights or fire hydrants. If you've wondered why that is, well,

0:32:22.920 --> 0:32:25.680
<v Speaker 1>it all comes down to good traffic versus bad traffic.

0:32:25.720 --> 0:32:28.680
<v Speaker 1>There's a lot of traffic out there that is uh

0:32:28.840 --> 0:32:33.920
<v Speaker 1>powered by butts for various reasons, and that can clog

0:32:34.000 --> 0:32:37.720
<v Speaker 1>things up, and so systems and companies like Google want

0:32:37.800 --> 0:32:41.000
<v Speaker 1>to prioritize traffic that's good traffic. It represents actual people

0:32:41.120 --> 0:32:45.720
<v Speaker 1>trying to do stuff, and give them preferential access to

0:32:46.120 --> 0:32:50.200
<v Speaker 1>other methods that might be malevolent or just might end

0:32:50.280 --> 0:32:54.600
<v Speaker 1>up making things run slower if they get unfettered access.

0:32:54.680 --> 0:32:57.840
<v Speaker 1>And the reason these captions are getting so difficult is

0:32:57.840 --> 0:33:02.480
<v Speaker 1>because machine learning and image recognition software has gotten really good,

0:33:02.720 --> 0:33:06.000
<v Speaker 1>and so to protect against bad traffic, companies like Google

0:33:06.120 --> 0:33:09.960
<v Speaker 1>are using difficult capture systems that present fuzzy, dimly lit,

0:33:10.080 --> 0:33:14.800
<v Speaker 1>or otherwise you know, bad photographs to you, and your

0:33:14.840 --> 0:33:17.680
<v Speaker 1>job is to stare at them, possibly on a tiny

0:33:17.720 --> 0:33:21.520
<v Speaker 1>smartphone screen, and figure out which ones are legit. The

0:33:21.560 --> 0:33:24.800
<v Speaker 1>whole goal is to present photos that are so lousy

0:33:24.840 --> 0:33:28.960
<v Speaker 1>that machines can't really deal with them. The problem is,

0:33:29.280 --> 0:33:33.000
<v Speaker 1>over the long run, machines get better than doing this

0:33:33.040 --> 0:33:35.120
<v Speaker 1>sort of stuff, whereas we kind of, you know, we

0:33:35.200 --> 0:33:38.200
<v Speaker 1>have a cap on our performance. There will come a

0:33:38.280 --> 0:33:40.800
<v Speaker 1>point where an image will be get you know, too

0:33:40.800 --> 0:33:42.880
<v Speaker 1>fuzzy or too dim for us to make out if

0:33:42.920 --> 0:33:46.240
<v Speaker 1>there's a fire hydrant in there or not. The machines

0:33:46.280 --> 0:33:49.440
<v Speaker 1>will always get better at stuff at this than than

0:33:49.560 --> 0:33:53.880
<v Speaker 1>we are over the long run. Heck, older capture systems

0:33:53.920 --> 0:33:57.880
<v Speaker 1>are completely obsolete now because computer systems can complete them

0:33:57.880 --> 0:34:01.440
<v Speaker 1>at a success rate that's actually higher than humans. We've

0:34:01.480 --> 0:34:04.000
<v Speaker 1>got a lot of science fiction stories about machines becoming

0:34:04.040 --> 0:34:07.000
<v Speaker 1>sentient and ruining humanity, but the truth of the matter

0:34:07.080 --> 0:34:10.920
<v Speaker 1>is they don't need sentients to be disruptive. If they

0:34:10.920 --> 0:34:14.719
<v Speaker 1>are directed by someone for a specific malevolent purpose, that's

0:34:14.760 --> 0:34:17.279
<v Speaker 1>bad enough, even if the machines aren't really you know,

0:34:17.600 --> 0:34:22.319
<v Speaker 1>thinking for themselves. Okay, but let's get back to predictive text.

0:34:22.400 --> 0:34:25.560
<v Speaker 1>After all of this. You could create a machine learning

0:34:25.560 --> 0:34:28.160
<v Speaker 1>model that has a huge database of words, you know,

0:34:28.200 --> 0:34:31.520
<v Speaker 1>a dictionary, and you could program the system to classify

0:34:31.600 --> 0:34:34.360
<v Speaker 1>the words. You can sus out which words are nouns

0:34:34.440 --> 0:34:37.839
<v Speaker 1>and verbs and adjectives, and then apply rules to how

0:34:37.880 --> 0:34:41.080
<v Speaker 1>those words can go together to make sentences. Or you

0:34:41.080 --> 0:34:45.399
<v Speaker 1>could just you know, analyze a ton of literature and

0:34:45.440 --> 0:34:48.320
<v Speaker 1>have the computer kind of figure that out for itself,

0:34:48.680 --> 0:34:54.160
<v Speaker 1>just through statistical analysis, understand how words fit together based

0:34:54.200 --> 0:34:57.480
<v Speaker 1>upon the history of the written word, at least in

0:34:57.640 --> 0:35:01.200
<v Speaker 1>modern English. For example, if you went further back to

0:35:01.719 --> 0:35:04.640
<v Speaker 1>like old English, first of all, your vocabulary would be

0:35:04.640 --> 0:35:07.600
<v Speaker 1>totally different, but your grammar would be too, and suddenly

0:35:07.680 --> 0:35:10.040
<v Speaker 1>things would not make much sense. It would everything would

0:35:10.080 --> 0:35:13.600
<v Speaker 1>sound like yoda. So the system could go through millions

0:35:13.640 --> 0:35:16.760
<v Speaker 1>of pages of materials building a statistical model that shows

0:35:16.760 --> 0:35:20.640
<v Speaker 1>how frequently certain words pair together and in which order. Effectively,

0:35:20.680 --> 0:35:24.360
<v Speaker 1>you're analyzing how humans put letters together to make words,

0:35:24.400 --> 0:35:27.400
<v Speaker 1>and words together to make sentences. You could move up

0:35:27.440 --> 0:35:30.879
<v Speaker 1>from there. You could try and analyze how sentences come

0:35:30.920 --> 0:35:35.000
<v Speaker 1>together to make up paragraphs, but it starts to get tricky. However,

0:35:35.080 --> 0:35:37.440
<v Speaker 1>you can work on a system that can present a

0:35:37.520 --> 0:35:40.359
<v Speaker 1>series of sentences that are related enough to be a

0:35:40.400 --> 0:35:43.640
<v Speaker 1>coherent presentation of ideas, at least in the short run.

0:35:44.120 --> 0:35:47.200
<v Speaker 1>It might not be super compelling or as effective as

0:35:47.239 --> 0:35:49.440
<v Speaker 1>what a human could do, but it could be a

0:35:49.440 --> 0:35:51.759
<v Speaker 1>lot more impressive than just, you know, a string of

0:35:51.760 --> 0:35:55.120
<v Speaker 1>totally unrelated words. When we come back, I'll talk a

0:35:55.120 --> 0:35:57.760
<v Speaker 1>bit more about how computer systems can put words together

0:35:57.800 --> 0:35:59.800
<v Speaker 1>for us and what that could mean in the future.

0:35:59.840 --> 0:36:11.680
<v Speaker 1>But first let's take another quick break. Okay, So, AI systems,

0:36:11.880 --> 0:36:16.120
<v Speaker 1>if sophisticated enough, can use stuff like hidden Markov models

0:36:16.120 --> 0:36:18.680
<v Speaker 1>and machine learning to put together strings of words that,

0:36:18.880 --> 0:36:23.359
<v Speaker 1>from a probability standpoint, a statistical standpoint, at least are

0:36:23.440 --> 0:36:27.720
<v Speaker 1>likely to make some sense. There's no guarantee it will

0:36:27.760 --> 0:36:31.200
<v Speaker 1>actually make sense, but if things are going well, the

0:36:31.239 --> 0:36:34.399
<v Speaker 1>phrases will be grammatically correct, and if they're going really well,

0:36:34.760 --> 0:36:37.400
<v Speaker 1>the word choice will be reasonable enough to pass muster.

0:36:38.040 --> 0:36:41.879
<v Speaker 1>But this is still pretty hard. Computer systems typically lack

0:36:41.960 --> 0:36:46.080
<v Speaker 1>the ability to build on context and meaning because they're

0:36:46.080 --> 0:36:49.080
<v Speaker 1>effectively looking for what is most likely to come next,

0:36:49.239 --> 0:36:52.360
<v Speaker 1>rather than looking back at what has already come before.

0:36:53.160 --> 0:36:55.160
<v Speaker 1>Does that make sense, Well, let me put it in

0:36:55.200 --> 0:36:58.359
<v Speaker 1>another way. In our weather example, I talked about how

0:36:58.400 --> 0:37:02.759
<v Speaker 1>the predictions for future weather depended on current weather. So

0:37:02.920 --> 0:37:06.440
<v Speaker 1>what is it doing today? If it is sunny today,

0:37:06.480 --> 0:37:08.960
<v Speaker 1>there's an eight percent chance it will be sunny tomorrow

0:37:09.000 --> 0:37:13.279
<v Speaker 1>according to our example. But the predictions don't depend upon

0:37:13.440 --> 0:37:17.319
<v Speaker 1>the weather that came earlier, like what happened yesterday. The

0:37:17.400 --> 0:37:22.000
<v Speaker 1>system doesn't care about yesterday's weather. We might care because

0:37:22.000 --> 0:37:25.200
<v Speaker 1>we're using long trends of weather to act as our

0:37:25.280 --> 0:37:28.080
<v Speaker 1>data source to train the computer model, you know, to

0:37:28.200 --> 0:37:31.640
<v Speaker 1>create those probabilities. But yesterday's weather, as far as the

0:37:31.640 --> 0:37:34.959
<v Speaker 1>computer system is concerned, has no impact on tomorrow's weather.

0:37:35.320 --> 0:37:38.719
<v Speaker 1>So if yesterday we're rainy in today is sunny, the

0:37:38.760 --> 0:37:42.280
<v Speaker 1>computer doesn't really care. It just cares that today is sunny.

0:37:42.480 --> 0:37:45.080
<v Speaker 1>The same thing can hold true with systems that are

0:37:45.120 --> 0:37:49.080
<v Speaker 1>creating predictive text. The goal with standard predictive text is

0:37:49.120 --> 0:37:53.080
<v Speaker 1>to save users time and effort by suggesting likely words

0:37:53.280 --> 0:37:55.960
<v Speaker 1>as you, you know, start typing, So if you start

0:37:56.040 --> 0:37:59.760
<v Speaker 1>typing the word technology, at some point, the system recognizes

0:37:59.840 --> 0:38:02.920
<v Speaker 1>the letter pattern and offers that up as an option,

0:38:03.280 --> 0:38:06.359
<v Speaker 1>And for words that are frequently used in pairs, you'll

0:38:06.400 --> 0:38:09.520
<v Speaker 1>get those suggestions right away after you type the first word.

0:38:09.880 --> 0:38:13.200
<v Speaker 1>Since this is typically presented as an option, you know,

0:38:13.280 --> 0:38:16.400
<v Speaker 1>something you can choose to use or not. It's pretty

0:38:16.400 --> 0:38:19.960
<v Speaker 1>simple to avoid going wrong unless you, as a user,

0:38:20.080 --> 0:38:23.080
<v Speaker 1>fumble things and accidentally picked the wrong word, which can

0:38:23.120 --> 0:38:26.400
<v Speaker 1>get kind of embarrassing, or if it autocompletes after the fact,

0:38:26.680 --> 0:38:29.759
<v Speaker 1>thinking that you made a spelling error and then you

0:38:30.200 --> 0:38:34.440
<v Speaker 1>have accidentally spelled Tim mentions name as Tim Munchkin and

0:38:34.600 --> 0:38:39.400
<v Speaker 1>I am deeply sorry for that. Auto replies with email

0:38:39.680 --> 0:38:42.879
<v Speaker 1>get a little more complicated as the system is analyzing

0:38:42.920 --> 0:38:46.040
<v Speaker 1>the message that is coming into you before formulating a

0:38:46.080 --> 0:38:49.440
<v Speaker 1>possible response. So I have email systems that do this

0:38:49.600 --> 0:38:52.719
<v Speaker 1>for me. And one common example for me is that

0:38:52.800 --> 0:38:55.480
<v Speaker 1>our sales team here at our company will send me

0:38:55.520 --> 0:38:58.840
<v Speaker 1>an email asking if I'm okay running a particular sponsors

0:38:58.880 --> 0:39:01.480
<v Speaker 1>ads on my show. Now, normally I like to do

0:39:01.560 --> 0:39:04.719
<v Speaker 1>research on my sponsors, so I'll take time to look

0:39:04.760 --> 0:39:08.480
<v Speaker 1>into things and then respond myself. But sometimes the request

0:39:08.520 --> 0:39:12.000
<v Speaker 1>is for a sponsor I'm familiar with and I definitely

0:39:12.080 --> 0:39:15.840
<v Speaker 1>want or you know, occasionally definitely do not want on

0:39:15.960 --> 0:39:18.680
<v Speaker 1>my show, and I'll see on my phone that I

0:39:18.719 --> 0:39:20.840
<v Speaker 1>have the option to pick a quick reply of something

0:39:20.880 --> 0:39:25.319
<v Speaker 1>like sure or yes, that's fine, or something similar. In

0:39:25.360 --> 0:39:28.720
<v Speaker 1>this case, the email program is using natural language systems

0:39:28.719 --> 0:39:31.640
<v Speaker 1>and predictive text to suss out that there is a

0:39:31.680 --> 0:39:35.200
<v Speaker 1>request and that the common responses I might make to

0:39:35.280 --> 0:39:38.319
<v Speaker 1>that request should be options. Now, it's not that the

0:39:38.320 --> 0:39:42.719
<v Speaker 1>computer system actually understands the nature of this request, but

0:39:42.920 --> 0:39:45.759
<v Speaker 1>more like the structure of a request. In other words,

0:39:45.760 --> 0:39:48.120
<v Speaker 1>it's saying, this looks like it's a yes or no question.

0:39:48.520 --> 0:39:52.239
<v Speaker 1>Let's present him with responses that are in a yes

0:39:52.360 --> 0:39:56.560
<v Speaker 1>or no format. The fact that the system doesn't really

0:39:56.560 --> 0:40:00.000
<v Speaker 1>have a deeper understanding can become evident in other use cases.

0:40:00.560 --> 0:40:04.680
<v Speaker 1>So for example, Janelle Shane, who is a research scientist

0:40:04.719 --> 0:40:09.239
<v Speaker 1>and who has a delightful blog called AI Weirdness, took

0:40:09.320 --> 0:40:11.920
<v Speaker 1>time to try and train a machine learning system to

0:40:11.960 --> 0:40:16.240
<v Speaker 1>tell jokes. It became clear that the system could construct

0:40:16.400 --> 0:40:21.760
<v Speaker 1>something resembling a classic question slash punchline style of joke.

0:40:22.320 --> 0:40:25.640
<v Speaker 1>But it was also clear that the punchline rarely had

0:40:25.760 --> 0:40:29.040
<v Speaker 1>any connection to the question. It actually reminded me a

0:40:29.040 --> 0:40:31.520
<v Speaker 1>lot of how little kids like my two year old

0:40:31.600 --> 0:40:35.080
<v Speaker 1>niece tell jokes. These jokes are some of my favorite

0:40:35.080 --> 0:40:38.360
<v Speaker 1>in the world, not because the jokes are inherently funny,

0:40:38.680 --> 0:40:41.399
<v Speaker 1>but because they are absurd and they show how little

0:40:41.480 --> 0:40:44.840
<v Speaker 1>children can recognize the structure, but not how to build

0:40:44.920 --> 0:40:49.160
<v Speaker 1>an actual joke. My favorite of the AI generated jokes

0:40:49.360 --> 0:40:53.319
<v Speaker 1>almost got it right, and it went like this, what

0:40:53.440 --> 0:40:57.359
<v Speaker 1>do you get when you cross a dinosaur? They get

0:40:57.400 --> 0:41:01.759
<v Speaker 1>a lawyer's I mean, that's that's almost a real joke.

0:41:01.840 --> 0:41:05.279
<v Speaker 1>I actually love that one. Shane pointed out the bit

0:41:05.400 --> 0:41:08.480
<v Speaker 1>that I mentioned earlier that these systems have next to

0:41:08.600 --> 0:41:12.280
<v Speaker 1>no short term memory, and so building any lengthy response

0:41:12.480 --> 0:41:15.279
<v Speaker 1>is pretty much impossible because the computer system is so

0:41:15.320 --> 0:41:18.560
<v Speaker 1>focused on choosing the word that comes next without an

0:41:18.640 --> 0:41:22.960
<v Speaker 1>understanding of the connection or context of what came earlier.

0:41:23.560 --> 0:41:26.640
<v Speaker 1>And you may have come across stuff like a social

0:41:26.680 --> 0:41:28.880
<v Speaker 1>media post that says something along the lines of I

0:41:28.960 --> 0:41:32.279
<v Speaker 1>fed a computer ten thousand movie scripts and asked it

0:41:32.320 --> 0:41:35.120
<v Speaker 1>to write the next you know, Highlander movie or whatever,

0:41:35.760 --> 0:41:39.560
<v Speaker 1>and then you get a little screenplay, and inevitably they

0:41:39.640 --> 0:41:44.160
<v Speaker 1>end up being silly and absurd, with crazy stage directions

0:41:44.200 --> 0:41:47.840
<v Speaker 1>and dialogue and descriptions. They also tend to be written

0:41:48.040 --> 0:41:53.319
<v Speaker 1>entirely by human beings. Most AI systems are incapable of

0:41:53.440 --> 0:41:57.920
<v Speaker 1>keeping things consistent, like character names. A computer system might

0:41:57.960 --> 0:42:01.960
<v Speaker 1>create a character name and give that character align, but

0:42:02.840 --> 0:42:06.239
<v Speaker 1>that name is not likely to return later on in

0:42:06.280 --> 0:42:09.440
<v Speaker 1>the screenplay. It's not necessarily going to show up in

0:42:09.440 --> 0:42:13.200
<v Speaker 1>any stage directions or descriptions. It ends up being more

0:42:13.320 --> 0:42:17.600
<v Speaker 1>dreamlike and free form. It's still absurd, but it's not

0:42:17.719 --> 0:42:21.960
<v Speaker 1>as internally consistent. So if you come across a long

0:42:22.080 --> 0:42:25.360
<v Speaker 1>piece of absurd ast humor that was quote unquote written

0:42:25.360 --> 0:42:29.160
<v Speaker 1>by a computer, chances are it wasn't. It was written

0:42:29.200 --> 0:42:33.120
<v Speaker 1>by a person who was emulating the dreamlike absurdism of

0:42:33.160 --> 0:42:37.359
<v Speaker 1>computer generated text. They're still really funny, they're just not

0:42:37.440 --> 0:42:41.400
<v Speaker 1>necessarily actually generated by a computer. So about that blog

0:42:41.480 --> 0:42:44.439
<v Speaker 1>post that ran on Hacker News. How did that get

0:42:44.480 --> 0:42:48.520
<v Speaker 1>past so many people? It started with Liam Poor, a

0:42:48.600 --> 0:42:52.279
<v Speaker 1>college student, a computer scientist, who made contact with a

0:42:52.320 --> 0:42:55.880
<v Speaker 1>PhD student who in turn had access to a private

0:42:55.960 --> 0:43:00.520
<v Speaker 1>beta build of the GPT three autocomplete tool. Poor created

0:43:00.560 --> 0:43:04.040
<v Speaker 1>a blog post title and an introduction to serve as

0:43:04.080 --> 0:43:07.680
<v Speaker 1>the launch point for the system to build upon. And

0:43:07.840 --> 0:43:11.280
<v Speaker 1>together they ran a few trials with this machine learning

0:43:11.320 --> 0:43:15.680
<v Speaker 1>system and auto generated text system and uh with those prompts,

0:43:15.800 --> 0:43:19.880
<v Speaker 1>and then Poor picked one of the results to submit

0:43:19.920 --> 0:43:22.200
<v Speaker 1>as a legit blog post. Now, I'm going to read

0:43:22.239 --> 0:43:25.120
<v Speaker 1>a little section of it. Now, the blog post title

0:43:25.320 --> 0:43:30.160
<v Speaker 1>was feeling unproductive, maybe you should stop overthinking. And here's

0:43:30.160 --> 0:43:34.320
<v Speaker 1>a segment that comes from the middle of the blog post. Quote.

0:43:35.280 --> 0:43:38.920
<v Speaker 1>When you engage in creative thinking, your brain starts working

0:43:38.960 --> 0:43:42.200
<v Speaker 1>more efficiently. It becomes more active and more open to

0:43:42.280 --> 0:43:45.360
<v Speaker 1>new ideas. It also helps you think outside the box

0:43:45.440 --> 0:43:48.399
<v Speaker 1>and look at things from a different perspective. So how

0:43:48.440 --> 0:43:53.000
<v Speaker 1>does this all tie into productivity. Well, if you're a creator,

0:43:53.280 --> 0:43:56.040
<v Speaker 1>then you should be engaging in creative thinking on a

0:43:56.040 --> 0:43:59.160
<v Speaker 1>regular basis. The more you do it, the better your

0:43:59.160 --> 0:44:02.520
<v Speaker 1>brain becomes at thinking up ideas. This makes it easier

0:44:02.560 --> 0:44:04.840
<v Speaker 1>for you to work on your projects because you won't

0:44:04.920 --> 0:44:09.680
<v Speaker 1>get stuck as often. End quote. Now the phrasing makes sense.

0:44:10.239 --> 0:44:12.920
<v Speaker 1>It's in a very casual style, and other parts of

0:44:12.920 --> 0:44:15.840
<v Speaker 1>the blog post get, you know, even more casual, sometimes

0:44:16.000 --> 0:44:21.239
<v Speaker 1>straying into grammatical error territory. It's not terribly precise, nor

0:44:21.320 --> 0:44:25.400
<v Speaker 1>is it saying anything really. The example I gave to

0:44:25.440 --> 0:44:28.040
<v Speaker 1>a friend of mine is that this blog post is

0:44:28.239 --> 0:44:31.080
<v Speaker 1>just like if I said, you know, if I'm caught

0:44:31.120 --> 0:44:34.080
<v Speaker 1>outside when it starts pouring down rain, I get wet.

0:44:34.800 --> 0:44:39.120
<v Speaker 1>I mean, yeah, that statement is true, but it's also,

0:44:39.239 --> 0:44:41.440
<v Speaker 1>you know, not saying anything, or at least not anything

0:44:41.440 --> 0:44:45.760
<v Speaker 1>that isn't already evident. All that being said, the blog

0:44:45.800 --> 0:44:48.880
<v Speaker 1>post impresses the heck out of me. And that's because

0:44:48.920 --> 0:44:53.360
<v Speaker 1>the paragraphs follow in a logical pattern. It's not well written,

0:44:53.800 --> 0:44:56.759
<v Speaker 1>but there's so much bad writing out there that it

0:44:56.840 --> 0:44:59.960
<v Speaker 1>also doesn't stand out. If I had read this without

0:45:00.200 --> 0:45:03.399
<v Speaker 1>knowing a computer generated it, I'm not certain I would

0:45:03.440 --> 0:45:06.400
<v Speaker 1>pick up on it again. Not because it's great writing,

0:45:06.440 --> 0:45:09.560
<v Speaker 1>but because I've read a lot of really bad writing

0:45:09.560 --> 0:45:13.080
<v Speaker 1>out there. Heck, I've probably written some of it. Think

0:45:13.120 --> 0:45:16.240
<v Speaker 1>of some of the content farms out there that post

0:45:16.680 --> 0:45:19.839
<v Speaker 1>thousands of blog posts a day. There's not as many

0:45:19.880 --> 0:45:22.680
<v Speaker 1>as there were maybe you know, five years ago, but

0:45:22.760 --> 0:45:25.279
<v Speaker 1>there's still quite a few. Well, a lot of that

0:45:25.320 --> 0:45:29.280
<v Speaker 1>content is written in a very quick, slap dash style,

0:45:29.640 --> 0:45:32.840
<v Speaker 1>and and no, no shade being thrown at the writers.

0:45:32.840 --> 0:45:35.680
<v Speaker 1>They're trying to make a living, but it's not exactly

0:45:35.800 --> 0:45:39.919
<v Speaker 1>well crafted work. This piece could have passed for one

0:45:39.920 --> 0:45:44.520
<v Speaker 1>of those, and the piece does actually seem to build

0:45:44.719 --> 0:45:48.000
<v Speaker 1>on itself. New paragraphs reference a point made in an

0:45:48.040 --> 0:45:51.160
<v Speaker 1>earlier paragraph, something that you didn't see so much of

0:45:51.280 --> 0:45:55.279
<v Speaker 1>in other systems. New paragraphs build on those earlier ones,

0:45:55.360 --> 0:45:59.400
<v Speaker 1>not in substantial ways, but there is a coherent link

0:45:59.600 --> 0:46:01.759
<v Speaker 1>from one paragraph to the next. It's not as free

0:46:01.800 --> 0:46:05.520
<v Speaker 1>form and absurd as other generative texts that I've seen.

0:46:06.480 --> 0:46:09.640
<v Speaker 1>As for the autocorrect on our phones, those get more

0:46:09.640 --> 0:46:11.799
<v Speaker 1>individualized as we use them. Like I said, if I

0:46:11.840 --> 0:46:14.520
<v Speaker 1>type a proper name like my dog tim Bolt, my

0:46:14.600 --> 0:46:17.120
<v Speaker 1>phone starts to pick up on this that it's a

0:46:17.120 --> 0:46:20.040
<v Speaker 1>word that has a particular meaning to me, that it's

0:46:20.040 --> 0:46:23.279
<v Speaker 1>also a proper noun because I always capitalize it, and

0:46:23.320 --> 0:46:25.920
<v Speaker 1>that it's not a typo, it's not a misspelling. So

0:46:26.200 --> 0:46:28.719
<v Speaker 1>while the name wasn't in my phone's dictionary when I

0:46:28.719 --> 0:46:31.919
<v Speaker 1>first got it, it has been added to that now

0:46:31.920 --> 0:46:33.640
<v Speaker 1>that I've been using it so much, and it can

0:46:33.719 --> 0:46:36.360
<v Speaker 1>even auto complete the name as I start to type.

0:46:36.360 --> 0:46:40.200
<v Speaker 1>Now we have some really impressive examples of generated text

0:46:40.320 --> 0:46:44.200
<v Speaker 1>or generated language applications in AI. A couple of years ago,

0:46:44.280 --> 0:46:47.800
<v Speaker 1>Google demonstrated how the Google Assistant could make a phone

0:46:47.800 --> 0:46:51.560
<v Speaker 1>call to a real human being operated business and make

0:46:51.560 --> 0:46:55.440
<v Speaker 1>an appointment for you. In a demonstration, the assistant called

0:46:55.520 --> 0:46:58.239
<v Speaker 1>a hair salon and had a brief conversation with the

0:46:58.239 --> 0:47:02.600
<v Speaker 1>salon employee to okay, haircut appointment, and it all sounded,

0:47:02.680 --> 0:47:06.600
<v Speaker 1>you know, fairly natural. This approach to natural language recognition

0:47:06.680 --> 0:47:10.480
<v Speaker 1>and generative language is really powerful stuff. In this case,

0:47:10.640 --> 0:47:14.640
<v Speaker 1>the assistant was relying upon certain parameters. Right The assistant

0:47:14.680 --> 0:47:18.560
<v Speaker 1>knew which salon the user wanted to call. They knew

0:47:18.560 --> 0:47:22.799
<v Speaker 1>the time frame that the user had outlined as being appropriate. Uh.

0:47:22.920 --> 0:47:26.800
<v Speaker 1>In this particular demonstration, it was an appointment slot anytime

0:47:26.840 --> 0:47:29.960
<v Speaker 1>between ten am and twelve pm, and knew what day

0:47:30.080 --> 0:47:33.360
<v Speaker 1>the user wanted an appointment and had all the basics,

0:47:33.560 --> 0:47:37.239
<v Speaker 1>and then the assistant could respond to questions and statements

0:47:37.280 --> 0:47:41.239
<v Speaker 1>from the salon employee on the phone and book the appointment,

0:47:41.400 --> 0:47:45.920
<v Speaker 1>all without obviously revealing that it was an AI program.

0:47:46.000 --> 0:47:48.799
<v Speaker 1>The appearance is that the assistant is able to have

0:47:49.040 --> 0:47:53.840
<v Speaker 1>persistent knowledge, but that's more of an illusion than anything else,

0:47:54.320 --> 0:47:56.920
<v Speaker 1>it does show that computer scientists are making a lot

0:47:56.920 --> 0:48:00.280
<v Speaker 1>of progress towards building systems that can generate language at

0:48:00.320 --> 0:48:04.320
<v Speaker 1>if it's not deeply meaningful, can at least be useful.

0:48:05.040 --> 0:48:07.160
<v Speaker 1>I'll close out was something that I covered at the

0:48:07.200 --> 0:48:11.400
<v Speaker 1>IBM Think Conference back in twenty nineteen. To demonstrate the

0:48:11.400 --> 0:48:14.920
<v Speaker 1>power of the Watson platform, which is a foundation for

0:48:15.040 --> 0:48:19.719
<v Speaker 1>various applications that all tap into deep AI processes, IBM

0:48:19.840 --> 0:48:24.360
<v Speaker 1>organized a debate between a debate champion and a system

0:48:24.400 --> 0:48:27.560
<v Speaker 1>called Project debater or, and the debate was on the

0:48:27.640 --> 0:48:32.480
<v Speaker 1>topic of subsidizing preschools. IBM had drawn the pro side

0:48:32.719 --> 0:48:35.480
<v Speaker 1>of the argument, and I got to watch this debate

0:48:35.600 --> 0:48:38.600
<v Speaker 1>live in person, and it was impressive. Not that I

0:48:38.640 --> 0:48:42.400
<v Speaker 1>felt that Watson was able to outmaneuver the skilled, logical,

0:48:42.680 --> 0:48:46.759
<v Speaker 1>eloquent human champion, but it was able to construct a

0:48:46.840 --> 0:48:51.600
<v Speaker 1>pretty sound and consistent argument. It wasn't as strong and rhetoric,

0:48:52.080 --> 0:48:55.000
<v Speaker 1>but it appeared to parse the flow of the debate

0:48:55.280 --> 0:48:58.520
<v Speaker 1>properly for the most part, constructing arguments and supporting them

0:48:58.560 --> 0:49:03.080
<v Speaker 1>with information wherever possible. It didn't come across as quite human,

0:49:03.560 --> 0:49:06.359
<v Speaker 1>but it was still really impressive. I think it will

0:49:06.400 --> 0:49:09.319
<v Speaker 1>be quite some time before machines can generate text or

0:49:09.400 --> 0:49:13.680
<v Speaker 1>speech at a level that compares with skilled humans, you know,

0:49:14.000 --> 0:49:18.200
<v Speaker 1>humans who incorporate so many things from creativity to insight

0:49:18.320 --> 0:49:22.160
<v Speaker 1>to intelligence in order to build communication. But progress is

0:49:22.200 --> 0:49:24.759
<v Speaker 1>being made all the time, and thanks to a surplus

0:49:24.800 --> 0:49:27.839
<v Speaker 1>of you know, not so great communication out there, we're

0:49:27.880 --> 0:49:31.399
<v Speaker 1>more likely to not notice the computer generated stuff as

0:49:31.440 --> 0:49:35.719
<v Speaker 1>it improves. This opens up a lot of thorny problems.

0:49:36.080 --> 0:49:39.160
<v Speaker 1>We've already got a problem with fake news. In a

0:49:39.200 --> 0:49:43.160
<v Speaker 1>world where computer systems could generate endless blog posts and

0:49:43.320 --> 0:49:47.719
<v Speaker 1>articles supporting narratives that don't reflect the truth, we're really

0:49:47.719 --> 0:49:50.320
<v Speaker 1>going to be in trouble. And I think that's why

0:49:50.320 --> 0:49:54.040
<v Speaker 1>this news about the blog post passing for a real

0:49:54.160 --> 0:49:58.319
<v Speaker 1>article should scare platforms like Facebook. If we reach a

0:49:58.360 --> 0:50:02.440
<v Speaker 1>point where computers can lad Facebook with fake news and

0:50:02.560 --> 0:50:06.760
<v Speaker 1>other computers are running bots that interact with that fake news,

0:50:07.520 --> 0:50:11.120
<v Speaker 1>fewer people are going to stick around on that platform.

0:50:11.160 --> 0:50:13.840
<v Speaker 1>They're going to it's just gonna get a turned to

0:50:13.920 --> 0:50:18.799
<v Speaker 1>a cess pit of of total nonsense. You know, some

0:50:18.840 --> 0:50:20.520
<v Speaker 1>people stick around, but a lot of people are just

0:50:20.560 --> 0:50:23.080
<v Speaker 1>gonna bail. People have been bailing already. We're gonna see

0:50:23.080 --> 0:50:26.160
<v Speaker 1>a lot more leave, and once the advertisers get win

0:50:26.320 --> 0:50:30.360
<v Speaker 1>that the majority of activity on Facebook isn't even human

0:50:31.040 --> 0:50:35.759
<v Speaker 1>and therefore doesn't represent actual potential customers, advertising money will

0:50:35.760 --> 0:50:39.319
<v Speaker 1>start to dry up, and then even a behemoth like

0:50:39.400 --> 0:50:42.960
<v Speaker 1>Facebook could crumble. Now I'm not saying this is going

0:50:43.000 --> 0:50:46.200
<v Speaker 1>to happen quickly, but I think it definitely could and

0:50:46.280 --> 0:50:49.840
<v Speaker 1>probably will happen at least in some respect over the

0:50:49.880 --> 0:50:55.840
<v Speaker 1>course of the next few years. So hey, Facebook, maybe

0:50:55.840 --> 0:51:00.200
<v Speaker 1>think about your oncoming existential crisis and you know, get

0:51:00.200 --> 0:51:04.200
<v Speaker 1>ahead of it. It would be good for everybody, including

0:51:04.239 --> 0:51:09.000
<v Speaker 1>your shareholders, and I know you really care about those alright.

0:51:09.080 --> 0:51:11.160
<v Speaker 1>That wraps up this episode of tech Stuff and how

0:51:11.320 --> 0:51:15.200
<v Speaker 1>artificial intelligence and machine learning and predictive text are all

0:51:15.239 --> 0:51:20.600
<v Speaker 1>evolving rapidly in ways that are both cool and you know, concerning,

0:51:20.800 --> 0:51:23.600
<v Speaker 1>if we're being totally honest, But I want to know

0:51:23.640 --> 0:51:25.520
<v Speaker 1>what you guys think. I also want to know if

0:51:25.520 --> 0:51:28.040
<v Speaker 1>you have any suggestions for future episodes of tech Stuff.

0:51:28.360 --> 0:51:31.280
<v Speaker 1>Reach out to me on Twitter. The handle is text

0:51:31.280 --> 0:51:34.320
<v Speaker 1>stuff h s W and I'll talk to you again

0:51:35.080 --> 0:51:43.360
<v Speaker 1>really soon. Text Stuff is an I Heart Radio production.

0:51:43.600 --> 0:51:46.400
<v Speaker 1>For more podcasts from My Heart Radio, visit the I

0:51:46.520 --> 0:51:49.759
<v Speaker 1>heart radio, app, Apple podcasts, or wherever you listen to

0:51:49.800 --> 0:51:50.720
<v Speaker 1>your favorite shows.