WEBVTT - Rerun: Machine Learning 101

0:00:04.400 --> 0:00:07.800
<v Speaker 1>Welcome to Tech Stuff, a production from I Heart Radio.

0:00:11.840 --> 0:00:14.040
<v Speaker 1>Hey there, and welcome to tech Stuff. I'm your host,

0:00:14.120 --> 0:00:16.759
<v Speaker 1>Jonathan Strickland. I'm an executive producer with iHeart Radio. And

0:00:16.800 --> 0:00:20.479
<v Speaker 1>how the tech are you? Alright? Well, I'm still on vacation.

0:00:20.600 --> 0:00:24.040
<v Speaker 1>I'll be coming back soon, so tomorrow you should expect

0:00:24.040 --> 0:00:27.920
<v Speaker 1>a brand new episode unless something goes wrong while I'm

0:00:27.920 --> 0:00:31.000
<v Speaker 1>trying to get back. Hopefully nothing like that happens, And

0:00:31.440 --> 0:00:33.920
<v Speaker 1>so we thought we'd have a little rerun. This episode

0:00:33.960 --> 0:00:38.080
<v Speaker 1>originally published in April one, so just last year. It

0:00:38.200 --> 0:00:42.400
<v Speaker 1>is titled machine Learning one oh one. And I wanted

0:00:42.440 --> 0:00:45.360
<v Speaker 1>to do this one because, as always, we hear a

0:00:45.440 --> 0:00:48.919
<v Speaker 1>lot about artificial intelligence and machine learning in the news

0:00:48.960 --> 0:00:54.040
<v Speaker 1>and in media, and often those topics get a little confusing.

0:00:54.120 --> 0:00:59.520
<v Speaker 1>They can come across more broad than some people intend,

0:00:59.840 --> 0:01:04.640
<v Speaker 1>or or they can be somewhat misguided in their interpretations.

0:01:04.680 --> 0:01:06.480
<v Speaker 1>So I thought it would be useful to have a

0:01:06.480 --> 0:01:10.399
<v Speaker 1>little refresher course on machine learning and artificial intelligence to

0:01:10.400 --> 0:01:13.920
<v Speaker 1>hope you enjoy, uh and I will be back at

0:01:13.920 --> 0:01:20.200
<v Speaker 1>the end. Back in nineteen eighties, six comedy science fiction

0:01:20.280 --> 0:01:24.200
<v Speaker 1>film that I saw in the theater about a robot,

0:01:24.600 --> 0:01:28.560
<v Speaker 1>the game sentience and becomes a total goofball what it will.

0:01:28.600 --> 0:01:31.039
<v Speaker 1>It hit theaters in eighties six and it was called

0:01:31.640 --> 0:01:36.080
<v Speaker 1>Short Circuit. The movie starred Steve Gutenberg, Ali Sheety, and

0:01:36.360 --> 0:01:40.440
<v Speaker 1>lamentably a white actor named Fisher Stevens playing a non

0:01:40.480 --> 0:01:44.520
<v Speaker 1>white character, someone who is Indian. I should add that's

0:01:44.520 --> 0:01:48.240
<v Speaker 1>not Steven's fault. I mean, he auditioned to be in

0:01:48.240 --> 0:01:50.640
<v Speaker 1>a movie and he got a gig. He didn't cast

0:01:50.720 --> 0:01:53.080
<v Speaker 1>himself in the film, and he has since talked about

0:01:53.120 --> 0:01:56.720
<v Speaker 1>his experiences, realizing the problems with a white man playing

0:01:56.720 --> 0:01:59.760
<v Speaker 1>a non white character, but setting aside all the problematic

0:01:59.760 --> 0:02:04.080
<v Speaker 1>white washing, the movie showed this robot, who in the

0:02:04.080 --> 0:02:08.440
<v Speaker 1>course of the film names itself Johnny five learning. It

0:02:08.560 --> 0:02:11.560
<v Speaker 1>learns about the world around it, it learns about people,

0:02:12.080 --> 0:02:16.960
<v Speaker 1>It learns about human concepts like humor and emotion, and

0:02:17.000 --> 0:02:20.919
<v Speaker 1>the general idea was pretty cute. Now, the nifty thing

0:02:21.040 --> 0:02:25.680
<v Speaker 1>is machines actually can learn. In fact, machine learning is

0:02:25.720 --> 0:02:29.320
<v Speaker 1>a really important field of study these days, complete with

0:02:29.360 --> 0:02:32.959
<v Speaker 1>its own challenges and risks. I've talked about machine learning

0:02:33.240 --> 0:02:35.040
<v Speaker 1>a few times in the past, but I figured we

0:02:35.040 --> 0:02:38.400
<v Speaker 1>could do a deeper dive to understand what machine learning

0:02:38.560 --> 0:02:42.160
<v Speaker 1>is what it isn't how people are leveraging machine learning

0:02:42.240 --> 0:02:45.919
<v Speaker 1>and why I said that it does come with risks,

0:02:45.919 --> 0:02:53.280
<v Speaker 1>So let's learn about machines learning. It will be impossible

0:02:53.360 --> 0:02:56.800
<v Speaker 1>to talk about machine learning without also talking about artificial

0:02:56.840 --> 0:03:01.840
<v Speaker 1>intelligence or AI. And this term artificial intelligence is a

0:03:02.000 --> 0:03:06.520
<v Speaker 1>real doozy. It trips people up, even people who have

0:03:06.680 --> 0:03:11.560
<v Speaker 1>dedicated their lives to researching and developing artificial intelligence. You

0:03:11.600 --> 0:03:16.200
<v Speaker 1>can get two experts in AI talking about AI and

0:03:16.240 --> 0:03:19.000
<v Speaker 1>find out that because they have slightly different takes on

0:03:19.160 --> 0:03:24.680
<v Speaker 1>what AI is, there are some communication issues. It's not

0:03:24.760 --> 0:03:27.480
<v Speaker 1>as simple as red versus blue would have you think

0:03:28.080 --> 0:03:33.680
<v Speaker 1>what does the A stand for? So when you really

0:03:34.120 --> 0:03:36.440
<v Speaker 1>boil it down, it comes out as as no big

0:03:36.480 --> 0:03:39.480
<v Speaker 1>surprise that there's a lot of ambiguity here. After all,

0:03:39.840 --> 0:03:44.880
<v Speaker 1>how would you define intelligence just intelligence, not artificial intelligence,

0:03:45.240 --> 0:03:49.880
<v Speaker 1>just intelligence? Well? Would it be the ability to learn,

0:03:50.240 --> 0:03:54.480
<v Speaker 1>that is, to acquire skills and knowledge? Or is it

0:03:54.560 --> 0:03:57.920
<v Speaker 1>the application of learning? Is it problems solving? Is it

0:03:58.400 --> 0:04:01.680
<v Speaker 1>being able to think ahead and make plans in order

0:04:01.720 --> 0:04:05.960
<v Speaker 1>to achieve a specific goal? Is it the ability to

0:04:06.240 --> 0:04:09.800
<v Speaker 1>examine a problem and deconstructed in order to figure out

0:04:09.840 --> 0:04:12.840
<v Speaker 1>the best solution. A more specific version of problem solving.

0:04:13.480 --> 0:04:18.800
<v Speaker 1>Is it the ability to recognize, understand, and navigate emotional scenarios? Now,

0:04:18.920 --> 0:04:24.200
<v Speaker 1>arguably it's all of these things and more. We all

0:04:24.240 --> 0:04:28.640
<v Speaker 1>have kind of an intuitive grasp on what intelligence is,

0:04:29.560 --> 0:04:34.240
<v Speaker 1>but defining it in a simple way tends to feel

0:04:34.240 --> 0:04:37.680
<v Speaker 1>reductive and it leaves out a lot of important details.

0:04:37.720 --> 0:04:43.440
<v Speaker 1>So if defining just general intelligence is hard, it stands

0:04:43.440 --> 0:04:46.720
<v Speaker 1>for a reason that defining artificial intelligence is also a

0:04:46.760 --> 0:04:50.600
<v Speaker 1>tough job. Heck, even coming up with a number of

0:04:50.640 --> 0:04:54.680
<v Speaker 1>different types of a I is tricky. And if you

0:04:54.720 --> 0:04:59.159
<v Speaker 1>don't believe me, just google the phrase different types of

0:04:59.279 --> 0:05:03.400
<v Speaker 1>artificial intelligence. Never mind, you don't. You don't really actually

0:05:03.440 --> 0:05:06.119
<v Speaker 1>have to do that. I already did it, though, Feel

0:05:06.160 --> 0:05:08.640
<v Speaker 1>free to do it yourself and check my work if

0:05:08.680 --> 0:05:13.360
<v Speaker 1>you like. When I googled that phrase different types of AI,

0:05:13.520 --> 0:05:16.400
<v Speaker 1>some of the top results included a blog post on

0:05:16.600 --> 0:05:21.480
<v Speaker 1>BMC software titled four types of artificial Intelligence. But then

0:05:21.520 --> 0:05:24.279
<v Speaker 1>there was also an article on code bots that was

0:05:24.320 --> 0:05:27.680
<v Speaker 1>titled what are the three types of AI? And then

0:05:27.720 --> 0:05:31.440
<v Speaker 1>there was an article from Forbes titled seven types of

0:05:31.520 --> 0:05:35.600
<v Speaker 1>artificial intelligence. See, we can't even agree on how many

0:05:35.720 --> 0:05:39.200
<v Speaker 1>versions of a EI there are because defining a I

0:05:40.080 --> 0:05:44.040
<v Speaker 1>is really hard. It largely depends upon how you view

0:05:44.200 --> 0:05:46.720
<v Speaker 1>AI and then how you break it down into different

0:05:46.760 --> 0:05:51.599
<v Speaker 1>realms of intelligence. Now we could go super high level,

0:05:51.920 --> 0:05:55.159
<v Speaker 1>because a classic way to look at AI is strong

0:05:55.760 --> 0:06:02.240
<v Speaker 1>versus weak Artificial intelligence stro on AI UH sometimes called

0:06:02.440 --> 0:06:08.760
<v Speaker 1>artificial general intelligence, would be a machine that processes information

0:06:09.040 --> 0:06:13.400
<v Speaker 1>and at least appears to have some form of consciousness

0:06:13.480 --> 0:06:17.440
<v Speaker 1>and self awareness and the ability to both have experiences

0:06:17.480 --> 0:06:21.359
<v Speaker 1>and to be aware that it is having experiences. It

0:06:21.440 --> 0:06:25.599
<v Speaker 1>might even feel emotion, though maybe not emotions that we

0:06:25.680 --> 0:06:29.480
<v Speaker 1>could easily identify or sympathize with. So this would be

0:06:30.080 --> 0:06:33.840
<v Speaker 1>the kind of machine that would think in a way

0:06:34.000 --> 0:06:36.840
<v Speaker 1>similar to humans. It would be able to sense its

0:06:36.920 --> 0:06:40.640
<v Speaker 1>environment and not just react, but really process what is

0:06:40.680 --> 0:06:43.839
<v Speaker 1>going on and build and understanding. It's the type of

0:06:43.880 --> 0:06:46.880
<v Speaker 1>AI that we see a lot in science fiction. A's

0:06:46.920 --> 0:06:50.000
<v Speaker 1>the type of AI of Johnny five from Short Circuit

0:06:50.480 --> 0:06:53.719
<v Speaker 1>or how from two thousand one, or the droids in

0:06:53.800 --> 0:06:57.880
<v Speaker 1>Star Wars. It's also a type of artificial intelligence that

0:06:57.960 --> 0:07:01.480
<v Speaker 1>we have yet to actually achieve in the real world.

0:07:02.000 --> 0:07:06.520
<v Speaker 1>So then what is week AI. Well, you could say

0:07:06.520 --> 0:07:10.120
<v Speaker 1>it's everything else, or you could say it's the building

0:07:10.160 --> 0:07:16.080
<v Speaker 1>blocks that maybe collectively will lead to strong AI week.

0:07:16.240 --> 0:07:21.160
<v Speaker 1>AI involves processes that allow machines to complete tasks, So,

0:07:21.240 --> 0:07:25.640
<v Speaker 1>for example, image recognition software could fall into this category.

0:07:25.960 --> 0:07:29.640
<v Speaker 1>Once upon a time, in order to search photos effectively,

0:07:30.160 --> 0:07:34.680
<v Speaker 1>you needed to actually add meta data like tags to

0:07:34.880 --> 0:07:40.040
<v Speaker 1>those photos. So, for example, I might tag pictures of

0:07:40.080 --> 0:07:44.080
<v Speaker 1>my dog with the meta tag dog, and then if

0:07:44.080 --> 0:07:46.920
<v Speaker 1>I wanted to see photos of my pooch, then I

0:07:46.920 --> 0:07:49.920
<v Speaker 1>would pull up my photo app and search the term dog,

0:07:50.440 --> 0:07:52.920
<v Speaker 1>and all the photos that I had tagged with the

0:07:52.960 --> 0:07:55.320
<v Speaker 1>word dog would show up. But if I had failed

0:07:55.480 --> 0:07:59.520
<v Speaker 1>to tag some pictures of my dog, those pictures wouldn't

0:07:59.560 --> 0:08:02.200
<v Speaker 1>pop up in search because the computer program wasn't actually

0:08:02.280 --> 0:08:05.200
<v Speaker 1>looking for dogs in my photos. It was just looking

0:08:05.200 --> 0:08:08.720
<v Speaker 1>for photos that had that particular meta tag attached to it.

0:08:09.480 --> 0:08:12.320
<v Speaker 1>But now we've reached a point where at least some

0:08:12.400 --> 0:08:16.720
<v Speaker 1>photo apps are using image recognition to analyze photos, and

0:08:16.760 --> 0:08:20.120
<v Speaker 1>these will return results that the algorithm has identified as

0:08:20.160 --> 0:08:23.560
<v Speaker 1>having a reasonable chance of meeting your search query. So

0:08:23.840 --> 0:08:26.280
<v Speaker 1>if I used an app like that and I put

0:08:26.320 --> 0:08:29.480
<v Speaker 1>in dog as my search term, it could pull up

0:08:29.480 --> 0:08:32.640
<v Speaker 1>photos that had no meta tags attached to them at all.

0:08:33.120 --> 0:08:36.520
<v Speaker 1>Because the search is relying on image recognition. Now, this

0:08:36.640 --> 0:08:40.680
<v Speaker 1>also means that if the image recognition algorithm isn't very good,

0:08:40.720 --> 0:08:42.960
<v Speaker 1>I could get some images that don't have a dog

0:08:43.000 --> 0:08:46.480
<v Speaker 1>in them at all, or it might miss other images

0:08:46.520 --> 0:08:48.960
<v Speaker 1>that have my dog in them. But my point is

0:08:49.000 --> 0:08:52.080
<v Speaker 1>that the ability to identify whether or not a dog

0:08:52.160 --> 0:08:56.000
<v Speaker 1>is in a particular photo represents a kind of weak

0:08:56.160 --> 0:09:01.560
<v Speaker 1>artificial intelligence. You wouldn't say that the photo search tool

0:09:01.720 --> 0:09:05.560
<v Speaker 1>possesses humanlike intelligence, because really it only does one thing.

0:09:06.120 --> 0:09:10.200
<v Speaker 1>It's analyzing photos and looks for matches to specific search queries,

0:09:10.559 --> 0:09:14.360
<v Speaker 1>but it can't do anything outside of that use case. However,

0:09:14.400 --> 0:09:17.080
<v Speaker 1>that's just one little example. There are all sorts of

0:09:17.080 --> 0:09:23.120
<v Speaker 1>other ones, like voice recognition, environmental sensing, course plotting, that

0:09:23.200 --> 0:09:25.760
<v Speaker 1>kind of thing, and in some circles, as we get

0:09:25.800 --> 0:09:30.320
<v Speaker 1>better at making machines and systems that can do these things,

0:09:31.120 --> 0:09:34.120
<v Speaker 1>those elements seem to kind of drift away from the

0:09:34.200 --> 0:09:38.960
<v Speaker 1>ongoing conversation about artificial intelligence. A guy named Larry Tessler,

0:09:39.160 --> 0:09:41.320
<v Speaker 1>who was a computer scientist who worked at lots of

0:09:41.320 --> 0:09:46.320
<v Speaker 1>really important places like Xerox, Park and Amazon and Apple,

0:09:46.840 --> 0:09:52.200
<v Speaker 1>he once observed, quote, intelligence is whatever machines haven't done yet.

0:09:52.559 --> 0:09:55.920
<v Speaker 1>End quote. So his point was that the reason that

0:09:56.000 --> 0:09:58.560
<v Speaker 1>AI is really hard to talk about is that the

0:09:58.600 --> 0:10:04.160
<v Speaker 1>goal post for why actually is artificial intelligence is constantly moving.

0:10:06.000 --> 0:10:08.560
<v Speaker 1>Now this pretty much mirrors how we think about things

0:10:08.600 --> 0:10:13.439
<v Speaker 1>like consciousness. Lots of people study consciousness, and the general

0:10:13.480 --> 0:10:16.040
<v Speaker 1>sense I get is that it's a lot easier for

0:10:16.080 --> 0:10:20.160
<v Speaker 1>people to talk about what isn't consciousness rather than what

0:10:20.520 --> 0:10:25.080
<v Speaker 1>consciousness actually is. And it seems like artificial intelligence is

0:10:25.120 --> 0:10:28.640
<v Speaker 1>in a similar place, which really isn't that big of

0:10:28.640 --> 0:10:33.640
<v Speaker 1>a surprise as we closely associate intelligence with consciousness. Now

0:10:33.679 --> 0:10:36.959
<v Speaker 1>this leads us to why there are so many different

0:10:37.040 --> 0:10:41.000
<v Speaker 1>takes on how many types of AI there are. It

0:10:41.000 --> 0:10:45.400
<v Speaker 1>all depends on how you classify different disciplines in artificial intelligence,

0:10:45.720 --> 0:10:48.920
<v Speaker 1>and over time, a lot of disciplines that were previously

0:10:49.080 --> 0:10:53.480
<v Speaker 1>distinct from AI have sort of converged into becoming part

0:10:53.600 --> 0:10:56.840
<v Speaker 1>of the AI discussion. Machine learning, as it turns out,

0:10:57.360 --> 0:11:00.880
<v Speaker 1>was part of the AI discussion, branch off from it,

0:11:01.120 --> 0:11:05.480
<v Speaker 1>and then rejoined the AI discussion years later. So I

0:11:05.520 --> 0:11:08.000
<v Speaker 1>am not going to go down all the different approaches

0:11:08.040 --> 0:11:10.640
<v Speaker 1>to classification because I don't know that they would be

0:11:10.760 --> 0:11:13.840
<v Speaker 1>that valuable to us. They would really just illustrate that

0:11:13.880 --> 0:11:16.280
<v Speaker 1>there are a lot of different ways to look at

0:11:16.320 --> 0:11:21.560
<v Speaker 1>the subject. So if you ever find yourself in a

0:11:21.600 --> 0:11:25.760
<v Speaker 1>conversation about AI, it might be a good idea to

0:11:25.800 --> 0:11:29.400
<v Speaker 1>set a few ground rules as to what everyone means

0:11:29.840 --> 0:11:33.320
<v Speaker 1>when they use the term artificial intelligence. That can help

0:11:33.559 --> 0:11:38.360
<v Speaker 1>with expectations and understanding. Or you could just run for

0:11:38.400 --> 0:11:41.560
<v Speaker 1>the nearest exit, which is what people tend to do

0:11:41.640 --> 0:11:48.120
<v Speaker 1>whenever I start talking about it anyway. What about machine learning, Well,

0:11:48.200 --> 0:11:51.240
<v Speaker 1>from one perspective, you could say machine learning is a

0:11:51.360 --> 0:11:55.520
<v Speaker 1>sub discipline of artificial intelligence, although like I said, it

0:11:55.600 --> 0:11:59.679
<v Speaker 1>hasn't always been viewed as such. I think most people

0:11:59.760 --> 0:12:02.880
<v Speaker 1>would say that the ability to learn that is to

0:12:03.200 --> 0:12:07.520
<v Speaker 1>take information and experience and then have some form of

0:12:07.640 --> 0:12:11.120
<v Speaker 1>understanding of those things so that you can apply that

0:12:11.200 --> 0:12:15.200
<v Speaker 1>to future tasks, potentially getting better over time. I would

0:12:15.240 --> 0:12:18.880
<v Speaker 1>say most people would call that part of intelligence. But

0:12:19.480 --> 0:12:21.400
<v Speaker 1>you could also be a bit more wishy washy and

0:12:21.440 --> 0:12:25.000
<v Speaker 1>say it's related to, you know, artificial intelligence, as opposed

0:12:25.040 --> 0:12:28.080
<v Speaker 1>to being part of AI, since the definition of AI

0:12:28.240 --> 0:12:33.320
<v Speaker 1>is let's say, fluid. Either way of classifying machine learning works.

0:12:33.360 --> 0:12:37.960
<v Speaker 1>As far as I'm concerned, machine learning boils down to

0:12:38.000 --> 0:12:41.520
<v Speaker 1>the idea of creating a system that can learn as

0:12:41.559 --> 0:12:45.360
<v Speaker 1>it performs a task. It can learn what works and

0:12:45.520 --> 0:12:49.280
<v Speaker 1>more importantly, what does not work. You may have heard

0:12:49.360 --> 0:12:51.920
<v Speaker 1>that we learn a lot more from our mistakes than

0:12:51.960 --> 0:12:56.320
<v Speaker 1>we do from our successes, which there's pretty much true

0:12:56.360 --> 0:13:00.480
<v Speaker 1>in my experience. When something goes wrong, it's usually, but

0:13:00.800 --> 0:13:05.640
<v Speaker 1>not always, possible to trace the event or events that

0:13:05.800 --> 0:13:09.920
<v Speaker 1>led to the failure. You can identify decisions that we're

0:13:09.960 --> 0:13:13.400
<v Speaker 1>probably the wrong ones or that led to a bad outcome,

0:13:14.120 --> 0:13:17.640
<v Speaker 1>But if you have a success, it's hard to figure

0:13:17.679 --> 0:13:22.600
<v Speaker 1>out which decisions were key to that successful outcome. Did

0:13:22.640 --> 0:13:25.199
<v Speaker 1>your decision at step two set you on the right path,

0:13:25.600 --> 0:13:28.720
<v Speaker 1>or was your choice at step three so good that

0:13:28.800 --> 0:13:31.840
<v Speaker 1>it helped correct a mistake that you made it step two.

0:13:32.360 --> 0:13:35.319
<v Speaker 1>But a good approach to machine learning involves a system

0:13:35.480 --> 0:13:38.560
<v Speaker 1>that can adjust things on its own to reduce mistakes

0:13:38.960 --> 0:13:41.839
<v Speaker 1>and increase the success rate. And another way of putting

0:13:41.880 --> 0:13:44.959
<v Speaker 1>it is that instead of programming a system to arrive

0:13:45.000 --> 0:13:48.920
<v Speaker 1>at a specific outcome, you are training the system to

0:13:49.080 --> 0:13:52.480
<v Speaker 1>learn how to do it by itself. And that sounds

0:13:52.480 --> 0:13:55.240
<v Speaker 1>a bit magical when you put it that way, doesn't it?

0:13:55.800 --> 0:13:59.040
<v Speaker 1>It sounds like someone just took a computer and showed

0:13:59.040 --> 0:14:01.840
<v Speaker 1>it pictures of cat and then expected the computer to

0:14:01.880 --> 0:14:05.200
<v Speaker 1>know what a cat was. And this actually does mirror

0:14:05.360 --> 0:14:09.000
<v Speaker 1>an actual project that really did do that, But I'm

0:14:09.080 --> 0:14:13.320
<v Speaker 1>leaving out some big important information in the middle. Now,

0:14:13.840 --> 0:14:17.679
<v Speaker 1>one big step is that computers and machines can't just

0:14:17.800 --> 0:14:20.880
<v Speaker 1>magically learn by default. People first had to come up

0:14:20.920 --> 0:14:24.240
<v Speaker 1>with a methodology that allows machines to go through the

0:14:24.280 --> 0:14:27.960
<v Speaker 1>process of completing a task, then making adjustments to the

0:14:28.080 --> 0:14:32.920
<v Speaker 1>process of doing that task, which would then improve future results.

0:14:33.440 --> 0:14:36.960
<v Speaker 1>We have to lay the groundwork in architecture and theory

0:14:37.160 --> 0:14:41.160
<v Speaker 1>and algorithms. We have to build the logical pathways that

0:14:41.200 --> 0:14:44.760
<v Speaker 1>computers can follow in order for them to learn. A

0:14:44.800 --> 0:14:49.680
<v Speaker 1>lot of machine learning revolves around patterns and pattern recognition.

0:14:50.080 --> 0:14:52.400
<v Speaker 1>So what do I mean by patterns? Well, I mean

0:14:52.560 --> 0:14:58.680
<v Speaker 1>some form of regularity and predictability. Machine learning models analyze

0:14:58.720 --> 0:15:03.040
<v Speaker 1>patterns and attempt to draw conclusions based on those patterns.

0:15:03.760 --> 0:15:07.120
<v Speaker 1>This in itself is tricky stuff. So why is that? Well,

0:15:07.160 --> 0:15:11.720
<v Speaker 1>it's because sometimes we might think there's a pattern when

0:15:11.720 --> 0:15:17.040
<v Speaker 1>in reality there is not. We humans are pretty good

0:15:17.320 --> 0:15:22.160
<v Speaker 1>at recognizing patterns, which makes sense. It's a survival mechanism.

0:15:22.200 --> 0:15:25.280
<v Speaker 1>If you were to look at tall grass and you

0:15:25.480 --> 0:15:28.800
<v Speaker 1>see patterns that suggest the presence of a predator like

0:15:29.000 --> 0:15:33.200
<v Speaker 1>a tiger, well you would know that danger is nearby,

0:15:33.240 --> 0:15:36.120
<v Speaker 1>and you would have the opportunity to do something about

0:15:36.160 --> 0:15:40.200
<v Speaker 1>that to help your chances of survival. If, however, you

0:15:40.320 --> 0:15:44.400
<v Speaker 1>remained blissfully unaware of the danger, you'd be far more

0:15:44.480 --> 0:15:48.240
<v Speaker 1>likely to fall prey to that hungry tiger. So recognizing

0:15:48.320 --> 0:15:51.280
<v Speaker 1>patterns is one of the abilities that gave humans a

0:15:51.360 --> 0:15:55.080
<v Speaker 1>chance to live another day, and, from an evolutionary standpoint,

0:15:55.120 --> 0:16:00.240
<v Speaker 1>a chance to make more humans. But sometimes we wins

0:16:00.280 --> 0:16:05.360
<v Speaker 1>will perceive a pattern where none actually exists. A simple

0:16:05.360 --> 0:16:08.760
<v Speaker 1>example of this is the fun exercise of laying on

0:16:08.800 --> 0:16:13.000
<v Speaker 1>your back outside, looking up at the clouds and saying,

0:16:13.040 --> 0:16:16.600
<v Speaker 1>what does that cloud remind you? Of? The shapes of clouds,

0:16:16.680 --> 0:16:21.120
<v Speaker 1>which have no significance and are the product of environmental factors,

0:16:21.560 --> 0:16:25.040
<v Speaker 1>can seem to suggest patterns to us. We might see

0:16:25.040 --> 0:16:28.840
<v Speaker 1>a dog, or a car or a face, but we

0:16:28.920 --> 0:16:32.880
<v Speaker 1>know that what we're really seeing with just the appearance

0:16:33.000 --> 0:16:35.400
<v Speaker 1>of a pattern, it's it's not evidence of a pattern

0:16:35.480 --> 0:16:40.000
<v Speaker 1>actually being there. It's noise, not signal. But it could

0:16:40.040 --> 0:16:44.200
<v Speaker 1>be misinterpreted as signal. Well, it turns out that in

0:16:44.280 --> 0:16:47.440
<v Speaker 1>machine learning applications this is also an issue. I'll talk

0:16:47.480 --> 0:16:50.520
<v Speaker 1>about it more towards the end of this episode. Computers

0:16:50.560 --> 0:16:55.400
<v Speaker 1>can sometimes misinterpret data and determine something represents a pattern

0:16:55.480 --> 0:16:58.760
<v Speaker 1>when it really doesn't. When that happens, a system relying

0:16:58.760 --> 0:17:02.760
<v Speaker 1>on machine learning can whose false positives, and the consequences

0:17:02.800 --> 0:17:06.159
<v Speaker 1>can sometimes be funny, like hey, this image recognition software

0:17:06.200 --> 0:17:09.119
<v Speaker 1>thinks this coffee mug is actually a kidney cat. Or

0:17:09.160 --> 0:17:12.640
<v Speaker 1>they can be really serious and potentially harmful. Hey, this

0:17:12.800 --> 0:17:17.120
<v Speaker 1>facial recognition software has misidentified a person, marking them as, say,

0:17:17.200 --> 0:17:20.240
<v Speaker 1>a person of interest in a criminal case. And it's

0:17:20.240 --> 0:17:23.280
<v Speaker 1>all because this facial recognition software isn't very good at

0:17:23.320 --> 0:17:29.040
<v Speaker 1>differentiating people of color. That's a real problem that really happens. Now,

0:17:29.040 --> 0:17:31.800
<v Speaker 1>when we come back, I'll give a little overview of

0:17:31.880 --> 0:17:35.080
<v Speaker 1>the evolution of machine learning. But before we do that,

0:17:35.720 --> 0:17:46.560
<v Speaker 1>let's take a quick break to talk about the history

0:17:46.760 --> 0:17:50.080
<v Speaker 1>of machine learning. We first have to look back much

0:17:50.560 --> 0:17:54.080
<v Speaker 1>much earlier, long before the era of computers, and talk

0:17:54.160 --> 0:17:58.480
<v Speaker 1>about how thinkers like Thomas Bayes thought about the act

0:17:58.720 --> 0:18:03.400
<v Speaker 1>of problem solving. Bays was born way back in two,

0:18:03.440 --> 0:18:06.320
<v Speaker 1>so quite a bit before we were thinking about machine learning,

0:18:06.720 --> 0:18:11.400
<v Speaker 1>but he was interested in problem solving for problems involving probabilities,

0:18:11.840 --> 0:18:16.480
<v Speaker 1>and specifically the relationship between different probabilities. I think it's

0:18:16.520 --> 0:18:19.440
<v Speaker 1>easier to talk about if I give you an example.

0:18:20.040 --> 0:18:22.520
<v Speaker 1>So let's make a silly one, all right, So let's

0:18:22.560 --> 0:18:27.200
<v Speaker 1>say we got ourselves a plucky podcaster. Hey there, everybody,

0:18:27.440 --> 0:18:31.960
<v Speaker 1>It's Jonathan Strickland, and it's Tuesday as I record this,

0:18:32.160 --> 0:18:35.040
<v Speaker 1>And because of who I am, you know who this

0:18:35.119 --> 0:18:39.800
<v Speaker 1>podcaster is. And because it's Tuesday, there is a chance

0:18:39.960 --> 0:18:42.840
<v Speaker 1>I am wearing a they might be Giants T shirt.

0:18:43.320 --> 0:18:48.080
<v Speaker 1>And we also know that if this podcaster is wearing

0:18:48.280 --> 0:18:51.800
<v Speaker 1>a they might be Giants T shirt on a Tuesday,

0:18:52.000 --> 0:18:55.639
<v Speaker 1>there's a sixty chance that I'm going to end up

0:18:55.640 --> 0:18:59.720
<v Speaker 1>wearing pajamas on Wednesday. But we also know that if

0:18:59.760 --> 0:19:04.280
<v Speaker 1>I did not where they might be Giant's shirt on Tuesday,

0:19:04.480 --> 0:19:08.359
<v Speaker 1>and remember there's a six chance I didn't, then we

0:19:08.440 --> 0:19:10.879
<v Speaker 1>know there's an eighty percent chance I'm going to be

0:19:10.920 --> 0:19:15.359
<v Speaker 1>wearing pajamas on Wednesday. Will Bays worked out a way

0:19:15.440 --> 0:19:20.240
<v Speaker 1>that described the sort of probability relationship between different discrete

0:19:20.320 --> 0:19:24.320
<v Speaker 1>events and using his reasoning, you can work forward or

0:19:24.440 --> 0:19:29.000
<v Speaker 1>backward based on probabilities. Theys would describe wearing a they

0:19:29.080 --> 0:19:32.240
<v Speaker 1>Might be Giant shirt on Tuesday as one event and

0:19:32.280 --> 0:19:36.360
<v Speaker 1>wearing pajamas on Wednesday as a separate event, and then

0:19:36.400 --> 0:19:39.399
<v Speaker 1>describe the two not only determining how likely it is

0:19:39.440 --> 0:19:43.760
<v Speaker 1>I'll wear pajamas on Wednesday, but if we start with

0:19:43.880 --> 0:19:46.439
<v Speaker 1>the later event, in other words, that we start with

0:19:46.480 --> 0:19:50.199
<v Speaker 1>the fact that it's Wednesday and I'm wearing pajamas, we

0:19:50.240 --> 0:19:55.360
<v Speaker 1>could work out how likely it was that yesterday, on Tuesday,

0:19:55.440 --> 0:19:58.719
<v Speaker 1>I was wearing they Might be Giants shirt. That was

0:19:58.800 --> 0:20:01.240
<v Speaker 1>his his contribution, that you can work this in either

0:20:01.359 --> 0:20:04.919
<v Speaker 1>direction if you know these different variables. Now, Bay has

0:20:05.000 --> 0:20:08.480
<v Speaker 1>never published his thoughts, but rather send an essay explaining

0:20:08.520 --> 0:20:11.280
<v Speaker 1>it to a friend of his, who then made sure

0:20:11.359 --> 0:20:13.879
<v Speaker 1>that the work was published. After Bays had passed away,

0:20:14.160 --> 0:20:18.280
<v Speaker 1>and a few decades later, Pierre Simon Laplace would take

0:20:18.359 --> 0:20:20.800
<v Speaker 1>this work that Bays had done and flesh it out

0:20:20.840 --> 0:20:25.520
<v Speaker 1>into an actual formal theorem. It's an important example of

0:20:25.600 --> 0:20:30.080
<v Speaker 1>conditional probability, and a lot of what machine learning is

0:20:30.880 --> 0:20:36.000
<v Speaker 1>really boiled down to is dealing with different probabilities, not certainties, which,

0:20:36.040 --> 0:20:37.399
<v Speaker 1>when you get down to it, is what most of

0:20:37.440 --> 0:20:39.360
<v Speaker 1>us are doing most of the time. Right. We make

0:20:39.400 --> 0:20:44.720
<v Speaker 1>decisions based on at least perceived probabilities. Sometimes these decisions

0:20:44.800 --> 0:20:48.200
<v Speaker 1>might feel like they're a coin flip situation, that any

0:20:48.320 --> 0:20:51.639
<v Speaker 1>choice is equally likely to precipitate a good outcome or

0:20:51.680 --> 0:20:54.640
<v Speaker 1>a bad outcome. Other Times we might make a choice

0:20:54.680 --> 0:20:58.240
<v Speaker 1>because we feel the probabilities are stacked favorably one way

0:20:58.320 --> 0:21:02.080
<v Speaker 1>over another. Sometimes we will make a choice to back

0:21:02.240 --> 0:21:07.720
<v Speaker 1>the least probable outcome, because well, humans are not always superrational.

0:21:07.760 --> 0:21:10.960
<v Speaker 1>In hex sometimes the long shot does pay off, so

0:21:11.920 --> 0:21:16.120
<v Speaker 1>that keeps Vegas in business. Bayes' theorem is just one

0:21:16.160 --> 0:21:19.639
<v Speaker 1>example of ways that mathematicians and philosophers figured out ways

0:21:19.680 --> 0:21:24.639
<v Speaker 1>to mathematically express problem solving and decision making, And a

0:21:24.680 --> 0:21:26.879
<v Speaker 1>lot of this was figuring out if there were a

0:21:26.920 --> 0:21:29.880
<v Speaker 1>way to boil down things that most of us approached

0:21:29.960 --> 0:21:34.359
<v Speaker 1>through intuition and experience. So it's kind of neat, and

0:21:34.480 --> 0:21:37.080
<v Speaker 1>also the more you look into it, the more likely

0:21:37.119 --> 0:21:39.879
<v Speaker 1>you might find it's little spooky, because it's weird to

0:21:39.880 --> 0:21:43.960
<v Speaker 1>consider that our approaches to making choices and solving problems

0:21:44.240 --> 0:21:50.440
<v Speaker 1>can be reduced down to mathematical expressions. But let's leave

0:21:50.520 --> 0:21:53.840
<v Speaker 1>the potential existential crises alone for now, shall we. So

0:21:53.960 --> 0:21:57.280
<v Speaker 1>moving on, we have another smarty pants we need to

0:21:57.320 --> 0:22:03.240
<v Speaker 1>talk about Andre Markov, mathematician. In the early twentie century.

0:22:03.320 --> 0:22:07.159
<v Speaker 1>He began studying the nature of certain random processes that

0:22:07.240 --> 0:22:10.040
<v Speaker 1>follow a particular type of rule, which we now call

0:22:10.240 --> 0:22:15.400
<v Speaker 1>the Markov property. That rule says that for this particular process,

0:22:15.440 --> 0:22:19.640
<v Speaker 1>the next stage of the process only depends upon the

0:22:19.680 --> 0:22:23.960
<v Speaker 1>current stage, but not any stages that came before then.

0:22:24.400 --> 0:22:28.480
<v Speaker 1>So let's take my ridiculous T shirt example and let's

0:22:28.480 --> 0:22:30.880
<v Speaker 1>build it out a little bit further. Let's say that

0:22:31.000 --> 0:22:33.680
<v Speaker 1>I've got three T shirts to my name. One of

0:22:33.720 --> 0:22:36.320
<v Speaker 1>them is that they might be Giant's shirt. One is

0:22:36.359 --> 0:22:40.040
<v Speaker 1>a plain blue T shirt, and the third is a

0:22:40.119 --> 0:22:43.159
<v Speaker 1>shirt that has the tech Stuff logo on it. And

0:22:43.960 --> 0:22:48.879
<v Speaker 1>it's based off of long observation that you've determined these

0:22:48.920 --> 0:22:53.040
<v Speaker 1>following facts. If I am wearing that they might be

0:22:53.119 --> 0:22:57.639
<v Speaker 1>Giant's shirt today, I definitely will not wear it tomorrow.

0:22:58.040 --> 0:23:01.199
<v Speaker 1>But there's a fifty fifty shot I'll wear either the

0:23:01.200 --> 0:23:05.000
<v Speaker 1>blue shirt or the tech Stuff shirt. Now, if I'm

0:23:05.040 --> 0:23:09.040
<v Speaker 1>wearing the blue shirt today, there's a ten chance I'm

0:23:09.040 --> 0:23:12.520
<v Speaker 1>going to wear the same blue shirt tomorrow. Don't worry,

0:23:12.800 --> 0:23:16.840
<v Speaker 1>I'll wash it first. There's a sixty chance that I'll

0:23:16.880 --> 0:23:19.560
<v Speaker 1>wear the tech Stuff shirt, and there's a thirty percent

0:23:19.640 --> 0:23:22.879
<v Speaker 1>chance I'll wear the they Might Be Giant shirt. But

0:23:23.800 --> 0:23:26.439
<v Speaker 1>if I'm wearing the tech stuff shirt today, there's a

0:23:26.440 --> 0:23:29.639
<v Speaker 1>seventy chance I'll wear it again tomorrow because I like

0:23:29.720 --> 0:23:33.000
<v Speaker 1>to promote myself. But there's a thirty percent chance I'll

0:23:33.000 --> 0:23:35.439
<v Speaker 1>wear the they Might be Giant shirt, and there is

0:23:35.520 --> 0:23:38.160
<v Speaker 1>no chance that I'm going to wear the blue one

0:23:38.520 --> 0:23:42.760
<v Speaker 1>in this case. So those are our various scenarios. Right

0:23:43.080 --> 0:23:47.800
<v Speaker 1>which shirt I will wear tomorrow depends only upon which

0:23:47.880 --> 0:23:51.359
<v Speaker 1>shirt I am wearing today. What I wore yesterday has

0:23:51.400 --> 0:23:55.359
<v Speaker 1>no bearing on the outcome for tomorrow, So today is

0:23:55.400 --> 0:23:59.119
<v Speaker 1>all that matters. And depending on which shirt I wear,

0:23:59.560 --> 0:24:02.879
<v Speaker 1>you can make some probability predictions for tomorrow. So we

0:24:02.920 --> 0:24:05.840
<v Speaker 1>can actually use this approach to figure out the probability

0:24:05.920 --> 0:24:09.080
<v Speaker 1>that I might wear the tech Stuff shirts, say ten

0:24:09.200 --> 0:24:12.359
<v Speaker 1>days in a row, since there's a better than even

0:24:12.480 --> 0:24:16.000
<v Speaker 1>chance that if I'm wearing tech Stuff today, I'll end

0:24:16.080 --> 0:24:19.280
<v Speaker 1>up wearing it again tomorrow, and if I wear it tomorrow,

0:24:19.480 --> 0:24:22.119
<v Speaker 1>then there's a better than fift chance that I'm going

0:24:22.160 --> 0:24:25.840
<v Speaker 1>to wear it the following day. But at some point

0:24:25.960 --> 0:24:29.119
<v Speaker 1>you're going to see that the odds are starting to

0:24:29.200 --> 0:24:33.600
<v Speaker 1>be against you, for you know, increasingly long strings of

0:24:33.640 --> 0:24:37.240
<v Speaker 1>wearing the tech stuff shirt. Anyway, Markov chains would become

0:24:37.320 --> 0:24:40.159
<v Speaker 1>one of the types of processes that machine learning models

0:24:40.200 --> 0:24:43.760
<v Speaker 1>would incorporate, with some models looking at the current state

0:24:43.880 --> 0:24:46.879
<v Speaker 1>of a given process and then make predictions on what

0:24:47.160 --> 0:24:50.679
<v Speaker 1>the next state will be with no need to look

0:24:50.800 --> 0:24:56.720
<v Speaker 1>back at the previous decisions. The Markov chain is memory less.

0:24:57.640 --> 0:25:00.960
<v Speaker 1>Now that's just a couple of the mathematicians whose work

0:25:01.080 --> 0:25:05.399
<v Speaker 1>underlies elements of machine learning. There's also structure we need

0:25:05.440 --> 0:25:09.800
<v Speaker 1>to talk about. In a man named Donald Hebb wrote

0:25:09.800 --> 0:25:13.520
<v Speaker 1>a book titled The Organization of Behavior, and in that book,

0:25:14.080 --> 0:25:18.560
<v Speaker 1>Hebb gave hypothesis on how neurons, that is, how how

0:25:18.640 --> 0:25:22.840
<v Speaker 1>brain cells interact with one another. His ideas included the

0:25:22.840 --> 0:25:27.119
<v Speaker 1>notion that if two neurons interact with one another regularly,

0:25:27.640 --> 0:25:31.000
<v Speaker 1>that is, if one fires, that the second one is

0:25:31.040 --> 0:25:35.280
<v Speaker 1>also likely to fire. They end up forming a tighter

0:25:35.320 --> 0:25:40.399
<v Speaker 1>communicative relationship with each other. Not long after his expression

0:25:40.400 --> 0:25:44.199
<v Speaker 1>of this hypothesis. Computer scientists began to think of a

0:25:44.200 --> 0:25:48.480
<v Speaker 1>potential way to do this artificially, with machines creating the

0:25:48.560 --> 0:25:54.440
<v Speaker 1>equivalent of artificial neurons. The relative strength in relationship between

0:25:54.720 --> 0:25:59.560
<v Speaker 1>artificial neurons is something we describe by Wait, that's going

0:25:59.600 --> 0:26:02.919
<v Speaker 1>to be an important part of machine learning. WIT. By

0:26:02.920 --> 0:26:06.120
<v Speaker 1>the way, is W E I G H T, as

0:26:06.160 --> 0:26:11.439
<v Speaker 1>in this relationship is weighted more heavily than that relationship.

0:26:12.200 --> 0:26:16.080
<v Speaker 1>In the early nineteen fifties, an IBM researcher named Arthur

0:26:16.280 --> 0:26:19.919
<v Speaker 1>Samuel created a program designed to win at checkers. The

0:26:19.960 --> 0:26:22.920
<v Speaker 1>program would do a quick analysis of where pieces were

0:26:23.160 --> 0:26:27.120
<v Speaker 1>on a checkerboard and whose move it was, and then

0:26:27.200 --> 0:26:30.520
<v Speaker 1>calculate the chances of each side winning the game based

0:26:30.560 --> 0:26:33.280
<v Speaker 1>on those positions. And it did this with a mini

0:26:33.320 --> 0:26:38.000
<v Speaker 1>max approach. Alright, so checkers is a two player turn

0:26:38.080 --> 0:26:41.160
<v Speaker 1>based game. Player one makes a move, then player two

0:26:41.160 --> 0:26:43.560
<v Speaker 1>can make a move. There are a finite number of

0:26:43.600 --> 0:26:47.439
<v Speaker 1>moves that can be made, a finite number of possibilities,

0:26:47.480 --> 0:26:51.760
<v Speaker 1>though admittedly it's a pretty good number of possibilities. But

0:26:51.880 --> 0:26:54.159
<v Speaker 1>let's say a game has been going on for a

0:26:54.200 --> 0:26:57.080
<v Speaker 1>few moves, and you've got your two sides you've got

0:26:57.080 --> 0:26:59.639
<v Speaker 1>the red checkers over on player one side and the

0:26:59.720 --> 0:27:02.639
<v Speaker 1>black checkers for a player to Let's say it's player

0:27:02.720 --> 0:27:06.080
<v Speaker 1>one's move. For the purposes of this example, will say

0:27:06.080 --> 0:27:08.880
<v Speaker 1>that player one really just has one piece that they

0:27:09.520 --> 0:27:12.800
<v Speaker 1>can actually move on this turn, and it can move

0:27:12.840 --> 0:27:17.160
<v Speaker 1>into one of two open spaces. So player one has

0:27:17.200 --> 0:27:20.280
<v Speaker 1>to make a choice. After that choice, it's going to

0:27:20.320 --> 0:27:23.720
<v Speaker 1>be player two's turn, so we can create a decision

0:27:23.800 --> 0:27:28.399
<v Speaker 1>treat illustrating the possible choices and the possible outcomes of

0:27:28.440 --> 0:27:32.440
<v Speaker 1>those choices. These choices are the children of the starting

0:27:32.440 --> 0:27:35.880
<v Speaker 1>position for player one, so player one's starting position has

0:27:36.119 --> 0:27:39.960
<v Speaker 1>two children. Player too will have their own choices to

0:27:40.040 --> 0:27:43.760
<v Speaker 1>make after that decision has been made, but those choices

0:27:43.760 --> 0:27:48.400
<v Speaker 1>are going to depend upon whatever move player one ultimately takes.

0:27:48.440 --> 0:27:51.720
<v Speaker 1>So we can extend out our decision treat showing the

0:27:51.800 --> 0:27:56.120
<v Speaker 1>branching possible moves that player Too might make, And these

0:27:56.160 --> 0:28:00.639
<v Speaker 1>are the children of the two possible outcomes of our choice.

0:28:01.160 --> 0:28:04.960
<v Speaker 1>After player two's turn, it's player ones turn again, which

0:28:04.960 --> 0:28:08.760
<v Speaker 1>means we need to branch those decisions out even further.

0:28:09.359 --> 0:28:12.000
<v Speaker 1>And this is all before player one has even made

0:28:12.240 --> 0:28:16.840
<v Speaker 1>that first choice. We're just evaluating possibilities. At some point,

0:28:17.080 --> 0:28:19.560
<v Speaker 1>either when we have plotted far enough out that we

0:28:19.640 --> 0:28:23.760
<v Speaker 1>know all possible outcomes of the game, or we're just

0:28:24.240 --> 0:28:26.919
<v Speaker 1>reaching a point where it would be unmanageable for us

0:28:26.920 --> 0:28:29.879
<v Speaker 1>to go any further, we need to actually analyze what

0:28:29.960 --> 0:28:35.639
<v Speaker 1>our options are. The endpoints represent either a win, a loss,

0:28:35.920 --> 0:28:39.720
<v Speaker 1>or a draw for player one, or, if we haven't

0:28:39.760 --> 0:28:41.959
<v Speaker 1>extended out the tree all the way to the end

0:28:41.960 --> 0:28:45.040
<v Speaker 1>of the game, at least a change in advantage, whether

0:28:45.240 --> 0:28:47.840
<v Speaker 1>it would be in player one's advantage to make that

0:28:47.920 --> 0:28:52.680
<v Speaker 1>move or disadvantage. We could actually assign numerical values to

0:28:52.760 --> 0:28:56.760
<v Speaker 1>each end point, with positive values representing an advantage for

0:28:56.840 --> 0:29:00.120
<v Speaker 1>player one and a negative value representing an advantage for

0:29:00.120 --> 0:29:03.040
<v Speaker 1>a player too, and once we do that, we can

0:29:03.080 --> 0:29:06.840
<v Speaker 1>see which pathways tend to lead to better outcomes for

0:29:07.040 --> 0:29:11.360
<v Speaker 1>player one. We work backward through the decision tree, so

0:29:11.680 --> 0:29:15.120
<v Speaker 1>on all the decisions that end in an advantage for

0:29:15.200 --> 0:29:18.080
<v Speaker 1>player one, we can say this is the choice that

0:29:18.120 --> 0:29:21.640
<v Speaker 1>player one would take. But then we know that a

0:29:21.640 --> 0:29:25.200
<v Speaker 1>player to player two is always going to choose whichever

0:29:25.320 --> 0:29:29.360
<v Speaker 1>choice has the greatest advantage for that player, so we

0:29:29.440 --> 0:29:32.400
<v Speaker 1>have to actually take that into account as we're working backward,

0:29:33.400 --> 0:29:36.840
<v Speaker 1>and this is how we can finally get to the

0:29:36.840 --> 0:29:39.120
<v Speaker 1>point where we decide which move we're going to make.

0:29:39.200 --> 0:29:42.760
<v Speaker 1>Because these decisions as you go backward up the tree,

0:29:43.560 --> 0:29:47.480
<v Speaker 1>they ultimately inform you which of those two choices is

0:29:47.520 --> 0:29:51.280
<v Speaker 1>going to give you the best result. Those values, well,

0:29:51.440 --> 0:29:54.280
<v Speaker 1>those are weights. So for player one, the goal is

0:29:54.320 --> 0:29:57.640
<v Speaker 1>to pick the path that has the highest positive value.

0:29:58.040 --> 0:30:00.680
<v Speaker 1>For player too, it's to pick the path that has

0:30:00.720 --> 0:30:04.320
<v Speaker 1>the lowest possible value or the highest negative value if

0:30:04.360 --> 0:30:06.800
<v Speaker 1>you prefer so. In other words, player one might be

0:30:06.840 --> 0:30:09.960
<v Speaker 1>thinking something like, if I move to Spot A, my

0:30:10.080 --> 0:30:13.160
<v Speaker 1>chance of winning this game, But if I moved to

0:30:13.160 --> 0:30:17.960
<v Speaker 1>Spot B, it's only so. Of course, those percentages will

0:30:18.000 --> 0:30:19.960
<v Speaker 1>also depend on what player two is going to do

0:30:20.000 --> 0:30:22.880
<v Speaker 1>in response. Some moves that player two might do could

0:30:23.000 --> 0:30:26.520
<v Speaker 1>end up guaranteeing a win for player one. This is

0:30:26.560 --> 0:30:30.080
<v Speaker 1>the mini max approach, and there's an algorithm that guides it.

0:30:30.080 --> 0:30:33.800
<v Speaker 1>It depends upon the current position within a game and

0:30:33.920 --> 0:30:36.680
<v Speaker 1>how many moves or how much depth it has to

0:30:36.720 --> 0:30:40.240
<v Speaker 1>take into account, and for which player is it actually

0:30:40.280 --> 0:30:44.440
<v Speaker 1>helping out. What happens is if player one does this

0:30:44.480 --> 0:30:48.720
<v Speaker 1>evaluation and finds that both options are negative, well, then

0:30:49.560 --> 0:30:51.760
<v Speaker 1>this is something that happens in games, right, Sometimes you

0:30:51.840 --> 0:30:54.880
<v Speaker 1>find out there is no good move, like any move

0:30:54.920 --> 0:30:56.880
<v Speaker 1>you make is going to be a losing move. Well,

0:30:56.920 --> 0:30:59.040
<v Speaker 1>the only option at that point is to choose the

0:30:59.160 --> 0:31:01.920
<v Speaker 1>least bad had one, so it would be whatever the

0:31:01.960 --> 0:31:06.360
<v Speaker 1>smallest negative value choice was. Our Next big development that

0:31:06.400 --> 0:31:10.720
<v Speaker 1>I need to mention is Frank Rosenblatt's artificial neural network

0:31:10.840 --> 0:31:15.480
<v Speaker 1>called Perceptron. Its purpose was to recognize shapes and patterns,

0:31:15.840 --> 0:31:18.400
<v Speaker 1>and it was originally going to be its own machine

0:31:18.520 --> 0:31:23.040
<v Speaker 1>like actual hardware, but the first incarnation of Perceptron would

0:31:23.080 --> 0:31:26.000
<v Speaker 1>actually be in the form of software rather than hardware.

0:31:26.320 --> 0:31:29.880
<v Speaker 1>There was a purpose built Perceptron later, but the original

0:31:29.880 --> 0:31:34.360
<v Speaker 1>one was software. Despite some early excitement, the Perceptron proved

0:31:34.400 --> 0:31:37.960
<v Speaker 1>to be somewhat limited in its capabilities, and interest in

0:31:38.040 --> 0:31:41.320
<v Speaker 1>artificial neural networks died down for a while as a result.

0:31:42.320 --> 0:31:45.080
<v Speaker 1>In a way, you could kind of compare this to

0:31:45.280 --> 0:31:48.320
<v Speaker 1>some other technologies that got a big hype cycle and

0:31:48.360 --> 0:31:52.440
<v Speaker 1>then later deflated. Virtual reality is the one I always

0:31:52.480 --> 0:31:54.920
<v Speaker 1>go with. Back in the nineteen nineties, the world was

0:31:55.000 --> 0:32:00.000
<v Speaker 1>really hyped for virtual reality. People had incredibly unrealistic x

0:32:00.000 --> 0:32:03.320
<v Speaker 1>spectations for what VR actually meant and what it could do,

0:32:04.000 --> 0:32:06.720
<v Speaker 1>and when it turned out the VR wasn't nearly as

0:32:06.720 --> 0:32:10.600
<v Speaker 1>sophisticated as people were imagining, a lot of enthusiasm dropped

0:32:10.640 --> 0:32:15.320
<v Speaker 1>out for the entire field, and with that dropped funding

0:32:15.440 --> 0:32:18.480
<v Speaker 1>and support, and as a result, development and VR hit

0:32:18.520 --> 0:32:21.560
<v Speaker 1>a real wall, with only a fraction of the people

0:32:21.600 --> 0:32:24.640
<v Speaker 1>who had been working in the field sticking around, and

0:32:25.200 --> 0:32:27.600
<v Speaker 1>they had to scramble just to find funding to keep

0:32:27.640 --> 0:32:30.680
<v Speaker 1>their projects going. So VR was effectively put on the

0:32:30.720 --> 0:32:34.520
<v Speaker 1>shelf and wouldn't make much progress for nearly twenty years. Well.

0:32:34.640 --> 0:32:39.120
<v Speaker 1>Artificial neural networks had a very similar issue, but other

0:32:39.160 --> 0:32:43.680
<v Speaker 1>computer scientists eventually found ways to design artificial neural networks.

0:32:43.960 --> 0:32:47.240
<v Speaker 1>They could do some pretty amazing things if they had

0:32:47.280 --> 0:32:50.680
<v Speaker 1>access to enough data. When we come back, i'll talk

0:32:50.720 --> 0:32:53.560
<v Speaker 1>a little bit more about that and what it all means,

0:32:53.600 --> 0:33:04.800
<v Speaker 1>but first let's take another quick break. So we left

0:33:04.840 --> 0:33:07.800
<v Speaker 1>off with the AI field going into hibernation for a

0:33:07.840 --> 0:33:11.720
<v Speaker 1>little bit. Theory and mathematics were bumping up against the

0:33:11.760 --> 0:33:15.280
<v Speaker 1>limitations of technology, which wasn't quite at the level to

0:33:15.840 --> 0:33:19.040
<v Speaker 1>put all that theory to the test. Plus there needed

0:33:19.040 --> 0:33:22.000
<v Speaker 1>to be some tweaks to the approaches, but those came

0:33:22.120 --> 0:33:26.200
<v Speaker 1>with time and more mathematicians found new ways to create

0:33:26.280 --> 0:33:30.720
<v Speaker 1>artificial neural networks capable of stuff like pattern recognition and learning.

0:33:31.320 --> 0:33:36.400
<v Speaker 1>So let's imagine another decision tree. We've got our starting position.

0:33:37.160 --> 0:33:40.000
<v Speaker 1>This is probably where we put some input. We would

0:33:40.120 --> 0:33:44.200
<v Speaker 1>feed data into a system, and let's say from that

0:33:44.360 --> 0:33:47.600
<v Speaker 1>starting position, we have a process that's going to transform

0:33:47.720 --> 0:33:52.080
<v Speaker 1>that input into one of two possible ways. So we've

0:33:52.120 --> 0:33:57.240
<v Speaker 1>got two potential outputs for that first step. Like our

0:33:57.320 --> 0:34:00.560
<v Speaker 1>mini max example, we can go down several layers of

0:34:00.640 --> 0:34:04.800
<v Speaker 1>possible choices, and we can wait the relationships between these

0:34:04.800 --> 0:34:08.600
<v Speaker 1>different choices. So if the incoming value is higher than

0:34:08.760 --> 0:34:12.760
<v Speaker 1>a certain amount, maybe the node sends it down one pathway,

0:34:12.800 --> 0:34:15.880
<v Speaker 1>But if the value is lower than that arbitrary amount,

0:34:16.200 --> 0:34:19.399
<v Speaker 1>the node will send it down a different pathway. This

0:34:19.520 --> 0:34:23.480
<v Speaker 1>is drastically oversimplifying, but I hope you kind of get

0:34:23.520 --> 0:34:26.960
<v Speaker 1>the idea. It's like a big sorting system, and the

0:34:27.000 --> 0:34:30.479
<v Speaker 1>goal is that at the very end whatever comes out

0:34:30.600 --> 0:34:35.640
<v Speaker 1>as output is correct or true. Ideally, you've got a

0:34:35.680 --> 0:34:40.840
<v Speaker 1>system that is self improving. It trains itself to be better.

0:34:41.320 --> 0:34:44.560
<v Speaker 1>But how the heck does that happen? Well, let's consider

0:34:44.920 --> 0:34:50.000
<v Speaker 1>cats for a bit, not the musical and good Heaven's

0:34:50.120 --> 0:34:56.000
<v Speaker 1>definitely not the movie musical. That is a subject that

0:34:56.239 --> 0:34:59.000
<v Speaker 1>deserves its own episode. Maybe one day I'll figure out

0:34:59.280 --> 0:35:01.000
<v Speaker 1>a way to tell a cackled that film with some

0:35:01.040 --> 0:35:04.080
<v Speaker 1>sort of tech capacity, But honestly, I'm just not ready

0:35:04.120 --> 0:35:07.480
<v Speaker 1>to do that yet. From like an emotional standpoint as

0:35:07.520 --> 0:35:11.760
<v Speaker 1>well as a research one. No, Let's say you're teaching

0:35:11.800 --> 0:35:16.480
<v Speaker 1>a computer system to recognize cats pictures of cats, and

0:35:16.480 --> 0:35:20.240
<v Speaker 1>the system has an artificial neural network that accepts input

0:35:20.600 --> 0:35:23.920
<v Speaker 1>pictures of cats and then filters that input through the

0:35:23.960 --> 0:35:27.920
<v Speaker 1>network to make the determination does this picture include a

0:35:28.000 --> 0:35:31.320
<v Speaker 1>cat in it? And you start feeding it lots of images.

0:35:31.719 --> 0:35:34.279
<v Speaker 1>The neural network acts on the data according to the

0:35:34.400 --> 0:35:39.640
<v Speaker 1>weighted relationship between the artificial neurons, and it produces an output.

0:35:40.440 --> 0:35:43.759
<v Speaker 1>Now here's the thing. We already know what we want

0:35:43.800 --> 0:35:46.880
<v Speaker 1>the output to be, because we can recognize if a

0:35:46.920 --> 0:35:50.040
<v Speaker 1>picture has a cat inet or not. Maybe we've got

0:35:50.200 --> 0:35:53.560
<v Speaker 1>one thousand pictures. This is the training data we're going

0:35:53.600 --> 0:35:57.040
<v Speaker 1>to use for this machine learning process. We also know

0:35:57.120 --> 0:35:59.759
<v Speaker 1>that eight hundred of those pictures have a cat in

0:35:59.800 --> 0:36:03.399
<v Speaker 1>the and two don't, so we know what we want

0:36:03.400 --> 0:36:06.400
<v Speaker 1>the results to be. We've got an artificial neural network

0:36:06.600 --> 0:36:10.000
<v Speaker 1>in which some neurons or nodes will accept input and

0:36:10.040 --> 0:36:12.680
<v Speaker 1>perform a function based on that input, and then the

0:36:12.719 --> 0:36:16.759
<v Speaker 1>weighted connections that neuron has to other neurons will determine

0:36:16.880 --> 0:36:19.719
<v Speaker 1>where it passes the information down until we get to

0:36:19.760 --> 0:36:23.040
<v Speaker 1>an output. And this happens until we get that conclusion.

0:36:23.680 --> 0:36:27.319
<v Speaker 1>So what happens if the computer's answer is wrong? What

0:36:27.520 --> 0:36:30.400
<v Speaker 1>if we feed those one thousand photos to it and

0:36:30.480 --> 0:36:33.719
<v Speaker 1>says only three hundred of them have cats in them?

0:36:33.719 --> 0:36:37.719
<v Speaker 1>While we have to go back and adjust those weighted connections,

0:36:37.719 --> 0:36:42.080
<v Speaker 1>because clearly something didn't go right, the connections within the

0:36:42.120 --> 0:36:47.080
<v Speaker 1>network need to be readjusted. We would likely start closest

0:36:47.120 --> 0:36:51.120
<v Speaker 1>to our output and see which neurons seem to contribute

0:36:51.120 --> 0:36:55.239
<v Speaker 1>to the mistake, which which neurons were responsible, In other words,

0:36:55.280 --> 0:36:58.080
<v Speaker 1>for it to say, oh, only three these pictures had

0:36:58.440 --> 0:37:01.920
<v Speaker 1>cats in them, and then we would adjust the weights,

0:37:01.960 --> 0:37:06.120
<v Speaker 1>the incoming weights of connections to those neurons in order

0:37:06.160 --> 0:37:10.160
<v Speaker 1>to try and favor pathways that lead to correct answers.

0:37:10.680 --> 0:37:13.640
<v Speaker 1>Then we feed it the one thousand pictures again and

0:37:13.719 --> 0:37:16.720
<v Speaker 1>we look at those results. Then we do this again

0:37:16.920 --> 0:37:20.239
<v Speaker 1>and again and again, every time, tweaking the network a

0:37:20.280 --> 0:37:24.520
<v Speaker 1>little bit so that it gets a bit better. Eventually,

0:37:24.760 --> 0:37:28.239
<v Speaker 1>when we have trained the system, we can start to

0:37:28.400 --> 0:37:32.960
<v Speaker 1>feed brand new data to the network, not the stuff

0:37:33.000 --> 0:37:36.920
<v Speaker 1>we've trained it on, but pictures that we and the

0:37:36.960 --> 0:37:40.440
<v Speaker 1>system have never seen before. And if our network is

0:37:40.440 --> 0:37:42.719
<v Speaker 1>a good one, if we have trained it well, it

0:37:42.760 --> 0:37:46.520
<v Speaker 1>will sort through these new photos and it will count

0:37:46.560 --> 0:37:49.560
<v Speaker 1>up the ones that have the cat pictures lickety split.

0:37:50.040 --> 0:37:54.080
<v Speaker 1>This approach is called supervised learning because it involves kind

0:37:54.120 --> 0:37:58.120
<v Speaker 1>of grading the network on its homework and then working

0:37:58.160 --> 0:38:02.000
<v Speaker 1>with it to get better. Heck, with the right algorithm,

0:38:02.000 --> 0:38:05.759
<v Speaker 1>a neural network can learn to recognize and differentiate patterns

0:38:06.200 --> 0:38:09.759
<v Speaker 1>even if we never explicitly told the system what it

0:38:09.840 --> 0:38:13.960
<v Speaker 1>was looking for. Google discovered this several years ago when

0:38:14.000 --> 0:38:18.280
<v Speaker 1>it fed several thousand YouTube videos to an enormous artificial

0:38:18.320 --> 0:38:22.600
<v Speaker 1>neural network. The system analyzed the videos that were fed

0:38:22.640 --> 0:38:26.800
<v Speaker 1>to it and gradually recognized patterns that represented different types

0:38:26.800 --> 0:38:32.399
<v Speaker 1>of stuff, like people or like cats, because there are

0:38:32.440 --> 0:38:35.760
<v Speaker 1>a lot of cat videos on YouTube, and the network

0:38:36.120 --> 0:38:38.360
<v Speaker 1>got to the point where it could identify an image

0:38:38.360 --> 0:38:42.239
<v Speaker 1>of a cat fairly reliably better than seventy of the time,

0:38:42.680 --> 0:38:46.480
<v Speaker 1>even though it was never told how to do that,

0:38:47.200 --> 0:38:51.080
<v Speaker 1>or it was never even told what a cat was. So,

0:38:51.120 --> 0:38:54.360
<v Speaker 1>as Google representatives put it, they said, it had to

0:38:54.520 --> 0:38:57.960
<v Speaker 1>invent the concept of a cat. It had to recognize

0:38:58.480 --> 0:39:02.960
<v Speaker 1>that cats are not the same as people, which I

0:39:03.000 --> 0:39:07.360
<v Speaker 1>think is a big slap in the face to some cats. Really,

0:39:08.000 --> 0:39:11.800
<v Speaker 1>what it said was that I recognized this particular pattern

0:39:11.840 --> 0:39:16.319
<v Speaker 1>of features, and I recognized that these other instances of

0:39:16.400 --> 0:39:20.080
<v Speaker 1>creatures that have a similar pattern seemed to match that,

0:39:20.320 --> 0:39:24.160
<v Speaker 1>and so I draw the conclusion that this instance of

0:39:24.200 --> 0:39:28.360
<v Speaker 1>a thing belongs with all these other instances of things

0:39:28.440 --> 0:39:32.880
<v Speaker 1>that are similar in characteristics. So this was more of

0:39:32.920 --> 0:39:36.719
<v Speaker 1>an example of unsupervised learning, and that the system, when

0:39:36.719 --> 0:39:39.879
<v Speaker 1>fed enough data, began to categorize stuff all on its

0:39:39.880 --> 0:39:43.920
<v Speaker 1>own through its own parameters. Now, one neat way that

0:39:43.960 --> 0:39:47.120
<v Speaker 1>computer scientists will train up systems for certain types of

0:39:47.160 --> 0:39:53.640
<v Speaker 1>applications is through a generative adversarial network, which I admit

0:39:53.760 --> 0:39:56.440
<v Speaker 1>sounds kind of sinister, doesn't it, And I mean it

0:39:56.520 --> 0:39:59.879
<v Speaker 1>can be, but it doesn't have to be essentially near

0:40:00.120 --> 0:40:04.320
<v Speaker 1>Using two different artificial neural networks. One of the networks

0:40:04.320 --> 0:40:08.240
<v Speaker 1>has a specific job. It's to fool the other network.

0:40:08.520 --> 0:40:11.480
<v Speaker 1>So the other network's job is to detect attempts to

0:40:11.560 --> 0:40:16.240
<v Speaker 1>fool it versus legitimate data. So let's use an example.

0:40:16.440 --> 0:40:18.399
<v Speaker 1>Let's say you're trying to create a system that can

0:40:18.440 --> 0:40:25.400
<v Speaker 1>make realistic but entirely computer generated, that is, fabricated photographs

0:40:25.440 --> 0:40:28.680
<v Speaker 1>of people. So, in other words, these are computer generated

0:40:28.719 --> 0:40:32.040
<v Speaker 1>images that don't actually represent a real person at all.

0:40:32.680 --> 0:40:36.359
<v Speaker 1>We've got one artificial neural network, the generator, and its

0:40:36.440 --> 0:40:41.160
<v Speaker 1>job is to create images of people that can pass

0:40:41.360 --> 0:40:44.640
<v Speaker 1>as real photographs. Then we've got our other network, which

0:40:44.680 --> 0:40:48.360
<v Speaker 1>is the discriminator. This is trying to sort out real

0:40:48.400 --> 0:40:52.960
<v Speaker 1>photos of actual people from pictures that have been generated

0:40:52.960 --> 0:40:57.640
<v Speaker 1>by the generative system. And we pick these two networks

0:40:57.680 --> 0:41:01.880
<v Speaker 1>against each other. The idea here is that both systems

0:41:02.000 --> 0:41:05.759
<v Speaker 1>get better as they test one another out. If the

0:41:05.840 --> 0:41:10.440
<v Speaker 1>generator network is falling behind because the discriminator can suss

0:41:10.480 --> 0:41:13.040
<v Speaker 1>out the fakes too easily, well, then it's time to

0:41:13.040 --> 0:41:17.240
<v Speaker 1>tweak some weights in that neural network that are leading

0:41:17.280 --> 0:41:22.560
<v Speaker 1>to dissatisfactory computer generated images and try it again. But then,

0:41:22.600 --> 0:41:27.799
<v Speaker 1>if the discriminator is starting to miss fakes while, it's

0:41:27.800 --> 0:41:31.480
<v Speaker 1>time to tweak the discriminator network. So it's better at

0:41:31.600 --> 0:41:36.080
<v Speaker 1>spotting the false pictures. Now along the way, some pretty

0:41:36.080 --> 0:41:40.760
<v Speaker 1>extraordinary stuff can happen. There are photos of computer generated faces,

0:41:41.120 --> 0:41:45.400
<v Speaker 1>not altered pictures, not ones created by a human artist,

0:41:45.760 --> 0:41:50.120
<v Speaker 1>but entirely composed via a computer, and they can look

0:41:50.520 --> 0:41:56.000
<v Speaker 1>absolutely realistic, complete with consistent lighting and shadows. This is

0:41:56.080 --> 0:42:00.759
<v Speaker 1>only after lots of training sessions the networks learn what

0:42:00.840 --> 0:42:04.920
<v Speaker 1>the giveaways are, like, what is it that leads the

0:42:04.920 --> 0:42:08.040
<v Speaker 1>discriminator to say, no, this is a fake photo, and

0:42:08.080 --> 0:42:10.600
<v Speaker 1>how can you fix that? It reminds me a bit

0:42:10.640 --> 0:42:14.080
<v Speaker 1>of how photo experts used to point out really bad

0:42:14.160 --> 0:42:18.560
<v Speaker 1>photoshop jobs and explaining how certain elements like shadows or

0:42:18.680 --> 0:42:22.120
<v Speaker 1>edges or whatever, we're a dead giveaway that someone had

0:42:22.160 --> 0:42:26.280
<v Speaker 1>altered an image. Well, similar rules exist for generated images,

0:42:26.640 --> 0:42:30.480
<v Speaker 1>and through training, the generator gets better at making really

0:42:30.560 --> 0:42:34.600
<v Speaker 1>convincing examples that don't fall into the traps that would

0:42:34.600 --> 0:42:39.239
<v Speaker 1>reveal it as a fake. Over time, generative networks can

0:42:39.280 --> 0:42:42.279
<v Speaker 1>get good enough to produce stuff that would be very

0:42:42.320 --> 0:42:44.600
<v Speaker 1>difficult for a human to tell apart from the quote

0:42:44.640 --> 0:42:48.400
<v Speaker 1>unquote real thing, and discriminators can get good enough to

0:42:48.440 --> 0:42:52.680
<v Speaker 1>detect fakes that would otherwise pass human inspection. So an

0:42:52.719 --> 0:42:57.240
<v Speaker 1>example of This is the current ongoing battle with deep fakes.

0:42:57.280 --> 0:43:00.960
<v Speaker 1>These are computer generated videos that appear to be legit.

0:43:01.360 --> 0:43:04.800
<v Speaker 1>If they're done well enough, they can have famous people

0:43:04.880 --> 0:43:07.160
<v Speaker 1>in them. Doesn't have to be a famous person, but

0:43:07.239 --> 0:43:09.680
<v Speaker 1>it can show a video of someone doing something that

0:43:09.719 --> 0:43:13.799
<v Speaker 1>they absolutely never did, but according to the video, they did,

0:43:14.360 --> 0:43:16.840
<v Speaker 1>and it can be really convincing if it's done well.

0:43:17.320 --> 0:43:21.680
<v Speaker 1>A good deep fake can fool people if you aren't

0:43:21.719 --> 0:43:23.879
<v Speaker 1>paying too much attention. Some of the really good ones

0:43:23.920 --> 0:43:29.000
<v Speaker 1>can pass pretty deep scrutiny. So this requires researchers to

0:43:29.000 --> 0:43:32.520
<v Speaker 1>come up with solutions that are pretty subtle and beyond

0:43:32.520 --> 0:43:35.640
<v Speaker 1>the average person's ability to replicate, like looking at the

0:43:35.719 --> 0:43:39.720
<v Speaker 1>reflections in the person's eyes and whether or not they

0:43:39.760 --> 0:43:43.600
<v Speaker 1>seem realistic or a computer generated. But that really just

0:43:43.680 --> 0:43:47.800
<v Speaker 1>represents another hurdle for the generative side. So in other words,

0:43:48.680 --> 0:43:53.799
<v Speaker 1>this is a seesaw approach, right. It's creating fakes on

0:43:53.800 --> 0:43:57.160
<v Speaker 1>one side and detecting them on the other side. It's

0:43:57.200 --> 0:44:00.000
<v Speaker 1>something we see in artificial intelligence in general. A similar

0:44:00.000 --> 0:44:03.520
<v Speaker 1>our story played out with the old capture systems, where

0:44:04.040 --> 0:44:06.440
<v Speaker 1>you know, we saw back and forth between methods to

0:44:06.520 --> 0:44:10.799
<v Speaker 1>try and weed out bots by using capture images that

0:44:10.840 --> 0:44:15.000
<v Speaker 1>only humans could really parse, and then we saw improved

0:44:15.040 --> 0:44:19.040
<v Speaker 1>bots that could analyze these images and return correct results,

0:44:19.520 --> 0:44:22.840
<v Speaker 1>which meant it was necessary to create more difficult captures.

0:44:22.960 --> 0:44:25.600
<v Speaker 1>Eventually get to a point where the captures are difficult

0:44:25.719 --> 0:44:28.239
<v Speaker 1>enough where the average person can't even pass them, and

0:44:28.239 --> 0:44:30.799
<v Speaker 1>then you have to go to a different method. We

0:44:30.880 --> 0:44:33.720
<v Speaker 1>also see this play out in the cyber security realm,

0:44:33.760 --> 0:44:36.960
<v Speaker 1>where you might say the thieves get better at lock picking,

0:44:37.360 --> 0:44:40.800
<v Speaker 1>and then security experts make better locks, and the cycle

0:44:40.880 --> 0:44:46.080
<v Speaker 1>just repeats endlessly. One thing that has really fueled machine

0:44:46.160 --> 0:44:50.040
<v Speaker 1>learning recently is the era of big data. Being able

0:44:50.080 --> 0:44:54.680
<v Speaker 1>to harvest information on a truly massive scale provides the

0:44:54.680 --> 0:44:59.560
<v Speaker 1>opportunity to feed that data into various machine learning systems

0:45:00.200 --> 0:45:04.680
<v Speaker 1>to search for meaning within that data. These systems might

0:45:04.840 --> 0:45:08.560
<v Speaker 1>scour the information to look for stuff like criminal activity

0:45:08.920 --> 0:45:13.120
<v Speaker 1>like financial crimes or the attempt to move some money

0:45:13.160 --> 0:45:17.120
<v Speaker 1>around from various criminal exploits. Or it could be used

0:45:17.160 --> 0:45:20.640
<v Speaker 1>to look for trends like market trends, or it might

0:45:20.640 --> 0:45:24.879
<v Speaker 1>be used to plot possible spikes in COVID nineteen transmission

0:45:25.280 --> 0:45:28.440
<v Speaker 1>where those might occur where people should really be focusing

0:45:28.480 --> 0:45:31.759
<v Speaker 1>their attention. But now we got to think back on

0:45:31.840 --> 0:45:35.080
<v Speaker 1>what I said earlier about looking up at the sky

0:45:35.200 --> 0:45:39.600
<v Speaker 1>and seeing shapes in the clouds. There's a risk that

0:45:39.680 --> 0:45:42.319
<v Speaker 1>comes along with machine learning. Actually, technically there are a

0:45:42.320 --> 0:45:45.120
<v Speaker 1>lot of risks, but this one is a biggie. It

0:45:45.239 --> 0:45:49.680
<v Speaker 1>is possible for machines like humans, to detect a pattern

0:45:49.840 --> 0:45:54.480
<v Speaker 1>where there really isn't a pattern. Systems might interpret noise

0:45:54.760 --> 0:45:57.279
<v Speaker 1>to be signal, and depending on what you're using the

0:45:57.320 --> 0:46:01.240
<v Speaker 1>system to do, that could lead you to some seriously dangerous,

0:46:01.360 --> 0:46:05.799
<v Speaker 1>incorrect conclusions. In some cases, you could just be inconvenient,

0:46:05.840 --> 0:46:09.000
<v Speaker 1>but depending on what you're working toward, it could be catastrophic.

0:46:09.120 --> 0:46:12.000
<v Speaker 1>And so computer scientists know they have to do a

0:46:12.000 --> 0:46:15.600
<v Speaker 1>lot of analysis to make sure that patterns that are

0:46:15.640 --> 0:46:21.440
<v Speaker 1>identified through machine learning processes are actually real before acting

0:46:21.640 --> 0:46:28.320
<v Speaker 1>on that information. Likewise, bias is something that we humans have, well,

0:46:28.440 --> 0:46:31.719
<v Speaker 1>it's also something that machine learning systems have too. Now,

0:46:31.800 --> 0:46:35.319
<v Speaker 1>sometimes bias is intentional. It can take the form of

0:46:35.360 --> 0:46:42.000
<v Speaker 1>those weighted relationships between artificial neurons. Other times, a systems architects,

0:46:42.080 --> 0:46:44.080
<v Speaker 1>you know, the people who put it together, They might

0:46:44.200 --> 0:46:48.879
<v Speaker 1>have introduced bias, not through conscious effort, but merely through

0:46:49.400 --> 0:46:52.480
<v Speaker 1>the approach they took and that approach might have been

0:46:52.560 --> 0:46:56.120
<v Speaker 1>too narrow. We've seen this pop up a lot again

0:46:56.160 --> 0:46:59.840
<v Speaker 1>with facial recognition technologies, many of which have a sliding

0:47:00.200 --> 0:47:04.560
<v Speaker 1>scale of efficacy. They might be more reliable with certain

0:47:04.600 --> 0:47:09.000
<v Speaker 1>ethnicities like white people, over others. That points that a

0:47:09.120 --> 0:47:12.920
<v Speaker 1>likely problem with the way those systems were trained. This

0:47:13.040 --> 0:47:15.600
<v Speaker 1>is one of the reasons why many companies have made

0:47:15.640 --> 0:47:19.760
<v Speaker 1>a choice to stop supplying certain parties like police forces

0:47:19.800 --> 0:47:24.360
<v Speaker 1>and military branches with facial recognition systems. The systems aren't

0:47:24.400 --> 0:47:28.600
<v Speaker 1>reliable for all demographic groups and thus could cause disproportionate

0:47:28.680 --> 0:47:32.360
<v Speaker 1>harm to certain populations. It would be a technological approach

0:47:32.400 --> 0:47:36.040
<v Speaker 1>to systemic racism, and this stuff is already out there

0:47:36.080 --> 0:47:38.959
<v Speaker 1>in the wild. You might think a computer system can't

0:47:38.960 --> 0:47:43.640
<v Speaker 1>be biased or prejudiced or racist, and sure, we're still

0:47:43.800 --> 0:47:46.120
<v Speaker 1>not at the point where these systems are thinking in

0:47:46.160 --> 0:47:49.239
<v Speaker 1>the way that humans do, but the outcome is still

0:47:49.360 --> 0:47:53.920
<v Speaker 1>disproportionately harmful to some groups. That's not to say that

0:47:53.960 --> 0:47:58.040
<v Speaker 1>machine learning itself is bad. It's not bad. It's a tool,

0:47:58.360 --> 0:48:02.520
<v Speaker 1>just as all technology is a tool used properly with

0:48:02.640 --> 0:48:05.960
<v Speaker 1>a careful hand to make sure that biases understood and

0:48:06.040 --> 0:48:10.600
<v Speaker 1>where needed mitigated and where work can be double or

0:48:10.640 --> 0:48:14.840
<v Speaker 1>triple checked before acted upon. It is a remarkably useful tool,

0:48:15.040 --> 0:48:18.759
<v Speaker 1>one that will power and design and improve elements in

0:48:18.800 --> 0:48:23.040
<v Speaker 1>our lives if it's under the correct stewardship. But it

0:48:23.160 --> 0:48:26.560
<v Speaker 1>does require a bit more hands on work. We can't

0:48:27.120 --> 0:48:32.520
<v Speaker 1>just leave it to the machines just yet. Well, that

0:48:32.560 --> 0:48:35.960
<v Speaker 1>wraps up this look at the concept of machine learning

0:48:36.000 --> 0:48:39.720
<v Speaker 1>and some of the thought that underlies it. This really

0:48:39.840 --> 0:48:44.160
<v Speaker 1>is a very high level treatment of machine learning. There

0:48:44.200 --> 0:48:47.080
<v Speaker 1>are plenty of resources online if you want to dive

0:48:47.120 --> 0:48:50.040
<v Speaker 1>in and learn more. A lot of them get very

0:48:50.120 --> 0:48:52.760
<v Speaker 1>heavy into the math, so if that's not your bag,

0:48:53.560 --> 0:48:56.000
<v Speaker 1>it might be a little challenging to navigate. It certainly

0:48:56.080 --> 0:48:59.279
<v Speaker 1>is for me. I love learning about the stuff, but

0:49:00.160 --> 0:49:03.239
<v Speaker 1>a lot of it requires me to look up a term,

0:49:03.560 --> 0:49:06.359
<v Speaker 1>then look up a term that explains that term, and

0:49:06.400 --> 0:49:09.600
<v Speaker 1>so on, and I go down a rabbit hole. I

0:49:09.640 --> 0:49:13.000
<v Speaker 1>hope you enjoyed that classic episode. I guess not classic,

0:49:13.040 --> 0:49:15.759
<v Speaker 1>that rerun episode of tech stuff. You can't call it

0:49:15.800 --> 0:49:19.080
<v Speaker 1>a classic if it's just a year old, right, So anyway,

0:49:19.120 --> 0:49:22.440
<v Speaker 1>I will be back again tomorrow hopefully, and we will

0:49:22.480 --> 0:49:25.000
<v Speaker 1>have a new episode, y'all If you want to get

0:49:25.040 --> 0:49:26.480
<v Speaker 1>in touch with me and let me know what you

0:49:26.480 --> 0:49:28.640
<v Speaker 1>would like me to cover in future episodes. There are

0:49:28.640 --> 0:49:30.320
<v Speaker 1>a couple of ways of doing that. You can drop

0:49:30.400 --> 0:49:32.359
<v Speaker 1>a note on Twitter. Several of you have been doing

0:49:32.360 --> 0:49:35.760
<v Speaker 1>that recently and I've got I've got a list of topics.

0:49:35.800 --> 0:49:39.120
<v Speaker 1>So thank you so much. That's fantastic. I really appreciate it.

0:49:39.520 --> 0:49:43.840
<v Speaker 1>Keep them coming. The The handle for the podcast Twitter

0:49:43.920 --> 0:49:47.759
<v Speaker 1>feed is text Stuff hs W. If, however, you would

0:49:47.760 --> 0:49:50.040
<v Speaker 1>like to leave me a voice message, you can go

0:49:50.120 --> 0:49:52.520
<v Speaker 1>to the I Heart Radio app go to the tech

0:49:52.600 --> 0:49:55.759
<v Speaker 1>stuff page. There's a little microphone icon you click on

0:49:55.840 --> 0:49:58.600
<v Speaker 1>that you can leave a message of up to thirty

0:49:58.680 --> 0:50:01.480
<v Speaker 1>seconds and if you like me to include that message

0:50:01.480 --> 0:50:04.120
<v Speaker 1>in an upcoming episode, just let me know in the message.

0:50:04.120 --> 0:50:06.719
<v Speaker 1>Because I'm an opt in kind of guy. That's it.

0:50:06.960 --> 0:50:09.120
<v Speaker 1>Hope you all are doing well and I'll talk to

0:50:09.120 --> 0:50:17.440
<v Speaker 1>you again really soon. Y text Stuff is an I

0:50:17.560 --> 0:50:21.040
<v Speaker 1>Heart Radio production. For more podcasts from my Heart Radio,

0:50:21.400 --> 0:50:24.560
<v Speaker 1>visit the i Heart Radio app, Apple Podcasts, or wherever

0:50:24.640 --> 0:50:26.160
<v Speaker 1>you listen to your favorite shows.