WEBVTT - Rerun: Machine Learning and Catastrophic Forgetting

0:00:02.920 --> 0:00:11.080
<v Speaker 1>Welcome to tech Stuff, a production from iHeartRadio. Hey there,

0:00:11.119 --> 0:00:14.560
<v Speaker 1>and welcome to tech Stuff. I'm your host, Jonathan Strickland.

0:00:14.640 --> 0:00:18.040
<v Speaker 1>I'm an executive producer with iHeart Podcasts. And how the

0:00:18.120 --> 0:00:21.960
<v Speaker 1>tech are you well. I just got back from celebrating

0:00:22.079 --> 0:00:25.000
<v Speaker 1>my birthday. Thank y'all for all of you who are

0:00:25.000 --> 0:00:28.040
<v Speaker 1>wishing me a happy birthday. And here in the United States,

0:00:28.120 --> 0:00:32.640
<v Speaker 1>we're about to have our national holiday celebrating the fourth

0:00:32.680 --> 0:00:36.320
<v Speaker 1>of July. I realize Fourth of July happens everywhere, not

0:00:36.520 --> 0:00:38.879
<v Speaker 1>just in the US, but we celebrate it here in

0:00:38.920 --> 0:00:42.680
<v Speaker 1>the US, and as such, there's very limited time to

0:00:42.680 --> 0:00:45.600
<v Speaker 1>get everything done, and I really wasn't able to pull

0:00:45.640 --> 0:00:48.320
<v Speaker 1>an episode together in time, and I apologize for that,

0:00:48.720 --> 0:00:52.239
<v Speaker 1>but I thought I would bring an older episode to

0:00:52.320 --> 0:00:55.480
<v Speaker 1>y'all so that we can still have an episode to

0:00:55.560 --> 0:00:59.800
<v Speaker 1>listen to today. And typically I would have one of my

0:01:00.080 --> 0:01:04.120
<v Speaker 1>Fireworks episodes play on this day, because Fireworks has a

0:01:04.240 --> 0:01:06.240
<v Speaker 1>very close association with the Fourth of July here in

0:01:06.240 --> 0:01:09.400
<v Speaker 1>the United States. But I've done that for several years

0:01:09.440 --> 0:01:11.560
<v Speaker 1>in a row, and I've thought it might be nice

0:01:11.600 --> 0:01:14.760
<v Speaker 1>to have a break from Fireworks instead. I thought I

0:01:14.760 --> 0:01:17.240
<v Speaker 1>would focus on something that continues to be a very

0:01:17.280 --> 0:01:21.120
<v Speaker 1>important topic in tech, and that is artificial intelligence. And

0:01:21.600 --> 0:01:26.280
<v Speaker 1>AI is incredibly impressive, but there are also lots of

0:01:26.480 --> 0:01:31.400
<v Speaker 1>challenges with AI, and those are ranging from the technological

0:01:31.480 --> 0:01:35.800
<v Speaker 1>side to the social side right and how we implement AI.

0:01:36.240 --> 0:01:38.560
<v Speaker 1>One thing I thought that we don't really get to

0:01:38.600 --> 0:01:43.959
<v Speaker 1>talk about very much is the concept of forgetting with AI.

0:01:44.240 --> 0:01:46.440
<v Speaker 1>We have a lot of generative AI out there that

0:01:46.959 --> 0:01:51.120
<v Speaker 1>is drawing upon huge resources of information, but AI can

0:01:51.240 --> 0:01:56.440
<v Speaker 1>also quote unquote forget. So this episode originally published on

0:01:56.520 --> 0:01:59.520
<v Speaker 1>July thirty first of twenty twenty three. It is called

0:01:59.600 --> 0:02:03.640
<v Speaker 1>Machine Learning and Catastrophic Forgetting. And I think it's a

0:02:03.760 --> 0:02:07.080
<v Speaker 1>useful thing to reflect upon as we see more and

0:02:07.160 --> 0:02:14.720
<v Speaker 1>more headlines about tech companies and their investment increasingly astronomical

0:02:14.840 --> 0:02:21.520
<v Speaker 1>investment in artificial intelligence. I hope you enjoy so. Over

0:02:21.560 --> 0:02:24.920
<v Speaker 1>this past weekend, I was listening to the podcast The

0:02:24.919 --> 0:02:27.920
<v Speaker 1>Skeptics Guide to the Universe, which I have no connection to.

0:02:28.200 --> 0:02:31.480
<v Speaker 1>I just listened to it, and it included a section

0:02:31.720 --> 0:02:36.280
<v Speaker 1>on AI that referenced something I don't think I had

0:02:36.400 --> 0:02:39.560
<v Speaker 1>heard of before, which is really talking more about my

0:02:39.680 --> 0:02:43.440
<v Speaker 1>oversight than anything else. Maybe I did hear about it

0:02:43.600 --> 0:02:47.160
<v Speaker 1>but then I forgot about it, you know, catastrophically. So

0:02:47.560 --> 0:02:52.280
<v Speaker 1>the thing they talked about was catastrophic forgetting in artificial intelligence,

0:02:52.280 --> 0:02:57.200
<v Speaker 1>specifically in machine learning systems built on artificial neural networks. Now,

0:02:57.200 --> 0:03:01.760
<v Speaker 1>before we talk about catastrophic forgetting, which as I mentioned,

0:03:01.800 --> 0:03:04.960
<v Speaker 1>is related to neural networks and machine learning, we really

0:03:05.000 --> 0:03:07.360
<v Speaker 1>need to do a quick reminder, not a quick reminder.

0:03:07.360 --> 0:03:09.280
<v Speaker 1>We need to do a full reminder on how all

0:03:09.360 --> 0:03:12.040
<v Speaker 1>this works. And that's going to require us to do

0:03:12.240 --> 0:03:15.560
<v Speaker 1>a whole lot of remembering. Not a catastrophic amount, but

0:03:15.639 --> 0:03:19.280
<v Speaker 1>a lot. So the history of artificial intelligence as a

0:03:19.320 --> 0:03:25.120
<v Speaker 1>discipline is one of intense and important debates in fields

0:03:25.160 --> 0:03:28.040
<v Speaker 1>like computer science. Now, I have often talked about how

0:03:28.120 --> 0:03:31.480
<v Speaker 1>AI can be seen as the convergence of several other

0:03:31.600 --> 0:03:35.600
<v Speaker 1>disciplines into its own field. And there's more than one

0:03:35.600 --> 0:03:40.680
<v Speaker 1>way to approach the challenge of artificial intelligence. And in

0:03:40.760 --> 0:03:43.440
<v Speaker 1>the history of AI, we actually saw that play out,

0:03:44.080 --> 0:03:47.680
<v Speaker 1>and some would argue the way it played out means

0:03:47.720 --> 0:03:51.200
<v Speaker 1>that we're actually just now playing catch up. So different

0:03:51.240 --> 0:03:56.200
<v Speaker 1>schools of thought pushed these different approaches forward as this

0:03:56.400 --> 0:04:01.920
<v Speaker 1>should be the prevailing methodology we use to develop artificial intelligence.

0:04:02.360 --> 0:04:05.440
<v Speaker 1>This is important because the development of AI does not

0:04:05.560 --> 0:04:09.680
<v Speaker 1>exist in a vacuum, right. It exists in our real world.

0:04:10.320 --> 0:04:16.760
<v Speaker 1>Research requires funding, and when you've got different sides arguing

0:04:16.800 --> 0:04:21.160
<v Speaker 1>that their approach to artificial intelligence is superior and that

0:04:21.200 --> 0:04:25.400
<v Speaker 1>the alternatives are not just inferior, but potentially limited to

0:04:25.440 --> 0:04:28.360
<v Speaker 1>the point of being useless, well you've got a metaphorical

0:04:28.440 --> 0:04:31.760
<v Speaker 1>wrestling match going on. The winner takes home the big

0:04:31.800 --> 0:04:36.000
<v Speaker 1>prize of getting funding for their research, and the loser

0:04:36.120 --> 0:04:38.839
<v Speaker 1>has to scrabble for whatever they can find, and often

0:04:39.080 --> 0:04:42.840
<v Speaker 1>they will see their work languish as a result. By

0:04:42.880 --> 0:04:45.960
<v Speaker 1>the way, this is why I often bring stuff up

0:04:46.000 --> 0:04:49.520
<v Speaker 1>in this podcast that is outside the realm of tech.

0:04:50.480 --> 0:04:52.720
<v Speaker 1>I've received a lot of messages over the years from

0:04:52.720 --> 0:04:55.400
<v Speaker 1>folks saying that I should leave out stuff like money

0:04:55.880 --> 0:04:58.640
<v Speaker 1>or politics. Politics is the big one. But to me,

0:04:58.760 --> 0:05:04.720
<v Speaker 1>that doesn't make sense because tech exists within our world,

0:05:04.839 --> 0:05:08.640
<v Speaker 1>a world that is largely shaped by money and politics.

0:05:09.040 --> 0:05:12.000
<v Speaker 1>I don't think we can separate the tech from all

0:05:12.040 --> 0:05:14.440
<v Speaker 1>of that because I believe that if you were to

0:05:14.480 --> 0:05:18.839
<v Speaker 1>somehow magically remove those influences, If somehow money and politics

0:05:18.880 --> 0:05:22.600
<v Speaker 1>never played a part in the development of technology, our

0:05:22.640 --> 0:05:25.479
<v Speaker 1>tech would look very different from what it does today.

0:05:25.960 --> 0:05:29.960
<v Speaker 1>Not necessarily better or worse, but different. I mean, think

0:05:29.960 --> 0:05:36.040
<v Speaker 1>about Thomas Edison. He was very much driven by financial success,

0:05:36.120 --> 0:05:40.200
<v Speaker 1>like his work in tech was really mostly about making

0:05:40.320 --> 0:05:43.520
<v Speaker 1>lots of money. And without the making lots of money part,

0:05:43.920 --> 0:05:47.480
<v Speaker 1>you don't really have his drive to really bring together

0:05:47.560 --> 0:05:50.800
<v Speaker 1>the brightest minds of his generation and set them to

0:05:50.880 --> 0:05:55.080
<v Speaker 1>work on creating incredible technology. So I think we have

0:05:55.240 --> 0:05:58.440
<v Speaker 1>to take all these things into consideration. Anyway, that's a

0:05:58.480 --> 0:06:00.720
<v Speaker 1>total rabbit trail, and I apology. Let's get back to

0:06:00.760 --> 0:06:05.200
<v Speaker 1>our story. It really begins around nineteen forty three when

0:06:05.200 --> 0:06:08.360
<v Speaker 1>a pair of researchers at the University of Chicago first

0:06:08.640 --> 0:06:13.080
<v Speaker 1>proposed the concept of the basic unit of a neural network.

0:06:13.400 --> 0:06:18.279
<v Speaker 1>Those researchers were Warren McCullough and Walter Pets, And in fact,

0:06:18.320 --> 0:06:22.839
<v Speaker 1>they demonstrate their idea by showing a simple electrical circuit

0:06:23.040 --> 0:06:25.839
<v Speaker 1>the very basis for what would become a neural network.

0:06:26.320 --> 0:06:29.679
<v Speaker 1>So their proposal was a system that would use those

0:06:29.720 --> 0:06:33.880
<v Speaker 1>simple circuits to mimic the neurons that we have in

0:06:33.880 --> 0:06:37.720
<v Speaker 1>our noggins. So our brain consists of a bunch of

0:06:37.760 --> 0:06:40.719
<v Speaker 1>these neurons, and you might wonder how much is a bunch. Well,

0:06:41.600 --> 0:06:45.159
<v Speaker 1>we're talking about on average, around one hundred billion neurons

0:06:45.320 --> 0:06:48.920
<v Speaker 1>in the human brain. These neurons interconnect with each other.

0:06:49.040 --> 0:06:51.640
<v Speaker 1>It's not just a one to one, right, You've got

0:06:51.640 --> 0:06:55.839
<v Speaker 1>these interconnections between all these different neurons, not with every

0:06:55.839 --> 0:06:58.880
<v Speaker 1>neuron connected to every other neuron, but lots of interconnections.

0:06:58.880 --> 0:07:01.680
<v Speaker 1>And if we're looking at just the connections, you would

0:07:01.720 --> 0:07:04.839
<v Speaker 1>count more than one hundred trillion of them in the

0:07:04.880 --> 0:07:08.560
<v Speaker 1>typical human brain. And these connections in our brains make

0:07:08.640 --> 0:07:13.320
<v Speaker 1>up neural circuits. Those circuits light up, and that represents

0:07:13.400 --> 0:07:16.640
<v Speaker 1>us doing lots of different stuff, from experiencing the world

0:07:16.680 --> 0:07:20.840
<v Speaker 1>around us so perception to thinking about a past memory.

0:07:21.000 --> 0:07:24.000
<v Speaker 1>You know that typically is like recreating the same pathway

0:07:24.080 --> 0:07:28.440
<v Speaker 1>over and over, and sometimes we don't recreate it exactly correctly,

0:07:28.920 --> 0:07:32.920
<v Speaker 1>and our memory ends up not being a perfect representation

0:07:33.080 --> 0:07:35.880
<v Speaker 1>of the thing that we actually experienced. This is why

0:07:36.120 --> 0:07:39.360
<v Speaker 1>things like eyewitness testimony is not always very reliable, because

0:07:39.400 --> 0:07:44.520
<v Speaker 1>our memories aren't infallible. They can trick us and we

0:07:44.560 --> 0:07:47.040
<v Speaker 1>can have all those pathways light up. When we learn

0:07:47.080 --> 0:07:50.200
<v Speaker 1>a new skill, we start forming new pathways, and then

0:07:50.360 --> 0:07:54.800
<v Speaker 1>as we practice this skill, we start to reinforce those pathways.

0:07:55.160 --> 0:07:58.800
<v Speaker 1>So McCulla and Pitts propose that we create machines capable

0:07:58.880 --> 0:08:03.320
<v Speaker 1>of doing essentially a similar thing that our brains do,

0:08:03.440 --> 0:08:08.680
<v Speaker 1>so kind of a neuromimicry, not exactly one to one

0:08:08.720 --> 0:08:12.600
<v Speaker 1>the way our brains work, but inspired by the way

0:08:12.760 --> 0:08:17.080
<v Speaker 1>our brains work. Now, we would be limited by what

0:08:17.360 --> 0:08:19.920
<v Speaker 1>the technology of the day would be able to do,

0:08:20.360 --> 0:08:23.640
<v Speaker 1>because there's no feasible way we could create a massive

0:08:24.160 --> 0:08:29.640
<v Speaker 1>electrical system with one hundred billion individual simple circuits with

0:08:29.760 --> 0:08:33.240
<v Speaker 1>more than one hundred trillion connections between them. That would

0:08:33.240 --> 0:08:37.199
<v Speaker 1>be beyond our capability. It would be beyond our resources.

0:08:37.559 --> 0:08:40.840
<v Speaker 1>We could, however, create systems that used interconnected circuits to

0:08:40.920 --> 0:08:45.480
<v Speaker 1>process information and to teach such a system to do

0:08:45.559 --> 0:08:50.920
<v Speaker 1>specific tasks. Now, in nineteen forty nine, Donald Hebb wrote

0:08:50.960 --> 0:08:55.080
<v Speaker 1>a book about biological neurons, and he titled this book

0:08:55.320 --> 0:08:59.960
<v Speaker 1>the Organization of Behavior and suggested neural pathways get stronger

0:09:00.520 --> 0:09:03.320
<v Speaker 1>with additional use, kind of like you know, if you

0:09:03.559 --> 0:09:06.520
<v Speaker 1>exercise your muscles, you build strength over time, while so

0:09:06.720 --> 0:09:10.640
<v Speaker 1>is the same with neural pathways, and if you don't

0:09:10.720 --> 0:09:13.240
<v Speaker 1>use those muscles, well, then your muscles get weaker. Well,

0:09:13.320 --> 0:09:16.760
<v Speaker 1>same with neural pathways. If you end up learning a skill,

0:09:17.480 --> 0:09:21.600
<v Speaker 1>but then over a great amount of time you no

0:09:21.640 --> 0:09:24.560
<v Speaker 1>longer practice that skill, you're going to lose some of

0:09:24.600 --> 0:09:27.400
<v Speaker 1>your ability, maybe not all of it, but at least

0:09:27.400 --> 0:09:29.240
<v Speaker 1>some of it. And you have to you know, like

0:09:29.559 --> 0:09:33.240
<v Speaker 1>I think about wrestlers who come back from from retirement,

0:09:33.360 --> 0:09:36.520
<v Speaker 1>professional wrestlers, they call it ring rust. You got to

0:09:36.600 --> 0:09:39.120
<v Speaker 1>knock off the ring rust and get back into step

0:09:39.200 --> 0:09:41.320
<v Speaker 1>and kind of get back into your groove. And it

0:09:41.360 --> 0:09:45.880
<v Speaker 1>takes a little time. Typically sometimes you know, you can

0:09:46.000 --> 0:09:48.280
<v Speaker 1>get back into the game faster than others, but you

0:09:48.400 --> 0:09:53.040
<v Speaker 1>get the idea. And also heb ended up proposing the

0:09:53.080 --> 0:09:58.080
<v Speaker 1>concept of cells that fire together wire together, meaning that

0:09:58.800 --> 0:10:02.800
<v Speaker 1>neurons that fire at the same time end up strengthening

0:10:02.880 --> 0:10:08.160
<v Speaker 1>faster than other neurons do. So when you get into

0:10:08.240 --> 0:10:14.040
<v Speaker 1>that system, you can actually reinforce those pathways. And for

0:10:14.160 --> 0:10:17.120
<v Speaker 1>AI this would be really important. And it wasn't very

0:10:17.160 --> 0:10:20.599
<v Speaker 1>long after Donald Habb had published this work that researchers

0:10:20.600 --> 0:10:23.679
<v Speaker 1>in the field of AI tried to apply that concept

0:10:23.760 --> 0:10:28.480
<v Speaker 1>that philosophy to computer science. By the mid nineteen fifties,

0:10:28.520 --> 0:10:32.040
<v Speaker 1>the burgeoning computer science lab and AI lab at MIT

0:10:32.880 --> 0:10:38.400
<v Speaker 1>was building out neural networks based on Hebb's ideas. Meanwhile,

0:10:38.840 --> 0:10:43.680
<v Speaker 1>another computer scientist named Frank Rosenblatt was looking at primitive

0:10:43.679 --> 0:10:48.079
<v Speaker 1>neural systems and he started with flies like house flies.

0:10:49.040 --> 0:10:52.160
<v Speaker 1>He wanted to explore systems that were involved when a

0:10:52.200 --> 0:10:56.560
<v Speaker 1>fly would quickly move away after detecting a possible threat,

0:10:57.000 --> 0:11:01.439
<v Speaker 1>like instantly, or at least appear to us to instantly

0:11:01.520 --> 0:11:05.480
<v Speaker 1>react to something. So, for example, a fly swatter coming

0:11:05.480 --> 0:11:07.640
<v Speaker 1>at it, like you might be moving the fly swater

0:11:07.720 --> 0:11:09.840
<v Speaker 1>very quickly, and yet the fly is able to move

0:11:10.400 --> 0:11:15.640
<v Speaker 1>super fast with no perceivable delay. Right, we know that

0:11:15.679 --> 0:11:18.200
<v Speaker 1>we have a delay from when we perceive something to

0:11:18.240 --> 0:11:20.520
<v Speaker 1>when we can act on something. Like if you've ever

0:11:20.559 --> 0:11:23.000
<v Speaker 1>been in a fender bender in a car accident, you

0:11:23.040 --> 0:11:25.920
<v Speaker 1>know that that there's a delay between when you see

0:11:25.920 --> 0:11:28.680
<v Speaker 1>the issue when you can hit the brake, and that

0:11:28.920 --> 0:11:32.240
<v Speaker 1>can lead to accidents. Well, with flies, that delay seems

0:11:32.280 --> 0:11:36.600
<v Speaker 1>to be super super small. So Rosenblatt was really interested

0:11:36.960 --> 0:11:40.960
<v Speaker 1>in exploring the neurological reasons for that. How can that happen?

0:11:41.000 --> 0:11:43.520
<v Speaker 1>It has to be really simple, right, There has to

0:11:43.559 --> 0:11:48.199
<v Speaker 1>be a simple and more or less direct pathway that

0:11:48.360 --> 0:11:52.800
<v Speaker 1>exists to allow a fly to react to detecting a

0:11:52.800 --> 0:11:57.160
<v Speaker 1>potential threat like that, and if you could replicate that

0:11:57.920 --> 0:12:02.040
<v Speaker 1>with electronics, you could have a very simple but potentially

0:12:02.200 --> 0:12:07.240
<v Speaker 1>powerful artificial intelligence system. So he came up with this

0:12:07.440 --> 0:12:10.160
<v Speaker 1>system that would be based off that very simple direct

0:12:10.200 --> 0:12:12.240
<v Speaker 1>pathway that you would see in something like a fly,

0:12:12.760 --> 0:12:16.120
<v Speaker 1>and he called it the perceptron. So he went back

0:12:16.200 --> 0:12:18.680
<v Speaker 1>to the simple circuit design that was proposed by Pitts

0:12:18.679 --> 0:12:22.520
<v Speaker 1>and McCullough and he built out the Mark one perceptron

0:12:23.480 --> 0:12:25.920
<v Speaker 1>or perceptron. I guess I should say, so let's talk

0:12:25.920 --> 0:12:28.920
<v Speaker 1>about a perceptron, like not big P, but a little

0:12:29.040 --> 0:12:31.840
<v Speaker 1>P perceptron. This is probably what we would call a

0:12:31.920 --> 0:12:35.680
<v Speaker 1>neural node in a modern neural network. So the purpose

0:12:35.800 --> 0:12:40.000
<v Speaker 1>of the perceptron was to accept inputs and produce an

0:12:40.040 --> 0:12:44.679
<v Speaker 1>output based on some threshold, Like if the inputs meet

0:12:44.720 --> 0:12:47.640
<v Speaker 1>a certain threshold, one output would be produced. If they

0:12:47.720 --> 0:12:49.880
<v Speaker 1>failed to do so, a different output would be produced.

0:12:50.720 --> 0:12:54.880
<v Speaker 1>The inputs, in turn would be assigned weights, which would

0:12:54.880 --> 0:12:58.240
<v Speaker 1>factor into the output the perceptron would generate. So when

0:12:58.240 --> 0:13:04.760
<v Speaker 1>we're talking weights, I mean weights as in like how

0:13:04.840 --> 0:13:08.079
<v Speaker 1>heavy something is or in this case, how much impact

0:13:08.520 --> 0:13:12.200
<v Speaker 1>that thing has, So we're talking about how much impact

0:13:12.280 --> 0:13:15.920
<v Speaker 1>one input has relative to other inputs. Let me use

0:13:15.960 --> 0:13:19.440
<v Speaker 1>a really mundane human example to kind of explain what

0:13:19.520 --> 0:13:22.640
<v Speaker 1>this means. Let's say that your friend asks you to

0:13:22.679 --> 0:13:24.760
<v Speaker 1>go see a movie with them, and it's going to

0:13:24.800 --> 0:13:27.760
<v Speaker 1>be playing tonight at nine pm. But you've had a

0:13:27.880 --> 0:13:30.880
<v Speaker 1>really busy day and you might not be able to

0:13:30.920 --> 0:13:34.320
<v Speaker 1>even eat dinner until around nine pm. And if you

0:13:34.360 --> 0:13:36.280
<v Speaker 1>go see this movie, it might mean having to skip

0:13:36.320 --> 0:13:40.000
<v Speaker 1>dinner or to try and eat something really fast and

0:13:40.120 --> 0:13:43.599
<v Speaker 1>unhealthy before you go to the movie. What's more, you

0:13:43.679 --> 0:13:46.680
<v Speaker 1>got a really big day tomorrow and you feel like

0:13:46.720 --> 0:13:49.480
<v Speaker 1>you really need to be well rested for it. However,

0:13:49.600 --> 0:13:53.320
<v Speaker 1>at the same time, you haven't seen this friend in ages,

0:13:53.360 --> 0:13:55.800
<v Speaker 1>and you really like this person and you've wanted to

0:13:55.800 --> 0:13:59.000
<v Speaker 1>hang with them for a really long time. Plus the

0:13:59.040 --> 0:14:01.600
<v Speaker 1>movie they're suggesting is one you've really wanted to see

0:14:01.600 --> 0:14:04.840
<v Speaker 1>and you haven't gone yet. Well, you would likely assign

0:14:04.960 --> 0:14:09.360
<v Speaker 1>at least unconsciously weights to each of these factors before

0:14:09.360 --> 0:14:11.439
<v Speaker 1>you make your decision. You know, if getting some dinner

0:14:11.480 --> 0:14:14.440
<v Speaker 1>without having to rush, and also to be really well

0:14:14.480 --> 0:14:17.720
<v Speaker 1>rested for tomorrow are really important to you, you'll probably

0:14:18.000 --> 0:14:21.880
<v Speaker 1>reluctantly decline the offer. But if you really crave some

0:14:21.960 --> 0:14:24.000
<v Speaker 1>time with your friend and you really want to see

0:14:24.000 --> 0:14:26.360
<v Speaker 1>that movie before all the spoilers come out on Facebook

0:14:26.400 --> 0:14:30.440
<v Speaker 1>or whatever, maybe you'll say yes. Your decision depends upon

0:14:30.480 --> 0:14:34.520
<v Speaker 1>the weights you assign those factors, those inputs, even if

0:14:34.520 --> 0:14:38.000
<v Speaker 1>you don't consciously think about it that way. Well, the

0:14:38.040 --> 0:14:41.920
<v Speaker 1>Perceptron system worked in a similar way, produced outputs by

0:14:41.920 --> 0:14:46.800
<v Speaker 1>taking the inputs into consideration, including each input's weight. Moreover,

0:14:47.080 --> 0:14:49.560
<v Speaker 1>the more you submitted inputs, the more the system would

0:14:49.640 --> 0:14:53.280
<v Speaker 1>quote unquote learn how to weight each of those inputs,

0:14:53.560 --> 0:14:56.600
<v Speaker 1>all with the goal of bringing the actual output that

0:14:56.640 --> 0:15:00.360
<v Speaker 1>the process or you know, generates closer to the one

0:15:00.560 --> 0:15:04.920
<v Speaker 1>you want it to generate. Okay, I just said a

0:15:04.920 --> 0:15:07.280
<v Speaker 1>lot there. We've got some more to get through. But

0:15:07.320 --> 0:15:09.320
<v Speaker 1>before we get to that, let's take a quick break,

0:15:18.520 --> 0:15:20.920
<v Speaker 1>all right. Before the break, we were talking about inputs

0:15:21.080 --> 0:15:25.240
<v Speaker 1>and weights and the idea of getting an output that

0:15:25.520 --> 0:15:28.240
<v Speaker 1>is close to what you want the system to do.

0:15:28.960 --> 0:15:31.720
<v Speaker 1>That's not a guarantee, right, The system could generate an

0:15:31.720 --> 0:15:35.800
<v Speaker 1>output that's quote unquote wrong, you know, depending on whatever

0:15:35.880 --> 0:15:41.080
<v Speaker 1>task you've set this machine learning system to learn, and

0:15:41.160 --> 0:15:43.280
<v Speaker 1>that gets a bit conceptual. So let's talk about a

0:15:43.320 --> 0:15:45.840
<v Speaker 1>simple example that I love to use. If you've been

0:15:45.840 --> 0:15:48.400
<v Speaker 1>listening to texta for a while, you've heard this before,

0:15:49.400 --> 0:15:53.000
<v Speaker 1>and that's talking about pictures of cats. Because cats ruled

0:15:53.160 --> 0:15:55.440
<v Speaker 1>the Internet. I don't know if they still do. They

0:15:55.480 --> 0:15:58.960
<v Speaker 1>won't talk to me, so just knock things off shelves. Anyway.

0:15:58.960 --> 0:16:01.320
<v Speaker 1>If your goal is to tea each a computer system

0:16:01.720 --> 0:16:06.360
<v Speaker 1>to differentiate photos that include a cat from photos that

0:16:06.440 --> 0:16:10.000
<v Speaker 1>do not include a cat, well, you would need to

0:16:10.040 --> 0:16:13.400
<v Speaker 1>train the system, and part of that includes feeding the

0:16:13.480 --> 0:16:18.200
<v Speaker 1>system a whole bunch of photographs. Some of those would

0:16:18.240 --> 0:16:21.960
<v Speaker 1>have cats in them, some would not, and chances are

0:16:22.040 --> 0:16:25.840
<v Speaker 1>the system would misidentify photos. Maybe a significant number of

0:16:25.840 --> 0:16:28.680
<v Speaker 1>those photos. You would probably have false positives where the

0:16:28.720 --> 0:16:31.560
<v Speaker 1>system thinks there's a cat there and there's not, and

0:16:31.600 --> 0:16:34.280
<v Speaker 1>false negatives where it doesn't think there's a cat there

0:16:34.560 --> 0:16:37.680
<v Speaker 1>but there is. At that point, your goal is to

0:16:37.680 --> 0:16:41.120
<v Speaker 1>try and teach the system to close the gap between

0:16:41.360 --> 0:16:44.800
<v Speaker 1>the actual results it produces and what you want it

0:16:44.920 --> 0:16:47.760
<v Speaker 1>to produce. In some systems, that means you might have

0:16:47.840 --> 0:16:51.320
<v Speaker 1>to go in manually to adjust the input weights to

0:16:51.440 --> 0:16:53.880
<v Speaker 1>increase the weight of one input versus another in an

0:16:53.920 --> 0:16:59.360
<v Speaker 1>effort to cut down on mistakes. So the perceptron was interesting,

0:16:59.760 --> 0:17:03.080
<v Speaker 1>but it was very limited in complexity. It was essentially

0:17:03.160 --> 0:17:05.560
<v Speaker 1>a single layer where you'd feed a bunch of inputs

0:17:05.560 --> 0:17:07.879
<v Speaker 1>in and you would get an output. So it was

0:17:07.920 --> 0:17:11.959
<v Speaker 1>suitable for a subset of computational challenges, but anything beyond

0:17:12.000 --> 0:17:16.119
<v Speaker 1>that was well beyond its own reach as a single

0:17:16.200 --> 0:17:19.719
<v Speaker 1>layer network. By the late nineteen fifties, other researchers had

0:17:19.760 --> 0:17:23.879
<v Speaker 1>created new neural networks that were multi layered. So a

0:17:23.960 --> 0:17:28.160
<v Speaker 1>node or neuron didn't just accept inputs, it would generate

0:17:28.200 --> 0:17:32.600
<v Speaker 1>outputs that then would become inputs for another layer down.

0:17:33.000 --> 0:17:36.399
<v Speaker 1>So instead of just having one layer of nodes, you

0:17:36.400 --> 0:17:38.840
<v Speaker 1>would have multiple layers of nodes. Typically you would have

0:17:39.280 --> 0:17:43.119
<v Speaker 1>one at the quote unquote top of the network, and

0:17:43.160 --> 0:17:44.880
<v Speaker 1>you would have outputs at the bottom, and the ones

0:17:44.880 --> 0:17:47.920
<v Speaker 1>in between would be often referred to as hidden layers,

0:17:48.400 --> 0:17:51.640
<v Speaker 1>and who knows how many there would be. So anyway

0:17:52.040 --> 0:17:54.840
<v Speaker 1>you would feed data to the system, the initial nodes

0:17:54.880 --> 0:17:58.879
<v Speaker 1>would generate information as outputs that would become inputs for

0:17:58.960 --> 0:18:03.680
<v Speaker 1>the next layer down, which would then continue the process

0:18:03.720 --> 0:18:05.679
<v Speaker 1>and so on and so forth until you get to

0:18:05.720 --> 0:18:08.760
<v Speaker 1>the output. So now you had artificial neural networks that

0:18:08.800 --> 0:18:13.199
<v Speaker 1>could tackle more complex challenges, and you would have multiple

0:18:13.200 --> 0:18:17.120
<v Speaker 1>steps in the process. Didn't necessarily mean they were automatically

0:18:17.200 --> 0:18:21.280
<v Speaker 1>better than the perceptron, was just that they were able

0:18:21.320 --> 0:18:27.119
<v Speaker 1>to tackle more complicated tasks. What followed is something that

0:18:27.160 --> 0:18:30.680
<v Speaker 1>will probably sound really familiar to you if you ever

0:18:30.840 --> 0:18:35.919
<v Speaker 1>follow technology or fads, the hype around machine learning and

0:18:36.000 --> 0:18:38.800
<v Speaker 1>artificial intelligence, and keep in mind this is like the

0:18:38.920 --> 0:18:43.920
<v Speaker 1>nineteen sixties. It grew beyond the technology's actual capabilities. At

0:18:43.920 --> 0:18:47.840
<v Speaker 1>that time. People started to project what this technology would

0:18:47.880 --> 0:18:50.239
<v Speaker 1>be able to do, and they did so thinking it

0:18:50.280 --> 0:18:53.520
<v Speaker 1>was going to be in a very short turnaround, like

0:18:53.560 --> 0:18:58.080
<v Speaker 1>we're right on the very precipice of a monstrous breakthrough

0:18:58.119 --> 0:19:00.960
<v Speaker 1>that will bring the science fiction future into the present.

0:19:01.880 --> 0:19:06.719
<v Speaker 1>So when it was realized that we weren't at that, like,

0:19:06.800 --> 0:19:10.639
<v Speaker 1>that's not how progress typically works. It's usually much more

0:19:11.119 --> 0:19:16.200
<v Speaker 1>gradual and humble than that, well, then enthusiasm around AI

0:19:16.280 --> 0:19:18.800
<v Speaker 1>began to take a hit. And as I mentioned already,

0:19:18.840 --> 0:19:22.440
<v Speaker 1>a big part of AI research really comes down to funding,

0:19:23.000 --> 0:19:26.360
<v Speaker 1>and it gets really challenging to secure funding when public

0:19:26.480 --> 0:19:31.200
<v Speaker 1>opinion dims on a technology. We've seen this happen lots

0:19:31.200 --> 0:19:35.000
<v Speaker 1>of times, right, like three D television was a fad

0:19:35.080 --> 0:19:37.720
<v Speaker 1>that was pushed. Now, granted, that one, you could argue

0:19:37.800 --> 0:19:41.120
<v Speaker 1>was more of an example of manufacturing companies that make

0:19:41.200 --> 0:19:44.800
<v Speaker 1>televisions trying to push a technology on consumers and the

0:19:44.800 --> 0:19:47.520
<v Speaker 1>consumers just weren't interested. You could argue that was the

0:19:47.560 --> 0:19:51.000
<v Speaker 1>case there. But virtual reality in the nineteen nineties definitely

0:19:51.040 --> 0:19:54.639
<v Speaker 1>followed this pathway. There was this excitement around virtual reality.

0:19:55.640 --> 0:19:59.480
<v Speaker 1>Then that excitement faded to almost nothing when people realized

0:19:59.480 --> 0:20:02.800
<v Speaker 1>that the actual state of the art of the technology

0:20:03.000 --> 0:20:06.480
<v Speaker 1>was far below where they expected it to be. And

0:20:06.560 --> 0:20:10.040
<v Speaker 1>suddenly people who are working in VR couldn't get funding

0:20:10.200 --> 0:20:12.400
<v Speaker 1>for their work and they kind of had to scrounge

0:20:12.440 --> 0:20:16.359
<v Speaker 1>around in order to keep the development going at all.

0:20:17.040 --> 0:20:19.879
<v Speaker 1>And then eventually we would see that come back around again.

0:20:20.480 --> 0:20:24.040
<v Speaker 1>You could argue that NFTs recently went through this too,

0:20:24.080 --> 0:20:27.560
<v Speaker 1>where the hype went well beyond what NFTs could actually do.

0:20:28.640 --> 0:20:31.920
<v Speaker 1>I've been really down on NFTs in general. I do

0:20:31.960 --> 0:20:37.080
<v Speaker 1>think that there are potential legitimate uses for NFTs, but

0:20:37.160 --> 0:20:43.399
<v Speaker 1>I think the early examples were frivolous and almost solely

0:20:43.480 --> 0:20:49.400
<v Speaker 1>centered around speculation, as in like financial speculation and as

0:20:49.400 --> 0:20:51.320
<v Speaker 1>a result, there was nothing for it to do other

0:20:51.400 --> 0:20:54.520
<v Speaker 1>than to create a bubble that would ultimately burst, which

0:20:54.560 --> 0:20:58.199
<v Speaker 1>is what happened. And maybe NFTs will recover from that

0:20:58.320 --> 0:21:02.440
<v Speaker 1>and become something that's more fundamentally useful in the Internet

0:21:02.520 --> 0:21:05.560
<v Speaker 1>in the future or in digital commerce in the future.

0:21:06.920 --> 0:21:10.879
<v Speaker 1>But it's going to have to get over the catastrophe

0:21:10.920 --> 0:21:13.680
<v Speaker 1>that happened when the rug was pulled out from underneath

0:21:13.760 --> 0:21:19.520
<v Speaker 1>n FTS. And that was all predictable and preventable. But

0:21:21.000 --> 0:21:23.919
<v Speaker 1>like I've said before, like I've lifted the joke from

0:21:23.960 --> 0:21:26.440
<v Speaker 1>Peter Cook, we've learned from our mistakes. We can repeat

0:21:26.480 --> 0:21:31.040
<v Speaker 1>them almost exactly. Anyway, This same sort of hype cycle

0:21:31.119 --> 0:21:35.800
<v Speaker 1>activity happened with neural networks and machine learning in the

0:21:35.880 --> 0:21:41.639
<v Speaker 1>nineteen sixties. Then enter Marvin Minsky and Seymour Pappart of

0:21:41.800 --> 0:21:44.920
<v Speaker 1>MIT's AI lab. They were leading that lab at the time.

0:21:45.280 --> 0:21:49.800
<v Speaker 1>In nineteen sixty nine, they co authored a book titled Perceptrons.

0:21:50.720 --> 0:21:55.040
<v Speaker 1>They were actually critical of that artificial neural network approach

0:21:55.080 --> 0:21:58.080
<v Speaker 1>to AI and machine learning. They were concerned that the

0:21:58.119 --> 0:22:01.040
<v Speaker 1>limitations of the technology meant that you would need an

0:22:01.160 --> 0:22:06.399
<v Speaker 1>unrealistically huge system of artificial neurons. Perhaps then using that

0:22:06.400 --> 0:22:10.639
<v Speaker 1>system to compute an infinite number of variations of the

0:22:10.680 --> 0:22:14.399
<v Speaker 1>same process or task if you wanted to train the

0:22:14.400 --> 0:22:18.879
<v Speaker 1>weights so that they were of the optimal value. So,

0:22:18.920 --> 0:22:22.920
<v Speaker 1>in other words, they thought, it's too impractical and it's

0:22:22.960 --> 0:22:24.960
<v Speaker 1>going to take too much compute time, and you're never

0:22:25.040 --> 0:22:27.360
<v Speaker 1>going to achieve the result you want. You're never going

0:22:27.400 --> 0:22:32.600
<v Speaker 1>to get to that most perfect system. And they believed

0:22:33.119 --> 0:22:37.760
<v Speaker 1>it just had fundamental inescapable flaws. They had different systems

0:22:37.800 --> 0:22:42.120
<v Speaker 1>in mind. Now Minski and Separate tried to push their

0:22:42.160 --> 0:22:44.680
<v Speaker 1>systems forward, and I could do a full episode about

0:22:44.720 --> 0:22:48.800
<v Speaker 1>them too, and their ideas were not bad. They were different.

0:22:49.160 --> 0:22:51.520
<v Speaker 1>It was a different approach. But this also meant that

0:22:51.600 --> 0:22:54.520
<v Speaker 1>researchers who had been pushing the development of our artificial

0:22:54.560 --> 0:22:58.919
<v Speaker 1>neural networks felt forced to move on to different projects

0:22:59.000 --> 0:23:03.600
<v Speaker 1>because financial support for anything connected to the concept of

0:23:03.640 --> 0:23:09.120
<v Speaker 1>neural networks effectively disappeared, right like funding just dropped for that.

0:23:09.200 --> 0:23:13.359
<v Speaker 1>Because here you had these experts in computer science saying, yeah,

0:23:13.560 --> 0:23:19.159
<v Speaker 1>this approach, while interesting, has already hit an insurmountable obstacle

0:23:19.200 --> 0:23:20.960
<v Speaker 1>and it's not going to go any further. It's gone

0:23:21.000 --> 0:23:23.880
<v Speaker 1>as far as it can go. And so a lot

0:23:23.920 --> 0:23:29.640
<v Speaker 1>of computer scientists blamed Minsky and Separate for essentially demolishing

0:23:29.720 --> 0:23:33.680
<v Speaker 1>funding for neural networks for more than a decade, and

0:23:33.680 --> 0:23:37.320
<v Speaker 1>in fact, this would become an era that retrospectively, computer

0:23:37.400 --> 0:23:41.680
<v Speaker 1>scientists would reference as the AI Winter got all Game

0:23:41.720 --> 0:23:44.800
<v Speaker 1>of Thrones up in here. Now. In nineteen eighty two,

0:23:45.240 --> 0:23:49.200
<v Speaker 1>there was a hint of spring thawing out that AI

0:23:49.240 --> 0:23:54.120
<v Speaker 1>Winter researchers in Japan were starting to resurrect work on

0:23:54.280 --> 0:23:58.640
<v Speaker 1>neural network projects, and meanwhile, a scientist named John Hopfield

0:23:59.080 --> 0:24:02.080
<v Speaker 1>submitted a research paper to the National Academy of Sciences

0:24:02.560 --> 0:24:05.280
<v Speaker 1>that brought neural networks back into discussion here in the

0:24:05.359 --> 0:24:10.800
<v Speaker 1>United States. And because Japan was actively investing in developing

0:24:10.800 --> 0:24:15.000
<v Speaker 1>that technology, institutions in the United States began to open

0:24:15.119 --> 0:24:17.359
<v Speaker 1>up the purse strings a bit because there was a

0:24:17.400 --> 0:24:21.280
<v Speaker 1>concern that if there were something to this artificial neural

0:24:21.320 --> 0:24:25.920
<v Speaker 1>network concept, if in fact those obstacles weren't insurmountable, as

0:24:25.960 --> 0:24:30.480
<v Speaker 1>min Skin Separate had suggested, the US could potentially fall

0:24:30.720 --> 0:24:35.320
<v Speaker 1>behind another country because it would fail to fund its development. So,

0:24:35.920 --> 0:24:38.760
<v Speaker 1>in a desire not to have Japan take the ball

0:24:38.800 --> 0:24:41.439
<v Speaker 1>and run with it, the United States began to invest

0:24:41.680 --> 0:24:45.479
<v Speaker 1>again in artificial neural network research and development. In the

0:24:45.480 --> 0:24:50.920
<v Speaker 1>mid nineteen eighties, computer scientists essentially rediscovered the usefulness of

0:24:51.480 --> 0:24:55.639
<v Speaker 1>a process called back propagation. And I've already talked about

0:24:56.160 --> 0:24:58.159
<v Speaker 1>nodes and weights and stuff, but this is going to

0:24:58.160 --> 0:25:00.479
<v Speaker 1>require a little bit more explanation to under stand what

0:25:00.560 --> 0:25:03.760
<v Speaker 1>back propagation is all about. So let's kind of try

0:25:03.800 --> 0:25:07.560
<v Speaker 1>to visualize a neural network. So you've got your input nodes.

0:25:07.920 --> 0:25:10.240
<v Speaker 1>Just think of a bunch of circles. If you were

0:25:10.359 --> 0:25:12.160
<v Speaker 1>drawing it from top to bottom, this would be your

0:25:12.200 --> 0:25:15.679
<v Speaker 1>top layer. This is like the funnels where you're going

0:25:15.760 --> 0:25:19.639
<v Speaker 1>to feed data into the system. Now you've got a

0:25:19.640 --> 0:25:21.400
<v Speaker 1>whole bunch of these at the top and they can

0:25:21.440 --> 0:25:25.240
<v Speaker 1>accept the data that you're feeding in. They process that data,

0:25:25.640 --> 0:25:30.480
<v Speaker 1>and then based upon some operation, they will then send

0:25:30.760 --> 0:25:35.240
<v Speaker 1>an output to a node one layer down. So there's

0:25:35.280 --> 0:25:38.440
<v Speaker 1>lots of other nodes in the layers below, or maybe

0:25:38.480 --> 0:25:40.600
<v Speaker 1>not as many as you have initial layers. You might

0:25:40.600 --> 0:25:45.919
<v Speaker 1>actually have fewer, and the layers above will send to

0:25:46.760 --> 0:25:48.800
<v Speaker 1>you know, data to a specific node depending upon what

0:25:48.960 --> 0:25:54.280
<v Speaker 1>the outcome is. Whatever the output is, so these nodes

0:25:54.400 --> 0:25:58.199
<v Speaker 1>accept the input. These inputs have a bias and a

0:25:58.240 --> 0:26:01.520
<v Speaker 1>weight to them, and this is one of the hidden layers.

0:26:01.560 --> 0:26:04.320
<v Speaker 1>They will then create an output and send that on

0:26:04.480 --> 0:26:09.399
<v Speaker 1>to nodes another layer down. So this goes on until

0:26:09.440 --> 0:26:11.840
<v Speaker 1>you get to your output layer, where you get your

0:26:11.880 --> 0:26:15.399
<v Speaker 1>final result, and then you can determine whether or not

0:26:15.400 --> 0:26:18.800
<v Speaker 1>the final result matches what you were hoping for. So

0:26:18.840 --> 0:26:21.920
<v Speaker 1>did your system properly identify which photos do and don't

0:26:21.960 --> 0:26:24.800
<v Speaker 1>have cats in them? Now, as I mentioned earlier, you

0:26:24.840 --> 0:26:28.000
<v Speaker 1>typically get results that aren't perfect, but we want to

0:26:28.080 --> 0:26:32.760
<v Speaker 1>train the system to improve with every test. Back propagation

0:26:33.400 --> 0:26:36.800
<v Speaker 1>is one way to do this. So with that propagation,

0:26:37.359 --> 0:26:40.040
<v Speaker 1>you actually start with the final output. You've already done

0:26:40.040 --> 0:26:43.480
<v Speaker 1>a test run, right, and you've got your output, and

0:26:44.080 --> 0:26:49.160
<v Speaker 1>maybe your test has five possible final outcomes, but only

0:26:49.200 --> 0:26:52.119
<v Speaker 1>one of those is the outcome you actually want. Okay,

0:26:52.160 --> 0:26:55.359
<v Speaker 1>we'll say it's outcome number one. We're saying I want

0:26:55.359 --> 0:26:59.159
<v Speaker 1>this system to more often than not come to the

0:26:59.160 --> 0:27:01.840
<v Speaker 1>conclusion that's outcome number one. But you run your test.

0:27:02.200 --> 0:27:08.320
<v Speaker 1>It's got you one thousand little tasks in it, and

0:27:08.359 --> 0:27:11.840
<v Speaker 1>you run your test. You find out that it only

0:27:11.920 --> 0:27:14.399
<v Speaker 1>arrives at outcome number one five percent of the time,

0:27:14.640 --> 0:27:17.080
<v Speaker 1>which is actually worse than random chance. Right, it should

0:27:17.080 --> 0:27:19.359
<v Speaker 1>be twenty percent for random chance, but it's only getting

0:27:19.359 --> 0:27:22.359
<v Speaker 1>there five percent of the time. Something is going really

0:27:22.400 --> 0:27:26.199
<v Speaker 1>wrong with your system for it to mistakenly go to

0:27:26.240 --> 0:27:29.760
<v Speaker 1>one of the other options and very rarely go to

0:27:29.800 --> 0:27:32.960
<v Speaker 1>the correct one. So let's say you also noticed the

0:27:32.960 --> 0:27:36.080
<v Speaker 1>outcome number three. It goes to that one forty percent

0:27:36.080 --> 0:27:38.359
<v Speaker 1>of the time. So it's making this mistake forty percent

0:27:38.359 --> 0:27:40.400
<v Speaker 1>of the time and only getting it right five percent

0:27:40.440 --> 0:27:43.080
<v Speaker 1>of the time. So things are seriously out of whack.

0:27:43.160 --> 0:27:47.240
<v Speaker 1>You need to find which connections which would involve the

0:27:47.280 --> 0:27:51.159
<v Speaker 1>biases and the weights that are within your system that

0:27:51.200 --> 0:27:55.359
<v Speaker 1>are leading it to mistakenly arrive at the wrong outcome,

0:27:55.560 --> 0:27:59.960
<v Speaker 1>so frequently you want to reduce those factors, and simultaneously

0:28:00.119 --> 0:28:03.119
<v Speaker 1>you need to boost the ones that lead the system

0:28:03.280 --> 0:28:05.919
<v Speaker 1>to arrive at outcome number one, because that's the answer

0:28:06.000 --> 0:28:09.560
<v Speaker 1>you actually want the system to get to. All Right,

0:28:10.640 --> 0:28:12.720
<v Speaker 1>I've been droning on for a bit. Let's take another

0:28:12.800 --> 0:28:15.320
<v Speaker 1>quick break. When we come back, I'll finish up explaining

0:28:15.359 --> 0:28:27.840
<v Speaker 1>this and then we'll move on to catastrophic forgetting. Okay,

0:28:28.280 --> 0:28:31.280
<v Speaker 1>so we were talking about how you are looking at

0:28:31.320 --> 0:28:35.439
<v Speaker 1>a system that is coming to the wrong conclusion ninety

0:28:35.520 --> 0:28:38.560
<v Speaker 1>five percent of the time. It is a broken system.

0:28:38.880 --> 0:28:43.120
<v Speaker 1>You have to then figure out what factors are causing

0:28:43.120 --> 0:28:46.960
<v Speaker 1>this to happen, and they are numerous, right, They extend

0:28:47.000 --> 0:28:49.480
<v Speaker 1>all the way up to the very top of your

0:28:49.520 --> 0:28:52.160
<v Speaker 1>neural network, the other end where the input comes in.

0:28:52.520 --> 0:28:55.120
<v Speaker 1>But you can't just change everything all at once. You've

0:28:55.120 --> 0:28:58.480
<v Speaker 1>got to figure this out systematically, and that's what backpropagation

0:28:58.600 --> 0:29:03.240
<v Speaker 1>is really all about. Which links one layer up from

0:29:03.280 --> 0:29:07.200
<v Speaker 1>the output have the greatest impact on the outcome. Right,

0:29:07.880 --> 0:29:10.720
<v Speaker 1>changing everything would be tedious, it would be impractical. You

0:29:10.800 --> 0:29:14.120
<v Speaker 1>might even make things worse. Some of these neural networks

0:29:14.160 --> 0:29:19.320
<v Speaker 1>are confoundingly complicated, so it's not really a feasible solution.

0:29:19.680 --> 0:29:22.480
<v Speaker 1>So instead you look at the connections that are having

0:29:22.560 --> 0:29:25.760
<v Speaker 1>the biggest impact on your outcome. So you want things

0:29:25.800 --> 0:29:28.160
<v Speaker 1>where if you make a small change in either the

0:29:28.160 --> 0:29:31.080
<v Speaker 1>bias or the weight, or maybe both, you'll see a

0:29:31.200 --> 0:29:35.040
<v Speaker 1>larger end effect on the outcome. All the connections are

0:29:35.160 --> 0:29:39.040
<v Speaker 1>arguably important, but some are more important than others. Backpropagation

0:29:39.160 --> 0:29:41.880
<v Speaker 1>works backwards from the result toward the other end of

0:29:41.920 --> 0:29:44.720
<v Speaker 1>the network to tweak those connections. It boosts ones that

0:29:44.840 --> 0:29:48.840
<v Speaker 1>lead to the correct or desired response, and it reduces

0:29:48.880 --> 0:29:53.360
<v Speaker 1>the values of those that lead to incorrect or undesired responses.

0:29:53.720 --> 0:29:55.520
<v Speaker 1>If we were to think of this like the classic

0:29:55.600 --> 0:30:00.000
<v Speaker 1>example and chaos theory, this could potentially involve us studying

0:30:00.200 --> 0:30:02.840
<v Speaker 1>hurricane as it hits land and tracing its history back

0:30:02.880 --> 0:30:06.280
<v Speaker 1>as it moved through the ocean, and we would eventually

0:30:06.320 --> 0:30:09.240
<v Speaker 1>arrive at the point where it was a tropical storm,

0:30:09.320 --> 0:30:12.040
<v Speaker 1>and then we would go further back and see the

0:30:12.040 --> 0:30:14.960
<v Speaker 1>factors that led to the creation of that storm. And

0:30:15.000 --> 0:30:16.720
<v Speaker 1>maybe if we tracked it all the way back, we

0:30:16.760 --> 0:30:20.200
<v Speaker 1>would even find that one of a billion factors that

0:30:20.280 --> 0:30:23.480
<v Speaker 1>made the storm was in fact, a butterfly was flapping

0:30:23.560 --> 0:30:25.080
<v Speaker 1>its wings on the other side of the world and

0:30:25.120 --> 0:30:28.400
<v Speaker 1>that contributed to it. Maybe we find out that butterfly

0:30:28.400 --> 0:30:32.200
<v Speaker 1>flap of its wings had an impact, but it was negligible,

0:30:32.240 --> 0:30:33.960
<v Speaker 1>and that if the butterfly hadn't flapped its wings, the

0:30:34.040 --> 0:30:36.640
<v Speaker 1>hurricane still would have happened. That would be an example

0:30:36.680 --> 0:30:40.080
<v Speaker 1>of well, we don't bother adjusting the weight of the

0:30:40.400 --> 0:30:43.360
<v Speaker 1>of the impact of that butterfly flapping its wings because

0:30:43.360 --> 0:30:46.440
<v Speaker 1>it doesn't matter for the end result. But what if

0:30:46.480 --> 0:30:48.880
<v Speaker 1>we were to discover that that butterfly flap of its

0:30:48.920 --> 0:30:53.600
<v Speaker 1>wings is the only reason the hurricane happened that, or

0:30:53.640 --> 0:30:56.040
<v Speaker 1>at least was the primary reason that all the other

0:30:56.120 --> 0:30:59.120
<v Speaker 1>factors pale in comparison. Well, then we'd want to make

0:30:59.120 --> 0:31:04.040
<v Speaker 1>sure we boost the weight of that input, because clearly

0:31:04.080 --> 0:31:09.040
<v Speaker 1>that butterfly is fundamental for hurricanes. I think hurricanes are

0:31:09.080 --> 0:31:12.080
<v Speaker 1>really dangerous, and I would ask butterflies to kind of chill,

0:31:12.600 --> 0:31:16.000
<v Speaker 1>all right. I mean, I don't want butterflies to go away,

0:31:16.760 --> 0:31:20.560
<v Speaker 1>just you know, maybe stop flapping so much. Anyway, the

0:31:20.600 --> 0:31:24.760
<v Speaker 1>formula for backpropagation gets into some calculus that is well

0:31:24.760 --> 0:31:27.920
<v Speaker 1>beyond my knowledge and skill. So rather than attempt to

0:31:28.040 --> 0:31:32.040
<v Speaker 1>stumble my way through an explanation that I don't actually understand,

0:31:33.000 --> 0:31:34.760
<v Speaker 1>I think it's best to leave the concept at the

0:31:34.840 --> 0:31:37.760
<v Speaker 1>high level that I have described right now. So just

0:31:37.840 --> 0:31:39.960
<v Speaker 1>know that it gets way more granular than what I've

0:31:40.000 --> 0:31:44.160
<v Speaker 1>talked about. But essentially, you're looking at those factors that

0:31:44.320 --> 0:31:47.920
<v Speaker 1>led to the ultimate decision and saying which ones of

0:31:47.960 --> 0:31:51.440
<v Speaker 1>these had the greatest impact, and how can I tweak

0:31:51.520 --> 0:31:54.880
<v Speaker 1>them so that I can shape the outcome to one

0:31:54.920 --> 0:31:57.040
<v Speaker 1>I wanted. If we were thinking about that example I

0:31:57.080 --> 0:31:59.600
<v Speaker 1>gave about whether or not you go to the movies.

0:32:00.640 --> 0:32:06.280
<v Speaker 1>Maybe in present day you starts thinking about past experiences

0:32:06.320 --> 0:32:08.520
<v Speaker 1>where you made a decision to go out when you

0:32:08.560 --> 0:32:11.120
<v Speaker 1>had a big day in the following day, and how

0:32:11.680 --> 0:32:14.560
<v Speaker 1>that impacted you, perhaps negatively. Maybe you're like, man, I

0:32:14.560 --> 0:32:17.720
<v Speaker 1>should have gotten a promotion by now, and then you think, well,

0:32:17.760 --> 0:32:20.440
<v Speaker 1>I do go to the movies an awful lot. You

0:32:20.520 --> 0:32:23.200
<v Speaker 1>might say, I need to adjust some of the factors

0:32:23.240 --> 0:32:27.959
<v Speaker 1>that affect my decision making process and perhaps prioritize my career.

0:32:28.480 --> 0:32:33.280
<v Speaker 1>Or if you've decided that late stage capitalism is terrible

0:32:33.360 --> 0:32:35.960
<v Speaker 1>evil and that you're going to try and live a

0:32:36.000 --> 0:32:40.840
<v Speaker 1>hedonistic lifestyle of a wandering soul, maybe you say, I'm

0:32:40.880 --> 0:32:42.680
<v Speaker 1>going to go and see my movie with my friend,

0:32:43.080 --> 0:32:45.640
<v Speaker 1>and yeah, that's just how it is, because that's the

0:32:45.640 --> 0:32:47.560
<v Speaker 1>most important thing to me. You only go around this

0:32:47.640 --> 0:32:50.560
<v Speaker 1>crazy world once. After all, I'm not telling you which

0:32:50.560 --> 0:32:54.440
<v Speaker 1>way to go. I'm still finding my own way. But yeah,

0:32:54.480 --> 0:32:57.640
<v Speaker 1>back propagation would be how you would go back and say,

0:32:57.680 --> 0:33:00.880
<v Speaker 1>all right, well, because I don't like the outcome that happened,

0:33:01.360 --> 0:33:05.440
<v Speaker 1>I need to change the way. These factors weigh in

0:33:05.760 --> 0:33:09.400
<v Speaker 1>on the decision making process that goes through the whole system. Now,

0:33:09.440 --> 0:33:13.080
<v Speaker 1>the advancements in the science of neural networks proved that

0:33:13.120 --> 0:33:16.600
<v Speaker 1>the technology no longer operated under the constraints that concern

0:33:16.720 --> 0:33:19.920
<v Speaker 1>Minski and support in the late sixties, so once again

0:33:20.320 --> 0:33:24.520
<v Speaker 1>funding found its way to neural network research and development projects.

0:33:25.280 --> 0:33:29.840
<v Speaker 1>Now let's finally talk about forgetting and what makes it catastrophic.

0:33:30.640 --> 0:33:34.360
<v Speaker 1>So you could, in theory, develop an artificial neural network

0:33:34.720 --> 0:33:38.480
<v Speaker 1>and have a library of training data, and the only

0:33:38.560 --> 0:33:41.240
<v Speaker 1>thing you ever do with this network is you feed

0:33:41.320 --> 0:33:45.960
<v Speaker 1>that same set of training data to that same neural

0:33:46.040 --> 0:33:50.440
<v Speaker 1>network over and over in an effort to get performance

0:33:50.480 --> 0:33:53.720
<v Speaker 1>as close to perfect as you possibly can. Just you know,

0:33:53.840 --> 0:33:55.640
<v Speaker 1>it's kind of like if you have a car and

0:33:55.680 --> 0:33:59.400
<v Speaker 1>you're constantly tweaking it so it will perform better, and

0:33:59.480 --> 0:34:02.320
<v Speaker 1>maybe you chase one thing and it boosts performance in

0:34:02.360 --> 0:34:06.120
<v Speaker 1>one area, but it kind of negatively impacts performance in

0:34:06.160 --> 0:34:09.080
<v Speaker 1>another area, so then you got to tweak something else.

0:34:09.440 --> 0:34:11.480
<v Speaker 1>You could be doing that with an artificial neural network

0:34:11.520 --> 0:34:13.879
<v Speaker 1>forever and just be using the same set of training data.

0:34:14.320 --> 0:34:16.320
<v Speaker 1>And all you're trying to do is make a system

0:34:16.640 --> 0:34:19.399
<v Speaker 1>that could handle that training data better than any other

0:34:19.440 --> 0:34:21.920
<v Speaker 1>system in the world, and that would be interesting, but

0:34:22.000 --> 0:34:25.560
<v Speaker 1>it would be useless from a practical standpoint. You could say, like, hey,

0:34:25.600 --> 0:34:27.359
<v Speaker 1>you want to see my machine that can sort through

0:34:27.560 --> 0:34:31.360
<v Speaker 1>only this collection of photographs and pick out the ones

0:34:31.400 --> 0:34:33.200
<v Speaker 1>that have cats in them and the ones that don't.

0:34:33.520 --> 0:34:37.799
<v Speaker 1>Pretty pretty darn effectively, but not perfectly. It's not really

0:34:37.800 --> 0:34:41.560
<v Speaker 1>an interesting value proposition, right, So more likely you are

0:34:41.600 --> 0:34:44.400
<v Speaker 1>eventually going to start feeding lots of different kinds of

0:34:44.480 --> 0:34:48.759
<v Speaker 1>data to this neural network. And know, yeah, you train

0:34:48.840 --> 0:34:51.759
<v Speaker 1>the network on certain data sets, but your goal is

0:34:51.800 --> 0:34:54.440
<v Speaker 1>to feed new sets of data data the system has

0:34:54.480 --> 0:34:57.440
<v Speaker 1>never encountered before and rely on the system's ability to

0:34:57.560 --> 0:35:00.760
<v Speaker 1>process this information correctly to get the result you want.

0:35:01.320 --> 0:35:04.040
<v Speaker 1>And we might even be talking about stuff the human

0:35:04.080 --> 0:35:07.880
<v Speaker 1>beings can't easily do, right, But see, the training data

0:35:08.480 --> 0:35:10.200
<v Speaker 1>is going to mean that the network will start to

0:35:10.239 --> 0:35:15.000
<v Speaker 1>create and reinforce certain pathways, and those pathways will over

0:35:15.080 --> 0:35:17.359
<v Speaker 1>time get stronger and stronger, just as we said at

0:35:17.360 --> 0:35:20.520
<v Speaker 1>the beginning of this episode. But new data is going

0:35:20.560 --> 0:35:25.120
<v Speaker 1>to necessitate new pathways. Sometimes when the system begins to

0:35:25.160 --> 0:35:29.960
<v Speaker 1>form these new pathways, it forgets the old pathways. So

0:35:30.000 --> 0:35:32.880
<v Speaker 1>it's possible for a neural network to actually get worse

0:35:33.120 --> 0:35:36.080
<v Speaker 1>at the task it had previously been trained to do

0:35:37.080 --> 0:35:41.120
<v Speaker 1>with the actual training material. In fact, in a true catastrophe,

0:35:41.160 --> 0:35:45.440
<v Speaker 1>the system might forget the objective and doesn't recognize what

0:35:45.480 --> 0:35:48.400
<v Speaker 1>the desired outcome is meant to be, so the results

0:35:48.440 --> 0:35:51.480
<v Speaker 1>can appear random and meaningless. It's as if the system

0:35:51.520 --> 0:35:55.440
<v Speaker 1>has developed some form of amnesia. So this is prevalent,

0:35:56.000 --> 0:36:00.600
<v Speaker 1>most prevalent anyway, in systems that rely on unguided learning.

0:36:01.200 --> 0:36:06.120
<v Speaker 1>With guided learning, you have engineers who are carefully selecting

0:36:06.160 --> 0:36:10.839
<v Speaker 1>the data that gets fed into a system. An unguided

0:36:10.840 --> 0:36:15.160
<v Speaker 1>system would collect raw data from wherever and attempt to

0:36:15.200 --> 0:36:18.560
<v Speaker 1>deliver desired results, and that those are the kinds of

0:36:19.280 --> 0:36:23.000
<v Speaker 1>neural networks that are more prone to catastrophic forgetting. But

0:36:23.080 --> 0:36:27.359
<v Speaker 1>as I said, machine learning systems tackle new data, maybe

0:36:27.360 --> 0:36:31.239
<v Speaker 1>even new tasks, and then you get the risk of

0:36:31.280 --> 0:36:34.080
<v Speaker 1>the system forgetting stuff. So I jokingly say, it's kind

0:36:34.080 --> 0:36:36.200
<v Speaker 1>of like when I learned something new, it has to

0:36:36.200 --> 0:36:39.680
<v Speaker 1>push out something old, like you know, my friend's phone

0:36:39.760 --> 0:36:42.160
<v Speaker 1>number or something. Suddenly I can no longer remember it

0:36:42.239 --> 0:36:45.640
<v Speaker 1>because I learned some new interesting fact, as if I

0:36:45.719 --> 0:36:49.000
<v Speaker 1>have met my capacity for being able to know things.

0:36:49.200 --> 0:36:52.759
<v Speaker 1>So learning anything new necessitates having to forget something I

0:36:52.880 --> 0:36:56.160
<v Speaker 1>used to know, like gat Ye, because now gat Ye

0:36:56.320 --> 0:36:59.439
<v Speaker 1>is just somebody that I used to know. But wait,

0:36:59.680 --> 0:37:04.680
<v Speaker 1>there's more. Just as a system can experience catastrophic forgetting,

0:37:05.400 --> 0:37:10.920
<v Speaker 1>it can also experience catastrophic remembering. This is when a

0:37:10.960 --> 0:37:15.200
<v Speaker 1>system mistakenly believes it is doing one process, a task

0:37:15.320 --> 0:37:19.160
<v Speaker 1>it had previously been trained to do, rather than the

0:37:19.200 --> 0:37:23.040
<v Speaker 1>one it's actually trying to do. So let's say we've

0:37:23.040 --> 0:37:26.359
<v Speaker 1>got an artificial neural network, and originally we taught it

0:37:26.400 --> 0:37:28.920
<v Speaker 1>to recognize the photos that have cats in them versus

0:37:28.920 --> 0:37:31.960
<v Speaker 1>the ones that don't. But now we have retrained the

0:37:32.080 --> 0:37:36.480
<v Speaker 1>same artificial neural network to try and recognize handwritten text.

0:37:37.239 --> 0:37:41.080
<v Speaker 1>Except when we feed handwritten text to the system, suddenly

0:37:41.120 --> 0:37:44.560
<v Speaker 1>the system believes it's trying to determine where the cats are.

0:37:45.160 --> 0:37:47.719
<v Speaker 1>This is something that can happen with machine learning systems too,

0:37:47.719 --> 0:37:50.560
<v Speaker 1>and you still get bad results out of it. So

0:37:50.680 --> 0:37:55.120
<v Speaker 1>this is a real problem. Now, these are not insurmountable problems.

0:37:55.680 --> 0:37:59.520
<v Speaker 1>There are some solutions that are actually intuitive. For example,

0:38:00.120 --> 0:38:03.480
<v Speaker 1>any gamer out there knows that it's best to save

0:38:03.560 --> 0:38:06.400
<v Speaker 1>your game just before you head into a big boss battle,

0:38:06.680 --> 0:38:09.879
<v Speaker 1>just in case things don't go the way you planned well.

0:38:09.880 --> 0:38:13.279
<v Speaker 1>With artificial neural networks, it's maybe not a bad idea

0:38:13.360 --> 0:38:16.640
<v Speaker 1>to make a copy of a network before you retrain

0:38:16.719 --> 0:38:19.200
<v Speaker 1>it to do something new. Then you still have the

0:38:19.200 --> 0:38:23.080
<v Speaker 1>backup if things do go pair shape. There are other

0:38:23.120 --> 0:38:27.799
<v Speaker 1>approaches to decreasing the risk of catastrophic forgetting or catastrophic remembering.

0:38:28.280 --> 0:38:32.400
<v Speaker 1>An article in Applied Mathematics titled Overcoming Catastrophic forgetting a

0:38:32.440 --> 0:38:36.560
<v Speaker 1>neural networks describes a system in which the researchers purposefully

0:38:36.600 --> 0:38:41.640
<v Speaker 1>slowed down the network's ability to change the weights involved

0:38:41.719 --> 0:38:47.359
<v Speaker 1>in important tasks from previous training cycles. So this makes

0:38:47.400 --> 0:38:50.239
<v Speaker 1>teaching the system to do new tasks a little more

0:38:50.320 --> 0:38:57.360
<v Speaker 1>challenging because it's protecting these weights. It's preventing the system's

0:38:57.360 --> 0:39:02.600
<v Speaker 1>ability to be completely plasid, which means the system has

0:39:02.640 --> 0:39:05.160
<v Speaker 1>to work around these constraints and still learn how to

0:39:05.160 --> 0:39:08.279
<v Speaker 1>do the new task, but in the process it means

0:39:08.320 --> 0:39:12.120
<v Speaker 1>it doesn't forget how to do the previous tasks. This

0:39:12.239 --> 0:39:15.920
<v Speaker 1>article is interesting because the tasks the researchers actually used

0:39:16.040 --> 0:39:18.600
<v Speaker 1>the purposes of training, Like, what were they teaching the

0:39:18.680 --> 0:39:21.440
<v Speaker 1>artificial neural network to do well? They were teaching it

0:39:21.640 --> 0:39:24.680
<v Speaker 1>how to play atari twenty six hundred games. So they

0:39:24.680 --> 0:39:27.879
<v Speaker 1>would start with one game and train the system on

0:39:27.960 --> 0:39:31.640
<v Speaker 1>how to play the game. Then they would give the

0:39:31.680 --> 0:39:36.160
<v Speaker 1>system a new game with different game mechanics, and the

0:39:36.200 --> 0:39:38.840
<v Speaker 1>system would have to learn how to play this new game,

0:39:39.440 --> 0:39:41.520
<v Speaker 1>but they wanted to see if it could still remember

0:39:41.520 --> 0:39:43.840
<v Speaker 1>how to play the original game. That was kind of

0:39:43.880 --> 0:39:46.320
<v Speaker 1>the system they were working on. They were tweaking things

0:39:46.920 --> 0:39:51.360
<v Speaker 1>so that the machine learning artificial neural network as a

0:39:51.360 --> 0:39:54.400
<v Speaker 1>whole could learn how to play multiple Atari twenty six

0:39:54.480 --> 0:39:57.400
<v Speaker 1>hundred games without forgetting how to do the previous ones.

0:39:57.840 --> 0:40:00.000
<v Speaker 1>This is a non trivial task. I mean, it takes

0:40:00.040 --> 0:40:03.120
<v Speaker 1>a lot of work to see exactly how to preserve

0:40:03.200 --> 0:40:05.960
<v Speaker 1>things so that you're not slowing down the learning process

0:40:05.960 --> 0:40:08.640
<v Speaker 1>too much, but you're also not inviting the possibility of

0:40:08.640 --> 0:40:13.480
<v Speaker 1>catastrophic forgetting. Now, that's just one example of how researchers

0:40:13.480 --> 0:40:17.080
<v Speaker 1>are looking to mitigate the problem of catastrophic forgetting in

0:40:17.080 --> 0:40:20.719
<v Speaker 1>catastrophic remembering. There are other methods as well, and maybe

0:40:20.760 --> 0:40:23.719
<v Speaker 1>I'll do another episode where I'll go into more detail

0:40:23.960 --> 0:40:27.480
<v Speaker 1>on some of those. They do get pretty complicated, and

0:40:27.480 --> 0:40:31.359
<v Speaker 1>in fact, eventually Rerilli and I even eventually pretty early

0:40:31.400 --> 0:40:35.520
<v Speaker 1>on I hit my limit for as far as I

0:40:35.560 --> 0:40:39.319
<v Speaker 1>can understand the actual mechanics of the system. So rather

0:40:39.400 --> 0:40:43.880
<v Speaker 1>than you know, try and punch above my weight, I

0:40:43.920 --> 0:40:46.640
<v Speaker 1>think it's best to kind of be a little more general,

0:40:47.360 --> 0:40:49.400
<v Speaker 1>but just to have that understanding to kind of get

0:40:49.400 --> 0:40:52.920
<v Speaker 1>a better appreciation of some of the challenges relating to

0:40:53.120 --> 0:40:58.240
<v Speaker 1>artificial intelligence in general and machine learning in particular. And again,

0:40:58.400 --> 0:41:01.560
<v Speaker 1>like this machine learning issue, you it's really a bigger

0:41:01.640 --> 0:41:06.120
<v Speaker 1>problem with more sophisticated systems that are meant to do

0:41:06.400 --> 0:41:09.840
<v Speaker 1>unsupervised and unguided learning, right, those are the ones that

0:41:09.880 --> 0:41:12.520
<v Speaker 1>are going to be more prone to these issues. If

0:41:12.520 --> 0:41:17.640
<v Speaker 1>we're talking about supervised and guided learning, where engineers are

0:41:18.239 --> 0:41:20.880
<v Speaker 1>being very careful with the data being fed to a system,

0:41:21.400 --> 0:41:25.560
<v Speaker 1>it's less likely to happen. But the whole promise, or

0:41:26.200 --> 0:41:29.080
<v Speaker 1>at least the you know, not the promise of the

0:41:29.080 --> 0:41:31.120
<v Speaker 1>technology itself, but the promise of the people who are

0:41:31.520 --> 0:41:34.680
<v Speaker 1>funding it, is that this technology is going to reach

0:41:34.680 --> 0:41:38.160
<v Speaker 1>a point where it's able to learn on its own

0:41:38.239 --> 0:41:41.640
<v Speaker 1>and be able to do things better than people can do,

0:41:41.760 --> 0:41:44.000
<v Speaker 1>to free us up to doing, you know, stuff we

0:41:44.040 --> 0:41:45.799
<v Speaker 1>want to do instead of stuff we have to do.

0:41:46.560 --> 0:41:49.919
<v Speaker 1>That's like the science fiction dream version of AI. As

0:41:49.960 --> 0:41:53.759
<v Speaker 1>we all know, getting there is much more painful. It's

0:41:53.800 --> 0:41:57.600
<v Speaker 1>not like a simple process of Hey, we've made everything

0:41:57.920 --> 0:42:00.759
<v Speaker 1>easy to do now and you don't have to work

0:42:00.800 --> 0:42:03.160
<v Speaker 1>all day. You can enjoy your life and pursue your

0:42:03.239 --> 0:42:07.279
<v Speaker 1>dreams and develop your hobbies and your interests, and you

0:42:07.320 --> 0:42:10.880
<v Speaker 1>can have fulfillment and somehow money isn't important anymore. Like

0:42:10.960 --> 0:42:13.000
<v Speaker 1>that seems to be the Star Trek version of the

0:42:13.000 --> 0:42:15.040
<v Speaker 1>future that people want it to go in. But as

0:42:15.040 --> 0:42:17.839
<v Speaker 1>we have seen, the process of getting there is way

0:42:17.840 --> 0:42:20.719
<v Speaker 1>more painful. As you know, people face a reality of

0:42:21.320 --> 0:42:25.800
<v Speaker 1>potentially being out of work because of AI, or maybe

0:42:25.840 --> 0:42:30.000
<v Speaker 1>being paid way less to do work because the AI

0:42:30.160 --> 0:42:33.319
<v Speaker 1>is doing most of it. These are not that's not

0:42:33.400 --> 0:42:36.800
<v Speaker 1>Star Trek feature. That's getting like into Blade Runner future,

0:42:37.239 --> 0:42:41.520
<v Speaker 1>So we don't want that one. By the way, the

0:42:41.600 --> 0:42:44.120
<v Speaker 1>tears in the rain speech is fantastic, but you do

0:42:44.120 --> 0:42:46.200
<v Speaker 1>not want to live in the Blade Runner world. Trust me.

0:42:47.440 --> 0:42:48.799
<v Speaker 1>You might not want to live in the Star Trek

0:42:48.800 --> 0:42:51.680
<v Speaker 1>world either, because those outfits don't look that comfortable anyway.

0:42:52.520 --> 0:42:57.080
<v Speaker 1>That's my little discussion about AI, machine learning and cast

0:42:57.120 --> 0:43:02.120
<v Speaker 1>trophic forgetting in castrophic. Remembering this is just one of

0:43:02.200 --> 0:43:05.439
<v Speaker 1>the challenges associated with AI and machine learning. I don't

0:43:05.520 --> 0:43:08.160
<v Speaker 1>mean to suggest it's the one and only, or even

0:43:08.160 --> 0:43:11.239
<v Speaker 1>that it's the most important one, but it is one

0:43:11.280 --> 0:43:13.560
<v Speaker 1>that I had not really heard of until I listened

0:43:13.560 --> 0:43:16.480
<v Speaker 1>to that Skeptics Guide to the Universe episode over the weekend,

0:43:17.040 --> 0:43:20.720
<v Speaker 1>and it was really interesting to dive into the material

0:43:20.800 --> 0:43:22.600
<v Speaker 1>and read up about it and to get a better

0:43:22.680 --> 0:43:25.960
<v Speaker 1>understanding of what it means and how it works. I

0:43:26.000 --> 0:43:28.600
<v Speaker 1>hope you liked that episode from last year, twenty twenty three,

0:43:28.719 --> 0:43:32.600
<v Speaker 1>machine Learning and Catastrophic Forgetting. I am working on other

0:43:32.640 --> 0:43:35.440
<v Speaker 1>episodes that relate to AI. I also want to do

0:43:35.480 --> 0:43:40.000
<v Speaker 1>an episode about companies that claim to be part of

0:43:40.040 --> 0:43:43.439
<v Speaker 1>the artificial intelligence space but in fact use little if

0:43:43.560 --> 0:43:47.919
<v Speaker 1>any AI technology, because that has become a thing. As

0:43:47.920 --> 0:43:51.680
<v Speaker 1>we all know, when there is the combination of huge

0:43:51.680 --> 0:43:55.640
<v Speaker 1>amounts of money and low amounts of understanding, you have

0:43:56.120 --> 0:44:00.399
<v Speaker 1>the perfect breeding ground for scams and con artists and

0:44:00.400 --> 0:44:03.640
<v Speaker 1>that kind of thing. So I do plan on doing

0:44:03.680 --> 0:44:08.560
<v Speaker 1>an episode about various startups that claim at some level

0:44:08.680 --> 0:44:12.680
<v Speaker 1>to be part of artificial intelligence, but when you really

0:44:12.880 --> 0:44:16.160
<v Speaker 1>start to examine them, have little to no connection to

0:44:16.200 --> 0:44:18.879
<v Speaker 1>that world. So being on the lookout for that, It's

0:44:19.040 --> 0:44:21.040
<v Speaker 1>going to take me some time to do some research

0:44:21.040 --> 0:44:23.839
<v Speaker 1>because there's lots of different sources to go through on

0:44:23.880 --> 0:44:26.840
<v Speaker 1>that one. But that's what I'm working on for probably

0:44:26.920 --> 0:44:29.920
<v Speaker 1>next week. I'm hoping next week. In the meantime, for

0:44:30.040 --> 0:44:32.359
<v Speaker 1>those of you here in the United States, I hope

0:44:32.360 --> 0:44:36.680
<v Speaker 1>you have a safe Fourth of July celebration. Make sure

0:44:37.000 --> 0:44:39.839
<v Speaker 1>that you spend time with friends and loved ones, and

0:44:40.280 --> 0:44:42.440
<v Speaker 1>you know, be very careful if you're going to be

0:44:42.480 --> 0:44:45.840
<v Speaker 1>around fireworks. Those things are very dangerous for everyone else

0:44:45.880 --> 0:44:48.520
<v Speaker 1>out there who's not celebrating a holiday and fourth of July.

0:44:48.719 --> 0:44:53.360
<v Speaker 1>I hope you have an excellent Fourth of July wherever

0:44:53.400 --> 0:44:56.759
<v Speaker 1>you are, and that whatever you enjoy doing, you get

0:44:56.760 --> 0:44:59.520
<v Speaker 1>to do a lot of it on the Fourth of July.

0:45:00.000 --> 0:45:02.080
<v Speaker 1>As long as it's you know, not hurting yourself or

0:45:02.120 --> 0:45:05.360
<v Speaker 1>other people. That's it for me. I will talk to

0:45:05.400 --> 0:45:16.120
<v Speaker 1>you again really soon. Tech Stuff is an iHeartRadio production.

0:45:16.440 --> 0:45:21.480
<v Speaker 1>For more podcasts from iHeartRadio, visit the iHeartRadio app, Apple Podcasts,

0:45:21.600 --> 0:45:23.600
<v Speaker 1>or wherever you listen to your favorite shows.