WEBVTT - What is Apple's Neural Engine?

0:00:04.120 --> 0:00:07.160
<v Speaker 1>Get in touch with technology with tech Stuff from how

0:00:07.200 --> 0:00:13.880
<v Speaker 1>stuff works dot com. Hey there, and welcome to tech Stuff.

0:00:13.960 --> 0:00:16.680
<v Speaker 1>My name is Jonathan Strickland. I happen to be the

0:00:16.680 --> 0:00:19.239
<v Speaker 1>host of this show. I'm also an executive producer at

0:00:19.239 --> 0:00:23.000
<v Speaker 1>how stuff Works. And hey, I love all things tech,

0:00:23.480 --> 0:00:26.960
<v Speaker 1>and today we're doing a little listener mail request. Dan

0:00:27.240 --> 0:00:29.600
<v Speaker 1>wrote in and asked if I might do an episode

0:00:29.840 --> 0:00:34.800
<v Speaker 1>about Apple's so called neural engine in its more recent iPhones.

0:00:35.200 --> 0:00:38.800
<v Speaker 1>So today we are going to learn what a neural

0:00:38.920 --> 0:00:42.360
<v Speaker 1>engine is and what it does. And if you guys,

0:00:42.479 --> 0:00:45.440
<v Speaker 1>by the way, have any requests for topics you've always thought, Hey,

0:00:45.479 --> 0:00:49.600
<v Speaker 1>I want to have an episode about this particular tech topic. Remember,

0:00:49.640 --> 0:00:52.040
<v Speaker 1>you can send those to me by sending an email

0:00:52.080 --> 0:00:54.920
<v Speaker 1>to tex Stuff at how stuff works dot com. And

0:00:54.960 --> 0:00:58.920
<v Speaker 1>now let's talk about this neural engine. Well, the general

0:00:58.960 --> 0:01:04.240
<v Speaker 1>public for heard about this topic back in September two

0:01:04.240 --> 0:01:08.640
<v Speaker 1>thousand seventeen, when Apple CEO Tim Cook presented at what

0:01:08.760 --> 0:01:12.000
<v Speaker 1>has become an annual tradition for Apple at around that

0:01:12.080 --> 0:01:16.160
<v Speaker 1>time of year, pretty much every September is when Apple

0:01:16.200 --> 0:01:18.800
<v Speaker 1>will come out and unveil the latest in its line

0:01:18.840 --> 0:01:23.880
<v Speaker 1>of iPhone smartphones, and in that would have been the

0:01:24.080 --> 0:01:28.120
<v Speaker 1>Iconic iPhone X, the tenth anniversary edition of the iPhone

0:01:28.160 --> 0:01:32.440
<v Speaker 1>also the one that's been discontinued now. Cook listed off

0:01:32.480 --> 0:01:35.399
<v Speaker 1>a lot of features when he went to that presentation,

0:01:35.440 --> 0:01:38.360
<v Speaker 1>but the one we're really interested in today is part

0:01:38.400 --> 0:01:42.840
<v Speaker 1>of the phones A eleven micro processor, also called the

0:01:42.920 --> 0:01:47.680
<v Speaker 1>A eleven Bionic CPU. The most recent iPhones as of

0:01:47.680 --> 0:01:51.720
<v Speaker 1>the recording of this podcast now have the next generation

0:01:52.040 --> 0:01:55.520
<v Speaker 1>of that chip, the A twelve, But in both cases,

0:01:55.720 --> 0:01:58.280
<v Speaker 1>the neural engine is one of the elements that gets

0:01:58.320 --> 0:02:00.800
<v Speaker 1>a lot of coverage. So let's go to the A

0:02:00.880 --> 0:02:03.000
<v Speaker 1>eleven since that was the first one to have it.

0:02:03.000 --> 0:02:07.520
<v Speaker 1>It's more than just a CPU. It's technically a system

0:02:07.600 --> 0:02:11.640
<v Speaker 1>on a chip or s O a C. It's an

0:02:11.840 --> 0:02:15.320
<v Speaker 1>ARM sixty four bit chip. But that doesn't really tell

0:02:15.360 --> 0:02:18.160
<v Speaker 1>you anything if you're not, you know, deep into the

0:02:18.200 --> 0:02:21.120
<v Speaker 1>world of micro processors. So what does that actually mean. Well,

0:02:21.360 --> 0:02:24.600
<v Speaker 1>the ARM based part means that it's it's based on

0:02:24.800 --> 0:02:29.519
<v Speaker 1>the ARM micro architecture in chip design. So for our

0:02:29.560 --> 0:02:33.760
<v Speaker 1>purposes we can simplify this to say, the chips components,

0:02:33.880 --> 0:02:38.240
<v Speaker 1>the stuff that's actually on the microprocessor are laid out

0:02:38.480 --> 0:02:42.000
<v Speaker 1>in a way that was developed by ARM Holdings, that's

0:02:42.000 --> 0:02:47.040
<v Speaker 1>the company behind ARM processors. Now that is different from

0:02:47.040 --> 0:02:49.560
<v Speaker 1>the layout you would find in a chip that was

0:02:49.639 --> 0:02:54.119
<v Speaker 1>made by Intel, for example. So the architecture part literally

0:02:54.200 --> 0:02:58.400
<v Speaker 1>refers to the layout of components in the microprocessor and

0:02:58.440 --> 0:03:02.519
<v Speaker 1>how they interact with each other. And generally speaking, companies

0:03:02.560 --> 0:03:08.080
<v Speaker 1>that make microprocessors develop an architecture. They do so in

0:03:08.120 --> 0:03:11.240
<v Speaker 1>a way that is supposed to maximize the efficiency of

0:03:11.320 --> 0:03:13.320
<v Speaker 1>the chips. So if you get the most power for

0:03:13.400 --> 0:03:17.679
<v Speaker 1>the least amount of energy input you can with the

0:03:17.760 --> 0:03:19.600
<v Speaker 1>least amount of waste, really is the best way of

0:03:19.600 --> 0:03:21.440
<v Speaker 1>putting it. You don't want to waste too much and

0:03:22.000 --> 0:03:25.680
<v Speaker 1>produce too much heat. And then you typically would then

0:03:25.880 --> 0:03:29.880
<v Speaker 1>reduce the size of the various components. And then after

0:03:29.919 --> 0:03:32.040
<v Speaker 1>you reduce the size of the components, you might figure

0:03:32.040 --> 0:03:35.720
<v Speaker 1>out a new architecture that makes better use of these

0:03:36.080 --> 0:03:39.720
<v Speaker 1>smaller components. And this process goes on and on. Intel

0:03:39.760 --> 0:03:44.320
<v Speaker 1>calls this the TIC talk methodology. So that's what the

0:03:44.400 --> 0:03:47.720
<v Speaker 1>ARM based part means. It's from this particular company following

0:03:47.720 --> 0:03:51.440
<v Speaker 1>this particular layout. As for that sixty four bit part,

0:03:51.520 --> 0:03:54.360
<v Speaker 1>what does that mean, Well, that refers to the data

0:03:54.480 --> 0:03:59.840
<v Speaker 1>width of the arithmetic logic unit or a LU. The

0:04:00.040 --> 0:04:02.680
<v Speaker 1>says the part of the processor that actually carries out

0:04:03.120 --> 0:04:08.640
<v Speaker 1>those operations on data from computer instructions. So data with

0:04:08.840 --> 0:04:12.160
<v Speaker 1>essentially tells you how much information the a L you

0:04:12.360 --> 0:04:16.919
<v Speaker 1>can accept or handle at a given time, and it

0:04:16.960 --> 0:04:20.240
<v Speaker 1>tells you this in bits. Now, a bit, just to

0:04:20.279 --> 0:04:24.760
<v Speaker 1>remind you, is a single unit of computational information, and

0:04:24.800 --> 0:04:29.560
<v Speaker 1>it is binary, meaning has two states, which we designate

0:04:29.640 --> 0:04:33.799
<v Speaker 1>as being either a zero or a one. Some people

0:04:33.839 --> 0:04:38.120
<v Speaker 1>say often on or false and true, but it's zero

0:04:38.279 --> 0:04:42.200
<v Speaker 1>and one. The number of bits tells you how big

0:04:42.640 --> 0:04:46.919
<v Speaker 1>these actual numbers can get. Before the a L you

0:04:47.120 --> 0:04:49.960
<v Speaker 1>can't handle them anymore. So let's say you have an

0:04:49.960 --> 0:04:52.320
<v Speaker 1>eight bit chip, because that's a lot easier to talk about.

0:04:53.000 --> 0:04:56.360
<v Speaker 1>You would be able to add, subtract, multiply, divide, you know,

0:04:56.400 --> 0:05:02.320
<v Speaker 1>the basic arithmetic lot logical operation to eight bit numbers.

0:05:02.480 --> 0:05:05.800
<v Speaker 1>With an eight bit chip, Now, a single bit is

0:05:05.800 --> 0:05:08.960
<v Speaker 1>a zero or a one, and eight bit number you

0:05:09.000 --> 0:05:14.000
<v Speaker 1>can represent as a string of eight eight numbers, either

0:05:14.120 --> 0:05:17.359
<v Speaker 1>zeros or ones, So you could have eight zeros in

0:05:17.360 --> 0:05:20.200
<v Speaker 1>a row, up to eight ones in a row and

0:05:20.240 --> 0:05:23.200
<v Speaker 1>everything in between. So it could be seven zeros and

0:05:23.240 --> 0:05:25.680
<v Speaker 1>then a one or it could be six zeros and

0:05:25.720 --> 0:05:28.159
<v Speaker 1>then a one and then another zero. You get the point.

0:05:28.880 --> 0:05:32.320
<v Speaker 1>With that many combinations, that means you would be able

0:05:32.400 --> 0:05:36.680
<v Speaker 1>to go from the typical numbers of zero to two

0:05:36.760 --> 0:05:41.120
<v Speaker 1>hundred fifty five. That's with eight bit. However, we're not

0:05:41.200 --> 0:05:44.840
<v Speaker 1>talking eight bit. We're talking about a sixty four bit chips.

0:05:44.839 --> 0:05:48.960
<v Speaker 1>So now you have sixty four digits in a row

0:05:49.000 --> 0:05:52.240
<v Speaker 1>that can be either a zero or one. That provides

0:05:52.279 --> 0:05:57.640
<v Speaker 1>you a lot more combinations, which means you could range

0:05:57.720 --> 0:06:04.040
<v Speaker 1>in number from zero row to nine quintillion, two hundred

0:06:04.320 --> 0:06:10.160
<v Speaker 1>twenty three quadrillion, three hundred seventy two trillion, thirty six billion,

0:06:10.360 --> 0:06:15.000
<v Speaker 1>eight hundred fifty four million, seven hundred seventy five thousand,

0:06:15.080 --> 0:06:19.880
<v Speaker 1>eight hundred seven. That's a pretty big range. It can

0:06:19.920 --> 0:06:23.760
<v Speaker 1>handle way way larger numbers than an eight bit chip.

0:06:24.120 --> 0:06:28.200
<v Speaker 1>So that tells you the type of architecture this chip

0:06:28.279 --> 0:06:30.440
<v Speaker 1>has and the amount of data it can handle at

0:06:30.440 --> 0:06:35.680
<v Speaker 1>a time. The A eleven has six cores, so processors

0:06:35.680 --> 0:06:40.400
<v Speaker 1>with multiple cores can work on parts of a problem simultaneously.

0:06:40.520 --> 0:06:43.200
<v Speaker 1>If you have something that's called a parallel problem, you

0:06:43.240 --> 0:06:45.839
<v Speaker 1>can divide that problem up into different segments and have

0:06:45.960 --> 0:06:49.440
<v Speaker 1>different cores tackle it. Two of those six cores are

0:06:49.480 --> 0:06:53.240
<v Speaker 1>what Apple calls high performance cores. They have a clock

0:06:53.320 --> 0:06:57.160
<v Speaker 1>speed of two point three thirty nine giga hurts uh

0:06:57.200 --> 0:06:59.640
<v Speaker 1>in the A eleven, So the clock speed tells you

0:06:59.680 --> 0:07:03.360
<v Speaker 1>how many clock cycles a CPU can perform per second.

0:07:03.760 --> 0:07:06.920
<v Speaker 1>Two point three nine gigga hurts means that these cores

0:07:06.960 --> 0:07:11.480
<v Speaker 1>can each perform two point thirty nine billion clock cycles

0:07:11.480 --> 0:07:16.320
<v Speaker 1>per second. Now, clock cycles do not easily translate over

0:07:16.440 --> 0:07:21.000
<v Speaker 1>into actions. It's not necessarily one clock cycle per action.

0:07:21.400 --> 0:07:24.960
<v Speaker 1>But generally these numbers tell you how much a core

0:07:26.080 --> 0:07:28.840
<v Speaker 1>of the processor is able to handle per second, how

0:07:28.880 --> 0:07:32.280
<v Speaker 1>many tasks it can do per second, assuming a certain

0:07:32.320 --> 0:07:36.640
<v Speaker 1>number of clock cycles per task. Now, these two cores

0:07:36.680 --> 0:07:41.080
<v Speaker 1>are referred to as Monsoon. The other four cores are

0:07:41.200 --> 0:07:44.440
<v Speaker 1>what Apple refers to as energy efficient cores. They are

0:07:44.480 --> 0:07:47.440
<v Speaker 1>not at that same high clock speed. They are meant

0:07:47.480 --> 0:07:52.440
<v Speaker 1>to handle more routine tasks. They are called mistral. So

0:07:52.520 --> 0:07:57.760
<v Speaker 1>you have Monsoon and Mistral, two Monsoon cores for mistral cores.

0:07:57.800 --> 0:08:00.360
<v Speaker 1>But the A eleven is not just a CPU. Also

0:08:00.440 --> 0:08:05.760
<v Speaker 1>has a three core graphics processing unit or GPU incorporated

0:08:05.840 --> 0:08:08.600
<v Speaker 1>into this chip. And then there are the two processing

0:08:08.640 --> 0:08:14.119
<v Speaker 1>cores dedicated specifically to handling tasks related to machine learning algorithms.

0:08:14.520 --> 0:08:18.400
<v Speaker 1>This pair of processors are the neural engine. They are

0:08:18.640 --> 0:08:23.720
<v Speaker 1>essentially an artificial neural network. And I've talked a little

0:08:23.760 --> 0:08:27.440
<v Speaker 1>bit about artificial neural networks before, but we're really going

0:08:27.480 --> 0:08:29.400
<v Speaker 1>to try and get an understanding of what makes them

0:08:29.400 --> 0:08:34.160
<v Speaker 1>special today, because that's really why neural engine means anything

0:08:34.160 --> 0:08:37.160
<v Speaker 1>in the first place. So this means we get to

0:08:37.160 --> 0:08:40.080
<v Speaker 1>do a quick history lesson because this is tech stuff,

0:08:40.160 --> 0:08:42.960
<v Speaker 1>and of course we have to go into the history.

0:08:43.040 --> 0:08:47.559
<v Speaker 1>So here we go back in the nineteen forties and

0:08:47.640 --> 0:08:51.800
<v Speaker 1>the nineteen fifties, there were some smart guys named Warren

0:08:51.840 --> 0:08:56.040
<v Speaker 1>McCullough who was a neurophysiologist, and another guy named Walter

0:08:56.200 --> 0:08:58.960
<v Speaker 1>Pitts who was a computer scientist and a logician, and

0:08:59.000 --> 0:09:04.560
<v Speaker 1>they began developing theories that brought together computational science and neuroscience,

0:09:04.600 --> 0:09:08.200
<v Speaker 1>in other words, the way machines process information and the

0:09:08.240 --> 0:09:13.599
<v Speaker 1>way brains process information, which is different. McCullough wrote a

0:09:13.640 --> 0:09:16.679
<v Speaker 1>couple of papers about this, and he asserted that the

0:09:16.720 --> 0:09:19.840
<v Speaker 1>basic unit of logic in the brain is the neuron.

0:09:20.320 --> 0:09:24.400
<v Speaker 1>So the nerve cell, the brain cell, is your your

0:09:24.440 --> 0:09:28.040
<v Speaker 1>basic unit of logic in a brain, so it would

0:09:28.040 --> 0:09:30.840
<v Speaker 1>act kind of like a gate or a transistor in

0:09:30.880 --> 0:09:35.000
<v Speaker 1>a circuit. And so you might have a transistor being

0:09:35.040 --> 0:09:40.280
<v Speaker 1>the smallest unit, not not metric of logic, but the

0:09:40.320 --> 0:09:43.640
<v Speaker 1>smallest unit to allow this to happen in a circuit

0:09:44.080 --> 0:09:47.920
<v Speaker 1>neurons in the brain. Pets and McCullough began developing computer

0:09:48.000 --> 0:09:51.280
<v Speaker 1>algorithms that attempted to guide machines to process information in

0:09:51.280 --> 0:09:54.320
<v Speaker 1>a way that was at least conceptually similar to the

0:09:54.320 --> 0:09:57.640
<v Speaker 1>way our brains process information. McCullough had proposed that by

0:09:57.640 --> 0:10:00.400
<v Speaker 1>doing this, you could train a machine to wreck niye

0:10:00.520 --> 0:10:06.000
<v Speaker 1>handwritten characters like numbers or letters, even if those representations

0:10:06.280 --> 0:10:09.520
<v Speaker 1>varied in size or style. And I've talked about this

0:10:09.600 --> 0:10:13.480
<v Speaker 1>being a challenge in the past as well, that training

0:10:13.559 --> 0:10:18.760
<v Speaker 1>a computer to recognize a specific type of image or

0:10:18.800 --> 0:10:22.800
<v Speaker 1>a specific thing in an image is challenging. So I

0:10:22.800 --> 0:10:25.200
<v Speaker 1>always use coffee mugs as an example. I don't know why,

0:10:25.400 --> 0:10:27.800
<v Speaker 1>but I like that that particular one. So we're gonna

0:10:27.840 --> 0:10:30.880
<v Speaker 1>go with it again. If you were to create a

0:10:30.920 --> 0:10:34.200
<v Speaker 1>computer program where you feed an image of a coffee

0:10:34.280 --> 0:10:36.680
<v Speaker 1>mug to the computer program, and you tell the computer

0:10:36.760 --> 0:10:42.679
<v Speaker 1>program this image corresponds with this concept called coffee mug.

0:10:43.240 --> 0:10:47.080
<v Speaker 1>And the image shows a blue coffee mug and its

0:10:47.120 --> 0:10:50.280
<v Speaker 1>handle is pointed toward the right of the perspective of

0:10:50.320 --> 0:10:54.280
<v Speaker 1>the viewer. And then you were to feed a different image,

0:10:54.360 --> 0:10:56.360
<v Speaker 1>maybe of that same coffee mug, but now at a

0:10:56.360 --> 0:11:01.040
<v Speaker 1>different angle. Well, the machine as looking at this as

0:11:01.080 --> 0:11:05.360
<v Speaker 1>if it's a totally new thing. It cannot just uh

0:11:05.559 --> 0:11:08.120
<v Speaker 1>extricate that information and say, oh, this is also a

0:11:08.120 --> 0:11:11.120
<v Speaker 1>coffee mug, or maybe it's a different coffee mug. It's

0:11:11.120 --> 0:11:13.800
<v Speaker 1>a different color or a different size or different shape.

0:11:14.840 --> 0:11:18.920
<v Speaker 1>The computer doesn't understand the concept of coffee mug. So

0:11:18.960 --> 0:11:21.720
<v Speaker 1>how can you teach it this concept? How can you

0:11:21.760 --> 0:11:25.480
<v Speaker 1>train it so it recognizes coffee mugs? That was what

0:11:25.600 --> 0:11:29.400
<v Speaker 1>McCulloch was looking at. Then you have another guy who

0:11:29.400 --> 0:11:33.480
<v Speaker 1>came along, Frank Rosenblat, very smart man, who built on

0:11:33.520 --> 0:11:37.760
<v Speaker 1>this work. He developed an artificial neuron called the perceptron. Now,

0:11:37.760 --> 0:11:41.160
<v Speaker 1>a perceptron's job is, from a very high level, pretty simple.

0:11:41.280 --> 0:11:46.400
<v Speaker 1>It accepts multiple binary inputs. So it accepts inputs that

0:11:46.440 --> 0:11:49.920
<v Speaker 1>are either zeros or ones, and then it produces a

0:11:50.000 --> 0:11:54.440
<v Speaker 1>single binary output either a zero or a one based

0:11:54.520 --> 0:11:58.199
<v Speaker 1>upon processing that information. So let's say you want to

0:11:58.240 --> 0:12:01.199
<v Speaker 1>create a program that can help you decide which restaurant

0:12:01.320 --> 0:12:03.920
<v Speaker 1>you want to go to, and you've come up with

0:12:04.040 --> 0:12:07.560
<v Speaker 1>three criteria that you think are really important in order

0:12:07.559 --> 0:12:10.560
<v Speaker 1>for you to make this decision. And the three criteria

0:12:10.760 --> 0:12:14.160
<v Speaker 1>you have are is the restaurant within a twenty minute

0:12:14.240 --> 0:12:19.439
<v Speaker 1>drive or less? So, is it relatively close? Will a

0:12:19.440 --> 0:12:23.080
<v Speaker 1>meal cost less than fifty dollars for two people to

0:12:23.160 --> 0:12:27.360
<v Speaker 1>have dinner there? And does the restaurant serve tacos? Those

0:12:27.400 --> 0:12:30.200
<v Speaker 1>are your three points of criteria, and you can represent

0:12:30.280 --> 0:12:33.800
<v Speaker 1>each of those variables with a binary figure. So, for example,

0:12:34.360 --> 0:12:37.880
<v Speaker 1>you could say that if the restaurant is closer than

0:12:37.920 --> 0:12:41.160
<v Speaker 1>a twenty minute drive, if it is nearby, you represent

0:12:41.240 --> 0:12:44.120
<v Speaker 1>that variable with a one. If it is further away

0:12:44.160 --> 0:12:47.440
<v Speaker 1>than that, it's a zero. If the dinner for two

0:12:47.520 --> 0:12:50.439
<v Speaker 1>is cheaper than fifty dollars, that's a one. If it's

0:12:50.440 --> 0:12:54.920
<v Speaker 1>more expensive, it's a zero. And if it serves tacos,

0:12:54.920 --> 0:12:57.840
<v Speaker 1>it's a one. And if it does not serve tacos,

0:12:57.960 --> 0:13:00.680
<v Speaker 1>it's a big fat zero. Then you have a list

0:13:00.720 --> 0:13:04.319
<v Speaker 1>of various restaurants you could feed each restaurant through your

0:13:04.320 --> 0:13:07.480
<v Speaker 1>criteria and see how they do. Uh, And then you

0:13:07.480 --> 0:13:10.000
<v Speaker 1>could narrow your choices this way, and perhaps there is

0:13:10.040 --> 0:13:13.280
<v Speaker 1>no single restaurant that meets all those criteria, so you

0:13:13.320 --> 0:13:17.640
<v Speaker 1>really should take another step. And that's where Rosenblatt introduces

0:13:17.679 --> 0:13:23.679
<v Speaker 1>the concept of weights, where you you change how important

0:13:23.720 --> 0:13:26.400
<v Speaker 1>each of the criteria are in relation to each other.

0:13:26.480 --> 0:13:31.160
<v Speaker 1>Weights are real numbers that indicate the importance of particular criterion.

0:13:31.720 --> 0:13:36.520
<v Speaker 1>So you want, let's say all those three criteria you've identified,

0:13:36.720 --> 0:13:39.400
<v Speaker 1>the distance, the cost, and whether or not they have tacos.

0:13:39.880 --> 0:13:43.560
<v Speaker 1>You have decided the most critical piece of information is

0:13:43.600 --> 0:13:47.360
<v Speaker 1>whether or not the restaurant serves tacos. So you could

0:13:47.400 --> 0:13:52.000
<v Speaker 1>then assign a greater weight to that criterion, saying this

0:13:52.080 --> 0:13:54.679
<v Speaker 1>is more important to me, and that will influence the

0:13:54.720 --> 0:13:58.440
<v Speaker 1>output of the neuron. You must also determine a threshold

0:13:58.520 --> 0:14:02.160
<v Speaker 1>value for the decision. In other words, you say, in

0:14:02.280 --> 0:14:05.920
<v Speaker 1>order to produce a positive result to tell me, yes,

0:14:05.960 --> 0:14:08.640
<v Speaker 1>this is a restaurant you should go to, you must

0:14:08.720 --> 0:14:13.240
<v Speaker 1>at least meet this threshold. That's the minimum value the

0:14:13.280 --> 0:14:15.880
<v Speaker 1>calculation has to meet or exceed in order to produce

0:14:15.960 --> 0:14:19.280
<v Speaker 1>a go to this restaurant result. I'll explain a bit

0:14:19.280 --> 0:14:21.720
<v Speaker 1>more about this in just a second, But first I'm

0:14:21.720 --> 0:14:24.440
<v Speaker 1>going to take a quick break and thank our sponsors.

0:14:32.320 --> 0:14:35.240
<v Speaker 1>That threshold value that I mentioned before the break is

0:14:35.280 --> 0:14:38.360
<v Speaker 1>really important because it tells your model what sort of

0:14:38.400 --> 0:14:42.920
<v Speaker 1>results count as valid versus not valid. So let's say

0:14:43.040 --> 0:14:45.920
<v Speaker 1>I've waited the criteria so that the distance to the

0:14:46.000 --> 0:14:49.040
<v Speaker 1>restaurant and the expense of the meal each have a

0:14:49.120 --> 0:14:53.120
<v Speaker 1>weight of two, but the presence of tacos is a six.

0:14:53.680 --> 0:14:56.240
<v Speaker 1>That's how important I think tacos are. And I've said

0:14:56.240 --> 0:14:59.240
<v Speaker 1>a threshold of four. Well, that means that if the

0:14:59.280 --> 0:15:04.240
<v Speaker 1>restaurant is relatively close and it's relatively inexpensive, it's going

0:15:04.280 --> 0:15:06.400
<v Speaker 1>to pass my criteria because I've given a weight of

0:15:06.440 --> 0:15:09.560
<v Speaker 1>two for both of those and added together that's four.

0:15:09.640 --> 0:15:12.320
<v Speaker 1>It equals the threshold. Good to go. But even if

0:15:12.360 --> 0:15:16.040
<v Speaker 1>the restaurant is far away and even if it's expensive,

0:15:16.720 --> 0:15:20.520
<v Speaker 1>if it serves tacos, it still passes my criteria because

0:15:20.520 --> 0:15:23.760
<v Speaker 1>I gave the tacos a weight of six. Raising the

0:15:23.800 --> 0:15:29.160
<v Speaker 1>threshold value reduces the number of valid restaurants. So if

0:15:29.200 --> 0:15:32.920
<v Speaker 1>I make the threshold eight instead of four, now the

0:15:33.080 --> 0:15:36.160
<v Speaker 1>only way I can get a valid result a result

0:15:36.200 --> 0:15:39.240
<v Speaker 1>of yes, go to this restaurant is if the restaurant

0:15:39.360 --> 0:15:44.760
<v Speaker 1>has tacos and it's either close by, or it's inexpensive,

0:15:45.160 --> 0:15:47.800
<v Speaker 1>or both. And if I said the threshold were ten,

0:15:48.680 --> 0:15:51.840
<v Speaker 1>all three criteria would need to be met for this

0:15:51.880 --> 0:15:55.720
<v Speaker 1>option to be valid. Now, an artificial intelligence for the

0:15:55.760 --> 0:15:59.200
<v Speaker 1>purposes of notation, many people will move the threshold value

0:15:59.440 --> 0:16:01.880
<v Speaker 1>to the other side of the equation, and in this

0:16:01.920 --> 0:16:04.760
<v Speaker 1>case we now call it a bias, and a bias

0:16:04.880 --> 0:16:07.360
<v Speaker 1>essentially is a measurement to tell you how easy or

0:16:07.440 --> 0:16:10.640
<v Speaker 1>difficult it is to get the perceptron to fire off

0:16:10.720 --> 0:16:14.520
<v Speaker 1>a positive value. If you have a big positive bias,

0:16:14.640 --> 0:16:17.640
<v Speaker 1>that means it's easier for the perceptron to produce a

0:16:17.680 --> 0:16:22.400
<v Speaker 1>positive result a one. A large negative bias does the opposite,

0:16:22.840 --> 0:16:25.400
<v Speaker 1>and thus you would get a zero. So we can

0:16:25.480 --> 0:16:29.480
<v Speaker 1>write out the perceptron's rules like this. Take the value

0:16:29.680 --> 0:16:32.440
<v Speaker 1>of a variable which is either going to be a

0:16:32.520 --> 0:16:36.120
<v Speaker 1>zero or a one. It will be binary. You multiply

0:16:36.440 --> 0:16:40.680
<v Speaker 1>the value of this variable by the weight of that variable,

0:16:41.280 --> 0:16:47.800
<v Speaker 1>and weights can be different values. Let's say that the

0:16:49.000 --> 0:16:52.000
<v Speaker 1>distance and expense are both weighted at two. Tacos gets

0:16:52.000 --> 0:16:56.560
<v Speaker 1>a big hefty six. You're going to add your various

0:16:56.760 --> 0:17:00.040
<v Speaker 1>weighted variable results together, and then you add the I

0:17:00.320 --> 0:17:03.160
<v Speaker 1>s for the perceptron. And in our example, the bias

0:17:03.360 --> 0:17:08.000
<v Speaker 1>is a minus six. That's to tell us that in

0:17:08.200 --> 0:17:11.840
<v Speaker 1>order for this perceptron to fire, you have to you

0:17:11.840 --> 0:17:14.240
<v Speaker 1>have to be able to factor in that minus six

0:17:14.359 --> 0:17:17.600
<v Speaker 1>and beat it. So if after adding these elements together,

0:17:18.200 --> 0:17:21.800
<v Speaker 1>you get a result that is zero or lower, the

0:17:21.880 --> 0:17:24.439
<v Speaker 1>output is a zero or a negative, saying, don't go

0:17:24.480 --> 0:17:27.280
<v Speaker 1>to this restaurant. So after adding that negative six, if

0:17:27.320 --> 0:17:29.920
<v Speaker 1>you have a zero or less, you don't go. If

0:17:29.920 --> 0:17:32.000
<v Speaker 1>you get a result that's greater than zero, it's a

0:17:32.040 --> 0:17:34.840
<v Speaker 1>positive result, it says, go to that restaurant. So here

0:17:34.840 --> 0:17:38.000
<v Speaker 1>in our hypothetical perceptron, we've decided on a bias of

0:17:38.000 --> 0:17:40.760
<v Speaker 1>minus six, and we take our three variables as we

0:17:40.840 --> 0:17:44.240
<v Speaker 1>examine a single restaurant. So this restaurant is twenty five

0:17:44.240 --> 0:17:47.800
<v Speaker 1>minutes away. So that means for our first variable, which

0:17:47.880 --> 0:17:51.159
<v Speaker 1>is all about distance, it gets a zero because it

0:17:51.240 --> 0:17:53.919
<v Speaker 1>is further than twenty minutes away. So that variable is

0:17:53.920 --> 0:17:57.160
<v Speaker 1>a zero. And we multiply the variable times the weight.

0:17:57.560 --> 0:18:00.720
<v Speaker 1>The weight is too for that particular variable two time

0:18:00.840 --> 0:18:05.040
<v Speaker 1>zero is zero. Then I look and I see that

0:18:05.119 --> 0:18:07.240
<v Speaker 1>dinner for two of that restaurant's gonna set me back

0:18:07.359 --> 0:18:10.640
<v Speaker 1>thirty dollars, but that's below the limit we had set

0:18:10.680 --> 0:18:13.240
<v Speaker 1>of fifty dollars. So that means the value of the

0:18:13.320 --> 0:18:16.160
<v Speaker 1>variable is one. It is cheaper than fifty dollars, so

0:18:16.200 --> 0:18:19.720
<v Speaker 1>that gets a one. The weight for this variable is

0:18:19.760 --> 0:18:22.840
<v Speaker 1>to so multiply the weight times of variable two times

0:18:22.880 --> 0:18:26.800
<v Speaker 1>one is two. Then we have the question does the

0:18:26.880 --> 0:18:29.800
<v Speaker 1>restaurants serve tacos? And I know you're dying to know this.

0:18:30.240 --> 0:18:34.560
<v Speaker 1>I'm glad to report the restaurant does in fact serve tacos,

0:18:35.040 --> 0:18:38.080
<v Speaker 1>And that means that the variable is a one. It's positive,

0:18:38.600 --> 0:18:41.159
<v Speaker 1>and we waited this variable very heavily with a six,

0:18:41.359 --> 0:18:44.800
<v Speaker 1>So six times one is six. Now we have to

0:18:44.840 --> 0:18:49.119
<v Speaker 1>add all of those results together, so we have zero

0:18:49.280 --> 0:18:52.280
<v Speaker 1>from the first one, too, from the second one, six

0:18:52.480 --> 0:18:55.200
<v Speaker 1>from the third one. Add that together you get eight.

0:18:55.840 --> 0:18:58.199
<v Speaker 1>Now we have to add in the bias, and the

0:18:58.240 --> 0:19:02.680
<v Speaker 1>bias for this perceptron is a minus six. Eight plus

0:19:02.720 --> 0:19:06.240
<v Speaker 1>minus six gives us a final value of two. Two

0:19:06.359 --> 0:19:09.640
<v Speaker 1>is greater than zero. So by the rules we have established,

0:19:09.680 --> 0:19:13.120
<v Speaker 1>the perceptron says this is a positive result and fires

0:19:13.119 --> 0:19:15.439
<v Speaker 1>off a one. So the restaurant we fed to the

0:19:15.440 --> 0:19:18.800
<v Speaker 1>perceptron met the criteria based on that bias. Now, if

0:19:18.840 --> 0:19:24.520
<v Speaker 1>our bias had been minus ten or minus nine, we

0:19:24.520 --> 0:19:28.159
<v Speaker 1>would have not produced this positive result. We have gotten

0:19:28.640 --> 0:19:31.199
<v Speaker 1>a zero or negative number and it would have said no.

0:19:31.840 --> 0:19:34.840
<v Speaker 1>So that bias is very important, as is the weight

0:19:35.080 --> 0:19:38.600
<v Speaker 1>of the various variables. And that is one neuron. Now

0:19:38.640 --> 0:19:41.439
<v Speaker 1>you can actually create layers of neurons. That's why we

0:19:41.480 --> 0:19:45.240
<v Speaker 1>call it an artificial neural network, not just an artificial neuron.

0:19:45.920 --> 0:19:49.400
<v Speaker 1>And by doing that you can have results from one

0:19:49.640 --> 0:19:55.480
<v Speaker 1>neuron's decisions feed directly into another neuron. Also, a perceptron

0:19:55.600 --> 0:19:59.560
<v Speaker 1>can perform as a type of logical gait called a

0:19:59.760 --> 0:20:04.359
<v Speaker 1>name end gate in a n D that stands for

0:20:04.800 --> 0:20:08.480
<v Speaker 1>not and it's a type of logical gate that can

0:20:08.560 --> 0:20:13.000
<v Speaker 1>produce a false or negative output if all its inputs

0:20:13.200 --> 0:20:16.720
<v Speaker 1>are true or positive. So, in other words, with the

0:20:16.800 --> 0:20:20.840
<v Speaker 1>right weights and biases, a perceptron will produce an output

0:20:20.920 --> 0:20:24.679
<v Speaker 1>of zero if all of its inputs are ones. The

0:20:24.800 --> 0:20:29.240
<v Speaker 1>nand gate in computer science is a universal gate because

0:20:29.600 --> 0:20:33.520
<v Speaker 1>you can use different creations and combinations of nand gates

0:20:34.080 --> 0:20:36.840
<v Speaker 1>and build any kind of computation. You just have to

0:20:36.880 --> 0:20:39.280
<v Speaker 1>link them together properly in order to do it. It's

0:20:39.280 --> 0:20:41.679
<v Speaker 1>not always the most efficient way to do this, but

0:20:41.800 --> 0:20:45.600
<v Speaker 1>it does work. So if you had perceptrons that accepted

0:20:45.720 --> 0:20:48.760
<v Speaker 1>two variables, each with a weight of minus two, and

0:20:48.800 --> 0:20:51.680
<v Speaker 1>the perceptron had a bias of three, it would act

0:20:51.760 --> 0:20:55.480
<v Speaker 1>like a nandgate. That's because if both variables are one,

0:20:56.080 --> 0:20:58.560
<v Speaker 1>then the final equation you'd get to determine the output

0:20:58.560 --> 0:21:02.600
<v Speaker 1>would be minus two because you multiply the weight of

0:21:02.680 --> 0:21:05.560
<v Speaker 1>minus two times the variable of one, and then you

0:21:05.720 --> 0:21:09.040
<v Speaker 1>have to add a second minus two because the second

0:21:09.080 --> 0:21:12.359
<v Speaker 1>variable is the same way. And then you would add

0:21:12.359 --> 0:21:15.080
<v Speaker 1>the bias, which is three. But minus two plus minus

0:21:15.119 --> 0:21:18.600
<v Speaker 1>two is minus four. You add in plus three, you

0:21:18.640 --> 0:21:21.320
<v Speaker 1>get a minus one is the result minus one is

0:21:21.400 --> 0:21:23.760
<v Speaker 1>less than zero, which means they output for the perceptron

0:21:23.960 --> 0:21:27.000
<v Speaker 1>must be zero as opposed to one. You get a

0:21:27.040 --> 0:21:31.560
<v Speaker 1>false or an off or a zero result. Two positive

0:21:31.560 --> 0:21:34.800
<v Speaker 1>inputs create a negative output when a few times you

0:21:34.840 --> 0:21:37.639
<v Speaker 1>can say two positives make a negative. Now that means

0:21:37.920 --> 0:21:42.760
<v Speaker 1>we can ask progressively more complicated questions, with each perceptron

0:21:42.840 --> 0:21:46.480
<v Speaker 1>handling one aspect of that question and feeding into another

0:21:46.560 --> 0:21:50.320
<v Speaker 1>layer of perceptrons. Each perceptron will produce either a positive

0:21:50.400 --> 0:21:52.440
<v Speaker 1>or a negative result, so you either get a one

0:21:52.560 --> 0:21:55.680
<v Speaker 1>or a zero, and these results will feed into other

0:21:55.760 --> 0:21:58.400
<v Speaker 1>neurons in the network, which will use them to perform

0:21:58.560 --> 0:22:02.280
<v Speaker 1>their own calculations of their own weights and their own biases.

0:22:02.800 --> 0:22:05.480
<v Speaker 1>All of this is to feed those questions through a

0:22:05.560 --> 0:22:07.840
<v Speaker 1>network to produce a result, and I should be clear

0:22:08.320 --> 0:22:11.000
<v Speaker 1>the weights for each variable along this path can change

0:22:11.400 --> 0:22:13.880
<v Speaker 1>from one part of the decision making process to the next.

0:22:13.920 --> 0:22:18.360
<v Speaker 1>We're not just talking about identical perceptrons all through the network,

0:22:18.760 --> 0:22:21.320
<v Speaker 1>and that last bit is the most important part, because

0:22:21.520 --> 0:22:24.359
<v Speaker 1>if this were just a matter of setting biases and

0:22:24.400 --> 0:22:27.600
<v Speaker 1>weights and building out a network of perceptrons, there'd be

0:22:27.640 --> 0:22:30.879
<v Speaker 1>nothing special about it, because we already have nannd gates.

0:22:31.760 --> 0:22:35.240
<v Speaker 1>They existed before perceptrons. It would just mean that we

0:22:35.320 --> 0:22:39.119
<v Speaker 1>have a different way to implement something we could already do,

0:22:39.440 --> 0:22:41.320
<v Speaker 1>and finding a new way to do something you were

0:22:41.359 --> 0:22:45.480
<v Speaker 1>already doing is rarely super transformative. You might be able

0:22:45.520 --> 0:22:48.760
<v Speaker 1>to make it a better way of doing the same thing,

0:22:48.840 --> 0:22:52.080
<v Speaker 1>but in this case it might be less efficient than

0:22:52.119 --> 0:22:54.680
<v Speaker 1>the old way. However, there is something else that makes

0:22:54.680 --> 0:22:58.600
<v Speaker 1>these perceptrons special, and that's by pairing them with those

0:22:58.640 --> 0:23:02.280
<v Speaker 1>special algorithms that Cola and Pets were proposing back in

0:23:02.320 --> 0:23:06.040
<v Speaker 1>the forties and fifties. These would be learning algorithms. These

0:23:06.040 --> 0:23:10.800
<v Speaker 1>algorithms are instructions that can, based upon external stimuli, dynamically

0:23:10.880 --> 0:23:15.240
<v Speaker 1>and automatically tune the weights and biases of perceptrons in

0:23:15.320 --> 0:23:18.800
<v Speaker 1>a neural network. In other words, a program can guide

0:23:18.960 --> 0:23:22.840
<v Speaker 1>the network so that it learns how to solve problems.

0:23:22.880 --> 0:23:26.639
<v Speaker 1>But how well. It all comes down to making small

0:23:26.800 --> 0:23:30.399
<v Speaker 1>changes in those weights and biases in order to fine

0:23:30.440 --> 0:23:33.680
<v Speaker 1>tune outputs. So let's say we're working on an image

0:23:33.720 --> 0:23:37.119
<v Speaker 1>recognition algorithm. That's one of the big things that the

0:23:37.160 --> 0:23:40.879
<v Speaker 1>neural engine and Apple's iPhones do. They that's one of

0:23:40.920 --> 0:23:43.920
<v Speaker 1>their main purposes. So in our example, let's say we're

0:23:43.960 --> 0:23:49.479
<v Speaker 1>training the neural network to recognize handwritten printed lowercase letters.

0:23:49.520 --> 0:23:52.240
<v Speaker 1>It's very similar to what McCulla was talking about. But

0:23:52.359 --> 0:23:55.960
<v Speaker 1>let's say our model is having trouble differentiating a lowercase

0:23:56.280 --> 0:24:00.480
<v Speaker 1>L with a lowercase I. It was just having issues

0:24:00.880 --> 0:24:04.280
<v Speaker 1>being able to tell those two apart in particular. Now

0:24:04.440 --> 0:24:07.320
<v Speaker 1>we've got a specific example in which our model is

0:24:07.400 --> 0:24:10.679
<v Speaker 1>misidentifying an L as an eye. Let's say, in the

0:24:10.760 --> 0:24:14.199
<v Speaker 1>hypothetical situation, and so we decide we're gonna make some

0:24:14.280 --> 0:24:18.280
<v Speaker 1>minor tweaks in the weights and biases earlier on in

0:24:18.320 --> 0:24:22.920
<v Speaker 1>the artificial neural network to guide our network so that

0:24:22.960 --> 0:24:27.000
<v Speaker 1>it can more readily tell the difference between l lower

0:24:27.000 --> 0:24:29.239
<v Speaker 1>case ls and lower case eyes. And we get our

0:24:29.280 --> 0:24:31.760
<v Speaker 1>model closer to being able to tell that difference. We

0:24:31.880 --> 0:24:35.400
<v Speaker 1>keep making these small adjustments until we get more consistent output.

0:24:35.840 --> 0:24:38.320
<v Speaker 1>The network as a whole is said to quote unquote

0:24:38.520 --> 0:24:41.639
<v Speaker 1>learn through this process. It's getting better and creating an

0:24:41.680 --> 0:24:45.359
<v Speaker 1>output there's more reflective reality. But there's a bit of

0:24:45.359 --> 0:24:47.920
<v Speaker 1>a problem, and anyone who has worked in QA has

0:24:47.960 --> 0:24:52.200
<v Speaker 1>probably already spotted what it is. For everybody else. I'm

0:24:52.200 --> 0:24:54.840
<v Speaker 1>gonna explain it in just a minute, but first let's

0:24:54.840 --> 0:25:05.280
<v Speaker 1>take another quick break to thank our sponsor. So what

0:25:05.359 --> 0:25:08.280
<v Speaker 1>was that problem I was talking about before the break? Well,

0:25:08.520 --> 0:25:11.200
<v Speaker 1>if you've ever worked in any sort of programming environment,

0:25:11.240 --> 0:25:14.760
<v Speaker 1>you know that when you introduce changes in code, you

0:25:14.840 --> 0:25:17.679
<v Speaker 1>might fix whatever problem you're focusing on at the moment,

0:25:18.080 --> 0:25:21.840
<v Speaker 1>but you might also break something else that's already in

0:25:21.880 --> 0:25:25.000
<v Speaker 1>the code. With perceptrons. That happens when you start tweaking

0:25:25.080 --> 0:25:28.679
<v Speaker 1>weights and biases, because a small change in one spot

0:25:28.760 --> 0:25:31.040
<v Speaker 1>in a network can have sort of a ripple effect

0:25:31.040 --> 0:25:36.480
<v Speaker 1>with unintended consequences. So, for example, in our little hypothetical situation,

0:25:36.520 --> 0:25:39.840
<v Speaker 1>maybe your new model can better tell the difference between

0:25:40.000 --> 0:25:42.560
<v Speaker 1>a lower case L and a lower case I, but

0:25:42.680 --> 0:25:46.360
<v Speaker 1>now the lowercase J is giving it problems the way

0:25:46.400 --> 0:25:50.320
<v Speaker 1>perceptron's work. Small changes in the network can reduce much

0:25:50.520 --> 0:25:54.119
<v Speaker 1>larger variations and output, so it's sort of like the

0:25:54.200 --> 0:25:59.359
<v Speaker 1>butterfly effect in action. Computer scientists created a different type

0:25:59.440 --> 0:26:03.600
<v Speaker 1>of artificial neuron network that addresses this problem, and this

0:26:03.640 --> 0:26:06.640
<v Speaker 1>type is called a sigmoid neuron. Really, I should say

0:26:06.680 --> 0:26:09.720
<v Speaker 1>they created a different type of artificial neuron, So the

0:26:09.760 --> 0:26:12.040
<v Speaker 1>sigmoid neuron. What the heck is this? Well, from a

0:26:12.119 --> 0:26:16.520
<v Speaker 1>high level, sigmoid neurons look kind of like perceptrons, but

0:26:17.200 --> 0:26:19.800
<v Speaker 1>while you'd either use either a zero or a one

0:26:19.920 --> 0:26:23.359
<v Speaker 1>as the value for an input into a perceptron, a

0:26:23.480 --> 0:26:27.920
<v Speaker 1>sigmoid neuron can accept a zero, a one, or any

0:26:28.040 --> 0:26:32.200
<v Speaker 1>number in between zero and one. The output a sigmoid

0:26:32.240 --> 0:26:36.920
<v Speaker 1>neuron produces is called the logistic function or sigmoid function.

0:26:37.640 --> 0:26:41.080
<v Speaker 1>This gets a bit complicated on a surface level, particularly

0:26:41.320 --> 0:26:44.639
<v Speaker 1>if like me, you're a little rusty on your algebra

0:26:44.720 --> 0:26:48.640
<v Speaker 1>and calculus, but generally speaking, the end result is that

0:26:48.920 --> 0:26:52.280
<v Speaker 1>using this type of artificial neuron, you can make small

0:26:52.359 --> 0:26:56.320
<v Speaker 1>changes to weights and biases and not create a larger

0:26:56.400 --> 0:26:59.840
<v Speaker 1>effect on the ultimate output. You'll still make small adjustments

0:26:59.840 --> 0:27:03.080
<v Speaker 1>to the output. There are a lot of resources online

0:27:03.080 --> 0:27:05.639
<v Speaker 1>that go into greater detail about sigma neurons. I'm not

0:27:05.680 --> 0:27:08.960
<v Speaker 1>going to go into more detail here because without visual

0:27:09.000 --> 0:27:12.240
<v Speaker 1>aids and being able to go into algebraic functions, it

0:27:12.320 --> 0:27:15.080
<v Speaker 1>gets a little hard for me to explain. But in

0:27:15.119 --> 0:27:18.399
<v Speaker 1>your typical neural network, you would have an input layer

0:27:18.760 --> 0:27:21.000
<v Speaker 1>and you would have an output layer, So you have

0:27:21.040 --> 0:27:23.879
<v Speaker 1>a layer where information comes in, and you would have

0:27:23.920 --> 0:27:27.199
<v Speaker 1>the output layer where new information comes out. But between

0:27:27.240 --> 0:27:30.360
<v Speaker 1>those two you would have what are called hidden layers.

0:27:31.000 --> 0:27:34.040
<v Speaker 1>Then just really means that they're not input or output

0:27:34.400 --> 0:27:37.520
<v Speaker 1>there in the middle. Hidden makes it sound like they're

0:27:37.520 --> 0:27:41.960
<v Speaker 1>super clandestine and spy worthy and cool, but really they're

0:27:41.960 --> 0:27:45.199
<v Speaker 1>just in between input and output. They perform processes on

0:27:45.280 --> 0:27:48.320
<v Speaker 1>the inputs they receive, and they passed them on as

0:27:48.359 --> 0:27:53.520
<v Speaker 1>outputs to other neurons to have more processes put on

0:27:53.560 --> 0:27:56.440
<v Speaker 1>them until you finally get the output. The sort of

0:27:56.480 --> 0:28:00.840
<v Speaker 1>networks I've described so far are called feed forward networks,

0:28:01.440 --> 0:28:03.840
<v Speaker 1>and that means pretty much what sounds like. You plug

0:28:03.880 --> 0:28:08.120
<v Speaker 1>in and puts the information passes one way through the network,

0:28:08.920 --> 0:28:11.960
<v Speaker 1>and you eventually get output as the information continues to move,

0:28:12.000 --> 0:28:15.200
<v Speaker 1>and we typically visualize this in a left to right

0:28:15.600 --> 0:28:20.000
<v Speaker 1>kind of of display, so you imagine input coming in

0:28:20.040 --> 0:28:23.199
<v Speaker 1>from the left side, passing through this network, having various

0:28:23.200 --> 0:28:26.840
<v Speaker 1>processes put on it as each of these neurons UH

0:28:27.040 --> 0:28:31.760
<v Speaker 1>decides if it counts as a positive or a negative response,

0:28:32.200 --> 0:28:35.960
<v Speaker 1>or with sigmoid neurons, some degree in between, and then

0:28:35.960 --> 0:28:38.360
<v Speaker 1>plugging that into the next neuron until you finally get

0:28:38.360 --> 0:28:41.200
<v Speaker 1>to the output. It always gets fed forward. But that's

0:28:41.480 --> 0:28:44.320
<v Speaker 1>not the only type of artificial neural network. There are

0:28:44.360 --> 0:28:48.880
<v Speaker 1>also things called recurrent neural networks, in which neurons fire

0:28:48.920 --> 0:28:52.000
<v Speaker 1>at some predetermined amount of time. Then they typically settle

0:28:52.080 --> 0:28:55.160
<v Speaker 1>down they're not firing at all, but the next group

0:28:55.240 --> 0:28:57.400
<v Speaker 1>of neurons start to fire. This creates kind of a

0:28:57.400 --> 0:29:01.560
<v Speaker 1>cascade effect through the network, and occasionally there it could

0:29:01.560 --> 0:29:06.000
<v Speaker 1>be neurons that feed back into previous neurons. There's a

0:29:06.040 --> 0:29:10.720
<v Speaker 1>feedback loop. It's more challenging to make a powerful learning

0:29:10.760 --> 0:29:14.959
<v Speaker 1>algorithm with recurrent neural networks because it gets super duper complicated.

0:29:15.400 --> 0:29:20.360
<v Speaker 1>But recurrent neural networks pose potentially huge utility in the future.

0:29:20.560 --> 0:29:24.080
<v Speaker 1>So an artificial neural network can be made up as

0:29:24.480 --> 0:29:28.120
<v Speaker 1>of as few as a few dozen artificial neurons all

0:29:28.160 --> 0:29:31.520
<v Speaker 1>the way up to millions of artificial neurons, and we

0:29:31.640 --> 0:29:36.280
<v Speaker 1>trained them through various processes such as back propagation. Now

0:29:36.320 --> 0:29:39.520
<v Speaker 1>that's when you take the actual output of the process

0:29:39.560 --> 0:29:42.000
<v Speaker 1>and you compare it to what you wanted it to produce,

0:29:42.640 --> 0:29:45.520
<v Speaker 1>and then you use the difference between those two results

0:29:45.520 --> 0:29:48.520
<v Speaker 1>to make changes to the weights and biases in the network.

0:29:48.640 --> 0:29:53.560
<v Speaker 1>So here's an example where training a our network to

0:29:53.640 --> 0:29:56.360
<v Speaker 1>recognize pictures of cats, because this has actually been done.

0:29:56.440 --> 0:30:00.480
<v Speaker 1>Google famously did this. So you're training your network to

0:30:00.520 --> 0:30:04.640
<v Speaker 1>recognize what a cat is based upon a picture, and

0:30:04.720 --> 0:30:07.520
<v Speaker 1>you use a picture that you know is a picture

0:30:07.600 --> 0:30:10.080
<v Speaker 1>of a cat, so you already know the answer to this.

0:30:10.360 --> 0:30:12.880
<v Speaker 1>You're teaching the computer to learn the answer to this.

0:30:13.320 --> 0:30:16.320
<v Speaker 1>You know that the answer is cat, and you feed

0:30:16.640 --> 0:30:20.360
<v Speaker 1>the image through this system, It analyzes the data, it

0:30:20.400 --> 0:30:23.960
<v Speaker 1>gives you an output, and you see how well it did.

0:30:24.160 --> 0:30:27.400
<v Speaker 1>Did it correctly identify the image as a cat, did

0:30:27.440 --> 0:30:33.520
<v Speaker 1>it assign a certain level of of certainty to its conclusion,

0:30:34.120 --> 0:30:37.000
<v Speaker 1>and if it's far off, you could start making changes

0:30:37.040 --> 0:30:40.560
<v Speaker 1>to those weights and biases to help guide the system

0:30:40.640 --> 0:30:44.000
<v Speaker 1>into determining, oh, yes, that is a cat. Training a

0:30:44.040 --> 0:30:47.320
<v Speaker 1>network multiple times refines this process to the point where

0:30:47.800 --> 0:30:51.080
<v Speaker 1>you can start to introduce brand new inputs to the system,

0:30:51.200 --> 0:30:54.200
<v Speaker 1>inputs that the system has never encountered before, and get

0:30:54.240 --> 0:30:58.960
<v Speaker 1>a reliable result. So with Google's example, you might feed

0:30:59.000 --> 0:31:03.160
<v Speaker 1>it thousands or tens of thousands, or hundreds of thousands

0:31:03.280 --> 0:31:07.400
<v Speaker 1>or more images of cats, and each time the system

0:31:07.520 --> 0:31:10.320
<v Speaker 1>is told now that there is a cat in this image,

0:31:10.800 --> 0:31:14.240
<v Speaker 1>and it begins to refine its approach, figuring out which

0:31:14.280 --> 0:31:16.720
<v Speaker 1>weights and biases it needs to tweak in order to

0:31:16.760 --> 0:31:20.320
<v Speaker 1>get to that result. And then you feed it a

0:31:20.400 --> 0:31:23.280
<v Speaker 1>whole group of new images and you don't tell it

0:31:23.480 --> 0:31:26.240
<v Speaker 1>if there are cats in those images or not. Then

0:31:26.280 --> 0:31:28.440
<v Speaker 1>you leave it to the system to determine are there

0:31:28.520 --> 0:31:32.360
<v Speaker 1>cats in these pictures? And if you have trained it properly,

0:31:32.720 --> 0:31:37.840
<v Speaker 1>if those weights and biases are actually well tweaked, then

0:31:38.080 --> 0:31:40.640
<v Speaker 1>the system should be able to reliably pick out the

0:31:40.680 --> 0:31:45.000
<v Speaker 1>pictures that have cats in them. That's the idea. Now,

0:31:45.040 --> 0:31:47.840
<v Speaker 1>there's tons more to be said about artificial neural networks,

0:31:47.840 --> 0:31:51.520
<v Speaker 1>but i'll give you I've given a quick overview. Let's

0:31:51.600 --> 0:31:54.120
<v Speaker 1>let's jump back over to Apple for a second, because

0:31:54.240 --> 0:31:56.160
<v Speaker 1>that was the whole purpose of this episode. So what

0:31:56.360 --> 0:32:00.440
<v Speaker 1>is a neural engine actually used for. Well, for the iPhone,

0:32:00.760 --> 0:32:04.400
<v Speaker 1>it's used mainly in processing speech and image data. It's

0:32:04.440 --> 0:32:07.120
<v Speaker 1>the neural engine that can analyze your face, for example,

0:32:07.440 --> 0:32:11.000
<v Speaker 1>and then translate your expressions into animated form. You can

0:32:11.000 --> 0:32:15.320
<v Speaker 1>create animated emoji this way, So you could use the

0:32:15.440 --> 0:32:21.320
<v Speaker 1>little application and create a customized surprise emoji that copies

0:32:21.320 --> 0:32:22.800
<v Speaker 1>the way you look when you make a sort of

0:32:22.800 --> 0:32:26.600
<v Speaker 1>an exaggerated surprise face. You could do that. The neural

0:32:26.600 --> 0:32:30.600
<v Speaker 1>engine takes the incoming data the images it's pulling from

0:32:30.680 --> 0:32:34.200
<v Speaker 1>the camera, analyze it, and then helps create an animated

0:32:34.240 --> 0:32:37.440
<v Speaker 1>image that mirrors what you did. The neural engine also

0:32:37.480 --> 0:32:41.280
<v Speaker 1>analyzes visual data for the purposes of augmented reality. That's

0:32:41.320 --> 0:32:44.240
<v Speaker 1>when you overlay digital information on top of a view

0:32:44.360 --> 0:32:48.000
<v Speaker 1>of the physical world around you. So with smartphones like

0:32:48.000 --> 0:32:50.680
<v Speaker 1>the iPhone, it means holding your phone up and looking

0:32:51.000 --> 0:32:54.280
<v Speaker 1>at the world through your phone screen. So the camera

0:32:54.600 --> 0:32:57.920
<v Speaker 1>on your phone is giving you a live video feed

0:32:58.480 --> 0:33:01.120
<v Speaker 1>of whatever you're pointing the phone at it, and then

0:33:02.040 --> 0:33:04.520
<v Speaker 1>you use an augmented reality app, and on top of

0:33:04.560 --> 0:33:07.400
<v Speaker 1>that video image your phone will overlay some sort of

0:33:07.440 --> 0:33:10.800
<v Speaker 1>digital information. Could be a game, it could be information

0:33:10.840 --> 0:33:14.480
<v Speaker 1>about your surroundings. The digital information can appear to be

0:33:14.520 --> 0:33:18.800
<v Speaker 1>anchored to the physical space itself. So you could have

0:33:18.840 --> 0:33:21.680
<v Speaker 1>an augmented reality application that let's you view a virtual

0:33:21.680 --> 0:33:24.240
<v Speaker 1>piece of furniture in your house. And so when you

0:33:24.240 --> 0:33:27.080
<v Speaker 1>hold up the phone, you use the app to place

0:33:27.240 --> 0:33:30.560
<v Speaker 1>a virtual chair, let's say, in a specific location in

0:33:30.560 --> 0:33:33.400
<v Speaker 1>a room, and you can walk around this virtual chair

0:33:33.440 --> 0:33:35.680
<v Speaker 1>holding your phone up, and it looks like the chair

0:33:35.760 --> 0:33:38.800
<v Speaker 1>is actually there, even as your perspective changes. You can

0:33:38.840 --> 0:33:41.680
<v Speaker 1>circle around it and view the chair from all the

0:33:41.720 --> 0:33:44.560
<v Speaker 1>different angles as if it were actually sitting there in

0:33:44.600 --> 0:33:48.200
<v Speaker 1>the room. It's anchored to its place that you've put

0:33:48.240 --> 0:33:51.320
<v Speaker 1>it within the view of the room. The neural engine

0:33:51.360 --> 0:33:54.040
<v Speaker 1>is analyzing all this information that's coming in from the

0:33:54.080 --> 0:33:56.920
<v Speaker 1>camera and helping the app create the image of the chair,

0:33:57.360 --> 0:34:00.880
<v Speaker 1>keeping it the appropriate size and orientation with respect your

0:34:00.960 --> 0:34:04.360
<v Speaker 1>viewing angle. And the neural engine can use this ability

0:34:04.400 --> 0:34:07.080
<v Speaker 1>to help you go through stuff like your photos. Let's

0:34:07.120 --> 0:34:10.720
<v Speaker 1>say you've got an adorable pet, like my doggie Timbolt.

0:34:10.920 --> 0:34:15.640
<v Speaker 1>He is adorable. The iPhone can use its neural engine

0:34:15.680 --> 0:34:19.480
<v Speaker 1>and image recognition algorithms to return the pictures of your

0:34:19.520 --> 0:34:22.200
<v Speaker 1>pet in response to a search query. So my wife,

0:34:22.200 --> 0:34:24.760
<v Speaker 1>who has an iPhone, could do this with our dogs.

0:34:24.800 --> 0:34:27.440
<v Speaker 1>She could search for the word dog in her photo

0:34:27.520 --> 0:34:31.440
<v Speaker 1>app and then she would get countless images of Tibolt.

0:34:31.719 --> 0:34:35.440
<v Speaker 1>And I know it works because she's done it. Apple

0:34:35.480 --> 0:34:38.760
<v Speaker 1>has included access to the neural engine so that app

0:34:38.840 --> 0:34:41.799
<v Speaker 1>developers can actually take advantage of that technology as well.

0:34:41.800 --> 0:34:45.040
<v Speaker 1>They'll doubtlessly create new ways to leverage this tech, so

0:34:45.200 --> 0:34:46.840
<v Speaker 1>we'll have to keep our eyes open to see what

0:34:46.920 --> 0:34:49.640
<v Speaker 1>comes out of it. Neural networks in general are becoming

0:34:49.719 --> 0:34:54.480
<v Speaker 1>increasingly important in machine learning and artificial intelligence, so it's

0:34:54.480 --> 0:34:57.000
<v Speaker 1>likely to grow as a branch of computer science for

0:34:57.120 --> 0:35:00.480
<v Speaker 1>the next several years. And that wraps up this episode.

0:35:00.840 --> 0:35:04.080
<v Speaker 1>If you have suggestions for future episodes of tech Stuff,

0:35:04.080 --> 0:35:07.560
<v Speaker 1>maybe it's a technology, a person in tech company, anything

0:35:07.600 --> 0:35:10.400
<v Speaker 1>like that, Send me an email the addresses tech Stuff

0:35:10.440 --> 0:35:12.880
<v Speaker 1>at how stuff works dot com. You can draw me

0:35:12.920 --> 0:35:15.879
<v Speaker 1>a line on Facebook or Twitter. The handle for both

0:35:15.880 --> 0:35:19.680
<v Speaker 1>of those is tech Stuff H s W. Don't forget,

0:35:20.239 --> 0:35:23.799
<v Speaker 1>we have a merchandise store over at t public dot

0:35:23.800 --> 0:35:27.080
<v Speaker 1>com slash tech stuff. That's T E E public dot

0:35:27.120 --> 0:35:30.479
<v Speaker 1>com slash tech stuff. You can go and get your uh,

0:35:30.520 --> 0:35:35.280
<v Speaker 1>your caption test, the prove you're not a robot sticker

0:35:35.400 --> 0:35:37.759
<v Speaker 1>or T shirt or tote bag or whatever type of

0:35:37.840 --> 0:35:40.160
<v Speaker 1>thing you would like that on. It's pretty cool, So

0:35:40.200 --> 0:35:42.440
<v Speaker 1>go check that out, and don't forget to follow us

0:35:42.440 --> 0:35:45.680
<v Speaker 1>on Instagram and I'll talk to you again really soon.

0:35:51.480 --> 0:35:54.000
<v Speaker 1>For more on this and thousands of other topics, visit

0:35:54.040 --> 0:36:02.080
<v Speaker 1>how staff works dot com ye