1
00:00:04,120 --> 00:00:07,160
Speaker 1: Get in touch with technology with tech Stuff from how

2
00:00:07,200 --> 00:00:13,880
Speaker 1: stuff works dot com. Hey there, and welcome to tech Stuff.

3
00:00:13,960 --> 00:00:16,680
Speaker 1: My name is Jonathan Strickland. I happen to be the

4
00:00:16,680 --> 00:00:19,239
Speaker 1: host of this show. I'm also an executive producer at

5
00:00:19,239 --> 00:00:23,000
Speaker 1: how stuff Works. And hey, I love all things tech,

6
00:00:23,480 --> 00:00:26,960
Speaker 1: and today we're doing a little listener mail request. Dan

7
00:00:27,240 --> 00:00:29,600
Speaker 1: wrote in and asked if I might do an episode

8
00:00:29,840 --> 00:00:34,800
Speaker 1: about Apple's so called neural engine in its more recent iPhones.

9
00:00:35,200 --> 00:00:38,800
Speaker 1: So today we are going to learn what a neural

10
00:00:38,920 --> 00:00:42,360
Speaker 1: engine is and what it does. And if you guys,

11
00:00:42,479 --> 00:00:45,440
Speaker 1: by the way, have any requests for topics you've always thought, Hey,

12
00:00:45,479 --> 00:00:49,600
Speaker 1: I want to have an episode about this particular tech topic. Remember,

13
00:00:49,640 --> 00:00:52,040
Speaker 1: you can send those to me by sending an email

14
00:00:52,080 --> 00:00:54,920
Speaker 1: to tex Stuff at how stuff works dot com. And

15
00:00:54,960 --> 00:00:58,920
Speaker 1: now let's talk about this neural engine. Well, the general

16
00:00:58,960 --> 00:01:04,240
Speaker 1: public for heard about this topic back in September two

17
00:01:04,240 --> 00:01:08,640
Speaker 1: thousand seventeen, when Apple CEO Tim Cook presented at what

18
00:01:08,760 --> 00:01:12,000
Speaker 1: has become an annual tradition for Apple at around that

19
00:01:12,080 --> 00:01:16,160
Speaker 1: time of year, pretty much every September is when Apple

20
00:01:16,200 --> 00:01:18,800
Speaker 1: will come out and unveil the latest in its line

21
00:01:18,840 --> 00:01:23,880
Speaker 1: of iPhone smartphones, and in that would have been the

22
00:01:24,080 --> 00:01:28,120
Speaker 1: Iconic iPhone X, the tenth anniversary edition of the iPhone

23
00:01:28,160 --> 00:01:32,440
Speaker 1: also the one that's been discontinued now. Cook listed off

24
00:01:32,480 --> 00:01:35,399
Speaker 1: a lot of features when he went to that presentation,

25
00:01:35,440 --> 00:01:38,360
Speaker 1: but the one we're really interested in today is part

26
00:01:38,400 --> 00:01:42,840
Speaker 1: of the phones A eleven micro processor, also called the

27
00:01:42,920 --> 00:01:47,680
Speaker 1: A eleven Bionic CPU. The most recent iPhones as of

28
00:01:47,680 --> 00:01:51,720
Speaker 1: the recording of this podcast now have the next generation

29
00:01:52,040 --> 00:01:55,520
Speaker 1: of that chip, the A twelve, But in both cases,

30
00:01:55,720 --> 00:01:58,280
Speaker 1: the neural engine is one of the elements that gets

31
00:01:58,320 --> 00:02:00,800
Speaker 1: a lot of coverage. So let's go to the A

32
00:02:00,880 --> 00:02:03,000
Speaker 1: eleven since that was the first one to have it.

33
00:02:03,000 --> 00:02:07,520
Speaker 1: It's more than just a CPU. It's technically a system

34
00:02:07,600 --> 00:02:11,640
Speaker 1: on a chip or s O a C. It's an

35
00:02:11,840 --> 00:02:15,320
Speaker 1: ARM sixty four bit chip. But that doesn't really tell

36
00:02:15,360 --> 00:02:18,160
Speaker 1: you anything if you're not, you know, deep into the

37
00:02:18,200 --> 00:02:21,120
Speaker 1: world of micro processors. So what does that actually mean. Well,

38
00:02:21,360 --> 00:02:24,600
Speaker 1: the ARM based part means that it's it's based on

39
00:02:24,800 --> 00:02:29,519
Speaker 1: the ARM micro architecture in chip design. So for our

40
00:02:29,560 --> 00:02:33,760
Speaker 1: purposes we can simplify this to say, the chips components,

41
00:02:33,880 --> 00:02:38,240
Speaker 1: the stuff that's actually on the microprocessor are laid out

42
00:02:38,480 --> 00:02:42,000
Speaker 1: in a way that was developed by ARM Holdings, that's

43
00:02:42,000 --> 00:02:47,040
Speaker 1: the company behind ARM processors. Now that is different from

44
00:02:47,040 --> 00:02:49,560
Speaker 1: the layout you would find in a chip that was

45
00:02:49,639 --> 00:02:54,119
Speaker 1: made by Intel, for example. So the architecture part literally

46
00:02:54,200 --> 00:02:58,400
Speaker 1: refers to the layout of components in the microprocessor and

47
00:02:58,440 --> 00:03:02,519
Speaker 1: how they interact with each other. And generally speaking, companies

48
00:03:02,560 --> 00:03:08,080
Speaker 1: that make microprocessors develop an architecture. They do so in

49
00:03:08,120 --> 00:03:11,240
Speaker 1: a way that is supposed to maximize the efficiency of

50
00:03:11,320 --> 00:03:13,320
Speaker 1: the chips. So if you get the most power for

51
00:03:13,400 --> 00:03:17,679
Speaker 1: the least amount of energy input you can with the

52
00:03:17,760 --> 00:03:19,600
Speaker 1: least amount of waste, really is the best way of

53
00:03:19,600 --> 00:03:21,440
Speaker 1: putting it. You don't want to waste too much and

54
00:03:22,000 --> 00:03:25,680
Speaker 1: produce too much heat. And then you typically would then

55
00:03:25,880 --> 00:03:29,880
Speaker 1: reduce the size of the various components. And then after

56
00:03:29,919 --> 00:03:32,040
Speaker 1: you reduce the size of the components, you might figure

57
00:03:32,040 --> 00:03:35,720
Speaker 1: out a new architecture that makes better use of these

58
00:03:36,080 --> 00:03:39,720
Speaker 1: smaller components. And this process goes on and on. Intel

59
00:03:39,760 --> 00:03:44,320
Speaker 1: calls this the TIC talk methodology. So that's what the

60
00:03:44,400 --> 00:03:47,720
Speaker 1: ARM based part means. It's from this particular company following

61
00:03:47,720 --> 00:03:51,440
Speaker 1: this particular layout. As for that sixty four bit part,

62
00:03:51,520 --> 00:03:54,360
Speaker 1: what does that mean, Well, that refers to the data

63
00:03:54,480 --> 00:03:59,840
Speaker 1: width of the arithmetic logic unit or a LU. The

64
00:04:00,040 --> 00:04:02,680
Speaker 1: says the part of the processor that actually carries out

65
00:04:03,120 --> 00:04:08,640
Speaker 1: those operations on data from computer instructions. So data with

66
00:04:08,840 --> 00:04:12,160
Speaker 1: essentially tells you how much information the a L you

67
00:04:12,360 --> 00:04:16,919
Speaker 1: can accept or handle at a given time, and it

68
00:04:16,960 --> 00:04:20,240
Speaker 1: tells you this in bits. Now, a bit, just to

69
00:04:20,279 --> 00:04:24,760
Speaker 1: remind you, is a single unit of computational information, and

70
00:04:24,800 --> 00:04:29,560
Speaker 1: it is binary, meaning has two states, which we designate

71
00:04:29,640 --> 00:04:33,799
Speaker 1: as being either a zero or a one. Some people

72
00:04:33,839 --> 00:04:38,120
Speaker 1: say often on or false and true, but it's zero

73
00:04:38,279 --> 00:04:42,200
Speaker 1: and one. The number of bits tells you how big

74
00:04:42,640 --> 00:04:46,919
Speaker 1: these actual numbers can get. Before the a L you

75
00:04:47,120 --> 00:04:49,960
Speaker 1: can't handle them anymore. So let's say you have an

76
00:04:49,960 --> 00:04:52,320
Speaker 1: eight bit chip, because that's a lot easier to talk about.

77
00:04:53,000 --> 00:04:56,360
Speaker 1: You would be able to add, subtract, multiply, divide, you know,

78
00:04:56,400 --> 00:05:02,320
Speaker 1: the basic arithmetic lot logical operation to eight bit numbers.

79
00:05:02,480 --> 00:05:05,800
Speaker 1: With an eight bit chip, Now, a single bit is

80
00:05:05,800 --> 00:05:08,960
Speaker 1: a zero or a one, and eight bit number you

81
00:05:09,000 --> 00:05:14,000
Speaker 1: can represent as a string of eight eight numbers, either

82
00:05:14,120 --> 00:05:17,359
Speaker 1: zeros or ones, So you could have eight zeros in

83
00:05:17,360 --> 00:05:20,200
Speaker 1: a row, up to eight ones in a row and

84
00:05:20,240 --> 00:05:23,200
Speaker 1: everything in between. So it could be seven zeros and

85
00:05:23,240 --> 00:05:25,680
Speaker 1: then a one or it could be six zeros and

86
00:05:25,720 --> 00:05:28,159
Speaker 1: then a one and then another zero. You get the point.

87
00:05:28,880 --> 00:05:32,320
Speaker 1: With that many combinations, that means you would be able

88
00:05:32,400 --> 00:05:36,680
Speaker 1: to go from the typical numbers of zero to two

89
00:05:36,760 --> 00:05:41,120
Speaker 1: hundred fifty five. That's with eight bit. However, we're not

90
00:05:41,200 --> 00:05:44,840
Speaker 1: talking eight bit. We're talking about a sixty four bit chips.

91
00:05:44,839 --> 00:05:48,960
Speaker 1: So now you have sixty four digits in a row

92
00:05:49,000 --> 00:05:52,240
Speaker 1: that can be either a zero or one. That provides

93
00:05:52,279 --> 00:05:57,640
Speaker 1: you a lot more combinations, which means you could range

94
00:05:57,720 --> 00:06:04,040
Speaker 1: in number from zero row to nine quintillion, two hundred

95
00:06:04,320 --> 00:06:10,160
Speaker 1: twenty three quadrillion, three hundred seventy two trillion, thirty six billion,

96
00:06:10,360 --> 00:06:15,000
Speaker 1: eight hundred fifty four million, seven hundred seventy five thousand,

97
00:06:15,080 --> 00:06:19,880
Speaker 1: eight hundred seven. That's a pretty big range. It can

98
00:06:19,920 --> 00:06:23,760
Speaker 1: handle way way larger numbers than an eight bit chip.

99
00:06:24,120 --> 00:06:28,200
Speaker 1: So that tells you the type of architecture this chip

100
00:06:28,279 --> 00:06:30,440
Speaker 1: has and the amount of data it can handle at

101
00:06:30,440 --> 00:06:35,680
Speaker 1: a time. The A eleven has six cores, so processors

102
00:06:35,680 --> 00:06:40,400
Speaker 1: with multiple cores can work on parts of a problem simultaneously.

103
00:06:40,520 --> 00:06:43,200
Speaker 1: If you have something that's called a parallel problem, you

104
00:06:43,240 --> 00:06:45,839
Speaker 1: can divide that problem up into different segments and have

105
00:06:45,960 --> 00:06:49,440
Speaker 1: different cores tackle it. Two of those six cores are

106
00:06:49,480 --> 00:06:53,240
Speaker 1: what Apple calls high performance cores. They have a clock

107
00:06:53,320 --> 00:06:57,160
Speaker 1: speed of two point three thirty nine giga hurts uh

108
00:06:57,200 --> 00:06:59,640
Speaker 1: in the A eleven, So the clock speed tells you

109
00:06:59,680 --> 00:07:03,360
Speaker 1: how many clock cycles a CPU can perform per second.

110
00:07:03,760 --> 00:07:06,920
Speaker 1: Two point three nine gigga hurts means that these cores

111
00:07:06,960 --> 00:07:11,480
Speaker 1: can each perform two point thirty nine billion clock cycles

112
00:07:11,480 --> 00:07:16,320
Speaker 1: per second. Now, clock cycles do not easily translate over

113
00:07:16,440 --> 00:07:21,000
Speaker 1: into actions. It's not necessarily one clock cycle per action.

114
00:07:21,400 --> 00:07:24,960
Speaker 1: But generally these numbers tell you how much a core

115
00:07:26,080 --> 00:07:28,840
Speaker 1: of the processor is able to handle per second, how

116
00:07:28,880 --> 00:07:32,280
Speaker 1: many tasks it can do per second, assuming a certain

117
00:07:32,320 --> 00:07:36,640
Speaker 1: number of clock cycles per task. Now, these two cores

118
00:07:36,680 --> 00:07:41,080
Speaker 1: are referred to as Monsoon. The other four cores are

119
00:07:41,200 --> 00:07:44,440
Speaker 1: what Apple refers to as energy efficient cores. They are

120
00:07:44,480 --> 00:07:47,440
Speaker 1: not at that same high clock speed. They are meant

121
00:07:47,480 --> 00:07:52,440
Speaker 1: to handle more routine tasks. They are called mistral. So

122
00:07:52,520 --> 00:07:57,760
Speaker 1: you have Monsoon and Mistral, two Monsoon cores for mistral cores.

123
00:07:57,800 --> 00:08:00,360
Speaker 1: But the A eleven is not just a CPU. Also

124
00:08:00,440 --> 00:08:05,760
Speaker 1: has a three core graphics processing unit or GPU incorporated

125
00:08:05,840 --> 00:08:08,600
Speaker 1: into this chip. And then there are the two processing

126
00:08:08,640 --> 00:08:14,119
Speaker 1: cores dedicated specifically to handling tasks related to machine learning algorithms.

127
00:08:14,520 --> 00:08:18,400
Speaker 1: This pair of processors are the neural engine. They are

128
00:08:18,640 --> 00:08:23,720
Speaker 1: essentially an artificial neural network. And I've talked a little

129
00:08:23,760 --> 00:08:27,440
Speaker 1: bit about artificial neural networks before, but we're really going

130
00:08:27,480 --> 00:08:29,400
Speaker 1: to try and get an understanding of what makes them

131
00:08:29,400 --> 00:08:34,160
Speaker 1: special today, because that's really why neural engine means anything

132
00:08:34,160 --> 00:08:37,160
Speaker 1: in the first place. So this means we get to

133
00:08:37,160 --> 00:08:40,080
Speaker 1: do a quick history lesson because this is tech stuff,

134
00:08:40,160 --> 00:08:42,960
Speaker 1: and of course we have to go into the history.

135
00:08:43,040 --> 00:08:47,559
Speaker 1: So here we go back in the nineteen forties and

136
00:08:47,640 --> 00:08:51,800
Speaker 1: the nineteen fifties, there were some smart guys named Warren

137
00:08:51,840 --> 00:08:56,040
Speaker 1: McCullough who was a neurophysiologist, and another guy named Walter

138
00:08:56,200 --> 00:08:58,960
Speaker 1: Pitts who was a computer scientist and a logician, and

139
00:08:59,000 --> 00:09:04,560
Speaker 1: they began developing theories that brought together computational science and neuroscience,

140
00:09:04,600 --> 00:09:08,200
Speaker 1: in other words, the way machines process information and the

141
00:09:08,240 --> 00:09:13,599
Speaker 1: way brains process information, which is different. McCullough wrote a

142
00:09:13,640 --> 00:09:16,679
Speaker 1: couple of papers about this, and he asserted that the

143
00:09:16,720 --> 00:09:19,840
Speaker 1: basic unit of logic in the brain is the neuron.

144
00:09:20,320 --> 00:09:24,400
Speaker 1: So the nerve cell, the brain cell, is your your

145
00:09:24,440 --> 00:09:28,040
Speaker 1: basic unit of logic in a brain, so it would

146
00:09:28,040 --> 00:09:30,840
Speaker 1: act kind of like a gate or a transistor in

147
00:09:30,880 --> 00:09:35,000
Speaker 1: a circuit. And so you might have a transistor being

148
00:09:35,040 --> 00:09:40,280
Speaker 1: the smallest unit, not not metric of logic, but the

149
00:09:40,320 --> 00:09:43,640
Speaker 1: smallest unit to allow this to happen in a circuit

150
00:09:44,080 --> 00:09:47,920
Speaker 1: neurons in the brain. Pets and McCullough began developing computer

151
00:09:48,000 --> 00:09:51,280
Speaker 1: algorithms that attempted to guide machines to process information in

152
00:09:51,280 --> 00:09:54,320
Speaker 1: a way that was at least conceptually similar to the

153
00:09:54,320 --> 00:09:57,640
Speaker 1: way our brains process information. McCullough had proposed that by

154
00:09:57,640 --> 00:10:00,400
Speaker 1: doing this, you could train a machine to wreck niye

155
00:10:00,520 --> 00:10:06,000
Speaker 1: handwritten characters like numbers or letters, even if those representations

156
00:10:06,280 --> 00:10:09,520
Speaker 1: varied in size or style. And I've talked about this

157
00:10:09,600 --> 00:10:13,480
Speaker 1: being a challenge in the past as well, that training

158
00:10:13,559 --> 00:10:18,760
Speaker 1: a computer to recognize a specific type of image or

159
00:10:18,800 --> 00:10:22,800
Speaker 1: a specific thing in an image is challenging. So I

160
00:10:22,800 --> 00:10:25,200
Speaker 1: always use coffee mugs as an example. I don't know why,

161
00:10:25,400 --> 00:10:27,800
Speaker 1: but I like that that particular one. So we're gonna

162
00:10:27,840 --> 00:10:30,880
Speaker 1: go with it again. If you were to create a

163
00:10:30,920 --> 00:10:34,200
Speaker 1: computer program where you feed an image of a coffee

164
00:10:34,280 --> 00:10:36,680
Speaker 1: mug to the computer program, and you tell the computer

165
00:10:36,760 --> 00:10:42,679
Speaker 1: program this image corresponds with this concept called coffee mug.

166
00:10:43,240 --> 00:10:47,080
Speaker 1: And the image shows a blue coffee mug and its

167
00:10:47,120 --> 00:10:50,280
Speaker 1: handle is pointed toward the right of the perspective of

168
00:10:50,320 --> 00:10:54,280
Speaker 1: the viewer. And then you were to feed a different image,

169
00:10:54,360 --> 00:10:56,360
Speaker 1: maybe of that same coffee mug, but now at a

170
00:10:56,360 --> 00:11:01,040
Speaker 1: different angle. Well, the machine as looking at this as

171
00:11:01,080 --> 00:11:05,360
Speaker 1: if it's a totally new thing. It cannot just uh

172
00:11:05,559 --> 00:11:08,120
Speaker 1: extricate that information and say, oh, this is also a

173
00:11:08,120 --> 00:11:11,120
Speaker 1: coffee mug, or maybe it's a different coffee mug. It's

174
00:11:11,120 --> 00:11:13,800
Speaker 1: a different color or a different size or different shape.

175
00:11:14,840 --> 00:11:18,920
Speaker 1: The computer doesn't understand the concept of coffee mug. So

176
00:11:18,960 --> 00:11:21,720
Speaker 1: how can you teach it this concept? How can you

177
00:11:21,760 --> 00:11:25,480
Speaker 1: train it so it recognizes coffee mugs? That was what

178
00:11:25,600 --> 00:11:29,400
Speaker 1: McCulloch was looking at. Then you have another guy who

179
00:11:29,400 --> 00:11:33,480
Speaker 1: came along, Frank Rosenblat, very smart man, who built on

180
00:11:33,520 --> 00:11:37,760
Speaker 1: this work. He developed an artificial neuron called the perceptron. Now,

181
00:11:37,760 --> 00:11:41,160
Speaker 1: a perceptron's job is, from a very high level, pretty simple.

182
00:11:41,280 --> 00:11:46,400
Speaker 1: It accepts multiple binary inputs. So it accepts inputs that

183
00:11:46,440 --> 00:11:49,920
Speaker 1: are either zeros or ones, and then it produces a

184
00:11:50,000 --> 00:11:54,440
Speaker 1: single binary output either a zero or a one based

185
00:11:54,520 --> 00:11:58,199
Speaker 1: upon processing that information. So let's say you want to

186
00:11:58,240 --> 00:12:01,199
Speaker 1: create a program that can help you decide which restaurant

187
00:12:01,320 --> 00:12:03,920
Speaker 1: you want to go to, and you've come up with

188
00:12:04,040 --> 00:12:07,560
Speaker 1: three criteria that you think are really important in order

189
00:12:07,559 --> 00:12:10,560
Speaker 1: for you to make this decision. And the three criteria

190
00:12:10,760 --> 00:12:14,160
Speaker 1: you have are is the restaurant within a twenty minute

191
00:12:14,240 --> 00:12:19,439
Speaker 1: drive or less? So, is it relatively close? Will a

192
00:12:19,440 --> 00:12:23,080
Speaker 1: meal cost less than fifty dollars for two people to

193
00:12:23,160 --> 00:12:27,360
Speaker 1: have dinner there? And does the restaurant serve tacos? Those

194
00:12:27,400 --> 00:12:30,200
Speaker 1: are your three points of criteria, and you can represent

195
00:12:30,280 --> 00:12:33,800
Speaker 1: each of those variables with a binary figure. So, for example,

196
00:12:34,360 --> 00:12:37,880
Speaker 1: you could say that if the restaurant is closer than

197
00:12:37,920 --> 00:12:41,160
Speaker 1: a twenty minute drive, if it is nearby, you represent

198
00:12:41,240 --> 00:12:44,120
Speaker 1: that variable with a one. If it is further away

199
00:12:44,160 --> 00:12:47,440
Speaker 1: than that, it's a zero. If the dinner for two

200
00:12:47,520 --> 00:12:50,439
Speaker 1: is cheaper than fifty dollars, that's a one. If it's

201
00:12:50,440 --> 00:12:54,920
Speaker 1: more expensive, it's a zero. And if it serves tacos,

202
00:12:54,920 --> 00:12:57,840
Speaker 1: it's a one. And if it does not serve tacos,

203
00:12:57,960 --> 00:13:00,680
Speaker 1: it's a big fat zero. Then you have a list

204
00:13:00,720 --> 00:13:04,319
Speaker 1: of various restaurants you could feed each restaurant through your

205
00:13:04,320 --> 00:13:07,480
Speaker 1: criteria and see how they do. Uh, And then you

206
00:13:07,480 --> 00:13:10,000
Speaker 1: could narrow your choices this way, and perhaps there is

207
00:13:10,040 --> 00:13:13,280
Speaker 1: no single restaurant that meets all those criteria, so you

208
00:13:13,320 --> 00:13:17,640
Speaker 1: really should take another step. And that's where Rosenblatt introduces

209
00:13:17,679 --> 00:13:23,679
Speaker 1: the concept of weights, where you you change how important

210
00:13:23,720 --> 00:13:26,400
Speaker 1: each of the criteria are in relation to each other.

211
00:13:26,480 --> 00:13:31,160
Speaker 1: Weights are real numbers that indicate the importance of particular criterion.

212
00:13:31,720 --> 00:13:36,520
Speaker 1: So you want, let's say all those three criteria you've identified,

213
00:13:36,720 --> 00:13:39,400
Speaker 1: the distance, the cost, and whether or not they have tacos.

214
00:13:39,880 --> 00:13:43,560
Speaker 1: You have decided the most critical piece of information is

215
00:13:43,600 --> 00:13:47,360
Speaker 1: whether or not the restaurant serves tacos. So you could

216
00:13:47,400 --> 00:13:52,000
Speaker 1: then assign a greater weight to that criterion, saying this

217
00:13:52,080 --> 00:13:54,679
Speaker 1: is more important to me, and that will influence the

218
00:13:54,720 --> 00:13:58,440
Speaker 1: output of the neuron. You must also determine a threshold

219
00:13:58,520 --> 00:14:02,160
Speaker 1: value for the decision. In other words, you say, in

220
00:14:02,280 --> 00:14:05,920
Speaker 1: order to produce a positive result to tell me, yes,

221
00:14:05,960 --> 00:14:08,640
Speaker 1: this is a restaurant you should go to, you must

222
00:14:08,720 --> 00:14:13,240
Speaker 1: at least meet this threshold. That's the minimum value the

223
00:14:13,280 --> 00:14:15,880
Speaker 1: calculation has to meet or exceed in order to produce

224
00:14:15,960 --> 00:14:19,280
Speaker 1: a go to this restaurant result. I'll explain a bit

225
00:14:19,280 --> 00:14:21,720
Speaker 1: more about this in just a second, But first I'm

226
00:14:21,720 --> 00:14:24,440
Speaker 1: going to take a quick break and thank our sponsors.

227
00:14:32,320 --> 00:14:35,240
Speaker 1: That threshold value that I mentioned before the break is

228
00:14:35,280 --> 00:14:38,360
Speaker 1: really important because it tells your model what sort of

229
00:14:38,400 --> 00:14:42,920
Speaker 1: results count as valid versus not valid. So let's say

230
00:14:43,040 --> 00:14:45,920
Speaker 1: I've waited the criteria so that the distance to the

231
00:14:46,000 --> 00:14:49,040
Speaker 1: restaurant and the expense of the meal each have a

232
00:14:49,120 --> 00:14:53,120
Speaker 1: weight of two, but the presence of tacos is a six.

233
00:14:53,680 --> 00:14:56,240
Speaker 1: That's how important I think tacos are. And I've said

234
00:14:56,240 --> 00:14:59,240
Speaker 1: a threshold of four. Well, that means that if the

235
00:14:59,280 --> 00:15:04,240
Speaker 1: restaurant is relatively close and it's relatively inexpensive, it's going

236
00:15:04,280 --> 00:15:06,400
Speaker 1: to pass my criteria because I've given a weight of

237
00:15:06,440 --> 00:15:09,560
Speaker 1: two for both of those and added together that's four.

238
00:15:09,640 --> 00:15:12,320
Speaker 1: It equals the threshold. Good to go. But even if

239
00:15:12,360 --> 00:15:16,040
Speaker 1: the restaurant is far away and even if it's expensive,

240
00:15:16,720 --> 00:15:20,520
Speaker 1: if it serves tacos, it still passes my criteria because

241
00:15:20,520 --> 00:15:23,760
Speaker 1: I gave the tacos a weight of six. Raising the

242
00:15:23,800 --> 00:15:29,160
Speaker 1: threshold value reduces the number of valid restaurants. So if

243
00:15:29,200 --> 00:15:32,920
Speaker 1: I make the threshold eight instead of four, now the

244
00:15:33,080 --> 00:15:36,160
Speaker 1: only way I can get a valid result a result

245
00:15:36,200 --> 00:15:39,240
Speaker 1: of yes, go to this restaurant is if the restaurant

246
00:15:39,360 --> 00:15:44,760
Speaker 1: has tacos and it's either close by, or it's inexpensive,

247
00:15:45,160 --> 00:15:47,800
Speaker 1: or both. And if I said the threshold were ten,

248
00:15:48,680 --> 00:15:51,840
Speaker 1: all three criteria would need to be met for this

249
00:15:51,880 --> 00:15:55,720
Speaker 1: option to be valid. Now, an artificial intelligence for the

250
00:15:55,760 --> 00:15:59,200
Speaker 1: purposes of notation, many people will move the threshold value

251
00:15:59,440 --> 00:16:01,880
Speaker 1: to the other side of the equation, and in this

252
00:16:01,920 --> 00:16:04,760
Speaker 1: case we now call it a bias, and a bias

253
00:16:04,880 --> 00:16:07,360
Speaker 1: essentially is a measurement to tell you how easy or

254
00:16:07,440 --> 00:16:10,640
Speaker 1: difficult it is to get the perceptron to fire off

255
00:16:10,720 --> 00:16:14,520
Speaker 1: a positive value. If you have a big positive bias,

256
00:16:14,640 --> 00:16:17,640
Speaker 1: that means it's easier for the perceptron to produce a

257
00:16:17,680 --> 00:16:22,400
Speaker 1: positive result a one. A large negative bias does the opposite,

258
00:16:22,840 --> 00:16:25,400
Speaker 1: and thus you would get a zero. So we can

259
00:16:25,480 --> 00:16:29,480
Speaker 1: write out the perceptron's rules like this. Take the value

260
00:16:29,680 --> 00:16:32,440
Speaker 1: of a variable which is either going to be a

261
00:16:32,520 --> 00:16:36,120
Speaker 1: zero or a one. It will be binary. You multiply

262
00:16:36,440 --> 00:16:40,680
Speaker 1: the value of this variable by the weight of that variable,

263
00:16:41,280 --> 00:16:47,800
Speaker 1: and weights can be different values. Let's say that the

264
00:16:49,000 --> 00:16:52,000
Speaker 1: distance and expense are both weighted at two. Tacos gets

265
00:16:52,000 --> 00:16:56,560
Speaker 1: a big hefty six. You're going to add your various

266
00:16:56,760 --> 00:17:00,040
Speaker 1: weighted variable results together, and then you add the I

267
00:17:00,320 --> 00:17:03,160
Speaker 1: s for the perceptron. And in our example, the bias

268
00:17:03,360 --> 00:17:08,000
Speaker 1: is a minus six. That's to tell us that in

269
00:17:08,200 --> 00:17:11,840
Speaker 1: order for this perceptron to fire, you have to you

270
00:17:11,840 --> 00:17:14,240
Speaker 1: have to be able to factor in that minus six

271
00:17:14,359 --> 00:17:17,600
Speaker 1: and beat it. So if after adding these elements together,

272
00:17:18,200 --> 00:17:21,800
Speaker 1: you get a result that is zero or lower, the

273
00:17:21,880 --> 00:17:24,439
Speaker 1: output is a zero or a negative, saying, don't go

274
00:17:24,480 --> 00:17:27,280
Speaker 1: to this restaurant. So after adding that negative six, if

275
00:17:27,320 --> 00:17:29,920
Speaker 1: you have a zero or less, you don't go. If

276
00:17:29,920 --> 00:17:32,000
Speaker 1: you get a result that's greater than zero, it's a

277
00:17:32,040 --> 00:17:34,840
Speaker 1: positive result, it says, go to that restaurant. So here

278
00:17:34,840 --> 00:17:38,000
Speaker 1: in our hypothetical perceptron, we've decided on a bias of

279
00:17:38,000 --> 00:17:40,760
Speaker 1: minus six, and we take our three variables as we

280
00:17:40,840 --> 00:17:44,240
Speaker 1: examine a single restaurant. So this restaurant is twenty five

281
00:17:44,240 --> 00:17:47,800
Speaker 1: minutes away. So that means for our first variable, which

282
00:17:47,880 --> 00:17:51,159
Speaker 1: is all about distance, it gets a zero because it

283
00:17:51,240 --> 00:17:53,919
Speaker 1: is further than twenty minutes away. So that variable is

284
00:17:53,920 --> 00:17:57,160
Speaker 1: a zero. And we multiply the variable times the weight.

285
00:17:57,560 --> 00:18:00,720
Speaker 1: The weight is too for that particular variable two time

286
00:18:00,840 --> 00:18:05,040
Speaker 1: zero is zero. Then I look and I see that

287
00:18:05,119 --> 00:18:07,240
Speaker 1: dinner for two of that restaurant's gonna set me back

288
00:18:07,359 --> 00:18:10,640
Speaker 1: thirty dollars, but that's below the limit we had set

289
00:18:10,680 --> 00:18:13,240
Speaker 1: of fifty dollars. So that means the value of the

290
00:18:13,320 --> 00:18:16,160
Speaker 1: variable is one. It is cheaper than fifty dollars, so

291
00:18:16,200 --> 00:18:19,720
Speaker 1: that gets a one. The weight for this variable is

292
00:18:19,760 --> 00:18:22,840
Speaker 1: to so multiply the weight times of variable two times

293
00:18:22,880 --> 00:18:26,800
Speaker 1: one is two. Then we have the question does the

294
00:18:26,880 --> 00:18:29,800
Speaker 1: restaurants serve tacos? And I know you're dying to know this.

295
00:18:30,240 --> 00:18:34,560
Speaker 1: I'm glad to report the restaurant does in fact serve tacos,

296
00:18:35,040 --> 00:18:38,080
Speaker 1: And that means that the variable is a one. It's positive,

297
00:18:38,600 --> 00:18:41,159
Speaker 1: and we waited this variable very heavily with a six,

298
00:18:41,359 --> 00:18:44,800
Speaker 1: So six times one is six. Now we have to

299
00:18:44,840 --> 00:18:49,119
Speaker 1: add all of those results together, so we have zero

300
00:18:49,280 --> 00:18:52,280
Speaker 1: from the first one, too, from the second one, six

301
00:18:52,480 --> 00:18:55,200
Speaker 1: from the third one. Add that together you get eight.

302
00:18:55,840 --> 00:18:58,199
Speaker 1: Now we have to add in the bias, and the

303
00:18:58,240 --> 00:19:02,680
Speaker 1: bias for this perceptron is a minus six. Eight plus

304
00:19:02,720 --> 00:19:06,240
Speaker 1: minus six gives us a final value of two. Two

305
00:19:06,359 --> 00:19:09,640
Speaker 1: is greater than zero. So by the rules we have established,

306
00:19:09,680 --> 00:19:13,120
Speaker 1: the perceptron says this is a positive result and fires

307
00:19:13,119 --> 00:19:15,439
Speaker 1: off a one. So the restaurant we fed to the

308
00:19:15,440 --> 00:19:18,800
Speaker 1: perceptron met the criteria based on that bias. Now, if

309
00:19:18,840 --> 00:19:24,520
Speaker 1: our bias had been minus ten or minus nine, we

310
00:19:24,520 --> 00:19:28,159
Speaker 1: would have not produced this positive result. We have gotten

311
00:19:28,640 --> 00:19:31,199
Speaker 1: a zero or negative number and it would have said no.

312
00:19:31,840 --> 00:19:34,840
Speaker 1: So that bias is very important, as is the weight

313
00:19:35,080 --> 00:19:38,600
Speaker 1: of the various variables. And that is one neuron. Now

314
00:19:38,640 --> 00:19:41,439
Speaker 1: you can actually create layers of neurons. That's why we

315
00:19:41,480 --> 00:19:45,240
Speaker 1: call it an artificial neural network, not just an artificial neuron.

316
00:19:45,920 --> 00:19:49,400
Speaker 1: And by doing that you can have results from one

317
00:19:49,640 --> 00:19:55,480
Speaker 1: neuron's decisions feed directly into another neuron. Also, a perceptron

318
00:19:55,600 --> 00:19:59,560
Speaker 1: can perform as a type of logical gait called a

319
00:19:59,760 --> 00:20:04,359
Speaker 1: name end gate in a n D that stands for

320
00:20:04,800 --> 00:20:08,480
Speaker 1: not and it's a type of logical gate that can

321
00:20:08,560 --> 00:20:13,000
Speaker 1: produce a false or negative output if all its inputs

322
00:20:13,200 --> 00:20:16,720
Speaker 1: are true or positive. So, in other words, with the

323
00:20:16,800 --> 00:20:20,840
Speaker 1: right weights and biases, a perceptron will produce an output

324
00:20:20,920 --> 00:20:24,679
Speaker 1: of zero if all of its inputs are ones. The

325
00:20:24,800 --> 00:20:29,240
Speaker 1: nand gate in computer science is a universal gate because

326
00:20:29,600 --> 00:20:33,520
Speaker 1: you can use different creations and combinations of nand gates

327
00:20:34,080 --> 00:20:36,840
Speaker 1: and build any kind of computation. You just have to

328
00:20:36,880 --> 00:20:39,280
Speaker 1: link them together properly in order to do it. It's

329
00:20:39,280 --> 00:20:41,679
Speaker 1: not always the most efficient way to do this, but

330
00:20:41,800 --> 00:20:45,600
Speaker 1: it does work. So if you had perceptrons that accepted

331
00:20:45,720 --> 00:20:48,760
Speaker 1: two variables, each with a weight of minus two, and

332
00:20:48,800 --> 00:20:51,680
Speaker 1: the perceptron had a bias of three, it would act

333
00:20:51,760 --> 00:20:55,480
Speaker 1: like a nandgate. That's because if both variables are one,

334
00:20:56,080 --> 00:20:58,560
Speaker 1: then the final equation you'd get to determine the output

335
00:20:58,560 --> 00:21:02,600
Speaker 1: would be minus two because you multiply the weight of

336
00:21:02,680 --> 00:21:05,560
Speaker 1: minus two times the variable of one, and then you

337
00:21:05,720 --> 00:21:09,040
Speaker 1: have to add a second minus two because the second

338
00:21:09,080 --> 00:21:12,359
Speaker 1: variable is the same way. And then you would add

339
00:21:12,359 --> 00:21:15,080
Speaker 1: the bias, which is three. But minus two plus minus

340
00:21:15,119 --> 00:21:18,600
Speaker 1: two is minus four. You add in plus three, you

341
00:21:18,640 --> 00:21:21,320
Speaker 1: get a minus one is the result minus one is

342
00:21:21,400 --> 00:21:23,760
Speaker 1: less than zero, which means they output for the perceptron

343
00:21:23,960 --> 00:21:27,000
Speaker 1: must be zero as opposed to one. You get a

344
00:21:27,040 --> 00:21:31,560
Speaker 1: false or an off or a zero result. Two positive

345
00:21:31,560 --> 00:21:34,800
Speaker 1: inputs create a negative output when a few times you

346
00:21:34,840 --> 00:21:37,639
Speaker 1: can say two positives make a negative. Now that means

347
00:21:37,920 --> 00:21:42,760
Speaker 1: we can ask progressively more complicated questions, with each perceptron

348
00:21:42,840 --> 00:21:46,480
Speaker 1: handling one aspect of that question and feeding into another

349
00:21:46,560 --> 00:21:50,320
Speaker 1: layer of perceptrons. Each perceptron will produce either a positive

350
00:21:50,400 --> 00:21:52,440
Speaker 1: or a negative result, so you either get a one

351
00:21:52,560 --> 00:21:55,680
Speaker 1: or a zero, and these results will feed into other

352
00:21:55,760 --> 00:21:58,400
Speaker 1: neurons in the network, which will use them to perform

353
00:21:58,560 --> 00:22:02,280
Speaker 1: their own calculations of their own weights and their own biases.

354
00:22:02,800 --> 00:22:05,480
Speaker 1: All of this is to feed those questions through a

355
00:22:05,560 --> 00:22:07,840
Speaker 1: network to produce a result, and I should be clear

356
00:22:08,320 --> 00:22:11,000
Speaker 1: the weights for each variable along this path can change

357
00:22:11,400 --> 00:22:13,880
Speaker 1: from one part of the decision making process to the next.

358
00:22:13,920 --> 00:22:18,360
Speaker 1: We're not just talking about identical perceptrons all through the network,

359
00:22:18,760 --> 00:22:21,320
Speaker 1: and that last bit is the most important part, because

360
00:22:21,520 --> 00:22:24,359
Speaker 1: if this were just a matter of setting biases and

361
00:22:24,400 --> 00:22:27,600
Speaker 1: weights and building out a network of perceptrons, there'd be

362
00:22:27,640 --> 00:22:30,879
Speaker 1: nothing special about it, because we already have nannd gates.

363
00:22:31,760 --> 00:22:35,240
Speaker 1: They existed before perceptrons. It would just mean that we

364
00:22:35,320 --> 00:22:39,119
Speaker 1: have a different way to implement something we could already do,

365
00:22:39,440 --> 00:22:41,320
Speaker 1: and finding a new way to do something you were

366
00:22:41,359 --> 00:22:45,480
Speaker 1: already doing is rarely super transformative. You might be able

367
00:22:45,520 --> 00:22:48,760
Speaker 1: to make it a better way of doing the same thing,

368
00:22:48,840 --> 00:22:52,080
Speaker 1: but in this case it might be less efficient than

369
00:22:52,119 --> 00:22:54,680
Speaker 1: the old way. However, there is something else that makes

370
00:22:54,680 --> 00:22:58,600
Speaker 1: these perceptrons special, and that's by pairing them with those

371
00:22:58,640 --> 00:23:02,280
Speaker 1: special algorithms that Cola and Pets were proposing back in

372
00:23:02,320 --> 00:23:06,040
Speaker 1: the forties and fifties. These would be learning algorithms. These

373
00:23:06,040 --> 00:23:10,800
Speaker 1: algorithms are instructions that can, based upon external stimuli, dynamically

374
00:23:10,880 --> 00:23:15,240
Speaker 1: and automatically tune the weights and biases of perceptrons in

375
00:23:15,320 --> 00:23:18,800
Speaker 1: a neural network. In other words, a program can guide

376
00:23:18,960 --> 00:23:22,840
Speaker 1: the network so that it learns how to solve problems.

377
00:23:22,880 --> 00:23:26,639
Speaker 1: But how well. It all comes down to making small

378
00:23:26,800 --> 00:23:30,399
Speaker 1: changes in those weights and biases in order to fine

379
00:23:30,440 --> 00:23:33,680
Speaker 1: tune outputs. So let's say we're working on an image

380
00:23:33,720 --> 00:23:37,119
Speaker 1: recognition algorithm. That's one of the big things that the

381
00:23:37,160 --> 00:23:40,879
Speaker 1: neural engine and Apple's iPhones do. They that's one of

382
00:23:40,920 --> 00:23:43,920
Speaker 1: their main purposes. So in our example, let's say we're

383
00:23:43,960 --> 00:23:49,479
Speaker 1: training the neural network to recognize handwritten printed lowercase letters.

384
00:23:49,520 --> 00:23:52,240
Speaker 1: It's very similar to what McCulla was talking about. But

385
00:23:52,359 --> 00:23:55,960
Speaker 1: let's say our model is having trouble differentiating a lowercase

386
00:23:56,280 --> 00:24:00,480
Speaker 1: L with a lowercase I. It was just having issues

387
00:24:00,880 --> 00:24:04,280
Speaker 1: being able to tell those two apart in particular. Now

388
00:24:04,440 --> 00:24:07,320
Speaker 1: we've got a specific example in which our model is

389
00:24:07,400 --> 00:24:10,679
Speaker 1: misidentifying an L as an eye. Let's say, in the

390
00:24:10,760 --> 00:24:14,199
Speaker 1: hypothetical situation, and so we decide we're gonna make some

391
00:24:14,280 --> 00:24:18,280
Speaker 1: minor tweaks in the weights and biases earlier on in

392
00:24:18,320 --> 00:24:22,920
Speaker 1: the artificial neural network to guide our network so that

393
00:24:22,960 --> 00:24:27,000
Speaker 1: it can more readily tell the difference between l lower

394
00:24:27,000 --> 00:24:29,239
Speaker 1: case ls and lower case eyes. And we get our

395
00:24:29,280 --> 00:24:31,760
Speaker 1: model closer to being able to tell that difference. We

396
00:24:31,880 --> 00:24:35,400
Speaker 1: keep making these small adjustments until we get more consistent output.

397
00:24:35,840 --> 00:24:38,320
Speaker 1: The network as a whole is said to quote unquote

398
00:24:38,520 --> 00:24:41,639
Speaker 1: learn through this process. It's getting better and creating an

399
00:24:41,680 --> 00:24:45,359
Speaker 1: output there's more reflective reality. But there's a bit of

400
00:24:45,359 --> 00:24:47,920
Speaker 1: a problem, and anyone who has worked in QA has

401
00:24:47,960 --> 00:24:52,200
Speaker 1: probably already spotted what it is. For everybody else. I'm

402
00:24:52,200 --> 00:24:54,840
Speaker 1: gonna explain it in just a minute, but first let's

403
00:24:54,840 --> 00:25:05,280
Speaker 1: take another quick break to thank our sponsor. So what

404
00:25:05,359 --> 00:25:08,280
Speaker 1: was that problem I was talking about before the break? Well,

405
00:25:08,520 --> 00:25:11,200
Speaker 1: if you've ever worked in any sort of programming environment,

406
00:25:11,240 --> 00:25:14,760
Speaker 1: you know that when you introduce changes in code, you

407
00:25:14,840 --> 00:25:17,679
Speaker 1: might fix whatever problem you're focusing on at the moment,

408
00:25:18,080 --> 00:25:21,840
Speaker 1: but you might also break something else that's already in

409
00:25:21,880 --> 00:25:25,000
Speaker 1: the code. With perceptrons. That happens when you start tweaking

410
00:25:25,080 --> 00:25:28,679
Speaker 1: weights and biases, because a small change in one spot

411
00:25:28,760 --> 00:25:31,040
Speaker 1: in a network can have sort of a ripple effect

412
00:25:31,040 --> 00:25:36,480
Speaker 1: with unintended consequences. So, for example, in our little hypothetical situation,

413
00:25:36,520 --> 00:25:39,840
Speaker 1: maybe your new model can better tell the difference between

414
00:25:40,000 --> 00:25:42,560
Speaker 1: a lower case L and a lower case I, but

415
00:25:42,680 --> 00:25:46,360
Speaker 1: now the lowercase J is giving it problems the way

416
00:25:46,400 --> 00:25:50,320
Speaker 1: perceptron's work. Small changes in the network can reduce much

417
00:25:50,520 --> 00:25:54,119
Speaker 1: larger variations and output, so it's sort of like the

418
00:25:54,200 --> 00:25:59,359
Speaker 1: butterfly effect in action. Computer scientists created a different type

419
00:25:59,440 --> 00:26:03,600
Speaker 1: of artificial neuron network that addresses this problem, and this

420
00:26:03,640 --> 00:26:06,640
Speaker 1: type is called a sigmoid neuron. Really, I should say

421
00:26:06,680 --> 00:26:09,720
Speaker 1: they created a different type of artificial neuron, So the

422
00:26:09,760 --> 00:26:12,040
Speaker 1: sigmoid neuron. What the heck is this? Well, from a

423
00:26:12,119 --> 00:26:16,520
Speaker 1: high level, sigmoid neurons look kind of like perceptrons, but

424
00:26:17,200 --> 00:26:19,800
Speaker 1: while you'd either use either a zero or a one

425
00:26:19,920 --> 00:26:23,359
Speaker 1: as the value for an input into a perceptron, a

426
00:26:23,480 --> 00:26:27,920
Speaker 1: sigmoid neuron can accept a zero, a one, or any

427
00:26:28,040 --> 00:26:32,200
Speaker 1: number in between zero and one. The output a sigmoid

428
00:26:32,240 --> 00:26:36,920
Speaker 1: neuron produces is called the logistic function or sigmoid function.

429
00:26:37,640 --> 00:26:41,080
Speaker 1: This gets a bit complicated on a surface level, particularly

430
00:26:41,320 --> 00:26:44,639
Speaker 1: if like me, you're a little rusty on your algebra

431
00:26:44,720 --> 00:26:48,640
Speaker 1: and calculus, but generally speaking, the end result is that

432
00:26:48,920 --> 00:26:52,280
Speaker 1: using this type of artificial neuron, you can make small

433
00:26:52,359 --> 00:26:56,320
Speaker 1: changes to weights and biases and not create a larger

434
00:26:56,400 --> 00:26:59,840
Speaker 1: effect on the ultimate output. You'll still make small adjustments

435
00:26:59,840 --> 00:27:03,080
Speaker 1: to the output. There are a lot of resources online

436
00:27:03,080 --> 00:27:05,639
Speaker 1: that go into greater detail about sigma neurons. I'm not

437
00:27:05,680 --> 00:27:08,960
Speaker 1: going to go into more detail here because without visual

438
00:27:09,000 --> 00:27:12,240
Speaker 1: aids and being able to go into algebraic functions, it

439
00:27:12,320 --> 00:27:15,080
Speaker 1: gets a little hard for me to explain. But in

440
00:27:15,119 --> 00:27:18,399
Speaker 1: your typical neural network, you would have an input layer

441
00:27:18,760 --> 00:27:21,000
Speaker 1: and you would have an output layer, So you have

442
00:27:21,040 --> 00:27:23,879
Speaker 1: a layer where information comes in, and you would have

443
00:27:23,920 --> 00:27:27,199
Speaker 1: the output layer where new information comes out. But between

444
00:27:27,240 --> 00:27:30,360
Speaker 1: those two you would have what are called hidden layers.

445
00:27:31,000 --> 00:27:34,040
Speaker 1: Then just really means that they're not input or output

446
00:27:34,400 --> 00:27:37,520
Speaker 1: there in the middle. Hidden makes it sound like they're

447
00:27:37,520 --> 00:27:41,960
Speaker 1: super clandestine and spy worthy and cool, but really they're

448
00:27:41,960 --> 00:27:45,199
Speaker 1: just in between input and output. They perform processes on

449
00:27:45,280 --> 00:27:48,320
Speaker 1: the inputs they receive, and they passed them on as

450
00:27:48,359 --> 00:27:53,520
Speaker 1: outputs to other neurons to have more processes put on

451
00:27:53,560 --> 00:27:56,440
Speaker 1: them until you finally get the output. The sort of

452
00:27:56,480 --> 00:28:00,840
Speaker 1: networks I've described so far are called feed forward networks,

453
00:28:01,440 --> 00:28:03,840
Speaker 1: and that means pretty much what sounds like. You plug

454
00:28:03,880 --> 00:28:08,120
Speaker 1: in and puts the information passes one way through the network,

455
00:28:08,920 --> 00:28:11,960
Speaker 1: and you eventually get output as the information continues to move,

456
00:28:12,000 --> 00:28:15,200
Speaker 1: and we typically visualize this in a left to right

457
00:28:15,600 --> 00:28:20,000
Speaker 1: kind of of display, so you imagine input coming in

458
00:28:20,040 --> 00:28:23,199
Speaker 1: from the left side, passing through this network, having various

459
00:28:23,200 --> 00:28:26,840
Speaker 1: processes put on it as each of these neurons UH

460
00:28:27,040 --> 00:28:31,760
Speaker 1: decides if it counts as a positive or a negative response,

461
00:28:32,200 --> 00:28:35,960
Speaker 1: or with sigmoid neurons, some degree in between, and then

462
00:28:35,960 --> 00:28:38,360
Speaker 1: plugging that into the next neuron until you finally get

463
00:28:38,360 --> 00:28:41,200
Speaker 1: to the output. It always gets fed forward. But that's

464
00:28:41,480 --> 00:28:44,320
Speaker 1: not the only type of artificial neural network. There are

465
00:28:44,360 --> 00:28:48,880
Speaker 1: also things called recurrent neural networks, in which neurons fire

466
00:28:48,920 --> 00:28:52,000
Speaker 1: at some predetermined amount of time. Then they typically settle

467
00:28:52,080 --> 00:28:55,160
Speaker 1: down they're not firing at all, but the next group

468
00:28:55,240 --> 00:28:57,400
Speaker 1: of neurons start to fire. This creates kind of a

469
00:28:57,400 --> 00:29:01,560
Speaker 1: cascade effect through the network, and occasionally there it could

470
00:29:01,560 --> 00:29:06,000
Speaker 1: be neurons that feed back into previous neurons. There's a

471
00:29:06,040 --> 00:29:10,720
Speaker 1: feedback loop. It's more challenging to make a powerful learning

472
00:29:10,760 --> 00:29:14,959
Speaker 1: algorithm with recurrent neural networks because it gets super duper complicated.

473
00:29:15,400 --> 00:29:20,360
Speaker 1: But recurrent neural networks pose potentially huge utility in the future.

474
00:29:20,560 --> 00:29:24,080
Speaker 1: So an artificial neural network can be made up as

475
00:29:24,480 --> 00:29:28,120
Speaker 1: of as few as a few dozen artificial neurons all

476
00:29:28,160 --> 00:29:31,520
Speaker 1: the way up to millions of artificial neurons, and we

477
00:29:31,640 --> 00:29:36,280
Speaker 1: trained them through various processes such as back propagation. Now

478
00:29:36,320 --> 00:29:39,520
Speaker 1: that's when you take the actual output of the process

479
00:29:39,560 --> 00:29:42,000
Speaker 1: and you compare it to what you wanted it to produce,

480
00:29:42,640 --> 00:29:45,520
Speaker 1: and then you use the difference between those two results

481
00:29:45,520 --> 00:29:48,520
Speaker 1: to make changes to the weights and biases in the network.

482
00:29:48,640 --> 00:29:53,560
Speaker 1: So here's an example where training a our network to

483
00:29:53,640 --> 00:29:56,360
Speaker 1: recognize pictures of cats, because this has actually been done.

484
00:29:56,440 --> 00:30:00,480
Speaker 1: Google famously did this. So you're training your network to

485
00:30:00,520 --> 00:30:04,640
Speaker 1: recognize what a cat is based upon a picture, and

486
00:30:04,720 --> 00:30:07,520
Speaker 1: you use a picture that you know is a picture

487
00:30:07,600 --> 00:30:10,080
Speaker 1: of a cat, so you already know the answer to this.

488
00:30:10,360 --> 00:30:12,880
Speaker 1: You're teaching the computer to learn the answer to this.

489
00:30:13,320 --> 00:30:16,320
Speaker 1: You know that the answer is cat, and you feed

490
00:30:16,640 --> 00:30:20,360
Speaker 1: the image through this system, It analyzes the data, it

491
00:30:20,400 --> 00:30:23,960
Speaker 1: gives you an output, and you see how well it did.

492
00:30:24,160 --> 00:30:27,400
Speaker 1: Did it correctly identify the image as a cat, did

493
00:30:27,440 --> 00:30:33,520
Speaker 1: it assign a certain level of of certainty to its conclusion,

494
00:30:34,120 --> 00:30:37,000
Speaker 1: and if it's far off, you could start making changes

495
00:30:37,040 --> 00:30:40,560
Speaker 1: to those weights and biases to help guide the system

496
00:30:40,640 --> 00:30:44,000
Speaker 1: into determining, oh, yes, that is a cat. Training a

497
00:30:44,040 --> 00:30:47,320
Speaker 1: network multiple times refines this process to the point where

498
00:30:47,800 --> 00:30:51,080
Speaker 1: you can start to introduce brand new inputs to the system,

499
00:30:51,200 --> 00:30:54,200
Speaker 1: inputs that the system has never encountered before, and get

500
00:30:54,240 --> 00:30:58,960
Speaker 1: a reliable result. So with Google's example, you might feed

501
00:30:59,000 --> 00:31:03,160
Speaker 1: it thousands or tens of thousands, or hundreds of thousands

502
00:31:03,280 --> 00:31:07,400
Speaker 1: or more images of cats, and each time the system

503
00:31:07,520 --> 00:31:10,320
Speaker 1: is told now that there is a cat in this image,

504
00:31:10,800 --> 00:31:14,240
Speaker 1: and it begins to refine its approach, figuring out which

505
00:31:14,280 --> 00:31:16,720
Speaker 1: weights and biases it needs to tweak in order to

506
00:31:16,760 --> 00:31:20,320
Speaker 1: get to that result. And then you feed it a

507
00:31:20,400 --> 00:31:23,280
Speaker 1: whole group of new images and you don't tell it

508
00:31:23,480 --> 00:31:26,240
Speaker 1: if there are cats in those images or not. Then

509
00:31:26,280 --> 00:31:28,440
Speaker 1: you leave it to the system to determine are there

510
00:31:28,520 --> 00:31:32,360
Speaker 1: cats in these pictures? And if you have trained it properly,

511
00:31:32,720 --> 00:31:37,840
Speaker 1: if those weights and biases are actually well tweaked, then

512
00:31:38,080 --> 00:31:40,640
Speaker 1: the system should be able to reliably pick out the

513
00:31:40,680 --> 00:31:45,000
Speaker 1: pictures that have cats in them. That's the idea. Now,

514
00:31:45,040 --> 00:31:47,840
Speaker 1: there's tons more to be said about artificial neural networks,

515
00:31:47,840 --> 00:31:51,520
Speaker 1: but i'll give you I've given a quick overview. Let's

516
00:31:51,600 --> 00:31:54,120
Speaker 1: let's jump back over to Apple for a second, because

517
00:31:54,240 --> 00:31:56,160
Speaker 1: that was the whole purpose of this episode. So what

518
00:31:56,360 --> 00:32:00,440
Speaker 1: is a neural engine actually used for. Well, for the iPhone,

519
00:32:00,760 --> 00:32:04,400
Speaker 1: it's used mainly in processing speech and image data. It's

520
00:32:04,440 --> 00:32:07,120
Speaker 1: the neural engine that can analyze your face, for example,

521
00:32:07,440 --> 00:32:11,000
Speaker 1: and then translate your expressions into animated form. You can

522
00:32:11,000 --> 00:32:15,320
Speaker 1: create animated emoji this way, So you could use the

523
00:32:15,440 --> 00:32:21,320
Speaker 1: little application and create a customized surprise emoji that copies

524
00:32:21,320 --> 00:32:22,800
Speaker 1: the way you look when you make a sort of

525
00:32:22,800 --> 00:32:26,600
Speaker 1: an exaggerated surprise face. You could do that. The neural

526
00:32:26,600 --> 00:32:30,600
Speaker 1: engine takes the incoming data the images it's pulling from

527
00:32:30,680 --> 00:32:34,200
Speaker 1: the camera, analyze it, and then helps create an animated

528
00:32:34,240 --> 00:32:37,440
Speaker 1: image that mirrors what you did. The neural engine also

529
00:32:37,480 --> 00:32:41,280
Speaker 1: analyzes visual data for the purposes of augmented reality. That's

530
00:32:41,320 --> 00:32:44,240
Speaker 1: when you overlay digital information on top of a view

531
00:32:44,360 --> 00:32:48,000
Speaker 1: of the physical world around you. So with smartphones like

532
00:32:48,000 --> 00:32:50,680
Speaker 1: the iPhone, it means holding your phone up and looking

533
00:32:51,000 --> 00:32:54,280
Speaker 1: at the world through your phone screen. So the camera

534
00:32:54,600 --> 00:32:57,920
Speaker 1: on your phone is giving you a live video feed

535
00:32:58,480 --> 00:33:01,120
Speaker 1: of whatever you're pointing the phone at it, and then

536
00:33:02,040 --> 00:33:04,520
Speaker 1: you use an augmented reality app, and on top of

537
00:33:04,560 --> 00:33:07,400
Speaker 1: that video image your phone will overlay some sort of

538
00:33:07,440 --> 00:33:10,800
Speaker 1: digital information. Could be a game, it could be information

539
00:33:10,840 --> 00:33:14,480
Speaker 1: about your surroundings. The digital information can appear to be

540
00:33:14,520 --> 00:33:18,800
Speaker 1: anchored to the physical space itself. So you could have

541
00:33:18,840 --> 00:33:21,680
Speaker 1: an augmented reality application that let's you view a virtual

542
00:33:21,680 --> 00:33:24,240
Speaker 1: piece of furniture in your house. And so when you

543
00:33:24,240 --> 00:33:27,080
Speaker 1: hold up the phone, you use the app to place

544
00:33:27,240 --> 00:33:30,560
Speaker 1: a virtual chair, let's say, in a specific location in

545
00:33:30,560 --> 00:33:33,400
Speaker 1: a room, and you can walk around this virtual chair

546
00:33:33,440 --> 00:33:35,680
Speaker 1: holding your phone up, and it looks like the chair

547
00:33:35,760 --> 00:33:38,800
Speaker 1: is actually there, even as your perspective changes. You can

548
00:33:38,840 --> 00:33:41,680
Speaker 1: circle around it and view the chair from all the

549
00:33:41,720 --> 00:33:44,560
Speaker 1: different angles as if it were actually sitting there in

550
00:33:44,600 --> 00:33:48,200
Speaker 1: the room. It's anchored to its place that you've put

551
00:33:48,240 --> 00:33:51,320
Speaker 1: it within the view of the room. The neural engine

552
00:33:51,360 --> 00:33:54,040
Speaker 1: is analyzing all this information that's coming in from the

553
00:33:54,080 --> 00:33:56,920
Speaker 1: camera and helping the app create the image of the chair,

554
00:33:57,360 --> 00:34:00,880
Speaker 1: keeping it the appropriate size and orientation with respect your

555
00:34:00,960 --> 00:34:04,360
Speaker 1: viewing angle. And the neural engine can use this ability

556
00:34:04,400 --> 00:34:07,080
Speaker 1: to help you go through stuff like your photos. Let's

557
00:34:07,120 --> 00:34:10,720
Speaker 1: say you've got an adorable pet, like my doggie Timbolt.

558
00:34:10,920 --> 00:34:15,640
Speaker 1: He is adorable. The iPhone can use its neural engine

559
00:34:15,680 --> 00:34:19,480
Speaker 1: and image recognition algorithms to return the pictures of your

560
00:34:19,520 --> 00:34:22,200
Speaker 1: pet in response to a search query. So my wife,

561
00:34:22,200 --> 00:34:24,760
Speaker 1: who has an iPhone, could do this with our dogs.

562
00:34:24,800 --> 00:34:27,440
Speaker 1: She could search for the word dog in her photo

563
00:34:27,520 --> 00:34:31,440
Speaker 1: app and then she would get countless images of Tibolt.

564
00:34:31,719 --> 00:34:35,440
Speaker 1: And I know it works because she's done it. Apple

565
00:34:35,480 --> 00:34:38,760
Speaker 1: has included access to the neural engine so that app

566
00:34:38,840 --> 00:34:41,799
Speaker 1: developers can actually take advantage of that technology as well.

567
00:34:41,800 --> 00:34:45,040
Speaker 1: They'll doubtlessly create new ways to leverage this tech, so

568
00:34:45,200 --> 00:34:46,840
Speaker 1: we'll have to keep our eyes open to see what

569
00:34:46,920 --> 00:34:49,640
Speaker 1: comes out of it. Neural networks in general are becoming

570
00:34:49,719 --> 00:34:54,480
Speaker 1: increasingly important in machine learning and artificial intelligence, so it's

571
00:34:54,480 --> 00:34:57,000
Speaker 1: likely to grow as a branch of computer science for

572
00:34:57,120 --> 00:35:00,480
Speaker 1: the next several years. And that wraps up this episode.

573
00:35:00,840 --> 00:35:04,080
Speaker 1: If you have suggestions for future episodes of tech Stuff,

574
00:35:04,080 --> 00:35:07,560
Speaker 1: maybe it's a technology, a person in tech company, anything

575
00:35:07,600 --> 00:35:10,400
Speaker 1: like that, Send me an email the addresses tech Stuff

576
00:35:10,440 --> 00:35:12,880
Speaker 1: at how stuff works dot com. You can draw me

577
00:35:12,920 --> 00:35:15,879
Speaker 1: a line on Facebook or Twitter. The handle for both

578
00:35:15,880 --> 00:35:19,680
Speaker 1: of those is tech Stuff H s W. Don't forget,

579
00:35:20,239 --> 00:35:23,799
Speaker 1: we have a merchandise store over at t public dot

580
00:35:23,800 --> 00:35:27,080
Speaker 1: com slash tech stuff. That's T E E public dot

581
00:35:27,120 --> 00:35:30,479
Speaker 1: com slash tech stuff. You can go and get your uh,

582
00:35:30,520 --> 00:35:35,280
Speaker 1: your caption test, the prove you're not a robot sticker

583
00:35:35,400 --> 00:35:37,759
Speaker 1: or T shirt or tote bag or whatever type of

584
00:35:37,840 --> 00:35:40,160
Speaker 1: thing you would like that on. It's pretty cool, So

585
00:35:40,200 --> 00:35:42,440
Speaker 1: go check that out, and don't forget to follow us

586
00:35:42,440 --> 00:35:45,680
Speaker 1: on Instagram and I'll talk to you again really soon.

587
00:35:51,480 --> 00:35:54,000
Speaker 1: For more on this and thousands of other topics, visit

588
00:35:54,040 --> 00:36:02,080
Speaker 1: how staff works dot com ye