1 00:00:05,120 --> 00:00:07,440 Speaker 1: AI seems like it burst out of the gate a 2 00:00:07,480 --> 00:00:10,920 Speaker 1: few years ago, But is it actually the latest chapter 3 00:00:11,119 --> 00:00:16,040 Speaker 1: in a three hundred year trajectory to turn thought into math? 4 00:00:16,680 --> 00:00:19,400 Speaker 2: Can the mind be captured with equations? 5 00:00:19,960 --> 00:00:23,400 Speaker 1: Why do current AI models need petabytes of data but 6 00:00:23,480 --> 00:00:26,759 Speaker 1: a child can learn from just a few examples. Why 7 00:00:26,760 --> 00:00:31,440 Speaker 1: does AI have jagged intelligence, meaning it looks brilliant in 8 00:00:31,480 --> 00:00:36,080 Speaker 1: one moment and then it does something totally nonsensical. In physics, 9 00:00:36,440 --> 00:00:39,479 Speaker 1: we have various laws, like the law of gravity or 10 00:00:39,520 --> 00:00:43,320 Speaker 1: the laws of motion, And today we're joined by cognitive 11 00:00:43,360 --> 00:00:47,080 Speaker 1: scientist Tom Griffiths from Princeton to talk about whether we 12 00:00:47,159 --> 00:00:55,400 Speaker 1: are moving towards nailing down laws of thought. Welcome to 13 00:00:55,400 --> 00:00:58,880 Speaker 1: Inner Cosmos with me David Eagleman. I'm a neuroscientist and 14 00:00:58,920 --> 00:01:02,400 Speaker 1: author at Stanford, and in these episodes we sail deeply 15 00:01:02,440 --> 00:01:05,960 Speaker 1: into our three pound universe to understand why and how. 16 00:01:05,800 --> 00:01:07,880 Speaker 2: Our lives look the way they do. 17 00:01:23,920 --> 00:01:27,000 Speaker 1: One thing that distinguishes Homo sapiens from all our cousins 18 00:01:27,040 --> 00:01:30,119 Speaker 1: in the animal kingdom is that we watch the world 19 00:01:30,120 --> 00:01:34,160 Speaker 1: around us and we try to abstract patterns from it. 20 00:01:34,600 --> 00:01:37,679 Speaker 1: For example, you might watch the way that a stone 21 00:01:37,760 --> 00:01:41,039 Speaker 1: falls to the ground and maybe you see a tree 22 00:01:41,120 --> 00:01:44,200 Speaker 1: branch fall, and maybe you see a glacier and one 23 00:01:44,240 --> 00:01:46,959 Speaker 1: day a huge wall of ice falls off it, and 24 00:01:47,040 --> 00:01:51,000 Speaker 1: pretty soon you start seeing an underlying similarity to the 25 00:01:51,040 --> 00:01:54,600 Speaker 1: way that things move. And eventually someone very very smart 26 00:01:54,680 --> 00:01:58,360 Speaker 1: comes along, like Isaac Newton and summarizes all this in 27 00:01:58,600 --> 00:02:02,080 Speaker 1: the law of gravity. And then the same smart guy, 28 00:02:02,160 --> 00:02:05,600 Speaker 1: Newton comes up with the three laws of motion. And 29 00:02:05,640 --> 00:02:08,560 Speaker 1: then another smart person is Einstein. He figures out the 30 00:02:08,919 --> 00:02:12,840 Speaker 1: conservation of mass and energy, which seems to be another 31 00:02:13,040 --> 00:02:16,919 Speaker 1: ironclad law, and then we have the laws of thermodynamics 32 00:02:16,960 --> 00:02:20,320 Speaker 1: and electrostatic laws, and all of this speaks to the 33 00:02:20,360 --> 00:02:24,079 Speaker 1: great success that we've had as the species in figuring 34 00:02:24,120 --> 00:02:28,160 Speaker 1: out the lowest level of code that's running in the universe. 35 00:02:28,880 --> 00:02:32,120 Speaker 1: But for most of human history, the concept of a 36 00:02:32,600 --> 00:02:35,840 Speaker 1: thought has felt like the most intimate thing we experience 37 00:02:35,960 --> 00:02:38,560 Speaker 1: and the least tractable thing to study. 38 00:02:39,200 --> 00:02:41,840 Speaker 2: What a thought is and how it occurs. 39 00:02:42,280 --> 00:02:46,200 Speaker 1: That seems to live in a different category of mystery 40 00:02:46,240 --> 00:02:49,560 Speaker 1: from how an object falls. Why, Well, it's because the 41 00:02:49,639 --> 00:02:53,600 Speaker 1: thought pops into your head and somehow it carries memory 42 00:02:53,680 --> 00:02:57,760 Speaker 1: and expectation and language and often a feeling. But it 43 00:02:57,840 --> 00:03:02,440 Speaker 1: feels vaporous and private. It feels like the one thing 44 00:03:02,960 --> 00:03:07,880 Speaker 1: that will forever escape formal description. But what's interesting is 45 00:03:07,919 --> 00:03:11,680 Speaker 1: that for centuries people have tried, there's always been a 46 00:03:11,800 --> 00:03:13,399 Speaker 1: deep human urge. 47 00:03:13,120 --> 00:03:16,480 Speaker 2: To ask whether thought has laws to it? 48 00:03:16,560 --> 00:03:19,720 Speaker 1: In other words, does the mind have principles that you 49 00:03:19,760 --> 00:03:23,760 Speaker 1: can write down? Does reasoning have a grammar to it? 50 00:03:24,120 --> 00:03:27,960 Speaker 1: Can you describe intelligence in a language that's precise enough 51 00:03:28,360 --> 00:03:31,440 Speaker 1: that once you understand the rules, you can begin to 52 00:03:31,480 --> 00:03:35,800 Speaker 1: build with them, like build artificial intelligence. Most of us 53 00:03:35,840 --> 00:03:38,400 Speaker 1: are old enough to remember that this question of AI 54 00:03:38,960 --> 00:03:42,840 Speaker 1: once lived in philosophy seminars and math departments, but now 55 00:03:42,880 --> 00:03:45,320 Speaker 1: it's sitting at the center of our economy. 56 00:03:46,280 --> 00:03:48,200 Speaker 2: Okay, So what is thought? 57 00:03:48,440 --> 00:03:52,640 Speaker 1: Can we capture it in formal systems like laws or equations? 58 00:03:53,040 --> 00:03:57,600 Speaker 1: Do different parts of intelligence come from logic, from learning, 59 00:03:57,720 --> 00:04:03,840 Speaker 1: from uncertainty, from memory, from prior knowledge, from living inside bodies, 60 00:04:03,920 --> 00:04:06,120 Speaker 1: from living inside our cultures? 61 00:04:06,640 --> 00:04:08,040 Speaker 2: From the particular. 62 00:04:07,600 --> 00:04:11,120 Speaker 1: Constraints of being a human animal with a short lifespan 63 00:04:11,520 --> 00:04:15,320 Speaker 1: and limited bandwidth. Our guest today is someone who lives 64 00:04:15,400 --> 00:04:19,240 Speaker 1: right at the intersection of all these questions. Tom Griffiths 65 00:04:19,320 --> 00:04:22,880 Speaker 1: is a professor at Princeton, where He directs the Computational 66 00:04:22,960 --> 00:04:27,680 Speaker 1: Cognitive Science Lab and the Princeton Laboratory for Artificial Intelligence. 67 00:04:28,160 --> 00:04:31,719 Speaker 1: He has spent years asking how minds work through the 68 00:04:31,839 --> 00:04:35,760 Speaker 1: different lenses of math and computation and learning. And he's 69 00:04:35,760 --> 00:04:38,919 Speaker 1: the author of a wonderful new book called The Laws 70 00:04:38,960 --> 00:04:43,719 Speaker 1: of Thought, which traces the long history of thinkers asking 71 00:04:44,080 --> 00:04:46,719 Speaker 1: are their rules to this? Can we understand what human 72 00:04:46,760 --> 00:04:50,839 Speaker 1: thinking is? In his book we get the lengthy arc 73 00:04:51,040 --> 00:04:55,839 Speaker 1: of minds trying to understand mind. This begins millennia ago 74 00:04:55,920 --> 00:05:00,520 Speaker 1: with Aristotle, who wondered whether logic itself could be math matized, 75 00:05:00,960 --> 00:05:05,520 Speaker 1: and Tom follows the trail through the architects of symbolic reasoning, 76 00:05:05,960 --> 00:05:09,240 Speaker 1: through the birth of computation, through the rise of neural networks, 77 00:05:09,640 --> 00:05:13,680 Speaker 1: through the realization that probability theory might serve as a 78 00:05:13,800 --> 00:05:17,719 Speaker 1: language for our beliefs about things. Along the way, in 79 00:05:17,760 --> 00:05:20,880 Speaker 1: his book, a picture emerges that there may not be 80 00:05:21,320 --> 00:05:24,440 Speaker 1: just a single tool for capturing their mind, but instead 81 00:05:24,440 --> 00:05:27,760 Speaker 1: there are different ways of trying to tackle the problem, 82 00:05:28,120 --> 00:05:32,520 Speaker 1: and each one sheds light on a different aspect of cognition. 83 00:05:33,240 --> 00:05:36,360 Speaker 1: So we're going to talk about ourselves human minds, and 84 00:05:36,440 --> 00:05:40,160 Speaker 1: we'll talk about AI what kind of intelligence is this 85 00:05:40,440 --> 00:05:47,480 Speaker 1: and what is missing? Here's my interview with Tom Griffiths. 86 00:05:48,160 --> 00:05:50,320 Speaker 3: As soon as you turn thought into math, it becomes 87 00:05:50,320 --> 00:05:52,480 Speaker 3: something that machines would be able to do. And so 88 00:05:52,600 --> 00:05:57,160 Speaker 3: our modern AI systems are really a consequence of, you know, 89 00:05:57,200 --> 00:06:00,520 Speaker 3: that thought that people were having hundreds of years ago, 90 00:06:01,080 --> 00:06:04,160 Speaker 3: of being able to turn thought into something that can 91 00:06:04,160 --> 00:06:05,800 Speaker 3: be expressed in mathematical terms. 92 00:06:05,920 --> 00:06:07,440 Speaker 1: And so one of the things that I loved about 93 00:06:07,440 --> 00:06:09,400 Speaker 1: your book, by the way, is that you really tell 94 00:06:09,560 --> 00:06:11,640 Speaker 1: stories of all the thinkers. 95 00:06:12,240 --> 00:06:14,880 Speaker 2: You dive into the lives, you tell them with real color. 96 00:06:15,160 --> 00:06:17,440 Speaker 1: If you were going to start with one thinker that 97 00:06:17,480 --> 00:06:19,160 Speaker 1: you think is the most important, who would that be. 98 00:06:19,400 --> 00:06:21,160 Speaker 3: There are a couple of people who have this sort 99 00:06:21,160 --> 00:06:23,960 Speaker 3: of enduring influence throughout the book. One of them is Leibnitz, 100 00:06:24,040 --> 00:06:26,680 Speaker 3: who kind of started this enterprise in some sense. He 101 00:06:26,760 --> 00:06:30,360 Speaker 3: was really trying to take the idea of logic as 102 00:06:30,400 --> 00:06:33,280 Speaker 3: expressed by Aristotle and turn it into math, but ultimately 103 00:06:33,279 --> 00:06:35,960 Speaker 3: failed in doing that. But along the way he also 104 00:06:36,120 --> 00:06:38,799 Speaker 3: discovered the calculus, which turned out to be really important 105 00:06:38,800 --> 00:06:41,440 Speaker 3: when people wanted to make neural networks that could learn 106 00:06:41,520 --> 00:06:44,480 Speaker 3: from data. It turns out that the trick for doing 107 00:06:44,560 --> 00:06:46,840 Speaker 3: that is actually a trick that lad had figured out 108 00:06:46,880 --> 00:06:50,599 Speaker 3: all that time ago. And then another key figure here, 109 00:06:50,760 --> 00:06:53,320 Speaker 3: as might be suggested by the title of the book, 110 00:06:53,400 --> 00:06:57,800 Speaker 3: is George Bull, who was a nineteenth century mathematician. He 111 00:06:57,920 --> 00:06:59,600 Speaker 3: was a school teacher for most of his life and 112 00:06:59,680 --> 00:07:01,600 Speaker 3: did a lot of like serious math on the side 113 00:07:01,640 --> 00:07:03,680 Speaker 3: instead of you know, had a big effect on the 114 00:07:03,720 --> 00:07:08,080 Speaker 3: history of mathematics. But he was really the person who 115 00:07:08,160 --> 00:07:11,560 Speaker 3: then first solved that problem that Leibnitz had posed. And 116 00:07:11,600 --> 00:07:15,480 Speaker 3: in addition to the impact that that work had, he's 117 00:07:15,560 --> 00:07:19,000 Speaker 3: also the great grandfather of Jeff Hinton, who was one 118 00:07:19,000 --> 00:07:21,400 Speaker 3: of the people who played an important role in developing 119 00:07:21,440 --> 00:07:24,040 Speaker 3: these algorithms for learning from your own networks. And so 120 00:07:24,200 --> 00:07:26,240 Speaker 3: you could make an argument that without Boole we would 121 00:07:26,240 --> 00:07:29,480 Speaker 3: be a fair way back from where we are today. 122 00:07:29,920 --> 00:07:30,080 Speaker 2: You know. 123 00:07:30,200 --> 00:07:33,160 Speaker 1: Interestingly, when most people think about Boole, they only know 124 00:07:33,320 --> 00:07:38,360 Speaker 1: about Boolean numbers. They know about zero and one binary numbers, 125 00:07:38,680 --> 00:07:41,520 Speaker 1: and that's essentially the extent of the think. But he 126 00:07:41,560 --> 00:07:43,760 Speaker 1: was quite celebrated in his life right even though he 127 00:07:43,840 --> 00:07:48,440 Speaker 1: was a headmaster and not formally involved as a professor. 128 00:07:48,560 --> 00:07:51,000 Speaker 1: Am I correct about this? He nonetheless was quite recognized 129 00:07:51,000 --> 00:07:51,800 Speaker 1: as a mathematician. 130 00:07:52,400 --> 00:07:55,080 Speaker 3: Yeah, he became a university professor later in his life, 131 00:07:55,120 --> 00:07:57,080 Speaker 3: but spent most of his life as a teacher and 132 00:07:57,120 --> 00:08:01,080 Speaker 3: a head master. But yeah, he won a gold medal 133 00:08:01,080 --> 00:08:04,080 Speaker 3: in mathematics from the Royal Society. Was a very prestigious award, 134 00:08:05,080 --> 00:08:08,960 Speaker 3: and you know, was this amazing person who was having 135 00:08:09,000 --> 00:08:11,880 Speaker 3: these high level correspondences with the leading mathematicians of the 136 00:08:11,960 --> 00:08:16,360 Speaker 3: day while holding down his job running a small school. 137 00:08:16,960 --> 00:08:17,640 Speaker 2: Yeah. 138 00:08:17,800 --> 00:08:22,840 Speaker 1: Now, in the book, you essentially use three different frameworks. 139 00:08:22,920 --> 00:08:26,440 Speaker 1: What phenomenon does each framework explain? 140 00:08:26,560 --> 00:08:27,960 Speaker 2: Unusually Well, the. 141 00:08:27,840 --> 00:08:29,880 Speaker 3: Three frameworks I talk about in the book are what 142 00:08:29,920 --> 00:08:32,360 Speaker 3: I call rules and symbols, which is what we've been 143 00:08:32,400 --> 00:08:34,800 Speaker 3: talking about. This kind of like approach that stems out 144 00:08:34,840 --> 00:08:37,120 Speaker 3: of logic, where the idea is that you're going to 145 00:08:37,200 --> 00:08:39,719 Speaker 3: be able to write down some rules that characterize the 146 00:08:39,720 --> 00:08:41,960 Speaker 3: structure of thought, and by following those rules, you end 147 00:08:42,000 --> 00:08:47,319 Speaker 3: up with interesting consequences. The second approach is networks, features 148 00:08:47,320 --> 00:08:47,840 Speaker 3: and spaces. 149 00:08:47,920 --> 00:08:48,079 Speaker 4: Right. 150 00:08:48,120 --> 00:08:50,640 Speaker 3: This is neural networks, which you can kind of think 151 00:08:50,640 --> 00:08:54,160 Speaker 3: about as a system for doing computation when you start 152 00:08:54,200 --> 00:08:56,839 Speaker 3: representing things as points in a space. Right, So if 153 00:08:56,840 --> 00:09:01,600 Speaker 3: you start to think about you know, every object that 154 00:09:01,640 --> 00:09:03,360 Speaker 3: you could see in the world is not being something 155 00:09:03,400 --> 00:09:05,920 Speaker 3: that's described by rules, but being described by a location 156 00:09:06,040 --> 00:09:08,560 Speaker 3: along some dimensions. You need to have a way of 157 00:09:08,600 --> 00:09:11,400 Speaker 3: talking about how to map between those spaces and your 158 00:09:11,440 --> 00:09:14,120 Speaker 3: all network solve that problem. And then the third is 159 00:09:14,960 --> 00:09:20,439 Speaker 3: probability and statistics. And probability theory is really powerful because 160 00:09:20,720 --> 00:09:24,120 Speaker 3: it is the complement to logic, where logic tells us 161 00:09:24,160 --> 00:09:26,120 Speaker 3: how to go from things that we know to be 162 00:09:26,200 --> 00:09:28,959 Speaker 3: true to other things that we're equally certain or true. 163 00:09:29,360 --> 00:09:31,680 Speaker 3: Probability theory tells us what to do when we're uncertain. 164 00:09:32,160 --> 00:09:34,920 Speaker 3: So if we get some information we want to draw 165 00:09:34,920 --> 00:09:37,319 Speaker 3: a conclusion, but we're not able to draw that conclusion 166 00:09:37,360 --> 00:09:40,559 Speaker 3: with perfect certainty, Probability theory tells us how to do that, 167 00:09:40,920 --> 00:09:44,000 Speaker 3: and it tells us how to combine our sort of 168 00:09:44,400 --> 00:09:49,320 Speaker 3: background beliefs, the other sources of information we have our 169 00:09:49,360 --> 00:09:52,120 Speaker 3: biases in with the data that we see in a 170 00:09:52,120 --> 00:09:54,160 Speaker 3: way that helps us to explain how it's possible to 171 00:09:54,240 --> 00:09:56,440 Speaker 3: learn from small amounts of data. And that's one thing 172 00:09:56,480 --> 00:09:59,199 Speaker 3: which is still something that discriminates human learning from the 173 00:09:59,280 --> 00:10:01,120 Speaker 3: learning that's done by AI systems today. 174 00:10:01,520 --> 00:10:03,560 Speaker 1: Okay, great, so we're going to dive into each of 175 00:10:03,600 --> 00:10:06,560 Speaker 1: these three lenses. But just before we do, do you 176 00:10:06,640 --> 00:10:11,680 Speaker 1: see the AI conversation today over indexing on one of 177 00:10:11,720 --> 00:10:13,160 Speaker 1: these lenses over the others. 178 00:10:14,760 --> 00:10:17,520 Speaker 3: I think there's a lot of emphasis on neural networks, 179 00:10:17,559 --> 00:10:21,240 Speaker 3: which are fundamentally the sort of engineering technology which is 180 00:10:21,280 --> 00:10:25,360 Speaker 3: making possible the creation of our chatbots and the other 181 00:10:25,520 --> 00:10:29,600 Speaker 3: sort of big AI systems that are deployed. I think 182 00:10:30,200 --> 00:10:34,360 Speaker 3: that potentially misses out the importance of these other threads 183 00:10:34,559 --> 00:10:37,679 Speaker 3: right where. One thing that's important to remember is that 184 00:10:37,720 --> 00:10:40,680 Speaker 3: those neural networks are being trained on what is essentially 185 00:10:40,720 --> 00:10:43,200 Speaker 3: a system of rules and symbols. They're being trained on 186 00:10:43,720 --> 00:10:47,480 Speaker 3: human language, which is symbolic and rule like in various ways, 187 00:10:48,160 --> 00:10:50,680 Speaker 3: and they're being trained on code, which is even more 188 00:10:50,720 --> 00:10:54,240 Speaker 3: symbolic and even more rule like, And those things together 189 00:10:54,280 --> 00:10:56,559 Speaker 3: provide some of the substrate for developing the kind of 190 00:10:56,600 --> 00:10:59,280 Speaker 3: intelligence that they demonstrate. And then the way that they're 191 00:10:59,320 --> 00:11:03,079 Speaker 3: trained is by learning to predict the next token, right, 192 00:11:03,120 --> 00:11:05,080 Speaker 3: the next word or part of word, based on what 193 00:11:05,080 --> 00:11:07,760 Speaker 3: they've seen so far. And that way of training them 194 00:11:07,800 --> 00:11:12,040 Speaker 3: is actually using probability theory. So that's a probabilistic problem 195 00:11:12,040 --> 00:11:14,079 Speaker 3: because you're making a guess about what the next thing 196 00:11:14,120 --> 00:11:15,920 Speaker 3: is going to be based on the things that you see, 197 00:11:16,200 --> 00:11:18,520 Speaker 3: and so that's an important ingredient in their success as well, 198 00:11:18,559 --> 00:11:22,000 Speaker 3: is that they're essentially learning to approximate a big probability distribution. 199 00:11:22,360 --> 00:11:25,079 Speaker 1: So let's dive into the first one, rules and symbols. 200 00:11:25,280 --> 00:11:28,600 Speaker 1: So take us back to the original urge. Why did 201 00:11:28,679 --> 00:11:33,559 Speaker 1: early thinkers believe that this could be used to explain thinking. 202 00:11:35,200 --> 00:11:38,240 Speaker 3: I think a lot of the draw of rules and 203 00:11:38,280 --> 00:11:41,760 Speaker 3: symbols was that that really was, in some way what 204 00:11:42,120 --> 00:11:45,400 Speaker 3: mathematics was to people, right, So Leibniz, part of the 205 00:11:45,400 --> 00:11:47,400 Speaker 3: reason why he wasn't able to solve this problem of 206 00:11:47,440 --> 00:11:50,160 Speaker 3: figuring out how to turn thought into math is that 207 00:11:50,679 --> 00:11:53,560 Speaker 3: what he thought math was, or the kind of math 208 00:11:53,600 --> 00:11:55,440 Speaker 3: that he was trying to use to solve that problem, 209 00:11:56,000 --> 00:11:59,080 Speaker 3: was really arithmetic, right, And arithmetic was kind of like 210 00:11:59,120 --> 00:12:01,560 Speaker 3: the model that they had for a mathematical system. So 211 00:12:01,600 --> 00:12:04,120 Speaker 3: you can think about ideas being added together or subtracting 212 00:12:04,120 --> 00:12:07,080 Speaker 3: one idea from another, and really thinking about the operators 213 00:12:07,120 --> 00:12:09,160 Speaker 3: that you're using as being the things that are sort 214 00:12:09,160 --> 00:12:11,280 Speaker 3: of coming from this familiar mathematical language. 215 00:12:11,360 --> 00:12:13,640 Speaker 4: And so I think part of. 216 00:12:13,640 --> 00:12:16,240 Speaker 3: The reason that we end up with that approach is 217 00:12:16,280 --> 00:12:19,120 Speaker 3: because of the kind of math that has been successful 218 00:12:19,120 --> 00:12:22,920 Speaker 3: in other settings, right where we need to do arithmetic 219 00:12:23,040 --> 00:12:25,320 Speaker 3: to you know, that's a good description of certain. 220 00:12:25,160 --> 00:12:26,320 Speaker 4: Kinds of things that human minds do. 221 00:12:27,720 --> 00:12:30,120 Speaker 3: Google had the insight that you needed a different kind 222 00:12:30,120 --> 00:12:32,640 Speaker 3: of algebra in order to describe thought, and then that's 223 00:12:32,679 --> 00:12:36,200 Speaker 3: what leads to modern mathematical logic. But it's still in 224 00:12:36,240 --> 00:12:39,560 Speaker 3: this kind of symbolic language, although Gooole also talked about 225 00:12:39,559 --> 00:12:42,840 Speaker 3: probability theory as being important for capturing languages as well. 226 00:12:42,920 --> 00:12:45,360 Speaker 3: So I think it's really more about what are the 227 00:12:45,440 --> 00:12:48,640 Speaker 3: kinds of mathematical systems that it was sort of straightforward 228 00:12:48,640 --> 00:12:51,199 Speaker 3: to formalize, and that gave us something that we could 229 00:12:51,240 --> 00:12:53,600 Speaker 3: try to map thought onto. And that's what we do 230 00:12:53,679 --> 00:12:58,320 Speaker 3: as scientists is often taking mathematical systems that mathematicians have 231 00:12:58,360 --> 00:13:00,840 Speaker 3: defined for us and then saying, oh, I think this 232 00:13:01,000 --> 00:13:03,800 Speaker 3: mathematical system maps onto the thing that I want to understand, 233 00:13:04,200 --> 00:13:06,400 Speaker 3: and so trying to establish that correspondence and not just 234 00:13:06,520 --> 00:13:08,199 Speaker 3: then allow us to derive its consequences. 235 00:13:09,160 --> 00:13:12,840 Speaker 1: So speaking of rules and symbols, So thinkers like Newl 236 00:13:12,920 --> 00:13:17,040 Speaker 1: and Simon, they popularize this idea of goals and sub goals. 237 00:13:17,480 --> 00:13:21,920 Speaker 1: What did that viewpoint get exactly right about human problem solving. 238 00:13:23,440 --> 00:13:26,760 Speaker 3: So now we're fast forwarding a bit right from we 239 00:13:26,840 --> 00:13:30,800 Speaker 3: have Boule figuring out the structure of logic. That turns 240 00:13:30,840 --> 00:13:32,480 Speaker 3: into you know, lots of people then sort of turn 241 00:13:32,520 --> 00:13:34,840 Speaker 3: that into a sort of mature theory of logic. You 242 00:13:34,920 --> 00:13:39,160 Speaker 3: get aalenteering kind of turning this into a theory of computation, 243 00:13:39,480 --> 00:13:41,960 Speaker 3: thinking about what an abstract mathematician is doing when they're 244 00:13:42,000 --> 00:13:43,880 Speaker 3: doing something like logic, and thinking about how you can 245 00:13:43,920 --> 00:13:48,240 Speaker 3: make a machine do that. And then we have people 246 00:13:48,520 --> 00:13:51,560 Speaker 3: starting to realize that, you know, as digital computers are 247 00:13:51,559 --> 00:13:55,560 Speaker 3: being developed, maybe those provide a good model for how 248 00:13:55,600 --> 00:13:59,880 Speaker 3: thinking works in general, and then trying to use a 249 00:14:00,000 --> 00:14:03,200 Speaker 3: computer as a sort of foundation for you know, thinking 250 00:14:03,200 --> 00:14:05,280 Speaker 3: about things like how people might solve problems. And so 251 00:14:06,040 --> 00:14:09,800 Speaker 3: Alan Ewele and Herbert Simon were influential cognitive scientists who 252 00:14:10,600 --> 00:14:14,679 Speaker 3: did exactly that. They had this idea that maybe there 253 00:14:14,760 --> 00:14:17,400 Speaker 3: is a way that you could make computers smarter by 254 00:14:17,440 --> 00:14:20,720 Speaker 3: using insights from human cognition, but also get a better 255 00:14:20,800 --> 00:14:23,160 Speaker 3: understanding of what humans are doing when they're solving problems 256 00:14:23,200 --> 00:14:25,400 Speaker 3: by using the sort of ideas that come from things 257 00:14:25,440 --> 00:14:29,360 Speaker 3: like computer programming, and so they set up you know this, 258 00:14:29,640 --> 00:14:31,240 Speaker 3: you know, when we're trying to solve a problem or 259 00:14:31,240 --> 00:14:33,360 Speaker 3: prove a mathematical theorem or play a game of chess, 260 00:14:33,720 --> 00:14:35,520 Speaker 3: they set this up as a problem of searching through 261 00:14:35,560 --> 00:14:40,600 Speaker 3: a tree of possibilities, where what you're doing is making choices, 262 00:14:40,880 --> 00:14:42,480 Speaker 3: and then each of those choices gives you a new 263 00:14:42,520 --> 00:14:44,240 Speaker 3: set of choices, and each of those choices gives you 264 00:14:44,240 --> 00:14:46,360 Speaker 3: a new set of choices, and the hard thing is 265 00:14:46,880 --> 00:14:49,160 Speaker 3: finding a path through those choices that leads to the 266 00:14:49,160 --> 00:14:51,160 Speaker 3: point that you want to end up at. And so 267 00:14:51,720 --> 00:14:54,120 Speaker 3: that's something where you can take inspiration from how human 268 00:14:54,120 --> 00:14:57,720 Speaker 3: mathematicians solve problems. You can take inspiration from the kind 269 00:14:57,760 --> 00:14:59,960 Speaker 3: of you know, tricks like working backwards from the end 270 00:15:00,080 --> 00:15:03,320 Speaker 3: towards the start. Right, Those were principles that they were 271 00:15:03,360 --> 00:15:05,280 Speaker 3: able to use to try and explain these aspects of 272 00:15:05,360 --> 00:15:07,800 Speaker 3: human cognition as well as making the machines work better. 273 00:15:08,040 --> 00:15:09,960 Speaker 1: Okay, but then one of the things that happened is 274 00:15:10,000 --> 00:15:13,120 Speaker 1: that at least one of these attempts had ballooned into 275 00:15:13,200 --> 00:15:17,320 Speaker 1: twenty five million rules. And so what does that teach 276 00:15:17,400 --> 00:15:20,520 Speaker 1: us about the shape of human intelligence. 277 00:15:21,960 --> 00:15:23,800 Speaker 4: This rules and symbols enterprise. 278 00:15:24,040 --> 00:15:24,160 Speaker 2: Right. 279 00:15:24,240 --> 00:15:27,000 Speaker 4: The sort of appeal that this had was that maybe 280 00:15:27,040 --> 00:15:27,520 Speaker 4: one day. 281 00:15:27,400 --> 00:15:29,560 Speaker 3: You could just write down all of the rules that 282 00:15:29,560 --> 00:15:31,720 Speaker 3: you need to write down, and then you've characterized how 283 00:15:31,760 --> 00:15:34,320 Speaker 3: intelligence works. Right, So it's just a matter of getting 284 00:15:34,400 --> 00:15:37,440 Speaker 3: enough rules in a way that's very reminiscent today, right 285 00:15:37,520 --> 00:15:40,920 Speaker 3: of you know, the way that our modern AI systems 286 00:15:40,920 --> 00:15:43,520 Speaker 3: are being made is by training them on more and 287 00:15:43,600 --> 00:15:46,640 Speaker 3: more data, right, feeding in more and more language. There 288 00:15:46,680 --> 00:15:48,520 Speaker 3: was a hope that you could just like, yeah, like 289 00:15:48,800 --> 00:15:50,760 Speaker 3: document all of the rules that you need to capture 290 00:15:51,080 --> 00:15:54,360 Speaker 3: the structure of human knowledge. And so that led to 291 00:15:54,960 --> 00:15:57,240 Speaker 3: you know, companies being started to try and engage in 292 00:15:57,280 --> 00:16:01,880 Speaker 3: that enterprise, ultimately I would say, unsuccessfully, but giving us 293 00:16:01,920 --> 00:16:06,440 Speaker 3: some kind of characterization of like particular subsets of human knowledge. 294 00:16:06,520 --> 00:16:08,640 Speaker 3: And so I think the thing that came out of 295 00:16:08,920 --> 00:16:12,280 Speaker 3: that enterprise was revealing that maybe you need something more 296 00:16:12,360 --> 00:16:16,600 Speaker 3: than just rules, right, that maybe thinking about logic as 297 00:16:16,640 --> 00:16:19,480 Speaker 3: a basis for our model of intelligence was missing something. 298 00:16:20,040 --> 00:16:22,240 Speaker 3: It's an approach that worked really well for certain kinds 299 00:16:22,280 --> 00:16:26,200 Speaker 3: of problems like doing arithmetic, playing games or chess, but 300 00:16:26,240 --> 00:16:28,520 Speaker 3: it didn't work very well for other kinds of problems 301 00:16:28,640 --> 00:16:31,520 Speaker 3: like figuring out what you're seeing in the world, or 302 00:16:31,880 --> 00:16:34,360 Speaker 3: actually learning language or these other kinds of things. 303 00:16:34,400 --> 00:16:37,080 Speaker 1: And so this is what leads to your second lens, 304 00:16:37,240 --> 00:16:40,800 Speaker 1: which is neural networks. And you talk about these as 305 00:16:40,880 --> 00:16:44,080 Speaker 1: having you know, a boom and bust history. So, first, 306 00:16:44,120 --> 00:16:46,720 Speaker 1: what happened in the last decade that allowed them to 307 00:16:46,760 --> 00:16:48,880 Speaker 1: turn into the dominant paradigm. 308 00:16:49,320 --> 00:16:53,440 Speaker 3: The big breakthrough in the last decade was really about 309 00:16:53,760 --> 00:16:56,680 Speaker 3: being able to make bigger in neural networks that could 310 00:16:56,720 --> 00:17:01,840 Speaker 3: be trained on more data in a way that could scale, right, 311 00:17:01,880 --> 00:17:05,159 Speaker 3: and so bigger here means what these are. An artificial 312 00:17:05,240 --> 00:17:08,840 Speaker 3: neural network is a set of units that are communicating 313 00:17:08,840 --> 00:17:13,120 Speaker 3: with one another. They're communicating along weighted connections, a sort 314 00:17:13,160 --> 00:17:15,639 Speaker 3: of you know, imagine like how neurons are connected in 315 00:17:15,680 --> 00:17:17,840 Speaker 3: your brain, and those neurons are connected to one another 316 00:17:17,880 --> 00:17:20,359 Speaker 3: and sending each other signals. An artificial neural network is 317 00:17:20,400 --> 00:17:23,719 Speaker 3: basically simulating that kind of structure inside a computer. And 318 00:17:23,760 --> 00:17:26,679 Speaker 3: so for a long time, the sort of the history 319 00:17:26,680 --> 00:17:30,040 Speaker 3: of neural networks has been one of people figuring out 320 00:17:30,119 --> 00:17:33,880 Speaker 3: how to make bigger neural networks work. So the very 321 00:17:33,920 --> 00:17:37,800 Speaker 3: first you know, learning neural networks. They had a learning 322 00:17:37,840 --> 00:17:40,719 Speaker 3: algorithm that worked for one layer of weights, and then 323 00:17:40,760 --> 00:17:42,560 Speaker 3: there was a breakthrough in the nineteen eighties that meant, 324 00:17:42,600 --> 00:17:44,280 Speaker 3: now you had a learning algorithm that could work for 325 00:17:44,359 --> 00:17:46,600 Speaker 3: multiple layers of weights, but it didn't work for very 326 00:17:46,760 --> 00:17:48,920 Speaker 3: deep neural networks with lots of layers of net weights 327 00:17:48,960 --> 00:17:52,160 Speaker 3: because it I can sort of explain the technical reasons 328 00:17:52,200 --> 00:17:54,159 Speaker 3: behind it, but you know, sort of like the basic 329 00:17:54,160 --> 00:17:57,760 Speaker 3: algorithm didn't quite work. And so the big breakthroughs of 330 00:17:57,840 --> 00:18:00,000 Speaker 3: the last you know, ten to fifteen years have been 331 00:18:00,080 --> 00:18:03,480 Speaker 3: about coming up with ways to take those algorithms and 332 00:18:03,520 --> 00:18:05,440 Speaker 3: actually make them work for neural networks that are bigger 333 00:18:05,480 --> 00:18:07,679 Speaker 3: and bigger and deeper and deeper, that are able to 334 00:18:07,840 --> 00:18:11,840 Speaker 3: easily learn more complex functions and can do so from 335 00:18:12,160 --> 00:18:14,879 Speaker 3: massive amounts of data in a way that means that 336 00:18:14,920 --> 00:18:18,000 Speaker 3: they're able to discover sort of complex relationships between things 337 00:18:18,000 --> 00:18:19,960 Speaker 3: that are necessary to produce intelligent behavior. 338 00:18:20,280 --> 00:18:24,080 Speaker 1: And so, what are these neural networks capture about cognition 339 00:18:24,760 --> 00:18:29,200 Speaker 1: that symbols missed, especially in terms of things like similarity 340 00:18:29,240 --> 00:18:32,040 Speaker 1: and fuzziness and graded concepts. 341 00:18:32,560 --> 00:18:35,000 Speaker 3: Fuzziness is a really good way of describing it. It's 342 00:18:35,040 --> 00:18:39,080 Speaker 3: that you know, if you ask somebody, you know, whether 343 00:18:39,119 --> 00:18:41,800 Speaker 3: something is a piece of furniture, they're going to say, 344 00:18:42,119 --> 00:18:43,840 Speaker 3: you know, if you show them a chair, they'll say, yes, 345 00:18:43,920 --> 00:18:46,879 Speaker 3: definitely a piece of furniture. If you show them a rug, 346 00:18:47,440 --> 00:18:51,119 Speaker 3: they'll say, yeah, maybe a piece of furniture. Right, it 347 00:18:51,119 --> 00:18:52,960 Speaker 3: doesn't sort of fit with our you know, week sort 348 00:18:52,960 --> 00:18:56,240 Speaker 3: of have a prototypical idea of what furniture is, which 349 00:18:56,240 --> 00:18:58,479 Speaker 3: contains things like chairs and tables and ottomans and these 350 00:18:58,480 --> 00:19:02,119 Speaker 3: other kinds of things, and then rugs and treadmills, and you. 351 00:19:02,160 --> 00:19:04,040 Speaker 4: Know, like these are things that maybe. 352 00:19:03,760 --> 00:19:06,320 Speaker 3: You're in this category, but maybe an't right. And so 353 00:19:07,160 --> 00:19:09,520 Speaker 3: we need to have a way of thinking about concepts 354 00:19:09,520 --> 00:19:11,760 Speaker 3: that's not just the sort of yes or no, true 355 00:19:11,800 --> 00:19:14,760 Speaker 3: or false one or zero that logic would give us. 356 00:19:14,800 --> 00:19:16,840 Speaker 3: We need to have something which has that fuzziness in it. 357 00:19:17,160 --> 00:19:19,399 Speaker 3: One way of getting fuzziness is by thinking about a 358 00:19:19,480 --> 00:19:22,760 Speaker 3: concept in terms of points in space, right where you 359 00:19:22,760 --> 00:19:26,399 Speaker 3: could think chairs are here in one location, rugs are 360 00:19:26,440 --> 00:19:29,040 Speaker 3: here in another location, and maybe what it is to 361 00:19:29,040 --> 00:19:30,439 Speaker 3: be a piece of furniture is to just be in 362 00:19:30,440 --> 00:19:32,560 Speaker 3: some part of that space, and how close you are 363 00:19:32,600 --> 00:19:34,280 Speaker 3: to that part of the space is like how good 364 00:19:34,320 --> 00:19:36,560 Speaker 3: you are as an example of that kind of furniture. 365 00:19:37,359 --> 00:19:39,800 Speaker 3: And so as soon as you think in those terms, 366 00:19:39,800 --> 00:19:42,160 Speaker 3: you have a new problem, which is with our rules 367 00:19:42,160 --> 00:19:44,400 Speaker 3: and symbols. We knew how to do computation, we knew 368 00:19:44,400 --> 00:19:47,000 Speaker 3: how to describe thinking. Thinking was a matter of applying 369 00:19:47,040 --> 00:19:49,240 Speaker 3: the rules and seat of you know, repeating that process. 370 00:19:50,040 --> 00:19:52,640 Speaker 3: But we don't have a way of doing computation in spaces. 371 00:19:52,720 --> 00:19:54,439 Speaker 3: And that's what youral networks give us. So you can 372 00:19:54,520 --> 00:19:58,760 Speaker 3: kind of think about a space corresponding to the activation 373 00:19:58,920 --> 00:20:00,639 Speaker 3: of the units inside this neural network. 374 00:20:00,680 --> 00:20:03,040 Speaker 4: How much you know, how much input. 375 00:20:02,720 --> 00:20:05,600 Speaker 3: Each neural unit in that neural network is receiving, and 376 00:20:04,920 --> 00:20:08,600 Speaker 3: how much of a response it's making that characterizes some 377 00:20:08,680 --> 00:20:10,639 Speaker 3: kind of space. And then neural network gives us a 378 00:20:10,640 --> 00:20:13,359 Speaker 3: way of mapping from the inputs that it's getting to 379 00:20:13,440 --> 00:20:14,080 Speaker 3: some output. 380 00:20:14,200 --> 00:20:16,040 Speaker 4: So you could put in you know. 381 00:20:16,000 --> 00:20:18,239 Speaker 3: Your picture of a chair, and it maps that to 382 00:20:18,240 --> 00:20:20,560 Speaker 3: some point in space, and then it put sort of 383 00:20:20,560 --> 00:20:23,440 Speaker 3: produces out an output the corresponds to, yes, this is 384 00:20:23,480 --> 00:20:26,080 Speaker 3: a piece of furniture. And because those outputs can now 385 00:20:26,119 --> 00:20:29,199 Speaker 3: be continuous values, you can capture the fuzziness and other 386 00:20:29,280 --> 00:20:31,360 Speaker 3: kinds of things that you want for your concepts. 387 00:20:31,640 --> 00:20:34,960 Speaker 1: And so, in what sense are these modern systems, these 388 00:20:35,040 --> 00:20:38,360 Speaker 1: artificial neural networks learning, and in what sense are they 389 00:20:38,400 --> 00:20:42,720 Speaker 1: doing something that's maybe categorically different from how children learn. 390 00:20:44,520 --> 00:20:48,320 Speaker 3: This is a fundamental question, right, That's the kind of 391 00:20:48,320 --> 00:20:50,440 Speaker 3: thing that we cognitive scientists think about a lot, and 392 00:20:50,800 --> 00:20:53,080 Speaker 3: I think that AI researchers are starting to care about 393 00:20:53,119 --> 00:20:55,359 Speaker 3: a lot too, which is, you know, what are these 394 00:20:55,440 --> 00:20:59,720 Speaker 3: sort of meaningful differences between human minds, human brains and 395 00:20:59,720 --> 00:21:01,680 Speaker 3: what we building in these AI systems or these sort 396 00:21:01,680 --> 00:21:07,240 Speaker 3: of artificial brains. I think one very salient difference is 397 00:21:07,359 --> 00:21:10,399 Speaker 3: the amount of data which is needed for a human 398 00:21:10,640 --> 00:21:12,639 Speaker 3: to learn language compared to the amount of data you 399 00:21:12,640 --> 00:21:15,159 Speaker 3: need to put into on neural network. So if you 400 00:21:15,200 --> 00:21:18,640 Speaker 3: take a system like chat GPT, right, one of these chatbots, 401 00:21:19,000 --> 00:21:21,639 Speaker 3: those systems are trained on the equivalent of something like 402 00:21:21,720 --> 00:21:24,920 Speaker 3: five thousand to fifty thousand years of continuous speech. There's 403 00:21:24,920 --> 00:21:26,800 Speaker 3: sort of massive amounts of data that are going into 404 00:21:26,800 --> 00:21:28,919 Speaker 3: that system. So it's like on the order of a 405 00:21:28,960 --> 00:21:31,159 Speaker 3: thousand or ten thousand times as much data as a 406 00:21:31,240 --> 00:21:33,879 Speaker 3: human child might get in order to learn language. And 407 00:21:33,960 --> 00:21:37,280 Speaker 3: the reason is that those artificial neural networks are really 408 00:21:37,359 --> 00:21:40,399 Speaker 3: kind of like undifferentiated learning machines. You can take that 409 00:21:40,440 --> 00:21:41,960 Speaker 3: same kind of neural network, you can get it to 410 00:21:42,040 --> 00:21:44,400 Speaker 3: learn all sorts of different kinds of things. It works 411 00:21:44,400 --> 00:21:46,119 Speaker 3: really well for learning language, but you can use it 412 00:21:46,160 --> 00:21:47,920 Speaker 3: to learn something about vision or something. You know, you 413 00:21:47,960 --> 00:21:49,280 Speaker 3: can sort of take all sorts of problems and give 414 00:21:49,280 --> 00:21:50,520 Speaker 3: it to them and it can learn how to do that. 415 00:21:51,040 --> 00:21:56,000 Speaker 3: And so as a consequence, they have what we call 416 00:21:56,000 --> 00:21:59,760 Speaker 3: in cognitive science machine learning inductive biases. They're not biased 417 00:21:59,800 --> 00:22:04,120 Speaker 3: to towards any particular solution to the learning problem, and 418 00:22:04,400 --> 00:22:09,399 Speaker 3: human brains have stronger inductive biases for things like learning language. Right, 419 00:22:09,440 --> 00:22:12,200 Speaker 3: we're sort of disposed towards certain kinds of things, which 420 00:22:12,200 --> 00:22:14,679 Speaker 3: are human languages. The things that we call human languages 421 00:22:14,680 --> 00:22:16,880 Speaker 3: are the things that we're disposed to learn. And as 422 00:22:16,880 --> 00:22:19,480 Speaker 3: a consequence, you know, we're able to sort of narrow 423 00:22:19,560 --> 00:22:21,320 Speaker 3: down the space of possibilities in a way that means 424 00:22:21,320 --> 00:22:22,680 Speaker 3: that we're able to learn from less data. 425 00:22:36,960 --> 00:22:39,399 Speaker 1: Okay, this makes a great segue to the third lens. 426 00:22:39,440 --> 00:22:42,880 Speaker 1: So you talked about rules and symbols, and you talked 427 00:22:42,920 --> 00:22:45,560 Speaker 1: about artificial neural networks. The third part of your book 428 00:22:45,640 --> 00:22:51,200 Speaker 1: is about probabilities and statistics. So why did probability become 429 00:22:51,240 --> 00:22:56,000 Speaker 1: an attractive candidate language for thinking about cognition? 430 00:22:56,920 --> 00:22:59,520 Speaker 3: Probability there is a good way of answering certain kinds 431 00:22:59,520 --> 00:23:02,840 Speaker 3: of why questions that we might have, right so, and 432 00:23:02,880 --> 00:23:05,520 Speaker 3: the reason is that it's a way of characterizing how 433 00:23:05,560 --> 00:23:08,960 Speaker 3: a rational agent should make an inference. So all the 434 00:23:08,960 --> 00:23:13,159 Speaker 3: way back in the eighteenth century, British nonconformist minister, the 435 00:23:13,160 --> 00:23:16,440 Speaker 3: Reverend Thomas Bays had this radical idea that you could 436 00:23:17,119 --> 00:23:19,800 Speaker 3: talk about, you know, again, like take a mathematical system 437 00:23:19,880 --> 00:23:22,399 Speaker 3: probability theory, which we would use for describing what happens 438 00:23:22,440 --> 00:23:24,639 Speaker 3: when you roll dice or flip coins right, sort of 439 00:23:24,720 --> 00:23:27,400 Speaker 3: you know, sort of language of gambling and saying, oh, 440 00:23:27,480 --> 00:23:30,240 Speaker 3: in fact, that mathematical system might also be a really 441 00:23:30,280 --> 00:23:33,560 Speaker 3: good system for describing how beliefs work. And so what 442 00:23:33,640 --> 00:23:35,399 Speaker 3: he was interested in was if you think about a 443 00:23:35,400 --> 00:23:38,280 Speaker 3: belief as you know, a degree of belief, right, you 444 00:23:38,320 --> 00:23:40,160 Speaker 3: can say, oh, I think it's going to rain tomorrow, 445 00:23:40,600 --> 00:23:42,800 Speaker 3: and I'll put put that on a scale which goes 446 00:23:42,800 --> 00:23:45,479 Speaker 3: from zero to one, where zero is you know, not 447 00:23:45,520 --> 00:23:47,359 Speaker 3: going to rain, and one is one hundred percent it's 448 00:23:47,359 --> 00:23:47,760 Speaker 3: going to rain. 449 00:23:47,800 --> 00:23:48,200 Speaker 4: Tomorrow. 450 00:23:48,280 --> 00:23:52,040 Speaker 3: Right, That is a belief that you've expressed in the 451 00:23:52,080 --> 00:23:54,960 Speaker 3: form of a probability. And now if you, you know, 452 00:23:55,119 --> 00:23:56,639 Speaker 3: wake up in the morning and look out the window 453 00:23:56,640 --> 00:23:59,239 Speaker 3: and you see gray storm clouds, you've got a new 454 00:23:59,240 --> 00:24:02,480 Speaker 3: piece of data. You need to revise your beliefs, and 455 00:24:02,560 --> 00:24:05,119 Speaker 3: probability theory actually tells you how to do that. It says, 456 00:24:06,320 --> 00:24:09,200 Speaker 3: you know, for each hypothesis, right, so our hypotheses here 457 00:24:09,240 --> 00:24:11,040 Speaker 3: are it's going. 458 00:24:10,960 --> 00:24:12,080 Speaker 4: To rain or it's not going to rain. 459 00:24:12,440 --> 00:24:15,360 Speaker 3: Right, You're going to modify that belief based on how 460 00:24:15,440 --> 00:24:18,639 Speaker 3: likely the data is that you saw if that hypothesis 461 00:24:18,680 --> 00:24:21,359 Speaker 3: were true. So, because gray storm clouds are more likely 462 00:24:21,440 --> 00:24:23,920 Speaker 3: if it's going to rain that day, we should increase 463 00:24:23,960 --> 00:24:27,600 Speaker 3: our belief that it's going to rain. And as a consequence, well, 464 00:24:27,680 --> 00:24:29,040 Speaker 3: we'll end up with a number that's a little bit 465 00:24:29,119 --> 00:24:30,359 Speaker 3: higher than the number we had before. 466 00:24:30,600 --> 00:24:32,280 Speaker 4: And probability theory tells us how to do that. 467 00:24:32,520 --> 00:24:35,240 Speaker 3: There's a principle of probability theory called Bays rule after 468 00:24:35,240 --> 00:24:37,680 Speaker 3: the Reverend Thomas Beys that tells you how to take 469 00:24:37,760 --> 00:24:40,960 Speaker 3: your original beliefs and then turn them into the beliefs 470 00:24:40,960 --> 00:24:43,680 Speaker 3: that you get after seeing data. And that turns out 471 00:24:43,720 --> 00:24:45,600 Speaker 3: to be exactly the tool that you need to answer 472 00:24:45,640 --> 00:24:49,560 Speaker 3: these kinds of questions about how inductive biases work. Right, So, 473 00:24:50,160 --> 00:24:53,320 Speaker 3: how is it that children are able to learn from 474 00:24:53,600 --> 00:24:56,880 Speaker 3: less data than anural networks. Well, it's a consequence of 475 00:24:57,440 --> 00:25:00,000 Speaker 3: you know, these things that we can describe using different 476 00:25:00,160 --> 00:25:03,399 Speaker 3: probabilities being assigned to different hypotheses, whether hypotheses correspond to 477 00:25:03,400 --> 00:25:05,080 Speaker 3: the structure of the languages that are being learned. 478 00:25:05,320 --> 00:25:10,320 Speaker 1: And when people call humans irrational, what changes if we 479 00:25:10,440 --> 00:25:16,120 Speaker 1: look at mistakes as resource limited inferences. 480 00:25:17,640 --> 00:25:20,280 Speaker 3: This is one of the things that I explore in 481 00:25:19,760 --> 00:25:23,399 Speaker 3: my own research is this question of how we should 482 00:25:23,400 --> 00:25:26,919 Speaker 3: actually think about rationality for real agents, right, And this 483 00:25:27,000 --> 00:25:28,919 Speaker 3: is relevant if you're building an AI system or if 484 00:25:28,920 --> 00:25:31,760 Speaker 3: you're just trying to understand human behavior. So I think, 485 00:25:33,040 --> 00:25:35,720 Speaker 3: like I said, probability theory gives us a characterization of 486 00:25:35,800 --> 00:25:38,679 Speaker 3: what you should do as an ideal rational agent, But 487 00:25:38,760 --> 00:25:42,640 Speaker 3: that assumes that you have infinite computational resources. We mirror, 488 00:25:42,760 --> 00:25:46,440 Speaker 3: humans don't have infinite computational resources, nor do our AI systems, 489 00:25:46,560 --> 00:25:49,280 Speaker 3: And so you can ask what should a rational agent 490 00:25:49,320 --> 00:25:52,280 Speaker 3: do if they don't have all of the computational resources 491 00:25:52,280 --> 00:25:54,359 Speaker 3: that you might need and then out of that you 492 00:25:54,400 --> 00:25:57,160 Speaker 3: get the answer is that you know, you should follow 493 00:25:57,480 --> 00:26:02,320 Speaker 3: an algorithm, follow us strategy makes sense given the resources 494 00:26:02,359 --> 00:26:04,280 Speaker 3: that you have. That's what it means to be rational 495 00:26:04,280 --> 00:26:06,840 Speaker 3: in those circumstances where you're sort of doing the best 496 00:26:06,920 --> 00:26:12,159 Speaker 3: job you can of approximating probabilistic inference given those resource constraints. 497 00:26:12,200 --> 00:26:14,479 Speaker 3: And so some of the things that people do when 498 00:26:14,520 --> 00:26:16,400 Speaker 3: people do weird things, and we do lots of weird things, 499 00:26:16,440 --> 00:26:19,000 Speaker 3: and we don't always follow probability theory, things that we 500 00:26:19,000 --> 00:26:22,320 Speaker 3: can understand as us, you know, running into those resource 501 00:26:22,359 --> 00:26:25,200 Speaker 3: limitations and then coming up with, you know, reasonable strategies 502 00:26:25,200 --> 00:26:26,719 Speaker 3: for trying to approximate the right answer. 503 00:26:28,040 --> 00:26:29,080 Speaker 2: So, if we look. 504 00:26:28,960 --> 00:26:33,800 Speaker 1: At probability being the grammar of uncertainty, one thing we 505 00:26:33,880 --> 00:26:36,840 Speaker 1: know is that our prior expectations matter. And one of 506 00:26:36,880 --> 00:26:39,280 Speaker 1: the things I've been obsessed with and doing a lot 507 00:26:39,320 --> 00:26:41,000 Speaker 1: of research on and talking a lot about on the 508 00:26:41,000 --> 00:26:44,200 Speaker 1: podcast is the way that all of us drop into 509 00:26:44,200 --> 00:26:48,520 Speaker 1: the world and our cultures influence us and our language 510 00:26:48,560 --> 00:26:51,160 Speaker 1: and our moment in time and our neighborhood, and this 511 00:26:51,320 --> 00:26:55,760 Speaker 1: leads to people being quite different on the inside. Is 512 00:26:55,800 --> 00:26:58,159 Speaker 1: this something that you think about sometimes about how we 513 00:26:58,240 --> 00:27:03,760 Speaker 1: develop our priors differently based on you know what where 514 00:27:03,760 --> 00:27:04,360 Speaker 1: we grow up. 515 00:27:05,600 --> 00:27:08,679 Speaker 3: Yeah, So priors is that Daysian language, right, for talking 516 00:27:08,680 --> 00:27:11,240 Speaker 3: about the beliefs that you have before you get data 517 00:27:11,440 --> 00:27:14,840 Speaker 3: that you then update into what we call posterior probabilities 518 00:27:14,840 --> 00:27:17,440 Speaker 3: that are informed by those data. And so yeah, I 519 00:27:18,359 --> 00:27:20,720 Speaker 3: spend a lot of my research time thinking about these 520 00:27:20,800 --> 00:27:23,920 Speaker 3: kinds of questions of you know what are these sort 521 00:27:23,960 --> 00:27:28,880 Speaker 3: of prior distributions for humans? How do we acquire good 522 00:27:28,960 --> 00:27:31,959 Speaker 3: prior distributions for solving different kinds of problems. One thing 523 00:27:32,000 --> 00:27:34,520 Speaker 3: there is that calling it a prior makes it sound 524 00:27:34,560 --> 00:27:36,840 Speaker 3: like maybe it's something you're born with, but in fact, 525 00:27:36,880 --> 00:27:37,680 Speaker 3: it just means. 526 00:27:37,720 --> 00:27:38,960 Speaker 4: It's before you get data. 527 00:27:39,040 --> 00:27:41,439 Speaker 3: Right, So when you're seeing that storm cloud in the morning, 528 00:27:42,040 --> 00:27:44,520 Speaker 3: you had a prior probability from last night, and then 529 00:27:44,640 --> 00:27:47,560 Speaker 3: that prior probability was informed by everything else that you know, Right, 530 00:27:47,680 --> 00:27:49,920 Speaker 3: The priors are all of the biases and knowledge that 531 00:27:49,960 --> 00:27:51,880 Speaker 3: we bring to bear when we're trying to make an difference. 532 00:27:51,960 --> 00:27:55,080 Speaker 3: And so yeah, I think I think understanding that is 533 00:27:55,119 --> 00:27:57,280 Speaker 3: a big part of the project of understanding human cognition. 534 00:27:57,600 --> 00:27:59,120 Speaker 2: So let's zoom the camera out. 535 00:27:59,200 --> 00:28:01,639 Speaker 1: We've talked about the three lenses that you describe in 536 00:28:01,640 --> 00:28:04,119 Speaker 1: the book. Now, you also point out that we have 537 00:28:04,160 --> 00:28:08,199 Speaker 1: a lot of constraints like finite lives in limited compute, 538 00:28:08,200 --> 00:28:11,760 Speaker 1: and limited bandwidth, and so how do these constraints sculpt 539 00:28:12,040 --> 00:28:13,320 Speaker 1: human intelligence? 540 00:28:13,800 --> 00:28:16,520 Speaker 3: I think this is really important to just thinking about 541 00:28:16,560 --> 00:28:18,399 Speaker 3: the moment that we're in where there's a lot of 542 00:28:18,400 --> 00:28:21,520 Speaker 3: anxiety around AI, right, And I think if you think 543 00:28:21,520 --> 00:28:24,439 Speaker 3: about intelligence as a kind of one dimensional quantity, you 544 00:28:24,440 --> 00:28:26,560 Speaker 3: can kind of imagine that you know, humans are somewhere, 545 00:28:26,920 --> 00:28:28,320 Speaker 3: our AI systems are somewhere. 546 00:28:28,359 --> 00:28:29,560 Speaker 4: It seems like they're approaching us. 547 00:28:29,600 --> 00:28:31,560 Speaker 3: Maybe they're going to overtake us, and then, oh my god, 548 00:28:31,560 --> 00:28:33,120 Speaker 3: what is going to happen when that happens, Right, We're 549 00:28:33,160 --> 00:28:34,600 Speaker 3: just going to become redundant. There's not going to be 550 00:28:34,600 --> 00:28:36,520 Speaker 3: any jobs. Everything is going to fall apart. And so 551 00:28:37,160 --> 00:28:39,520 Speaker 3: that's a consequence of having a particular conception of what 552 00:28:39,640 --> 00:28:41,880 Speaker 3: intelligence is, which is this kind of one dimensional way 553 00:28:41,880 --> 00:28:44,040 Speaker 3: of thinking about it. And I think there's a different 554 00:28:44,080 --> 00:28:46,200 Speaker 3: way of thinking about it which gives us a little 555 00:28:46,200 --> 00:28:48,000 Speaker 3: more flexibility and maybe a little more hope in the 556 00:28:48,040 --> 00:28:50,120 Speaker 3: way that we think about what's going to happen with AI, 557 00:28:50,560 --> 00:28:55,840 Speaker 3: and that is thinking about intelligence as being an adaptation 558 00:28:56,040 --> 00:28:59,240 Speaker 3: to the kinds of computational problems that a system has 559 00:28:59,360 --> 00:29:02,520 Speaker 3: either of or being trained to solve, right, And so 560 00:29:03,360 --> 00:29:06,240 Speaker 3: for human beings, those computational problems are shaped by the 561 00:29:06,280 --> 00:29:08,440 Speaker 3: constraints that we operate under. And a lot of those 562 00:29:08,440 --> 00:29:11,080 Speaker 3: constraints come from our biology, right that we, as you said, 563 00:29:11,480 --> 00:29:14,880 Speaker 3: have limited lives, have you know, limited compute resources what 564 00:29:14,880 --> 00:29:17,560 Speaker 3: we can carry around inside our heads, have limited bandwidth 565 00:29:17,600 --> 00:29:20,480 Speaker 3: for communication. We have to like make noises at each other, 566 00:29:20,800 --> 00:29:22,800 Speaker 3: or wiggle our fingers or you know, somehow use our 567 00:29:22,840 --> 00:29:26,280 Speaker 3: bodies to transfer data from one human mind to another 568 00:29:26,360 --> 00:29:30,440 Speaker 3: human mind. It's very inefficient. And so those constraints are 569 00:29:30,520 --> 00:29:34,200 Speaker 3: things that mean that human intelligence takes a particular shape, 570 00:29:34,440 --> 00:29:37,920 Speaker 3: which is we're able to learn from limited data because 571 00:29:37,920 --> 00:29:40,080 Speaker 3: we have to because we don't live that long. Right, 572 00:29:40,280 --> 00:29:44,120 Speaker 3: You can't rely on getting five thousand years of language 573 00:29:44,160 --> 00:29:47,600 Speaker 3: data or multiple human lifetimes of you know, chess playing 574 00:29:47,680 --> 00:29:51,400 Speaker 3: or whatever it is. Right, you have to be good 575 00:29:51,400 --> 00:29:53,960 Speaker 3: at using the resources that you have in ways that 576 00:29:53,960 --> 00:29:56,000 Speaker 3: are efficient. And so that's kind of like deciding what 577 00:29:56,080 --> 00:29:59,160 Speaker 3: to think about being able to recognize when a problem 578 00:29:59,200 --> 00:30:01,800 Speaker 3: has a structure that you I've seen before being able 579 00:30:01,840 --> 00:30:04,280 Speaker 3: to you know, sort of like become sort of automatic 580 00:30:04,320 --> 00:30:07,160 Speaker 3: in using certain kinds of patterns of thinking and strategies 581 00:30:07,160 --> 00:30:11,120 Speaker 3: for solving problems, really trying to make it as easy as. 582 00:30:11,000 --> 00:30:13,320 Speaker 4: Possible for us to use the resources that we have. 583 00:30:13,800 --> 00:30:18,760 Speaker 3: And then you need to develop capacities for trying to 584 00:30:18,800 --> 00:30:22,000 Speaker 3: circumvent those bandwidth constraints in order to be able to 585 00:30:22,040 --> 00:30:24,320 Speaker 3: do things that transcend what any individual human can do, 586 00:30:24,640 --> 00:30:30,600 Speaker 3: and that means developing things like language writing societies LLCs, 587 00:30:30,920 --> 00:30:34,920 Speaker 3: you know, all of the sort of libraries, right institutions, 588 00:30:35,240 --> 00:30:37,960 Speaker 3: all of the theory of mind, right for reasoning about 589 00:30:37,960 --> 00:30:40,360 Speaker 3: what someone else might be trying to communicate to you. 590 00:30:40,840 --> 00:30:43,120 Speaker 3: All of this stuff is actually sort of like human 591 00:30:43,160 --> 00:30:46,920 Speaker 3: stuff that's a consequence of these constraints. And so as 592 00:30:46,920 --> 00:30:50,800 Speaker 3: we make AI systems that are smarter, those AI systems 593 00:30:50,800 --> 00:30:53,400 Speaker 3: are in turn being shaped by what they're being trained 594 00:30:53,400 --> 00:30:56,280 Speaker 3: to do and what constraints they operate under. But those 595 00:30:56,320 --> 00:30:58,760 Speaker 3: constraints are different from the ones that humans have. They 596 00:30:58,800 --> 00:31:02,320 Speaker 3: can you know, get more data, they can get access 597 00:31:02,320 --> 00:31:04,520 Speaker 3: to more compute, they don't have bandwidth limitations. 598 00:31:04,520 --> 00:31:05,520 Speaker 4: You can just copy. 599 00:31:05,240 --> 00:31:09,280 Speaker 3: A you know, a state of an AI system across machines. 600 00:31:09,360 --> 00:31:12,440 Speaker 3: You can use the same data to train multiple AI systems, 601 00:31:13,000 --> 00:31:15,280 Speaker 3: and all of those things mean that I think, rather 602 00:31:15,320 --> 00:31:17,600 Speaker 3: than being sort of on one axis where we're sort 603 00:31:17,600 --> 00:31:20,800 Speaker 3: of thinking about better and worse, it's more that there 604 00:31:20,800 --> 00:31:23,560 Speaker 3: are many axes that we can think about intelligent systems 605 00:31:23,600 --> 00:31:25,680 Speaker 3: developing along, and we're just going to end up in 606 00:31:25,680 --> 00:31:28,080 Speaker 3: a state where we have human intelligence and we have AIS, 607 00:31:28,520 --> 00:31:30,640 Speaker 3: and they're going to be meaningfully different from one another, 608 00:31:31,160 --> 00:31:33,200 Speaker 3: rather than things that are sort of directly competing in 609 00:31:33,320 --> 00:31:34,600 Speaker 3: terms of the capacities that they have. 610 00:31:35,280 --> 00:31:36,640 Speaker 2: Yeah, I agree with you on that. 611 00:31:37,280 --> 00:31:41,480 Speaker 1: When you think about the way that humans beat machines 612 00:31:41,640 --> 00:31:44,400 Speaker 1: on data efficiency, what do you think that means is 613 00:31:44,520 --> 00:31:47,320 Speaker 1: missing architecturally from our AI systems. 614 00:31:48,880 --> 00:31:52,360 Speaker 3: I think it's actually it's a great question, and the 615 00:31:52,360 --> 00:31:55,040 Speaker 3: way I would express it is not in terms of architecture. 616 00:31:55,720 --> 00:31:58,280 Speaker 3: So it's actually in terms of a different part of 617 00:31:58,320 --> 00:32:01,000 Speaker 3: a neural network. So when we think about this problem 618 00:32:01,000 --> 00:32:05,520 Speaker 3: of inductive bias, right, which is we're you know, what 619 00:32:06,160 --> 00:32:09,120 Speaker 3: a system is sort of disposed towards learning. As I said, 620 00:32:09,240 --> 00:32:11,840 Speaker 3: our neural networks, the way we normally set them up, 621 00:32:12,040 --> 00:32:14,040 Speaker 3: are pretty weak inductive biases. They can learn all sorts 622 00:32:14,040 --> 00:32:17,320 Speaker 3: of things. The inductive bias that a neural network has 623 00:32:17,560 --> 00:32:20,640 Speaker 3: it is constrained by its architecture, but it's also constrained 624 00:32:20,640 --> 00:32:23,080 Speaker 3: by where it starts out in the space of the 625 00:32:23,120 --> 00:32:26,680 Speaker 3: settings of all of those weights. And normally the default 626 00:32:26,720 --> 00:32:28,360 Speaker 3: is that you set up your neural networks so those 627 00:32:28,360 --> 00:32:31,000 Speaker 3: weights start out really small, close to zero, and then 628 00:32:31,000 --> 00:32:32,960 Speaker 3: they sort of grow away from that as it starts 629 00:32:33,000 --> 00:32:36,040 Speaker 3: to learn how to do things. We've had success in 630 00:32:36,120 --> 00:32:40,719 Speaker 3: taking neural networks that are architecturally identical but setting them 631 00:32:40,720 --> 00:32:43,720 Speaker 3: with different initial weights in order to create an inductive 632 00:32:43,760 --> 00:32:47,000 Speaker 3: bias that enables rapid learning. So we act to use 633 00:32:47,040 --> 00:32:49,720 Speaker 3: a technique called meta learning, which is a method from 634 00:32:49,720 --> 00:32:53,440 Speaker 3: machine learning where you take the same neural network architecture 635 00:32:53,600 --> 00:32:55,960 Speaker 3: and the same initial weights, and you use it to 636 00:32:56,040 --> 00:32:57,960 Speaker 3: learn to solve lots of different problems, like you can 637 00:32:58,040 --> 00:33:00,800 Speaker 3: use it to learn lots of different languages, say, from 638 00:33:00,840 --> 00:33:04,000 Speaker 3: limited data, and then you try and optimize the initial 639 00:33:04,040 --> 00:33:05,800 Speaker 3: weights of the neural network to make it so it 640 00:33:05,800 --> 00:33:09,000 Speaker 3: can learn all of those languages better using the same 641 00:33:09,080 --> 00:33:11,000 Speaker 3: kinds of algorithms we use for training the weights of 642 00:33:11,040 --> 00:33:12,880 Speaker 3: the neural network. When we have these giant data sets. 643 00:33:12,880 --> 00:33:15,560 Speaker 3: You can instead use those algorithms to train the initial 644 00:33:15,560 --> 00:33:17,680 Speaker 3: weights of a neural network for a small data set, 645 00:33:17,720 --> 00:33:20,760 Speaker 3: for lots of small data sets, And when you do that, 646 00:33:20,800 --> 00:33:22,320 Speaker 3: you end up with a neural network that has an 647 00:33:22,360 --> 00:33:25,400 Speaker 3: inductive bias that makes it possible to learn from small 648 00:33:25,400 --> 00:33:26,960 Speaker 3: amounts of data. And so that's the kind of thing 649 00:33:27,000 --> 00:33:29,920 Speaker 3: we've been exploring in my lab is can we find 650 00:33:29,960 --> 00:33:33,080 Speaker 3: a way of taking exactly these same neural network architectures 651 00:33:33,600 --> 00:33:36,000 Speaker 3: and just starting them out in a different place that 652 00:33:36,120 --> 00:33:39,120 Speaker 3: maybe aligns better with the kinds of things that humans do. Okay, well, 653 00:33:39,160 --> 00:33:40,880 Speaker 3: this is a really good segue to what I wanted 654 00:33:40,880 --> 00:33:43,600 Speaker 3: to ask you, which is, if you're looking at rules 655 00:33:43,640 --> 00:33:46,880 Speaker 3: and systems is one sort of math to describe the mind, 656 00:33:46,880 --> 00:33:49,320 Speaker 3: and artificial neural networks is another kind of math, and 657 00:33:49,360 --> 00:33:54,200 Speaker 3: probability is another. What does an optimal hybrid look like? 658 00:33:54,320 --> 00:33:58,920 Speaker 3: Given that no single, no one of these describes everything 659 00:33:58,960 --> 00:34:02,240 Speaker 3: about what's going on with minds, So what does the 660 00:34:02,360 --> 00:34:04,120 Speaker 3: hybrid for an aisystem look like? 661 00:34:04,200 --> 00:34:07,120 Speaker 2: In twenty twenty six, the place. 662 00:34:06,920 --> 00:34:09,759 Speaker 3: Where I end up in the book is saying that 663 00:34:10,080 --> 00:34:13,600 Speaker 3: these different kinds of math really do all fit together 664 00:34:13,640 --> 00:34:16,239 Speaker 3: in an interesting way, and in order to understand that 665 00:34:16,280 --> 00:34:18,560 Speaker 3: we can talk about different levels of analysis when we're 666 00:34:18,560 --> 00:34:21,040 Speaker 3: trying to make sense of an information processing system. So 667 00:34:21,920 --> 00:34:23,880 Speaker 3: the most abstract level, this is an idea that was 668 00:34:23,880 --> 00:34:27,640 Speaker 3: introduced by the computational neuroscientist David Marr. He said, the 669 00:34:27,680 --> 00:34:29,560 Speaker 3: most abstract level is just thinking about the problem that 670 00:34:29,560 --> 00:34:32,279 Speaker 3: the system is solving in its ideal solution, right. And 671 00:34:32,960 --> 00:34:36,279 Speaker 3: I think logic and symbolic systems and probability theory give 672 00:34:36,360 --> 00:34:39,239 Speaker 3: us a good way of characterizing the kinds of problems 673 00:34:39,280 --> 00:34:43,279 Speaker 3: that minds have to solve, right They you know, probabilistic 674 00:34:43,280 --> 00:34:45,960 Speaker 3: inference because we have to make these uncertain inferences. And 675 00:34:46,000 --> 00:34:49,400 Speaker 3: then logic as a way of characterizing the kinds of 676 00:34:49,440 --> 00:34:50,960 Speaker 3: things that are in the world that have this rich 677 00:34:51,000 --> 00:34:54,480 Speaker 3: structure of you know, like a sort of combinatorial structure 678 00:34:54,520 --> 00:34:57,400 Speaker 3: that you get from from having symbols and rules that 679 00:34:57,440 --> 00:35:00,160 Speaker 3: combine together with things like language and dance and all 680 00:35:00,160 --> 00:35:02,400 Speaker 3: of these you know, structured things. Even you know, if 681 00:35:02,400 --> 00:35:03,960 Speaker 3: you look at trees, you can see they have like 682 00:35:04,120 --> 00:35:06,759 Speaker 3: recursive structures that are expressed in them. Right, So these 683 00:35:06,840 --> 00:35:08,600 Speaker 3: kind of occur in nature and are important to be 684 00:35:08,600 --> 00:35:11,640 Speaker 3: able to understand. And then at the level below that 685 00:35:11,840 --> 00:35:17,160 Speaker 3: you have how the system solves those problems, right, like 686 00:35:17,480 --> 00:35:20,279 Speaker 3: what algorithms it might use, what representations it might use, 687 00:35:20,400 --> 00:35:22,920 Speaker 3: And then below that it's you know, how that's actually 688 00:35:22,920 --> 00:35:25,960 Speaker 3: implemented in some kind of physical system, right, and artificial 689 00:35:25,960 --> 00:35:27,799 Speaker 3: neural networks give us a kind of story at those 690 00:35:27,880 --> 00:35:31,720 Speaker 3: levels where we can think about them as being a 691 00:35:31,760 --> 00:35:35,439 Speaker 3: good general purpose system for learning to approximate the things 692 00:35:35,440 --> 00:35:37,799 Speaker 3: that probabilistic in front tells you to do, and learning 693 00:35:37,840 --> 00:35:40,560 Speaker 3: to approximate the structure that's contained within those symbolic systems. 694 00:35:40,960 --> 00:35:45,160 Speaker 3: So I actually think, you know, the kind of story 695 00:35:45,200 --> 00:35:47,319 Speaker 3: that we have right now that's emerged out of these 696 00:35:47,400 --> 00:35:50,120 Speaker 3: advances in AI is actually a pretty good story for 697 00:35:50,200 --> 00:35:52,759 Speaker 3: how we could think about human minds working. The thing 698 00:35:52,800 --> 00:35:55,279 Speaker 3: that's missing, most important thing that's missing is this kind 699 00:35:55,320 --> 00:35:59,279 Speaker 3: of aspect of inductive bias, right where we haven't been 700 00:35:59,320 --> 00:36:01,440 Speaker 3: able to capture what human inductive biases are like in 701 00:36:01,520 --> 00:36:03,799 Speaker 3: machines and so that you have these meaningful differences that come. 702 00:36:03,680 --> 00:36:04,080 Speaker 4: Out of that. 703 00:36:05,440 --> 00:36:08,600 Speaker 3: But it's not a bad place for thinking about how 704 00:36:08,640 --> 00:36:11,080 Speaker 3: these pieces might fit together to give us an explanation 705 00:36:11,120 --> 00:36:12,279 Speaker 3: for how it is that mind's work. 706 00:36:12,680 --> 00:36:18,399 Speaker 1: So along these lines, which AI benchmarks feel to you misleading? 707 00:36:18,520 --> 00:36:21,080 Speaker 2: And how would you make better benchmarks? 708 00:36:22,640 --> 00:36:26,239 Speaker 3: So, in general, I'm not a huge fan of benchmarks, 709 00:36:26,360 --> 00:36:29,279 Speaker 3: because I think benchmarks are useful as an engineering tool, 710 00:36:29,880 --> 00:36:33,560 Speaker 3: but I, as a cognitive scientist, don't just want to 711 00:36:33,600 --> 00:36:36,120 Speaker 3: know how well something is doing something. I want to 712 00:36:36,120 --> 00:36:38,960 Speaker 3: know how it's doing that thing right and how it 713 00:36:39,040 --> 00:36:41,719 Speaker 3: might be sort of messing that up right. So when 714 00:36:41,800 --> 00:36:46,160 Speaker 3: we are designing experiments as cognitive scientists, we don't just say, oh, 715 00:36:46,200 --> 00:36:48,120 Speaker 3: here's one hundred math problems. Go do with one hundred 716 00:36:48,160 --> 00:36:50,720 Speaker 3: math problems and we'll get a score. We say, let's 717 00:36:50,840 --> 00:36:54,480 Speaker 3: choose a set of math problems so that which answers 718 00:36:54,640 --> 00:36:57,960 Speaker 3: people give us tell us about the misconceptions that they 719 00:36:58,000 --> 00:37:00,319 Speaker 3: have in a way that we can then diagnose, oh, 720 00:37:00,440 --> 00:37:02,080 Speaker 3: you know, this is why this person is thinking this 721 00:37:02,120 --> 00:37:04,160 Speaker 3: particular thing. And so I think there's lots of room 722 00:37:04,200 --> 00:37:06,520 Speaker 3: for coming up with better ways of evaluating our AI 723 00:37:06,560 --> 00:37:09,600 Speaker 3: systems that look more like cognitive science experiments. We're really 724 00:37:09,640 --> 00:37:12,719 Speaker 3: targeting understanding what's going on rather than just trying to 725 00:37:12,719 --> 00:37:15,480 Speaker 3: get some brute sort of you know, performance score. 726 00:37:16,320 --> 00:37:19,920 Speaker 1: Okay, good, And you have talked about curiosity as a 727 00:37:19,960 --> 00:37:25,800 Speaker 1: computational problem. So how do you think about what curiosity 728 00:37:25,960 --> 00:37:30,320 Speaker 1: is and how we might measure real curiosity in a machine? 729 00:37:30,680 --> 00:37:34,400 Speaker 3: What problem is curiosity trying to solve? Yeah, this is 730 00:37:34,440 --> 00:37:36,000 Speaker 3: this is a good question. You can you can ask 731 00:37:36,040 --> 00:37:38,480 Speaker 3: this kind of question that we call rational analysis. Right, 732 00:37:38,520 --> 00:37:42,640 Speaker 3: if you have a system that's solving a problem, you 733 00:37:42,680 --> 00:37:44,000 Speaker 3: know what's what's the problem? 734 00:37:44,000 --> 00:37:44,800 Speaker 4: What's the ideal solution? 735 00:37:44,960 --> 00:37:48,279 Speaker 3: Okay, So for curiosity, we've argued and this is work 736 00:37:48,320 --> 00:37:49,399 Speaker 3: with wretched debate. 737 00:37:49,400 --> 00:37:52,200 Speaker 4: Who is that UCLA That. 738 00:37:53,800 --> 00:37:56,960 Speaker 3: One way I think about curiosity is that you're trying 739 00:37:56,960 --> 00:38:02,760 Speaker 3: to find things that are good in increasing your long 740 00:38:03,120 --> 00:38:06,880 Speaker 3: run probability of being able to solve problems in the future. Right, 741 00:38:07,000 --> 00:38:10,120 Speaker 3: So you know, it's sort of like you want data 742 00:38:10,200 --> 00:38:13,840 Speaker 3: which for which the derivative of your total knowledge is 743 00:38:13,960 --> 00:38:18,960 Speaker 3: high relative to that particular data point. And so that 744 00:38:19,040 --> 00:38:21,800 Speaker 3: explanation captures some of the things that happen in human cognition, 745 00:38:22,000 --> 00:38:25,239 Speaker 3: where you know, in some circumstances, we're interested in the 746 00:38:25,280 --> 00:38:29,239 Speaker 3: newest thing, something we've never seen before, Right, But in 747 00:38:29,280 --> 00:38:31,560 Speaker 3: a lot of circumstances, those things aren't the things that 748 00:38:31,600 --> 00:38:34,520 Speaker 3: grab our attention. It's more things that maybe we've seen 749 00:38:34,520 --> 00:38:38,640 Speaker 3: a few times, and you know, we just sort of 750 00:38:38,680 --> 00:38:41,319 Speaker 3: noticed that they're starting to occur. If something happens once, 751 00:38:41,360 --> 00:38:43,000 Speaker 3: you're just say okay, that was weird, and you sort 752 00:38:43,000 --> 00:38:45,160 Speaker 3: of dismiss it. But when something happens a few times 753 00:38:45,200 --> 00:38:47,279 Speaker 3: and it's unfamiliar to you, you say, okay, maybe I 754 00:38:47,320 --> 00:38:49,520 Speaker 3: need to figure that out. Right, And something that happens 755 00:38:49,520 --> 00:38:51,120 Speaker 3: all the time, you're not that curious about. That's just 756 00:38:51,120 --> 00:38:52,520 Speaker 3: the thing that happens all the time. And you can 757 00:38:52,560 --> 00:38:55,120 Speaker 3: explain that by thinking about this sort of derivative, right, 758 00:38:55,239 --> 00:38:58,920 Speaker 3: where if something just happens once, you shouldn't be interested 759 00:38:58,960 --> 00:39:00,480 Speaker 3: in it. Because it just happened once, it's probably never 760 00:39:00,480 --> 00:39:03,160 Speaker 3: going to happen again. If something happens a few times, 761 00:39:03,719 --> 00:39:06,640 Speaker 3: that's a clue that it's probably going to happen again 762 00:39:06,640 --> 00:39:09,160 Speaker 3: in the future, and you've not seen it enough to 763 00:39:09,200 --> 00:39:11,719 Speaker 3: actually know what's going on, And so paying attention to 764 00:39:11,760 --> 00:39:13,520 Speaker 3: that is good in terms of that derivative of your 765 00:39:13,520 --> 00:39:16,759 Speaker 3: future knowledge. And if something happens a lot, then it's 766 00:39:16,760 --> 00:39:18,680 Speaker 3: probably happened enough that you know something about what's going 767 00:39:18,680 --> 00:39:21,160 Speaker 3: on and it's not that interesting, right, And so so 768 00:39:21,280 --> 00:39:23,120 Speaker 3: that sort of sweet spot ends up being around the 769 00:39:23,120 --> 00:39:25,359 Speaker 3: things that are sort of like just happening to your 770 00:39:25,480 --> 00:39:27,719 Speaker 3: enough times that you're starting to realize, oh, this might be. 771 00:39:27,640 --> 00:39:28,880 Speaker 4: A thing that I need to pay attention to. 772 00:39:42,800 --> 00:39:45,880 Speaker 1: If you hadn't been on one capability that's going to 773 00:39:46,120 --> 00:39:50,319 Speaker 1: unlock a broader intelligence, unlock a jump to that. 774 00:39:50,360 --> 00:39:51,240 Speaker 2: What's your candidate? 775 00:39:52,160 --> 00:39:54,319 Speaker 3: I actually think the biggest obstacle at the moment is 776 00:39:54,400 --> 00:39:59,120 Speaker 3: more about generalizability of intelligence rather than any specific capacity, right. 777 00:39:59,239 --> 00:40:03,920 Speaker 3: And so people in the AI world talk about jagged intelligence, right, 778 00:40:03,920 --> 00:40:06,720 Speaker 3: the sort of phenomenon where you have an AI system 779 00:40:06,760 --> 00:40:08,799 Speaker 3: that can do something that's really smart and impress you, 780 00:40:09,280 --> 00:40:11,480 Speaker 3: and then five minutes later does something that's really dumb 781 00:40:11,520 --> 00:40:13,839 Speaker 3: on a problem that's like right next to it, and like, 782 00:40:13,880 --> 00:40:15,479 Speaker 3: if it's able to solve that first problem, it seems 783 00:40:15,480 --> 00:40:17,160 Speaker 3: obvious that it should be able to solve the second problem. 784 00:40:17,160 --> 00:40:19,120 Speaker 4: And you're just like, what happened? You know, why did 785 00:40:19,120 --> 00:40:19,799 Speaker 4: it go wrong there? 786 00:40:20,200 --> 00:40:24,680 Speaker 3: And so that lack of generalizability is also a consequence 787 00:40:24,680 --> 00:40:27,680 Speaker 3: of these kinds of inductive biases, right, So these human 788 00:40:27,719 --> 00:40:30,719 Speaker 3: inductive biases that steer us towards a solution and let 789 00:40:30,840 --> 00:40:34,640 Speaker 3: us learn from limited amounts of data, they constrain the 790 00:40:34,680 --> 00:40:36,160 Speaker 3: kinds of solutions that we find are The kinds of 791 00:40:36,200 --> 00:40:37,719 Speaker 3: solutions that we find are the ones that are sort 792 00:40:37,719 --> 00:40:40,640 Speaker 3: of like generalizable at least to us, right. They are 793 00:40:40,640 --> 00:40:42,239 Speaker 3: things that kind of make sense where if someone's able 794 00:40:42,280 --> 00:40:43,239 Speaker 3: to do one thing, they'll be able to do the 795 00:40:43,320 --> 00:40:45,799 Speaker 3: other thing. And because the AI systems are approaching these 796 00:40:45,840 --> 00:40:48,440 Speaker 3: problems just in a completely different way from a different 797 00:40:48,440 --> 00:40:50,800 Speaker 3: starting point and then getting tons of data that's allowing 798 00:40:50,800 --> 00:40:52,919 Speaker 3: them to sort of approximate what the human solutions are. 799 00:40:53,120 --> 00:40:55,600 Speaker 3: But they're coming at it from another angle. That's the 800 00:40:55,600 --> 00:40:58,160 Speaker 3: thing that makes them jagged. It's not that they don't 801 00:40:58,160 --> 00:41:00,400 Speaker 3: have sort of these same compatible inducted bis is that 802 00:41:00,400 --> 00:41:03,560 Speaker 3: we have that are informed by having evolved in certain 803 00:41:03,640 --> 00:41:06,279 Speaker 3: environments and having had experience of the world, and you know, 804 00:41:06,440 --> 00:41:08,000 Speaker 3: all of these other things that are part of what 805 00:41:08,040 --> 00:41:11,239 Speaker 3: it means to you know, sort of learn anything as 806 00:41:11,239 --> 00:41:15,080 Speaker 3: a human being. And so because they are coming with 807 00:41:15,160 --> 00:41:17,640 Speaker 3: this different set of inductive biases, they're very influenced by 808 00:41:17,680 --> 00:41:20,560 Speaker 3: their training data, they end up doing things that are 809 00:41:20,560 --> 00:41:24,800 Speaker 3: sort of inscrutable to us because they are, you know, yeah, 810 00:41:24,840 --> 00:41:26,720 Speaker 3: like coming at these problems in a way that doesn't 811 00:41:27,400 --> 00:41:29,759 Speaker 3: make sense to us. You know, from the starting point 812 00:41:29,800 --> 00:41:30,680 Speaker 3: that humans come from. 813 00:41:30,960 --> 00:41:33,960 Speaker 1: After writing this book, what do you think we understand 814 00:41:34,080 --> 00:41:36,960 Speaker 1: now about minds that we didn't understand let's say, a 815 00:41:37,040 --> 00:41:37,879 Speaker 1: decade or two ago. 816 00:41:39,239 --> 00:41:43,480 Speaker 3: So it's funny because when I have taught this material 817 00:41:43,680 --> 00:41:47,680 Speaker 3: for you know, twenty twenty years at this point, I 818 00:41:47,800 --> 00:41:51,200 Speaker 3: normally start my cognitive science classes saying, you know, welcome 819 00:41:51,239 --> 00:41:53,279 Speaker 3: to cognitive science. This is going to be different from 820 00:41:53,320 --> 00:41:55,480 Speaker 3: your other science classes. Normally, when you take a science class, 821 00:41:55,480 --> 00:41:57,160 Speaker 3: someone is going to stand up and say, Okay, here's 822 00:41:57,200 --> 00:41:58,520 Speaker 3: all the things that we figured out. Here are the 823 00:41:58,520 --> 00:42:02,319 Speaker 3: answers to the questions. And in cognito science, it's more 824 00:42:02,320 --> 00:42:04,640 Speaker 3: that we figured out how to get better at asking 825 00:42:04,680 --> 00:42:05,240 Speaker 3: the questions. 826 00:42:05,239 --> 00:42:06,359 Speaker 4: We haven't answered them. We don't. 827 00:42:06,360 --> 00:42:08,200 Speaker 3: It's not like you have a consensus across the whole 828 00:42:08,200 --> 00:42:10,520 Speaker 3: field about what those answers look like. And so I 829 00:42:10,520 --> 00:42:13,040 Speaker 3: think that's important that we're still very much And this 830 00:42:13,120 --> 00:42:14,759 Speaker 3: is what got me interested in cognres science in the 831 00:42:14,760 --> 00:42:17,120 Speaker 3: first place. You know, still a field that has deep 832 00:42:17,160 --> 00:42:20,640 Speaker 3: mysteries and lots of opportunities to learn and discover interesting things. 833 00:42:21,200 --> 00:42:25,560 Speaker 3: But I think over the last ten years, like so, 834 00:42:25,640 --> 00:42:28,560 Speaker 3: as I was working on this book, I wrote the 835 00:42:28,640 --> 00:42:32,919 Speaker 3: first chapter, and I had that disclaimer in the first 836 00:42:32,960 --> 00:42:37,520 Speaker 3: chapter and said, okay, look, you know I'm not bromising 837 00:42:37,560 --> 00:42:40,080 Speaker 3: you answers. Well, well, we're going to see if we 838 00:42:40,120 --> 00:42:42,839 Speaker 3: can get a good handle on the questions. But by 839 00:42:42,880 --> 00:42:44,440 Speaker 3: the time I got to the end of the book, 840 00:42:44,520 --> 00:42:46,680 Speaker 3: right after that sort of process of working on it 841 00:42:46,680 --> 00:42:49,160 Speaker 3: for years, I felt like things that actually, you know, 842 00:42:49,360 --> 00:42:51,480 Speaker 3: me going through the process of writing it and exploring 843 00:42:51,480 --> 00:42:53,120 Speaker 3: all these things and thinking about how they fit together, 844 00:42:53,440 --> 00:42:56,040 Speaker 3: but also just where the field was, you know, having 845 00:42:56,040 --> 00:42:58,399 Speaker 3: moved forward, I actually started to feel like, actually, these 846 00:42:58,400 --> 00:43:00,480 Speaker 3: things do fit together in a way where you can 847 00:43:00,480 --> 00:43:03,040 Speaker 3: see the glimpses of what answers are going to look 848 00:43:03,080 --> 00:43:05,040 Speaker 3: like in a way that I think really wasn't there 849 00:43:05,040 --> 00:43:08,200 Speaker 3: ten years ago. And it's that story of Okay, we 850 00:43:08,239 --> 00:43:11,200 Speaker 3: sort of know what the goals are, right, we know 851 00:43:11,239 --> 00:43:13,880 Speaker 3: what the right mathematical systems are for describing what intelligent 852 00:43:13,880 --> 00:43:16,520 Speaker 3: system should be doing. You have these ingredients of symbolic 853 00:43:16,560 --> 00:43:19,920 Speaker 3: systems and probablistic inference, and we've discovered that in fact, 854 00:43:19,960 --> 00:43:22,760 Speaker 3: you can get a remarkable way just using these artificial 855 00:43:22,760 --> 00:43:25,680 Speaker 3: neural networks to learn to approximate those things, and so 856 00:43:26,560 --> 00:43:29,839 Speaker 3: that demonstration I think has shown first of all, that 857 00:43:30,440 --> 00:43:36,120 Speaker 3: language is a extremely good substrate for intelligence right in 858 00:43:36,160 --> 00:43:38,319 Speaker 3: a way that I think people had not anticipated before 859 00:43:38,400 --> 00:43:41,520 Speaker 3: large language models, and that you can make big neural 860 00:43:41,520 --> 00:43:44,799 Speaker 3: networks that can learn to approximate really complex probability distributions. 861 00:43:45,360 --> 00:43:48,080 Speaker 3: And so it gives us some of these ingredients for 862 00:43:48,360 --> 00:43:52,080 Speaker 3: seeing how what originally worth three very different views of 863 00:43:52,120 --> 00:43:54,719 Speaker 3: the mind might start to fit together to make something 864 00:43:54,760 --> 00:43:56,239 Speaker 3: that's a little bit more of a unified hole. 865 00:43:57,440 --> 00:44:00,839 Speaker 1: Excellent, And when you wrote the book what struck You 866 00:44:00,960 --> 00:44:04,920 Speaker 1: is the most beautiful idea in the whole quest, in 867 00:44:04,960 --> 00:44:07,480 Speaker 1: the whole history of this gosh. 868 00:44:07,560 --> 00:44:10,960 Speaker 3: Okay, I mean, I'm a big probability for theory fan, 869 00:44:11,120 --> 00:44:16,560 Speaker 3: so going to you're gonna get me endorsing Bays rule, 870 00:44:16,560 --> 00:44:19,040 Speaker 3: which I really do think is like it's it's when 871 00:44:19,080 --> 00:44:21,000 Speaker 3: you learn it, take it a probability class. It's just 872 00:44:21,040 --> 00:44:23,359 Speaker 3: like it's just a dumb principle of probability theory. But 873 00:44:23,400 --> 00:44:26,279 Speaker 3: when you make this move of saying probability theory isn't 874 00:44:26,360 --> 00:44:30,360 Speaker 3: just about dice and cards, it's about you know, beliefs, 875 00:44:30,640 --> 00:44:34,440 Speaker 3: it suddenly becomes a very deep and insightful sort of principle, 876 00:44:34,480 --> 00:44:37,640 Speaker 3: And in the book, I also show probability theory kind 877 00:44:37,640 --> 00:44:40,480 Speaker 3: of subsumes logic, like everything that's a valid logical inference 878 00:44:40,520 --> 00:44:43,040 Speaker 3: is also a valid inference in probability theory. Probability theory 879 00:44:43,080 --> 00:44:45,879 Speaker 3: just kind of extends the surmountics of logic to these 880 00:44:45,920 --> 00:44:49,200 Speaker 3: cases of uncertainty. So to me, I think that's a 881 00:44:49,320 --> 00:44:50,840 Speaker 3: that's a that's a big one. I kind of like 882 00:44:50,880 --> 00:44:51,520 Speaker 3: that's where I live. 883 00:44:51,640 --> 00:44:55,080 Speaker 1: Yeah, excellent, And and somebody, if we do have a 884 00:44:55,480 --> 00:44:58,520 Speaker 1: mature physics of thought, let's say fifty years from now, 885 00:44:58,560 --> 00:45:01,400 Speaker 1: what is that change from us in terms of education, 886 00:45:02,040 --> 00:45:04,040 Speaker 1: in terms of the way we build machines. 887 00:45:05,000 --> 00:45:07,640 Speaker 3: So I think this is this is exactly where we 888 00:45:07,680 --> 00:45:11,400 Speaker 3: can go, right, which is, once you figure out the 889 00:45:11,440 --> 00:45:13,880 Speaker 3: scientific principles of a domain, you can start to think 890 00:45:13,920 --> 00:45:16,680 Speaker 3: about how to do engineering right. So like you know, 891 00:45:17,040 --> 00:45:19,680 Speaker 3: when you're an engineer and you go to engineering school, 892 00:45:20,000 --> 00:45:22,480 Speaker 3: you take physics, right, and then you learn in your 893 00:45:22,480 --> 00:45:24,799 Speaker 3: physics class what these principles are, and then you take 894 00:45:24,840 --> 00:45:27,280 Speaker 3: your applied engineering classes, which are like taking those physical 895 00:45:27,320 --> 00:45:29,040 Speaker 3: principles and telling you how to build a bridge right 896 00:45:29,080 --> 00:45:32,239 Speaker 3: and explaining that you know not in terms of heuristics 897 00:45:32,760 --> 00:45:34,520 Speaker 3: for what makes a good bridge, but in terms of 898 00:45:34,600 --> 00:45:37,520 Speaker 3: those fundamental physical principles. So I think that's a thing 899 00:45:37,520 --> 00:45:40,640 Speaker 3: that's incredibly exciting here is that as we start to 900 00:45:40,719 --> 00:45:45,440 Speaker 3: converge on what these laws of thought look like, it 901 00:45:45,480 --> 00:45:48,359 Speaker 3: gives us the opportunity to do a much more sort 902 00:45:48,400 --> 00:45:54,719 Speaker 3: of science based form of engineering applied to human cognition, 903 00:45:55,000 --> 00:45:58,480 Speaker 3: thinking about how do we make an optimal you know, 904 00:45:58,680 --> 00:46:02,239 Speaker 3: sort of learning environment, how do we support human decision making. 905 00:46:02,280 --> 00:46:03,840 Speaker 3: That's something that I work on in my lab is like, 906 00:46:04,239 --> 00:46:08,480 Speaker 3: how do we put computation into human environments to overcome 907 00:46:08,520 --> 00:46:12,440 Speaker 3: whatever computational constraints we have as individual decision makers and 908 00:46:12,480 --> 00:46:17,160 Speaker 3: help us make better decisions. And how do we understand, 909 00:46:17,320 --> 00:46:21,359 Speaker 3: you know, the kinds of things that people are doing 910 00:46:21,360 --> 00:46:23,319 Speaker 3: in a way that allows us to then sort of 911 00:46:23,360 --> 00:46:27,160 Speaker 3: like make suggestions about, you know, how they might do 912 00:46:27,239 --> 00:46:30,160 Speaker 3: them better. Right, And so I think there's a there's 913 00:46:30,160 --> 00:46:32,680 Speaker 3: a lot of potential for you know, sort of human 914 00:46:32,760 --> 00:46:35,440 Speaker 3: upside as we start to be able to answer these 915 00:46:35,440 --> 00:46:36,400 Speaker 3: scientific questions. 916 00:46:41,200 --> 00:46:44,480 Speaker 1: That was my interview with Tom Griffith's. To quickly summarize 917 00:46:44,520 --> 00:46:49,120 Speaker 1: his framework, Tom sees three major scientific approaches that all 918 00:46:49,120 --> 00:46:53,000 Speaker 1: try to capture the mind. You've got rules and symbols, 919 00:46:53,440 --> 00:46:57,240 Speaker 1: you've got artificial neural networks, and you've got probability theory. 920 00:46:57,600 --> 00:47:01,239 Speaker 1: These very different approaches, and each which one has delivered 921 00:47:01,280 --> 00:47:05,759 Speaker 1: something a little different. Rules and symbols give us language 922 00:47:05,920 --> 00:47:10,040 Speaker 1: like machinery where pieces can be assembled and reassembled into 923 00:47:10,080 --> 00:47:15,759 Speaker 1: complex ideas. Artificial neural networks they give us graded concepts, 924 00:47:15,880 --> 00:47:20,160 Speaker 1: meaning ideas can be fuzzy, and probability theory gives us 925 00:47:20,160 --> 00:47:24,160 Speaker 1: a language for dealing with uncertainty. Now, what's interesting is 926 00:47:24,200 --> 00:47:28,000 Speaker 1: that human minds seem to traffic in all of these modes. 927 00:47:28,120 --> 00:47:32,719 Speaker 1: We use structured symbols, we also use graded concepts. We 928 00:47:32,760 --> 00:47:36,480 Speaker 1: also revise our belief as new evidence comes in. And 929 00:47:36,560 --> 00:47:38,400 Speaker 1: part of that is that we move through the world 930 00:47:38,400 --> 00:47:43,040 Speaker 1: with prior beliefs shaped by our history, our culture, our language, 931 00:47:43,040 --> 00:47:46,799 Speaker 1: our neighborhood, our moment in time. So none of these 932 00:47:46,840 --> 00:47:50,719 Speaker 1: models by themselves are the final answer. And what this 933 00:47:50,840 --> 00:47:54,120 Speaker 1: means is that, like most scientific stories, this is one 934 00:47:54,160 --> 00:47:58,880 Speaker 1: about humility. Tom's book illustrates how every generation arrives with 935 00:47:59,280 --> 00:48:03,319 Speaker 1: some new formalism, some new piece of math, some new 936 00:48:03,719 --> 00:48:08,240 Speaker 1: model that's powerful enough to illuminate an area of mental 937 00:48:08,280 --> 00:48:10,880 Speaker 1: life and for a moment it feels like, hey, the 938 00:48:10,880 --> 00:48:15,600 Speaker 1: whole mystery is finally collapsing. But then the spotlight widens 939 00:48:15,680 --> 00:48:19,160 Speaker 1: and we see more terrain. So what I love about 940 00:48:19,200 --> 00:48:21,640 Speaker 1: this conversation is that it can leave us with a 941 00:48:21,840 --> 00:48:25,120 Speaker 1: sense of progress and a sense of wonder. At the 942 00:48:25,160 --> 00:48:29,239 Speaker 1: same time, we feel a convergence of different fields, and 943 00:48:29,320 --> 00:48:34,280 Speaker 1: we can also feel how large this subject remains. Cognition 944 00:48:34,440 --> 00:48:37,560 Speaker 1: is still a field in motion, So let's look at 945 00:48:37,560 --> 00:48:40,799 Speaker 1: the big picture. When the field of physics matured, we 946 00:48:40,880 --> 00:48:45,399 Speaker 1: could then build bridges and airplanes and power grids because 947 00:48:45,440 --> 00:48:50,520 Speaker 1: we had firm principles to build on. So once the 948 00:48:50,640 --> 00:48:55,640 Speaker 1: laws of thought come into clearer view, what becomes possible 949 00:48:55,840 --> 00:49:00,520 Speaker 1: for education and for decision making, and for rules that 950 00:49:00,600 --> 00:49:04,839 Speaker 1: help us reason more effectively. So here we are at 951 00:49:04,880 --> 00:49:08,000 Speaker 1: a very cool moment in history where the old dream 952 00:49:08,640 --> 00:49:13,040 Speaker 1: of formalizing thought has escaped to the library and shown 953 00:49:13,120 --> 00:49:17,439 Speaker 1: up in everyone's laptop. The big thinkers of centuries ago 954 00:49:17,920 --> 00:49:21,239 Speaker 1: could sort of squint and see the outline of the project, 955 00:49:21,840 --> 00:49:24,839 Speaker 1: and now we're living much more squarely right in the 956 00:49:24,880 --> 00:49:27,799 Speaker 1: middle of it. If there truly are laws of thought, 957 00:49:27,840 --> 00:49:30,960 Speaker 1: they're going to teach us about our machines, but more importantly, 958 00:49:30,960 --> 00:49:35,399 Speaker 1: they're going to teach us about ourselves, because although it's 959 00:49:35,400 --> 00:49:39,320 Speaker 1: sometimes tempting to view the mind as a ghostly exception 960 00:49:39,440 --> 00:49:43,640 Speaker 1: to the universe, the mind, presumably is part of the universe, 961 00:49:44,120 --> 00:49:48,800 Speaker 1: and it is lawful and wondrous and discoverable, and every 962 00:49:48,880 --> 00:49:56,840 Speaker 1: step towards understanding it enlarges the human story. Go to 963 00:49:56,880 --> 00:49:59,680 Speaker 1: eagleman dot com slash podcast for more information and to 964 00:49:59,719 --> 00:50:03,520 Speaker 1: find further reading. Join the weekly discussions on my substack, 965 00:50:03,800 --> 00:50:06,800 Speaker 1: and check out and subscribe to Inner Cosmos on YouTube 966 00:50:06,800 --> 00:50:10,160 Speaker 1: for videos of each episode and to leave comments until 967 00:50:10,200 --> 00:50:13,640 Speaker 1: next time. I'm David Eagleman, and this is Inner Cosmos.