WEBVTT - Ep147 "Can we engineer human thought?" with Tom Griffiths

0:00:05.120 --> 0:00:07.440
<v Speaker 1>AI seems like it burst out of the gate a

0:00:07.480 --> 0:00:10.920
<v Speaker 1>few years ago, But is it actually the latest chapter

0:00:11.119 --> 0:00:16.040
<v Speaker 1>in a three hundred year trajectory to turn thought into math?

0:00:16.680 --> 0:00:19.400
<v Speaker 2>Can the mind be captured with equations?

0:00:19.960 --> 0:00:23.400
<v Speaker 1>Why do current AI models need petabytes of data but

0:00:23.480 --> 0:00:26.759
<v Speaker 1>a child can learn from just a few examples. Why

0:00:26.760 --> 0:00:31.440
<v Speaker 1>does AI have jagged intelligence, meaning it looks brilliant in

0:00:31.480 --> 0:00:36.080
<v Speaker 1>one moment and then it does something totally nonsensical. In physics,

0:00:36.440 --> 0:00:39.479
<v Speaker 1>we have various laws, like the law of gravity or

0:00:39.520 --> 0:00:43.320
<v Speaker 1>the laws of motion, And today we're joined by cognitive

0:00:43.360 --> 0:00:47.080
<v Speaker 1>scientist Tom Griffiths from Princeton to talk about whether we

0:00:47.159 --> 0:00:55.400
<v Speaker 1>are moving towards nailing down laws of thought. Welcome to

0:00:55.400 --> 0:00:58.880
<v Speaker 1>Inner Cosmos with me David Eagleman. I'm a neuroscientist and

0:00:58.920 --> 0:01:02.400
<v Speaker 1>author at Stanford, and in these episodes we sail deeply

0:01:02.440 --> 0:01:05.960
<v Speaker 1>into our three pound universe to understand why and how.

0:01:05.800 --> 0:01:07.880
<v Speaker 2>Our lives look the way they do.

0:01:23.920 --> 0:01:27.000
<v Speaker 1>One thing that distinguishes Homo sapiens from all our cousins

0:01:27.040 --> 0:01:30.119
<v Speaker 1>in the animal kingdom is that we watch the world

0:01:30.120 --> 0:01:34.160
<v Speaker 1>around us and we try to abstract patterns from it.

0:01:34.600 --> 0:01:37.679
<v Speaker 1>For example, you might watch the way that a stone

0:01:37.760 --> 0:01:41.039
<v Speaker 1>falls to the ground and maybe you see a tree

0:01:41.120 --> 0:01:44.200
<v Speaker 1>branch fall, and maybe you see a glacier and one

0:01:44.240 --> 0:01:46.959
<v Speaker 1>day a huge wall of ice falls off it, and

0:01:47.040 --> 0:01:51.000
<v Speaker 1>pretty soon you start seeing an underlying similarity to the

0:01:51.040 --> 0:01:54.600
<v Speaker 1>way that things move. And eventually someone very very smart

0:01:54.680 --> 0:01:58.360
<v Speaker 1>comes along, like Isaac Newton and summarizes all this in

0:01:58.600 --> 0:02:02.080
<v Speaker 1>the law of gravity. And then the same smart guy,

0:02:02.160 --> 0:02:05.600
<v Speaker 1>Newton comes up with the three laws of motion. And

0:02:05.640 --> 0:02:08.560
<v Speaker 1>then another smart person is Einstein. He figures out the

0:02:08.919 --> 0:02:12.840
<v Speaker 1>conservation of mass and energy, which seems to be another

0:02:13.040 --> 0:02:16.919
<v Speaker 1>ironclad law, and then we have the laws of thermodynamics

0:02:16.960 --> 0:02:20.320
<v Speaker 1>and electrostatic laws, and all of this speaks to the

0:02:20.360 --> 0:02:24.079
<v Speaker 1>great success that we've had as the species in figuring

0:02:24.120 --> 0:02:28.160
<v Speaker 1>out the lowest level of code that's running in the universe.

0:02:28.880 --> 0:02:32.120
<v Speaker 1>But for most of human history, the concept of a

0:02:32.600 --> 0:02:35.840
<v Speaker 1>thought has felt like the most intimate thing we experience

0:02:35.960 --> 0:02:38.560
<v Speaker 1>and the least tractable thing to study.

0:02:39.200 --> 0:02:41.840
<v Speaker 2>What a thought is and how it occurs.

0:02:42.280 --> 0:02:46.200
<v Speaker 1>That seems to live in a different category of mystery

0:02:46.240 --> 0:02:49.560
<v Speaker 1>from how an object falls. Why, Well, it's because the

0:02:49.639 --> 0:02:53.600
<v Speaker 1>thought pops into your head and somehow it carries memory

0:02:53.680 --> 0:02:57.760
<v Speaker 1>and expectation and language and often a feeling. But it

0:02:57.840 --> 0:03:02.440
<v Speaker 1>feels vaporous and private. It feels like the one thing

0:03:02.960 --> 0:03:07.880
<v Speaker 1>that will forever escape formal description. But what's interesting is

0:03:07.919 --> 0:03:11.680
<v Speaker 1>that for centuries people have tried, there's always been a

0:03:11.800 --> 0:03:13.399
<v Speaker 1>deep human urge.

0:03:13.120 --> 0:03:16.480
<v Speaker 2>To ask whether thought has laws to it?

0:03:16.560 --> 0:03:19.720
<v Speaker 1>In other words, does the mind have principles that you

0:03:19.760 --> 0:03:23.760
<v Speaker 1>can write down? Does reasoning have a grammar to it?

0:03:24.120 --> 0:03:27.960
<v Speaker 1>Can you describe intelligence in a language that's precise enough

0:03:28.360 --> 0:03:31.440
<v Speaker 1>that once you understand the rules, you can begin to

0:03:31.480 --> 0:03:35.800
<v Speaker 1>build with them, like build artificial intelligence. Most of us

0:03:35.840 --> 0:03:38.400
<v Speaker 1>are old enough to remember that this question of AI

0:03:38.960 --> 0:03:42.840
<v Speaker 1>once lived in philosophy seminars and math departments, but now

0:03:42.880 --> 0:03:45.320
<v Speaker 1>it's sitting at the center of our economy.

0:03:46.280 --> 0:03:48.200
<v Speaker 2>Okay, So what is thought?

0:03:48.440 --> 0:03:52.640
<v Speaker 1>Can we capture it in formal systems like laws or equations?

0:03:53.040 --> 0:03:57.600
<v Speaker 1>Do different parts of intelligence come from logic, from learning,

0:03:57.720 --> 0:04:03.840
<v Speaker 1>from uncertainty, from memory, from prior knowledge, from living inside bodies,

0:04:03.920 --> 0:04:06.120
<v Speaker 1>from living inside our cultures?

0:04:06.640 --> 0:04:08.040
<v Speaker 2>From the particular.

0:04:07.600 --> 0:04:11.120
<v Speaker 1>Constraints of being a human animal with a short lifespan

0:04:11.520 --> 0:04:15.320
<v Speaker 1>and limited bandwidth. Our guest today is someone who lives

0:04:15.400 --> 0:04:19.240
<v Speaker 1>right at the intersection of all these questions. Tom Griffiths

0:04:19.320 --> 0:04:22.880
<v Speaker 1>is a professor at Princeton, where He directs the Computational

0:04:22.960 --> 0:04:27.680
<v Speaker 1>Cognitive Science Lab and the Princeton Laboratory for Artificial Intelligence.

0:04:28.160 --> 0:04:31.719
<v Speaker 1>He has spent years asking how minds work through the

0:04:31.839 --> 0:04:35.760
<v Speaker 1>different lenses of math and computation and learning. And he's

0:04:35.760 --> 0:04:38.919
<v Speaker 1>the author of a wonderful new book called The Laws

0:04:38.960 --> 0:04:43.719
<v Speaker 1>of Thought, which traces the long history of thinkers asking

0:04:44.080 --> 0:04:46.719
<v Speaker 1>are their rules to this? Can we understand what human

0:04:46.760 --> 0:04:50.839
<v Speaker 1>thinking is? In his book we get the lengthy arc

0:04:51.040 --> 0:04:55.839
<v Speaker 1>of minds trying to understand mind. This begins millennia ago

0:04:55.920 --> 0:05:00.520
<v Speaker 1>with Aristotle, who wondered whether logic itself could be math matized,

0:05:00.960 --> 0:05:05.520
<v Speaker 1>and Tom follows the trail through the architects of symbolic reasoning,

0:05:05.960 --> 0:05:09.240
<v Speaker 1>through the birth of computation, through the rise of neural networks,

0:05:09.640 --> 0:05:13.680
<v Speaker 1>through the realization that probability theory might serve as a

0:05:13.800 --> 0:05:17.719
<v Speaker 1>language for our beliefs about things. Along the way, in

0:05:17.760 --> 0:05:20.880
<v Speaker 1>his book, a picture emerges that there may not be

0:05:21.320 --> 0:05:24.440
<v Speaker 1>just a single tool for capturing their mind, but instead

0:05:24.440 --> 0:05:27.760
<v Speaker 1>there are different ways of trying to tackle the problem,

0:05:28.120 --> 0:05:32.520
<v Speaker 1>and each one sheds light on a different aspect of cognition.

0:05:33.240 --> 0:05:36.360
<v Speaker 1>So we're going to talk about ourselves human minds, and

0:05:36.440 --> 0:05:40.160
<v Speaker 1>we'll talk about AI what kind of intelligence is this

0:05:40.440 --> 0:05:47.480
<v Speaker 1>and what is missing? Here's my interview with Tom Griffiths.

0:05:48.160 --> 0:05:50.320
<v Speaker 3>As soon as you turn thought into math, it becomes

0:05:50.320 --> 0:05:52.480
<v Speaker 3>something that machines would be able to do. And so

0:05:52.600 --> 0:05:57.160
<v Speaker 3>our modern AI systems are really a consequence of, you know,

0:05:57.200 --> 0:06:00.520
<v Speaker 3>that thought that people were having hundreds of years ago,

0:06:01.080 --> 0:06:04.160
<v Speaker 3>of being able to turn thought into something that can

0:06:04.160 --> 0:06:05.800
<v Speaker 3>be expressed in mathematical terms.

0:06:05.920 --> 0:06:07.440
<v Speaker 1>And so one of the things that I loved about

0:06:07.440 --> 0:06:09.400
<v Speaker 1>your book, by the way, is that you really tell

0:06:09.560 --> 0:06:11.640
<v Speaker 1>stories of all the thinkers.

0:06:12.240 --> 0:06:14.880
<v Speaker 2>You dive into the lives, you tell them with real color.

0:06:15.160 --> 0:06:17.440
<v Speaker 1>If you were going to start with one thinker that

0:06:17.480 --> 0:06:19.160
<v Speaker 1>you think is the most important, who would that be.

0:06:19.400 --> 0:06:21.160
<v Speaker 3>There are a couple of people who have this sort

0:06:21.160 --> 0:06:23.960
<v Speaker 3>of enduring influence throughout the book. One of them is Leibnitz,

0:06:24.040 --> 0:06:26.680
<v Speaker 3>who kind of started this enterprise in some sense. He

0:06:26.760 --> 0:06:30.360
<v Speaker 3>was really trying to take the idea of logic as

0:06:30.400 --> 0:06:33.280
<v Speaker 3>expressed by Aristotle and turn it into math, but ultimately

0:06:33.279 --> 0:06:35.960
<v Speaker 3>failed in doing that. But along the way he also

0:06:36.120 --> 0:06:38.799
<v Speaker 3>discovered the calculus, which turned out to be really important

0:06:38.800 --> 0:06:41.440
<v Speaker 3>when people wanted to make neural networks that could learn

0:06:41.520 --> 0:06:44.480
<v Speaker 3>from data. It turns out that the trick for doing

0:06:44.560 --> 0:06:46.840
<v Speaker 3>that is actually a trick that lad had figured out

0:06:46.880 --> 0:06:50.599
<v Speaker 3>all that time ago. And then another key figure here,

0:06:50.760 --> 0:06:53.320
<v Speaker 3>as might be suggested by the title of the book,

0:06:53.400 --> 0:06:57.800
<v Speaker 3>is George Bull, who was a nineteenth century mathematician. He

0:06:57.920 --> 0:06:59.600
<v Speaker 3>was a school teacher for most of his life and

0:06:59.680 --> 0:07:01.600
<v Speaker 3>did a lot of like serious math on the side

0:07:01.640 --> 0:07:03.680
<v Speaker 3>instead of you know, had a big effect on the

0:07:03.720 --> 0:07:08.080
<v Speaker 3>history of mathematics. But he was really the person who

0:07:08.160 --> 0:07:11.560
<v Speaker 3>then first solved that problem that Leibnitz had posed. And

0:07:11.600 --> 0:07:15.480
<v Speaker 3>in addition to the impact that that work had, he's

0:07:15.560 --> 0:07:19.000
<v Speaker 3>also the great grandfather of Jeff Hinton, who was one

0:07:19.000 --> 0:07:21.400
<v Speaker 3>of the people who played an important role in developing

0:07:21.440 --> 0:07:24.040
<v Speaker 3>these algorithms for learning from your own networks. And so

0:07:24.200 --> 0:07:26.240
<v Speaker 3>you could make an argument that without Boole we would

0:07:26.240 --> 0:07:29.480
<v Speaker 3>be a fair way back from where we are today.

0:07:29.920 --> 0:07:30.080
<v Speaker 2>You know.

0:07:30.200 --> 0:07:33.160
<v Speaker 1>Interestingly, when most people think about Boole, they only know

0:07:33.320 --> 0:07:38.360
<v Speaker 1>about Boolean numbers. They know about zero and one binary numbers,

0:07:38.680 --> 0:07:41.520
<v Speaker 1>and that's essentially the extent of the think. But he

0:07:41.560 --> 0:07:43.760
<v Speaker 1>was quite celebrated in his life right even though he

0:07:43.840 --> 0:07:48.440
<v Speaker 1>was a headmaster and not formally involved as a professor.

0:07:48.560 --> 0:07:51.000
<v Speaker 1>Am I correct about this? He nonetheless was quite recognized

0:07:51.000 --> 0:07:51.800
<v Speaker 1>as a mathematician.

0:07:52.400 --> 0:07:55.080
<v Speaker 3>Yeah, he became a university professor later in his life,

0:07:55.120 --> 0:07:57.080
<v Speaker 3>but spent most of his life as a teacher and

0:07:57.120 --> 0:08:01.080
<v Speaker 3>a head master. But yeah, he won a gold medal

0:08:01.080 --> 0:08:04.080
<v Speaker 3>in mathematics from the Royal Society. Was a very prestigious award,

0:08:05.080 --> 0:08:08.960
<v Speaker 3>and you know, was this amazing person who was having

0:08:09.000 --> 0:08:11.880
<v Speaker 3>these high level correspondences with the leading mathematicians of the

0:08:11.960 --> 0:08:16.360
<v Speaker 3>day while holding down his job running a small school.

0:08:16.960 --> 0:08:17.640
<v Speaker 2>Yeah.

0:08:17.800 --> 0:08:22.840
<v Speaker 1>Now, in the book, you essentially use three different frameworks.

0:08:22.920 --> 0:08:26.440
<v Speaker 1>What phenomenon does each framework explain?

0:08:26.560 --> 0:08:27.960
<v Speaker 2>Unusually Well, the.

0:08:27.840 --> 0:08:29.880
<v Speaker 3>Three frameworks I talk about in the book are what

0:08:29.920 --> 0:08:32.360
<v Speaker 3>I call rules and symbols, which is what we've been

0:08:32.400 --> 0:08:34.800
<v Speaker 3>talking about. This kind of like approach that stems out

0:08:34.840 --> 0:08:37.120
<v Speaker 3>of logic, where the idea is that you're going to

0:08:37.200 --> 0:08:39.719
<v Speaker 3>be able to write down some rules that characterize the

0:08:39.720 --> 0:08:41.960
<v Speaker 3>structure of thought, and by following those rules, you end

0:08:42.000 --> 0:08:47.319
<v Speaker 3>up with interesting consequences. The second approach is networks, features

0:08:47.320 --> 0:08:47.840
<v Speaker 3>and spaces.

0:08:47.920 --> 0:08:48.079
<v Speaker 4>Right.

0:08:48.120 --> 0:08:50.640
<v Speaker 3>This is neural networks, which you can kind of think

0:08:50.640 --> 0:08:54.160
<v Speaker 3>about as a system for doing computation when you start

0:08:54.200 --> 0:08:56.839
<v Speaker 3>representing things as points in a space. Right, So if

0:08:56.840 --> 0:09:01.600
<v Speaker 3>you start to think about you know, every object that

0:09:01.640 --> 0:09:03.360
<v Speaker 3>you could see in the world is not being something

0:09:03.400 --> 0:09:05.920
<v Speaker 3>that's described by rules, but being described by a location

0:09:06.040 --> 0:09:08.560
<v Speaker 3>along some dimensions. You need to have a way of

0:09:08.600 --> 0:09:11.400
<v Speaker 3>talking about how to map between those spaces and your

0:09:11.440 --> 0:09:14.120
<v Speaker 3>all network solve that problem. And then the third is

0:09:14.960 --> 0:09:20.439
<v Speaker 3>probability and statistics. And probability theory is really powerful because

0:09:20.720 --> 0:09:24.120
<v Speaker 3>it is the complement to logic, where logic tells us

0:09:24.160 --> 0:09:26.120
<v Speaker 3>how to go from things that we know to be

0:09:26.200 --> 0:09:28.959
<v Speaker 3>true to other things that we're equally certain or true.

0:09:29.360 --> 0:09:31.680
<v Speaker 3>Probability theory tells us what to do when we're uncertain.

0:09:32.160 --> 0:09:34.920
<v Speaker 3>So if we get some information we want to draw

0:09:34.920 --> 0:09:37.319
<v Speaker 3>a conclusion, but we're not able to draw that conclusion

0:09:37.360 --> 0:09:40.559
<v Speaker 3>with perfect certainty, Probability theory tells us how to do that,

0:09:40.920 --> 0:09:44.000
<v Speaker 3>and it tells us how to combine our sort of

0:09:44.400 --> 0:09:49.320
<v Speaker 3>background beliefs, the other sources of information we have our

0:09:49.360 --> 0:09:52.120
<v Speaker 3>biases in with the data that we see in a

0:09:52.120 --> 0:09:54.160
<v Speaker 3>way that helps us to explain how it's possible to

0:09:54.240 --> 0:09:56.440
<v Speaker 3>learn from small amounts of data. And that's one thing

0:09:56.480 --> 0:09:59.199
<v Speaker 3>which is still something that discriminates human learning from the

0:09:59.280 --> 0:10:01.120
<v Speaker 3>learning that's done by AI systems today.

0:10:01.520 --> 0:10:03.560
<v Speaker 1>Okay, great, so we're going to dive into each of

0:10:03.600 --> 0:10:06.560
<v Speaker 1>these three lenses. But just before we do, do you

0:10:06.640 --> 0:10:11.680
<v Speaker 1>see the AI conversation today over indexing on one of

0:10:11.720 --> 0:10:13.160
<v Speaker 1>these lenses over the others.

0:10:14.760 --> 0:10:17.520
<v Speaker 3>I think there's a lot of emphasis on neural networks,

0:10:17.559 --> 0:10:21.240
<v Speaker 3>which are fundamentally the sort of engineering technology which is

0:10:21.280 --> 0:10:25.360
<v Speaker 3>making possible the creation of our chatbots and the other

0:10:25.520 --> 0:10:29.600
<v Speaker 3>sort of big AI systems that are deployed. I think

0:10:30.200 --> 0:10:34.360
<v Speaker 3>that potentially misses out the importance of these other threads

0:10:34.559 --> 0:10:37.679
<v Speaker 3>right where. One thing that's important to remember is that

0:10:37.720 --> 0:10:40.680
<v Speaker 3>those neural networks are being trained on what is essentially

0:10:40.720 --> 0:10:43.200
<v Speaker 3>a system of rules and symbols. They're being trained on

0:10:43.720 --> 0:10:47.480
<v Speaker 3>human language, which is symbolic and rule like in various ways,

0:10:48.160 --> 0:10:50.680
<v Speaker 3>and they're being trained on code, which is even more

0:10:50.720 --> 0:10:54.240
<v Speaker 3>symbolic and even more rule like, And those things together

0:10:54.280 --> 0:10:56.559
<v Speaker 3>provide some of the substrate for developing the kind of

0:10:56.600 --> 0:10:59.280
<v Speaker 3>intelligence that they demonstrate. And then the way that they're

0:10:59.320 --> 0:11:03.079
<v Speaker 3>trained is by learning to predict the next token, right,

0:11:03.120 --> 0:11:05.080
<v Speaker 3>the next word or part of word, based on what

0:11:05.080 --> 0:11:07.760
<v Speaker 3>they've seen so far. And that way of training them

0:11:07.800 --> 0:11:12.040
<v Speaker 3>is actually using probability theory. So that's a probabilistic problem

0:11:12.040 --> 0:11:14.079
<v Speaker 3>because you're making a guess about what the next thing

0:11:14.120 --> 0:11:15.920
<v Speaker 3>is going to be based on the things that you see,

0:11:16.200 --> 0:11:18.520
<v Speaker 3>and so that's an important ingredient in their success as well,

0:11:18.559 --> 0:11:22.000
<v Speaker 3>is that they're essentially learning to approximate a big probability distribution.

0:11:22.360 --> 0:11:25.079
<v Speaker 1>So let's dive into the first one, rules and symbols.

0:11:25.280 --> 0:11:28.600
<v Speaker 1>So take us back to the original urge. Why did

0:11:28.679 --> 0:11:33.559
<v Speaker 1>early thinkers believe that this could be used to explain thinking.

0:11:35.200 --> 0:11:38.240
<v Speaker 3>I think a lot of the draw of rules and

0:11:38.280 --> 0:11:41.760
<v Speaker 3>symbols was that that really was, in some way what

0:11:42.120 --> 0:11:45.400
<v Speaker 3>mathematics was to people, right, So Leibniz, part of the

0:11:45.400 --> 0:11:47.400
<v Speaker 3>reason why he wasn't able to solve this problem of

0:11:47.440 --> 0:11:50.160
<v Speaker 3>figuring out how to turn thought into math is that

0:11:50.679 --> 0:11:53.560
<v Speaker 3>what he thought math was, or the kind of math

0:11:53.600 --> 0:11:55.440
<v Speaker 3>that he was trying to use to solve that problem,

0:11:56.000 --> 0:11:59.080
<v Speaker 3>was really arithmetic, right, And arithmetic was kind of like

0:11:59.120 --> 0:12:01.560
<v Speaker 3>the model that they had for a mathematical system. So

0:12:01.600 --> 0:12:04.120
<v Speaker 3>you can think about ideas being added together or subtracting

0:12:04.120 --> 0:12:07.080
<v Speaker 3>one idea from another, and really thinking about the operators

0:12:07.120 --> 0:12:09.160
<v Speaker 3>that you're using as being the things that are sort

0:12:09.160 --> 0:12:11.280
<v Speaker 3>of coming from this familiar mathematical language.

0:12:11.360 --> 0:12:13.640
<v Speaker 4>And so I think part of.

0:12:13.640 --> 0:12:16.240
<v Speaker 3>The reason that we end up with that approach is

0:12:16.280 --> 0:12:19.120
<v Speaker 3>because of the kind of math that has been successful

0:12:19.120 --> 0:12:22.920
<v Speaker 3>in other settings, right where we need to do arithmetic

0:12:23.040 --> 0:12:25.320
<v Speaker 3>to you know, that's a good description of certain.

0:12:25.160 --> 0:12:26.320
<v Speaker 4>Kinds of things that human minds do.

0:12:27.720 --> 0:12:30.120
<v Speaker 3>Google had the insight that you needed a different kind

0:12:30.120 --> 0:12:32.640
<v Speaker 3>of algebra in order to describe thought, and then that's

0:12:32.679 --> 0:12:36.200
<v Speaker 3>what leads to modern mathematical logic. But it's still in

0:12:36.240 --> 0:12:39.560
<v Speaker 3>this kind of symbolic language, although Gooole also talked about

0:12:39.559 --> 0:12:42.840
<v Speaker 3>probability theory as being important for capturing languages as well.

0:12:42.920 --> 0:12:45.360
<v Speaker 3>So I think it's really more about what are the

0:12:45.440 --> 0:12:48.640
<v Speaker 3>kinds of mathematical systems that it was sort of straightforward

0:12:48.640 --> 0:12:51.199
<v Speaker 3>to formalize, and that gave us something that we could

0:12:51.240 --> 0:12:53.600
<v Speaker 3>try to map thought onto. And that's what we do

0:12:53.679 --> 0:12:58.320
<v Speaker 3>as scientists is often taking mathematical systems that mathematicians have

0:12:58.360 --> 0:13:00.840
<v Speaker 3>defined for us and then saying, oh, I think this

0:13:01.000 --> 0:13:03.800
<v Speaker 3>mathematical system maps onto the thing that I want to understand,

0:13:04.200 --> 0:13:06.400
<v Speaker 3>and so trying to establish that correspondence and not just

0:13:06.520 --> 0:13:08.199
<v Speaker 3>then allow us to derive its consequences.

0:13:09.160 --> 0:13:12.840
<v Speaker 1>So speaking of rules and symbols, So thinkers like Newl

0:13:12.920 --> 0:13:17.040
<v Speaker 1>and Simon, they popularize this idea of goals and sub goals.

0:13:17.480 --> 0:13:21.920
<v Speaker 1>What did that viewpoint get exactly right about human problem solving.

0:13:23.440 --> 0:13:26.760
<v Speaker 3>So now we're fast forwarding a bit right from we

0:13:26.840 --> 0:13:30.800
<v Speaker 3>have Boule figuring out the structure of logic. That turns

0:13:30.840 --> 0:13:32.480
<v Speaker 3>into you know, lots of people then sort of turn

0:13:32.520 --> 0:13:34.840
<v Speaker 3>that into a sort of mature theory of logic. You

0:13:34.920 --> 0:13:39.160
<v Speaker 3>get aalenteering kind of turning this into a theory of computation,

0:13:39.480 --> 0:13:41.960
<v Speaker 3>thinking about what an abstract mathematician is doing when they're

0:13:42.000 --> 0:13:43.880
<v Speaker 3>doing something like logic, and thinking about how you can

0:13:43.920 --> 0:13:48.240
<v Speaker 3>make a machine do that. And then we have people

0:13:48.520 --> 0:13:51.560
<v Speaker 3>starting to realize that, you know, as digital computers are

0:13:51.559 --> 0:13:55.560
<v Speaker 3>being developed, maybe those provide a good model for how

0:13:55.600 --> 0:13:59.880
<v Speaker 3>thinking works in general, and then trying to use a

0:14:00.000 --> 0:14:03.200
<v Speaker 3>computer as a sort of foundation for you know, thinking

0:14:03.200 --> 0:14:05.280
<v Speaker 3>about things like how people might solve problems. And so

0:14:06.040 --> 0:14:09.800
<v Speaker 3>Alan Ewele and Herbert Simon were influential cognitive scientists who

0:14:10.600 --> 0:14:14.679
<v Speaker 3>did exactly that. They had this idea that maybe there

0:14:14.760 --> 0:14:17.400
<v Speaker 3>is a way that you could make computers smarter by

0:14:17.440 --> 0:14:20.720
<v Speaker 3>using insights from human cognition, but also get a better

0:14:20.800 --> 0:14:23.160
<v Speaker 3>understanding of what humans are doing when they're solving problems

0:14:23.200 --> 0:14:25.400
<v Speaker 3>by using the sort of ideas that come from things

0:14:25.440 --> 0:14:29.360
<v Speaker 3>like computer programming, and so they set up you know this,

0:14:29.640 --> 0:14:31.240
<v Speaker 3>you know, when we're trying to solve a problem or

0:14:31.240 --> 0:14:33.360
<v Speaker 3>prove a mathematical theorem or play a game of chess,

0:14:33.720 --> 0:14:35.520
<v Speaker 3>they set this up as a problem of searching through

0:14:35.560 --> 0:14:40.600
<v Speaker 3>a tree of possibilities, where what you're doing is making choices,

0:14:40.880 --> 0:14:42.480
<v Speaker 3>and then each of those choices gives you a new

0:14:42.520 --> 0:14:44.240
<v Speaker 3>set of choices, and each of those choices gives you

0:14:44.240 --> 0:14:46.360
<v Speaker 3>a new set of choices, and the hard thing is

0:14:46.880 --> 0:14:49.160
<v Speaker 3>finding a path through those choices that leads to the

0:14:49.160 --> 0:14:51.160
<v Speaker 3>point that you want to end up at. And so

0:14:51.720 --> 0:14:54.120
<v Speaker 3>that's something where you can take inspiration from how human

0:14:54.120 --> 0:14:57.720
<v Speaker 3>mathematicians solve problems. You can take inspiration from the kind

0:14:57.760 --> 0:14:59.960
<v Speaker 3>of you know, tricks like working backwards from the end

0:15:00.080 --> 0:15:03.320
<v Speaker 3>towards the start. Right, Those were principles that they were

0:15:03.360 --> 0:15:05.280
<v Speaker 3>able to use to try and explain these aspects of

0:15:05.360 --> 0:15:07.800
<v Speaker 3>human cognition as well as making the machines work better.

0:15:08.040 --> 0:15:09.960
<v Speaker 1>Okay, but then one of the things that happened is

0:15:10.000 --> 0:15:13.120
<v Speaker 1>that at least one of these attempts had ballooned into

0:15:13.200 --> 0:15:17.320
<v Speaker 1>twenty five million rules. And so what does that teach

0:15:17.400 --> 0:15:20.520
<v Speaker 1>us about the shape of human intelligence.

0:15:21.960 --> 0:15:23.800
<v Speaker 4>This rules and symbols enterprise.

0:15:24.040 --> 0:15:24.160
<v Speaker 2>Right.

0:15:24.240 --> 0:15:27.000
<v Speaker 4>The sort of appeal that this had was that maybe

0:15:27.040 --> 0:15:27.520
<v Speaker 4>one day.

0:15:27.400 --> 0:15:29.560
<v Speaker 3>You could just write down all of the rules that

0:15:29.560 --> 0:15:31.720
<v Speaker 3>you need to write down, and then you've characterized how

0:15:31.760 --> 0:15:34.320
<v Speaker 3>intelligence works. Right, So it's just a matter of getting

0:15:34.400 --> 0:15:37.440
<v Speaker 3>enough rules in a way that's very reminiscent today, right

0:15:37.520 --> 0:15:40.920
<v Speaker 3>of you know, the way that our modern AI systems

0:15:40.920 --> 0:15:43.520
<v Speaker 3>are being made is by training them on more and

0:15:43.600 --> 0:15:46.640
<v Speaker 3>more data, right, feeding in more and more language. There

0:15:46.680 --> 0:15:48.520
<v Speaker 3>was a hope that you could just like, yeah, like

0:15:48.800 --> 0:15:50.760
<v Speaker 3>document all of the rules that you need to capture

0:15:51.080 --> 0:15:54.360
<v Speaker 3>the structure of human knowledge. And so that led to

0:15:54.960 --> 0:15:57.240
<v Speaker 3>you know, companies being started to try and engage in

0:15:57.280 --> 0:16:01.880
<v Speaker 3>that enterprise, ultimately I would say, unsuccessfully, but giving us

0:16:01.920 --> 0:16:06.440
<v Speaker 3>some kind of characterization of like particular subsets of human knowledge.

0:16:06.520 --> 0:16:08.640
<v Speaker 3>And so I think the thing that came out of

0:16:08.920 --> 0:16:12.280
<v Speaker 3>that enterprise was revealing that maybe you need something more

0:16:12.360 --> 0:16:16.600
<v Speaker 3>than just rules, right, that maybe thinking about logic as

0:16:16.640 --> 0:16:19.480
<v Speaker 3>a basis for our model of intelligence was missing something.

0:16:20.040 --> 0:16:22.240
<v Speaker 3>It's an approach that worked really well for certain kinds

0:16:22.280 --> 0:16:26.200
<v Speaker 3>of problems like doing arithmetic, playing games or chess, but

0:16:26.240 --> 0:16:28.520
<v Speaker 3>it didn't work very well for other kinds of problems

0:16:28.640 --> 0:16:31.520
<v Speaker 3>like figuring out what you're seeing in the world, or

0:16:31.880 --> 0:16:34.360
<v Speaker 3>actually learning language or these other kinds of things.

0:16:34.400 --> 0:16:37.080
<v Speaker 1>And so this is what leads to your second lens,

0:16:37.240 --> 0:16:40.800
<v Speaker 1>which is neural networks. And you talk about these as

0:16:40.880 --> 0:16:44.080
<v Speaker 1>having you know, a boom and bust history. So, first,

0:16:44.120 --> 0:16:46.720
<v Speaker 1>what happened in the last decade that allowed them to

0:16:46.760 --> 0:16:48.880
<v Speaker 1>turn into the dominant paradigm.

0:16:49.320 --> 0:16:53.440
<v Speaker 3>The big breakthrough in the last decade was really about

0:16:53.760 --> 0:16:56.680
<v Speaker 3>being able to make bigger in neural networks that could

0:16:56.720 --> 0:17:01.840
<v Speaker 3>be trained on more data in a way that could scale, right,

0:17:01.880 --> 0:17:05.159
<v Speaker 3>and so bigger here means what these are. An artificial

0:17:05.240 --> 0:17:08.840
<v Speaker 3>neural network is a set of units that are communicating

0:17:08.840 --> 0:17:13.120
<v Speaker 3>with one another. They're communicating along weighted connections, a sort

0:17:13.160 --> 0:17:15.639
<v Speaker 3>of you know, imagine like how neurons are connected in

0:17:15.680 --> 0:17:17.840
<v Speaker 3>your brain, and those neurons are connected to one another

0:17:17.880 --> 0:17:20.359
<v Speaker 3>and sending each other signals. An artificial neural network is

0:17:20.400 --> 0:17:23.719
<v Speaker 3>basically simulating that kind of structure inside a computer. And

0:17:23.760 --> 0:17:26.679
<v Speaker 3>so for a long time, the sort of the history

0:17:26.680 --> 0:17:30.040
<v Speaker 3>of neural networks has been one of people figuring out

0:17:30.119 --> 0:17:33.880
<v Speaker 3>how to make bigger neural networks work. So the very

0:17:33.920 --> 0:17:37.800
<v Speaker 3>first you know, learning neural networks. They had a learning

0:17:37.840 --> 0:17:40.719
<v Speaker 3>algorithm that worked for one layer of weights, and then

0:17:40.760 --> 0:17:42.560
<v Speaker 3>there was a breakthrough in the nineteen eighties that meant,

0:17:42.600 --> 0:17:44.280
<v Speaker 3>now you had a learning algorithm that could work for

0:17:44.359 --> 0:17:46.600
<v Speaker 3>multiple layers of weights, but it didn't work for very

0:17:46.760 --> 0:17:48.920
<v Speaker 3>deep neural networks with lots of layers of net weights

0:17:48.960 --> 0:17:52.160
<v Speaker 3>because it I can sort of explain the technical reasons

0:17:52.200 --> 0:17:54.159
<v Speaker 3>behind it, but you know, sort of like the basic

0:17:54.160 --> 0:17:57.760
<v Speaker 3>algorithm didn't quite work. And so the big breakthroughs of

0:17:57.840 --> 0:18:00.000
<v Speaker 3>the last you know, ten to fifteen years have been

0:18:00.080 --> 0:18:03.480
<v Speaker 3>about coming up with ways to take those algorithms and

0:18:03.520 --> 0:18:05.440
<v Speaker 3>actually make them work for neural networks that are bigger

0:18:05.480 --> 0:18:07.679
<v Speaker 3>and bigger and deeper and deeper, that are able to

0:18:07.840 --> 0:18:11.840
<v Speaker 3>easily learn more complex functions and can do so from

0:18:12.160 --> 0:18:14.879
<v Speaker 3>massive amounts of data in a way that means that

0:18:14.920 --> 0:18:18.000
<v Speaker 3>they're able to discover sort of complex relationships between things

0:18:18.000 --> 0:18:19.960
<v Speaker 3>that are necessary to produce intelligent behavior.

0:18:20.280 --> 0:18:24.080
<v Speaker 1>And so, what are these neural networks capture about cognition

0:18:24.760 --> 0:18:29.200
<v Speaker 1>that symbols missed, especially in terms of things like similarity

0:18:29.240 --> 0:18:32.040
<v Speaker 1>and fuzziness and graded concepts.

0:18:32.560 --> 0:18:35.000
<v Speaker 3>Fuzziness is a really good way of describing it. It's

0:18:35.040 --> 0:18:39.080
<v Speaker 3>that you know, if you ask somebody, you know, whether

0:18:39.119 --> 0:18:41.800
<v Speaker 3>something is a piece of furniture, they're going to say,

0:18:42.119 --> 0:18:43.840
<v Speaker 3>you know, if you show them a chair, they'll say, yes,

0:18:43.920 --> 0:18:46.879
<v Speaker 3>definitely a piece of furniture. If you show them a rug,

0:18:47.440 --> 0:18:51.119
<v Speaker 3>they'll say, yeah, maybe a piece of furniture. Right, it

0:18:51.119 --> 0:18:52.960
<v Speaker 3>doesn't sort of fit with our you know, week sort

0:18:52.960 --> 0:18:56.240
<v Speaker 3>of have a prototypical idea of what furniture is, which

0:18:56.240 --> 0:18:58.479
<v Speaker 3>contains things like chairs and tables and ottomans and these

0:18:58.480 --> 0:19:02.119
<v Speaker 3>other kinds of things, and then rugs and treadmills, and you.

0:19:02.160 --> 0:19:04.040
<v Speaker 4>Know, like these are things that maybe.

0:19:03.760 --> 0:19:06.320
<v Speaker 3>You're in this category, but maybe an't right. And so

0:19:07.160 --> 0:19:09.520
<v Speaker 3>we need to have a way of thinking about concepts

0:19:09.520 --> 0:19:11.760
<v Speaker 3>that's not just the sort of yes or no, true

0:19:11.800 --> 0:19:14.760
<v Speaker 3>or false one or zero that logic would give us.

0:19:14.800 --> 0:19:16.840
<v Speaker 3>We need to have something which has that fuzziness in it.

0:19:17.160 --> 0:19:19.399
<v Speaker 3>One way of getting fuzziness is by thinking about a

0:19:19.480 --> 0:19:22.760
<v Speaker 3>concept in terms of points in space, right where you

0:19:22.760 --> 0:19:26.399
<v Speaker 3>could think chairs are here in one location, rugs are

0:19:26.440 --> 0:19:29.040
<v Speaker 3>here in another location, and maybe what it is to

0:19:29.040 --> 0:19:30.439
<v Speaker 3>be a piece of furniture is to just be in

0:19:30.440 --> 0:19:32.560
<v Speaker 3>some part of that space, and how close you are

0:19:32.600 --> 0:19:34.280
<v Speaker 3>to that part of the space is like how good

0:19:34.320 --> 0:19:36.560
<v Speaker 3>you are as an example of that kind of furniture.

0:19:37.359 --> 0:19:39.800
<v Speaker 3>And so as soon as you think in those terms,

0:19:39.800 --> 0:19:42.160
<v Speaker 3>you have a new problem, which is with our rules

0:19:42.160 --> 0:19:44.400
<v Speaker 3>and symbols. We knew how to do computation, we knew

0:19:44.400 --> 0:19:47.000
<v Speaker 3>how to describe thinking. Thinking was a matter of applying

0:19:47.040 --> 0:19:49.240
<v Speaker 3>the rules and seat of you know, repeating that process.

0:19:50.040 --> 0:19:52.640
<v Speaker 3>But we don't have a way of doing computation in spaces.

0:19:52.720 --> 0:19:54.439
<v Speaker 3>And that's what youral networks give us. So you can

0:19:54.520 --> 0:19:58.760
<v Speaker 3>kind of think about a space corresponding to the activation

0:19:58.920 --> 0:20:00.639
<v Speaker 3>of the units inside this neural network.

0:20:00.680 --> 0:20:03.040
<v Speaker 4>How much you know, how much input.

0:20:02.720 --> 0:20:05.600
<v Speaker 3>Each neural unit in that neural network is receiving, and

0:20:04.920 --> 0:20:08.600
<v Speaker 3>how much of a response it's making that characterizes some

0:20:08.680 --> 0:20:10.639
<v Speaker 3>kind of space. And then neural network gives us a

0:20:10.640 --> 0:20:13.359
<v Speaker 3>way of mapping from the inputs that it's getting to

0:20:13.440 --> 0:20:14.080
<v Speaker 3>some output.

0:20:14.200 --> 0:20:16.040
<v Speaker 4>So you could put in you know.

0:20:16.000 --> 0:20:18.239
<v Speaker 3>Your picture of a chair, and it maps that to

0:20:18.240 --> 0:20:20.560
<v Speaker 3>some point in space, and then it put sort of

0:20:20.560 --> 0:20:23.440
<v Speaker 3>produces out an output the corresponds to, yes, this is

0:20:23.480 --> 0:20:26.080
<v Speaker 3>a piece of furniture. And because those outputs can now

0:20:26.119 --> 0:20:29.199
<v Speaker 3>be continuous values, you can capture the fuzziness and other

0:20:29.280 --> 0:20:31.360
<v Speaker 3>kinds of things that you want for your concepts.

0:20:31.640 --> 0:20:34.960
<v Speaker 1>And so, in what sense are these modern systems, these

0:20:35.040 --> 0:20:38.360
<v Speaker 1>artificial neural networks learning, and in what sense are they

0:20:38.400 --> 0:20:42.720
<v Speaker 1>doing something that's maybe categorically different from how children learn.

0:20:44.520 --> 0:20:48.320
<v Speaker 3>This is a fundamental question, right, That's the kind of

0:20:48.320 --> 0:20:50.440
<v Speaker 3>thing that we cognitive scientists think about a lot, and

0:20:50.800 --> 0:20:53.080
<v Speaker 3>I think that AI researchers are starting to care about

0:20:53.119 --> 0:20:55.359
<v Speaker 3>a lot too, which is, you know, what are these

0:20:55.440 --> 0:20:59.720
<v Speaker 3>sort of meaningful differences between human minds, human brains and

0:20:59.720 --> 0:21:01.680
<v Speaker 3>what we building in these AI systems or these sort

0:21:01.680 --> 0:21:07.240
<v Speaker 3>of artificial brains. I think one very salient difference is

0:21:07.359 --> 0:21:10.399
<v Speaker 3>the amount of data which is needed for a human

0:21:10.640 --> 0:21:12.639
<v Speaker 3>to learn language compared to the amount of data you

0:21:12.640 --> 0:21:15.159
<v Speaker 3>need to put into on neural network. So if you

0:21:15.200 --> 0:21:18.640
<v Speaker 3>take a system like chat GPT, right, one of these chatbots,

0:21:19.000 --> 0:21:21.639
<v Speaker 3>those systems are trained on the equivalent of something like

0:21:21.720 --> 0:21:24.920
<v Speaker 3>five thousand to fifty thousand years of continuous speech. There's

0:21:24.920 --> 0:21:26.800
<v Speaker 3>sort of massive amounts of data that are going into

0:21:26.800 --> 0:21:28.919
<v Speaker 3>that system. So it's like on the order of a

0:21:28.960 --> 0:21:31.159
<v Speaker 3>thousand or ten thousand times as much data as a

0:21:31.240 --> 0:21:33.879
<v Speaker 3>human child might get in order to learn language. And

0:21:33.960 --> 0:21:37.280
<v Speaker 3>the reason is that those artificial neural networks are really

0:21:37.359 --> 0:21:40.399
<v Speaker 3>kind of like undifferentiated learning machines. You can take that

0:21:40.440 --> 0:21:41.960
<v Speaker 3>same kind of neural network, you can get it to

0:21:42.040 --> 0:21:44.400
<v Speaker 3>learn all sorts of different kinds of things. It works

0:21:44.400 --> 0:21:46.119
<v Speaker 3>really well for learning language, but you can use it

0:21:46.160 --> 0:21:47.920
<v Speaker 3>to learn something about vision or something. You know, you

0:21:47.960 --> 0:21:49.280
<v Speaker 3>can sort of take all sorts of problems and give

0:21:49.280 --> 0:21:50.520
<v Speaker 3>it to them and it can learn how to do that.

0:21:51.040 --> 0:21:56.000
<v Speaker 3>And so as a consequence, they have what we call

0:21:56.000 --> 0:21:59.760
<v Speaker 3>in cognitive science machine learning inductive biases. They're not biased

0:21:59.800 --> 0:22:04.120
<v Speaker 3>to towards any particular solution to the learning problem, and

0:22:04.400 --> 0:22:09.399
<v Speaker 3>human brains have stronger inductive biases for things like learning language. Right,

0:22:09.440 --> 0:22:12.200
<v Speaker 3>we're sort of disposed towards certain kinds of things, which

0:22:12.200 --> 0:22:14.679
<v Speaker 3>are human languages. The things that we call human languages

0:22:14.680 --> 0:22:16.880
<v Speaker 3>are the things that we're disposed to learn. And as

0:22:16.880 --> 0:22:19.480
<v Speaker 3>a consequence, you know, we're able to sort of narrow

0:22:19.560 --> 0:22:21.320
<v Speaker 3>down the space of possibilities in a way that means

0:22:21.320 --> 0:22:22.680
<v Speaker 3>that we're able to learn from less data.

0:22:36.960 --> 0:22:39.399
<v Speaker 1>Okay, this makes a great segue to the third lens.

0:22:39.440 --> 0:22:42.880
<v Speaker 1>So you talked about rules and symbols, and you talked

0:22:42.920 --> 0:22:45.560
<v Speaker 1>about artificial neural networks. The third part of your book

0:22:45.640 --> 0:22:51.200
<v Speaker 1>is about probabilities and statistics. So why did probability become

0:22:51.240 --> 0:22:56.000
<v Speaker 1>an attractive candidate language for thinking about cognition?

0:22:56.920 --> 0:22:59.520
<v Speaker 3>Probability there is a good way of answering certain kinds

0:22:59.520 --> 0:23:02.840
<v Speaker 3>of why questions that we might have, right so, and

0:23:02.880 --> 0:23:05.520
<v Speaker 3>the reason is that it's a way of characterizing how

0:23:05.560 --> 0:23:08.960
<v Speaker 3>a rational agent should make an inference. So all the

0:23:08.960 --> 0:23:13.159
<v Speaker 3>way back in the eighteenth century, British nonconformist minister, the

0:23:13.160 --> 0:23:16.440
<v Speaker 3>Reverend Thomas Bays had this radical idea that you could

0:23:17.119 --> 0:23:19.800
<v Speaker 3>talk about, you know, again, like take a mathematical system

0:23:19.880 --> 0:23:22.399
<v Speaker 3>probability theory, which we would use for describing what happens

0:23:22.440 --> 0:23:24.639
<v Speaker 3>when you roll dice or flip coins right, sort of

0:23:24.720 --> 0:23:27.400
<v Speaker 3>you know, sort of language of gambling and saying, oh,

0:23:27.480 --> 0:23:30.240
<v Speaker 3>in fact, that mathematical system might also be a really

0:23:30.280 --> 0:23:33.560
<v Speaker 3>good system for describing how beliefs work. And so what

0:23:33.640 --> 0:23:35.399
<v Speaker 3>he was interested in was if you think about a

0:23:35.400 --> 0:23:38.280
<v Speaker 3>belief as you know, a degree of belief, right, you

0:23:38.320 --> 0:23:40.160
<v Speaker 3>can say, oh, I think it's going to rain tomorrow,

0:23:40.600 --> 0:23:42.800
<v Speaker 3>and I'll put put that on a scale which goes

0:23:42.800 --> 0:23:45.479
<v Speaker 3>from zero to one, where zero is you know, not

0:23:45.520 --> 0:23:47.359
<v Speaker 3>going to rain, and one is one hundred percent it's

0:23:47.359 --> 0:23:47.760
<v Speaker 3>going to rain.

0:23:47.800 --> 0:23:48.200
<v Speaker 4>Tomorrow.

0:23:48.280 --> 0:23:52.040
<v Speaker 3>Right, That is a belief that you've expressed in the

0:23:52.080 --> 0:23:54.960
<v Speaker 3>form of a probability. And now if you, you know,

0:23:55.119 --> 0:23:56.639
<v Speaker 3>wake up in the morning and look out the window

0:23:56.640 --> 0:23:59.239
<v Speaker 3>and you see gray storm clouds, you've got a new

0:23:59.240 --> 0:24:02.480
<v Speaker 3>piece of data. You need to revise your beliefs, and

0:24:02.560 --> 0:24:05.119
<v Speaker 3>probability theory actually tells you how to do that. It says,

0:24:06.320 --> 0:24:09.200
<v Speaker 3>you know, for each hypothesis, right, so our hypotheses here

0:24:09.240 --> 0:24:11.040
<v Speaker 3>are it's going.

0:24:10.960 --> 0:24:12.080
<v Speaker 4>To rain or it's not going to rain.

0:24:12.440 --> 0:24:15.360
<v Speaker 3>Right, You're going to modify that belief based on how

0:24:15.440 --> 0:24:18.639
<v Speaker 3>likely the data is that you saw if that hypothesis

0:24:18.680 --> 0:24:21.359
<v Speaker 3>were true. So, because gray storm clouds are more likely

0:24:21.440 --> 0:24:23.920
<v Speaker 3>if it's going to rain that day, we should increase

0:24:23.960 --> 0:24:27.600
<v Speaker 3>our belief that it's going to rain. And as a consequence, well,

0:24:27.680 --> 0:24:29.040
<v Speaker 3>we'll end up with a number that's a little bit

0:24:29.119 --> 0:24:30.359
<v Speaker 3>higher than the number we had before.

0:24:30.600 --> 0:24:32.280
<v Speaker 4>And probability theory tells us how to do that.

0:24:32.520 --> 0:24:35.240
<v Speaker 3>There's a principle of probability theory called Bays rule after

0:24:35.240 --> 0:24:37.680
<v Speaker 3>the Reverend Thomas Beys that tells you how to take

0:24:37.760 --> 0:24:40.960
<v Speaker 3>your original beliefs and then turn them into the beliefs

0:24:40.960 --> 0:24:43.680
<v Speaker 3>that you get after seeing data. And that turns out

0:24:43.720 --> 0:24:45.600
<v Speaker 3>to be exactly the tool that you need to answer

0:24:45.640 --> 0:24:49.560
<v Speaker 3>these kinds of questions about how inductive biases work. Right, So,

0:24:50.160 --> 0:24:53.320
<v Speaker 3>how is it that children are able to learn from

0:24:53.600 --> 0:24:56.880
<v Speaker 3>less data than anural networks. Well, it's a consequence of

0:24:57.440 --> 0:25:00.000
<v Speaker 3>you know, these things that we can describe using different

0:25:00.160 --> 0:25:03.399
<v Speaker 3>probabilities being assigned to different hypotheses, whether hypotheses correspond to

0:25:03.400 --> 0:25:05.080
<v Speaker 3>the structure of the languages that are being learned.

0:25:05.320 --> 0:25:10.320
<v Speaker 1>And when people call humans irrational, what changes if we

0:25:10.440 --> 0:25:16.120
<v Speaker 1>look at mistakes as resource limited inferences.

0:25:17.640 --> 0:25:20.280
<v Speaker 3>This is one of the things that I explore in

0:25:19.760 --> 0:25:23.399
<v Speaker 3>my own research is this question of how we should

0:25:23.400 --> 0:25:26.919
<v Speaker 3>actually think about rationality for real agents, right, And this

0:25:27.000 --> 0:25:28.919
<v Speaker 3>is relevant if you're building an AI system or if

0:25:28.920 --> 0:25:31.760
<v Speaker 3>you're just trying to understand human behavior. So I think,

0:25:33.040 --> 0:25:35.720
<v Speaker 3>like I said, probability theory gives us a characterization of

0:25:35.800 --> 0:25:38.679
<v Speaker 3>what you should do as an ideal rational agent, But

0:25:38.760 --> 0:25:42.640
<v Speaker 3>that assumes that you have infinite computational resources. We mirror,

0:25:42.760 --> 0:25:46.440
<v Speaker 3>humans don't have infinite computational resources, nor do our AI systems,

0:25:46.560 --> 0:25:49.280
<v Speaker 3>And so you can ask what should a rational agent

0:25:49.320 --> 0:25:52.280
<v Speaker 3>do if they don't have all of the computational resources

0:25:52.280 --> 0:25:54.359
<v Speaker 3>that you might need and then out of that you

0:25:54.400 --> 0:25:57.160
<v Speaker 3>get the answer is that you know, you should follow

0:25:57.480 --> 0:26:02.320
<v Speaker 3>an algorithm, follow us strategy makes sense given the resources

0:26:02.359 --> 0:26:04.280
<v Speaker 3>that you have. That's what it means to be rational

0:26:04.280 --> 0:26:06.840
<v Speaker 3>in those circumstances where you're sort of doing the best

0:26:06.920 --> 0:26:12.159
<v Speaker 3>job you can of approximating probabilistic inference given those resource constraints.

0:26:12.200 --> 0:26:14.479
<v Speaker 3>And so some of the things that people do when

0:26:14.520 --> 0:26:16.400
<v Speaker 3>people do weird things, and we do lots of weird things,

0:26:16.440 --> 0:26:19.000
<v Speaker 3>and we don't always follow probability theory, things that we

0:26:19.000 --> 0:26:22.320
<v Speaker 3>can understand as us, you know, running into those resource

0:26:22.359 --> 0:26:25.200
<v Speaker 3>limitations and then coming up with, you know, reasonable strategies

0:26:25.200 --> 0:26:26.719
<v Speaker 3>for trying to approximate the right answer.

0:26:28.040 --> 0:26:29.080
<v Speaker 2>So, if we look.

0:26:28.960 --> 0:26:33.800
<v Speaker 1>At probability being the grammar of uncertainty, one thing we

0:26:33.880 --> 0:26:36.840
<v Speaker 1>know is that our prior expectations matter. And one of

0:26:36.880 --> 0:26:39.280
<v Speaker 1>the things I've been obsessed with and doing a lot

0:26:39.320 --> 0:26:41.000
<v Speaker 1>of research on and talking a lot about on the

0:26:41.000 --> 0:26:44.200
<v Speaker 1>podcast is the way that all of us drop into

0:26:44.200 --> 0:26:48.520
<v Speaker 1>the world and our cultures influence us and our language

0:26:48.560 --> 0:26:51.160
<v Speaker 1>and our moment in time and our neighborhood, and this

0:26:51.320 --> 0:26:55.760
<v Speaker 1>leads to people being quite different on the inside. Is

0:26:55.800 --> 0:26:58.159
<v Speaker 1>this something that you think about sometimes about how we

0:26:58.240 --> 0:27:03.760
<v Speaker 1>develop our priors differently based on you know what where

0:27:03.760 --> 0:27:04.360
<v Speaker 1>we grow up.

0:27:05.600 --> 0:27:08.679
<v Speaker 3>Yeah, So priors is that Daysian language, right, for talking

0:27:08.680 --> 0:27:11.240
<v Speaker 3>about the beliefs that you have before you get data

0:27:11.440 --> 0:27:14.840
<v Speaker 3>that you then update into what we call posterior probabilities

0:27:14.840 --> 0:27:17.440
<v Speaker 3>that are informed by those data. And so yeah, I

0:27:18.359 --> 0:27:20.720
<v Speaker 3>spend a lot of my research time thinking about these

0:27:20.800 --> 0:27:23.920
<v Speaker 3>kinds of questions of you know what are these sort

0:27:23.960 --> 0:27:28.880
<v Speaker 3>of prior distributions for humans? How do we acquire good

0:27:28.960 --> 0:27:31.959
<v Speaker 3>prior distributions for solving different kinds of problems. One thing

0:27:32.000 --> 0:27:34.520
<v Speaker 3>there is that calling it a prior makes it sound

0:27:34.560 --> 0:27:36.840
<v Speaker 3>like maybe it's something you're born with, but in fact,

0:27:36.880 --> 0:27:37.680
<v Speaker 3>it just means.

0:27:37.720 --> 0:27:38.960
<v Speaker 4>It's before you get data.

0:27:39.040 --> 0:27:41.439
<v Speaker 3>Right, So when you're seeing that storm cloud in the morning,

0:27:42.040 --> 0:27:44.520
<v Speaker 3>you had a prior probability from last night, and then

0:27:44.640 --> 0:27:47.560
<v Speaker 3>that prior probability was informed by everything else that you know, Right,

0:27:47.680 --> 0:27:49.920
<v Speaker 3>The priors are all of the biases and knowledge that

0:27:49.960 --> 0:27:51.880
<v Speaker 3>we bring to bear when we're trying to make an difference.

0:27:51.960 --> 0:27:55.080
<v Speaker 3>And so yeah, I think I think understanding that is

0:27:55.119 --> 0:27:57.280
<v Speaker 3>a big part of the project of understanding human cognition.

0:27:57.600 --> 0:27:59.120
<v Speaker 2>So let's zoom the camera out.

0:27:59.200 --> 0:28:01.639
<v Speaker 1>We've talked about the three lenses that you describe in

0:28:01.640 --> 0:28:04.119
<v Speaker 1>the book. Now, you also point out that we have

0:28:04.160 --> 0:28:08.199
<v Speaker 1>a lot of constraints like finite lives in limited compute,

0:28:08.200 --> 0:28:11.760
<v Speaker 1>and limited bandwidth, and so how do these constraints sculpt

0:28:12.040 --> 0:28:13.320
<v Speaker 1>human intelligence?

0:28:13.800 --> 0:28:16.520
<v Speaker 3>I think this is really important to just thinking about

0:28:16.560 --> 0:28:18.399
<v Speaker 3>the moment that we're in where there's a lot of

0:28:18.400 --> 0:28:21.520
<v Speaker 3>anxiety around AI, right, And I think if you think

0:28:21.520 --> 0:28:24.439
<v Speaker 3>about intelligence as a kind of one dimensional quantity, you

0:28:24.440 --> 0:28:26.560
<v Speaker 3>can kind of imagine that you know, humans are somewhere,

0:28:26.920 --> 0:28:28.320
<v Speaker 3>our AI systems are somewhere.

0:28:28.359 --> 0:28:29.560
<v Speaker 4>It seems like they're approaching us.

0:28:29.600 --> 0:28:31.560
<v Speaker 3>Maybe they're going to overtake us, and then, oh my god,

0:28:31.560 --> 0:28:33.120
<v Speaker 3>what is going to happen when that happens, Right, We're

0:28:33.160 --> 0:28:34.600
<v Speaker 3>just going to become redundant. There's not going to be

0:28:34.600 --> 0:28:36.520
<v Speaker 3>any jobs. Everything is going to fall apart. And so

0:28:37.160 --> 0:28:39.520
<v Speaker 3>that's a consequence of having a particular conception of what

0:28:39.640 --> 0:28:41.880
<v Speaker 3>intelligence is, which is this kind of one dimensional way

0:28:41.880 --> 0:28:44.040
<v Speaker 3>of thinking about it. And I think there's a different

0:28:44.080 --> 0:28:46.200
<v Speaker 3>way of thinking about it which gives us a little

0:28:46.200 --> 0:28:48.000
<v Speaker 3>more flexibility and maybe a little more hope in the

0:28:48.040 --> 0:28:50.120
<v Speaker 3>way that we think about what's going to happen with AI,

0:28:50.560 --> 0:28:55.840
<v Speaker 3>and that is thinking about intelligence as being an adaptation

0:28:56.040 --> 0:28:59.240
<v Speaker 3>to the kinds of computational problems that a system has

0:28:59.360 --> 0:29:02.520
<v Speaker 3>either of or being trained to solve, right, And so

0:29:03.360 --> 0:29:06.240
<v Speaker 3>for human beings, those computational problems are shaped by the

0:29:06.280 --> 0:29:08.440
<v Speaker 3>constraints that we operate under. And a lot of those

0:29:08.440 --> 0:29:11.080
<v Speaker 3>constraints come from our biology, right that we, as you said,

0:29:11.480 --> 0:29:14.880
<v Speaker 3>have limited lives, have you know, limited compute resources what

0:29:14.880 --> 0:29:17.560
<v Speaker 3>we can carry around inside our heads, have limited bandwidth

0:29:17.600 --> 0:29:20.480
<v Speaker 3>for communication. We have to like make noises at each other,

0:29:20.800 --> 0:29:22.800
<v Speaker 3>or wiggle our fingers or you know, somehow use our

0:29:22.840 --> 0:29:26.280
<v Speaker 3>bodies to transfer data from one human mind to another

0:29:26.360 --> 0:29:30.440
<v Speaker 3>human mind. It's very inefficient. And so those constraints are

0:29:30.520 --> 0:29:34.200
<v Speaker 3>things that mean that human intelligence takes a particular shape,

0:29:34.440 --> 0:29:37.920
<v Speaker 3>which is we're able to learn from limited data because

0:29:37.920 --> 0:29:40.080
<v Speaker 3>we have to because we don't live that long. Right,

0:29:40.280 --> 0:29:44.120
<v Speaker 3>You can't rely on getting five thousand years of language

0:29:44.160 --> 0:29:47.600
<v Speaker 3>data or multiple human lifetimes of you know, chess playing

0:29:47.680 --> 0:29:51.400
<v Speaker 3>or whatever it is. Right, you have to be good

0:29:51.400 --> 0:29:53.960
<v Speaker 3>at using the resources that you have in ways that

0:29:53.960 --> 0:29:56.000
<v Speaker 3>are efficient. And so that's kind of like deciding what

0:29:56.080 --> 0:29:59.160
<v Speaker 3>to think about being able to recognize when a problem

0:29:59.200 --> 0:30:01.800
<v Speaker 3>has a structure that you I've seen before being able

0:30:01.840 --> 0:30:04.280
<v Speaker 3>to you know, sort of like become sort of automatic

0:30:04.320 --> 0:30:07.160
<v Speaker 3>in using certain kinds of patterns of thinking and strategies

0:30:07.160 --> 0:30:11.120
<v Speaker 3>for solving problems, really trying to make it as easy as.

0:30:11.000 --> 0:30:13.320
<v Speaker 4>Possible for us to use the resources that we have.

0:30:13.800 --> 0:30:18.760
<v Speaker 3>And then you need to develop capacities for trying to

0:30:18.800 --> 0:30:22.000
<v Speaker 3>circumvent those bandwidth constraints in order to be able to

0:30:22.040 --> 0:30:24.320
<v Speaker 3>do things that transcend what any individual human can do,

0:30:24.640 --> 0:30:30.600
<v Speaker 3>and that means developing things like language writing societies LLCs,

0:30:30.920 --> 0:30:34.920
<v Speaker 3>you know, all of the sort of libraries, right institutions,

0:30:35.240 --> 0:30:37.960
<v Speaker 3>all of the theory of mind, right for reasoning about

0:30:37.960 --> 0:30:40.360
<v Speaker 3>what someone else might be trying to communicate to you.

0:30:40.840 --> 0:30:43.120
<v Speaker 3>All of this stuff is actually sort of like human

0:30:43.160 --> 0:30:46.920
<v Speaker 3>stuff that's a consequence of these constraints. And so as

0:30:46.920 --> 0:30:50.800
<v Speaker 3>we make AI systems that are smarter, those AI systems

0:30:50.800 --> 0:30:53.400
<v Speaker 3>are in turn being shaped by what they're being trained

0:30:53.400 --> 0:30:56.280
<v Speaker 3>to do and what constraints they operate under. But those

0:30:56.320 --> 0:30:58.760
<v Speaker 3>constraints are different from the ones that humans have. They

0:30:58.800 --> 0:31:02.320
<v Speaker 3>can you know, get more data, they can get access

0:31:02.320 --> 0:31:04.520
<v Speaker 3>to more compute, they don't have bandwidth limitations.

0:31:04.520 --> 0:31:05.520
<v Speaker 4>You can just copy.

0:31:05.240 --> 0:31:09.280
<v Speaker 3>A you know, a state of an AI system across machines.

0:31:09.360 --> 0:31:12.440
<v Speaker 3>You can use the same data to train multiple AI systems,

0:31:13.000 --> 0:31:15.280
<v Speaker 3>and all of those things mean that I think, rather

0:31:15.320 --> 0:31:17.600
<v Speaker 3>than being sort of on one axis where we're sort

0:31:17.600 --> 0:31:20.800
<v Speaker 3>of thinking about better and worse, it's more that there

0:31:20.800 --> 0:31:23.560
<v Speaker 3>are many axes that we can think about intelligent systems

0:31:23.600 --> 0:31:25.680
<v Speaker 3>developing along, and we're just going to end up in

0:31:25.680 --> 0:31:28.080
<v Speaker 3>a state where we have human intelligence and we have AIS,

0:31:28.520 --> 0:31:30.640
<v Speaker 3>and they're going to be meaningfully different from one another,

0:31:31.160 --> 0:31:33.200
<v Speaker 3>rather than things that are sort of directly competing in

0:31:33.320 --> 0:31:34.600
<v Speaker 3>terms of the capacities that they have.

0:31:35.280 --> 0:31:36.640
<v Speaker 2>Yeah, I agree with you on that.

0:31:37.280 --> 0:31:41.480
<v Speaker 1>When you think about the way that humans beat machines

0:31:41.640 --> 0:31:44.400
<v Speaker 1>on data efficiency, what do you think that means is

0:31:44.520 --> 0:31:47.320
<v Speaker 1>missing architecturally from our AI systems.

0:31:48.880 --> 0:31:52.360
<v Speaker 3>I think it's actually it's a great question, and the

0:31:52.360 --> 0:31:55.040
<v Speaker 3>way I would express it is not in terms of architecture.

0:31:55.720 --> 0:31:58.280
<v Speaker 3>So it's actually in terms of a different part of

0:31:58.320 --> 0:32:01.000
<v Speaker 3>a neural network. So when we think about this problem

0:32:01.000 --> 0:32:05.520
<v Speaker 3>of inductive bias, right, which is we're you know, what

0:32:06.160 --> 0:32:09.120
<v Speaker 3>a system is sort of disposed towards learning. As I said,

0:32:09.240 --> 0:32:11.840
<v Speaker 3>our neural networks, the way we normally set them up,

0:32:12.040 --> 0:32:14.040
<v Speaker 3>are pretty weak inductive biases. They can learn all sorts

0:32:14.040 --> 0:32:17.320
<v Speaker 3>of things. The inductive bias that a neural network has

0:32:17.560 --> 0:32:20.640
<v Speaker 3>it is constrained by its architecture, but it's also constrained

0:32:20.640 --> 0:32:23.080
<v Speaker 3>by where it starts out in the space of the

0:32:23.120 --> 0:32:26.680
<v Speaker 3>settings of all of those weights. And normally the default

0:32:26.720 --> 0:32:28.360
<v Speaker 3>is that you set up your neural networks so those

0:32:28.360 --> 0:32:31.000
<v Speaker 3>weights start out really small, close to zero, and then

0:32:31.000 --> 0:32:32.960
<v Speaker 3>they sort of grow away from that as it starts

0:32:33.000 --> 0:32:36.040
<v Speaker 3>to learn how to do things. We've had success in

0:32:36.120 --> 0:32:40.719
<v Speaker 3>taking neural networks that are architecturally identical but setting them

0:32:40.720 --> 0:32:43.720
<v Speaker 3>with different initial weights in order to create an inductive

0:32:43.760 --> 0:32:47.000
<v Speaker 3>bias that enables rapid learning. So we act to use

0:32:47.040 --> 0:32:49.720
<v Speaker 3>a technique called meta learning, which is a method from

0:32:49.720 --> 0:32:53.440
<v Speaker 3>machine learning where you take the same neural network architecture

0:32:53.600 --> 0:32:55.960
<v Speaker 3>and the same initial weights, and you use it to

0:32:56.040 --> 0:32:57.960
<v Speaker 3>learn to solve lots of different problems, like you can

0:32:58.040 --> 0:33:00.800
<v Speaker 3>use it to learn lots of different languages, say, from

0:33:00.840 --> 0:33:04.000
<v Speaker 3>limited data, and then you try and optimize the initial

0:33:04.040 --> 0:33:05.800
<v Speaker 3>weights of the neural network to make it so it

0:33:05.800 --> 0:33:09.000
<v Speaker 3>can learn all of those languages better using the same

0:33:09.080 --> 0:33:11.000
<v Speaker 3>kinds of algorithms we use for training the weights of

0:33:11.040 --> 0:33:12.880
<v Speaker 3>the neural network. When we have these giant data sets.

0:33:12.880 --> 0:33:15.560
<v Speaker 3>You can instead use those algorithms to train the initial

0:33:15.560 --> 0:33:17.680
<v Speaker 3>weights of a neural network for a small data set,

0:33:17.720 --> 0:33:20.760
<v Speaker 3>for lots of small data sets, And when you do that,

0:33:20.800 --> 0:33:22.320
<v Speaker 3>you end up with a neural network that has an

0:33:22.360 --> 0:33:25.400
<v Speaker 3>inductive bias that makes it possible to learn from small

0:33:25.400 --> 0:33:26.960
<v Speaker 3>amounts of data. And so that's the kind of thing

0:33:27.000 --> 0:33:29.920
<v Speaker 3>we've been exploring in my lab is can we find

0:33:29.960 --> 0:33:33.080
<v Speaker 3>a way of taking exactly these same neural network architectures

0:33:33.600 --> 0:33:36.000
<v Speaker 3>and just starting them out in a different place that

0:33:36.120 --> 0:33:39.120
<v Speaker 3>maybe aligns better with the kinds of things that humans do. Okay, well,

0:33:39.160 --> 0:33:40.880
<v Speaker 3>this is a really good segue to what I wanted

0:33:40.880 --> 0:33:43.600
<v Speaker 3>to ask you, which is, if you're looking at rules

0:33:43.640 --> 0:33:46.880
<v Speaker 3>and systems is one sort of math to describe the mind,

0:33:46.880 --> 0:33:49.320
<v Speaker 3>and artificial neural networks is another kind of math, and

0:33:49.360 --> 0:33:54.200
<v Speaker 3>probability is another. What does an optimal hybrid look like?

0:33:54.320 --> 0:33:58.920
<v Speaker 3>Given that no single, no one of these describes everything

0:33:58.960 --> 0:34:02.240
<v Speaker 3>about what's going on with minds, So what does the

0:34:02.360 --> 0:34:04.120
<v Speaker 3>hybrid for an aisystem look like?

0:34:04.200 --> 0:34:07.120
<v Speaker 2>In twenty twenty six, the place.

0:34:06.920 --> 0:34:09.759
<v Speaker 3>Where I end up in the book is saying that

0:34:10.080 --> 0:34:13.600
<v Speaker 3>these different kinds of math really do all fit together

0:34:13.640 --> 0:34:16.239
<v Speaker 3>in an interesting way, and in order to understand that

0:34:16.280 --> 0:34:18.560
<v Speaker 3>we can talk about different levels of analysis when we're

0:34:18.560 --> 0:34:21.040
<v Speaker 3>trying to make sense of an information processing system. So

0:34:21.920 --> 0:34:23.880
<v Speaker 3>the most abstract level, this is an idea that was

0:34:23.880 --> 0:34:27.640
<v Speaker 3>introduced by the computational neuroscientist David Marr. He said, the

0:34:27.680 --> 0:34:29.560
<v Speaker 3>most abstract level is just thinking about the problem that

0:34:29.560 --> 0:34:32.279
<v Speaker 3>the system is solving in its ideal solution, right. And

0:34:32.960 --> 0:34:36.279
<v Speaker 3>I think logic and symbolic systems and probability theory give

0:34:36.360 --> 0:34:39.239
<v Speaker 3>us a good way of characterizing the kinds of problems

0:34:39.280 --> 0:34:43.279
<v Speaker 3>that minds have to solve, right They you know, probabilistic

0:34:43.280 --> 0:34:45.960
<v Speaker 3>inference because we have to make these uncertain inferences. And

0:34:46.000 --> 0:34:49.400
<v Speaker 3>then logic as a way of characterizing the kinds of

0:34:49.440 --> 0:34:50.960
<v Speaker 3>things that are in the world that have this rich

0:34:51.000 --> 0:34:54.480
<v Speaker 3>structure of you know, like a sort of combinatorial structure

0:34:54.520 --> 0:34:57.400
<v Speaker 3>that you get from from having symbols and rules that

0:34:57.440 --> 0:35:00.160
<v Speaker 3>combine together with things like language and dance and all

0:35:00.160 --> 0:35:02.400
<v Speaker 3>of these you know, structured things. Even you know, if

0:35:02.400 --> 0:35:03.960
<v Speaker 3>you look at trees, you can see they have like

0:35:04.120 --> 0:35:06.759
<v Speaker 3>recursive structures that are expressed in them. Right, So these

0:35:06.840 --> 0:35:08.600
<v Speaker 3>kind of occur in nature and are important to be

0:35:08.600 --> 0:35:11.640
<v Speaker 3>able to understand. And then at the level below that

0:35:11.840 --> 0:35:17.160
<v Speaker 3>you have how the system solves those problems, right, like

0:35:17.480 --> 0:35:20.279
<v Speaker 3>what algorithms it might use, what representations it might use,

0:35:20.400 --> 0:35:22.920
<v Speaker 3>And then below that it's you know, how that's actually

0:35:22.920 --> 0:35:25.960
<v Speaker 3>implemented in some kind of physical system, right, and artificial

0:35:25.960 --> 0:35:27.799
<v Speaker 3>neural networks give us a kind of story at those

0:35:27.880 --> 0:35:31.720
<v Speaker 3>levels where we can think about them as being a

0:35:31.760 --> 0:35:35.439
<v Speaker 3>good general purpose system for learning to approximate the things

0:35:35.440 --> 0:35:37.799
<v Speaker 3>that probabilistic in front tells you to do, and learning

0:35:37.840 --> 0:35:40.560
<v Speaker 3>to approximate the structure that's contained within those symbolic systems.

0:35:40.960 --> 0:35:45.160
<v Speaker 3>So I actually think, you know, the kind of story

0:35:45.200 --> 0:35:47.319
<v Speaker 3>that we have right now that's emerged out of these

0:35:47.400 --> 0:35:50.120
<v Speaker 3>advances in AI is actually a pretty good story for

0:35:50.200 --> 0:35:52.759
<v Speaker 3>how we could think about human minds working. The thing

0:35:52.800 --> 0:35:55.279
<v Speaker 3>that's missing, most important thing that's missing is this kind

0:35:55.320 --> 0:35:59.279
<v Speaker 3>of aspect of inductive bias, right where we haven't been

0:35:59.320 --> 0:36:01.440
<v Speaker 3>able to capture what human inductive biases are like in

0:36:01.520 --> 0:36:03.799
<v Speaker 3>machines and so that you have these meaningful differences that come.

0:36:03.680 --> 0:36:04.080
<v Speaker 4>Out of that.

0:36:05.440 --> 0:36:08.600
<v Speaker 3>But it's not a bad place for thinking about how

0:36:08.640 --> 0:36:11.080
<v Speaker 3>these pieces might fit together to give us an explanation

0:36:11.120 --> 0:36:12.279
<v Speaker 3>for how it is that mind's work.

0:36:12.680 --> 0:36:18.399
<v Speaker 1>So along these lines, which AI benchmarks feel to you misleading?

0:36:18.520 --> 0:36:21.080
<v Speaker 2>And how would you make better benchmarks?

0:36:22.640 --> 0:36:26.239
<v Speaker 3>So, in general, I'm not a huge fan of benchmarks,

0:36:26.360 --> 0:36:29.279
<v Speaker 3>because I think benchmarks are useful as an engineering tool,

0:36:29.880 --> 0:36:33.560
<v Speaker 3>but I, as a cognitive scientist, don't just want to

0:36:33.600 --> 0:36:36.120
<v Speaker 3>know how well something is doing something. I want to

0:36:36.120 --> 0:36:38.960
<v Speaker 3>know how it's doing that thing right and how it

0:36:39.040 --> 0:36:41.719
<v Speaker 3>might be sort of messing that up right. So when

0:36:41.800 --> 0:36:46.160
<v Speaker 3>we are designing experiments as cognitive scientists, we don't just say, oh,

0:36:46.200 --> 0:36:48.120
<v Speaker 3>here's one hundred math problems. Go do with one hundred

0:36:48.160 --> 0:36:50.720
<v Speaker 3>math problems and we'll get a score. We say, let's

0:36:50.840 --> 0:36:54.480
<v Speaker 3>choose a set of math problems so that which answers

0:36:54.640 --> 0:36:57.960
<v Speaker 3>people give us tell us about the misconceptions that they

0:36:58.000 --> 0:37:00.319
<v Speaker 3>have in a way that we can then diagnose, oh,

0:37:00.440 --> 0:37:02.080
<v Speaker 3>you know, this is why this person is thinking this

0:37:02.120 --> 0:37:04.160
<v Speaker 3>particular thing. And so I think there's lots of room

0:37:04.200 --> 0:37:06.520
<v Speaker 3>for coming up with better ways of evaluating our AI

0:37:06.560 --> 0:37:09.600
<v Speaker 3>systems that look more like cognitive science experiments. We're really

0:37:09.640 --> 0:37:12.719
<v Speaker 3>targeting understanding what's going on rather than just trying to

0:37:12.719 --> 0:37:15.480
<v Speaker 3>get some brute sort of you know, performance score.

0:37:16.320 --> 0:37:19.920
<v Speaker 1>Okay, good, And you have talked about curiosity as a

0:37:19.960 --> 0:37:25.800
<v Speaker 1>computational problem. So how do you think about what curiosity

0:37:25.960 --> 0:37:30.320
<v Speaker 1>is and how we might measure real curiosity in a machine?

0:37:30.680 --> 0:37:34.400
<v Speaker 3>What problem is curiosity trying to solve? Yeah, this is

0:37:34.440 --> 0:37:36.000
<v Speaker 3>this is a good question. You can you can ask

0:37:36.040 --> 0:37:38.480
<v Speaker 3>this kind of question that we call rational analysis. Right,

0:37:38.520 --> 0:37:42.640
<v Speaker 3>if you have a system that's solving a problem, you

0:37:42.680 --> 0:37:44.000
<v Speaker 3>know what's what's the problem?

0:37:44.000 --> 0:37:44.800
<v Speaker 4>What's the ideal solution?

0:37:44.960 --> 0:37:48.279
<v Speaker 3>Okay, So for curiosity, we've argued and this is work

0:37:48.320 --> 0:37:49.399
<v Speaker 3>with wretched debate.

0:37:49.400 --> 0:37:52.200
<v Speaker 4>Who is that UCLA That.

0:37:53.800 --> 0:37:56.960
<v Speaker 3>One way I think about curiosity is that you're trying

0:37:56.960 --> 0:38:02.760
<v Speaker 3>to find things that are good in increasing your long

0:38:03.120 --> 0:38:06.880
<v Speaker 3>run probability of being able to solve problems in the future. Right,

0:38:07.000 --> 0:38:10.120
<v Speaker 3>So you know, it's sort of like you want data

0:38:10.200 --> 0:38:13.840
<v Speaker 3>which for which the derivative of your total knowledge is

0:38:13.960 --> 0:38:18.960
<v Speaker 3>high relative to that particular data point. And so that

0:38:19.040 --> 0:38:21.800
<v Speaker 3>explanation captures some of the things that happen in human cognition,

0:38:22.000 --> 0:38:25.239
<v Speaker 3>where you know, in some circumstances, we're interested in the

0:38:25.280 --> 0:38:29.239
<v Speaker 3>newest thing, something we've never seen before, Right, But in

0:38:29.280 --> 0:38:31.560
<v Speaker 3>a lot of circumstances, those things aren't the things that

0:38:31.600 --> 0:38:34.520
<v Speaker 3>grab our attention. It's more things that maybe we've seen

0:38:34.520 --> 0:38:38.640
<v Speaker 3>a few times, and you know, we just sort of

0:38:38.680 --> 0:38:41.319
<v Speaker 3>noticed that they're starting to occur. If something happens once,

0:38:41.360 --> 0:38:43.000
<v Speaker 3>you're just say okay, that was weird, and you sort

0:38:43.000 --> 0:38:45.160
<v Speaker 3>of dismiss it. But when something happens a few times

0:38:45.200 --> 0:38:47.279
<v Speaker 3>and it's unfamiliar to you, you say, okay, maybe I

0:38:47.320 --> 0:38:49.520
<v Speaker 3>need to figure that out. Right, And something that happens

0:38:49.520 --> 0:38:51.120
<v Speaker 3>all the time, you're not that curious about. That's just

0:38:51.120 --> 0:38:52.520
<v Speaker 3>the thing that happens all the time. And you can

0:38:52.560 --> 0:38:55.120
<v Speaker 3>explain that by thinking about this sort of derivative, right,

0:38:55.239 --> 0:38:58.920
<v Speaker 3>where if something just happens once, you shouldn't be interested

0:38:58.960 --> 0:39:00.480
<v Speaker 3>in it. Because it just happened once, it's probably never

0:39:00.480 --> 0:39:03.160
<v Speaker 3>going to happen again. If something happens a few times,

0:39:03.719 --> 0:39:06.640
<v Speaker 3>that's a clue that it's probably going to happen again

0:39:06.640 --> 0:39:09.160
<v Speaker 3>in the future, and you've not seen it enough to

0:39:09.200 --> 0:39:11.719
<v Speaker 3>actually know what's going on, And so paying attention to

0:39:11.760 --> 0:39:13.520
<v Speaker 3>that is good in terms of that derivative of your

0:39:13.520 --> 0:39:16.759
<v Speaker 3>future knowledge. And if something happens a lot, then it's

0:39:16.760 --> 0:39:18.680
<v Speaker 3>probably happened enough that you know something about what's going

0:39:18.680 --> 0:39:21.160
<v Speaker 3>on and it's not that interesting, right, And so so

0:39:21.280 --> 0:39:23.120
<v Speaker 3>that sort of sweet spot ends up being around the

0:39:23.120 --> 0:39:25.359
<v Speaker 3>things that are sort of like just happening to your

0:39:25.480 --> 0:39:27.719
<v Speaker 3>enough times that you're starting to realize, oh, this might be.

0:39:27.640 --> 0:39:28.880
<v Speaker 4>A thing that I need to pay attention to.

0:39:42.800 --> 0:39:45.880
<v Speaker 1>If you hadn't been on one capability that's going to

0:39:46.120 --> 0:39:50.319
<v Speaker 1>unlock a broader intelligence, unlock a jump to that.

0:39:50.360 --> 0:39:51.240
<v Speaker 2>What's your candidate?

0:39:52.160 --> 0:39:54.319
<v Speaker 3>I actually think the biggest obstacle at the moment is

0:39:54.400 --> 0:39:59.120
<v Speaker 3>more about generalizability of intelligence rather than any specific capacity, right.

0:39:59.239 --> 0:40:03.920
<v Speaker 3>And so people in the AI world talk about jagged intelligence, right,

0:40:03.920 --> 0:40:06.720
<v Speaker 3>the sort of phenomenon where you have an AI system

0:40:06.760 --> 0:40:08.799
<v Speaker 3>that can do something that's really smart and impress you,

0:40:09.280 --> 0:40:11.480
<v Speaker 3>and then five minutes later does something that's really dumb

0:40:11.520 --> 0:40:13.839
<v Speaker 3>on a problem that's like right next to it, and like,

0:40:13.880 --> 0:40:15.479
<v Speaker 3>if it's able to solve that first problem, it seems

0:40:15.480 --> 0:40:17.160
<v Speaker 3>obvious that it should be able to solve the second problem.

0:40:17.160 --> 0:40:19.120
<v Speaker 4>And you're just like, what happened? You know, why did

0:40:19.120 --> 0:40:19.799
<v Speaker 4>it go wrong there?

0:40:20.200 --> 0:40:24.680
<v Speaker 3>And so that lack of generalizability is also a consequence

0:40:24.680 --> 0:40:27.680
<v Speaker 3>of these kinds of inductive biases, right, So these human

0:40:27.719 --> 0:40:30.719
<v Speaker 3>inductive biases that steer us towards a solution and let

0:40:30.840 --> 0:40:34.640
<v Speaker 3>us learn from limited amounts of data, they constrain the

0:40:34.680 --> 0:40:36.160
<v Speaker 3>kinds of solutions that we find are The kinds of

0:40:36.200 --> 0:40:37.719
<v Speaker 3>solutions that we find are the ones that are sort

0:40:37.719 --> 0:40:40.640
<v Speaker 3>of like generalizable at least to us, right. They are

0:40:40.640 --> 0:40:42.239
<v Speaker 3>things that kind of make sense where if someone's able

0:40:42.280 --> 0:40:43.239
<v Speaker 3>to do one thing, they'll be able to do the

0:40:43.320 --> 0:40:45.799
<v Speaker 3>other thing. And because the AI systems are approaching these

0:40:45.840 --> 0:40:48.440
<v Speaker 3>problems just in a completely different way from a different

0:40:48.440 --> 0:40:50.800
<v Speaker 3>starting point and then getting tons of data that's allowing

0:40:50.800 --> 0:40:52.919
<v Speaker 3>them to sort of approximate what the human solutions are.

0:40:53.120 --> 0:40:55.600
<v Speaker 3>But they're coming at it from another angle. That's the

0:40:55.600 --> 0:40:58.160
<v Speaker 3>thing that makes them jagged. It's not that they don't

0:40:58.160 --> 0:41:00.400
<v Speaker 3>have sort of these same compatible inducted bis is that

0:41:00.400 --> 0:41:03.560
<v Speaker 3>we have that are informed by having evolved in certain

0:41:03.640 --> 0:41:06.279
<v Speaker 3>environments and having had experience of the world, and you know,

0:41:06.440 --> 0:41:08.000
<v Speaker 3>all of these other things that are part of what

0:41:08.040 --> 0:41:11.239
<v Speaker 3>it means to you know, sort of learn anything as

0:41:11.239 --> 0:41:15.080
<v Speaker 3>a human being. And so because they are coming with

0:41:15.160 --> 0:41:17.640
<v Speaker 3>this different set of inductive biases, they're very influenced by

0:41:17.680 --> 0:41:20.560
<v Speaker 3>their training data, they end up doing things that are

0:41:20.560 --> 0:41:24.800
<v Speaker 3>sort of inscrutable to us because they are, you know, yeah,

0:41:24.840 --> 0:41:26.720
<v Speaker 3>like coming at these problems in a way that doesn't

0:41:27.400 --> 0:41:29.759
<v Speaker 3>make sense to us. You know, from the starting point

0:41:29.800 --> 0:41:30.680
<v Speaker 3>that humans come from.

0:41:30.960 --> 0:41:33.960
<v Speaker 1>After writing this book, what do you think we understand

0:41:34.080 --> 0:41:36.960
<v Speaker 1>now about minds that we didn't understand let's say, a

0:41:37.040 --> 0:41:37.879
<v Speaker 1>decade or two ago.

0:41:39.239 --> 0:41:43.480
<v Speaker 3>So it's funny because when I have taught this material

0:41:43.680 --> 0:41:47.680
<v Speaker 3>for you know, twenty twenty years at this point, I

0:41:47.800 --> 0:41:51.200
<v Speaker 3>normally start my cognitive science classes saying, you know, welcome

0:41:51.239 --> 0:41:53.279
<v Speaker 3>to cognitive science. This is going to be different from

0:41:53.320 --> 0:41:55.480
<v Speaker 3>your other science classes. Normally, when you take a science class,

0:41:55.480 --> 0:41:57.160
<v Speaker 3>someone is going to stand up and say, Okay, here's

0:41:57.200 --> 0:41:58.520
<v Speaker 3>all the things that we figured out. Here are the

0:41:58.520 --> 0:42:02.319
<v Speaker 3>answers to the questions. And in cognito science, it's more

0:42:02.320 --> 0:42:04.640
<v Speaker 3>that we figured out how to get better at asking

0:42:04.680 --> 0:42:05.240
<v Speaker 3>the questions.

0:42:05.239 --> 0:42:06.359
<v Speaker 4>We haven't answered them. We don't.

0:42:06.360 --> 0:42:08.200
<v Speaker 3>It's not like you have a consensus across the whole

0:42:08.200 --> 0:42:10.520
<v Speaker 3>field about what those answers look like. And so I

0:42:10.520 --> 0:42:13.040
<v Speaker 3>think that's important that we're still very much And this

0:42:13.120 --> 0:42:14.759
<v Speaker 3>is what got me interested in cognres science in the

0:42:14.760 --> 0:42:17.120
<v Speaker 3>first place. You know, still a field that has deep

0:42:17.160 --> 0:42:20.640
<v Speaker 3>mysteries and lots of opportunities to learn and discover interesting things.

0:42:21.200 --> 0:42:25.560
<v Speaker 3>But I think over the last ten years, like so,

0:42:25.640 --> 0:42:28.560
<v Speaker 3>as I was working on this book, I wrote the

0:42:28.640 --> 0:42:32.919
<v Speaker 3>first chapter, and I had that disclaimer in the first

0:42:32.960 --> 0:42:37.520
<v Speaker 3>chapter and said, okay, look, you know I'm not bromising

0:42:37.560 --> 0:42:40.080
<v Speaker 3>you answers. Well, well, we're going to see if we

0:42:40.120 --> 0:42:42.839
<v Speaker 3>can get a good handle on the questions. But by

0:42:42.880 --> 0:42:44.440
<v Speaker 3>the time I got to the end of the book,

0:42:44.520 --> 0:42:46.680
<v Speaker 3>right after that sort of process of working on it

0:42:46.680 --> 0:42:49.160
<v Speaker 3>for years, I felt like things that actually, you know,

0:42:49.360 --> 0:42:51.480
<v Speaker 3>me going through the process of writing it and exploring

0:42:51.480 --> 0:42:53.120
<v Speaker 3>all these things and thinking about how they fit together,

0:42:53.440 --> 0:42:56.040
<v Speaker 3>but also just where the field was, you know, having

0:42:56.040 --> 0:42:58.399
<v Speaker 3>moved forward, I actually started to feel like, actually, these

0:42:58.400 --> 0:43:00.480
<v Speaker 3>things do fit together in a way where you can

0:43:00.480 --> 0:43:03.040
<v Speaker 3>see the glimpses of what answers are going to look

0:43:03.080 --> 0:43:05.040
<v Speaker 3>like in a way that I think really wasn't there

0:43:05.040 --> 0:43:08.200
<v Speaker 3>ten years ago. And it's that story of Okay, we

0:43:08.239 --> 0:43:11.200
<v Speaker 3>sort of know what the goals are, right, we know

0:43:11.239 --> 0:43:13.880
<v Speaker 3>what the right mathematical systems are for describing what intelligent

0:43:13.880 --> 0:43:16.520
<v Speaker 3>system should be doing. You have these ingredients of symbolic

0:43:16.560 --> 0:43:19.920
<v Speaker 3>systems and probablistic inference, and we've discovered that in fact,

0:43:19.960 --> 0:43:22.760
<v Speaker 3>you can get a remarkable way just using these artificial

0:43:22.760 --> 0:43:25.680
<v Speaker 3>neural networks to learn to approximate those things, and so

0:43:26.560 --> 0:43:29.839
<v Speaker 3>that demonstration I think has shown first of all, that

0:43:30.440 --> 0:43:36.120
<v Speaker 3>language is a extremely good substrate for intelligence right in

0:43:36.160 --> 0:43:38.319
<v Speaker 3>a way that I think people had not anticipated before

0:43:38.400 --> 0:43:41.520
<v Speaker 3>large language models, and that you can make big neural

0:43:41.520 --> 0:43:44.799
<v Speaker 3>networks that can learn to approximate really complex probability distributions.

0:43:45.360 --> 0:43:48.080
<v Speaker 3>And so it gives us some of these ingredients for

0:43:48.360 --> 0:43:52.080
<v Speaker 3>seeing how what originally worth three very different views of

0:43:52.120 --> 0:43:54.719
<v Speaker 3>the mind might start to fit together to make something

0:43:54.760 --> 0:43:56.239
<v Speaker 3>that's a little bit more of a unified hole.

0:43:57.440 --> 0:44:00.839
<v Speaker 1>Excellent, And when you wrote the book what struck You

0:44:00.960 --> 0:44:04.920
<v Speaker 1>is the most beautiful idea in the whole quest, in

0:44:04.960 --> 0:44:07.480
<v Speaker 1>the whole history of this gosh.

0:44:07.560 --> 0:44:10.960
<v Speaker 3>Okay, I mean, I'm a big probability for theory fan,

0:44:11.120 --> 0:44:16.560
<v Speaker 3>so going to you're gonna get me endorsing Bays rule,

0:44:16.560 --> 0:44:19.040
<v Speaker 3>which I really do think is like it's it's when

0:44:19.080 --> 0:44:21.000
<v Speaker 3>you learn it, take it a probability class. It's just

0:44:21.040 --> 0:44:23.359
<v Speaker 3>like it's just a dumb principle of probability theory. But

0:44:23.400 --> 0:44:26.279
<v Speaker 3>when you make this move of saying probability theory isn't

0:44:26.360 --> 0:44:30.360
<v Speaker 3>just about dice and cards, it's about you know, beliefs,

0:44:30.640 --> 0:44:34.440
<v Speaker 3>it suddenly becomes a very deep and insightful sort of principle,

0:44:34.480 --> 0:44:37.640
<v Speaker 3>And in the book, I also show probability theory kind

0:44:37.640 --> 0:44:40.480
<v Speaker 3>of subsumes logic, like everything that's a valid logical inference

0:44:40.520 --> 0:44:43.040
<v Speaker 3>is also a valid inference in probability theory. Probability theory

0:44:43.080 --> 0:44:45.879
<v Speaker 3>just kind of extends the surmountics of logic to these

0:44:45.920 --> 0:44:49.200
<v Speaker 3>cases of uncertainty. So to me, I think that's a

0:44:49.320 --> 0:44:50.840
<v Speaker 3>that's a that's a big one. I kind of like

0:44:50.880 --> 0:44:51.520
<v Speaker 3>that's where I live.

0:44:51.640 --> 0:44:55.080
<v Speaker 1>Yeah, excellent, And and somebody, if we do have a

0:44:55.480 --> 0:44:58.520
<v Speaker 1>mature physics of thought, let's say fifty years from now,

0:44:58.560 --> 0:45:01.400
<v Speaker 1>what is that change from us in terms of education,

0:45:02.040 --> 0:45:04.040
<v Speaker 1>in terms of the way we build machines.

0:45:05.000 --> 0:45:07.640
<v Speaker 3>So I think this is this is exactly where we

0:45:07.680 --> 0:45:11.400
<v Speaker 3>can go, right, which is, once you figure out the

0:45:11.440 --> 0:45:13.880
<v Speaker 3>scientific principles of a domain, you can start to think

0:45:13.920 --> 0:45:16.680
<v Speaker 3>about how to do engineering right. So like you know,

0:45:17.040 --> 0:45:19.680
<v Speaker 3>when you're an engineer and you go to engineering school,

0:45:20.000 --> 0:45:22.480
<v Speaker 3>you take physics, right, and then you learn in your

0:45:22.480 --> 0:45:24.799
<v Speaker 3>physics class what these principles are, and then you take

0:45:24.840 --> 0:45:27.280
<v Speaker 3>your applied engineering classes, which are like taking those physical

0:45:27.320 --> 0:45:29.040
<v Speaker 3>principles and telling you how to build a bridge right

0:45:29.080 --> 0:45:32.239
<v Speaker 3>and explaining that you know not in terms of heuristics

0:45:32.760 --> 0:45:34.520
<v Speaker 3>for what makes a good bridge, but in terms of

0:45:34.600 --> 0:45:37.520
<v Speaker 3>those fundamental physical principles. So I think that's a thing

0:45:37.520 --> 0:45:40.640
<v Speaker 3>that's incredibly exciting here is that as we start to

0:45:40.719 --> 0:45:45.440
<v Speaker 3>converge on what these laws of thought look like, it

0:45:45.480 --> 0:45:48.359
<v Speaker 3>gives us the opportunity to do a much more sort

0:45:48.400 --> 0:45:54.719
<v Speaker 3>of science based form of engineering applied to human cognition,

0:45:55.000 --> 0:45:58.480
<v Speaker 3>thinking about how do we make an optimal you know,

0:45:58.680 --> 0:46:02.239
<v Speaker 3>sort of learning environment, how do we support human decision making.

0:46:02.280 --> 0:46:03.840
<v Speaker 3>That's something that I work on in my lab is like,

0:46:04.239 --> 0:46:08.480
<v Speaker 3>how do we put computation into human environments to overcome

0:46:08.520 --> 0:46:12.440
<v Speaker 3>whatever computational constraints we have as individual decision makers and

0:46:12.480 --> 0:46:17.160
<v Speaker 3>help us make better decisions. And how do we understand,

0:46:17.320 --> 0:46:21.359
<v Speaker 3>you know, the kinds of things that people are doing

0:46:21.360 --> 0:46:23.319
<v Speaker 3>in a way that allows us to then sort of

0:46:23.360 --> 0:46:27.160
<v Speaker 3>like make suggestions about, you know, how they might do

0:46:27.239 --> 0:46:30.160
<v Speaker 3>them better. Right, And so I think there's a there's

0:46:30.160 --> 0:46:32.680
<v Speaker 3>a lot of potential for you know, sort of human

0:46:32.760 --> 0:46:35.440
<v Speaker 3>upside as we start to be able to answer these

0:46:35.440 --> 0:46:36.400
<v Speaker 3>scientific questions.

0:46:41.200 --> 0:46:44.480
<v Speaker 1>That was my interview with Tom Griffith's. To quickly summarize

0:46:44.520 --> 0:46:49.120
<v Speaker 1>his framework, Tom sees three major scientific approaches that all

0:46:49.120 --> 0:46:53.000
<v Speaker 1>try to capture the mind. You've got rules and symbols,

0:46:53.440 --> 0:46:57.240
<v Speaker 1>you've got artificial neural networks, and you've got probability theory.

0:46:57.600 --> 0:47:01.239
<v Speaker 1>These very different approaches, and each which one has delivered

0:47:01.280 --> 0:47:05.759
<v Speaker 1>something a little different. Rules and symbols give us language

0:47:05.920 --> 0:47:10.040
<v Speaker 1>like machinery where pieces can be assembled and reassembled into

0:47:10.080 --> 0:47:15.759
<v Speaker 1>complex ideas. Artificial neural networks they give us graded concepts,

0:47:15.880 --> 0:47:20.160
<v Speaker 1>meaning ideas can be fuzzy, and probability theory gives us

0:47:20.160 --> 0:47:24.160
<v Speaker 1>a language for dealing with uncertainty. Now, what's interesting is

0:47:24.200 --> 0:47:28.000
<v Speaker 1>that human minds seem to traffic in all of these modes.

0:47:28.120 --> 0:47:32.719
<v Speaker 1>We use structured symbols, we also use graded concepts. We

0:47:32.760 --> 0:47:36.480
<v Speaker 1>also revise our belief as new evidence comes in. And

0:47:36.560 --> 0:47:38.400
<v Speaker 1>part of that is that we move through the world

0:47:38.400 --> 0:47:43.040
<v Speaker 1>with prior beliefs shaped by our history, our culture, our language,

0:47:43.040 --> 0:47:46.799
<v Speaker 1>our neighborhood, our moment in time. So none of these

0:47:46.840 --> 0:47:50.719
<v Speaker 1>models by themselves are the final answer. And what this

0:47:50.840 --> 0:47:54.120
<v Speaker 1>means is that, like most scientific stories, this is one

0:47:54.160 --> 0:47:58.880
<v Speaker 1>about humility. Tom's book illustrates how every generation arrives with

0:47:59.280 --> 0:48:03.319
<v Speaker 1>some new formalism, some new piece of math, some new

0:48:03.719 --> 0:48:08.240
<v Speaker 1>model that's powerful enough to illuminate an area of mental

0:48:08.280 --> 0:48:10.880
<v Speaker 1>life and for a moment it feels like, hey, the

0:48:10.880 --> 0:48:15.600
<v Speaker 1>whole mystery is finally collapsing. But then the spotlight widens

0:48:15.680 --> 0:48:19.160
<v Speaker 1>and we see more terrain. So what I love about

0:48:19.200 --> 0:48:21.640
<v Speaker 1>this conversation is that it can leave us with a

0:48:21.840 --> 0:48:25.120
<v Speaker 1>sense of progress and a sense of wonder. At the

0:48:25.160 --> 0:48:29.239
<v Speaker 1>same time, we feel a convergence of different fields, and

0:48:29.320 --> 0:48:34.280
<v Speaker 1>we can also feel how large this subject remains. Cognition

0:48:34.440 --> 0:48:37.560
<v Speaker 1>is still a field in motion, So let's look at

0:48:37.560 --> 0:48:40.799
<v Speaker 1>the big picture. When the field of physics matured, we

0:48:40.880 --> 0:48:45.399
<v Speaker 1>could then build bridges and airplanes and power grids because

0:48:45.440 --> 0:48:50.520
<v Speaker 1>we had firm principles to build on. So once the

0:48:50.640 --> 0:48:55.640
<v Speaker 1>laws of thought come into clearer view, what becomes possible

0:48:55.840 --> 0:49:00.520
<v Speaker 1>for education and for decision making, and for rules that

0:49:00.600 --> 0:49:04.839
<v Speaker 1>help us reason more effectively. So here we are at

0:49:04.880 --> 0:49:08.000
<v Speaker 1>a very cool moment in history where the old dream

0:49:08.640 --> 0:49:13.040
<v Speaker 1>of formalizing thought has escaped to the library and shown

0:49:13.120 --> 0:49:17.439
<v Speaker 1>up in everyone's laptop. The big thinkers of centuries ago

0:49:17.920 --> 0:49:21.239
<v Speaker 1>could sort of squint and see the outline of the project,

0:49:21.840 --> 0:49:24.839
<v Speaker 1>and now we're living much more squarely right in the

0:49:24.880 --> 0:49:27.799
<v Speaker 1>middle of it. If there truly are laws of thought,

0:49:27.840 --> 0:49:30.960
<v Speaker 1>they're going to teach us about our machines, but more importantly,

0:49:30.960 --> 0:49:35.399
<v Speaker 1>they're going to teach us about ourselves, because although it's

0:49:35.400 --> 0:49:39.320
<v Speaker 1>sometimes tempting to view the mind as a ghostly exception

0:49:39.440 --> 0:49:43.640
<v Speaker 1>to the universe, the mind, presumably is part of the universe,

0:49:44.120 --> 0:49:48.800
<v Speaker 1>and it is lawful and wondrous and discoverable, and every

0:49:48.880 --> 0:49:56.840
<v Speaker 1>step towards understanding it enlarges the human story. Go to

0:49:56.880 --> 0:49:59.680
<v Speaker 1>eagleman dot com slash podcast for more information and to

0:49:59.719 --> 0:50:03.520
<v Speaker 1>find further reading. Join the weekly discussions on my substack,

0:50:03.800 --> 0:50:06.800
<v Speaker 1>and check out and subscribe to Inner Cosmos on YouTube

0:50:06.800 --> 0:50:10.160
<v Speaker 1>for videos of each episode and to leave comments until

0:50:10.200 --> 0:50:13.640
<v Speaker 1>next time. I'm David Eagleman, and this is Inner Cosmos.