WEBVTT - Ep139 "What does alignment look like in a society of AIs?" with Danielle Perszyk 0:00:05.160 --> 0:00:08.760 Is it possible that we're thinking about intelligence in the 0:00:08.840 --> 0:00:14.080 wrong way? Instead of being something inside individual brains, is 0:00:14.120 --> 0:00:19.400 intelligence instead something that emerges from lots of brains that 0:00:19.440 --> 0:00:23.800 are constantly working to align with one another. And if 0:00:23.840 --> 0:00:26.960 we take on that lens, what does this mean about 0:00:26.960 --> 0:00:31.040 the way that we can build AI agents or the 0:00:31.040 --> 0:00:33.960 way that they can make us better? What is the 0:00:34.000 --> 0:00:38.959 difference between information and information with a purpose? Today we're 0:00:38.960 --> 0:00:42.040 going to speak with Daniel Persick, a cognitive scientist who 0:00:42.120 --> 0:00:47.080 leads the Human Computer Interaction team at Amazon's AGI Lab. 0:00:47.440 --> 0:00:54.240 So get ready for a great brain stretch. Welcome to 0:00:54.240 --> 0:00:57.680 Intercosmos with me David Eagleman. I'm a neuroscientist and an 0:00:57.680 --> 0:01:00.640 author at Stanford and in these episodes, as we sail 0:01:00.800 --> 0:01:04.160 deeply into our three pound universe to understand how we 0:01:04.240 --> 0:01:08.520 see the world and soon how AI might come to 0:01:08.640 --> 0:01:25.040 understand the world with us. Let's think about the word intelligence. 0:01:25.760 --> 0:01:30.440 You might justifiably assume that neuroscientists have an agreed upon 0:01:30.640 --> 0:01:34.800 definition for this, but we actually don't. However one thinks 0:01:34.800 --> 0:01:38.880 about intelligence, I think it's a fair assumption that most 0:01:38.880 --> 0:01:42.360 of us, when we think about it, assume that intelligence 0:01:42.480 --> 0:01:47.000 is something that happens inside a single head, in other words, 0:01:47.400 --> 0:01:52.400 a brain processing information. This statement seems so obvious that 0:01:52.520 --> 0:01:56.680 it hardly invites inspection, but if you step back and 0:01:56.720 --> 0:02:01.480 look at how intelligence actually unfolds in a human life, 0:02:01.520 --> 0:02:06.240 a different picture can start to emerge. Our thinking is 0:02:06.280 --> 0:02:09.919 shaped by other people from the very beginning. We learn 0:02:10.080 --> 0:02:15.200 by watching, by imitating, by trying to communicate, and eventually 0:02:15.240 --> 0:02:19.680 by negotiating meaning with the people around us. Even our 0:02:19.680 --> 0:02:24.840 most private thoughts are built from tools that are fundamentally social, 0:02:25.040 --> 0:02:30.239 things like language and symbols and shared concepts and cultural norms. 0:02:30.560 --> 0:02:32.400 So this may sound strange, but this is what we're 0:02:32.400 --> 0:02:34.760 going to talk about today, and the idea will become 0:02:34.880 --> 0:02:39.720 very clear. Most of humanity's greatest achievements didn't come from 0:02:39.840 --> 0:02:45.440 lone geniuses working in isolation, but from really dense networks 0:02:45.600 --> 0:02:50.320 of minds interacting over time. When we look at things 0:02:50.400 --> 0:02:54.440 like science or art, or morality or technology, it almost 0:02:54.639 --> 0:02:59.800 never makes sense to interpret these as products of individual intelligence, 0:02:59.800 --> 0:03:04.400 but instead they are collective processes that allow ideas to 0:03:04.919 --> 0:03:09.520 collide and to form into something and to continuously evolve. 0:03:10.280 --> 0:03:14.320 So intelligence in this sense may be less like a 0:03:14.440 --> 0:03:20.600 thing we possess and more like something that emerges between us. Now, 0:03:20.800 --> 0:03:26.320 this broader perspective becomes especially important as we find ourselves 0:03:26.600 --> 0:03:31.760 flinging headlong into the era of artificial intelligence. With every 0:03:31.840 --> 0:03:37.360 passing week, we're getting AI acting more like a participant. 0:03:37.360 --> 0:03:41.360 We're getting systems that communicate but also agents that act 0:03:41.440 --> 0:03:43.720 on our behalf to do things in the world. And 0:03:43.960 --> 0:03:48.880 soon these agents will collaborate with each other at their 0:03:49.000 --> 0:03:54.920 time scales and spatial scales. So if intelligence is social 0:03:55.040 --> 0:03:59.560 by nature, then building the future world of AI might 0:03:59.680 --> 0:04:04.440 end up requiring more than just dumping billions into scaling 0:04:04.600 --> 0:04:08.000 up the training data for these systems. It may instead 0:04:08.080 --> 0:04:13.560 require understanding how minds relate to one another in the 0:04:13.600 --> 0:04:17.840 first place. And that's where today's conversation begins. Today I'm 0:04:17.880 --> 0:04:22.040 joined by Danielle Persk. She's a cognitive scientist who leads 0:04:22.080 --> 0:04:27.799 the Human Computer Interaction team at Amazon's AGI Lab. Danielle 0:04:27.880 --> 0:04:32.680 uses insights from the evolution and development of human intelligence 0:04:33.120 --> 0:04:36.480 to inform how we can not only make AI smarter, 0:04:36.920 --> 0:04:41.320 but build AI that also makes us smarter. Here's my 0:04:41.400 --> 0:04:43.120 conversation with Danielle Persk. 0:04:47.680 --> 0:04:52.640 Intelligence in humans is really social, and that is the 0:04:52.680 --> 0:04:56.120 thing that differentiates our intelligence from other species. Even other 0:04:56.160 --> 0:04:58.960 species that are closely related to us have similar brain 0:04:59.000 --> 0:05:03.360 structures and function similar genetics. And what we are really 0:05:03.800 --> 0:05:09.560 optimizing for is representing other minds. So not only are 0:05:10.160 --> 0:05:14.680 infants human infants inferring the existence of other minds, but 0:05:15.279 --> 0:05:21.279 once this thing exists, we are optimized for aligning our minds. Evolutionarily, 0:05:21.400 --> 0:05:26.479 we had to cooperate to survive. Infants need to be 0:05:26.560 --> 0:05:29.600 able to have their caretaker's attention on them to survive, 0:05:30.160 --> 0:05:33.000 and in terms of being able to learn about the world, 0:05:33.240 --> 0:05:35.480 once infants have a model of other minds, then they 0:05:35.480 --> 0:05:38.839 can manipulate it. They can direct their caretaker's attention point 0:05:39.080 --> 0:05:41.640 what's that, and magically they'll have a label for. 0:05:41.680 --> 0:05:43.720 This thing that they're looking at in their environment. 0:05:43.880 --> 0:05:46.480 So they're doing prompt engineering. 0:05:47.960 --> 0:05:50.680 Great technology. Yeah, okay, so we know that. 0:05:50.800 --> 0:05:54.440 You know, throughout the course of human evolution, we became 0:05:54.640 --> 0:05:59.320 increasingly dependent upon cooperating to to stay alive and adapt 0:05:59.440 --> 0:06:02.080 to new environment. So it makes sense that there'd be 0:06:02.160 --> 0:06:06.080 this extreme pressure on being able to predict each other's 0:06:06.120 --> 0:06:09.920 behaviors to understand our minds, and then with infants, developmentally, 0:06:10.000 --> 0:06:13.479 we have also the benefit of being able to learn 0:06:13.839 --> 0:06:19.280 much more efficiently even language itself, from representing other minds. 0:06:19.720 --> 0:06:21.840 Okay, so it turns out that we can do a 0:06:22.000 --> 0:06:26.760 much better job of predicting if we can imagine what 0:06:26.800 --> 0:06:29.440 it's like to be inside other people's heads. Right, So, 0:06:29.720 --> 0:06:33.920 if I want to know what some non player character 0:06:34.000 --> 0:06:35.680 is going to do in a video game whatever, they 0:06:35.720 --> 0:06:37.440 have certain behaviors. But if I want to know, let's 0:06:37.440 --> 0:06:39.599 say what you're going to do next, or say next, 0:06:39.880 --> 0:06:42.160 if I have a model of your mind and what 0:06:42.200 --> 0:06:43.680 you know and you don't know and all that stuff, 0:06:43.720 --> 0:06:44.919 I can make a better prediction. 0:06:45.800 --> 0:06:50.360 And so you've said that there's information and then information 0:06:50.440 --> 0:06:53.400 with a purpose, and that information with a purpose really matters. 0:06:53.720 --> 0:06:57.000 So you've used the example of like the. 0:06:56.720 --> 0:06:59.960 Land rover on Mars not being able to fix itself, 0:07:00.400 --> 0:07:02.880 and like a wolf that gets its like trap. 0:07:03.080 --> 0:07:05.560 Right, the Curiosity Rover went up to Mars. We had 0:07:05.600 --> 0:07:08.160 spent like a billion something dollars on it. It did 0:07:08.160 --> 0:07:10.520 a great job on Mars, but eventually it got its 0:07:10.600 --> 0:07:13.400 right front wheel stuck in the Martian soil and it 0:07:13.480 --> 0:07:17.000 died couldn't get out. But if you can trast that 0:07:17.040 --> 0:07:20.080 with a wolf who gets its leg cond of trap. 0:07:20.400 --> 0:07:22.600 It'll chew its leg off and then figure out how 0:07:22.640 --> 0:07:24.920 to walk on three legs, which is extraordinary because a 0:07:24.960 --> 0:07:27.600 wolf's brain didn't evolve for three legs. But it can 0:07:27.680 --> 0:07:30.040 figure it out because it's live wired. It has brain 0:07:30.080 --> 0:07:33.680 plasticity and figure out, Okay, how do I adjust everything. 0:07:33.560 --> 0:07:34.640 So that I can survival? 0:07:34.680 --> 0:07:37.920 Depends upon it exactly. That's the key. It has relevance 0:07:38.040 --> 0:07:39.360 to the animal. 0:07:39.200 --> 0:07:42.480 Right, So all animals have a drive to survive, a 0:07:42.560 --> 0:07:46.800 drive to reproduce, But humans also have a drive to 0:07:47.080 --> 0:07:51.320 align our minds because it helps us cooperate, it helps 0:07:51.400 --> 0:07:55.560 us survive, and it helps us to learn extremely efficiently. 0:07:56.000 --> 0:07:59.400 So we don't just model other minds. That would just 0:07:59.440 --> 0:08:03.720 be the information part. We are optimized for aligning our minds. 0:08:03.720 --> 0:08:05.520 So it's information with a purpose. 0:08:05.760 --> 0:08:08.440 Okay, so aligning our minds this is the key thing 0:08:09.640 --> 0:08:12.440 and at the center of your interests. And so then 0:08:12.560 --> 0:08:15.280 you went into looking into AGI. So first of all, 0:08:15.320 --> 0:08:18.520 tell us what artificial general intelligence is to you. 0:08:18.800 --> 0:08:22.120 Well, I think most of the labs that are trying 0:08:22.160 --> 0:08:26.160 to build something like AGI, they all have their own definitions. 0:08:27.120 --> 0:08:30.040 None of them are really very good. But the one 0:08:30.080 --> 0:08:32.400 thing that unifies all of them is that they are 0:08:32.480 --> 0:08:35.840 all benchmarked to human intelligence. And this goes all the 0:08:35.840 --> 0:08:39.160 way back to the origin of the field of AI. 0:08:39.640 --> 0:08:42.000 So in nineteen fifty six, a group of these engineers 0:08:42.000 --> 0:08:44.480 and mathematicians got together. They were going to solve intelligence 0:08:44.480 --> 0:08:46.640 and build thinking machines, and the idea is that these 0:08:46.640 --> 0:08:48.240 thinking machines would think like us. 0:08:49.040 --> 0:08:50.160 It obviously took a. 0:08:50.200 --> 0:08:53.280 Very long time to realize, Oh, that's a lot harder 0:08:53.800 --> 0:08:56.199 than we thought that it was. But now we are 0:08:56.320 --> 0:08:59.120 back to aiming for something like that original goal of 0:08:59.120 --> 0:09:01.720 building thinking machines that think like us. 0:09:01.760 --> 0:09:02.800 We call it AGI. 0:09:03.280 --> 0:09:08.360 Again, have slightly different operationalizations. But I think that we're 0:09:08.360 --> 0:09:12.080 all running towards the wrong thing. And that's because I 0:09:12.120 --> 0:09:17.079 don't think that intelligence can exist in a machine. It 0:09:17.120 --> 0:09:21.800 doesn't exist in individual humans. It's something that emerges from 0:09:21.960 --> 0:09:27.079 our interactions because we have this drive to align our representations, 0:09:27.600 --> 0:09:30.480 and of course we all have very different representations. 0:09:30.559 --> 0:09:31.840 Right When I used to teach. 0:09:31.679 --> 0:09:35.160 Cognitive science, I would teach about this condition called a fantasia, 0:09:35.800 --> 0:09:39.640 and once every couple of classes a student would come 0:09:39.720 --> 0:09:40.040 up to. 0:09:40.000 --> 0:09:43.280 Me quick In fantations where you can't imagine, you don't 0:09:43.320 --> 0:09:45.280 have any visual representation on the Yes. 0:09:45.400 --> 0:09:47.640 Yes, a student would come up to me and say, wait, 0:09:47.800 --> 0:09:51.000 you mean there are people who can actually imagine things. 0:09:51.080 --> 0:09:53.280 Their mind's eye is not just a metaphor. It's a 0:09:53.320 --> 0:09:53.880 thing that. 0:09:53.800 --> 0:09:57.360 People experience, and they wouldn't know because they don't suffer 0:09:57.480 --> 0:10:02.000 from other types of death. It's just one of the 0:10:02.480 --> 0:10:06.920 many ways in which human cognition and experience can very 0:10:07.920 --> 0:10:10.839 And when I imagine in apple, it's different than when 0:10:10.880 --> 0:10:15.560 you imagine an apple. We all have different associations. So 0:10:15.600 --> 0:10:20.359 when we come together and we have to use words 0:10:20.880 --> 0:10:24.079 to try to align our minds, there's necessarily going to 0:10:24.120 --> 0:10:29.320 be friction, especially when we're trying to talk about abstract things, 0:10:29.960 --> 0:10:32.880 especially when we're talking about things at the bleeding edge 0:10:32.880 --> 0:10:34.240 of our knowledge, like science. 0:10:34.800 --> 0:10:37.400 How do you align. 0:10:37.160 --> 0:10:39.600 Your representations when there's not even a word for something. 0:10:39.679 --> 0:10:44.720 So intelligence emerges as a function of trying to align 0:10:44.760 --> 0:10:50.679 our minds and oftentimes creating new concepts to achieve that. 0:10:51.120 --> 0:10:53.320 Okay, so when you're talking about aligning minds, it's because 0:10:53.480 --> 0:10:56.440 I've got my whole internal world. You've got your whole 0:10:56.480 --> 0:11:00.240 internal world that is built by each of our our 0:11:00.320 --> 0:11:03.959 trajectories through space time. We've had different experiences all these things. 0:11:04.480 --> 0:11:07.800 So we come together and we've got completely different worlds 0:11:07.840 --> 0:11:10.280 running on the inside. And that's what conversation is about. 0:11:10.320 --> 0:11:12.280 We're trying to align things that way. 0:11:12.440 --> 0:11:16.280 And there are neuroscientists who measure when people are either 0:11:16.400 --> 0:11:19.520 communicating in real time or if they're listening to a story, 0:11:19.640 --> 0:11:23.200 if they're watching something on a screen, you can measure 0:11:23.280 --> 0:11:27.080 the degree of neuralsynchrony, how close they are to be 0:11:27.120 --> 0:11:29.200 on the same wavelength, and that predicts all sorts of 0:11:29.240 --> 0:11:32.400 things like how much they like each other, how much 0:11:32.480 --> 0:11:37.199 they understood the story, and how much they liked the story, 0:11:37.320 --> 0:11:39.480 how similarly they remember things. 0:11:39.880 --> 0:11:42.640 Okay, so this is what humans do. We get together 0:11:42.679 --> 0:11:44.960 in conversation all the time and we try to achieve 0:11:45.000 --> 0:11:49.480 that synchrony in terms of oh, okay, wait, you have 0:11:49.480 --> 0:11:51.880 a different view than I do on this, here's how 0:11:51.880 --> 0:11:54.640 we can make progress. This is the Socratic dialectic, right. 0:11:54.640 --> 0:11:56.719 This is what Socrates love to do, is have these 0:11:56.720 --> 0:12:00.680 conversations where the truth emerges, something bigger than either person 0:12:00.760 --> 0:12:03.199 knew when they started the conversation. 0:12:03.440 --> 0:12:05.280 And on that point too, I think a lot of 0:12:05.360 --> 0:12:10.920 us think that we know things, but actually when we're 0:12:10.960 --> 0:12:13.880 forced to describe something we realized we don't. 0:12:14.160 --> 0:12:16.439 Yeah. Actually, in my next book, I'm talking about this 0:12:16.480 --> 0:12:20.520 as a Potempkin village. Yeah, so you know. The Potemkin village, 0:12:20.559 --> 0:12:23.360 for anyone doesn't remember, is when it was Catherine the 0:12:23.400 --> 0:12:26.680 Great of Russia was heading down the river with a 0:12:26.679 --> 0:12:28.680 bunch of dignitaries that she was trying to impress. She 0:12:28.720 --> 0:12:31.679 hired this skuy Potempkin. Actually he was her lover as 0:12:31.679 --> 0:12:34.640 well as a military general, but she got him to 0:12:34.679 --> 0:12:38.760 go down the river a long way and build what 0:12:38.960 --> 0:12:42.120 looked like a facade of a village so that when 0:12:42.200 --> 0:12:44.440 the ship went by, all the dignitaries would be impressed 0:12:44.440 --> 0:12:47.120 that there was this village. And he got all these 0:12:47.160 --> 0:12:49.800 peasants like walk around happily and stuff, But there were 0:12:49.840 --> 0:12:53.560 no buildings. It was just the front face of the building. 0:12:53.559 --> 0:12:57.320 And then when the ship passed, he deconstructed this and 0:12:57.800 --> 0:13:00.679 went ahead and built another village so that they passed 0:13:00.720 --> 0:13:02.840 another great village, so it looked like things were really 0:13:02.880 --> 0:13:07.200 happening there. Anyway, Cognition is often like this, where we think, oh, yeah, 0:13:07.200 --> 0:13:10.520 I got it. Here's an example that I often use 0:13:10.600 --> 0:13:13.200 is for anybody listening, take out a piece of paper, 0:13:13.240 --> 0:13:15.560 and draw a bicycle, draw a bike. 0:13:16.080 --> 0:13:19.080 I've tried this, yeah hard, Yeah. 0:13:19.000 --> 0:13:20.840 Exactly, it turns out, I mean something as simple as 0:13:20.840 --> 0:13:24.679 a bike what you see every day. Yeah, you start realizing, wait, 0:13:24.720 --> 0:13:26.640 actually I don't know exactly where this goes and what's 0:13:26.679 --> 0:13:30.679 the thing and so anyway, Yes, this is an example 0:13:30.720 --> 0:13:33.280 of where we think we have deep knowledge and sometimes 0:13:33.320 --> 0:13:34.960 it's just the facade of something that we know. 0:13:35.280 --> 0:13:37.360 Yeah, and you can apply it on all different levels. 0:13:37.360 --> 0:13:40.160 So you're describing, like the visual imagery might not be 0:13:40.360 --> 0:13:45.600 very stable, but a lot of concepts are not stable either, 0:13:45.800 --> 0:13:50.480 and we invent ways of making them more stable. Words 0:13:50.640 --> 0:13:53.600 are a classic example of that. Once you have a 0:13:53.640 --> 0:13:57.440 word for something, you can more easily trigger it, you 0:13:57.440 --> 0:13:59.280 can more easily remember it, you can use it and 0:13:59.320 --> 0:14:02.280 manipulate it and apply it to different things. But there's 0:14:02.320 --> 0:14:06.840 a whole class of things that we're constantly inventing to 0:14:06.960 --> 0:14:12.000 better align our minds. They're called cognitive technologies. So writing 0:14:12.080 --> 0:14:17.840 would be one of the original ones. But symbols like math, logic, Yeah. 0:14:17.600 --> 0:14:19.360 So unpack that. What's an example of this? 0:14:19.920 --> 0:14:22.640 So literally, any word that you learn. Let's go back 0:14:22.640 --> 0:14:23.240 to apples. 0:14:23.360 --> 0:14:28.520 So children see apples, they don't have a word associated 0:14:28.560 --> 0:14:32.560 with it, so the likelihood that it's going to spontaneously 0:14:33.120 --> 0:14:36.440 sort of emerge in their mind is very low. Maybe 0:14:36.480 --> 0:14:38.760 if they've seen a couple then there will be some 0:14:38.800 --> 0:14:44.400 sort of increased likelihood or lowered threshold. But once they 0:14:44.440 --> 0:14:48.600 have a word for that thing, that they reliably associate 0:14:48.600 --> 0:14:51.480 it with it, then anybody who says the word anytime 0:14:51.520 --> 0:14:54.280 they hear it, now their brain will elicit that activity 0:14:54.320 --> 0:14:59.160 and it becomes a more stable representation. That's an obvious 0:14:59.280 --> 0:15:02.200 example where where there's actually a physical thing that the 0:15:02.200 --> 0:15:03.040 word can refer to. 0:15:03.160 --> 0:15:04.040 But what about. 0:15:04.080 --> 0:15:08.120 Concepts like love and justice that you can't see you're 0:15:08.120 --> 0:15:12.040 saying By assigning a word to it, then we make 0:15:12.120 --> 0:15:15.000 that stable, and then it become it can become associated 0:15:15.080 --> 0:15:18.400 with a whole web of other concepts, and that web 0:15:18.480 --> 0:15:22.359 becomes increasingly stable when when we can. 0:15:22.560 --> 0:15:25.040 Make the associations. 0:15:24.280 --> 0:15:29.280 More robust, more reliable, and then further when we can 0:15:29.320 --> 0:15:34.960 invent things like science where we can really validate causal 0:15:35.000 --> 0:15:39.040 relationships between things, and that makes our representations even more stable. 0:15:39.280 --> 0:15:42.240 I see, so human brains interact with one another and 0:15:43.600 --> 0:15:46.640 work on how do we make these representations stable? How 0:15:46.640 --> 0:15:50.800 do we get knowledge coming out like a Socratic dialectic, 0:15:50.840 --> 0:15:54.240 but with with everybody all involved and so on. And 0:15:54.280 --> 0:15:58.600 so your idea when you moved into this field of AGI, 0:15:58.720 --> 0:16:01.520 howeveryone wants to define it? What was your idea? 0:16:01.880 --> 0:16:05.600 Well, so I left academia, which I absolutely loved, but 0:16:05.800 --> 0:16:11.440 I felt an urgency to validate this theory. 0:16:11.560 --> 0:16:13.680 I don't I mean, it's just a theory at this point. 0:16:14.000 --> 0:16:17.920 How do we know whether we are optimized to align 0:16:17.920 --> 0:16:18.240 our minds? 0:16:18.280 --> 0:16:20.640 There's so much evidence to suggest that we do. 0:16:20.800 --> 0:16:23.560 But as Richard Feyman said, you don't really know if 0:16:23.560 --> 0:16:24.920 you understand something until you can build it. 0:16:24.960 --> 0:16:27.280 And so I thought, well, maybe I could build this thing. 0:16:27.320 --> 0:16:30.480 And the moment is just right because AI is taking 0:16:30.480 --> 0:16:31.000 off again. 0:16:31.200 --> 0:16:35.280 It's waking up from one of the winters, and I 0:16:35.360 --> 0:16:41.280 was watching the scaling have really impressive results with a 0:16:41.320 --> 0:16:43.520 deep learning, which felt really good because as somebody who 0:16:43.520 --> 0:16:46.800 had a background in neuroscience, just like, oh yeah, inspiration 0:16:47.320 --> 0:16:52.440 from brains is actually proving to be really effective. So 0:16:53.160 --> 0:16:57.720 I moved into tech and started collaborating with the engineers 0:16:57.760 --> 0:17:02.320 who were trying to build ever more capable intelligence. I'm 0:17:02.320 --> 0:17:06.480 now in one of these frontier AGI labs and the 0:17:06.600 --> 0:17:08.520 thing that we are going to be doing, which I 0:17:08.520 --> 0:17:12.159 think is really differentiated from other approaches, is try to 0:17:12.160 --> 0:17:15.680 build the communicative drive. Can we build agents that are 0:17:15.800 --> 0:17:21.720 optimized for understanding each other's perspectives? And from that, can 0:17:21.760 --> 0:17:26.600 we get emergent behaviors, emergent capabilities that we wouldn't get 0:17:26.640 --> 0:17:29.160 from a single model on its own. 0:17:29.320 --> 0:17:32.520 So I just want to slow that down. So communicative drive, 0:17:32.960 --> 0:17:34.800 that's the first time we've heard the term, So tell 0:17:34.840 --> 0:17:35.720 us what that means. 0:17:36.040 --> 0:17:40.800 So communicative drive is the phrase that I use to 0:17:41.080 --> 0:17:45.399 describe this compulsion that we have to align our minds 0:17:45.480 --> 0:17:51.440 to establish representational alignment. You can imagine how the communicative 0:17:51.560 --> 0:17:57.760 drive would interact with other dispositions that humans have. And importantly, 0:17:57.760 --> 0:17:59.800 you have to think at the population level. So again 0:18:00.080 --> 0:18:04.840 we have variation for every trait, and some of us 0:18:04.840 --> 0:18:08.080 are more open, some of us are more closed to experience. 0:18:08.280 --> 0:18:10.000 But you can imagine. Okay, so in the case of 0:18:10.040 --> 0:18:14.240 somebody who's really closed, Let's say that they are. 0:18:14.000 --> 0:18:21.480 In some communicative exchange and they detect a mismatch. So 0:18:21.520 --> 0:18:26.600 somebody is clearly not understanding what they are saying. 0:18:27.320 --> 0:18:31.560 They have two choices. They can update. 0:18:31.359 --> 0:18:35.080 Their perspectives to the other person's, or they can try 0:18:35.080 --> 0:18:38.240 to get the other person's perspective to look more like theirs. 0:18:38.920 --> 0:18:41.840 What would it take to get another person to come 0:18:41.840 --> 0:18:42.840 to your perspective. 0:18:43.119 --> 0:18:46.360 You'd have to create an artifact. You'd have to create. 0:18:46.280 --> 0:18:50.280 A word or a piece of art or a theory 0:18:50.560 --> 0:18:53.680 to get them to really understand and take on your 0:18:53.760 --> 0:18:54.720 perspective and. 0:18:54.680 --> 0:18:55.560 Close that gap. 0:18:55.880 --> 0:18:59.000 But if you're a very open person, if you're creative, 0:18:59.520 --> 0:19:02.359 that might be your default. But if you're a little 0:19:02.359 --> 0:19:05.440 bit more reserved, maybe you just take on the other 0:19:05.560 --> 0:19:06.760 person's perspective. 0:19:07.640 --> 0:19:09.359 Is it that way or the other way? Sorry? If 0:19:09.400 --> 0:19:11.199 I'm very open, I feel like I would take the 0:19:11.240 --> 0:19:12.600 other person's perspective, so you. 0:19:12.520 --> 0:19:15.719 Can actually imagine both situations. Yes, So I score very 0:19:15.800 --> 0:19:18.600 high on openness, and when I'm in communicative exchanges, I 0:19:18.640 --> 0:19:22.640 often feel like, oh, wow, yeah, that's I've never thought 0:19:22.680 --> 0:19:24.240 of it that way, or maybe I have, and I 0:19:24.240 --> 0:19:26.000 want to add all these things and like it's a 0:19:26.119 --> 0:19:29.800 very cooperative thing. I'm more thinking about the dynamics of 0:19:29.800 --> 0:19:34.000 people who want to maintain tradition and status quo versus 0:19:34.080 --> 0:19:37.000 people who want to challenge that. So oftentimes that maps 0:19:37.000 --> 0:19:41.879 onto the dimension of openness. So if you see that 0:19:42.040 --> 0:19:46.320 everybody around you seems to hold a different perspective than 0:19:46.400 --> 0:19:51.160 you do, you're more likely to conform to their perspectives. 0:19:51.160 --> 0:19:54.520 If you're somebody who might be a little bit more conservative, 0:19:54.800 --> 0:19:57.520 not wanting to ruffle feathers that kind of thing. 0:19:57.760 --> 0:19:59.080 Well, I'm just trying to stand why I use the 0:19:59.080 --> 0:20:04.760 word conservative there, because conservative meaning are like iinin exactly. 0:20:05.320 --> 0:20:08.440 Oh but you're saying, maintain the group traditions. Yeah, okay, 0:20:08.560 --> 0:20:09.119 got it. 0:20:09.119 --> 0:20:11.360 As opposed to being iconoclastic and innovative. 0:20:11.560 --> 0:20:13.680 I see, I see how you're using it. Okay, great. 0:20:13.880 --> 0:20:16.359 So this is the idea is that people are always talking, 0:20:16.400 --> 0:20:19.119 and depending on your personality type, what you're trying to 0:20:19.160 --> 0:20:21.159 do is either align yourself with them or them with 0:20:21.200 --> 0:20:24.399 you or whatever, or meet in the middle. But this, 0:20:25.080 --> 0:20:28.840 this you feel, is the key to what human societies 0:20:29.119 --> 0:20:32.359 bring as opposed to looking at individual brains. You know, 0:20:32.400 --> 0:20:35.640 the history of neuroscience is all about looking at individual brains. Oh, 0:20:35.640 --> 0:20:37.399 this is how the visual system works, as how decision 0:20:37.400 --> 0:20:41.240 making works, how hearing works, whatever. But there's this new 0:20:41.280 --> 0:20:43.320 feel that's been growing for the last twenty or thirty years, 0:20:43.320 --> 0:20:46.480 which is called social neuroscience, which is all about, gosh, 0:20:46.480 --> 0:20:48.760 we've got a lot of circuitry in our brains that 0:20:48.840 --> 0:20:51.520 care about other brains. So this is the heart of 0:20:51.560 --> 0:20:55.000 your interest. Is what happens when people are talking and aligning? 0:20:55.400 --> 0:20:58.639 And why are we so driven to communicate instead of 0:20:58.680 --> 0:21:00.480 let's imagine that you and I set down on a 0:21:00.520 --> 0:21:03.840 bus next to each other, we'd probably chat as opposed 0:21:03.880 --> 0:21:06.440 to just sit there and deal with our own brains. Okay, 0:21:06.520 --> 0:21:09.919 so how does this map onto what you're interested in 0:21:09.920 --> 0:21:10.639 doing in AI? 0:21:11.640 --> 0:21:16.720 Yes, so I am concerned about building something that resembles 0:21:16.880 --> 0:21:20.240 our own intelligence, or something that resembles us because we 0:21:20.359 --> 0:21:23.480 have all sorts of flaws and biases. 0:21:23.920 --> 0:21:25.600 The variability, I. 0:21:25.520 --> 0:21:28.359 Think is very useful, and we wouldn't be intelligent in 0:21:28.359 --> 0:21:30.439 the way that we are without the variability. And you 0:21:30.520 --> 0:21:32.840 might call some of that variability the bias, the unique 0:21:32.880 --> 0:21:33.800 biases that we have. 0:21:34.320 --> 0:21:37.000 But I think if we try to reproduce. 0:21:36.480 --> 0:21:39.320 All of that, we're going to get a mirror of ourselves, 0:21:39.320 --> 0:21:44.640 and that's not always the most effective way to augment 0:21:44.800 --> 0:21:46.960 our intelligence. And I should back up and say, why 0:21:47.000 --> 0:21:49.200 are we doing any of this? Why do we want 0:21:49.240 --> 0:21:51.840 to build intelligence that looks like us. I think the 0:21:51.920 --> 0:21:55.080 assumption that a lot of these the people, the engineers, 0:21:55.080 --> 0:21:57.560 and these labs have is that, oh, of course it's 0:21:57.560 --> 0:22:01.359 going to be extremely useful for us. It's going to 0:22:01.560 --> 0:22:06.080 unlock this unprecedented era of human flourishing. But the assumption 0:22:06.160 --> 0:22:08.720 that it's going to be really useful for us, I 0:22:08.760 --> 0:22:11.400 think is taken for granted, and if you really think 0:22:11.400 --> 0:22:15.000 about it, well, how because a lot of the examples 0:22:15.040 --> 0:22:19.440 that we have from recent technology and algorithms is that 0:22:19.920 --> 0:22:24.920 they actually take away our agency. We lose hours to scrolling, 0:22:24.960 --> 0:22:29.400 we get stuck in echo chambers, we have autocomplete takeaway 0:22:29.560 --> 0:22:31.600 our thinking, and we're starting to see. 0:22:31.440 --> 0:22:34.680 The same kinds of things with chatbots. 0:22:35.600 --> 0:22:38.639 We're also seeing that people are using these technologies and 0:22:39.359 --> 0:22:44.320 very much augmenting their their own intelligence. I feel sometimes 0:22:44.480 --> 0:22:48.440 like I'm having entirely new thoughts at an unprecedented pace 0:22:48.800 --> 0:22:51.359 when I'm going back and forth, just like when you 0:22:51.400 --> 0:22:54.480 were having amazing conversations with other people. We use each 0:22:54.480 --> 0:22:56.560 other's minds as tools, but you can just do that 0:22:56.600 --> 0:22:59.840 at a more rapid pace. So it's not a foregone 0:23:00.000 --> 0:23:06.159 inclusion that giving the AI more capabilities, making it smarter 0:23:06.240 --> 0:23:09.240 and giving it more agency is going to be good 0:23:09.240 --> 0:23:11.840 for us. I think we have to turn that on 0:23:11.920 --> 0:23:14.200 its head and say, what would it take to make 0:23:14.480 --> 0:23:18.359 AI that makes us smarter and gives us more agency? 0:23:19.200 --> 0:23:22.960 And that would be, by definition, something that is good 0:23:23.040 --> 0:23:27.160 for us. So how do we do that? I don't 0:23:27.200 --> 0:23:29.880 think that we want to have agents that have their 0:23:29.920 --> 0:23:33.560 own drives to survive and manipulate us and have all 0:23:33.600 --> 0:23:39.000 of the status seeking U situations that we have. But 0:23:39.480 --> 0:23:44.640