WEBVTT - Ep105 "What if AI is not actually intelligent?" (with Alison Gopnik) 0:00:05.160 --> 0:00:09.040 Is AI an intelligent agent or is there a totally 0:00:09.039 --> 0:00:12.080 different way that we should be thinking about this. Perhaps 0:00:12.080 --> 0:00:16.520 it's more like a piece of cultural technology. What in 0:00:16.560 --> 0:00:21.360 the world is cultural technology? And how would rethinking this 0:00:21.960 --> 0:00:25.160 change the way we approach what to do next? And 0:00:25.200 --> 0:00:26.880 what does any of this have to do with the 0:00:26.920 --> 0:00:31.479 myth of the golom or Socrates, or the printing press 0:00:31.720 --> 0:00:39.000 or Martin Luther or the story of stone soup. Welcome 0:00:39.000 --> 0:00:42.199 to Intercosmos with me David Eagleman. I'm a neuroscientist and 0:00:42.240 --> 0:00:45.919 an author at Stanford and in these episodes we examine 0:00:46.000 --> 0:00:49.720 brains and the world around us to understand who we 0:00:49.840 --> 0:01:09.319 are and where we're going. So let's start with the 0:01:09.520 --> 0:01:12.280 appreciation that we are smack in the middle of the 0:01:12.280 --> 0:01:17.160 most dramatic technological shift in human history. Every few weeks, 0:01:17.200 --> 0:01:21.119 a new AI system is released that can answer questions 0:01:21.160 --> 0:01:25.479 of increasing complexity. As we've all seen, it can write 0:01:25.520 --> 0:01:29.800 beautiful prose. It can punch out incredibly good code for software. 0:01:30.319 --> 0:01:34.720 It composes music, It mimics voices. It produces images so 0:01:34.880 --> 0:01:38.160 realistic that we've long ago lost the ability to tell 0:01:38.200 --> 0:01:39.679 if a photo is real or not. 0:01:40.160 --> 0:01:41.760 And this increasingly applies. 0:01:41.440 --> 0:01:45.560 To video as well, So with every leap forward, the 0:01:45.600 --> 0:01:51.440 same questions become louder. Is this an intelligent agent? Is 0:01:51.480 --> 0:01:55.520 it conscious? Will it one day surpass us? And if so, 0:01:55.600 --> 0:01:59.200 what happens to us? Today's episode is about how to 0:01:59.280 --> 0:02:01.720 think about this from an angle that. 0:02:01.640 --> 0:02:03.080 Will probably surprise you. 0:02:03.440 --> 0:02:06.880 So as it stands now, in the public conversation about AI, 0:02:07.200 --> 0:02:10.680 we really have just one metaphor, which is that AI 0:02:10.960 --> 0:02:16.040 is an intelligent agent and of increasing intelligence. We all 0:02:16.080 --> 0:02:20.320 talk about these systems as digital minds that can reason 0:02:20.360 --> 0:02:24.240 and plan and act and perhaps even at some point desire. 0:02:25.000 --> 0:02:28.320 This narrative of an AI with its own mind has 0:02:28.360 --> 0:02:30.880 always been with us in science fiction, of course, but 0:02:30.919 --> 0:02:34.440 today we hear it constantly in policy conversations and in 0:02:34.520 --> 0:02:38.680 media headlines. And whether the tone is optimistic or anxious, 0:02:39.040 --> 0:02:43.600 the underlying premise is the same that these are minds 0:02:43.760 --> 0:02:46.079 in the making, that we are witnessing the birth of 0:02:46.120 --> 0:02:50.799 a new kind of intelligence. But what if that metaphor 0:02:51.400 --> 0:02:55.680 is misleading so much so that it's sending our conversations, 0:02:56.000 --> 0:03:01.400 our policy, our research priorities off course. So today's episode 0:03:01.639 --> 0:03:06.720 is about reframing what large AI models really are and 0:03:06.760 --> 0:03:10.720 what they aren't My guest today is Alison Gopnik. She's 0:03:10.760 --> 0:03:13.880 a professor of psychology at Berkeley, very well known in 0:03:13.919 --> 0:03:18.040 the areas of cognitive and language development. She studies infants 0:03:18.080 --> 0:03:21.760 and young children to understand how learning takes place. And 0:03:21.800 --> 0:03:23.799 she was just by the way, elected to the National 0:03:23.800 --> 0:03:26.919 Academy of Sciences. But I'm talking with her today about 0:03:26.919 --> 0:03:29.560 a new paper she co authored in the journal Science 0:03:29.840 --> 0:03:35.040 about AI with colleagues Henry Ferrell, Cosmishalsi, and James Evans. 0:03:35.400 --> 0:03:37.120 The paper argues that. 0:03:37.120 --> 0:03:40.920 We should stop thinking of large models as intelligent agents 0:03:41.480 --> 0:03:45.480 and instead see them as a new kind of cultural 0:03:45.560 --> 0:03:47.160 and social technology. 0:03:47.520 --> 0:03:48.440 Now what does that mean. 0:03:48.960 --> 0:03:50.560 Well, I'll give you a quick preview and then we'll 0:03:50.600 --> 0:03:54.520 jump into the interview. All throughout history, humans have built 0:03:54.520 --> 0:03:59.480 tools to organize information and transmit it. Think of spoken 0:03:59.600 --> 0:04:06.480 language and then writing, and then printing, libraries, television, the Internet. 0:04:06.720 --> 0:04:11.280 Each of these systems reshaped human culture, not because they 0:04:11.280 --> 0:04:16.120 were intelligent in themselves, but because they allowed information to 0:04:16.279 --> 0:04:21.279 be shared and transformed and coordinated in new ways. Just 0:04:21.279 --> 0:04:25.280 think of how the printing press amplified voices, or how 0:04:25.320 --> 0:04:30.440 something like markets distill the messy complexity of economies into 0:04:30.760 --> 0:04:32.440 a single price. 0:04:32.279 --> 0:04:34.760 Signal, how much does this thing cost? 0:04:35.520 --> 0:04:39.400 Or how bureaucracies take the chaos of signals and sort 0:04:39.440 --> 0:04:43.840 it into categories. These are not minds, but they are 0:04:43.880 --> 0:04:48.880 powerful technologies of culture, technologies that change how we all 0:04:49.000 --> 0:04:50.120 think and how we. 0:04:50.040 --> 0:04:50.960 Act and how we live. 0:04:51.240 --> 0:04:54.080 So the argument we'll hear today is that large AI 0:04:54.200 --> 0:04:59.680 models are best understood in this lineage. They don't think, 0:05:00.120 --> 0:05:05.160 but they process the vast collective output of human thought. 0:05:05.760 --> 0:05:09.480 They are trained on millions of texts and images and voices, 0:05:09.839 --> 0:05:13.839 everything from Shakespeare to Reddit threads to government paperwork, and 0:05:13.880 --> 0:05:19.159 they summarize and reorganize and remix that cultural data. And 0:05:19.200 --> 0:05:22.400 they also surface patterns in the data that maybe hadn't 0:05:22.440 --> 0:05:25.479 been seen before. And so when you're interacting with such 0:05:25.480 --> 0:05:28.320 a piece of technology, let's say, asking it to write 0:05:28.360 --> 0:05:31.600 you a poem or explain a concept, you're not talking 0:05:31.640 --> 0:05:36.520 to a mind. You're participating with a kind of cultural 0:05:36.640 --> 0:05:40.520 compression and recombination machine. So see what you think of 0:05:40.560 --> 0:05:44.000 the perspective that you hear today, because it can change 0:05:44.080 --> 0:05:48.479 our concerns and our eventual legislative approaches if we stop 0:05:48.560 --> 0:05:52.040 assuming that these are minds and instead treat them as 0:05:52.600 --> 0:05:58.240 cultural infrastructures like search engines or even democratic institutions, then 0:05:58.279 --> 0:06:00.839 we can start asking the questions. 0:06:01.200 --> 0:06:03.240 One quick thing before we jump into the interview. 0:06:03.760 --> 0:06:07.840 You've heard of large language models llms, and more recently 0:06:08.000 --> 0:06:11.400 large multimodal models that are trained on words and images 0:06:11.800 --> 0:06:14.880 and increasingly other data as well. So nowadays we just 0:06:14.960 --> 0:06:18.640 refer to these as large models. So here's my interview 0:06:18.640 --> 0:06:25.560 with Alison Gothnik. So, Alison, before we get started talking 0:06:25.560 --> 0:06:29.120 about AI, you have built a very wonderful career studying 0:06:29.720 --> 0:06:32.640 scientists who are unusually small and spend most of their 0:06:32.680 --> 0:06:34.320 time lying down, So tell us about that. 0:06:35.080 --> 0:06:39.000 So what I've always been most interested in is how 0:06:39.080 --> 0:06:41.440 is it that people can figure out the world around them? 0:06:41.480 --> 0:06:44.039 How is it that human beings with just a bunch 0:06:44.120 --> 0:06:47.440 of photons hitting our eyes and little disturbances of air 0:06:47.560 --> 0:06:50.000 at our ears, nevertheless, we know about a world of 0:06:50.040 --> 0:06:54.360 people and objects and ultimately quarks and distant planets. How 0:06:54.360 --> 0:06:56.000 could we ever do that? How could we ever learn 0:06:56.040 --> 0:06:58.359 so much from so little? And of course the people 0:06:58.360 --> 0:07:01.400 who are doing that more than anyone else are little children. 0:07:01.760 --> 0:07:04.920 So for the past forty years, what I've been doing 0:07:04.960 --> 0:07:07.240 is trying to figure out how is it that even 0:07:07.279 --> 0:07:10.480 little children can learn so much so quickly from such 0:07:10.920 --> 0:07:14.600 little information. And one of the questions is what kinds 0:07:14.600 --> 0:07:17.080 of computations, what's going on in their brains? What are 0:07:17.120 --> 0:07:21.400 their brains and minds doing that lets them solve these 0:07:21.520 --> 0:07:25.119 really deep problems so quickly and so effectively. And that's 0:07:25.160 --> 0:07:28.239 been the central idea in my career. And it's turned 0:07:28.240 --> 0:07:31.679 out that by looking at kids empirically, by actually studying 0:07:31.680 --> 0:07:34.960 them as scientists, we've discovered that they both know more 0:07:35.000 --> 0:07:37.160 and learn more than we ever would have thought before. 0:07:37.200 --> 0:07:38.480 They're the best learners. 0:07:38.120 --> 0:07:39.320 That we know of in the universe. 0:07:39.600 --> 0:07:40.080 Amazing. 0:07:40.400 --> 0:07:44.520 So we're in this quite remarkable time where for both 0:07:44.520 --> 0:07:48.360 of us we've been doing, you know, the same research 0:07:48.440 --> 0:07:51.640 that we've been doing for many decades, and one might 0:07:51.680 --> 0:07:54.200 have thought five years ago, okay, we'll probably be doing 0:07:54.240 --> 0:07:57.080 that in twenty twenty five, But suddenly the world has 0:07:57.120 --> 0:08:02.120 really changed around us because of A and so you 0:08:02.160 --> 0:08:03.800 and I both are spending a lot of our time 0:08:03.920 --> 0:08:07.080 writing about that and thinking about how to position AI, 0:08:07.160 --> 0:08:09.680 how to understand what it does and does not mean. 0:08:10.440 --> 0:08:13.040 So a lot of people, of course, are concerned about 0:08:14.360 --> 0:08:17.560 super intelligence and the alignment problem, and so on. But 0:08:17.640 --> 0:08:20.880 you and your colleagues have a quite different take that 0:08:20.920 --> 0:08:24.520 you just wrote up in the journal Science in March, 0:08:25.040 --> 0:08:27.200 and I thought it was a really lovely paper. So 0:08:27.240 --> 0:08:29.480 that's what I want to ask you about. So you 0:08:29.760 --> 0:08:33.000 are talking about the right way to look at large 0:08:33.040 --> 0:08:36.480 models is as a social and cultural technology. 0:08:36.559 --> 0:08:38.840 So let's unpack that, right. 0:08:38.960 --> 0:08:41.400 So, as I said, you know, my career has been 0:08:41.440 --> 0:08:43.040 about how could we learn as much as we do? 0:08:43.160 --> 0:08:44.880 And how do children learn as much as they do? 0:08:45.320 --> 0:08:48.400 And part of that has always been if we wanted 0:08:48.400 --> 0:08:51.040 to design a computer or design an artificial system that 0:08:51.040 --> 0:08:53.360 could learn the way children do, what would that system 0:08:53.440 --> 0:08:55.720 look like, what could we put in, what kinds of 0:08:55.760 --> 0:08:59.440 things would it have to do? So for twenty years 0:08:59.520 --> 0:09:02.560 I've been laborating with computer scientists about what would that 0:09:02.640 --> 0:09:05.360 kind of artificial system look like? But as you say, 0:09:05.760 --> 0:09:07.920 even though this has been a long project, in the 0:09:08.000 --> 0:09:11.880 last five years or so, these advances in AI have 0:09:12.000 --> 0:09:14.480 really made us think about that in a different way. 0:09:14.720 --> 0:09:17.480 And one of the interesting things is the big advances 0:09:17.520 --> 0:09:20.559 have been in machine learning. They've actually been in designing 0:09:20.640 --> 0:09:23.960 systems that don't just know things, but can learn things. 0:09:24.040 --> 0:09:26.160 And as I say, children are the best learners we 0:09:26.160 --> 0:09:28.480 know of in the universe. So there's been a really 0:09:28.520 --> 0:09:31.120 interesting development, which is a lot of the people in 0:09:31.160 --> 0:09:35.720 AI have been turning to developmental psychologists like me to say, look, 0:09:35.760 --> 0:09:38.199 could we get some clues from how children are learning 0:09:38.440 --> 0:09:41.640 to design systems that could learn that could learn in 0:09:41.640 --> 0:09:45.400 the same way. Now, the interesting thing is that what 0:09:45.679 --> 0:09:49.079 actually has happened in AI, specifically in the last five 0:09:49.160 --> 0:09:51.760 years or so are these large models, these large language 0:09:51.800 --> 0:09:55.120 models and more recently large language and vision models, and 0:09:55.160 --> 0:09:57.880 they are the things that have really revolutionized our everyday 0:09:57.920 --> 0:10:02.000 interactions with AI. It's important to say those are really 0:10:02.160 --> 0:10:05.480 really different from children, and in fact, they're different from humans. 0:10:05.520 --> 0:10:08.440 They're doing something that I think is really really different 0:10:08.480 --> 0:10:15.280 from human intelligence. And it's very natural for people to think, oh, okay, 0:10:15.320 --> 0:10:18.000 look I talked to chatchept and I get an answer back, 0:10:18.320 --> 0:10:20.720 it must have the same kind of intelligence that my 0:10:20.800 --> 0:10:23.320 friend does or my child does. And it turns out 0:10:23.400 --> 0:10:26.480 that that's not true. Those systems are really really different. 0:10:26.760 --> 0:10:28.600 So before we go on, let's unpack that a little 0:10:28.640 --> 0:10:30.080 bit in what ways are they different? 0:10:30.320 --> 0:10:34.920 So a very common kind of model for how a 0:10:34.920 --> 0:10:37.640 AI works is to think of it as if something 0:10:37.679 --> 0:10:40.880 like chatcheept is an agent, an intelligent agent in the world, 0:10:41.000 --> 0:10:43.880 like a person or even an animal that you know about. 0:10:45.160 --> 0:10:48.960 But that's actually, I think, an illusion. That's what we've argued. 0:10:49.480 --> 0:10:51.720 A better way to think about it is that as 0:10:51.800 --> 0:10:55.600 long as we've been human, we've learned from other people, 0:10:55.800 --> 0:10:59.120 and we've had great technological advances that have helped us 0:10:59.440 --> 0:11:02.400 to learn more effectively for more and more people. So 0:11:02.440 --> 0:11:07.000 if you think about language itself, or writing or print, 0:11:07.360 --> 0:11:11.000 those are all examples of technological changes that made us 0:11:11.080 --> 0:11:13.880 able to get more information from others. And what the 0:11:13.960 --> 0:11:16.600 large models do is not go out into the world 0:11:16.600 --> 0:11:19.560 and learn and think the way that babies do. What 0:11:19.600 --> 0:11:24.960 they do is summarize information that human beings have actually 0:11:25.000 --> 0:11:28.000 already discovered. So what they do is take all the 0:11:28.040 --> 0:11:30.440 information and knowledge that human beings have put out on 0:11:30.840 --> 0:11:34.800 the web essentially, and then summarize that in a way 0:11:34.840 --> 0:11:38.440 that lets other people access it more efficiently. So it's 0:11:38.600 --> 0:11:42.520 much more the technological development is much more like something 0:11:42.600 --> 0:11:45.400 like writing that lets you find out what other people 0:11:45.400 --> 0:11:48.760 are thinking than it is creating a system that could 0:11:48.800 --> 0:11:49.800 learn and think itself. 0:11:50.160 --> 0:11:50.439 Yeah. 0:11:50.720 --> 0:11:53.200 In a previous episode of Intercosmos. They did a calculation 0:11:53.280 --> 0:11:56.920 showing that the amount of information that one of these 0:11:57.000 --> 0:12:01.440 dllms consumes would take you one thousand lifetimes. 0:12:00.880 --> 0:12:01.679 For you to read. 0:12:02.600 --> 0:12:07.080 And so it's consumed more than you could ever imagine. 0:12:07.280 --> 0:12:11.280 And what it's doing fundamentally is when you ask it 0:12:11.320 --> 0:12:13.959 a question, it's giving you an echo of the human 0:12:14.000 --> 0:12:16.200 intelligence that's already in there. So I call this the 0:12:16.240 --> 0:12:19.320 echo intelligence solution, where we feel like, wow, that thing's 0:12:19.360 --> 0:12:23.600 really smart. But it's not smart. It's not smart in 0:12:23.640 --> 0:12:27.640 the same way that a human is. It's taking advantage 0:12:27.679 --> 0:12:29.480 of all the things that are already out there. 0:12:29.520 --> 0:12:31.760 So here's here's a way I like to I like 0:12:31.840 --> 0:12:35.440 to convey this. I think you know, storytelling, as you know, 0:12:35.600 --> 0:12:38.720 is really important. So here's two stories you could tell 0:12:39.040 --> 0:12:42.840 about how current ani work. So one story is sort 0:12:42.880 --> 0:12:45.800 of the story of the Gollum, right, the Rabbi of 0:12:45.880 --> 0:12:49.319 Progue and the Gollum. You create this artificial system and 0:12:49.960 --> 0:12:53.000 it's magical and you put special magic in it, and 0:12:53.040 --> 0:12:55.080 then it turns into something that's almost alive. 0:12:55.200 --> 0:12:57.839 And it's interesting for anyone who doesn't know about the 0:12:57.840 --> 0:13:00.320 story of the Gallum. That was a figure made of clay. 0:13:01.240 --> 0:13:04.959 He was brought to life and defended the community. 0:13:05.760 --> 0:13:09.440 But then, well, it turns out that these stories about 0:13:09.480 --> 0:13:12.319 what would happen if you had something that wasn't human, 0:13:12.360 --> 0:13:15.120 that was artificial that you brought to life are really 0:13:15.160 --> 0:13:18.760 ancient there, way before even the Industrial Revolution. And I 0:13:18.760 --> 0:13:21.200 can tell you right now it never ends well, the 0:13:21.320 --> 0:13:26.040 story always variably. The end of the story is that 0:13:26.280 --> 0:13:29.719 some terrible thing happens and the column goes mad and 0:13:30.640 --> 0:13:32.240 causes trouble and chaos, and. 0:13:32.200 --> 0:13:36.200 That inspired Frankenstein some hundreds of years later. They don't 0:13:36.200 --> 0:13:36.960 have the same character. 0:13:37.200 --> 0:13:41.400 So there's this very basic human fear about what would 0:13:41.440 --> 0:13:44.240 it be like if there was something that wasn't actually 0:13:44.360 --> 0:13:46.920 living that you treated as if it was living, as 0:13:47.000 --> 0:13:47.760 you treated as. 0:13:47.640 --> 0:13:48.400 If it was an agent. 0:13:48.440 --> 0:13:51.920 And I think that basic picture, that's the sci fi picture, 0:13:52.080 --> 0:13:54.920 that's the picture that a lot of people, including people 0:13:54.920 --> 0:13:57.680 in the AI world themselves, have about what's happened in Ai. 0:13:58.320 --> 0:14:01.560 Here's a really different story, also, a different ancient story. 0:14:01.600 --> 0:14:04.240 This is the story of stone Soup. So what's the 0:14:04.280 --> 0:14:06.280 story of stone soup. The story of stone Soup is 0:14:06.280 --> 0:14:09.960 there's visitors who come to a village and they say, 0:14:10.000 --> 0:14:12.280 we'd like some food and the villagers say, no, we 0:14:12.320 --> 0:14:14.360 don't have any extra food. And they say, it's okay, 0:14:14.360 --> 0:14:16.319 we're going to make stone soup. And they take out 0:14:16.320 --> 0:14:18.880 a big pot. They put a couple of stones in it, 0:14:19.160 --> 0:14:21.360 they put some water in, they start to boil it 0:14:21.480 --> 0:14:23.240 up and they say, this will be delicious. We're going 0:14:23.280 --> 0:14:25.240 to make stone soup just with these stones, and the 0:14:25.280 --> 0:14:28.760 villagers say really. They say, yeah, it would be even 0:14:28.840 --> 0:14:30.960 better if we had an onion and a carrot in it, 0:14:31.000 --> 0:14:32.360 but if we don't, we don't. 0:14:32.400 --> 0:14:33.520 And the villager says. 0:14:33.440 --> 0:14:35.360 I think I have an onion and a carrot somewhere, 0:14:35.440 --> 0:14:38.360 and they go and put it in and then they say, 0:14:38.480 --> 0:14:40.320 you know, when we made this for the rich people, 0:14:40.400 --> 0:14:42.800 we put barley and buttermilk in it, which makes it 0:14:42.840 --> 0:14:45.800 even better. But it's okay, it'll still be good stone soup. 0:14:45.840 --> 0:14:48.360 And another villager goes and gets the barley and buttermilk, 0:14:48.400 --> 0:14:49.360 and you can imagine. 0:14:49.080 --> 0:14:49.640 How this goes. 0:14:49.680 --> 0:14:52.320 And they say, the king said that we should put 0:14:52.320 --> 0:14:54.480 a chicken in it, which would make it really royal, 0:14:54.960 --> 0:14:57.760 but we don't have any chicken. So another villager goes 0:14:57.760 --> 0:14:59.760 and gets the chicken from the back, and by the 0:14:59.800 --> 0:15:02.480 time they're done of course, they have this really wonderful 0:15:02.520 --> 0:15:05.440 soup with all the contributions from all the villagers, and 0:15:05.600 --> 0:15:07.720 they go to eat it, and the villagers say, this 0:15:07.880 --> 0:15:10.680 is amazing. There's this wonderful soup and it was just 0:15:10.800 --> 0:15:13.600 made from stones. Okay, here's the modern version of this. 0:15:14.000 --> 0:15:17.280 There's a bunch of tech guys and they go to 0:15:17.320 --> 0:15:20.600 the village of computer users and they say, we're going 0:15:20.680 --> 0:15:25.360 to make artificial general intelligence just from gradient descent and 0:15:25.600 --> 0:15:30.280 transformers and a few algorithms. And the computer used to say, 0:15:30.320 --> 0:15:33.120 that sounds great. We're gonna have artificial general intelligence. And 0:15:33.160 --> 0:15:34.920 they say, yeah, but it would be better if we 0:15:34.960 --> 0:15:37.000 had more data. What we need for this, as you 0:15:37.120 --> 0:15:39.360 just said, David, is lots and lots of data. 0:15:39.440 --> 0:15:41.520 Could you guys put all of. 0:15:41.440 --> 0:15:44.120 Your texts and pictures on the internet for us and 0:15:44.160 --> 0:15:45.840 then let us use them to. 0:15:45.800 --> 0:15:48.240 Train our systems, And the computer user to say, oh, 0:15:48.280 --> 0:15:49.000 that sounds good. 0:15:49.080 --> 0:15:51.960 We'll just keep putting more of our pictures and our 0:15:52.440 --> 0:15:56.680 writings and our books on the internet, and I guess 0:15:56.720 --> 0:15:59.360 you can just use them all for free, and then 0:15:59.480 --> 0:16:02.840 the then the tech person, oh, this is really good. 0:16:02.840 --> 0:16:04.600 This is getting to be more intelligent. But you know, 0:16:04.640 --> 0:16:06.840 it still says really stupid things. A lot of the 0:16:06.840 --> 0:16:10.760 time it says weird things. So what we could do 0:16:10.880 --> 0:16:13.360 is reinforcement learning from human feedback, which is actually a 0:16:13.440 --> 0:16:16.360 really important part of these systems. What we'll do is 0:16:16.400 --> 0:16:19.760 we'll give them to humans and then people can say 0:16:19.800 --> 0:16:21.880 whether what they're saying is good or not, and then 0:16:21.920 --> 0:16:23.280 we'll use that for the training. 0:16:24.120 --> 0:16:27.000 The computer used to say, oh, okay, we're happy to 0:16:27.080 --> 0:16:27.440 do that. 0:16:27.560 --> 0:16:29.640 We'll actually go out and say whether this is good 0:16:29.760 --> 0:16:30.960 or not. There's a whole and this. 0:16:31.000 --> 0:16:31.840 Is literally true. 0:16:31.880 --> 0:16:34.920 There are whole villages in Kenya that we'll do this 0:16:35.080 --> 0:16:36.880 for very small amounts of money. 0:16:37.840 --> 0:16:39.560 And the tech pro said, oh. 0:16:39.440 --> 0:16:42.760 Look see it's even smarter, but it's still saying really 0:16:42.760 --> 0:16:46.440 stupid things sometimes. How about if you did prompt engineering. 0:16:46.520 --> 0:16:49.640 So think really hard about exactly how to ask it 0:16:49.680 --> 0:16:52.720 the right questions so that you can get the right answers, 0:16:52.720 --> 0:16:54.760 because otherwise it's going to say stupid things. And the 0:16:54.880 --> 0:16:57.320 users say, oh, okay, we'll do that. We'll sit down 0:16:57.320 --> 0:17:00.120 and we'll figure out how to do prompt engineering. At 0:17:00.160 --> 0:17:04.160 the end of this process, the tech bro say, seeing, 0:17:04.280 --> 0:17:07.320 we told you we made artificial general intelligence and it 0:17:07.400 --> 0:17:10.439 was just from a few algorithms and the computer you 0:17:10.560 --> 0:17:13.439 to say that's amazing, that's amazing, We're going to have 0:17:13.560 --> 0:17:17.240 artificial intelligence, and it's just you, brilliant tech guys who 0:17:17.400 --> 0:17:20.080 invented it. So, of course, the point of this is 0:17:20.119 --> 0:17:23.040 that it's a sort of debunking story, but it's also 0:17:23.400 --> 0:17:26.239 in both versions, a positive story, because the point is 0:17:26.280 --> 0:17:30.159 when you have a combination of lots and lots of contributions, 0:17:30.200 --> 0:17:34.159 of lots and lots of intelligent people, lots of humans 0:17:34.160 --> 0:17:36.760 who we know are intelligent, both in terms of the 0:17:36.840 --> 0:17:39.280 data they provide and in terms of things like reinforcement 0:17:39.359 --> 0:17:42.840 learning and from human feedback and prompt engineering, you've got 0:17:42.920 --> 0:17:46.600 something that that's bigger than any individual human could have. 0:17:47.080 --> 0:17:49.520 But it's not that what you've got is a gall 0:17:49.760 --> 0:17:51.600 it's not that what you've got is an agent that's 0:17:51.640 --> 0:17:54.479 gone out and been intelligent itself. It's really just a 0:17:54.520 --> 0:17:58.080 system for putting together the thoughts of other agents. 0:17:58.320 --> 0:18:01.800 So the lesson that surfaces here is that although we 0:18:01.960 --> 0:18:07.280 humans love to anthromorphize things and love to make inanimate 0:18:07.320 --> 0:18:12.000 objects into agents in our minds, it's probably not the 0:18:12.080 --> 0:18:15.199 right way to think about these language models. Or let 0:18:15.240 --> 0:18:19.040 me say, these large models, language, vision, multimodal. So what 0:18:19.160 --> 0:18:21.280 is the right way to think about it? So let's 0:18:21.320 --> 0:18:24.080 really unpack this issue about what a social or cultural 0:18:24.119 --> 0:18:25.080 technology is. 0:18:25.400 --> 0:18:31.800 Yeah, so, ever since we've been human, we've made progress 0:18:32.040 --> 0:18:36.040 by learning from other humans, and a number of people 0:18:36.640 --> 0:18:39.320 like Joseph Henrick, for example, and Brob Boyd have argued 0:18:39.359 --> 0:18:41.760 that that's kind of our human our great human gift, 0:18:42.680 --> 0:18:45.520 that's really our secret sauce is not so much that 0:18:45.560 --> 0:18:49.520 we can individually learn things that other creatures can't, although 0:18:49.520 --> 0:18:51.400 I think that's part of it, but that we can 0:18:51.440 --> 0:18:54.119 take advantage of all the things that other humans have 0:18:54.240 --> 0:18:57.440 done over many, many, many generations. I like to think 0:18:57.440 --> 0:19:00.960 of this in terms of the postmenopausal grandmother. I think, 0:19:01.200 --> 0:19:03.040 you know, one of the distinctive human things is that 0:19:03.040 --> 0:19:05.320 we have these postmenopausal grandmothers, and a lot of what 0:19:05.359 --> 0:19:08.040 they do is tell us about the things that they've 0:19:08.400 --> 0:19:12.440 learned in their long, wise lives. And by taking advantage 0:19:12.480 --> 0:19:15.719 of what granny says, you can make progress, even if 0:19:15.760 --> 0:19:18.000 what you're doing is now finding new things that you 0:19:18.000 --> 0:19:22.560 will tell your grandchildren. And that capacity is really the 0:19:22.600 --> 0:19:26.600 capacity that makes us special. And we've had special technologies 0:19:26.680 --> 0:19:30.159 ever since we evolved, that tuned up that capacity, that 0:19:30.200 --> 0:19:33.720 made it more powerful. So language itself, of course, which 0:19:33.760 --> 0:19:36.520 is one of our distinctive things about humans, lets us 0:19:36.600 --> 0:19:39.280 learn from others. But even more if you think about 0:19:39.280 --> 0:19:42.679 something like the invention of writing that enabled us to 0:19:42.760 --> 0:19:45.040 learn from you know, not just our own granny, but 0:19:45.080 --> 0:19:47.679 granny's who were far away in space and in time, 0:19:48.080 --> 0:19:52.080 and it's fascinating. Socrates famously has a whole section about 0:19:52.080 --> 0:19:55.800 why he thinks writing is a terrible idea because exactly 0:19:55.880 --> 0:19:58.480 because he thinks, people will read something in a book 0:19:58.520 --> 0:20:00.879 and they'll think it's actually a person. They'll think that 0:20:00.960 --> 0:20:04.520 it's a person who said this, and it's not. It's 0:20:04.640 --> 0:20:06.600 just something that's written in a book. You won't be 0:20:06.640 --> 0:20:09.560 able to have Socratic dialogues with something that's written in 0:20:09.600 --> 0:20:13.120 a book. It's not really a person, but because it's language, 0:20:13.440 --> 0:20:16.720 will treat it as if it's a person. So writing 0:20:16.880 --> 0:20:20.000 is a good example of something that even though the 0:20:20.040 --> 0:20:24.040 books aren't intelligent, the books in some sense don't know things. 0:20:24.160 --> 0:20:28.000 In another sense, we know things because of books a way. 0:20:28.040 --> 0:20:30.119 I think if this, sometimes they suppose someone asks you, 0:20:30.760 --> 0:20:34.680 who knows more me or the UC Berkeley library. Well, 0:20:35.000 --> 0:20:37.600 the library has much more knowledge in it. It's got 0:20:37.800 --> 0:20:40.159 vast amounts of knowledge that I could never actually have 0:20:40.280 --> 0:20:43.160 in my head. But it's not the sort of thing 0:20:43.200 --> 0:20:45.639 that knows. I'm the sort of person who knows and 0:20:45.880 --> 0:20:48.920 I know things because I can do things like consult 0:20:48.960 --> 0:20:52.000 the library. And then you have print, which is even 0:20:52.119 --> 0:20:55.680 more powerful and has even more powerful effects. You have 0:20:56.800 --> 0:21:00.600 video and film. You have pictures, which I think are 0:21:00.640 --> 0:21:03.560 a really important medium that we don't pay enough attention to. 0:21:03.920 --> 0:21:07.040 So when we talk about vision models, for example, they're 0:21:07.040 --> 0:21:10.000 not actually using vision. What they're using is all the 0:21:10.000 --> 0:21:12.199 pictures that we put on the Internet, and pictures are 0:21:12.240 --> 0:21:16.400 a really important source of communication to So to say 0:21:16.440 --> 0:21:19.119 that it's a cultural technology, to say it's one of 0:21:19.160 --> 0:21:22.199 these technologies that lets humans learn from other humans is 0:21:22.240 --> 0:21:24.840 not at all to dismiss it. Those cultural technologies are 0:21:24.840 --> 0:21:27.480 the things that have led, for better or for worse, 0:21:27.600 --> 0:21:30.000 to the world that we have now. But it's just 0:21:30.160 --> 0:21:32.920 a really different thing. It's what philosophers would call a 0:21:32.960 --> 0:21:36.800 category mistake to think that it's like an intelligent agent, 0:21:36.800 --> 0:21:38.359 which is not to say that at some point in 0:21:38.400 --> 0:21:42.560 the future AI might not develop intelligent agents, but that's 0:21:42.600 --> 0:21:44.600 not what the large models are doing, and that's a 0:21:44.680 --> 0:21:47.880