1 00:00:03,520 --> 00:00:06,320 Speaker 1: In a few years, we will use helpful and harmless 2 00:00:06,320 --> 00:00:11,200 Speaker 1: AI systems. That's the premise of today's conversation. I'm Asimazar. 3 00:00:11,640 --> 00:00:18,680 Speaker 1: Welcome to the Exponentially podcast. Today's AI models are complex, 4 00:00:18,760 --> 00:00:21,880 Speaker 1: with hundreds of billions of virtual moving parts. We don't 5 00:00:21,960 --> 00:00:25,239 Speaker 1: so much as build them as nurture and nudge them. 6 00:00:25,760 --> 00:00:29,240 Speaker 1: As this technology improves exponentially, how can we trust it? 7 00:00:29,480 --> 00:00:32,159 Speaker 1: Can we really design these systems to be harmless and 8 00:00:32,320 --> 00:00:35,720 Speaker 1: honest as well as helpful. I've come to San Francisco, 9 00:00:35,920 --> 00:00:38,600 Speaker 1: the epicenter of the AI revolution, to talk to a 10 00:00:38,640 --> 00:00:40,879 Speaker 1: man who has staked a lot on being able to 11 00:00:40,880 --> 00:00:45,919 Speaker 1: do just that. Dario Amaday, the founder and CEO of Anthropic. 12 00:00:52,440 --> 00:00:52,640 Speaker 2: Well. 13 00:00:53,200 --> 00:00:56,000 Speaker 1: Darius, it's wonderful to have you here. You're a bona 14 00:00:56,000 --> 00:00:59,400 Speaker 1: fide researcher with papers on AI and AI safety that 15 00:00:59,400 --> 00:01:01,760 Speaker 1: have been cited more than thirty thousand times in just 16 00:01:01,800 --> 00:01:05,800 Speaker 1: the last seven years. But you are at the epicenter 17 00:01:06,240 --> 00:01:09,920 Speaker 1: of an enormous explosion in the field of AI today. 18 00:01:10,080 --> 00:01:10,959 Speaker 1: What does that feel like? 19 00:01:11,360 --> 00:01:15,880 Speaker 2: It feels like a mixture of excitement and concern at 20 00:01:15,880 --> 00:01:18,800 Speaker 2: how fast things are going. I generally alternate between the two. 21 00:01:18,880 --> 00:01:21,800 Speaker 2: You know, on one hand, there's something new and exciting 22 00:01:21,840 --> 00:01:24,520 Speaker 2: every day that comes from us or that comes from 23 00:01:24,640 --> 00:01:26,800 Speaker 2: one of the other many players in the space. I 24 00:01:26,840 --> 00:01:28,520 Speaker 2: always look at things and I say, Wow, this is 25 00:01:28,560 --> 00:01:31,360 Speaker 2: so cool, this could be so useful, And then I 26 00:01:31,400 --> 00:01:32,800 Speaker 2: look at the other side of it and I'm going, 27 00:01:32,880 --> 00:01:35,800 Speaker 2: this is all happening so fast that it's hard for 28 00:01:35,880 --> 00:01:39,360 Speaker 2: us to adapt. It's hard for me running a company 29 00:01:39,400 --> 00:01:41,560 Speaker 2: that makes these things, to keep up with all the 30 00:01:41,560 --> 00:01:43,760 Speaker 2: innovations that we've done, even within the company. 31 00:01:43,920 --> 00:01:45,280 Speaker 3: I totally concur with you. 32 00:01:45,480 --> 00:01:48,440 Speaker 1: I've been in the tech industry since the early nineties. 33 00:01:48,440 --> 00:01:53,040 Speaker 1: I've been through the dot com, bubble, mobile, social Nothing 34 00:01:53,160 --> 00:01:55,520 Speaker 1: has been as significant as this. 35 00:01:55,840 --> 00:01:58,360 Speaker 2: It runs the gamut all the way from this heiny 36 00:01:58,440 --> 00:02:02,240 Speaker 2: detail and in the computer code to you know, well, 37 00:02:02,240 --> 00:02:04,480 Speaker 2: what does that mean for the way the model interacts 38 00:02:04,520 --> 00:02:05,960 Speaker 2: in this particular use case? 39 00:02:06,200 --> 00:02:09,519 Speaker 1: And also given the interest in AI today, what will 40 00:02:09,520 --> 00:02:12,720 Speaker 1: this mean for truth on the internet? What would it 41 00:02:12,800 --> 00:02:16,120 Speaker 1: mean for jobs for white collar and office workers? What 42 00:02:16,160 --> 00:02:20,560 Speaker 1: would it mean for national productivity in national competition? I mean, 43 00:02:20,560 --> 00:02:23,520 Speaker 1: these are all questions that people are asking and that 44 00:02:23,639 --> 00:02:25,880 Speaker 1: they're turning to the AI developers in a sense for 45 00:02:25,919 --> 00:02:26,480 Speaker 1: those answers. 46 00:02:26,639 --> 00:02:29,880 Speaker 2: Yeah, I think the multi facodid nature of the technology 47 00:02:29,919 --> 00:02:33,720 Speaker 2: the generality means that on one hand, there's this almost 48 00:02:33,919 --> 00:02:37,840 Speaker 2: endless set of possible positive applications to the technology, but 49 00:02:38,040 --> 00:02:40,760 Speaker 2: also when you go to list what are the concerns 50 00:02:40,760 --> 00:02:43,480 Speaker 2: with the technology, that list is also very long. 51 00:02:43,680 --> 00:02:47,160 Speaker 1: There's this challenge that we face though, because the technologies 52 00:02:47,200 --> 00:02:50,440 Speaker 1: are they're accelerating away this curve that we're all familiar 53 00:02:50,480 --> 00:02:53,680 Speaker 1: with now, the exponential curve. But the way that human 54 00:02:53,760 --> 00:02:57,360 Speaker 1: dynamics work, human institutions, the way that our laws work, 55 00:02:57,360 --> 00:03:00,520 Speaker 1: our families work, the relationships we have in school and 56 00:03:00,600 --> 00:03:03,160 Speaker 1: at work, they move much slower, they move at a 57 00:03:03,240 --> 00:03:07,320 Speaker 1: more traditional pace, and there is a gap that is emerging. 58 00:03:07,360 --> 00:03:08,160 Speaker 3: Does that worry you? 59 00:03:08,639 --> 00:03:10,840 Speaker 2: Yes, I think that's a good way of describing it. 60 00:03:10,919 --> 00:03:14,800 Speaker 2: We're pouring exponentially more compute into these systems, we're technically 61 00:03:14,840 --> 00:03:17,000 Speaker 2: able to do it, and we're getting better and better 62 00:03:17,040 --> 00:03:19,440 Speaker 2: performance when we do that. But then when we look 63 00:03:19,480 --> 00:03:23,880 Speaker 2: at what does that mean for society, for disruptions to business, 64 00:03:23,880 --> 00:03:28,760 Speaker 2: disruptions to economic and governmental structures, it's happening faster than 65 00:03:28,760 --> 00:03:32,000 Speaker 2: we can adapt, and so I think on the technical side, 66 00:03:32,040 --> 00:03:36,000 Speaker 2: we need to do more to try and control measure 67 00:03:36,160 --> 00:03:39,120 Speaker 2: steer these models more so than we're able to do today. 68 00:03:39,520 --> 00:03:41,760 Speaker 2: And I think on kind of the business and legal 69 00:03:41,800 --> 00:03:45,080 Speaker 2: and regulatory side, we need to find ways for societal 70 00:03:45,120 --> 00:03:48,320 Speaker 2: institutions to adapt faster to the changing technology. 71 00:03:48,400 --> 00:03:50,000 Speaker 1: AI is one of those terms where it means so 72 00:03:50,000 --> 00:03:52,920 Speaker 1: many different things to different people. So what does AI 73 00:03:53,080 --> 00:03:55,680 Speaker 1: mean to you when you say you want to build 74 00:03:55,680 --> 00:03:56,520 Speaker 1: an AI system? 75 00:03:56,560 --> 00:03:59,760 Speaker 2: What do you think of the systems that we primarily 76 00:03:59,800 --> 00:04:03,600 Speaker 2: work on building our large language models, which are systems 77 00:04:03,920 --> 00:04:05,880 Speaker 2: that you can talk to and they talk back, and 78 00:04:05,920 --> 00:04:08,680 Speaker 2: they can perform tasks for you. They can program, they 79 00:04:08,680 --> 00:04:12,200 Speaker 2: can answer questions about legal matters, medical matters, and any 80 00:04:12,280 --> 00:04:14,960 Speaker 2: number of topics. So our model Claude is an example 81 00:04:15,000 --> 00:04:15,160 Speaker 2: of this. 82 00:04:15,280 --> 00:04:17,279 Speaker 1: Well, tell me about Claude, though, because I heard the 83 00:04:17,360 --> 00:04:19,960 Speaker 1: name and it's a really cute name, and a friend 84 00:04:19,960 --> 00:04:22,800 Speaker 1: of mine from school had a fluffy skunk that was 85 00:04:22,839 --> 00:04:25,919 Speaker 1: called Claude, so it always has this sweet association for me. 86 00:04:26,080 --> 00:04:28,080 Speaker 2: I think we just wanted a name that sounds like 87 00:04:28,120 --> 00:04:30,720 Speaker 2: it was a friendly assistant or someone that would help you. 88 00:04:30,880 --> 00:04:34,000 Speaker 2: The term we use is helpful, honest, and harmless. So 89 00:04:34,160 --> 00:04:36,839 Speaker 2: what can I reasonably ask of someone that I'm asking 90 00:04:36,880 --> 00:04:38,880 Speaker 2: for help on something. I want them to be helpful 91 00:04:38,880 --> 00:04:40,839 Speaker 2: in the task. I don't want them to do anything 92 00:04:40,960 --> 00:04:43,719 Speaker 2: dangerous or harmful, and I don't want them to mislead me. 93 00:04:43,760 --> 00:04:46,640 Speaker 2: I want them to be honest. And if someone manages 94 00:04:46,680 --> 00:04:48,640 Speaker 2: to do those three things, then you know, I generally 95 00:04:48,720 --> 00:04:50,920 Speaker 2: feel like they've done a good job being an assistant. 96 00:04:51,000 --> 00:04:54,440 Speaker 1: I want to bring that back to your definition of 97 00:04:54,960 --> 00:04:58,719 Speaker 1: what an AI system is. Is it a system that 98 00:04:58,760 --> 00:05:01,719 Speaker 1: exhibits those types of human personality characteristics, or is it 99 00:05:01,800 --> 00:05:03,000 Speaker 1: something a little bit different? 100 00:05:03,400 --> 00:05:06,480 Speaker 2: Like the overall definition of AI right for the whole field, 101 00:05:06,560 --> 00:05:10,240 Speaker 2: can be any system that performs any intelligent or pattern 102 00:05:10,320 --> 00:05:13,760 Speaker 2: matching task. So it's possible to build AIS with all 103 00:05:13,880 --> 00:05:16,720 Speaker 2: kinds of different properties. But I think our vision and 104 00:05:16,800 --> 00:05:19,000 Speaker 2: our picture is that we want to build these systems 105 00:05:19,040 --> 00:05:22,680 Speaker 2: to be helpful, honest, and harmless, and if we can 106 00:05:23,040 --> 00:05:29,559 Speaker 2: achieve those consistently, then systems will be beneficial to society. 107 00:05:32,560 --> 00:05:35,159 Speaker 1: So I've used Claude a little bit. I'll let you 108 00:05:35,200 --> 00:05:36,680 Speaker 1: in on a secret. I'd use it to help me 109 00:05:36,760 --> 00:05:39,560 Speaker 1: with my research for the interviews that I do. What 110 00:05:39,600 --> 00:05:42,120 Speaker 1: are the kind of use cases that you would like 111 00:05:42,200 --> 00:05:44,760 Speaker 1: people to be using the technology for now. 112 00:05:44,880 --> 00:05:47,880 Speaker 2: I think on the helpful side, people often find that 113 00:05:47,960 --> 00:05:51,479 Speaker 2: Claude is more friendly and creative than other models, So 114 00:05:51,560 --> 00:05:55,400 Speaker 2: that's the helpful side. I think honest and harmless are 115 00:05:55,440 --> 00:05:59,520 Speaker 2: often connected to some business use cases that we think 116 00:05:59,520 --> 00:06:02,320 Speaker 2: are important. So by harmless we mean that we don't 117 00:06:02,360 --> 00:06:06,240 Speaker 2: want Claude to be willing to kind of engage in 118 00:06:06,480 --> 00:06:10,120 Speaker 2: aiding with dangerous or legal activities. We don't want Claude 119 00:06:10,120 --> 00:06:13,880 Speaker 2: to have prejudices or biases in either direction. Really, right, 120 00:06:13,960 --> 00:06:16,120 Speaker 2: if I present a model that you know it serves 121 00:06:16,160 --> 00:06:19,240 Speaker 2: as a lawyer or serves in some medical function, it's 122 00:06:19,400 --> 00:06:21,480 Speaker 2: very important to the model be you know, for a 123 00:06:21,520 --> 00:06:24,560 Speaker 2: human we would call it neutral and professional. From the model, 124 00:06:24,600 --> 00:06:28,120 Speaker 2: we call it harmless. One problem that models often have, 125 00:06:28,320 --> 00:06:31,159 Speaker 2: and I'll be honest, every model still this is unsolved 126 00:06:31,200 --> 00:06:33,160 Speaker 2: problem every model has to some extent is what we 127 00:06:33,200 --> 00:06:34,719 Speaker 2: call the hallucination problem. 128 00:06:34,839 --> 00:06:37,160 Speaker 1: Right, So I find sometimes if I ask these systems 129 00:06:37,440 --> 00:06:41,920 Speaker 1: to give my biography, it'll switch my university, and then 130 00:06:41,960 --> 00:06:43,800 Speaker 1: it'll switch the first place I worked. 131 00:06:44,160 --> 00:06:45,960 Speaker 3: It still looks really credible, but it's wrong. 132 00:06:46,160 --> 00:06:48,440 Speaker 2: Yeah, this is the insidious nature of kind of the 133 00:06:48,520 --> 00:06:52,200 Speaker 2: imperfect systems. Right where you know the nightmare is, you know, 134 00:06:52,240 --> 00:06:54,200 Speaker 2: you ask the system for as seut of ten facts, 135 00:06:54,680 --> 00:06:58,320 Speaker 2: and all of it sounds professional and credible. Nine of 136 00:06:58,320 --> 00:07:00,760 Speaker 2: the facts are right, and one of them is wrong 137 00:07:00,839 --> 00:07:04,280 Speaker 2: in some very very important way. Making it good enough 138 00:07:04,320 --> 00:07:06,240 Speaker 2: so that you can really trust it is one of 139 00:07:06,279 --> 00:07:09,600 Speaker 2: our top priorities an Entropic. We have a significant probably 140 00:07:09,640 --> 00:07:12,800 Speaker 2: about a quarter of the team Atanthropic focuses on it. 141 00:07:12,840 --> 00:07:15,800 Speaker 2: But still no one is perfect. We, like everyone else, 142 00:07:15,840 --> 00:07:18,640 Speaker 2: are still to some extent, plagued by this problem. 143 00:07:18,760 --> 00:07:22,239 Speaker 1: But this problem exists because of the way that these 144 00:07:22,440 --> 00:07:25,800 Speaker 1: large language models are structured. It's the way that they 145 00:07:26,480 --> 00:07:28,440 Speaker 1: I think we don't even say that they're built. They're 146 00:07:28,480 --> 00:07:30,360 Speaker 1: sort of grown in a funny sort of way. I 147 00:07:30,400 --> 00:07:31,120 Speaker 1: think that's something. 148 00:07:31,040 --> 00:07:33,840 Speaker 2: Baked like a cake or something right like just stated. 149 00:07:33,560 --> 00:07:36,680 Speaker 1: They're not built like scaffolding is built, or built like 150 00:07:36,760 --> 00:07:40,320 Speaker 1: a car engine is built, where you assemble component after component. 151 00:07:40,920 --> 00:07:43,480 Speaker 2: No, there's in fact two stages of the training. So 152 00:07:43,520 --> 00:07:46,640 Speaker 2: in the first stage you just train the model on 153 00:07:46,880 --> 00:07:49,280 Speaker 2: a huge amount of text, like a huge amount of 154 00:07:49,280 --> 00:07:50,800 Speaker 2: the text on the internet. 155 00:07:51,040 --> 00:07:52,280 Speaker 3: But it's billions of words. 156 00:07:53,080 --> 00:07:56,840 Speaker 2: It's some large fraction of what's available, and literally we 157 00:07:57,000 --> 00:07:59,680 Speaker 2: just train the model to be good at predicting the 158 00:07:59,680 --> 00:08:01,880 Speaker 2: next word in the sentence, predicting each word in the 159 00:08:01,880 --> 00:08:04,760 Speaker 2: sentence after one another. So the model learns a lot 160 00:08:04,800 --> 00:08:08,200 Speaker 2: about the world when you do this, But honestly, one 161 00:08:08,200 --> 00:08:10,600 Speaker 2: thing it doesn't learn is that it shouldn't make things up. 162 00:08:10,720 --> 00:08:13,200 Speaker 2: It's basically trying to predict what would be plausible if 163 00:08:13,240 --> 00:08:15,960 Speaker 2: it came next, not necessarily what is true. So then 164 00:08:16,000 --> 00:08:19,600 Speaker 2: there's a second stage of training done in different ways. 165 00:08:19,640 --> 00:08:21,840 Speaker 2: For example, the state of the art in the field 166 00:08:21,880 --> 00:08:24,520 Speaker 2: is a method called r L from reinforcement learning from 167 00:08:24,560 --> 00:08:25,320 Speaker 2: human feedback. 168 00:08:25,440 --> 00:08:27,840 Speaker 1: It's a little bit like how I might train my 169 00:08:28,160 --> 00:08:31,680 Speaker 1: young puppy. You give it rewards when it does well, 170 00:08:31,920 --> 00:08:35,440 Speaker 1: and you may treat it slightly differently if it doesn't 171 00:08:35,480 --> 00:08:36,200 Speaker 1: behave correctly. 172 00:08:36,520 --> 00:08:38,439 Speaker 3: Is it like that or is it more sophisticated? 173 00:08:38,520 --> 00:08:42,040 Speaker 2: Yea, it's actually quite a lot like that, where instead 174 00:08:42,080 --> 00:08:44,000 Speaker 2: of the owner speaking to the puppy, you just have 175 00:08:44,080 --> 00:08:46,520 Speaker 2: a human rate how all the models are doing. 176 00:08:46,320 --> 00:08:50,080 Speaker 1: And who is the you in that you teach it? 177 00:08:50,120 --> 00:08:51,040 Speaker 1: Is it people like you? 178 00:08:51,080 --> 00:08:51,680 Speaker 3: People like me? 179 00:08:51,880 --> 00:08:54,720 Speaker 2: Yeah? To get a little bit into the details in 180 00:08:54,760 --> 00:08:56,559 Speaker 2: the state of the art method, which I said, r 181 00:08:56,679 --> 00:08:59,959 Speaker 2: L from human feedback. In that method, some number of 182 00:09:00,120 --> 00:09:03,240 Speaker 2: people will be hired. Usually it's contractors who looks at 183 00:09:03,240 --> 00:09:06,520 Speaker 2: the model and says, okay, I saw these two responses. 184 00:09:06,600 --> 00:09:08,520 Speaker 2: This one is better than that one. One of the 185 00:09:08,600 --> 00:09:13,240 Speaker 2: reasons why we invented constitutional AI is that's fairly opaque. 186 00:09:13,240 --> 00:09:15,760 Speaker 2: If someone says, you know, I might ask my model 187 00:09:15,960 --> 00:09:18,840 Speaker 2: say a political question, right, and it expresses an opinion, 188 00:09:18,840 --> 00:09:20,600 Speaker 2: and someone gets mad, they say, why does the model 189 00:09:20,600 --> 00:09:23,000 Speaker 2: have this opinion? Is that that opinion? All you can 190 00:09:23,040 --> 00:09:27,440 Speaker 2: really say with ROLHF is okay, Well that was the 191 00:09:27,559 --> 00:09:31,120 Speaker 2: average opinion of the thousand contractors that I hired, which 192 00:09:31,160 --> 00:09:32,320 Speaker 2: is not very satisfying. 193 00:09:32,400 --> 00:09:35,520 Speaker 1: The other thing that strikes me about that approach is 194 00:09:35,559 --> 00:09:39,160 Speaker 1: that your models billions and billions and billions of words 195 00:09:39,200 --> 00:09:42,720 Speaker 1: in it, and it can throw out billions, umpteen billions 196 00:09:42,760 --> 00:09:46,320 Speaker 1: of different sentences. So that's a lot of stuff for 197 00:09:46,440 --> 00:09:49,320 Speaker 1: humans to look at. I mean, are there even enough 198 00:09:50,000 --> 00:09:51,280 Speaker 1: humans to give feedback? 199 00:09:51,400 --> 00:09:55,079 Speaker 2: The big first stage of training does involve billions of words, 200 00:09:55,400 --> 00:09:58,840 Speaker 2: but actually the second stage of training typically it's maybe 201 00:09:58,880 --> 00:10:01,080 Speaker 2: I don't know, a thousand hum humans, each of whom 202 00:10:01,120 --> 00:10:04,240 Speaker 2: gives a thousand ratings over a few days. Or something. Wow, 203 00:10:04,280 --> 00:10:08,240 Speaker 2: So that is hundreds of billions and then millions. It's 204 00:10:08,320 --> 00:10:11,320 Speaker 2: very difficult conceptually, but it actually doesn't take all that 205 00:10:11,400 --> 00:10:12,080 Speaker 2: much data. 206 00:10:12,240 --> 00:10:16,839 Speaker 1: But you've moved on from this URLHF reinforcement learning with 207 00:10:16,920 --> 00:10:21,280 Speaker 1: human feedback to constitutional AI, which introduces a second AI 208 00:10:21,400 --> 00:10:24,439 Speaker 1: system to help train the first one. 209 00:10:24,559 --> 00:10:29,400 Speaker 2: Yes, So basically, in constitutional AI, you write a constitution 210 00:10:29,920 --> 00:10:32,720 Speaker 2: which could be anywhere between one page and ten pages, 211 00:10:33,080 --> 00:10:35,840 Speaker 2: and it basically states the rules that the AI system 212 00:10:35,840 --> 00:10:36,360 Speaker 2: could follow. 213 00:10:36,480 --> 00:10:38,920 Speaker 3: What are the things that are in your constitution for Claude. 214 00:10:39,000 --> 00:10:41,520 Speaker 2: It's evolved over time, but you know, from the beginning, 215 00:10:41,600 --> 00:10:44,280 Speaker 2: I think we started with some things from like the 216 00:10:44,400 --> 00:10:46,840 Speaker 2: UN Charter of Human Rights, things that are hard to 217 00:10:46,840 --> 00:10:49,840 Speaker 2: disagree with, and then we added some things about Claude 218 00:10:49,920 --> 00:10:53,640 Speaker 2: being responsive to the user and various things. There are 219 00:10:53,720 --> 00:10:57,199 Speaker 2: various kinds of harms that we were particularly concerned with, 220 00:10:57,679 --> 00:10:59,839 Speaker 2: kinds of information that are dangerous or. 221 00:10:59,760 --> 00:11:03,680 Speaker 1: Really goal But how can you measure then whether Claude 222 00:11:03,800 --> 00:11:06,600 Speaker 1: is behaving as you have trained it. 223 00:11:06,760 --> 00:11:10,319 Speaker 2: That's actually a very difficult and subtle problem, right because 224 00:11:10,880 --> 00:11:12,880 Speaker 2: I think one of the things about these models is 225 00:11:12,880 --> 00:11:15,559 Speaker 2: that they're incredibly broad. One of the things I've said 226 00:11:15,800 --> 00:11:18,680 Speaker 2: is you know, often a model might know something, or 227 00:11:18,760 --> 00:11:21,839 Speaker 2: not know something, or have an opinion on something, and 228 00:11:21,880 --> 00:11:24,760 Speaker 2: you don't necessarily know about it until a million people 229 00:11:24,800 --> 00:11:26,720 Speaker 2: have used it or something. To be clear, I think 230 00:11:26,760 --> 00:11:29,280 Speaker 2: this is a bad thing. We shouldn't have to deploy 231 00:11:29,320 --> 00:11:32,080 Speaker 2: the model to a million people to discover that it 232 00:11:32,160 --> 00:11:34,599 Speaker 2: happens to be an expert on some particular type of 233 00:11:34,679 --> 00:11:36,839 Speaker 2: weapons that I would rather not talk about that. 234 00:11:37,040 --> 00:11:37,319 Speaker 3: Yes. 235 00:11:37,400 --> 00:11:40,040 Speaker 2: Another example is I don't know the first thing about cricket, 236 00:11:40,080 --> 00:11:42,520 Speaker 2: but Claude is an expert on cricket. Claude is also 237 00:11:42,520 --> 00:11:44,640 Speaker 2: an expert on Japanese history. I don't know the first 238 00:11:44,679 --> 00:11:45,439 Speaker 2: thing about Japanese. 239 00:11:45,480 --> 00:11:48,439 Speaker 3: I can help you with one of those two history. 240 00:11:48,640 --> 00:11:53,680 Speaker 2: Yeah, and so one of our main areas of research 241 00:11:53,840 --> 00:11:56,839 Speaker 2: is trying to detect ahead of time all the things 242 00:11:56,840 --> 00:11:59,800 Speaker 2: that the model is capable of. So it's this very 243 00:12:00,080 --> 00:12:02,760 Speaker 2: but ended problem. Then we're constantly trying to build up 244 00:12:03,240 --> 00:12:06,199 Speaker 2: kind of evaluations and standards for measuring our model. 245 00:12:06,600 --> 00:12:11,520 Speaker 1: Software and engineering has been very deterministic. Yes, you buy 246 00:12:11,559 --> 00:12:13,880 Speaker 1: a hammer, you know what a hammer does, ye, clunk. 247 00:12:14,400 --> 00:12:15,160 Speaker 3: You get a piece of. 248 00:12:15,120 --> 00:12:18,880 Speaker 1: Software like the calculator on your smartphone. It calculates and 249 00:12:18,880 --> 00:12:22,000 Speaker 1: will always give you the same results. And the words 250 00:12:22,040 --> 00:12:24,520 Speaker 1: that you use is that you're working with the model 251 00:12:24,920 --> 00:12:27,559 Speaker 1: so that it doesn't do things that you would rather 252 00:12:27,840 --> 00:12:31,160 Speaker 1: it not do. A bit like a kind uncle talking 253 00:12:31,200 --> 00:12:35,880 Speaker 1: to their slightly difficult nephew. Are you translating technical language 254 00:12:36,120 --> 00:12:40,560 Speaker 1: into normal english from by benefit or is this process 255 00:12:40,640 --> 00:12:43,880 Speaker 1: one of rather's and maybe's and would be betters. 256 00:12:44,400 --> 00:12:46,280 Speaker 2: So when you go to train the system, right, you 257 00:12:46,280 --> 00:12:50,280 Speaker 2: know it requires thousands of computer chips all working in sync. 258 00:12:50,600 --> 00:12:53,640 Speaker 2: There's an incredible precision to the engineering. You know exactly 259 00:12:53,640 --> 00:12:56,280 Speaker 2: what you're making, you know exactly what data is going 260 00:12:56,320 --> 00:12:59,319 Speaker 2: into it, you know exactly how much it costs per hour. 261 00:12:59,400 --> 00:13:03,199 Speaker 2: It has all the hallmarks of precision engineering, same as 262 00:13:03,240 --> 00:13:06,240 Speaker 2: making the semiconductor chip, but on the output it has 263 00:13:06,280 --> 00:13:09,439 Speaker 2: exactly the properties that you talk about. It's much more 264 00:13:09,480 --> 00:13:12,320 Speaker 2: of an art than a science when you look past 265 00:13:12,520 --> 00:13:15,720 Speaker 2: the form and the container into which you're pouring things. 266 00:13:15,760 --> 00:13:18,960 Speaker 2: The pouring process is very predictable, but what you get 267 00:13:18,960 --> 00:13:22,160 Speaker 2: out at the other end is very inherently hard to predict. 268 00:13:22,520 --> 00:13:24,560 Speaker 2: And we're trying to turn it into more of a science, 269 00:13:24,600 --> 00:13:26,840 Speaker 2: but it's not inherently so it doesn't start that way, 270 00:13:26,880 --> 00:13:28,280 Speaker 2: that's a problem for us to solve. 271 00:13:28,640 --> 00:13:33,360 Speaker 1: I think of this analogy of the first stereo system 272 00:13:33,400 --> 00:13:35,360 Speaker 1: that we had at home, and it had a base 273 00:13:35,480 --> 00:13:38,160 Speaker 1: dial and a treble dial. It had two dials that 274 00:13:38,200 --> 00:13:41,040 Speaker 1: you could use to adjust the sound. And when I 275 00:13:41,080 --> 00:13:43,880 Speaker 1: look at these large language models, they have ten billion, 276 00:13:43,960 --> 00:13:46,440 Speaker 1: one hundred billion, five hundred billion dials you guys call 277 00:13:46,480 --> 00:13:50,160 Speaker 1: them parameters. Does the fuzziness come out of that complexity? 278 00:13:50,559 --> 00:13:53,000 Speaker 2: Yeah? I think it comes out of that complexity. And 279 00:13:53,040 --> 00:13:55,000 Speaker 2: we're not manually turning each of the. 280 00:13:55,000 --> 00:13:56,920 Speaker 3: Dials, right, it's tedious. 281 00:13:57,400 --> 00:14:00,640 Speaker 2: We have an automated process that kind of sides when 282 00:14:00,640 --> 00:14:02,920 Speaker 2: any dials should be turned in how much, based on 283 00:14:03,000 --> 00:14:04,240 Speaker 2: the data that it receives. 284 00:14:04,960 --> 00:14:06,600 Speaker 1: So a lot of people have said over the last 285 00:14:06,640 --> 00:14:10,080 Speaker 1: five or six years, the problem with neural networks and 286 00:14:10,080 --> 00:14:12,080 Speaker 1: a large language model is a type of neural network 287 00:14:12,480 --> 00:14:15,720 Speaker 1: is that they are black boxes. And the point being 288 00:14:15,840 --> 00:14:17,760 Speaker 1: that you can't look into them and see what the 289 00:14:17,800 --> 00:14:19,920 Speaker 1: process is, and the same way you can't look into 290 00:14:19,960 --> 00:14:22,600 Speaker 1: my brain at the moment, not without hurting me anyway, 291 00:14:22,880 --> 00:14:26,480 Speaker 1: and see what the process is. So you're developing methods 292 00:14:26,520 --> 00:14:29,240 Speaker 1: of peering into that black box. You're developing the instruments 293 00:14:29,240 --> 00:14:30,480 Speaker 1: and the tools to do that. 294 00:14:30,680 --> 00:14:32,680 Speaker 2: Yes, this is an area that we've been worked on 295 00:14:32,880 --> 00:14:34,840 Speaker 2: since the beginning of Anthropic This was one of our 296 00:14:34,840 --> 00:14:37,680 Speaker 2: first teams and it's grown over time. We're looking at 297 00:14:37,680 --> 00:14:41,640 Speaker 2: methods to try and understand when a particular element of 298 00:14:41,680 --> 00:14:44,520 Speaker 2: the network, which we call a neuron and that analogy 299 00:14:44,600 --> 00:14:47,960 Speaker 2: to the human neurons, turns on or fires, what is 300 00:14:48,000 --> 00:14:51,240 Speaker 2: associated with it, and we've found some interesting things that 301 00:14:51,320 --> 00:14:53,880 Speaker 2: actually parallel what we've seen in the human brain. I 302 00:14:53,960 --> 00:14:56,480 Speaker 2: used to be a neuroscientist, so you can see the 303 00:14:56,520 --> 00:15:00,680 Speaker 2: network often using very human like concepts. But we're really 304 00:15:00,760 --> 00:15:02,640 Speaker 2: just at the beginning of that. Right. We can decode 305 00:15:02,640 --> 00:15:04,760 Speaker 2: some of what the network does and understand some of 306 00:15:04,760 --> 00:15:07,160 Speaker 2: the principles behind it, but I think it's going to 307 00:15:07,200 --> 00:15:09,240 Speaker 2: be years before that science matures. 308 00:15:09,640 --> 00:15:14,000 Speaker 1: Is it important to make some breakthroughs in those particular 309 00:15:14,040 --> 00:15:17,920 Speaker 1: fields in order to deliver verifiably safe AI systems. 310 00:15:18,040 --> 00:15:20,440 Speaker 2: Yeah, I think that's going to be one important component 311 00:15:21,080 --> 00:15:23,480 Speaker 2: because of the fuzziness that we talked about before, and 312 00:15:23,560 --> 00:15:26,680 Speaker 2: if you understand something about what's going on inside the network, 313 00:15:27,000 --> 00:15:29,920 Speaker 2: why it does what it does, then you can maybe 314 00:15:29,960 --> 00:15:33,080 Speaker 2: predict what it's going to do in circumstances you've never 315 00:15:33,120 --> 00:15:33,800 Speaker 2: seen before. 316 00:15:34,040 --> 00:15:36,800 Speaker 1: There are behaviors that come out of these networks that 317 00:15:37,120 --> 00:15:40,320 Speaker 1: weren't designed in that are emerging, and it's given almost 318 00:15:40,360 --> 00:15:43,400 Speaker 1: a sort of a mystical sense around it. What do 319 00:15:43,520 --> 00:15:46,360 Speaker 1: you understand by this idea of emergent behavior. 320 00:15:46,600 --> 00:15:49,880 Speaker 2: Yeah, so I wouldn't attach anything mystical to it, anymore 321 00:15:49,920 --> 00:15:52,040 Speaker 2: than I would attach anything mystical to you know, as 322 00:15:52,120 --> 00:15:54,960 Speaker 2: humans grow up, they start to understand the world and 323 00:15:55,000 --> 00:15:57,040 Speaker 2: they have realization. But I think, you know, as the 324 00:15:57,040 --> 00:16:00,480 Speaker 2: model starts to see something in its training data, learned 325 00:16:00,520 --> 00:16:03,840 Speaker 2: to concatenate that training data to put together the puzzle 326 00:16:03,840 --> 00:16:09,440 Speaker 2: pieces in different ways. Writing semantically correct computer code, or 327 00:16:09,560 --> 00:16:12,800 Speaker 2: being able to do a particular type of math, or 328 00:16:13,080 --> 00:16:17,440 Speaker 2: understanding the concept of what's legal versus what's illegal. Right, 329 00:16:17,560 --> 00:16:20,320 Speaker 2: all of these are things that appear at some stage. 330 00:16:20,320 --> 00:16:23,440 Speaker 2: They're not magical, they're not mystical. They're in the training data. 331 00:16:23,600 --> 00:16:25,680 Speaker 2: But the model at some point learns to put together 332 00:16:25,720 --> 00:16:27,760 Speaker 2: the pieces when it wasn't able to before. 333 00:16:28,200 --> 00:16:32,280 Speaker 1: It's such a complex set of trade offs because if 334 00:16:32,320 --> 00:16:34,120 Speaker 1: I know the thing is wrong half the time, I 335 00:16:34,160 --> 00:16:36,800 Speaker 1: will double check every answer. But if it's only wrong 336 00:16:36,840 --> 00:16:40,520 Speaker 1: once in one hundred, I'm not going to. And I 337 00:16:40,600 --> 00:16:44,800 Speaker 1: wonder about whether youth will see some almost chasm that 338 00:16:44,840 --> 00:16:48,480 Speaker 1: you'd have to leap of safety before these things really 339 00:16:48,680 --> 00:16:49,720 Speaker 1: can feel safe. 340 00:16:50,120 --> 00:16:52,440 Speaker 2: Yeah, So I think that's an important problem, and we 341 00:16:52,480 --> 00:16:55,400 Speaker 2: really want to avoid this situation where the models are 342 00:16:55,480 --> 00:16:57,840 Speaker 2: kind of you know, we become dependent on them or 343 00:16:57,880 --> 00:17:00,640 Speaker 2: come to rely on them, while they may still be 344 00:17:00,720 --> 00:17:03,480 Speaker 2: sometimes making mistakes that we would be able to catch. 345 00:17:03,960 --> 00:17:06,040 Speaker 2: So I think one of the important things is for 346 00:17:06,160 --> 00:17:08,919 Speaker 2: models to know what they don't know. And so the 347 00:17:08,960 --> 00:17:12,760 Speaker 2: great thing would be a much more usable AI system 348 00:17:12,760 --> 00:17:15,320 Speaker 2: than the one you described, is one where ninety nine 349 00:17:15,320 --> 00:17:16,919 Speaker 2: percent of the time it gets the right answer, and 350 00:17:16,920 --> 00:17:19,160 Speaker 2: that one percent of the time it says I don't 351 00:17:19,200 --> 00:17:21,920 Speaker 2: actually know. Here are some guesses. They might be wrong, 352 00:17:22,240 --> 00:17:25,159 Speaker 2: but if it's able to signal or signposts, then it 353 00:17:25,240 --> 00:17:27,960 Speaker 2: might not be confident. It's a lot more useful. In fact, 354 00:17:28,280 --> 00:17:31,520 Speaker 2: I would probably prefer a system that's right ninety percent 355 00:17:31,520 --> 00:17:33,679 Speaker 2: of the time and says I don't know ten percent 356 00:17:33,680 --> 00:17:36,720 Speaker 2: of the time, then one that's right ninety nine percent 357 00:17:36,760 --> 00:17:39,159 Speaker 2: of the time and kind of silently lies to me 358 00:17:39,200 --> 00:17:41,480 Speaker 2: another one percent. Right, This is getting back to the 359 00:17:41,520 --> 00:17:44,240 Speaker 2: honest thing, right, Like it's okay not to know sometimes, 360 00:17:44,600 --> 00:17:46,080 Speaker 2: but I don't want you to make things up. 361 00:17:52,400 --> 00:17:54,640 Speaker 1: These are pretty powerful technologies, and I'll put my calls 362 00:17:54,680 --> 00:17:56,520 Speaker 1: on the type. I think they will be the most 363 00:17:56,560 --> 00:18:00,040 Speaker 1: pawful technologies we'll see in a lifetimes. How do we 364 00:18:00,080 --> 00:18:05,240 Speaker 1: get them into society more broadly in ways that are very, 365 00:18:05,359 --> 00:18:06,400 Speaker 1: very beneficial? 366 00:18:06,760 --> 00:18:08,920 Speaker 2: So I think there's kind of two sides to that, right, 367 00:18:08,960 --> 00:18:13,159 Speaker 2: there's preventing the harms and achieving the benefits. So I 368 00:18:13,160 --> 00:18:16,800 Speaker 2: think on the preventing the harm side, I mean this helpful, honest, 369 00:18:16,840 --> 00:18:20,600 Speaker 2: harmless looking inside the model. These are both important areas. 370 00:18:20,960 --> 00:18:23,800 Speaker 2: There's another area I haven't talked about yet, which is 371 00:18:23,920 --> 00:18:27,920 Speaker 2: ensuring that models stay under effective human control, that we're 372 00:18:27,920 --> 00:18:30,680 Speaker 2: able to supervise them even as they get smarter than 373 00:18:30,680 --> 00:18:33,399 Speaker 2: we are. You know, when the models start to know 374 00:18:33,560 --> 00:18:36,040 Speaker 2: much more than humans do, how do we make sure 375 00:18:36,080 --> 00:18:38,600 Speaker 2: that humans are able to check and verify their work 376 00:18:38,880 --> 00:18:41,440 Speaker 2: and that they don't lie to us in ways that 377 00:18:41,520 --> 00:18:42,440 Speaker 2: we can't detect. 378 00:18:42,720 --> 00:18:46,640 Speaker 1: In a world where AI systems are prevalent, and many 379 00:18:46,680 --> 00:18:49,800 Speaker 1: of these systems perhaps are built by anthropic and therefore 380 00:18:49,840 --> 00:18:55,760 Speaker 1: they're guided by your constitution in your constitutional. AI, are 381 00:18:55,840 --> 00:18:59,280 Speaker 1: you the right person to set the rules for that constitution? 382 00:18:59,359 --> 00:19:02,399 Speaker 1: Because the US has a constitution, Germany has a constitution, 383 00:19:02,960 --> 00:19:06,120 Speaker 1: but that constitution was built by a sense of consensus, 384 00:19:06,160 --> 00:19:09,240 Speaker 1: a sense of accountability and legitimacy. You seem like a 385 00:19:09,280 --> 00:19:13,560 Speaker 1: really trustworthy guy. But is it fair for that palate 386 00:19:13,640 --> 00:19:14,280 Speaker 1: to reside with you? 387 00:19:14,560 --> 00:19:17,439 Speaker 2: I think actually mostly not so. I think the way 388 00:19:17,560 --> 00:19:20,879 Speaker 2: we envision it is there may be a base model 389 00:19:21,359 --> 00:19:24,200 Speaker 2: that has a very basic constitution, right, and we talk 390 00:19:24,240 --> 00:19:26,919 Speaker 2: about things like the UN Charter of Human Rights, but 391 00:19:27,200 --> 00:19:31,280 Speaker 2: we're actually developing a process to allow different use cases 392 00:19:31,359 --> 00:19:34,760 Speaker 2: or different customers to write almost an addendum or to 393 00:19:34,840 --> 00:19:37,840 Speaker 2: extend the constitution on top of the basic things. So 394 00:19:37,840 --> 00:19:40,760 Speaker 2: the idea would be all versions of Claude have these 395 00:19:40,920 --> 00:19:43,240 Speaker 2: very basic rules, right, They're not going to commit things 396 00:19:43,280 --> 00:19:45,960 Speaker 2: that or help with things that almost all of human 397 00:19:46,040 --> 00:19:50,080 Speaker 2: society agrees is bad. But then let's say I wanted 398 00:19:50,080 --> 00:19:54,440 Speaker 2: to make an agent that helped with something medical versus 399 00:19:54,480 --> 00:19:57,320 Speaker 2: an agent that served as your lawyer, versus a customer 400 00:19:57,359 --> 00:20:02,040 Speaker 2: service agent versus a therapist. Rules for that are very different. Basically, 401 00:20:02,080 --> 00:20:04,760 Speaker 2: my answer is that for ninety percent of things, it's 402 00:20:04,760 --> 00:20:06,800 Speaker 2: not up to us to decide. It's only the ten 403 00:20:06,840 --> 00:20:09,679 Speaker 2: percent of things where we think most people would agree 404 00:20:09,680 --> 00:20:11,680 Speaker 2: and where we defer as much as we can to 405 00:20:12,160 --> 00:20:13,720 Speaker 2: societal processes. 406 00:20:13,359 --> 00:20:16,680 Speaker 1: And there are so many great processes. We know, for example, 407 00:20:16,720 --> 00:20:19,719 Speaker 1: that cars are safer in twenty twenty three than they 408 00:20:19,720 --> 00:20:22,359 Speaker 1: were in the nineteen sixties because of rules around seat 409 00:20:22,359 --> 00:20:25,320 Speaker 1: belts and breaking systems and crash testing. We know that 410 00:20:25,400 --> 00:20:29,040 Speaker 1: when radium was first discovered by Mary Currie, anyone could 411 00:20:29,119 --> 00:20:32,120 Speaker 1: make a medical product with radium, radium cough suites for babies. 412 00:20:32,359 --> 00:20:36,960 Speaker 1: So what's the process that we should use for AI systems? 413 00:20:36,960 --> 00:20:41,000 Speaker 1: Should it look like drug approvals or should it look 414 00:20:41,119 --> 00:20:43,720 Speaker 1: like perhaps a much lighter weight system of the type 415 00:20:43,760 --> 00:20:44,920 Speaker 1: we have in the autel industry. 416 00:20:45,200 --> 00:20:48,520 Speaker 2: I think maybe of like cars and airplanes or something 417 00:20:48,600 --> 00:20:52,359 Speaker 2: like that as good examples of kind of powerful technologies 418 00:20:52,400 --> 00:20:55,520 Speaker 2: that are safety critical where lives are online. So the 419 00:20:55,600 --> 00:20:58,520 Speaker 2: kind of early wild West of all these technologies. I 420 00:20:58,520 --> 00:21:00,840 Speaker 2: think we're in that period they and we need to 421 00:21:00,880 --> 00:21:02,240 Speaker 2: move as quickly as possible. 422 00:21:03,280 --> 00:21:05,280 Speaker 3: Moved through that period, right, We've moved. 423 00:21:05,040 --> 00:21:09,640 Speaker 2: Through that period rather quickly, rather soon. Where rules of. 424 00:21:09,560 --> 00:21:10,960 Speaker 3: The road, why say quickly? 425 00:21:11,040 --> 00:21:15,399 Speaker 2: I think it's the exponential with another technology, I might say, look, 426 00:21:15,440 --> 00:21:18,239 Speaker 2: we don't understand the cost and benefits that well, Like 427 00:21:18,520 --> 00:21:20,119 Speaker 2: we need to have these things play out in the 428 00:21:20,160 --> 00:21:22,840 Speaker 2: market a little bit before we start to step in 429 00:21:22,960 --> 00:21:25,760 Speaker 2: and set regulation that might be too rigid. But that's 430 00:21:25,760 --> 00:21:28,600 Speaker 2: not my view for AI. Because it's moving so fast, 431 00:21:28,920 --> 00:21:32,199 Speaker 2: because the implications are happening so fast, I suspect that 432 00:21:32,240 --> 00:21:35,120 Speaker 2: this is a case where we're going to need very 433 00:21:35,160 --> 00:21:37,440 Speaker 2: soon some kinds of rules of the road. 434 00:21:37,800 --> 00:21:39,960 Speaker 1: Is it that the systems are getting faster? Is it 435 00:21:40,080 --> 00:21:44,080 Speaker 1: are getting measurably more powerful? Is it being used more 436 00:21:44,200 --> 00:21:46,440 Speaker 1: frequently in business? What is this exponential that you're. 437 00:21:46,600 --> 00:21:51,679 Speaker 2: Ying yes yes to all? So the exponential is basically 438 00:21:52,160 --> 00:21:55,800 Speaker 2: the amount of computation number of chips times the time 439 00:21:55,880 --> 00:21:58,560 Speaker 2: we run them, four times the speed of the chips 440 00:21:59,359 --> 00:22:01,840 Speaker 2: and each of them. Those factors is getting faster. But 441 00:22:02,480 --> 00:22:05,320 Speaker 2: used to be five or ten years ago, the amount 442 00:22:05,400 --> 00:22:07,679 Speaker 2: of money that you would put into training one of 443 00:22:07,680 --> 00:22:10,520 Speaker 2: these AI systems was the size of an academic research grant, 444 00:22:10,840 --> 00:22:13,600 Speaker 2: So one hundred thousand to a million dollars. We're now 445 00:22:13,640 --> 00:22:16,320 Speaker 2: in an era where I would say companies spend ten 446 00:22:16,560 --> 00:22:18,840 Speaker 2: to one hundred million dollars, But I think we're going 447 00:22:18,920 --> 00:22:21,840 Speaker 2: to enter an era because the economic value is so 448 00:22:21,960 --> 00:22:23,959 Speaker 2: great where it's going to be, you know, a billion 449 00:22:24,000 --> 00:22:26,120 Speaker 2: dollars or ten billion dollars, and. 450 00:22:26,080 --> 00:22:30,919 Speaker 1: We should convert that spend into the amount of processing 451 00:22:30,960 --> 00:22:34,320 Speaker 1: that these big AI supercomputers are doing, exactly, and they're 452 00:22:34,359 --> 00:22:38,880 Speaker 1: doing that processing to produce systems that are even more 453 00:22:38,920 --> 00:22:40,080 Speaker 1: powerful exactly. 454 00:22:40,080 --> 00:22:42,040 Speaker 2: And at the same time as that happening, the chips 455 00:22:42,040 --> 00:22:44,760 Speaker 2: are getting faster, and more money is going also into 456 00:22:44,760 --> 00:22:48,960 Speaker 2: making the chips faster because there's so much useful things 457 00:22:49,000 --> 00:22:51,959 Speaker 2: that the models can do. And then of course engineers 458 00:22:51,960 --> 00:22:54,679 Speaker 2: are working on how to squeeze every possible drop of 459 00:22:54,680 --> 00:22:57,359 Speaker 2: efficiency out of the compute that I have once we 460 00:22:57,480 --> 00:22:57,920 Speaker 2: spend it. 461 00:22:58,000 --> 00:23:01,040 Speaker 1: And companies are desperate. I've spike to the bosses of 462 00:23:01,560 --> 00:23:05,520 Speaker 1: many very large firms and it's really high up on 463 00:23:05,600 --> 00:23:08,240 Speaker 1: their agenda to figure out how they use these technologies 464 00:23:08,720 --> 00:23:09,560 Speaker 1: in their businesses. 465 00:23:09,640 --> 00:23:11,199 Speaker 3: And walking around San Francisco the. 466 00:23:11,240 --> 00:23:14,639 Speaker 1: Last few days, I can feel the palpable buzz of 467 00:23:14,680 --> 00:23:17,560 Speaker 1: people just wanting to build on AI the way they 468 00:23:17,600 --> 00:23:19,640 Speaker 1: wants to build on the iPhone fifteen years ago. 469 00:23:19,920 --> 00:23:22,720 Speaker 2: I think, on one hand, that's really exciting, and we 470 00:23:22,800 --> 00:23:25,240 Speaker 2: benefit from it and others benefit from it, and I 471 00:23:25,240 --> 00:23:28,359 Speaker 2: don't want to do anything to slow down the excitement 472 00:23:28,480 --> 00:23:31,439 Speaker 2: or the positive benefits. But everyone understands that you need 473 00:23:31,480 --> 00:23:33,000 Speaker 2: to make these things safe, and that there is no 474 00:23:33,160 --> 00:23:34,720 Speaker 2: industry if you don't make these things safe. 475 00:23:34,760 --> 00:23:38,080 Speaker 1: Absolutely, we're building these AI systems using large language models 476 00:23:38,080 --> 00:23:41,520 Speaker 1: that are on an exponential, but exponentials are really they're 477 00:23:41,560 --> 00:23:43,520 Speaker 1: really s curve, so they go up and then they 478 00:23:43,600 --> 00:23:47,399 Speaker 1: tail off. How long does this exponential run for before 479 00:23:47,440 --> 00:23:49,200 Speaker 1: it tails off? In other words, is this the last 480 00:23:49,240 --> 00:23:51,240 Speaker 1: set of innovations that we're going to need for AI? 481 00:23:51,600 --> 00:23:53,760 Speaker 2: I would say we have at least a few years 482 00:23:53,800 --> 00:23:56,679 Speaker 2: of the current exponential, and then people have ways of 483 00:23:56,720 --> 00:23:59,720 Speaker 2: coming up with new innovations that continue things after that. 484 00:24:00,240 --> 00:24:02,720 Speaker 2: I think a few years from now we may get 485 00:24:02,760 --> 00:24:07,119 Speaker 2: to the point where AI systems can perform these feats 486 00:24:07,119 --> 00:24:09,439 Speaker 2: that humans aren't capable of. And we've seen with the 487 00:24:09,560 --> 00:24:12,639 Speaker 2: AI systems they're already broader than humans. So if we 488 00:24:12,640 --> 00:24:14,880 Speaker 2: could get them to the point where they're broader and 489 00:24:15,040 --> 00:24:18,280 Speaker 2: they're more creative than we are, or as creative and 490 00:24:18,400 --> 00:24:21,720 Speaker 2: able to see all the connections, I really have this 491 00:24:21,920 --> 00:24:27,280 Speaker 2: hope that human scientists assisted by AI could make progress 492 00:24:27,320 --> 00:24:30,400 Speaker 2: on these complex diseases as fast as we've made progress 493 00:24:30,440 --> 00:24:32,760 Speaker 2: on the simple diseases. And my hope is, if we 494 00:24:32,840 --> 00:24:34,639 Speaker 2: really get this right, could we actually get to the 495 00:24:34,640 --> 00:24:37,760 Speaker 2: point where this particular cancer is just not a problem anymore. 496 00:24:37,800 --> 00:24:42,440 Speaker 1: And of course, beyond the medical applications, into climate change, 497 00:24:42,560 --> 00:24:45,840 Speaker 1: into poverty elimination, into all sorts of problems that we 498 00:24:45,920 --> 00:24:47,000 Speaker 1: as humans are found. 499 00:24:46,800 --> 00:24:50,280 Speaker 2: Problems of complexity beyond human scope. 500 00:24:50,520 --> 00:24:52,919 Speaker 1: Right, So let's look forward a little bit. The premise 501 00:24:52,960 --> 00:24:55,600 Speaker 1: of our discussion is that in five years we could 502 00:24:55,680 --> 00:25:00,680 Speaker 1: all be using good, trustworthy AI systems just as part 503 00:25:00,720 --> 00:25:03,600 Speaker 1: of normal life. Do you think that could become reality? 504 00:25:03,800 --> 00:25:05,800 Speaker 2: Yeah? I think that could. So, you know, as soon 505 00:25:05,880 --> 00:25:08,119 Speaker 2: we get right all the kind of rules of the road, 506 00:25:08,720 --> 00:25:12,000 Speaker 2: safety helpful on is termless. If we solve all those 507 00:25:12,040 --> 00:25:15,080 Speaker 2: problems which we've talked about a fair amount, I do 508 00:25:15,119 --> 00:25:18,240 Speaker 2: think that everyone could have an AI assistant that they 509 00:25:18,320 --> 00:25:21,040 Speaker 2: really trust, and your whole way of interacting with the 510 00:25:21,080 --> 00:25:24,080 Speaker 2: world could be done through this AI assistant. It can 511 00:25:24,119 --> 00:25:26,520 Speaker 2: help you make better decision and say, hey, like, you know, 512 00:25:26,600 --> 00:25:28,879 Speaker 2: I think you'd be happier if you did X instead 513 00:25:28,920 --> 00:25:31,000 Speaker 2: of why tailored to the way you want it. To 514 00:25:31,040 --> 00:25:33,200 Speaker 2: be that helps you to be the best version of yourself. 515 00:25:33,359 --> 00:25:37,320 Speaker 1: Well, maybe in five years time, my AI assistant can 516 00:25:37,359 --> 00:25:40,960 Speaker 1: meet your AI assistance right here and we can see 517 00:25:40,960 --> 00:25:41,800 Speaker 1: how well the two of us. 518 00:25:41,800 --> 00:25:44,520 Speaker 2: Did same place the same time. Let's see if we 519 00:25:44,560 --> 00:25:45,560 Speaker 2: can fulfill that bet. 520 00:25:51,920 --> 00:25:54,960 Speaker 1: Reflecting on my conversation with Dario, I'm struck by how 521 00:25:54,960 --> 00:25:57,879 Speaker 1: he acknowledges that the pace of change is so quick, 522 00:25:57,960 --> 00:26:01,639 Speaker 1: it's exponential, and he's really attentive to the problem of harm. 523 00:26:01,800 --> 00:26:05,399 Speaker 1: It's very thoughtful about it. That made me much more comfortable. 524 00:26:05,480 --> 00:26:07,600 Speaker 1: But it's also clear that the way we define what 525 00:26:07,680 --> 00:26:10,720 Speaker 1: we want from these systems cannot be left to AI developers. 526 00:26:11,000 --> 00:26:13,840 Speaker 1: It really needs to be led by ordinary citizens and 527 00:26:13,920 --> 00:26:21,359 Speaker 1: by their legitimate governments. Thanks for listening to the exponentially podcast. 528 00:26:21,560 --> 00:26:24,560 Speaker 1: If you enjoy the show, please leave a review or rating. 529 00:26:24,720 --> 00:26:28,440 Speaker 1: It really does help others find us. The Exponentially podcast 530 00:26:28,520 --> 00:26:31,879 Speaker 1: is presented by me Azeem Azar. The sound designer is 531 00:26:31,920 --> 00:26:34,760 Speaker 1: Will Horricks. The research was led by Chloe Ippah and 532 00:26:34,840 --> 00:26:38,680 Speaker 1: music composed by Emily Green and John Zarcone. The show 533 00:26:38,760 --> 00:26:42,600 Speaker 1: is produced by Frederick Cassella, Maria Garrilov and me Azeem Azar. 534 00:26:43,000 --> 00:26:46,399 Speaker 1: Special thanks to Sage Bauman, Jeff Grocott, and Magnus Henrikson. 535 00:26:46,720 --> 00:26:50,920 Speaker 1: The executive producers are Andrew Barden, Adam Kamiski, and Kyle Kramer. 536 00:26:51,200 --> 00:26:54,840 Speaker 1: David Ravella is the managing editor. Exponentially was created by 537 00:26:54,920 --> 00:26:57,400 Speaker 1: Frederick Cassella and is an Eat the Pie iplus one 538 00:26:57,440 --> 00:27:01,000 Speaker 1: limited production in association with Black Boomberg LC