1 00:00:03,560 --> 00:00:06,840 Speaker 1: Within five years, open source AI will have raised the 2 00:00:06,920 --> 00:00:11,040 Speaker 1: GDP in the world's poorest countries. That is a premise 3 00:00:11,080 --> 00:00:16,720 Speaker 1: for today's conversation. I'm Asimazar. Welcome to the Exponentially podcast. 4 00:00:19,120 --> 00:00:22,960 Speaker 1: Recent reports from the banks and consultancies have suggested that 5 00:00:23,320 --> 00:00:27,520 Speaker 1: new breakthroughs in AI, particularly generative AI, could add trillions 6 00:00:27,560 --> 00:00:31,040 Speaker 1: of dollars to global GDP. But the most advanced of 7 00:00:31,080 --> 00:00:34,640 Speaker 1: these models are built in the West, on expensive supercomputers 8 00:00:34,920 --> 00:00:38,959 Speaker 1: and trained using English language data sets. So how can 9 00:00:39,000 --> 00:00:42,760 Speaker 1: the global South, with their young populations, tighter finances and 10 00:00:42,840 --> 00:00:48,600 Speaker 1: shake your infrastructure share in this productivity boom. Today's guest 11 00:00:48,680 --> 00:00:52,680 Speaker 1: is m Ad Mustark, a Jordanian born Bangladeshi immigrant to 12 00:00:52,760 --> 00:00:56,680 Speaker 1: Britain who's a founder and CEO of Stability AI. It's 13 00:00:56,720 --> 00:00:59,400 Speaker 1: a firm that has accelerated to a billion dollar valuation 14 00:00:59,560 --> 00:01:03,240 Speaker 1: in less than three years. Emad has been vocal about 15 00:01:03,280 --> 00:01:06,520 Speaker 1: the potential that open source AI offers to the poorest 16 00:01:06,520 --> 00:01:09,720 Speaker 1: in the world, but there have been serious criticisms leveled 17 00:01:09,720 --> 00:01:13,560 Speaker 1: against him, most prominently in a recent story by Forbes magazine. 18 00:01:13,680 --> 00:01:17,200 Speaker 1: These include questions about taking credit for the company's technology, 19 00:01:17,400 --> 00:01:19,520 Speaker 1: about some of the partnerships the firm is meant to 20 00:01:19,560 --> 00:01:24,280 Speaker 1: have and also governance practices at Stability AI, where Mustak 21 00:01:24,400 --> 00:01:29,680 Speaker 1: remains CEO. MD has publicly rejected these as grossness characterizations, 22 00:01:29,880 --> 00:01:32,520 Speaker 1: and a debate over the story and others critical of 23 00:01:32,520 --> 00:01:36,400 Speaker 1: Stability AI continues to play out on social media and elsewhere. 24 00:01:37,319 --> 00:01:39,640 Speaker 1: But as long as investors who have poured hundreds of 25 00:01:39,640 --> 00:01:42,880 Speaker 1: millions of dollars into Stability AI continue to stick by 26 00:01:42,920 --> 00:01:45,360 Speaker 1: a mad and as long as a firm continues to 27 00:01:45,480 --> 00:01:48,840 Speaker 1: innovate and bring out new AI models, he will remain 28 00:01:49,160 --> 00:01:58,640 Speaker 1: a powerful force in the expanding AI universe. What is 29 00:01:58,880 --> 00:02:02,640 Speaker 1: generative AI and how is it different from all of 30 00:02:02,640 --> 00:02:04,800 Speaker 1: the AI systems that we saw previously. 31 00:02:05,240 --> 00:02:07,640 Speaker 2: Generator AI is a new type of AI that started 32 00:02:07,680 --> 00:02:10,560 Speaker 2: in twenty seventeen. There was a seminal paper called Attention 33 00:02:10,720 --> 00:02:13,320 Speaker 2: is all you Need because not all data is the same. 34 00:02:13,360 --> 00:02:16,320 Speaker 2: You pay attention to what's important. So classical AI was 35 00:02:16,320 --> 00:02:18,400 Speaker 2: built on this concept of big data. So you had 36 00:02:18,440 --> 00:02:21,399 Speaker 2: all this data of Facebook and Google and they used 37 00:02:21,440 --> 00:02:24,200 Speaker 2: it to sell you coconut shampoo, but it couldn't go 38 00:02:24,280 --> 00:02:27,320 Speaker 2: outside its boundaries. So it's like a very logical The 39 00:02:27,360 --> 00:02:30,400 Speaker 2: future is like the past, stable kind of environment this 40 00:02:30,480 --> 00:02:32,679 Speaker 2: new type of AI said pay attention to the important 41 00:02:32,720 --> 00:02:35,600 Speaker 2: parts of data to compress it. So as people listening 42 00:02:35,639 --> 00:02:37,760 Speaker 2: to this, we'll see they might take away a few points, 43 00:02:37,760 --> 00:02:40,640 Speaker 2: they're not going to remember our entire conversation. That's what 44 00:02:40,720 --> 00:02:42,880 Speaker 2: human mind does. You've got the very logical part that 45 00:02:42,960 --> 00:02:46,880 Speaker 2: can memorize stuff, and you've got the part that builds principles, stories, 46 00:02:47,360 --> 00:02:48,720 Speaker 2: frameworks for understanding. 47 00:02:49,280 --> 00:02:51,920 Speaker 1: And this paper attention is all you need in a 48 00:02:51,960 --> 00:02:55,679 Speaker 1: sense analogize that for a computer system exactly. 49 00:02:56,000 --> 00:02:57,560 Speaker 2: It was the first one that said, this is how 50 00:02:57,600 --> 00:03:01,160 Speaker 2: you show it at scale, and let's simplify it down 51 00:03:01,240 --> 00:03:05,440 Speaker 2: to a problem of better data and bigger computers. So 52 00:03:05,560 --> 00:03:08,960 Speaker 2: using gigantic supercomputers, you can take these big data sets 53 00:03:08,960 --> 00:03:11,440 Speaker 2: of text, images and others and can press it down 54 00:03:11,480 --> 00:03:15,639 Speaker 2: to just a few gigabytes of file that learns principles, 55 00:03:15,680 --> 00:03:16,359 Speaker 2: not facts. 56 00:03:16,880 --> 00:03:18,120 Speaker 3: And so this was the missing. 57 00:03:17,880 --> 00:03:21,079 Speaker 2: Piece in AI, and that's why using these systems feels 58 00:03:21,080 --> 00:03:23,360 Speaker 2: actually quite surprisingly human. 59 00:03:23,200 --> 00:03:26,000 Speaker 1: Surprisingly human. But how do we get to the generative 60 00:03:26,000 --> 00:03:26,720 Speaker 1: part of all of that. 61 00:03:26,840 --> 00:03:30,080 Speaker 2: The generative part is that you put a prompt in 62 00:03:30,240 --> 00:03:32,400 Speaker 2: or some words in, and then it gives you something back. 63 00:03:32,440 --> 00:03:36,040 Speaker 2: It generates the outputs and the outputs are not always 64 00:03:36,080 --> 00:03:38,800 Speaker 2: even the same because it has principles as a base. 65 00:03:39,000 --> 00:03:41,840 Speaker 1: So in the same sense that if you and I 66 00:03:41,920 --> 00:03:46,200 Speaker 1: meet one day, the way the conversation plays out could 67 00:03:46,200 --> 00:03:49,119 Speaker 1: be quite different to the next time we meet, because 68 00:03:49,160 --> 00:03:53,120 Speaker 1: we have principles of socialization and of behavior and of 69 00:03:53,160 --> 00:03:56,080 Speaker 1: how well we know each other, and those get applied 70 00:03:57,080 --> 00:03:59,400 Speaker 1: in real time at that moment each time we shake hands. 71 00:03:59,480 --> 00:04:02,400 Speaker 2: In real time, it's a file with just a bunch 72 00:04:02,440 --> 00:04:04,240 Speaker 2: of like it's called neural nets. 73 00:04:04,240 --> 00:04:04,600 Speaker 3: Weights. 74 00:04:05,000 --> 00:04:07,920 Speaker 2: Words go in and get shaken out like a pinball 75 00:04:07,920 --> 00:04:10,400 Speaker 2: slot machine, and then the output comes. But the output 76 00:04:10,440 --> 00:04:13,160 Speaker 2: input can be a painting of a cup in the 77 00:04:13,200 --> 00:04:16,240 Speaker 2: style of Vermere, and then it understands the nature of 78 00:04:16,279 --> 00:04:19,760 Speaker 2: painting cup Vermire. But cup has so many different meanings, 79 00:04:19,839 --> 00:04:21,640 Speaker 2: can mean this cup or cup your hands, or cup 80 00:04:21,680 --> 00:04:24,320 Speaker 2: your ears, or a world cup, and it understands those 81 00:04:24,320 --> 00:04:29,839 Speaker 2: things in place because it's been trained on images and text. Similarly, 82 00:04:29,920 --> 00:04:32,640 Speaker 2: a lot of these language models they've been trained on sentences, 83 00:04:33,000 --> 00:04:35,160 Speaker 2: so they look at the context of the sentence and 84 00:04:35,200 --> 00:04:38,520 Speaker 2: they say, what's coming next, you know, like that game 85 00:04:38,560 --> 00:04:41,200 Speaker 2: of improvisation where you start with the sentence and then 86 00:04:41,240 --> 00:04:45,160 Speaker 2: you provide something in the next one, and on we go. 87 00:04:45,240 --> 00:04:48,159 Speaker 1: And it's moving so so quickly. So last summer I 88 00:04:48,400 --> 00:04:52,480 Speaker 1: was away in Tanzania on a safari and there's really 89 00:04:52,480 --> 00:04:55,120 Speaker 1: no mobile signal. We were away from about two and 90 00:04:55,120 --> 00:04:58,200 Speaker 1: a half weeks and when I came back, the Internet 91 00:04:58,480 --> 00:05:03,440 Speaker 1: was full of stable diffusion, stable diffusion, stable difusion, stable diffusion. 92 00:05:03,680 --> 00:05:06,680 Speaker 1: And that is one of your your generative products. What 93 00:05:06,920 --> 00:05:10,839 Speaker 1: it does it takes text and it produces images. So 94 00:05:10,880 --> 00:05:16,240 Speaker 1: I can say I want an image of a badger 95 00:05:16,560 --> 00:05:21,480 Speaker 1: playing football on a bicycle, and stable difusion will will 96 00:05:21,480 --> 00:05:24,039 Speaker 1: produce that image. It could also produce more useful, commercially 97 00:05:24,160 --> 00:05:29,400 Speaker 1: interesting images. Recently, you've brought out a new generative product 98 00:05:29,480 --> 00:05:34,839 Speaker 1: which is stable LM, which looks a lot like the 99 00:05:34,839 --> 00:05:37,520 Speaker 1: these text models that we've seen are running around totage. 100 00:05:37,520 --> 00:05:40,200 Speaker 2: Stable lemmas are language model suites, and what we do 101 00:05:40,279 --> 00:05:42,240 Speaker 2: is a bit different from a lot of the other companies, 102 00:05:42,279 --> 00:05:44,880 Speaker 2: and that a lot of the focus and the breakthroughs 103 00:05:44,880 --> 00:05:46,880 Speaker 2: were because a lot of these research labs, the open 104 00:05:47,000 --> 00:05:49,520 Speaker 2: eyes and thropics of the world deep minds, have this 105 00:05:49,600 --> 00:05:52,720 Speaker 2: focus on AGI artificial general intelligence. 106 00:05:52,720 --> 00:05:54,160 Speaker 3: Can you be an AI that can do anything? 107 00:05:54,360 --> 00:05:54,600 Speaker 1: Yes? 108 00:05:54,760 --> 00:05:56,720 Speaker 3: It turns out maybe you can so. 109 00:05:57,000 --> 00:05:59,120 Speaker 1: Making a machine that's in our own image in a sense, 110 00:05:59,120 --> 00:06:00,560 Speaker 1: that's what AGI sound. 111 00:06:00,680 --> 00:06:04,200 Speaker 2: More than that, it's a general intelligence. So it's the 112 00:06:04,880 --> 00:06:06,599 Speaker 2: kid that made you look bad at school because he 113 00:06:06,680 --> 00:06:09,800 Speaker 2: was good at everything. You know, the top performer like 114 00:06:10,200 --> 00:06:13,040 Speaker 2: GPT four now can pass the bar exam, the medical 115 00:06:13,080 --> 00:06:16,279 Speaker 2: licensing exam, the GIRE. It's probably going to Stanford, you know, 116 00:06:16,520 --> 00:06:21,119 Speaker 2: and next floor. But then our take was that's great, 117 00:06:21,279 --> 00:06:24,800 Speaker 2: you can have these amazing giant general models. What will 118 00:06:24,880 --> 00:06:27,880 Speaker 2: be better is to mimic humanity where our companies are 119 00:06:27,880 --> 00:06:30,960 Speaker 2: not one generalist doing everything. What if we made it 120 00:06:31,000 --> 00:06:33,599 Speaker 2: so you could bring your own data to the models 121 00:06:33,640 --> 00:06:36,360 Speaker 2: and have lots of specialists working together and have those 122 00:06:36,360 --> 00:06:37,719 Speaker 2: models working for you that you own. 123 00:06:37,960 --> 00:06:38,720 Speaker 3: Right, What if you. 124 00:06:38,680 --> 00:06:41,840 Speaker 2: Had open data, open models and allowed it to be 125 00:06:41,920 --> 00:06:44,880 Speaker 2: customized and specialized, so rather than relying on one models 126 00:06:44,880 --> 00:06:48,240 Speaker 2: to do everything, instead you optimize them. 127 00:06:48,600 --> 00:06:48,800 Speaker 3: Right. 128 00:06:49,000 --> 00:06:50,960 Speaker 1: So, but the idea is still the same. The idea 129 00:06:51,040 --> 00:06:54,919 Speaker 1: is still constructing a model that is generative. I can 130 00:06:55,000 --> 00:06:57,320 Speaker 1: give it some text and it can produce something that 131 00:06:57,440 --> 00:07:02,200 Speaker 1: is commercially useful or in emotionally useful. So working software 132 00:07:02,240 --> 00:07:06,359 Speaker 1: code or a working invitation to a meeting, or you know, 133 00:07:06,440 --> 00:07:08,960 Speaker 1: things that will save us time and help us perhaps 134 00:07:09,000 --> 00:07:10,360 Speaker 1: be a bit more creative. 135 00:07:10,480 --> 00:07:12,640 Speaker 2: So again, like a talented graduate, and they can learn 136 00:07:12,760 --> 00:07:14,960 Speaker 2: very quickly from a few examples. So they have this 137 00:07:15,040 --> 00:07:18,000 Speaker 2: base of generalized knowledge. They've been through kindergarten, high school, 138 00:07:18,040 --> 00:07:21,720 Speaker 2: and university, but they're not specialized yet. You can train 139 00:07:21,800 --> 00:07:23,680 Speaker 2: them yourself, or you can just show them some examples 140 00:07:23,720 --> 00:07:27,000 Speaker 2: and they learn very quickly. Unlike classical AI models that 141 00:07:27,040 --> 00:07:29,040 Speaker 2: you had to train on the whole data set. 142 00:07:29,240 --> 00:07:33,040 Speaker 3: They weren't good at at aaptation like the badger riding 143 00:07:33,080 --> 00:07:33,440 Speaker 3: a bike. 144 00:07:33,920 --> 00:07:36,920 Speaker 2: That's not really a normal thing, you know, And so 145 00:07:37,120 --> 00:07:39,920 Speaker 2: that has been the province of well, just us right. 146 00:07:39,960 --> 00:07:41,880 Speaker 2: We were able to take these concepts together, whereas a 147 00:07:41,880 --> 00:07:45,600 Speaker 2: computer could never merge together concepts. Now we've got that 148 00:07:45,720 --> 00:07:49,360 Speaker 2: missing link of being able to take concepts, merge them together, 149 00:07:49,880 --> 00:07:58,480 Speaker 2: understand some of these hidden meanings. 150 00:07:59,480 --> 00:08:03,440 Speaker 1: I'm curier about what you think the economic impact of 151 00:08:03,480 --> 00:08:05,440 Speaker 1: all of this will be. I mean, there have been 152 00:08:05,800 --> 00:08:09,520 Speaker 1: any number of papers coming out from the investment banks 153 00:08:09,520 --> 00:08:12,640 Speaker 1: and the research houses and economists in the last few months. 154 00:08:12,680 --> 00:08:16,360 Speaker 1: I think Golden Sas had a report that suggested that 155 00:08:16,360 --> 00:08:20,240 Speaker 1: that with one of their scenarios, they could see fast 156 00:08:20,280 --> 00:08:23,520 Speaker 1: implementation of generative AI across the world, leading to a 157 00:08:23,560 --> 00:08:27,600 Speaker 1: seven percent increase in global GDP in about ten years. 158 00:08:28,240 --> 00:08:30,240 Speaker 1: What's your sense of what it could do economically. 159 00:08:30,320 --> 00:08:33,280 Speaker 2: I think it's the biggest thing since the Gutenberg Press, 160 00:08:33,320 --> 00:08:34,439 Speaker 2: maybe even fire. 161 00:08:34,559 --> 00:08:38,200 Speaker 1: The Gutenberg Press six hundred years ago, or fire two 162 00:08:38,280 --> 00:08:39,040 Speaker 1: million years ago. 163 00:08:39,120 --> 00:08:42,400 Speaker 2: I think that humans are driven by stories. It's what 164 00:08:42,480 --> 00:08:45,840 Speaker 2: allowed us to form tribes and then money and things 165 00:08:45,880 --> 00:08:48,040 Speaker 2: like that are stories. The press allowed us to write 166 00:08:48,040 --> 00:08:50,760 Speaker 2: down the stories. But it's very lossy. Again, you me 167 00:08:50,960 --> 00:08:54,120 Speaker 2: everyone listening like you're looking through your things right now. 168 00:08:54,600 --> 00:08:57,319 Speaker 2: We do power points, we write things, but it doesn't 169 00:08:57,360 --> 00:09:02,320 Speaker 2: capture the richness of humanity. Our organizations are built on 170 00:09:02,440 --> 00:09:05,440 Speaker 2: layers of text, which is painful, and that's why it 171 00:09:05,520 --> 00:09:08,200 Speaker 2: turns us into cogs and the machine, shall we say. 172 00:09:08,240 --> 00:09:10,200 Speaker 1: I mean there's something that's quite powerful about the models. 173 00:09:10,240 --> 00:09:11,960 Speaker 1: I think that you are getting to here, which is 174 00:09:11,960 --> 00:09:16,360 Speaker 1: that you can feed them text and through that the 175 00:09:16,400 --> 00:09:21,760 Speaker 1: machination of the billions of different switches and cogs you 176 00:09:21,800 --> 00:09:25,319 Speaker 1: guys call them parameters. Yes, in the system, it starts 177 00:09:25,360 --> 00:09:31,800 Speaker 1: to find those underlying relationships that we know, probably deeply 178 00:09:31,800 --> 00:09:34,280 Speaker 1: in our brains, but don't express. What we do is 179 00:09:34,320 --> 00:09:37,080 Speaker 1: we express words, one word at a time, and they 180 00:09:37,080 --> 00:09:38,920 Speaker 1: look at all these words and they're able to find 181 00:09:39,320 --> 00:09:44,199 Speaker 1: some representations of reality that actually humans use but can't 182 00:09:44,520 --> 00:09:45,360 Speaker 1: touch and describe. 183 00:09:45,440 --> 00:09:47,240 Speaker 2: I think there's that part of it. Another part of 184 00:09:47,280 --> 00:09:51,120 Speaker 2: it is just being able again. AI is about information classification. 185 00:09:51,800 --> 00:09:54,199 Speaker 2: When you're writing, the hardest thing is to write something 186 00:09:54,880 --> 00:09:57,600 Speaker 2: terse and compress. It's a bit easier to write big, 187 00:09:57,640 --> 00:09:59,240 Speaker 2: but it's still difficult. The easiest thing for us to 188 00:09:59,280 --> 00:09:59,800 Speaker 2: do is talk. 189 00:10:00,280 --> 00:10:02,959 Speaker 1: Now there's the old added. You know, I couldn't send 190 00:10:02,960 --> 00:10:05,080 Speaker 1: you a short letter, so I've sent you a long one. 191 00:10:05,280 --> 00:10:09,160 Speaker 2: Right now, anyone anywhere can create any image soon, any 192 00:10:09,360 --> 00:10:13,080 Speaker 2: PowerPoint slide almost instantly that looks beautiful. So the fact 193 00:10:13,080 --> 00:10:15,400 Speaker 2: that it understands concepts is a big deal because the 194 00:10:15,440 --> 00:10:17,800 Speaker 2: barriers to information flow are reduced, so in motion can 195 00:10:17,840 --> 00:10:20,079 Speaker 2: flow better around. Our organizations are systems. 196 00:10:20,200 --> 00:10:24,480 Speaker 1: As you eliminate barriers to information flow, you're taking friction 197 00:10:25,000 --> 00:10:27,960 Speaker 1: out of systems. You're taking friction out of daily life, 198 00:10:28,040 --> 00:10:31,360 Speaker 1: You're taking friction out of business processes, You're taking friction 199 00:10:31,360 --> 00:10:34,760 Speaker 1: out of the economy and so we would hope to 200 00:10:34,880 --> 00:10:40,120 Speaker 1: see improvements in productivity and with that improvements in prosperity. 201 00:10:40,240 --> 00:10:42,800 Speaker 2: All of finance can be broken down into two things, 202 00:10:42,960 --> 00:10:48,480 Speaker 2: securitization and leverage. Securitization is a representation of a asset 203 00:10:48,520 --> 00:10:52,000 Speaker 2: of some sort. It's money, the trust of the American government, 204 00:10:52,480 --> 00:10:54,640 Speaker 2: it is a bond, it is a property deed, something 205 00:10:54,679 --> 00:10:55,000 Speaker 2: like that. 206 00:10:56,000 --> 00:10:57,720 Speaker 3: But you can only have so much information on that. 207 00:10:58,400 --> 00:11:00,440 Speaker 2: You and I, we have our credit score based on 208 00:11:00,480 --> 00:11:02,800 Speaker 2: the information of who we are, what we do, our 209 00:11:02,880 --> 00:11:03,960 Speaker 2: functional identities. 210 00:11:04,320 --> 00:11:06,120 Speaker 3: Most of the world is invisible. 211 00:11:05,880 --> 00:11:08,560 Speaker 1: The global South. This is people who are under banked, 212 00:11:08,679 --> 00:11:11,679 Speaker 1: people who perhaps don't have formal IDs and so on. 213 00:11:11,760 --> 00:11:14,480 Speaker 2: You need identity and you need information to allow for banking. 214 00:11:15,360 --> 00:11:17,959 Speaker 2: You need that for finance, and our financial systems are 215 00:11:18,040 --> 00:11:20,959 Speaker 2: quite slow. As you get increases in information flow, you 216 00:11:21,000 --> 00:11:24,199 Speaker 2: get increases in prosperity because you can direct assets to 217 00:11:24,240 --> 00:11:26,760 Speaker 2: where they're needed. You can direct resources to where they're needed. 218 00:11:27,200 --> 00:11:29,800 Speaker 2: It's like I always tell people in the team roadmaps, 219 00:11:29,800 --> 00:11:32,240 Speaker 2: are they a resource constrained? Their story constrained? Because if 220 00:11:32,280 --> 00:11:34,520 Speaker 2: it's a good idea. As as a leader, I hope 221 00:11:34,559 --> 00:11:36,160 Speaker 2: I will find you the resources. But you have to 222 00:11:36,160 --> 00:11:36,640 Speaker 2: convince me. 223 00:11:36,679 --> 00:11:40,920 Speaker 1: First, I can imagine these models in rich, advanced economies 224 00:11:40,960 --> 00:11:42,880 Speaker 1: where there's a huge service sector. There are lots of 225 00:11:42,880 --> 00:11:46,120 Speaker 1: people who sit behind computers typing away creating spreadsheets and 226 00:11:46,120 --> 00:11:50,400 Speaker 1: PowerPoint slides. You can imagine these models helping economies like that. 227 00:11:51,200 --> 00:11:55,439 Speaker 1: But how can we see them helping the Global South 228 00:11:55,559 --> 00:11:57,280 Speaker 1: or poorer, less advanced economies. 229 00:11:57,400 --> 00:11:59,199 Speaker 2: One of the reasons these models have got everywhere is 230 00:11:59,240 --> 00:12:01,880 Speaker 2: they've become good enough, fast enough, and cheap enough. Stuff 231 00:12:01,920 --> 00:12:04,680 Speaker 2: that used to cost dollars, tens of dolls, hundreds of dollars, 232 00:12:04,679 --> 00:12:06,960 Speaker 2: thousands of dollars you can now do with a few 233 00:12:07,000 --> 00:12:10,280 Speaker 2: simple prompts now. I think it will remove a lot 234 00:12:10,320 --> 00:12:12,920 Speaker 2: of the basic tasks and make people more productive as 235 00:12:12,960 --> 00:12:15,000 Speaker 2: opposed to leading to mascul unemployment. 236 00:12:15,080 --> 00:12:18,160 Speaker 1: And we're not seeing demand for coders drop off right. 237 00:12:18,240 --> 00:12:21,400 Speaker 1: Coders can still get work pretty quickly as they need it. 238 00:12:21,400 --> 00:12:24,240 Speaker 2: Because you will write better code and again a productivity increase. 239 00:12:24,520 --> 00:12:27,079 Speaker 2: Smart Phones can take amazing pictures, but there are more 240 00:12:27,480 --> 00:12:30,640 Speaker 2: employed photographers in the world now than ever. You know, again, 241 00:12:30,679 --> 00:12:34,000 Speaker 2: we adapt, we improve, we use the technology. However, the 242 00:12:34,000 --> 00:12:37,280 Speaker 2: Global South is a very interesting promonm We had mobile phones, 243 00:12:37,320 --> 00:12:38,880 Speaker 2: you remember it used to be for the rich, only 244 00:12:38,920 --> 00:12:39,880 Speaker 2: these big, big things. 245 00:12:40,200 --> 00:12:43,199 Speaker 1: And now in the global South there are mobile phones everywhere. 246 00:12:42,960 --> 00:12:43,120 Speaker 3: There. 247 00:12:43,440 --> 00:12:47,319 Speaker 2: They leapt over the PC to mobile. Yeah, they leapt 248 00:12:47,360 --> 00:12:50,360 Speaker 2: over to instant payments. Whereas we took a while to 249 00:12:50,360 --> 00:12:52,680 Speaker 2: catch up. In certain of western countries, you still haven't 250 00:12:52,679 --> 00:12:54,720 Speaker 2: caught up to instant payments. I think what will happen 251 00:12:54,800 --> 00:12:56,760 Speaker 2: is these models become good enough, fast enough and cheap 252 00:12:56,840 --> 00:12:58,640 Speaker 2: enough just within the next few years that they will 253 00:12:58,720 --> 00:13:01,679 Speaker 2: leap forward to intelligence augmentation. 254 00:13:01,559 --> 00:13:11,400 Speaker 1: To the benefit of these emerging markets. Yes, so let's 255 00:13:11,400 --> 00:13:13,560 Speaker 1: think about where we've got to. We've got this very 256 00:13:13,679 --> 00:13:18,680 Speaker 1: powerful technology that we've characterized. It's extremely helpful in many 257 00:13:18,679 --> 00:13:21,280 Speaker 1: many different ways. But it is the case that these 258 00:13:21,280 --> 00:13:26,960 Speaker 1: systems are extremely expensive to build. To train, as it's called, 259 00:13:27,000 --> 00:13:30,240 Speaker 1: they require lots and lots of data. The British government 260 00:13:30,240 --> 00:13:33,280 Speaker 1: has allocated more than a billion dollars to build a 261 00:13:33,400 --> 00:13:36,719 Speaker 1: supercomputer just to train these models. The rumors are that 262 00:13:37,200 --> 00:13:40,959 Speaker 1: the GPT four model from open Ai cost hundreds of 263 00:13:41,000 --> 00:13:45,440 Speaker 1: millions of dollars to train. But they're also trained on 264 00:13:46,200 --> 00:13:48,400 Speaker 1: pretty much everything that you can find on the internet, 265 00:13:48,840 --> 00:13:53,720 Speaker 1: a large part of which will be Western American English language, 266 00:13:53,720 --> 00:13:56,960 Speaker 1: a strong cultural buias. So it sounds like that not 267 00:13:57,040 --> 00:14:01,240 Speaker 1: only can poorer countries not afford this, but even if 268 00:14:01,280 --> 00:14:05,240 Speaker 1: they could, the technologies wouldn't necessarily be be suitable for 269 00:14:05,320 --> 00:14:11,199 Speaker 1: the economic requirements or the cultural requirements of Tanzania or Bangladesh. 270 00:14:11,400 --> 00:14:13,160 Speaker 2: Yeah, I think this is a real problem. I think 271 00:14:13,200 --> 00:14:16,800 Speaker 2: the quality of data we're feeding to these incredible models 272 00:14:17,160 --> 00:14:19,640 Speaker 2: is poor. It's scraped from the whole Internet. We need 273 00:14:19,680 --> 00:14:23,280 Speaker 2: better data, We need that as infrastructure. There is a 274 00:14:23,360 --> 00:14:25,680 Speaker 2: monetary equation if we need giant supercomputers, but more is 275 00:14:25,720 --> 00:14:28,240 Speaker 2: a question of talent and expertise. It's complicated to build 276 00:14:28,280 --> 00:14:30,480 Speaker 2: these things. This is one of the reasons again kind 277 00:14:30,480 --> 00:14:33,120 Speaker 2: of we had stability to do an open version and 278 00:14:33,160 --> 00:14:36,000 Speaker 2: build these data sets for each country on an open basis. 279 00:14:36,160 --> 00:14:38,360 Speaker 1: So what do you actually mean by by an open 280 00:14:38,400 --> 00:14:42,520 Speaker 1: model and how does that solve the computational problem. 281 00:14:42,640 --> 00:14:45,000 Speaker 2: We got the giant supercomputers, and then we got we 282 00:14:45,120 --> 00:14:48,520 Speaker 2: stability ability, yes, and then we made it available, and 283 00:14:48,560 --> 00:14:50,760 Speaker 2: then we released an open source so people could take 284 00:14:50,760 --> 00:14:52,440 Speaker 2: these models as a base and then extend them. 285 00:14:52,560 --> 00:14:57,040 Speaker 1: But I'm familiar with open source in software. It's a 286 00:14:57,120 --> 00:15:00,280 Speaker 1: whereas with closed source. If you're getting a Microsoft word 287 00:15:00,560 --> 00:15:04,920 Speaker 1: you buy it from Microsoft, and you can't inspect the 288 00:15:04,960 --> 00:15:07,800 Speaker 1: source code the instructions that make it run. You just 289 00:15:07,920 --> 00:15:10,960 Speaker 1: run it, so you are fundamentally consumer of it. But 290 00:15:11,160 --> 00:15:15,200 Speaker 1: with an open source project like Libra Office, which is 291 00:15:15,200 --> 00:15:18,720 Speaker 1: an open source office product, you just download the code. 292 00:15:18,760 --> 00:15:21,120 Speaker 1: You can look at the code, you can inspect the code, 293 00:15:21,200 --> 00:15:23,760 Speaker 1: you can modify the code, and you can tailor it 294 00:15:23,760 --> 00:15:26,680 Speaker 1: to your own requirement. So that's open source in software. 295 00:15:27,240 --> 00:15:29,760 Speaker 1: What's open source in a model? 296 00:15:29,960 --> 00:15:32,000 Speaker 2: So in a model, you can inspect the code, you 297 00:15:32,000 --> 00:15:34,480 Speaker 2: can expect the data sets, and the model weights themselves 298 00:15:34,560 --> 00:15:37,680 Speaker 2: are freely available as a fresh trained graduate, as it were. 299 00:15:37,880 --> 00:15:41,840 Speaker 2: And by releasing this openly so you could take it 300 00:15:41,880 --> 00:15:45,000 Speaker 2: and adapt it, it's a massive development boom where you 301 00:15:45,000 --> 00:15:46,040 Speaker 2: start seeing it everywhere. 302 00:15:46,240 --> 00:15:48,680 Speaker 1: Help me understand the mechanics of all this, because I 303 00:15:48,720 --> 00:15:55,560 Speaker 1: think it's important. Stability has its own machine learning AI supercomputer. 304 00:15:56,040 --> 00:15:59,440 Speaker 1: So you run up the cost of training these models 305 00:15:59,440 --> 00:16:02,960 Speaker 1: for the first time. Yeah, you then release them as models, 306 00:16:03,200 --> 00:16:07,920 Speaker 1: data and weights which any developer can take and use. 307 00:16:07,960 --> 00:16:10,960 Speaker 1: And when the developer runs them, they run them on 308 00:16:11,040 --> 00:16:15,480 Speaker 1: their own computing hardware, and then they're paying for that. 309 00:16:15,760 --> 00:16:17,440 Speaker 3: Yes, in some sense, they don't pay for it. 310 00:16:17,440 --> 00:16:19,520 Speaker 2: But if you want enterprise support, then you work with 311 00:16:19,600 --> 00:16:22,240 Speaker 2: us and our partners, or if you want customized versions, 312 00:16:22,280 --> 00:16:25,520 Speaker 2: because it's our view that you enterprise will want their 313 00:16:25,560 --> 00:16:28,360 Speaker 2: own versions with their own data sets underlying it. Every 314 00:16:28,400 --> 00:16:31,360 Speaker 2: country will want their own version because this is the 315 00:16:31,400 --> 00:16:34,520 Speaker 2: next generation of infrastructure. The actual comparison is five G. 316 00:16:35,520 --> 00:16:39,000 Speaker 2: This is five gv phonology works right, Yes, this is 317 00:16:39,040 --> 00:16:41,160 Speaker 2: five G for creativity as it were, it's five G 318 00:16:41,280 --> 00:16:45,040 Speaker 2: for information flow. And our trillion dollars has been spent 319 00:16:45,080 --> 00:16:45,520 Speaker 2: on five G. 320 00:16:46,480 --> 00:16:49,880 Speaker 1: Right, so we can spend a lot more on AI 321 00:16:49,960 --> 00:16:54,160 Speaker 1: systems across our economies. If we come back to your 322 00:16:54,600 --> 00:16:58,520 Speaker 1: open source models, it strikes me that one of the 323 00:16:58,520 --> 00:17:02,720 Speaker 1: things that you can do with them is you could 324 00:17:02,760 --> 00:17:05,639 Speaker 1: make them very culturally relevant. And I think back to 325 00:17:06,320 --> 00:17:09,080 Speaker 1: this idea that that sort of Western values get exported 326 00:17:09,160 --> 00:17:13,440 Speaker 1: for every country regardless. Back in the day, when you 327 00:17:13,600 --> 00:17:17,000 Speaker 1: register for Facebook, it would ask you what your marital 328 00:17:17,000 --> 00:17:19,720 Speaker 1: status was, and it was sort of single, divorced, married 329 00:17:19,800 --> 00:17:20,840 Speaker 1: or it's complicated. 330 00:17:21,080 --> 00:17:21,320 Speaker 3: Yes. 331 00:17:21,520 --> 00:17:23,199 Speaker 1: And my mum, who was in her late seventies at 332 00:17:23,240 --> 00:17:26,359 Speaker 1: the time, is registering for Facebook and is on the 333 00:17:26,359 --> 00:17:28,199 Speaker 1: phone to me game, what does it's complicated mean? 334 00:17:28,240 --> 00:17:28,760 Speaker 3: Because it just. 335 00:17:28,720 --> 00:17:32,760 Speaker 1: Didn't exist within her sort of mental space. And it 336 00:17:32,760 --> 00:17:36,080 Speaker 1: seems like given how important AI is going to be 337 00:17:36,119 --> 00:17:38,240 Speaker 1: as infrastructure, I mean, it is going to be the 338 00:17:38,320 --> 00:17:41,680 Speaker 1: layer between me and the services that I access as 339 00:17:41,720 --> 00:17:45,600 Speaker 1: a consumer, as a citizen or as an employee. It's 340 00:17:45,640 --> 00:17:50,720 Speaker 1: a really critical gatekeeper. So is that part of your vision? 341 00:17:50,800 --> 00:17:53,399 Speaker 1: An Indian version of chat GPT, a Brazilian version of 342 00:17:53,480 --> 00:17:55,879 Speaker 1: chat GPT, in Indonesian version of chat GPT. 343 00:17:56,440 --> 00:17:56,680 Speaker 3: Yes. 344 00:17:56,760 --> 00:18:00,760 Speaker 2: My vision is that every person, company, country, culture has 345 00:18:00,800 --> 00:18:04,440 Speaker 2: their own models that they themselves build and have the 346 00:18:04,520 --> 00:18:08,800 Speaker 2: data sets for, because this is vital infrastructure to represent themselves, 347 00:18:08,880 --> 00:18:10,440 Speaker 2: to extend their abilities. 348 00:18:10,800 --> 00:18:12,719 Speaker 1: How much of what you're saying is theory? Do you 349 00:18:12,760 --> 00:18:18,879 Speaker 1: actually have national models being built across the global South? 350 00:18:19,480 --> 00:18:21,280 Speaker 2: A lot of this stuff is still in the research 351 00:18:21,320 --> 00:18:24,040 Speaker 2: phase and now is only just entering the engineering phase, 352 00:18:24,480 --> 00:18:25,920 Speaker 2: and so there's a lot we still don't know about 353 00:18:25,920 --> 00:18:27,680 Speaker 2: these models. But they're good enough, fast enough, and cheap 354 00:18:27,800 --> 00:18:28,320 Speaker 2: enough to do it. 355 00:18:28,640 --> 00:18:31,320 Speaker 1: Okay, even if the models are cheaper, you still have 356 00:18:31,400 --> 00:18:35,480 Speaker 1: to get the relevant data because you know, the Internet 357 00:18:35,520 --> 00:18:40,119 Speaker 1: doesn't necessarily have lots of information about Pakistani culture in 358 00:18:40,160 --> 00:18:43,040 Speaker 1: Pakistani broadcasts in Pakistani media exactly. 359 00:18:43,160 --> 00:18:45,280 Speaker 2: And so what you need to have is you need 360 00:18:45,320 --> 00:18:50,680 Speaker 2: to have Pakistani newspapers, Pakistani broadcast and then have Pakistani's 361 00:18:50,720 --> 00:18:53,720 Speaker 2: come together to build better data sets that teach these models. 362 00:18:54,080 --> 00:18:56,840 Speaker 2: We know the technology now, but we lack the data 363 00:18:57,320 --> 00:18:59,800 Speaker 2: and so that is the key blocking point. 364 00:18:59,600 --> 00:19:03,240 Speaker 1: To get access to the data as is needed. 365 00:19:03,480 --> 00:19:06,200 Speaker 2: No to enable it so that as these data sets 366 00:19:06,240 --> 00:19:08,719 Speaker 2: are built, the models can then come from there and 367 00:19:08,760 --> 00:19:11,600 Speaker 2: then people can build on those models for their own people. 368 00:19:11,880 --> 00:19:15,040 Speaker 1: It almost sounds philanthropic, right, So what is your model 369 00:19:15,359 --> 00:19:17,960 Speaker 1: for making money from these open source models that you 370 00:19:18,000 --> 00:19:20,480 Speaker 1: are effectively giving away after all your hard work. 371 00:19:21,000 --> 00:19:25,240 Speaker 2: The whole of the infrastructure here in London in the 372 00:19:25,240 --> 00:19:27,960 Speaker 2: West is all based on open source. And the model 373 00:19:27,960 --> 00:19:29,879 Speaker 2: for open source is that you have an open version 374 00:19:30,280 --> 00:19:33,600 Speaker 2: that anyone can use, they can start experimenting with, and 375 00:19:33,640 --> 00:19:37,360 Speaker 2: then there are variant enterprise versions that you provide full 376 00:19:37,400 --> 00:19:41,600 Speaker 2: support around other services and facilities integration, and you're making 377 00:19:41,680 --> 00:19:44,280 Speaker 2: money that way with these models. We have our open 378 00:19:44,320 --> 00:19:47,440 Speaker 2: models based on open data and we have open models 379 00:19:47,480 --> 00:19:51,000 Speaker 2: based on licensed data from our partners. Because when you 380 00:19:51,040 --> 00:19:54,280 Speaker 2: talk to regulated industries and others they want models they own. 381 00:19:54,440 --> 00:19:57,520 Speaker 2: They don't want to send their data away to other people, 382 00:19:57,880 --> 00:19:59,960 Speaker 2: and they want to know every single piece of data 383 00:20:00,080 --> 00:20:01,600 Speaker 2: in there, and they want it to be the best 384 00:20:01,680 --> 00:20:02,159 Speaker 2: data you know. 385 00:20:02,320 --> 00:20:05,919 Speaker 1: Essentially, your average young developer or a small startup can 386 00:20:06,520 --> 00:20:09,200 Speaker 1: download your models and use them, but if there's an issue, 387 00:20:09,400 --> 00:20:11,320 Speaker 1: there's not a lot of support. But if you're a 388 00:20:11,320 --> 00:20:14,960 Speaker 1: big company, you're a national government, you might enter into 389 00:20:15,000 --> 00:20:18,640 Speaker 1: a more detailed contract where there is support and advice 390 00:20:18,960 --> 00:20:22,240 Speaker 1: and potentially even data yes coming through. We know that 391 00:20:22,280 --> 00:20:26,440 Speaker 1: AI is a very powerful technology, and Stability has taken 392 00:20:26,480 --> 00:20:29,520 Speaker 1: a different path to other firms by going through an 393 00:20:29,560 --> 00:20:34,840 Speaker 1: open source approach, which could democratize the technology, making it culturally, linguistically, 394 00:20:35,320 --> 00:20:41,760 Speaker 1: locally relevant for any nation, any business, any region, any individual. 395 00:20:42,320 --> 00:20:46,119 Speaker 1: But you're up against firms like Microsoft and open AI 396 00:20:46,600 --> 00:20:50,280 Speaker 1: and Google and deep Mind and others. How do you 397 00:20:50,280 --> 00:20:51,080 Speaker 1: plan to compete? 398 00:20:51,880 --> 00:20:54,760 Speaker 2: I think there's a cab career on addressable market. Our 399 00:20:54,760 --> 00:20:57,040 Speaker 2: addressable market is all the private data in the world. 400 00:20:57,800 --> 00:21:00,280 Speaker 2: Data you can't send anyone but your personal data or 401 00:21:00,359 --> 00:21:03,920 Speaker 2: enterprise data, financial regulated data, and so our models will 402 00:21:03,920 --> 00:21:07,399 Speaker 2: go in and transform that into knowledge and we'll have 403 00:21:07,400 --> 00:21:09,800 Speaker 2: a hybrid AI think, we've got our models on your 404 00:21:09,840 --> 00:21:11,879 Speaker 2: private data, and we standardize all of that, make it 405 00:21:12,000 --> 00:21:14,800 Speaker 2: very predictable, loads of support, and then you use these 406 00:21:14,800 --> 00:21:17,800 Speaker 2: proprietary systems for the best outcomes. You'll have your own 407 00:21:17,840 --> 00:21:21,280 Speaker 2: graduates that you hire, and you'll hire from McKinsey, and. 408 00:21:21,280 --> 00:21:22,200 Speaker 1: You'll put them together. 409 00:21:22,240 --> 00:21:23,040 Speaker 3: You'll put them together. 410 00:21:23,320 --> 00:21:28,200 Speaker 1: But I'm curious because other companies are taking a closed 411 00:21:28,359 --> 00:21:33,399 Speaker 1: source approach to proprietary data. So there are companies like 412 00:21:33,560 --> 00:21:39,680 Speaker 1: cohere and Anthropic who will build a powerful generative AI 413 00:21:39,760 --> 00:21:43,920 Speaker 1: model just on a company's own private data. They're competing 414 00:21:43,920 --> 00:21:45,280 Speaker 1: with you as well, right, and so why is your 415 00:21:45,280 --> 00:21:46,119 Speaker 1: approach better than that? 416 00:21:46,440 --> 00:21:49,159 Speaker 2: They will not give that company ownership of that model. 417 00:21:50,680 --> 00:21:53,440 Speaker 2: They will not share the detail of every single piece 418 00:21:53,440 --> 00:21:54,600 Speaker 2: of data that's in that model. 419 00:21:55,040 --> 00:21:57,879 Speaker 1: But that's the case today with lots of the technology 420 00:21:57,880 --> 00:21:59,800 Speaker 1: that we use. You know, when I'm running my e 421 00:22:00,000 --> 00:22:04,280 Speaker 1: commerce application on the cloud, I don't know the details 422 00:22:04,320 --> 00:22:07,520 Speaker 1: of every configuration of the servers that I'm renting from 423 00:22:08,040 --> 00:22:11,160 Speaker 1: Amazon or Microsoft as are so businesses are used to that. 424 00:22:11,080 --> 00:22:13,960 Speaker 2: One hundred percent, and you have data that you can 425 00:22:14,000 --> 00:22:16,639 Speaker 2: share with people, but there is a core of regulated 426 00:22:16,680 --> 00:22:19,440 Speaker 2: data and other things hip a compliant data medical data 427 00:22:19,480 --> 00:22:21,240 Speaker 2: that you cannot send to other companies, and you have 428 00:22:21,240 --> 00:22:25,119 Speaker 2: to build your own systems for inside regulated environments. The 429 00:22:25,160 --> 00:22:28,480 Speaker 2: feedback we've got from regulated entities and again from policymakers 430 00:22:28,520 --> 00:22:32,160 Speaker 2: and others is open, transparent models, even if it's licensed 431 00:22:32,200 --> 00:22:34,440 Speaker 2: data are something that we would like a lot, and 432 00:22:34,440 --> 00:22:36,520 Speaker 2: we would like to own this technology if it's going 433 00:22:36,560 --> 00:22:37,920 Speaker 2: to be vital infrastructure. 434 00:22:38,280 --> 00:22:41,600 Speaker 1: Right, we're in London. Now there's a deep bench of 435 00:22:41,640 --> 00:22:46,000 Speaker 1: AI skills. How do we expand this and democratize it 436 00:22:46,320 --> 00:22:51,080 Speaker 1: out to countries where there's just less talent in these 437 00:22:51,160 --> 00:22:52,080 Speaker 1: breakthrough areas. 438 00:22:52,240 --> 00:22:54,280 Speaker 3: I think there's talent, it just hasn't been accessed. 439 00:22:54,320 --> 00:22:56,640 Speaker 2: And these models are very interesting and you can use 440 00:22:56,680 --> 00:22:59,879 Speaker 2: AI to help you develop applications of the AI. 441 00:23:00,119 --> 00:23:01,760 Speaker 3: Quite a funny kind of recursion there. 442 00:23:01,680 --> 00:23:06,040 Speaker 1: Right, So you effectively will start to support places where 443 00:23:06,040 --> 00:23:09,280 Speaker 1: perhaps the workforce doesn't have the depth of San Francisco's 444 00:23:09,320 --> 00:23:11,199 Speaker 1: AI talent with the tools themselves. 445 00:23:11,320 --> 00:23:14,399 Speaker 2: Yes, because the AI models are pre computed, the actual 446 00:23:14,480 --> 00:23:17,960 Speaker 2: running of the AI is very computationally non intensive. The 447 00:23:18,040 --> 00:23:21,399 Speaker 2: creation of the AI is ridiculously intensive, right, So you've 448 00:23:21,400 --> 00:23:23,360 Speaker 2: got all the energy at the start rather than the end. 449 00:23:23,960 --> 00:23:27,919 Speaker 1: So that takes us back to what stability will do. 450 00:23:28,000 --> 00:23:31,679 Speaker 1: Stability will take the cost upfront, and then it'll find 451 00:23:32,320 --> 00:23:35,880 Speaker 1: rich companies, rich nations, rich clients to tailor the models, 452 00:23:36,520 --> 00:23:39,879 Speaker 1: which allows you to continue to make the base foundation model, 453 00:23:40,080 --> 00:23:44,960 Speaker 1: which can then be given as open source to anyone 454 00:23:45,040 --> 00:23:45,760 Speaker 1: else who yes. 455 00:23:45,840 --> 00:23:48,080 Speaker 2: It stimulates demand, and then as people go up and 456 00:23:48,119 --> 00:23:50,439 Speaker 2: they need the support, they come to us and our partners. 457 00:23:50,480 --> 00:23:53,240 Speaker 2: As any customization, they come to us on our partners, right, 458 00:23:53,280 --> 00:23:56,840 Speaker 2: And it's for private data open models, and then other 459 00:23:56,920 --> 00:23:59,959 Speaker 2: models are for data that you either are semi private 460 00:24:00,080 --> 00:24:02,080 Speaker 2: or you don't mind, and you combine the two so 461 00:24:02,119 --> 00:24:03,399 Speaker 2: you have models of both types. 462 00:24:04,440 --> 00:24:08,120 Speaker 1: There are, of course concerns with the safety of AI systems, 463 00:24:08,160 --> 00:24:12,480 Speaker 1: and one argument is that with closed models that are controlled, 464 00:24:12,880 --> 00:24:15,359 Speaker 1: there's a lot more safety because if I'm accessing it 465 00:24:15,440 --> 00:24:19,240 Speaker 1: over the web, the organization that's running the AI system 466 00:24:19,320 --> 00:24:21,680 Speaker 1: can can stop me if I'm trying to do something 467 00:24:21,720 --> 00:24:24,720 Speaker 1: bad with it. And with open source models, of course 468 00:24:24,720 --> 00:24:27,359 Speaker 1: they're just available for anyone to download, so the cat 469 00:24:27,440 --> 00:24:29,840 Speaker 1: is literally out of the bag many many times over, 470 00:24:29,960 --> 00:24:35,360 Speaker 1: millions of times over. Is your approach less safe than 471 00:24:35,800 --> 00:24:36,680 Speaker 1: the closed approach? 472 00:24:37,280 --> 00:24:39,600 Speaker 2: I think it's more safe. There's a reason that our 473 00:24:39,640 --> 00:24:42,720 Speaker 2: infrastructure is based on open source databases and servers and 474 00:24:42,760 --> 00:24:45,560 Speaker 2: others because it can be checked, it can be tested, 475 00:24:45,600 --> 00:24:48,760 Speaker 2: and it can be fully audited and battle tested. Our 476 00:24:48,800 --> 00:24:51,000 Speaker 2: approach to stability is to create the standard around this, 477 00:24:51,080 --> 00:24:54,640 Speaker 2: so there aren't thousands of different models. There is an 478 00:24:54,760 --> 00:24:58,159 Speaker 2: entity in a partnership and an ecosystem that standardizes around 479 00:24:58,160 --> 00:25:02,919 Speaker 2: this principles line safety, water marking, and other things, so 480 00:25:02,960 --> 00:25:04,040 Speaker 2: it becomes predictable. 481 00:25:04,440 --> 00:25:06,600 Speaker 1: But you've put the models out for any bad actor, 482 00:25:06,760 --> 00:25:13,000 Speaker 1: any hacker, any annoyed employee to build something difficult with. 483 00:25:13,480 --> 00:25:15,280 Speaker 2: We weren't the ones that came up with open models. 484 00:25:15,280 --> 00:25:19,479 Speaker 2: We're standardizing it. We're supporting open innovation for detection and 485 00:25:19,560 --> 00:25:20,959 Speaker 2: prevention as well as creation. 486 00:25:21,320 --> 00:25:24,520 Speaker 1: But it does sound like bad actors will end up 487 00:25:24,560 --> 00:25:26,800 Speaker 1: having a little bit of a field day, which creates 488 00:25:26,880 --> 00:25:31,080 Speaker 1: I suspect an enormous opportunity for an AI driven security 489 00:25:31,119 --> 00:25:32,840 Speaker 1: and resilience industry. 490 00:25:33,000 --> 00:25:36,240 Speaker 2: The reality is that we're stronger together when things are open, 491 00:25:36,320 --> 00:25:40,600 Speaker 2: and open is required for all the private, regulated and 492 00:25:40,640 --> 00:25:43,280 Speaker 2: other data out there. If you don't have open systems, 493 00:25:43,359 --> 00:25:46,080 Speaker 2: then you will only have proprietary entities, and they become 494 00:25:46,119 --> 00:25:48,560 Speaker 2: the choke points on the Internet, and that's far more 495 00:25:48,640 --> 00:25:51,879 Speaker 2: dangerous than the other side. Open is there anyway, But 496 00:25:51,920 --> 00:25:54,280 Speaker 2: like I said, let's standardize it, let's make it safer, 497 00:25:54,320 --> 00:25:57,800 Speaker 2: and let's work together to combat the bad as opposed 498 00:25:57,840 --> 00:26:00,320 Speaker 2: to leaving it with a few unelected giant companies. 499 00:26:00,640 --> 00:26:04,880 Speaker 1: Part of the story of technology is that technology has 500 00:26:04,960 --> 00:26:10,439 Speaker 1: been exported from one place really for everyone else to use. 501 00:26:10,560 --> 00:26:13,200 Speaker 1: I think one exciting opportunity now is the idea that 502 00:26:14,200 --> 00:26:17,000 Speaker 1: the people on whom this technology is going to operate 503 00:26:17,040 --> 00:26:20,560 Speaker 1: could potentially build their own Now, many of those people 504 00:26:20,640 --> 00:26:23,000 Speaker 1: are going to be in the Global South, and the 505 00:26:23,040 --> 00:26:26,639 Speaker 1: premise of our conversation is that within five years, safe 506 00:26:26,680 --> 00:26:30,960 Speaker 1: and open source generative AI could make communic for contribution 507 00:26:31,480 --> 00:26:36,760 Speaker 1: to increase the GDP of those world's poorest nations. How 508 00:26:36,920 --> 00:26:39,480 Speaker 1: likely is it that this vision could become reality? I 509 00:26:39,560 --> 00:26:43,240 Speaker 1: think it's incredibly likely. The desired talent and passion to 510 00:26:43,320 --> 00:26:47,080 Speaker 1: adopt technology like this is huge within the Global South, 511 00:26:47,280 --> 00:26:49,280 Speaker 1: and it is where it can have the most impact 512 00:26:49,320 --> 00:26:51,800 Speaker 1: the highest ROI. So I think they'll take the building 513 00:26:51,840 --> 00:26:54,760 Speaker 1: blocks that we and others provide and they'll build some 514 00:26:54,840 --> 00:26:59,000 Speaker 1: amazing things to activate their potential. Mada, it's a great vision. 515 00:26:59,280 --> 00:27:01,399 Speaker 1: Thank you so much, my pleasure, thank you for having me. 516 00:27:08,920 --> 00:27:12,440 Speaker 1: Reflecting on my conversation with Emmad, I'm reminded that much 517 00:27:12,480 --> 00:27:15,200 Speaker 1: of the software that powers the Internet today, used by 518 00:27:15,240 --> 00:27:18,840 Speaker 1: billions of US, is actually open source. It's proven to 519 00:27:18,880 --> 00:27:24,320 Speaker 1: be resilient, stable, and importantly affordable. The open approach is 520 00:27:24,440 --> 00:27:28,800 Speaker 1: one reason why the Internet is today ubiquitous, So why 521 00:27:28,800 --> 00:27:31,840 Speaker 1: wouldn't that be true for generative AI? And if the 522 00:27:31,880 --> 00:27:35,440 Speaker 1: technology can live up to its promise of improving productivity, 523 00:27:35,800 --> 00:27:38,720 Speaker 1: wouldn't the open approach make it more widely accessible to 524 00:27:38,760 --> 00:27:41,399 Speaker 1: the poorest countries in the world. That seems to make 525 00:27:41,480 --> 00:27:53,720 Speaker 1: sense to me. Thanks for listening to the Exponentially podcast. 526 00:27:53,920 --> 00:27:56,919 Speaker 1: If you enjoy the show, please leave a review or rating. 527 00:27:57,080 --> 00:28:00,960 Speaker 1: It really does help others find us. Any podcast is 528 00:28:01,000 --> 00:28:04,920 Speaker 1: presented by me Azeem Azar. The sound designer is Will Horrocks. 529 00:28:05,200 --> 00:28:08,120 Speaker 1: The research was led by Chloe Ippah and music composed 530 00:28:08,119 --> 00:28:11,560 Speaker 1: by Emily Green and John Zarcone. The show is produced 531 00:28:11,560 --> 00:28:15,719 Speaker 1: by Frederick Cassella, Maria Garrilov and me Azeem Azar. Special 532 00:28:15,760 --> 00:28:19,159 Speaker 1: thanks to Sage Bauman, Jeff Grocott and Magnus Henrikson. The 533 00:28:19,200 --> 00:28:23,280 Speaker 1: executive producers are Andrew Barden, Adam Kamiski, and Kyle Kramer. 534 00:28:23,560 --> 00:28:27,239 Speaker 1: David Rivella is the managing editor. Exponentially was created by 535 00:28:27,240 --> 00:28:29,199 Speaker 1: Frederick Cassella and is an e to the pie I 536 00:28:29,280 --> 00:28:33,400 Speaker 1: plus one Limited production in association with Bloomberg LC