1 00:00:00,960 --> 00:00:10,520 Speaker 1: Okay, make a photo of a CEO and enter all 2 00:00:10,600 --> 00:00:18,160 Speaker 1: right results. Oh, it made four different pictures and all 3 00:00:18,280 --> 00:00:23,440 Speaker 1: of them appear to be light skinned men in suits. Okay, 4 00:00:23,560 --> 00:00:30,960 Speaker 1: do that again, photo of a ce Oh and for 5 00:00:31,360 --> 00:00:36,239 Speaker 1: more people who look like men to me. This is 6 00:00:36,280 --> 00:00:40,800 Speaker 1: my very first time using an AI image generator, is 7 00:00:40,840 --> 00:00:43,960 Speaker 1: through a free website on my laptop. Essentially, it's an 8 00:00:44,000 --> 00:00:48,000 Speaker 1: AI model like chat GPT, but instead of creating AI 9 00:00:48,120 --> 00:00:53,400 Speaker 1: generated text, it produces AI generated images. I type into 10 00:00:53,400 --> 00:00:55,240 Speaker 1: a search box what I want to see, and a 11 00:00:55,240 --> 00:00:59,120 Speaker 1: few seconds later it produces pictures that the model believes 12 00:00:59,200 --> 00:01:02,760 Speaker 1: are what I'm asking asking for. All right, let's do 13 00:01:02,840 --> 00:01:11,120 Speaker 1: this again. Photo of a physician and looks to me 14 00:01:11,319 --> 00:01:16,240 Speaker 1: like men, but in lab coats and scrubs. Oh, and 15 00:01:16,319 --> 00:01:19,400 Speaker 1: you know they're doctors because they all have stethoscopes around 16 00:01:19,440 --> 00:01:24,800 Speaker 1: their necks. I requested images for several different jobs and 17 00:01:24,920 --> 00:01:27,920 Speaker 1: repeated the requests at least ten times for each one. 18 00:01:28,360 --> 00:01:32,440 Speaker 1: The results were eye opening. Almost all the images of 19 00:01:32,560 --> 00:01:35,560 Speaker 1: CEOs and doctors the model produced, at least when I 20 00:01:35,680 --> 00:01:39,160 Speaker 1: was using it, appeared to me to be men. All 21 00:01:39,280 --> 00:01:43,480 Speaker 1: nurses and almost all teachers appeared to be women. By 22 00:01:43,520 --> 00:01:45,840 Speaker 1: the way, I'm saying appeared to be women or men, 23 00:01:45,920 --> 00:01:49,880 Speaker 1: because these images are people who don't actually exist, and 24 00:01:49,920 --> 00:01:54,760 Speaker 1: so identifiers like gender, race, and ethnicity are subjective and 25 00:01:54,800 --> 00:01:58,600 Speaker 1: we'll talk more about that a bit later. Most images 26 00:01:58,640 --> 00:02:01,360 Speaker 1: of attorneys the model look to me to be light 27 00:02:01,440 --> 00:02:05,680 Speaker 1: skinned men. Images of scientists appeared to be more diverse, 28 00:02:05,760 --> 00:02:09,360 Speaker 1: but they also mostly looked like men, and this is 29 00:02:09,440 --> 00:02:13,079 Speaker 1: really weird. When the model did generate pictures of attorneys 30 00:02:13,160 --> 00:02:16,239 Speaker 1: or scientists who looked to me like women, they were 31 00:02:16,400 --> 00:02:20,520 Speaker 1: very often shown dressed in traditional men's clothing, like business 32 00:02:20,520 --> 00:02:24,200 Speaker 1: suits and wearing neckties, as though the model couldn't fully 33 00:02:24,280 --> 00:02:28,519 Speaker 1: accept the concept of a woman in those professions. AI 34 00:02:28,600 --> 00:02:32,560 Speaker 1: sometimes generates a distorted version of reality that doesn't look 35 00:02:32,639 --> 00:02:34,919 Speaker 1: like the world we live in, and it can perpetuate 36 00:02:35,040 --> 00:02:39,280 Speaker 1: gender and racial stereotypes. This matters because, as we know, 37 00:02:39,600 --> 00:02:42,440 Speaker 1: AI is fast working its way into our lives, and 38 00:02:42,480 --> 00:02:46,360 Speaker 1: AI generated images that can make us believe something artificial 39 00:02:46,520 --> 00:02:52,160 Speaker 1: is actually something real, maybe especially influential and potentially harmful. 40 00:02:53,160 --> 00:02:56,400 Speaker 2: So we're looking at a situation where we're generating more 41 00:02:56,440 --> 00:02:59,200 Speaker 2: and more content via AI, more and more of these 42 00:02:59,200 --> 00:03:02,959 Speaker 2: synthetic images. Those images become a part of the body 43 00:03:03,000 --> 00:03:05,560 Speaker 2: of images, the body of work that's on the internet, 44 00:03:05,919 --> 00:03:08,760 Speaker 2: they are more biased than reality. And then in the 45 00:03:08,800 --> 00:03:12,840 Speaker 2: future those images get fed back into future AI systems, 46 00:03:13,200 --> 00:03:16,360 Speaker 2: so that you end up in this nasty cycle where 47 00:03:16,440 --> 00:03:18,680 Speaker 2: the bias is getting worse and worse and being fed 48 00:03:18,720 --> 00:03:22,240 Speaker 2: back into future systems which are then less diverse. 49 00:03:22,800 --> 00:03:26,359 Speaker 3: So there was a recent eurocal report that suggested that 50 00:03:26,440 --> 00:03:30,440 Speaker 3: by twenty twenty six, ninety percent of all online content 51 00:03:30,600 --> 00:03:34,520 Speaker 3: could be artificially generated. What happens when ninety percent of 52 00:03:34,639 --> 00:03:39,040 Speaker 3: all online images are images reinforcing those stereotypes. 53 00:03:39,480 --> 00:03:43,800 Speaker 1: Bloomberg's Dina Bass and Leonardo Nicoletti dug deep into the 54 00:03:43,880 --> 00:03:47,000 Speaker 1: data to find out why the results look like this 55 00:03:47,440 --> 00:03:50,240 Speaker 1: and what can be done to fix the shortcomings of 56 00:03:50,280 --> 00:03:58,960 Speaker 1: this rapidly emerging technology. I'm wes Kasova today on the 57 00:03:58,960 --> 00:04:02,760 Speaker 1: Big take. You can't trust your eyes when it comes 58 00:04:02,760 --> 00:04:08,320 Speaker 1: to AI, like, maybe you can start by just giving 59 00:04:08,360 --> 00:04:12,440 Speaker 1: us an overview of what you found in this investigation. 60 00:04:13,560 --> 00:04:17,240 Speaker 3: This investigation essentially looks at generative AI, which is a 61 00:04:17,279 --> 00:04:20,919 Speaker 3: new type of AI. It's like chat GPT where you 62 00:04:21,400 --> 00:04:23,840 Speaker 3: ask it a question and it just answers you or 63 00:04:23,920 --> 00:04:26,479 Speaker 3: gives you the information you need. In our case, we 64 00:04:26,600 --> 00:04:30,200 Speaker 3: use stable diffusion, which is similar to chat GPT, but 65 00:04:30,279 --> 00:04:33,120 Speaker 3: it's instead of text to text, which means like using 66 00:04:33,160 --> 00:04:36,240 Speaker 3: texts to generate more text, it's text to image where 67 00:04:36,279 --> 00:04:39,520 Speaker 3: you ask it a question or give it a description, 68 00:04:39,760 --> 00:04:42,920 Speaker 3: and then it will generate an image for you of 69 00:04:43,680 --> 00:04:47,920 Speaker 3: what you're looking for. And well, you know that gives 70 00:04:48,000 --> 00:04:51,640 Speaker 3: us lots of possibilities and opens lots of doors for work, 71 00:04:51,720 --> 00:04:56,480 Speaker 3: for design, for artistic advertising, lots of purposes. What we 72 00:04:56,600 --> 00:05:00,279 Speaker 3: found is that it also has very strong bias is 73 00:05:00,560 --> 00:05:05,880 Speaker 3: against people of color, women in general. And so what 74 00:05:05,920 --> 00:05:09,040 Speaker 3: we wanted to show through this piece is really how 75 00:05:09,160 --> 00:05:14,480 Speaker 3: biased is generative AI, and specifically generative AI that creates 76 00:05:14,520 --> 00:05:19,279 Speaker 3: images and visual representations. And you know to what extent 77 00:05:19,440 --> 00:05:23,240 Speaker 3: are these biased ingrained in this technology? And you know 78 00:05:23,279 --> 00:05:25,080 Speaker 3: what are the potential implications of that. 79 00:05:25,960 --> 00:05:28,800 Speaker 1: This is significant because it's different from the kind of 80 00:05:28,920 --> 00:05:32,560 Speaker 1: facial recognition that we already know about, Is that right? 81 00:05:33,279 --> 00:05:33,600 Speaker 4: It is? 82 00:05:33,680 --> 00:05:37,440 Speaker 2: So somewhere around twenty eighteen we started finding out that 83 00:05:37,520 --> 00:05:42,159 Speaker 2: facial recognition software had significant racial and gender biases. And 84 00:05:42,480 --> 00:05:45,680 Speaker 2: what that software is You have a picture, an image, 85 00:05:46,000 --> 00:05:49,239 Speaker 2: and the AI scans it and tries to predict what's 86 00:05:49,360 --> 00:05:49,680 Speaker 2: in it? 87 00:05:49,880 --> 00:05:51,680 Speaker 4: You know, what am I looking at? Is it a 88 00:05:51,720 --> 00:05:54,960 Speaker 4: black cat, Is it a cheeseburger? Is it a white woman? 89 00:05:55,040 --> 00:05:56,080 Speaker 4: What am I looking at? 90 00:05:56,920 --> 00:06:01,480 Speaker 2: In twenty eighteen, a couple of researchers join Bulmini, Timnique Cabrew, 91 00:06:01,640 --> 00:06:05,200 Speaker 2: and then Deboraji combined to do some work called gender sheets, 92 00:06:05,200 --> 00:06:08,400 Speaker 2: where they ran a bunch of the popular facial recognition programs, 93 00:06:08,680 --> 00:06:10,880 Speaker 2: ran tests on them and found that their performance was 94 00:06:10,920 --> 00:06:14,240 Speaker 2: significantly worse on people of color and significantly worse on 95 00:06:14,279 --> 00:06:16,680 Speaker 2: women of color. So we've known that that's an issue, 96 00:06:16,800 --> 00:06:18,880 Speaker 2: and it's wrapped up in real world scenarios. There have 97 00:06:18,960 --> 00:06:21,760 Speaker 2: been situations in the US where black men have been 98 00:06:21,800 --> 00:06:25,680 Speaker 2: mistakenly arrested because they were flagged by facial recognition software. 99 00:06:25,839 --> 00:06:29,080 Speaker 2: It turns out it was some completely different person, So 100 00:06:29,120 --> 00:06:32,479 Speaker 2: we know that's a problem. Generative AI is a new 101 00:06:32,520 --> 00:06:35,080 Speaker 2: type of AI, and it's a new wrinkle. So instead 102 00:06:35,080 --> 00:06:38,800 Speaker 2: of AI that scans existing pictures, it's creating new ones 103 00:06:39,360 --> 00:06:44,000 Speaker 2: and that we found also has significant racial and gender biases. 104 00:06:44,279 --> 00:06:47,320 Speaker 2: So the additional, you know, significant issue that this raises 105 00:06:47,400 --> 00:06:51,200 Speaker 2: is we're now using artificial intelligence to create massive volumes 106 00:06:51,240 --> 00:06:54,960 Speaker 2: of new content. Then put out into the world for use. 107 00:06:55,720 --> 00:06:59,160 Speaker 2: That new content is demonstrating racial and gender bias, and 108 00:06:59,320 --> 00:07:02,120 Speaker 2: we're adding to a body of content out there, using 109 00:07:02,160 --> 00:07:06,640 Speaker 2: it for reports, using it for clip art, for presentations, 110 00:07:06,680 --> 00:07:08,440 Speaker 2: and it is significantly biased. 111 00:07:11,040 --> 00:07:14,480 Speaker 1: And Leo, how did you go about finding this bias 112 00:07:14,520 --> 00:07:16,960 Speaker 1: in this new form of generative AI? 113 00:07:18,160 --> 00:07:23,960 Speaker 3: As a half reporter but also half former scientists, academic 114 00:07:24,120 --> 00:07:27,160 Speaker 3: and just coder. The fact that many of these models 115 00:07:27,920 --> 00:07:32,400 Speaker 3: generative AI models are open source was actually very useful 116 00:07:32,560 --> 00:07:36,880 Speaker 3: for just researchers in general, but also reporters because it 117 00:07:36,920 --> 00:07:41,480 Speaker 3: gives the possibility for anybody to download the generative AI model, 118 00:07:41,560 --> 00:07:46,560 Speaker 3: in our case, Stable Diffusion and ask it to generate images. 119 00:07:47,280 --> 00:07:49,560 Speaker 3: And so what I did is I simply went on 120 00:07:49,760 --> 00:07:53,680 Speaker 3: the hugging Face platform, which is this really interesting and 121 00:07:53,840 --> 00:07:57,840 Speaker 3: very useful platform that has come out recently that hosts 122 00:07:57,920 --> 00:08:02,440 Speaker 3: all of these models, including source versions of GPT for example, 123 00:08:02,760 --> 00:08:07,440 Speaker 3: and stable Diffusion, and I downloaded the model, and then 124 00:08:07,560 --> 00:08:12,080 Speaker 3: I wrote some code to basically iterate through a series 125 00:08:12,200 --> 00:08:17,360 Speaker 3: of very well known high paying and low paying jobs 126 00:08:17,920 --> 00:08:23,960 Speaker 3: and also different criminalized activities, and just ask the model 127 00:08:24,080 --> 00:08:29,040 Speaker 3: a very simple question, can you generate color photograph of blank? 128 00:08:29,560 --> 00:08:34,480 Speaker 3: And Blank is a judge, an engineer, and a janitor, 129 00:08:34,800 --> 00:08:39,160 Speaker 3: a housekeeper, a fast food worker. For professions, for example, 130 00:08:39,520 --> 00:08:42,760 Speaker 3: and for criminalized activities, we looked at three of them, 131 00:08:42,840 --> 00:08:46,040 Speaker 3: so Blank would be a terrorist, a drug dealer, or 132 00:08:46,200 --> 00:08:49,880 Speaker 3: an inmate. I let my computer run for actually an 133 00:08:50,040 --> 00:08:54,120 Speaker 3: entire month, because it's very computationally heavy to generate thousands 134 00:08:54,120 --> 00:08:57,040 Speaker 3: of images for each of those keywords. So that was 135 00:08:57,080 --> 00:08:57,760 Speaker 3: the first step. 136 00:08:58,520 --> 00:09:00,760 Speaker 1: So you would just say to it, make me a 137 00:09:00,800 --> 00:09:04,760 Speaker 1: picture of a CEO and then see what it came 138 00:09:04,840 --> 00:09:05,160 Speaker 1: up with. 139 00:09:06,960 --> 00:09:10,040 Speaker 3: Exactly. But the idea was to do that exact same 140 00:09:10,080 --> 00:09:14,520 Speaker 3: thing thousands of times, so that instead of, you know, 141 00:09:14,720 --> 00:09:20,160 Speaker 3: having anecdotal evidence that the AI might be biased, we 142 00:09:20,200 --> 00:09:24,679 Speaker 3: would actually gather a database of images of the same 143 00:09:24,720 --> 00:09:29,040 Speaker 3: thing over and over and over. Basically that would allow 144 00:09:29,400 --> 00:09:33,520 Speaker 3: us as reporters and as data scientists to then analyze 145 00:09:33,559 --> 00:09:36,920 Speaker 3: those thousands of images and actually find a pattern across 146 00:09:36,960 --> 00:09:39,040 Speaker 3: those images. So that's exactly what we did. 147 00:09:39,600 --> 00:09:41,480 Speaker 1: And what is the pattern that you found when you 148 00:09:41,520 --> 00:09:44,240 Speaker 1: typed in ceo, when you typed in fast food worker 149 00:09:44,280 --> 00:09:47,079 Speaker 1: all of the other things you mentioned, and then asked 150 00:09:47,120 --> 00:09:50,480 Speaker 1: it to show you pictures thousands of times? What did 151 00:09:50,480 --> 00:09:51,160 Speaker 1: it turn up. 152 00:09:51,880 --> 00:09:56,160 Speaker 3: So the pattern is a very stark pattern. It's that 153 00:09:56,400 --> 00:10:03,240 Speaker 3: for high paying professions, the generative AI model is overwhelmingly 154 00:10:03,360 --> 00:10:07,439 Speaker 3: generating pictures of white men, and for low paying professions 155 00:10:07,520 --> 00:10:12,679 Speaker 3: it's overwhelmingly generated more pictures of women and darker skinned people. 156 00:10:12,720 --> 00:10:16,920 Speaker 3: So in our analysis we couldn't really talk about race 157 00:10:17,040 --> 00:10:21,760 Speaker 3: because race is very hard to quantify in images, especially 158 00:10:21,800 --> 00:10:25,480 Speaker 3: when you have images of fake people essentially that can't 159 00:10:25,480 --> 00:10:28,400 Speaker 3: really self identify, So you can't say this is a 160 00:10:28,400 --> 00:10:31,160 Speaker 3: black person or this is an Asian person. But what 161 00:10:31,200 --> 00:10:36,520 Speaker 3: you can do is rigorous scientific analysis where you do 162 00:10:36,600 --> 00:10:39,920 Speaker 3: things like average all the pixels of a person's skin 163 00:10:40,280 --> 00:10:43,760 Speaker 3: across all of the images of one profession, and for example, 164 00:10:44,080 --> 00:10:47,120 Speaker 3: doing that, what we found is that the pattern of 165 00:10:47,440 --> 00:10:52,839 Speaker 3: darker skinned subjects being overrepresented in low paying professions and 166 00:10:53,000 --> 00:10:58,360 Speaker 3: lighter skin subjects being overrepresented in high paying professions. And 167 00:10:58,440 --> 00:11:02,840 Speaker 3: the same goes for criminalized activities, where you have darker 168 00:11:02,880 --> 00:11:09,040 Speaker 3: skin tones constantly and systematically being represented in criminalized activities. 169 00:11:11,800 --> 00:11:16,520 Speaker 1: Tina Leo was talking about stable diffusion. Exactly what is 170 00:11:16,559 --> 00:11:17,640 Speaker 1: that and how does it work? 171 00:11:18,440 --> 00:11:22,360 Speaker 2: So stable diffusion is a text to image program that 172 00:11:22,559 --> 00:11:26,360 Speaker 2: is open source. It's distributed by a company called Stability Ai, 173 00:11:27,000 --> 00:11:29,559 Speaker 2: and the version that we used is Leao mentioned hosted 174 00:11:29,600 --> 00:11:32,120 Speaker 2: on hugging Face, which is basically a repository of open 175 00:11:32,160 --> 00:11:34,240 Speaker 2: source AI model. So some of your listeners may have 176 00:11:34,280 --> 00:11:37,880 Speaker 2: heard of GitHub, which is a repository of programming code. 177 00:11:38,360 --> 00:11:40,440 Speaker 2: Hugging Face tries to be sort of like a version 178 00:11:40,480 --> 00:11:43,240 Speaker 2: of that for AI models, and a lot of your 179 00:11:43,240 --> 00:11:46,120 Speaker 2: listeners may have actually heard of a different image generation program, 180 00:11:46,200 --> 00:11:49,640 Speaker 2: which is open AI's Dolli. Dolli to the second version 181 00:11:49,640 --> 00:11:52,719 Speaker 2: of it came out in wider distribution last year and 182 00:11:52,760 --> 00:11:55,040 Speaker 2: around July. It was announced a bit earlier than that, 183 00:11:55,520 --> 00:11:58,439 Speaker 2: and that was also very popular and attracted a lot 184 00:11:58,440 --> 00:12:01,720 Speaker 2: of attention. Stable Diffusion followed that and came out as 185 00:12:01,760 --> 00:12:04,240 Speaker 2: an open source version, and because it was open source, 186 00:12:04,320 --> 00:12:07,880 Speaker 2: it's been very widely used. In order to use the 187 00:12:07,920 --> 00:12:11,760 Speaker 2: open ai version, you for you know, commercial applications, you 188 00:12:11,800 --> 00:12:13,000 Speaker 2: have to work with open Ai. 189 00:12:13,120 --> 00:12:14,920 Speaker 4: You have to pay for that, and so it's a 190 00:12:14,960 --> 00:12:15,560 Speaker 4: little different. 191 00:12:16,080 --> 00:12:17,520 Speaker 2: I just want to talk for a minute about why 192 00:12:17,520 --> 00:12:20,040 Speaker 2: we did not look at open AI's Dolli, and that's 193 00:12:20,040 --> 00:12:23,559 Speaker 2: because it's not open source, so we can't tell what 194 00:12:23,679 --> 00:12:24,840 Speaker 2: is in the training data. 195 00:12:24,880 --> 00:12:26,559 Speaker 4: For DOLLI in a way that we. 196 00:12:26,600 --> 00:12:29,280 Speaker 2: Can for stable diffusion, and there are greater limits on 197 00:12:29,320 --> 00:12:31,080 Speaker 2: what you can do with it. 198 00:12:31,080 --> 00:12:33,160 Speaker 4: It was sort of difficult to look at the bias 199 00:12:33,320 --> 00:12:34,160 Speaker 4: there and. 200 00:12:34,200 --> 00:12:36,079 Speaker 1: Dan, by open source, what exactly do you mean. 201 00:12:37,040 --> 00:12:39,920 Speaker 2: So it's basically the opposite of proprietary software. 202 00:12:40,040 --> 00:12:42,640 Speaker 4: It's freely distributed. It's openly distributed. 203 00:12:42,720 --> 00:12:45,560 Speaker 2: Anyone can download it, use it, and in the case 204 00:12:45,559 --> 00:12:47,800 Speaker 2: of AI models, you have greater freedom to play with 205 00:12:47,880 --> 00:12:50,880 Speaker 2: it to tweak different parts of the AI model to 206 00:12:50,960 --> 00:12:51,640 Speaker 2: what you need. 207 00:12:52,400 --> 00:12:56,320 Speaker 3: You can see exactly the code or the data that 208 00:12:56,640 --> 00:13:00,319 Speaker 3: is going behind an AI model, and you can see 209 00:13:00,320 --> 00:13:04,200 Speaker 3: the different versions of the model over time, and that's 210 00:13:04,360 --> 00:13:07,240 Speaker 3: very important for people who are trying to improve these 211 00:13:07,280 --> 00:13:13,000 Speaker 3: things because you can basically have some sort of version control, 212 00:13:13,080 --> 00:13:16,079 Speaker 3: so control fork. The previous version used to be like this, 213 00:13:16,320 --> 00:13:18,480 Speaker 3: and now we've improved it, and now we can see 214 00:13:18,559 --> 00:13:22,720 Speaker 3: clearly the difference between the new version and the previous version. Actually, 215 00:13:22,760 --> 00:13:26,720 Speaker 3: for this story, we did interview prominent academics within this field, 216 00:13:27,040 --> 00:13:30,599 Speaker 3: and they've all really stressed this point that one of 217 00:13:30,640 --> 00:13:34,720 Speaker 3: the only ways to address the problem of bias is 218 00:13:34,840 --> 00:13:38,160 Speaker 3: to start by having open source models, because then those 219 00:13:38,200 --> 00:13:41,800 Speaker 3: models can be taken by other academics or other organizations 220 00:13:41,800 --> 00:13:45,640 Speaker 3: that are also transparent, and whatever they do to them 221 00:13:46,200 --> 00:13:50,040 Speaker 3: to quote unquote improve them is now again made transparent, 222 00:13:50,240 --> 00:13:55,840 Speaker 3: made very publicly known, and available to yet more academics 223 00:13:55,840 --> 00:13:56,880 Speaker 3: to improve upon it. 224 00:13:56,920 --> 00:14:01,360 Speaker 4: Again, there's also greater auditability. 225 00:14:01,760 --> 00:14:04,720 Speaker 2: The reason that we were able to run this experiment 226 00:14:04,720 --> 00:14:07,480 Speaker 2: on stable diffusion is that it's open source. So you know, 227 00:14:07,520 --> 00:14:12,320 Speaker 2: we obviously found some significant problems, but there is that auditability. 228 00:14:12,400 --> 00:14:15,240 Speaker 2: You don't have that with open aies Dolly and so again. 229 00:14:15,320 --> 00:14:18,119 Speaker 2: Open AI has said that they're taking steps to address 230 00:14:18,360 --> 00:14:21,880 Speaker 2: representation and make sure that the outputs are representative, but 231 00:14:22,040 --> 00:14:24,200 Speaker 2: you kind of have to trust them because you don't 232 00:14:24,200 --> 00:14:24,920 Speaker 2: know what they're doing. 233 00:14:26,760 --> 00:14:29,640 Speaker 1: After the break, what's the data set behind these AI 234 00:14:29,760 --> 00:14:42,160 Speaker 1: generated images? We know from everything we've been hearing all 235 00:14:42,200 --> 00:14:46,800 Speaker 1: about chat GPT now GPT for that it takes as 236 00:14:46,840 --> 00:14:51,120 Speaker 1: its source enormous amounts of data that exists on the Internet. 237 00:14:51,240 --> 00:14:55,880 Speaker 1: What is the source material for generative AI when it 238 00:14:55,920 --> 00:14:57,400 Speaker 1: comes to images. 239 00:14:58,080 --> 00:15:03,000 Speaker 3: So, the source material for most generative AIS models, these 240 00:15:03,120 --> 00:15:06,960 Speaker 3: so called large language models, it's basically the entire Internet. 241 00:15:07,640 --> 00:15:11,040 Speaker 3: In simple terms, it's everything that's been posted on the 242 00:15:11,040 --> 00:15:14,720 Speaker 3: Internet in the past ten fifteen years. The way that 243 00:15:14,800 --> 00:15:18,840 Speaker 3: works is that there is a data set called Lyon, 244 00:15:19,400 --> 00:15:24,600 Speaker 3: which basically collected URLs to images or texts for the 245 00:15:24,640 --> 00:15:27,000 Speaker 3: past fifteen years all over the Internet. 246 00:15:27,800 --> 00:15:30,800 Speaker 2: When you're training on data from across the entire Internet, 247 00:15:31,000 --> 00:15:32,640 Speaker 2: as most of us know, there's a fair amount of 248 00:15:32,760 --> 00:15:35,600 Speaker 2: unsavory stuff out there on the Internet. And you know, 249 00:15:35,680 --> 00:15:38,800 Speaker 2: there's been some academic work done on the earlier version 250 00:15:38,840 --> 00:15:41,000 Speaker 2: of this data set, this line On data set that 251 00:15:41,200 --> 00:15:47,040 Speaker 2: found pornography, violence, again, racial and gender bias. When certain 252 00:15:47,200 --> 00:15:50,240 Speaker 2: terms that were associated with certain races were used, it 253 00:15:50,280 --> 00:15:53,200 Speaker 2: was much more likely to bring up an image that 254 00:15:53,400 --> 00:15:54,360 Speaker 2: was sexualized. 255 00:15:54,840 --> 00:15:56,440 Speaker 4: So there are a lot of problems. 256 00:15:56,040 --> 00:15:59,240 Speaker 2: Within that data set, and it is an openly available, 257 00:15:59,280 --> 00:16:01,640 Speaker 2: open source data set, and so the viewpoint of the 258 00:16:01,640 --> 00:16:03,880 Speaker 2: people behind it is, look, you know, you should use 259 00:16:03,920 --> 00:16:05,960 Speaker 2: this for academic work. If you're using this in a 260 00:16:05,960 --> 00:16:09,080 Speaker 2: commercial product, you've got to actually take some responsibility for 261 00:16:09,160 --> 00:16:11,520 Speaker 2: the content, and we've made some not Safe. 262 00:16:11,280 --> 00:16:14,000 Speaker 4: For work filters. There are steps you can take, but you. 263 00:16:13,920 --> 00:16:16,200 Speaker 2: Know, when you're training a model on a large volume 264 00:16:16,240 --> 00:16:18,640 Speaker 2: of data from across the entire Internet, there is a 265 00:16:18,640 --> 00:16:21,120 Speaker 2: lot of unsavory stuff in there, and there are way 266 00:16:21,440 --> 00:16:23,920 Speaker 2: way too many images in this data set for anybody 267 00:16:24,000 --> 00:16:26,160 Speaker 2: to go through it and make sure that they're cleaning 268 00:16:26,200 --> 00:16:26,520 Speaker 2: it up. 269 00:16:28,640 --> 00:16:32,400 Speaker 1: Diana, what does Stable Diffusion say about your findings about 270 00:16:32,560 --> 00:16:34,640 Speaker 1: this bias in their data? 271 00:16:35,280 --> 00:16:37,800 Speaker 2: We reached out to Stable Diffusion and explained what we 272 00:16:37,800 --> 00:16:40,320 Speaker 2: were finding, and they sent us in an email state 273 00:16:40,360 --> 00:16:43,600 Speaker 2: informous spokesperson saying that quote, all AI models have inherent 274 00:16:43,680 --> 00:16:46,560 Speaker 2: biases that are representative of the data sets they're trained on, 275 00:16:46,880 --> 00:16:49,480 Speaker 2: and by open sourcing our models, we aim to support 276 00:16:49,520 --> 00:16:53,160 Speaker 2: the AI community and collaborate to improve bias evaluation techniques 277 00:16:53,200 --> 00:16:56,760 Speaker 2: and develop solutions beyond the basic prompt modification. The company 278 00:16:56,800 --> 00:16:59,080 Speaker 2: also told us that, you know, they have sort of 279 00:16:59,080 --> 00:17:02,000 Speaker 2: an initiative to developed some open source models that will 280 00:17:02,000 --> 00:17:04,679 Speaker 2: be trained on data sets that are specific to different 281 00:17:04,680 --> 00:17:07,679 Speaker 2: countries and cultures, and so part of the argument the 282 00:17:07,680 --> 00:17:10,240 Speaker 2: company was making is that the open source nature of 283 00:17:10,359 --> 00:17:12,920 Speaker 2: what they're doing will enable them to address some of 284 00:17:12,960 --> 00:17:15,800 Speaker 2: these issues by getting more and more data that is 285 00:17:16,040 --> 00:17:17,439 Speaker 2: more diverse than what they currently have. 286 00:17:18,359 --> 00:17:21,960 Speaker 1: Dina, we can see why bias would be so harmful, 287 00:17:22,359 --> 00:17:25,639 Speaker 1: especially when it comes to images, which are very powerful. 288 00:17:25,840 --> 00:17:28,800 Speaker 1: What are some of the real world downsides that we 289 00:17:29,000 --> 00:17:33,480 Speaker 1: see with the possibility of fake images, bias and images 290 00:17:34,359 --> 00:17:36,480 Speaker 1: being proliferated all over the world. 291 00:17:38,280 --> 00:17:40,439 Speaker 2: There is an issue of deep fakes, things that are 292 00:17:40,480 --> 00:17:43,280 Speaker 2: meant to mislead people, misinformation that you can't tell as 293 00:17:43,320 --> 00:17:47,199 Speaker 2: AI generated. With the specific issue of bias. There's a 294 00:17:47,280 --> 00:17:49,560 Speaker 2: number of issues that crop up here. One is a 295 00:17:49,600 --> 00:17:52,920 Speaker 2: representation one. So if we're going to start using all 296 00:17:52,920 --> 00:17:58,240 Speaker 2: of these synthetic generated images for brochures, for advertisements, for 297 00:17:58,320 --> 00:18:02,399 Speaker 2: marketing materials, and we're already seeing what happens when the 298 00:18:02,480 --> 00:18:06,359 Speaker 2: marketing materials have all the CEOs be white men, doesn't 299 00:18:06,400 --> 00:18:09,679 Speaker 2: that worse in the situation that we already have. You know, 300 00:18:09,800 --> 00:18:12,320 Speaker 2: one of the things that we found in this experiment 301 00:18:12,440 --> 00:18:16,040 Speaker 2: was that the bias in the unstable diffusion was actually 302 00:18:16,119 --> 00:18:17,960 Speaker 2: worse than the real world. So we know that there 303 00:18:17,960 --> 00:18:21,800 Speaker 2: are fewer female CEOs, but the number of female CEOs 304 00:18:21,800 --> 00:18:24,679 Speaker 2: that were being generated in these experiments was even lower 305 00:18:24,720 --> 00:18:28,040 Speaker 2: than the real world. So we're looking at a situation 306 00:18:28,240 --> 00:18:31,760 Speaker 2: where we're generating more and more content via AI, more 307 00:18:31,800 --> 00:18:35,000 Speaker 2: and more of these synthetic images. Those images become a 308 00:18:35,040 --> 00:18:37,520 Speaker 2: part of the body of images, the body of work 309 00:18:37,600 --> 00:18:40,879 Speaker 2: that's on the internet, they are more biased than reality, 310 00:18:41,320 --> 00:18:44,240 Speaker 2: and then in the future those images get fed back 311 00:18:44,280 --> 00:18:47,280 Speaker 2: into future AI systems, so that you end up in 312 00:18:47,320 --> 00:18:50,399 Speaker 2: this nasty cycle where the bias is getting worse and 313 00:18:50,480 --> 00:18:53,880 Speaker 2: worse and being fed back into future systems which are 314 00:18:53,920 --> 00:18:54,920 Speaker 2: then less diverse. 315 00:18:57,200 --> 00:19:00,800 Speaker 3: So there was a recent eurocal report that suggested that 316 00:19:00,880 --> 00:19:04,840 Speaker 3: by twenty twenty six, ninety percent of all online content 317 00:19:05,000 --> 00:19:08,959 Speaker 3: could be artificially generated. What happens when ninety percent of 318 00:19:09,040 --> 00:19:14,119 Speaker 3: all online images are images reinforcing those stereotypes? One of 319 00:19:14,200 --> 00:19:18,560 Speaker 3: the main impacts can really affect people's mental health and 320 00:19:18,600 --> 00:19:21,560 Speaker 3: how they project themselves into the world and you know, 321 00:19:21,600 --> 00:19:25,399 Speaker 3: what kind of jobs that they see themselves doing in life. 322 00:19:25,440 --> 00:19:29,119 Speaker 3: So that's a really big issue that can definitely be 323 00:19:29,520 --> 00:19:34,040 Speaker 3: reinforced by this problem. 324 00:19:32,560 --> 00:19:37,080 Speaker 1: When we come back. How can artificial intelligence become more intelligent? 325 00:19:45,960 --> 00:19:50,080 Speaker 1: Diina earlier layout said that these open source models have 326 00:19:50,160 --> 00:19:53,000 Speaker 1: one advantage, which is that everybody is able to kind 327 00:19:53,000 --> 00:19:56,240 Speaker 1: of work on them and improve them. And if you are, say, 328 00:19:56,840 --> 00:20:00,440 Speaker 1: had an advertising agency making a brochure and you ask 329 00:20:00,520 --> 00:20:03,040 Speaker 1: it to create a CEO and it's a white male, 330 00:20:03,400 --> 00:20:05,600 Speaker 1: can't you say no, that's not the image I'm looking 331 00:20:05,600 --> 00:20:08,200 Speaker 1: for that there's a certain amount of responsibility of people 332 00:20:08,240 --> 00:20:12,240 Speaker 1: who are generating these images not to just simply accept 333 00:20:12,400 --> 00:20:15,960 Speaker 1: what the generative AI bot spits out. 334 00:20:16,560 --> 00:20:18,760 Speaker 2: When we talk about AI bias, a lot of the 335 00:20:18,880 --> 00:20:21,919 Speaker 2: quote unquote blame for it gets put on the data sets. 336 00:20:22,320 --> 00:20:26,120 Speaker 2: There needs to also be accountability from users at all levels, 337 00:20:26,160 --> 00:20:28,320 Speaker 2: and that includes the people that are creating the models, 338 00:20:28,400 --> 00:20:31,119 Speaker 2: the researchers that are working on the models, who have 339 00:20:31,200 --> 00:20:34,240 Speaker 2: their own biases that get kind of imprinted on these models, 340 00:20:34,480 --> 00:20:36,320 Speaker 2: and it includes the people that are using them. 341 00:20:36,480 --> 00:20:39,560 Speaker 4: At the end of the day, it's not totally clear. 342 00:20:39,320 --> 00:20:41,439 Speaker 2: To me that you can currently use these models that 343 00:20:41,520 --> 00:20:44,399 Speaker 2: effectively to even specify in that way and get the 344 00:20:44,400 --> 00:20:45,359 Speaker 2: output you want. 345 00:20:46,280 --> 00:20:48,240 Speaker 1: So do you know what can actually be done to 346 00:20:48,240 --> 00:20:50,520 Speaker 1: fix this? We talked earlier about how there's a lot 347 00:20:50,560 --> 00:20:53,360 Speaker 1: of work being done to improve these models. 348 00:20:54,960 --> 00:20:56,879 Speaker 2: One of the things that needs to be done is 349 00:20:57,160 --> 00:21:00,560 Speaker 2: increased diversification of the data set to be a way 350 00:21:00,600 --> 00:21:05,200 Speaker 2: to get data from other countries, other cultures, and there 351 00:21:05,200 --> 00:21:06,720 Speaker 2: needs to be to be clear a way. 352 00:21:06,560 --> 00:21:08,240 Speaker 4: To do that that's ethical. 353 00:21:08,520 --> 00:21:10,479 Speaker 2: There have been projects or companies that have tried to 354 00:21:10,560 --> 00:21:12,760 Speaker 2: source a more diverse set of data, but they've done 355 00:21:12,760 --> 00:21:15,560 Speaker 2: it in unethical ways. They've tried to get images of people, 356 00:21:15,560 --> 00:21:17,639 Speaker 2: and they've done it without consent. This is sort of 357 00:21:17,640 --> 00:21:20,119 Speaker 2: cropped up in the facial recognition era when people are 358 00:21:20,119 --> 00:21:23,240 Speaker 2: trying to fix those systems, just as a question about 359 00:21:23,280 --> 00:21:25,760 Speaker 2: the largeness of all of these models. So the current 360 00:21:25,800 --> 00:21:28,560 Speaker 2: trend in AI is that bigger is better, that the 361 00:21:28,600 --> 00:21:31,320 Speaker 2: only way to do these kind of foundational models is 362 00:21:31,359 --> 00:21:34,040 Speaker 2: to have the sum total of the Internet dumped into 363 00:21:34,160 --> 00:21:37,240 Speaker 2: the training data. There are people working on ways to 364 00:21:37,359 --> 00:21:40,840 Speaker 2: do better smaller models, in which case you have greater 365 00:21:40,840 --> 00:21:42,879 Speaker 2: control over what is in the data set and you 366 00:21:42,880 --> 00:21:45,360 Speaker 2: can do things that are more targeted. If we move 367 00:21:45,440 --> 00:21:48,199 Speaker 2: to optimizing the technology where you don't just have to 368 00:21:48,200 --> 00:21:50,840 Speaker 2: add more volume in order to have a better performing 369 00:21:51,000 --> 00:21:56,160 Speaker 2: algorithm or model, that could help as well. 370 00:21:56,280 --> 00:22:00,680 Speaker 1: LAO is somebody who is deep in this data and 371 00:22:00,960 --> 00:22:04,240 Speaker 1: watching how it's developing very rapidly. What are you watching 372 00:22:04,359 --> 00:22:07,119 Speaker 1: for as this keeps unfolding. 373 00:22:07,800 --> 00:22:12,280 Speaker 3: One of the most interesting developments is really the open 374 00:22:12,320 --> 00:22:17,400 Speaker 3: source versus closed source models, And you know which are 375 00:22:17,440 --> 00:22:20,719 Speaker 3: going to become the status quo because you know it's 376 00:22:20,800 --> 00:22:24,479 Speaker 3: not clear right now. It's very easy to use closed 377 00:22:24,520 --> 00:22:28,600 Speaker 3: source models in some way because they have better user interfaces, 378 00:22:28,640 --> 00:22:32,960 Speaker 3: and they market it better, and you know, it's it's 379 00:22:32,960 --> 00:22:37,280 Speaker 3: for profits, so they have all these ways to kind 380 00:22:37,280 --> 00:22:41,200 Speaker 3: of like get really mainstream. But at the same time, 381 00:22:41,680 --> 00:22:44,720 Speaker 3: the open source models are being used by millions of people, 382 00:22:45,240 --> 00:22:48,280 Speaker 3: not just people you know that are using them like 383 00:22:48,400 --> 00:22:52,560 Speaker 3: as a developers or researchers. And then we also see 384 00:22:52,600 --> 00:22:56,560 Speaker 3: private companies adapting open source models as opposed to closed 385 00:22:56,560 --> 00:22:59,840 Speaker 3: source models because they actually recognize the fact that they 386 00:22:59,840 --> 00:23:02,879 Speaker 3: can and build on top of those models within their systems. 387 00:23:03,240 --> 00:23:06,879 Speaker 3: It's very unclear and it will be interesting to see 388 00:23:06,920 --> 00:23:10,240 Speaker 3: if five ten years from now, generative bi has become 389 00:23:10,359 --> 00:23:14,120 Speaker 3: completely a closed source thing because it's easier to regulate. 390 00:23:14,200 --> 00:23:17,080 Speaker 3: You can just regulate private companies and tell them what 391 00:23:17,119 --> 00:23:20,120 Speaker 3: to do, or it's become an open source thing because 392 00:23:20,240 --> 00:23:23,760 Speaker 3: there's more transparency and it's easier to see if things 393 00:23:23,760 --> 00:23:24,760 Speaker 3: are getting better or not. 394 00:23:26,119 --> 00:23:29,240 Speaker 1: Leo Dina, thanks so much for coming on the show. 395 00:23:30,119 --> 00:23:32,280 Speaker 4: Thank you, Wes, thank you for having us. 396 00:23:33,080 --> 00:23:35,040 Speaker 1: Thanks for listening to us here at the Big Take. 397 00:23:35,160 --> 00:23:38,440 Speaker 1: It's a daily podcast from Bloomberg and iHeartRadio. For more 398 00:23:38,440 --> 00:23:42,560 Speaker 1: shows from iHeartRadio, visit the iHeartRadio, app, Apple Podcasts or 399 00:23:42,600 --> 00:23:45,440 Speaker 1: wherever you listen, and we'd love to hear from you. 400 00:23:45,520 --> 00:23:48,960 Speaker 1: Email us questions or comments to Big Take at Bloomberg 401 00:23:48,960 --> 00:23:52,040 Speaker 1: dot net. The supervising producer of The Big Take is 402 00:23:52,160 --> 00:23:56,040 Speaker 1: Vicky burg Alina. Our senior producer is Catherine Fink. Frederica 403 00:23:56,119 --> 00:24:01,120 Speaker 1: Romanello is our producer. Our associate producer is Zenobsiiti. Raphael 404 00:24:01,200 --> 00:24:04,680 Speaker 1: M Seely is our engineer. Our original music was composed 405 00:24:04,680 --> 00:24:08,080 Speaker 1: by Leo Sidrin. I'm West Kasova. We'll be back tomorrow 406 00:24:08,200 --> 00:24:09,400 Speaker 1: with another Big Take