1
00:00:00,960 --> 00:00:10,520
Speaker 1: Okay, make a photo of a CEO and enter all

2
00:00:10,600 --> 00:00:18,160
Speaker 1: right results. Oh, it made four different pictures and all

3
00:00:18,280 --> 00:00:23,440
Speaker 1: of them appear to be light skinned men in suits. Okay,

4
00:00:23,560 --> 00:00:30,960
Speaker 1: do that again, photo of a ce Oh and for

5
00:00:31,360 --> 00:00:36,239
Speaker 1: more people who look like men to me. This is

6
00:00:36,280 --> 00:00:40,800
Speaker 1: my very first time using an AI image generator, is

7
00:00:40,840 --> 00:00:43,960
Speaker 1: through a free website on my laptop. Essentially, it's an

8
00:00:44,000 --> 00:00:48,000
Speaker 1: AI model like chat GPT, but instead of creating AI

9
00:00:48,120 --> 00:00:53,400
Speaker 1: generated text, it produces AI generated images. I type into

10
00:00:53,400 --> 00:00:55,240
Speaker 1: a search box what I want to see, and a

11
00:00:55,240 --> 00:00:59,120
Speaker 1: few seconds later it produces pictures that the model believes

12
00:00:59,200 --> 00:01:02,760
Speaker 1: are what I'm asking asking for. All right, let's do

13
00:01:02,840 --> 00:01:11,120
Speaker 1: this again. Photo of a physician and looks to me

14
00:01:11,319 --> 00:01:16,240
Speaker 1: like men, but in lab coats and scrubs. Oh, and

15
00:01:16,319 --> 00:01:19,400
Speaker 1: you know they're doctors because they all have stethoscopes around

16
00:01:19,440 --> 00:01:24,800
Speaker 1: their necks. I requested images for several different jobs and

17
00:01:24,920 --> 00:01:27,920
Speaker 1: repeated the requests at least ten times for each one.

18
00:01:28,360 --> 00:01:32,440
Speaker 1: The results were eye opening. Almost all the images of

19
00:01:32,560 --> 00:01:35,560
Speaker 1: CEOs and doctors the model produced, at least when I

20
00:01:35,680 --> 00:01:39,160
Speaker 1: was using it, appeared to me to be men. All

21
00:01:39,280 --> 00:01:43,480
Speaker 1: nurses and almost all teachers appeared to be women. By

22
00:01:43,520 --> 00:01:45,840
Speaker 1: the way, I'm saying appeared to be women or men,

23
00:01:45,920 --> 00:01:49,880
Speaker 1: because these images are people who don't actually exist, and

24
00:01:49,920 --> 00:01:54,760
Speaker 1: so identifiers like gender, race, and ethnicity are subjective and

25
00:01:54,800 --> 00:01:58,600
Speaker 1: we'll talk more about that a bit later. Most images

26
00:01:58,640 --> 00:02:01,360
Speaker 1: of attorneys the model look to me to be light

27
00:02:01,440 --> 00:02:05,680
Speaker 1: skinned men. Images of scientists appeared to be more diverse,

28
00:02:05,760 --> 00:02:09,360
Speaker 1: but they also mostly looked like men, and this is

29
00:02:09,440 --> 00:02:13,079
Speaker 1: really weird. When the model did generate pictures of attorneys

30
00:02:13,160 --> 00:02:16,239
Speaker 1: or scientists who looked to me like women, they were

31
00:02:16,400 --> 00:02:20,520
Speaker 1: very often shown dressed in traditional men's clothing, like business

32
00:02:20,520 --> 00:02:24,200
Speaker 1: suits and wearing neckties, as though the model couldn't fully

33
00:02:24,280 --> 00:02:28,519
Speaker 1: accept the concept of a woman in those professions. AI

34
00:02:28,600 --> 00:02:32,560
Speaker 1: sometimes generates a distorted version of reality that doesn't look

35
00:02:32,639 --> 00:02:34,919
Speaker 1: like the world we live in, and it can perpetuate

36
00:02:35,040 --> 00:02:39,280
Speaker 1: gender and racial stereotypes. This matters because, as we know,

37
00:02:39,600 --> 00:02:42,440
Speaker 1: AI is fast working its way into our lives, and

38
00:02:42,480 --> 00:02:46,360
Speaker 1: AI generated images that can make us believe something artificial

39
00:02:46,520 --> 00:02:52,160
Speaker 1: is actually something real, maybe especially influential and potentially harmful.

40
00:02:53,160 --> 00:02:56,400
Speaker 2: So we're looking at a situation where we're generating more

41
00:02:56,440 --> 00:02:59,200
Speaker 2: and more content via AI, more and more of these

42
00:02:59,200 --> 00:03:02,959
Speaker 2: synthetic images. Those images become a part of the body

43
00:03:03,000 --> 00:03:05,560
Speaker 2: of images, the body of work that's on the internet,

44
00:03:05,919 --> 00:03:08,760
Speaker 2: they are more biased than reality. And then in the

45
00:03:08,800 --> 00:03:12,840
Speaker 2: future those images get fed back into future AI systems,

46
00:03:13,200 --> 00:03:16,360
Speaker 2: so that you end up in this nasty cycle where

47
00:03:16,440 --> 00:03:18,680
Speaker 2: the bias is getting worse and worse and being fed

48
00:03:18,720 --> 00:03:22,240
Speaker 2: back into future systems which are then less diverse.

49
00:03:22,800 --> 00:03:26,359
Speaker 3: So there was a recent eurocal report that suggested that

50
00:03:26,440 --> 00:03:30,440
Speaker 3: by twenty twenty six, ninety percent of all online content

51
00:03:30,600 --> 00:03:34,520
Speaker 3: could be artificially generated. What happens when ninety percent of

52
00:03:34,639 --> 00:03:39,040
Speaker 3: all online images are images reinforcing those stereotypes.

53
00:03:39,480 --> 00:03:43,800
Speaker 1: Bloomberg's Dina Bass and Leonardo Nicoletti dug deep into the

54
00:03:43,880 --> 00:03:47,000
Speaker 1: data to find out why the results look like this

55
00:03:47,440 --> 00:03:50,240
Speaker 1: and what can be done to fix the shortcomings of

56
00:03:50,280 --> 00:03:58,960
Speaker 1: this rapidly emerging technology. I'm wes Kasova today on the

57
00:03:58,960 --> 00:04:02,760
Speaker 1: Big take. You can't trust your eyes when it comes

58
00:04:02,760 --> 00:04:08,320
Speaker 1: to AI, like, maybe you can start by just giving

59
00:04:08,360 --> 00:04:12,440
Speaker 1: us an overview of what you found in this investigation.

60
00:04:13,560 --> 00:04:17,240
Speaker 3: This investigation essentially looks at generative AI, which is a

61
00:04:17,279 --> 00:04:20,919
Speaker 3: new type of AI. It's like chat GPT where you

62
00:04:21,400 --> 00:04:23,840
Speaker 3: ask it a question and it just answers you or

63
00:04:23,920 --> 00:04:26,479
Speaker 3: gives you the information you need. In our case, we

64
00:04:26,600 --> 00:04:30,200
Speaker 3: use stable diffusion, which is similar to chat GPT, but

65
00:04:30,279 --> 00:04:33,120
Speaker 3: it's instead of text to text, which means like using

66
00:04:33,160 --> 00:04:36,240
Speaker 3: texts to generate more text, it's text to image where

67
00:04:36,279 --> 00:04:39,520
Speaker 3: you ask it a question or give it a description,

68
00:04:39,760 --> 00:04:42,920
Speaker 3: and then it will generate an image for you of

69
00:04:43,680 --> 00:04:47,920
Speaker 3: what you're looking for. And well, you know that gives

70
00:04:48,000 --> 00:04:51,640
Speaker 3: us lots of possibilities and opens lots of doors for work,

71
00:04:51,720 --> 00:04:56,480
Speaker 3: for design, for artistic advertising, lots of purposes. What we

72
00:04:56,600 --> 00:05:00,279
Speaker 3: found is that it also has very strong bias is

73
00:05:00,560 --> 00:05:05,880
Speaker 3: against people of color, women in general. And so what

74
00:05:05,920 --> 00:05:09,040
Speaker 3: we wanted to show through this piece is really how

75
00:05:09,160 --> 00:05:14,480
Speaker 3: biased is generative AI, and specifically generative AI that creates

76
00:05:14,520 --> 00:05:19,279
Speaker 3: images and visual representations. And you know to what extent

77
00:05:19,440 --> 00:05:23,240
Speaker 3: are these biased ingrained in this technology? And you know

78
00:05:23,279 --> 00:05:25,080
Speaker 3: what are the potential implications of that.

79
00:05:25,960 --> 00:05:28,800
Speaker 1: This is significant because it's different from the kind of

80
00:05:28,920 --> 00:05:32,560
Speaker 1: facial recognition that we already know about, Is that right?

81
00:05:33,279 --> 00:05:33,600
Speaker 4: It is?

82
00:05:33,680 --> 00:05:37,440
Speaker 2: So somewhere around twenty eighteen we started finding out that

83
00:05:37,520 --> 00:05:42,159
Speaker 2: facial recognition software had significant racial and gender biases. And

84
00:05:42,480 --> 00:05:45,680
Speaker 2: what that software is You have a picture, an image,

85
00:05:46,000 --> 00:05:49,239
Speaker 2: and the AI scans it and tries to predict what's

86
00:05:49,360 --> 00:05:49,680
Speaker 2: in it?

87
00:05:49,880 --> 00:05:51,680
Speaker 4: You know, what am I looking at? Is it a

88
00:05:51,720 --> 00:05:54,960
Speaker 4: black cat, Is it a cheeseburger? Is it a white woman?

89
00:05:55,040 --> 00:05:56,080
Speaker 4: What am I looking at?

90
00:05:56,920 --> 00:06:01,480
Speaker 2: In twenty eighteen, a couple of researchers join Bulmini, Timnique Cabrew,

91
00:06:01,640 --> 00:06:05,200
Speaker 2: and then Deboraji combined to do some work called gender sheets,

92
00:06:05,200 --> 00:06:08,400
Speaker 2: where they ran a bunch of the popular facial recognition programs,

93
00:06:08,680 --> 00:06:10,880
Speaker 2: ran tests on them and found that their performance was

94
00:06:10,920 --> 00:06:14,240
Speaker 2: significantly worse on people of color and significantly worse on

95
00:06:14,279 --> 00:06:16,680
Speaker 2: women of color. So we've known that that's an issue,

96
00:06:16,800 --> 00:06:18,880
Speaker 2: and it's wrapped up in real world scenarios. There have

97
00:06:18,960 --> 00:06:21,760
Speaker 2: been situations in the US where black men have been

98
00:06:21,800 --> 00:06:25,680
Speaker 2: mistakenly arrested because they were flagged by facial recognition software.

99
00:06:25,839 --> 00:06:29,080
Speaker 2: It turns out it was some completely different person, So

100
00:06:29,120 --> 00:06:32,479
Speaker 2: we know that's a problem. Generative AI is a new

101
00:06:32,520 --> 00:06:35,080
Speaker 2: type of AI, and it's a new wrinkle. So instead

102
00:06:35,080 --> 00:06:38,800
Speaker 2: of AI that scans existing pictures, it's creating new ones

103
00:06:39,360 --> 00:06:44,000
Speaker 2: and that we found also has significant racial and gender biases.

104
00:06:44,279 --> 00:06:47,320
Speaker 2: So the additional, you know, significant issue that this raises

105
00:06:47,400 --> 00:06:51,200
Speaker 2: is we're now using artificial intelligence to create massive volumes

106
00:06:51,240 --> 00:06:54,960
Speaker 2: of new content. Then put out into the world for use.

107
00:06:55,720 --> 00:06:59,160
Speaker 2: That new content is demonstrating racial and gender bias, and

108
00:06:59,320 --> 00:07:02,120
Speaker 2: we're adding to a body of content out there, using

109
00:07:02,160 --> 00:07:06,640
Speaker 2: it for reports, using it for clip art, for presentations,

110
00:07:06,680 --> 00:07:08,440
Speaker 2: and it is significantly biased.

111
00:07:11,040 --> 00:07:14,480
Speaker 1: And Leo, how did you go about finding this bias

112
00:07:14,520 --> 00:07:16,960
Speaker 1: in this new form of generative AI?

113
00:07:18,160 --> 00:07:23,960
Speaker 3: As a half reporter but also half former scientists, academic

114
00:07:24,120 --> 00:07:27,160
Speaker 3: and just coder. The fact that many of these models

115
00:07:27,920 --> 00:07:32,400
Speaker 3: generative AI models are open source was actually very useful

116
00:07:32,560 --> 00:07:36,880
Speaker 3: for just researchers in general, but also reporters because it

117
00:07:36,920 --> 00:07:41,480
Speaker 3: gives the possibility for anybody to download the generative AI model,

118
00:07:41,560 --> 00:07:46,560
Speaker 3: in our case, Stable Diffusion and ask it to generate images.

119
00:07:47,280 --> 00:07:49,560
Speaker 3: And so what I did is I simply went on

120
00:07:49,760 --> 00:07:53,680
Speaker 3: the hugging Face platform, which is this really interesting and

121
00:07:53,840 --> 00:07:57,840
Speaker 3: very useful platform that has come out recently that hosts

122
00:07:57,920 --> 00:08:02,440
Speaker 3: all of these models, including source versions of GPT for example,

123
00:08:02,760 --> 00:08:07,440
Speaker 3: and stable Diffusion, and I downloaded the model, and then

124
00:08:07,560 --> 00:08:12,080
Speaker 3: I wrote some code to basically iterate through a series

125
00:08:12,200 --> 00:08:17,360
Speaker 3: of very well known high paying and low paying jobs

126
00:08:17,920 --> 00:08:23,960
Speaker 3: and also different criminalized activities, and just ask the model

127
00:08:24,080 --> 00:08:29,040
Speaker 3: a very simple question, can you generate color photograph of blank?

128
00:08:29,560 --> 00:08:34,480
Speaker 3: And Blank is a judge, an engineer, and a janitor,

129
00:08:34,800 --> 00:08:39,160
Speaker 3: a housekeeper, a fast food worker. For professions, for example,

130
00:08:39,520 --> 00:08:42,760
Speaker 3: and for criminalized activities, we looked at three of them,

131
00:08:42,840 --> 00:08:46,040
Speaker 3: so Blank would be a terrorist, a drug dealer, or

132
00:08:46,200 --> 00:08:49,880
Speaker 3: an inmate. I let my computer run for actually an

133
00:08:50,040 --> 00:08:54,120
Speaker 3: entire month, because it's very computationally heavy to generate thousands

134
00:08:54,120 --> 00:08:57,040
Speaker 3: of images for each of those keywords. So that was

135
00:08:57,080 --> 00:08:57,760
Speaker 3: the first step.

136
00:08:58,520 --> 00:09:00,760
Speaker 1: So you would just say to it, make me a

137
00:09:00,800 --> 00:09:04,760
Speaker 1: picture of a CEO and then see what it came

138
00:09:04,840 --> 00:09:05,160
Speaker 1: up with.

139
00:09:06,960 --> 00:09:10,040
Speaker 3: Exactly. But the idea was to do that exact same

140
00:09:10,080 --> 00:09:14,520
Speaker 3: thing thousands of times, so that instead of, you know,

141
00:09:14,720 --> 00:09:20,160
Speaker 3: having anecdotal evidence that the AI might be biased, we

142
00:09:20,200 --> 00:09:24,679
Speaker 3: would actually gather a database of images of the same

143
00:09:24,720 --> 00:09:29,040
Speaker 3: thing over and over and over. Basically that would allow

144
00:09:29,400 --> 00:09:33,520
Speaker 3: us as reporters and as data scientists to then analyze

145
00:09:33,559 --> 00:09:36,920
Speaker 3: those thousands of images and actually find a pattern across

146
00:09:36,960 --> 00:09:39,040
Speaker 3: those images. So that's exactly what we did.

147
00:09:39,600 --> 00:09:41,480
Speaker 1: And what is the pattern that you found when you

148
00:09:41,520 --> 00:09:44,240
Speaker 1: typed in ceo, when you typed in fast food worker

149
00:09:44,280 --> 00:09:47,079
Speaker 1: all of the other things you mentioned, and then asked

150
00:09:47,120 --> 00:09:50,480
Speaker 1: it to show you pictures thousands of times? What did

151
00:09:50,480 --> 00:09:51,160
Speaker 1: it turn up.

152
00:09:51,880 --> 00:09:56,160
Speaker 3: So the pattern is a very stark pattern. It's that

153
00:09:56,400 --> 00:10:03,240
Speaker 3: for high paying professions, the generative AI model is overwhelmingly

154
00:10:03,360 --> 00:10:07,439
Speaker 3: generating pictures of white men, and for low paying professions

155
00:10:07,520 --> 00:10:12,679
Speaker 3: it's overwhelmingly generated more pictures of women and darker skinned people.

156
00:10:12,720 --> 00:10:16,920
Speaker 3: So in our analysis we couldn't really talk about race

157
00:10:17,040 --> 00:10:21,760
Speaker 3: because race is very hard to quantify in images, especially

158
00:10:21,800 --> 00:10:25,480
Speaker 3: when you have images of fake people essentially that can't

159
00:10:25,480 --> 00:10:28,400
Speaker 3: really self identify, So you can't say this is a

160
00:10:28,400 --> 00:10:31,160
Speaker 3: black person or this is an Asian person. But what

161
00:10:31,200 --> 00:10:36,520
Speaker 3: you can do is rigorous scientific analysis where you do

162
00:10:36,600 --> 00:10:39,920
Speaker 3: things like average all the pixels of a person's skin

163
00:10:40,280 --> 00:10:43,760
Speaker 3: across all of the images of one profession, and for example,

164
00:10:44,080 --> 00:10:47,120
Speaker 3: doing that, what we found is that the pattern of

165
00:10:47,440 --> 00:10:52,839
Speaker 3: darker skinned subjects being overrepresented in low paying professions and

166
00:10:53,000 --> 00:10:58,360
Speaker 3: lighter skin subjects being overrepresented in high paying professions. And

167
00:10:58,440 --> 00:11:02,840
Speaker 3: the same goes for criminalized activities, where you have darker

168
00:11:02,880 --> 00:11:09,040
Speaker 3: skin tones constantly and systematically being represented in criminalized activities.

169
00:11:11,800 --> 00:11:16,520
Speaker 1: Tina Leo was talking about stable diffusion. Exactly what is

170
00:11:16,559 --> 00:11:17,640
Speaker 1: that and how does it work?

171
00:11:18,440 --> 00:11:22,360
Speaker 2: So stable diffusion is a text to image program that

172
00:11:22,559 --> 00:11:26,360
Speaker 2: is open source. It's distributed by a company called Stability Ai,

173
00:11:27,000 --> 00:11:29,559
Speaker 2: and the version that we used is Leao mentioned hosted

174
00:11:29,600 --> 00:11:32,120
Speaker 2: on hugging Face, which is basically a repository of open

175
00:11:32,160 --> 00:11:34,240
Speaker 2: source AI model. So some of your listeners may have

176
00:11:34,280 --> 00:11:37,880
Speaker 2: heard of GitHub, which is a repository of programming code.

177
00:11:38,360 --> 00:11:40,440
Speaker 2: Hugging Face tries to be sort of like a version

178
00:11:40,480 --> 00:11:43,240
Speaker 2: of that for AI models, and a lot of your

179
00:11:43,240 --> 00:11:46,120
Speaker 2: listeners may have actually heard of a different image generation program,

180
00:11:46,200 --> 00:11:49,640
Speaker 2: which is open AI's Dolli. Dolli to the second version

181
00:11:49,640 --> 00:11:52,719
Speaker 2: of it came out in wider distribution last year and

182
00:11:52,760 --> 00:11:55,040
Speaker 2: around July. It was announced a bit earlier than that,

183
00:11:55,520 --> 00:11:58,439
Speaker 2: and that was also very popular and attracted a lot

184
00:11:58,440 --> 00:12:01,720
Speaker 2: of attention. Stable Diffusion followed that and came out as

185
00:12:01,760 --> 00:12:04,240
Speaker 2: an open source version, and because it was open source,

186
00:12:04,320 --> 00:12:07,880
Speaker 2: it's been very widely used. In order to use the

187
00:12:07,920 --> 00:12:11,760
Speaker 2: open ai version, you for you know, commercial applications, you

188
00:12:11,800 --> 00:12:13,000
Speaker 2: have to work with open Ai.

189
00:12:13,120 --> 00:12:14,920
Speaker 4: You have to pay for that, and so it's a

190
00:12:14,960 --> 00:12:15,560
Speaker 4: little different.

191
00:12:16,080 --> 00:12:17,520
Speaker 2: I just want to talk for a minute about why

192
00:12:17,520 --> 00:12:20,040
Speaker 2: we did not look at open AI's Dolli, and that's

193
00:12:20,040 --> 00:12:23,559
Speaker 2: because it's not open source, so we can't tell what

194
00:12:23,679 --> 00:12:24,840
Speaker 2: is in the training data.

195
00:12:24,880 --> 00:12:26,559
Speaker 4: For DOLLI in a way that we.

196
00:12:26,600 --> 00:12:29,280
Speaker 2: Can for stable diffusion, and there are greater limits on

197
00:12:29,320 --> 00:12:31,080
Speaker 2: what you can do with it.

198
00:12:31,080 --> 00:12:33,160
Speaker 4: It was sort of difficult to look at the bias

199
00:12:33,320 --> 00:12:34,160
Speaker 4: there and.

200
00:12:34,200 --> 00:12:36,079
Speaker 1: Dan, by open source, what exactly do you mean.

201
00:12:37,040 --> 00:12:39,920
Speaker 2: So it's basically the opposite of proprietary software.

202
00:12:40,040 --> 00:12:42,640
Speaker 4: It's freely distributed. It's openly distributed.

203
00:12:42,720 --> 00:12:45,560
Speaker 2: Anyone can download it, use it, and in the case

204
00:12:45,559 --> 00:12:47,800
Speaker 2: of AI models, you have greater freedom to play with

205
00:12:47,880 --> 00:12:50,880
Speaker 2: it to tweak different parts of the AI model to

206
00:12:50,960 --> 00:12:51,640
Speaker 2: what you need.

207
00:12:52,400 --> 00:12:56,320
Speaker 3: You can see exactly the code or the data that

208
00:12:56,640 --> 00:13:00,319
Speaker 3: is going behind an AI model, and you can see

209
00:13:00,320 --> 00:13:04,200
Speaker 3: the different versions of the model over time, and that's

210
00:13:04,360 --> 00:13:07,240
Speaker 3: very important for people who are trying to improve these

211
00:13:07,280 --> 00:13:13,000
Speaker 3: things because you can basically have some sort of version control,

212
00:13:13,080 --> 00:13:16,079
Speaker 3: so control fork. The previous version used to be like this,

213
00:13:16,320 --> 00:13:18,480
Speaker 3: and now we've improved it, and now we can see

214
00:13:18,559 --> 00:13:22,720
Speaker 3: clearly the difference between the new version and the previous version. Actually,

215
00:13:22,760 --> 00:13:26,720
Speaker 3: for this story, we did interview prominent academics within this field,

216
00:13:27,040 --> 00:13:30,599
Speaker 3: and they've all really stressed this point that one of

217
00:13:30,640 --> 00:13:34,720
Speaker 3: the only ways to address the problem of bias is

218
00:13:34,840 --> 00:13:38,160
Speaker 3: to start by having open source models, because then those

219
00:13:38,200 --> 00:13:41,800
Speaker 3: models can be taken by other academics or other organizations

220
00:13:41,800 --> 00:13:45,640
Speaker 3: that are also transparent, and whatever they do to them

221
00:13:46,200 --> 00:13:50,040
Speaker 3: to quote unquote improve them is now again made transparent,

222
00:13:50,240 --> 00:13:55,840
Speaker 3: made very publicly known, and available to yet more academics

223
00:13:55,840 --> 00:13:56,880
Speaker 3: to improve upon it.

224
00:13:56,920 --> 00:14:01,360
Speaker 4: Again, there's also greater auditability.

225
00:14:01,760 --> 00:14:04,720
Speaker 2: The reason that we were able to run this experiment

226
00:14:04,720 --> 00:14:07,480
Speaker 2: on stable diffusion is that it's open source. So you know,

227
00:14:07,520 --> 00:14:12,320
Speaker 2: we obviously found some significant problems, but there is that auditability.

228
00:14:12,400 --> 00:14:15,240
Speaker 2: You don't have that with open aies Dolly and so again.

229
00:14:15,320 --> 00:14:18,119
Speaker 2: Open AI has said that they're taking steps to address

230
00:14:18,360 --> 00:14:21,880
Speaker 2: representation and make sure that the outputs are representative, but

231
00:14:22,040 --> 00:14:24,200
Speaker 2: you kind of have to trust them because you don't

232
00:14:24,200 --> 00:14:24,920
Speaker 2: know what they're doing.

233
00:14:26,760 --> 00:14:29,640
Speaker 1: After the break, what's the data set behind these AI

234
00:14:29,760 --> 00:14:42,160
Speaker 1: generated images? We know from everything we've been hearing all

235
00:14:42,200 --> 00:14:46,800
Speaker 1: about chat GPT now GPT for that it takes as

236
00:14:46,840 --> 00:14:51,120
Speaker 1: its source enormous amounts of data that exists on the Internet.

237
00:14:51,240 --> 00:14:55,880
Speaker 1: What is the source material for generative AI when it

238
00:14:55,920 --> 00:14:57,400
Speaker 1: comes to images.

239
00:14:58,080 --> 00:15:03,000
Speaker 3: So, the source material for most generative AIS models, these

240
00:15:03,120 --> 00:15:06,960
Speaker 3: so called large language models, it's basically the entire Internet.

241
00:15:07,640 --> 00:15:11,040
Speaker 3: In simple terms, it's everything that's been posted on the

242
00:15:11,040 --> 00:15:14,720
Speaker 3: Internet in the past ten fifteen years. The way that

243
00:15:14,800 --> 00:15:18,840
Speaker 3: works is that there is a data set called Lyon,

244
00:15:19,400 --> 00:15:24,600
Speaker 3: which basically collected URLs to images or texts for the

245
00:15:24,640 --> 00:15:27,000
Speaker 3: past fifteen years all over the Internet.

246
00:15:27,800 --> 00:15:30,800
Speaker 2: When you're training on data from across the entire Internet,

247
00:15:31,000 --> 00:15:32,640
Speaker 2: as most of us know, there's a fair amount of

248
00:15:32,760 --> 00:15:35,600
Speaker 2: unsavory stuff out there on the Internet. And you know,

249
00:15:35,680 --> 00:15:38,800
Speaker 2: there's been some academic work done on the earlier version

250
00:15:38,840 --> 00:15:41,000
Speaker 2: of this data set, this line On data set that

251
00:15:41,200 --> 00:15:47,040
Speaker 2: found pornography, violence, again, racial and gender bias. When certain

252
00:15:47,200 --> 00:15:50,240
Speaker 2: terms that were associated with certain races were used, it

253
00:15:50,280 --> 00:15:53,200
Speaker 2: was much more likely to bring up an image that

254
00:15:53,400 --> 00:15:54,360
Speaker 2: was sexualized.

255
00:15:54,840 --> 00:15:56,440
Speaker 4: So there are a lot of problems.

256
00:15:56,040 --> 00:15:59,240
Speaker 2: Within that data set, and it is an openly available,

257
00:15:59,280 --> 00:16:01,640
Speaker 2: open source data set, and so the viewpoint of the

258
00:16:01,640 --> 00:16:03,880
Speaker 2: people behind it is, look, you know, you should use

259
00:16:03,920 --> 00:16:05,960
Speaker 2: this for academic work. If you're using this in a

260
00:16:05,960 --> 00:16:09,080
Speaker 2: commercial product, you've got to actually take some responsibility for

261
00:16:09,160 --> 00:16:11,520
Speaker 2: the content, and we've made some not Safe.

262
00:16:11,280 --> 00:16:14,000
Speaker 4: For work filters. There are steps you can take, but you.

263
00:16:13,920 --> 00:16:16,200
Speaker 2: Know, when you're training a model on a large volume

264
00:16:16,240 --> 00:16:18,640
Speaker 2: of data from across the entire Internet, there is a

265
00:16:18,640 --> 00:16:21,120
Speaker 2: lot of unsavory stuff in there, and there are way

266
00:16:21,440 --> 00:16:23,920
Speaker 2: way too many images in this data set for anybody

267
00:16:24,000 --> 00:16:26,160
Speaker 2: to go through it and make sure that they're cleaning

268
00:16:26,200 --> 00:16:26,520
Speaker 2: it up.

269
00:16:28,640 --> 00:16:32,400
Speaker 1: Diana, what does Stable Diffusion say about your findings about

270
00:16:32,560 --> 00:16:34,640
Speaker 1: this bias in their data?

271
00:16:35,280 --> 00:16:37,800
Speaker 2: We reached out to Stable Diffusion and explained what we

272
00:16:37,800 --> 00:16:40,320
Speaker 2: were finding, and they sent us in an email state

273
00:16:40,360 --> 00:16:43,600
Speaker 2: informous spokesperson saying that quote, all AI models have inherent

274
00:16:43,680 --> 00:16:46,560
Speaker 2: biases that are representative of the data sets they're trained on,

275
00:16:46,880 --> 00:16:49,480
Speaker 2: and by open sourcing our models, we aim to support

276
00:16:49,520 --> 00:16:53,160
Speaker 2: the AI community and collaborate to improve bias evaluation techniques

277
00:16:53,200 --> 00:16:56,760
Speaker 2: and develop solutions beyond the basic prompt modification. The company

278
00:16:56,800 --> 00:16:59,080
Speaker 2: also told us that, you know, they have sort of

279
00:16:59,080 --> 00:17:02,000
Speaker 2: an initiative to developed some open source models that will

280
00:17:02,000 --> 00:17:04,679
Speaker 2: be trained on data sets that are specific to different

281
00:17:04,680 --> 00:17:07,679
Speaker 2: countries and cultures, and so part of the argument the

282
00:17:07,680 --> 00:17:10,240
Speaker 2: company was making is that the open source nature of

283
00:17:10,359 --> 00:17:12,920
Speaker 2: what they're doing will enable them to address some of

284
00:17:12,960 --> 00:17:15,800
Speaker 2: these issues by getting more and more data that is

285
00:17:16,040 --> 00:17:17,439
Speaker 2: more diverse than what they currently have.

286
00:17:18,359 --> 00:17:21,960
Speaker 1: Dina, we can see why bias would be so harmful,

287
00:17:22,359 --> 00:17:25,639
Speaker 1: especially when it comes to images, which are very powerful.

288
00:17:25,840 --> 00:17:28,800
Speaker 1: What are some of the real world downsides that we

289
00:17:29,000 --> 00:17:33,480
Speaker 1: see with the possibility of fake images, bias and images

290
00:17:34,359 --> 00:17:36,480
Speaker 1: being proliferated all over the world.

291
00:17:38,280 --> 00:17:40,439
Speaker 2: There is an issue of deep fakes, things that are

292
00:17:40,480 --> 00:17:43,280
Speaker 2: meant to mislead people, misinformation that you can't tell as

293
00:17:43,320 --> 00:17:47,199
Speaker 2: AI generated. With the specific issue of bias. There's a

294
00:17:47,280 --> 00:17:49,560
Speaker 2: number of issues that crop up here. One is a

295
00:17:49,600 --> 00:17:52,920
Speaker 2: representation one. So if we're going to start using all

296
00:17:52,920 --> 00:17:58,240
Speaker 2: of these synthetic generated images for brochures, for advertisements, for

297
00:17:58,320 --> 00:18:02,399
Speaker 2: marketing materials, and we're already seeing what happens when the

298
00:18:02,480 --> 00:18:06,359
Speaker 2: marketing materials have all the CEOs be white men, doesn't

299
00:18:06,400 --> 00:18:09,679
Speaker 2: that worse in the situation that we already have. You know,

300
00:18:09,800 --> 00:18:12,320
Speaker 2: one of the things that we found in this experiment

301
00:18:12,440 --> 00:18:16,040
Speaker 2: was that the bias in the unstable diffusion was actually

302
00:18:16,119 --> 00:18:17,960
Speaker 2: worse than the real world. So we know that there

303
00:18:17,960 --> 00:18:21,800
Speaker 2: are fewer female CEOs, but the number of female CEOs

304
00:18:21,800 --> 00:18:24,679
Speaker 2: that were being generated in these experiments was even lower

305
00:18:24,720 --> 00:18:28,040
Speaker 2: than the real world. So we're looking at a situation

306
00:18:28,240 --> 00:18:31,760
Speaker 2: where we're generating more and more content via AI, more

307
00:18:31,800 --> 00:18:35,000
Speaker 2: and more of these synthetic images. Those images become a

308
00:18:35,040 --> 00:18:37,520
Speaker 2: part of the body of images, the body of work

309
00:18:37,600 --> 00:18:40,879
Speaker 2: that's on the internet, they are more biased than reality,

310
00:18:41,320 --> 00:18:44,240
Speaker 2: and then in the future those images get fed back

311
00:18:44,280 --> 00:18:47,280
Speaker 2: into future AI systems, so that you end up in

312
00:18:47,320 --> 00:18:50,399
Speaker 2: this nasty cycle where the bias is getting worse and

313
00:18:50,480 --> 00:18:53,880
Speaker 2: worse and being fed back into future systems which are

314
00:18:53,920 --> 00:18:54,920
Speaker 2: then less diverse.

315
00:18:57,200 --> 00:19:00,800
Speaker 3: So there was a recent eurocal report that suggested that

316
00:19:00,880 --> 00:19:04,840
Speaker 3: by twenty twenty six, ninety percent of all online content

317
00:19:05,000 --> 00:19:08,959
Speaker 3: could be artificially generated. What happens when ninety percent of

318
00:19:09,040 --> 00:19:14,119
Speaker 3: all online images are images reinforcing those stereotypes? One of

319
00:19:14,200 --> 00:19:18,560
Speaker 3: the main impacts can really affect people's mental health and

320
00:19:18,600 --> 00:19:21,560
Speaker 3: how they project themselves into the world and you know,

321
00:19:21,600 --> 00:19:25,399
Speaker 3: what kind of jobs that they see themselves doing in life.

322
00:19:25,440 --> 00:19:29,119
Speaker 3: So that's a really big issue that can definitely be

323
00:19:29,520 --> 00:19:34,040
Speaker 3: reinforced by this problem.

324
00:19:32,560 --> 00:19:37,080
Speaker 1: When we come back. How can artificial intelligence become more intelligent?

325
00:19:45,960 --> 00:19:50,080
Speaker 1: Diina earlier layout said that these open source models have

326
00:19:50,160 --> 00:19:53,000
Speaker 1: one advantage, which is that everybody is able to kind

327
00:19:53,000 --> 00:19:56,240
Speaker 1: of work on them and improve them. And if you are, say,

328
00:19:56,840 --> 00:20:00,440
Speaker 1: had an advertising agency making a brochure and you ask

329
00:20:00,520 --> 00:20:03,040
Speaker 1: it to create a CEO and it's a white male,

330
00:20:03,400 --> 00:20:05,600
Speaker 1: can't you say no, that's not the image I'm looking

331
00:20:05,600 --> 00:20:08,200
Speaker 1: for that there's a certain amount of responsibility of people

332
00:20:08,240 --> 00:20:12,240
Speaker 1: who are generating these images not to just simply accept

333
00:20:12,400 --> 00:20:15,960
Speaker 1: what the generative AI bot spits out.

334
00:20:16,560 --> 00:20:18,760
Speaker 2: When we talk about AI bias, a lot of the

335
00:20:18,880 --> 00:20:21,919
Speaker 2: quote unquote blame for it gets put on the data sets.

336
00:20:22,320 --> 00:20:26,120
Speaker 2: There needs to also be accountability from users at all levels,

337
00:20:26,160 --> 00:20:28,320
Speaker 2: and that includes the people that are creating the models,

338
00:20:28,400 --> 00:20:31,119
Speaker 2: the researchers that are working on the models, who have

339
00:20:31,200 --> 00:20:34,240
Speaker 2: their own biases that get kind of imprinted on these models,

340
00:20:34,480 --> 00:20:36,320
Speaker 2: and it includes the people that are using them.

341
00:20:36,480 --> 00:20:39,560
Speaker 4: At the end of the day, it's not totally clear.

342
00:20:39,320 --> 00:20:41,439
Speaker 2: To me that you can currently use these models that

343
00:20:41,520 --> 00:20:44,399
Speaker 2: effectively to even specify in that way and get the

344
00:20:44,400 --> 00:20:45,359
Speaker 2: output you want.

345
00:20:46,280 --> 00:20:48,240
Speaker 1: So do you know what can actually be done to

346
00:20:48,240 --> 00:20:50,520
Speaker 1: fix this? We talked earlier about how there's a lot

347
00:20:50,560 --> 00:20:53,360
Speaker 1: of work being done to improve these models.

348
00:20:54,960 --> 00:20:56,879
Speaker 2: One of the things that needs to be done is

349
00:20:57,160 --> 00:21:00,560
Speaker 2: increased diversification of the data set to be a way

350
00:21:00,600 --> 00:21:05,200
Speaker 2: to get data from other countries, other cultures, and there

351
00:21:05,200 --> 00:21:06,720
Speaker 2: needs to be to be clear a way.

352
00:21:06,560 --> 00:21:08,240
Speaker 4: To do that that's ethical.

353
00:21:08,520 --> 00:21:10,479
Speaker 2: There have been projects or companies that have tried to

354
00:21:10,560 --> 00:21:12,760
Speaker 2: source a more diverse set of data, but they've done

355
00:21:12,760 --> 00:21:15,560
Speaker 2: it in unethical ways. They've tried to get images of people,

356
00:21:15,560 --> 00:21:17,639
Speaker 2: and they've done it without consent. This is sort of

357
00:21:17,640 --> 00:21:20,119
Speaker 2: cropped up in the facial recognition era when people are

358
00:21:20,119 --> 00:21:23,240
Speaker 2: trying to fix those systems, just as a question about

359
00:21:23,280 --> 00:21:25,760
Speaker 2: the largeness of all of these models. So the current

360
00:21:25,800 --> 00:21:28,560
Speaker 2: trend in AI is that bigger is better, that the

361
00:21:28,600 --> 00:21:31,320
Speaker 2: only way to do these kind of foundational models is

362
00:21:31,359 --> 00:21:34,040
Speaker 2: to have the sum total of the Internet dumped into

363
00:21:34,160 --> 00:21:37,240
Speaker 2: the training data. There are people working on ways to

364
00:21:37,359 --> 00:21:40,840
Speaker 2: do better smaller models, in which case you have greater

365
00:21:40,840 --> 00:21:42,879
Speaker 2: control over what is in the data set and you

366
00:21:42,880 --> 00:21:45,360
Speaker 2: can do things that are more targeted. If we move

367
00:21:45,440 --> 00:21:48,199
Speaker 2: to optimizing the technology where you don't just have to

368
00:21:48,200 --> 00:21:50,840
Speaker 2: add more volume in order to have a better performing

369
00:21:51,000 --> 00:21:56,160
Speaker 2: algorithm or model, that could help as well.

370
00:21:56,280 --> 00:22:00,680
Speaker 1: LAO is somebody who is deep in this data and

371
00:22:00,960 --> 00:22:04,240
Speaker 1: watching how it's developing very rapidly. What are you watching

372
00:22:04,359 --> 00:22:07,119
Speaker 1: for as this keeps unfolding.

373
00:22:07,800 --> 00:22:12,280
Speaker 3: One of the most interesting developments is really the open

374
00:22:12,320 --> 00:22:17,400
Speaker 3: source versus closed source models, And you know which are

375
00:22:17,440 --> 00:22:20,719
Speaker 3: going to become the status quo because you know it's

376
00:22:20,800 --> 00:22:24,479
Speaker 3: not clear right now. It's very easy to use closed

377
00:22:24,520 --> 00:22:28,600
Speaker 3: source models in some way because they have better user interfaces,

378
00:22:28,640 --> 00:22:32,960
Speaker 3: and they market it better, and you know, it's it's

379
00:22:32,960 --> 00:22:37,280
Speaker 3: for profits, so they have all these ways to kind

380
00:22:37,280 --> 00:22:41,200
Speaker 3: of like get really mainstream. But at the same time,

381
00:22:41,680 --> 00:22:44,720
Speaker 3: the open source models are being used by millions of people,

382
00:22:45,240 --> 00:22:48,280
Speaker 3: not just people you know that are using them like

383
00:22:48,400 --> 00:22:52,560
Speaker 3: as a developers or researchers. And then we also see

384
00:22:52,600 --> 00:22:56,560
Speaker 3: private companies adapting open source models as opposed to closed

385
00:22:56,560 --> 00:22:59,840
Speaker 3: source models because they actually recognize the fact that they

386
00:22:59,840 --> 00:23:02,879
Speaker 3: can and build on top of those models within their systems.

387
00:23:03,240 --> 00:23:06,879
Speaker 3: It's very unclear and it will be interesting to see

388
00:23:06,920 --> 00:23:10,240
Speaker 3: if five ten years from now, generative bi has become

389
00:23:10,359 --> 00:23:14,120
Speaker 3: completely a closed source thing because it's easier to regulate.

390
00:23:14,200 --> 00:23:17,080
Speaker 3: You can just regulate private companies and tell them what

391
00:23:17,119 --> 00:23:20,120
Speaker 3: to do, or it's become an open source thing because

392
00:23:20,240 --> 00:23:23,760
Speaker 3: there's more transparency and it's easier to see if things

393
00:23:23,760 --> 00:23:24,760
Speaker 3: are getting better or not.

394
00:23:26,119 --> 00:23:29,240
Speaker 1: Leo Dina, thanks so much for coming on the show.

395
00:23:30,119 --> 00:23:32,280
Speaker 4: Thank you, Wes, thank you for having us.

396
00:23:33,080 --> 00:23:35,040
Speaker 1: Thanks for listening to us here at the Big Take.

397
00:23:35,160 --> 00:23:38,440
Speaker 1: It's a daily podcast from Bloomberg and iHeartRadio. For more

398
00:23:38,440 --> 00:23:42,560
Speaker 1: shows from iHeartRadio, visit the iHeartRadio, app, Apple Podcasts or

399
00:23:42,600 --> 00:23:45,440
Speaker 1: wherever you listen, and we'd love to hear from you.

400
00:23:45,520 --> 00:23:48,960
Speaker 1: Email us questions or comments to Big Take at Bloomberg

401
00:23:48,960 --> 00:23:52,040
Speaker 1: dot net. The supervising producer of The Big Take is

402
00:23:52,160 --> 00:23:56,040
Speaker 1: Vicky burg Alina. Our senior producer is Catherine Fink. Frederica

403
00:23:56,119 --> 00:24:01,120
Speaker 1: Romanello is our producer. Our associate producer is Zenobsiiti. Raphael

404
00:24:01,200 --> 00:24:04,680
Speaker 1: M Seely is our engineer. Our original music was composed

405
00:24:04,680 --> 00:24:08,080
Speaker 1: by Leo Sidrin. I'm West Kasova. We'll be back tomorrow

406
00:24:08,200 --> 00:24:09,400
Speaker 1: with another Big Take