WEBVTT - AI Model Collapse and the Dangers of AI-Generated Content 0:00:04.480 --> 0:00:12.319 Welcome to tech Stuff, a production from iHeartRadio. Hey there, 0:00:12.360 --> 0:00:15.560 and welcome to tech Stuff. I'm your host, Jonathan Strickland. 0:00:15.560 --> 0:00:18.919 I'm an executive producer with iHeart Podcasts. And how the 0:00:18.960 --> 0:00:23.520 tech are you? So? Imagine for a moment that you 0:00:23.720 --> 0:00:27.200 are in school. Some of y'all might actually be in school, 0:00:27.400 --> 0:00:30.600 but others, like me, we have to satisfy ourselves by 0:00:30.640 --> 0:00:33.919 having that occasional stress dream where we imagine that we're 0:00:33.960 --> 0:00:35.919 in school and it's time to take a final and 0:00:35.960 --> 0:00:38.960 we haven't gone to class all year, and also we 0:00:39.000 --> 0:00:41.479 can't remember our locker combination. I don't know about you, 0:00:41.520 --> 0:00:44.320 but I still occasionally get those dreams. And I'm almost 0:00:44.400 --> 0:00:47.280 fifty years old at this point. Anyway, you're in school, 0:00:47.760 --> 0:00:50.879 you're in English class, and you've been given the dreaded 0:00:51.120 --> 0:00:53.760 term paper assignment. You're told you need to go to 0:00:53.800 --> 0:00:57.040 the library and you have to gather resources and read 0:00:57.120 --> 0:01:01.080 up and form your thesis and write your paper while 0:01:01.080 --> 0:01:06.080 making verifiable citations all the way through. So off you 0:01:06.120 --> 0:01:09.680 go to the library. However, you discover, horror of horrors, 0:01:09.959 --> 0:01:14.120 that all the resource books have disappeared. They're none in 0:01:14.160 --> 0:01:17.280 their place. Are other student term papers? Now? Some of 0:01:17.280 --> 0:01:21.040 those term papers are pretty good, some of them are terrible. 0:01:21.440 --> 0:01:23.880 Nearly all of them do have a list of references 0:01:23.920 --> 0:01:25.839 at the end, But the problem is that you don't 0:01:25.840 --> 0:01:29.319 have access to those references. You only have access to 0:01:29.440 --> 0:01:32.640 the term papers, which, in a way, you could say 0:01:32.800 --> 0:01:36.520 is a filtered view of those references. But you have 0:01:36.600 --> 0:01:39.319 no way of knowing if the student who wrote the 0:01:39.440 --> 0:01:42.880 term papers you've pulled out did a proper citation. You 0:01:42.920 --> 0:01:46.319 don't know if the student understood the source material. You 0:01:46.360 --> 0:01:49.520 don't know if they have made a valid reference using 0:01:49.560 --> 0:01:52.400 that source. You don't know if the student didn't understand 0:01:52.400 --> 0:01:56.200 the source and thus misconstrued the information, either accidentally or 0:01:56.240 --> 0:01:59.600 on purpose, or if the student is just outright plagiarizing 0:01:59.640 --> 0:02:03.280 the source material or making stuff up. So how do 0:02:03.360 --> 0:02:06.880 you think your own term paper would turn out? Probably 0:02:07.240 --> 0:02:10.359 it'd be a challenge to write a good term paper. 0:02:10.360 --> 0:02:13.200 It definitely would be difficult or almost impossible to support 0:02:13.200 --> 0:02:16.200 your thesis using citations, because all you would have access 0:02:16.200 --> 0:02:19.880 to would be other term papers. Chances are you'd have 0:02:19.919 --> 0:02:24.600 a pretty lousy grade by the end of that assignment. Now, 0:02:24.680 --> 0:02:28.200 I started off this episode with that analogy because today 0:02:28.200 --> 0:02:31.920 we're going to talk about what happens when AI models 0:02:32.040 --> 0:02:35.600 train off stuff that was generated by other or sometimes 0:02:35.639 --> 0:02:40.040 even the same but earlier versions of AI models. So 0:02:40.120 --> 0:02:44.160 when bots make stuff that other bots consume, and then 0:02:44.200 --> 0:02:47.920 those other bots make new stuff and the cycle goes on. 0:02:48.560 --> 0:02:51.320 Where are the humans in this picture. Maybe they're in 0:02:51.360 --> 0:02:55.720 an actual library, because the online resources will all have 0:02:55.840 --> 0:02:59.160 become practically useless. So if we want to actually learn anything, 0:02:59.160 --> 0:03:02.160 we're gonna need to go back to the basics. So 0:03:02.200 --> 0:03:05.720 we're going to talk about an idea called model collapse, 0:03:05.880 --> 0:03:10.440 as in large language models LMS and other types of 0:03:10.480 --> 0:03:14.320 AI models. We're going to build to that. However, first up, 0:03:14.480 --> 0:03:18.079 let's explore the tendency of AI models to produce wrong 0:03:18.400 --> 0:03:21.880 or misleading results, regardless of whether the material used to 0:03:21.960 --> 0:03:26.000 train that AI model came from AI or humans. This 0:03:26.040 --> 0:03:28.440 is something I've talked about in past episodes, but it's 0:03:28.480 --> 0:03:32.280 an important part to kind of build toward our understanding 0:03:32.280 --> 0:03:35.680 of what model collapse is. Now. In past episodes, I've 0:03:35.680 --> 0:03:40.640 talked about the issue of AI hallucinations, also sometimes called confabulations. 0:03:40.760 --> 0:03:45.360 Some people prefer confabulations to hallucinations. This is the tendency 0:03:45.640 --> 0:03:51.320 for generative AI to mistakenly include untrue or misleading information, 0:03:51.960 --> 0:03:56.600 or to insert stuff that does not belong into whatever 0:03:56.640 --> 0:03:59.480 it is that's creating, whether that's an image or text 0:03:59.600 --> 0:04:02.640 or what an so. One fairly recent example of this 0:04:03.160 --> 0:04:07.400 was when Google's AI augmented search tool suggested that you 0:04:07.480 --> 0:04:11.240 add a non toxic glue to your pizza ingredients if 0:04:11.280 --> 0:04:14.440 you want to solve the irritating issue of cheese slip 0:04:14.520 --> 0:04:18.760 sladden away off your ding dang dern pizza. Clearly this 0:04:18.800 --> 0:04:22.800 answer is not acceptable. Adding glue, non toxic or otherwise 0:04:23.160 --> 0:04:25.560 is not a way of making good eats. I'm pretty 0:04:25.560 --> 0:04:28.600 sure Alton Brown would agree with me, and actually I 0:04:28.600 --> 0:04:31.400 would argue this is one of the less egregious cases 0:04:31.440 --> 0:04:34.240 of AI providing a bad answer. It's famous because it 0:04:34.320 --> 0:04:36.800 got a lot of traction. It went viral for how 0:04:36.880 --> 0:04:39.479 bad the answer was. But in the grand scheme of things, 0:04:39.480 --> 0:04:43.400 there are other examples that were far more potentially harmful. 0:04:43.680 --> 0:04:47.520 So why does AI do this sometimes? Well, there are 0:04:47.560 --> 0:04:51.479 a few different contributing factors that lead AI to making 0:04:51.480 --> 0:04:54.040 these mistakes. By the way, the reason why some people 0:04:54.240 --> 0:04:59.520 prefer confabulations as opposed to hallucinations. Hallucination sounds like the 0:04:59.680 --> 0:05:03.880 AI I has somehow been tricked into thinking something is 0:05:04.279 --> 0:05:07.800 what it isn't right, like the idea that you hallucinate 0:05:07.839 --> 0:05:11.960 your seeing or hearing or experiencing something that's not really there. 0:05:12.400 --> 0:05:17.400 Confabulation suggests that the AI is inventing something. It is confabulating, 0:05:17.440 --> 0:05:20.560 it is creating an answer where there was none, and 0:05:20.640 --> 0:05:23.560 so some people prefer the second one because they but 0:05:23.680 --> 0:05:26.520 it puts more of the onus on the AI model itself. 0:05:26.920 --> 0:05:30.799 So one of the factors that contributes to AI making mistakes. 0:05:31.520 --> 0:05:34.800 And you know, large language models and like are in 0:05:34.880 --> 0:05:39.720 part focused on pattern recognition, and this can lead to issues. Now, 0:05:39.760 --> 0:05:43.680 recognizing patterns is what gives these models the ability to 0:05:43.960 --> 0:05:48.559 form relevant and coherent responses to queries, and obviously pattern 0:05:48.600 --> 0:05:53.240 recognition is important otherwise you're just gonna perceive everything is 0:05:53.279 --> 0:05:58.040 being random and meaningless and then really, this whole conversation 0:05:58.240 --> 0:06:02.000 doesn't mean anything either, or if the whole universe is meaningless, 0:06:02.440 --> 0:06:05.200 then what are we even doing here? But I don't 0:06:05.200 --> 0:06:07.479 want you to go down that path of existential dread. 0:06:07.920 --> 0:06:12.239 So sometimes AI will detect a pattern where there really 0:06:12.360 --> 0:06:15.760 isn't a pattern. And we humans do this too, you know, 0:06:15.800 --> 0:06:19.160 we sometimes experience like paradolia. For example. That's when we 0:06:19.200 --> 0:06:24.680 perceive something meaningful within an otherwise meaningless thing, like we 0:06:24.760 --> 0:06:28.040 see a pattern where there is none. So if you 0:06:28.080 --> 0:06:30.880 were to look at the clouds and you say that 0:06:31.160 --> 0:06:34.680 one of them looks very like a whale, that's paradolia. 0:06:34.880 --> 0:06:39.680 It's also a reference to Hamlet the infamous face on Mars, 0:06:39.880 --> 0:06:42.440 which was really just a hill with some shadows cast 0:06:42.480 --> 0:06:46.080 on it. Because the angle of the image, that was 0:06:46.080 --> 0:06:48.960 another example of paradolia, people began to think that there 0:06:49.040 --> 0:06:52.279 was actually a big sculpted face on Mars. There's not. 0:06:52.880 --> 0:06:55.080 It's a hill. The shadows hit the hill in a 0:06:55.080 --> 0:06:57.560 specific way that made it look kind of like the 0:06:57.600 --> 0:07:01.560 face of an enormous statue, something like the Sphinx, something 0:07:01.600 --> 0:07:04.320 along those lines. But in fact it was just a hill. 0:07:04.600 --> 0:07:07.679 And if you took another image from a different angle, 0:07:07.680 --> 0:07:12.000 which people have done, the illusion of a face disappears. 0:07:12.320 --> 0:07:15.680 So again, that was us inventing a pattern where there 0:07:15.880 --> 0:07:19.080 was none. Now, much of the time we humans can 0:07:19.120 --> 0:07:22.320 recognize when the things we see, you know, the shapes 0:07:22.360 --> 0:07:26.320 of faces or whatever it may be, aren't actually there. Right, 0:07:26.360 --> 0:07:29.960 we can recognize, oh, that looks like a blah, blah blah, 0:07:30.000 --> 0:07:33.000 but we know it's not actually a real image of that. 0:07:33.160 --> 0:07:38.200 It just happens. Now. Sometimes we don't recognize this. Sometimes 0:07:38.200 --> 0:07:41.640 there are ties where people will assume that what they're 0:07:41.680 --> 0:07:46.680 seeing is an actual image made with intent and intelligence, 0:07:46.760 --> 0:07:49.560 perhaps not by humans but by something. So there are 0:07:49.560 --> 0:07:52.119 all those stories of people going bonkers because they believe 0:07:52.120 --> 0:07:54.280 they saw an image of like the Virgin Mary in 0:07:54.320 --> 0:07:58.120 a potato chip or whatever. And machines don't necessarily have 0:07:58.160 --> 0:08:02.400 any checks against fall hits when it comes to pattern recognition, 0:08:02.800 --> 0:08:06.640 and then they might act on a perceived pattern, which 0:08:06.680 --> 0:08:10.360 means the machines produce bad results. What's more, machines conceive 0:08:10.400 --> 0:08:13.720 patterns where we can't. Like sometimes there are patterns present 0:08:13.800 --> 0:08:17.720 that we cannot perceive because maybe the dataset is far 0:08:17.800 --> 0:08:22.400 too large or far too complicated, and so we can't 0:08:22.440 --> 0:08:26.560 perceive where the pattern is. It's just beyond our abilities 0:08:27.080 --> 0:08:31.360 to do so. But sometimes machines can detect those patterns, 0:08:31.360 --> 0:08:34.720 and sometimes they are meaningful. So it can be really tricky. 0:08:34.840 --> 0:08:37.400 If a machine thinks it's found a pattern, it can 0:08:37.440 --> 0:08:42.079 be hard for people to verify or discredit that because 0:08:42.400 --> 0:08:44.600 it's on a scale that we humans are not really 0:08:44.640 --> 0:08:48.320 well equipped to handle with generative AI. This can mean 0:08:48.320 --> 0:08:52.240 that the AI model correctly identifies that it needs to 0:08:52.320 --> 0:08:56.319 use a specific syntax to craft a response to whatever 0:08:56.520 --> 0:09:01.240 query or direction it was given, and it can thus 0:09:01.559 --> 0:09:06.600 put together a sentence that grammatically makes sense. What's happening 0:09:06.640 --> 0:09:11.360 is it's essentially statistically analyzing the structure of hundreds of 0:09:11.520 --> 0:09:14.760 millions of sentences, as well as the role that certain 0:09:14.800 --> 0:09:18.120 words play within those sentences, so that it quote unquote 0:09:18.280 --> 0:09:22.120 knows how to write a grammatically correct response, and ultimately 0:09:22.480 --> 0:09:25.439 it's using statistics to pick what should be the most 0:09:25.520 --> 0:09:30.000 correct word in each position of that sentence. So ideally, 0:09:30.440 --> 0:09:34.000 it's pulling information from various sources that are related to 0:09:34.000 --> 0:09:38.240 whatever it is you're asking about and pulling the words 0:09:38.240 --> 0:09:43.439 together in a way that makes logical sense and is accurate, 0:09:43.559 --> 0:09:45.880 and it's a correct answer to whatever your question is. 0:09:46.000 --> 0:09:49.440 But that doesn't always happen right. Sometimes it can't find 0:09:49.880 --> 0:09:52.840 the right word. Sometimes it finds a different word that 0:09:52.920 --> 0:09:56.080 it thinks is right, but it's not. And the real 0:09:56.160 --> 0:10:00.680 problem is it will present this to you authoritatively as 0:10:00.720 --> 0:10:04.160 if the AI is absolutely certain this is the right answer, 0:10:04.360 --> 0:10:07.560 when in fact it's wrong and the AI has no 0:10:07.600 --> 0:10:10.600 way of knowing it's wrong. It's not purposefully trying to 0:10:10.600 --> 0:10:13.800 mislead you, and at least not necessarily. Maybe it was 0:10:13.800 --> 0:10:17.000 given direction to try and do that, but that's another matter. 0:10:17.400 --> 0:10:22.360 It's just trying to complete its task and failing to 0:10:22.400 --> 0:10:26.640 do so accurately. Sometimes the word or a series of 0:10:26.640 --> 0:10:30.400 words can be wrong. Therefore, now grammatically it could be correct, 0:10:30.480 --> 0:10:33.679 but factually it could be completely made up. And why 0:10:33.720 --> 0:10:36.640 this all happens. It does get really complicated. It's not 0:10:36.720 --> 0:10:40.440 necessarily due to just one specific flaw. It's not always 0:10:40.480 --> 0:10:43.840 the case that, oh, that data point didn't appear in 0:10:43.880 --> 0:10:47.120 the data set for some reason, and so the computer 0:10:47.440 --> 0:10:50.400 made something up. There are other issues that could also 0:10:50.440 --> 0:10:53.120 be at play. So, for example, one possible reason for 0:10:53.160 --> 0:10:57.679 hallucinations is something that's called overfitting. IBM defines this as 0:10:57.720 --> 0:11:01.600 what happens quote when an algorith rhythm fits too closely 0:11:01.800 --> 0:11:05.280 or even exactly to its training data, resulting in a 0:11:05.320 --> 0:11:08.920 model that can't make accurate predictions or conclusions from any 0:11:09.040 --> 0:11:12.440 data other than the training data. End quote. That's from 0:11:12.440 --> 0:11:16.439 a piece on IBM dot com. It's titled what is overfitting? 0:11:16.800 --> 0:11:21.440 Sometimes models get so complex or they're trained so closely 0:11:21.520 --> 0:11:24.800 on a specific data set that they start to pick 0:11:24.880 --> 0:11:30.320 up more noise than signal. They give significance to insignificant things. 0:11:30.600 --> 0:11:32.800 I think of this kind of like the character Dracks 0:11:33.000 --> 0:11:36.640 in the Guardians of the Galaxy movies. Drags takes things literally, 0:11:37.000 --> 0:11:40.120 so if you use a saying or an idiom on him, 0:11:40.480 --> 0:11:43.959 he's likely to interpret what you're saying as being what 0:11:44.040 --> 0:11:47.760 you mean. So if you say, oh, that's like throwing 0:11:47.800 --> 0:11:50.839 the baby out with the bathwater, he would assume you're 0:11:50.880 --> 0:11:54.080 talking about something you have literally done before in your life, 0:11:54.080 --> 0:11:56.880 that you have literally thrown out a baby with bathwater, 0:11:57.320 --> 0:12:00.000 and he would not understand you were using an analog 0:12:00.600 --> 0:12:03.960 to describe getting rid of important stuff along with the 0:12:04.040 --> 0:12:06.640 unimportant stuff you want to get rid of. If a 0:12:06.720 --> 0:12:09.719 model has been overfitted, if it's been trained too much 0:12:09.840 --> 0:12:12.679 on a relatively narrow set of data, it might have 0:12:12.720 --> 0:12:16.800 trouble taking what it has learned and generalizing those learnings 0:12:16.800 --> 0:12:20.440 towards something else that's outside the data set. And rather 0:12:20.520 --> 0:12:23.080 than saying I'm sorry, I don't know the answer to that, 0:12:23.440 --> 0:12:27.000 it could produce an answer that follows the statistical rules 0:12:27.240 --> 0:12:29.640 that the model is set to In other words, it'll 0:12:29.800 --> 0:12:33.640 create something that grammatically makes sense, but it won't necessarily 0:12:33.640 --> 0:12:38.079 be relevant or you know, thematically or irrelevance makes sense. 0:12:38.640 --> 0:12:41.360 So in this way, an AI model can become like 0:12:41.400 --> 0:12:44.760 that stereotypical person in the car who absolutely refuses to 0:12:44.800 --> 0:12:47.080 pull over and ask for directions when they get lost, 0:12:47.400 --> 0:12:50.160 because that would be showing weakness. No, gush, darn. It 0:12:50.200 --> 0:12:53.079 will somehow reason our way out of taking that wrong 0:12:53.200 --> 0:12:56.320 turn forty five minutes ago. That'll fix everything. Except it 0:12:56.320 --> 0:12:59.360 doesn't fix everything, and it can make things worse. But 0:12:59.400 --> 0:13:02.480 it's not just pattern recognition that can trip up AI models. 0:13:02.760 --> 0:13:07.119 Another issue is bias. I've talked about bias in other episodes, 0:13:07.280 --> 0:13:10.319 but it's really important that we understand what we mean 0:13:10.360 --> 0:13:13.559 when we're talking bias and how it can happen, because 0:13:14.000 --> 0:13:16.520 I think a lot of people get tripped up. They 0:13:16.559 --> 0:13:22.120 think it's a machine, right, it doesn't possess opinions. How 0:13:22.160 --> 0:13:26.640 can it have bias? Well, we'll explore that in just 0:13:26.800 --> 0:13:29.679 a couple of moments, but first let's take a quick 0:13:29.720 --> 0:13:43.520 break to think our sponsors. How can an AI model 0:13:43.920 --> 0:13:47.880 have bias? Well, the answer is that the machines that 0:13:47.920 --> 0:13:51.640 AI runs on the algorithms that AI is built upon. 0:13:51.920 --> 0:13:55.839 All this stuff, it didn't just pop out of nowhere. Ultimately, 0:13:55.920 --> 0:13:59.200 this stuff was designed, built, and programmed by human beings. 0:13:59.280 --> 0:14:01.880 Even if you have had a piece of software that 0:14:02.080 --> 0:14:06.160 was designed by AI, while the AI that designed it 0:14:06.280 --> 0:14:08.959 in turn had been designed by humans at least somewhere 0:14:09.000 --> 0:14:11.560 down the line once you trace it back far enough so. 0:14:12.080 --> 0:14:16.360 Human beings absolutely do have biases, and those biases can 0:14:16.400 --> 0:14:20.920 make their way into the routines and processes of machines. 0:14:21.480 --> 0:14:25.280 MIT has a great introduction to AI hallucinations and bias 0:14:25.360 --> 0:14:28.040 on a web page that has the fitting title when 0:14:28.200 --> 0:14:32.800 AI Gets It Wrong, Addressing AI hallucinations and bias now. 0:14:32.840 --> 0:14:35.600 In that article, the author points out that AI has 0:14:35.640 --> 0:14:39.440 had issues with bias for years and uses the example 0:14:39.600 --> 0:14:45.720 of image analysis. The author cites a project called Gender Shades. 0:14:46.040 --> 0:14:51.440 This was led by Joi Adowa Buomini, and I apologize 0:14:51.760 --> 0:14:56.080 for my pronunciation of the name. But the project examined 0:14:56.320 --> 0:15:02.280 how an AI powered gender classification tool performed when presented 0:15:02.320 --> 0:15:06.880 with subjects of varying genders, ethnicities, and skin tones from 0:15:06.960 --> 0:15:14.280 the IARPA Janus benchmark A data set or IJBA. This 0:15:14.320 --> 0:15:17.360 is a database of facial images taken from various angles 0:15:17.400 --> 0:15:20.880 and lighting conditions of lots of different people. It's used 0:15:20.880 --> 0:15:25.440 as a government benchmark for testing stuff like facial recognition technologies. Now. 0:15:25.480 --> 0:15:30.640 The project also used a gender classification benchmark from Adance, 0:15:31.240 --> 0:15:35.560 and this was in part to try and address shortcomings 0:15:35.600 --> 0:15:40.840 with the IJB dash A benchmark set. Plus due to 0:15:40.880 --> 0:15:43.360 the limitations of both of these data sets, which I'll 0:15:43.360 --> 0:15:46.480 talk about in just a moment, the project also outlines 0:15:46.520 --> 0:15:49.640 a process to create a better data set for the 0:15:49.640 --> 0:15:54.160 purposes of training technologies like facial recognition and gender classification. 0:15:54.720 --> 0:15:59.480 The project aimed to test several gender classifier programs from 0:15:59.480 --> 0:16:04.120 companies Microsoft and IBM, among others, all with regard to 0:16:04.320 --> 0:16:08.640 quote gender, skin type, and the intersection of skin type 0:16:08.680 --> 0:16:12.640 and gender end quote. So Joy found that the data 0:16:12.680 --> 0:16:17.360 sets from IJB dah A skewed male and lighter skin 0:16:17.480 --> 0:16:21.440 tones skewed heavily male and lighter skin tones. In fact, 0:16:21.480 --> 0:16:24.040 she said between seventy nine point six percent and eighty 0:16:24.040 --> 0:16:26.640 six point twenty four percent of all the images in 0:16:26.680 --> 0:16:31.040 the database were of people with lighter skin tones, and 0:16:31.520 --> 0:16:34.440 fewer than twenty five percent of all the images were 0:16:34.480 --> 0:16:38.480 of women or female presenting people worse, Yet, only four 0:16:38.520 --> 0:16:41.760 point four percent of all the images were of female 0:16:41.840 --> 0:16:46.880 presenting people who had dark skin Adiance's data set had 0:16:46.920 --> 0:16:50.840 a better distribution of photos, at least between genders. Female 0:16:50.840 --> 0:16:54.120 presenting people made up fifty two percent of the images 0:16:54.160 --> 0:16:58.480 in Aightiance's data set, but again, lighter skin tones made 0:16:58.520 --> 0:17:02.320 up the majority of these images. Less than fifteen percent 0:17:02.360 --> 0:17:05.000 of all the images in that data set contained people 0:17:05.080 --> 0:17:08.840 of darker skin tones. So I'm sure you can already 0:17:08.920 --> 0:17:12.800 see where this is going. If you train an AI 0:17:12.840 --> 0:17:18.159 model on data that has a disproportionate emphasis on certain factors, 0:17:18.440 --> 0:17:23.720 such as certain genders or certain skin tones, then you 0:17:23.760 --> 0:17:27.280 would expect the AI to be better at handling cases 0:17:27.280 --> 0:17:31.760 that fall into those categories, Right Like, if most of 0:17:31.800 --> 0:17:34.800 the data you've fed to your AI model is of 0:17:35.040 --> 0:17:37.760 men who have a lighter skin tone, then when you 0:17:37.800 --> 0:17:43.040 are serving the AI model a picture of someone who's 0:17:43.320 --> 0:17:46.000 male presenting and has a lighter skin tone, chances are 0:17:46.080 --> 0:17:49.600 the tools going to work better. If you are instead 0:17:50.160 --> 0:17:55.800 feeding it images of people who fall outside those majority cases, 0:17:56.080 --> 0:17:59.000 the AI tool is probably not going to work as 0:17:59.040 --> 0:18:02.159 well with them, and that's exactly what Joy found in 0:18:02.200 --> 0:18:06.679 her research. She discovered that gender classification tools from all 0:18:06.840 --> 0:18:10.920 of the providers performed better with lighter skinned men than 0:18:10.960 --> 0:18:14.600 with any other group. They perform the worst with darker 0:18:14.640 --> 0:18:17.959 skinned women. Thus we have a bias in the system. 0:18:18.320 --> 0:18:21.160 The data that folks use to train these systems had 0:18:21.200 --> 0:18:25.320 that bias, and it unsurprisingly affects how the AI does 0:18:25.359 --> 0:18:29.639 its job. Now, this isn't just a curiosity for research labs. 0:18:29.680 --> 0:18:34.800 Of course, around the world, various organizations and companies are 0:18:34.840 --> 0:18:38.720 making use of facial recognition tools and gender classification tools. 0:18:39.119 --> 0:18:42.440 There are numerous stories of law enforcement agencies getting into 0:18:42.480 --> 0:18:45.719 hot water for relying on this kind of technology. So 0:18:46.000 --> 0:18:51.120 we know that this technology isn't reliable, particularly if someone 0:18:51.240 --> 0:18:55.280 belongs to a group that's outside of lighter skinned men, 0:18:55.840 --> 0:18:59.000 and the data being used to train these tools is limited. 0:18:59.320 --> 0:19:02.760 That's why we're having these issues, or one of the 0:19:02.840 --> 0:19:05.760 main reasons why we're having these issues. So it stands 0:19:05.800 --> 0:19:09.080 to reason we should not employ those tools for anything 0:19:09.560 --> 0:19:13.480 really at all, other than maybe working to make them better. 0:19:13.680 --> 0:19:16.040 But we definitely shouldn't be using them for things like 0:19:16.200 --> 0:19:19.160 law enforcement, for example. At least we should not use 0:19:19.200 --> 0:19:23.320 them until we can address the problem of bias generative 0:19:23.359 --> 0:19:27.399 AI can actually have similar issues with bias that MIT. 0:19:27.640 --> 0:19:30.760 Article that I mentioned earlier in this episode cites another 0:19:30.880 --> 0:19:36.720 article by Leonardo Nicoletti and Dina Bass titled humans are biased. 0:19:36.920 --> 0:19:41.520 Generative AI is even worse. This piece appeared in Bloomberg. 0:19:41.960 --> 0:19:46.000 So this article explores how a generative AI platform called 0:19:46.080 --> 0:19:50.280 stable Diffusion had a tendency to make assumptions based on 0:19:50.440 --> 0:19:57.359 racial and gender stereotypes, thus repeating and even amplifying those stereotypes. 0:19:57.760 --> 0:20:01.880 Nicoletti and Bass performed and in formal test with stable Diffusion, 0:20:02.040 --> 0:20:05.520 a pretty thorough one, but still informal. They asked stable 0:20:05.560 --> 0:20:10.760 Diffusion to generate images of people who were working one 0:20:10.800 --> 0:20:14.520 of fourteen different jobs. Now, half of those jobs belonged 0:20:14.560 --> 0:20:18.119 to what they called high paying positions, like things that 0:20:18.200 --> 0:20:21.720 you would typically associate as a high paying job. The 0:20:21.800 --> 0:20:26.800 other half typically were too low paying jobs, and actually 0:20:26.840 --> 0:20:28.879 a little less than half of them were low paying jobs. 0:20:28.920 --> 0:20:31.480 Three of them actually fell into the category of crime, 0:20:31.880 --> 0:20:34.679 so like you know, thief or something like that. The 0:20:34.840 --> 0:20:38.720 two had Stable Diffusion generate more than five thousand images 0:20:38.760 --> 0:20:42.640 total so that they could really compare. They didn't want 0:20:42.680 --> 0:20:46.040 to just create, you know, a single image each that's 0:20:46.040 --> 0:20:48.359 a terrible test. They wanted to see, all right, is 0:20:48.400 --> 0:20:51.960 this something that's actually appearing over and over again when 0:20:52.000 --> 0:20:55.000 we make use of this tool, or is it possible 0:20:55.080 --> 0:20:58.119 that you know, you run fourteen tests and it just 0:20:58.320 --> 0:21:03.840 happens to go along with racial stereotypes. Nope. They classified 0:21:03.880 --> 0:21:07.720 the generated images based off of the Fitzpatrick's skin scale. 0:21:08.240 --> 0:21:12.040 This is actually a skin pigmentation metric that's used by 0:21:12.440 --> 0:21:16.359 dermatologists as well as like other researchers, and the scale 0:21:16.440 --> 0:21:19.440 goes from one to six, so one would be very 0:21:19.520 --> 0:21:23.080 light skinned and six would be very dark skinned. The 0:21:23.440 --> 0:21:27.679 researchers found that stable diffusion was far more likely to 0:21:27.680 --> 0:21:31.360 create a person with a lighter skin tone for positions 0:21:31.400 --> 0:21:36.159 that traditionally fall into the higher paid categories, and that 0:21:36.200 --> 0:21:38.760 it was more likely to generate someone with a darker 0:21:38.800 --> 0:21:44.040 skin tone for lower paid or criminal categories. What's more, 0:21:44.280 --> 0:21:48.080 stable diffusion generated images of people appearing to be men 0:21:48.240 --> 0:21:52.159 or male presenting for most of those higher paid positions. 0:21:52.280 --> 0:21:55.080 It was very rare for it to generate the image 0:21:55.080 --> 0:21:58.560 of a female presenting person in the role of one 0:21:58.600 --> 0:22:04.000 of these traditionally higher paid jobs. So the AI was 0:22:04.040 --> 0:22:09.280 perpetuating and amplifying these racial and gender stereotypes. This actually 0:22:09.280 --> 0:22:11.720 reminds me of a classic riddle that was intended to 0:22:11.760 --> 0:22:14.040 reveal bias. I'm sure most of you have heard this 0:22:14.119 --> 0:22:17.480 before or some variation. So the riddle typically goes something 0:22:17.560 --> 0:22:20.080 like this. A father and a son are in a 0:22:20.200 --> 0:22:23.680 terrible car accident, and the father tragically dies at the scene. 0:22:24.040 --> 0:22:27.480 The son is badly injured. EMTs arrived. They rushed the 0:22:27.480 --> 0:22:30.600 boy to a surgical ward. The surgeon on duty looks 0:22:30.600 --> 0:22:32.960 at the boy and says, I can't operate on him, 0:22:33.520 --> 0:22:37.360 he's my son. Well, how could that be true? Now? 0:22:37.400 --> 0:22:41.080 The obvious answer is the surgeon is the boy's mother. 0:22:41.400 --> 0:22:43.399 And I think a lot of people arrive at that 0:22:43.880 --> 0:22:47.600 conclusion much more easily today than they did when I 0:22:47.760 --> 0:22:50.000 was a kid. Like when I was a kid, the 0:22:50.160 --> 0:22:55.159 sexist stereotype was that all real quote unquote real doctors 0:22:55.200 --> 0:23:00.760 and surgeons were men and women they were nurses or administrators. Right, 0:23:00.840 --> 0:23:04.960 That was the stereotype that people kind of believed in. 0:23:05.320 --> 0:23:08.320 But I'm sure most of y'all understood this answer, or 0:23:08.440 --> 0:23:11.200 you've been exposed to this riddle numerous times. I mean, 0:23:11.240 --> 0:23:13.479 it is a meme at this point, but again, back 0:23:13.560 --> 0:23:15.200 in my day, a lot of folks would likely get 0:23:15.240 --> 0:23:18.240 stumped by this, or they would say something dumb like, oh, 0:23:18.240 --> 0:23:21.600 it turns out the surgeon was the real dad and 0:23:21.680 --> 0:23:25.119 the father who died at the scene had been the 0:23:25.160 --> 0:23:28.520 adopted father he adopted the boy, or something along those lines, 0:23:28.520 --> 0:23:32.240 which reveals the bias of the listener. It reminds the 0:23:32.320 --> 0:23:36.040 listener to think critically and be aware of sexist stereotypes. 0:23:36.359 --> 0:23:39.760 So AI can produce the wrong results due to bias 0:23:39.800 --> 0:23:44.320 built into the underlying model and end up making these 0:23:44.320 --> 0:23:47.320 same mistakes right, Like if you say surgeon, it may 0:23:47.359 --> 0:23:51.439 mistakenly just believe ah, you meant man. It has to 0:23:51.440 --> 0:23:55.240 be a man that I generate in this image because 0:23:55.960 --> 0:23:59.399 the user said surgeon, so that means man. That's a 0:23:59.400 --> 0:24:02.639 real problem. With enough work and attention, we can actually 0:24:02.680 --> 0:24:07.240 create training materials that minimize bias and can help reverse 0:24:07.359 --> 0:24:12.080 this trend. But even doing that is not enough to 0:24:12.560 --> 0:24:17.119 eliminate errors in generative AI. There are other problems we 0:24:17.200 --> 0:24:20.879 have to look out for. So what happens when you 0:24:21.040 --> 0:24:26.120 have an AI model, like a large language model, for example, 0:24:26.520 --> 0:24:30.280 and part of the massive amount of material that it's 0:24:30.359 --> 0:24:34.560 training itself on includes data sets that were generated by 0:24:34.680 --> 0:24:38.720 other AI. When an AI image generator is pulling images 0:24:38.760 --> 0:24:41.760 that were made by other image generators and then training 0:24:41.800 --> 0:24:44.800 itself on that, or you know, even if it's pulling 0:24:45.240 --> 0:24:49.000 images that an earlier version of that very same generator 0:24:49.040 --> 0:24:53.879 had created, the mistakes that exist in those AI generated images, 0:24:54.440 --> 0:24:56.840 or you know, it's if we're not talking images like 0:24:56.920 --> 0:25:01.000 in text or whatever, those things can become like you 0:25:01.000 --> 0:25:05.080 would argue, oh, those things are noise, right, that's those 0:25:05.080 --> 0:25:09.280 are mistakes. But AI doesn't know that they're mistakes. They don't. 0:25:09.280 --> 0:25:11.840 It doesn't know that it's noise. If you're training it 0:25:11.880 --> 0:25:14.359 on the data, it thinks it's significant. And if it 0:25:14.400 --> 0:25:18.240 thinks it's significant, it's going to incorporate it and perhaps 0:25:18.520 --> 0:25:22.359 even dial it up quite a bit. So a great 0:25:22.400 --> 0:25:25.800 way of illustrating this, in my opinion, is to talk 0:25:25.840 --> 0:25:29.280 about fingers. I mean, I'm sure all of you out 0:25:29.320 --> 0:25:34.720