1 00:00:15,250 --> 00:00:28,890 Speaker 1: Pushkin. You're listening to Brave New Planet, a podcast about 2 00:00:28,930 --> 00:00:33,090 Speaker 1: amazing new technologies that could dramatically improve our world. Or 3 00:00:33,490 --> 00:00:36,010 Speaker 1: if we don't make wise choices, could leave us a 4 00:00:36,050 --> 00:00:40,690 Speaker 1: lot worse off, Utopia or dystopia. It's up to us. 5 00:00:46,970 --> 00:00:52,850 Speaker 1: On November eleventh, twenty sixteen, the Babelfish burst from fiction 6 00:00:53,290 --> 00:00:58,450 Speaker 1: into reality. The Babelfish was conceived forty years ago in 7 00:00:58,610 --> 00:01:03,450 Speaker 1: Douglas Adam's science fiction classic The Hitchhiker's Guide to the Galaxy. 8 00:01:04,130 --> 00:01:08,410 Speaker 1: In the story, a hapless Earthling finds himself a stowaway 9 00:01:08,690 --> 00:01:13,010 Speaker 1: on a Vogon spaceship. When the alien captain starts an 10 00:01:13,010 --> 00:01:17,370 Speaker 1: announcement over the loudspeaker, his companion tells him to stick 11 00:01:17,410 --> 00:01:23,090 Speaker 1: a small yellow fish in his ear. Listen, it's important, 12 00:01:24,170 --> 00:01:27,730 Speaker 1: it's a I can't just put this in your ear. 13 00:01:28,570 --> 00:01:35,450 Speaker 1: Suddenly he's able to understand the language. The Babelfish is small, yellow, 14 00:01:35,850 --> 00:01:39,490 Speaker 1: leech like and probably the oddest thing in the universe. 15 00:01:40,930 --> 00:01:45,530 Speaker 1: It feeds on brainwave energy, whose ambing all unconscious frequencies, 16 00:01:46,010 --> 00:01:48,570 Speaker 1: the practical upshot of which is that if you stick 17 00:01:48,610 --> 00:01:51,770 Speaker 1: one in your ear, you instantly understand anything said to 18 00:01:51,810 --> 00:01:54,370 Speaker 1: you in any form of language. At the time, the 19 00:01:54,450 --> 00:01:58,810 Speaker 1: idea of sticking an instantaneous universal translator in your ear 20 00:01:59,330 --> 00:02:04,490 Speaker 1: seems charmingly absurd, But a couple of years ago, Google 21 00:02:04,570 --> 00:02:08,890 Speaker 1: and other companies announced plants to start selling Babelfish well 22 00:02:09,130 --> 00:02:12,570 Speaker 1: not fish actually, but earbuds that do the same thing. 23 00:02:13,490 --> 00:02:17,050 Speaker 1: The key breakthrough came in November twenty sixteen, when Google 24 00:02:17,170 --> 00:02:23,010 Speaker 1: replaced the technology behind its translate program. Overnight, the Internet 25 00:02:23,090 --> 00:02:28,490 Speaker 1: realized that something extraordinary had happened. A Japanese computer scientist 26 00:02:28,570 --> 00:02:31,890 Speaker 1: ran a quick test. He dashed off his own Japanese 27 00:02:31,930 --> 00:02:36,250 Speaker 1: translation of the opening lines of Ernest Hemingway's short story 28 00:02:36,690 --> 00:02:41,290 Speaker 1: The Snows of Kilmanjaro, and dared Google Translate to turn 29 00:02:41,330 --> 00:02:45,210 Speaker 1: it back into English. Here's the opening passage from the 30 00:02:45,250 --> 00:02:49,370 Speaker 1: Simon and Schuster audio book. Kilimanjaro is a snow covered 31 00:02:49,410 --> 00:02:53,050 Speaker 1: mountain nineteen thousand, seven hundred and ten feet high and 32 00:02:53,210 --> 00:02:56,090 Speaker 1: is said to be the highest mountain in Africa. Its 33 00:02:56,090 --> 00:03:00,130 Speaker 1: western summit is called the Massai Nagaji Nagai, the House 34 00:03:00,170 --> 00:03:03,530 Speaker 1: of God. Close to the western summit there is the 35 00:03:03,650 --> 00:03:07,690 Speaker 1: dried and frozen carcass of a leopard. No one has 36 00:03:07,730 --> 00:03:11,810 Speaker 1: explained what the leopard seeking at that altitude. Let's just 37 00:03:11,890 --> 00:03:15,970 Speaker 1: consider that last sentence. No one has explained what the 38 00:03:16,090 --> 00:03:21,330 Speaker 1: leopard was seeking at that altitude. One day earlier, Google 39 00:03:21,410 --> 00:03:26,650 Speaker 1: had mangled the back translation quote. Whether the leopard had 40 00:03:26,690 --> 00:03:30,890 Speaker 1: what the demand at that altitude? There is no that 41 00:03:30,970 --> 00:03:37,850 Speaker 1: nobody explained. But now Google Translate returned quote. No one 42 00:03:37,930 --> 00:03:43,130 Speaker 1: has ever explained what leopard wanted at that altitude. It 43 00:03:43,250 --> 00:03:49,570 Speaker 1: was perfect except for a missing the the what explained 44 00:03:49,610 --> 00:03:54,970 Speaker 1: the great leap? While Google had built a predictive algorithm 45 00:03:55,010 --> 00:03:59,330 Speaker 1: that taught itself how to translate between English and Japanese 46 00:03:59,810 --> 00:04:03,690 Speaker 1: by training on a vast library of examples and tweaking 47 00:04:03,730 --> 00:04:07,210 Speaker 1: its connections to get better and better at predicting the 48 00:04:07,330 --> 00:04:11,330 Speaker 1: right answer. In anyways, the algorithm was a black box. 49 00:04:11,970 --> 00:04:15,730 Speaker 1: No one understood precisely how it worked, but it did 50 00:04:15,770 --> 00:04:21,890 Speaker 1: amazingly well. Predictive algorithms turn out to be remarkably general. 51 00:04:22,570 --> 00:04:25,250 Speaker 1: They can be applied to predict which movies a Netflix 52 00:04:25,370 --> 00:04:27,970 Speaker 1: user will want to see next, or whether an eye 53 00:04:28,050 --> 00:04:32,690 Speaker 1: exam or a mammogram indicates disease. But it doesn't stop there. 54 00:04:33,330 --> 00:04:37,610 Speaker 1: Predictive algorithms or also being trained to make societal decisions 55 00:04:38,450 --> 00:04:41,490 Speaker 1: who to hire for a job, whether to approve a 56 00:04:41,570 --> 00:04:45,890 Speaker 1: mortgage application, what students to let into a college, what 57 00:04:46,050 --> 00:04:49,530 Speaker 1: a rest ease to let out on bail? But what 58 00:04:49,650 --> 00:04:54,170 Speaker 1: exactly are these big black boxes learning from massive data sets? 59 00:04:54,770 --> 00:04:58,330 Speaker 1: Are they gaining deep new insights about people? Or might 60 00:04:58,370 --> 00:05:07,490 Speaker 1: they sometimes be automating systemic biases? Today's big question when 61 00:05:07,530 --> 00:05:11,690 Speaker 1: should predictive algorithms be allowed to make big decisions about people? 62 00:05:12,490 --> 00:05:15,970 Speaker 1: And before they judge us, should we have the right 63 00:05:16,010 --> 00:05:21,090 Speaker 1: to know what's inside the black box? My name is 64 00:05:21,170 --> 00:05:23,570 Speaker 1: Eric Lander. I'm a scientist who works on ways to 65 00:05:23,610 --> 00:05:27,010 Speaker 1: improve human health. I helped lead the Human Genome Project, 66 00:05:27,210 --> 00:05:30,370 Speaker 1: and today I lead the Broad Institute of MIT and Harvard. 67 00:05:31,090 --> 00:05:34,890 Speaker 1: In the twenty first century, powerful technologies have been appearing 68 00:05:34,930 --> 00:05:39,290 Speaker 1: at a breathtaking pace, related to the Internet, artificial intelligence, 69 00:05:39,330 --> 00:05:44,610 Speaker 1: genetic engineering, and more. They have amazing potential upsides, but 70 00:05:44,730 --> 00:05:47,410 Speaker 1: we can't ignore the risks that come with them. The 71 00:05:47,530 --> 00:05:52,210 Speaker 1: decisions aren't just up to scientists or politicians. Whether we 72 00:05:52,290 --> 00:05:55,130 Speaker 1: like it or not, we all of us are the 73 00:05:55,210 --> 00:05:59,130 Speaker 1: stewards of a brave New planet. This generation's choices will 74 00:05:59,130 --> 00:06:11,570 Speaker 1: shape the future as never before. Coming up on today's 75 00:06:11,570 --> 00:06:20,210 Speaker 1: episode of Brave New Planet, predictive algorithms. We hear from 76 00:06:20,210 --> 00:06:23,650 Speaker 1: a physician at Google about how this technology might help 77 00:06:23,770 --> 00:06:27,410 Speaker 1: keep millions of people with diabetes from going blind, and 78 00:06:27,490 --> 00:06:30,010 Speaker 1: the idea was, well, if you could retrain the model, 79 00:06:30,450 --> 00:06:33,650 Speaker 1: you could get to more patients to screen them for disease. 80 00:06:34,010 --> 00:06:37,410 Speaker 1: The first iteration of the model was on par with 81 00:06:37,690 --> 00:06:41,450 Speaker 1: the US board sortified ophomologists. I speak with an AI 82 00:06:41,610 --> 00:06:46,930 Speaker 1: researcher about how predictive algorithms sometimes learn to be sexist 83 00:06:47,010 --> 00:06:50,690 Speaker 1: and racist. If you typed in I am a white man, 84 00:06:50,730 --> 00:06:53,090 Speaker 1: you would get positive sentiment. If you typed in I 85 00:06:53,130 --> 00:06:56,810 Speaker 1: am a black lesbian, for example, negative sentiment. We hear 86 00:06:56,850 --> 00:07:01,330 Speaker 1: how algorithms are affecting the criminal justice system. For black defendants, 87 00:07:01,570 --> 00:07:05,330 Speaker 1: it was much more likely to incorrectly predict that they 88 00:07:05,330 --> 00:07:06,730 Speaker 1: were going to go on to come in a future 89 00:07:06,730 --> 00:07:09,610 Speaker 1: crime when they didn't, and for white defend it was 90 00:07:09,810 --> 00:07:12,490 Speaker 1: much more likely to predict that they were going to 91 00:07:12,530 --> 00:07:14,770 Speaker 1: go on to not commit a future crime when they did. 92 00:07:15,330 --> 00:07:18,370 Speaker 1: And we hear from a policy expert about whether these 93 00:07:18,410 --> 00:07:21,650 Speaker 1: systems should be regulated. A lot of the horror stories 94 00:07:21,690 --> 00:07:25,610 Speaker 1: are about fully implemented tools that were in work for years. 95 00:07:25,690 --> 00:07:29,650 Speaker 1: There's never a pause button. To reevaluate or look at 96 00:07:29,650 --> 00:07:32,690 Speaker 1: how a system is working real time. Stay with us 97 00:07:36,330 --> 00:07:42,130 Speaker 1: Chapter one, The Big Black Box. To better understand these algorithms, 98 00:07:42,170 --> 00:07:44,370 Speaker 1: I decided to speak with one of the creators of 99 00:07:44,410 --> 00:07:49,090 Speaker 1: the technology that transformed Google Translate. My name is Greg Kurado, 100 00:07:49,250 --> 00:07:52,330 Speaker 1: and I'm a distinguished scientist at Google Research. Early in 101 00:07:52,370 --> 00:07:56,370 Speaker 1: his career, Greg had trained neuroscience, but he soon shifted 102 00:07:56,410 --> 00:08:01,210 Speaker 1: his focus from organic intelligence to artificial and not turned 103 00:08:01,210 --> 00:08:04,290 Speaker 1: out to be really a very lucky moment, because I 104 00:08:04,410 --> 00:08:08,570 Speaker 1: was becoming interested in artificial intelligence at exactly the moment 105 00:08:08,610 --> 00:08:12,010 Speaker 1: that artificial intelligence was changing so much. Ever since the 106 00:08:12,050 --> 00:08:16,090 Speaker 1: field of artificial intelligence started more than sixty years ago, 107 00:08:16,610 --> 00:08:19,930 Speaker 1: there have been two warring approaches about how to teach 108 00:08:20,010 --> 00:08:24,210 Speaker 1: machines to do human tasks. We might call them human 109 00:08:24,290 --> 00:08:28,010 Speaker 1: rules versus machine learning. The way that we used to 110 00:08:28,050 --> 00:08:32,610 Speaker 1: try to get computers to recognize patterns was to program 111 00:08:32,650 --> 00:08:36,850 Speaker 1: into them specific rules. So we would say, oh, well, 112 00:08:36,850 --> 00:08:38,810 Speaker 1: you can tell the difference between a cat and a 113 00:08:38,930 --> 00:08:42,530 Speaker 1: dog by how long it's whiskers are and what kind 114 00:08:42,570 --> 00:08:45,090 Speaker 1: of fur it has and does it have stripes? And 115 00:08:45,210 --> 00:08:48,930 Speaker 1: trying to put these rules into computers. It kind of worked, 116 00:08:49,450 --> 00:08:52,650 Speaker 1: but it made for a lot of mistakes. The other 117 00:08:52,690 --> 00:08:56,330 Speaker 1: approach was machine learning, let the computer figure everything out 118 00:08:56,330 --> 00:09:01,290 Speaker 1: for itself, somewhat like the biological brain. The machine learning 119 00:09:01,290 --> 00:09:05,930 Speaker 1: system is actually built of tiny little decision makers or neurons. 120 00:09:06,290 --> 00:09:09,730 Speaker 1: They start out connected very much in random ways, but 121 00:09:10,250 --> 00:09:13,250 Speaker 1: we give the system feedback. So, for example, if it's 122 00:09:13,250 --> 00:09:16,090 Speaker 1: guessing between a cat and a dog and it gets 123 00:09:16,090 --> 00:09:18,890 Speaker 1: one wrong, we tell the system that it got one wrong, 124 00:09:18,970 --> 00:09:21,810 Speaker 1: and we make little changes inside so that it's much 125 00:09:21,850 --> 00:09:25,210 Speaker 1: more likely to recognize that cat as a cat and 126 00:09:25,370 --> 00:09:29,050 Speaker 1: not mistake it for a dog. Over time, the system 127 00:09:29,090 --> 00:09:32,650 Speaker 1: gets better and better and better. Machine learning had been 128 00:09:32,690 --> 00:09:37,450 Speaker 1: around for decades with rather unimpressive results. The number of 129 00:09:37,610 --> 00:09:42,570 Speaker 1: connections and neurons in those early systems was pretty small. 130 00:09:42,890 --> 00:09:47,690 Speaker 1: We didn't realize until about two and ten that computers 131 00:09:47,730 --> 00:09:51,250 Speaker 1: had gotten fast enough and the data sets were big 132 00:09:51,370 --> 00:09:55,850 Speaker 1: enough that these systems could actually learn from patterns and 133 00:09:55,930 --> 00:10:01,930 Speaker 1: learn from data better than we could describe rules manually. 134 00:10:02,170 --> 00:10:07,570 Speaker 1: Machine learning made huge leaps. Google itself became the leading 135 00:10:07,690 --> 00:10:12,570 Speaker 1: driver of machine learning. In twenty eleven, Krrado joined with 136 00:10:12,650 --> 00:10:17,490 Speaker 1: two colleagues to form a unit called Google Brain. Among 137 00:10:17,570 --> 00:10:22,090 Speaker 1: other things, they applied a machine learning approach to language translation. 138 00:10:22,970 --> 00:10:28,290 Speaker 1: The strategy turned out to be remarkably effective. It doesn't 139 00:10:28,410 --> 00:10:31,130 Speaker 1: learn French the way you would learn French in high school. 140 00:10:31,530 --> 00:10:34,290 Speaker 1: It learns French the way you would learn French at home, 141 00:10:34,730 --> 00:10:38,050 Speaker 1: much more like the way that a child learns the language. 142 00:10:38,250 --> 00:10:41,930 Speaker 1: We give the machine the English sentence, and then we 143 00:10:41,970 --> 00:10:45,370 Speaker 1: give it an example of a French translation of that 144 00:10:45,530 --> 00:10:49,730 Speaker 1: whole sentence. We show a whole lot of them, probably 145 00:10:50,170 --> 00:10:53,330 Speaker 1: more French and English sentences than you could read in 146 00:10:53,370 --> 00:10:57,690 Speaker 1: your whole life. And by seeing so many examples of 147 00:10:58,050 --> 00:11:02,170 Speaker 1: entire sentences, the system is able to learn, oh, this 148 00:11:02,250 --> 00:11:05,170 Speaker 1: is how I would say this in French. That's actually, 149 00:11:05,210 --> 00:11:09,530 Speaker 1: at this point about as good as a biling human 150 00:11:09,570 --> 00:11:13,890 Speaker 1: would produce. Soon Google was training predictive algorithms for all 151 00:11:14,090 --> 00:11:17,890 Speaker 1: sorts of purposes. We use neural network predictors to help 152 00:11:18,090 --> 00:11:22,370 Speaker 1: rank search results, tell people organize their photos, to recognize speech, 153 00:11:22,450 --> 00:11:27,370 Speaker 1: to find driving directions, to help complete emails. Really anything 154 00:11:27,410 --> 00:11:30,010 Speaker 1: that you can think of where there's some notion of 155 00:11:30,130 --> 00:11:34,370 Speaker 1: finding a pattern or making a prediction, artificial intelligence might 156 00:11:34,410 --> 00:11:39,330 Speaker 1: be at play. Predictive algorithms would become ubiquitous in commerce. 157 00:11:39,770 --> 00:11:43,410 Speaker 1: They let Netflix know which movies to recommend to each customer, 158 00:11:43,850 --> 00:11:47,890 Speaker 1: Amazon to suggest products users might be interested in purchasing, 159 00:11:48,210 --> 00:11:53,050 Speaker 1: and much more well, they're shockingly useful, they can also 160 00:11:53,130 --> 00:11:57,810 Speaker 1: be inscrutable. Modern neural networks are like a black box. 161 00:11:58,370 --> 00:12:02,850 Speaker 1: Understanding how they make their predictions can be surprisingly difficult. 162 00:12:03,210 --> 00:12:05,530 Speaker 1: When you build an artificial neural network, you do not 163 00:12:05,810 --> 00:12:10,210 Speaker 1: necessarily understand exactly the final state of how it works. 164 00:12:10,730 --> 00:12:15,530 Speaker 1: Figuring out how it works becomes its own science project. 165 00:12:16,010 --> 00:12:21,130 Speaker 1: One thing we do know. Predictive algorithms are especially sensitive, 166 00:12:21,410 --> 00:12:24,530 Speaker 1: so the choice of examples used to train them. The 167 00:12:24,730 --> 00:12:28,490 Speaker 1: systems learn to imitate the examples in the data that 168 00:12:28,530 --> 00:12:31,170 Speaker 1: they see. You don't know how well they will do 169 00:12:31,210 --> 00:12:34,090 Speaker 1: on things that are very different. So, for example, if 170 00:12:34,130 --> 00:12:38,090 Speaker 1: you train a system to recognize cats and dogs, but 171 00:12:38,250 --> 00:12:43,290 Speaker 1: you only ever show it border collies and tabbycats, it's 172 00:12:43,370 --> 00:12:45,570 Speaker 1: not clear what it will do. When you show it 173 00:12:45,610 --> 00:12:50,250 Speaker 1: a picture of chihuahua, all it's ever seen as border collies, 174 00:12:50,690 --> 00:12:53,930 Speaker 1: it may not get the right answer. So its concept 175 00:12:53,970 --> 00:12:56,890 Speaker 1: of dog is going to be limited by the dogs 176 00:12:56,930 --> 00:13:00,130 Speaker 1: it's seen. That's right, and this is why diversity of 177 00:13:00,290 --> 00:13:04,850 Speaker 1: data in machine learning systems is so important. You have 178 00:13:04,930 --> 00:13:08,810 Speaker 1: to have a data set that represents the entire spectrum 179 00:13:08,810 --> 00:13:12,170 Speaker 1: of possibilities that you expect the system to work under. 180 00:13:12,730 --> 00:13:15,970 Speaker 1: Teaching algorithms turns out to be not so different than 181 00:13:16,010 --> 00:13:25,490 Speaker 1: teaching people. They learn what they see. Chapter two, retina fundoscopy. 182 00:13:27,170 --> 00:13:30,650 Speaker 1: It's cool that predictive algorithms can learn to translate languages 183 00:13:30,730 --> 00:13:35,130 Speaker 1: and suggest movies, but what about more life changing applications. 184 00:13:35,850 --> 00:13:38,970 Speaker 1: My name is Lily Ping. I am a physician by training, 185 00:13:39,370 --> 00:13:42,170 Speaker 1: and I am a product manager at Google. I went 186 00:13:42,250 --> 00:13:45,050 Speaker 1: to visit doctor Ping because she and her colleagues are 187 00:13:45,130 --> 00:13:50,330 Speaker 1: using predictive algorithms to help millions of people avoid going blind. So. 188 00:13:50,490 --> 00:13:55,210 Speaker 1: Diabetic retinopathy is a complication of diabetes that affects the 189 00:13:55,250 --> 00:13:58,170 Speaker 1: back of the eye, the retina. One of the devastating 190 00:13:58,210 --> 00:14:02,530 Speaker 1: complications is vision loss. All patients that have diabetes need 191 00:14:02,650 --> 00:14:06,370 Speaker 1: to be screened once a year for a diabetic retnopathy. 192 00:14:06,410 --> 00:14:08,810 Speaker 1: This is an asymptomatic disease, which means that you do 193 00:14:08,850 --> 00:14:12,170 Speaker 1: not feel the symptoms. You don't experienced vision loss until 194 00:14:12,210 --> 00:14:16,330 Speaker 1: it's too late. Now, diabetes is epidemic around the world. 195 00:14:16,370 --> 00:14:19,930 Speaker 1: How many diabetics are there? Though by most estimates, there 196 00:14:19,930 --> 00:14:23,370 Speaker 1: are over four hundred million patients in the world with diabetes. 197 00:14:23,610 --> 00:14:26,810 Speaker 1: How do you screen a patient to see whether they 198 00:14:26,850 --> 00:14:30,930 Speaker 1: have diabetic retinopathy. You need to have a special camera 199 00:14:31,050 --> 00:14:34,050 Speaker 1: while a fundis camera and it takes a picture through 200 00:14:34,090 --> 00:14:36,410 Speaker 1: the people of the back of the eye. We have 201 00:14:36,450 --> 00:14:39,850 Speaker 1: a very small supply of retina specialists and eye doctors 202 00:14:40,130 --> 00:14:42,730 Speaker 1: and they do a lot more than reading images, so 203 00:14:42,970 --> 00:14:47,250 Speaker 1: they needed to scale the reading of these images. Four 204 00:14:47,490 --> 00:14:52,370 Speaker 1: hundred million people with diabetes. There just aren't enough specialists 205 00:14:52,370 --> 00:14:55,770 Speaker 1: for all the retinal images that need reading, especially in 206 00:14:55,810 --> 00:14:59,330 Speaker 1: some countries in Asia where resources are limited and the 207 00:14:59,410 --> 00:15:04,290 Speaker 1: incidence of diabetes is skyrocketing. Two hospitals in southern India 208 00:15:04,370 --> 00:15:07,930 Speaker 1: recognize the problem and reached out to Google for help. 209 00:15:09,210 --> 00:15:12,410 Speaker 1: That point, Google was already sort of well known for 210 00:15:12,450 --> 00:15:17,850 Speaker 1: image recognition. We were classifying cats and dogs and consumer images, 211 00:15:18,090 --> 00:15:20,490 Speaker 1: and the idea was, well, if you could retrain the 212 00:15:20,530 --> 00:15:26,410 Speaker 1: model to recognize diabetic retinopathy, you could potentially help the 213 00:15:26,450 --> 00:15:30,290 Speaker 1: hospitals in India get to more patients to screen them 214 00:15:30,410 --> 00:15:33,090 Speaker 1: for disease. How did you and your colleagues set out 215 00:15:33,090 --> 00:15:36,210 Speaker 1: to attack this problem? So when I first started the project, 216 00:15:36,570 --> 00:15:40,370 Speaker 1: we had about one hundred thirty thousand images from eye 217 00:15:40,410 --> 00:15:43,690 Speaker 1: hospitals in India as well as a screening program in 218 00:15:43,690 --> 00:15:48,170 Speaker 1: the US. Also, we gathered the army of opthalmologists to 219 00:15:48,290 --> 00:15:52,370 Speaker 1: grade them eight hundred eighty thousand diagnoses or rendered on 220 00:15:52,370 --> 00:15:55,610 Speaker 1: one hundred thirty thousand images. So we took this training 221 00:15:55,690 --> 00:15:58,090 Speaker 1: data and we put it in a machine learning model 222 00:15:58,250 --> 00:16:00,770 Speaker 1: and had to do The first iteration of the model 223 00:16:01,210 --> 00:16:06,090 Speaker 1: was on par with the US board sortified ophomologists. Since then, 224 00:16:06,130 --> 00:16:09,570 Speaker 1: we've made some improvements to the model, and the initial 225 00:16:09,610 --> 00:16:13,610 Speaker 1: training took about how long The first time we train 226 00:16:13,650 --> 00:16:16,290 Speaker 1: a model, it may have taken a couple of weeks, 227 00:16:16,330 --> 00:16:18,410 Speaker 1: But then the second time you train the next models 228 00:16:18,410 --> 00:16:21,530 Speaker 1: and next models, it's just it's shorter and shorter, sometimes overnight, 229 00:16:22,050 --> 00:16:24,930 Speaker 1: sometimes overnight. Well, yes, all right, And by contrast, how 230 00:16:24,970 --> 00:16:28,250 Speaker 1: long does it take to train a board certified ophthimologist, 231 00:16:29,570 --> 00:16:33,650 Speaker 1: So that usually takes at least five years, and then 232 00:16:33,690 --> 00:16:37,730 Speaker 1: you also have additional fellowship ears to specialize in the retina. 233 00:16:37,770 --> 00:16:39,450 Speaker 1: And at the end of that you only have one 234 00:16:39,770 --> 00:16:42,890 Speaker 1: board certified ophthimologist. Yes, at the end of that you'd 235 00:16:42,930 --> 00:16:48,250 Speaker 1: have one very very well trained doctor, but that's not scaled. Yes, 236 00:16:48,410 --> 00:16:53,930 Speaker 1: So by contrast, a model like this scales worldwide and 237 00:16:54,050 --> 00:16:58,730 Speaker 1: never fatigues. It consistently gives the same diagnosis on the 238 00:16:58,730 --> 00:17:02,850 Speaker 1: same image, and it obviously takes a much shorter time 239 00:17:02,930 --> 00:17:07,010 Speaker 1: to train. That being said, it does a very very 240 00:17:07,130 --> 00:17:10,050 Speaker 1: narrow task that is just a very small portion of 241 00:17:10,050 --> 00:17:14,050 Speaker 1: what that doctor can do. The retina screening tools already 242 00:17:14,090 --> 00:17:17,490 Speaker 1: being used in India, It was recently approved in Europe 243 00:17:17,530 --> 00:17:21,450 Speaker 1: and its under review in the United States. Groups around 244 00:17:21,490 --> 00:17:24,890 Speaker 1: the world are now working on other challenges in medical imaging, 245 00:17:25,370 --> 00:17:29,130 Speaker 1: like detecting breast cancers at earlier stages. But I was 246 00:17:29,210 --> 00:17:33,970 Speaker 1: particularly struck by a surprising discovery by Lily's team that 247 00:17:34,210 --> 00:17:39,410 Speaker 1: unexpected information about patients was hiding in their retinal pictures. 248 00:17:40,010 --> 00:17:43,450 Speaker 1: In the fundess image, there are blood vessels, and so 249 00:17:43,650 --> 00:17:46,570 Speaker 1: one of the thoughts that we had was, because you 250 00:17:46,610 --> 00:17:49,250 Speaker 1: can see these vessels, I wonder if we can predict 251 00:17:49,450 --> 00:17:53,170 Speaker 1: cardiovascular disease from the same image. So we did an 252 00:17:53,170 --> 00:17:58,250 Speaker 1: experiment where we took fundess images and we train a 253 00:17:58,290 --> 00:18:01,650 Speaker 1: model to predict whether or not that patient would have 254 00:18:01,690 --> 00:18:04,370 Speaker 1: a heart attack in five years. We found that we 255 00:18:04,450 --> 00:18:09,130 Speaker 1: could tell whether or not this patient may have a 256 00:18:09,130 --> 00:18:13,170 Speaker 1: a vascular event much better than doctors. It speaks to 257 00:18:13,410 --> 00:18:16,850 Speaker 1: what might be in this data that we've overlooked. The 258 00:18:16,930 --> 00:18:21,730 Speaker 1: model could make predictions that doctors couldn't from the same 259 00:18:21,890 --> 00:18:26,090 Speaker 1: type of data. It turned out the computer could also 260 00:18:26,130 --> 00:18:30,130 Speaker 1: do a reasonable job of predicting a patient sex, age, 261 00:18:30,170 --> 00:18:32,930 Speaker 1: and smoking status. The first time I did this with 262 00:18:32,970 --> 00:18:35,570 Speaker 1: an ahomologist, I think she thought I was trolling her. 263 00:18:35,850 --> 00:18:38,570 Speaker 1: I said, well, here pictures. Guess which one is a woman, 264 00:18:38,970 --> 00:18:41,410 Speaker 1: Guess which one is a man. Guess which one's a smoker, 265 00:18:41,850 --> 00:18:44,330 Speaker 1: Guess which one is young. Right, these are all tasks 266 00:18:44,370 --> 00:18:48,090 Speaker 1: that doctors don't generally do with these images. It turns 267 00:18:48,130 --> 00:18:51,970 Speaker 1: out the model was right ninety eight ninety nine percent 268 00:18:51,970 --> 00:18:55,610 Speaker 1: of the time. That being said, there are much easier 269 00:18:55,650 --> 00:18:59,290 Speaker 1: ways of getting the sex of a patience. So so, 270 00:18:59,450 --> 00:19:03,890 Speaker 1: while scientifically interesting, this is one of the most useless 271 00:19:03,890 --> 00:19:07,250 Speaker 1: clinical predictions ever. So how far can it go? If 272 00:19:07,250 --> 00:19:11,930 Speaker 1: you gave preference for rock music or not? What do 273 00:19:11,970 --> 00:19:15,850 Speaker 1: you think? You know? We tried predicting happiness. That didn't work, 274 00:19:15,850 --> 00:19:22,290 Speaker 1: So I'm guessing rock music. Oh, probably not, but who knows. So. 275 00:19:22,450 --> 00:19:26,610 Speaker 1: Predictive algorithms can learn a remarkable range of tasks, and 276 00:19:26,650 --> 00:19:30,890 Speaker 1: they can even discover hidden patterns that humans miss. We 277 00:19:31,010 --> 00:19:33,730 Speaker 1: just have to give them enough training data to learn from. 278 00:19:34,450 --> 00:19:43,970 Speaker 1: Sounds pretty fantastic. What could possibly go wrong? Chapter three? 279 00:19:44,290 --> 00:19:49,130 Speaker 1: What could possibly go wrong? If predictive algorithms can use 280 00:19:49,210 --> 00:19:53,410 Speaker 1: massive data to discover unexpected connections between your eye and 281 00:19:53,490 --> 00:19:58,410 Speaker 1: your heart, what might they be learning about, say, human society. 282 00:19:58,930 --> 00:20:01,170 Speaker 1: To answer this question, I took a trip to speak 283 00:20:01,210 --> 00:20:04,330 Speaker 1: with Kate Crawford, the co founder and co director of 284 00:20:04,370 --> 00:20:08,770 Speaker 1: the AI Now Institute at New York University. When we 285 00:20:08,890 --> 00:20:12,930 Speaker 1: begin and we were the world's first AI institute dedicated 286 00:20:12,970 --> 00:20:16,850 Speaker 1: to studying the social implications of these tools. To me, 287 00:20:17,290 --> 00:20:20,170 Speaker 1: these are the biggest challenges that we face right now, 288 00:20:20,210 --> 00:20:23,650 Speaker 1: simply because we've spent decades looking at these questions from 289 00:20:23,650 --> 00:20:26,770 Speaker 1: a technical lens at the expense of looking at them 290 00:20:26,850 --> 00:20:29,490 Speaker 1: at a social and an ethical lens. I knew about 291 00:20:29,570 --> 00:20:32,410 Speaker 1: Kate's work because we served together on a working group 292 00:20:32,450 --> 00:20:36,330 Speaker 1: about artificial intelligence for the US National Institutes of Health. 293 00:20:36,970 --> 00:20:40,770 Speaker 1: I also knew she had an interesting background. I grew 294 00:20:40,810 --> 00:20:44,930 Speaker 1: up in Australia. I studied a really strange grab bag 295 00:20:44,970 --> 00:20:49,370 Speaker 1: of disciplines. I studied law, I studied philosophy, and then 296 00:20:49,370 --> 00:20:52,570 Speaker 1: I got really interested in computer science, and this was 297 00:20:52,570 --> 00:20:56,210 Speaker 1: happening at the same time as I was writing electronic 298 00:20:56,330 --> 00:21:00,090 Speaker 1: music on large scale modulus synthesizers, and that's still a 299 00:21:00,170 --> 00:21:03,250 Speaker 1: thing that I do today. It's almost like the opposite 300 00:21:03,250 --> 00:21:06,730 Speaker 1: of artificial intelligence because it's so analog, so I absolutely 301 00:21:06,730 --> 00:21:09,010 Speaker 1: love it for that reason. In the year two thousand 302 00:21:09,090 --> 00:21:14,090 Speaker 1: and Kate's band released an album entitled twenty twenty that 303 00:21:14,210 --> 00:21:20,930 Speaker 1: included a pression song called Machines work so that people 304 00:21:21,050 --> 00:21:27,210 Speaker 1: have time to think. It's funny because we use a 305 00:21:27,330 --> 00:21:30,930 Speaker 1: sample from an early IBM promotional film that was made 306 00:21:30,930 --> 00:21:34,130 Speaker 1: in the nineteen sixties, which says machines can do the 307 00:21:34,170 --> 00:21:37,250 Speaker 1: work so that people have time to think, and we 308 00:21:37,370 --> 00:21:40,210 Speaker 1: actually ended up sort of cutting it and splicing it 309 00:21:40,250 --> 00:21:42,050 Speaker 1: in the track, so it ends up saying that people 310 00:21:42,050 --> 00:21:44,370 Speaker 1: can do the work so that machines have time to think. 311 00:21:44,690 --> 00:21:47,290 Speaker 1: And strangely, the more that I've been working in the 312 00:21:47,330 --> 00:21:49,930 Speaker 1: sort of machine learning space, I think, yeah, there's a 313 00:21:49,930 --> 00:21:52,050 Speaker 1: lot of ways in which actually people are doing the 314 00:21:52,130 --> 00:22:00,050 Speaker 1: work so that machines can do all the thinking. Kate 315 00:22:00,210 --> 00:22:03,890 Speaker 1: gave me a crash course on how predictive algorithms not 316 00:22:04,010 --> 00:22:08,170 Speaker 1: only teach themselves language skills, but also in the process 317 00:22:08,450 --> 00:22:13,730 Speaker 1: acquire human prejudices, even in something as seemingly benign as 318 00:22:13,850 --> 00:22:18,090 Speaker 1: language translation. So in many cases, if you say, translate 319 00:22:18,090 --> 00:22:21,930 Speaker 1: a sentence like she is a doctor into a language 320 00:22:21,930 --> 00:22:24,890 Speaker 1: like Turkish, and then you translate it back into English, 321 00:22:25,170 --> 00:22:28,130 Speaker 1: and you're saying Turkish because Turkish has pronouns that are 322 00:22:28,130 --> 00:22:32,170 Speaker 1: not gendered precisely, and so you would expect that you 323 00:22:32,170 --> 00:22:34,130 Speaker 1: would get the same sentence back, but you do not. 324 00:22:34,370 --> 00:22:37,410 Speaker 1: It will say he is a doctor, so she is 325 00:22:37,410 --> 00:22:41,810 Speaker 1: a doctor was translated into gender neutral Turkish as all 326 00:22:41,970 --> 00:22:46,770 Speaker 1: beer doctor, which was then back translated into English as 327 00:22:46,930 --> 00:22:49,650 Speaker 1: he is a doctor. In fact, you could see how 328 00:22:49,770 --> 00:22:53,170 Speaker 1: much the predictive algorithms had learned about gender roles. Just 329 00:22:53,250 --> 00:22:57,610 Speaker 1: by giving Google Translate a bunch of gender neutral sentences 330 00:22:57,650 --> 00:23:01,730 Speaker 1: in Turkish. You got he is an engineer, she is 331 00:23:01,730 --> 00:23:04,930 Speaker 1: a cook. He is a soldier, but she is a teacher. 332 00:23:05,290 --> 00:23:08,010 Speaker 1: He is a friend, but she is a lover. He 333 00:23:08,290 --> 00:23:11,170 Speaker 1: is happy and she is unhappy. I find that one 334 00:23:11,290 --> 00:23:15,810 Speaker 1: particularly odd, and it's not just language translation that's problematic. 335 00:23:16,250 --> 00:23:20,650 Speaker 1: The same sort of issues arise in language understanding. Predictive 336 00:23:20,690 --> 00:23:24,850 Speaker 1: algorithms were trained to learn analogies by reading lots of texts, 337 00:23:25,210 --> 00:23:28,570 Speaker 1: they concluded that dog is to puppy as cat is 338 00:23:28,570 --> 00:23:31,530 Speaker 1: to kitten, and man is to king as woman is 339 00:23:31,570 --> 00:23:36,490 Speaker 1: to queen. But they also automatically inferred that man is 340 00:23:36,490 --> 00:23:41,490 Speaker 1: to computer programmer as woman is to homemaker. And with 341 00:23:41,570 --> 00:23:44,890 Speaker 1: the rise of social media, Google used text on the 342 00:23:44,890 --> 00:23:49,530 Speaker 1: Internet to train predictive algorithms to infer the sentiment of 343 00:23:49,650 --> 00:23:53,690 Speaker 1: tweets and online reviews. Is it a positive sentiment? Is 344 00:23:53,690 --> 00:23:56,490 Speaker 1: it a negative sentiment? I believe it was Google who 345 00:23:56,490 --> 00:23:59,690 Speaker 1: released their sentiment engine, and you could just try it online, 346 00:23:59,730 --> 00:24:01,530 Speaker 1: you know, put in a sentiment and see what you'd get. 347 00:24:02,010 --> 00:24:05,730 Speaker 1: And again, similar problems emerged. If you typed in I 348 00:24:05,810 --> 00:24:08,130 Speaker 1: am a white man, you would get positive sentiment. If 349 00:24:08,130 --> 00:24:11,690 Speaker 1: you typed in a black lesbian, for example, negative sentiment. 350 00:24:12,530 --> 00:24:16,370 Speaker 1: Just as Greg Korado explained with chihuahuas and border collies, 351 00:24:16,850 --> 00:24:20,410 Speaker 1: the predictive algorithms were learning from the examples they found 352 00:24:20,410 --> 00:24:24,690 Speaker 1: in the world, and those examples reflected a lot about 353 00:24:24,770 --> 00:24:28,490 Speaker 1: past practices and prejudices. If we think about where you 354 00:24:28,570 --> 00:24:32,090 Speaker 1: might be scraping large amounts of text from say Reddit, 355 00:24:32,170 --> 00:24:35,450 Speaker 1: for example, and you're not thinking about how that sentiment 356 00:24:35,530 --> 00:24:39,250 Speaker 1: might be biased against certain groups, then you're just basically 357 00:24:39,290 --> 00:24:43,130 Speaker 1: importing that directly into your tool. But it's not just 358 00:24:43,250 --> 00:24:47,090 Speaker 1: conversations on Reddit. There's the cautionary tale of what happens 359 00:24:47,130 --> 00:24:50,450 Speaker 1: when Amazon let a computer teach itself how to sift 360 00:24:50,450 --> 00:24:54,930 Speaker 1: through mountains of resumes for computer programming jobs to find 361 00:24:55,010 --> 00:24:59,530 Speaker 1: the best candidates to interview. So they set up this system, 362 00:24:59,650 --> 00:25:01,730 Speaker 1: they designed it, and what they found was a very 363 00:25:01,810 --> 00:25:06,170 Speaker 1: quickly this system had learned to discard and really demote 364 00:25:06,490 --> 00:25:10,570 Speaker 1: the applications from women. And typically if you had a 365 00:25:10,570 --> 00:25:13,410 Speaker 1: women's college mentioned, and even if you had the word 366 00:25:13,770 --> 00:25:18,010 Speaker 1: women's on your resume, your application would go to the 367 00:25:18,050 --> 00:25:20,610 Speaker 1: bottom of the pile. All right, So how does it 368 00:25:20,770 --> 00:25:23,570 Speaker 1: learn that? So, first of all, we take a look 369 00:25:23,570 --> 00:25:26,770 Speaker 1: at who is generally hired by Amazon, and of course 370 00:25:26,850 --> 00:25:30,450 Speaker 1: they have a very heavily skewed male workforce, and so 371 00:25:30,490 --> 00:25:32,850 Speaker 1: the system is learning that these are the sorts of 372 00:25:32,890 --> 00:25:36,210 Speaker 1: people who will tend to be hired and promoted. And 373 00:25:36,330 --> 00:25:38,770 Speaker 1: it is not a surprise then that they actually found 374 00:25:38,770 --> 00:25:41,730 Speaker 1: it impossible to really retrain this system. They ended up 375 00:25:41,770 --> 00:25:45,650 Speaker 1: abandoning this tool because simply correcting for a bias is 376 00:25:45,770 --> 00:25:48,370 Speaker 1: very hard to do when all of your ground truth 377 00:25:48,450 --> 00:25:52,890 Speaker 1: data is so profoundly skewed in a particular direction. So 378 00:25:53,010 --> 00:25:57,250 Speaker 1: Amazon dropped this particular machine learning project and Google fixed 379 00:25:57,290 --> 00:26:01,210 Speaker 1: the Turkish to English problem. Today, Google Translate gives both 380 00:26:01,330 --> 00:26:04,250 Speaker 1: he is a doctor and she is a doctor as 381 00:26:04,290 --> 00:26:09,090 Speaker 1: translation options. But biases keep popping up in predictive algorithms 382 00:26:09,170 --> 00:26:14,170 Speaker 1: in many settings, there's no systematic way to prevent them. Instead, 383 00:26:14,650 --> 00:26:18,850 Speaker 1: spotting and fixing biases has become a game of whacamole 384 00:26:22,570 --> 00:26:28,930 Speaker 1: Chapter four quarterbacks. Perhaps it's no surprise that algorithms trained 385 00:26:28,930 --> 00:26:32,090 Speaker 1: in the wild west of the Internet or on tech 386 00:26:32,130 --> 00:26:37,330 Speaker 1: industry hiring practices learned serious biases. But what about more 387 00:26:37,450 --> 00:26:42,250 Speaker 1: sober settings like a hospital. I talked with someone recently 388 00:26:42,330 --> 00:26:47,850 Speaker 1: discovered similar problems with potentially life threatening consequences. Hi am 389 00:26:47,930 --> 00:26:52,250 Speaker 1: Christine Vogeli. I'm the director of evaluation research at Partner's 390 00:26:52,370 --> 00:26:57,930 Speaker 1: Healthcare here in Boston. Partner's Healthcare, recently rebranded as mass 391 00:26:58,010 --> 00:27:02,570 Speaker 1: General Brigham, is the largest healthcare provider in Massachusetts, a 392 00:27:02,650 --> 00:27:06,410 Speaker 1: system that has six thousand doctors and a dozen hospitals 393 00:27:06,410 --> 00:27:10,370 Speaker 1: and serves more than a million patients. As Christine explained 394 00:27:10,410 --> 00:27:13,530 Speaker 1: to me, the role of healthcare providers in the US 395 00:27:13,530 --> 00:27:18,210 Speaker 1: has been shifting. The responsibility for controlling costs and ensuring 396 00:27:18,290 --> 00:27:21,770 Speaker 1: high quality services is now being put down on the 397 00:27:21,810 --> 00:27:24,690 Speaker 1: hospitals and the doctors. And to me, this makes a 398 00:27:24,690 --> 00:27:27,010 Speaker 1: lot of sense, Right, we really should be the ones 399 00:27:27,090 --> 00:27:29,690 Speaker 1: responsible for ensuring that there's good quality care and that 400 00:27:29,730 --> 00:27:34,410 Speaker 1: we're doing it efficiently. Healthcare providers are especially focusing their 401 00:27:34,450 --> 00:27:38,770 Speaker 1: attention on what they call high risk patients. Really, what 402 00:27:38,810 --> 00:27:42,730 Speaker 1: it means is that they have both multiple chronic illnesses 403 00:27:43,130 --> 00:27:46,930 Speaker 1: and relatively acute chronic illnesses. So give me a set 404 00:27:46,970 --> 00:27:49,890 Speaker 1: of conditions that a patient might have, right, So somebody, 405 00:27:49,930 --> 00:27:53,250 Speaker 1: for example, with cardiovascular disease co occurring with diabetes, and 406 00:27:53,490 --> 00:27:55,890 Speaker 1: you know, maybe they also have depression. They're just kind 407 00:27:55,930 --> 00:27:57,890 Speaker 1: of suffering and trying to get used to having that 408 00:27:58,010 --> 00:28:01,610 Speaker 1: complex illness and how to manage it. Partners Healthcare offers 409 00:28:01,610 --> 00:28:04,650 Speaker 1: a program to help these complex patients. We have a 410 00:28:04,730 --> 00:28:08,170 Speaker 1: nurse or social worker who works as a care manager 411 00:28:08,610 --> 00:28:13,450 Speaker 1: who help everything from education to care coordination services. But 412 00:28:13,610 --> 00:28:17,530 Speaker 1: really that care manager works essentially as a quarterback, arranges 413 00:28:17,570 --> 00:28:20,210 Speaker 1: everything but also provides hands on care to the patient 414 00:28:20,250 --> 00:28:24,090 Speaker 1: and the caregiver. Yeah, I think it's a wonder how 415 00:28:24,170 --> 00:28:27,530 Speaker 1: we expect patients to go figure out all the things 416 00:28:27,610 --> 00:28:29,730 Speaker 1: they're supposed to be doing and how to interact with 417 00:28:29,770 --> 00:28:33,570 Speaker 1: the medical system without a quarterback. It's incredibly complex. These 418 00:28:33,610 --> 00:28:37,570 Speaker 1: patients have multiple specialists who are interacting with the primary 419 00:28:37,570 --> 00:28:39,690 Speaker 1: care physician. They need somebody to be able to tie 420 00:28:39,690 --> 00:28:42,210 Speaker 1: it together and be able to create a care plan 421 00:28:42,290 --> 00:28:45,290 Speaker 1: for them that they can follow, and it pulls everything 422 00:28:45,330 --> 00:28:49,050 Speaker 1: together from all those specialists. Partners Healthcare found that providing 423 00:28:49,130 --> 00:28:54,810 Speaker 1: complex patients with quarterbacks both saved money and improved patient's health. 424 00:28:55,330 --> 00:28:58,890 Speaker 1: For example, they had fewer emergency visits each year, so 425 00:28:59,050 --> 00:29:02,530 Speaker 1: Partners developed a program to identify the top three percent 426 00:29:02,570 --> 00:29:06,610 Speaker 1: of patients with the greatest need for the service. Most 427 00:29:06,690 --> 00:29:10,570 Speaker 1: were recommended by their physicians, but they also used a 428 00:29:10,650 --> 00:29:14,690 Speaker 1: predictive algorithm provided by a major health insurance company that 429 00:29:14,770 --> 00:29:19,090 Speaker 1: assigns each patient a risk score. What does the algorithm do? 430 00:29:19,650 --> 00:29:22,330 Speaker 1: When you look at the web page, it really describes 431 00:29:22,370 --> 00:29:26,290 Speaker 1: itself as a tool to help identify high risk patients. 432 00:29:26,530 --> 00:29:29,530 Speaker 1: And that term is really interesting term to me. What 433 00:29:29,730 --> 00:29:32,410 Speaker 1: makes a patient high risk? So I think from an 434 00:29:32,410 --> 00:29:36,730 Speaker 1: insurance perspective, risk means these patients are going to be 435 00:29:36,810 --> 00:29:41,730 Speaker 1: expensive from a healthcare organization perspective, these are patients who 436 00:29:41,730 --> 00:29:45,530 Speaker 1: we think we could help, and that's the fundamental challenge 437 00:29:45,570 --> 00:29:48,050 Speaker 1: on this one. When the team began to look closely 438 00:29:48,090 --> 00:29:51,210 Speaker 1: at the results, they noticed that people recommended by the 439 00:29:51,250 --> 00:29:55,690 Speaker 1: algorithm were strikingly different than those recommended by their doctor. 440 00:29:56,450 --> 00:30:00,930 Speaker 1: We noticed that black patients overall were underrepresented patients with 441 00:30:01,090 --> 00:30:05,090 Speaker 1: similar numbers of chronic illnesses. If they were black, they 442 00:30:05,130 --> 00:30:08,290 Speaker 1: had a lower riskcore than if they were white, and 443 00:30:08,330 --> 00:30:11,770 Speaker 1: that didn't make sense. Just black patients identified by the 444 00:30:11,810 --> 00:30:15,970 Speaker 1: algorithm turned out to have twenty six percent more chronic 445 00:30:16,010 --> 00:30:20,410 Speaker 1: illnesses than white patients with the same risk scores. So 446 00:30:20,450 --> 00:30:24,690 Speaker 1: what was wrong with the algorithm? It was because given 447 00:30:24,690 --> 00:30:28,530 Speaker 1: a certain level of illness, black and minority patients tend 448 00:30:28,570 --> 00:30:31,490 Speaker 1: to use fewer healthcare services, and whites tend to use 449 00:30:31,530 --> 00:30:35,250 Speaker 1: more even if they have the same level of chronic 450 00:30:35,410 --> 00:30:37,370 Speaker 1: even if they have the same level of chronic conditions. 451 00:30:37,370 --> 00:30:39,850 Speaker 1: That's right, So in some sense, the algorithm is correctly 452 00:30:39,930 --> 00:30:44,450 Speaker 1: predicting the cost associated with the patient, but not the 453 00:30:44,570 --> 00:30:48,970 Speaker 1: need exactly. It predicts costs very well, but we're interested 454 00:30:49,010 --> 00:30:53,050 Speaker 1: in understanding patients who are sick and have needs It's 455 00:30:53,090 --> 00:30:56,410 Speaker 1: important to say that the algorithm only used information about 456 00:30:56,410 --> 00:31:00,250 Speaker 1: insurance claims and medical costs. It didn't use any information 457 00:31:00,290 --> 00:31:04,610 Speaker 1: about a patient's race. But of course these factors are 458 00:31:04,690 --> 00:31:09,850 Speaker 1: correlated with race due to longstanding issues in American society. Frankly, 459 00:31:10,850 --> 00:31:14,690 Speaker 1: we have fewer minority physicians and we do white physicians. 460 00:31:14,690 --> 00:31:18,010 Speaker 1: So the level of trust minorities with the healthcare system, 461 00:31:18,010 --> 00:31:21,450 Speaker 1: we've observed it's lower. And we also know that there 462 00:31:21,490 --> 00:31:25,890 Speaker 1: are just systematic barriers to care that certain groups of 463 00:31:25,930 --> 00:31:29,770 Speaker 1: patients experience more so. For example, race and poverty go 464 00:31:29,850 --> 00:31:34,370 Speaker 1: together and job flexibility. So all these issues with scheduling, 465 00:31:34,410 --> 00:31:37,330 Speaker 1: being able to come in, being able to access services 466 00:31:37,410 --> 00:31:41,050 Speaker 1: are just heightened for minority populations relative to white populations. 467 00:31:41,730 --> 00:31:46,970 Speaker 1: So someone who just has less economic resources might not 468 00:31:47,050 --> 00:31:48,850 Speaker 1: be able to get off work, might not be able 469 00:31:48,850 --> 00:31:51,010 Speaker 1: to get off work, might not have the flexibility with 470 00:31:51,090 --> 00:31:53,010 Speaker 1: childcare to be able to come in for a visit 471 00:31:53,050 --> 00:31:55,650 Speaker 1: when they need to. Exactly, so it means that if 472 00:31:55,770 --> 00:31:59,970 Speaker 1: one only relied on the algorithm, you wouldn't be targeting 473 00:31:59,970 --> 00:32:03,410 Speaker 1: the right people. Yes, we would be targeting more advantaged 474 00:32:03,490 --> 00:32:06,650 Speaker 1: patients who tend to use a lot of healthcare services 475 00:32:06,810 --> 00:32:09,970 Speaker 1: when they corrected the problem, the proportion black patients in 476 00:32:10,050 --> 00:32:13,850 Speaker 1: the high risk group jumped from eighteen percent to forty 477 00:32:13,930 --> 00:32:18,810 Speaker 1: seven percent. Christine, together with colleagues from several other institutions, 478 00:32:18,810 --> 00:32:22,890 Speaker 1: wrote up a paper describing their findings. It was published 479 00:32:22,930 --> 00:32:27,570 Speaker 1: in Science, the nation's leading research journal, in twenty nineteen. 480 00:32:28,330 --> 00:32:31,730 Speaker 1: It made a big splash, not least because many other 481 00:32:31,730 --> 00:32:34,850 Speaker 1: hospital systems we're using the algorithm and others like it. 482 00:32:35,210 --> 00:32:39,010 Speaker 1: We've since changed the algorithm that we use to one 483 00:32:39,090 --> 00:32:45,290 Speaker 1: that uses exclusively information about chronic illness and not healthcare utilization, 484 00:32:45,770 --> 00:32:49,970 Speaker 1: and has that worked. We're still testing. We think it's 485 00:32:49,970 --> 00:32:52,330 Speaker 1: going to work, but as in all of these things, 486 00:32:52,530 --> 00:32:55,370 Speaker 1: you really need to test it. You need to understand 487 00:32:55,570 --> 00:32:58,530 Speaker 1: and see if there's actually any biases. In the end, 488 00:32:58,610 --> 00:33:01,090 Speaker 1: you can't just adopt an algorithm. It's very important to 489 00:33:01,090 --> 00:33:04,090 Speaker 1: be very conscious about what you're predicting. It's also very 490 00:33:04,090 --> 00:33:06,370 Speaker 1: important to think about what are the factors you're putting 491 00:33:06,370 --> 00:33:09,090 Speaker 1: into that prediction algorithm. Even if you believe the ingredients 492 00:33:09,210 --> 00:33:11,570 Speaker 1: so right, you do actually have to see how it 493 00:33:11,610 --> 00:33:15,210 Speaker 1: works in practice. Anything that has to do with people's lives, 494 00:33:15,290 --> 00:33:22,370 Speaker 1: you know, you have to be transparent about it. Chapter 495 00:33:22,450 --> 00:33:29,690 Speaker 1: five Compass Transparency. Christine Vogeli and her colleagues were able 496 00:33:29,730 --> 00:33:31,250 Speaker 1: to get to the bottom of the issue with the 497 00:33:31,290 --> 00:33:34,730 Speaker 1: medical risk prediction because they had ready access to the 498 00:33:34,770 --> 00:33:39,490 Speaker 1: partners healthcare data and could test the algorithm. Unfortunately, that's 499 00:33:39,570 --> 00:33:42,970 Speaker 1: not always the case. I traveled to New York to 500 00:33:43,050 --> 00:33:46,210 Speaker 1: speak with a person who's arguably done more than anyone 501 00:33:46,650 --> 00:33:50,930 Speaker 1: to focus attention on the consequences of algorithmic bias. My 502 00:33:51,010 --> 00:33:54,530 Speaker 1: name is Julia Anguin. I'm a journalist. I've been writing 503 00:33:54,530 --> 00:33:57,970 Speaker 1: about technology for twenty five years, mostly at the Wealthy 504 00:33:58,050 --> 00:34:01,330 Speaker 1: Journal and Pro Publicat. Julia grew up in Silicon Valley 505 00:34:01,570 --> 00:34:04,770 Speaker 1: as the child of a mathematician and a chemist. She 506 00:34:04,890 --> 00:34:08,370 Speaker 1: studied math at the University of Chicago, but decided to 507 00:34:08,370 --> 00:34:12,450 Speaker 1: pursue a career your in journalism. Her quantitative skills gave 508 00:34:12,490 --> 00:34:16,290 Speaker 1: her a unique lens to report on the societal implications 509 00:34:16,290 --> 00:34:21,290 Speaker 1: of technology, and she eventually became interested in investigating high 510 00:34:21,370 --> 00:34:24,570 Speaker 1: stakes algorithms. When I learned that there was actually an 511 00:34:24,610 --> 00:34:28,210 Speaker 1: algorithm that judges used to help decide what to sentence people, 512 00:34:28,850 --> 00:34:31,370 Speaker 1: I was stunned. I thought, this is shocking. I can't 513 00:34:31,370 --> 00:34:33,770 Speaker 1: believe this exists, and I'm going to investigate it. What 514 00:34:33,890 --> 00:34:39,090 Speaker 1: we're talking about is a score that is assigned to 515 00:34:39,690 --> 00:34:44,490 Speaker 1: criminal defendants in many jurisdictions in this country that aims 516 00:34:44,530 --> 00:34:46,970 Speaker 1: to predict whether they will go on to commit a 517 00:34:47,010 --> 00:34:52,490 Speaker 1: future crime. It's known as a risk assessment score, and 518 00:34:52,610 --> 00:34:54,490 Speaker 1: the one that we chose to look at was called 519 00:34:54,530 --> 00:34:57,890 Speaker 1: the Compass Risk Assessment Score. Based on the answers to 520 00:34:57,930 --> 00:35:01,970 Speaker 1: a long list of questions, Compass gives defendants a risk 521 00:35:02,090 --> 00:35:07,090 Speaker 1: score from one to ten. In some jurisdictions, judges use 522 00:35:07,170 --> 00:35:10,690 Speaker 1: the Compass score to decide whether defendants should be released 523 00:35:10,730 --> 00:35:14,770 Speaker 1: on bail before trial. In others, judges use it to 524 00:35:14,770 --> 00:35:18,250 Speaker 1: decide the length of sentence to impose undefendants who plead 525 00:35:18,290 --> 00:35:22,410 Speaker 1: guilty or who were convicted a trial. Julia had a 526 00:35:22,450 --> 00:35:26,890 Speaker 1: suspicion that the algorithm might reflect bias against black defendants. 527 00:35:27,130 --> 00:35:31,290 Speaker 1: Attorney General Eric Holder had actually given a big speech 528 00:35:31,290 --> 00:35:33,730 Speaker 1: saying he was concerned about the use of these growers 529 00:35:33,770 --> 00:35:37,090 Speaker 1: and whether they were exacerbating racial bias, and so that 530 00:35:37,170 --> 00:35:39,770 Speaker 1: was one of the reasons we wanted to investigate. But 531 00:35:39,970 --> 00:35:45,450 Speaker 1: investigating wasn't easy. Unlike Christine Vogeli had partner's healthcare. Julia 532 00:35:45,610 --> 00:35:49,810 Speaker 1: couldn't inspect the Compass algorithm itself. Now, Compass isn't a 533 00:35:49,890 --> 00:35:53,330 Speaker 1: modern neural network who was developed by a company that's 534 00:35:53,330 --> 00:35:57,410 Speaker 1: now called Equivand and it's a much simpler algorithm. It's 535 00:35:57,450 --> 00:36:01,450 Speaker 1: basically a linear equation that should be easy to understand. 536 00:36:02,290 --> 00:36:05,210 Speaker 1: But it's a black box of a different sort. The 537 00:36:05,290 --> 00:36:10,450 Speaker 1: algorithm is opaque because to date Fant has insisted on 538 00:36:10,610 --> 00:36:14,450 Speaker 1: keeping it a trade secret. Julia also had no way 539 00:36:14,450 --> 00:36:18,210 Speaker 1: to download defendants Compass scores from her website, so she 540 00:36:18,330 --> 00:36:21,650 Speaker 1: had to gather the data herself. Her team decided to 541 00:36:21,690 --> 00:36:26,770 Speaker 1: focus on Broward County, Florida. Florida has great public records laws, 542 00:36:26,970 --> 00:36:29,690 Speaker 1: and so we filed a public records request and we 543 00:36:29,810 --> 00:36:33,410 Speaker 1: did end up getting eighteen thousand scores. We got scores 544 00:36:33,410 --> 00:36:36,850 Speaker 1: for everyone who was arrested for a two year period. 545 00:36:37,530 --> 00:36:41,490 Speaker 1: Eighteen thousand scores. All right, So then what did you 546 00:36:41,530 --> 00:36:44,930 Speaker 1: do to evaluate these scores? Well, first thing we did 547 00:36:44,970 --> 00:36:47,490 Speaker 1: when we got the eighteen thousand scores was actually we 548 00:36:47,570 --> 00:36:51,850 Speaker 1: just threw them into a bar chart black and white defendants. 549 00:36:52,450 --> 00:36:56,530 Speaker 1: We immediately noticed there was really different looking distributions for 550 00:36:56,690 --> 00:37:01,010 Speaker 1: black defendants. The scores were evenly distributed, meaning one through 551 00:37:01,090 --> 00:37:04,010 Speaker 1: ten lowest risk to highest Chris. There's equal numbers of 552 00:37:04,010 --> 00:37:07,690 Speaker 1: black defendants in every one of those buckets. For white defendants, 553 00:37:07,810 --> 00:37:11,730 Speaker 1: the scores were heavily clustered in the low risk range. 554 00:37:12,050 --> 00:37:15,010 Speaker 1: And so we thought, there's two options. All the white 555 00:37:15,010 --> 00:37:19,010 Speaker 1: people getting scored in Broward County are legitimately really low risk. 556 00:37:19,250 --> 00:37:22,810 Speaker 1: They're all Mother Teresa, or there's something weird going on. 557 00:37:23,290 --> 00:37:26,890 Speaker 1: Julia sworted the defendants and to those who were rearrested 558 00:37:27,050 --> 00:37:30,210 Speaker 1: over the next two years and those who weren't. She 559 00:37:30,330 --> 00:37:33,610 Speaker 1: compared the compass scores that had been assigned to each group. 560 00:37:34,210 --> 00:37:38,050 Speaker 1: For black defendants, it was much more likely to incorrectly 561 00:37:38,170 --> 00:37:40,290 Speaker 1: predict that they were going to go on to commit 562 00:37:40,370 --> 00:37:43,370 Speaker 1: a future crime when they didn't, and for white defendants, 563 00:37:43,370 --> 00:37:45,810 Speaker 1: it was much more likely to predict that they were 564 00:37:45,970 --> 00:37:48,330 Speaker 1: going to go on to not commit a future crime 565 00:37:48,370 --> 00:37:51,410 Speaker 1: when they did. They were twice as many false positives 566 00:37:51,650 --> 00:37:54,490 Speaker 1: for black defendants as white and twice as many false 567 00:37:54,530 --> 00:37:57,970 Speaker 1: negatives for white defendants as black defendants. Julia described the 568 00:37:58,010 --> 00:38:02,210 Speaker 1: story of two people whose arrest histories illustrate this difference. 569 00:38:02,730 --> 00:38:05,410 Speaker 1: A young eighteen year old black girl named Brecia Borden, 570 00:38:05,930 --> 00:38:10,730 Speaker 1: who had been arrested after picking up kid's bicycle from 571 00:38:10,730 --> 00:38:13,770 Speaker 1: their front yard. Riding at a few blocks. The mom 572 00:38:13,850 --> 00:38:16,090 Speaker 1: came out yelled at her, so that's my kid's bike. 573 00:38:16,690 --> 00:38:20,490 Speaker 1: She gave it back, but actually by then the neighbor 574 00:38:20,570 --> 00:38:22,770 Speaker 1: had called the police, and so she was arrested for that. 575 00:38:23,290 --> 00:38:26,730 Speaker 1: And we compared her with a white man who had 576 00:38:26,770 --> 00:38:30,450 Speaker 1: stolen about eighty dollars worth of stuff from a drug 577 00:38:30,490 --> 00:38:35,170 Speaker 1: store Vernon Prater. When teenager Brecia Borden got booked into jail, 578 00:38:35,850 --> 00:38:40,410 Speaker 1: she got a high compass score and eight, predicting a 579 00:38:40,490 --> 00:38:44,330 Speaker 1: high risk that she'd get re arrested, And Vernon Prader, 580 00:38:44,930 --> 00:38:49,890 Speaker 1: he got a low score a three. Now he had 581 00:38:49,890 --> 00:38:54,370 Speaker 1: already committed two armed robberies and had served time. She 582 00:38:54,930 --> 00:38:58,970 Speaker 1: was eighteen. She given back the bike, and of course 583 00:38:59,010 --> 00:39:01,010 Speaker 1: these scores turned out to be completely wrong. She did 584 00:39:01,050 --> 00:39:02,690 Speaker 1: not go on to commit a future crime in the 585 00:39:02,690 --> 00:39:05,090 Speaker 1: next two years, and he actually went on to break 586 00:39:05,090 --> 00:39:08,010 Speaker 1: into a warehouse steal thousands of dollars of electronics and 587 00:39:08,050 --> 00:39:13,450 Speaker 1: he's serving a ten year ten And so that's what 588 00:39:13,490 --> 00:39:15,610 Speaker 1: the difference between a false positive and a false negative 589 00:39:15,650 --> 00:39:18,330 Speaker 1: looks like. It looks like Fresha Borden and Vernon Prater 590 00:39:24,730 --> 00:39:30,770 Speaker 1: Chapter six Criminal Attitudes. Julia Anguin and her team spent 591 00:39:30,850 --> 00:39:35,130 Speaker 1: over a year doing research. In May twenty sixteen, Pro 592 00:39:35,250 --> 00:39:42,250 Speaker 1: Publica published their article headlined machine Bias. The subtitle quote 593 00:39:42,650 --> 00:39:46,610 Speaker 1: their software used across the country to predict future criminals, 594 00:39:46,970 --> 00:39:52,130 Speaker 1: and it's biased against blacks. Julia's team released all the 595 00:39:52,250 --> 00:39:55,650 Speaker 1: data they had collected so that anyone could check or 596 00:39:55,730 --> 00:40:02,050 Speaker 1: dispute their conclusions. What happened next was truly remarkable. The 597 00:40:02,130 --> 00:40:06,810 Speaker 1: Pro Publica article provoked an outcry for some statisticians, who 598 00:40:06,970 --> 00:40:10,970 Speaker 1: argued that the data actually proved moved Compass wasn't biased. 599 00:40:11,810 --> 00:40:15,490 Speaker 1: How could they reach the opposite conclusion. It turned out 600 00:40:15,530 --> 00:40:21,290 Speaker 1: the answer depended on how you define bias. Pro Publica 601 00:40:21,370 --> 00:40:25,090 Speaker 1: had to analyze the Compass scores by looking backward after 602 00:40:25,170 --> 00:40:29,170 Speaker 1: the outcomes were known among people who are not re arrested. 603 00:40:29,610 --> 00:40:32,530 Speaker 1: They found that black people had been assigned much higher 604 00:40:32,610 --> 00:40:36,890 Speaker 1: risk scores than white people. That seemed pretty unfair, but 605 00:40:37,050 --> 00:40:41,850 Speaker 1: statisticians use the word bias to describe how a predictor 606 00:40:41,930 --> 00:40:47,490 Speaker 1: performs when looking forward before the outcomes happened. It turns 607 00:40:47,490 --> 00:40:50,690 Speaker 1: out that black people and white people who received the 608 00:40:50,810 --> 00:40:55,010 Speaker 1: same risk score had roughly the same chance of being rearrested. 609 00:40:55,930 --> 00:41:00,330 Speaker 1: That seems pretty fair, So whether Compass was fair or 610 00:41:00,450 --> 00:41:06,090 Speaker 1: unfair depended on your definition of fairness. This sparked an 611 00:41:06,130 --> 00:41:11,450 Speaker 1: explosion of academic research. Matt Maticians showed there's no way 612 00:41:11,530 --> 00:41:15,170 Speaker 1: out of the problem. They proved a theorem saying it's 613 00:41:15,330 --> 00:41:19,850 Speaker 1: impossible to build a risk predictor that's fair when looking 614 00:41:19,930 --> 00:41:24,170 Speaker 1: both backward and forward unless the arrest rates for black 615 00:41:24,210 --> 00:41:28,810 Speaker 1: people and white people are identical, which they aren't. The 616 00:41:28,890 --> 00:41:32,250 Speaker 1: pro public article also focused at tension on many other 617 00:41:32,330 --> 00:41:36,650 Speaker 1: ways in which COMPASS scores are biased, like the healthcare 618 00:41:36,730 --> 00:41:41,730 Speaker 1: algorithm that Christine Vogeli studied. Compass scores don't explicitly ask 619 00:41:41,810 --> 00:41:45,490 Speaker 1: about a person's race, but race is closely correlated with 620 00:41:45,530 --> 00:41:49,770 Speaker 1: both the training data and the inputs to the algorithm. First, 621 00:41:49,810 --> 00:41:53,490 Speaker 1: the training data, COMPASS isn't actually trained to predict the 622 00:41:53,490 --> 00:41:58,170 Speaker 1: probability that a person will commit another crime. Instead, it's 623 00:41:58,170 --> 00:42:01,330 Speaker 1: trained to predict whether a person will be arrested for 624 00:42:01,410 --> 00:42:05,850 Speaker 1: committing another crime. The problem is there's abundant evidence that 625 00:42:06,290 --> 00:42:10,010 Speaker 1: in situations where black people and white people commit crimes 626 00:42:10,050 --> 00:42:13,970 Speaker 1: at the same rate, for example, illegal drug use, black 627 00:42:14,010 --> 00:42:17,810 Speaker 1: people are much more likely to get arrested, so Compass 628 00:42:17,930 --> 00:42:23,290 Speaker 1: is being trained on an unfair outcome. Second, the questionnaire 629 00:42:23,450 --> 00:42:28,810 Speaker 1: used to calculate Compass scores is pretty revealing. Some sections 630 00:42:28,930 --> 00:42:35,530 Speaker 1: assess peers, work, and social environment. The questions include how 631 00:42:35,530 --> 00:42:38,930 Speaker 1: many of your friends and acquaintances have ever been arrested? 632 00:42:39,490 --> 00:42:42,890 Speaker 1: How many have been crime victims? How often do you 633 00:42:42,890 --> 00:42:48,730 Speaker 1: have trouble paying bills. Other sections are titled criminal personality 634 00:42:48,770 --> 00:42:53,130 Speaker 1: and criminal attitude. They ask people to agree or disagree 635 00:42:53,170 --> 00:42:57,330 Speaker 1: with such statements as the law doesn't help average people, 636 00:42:58,250 --> 00:43:01,930 Speaker 1: or many people get into trouble because society has given 637 00:43:01,970 --> 00:43:06,490 Speaker 1: them no education, jobs, or future. In a nutshell, the 638 00:43:06,570 --> 00:43:10,770 Speaker 1: predictor penalizes defendants who are honest enough to admit they 639 00:43:10,810 --> 00:43:13,970 Speaker 1: live in high crime neighborhoods or they don't fully trust 640 00:43:14,010 --> 00:43:18,050 Speaker 1: the system. From the questionnaire, it's not hard to guess 641 00:43:18,090 --> 00:43:22,250 Speaker 1: how a teenage black girl arrested for something so minor 642 00:43:22,330 --> 00:43:26,290 Speaker 1: is writing someone else's bicycle a few blocks and returning 643 00:43:26,290 --> 00:43:30,490 Speaker 1: it might have received a COMPASS score of eight. And 644 00:43:30,570 --> 00:43:35,330 Speaker 1: it's not hard to imagine why racially correlated questions would 645 00:43:35,410 --> 00:43:39,610 Speaker 1: do a good job of predicting racially correlated arrest rates. 646 00:43:40,570 --> 00:43:43,370 Speaker 1: Pro PUBLICA didn't win a Pulitzer Prize for its article, 647 00:43:43,890 --> 00:43:52,610 Speaker 1: but it was a remarkable public service Chapter seven Minority report. 648 00:43:54,290 --> 00:43:57,610 Speaker 1: Putting aside the details of Compass, I wanted to find 649 00:43:57,610 --> 00:44:00,530 Speaker 1: out more about the role of predictive algorithms in courts. 650 00:44:01,170 --> 00:44:03,850 Speaker 1: I reached out to one of the leading legal scholars 651 00:44:03,890 --> 00:44:06,970 Speaker 1: in the country. I'm Martha Minnow. I'm a law professor 652 00:44:07,010 --> 00:44:12,690 Speaker 1: at Harvard, and I have recently immersed myself in issues 653 00:44:12,770 --> 00:44:18,170 Speaker 1: of algorithmic fairness. Martha Minnow has a remarkable resume. From 654 00:44:18,210 --> 00:44:21,810 Speaker 1: two thousand and nine to twenty seventeen, she served as 655 00:44:21,930 --> 00:44:25,530 Speaker 1: dean of the Harvard Law School, following now Supreme Court 656 00:44:25,610 --> 00:44:29,410 Speaker 1: Justice Elaina Kagan. Martha also served on the board of 657 00:44:29,410 --> 00:44:34,410 Speaker 1: the government sponsored Legal Services Corporation, which provides legal assistance 658 00:44:34,450 --> 00:44:37,850 Speaker 1: to low income Americans. She was appointed by her former 659 00:44:37,970 --> 00:44:42,570 Speaker 1: law student, President Barack Obama. It became very interested in 660 00:44:42,650 --> 00:44:46,970 Speaker 1: and concerned about the increasing use of algorithms in worlds 661 00:44:47,010 --> 00:44:51,930 Speaker 1: that touch on my preoccupations with equal protection, do process, 662 00:44:52,290 --> 00:44:58,210 Speaker 1: constitutional rights, fairness, anti discrimination. Martha recently co signed a 663 00:44:58,290 --> 00:45:02,890 Speaker 1: statement with twenty six other lawyers and scientists raising quote 664 00:45:03,050 --> 00:45:07,210 Speaker 1: grave concerns about the use of predictive algorithms for pre 665 00:45:07,330 --> 00:45:11,610 Speaker 1: trial risk assessment. I asked her how courts had gotten 666 00:45:11,650 --> 00:45:16,690 Speaker 1: involved in the business of prediction. Criminal's justice system has 667 00:45:16,770 --> 00:45:21,770 Speaker 1: flirted with the use of prediction forever, including discussions from 668 00:45:21,770 --> 00:45:24,970 Speaker 1: the nineteenth century on in this country about dangerousness and 669 00:45:25,090 --> 00:45:30,410 Speaker 1: whether people should be detained prevactively. So far, that's not 670 00:45:30,530 --> 00:45:33,810 Speaker 1: permitted in the United States. It appears in Minority Report 671 00:45:33,850 --> 00:45:38,730 Speaker 1: and other interesting movies. The movie starring Tom Cruise tells 672 00:45:38,850 --> 00:45:42,010 Speaker 1: the story of a future in which the PreCrime division 673 00:45:42,170 --> 00:45:46,770 Speaker 1: of the police arrest people for crimes they haven't yet committed. 674 00:45:48,170 --> 00:45:50,530 Speaker 1: I'm placing you under arrest for the future, murder, Sarah marks. 675 00:45:50,610 --> 00:45:54,170 Speaker 1: We are arresting individuals who've broken no law, but they will. 676 00:45:54,770 --> 00:45:59,330 Speaker 1: The use of prediction in the context of sentencing is 677 00:45:59,410 --> 00:46:04,330 Speaker 1: part of this rather large sphere of discretion that judges 678 00:46:04,410 --> 00:46:07,890 Speaker 1: have to decide what kind of sentence fits the crime 679 00:46:08,570 --> 00:46:13,410 Speaker 1: you're saying. In sentencing, one is allowed to use essentially 680 00:46:13,490 --> 00:46:16,970 Speaker 1: information from the pre crime division about crimes that haven't 681 00:46:17,010 --> 00:46:21,690 Speaker 1: been committed yet. Well, I am horrified by that suggestion, 682 00:46:21,770 --> 00:46:24,290 Speaker 1: but I think it's fair to raise it as a concern. 683 00:46:25,010 --> 00:46:30,050 Speaker 1: The problem is if we actually acknowledge purposes of the 684 00:46:30,050 --> 00:46:33,050 Speaker 1: criminal justice system, some of them start to get into 685 00:46:33,210 --> 00:46:39,530 Speaker 1: the future. So if one purpose is simply incapacitation, prevent 686 00:46:39,610 --> 00:46:43,010 Speaker 1: this person from walking the streets because they might hurt 687 00:46:43,050 --> 00:46:47,130 Speaker 1: someone else, there's a prediction built in. So judges have 688 00:46:47,210 --> 00:46:50,690 Speaker 1: been factoring in predictions about a defendant's future behavior for 689 00:46:50,770 --> 00:46:55,130 Speaker 1: a long time. And judges certainly aren't perfect. They can 690 00:46:55,170 --> 00:47:00,370 Speaker 1: be biased or sometimes just cranky. There are even studies 691 00:47:00,370 --> 00:47:04,570 Speaker 1: showing the judges hand down harsher sentences before lunch breaks 692 00:47:04,810 --> 00:47:09,770 Speaker 1: than after. Now, the defenders of risk Prediction score will say, well, 693 00:47:09,770 --> 00:47:12,890 Speaker 1: it's always not what's the ideal but compared to what 694 00:47:13,730 --> 00:47:17,450 Speaker 1: and if the alternative is we're relying entirely on the 695 00:47:17,650 --> 00:47:23,650 Speaker 1: individual judges and their prejudices, their lack of education, what 696 00:47:23,770 --> 00:47:27,130 Speaker 1: they had for lunch. Isn't this better that it will 697 00:47:27,170 --> 00:47:33,530 Speaker 1: provide some kind of scaffold for more consistency. Journalist Julia 698 00:47:33,610 --> 00:47:37,490 Speaker 1: Anguin has heard the same arguments some good friends right 699 00:47:37,530 --> 00:47:40,890 Speaker 1: who really believe in the use of these criminal risks 700 00:47:40,890 --> 00:47:43,650 Speaker 1: score algorithms. I've said to me, look, Julia, the fact 701 00:47:43,770 --> 00:47:48,130 Speaker 1: is judges are terribly biased, and this is an improvement, 702 00:47:48,290 --> 00:47:51,850 Speaker 1: and my feeling is That's probably true for some judges 703 00:47:51,890 --> 00:47:55,610 Speaker 1: and maybe less true for other judges. But I don't 704 00:47:55,610 --> 00:47:59,450 Speaker 1: think it is a reason to automate bias, right, Like 705 00:47:59,490 --> 00:48:02,490 Speaker 1: I don't understand why you say, Okay, humans are flawed, 706 00:48:02,530 --> 00:48:05,290 Speaker 1: so why don't we make a flawed algorithm and bake 707 00:48:05,370 --> 00:48:09,730 Speaker 1: it into every decision, because then it's really intractable. Martha 708 00:48:09,890 --> 00:48:14,450 Speaker 1: also worries that numerical risk scores are misleading. The judges 709 00:48:14,570 --> 00:48:18,090 Speaker 1: think high numbers mean people are very likely to commit 710 00:48:18,170 --> 00:48:21,890 Speaker 1: violent crime. In fact, the actual probability of violence is 711 00:48:22,010 --> 00:48:26,810 Speaker 1: very low, about eight percent according to a public assessment, 712 00:48:27,770 --> 00:48:31,930 Speaker 1: And she thinks numerical scores can lull judges into a 713 00:48:32,090 --> 00:48:36,690 Speaker 1: false sense of certainty. There's an appearance of objectivity because 714 00:48:36,690 --> 00:48:40,930 Speaker 1: it's math, but is it really Then for lawyers, they 715 00:48:40,970 --> 00:48:45,170 Speaker 1: may have had no math, no numeracy education since high school. 716 00:48:46,090 --> 00:48:49,250 Speaker 1: Many people go to a law in part because they 717 00:48:49,290 --> 00:48:53,050 Speaker 1: don't want to do anything with numbers. And there is 718 00:48:53,730 --> 00:48:58,810 Speaker 1: a larger problem, which is the deference to expertise, particularly 719 00:48:58,810 --> 00:49:03,610 Speaker 1: scientific expertise. Finally, I wanted to ask Martha if defendants 720 00:49:03,610 --> 00:49:07,410 Speaker 1: have a constitutional right to know what's inside the black 721 00:49:07,450 --> 00:49:11,210 Speaker 1: box that's helping to term in their fate. I confess 722 00:49:11,290 --> 00:49:15,090 Speaker 1: I thought the answer was an obvious yes until I 723 00:49:15,130 --> 00:49:19,890 Speaker 1: read a twenty sixteen decision by Wisconsin's Supreme Court. The 724 00:49:20,010 --> 00:49:24,210 Speaker 1: defendant in that case, Eric Loomis, pled guilty to operating 725 00:49:24,210 --> 00:49:28,290 Speaker 1: a car without the owner's permission and fleeing a traffic officer. 726 00:49:29,210 --> 00:49:32,610 Speaker 1: When Loomis was sentenced, the presentencing report given to the 727 00:49:32,690 --> 00:49:36,850 Speaker 1: judge included a Compass score that predicted Loomis had a 728 00:49:36,970 --> 00:49:41,530 Speaker 1: high risk for committing future crimes. He was sentenced to 729 00:49:41,690 --> 00:49:46,850 Speaker 1: six years in prison. Loomis appealed, arguing that his inability 730 00:49:46,890 --> 00:49:51,970 Speaker 1: to inspect the Compass algorithm violated his constitutional right to 731 00:49:52,050 --> 00:49:57,130 Speaker 1: due process. Wisconsin's Supreme Court ultimately decided that Loomis had 732 00:49:57,290 --> 00:50:02,090 Speaker 1: no right to know how Compass worked. Why. First, the 733 00:50:02,090 --> 00:50:05,890 Speaker 1: Wisconsin court said the score was just one of several 734 00:50:05,970 --> 00:50:10,850 Speaker 1: inputs to the judge's sentencing decision. Second, the court said 735 00:50:11,370 --> 00:50:13,850 Speaker 1: even if Lomas didn't know how the score was determined, 736 00:50:14,210 --> 00:50:17,970 Speaker 1: he could still dispute its accuracy. Lomas appealed to the 737 00:50:18,050 --> 00:50:21,610 Speaker 1: US Supreme Court, but it declined to hear the case. 738 00:50:22,490 --> 00:50:26,650 Speaker 1: I find that troubling and not persuasive. It was up 739 00:50:26,690 --> 00:50:31,050 Speaker 1: to you, how would you change the law. I actually 740 00:50:31,170 --> 00:50:37,890 Speaker 1: would require transparency for any use of any algorithm by 741 00:50:37,930 --> 00:50:43,730 Speaker 1: a government agency or court that has the consequence of 742 00:50:43,890 --> 00:50:50,250 Speaker 1: influencing not just deciding, but influencing decisions about individual's rights. 743 00:50:50,410 --> 00:50:54,410 Speaker 1: And those rights could be rights to liberty, property opportunities. 744 00:50:54,850 --> 00:50:58,050 Speaker 1: So transparency, transparency, and am be able to see what 745 00:50:58,090 --> 00:51:01,410 Speaker 1: this algorithm does, absolutely and have the code and be 746 00:51:01,490 --> 00:51:03,410 Speaker 1: able to give it to your own lawyer and your 747 00:51:03,450 --> 00:51:07,410 Speaker 1: own experts. But should a state be able to buy 748 00:51:07,810 --> 00:51:11,650 Speaker 1: a computer program that's proprietary. I mean it would say, well, 749 00:51:11,850 --> 00:51:13,850 Speaker 1: I'd love to give it to you, but it's proprietary. 750 00:51:13,930 --> 00:51:16,410 Speaker 1: I can't. Should that be okay? I think not, because 751 00:51:16,450 --> 00:51:20,170 Speaker 1: if that then limits the transparency, that seems a breach. 752 00:51:20,490 --> 00:51:23,610 Speaker 1: But you know, this is a major problem, the outsourcing 753 00:51:23,610 --> 00:51:28,650 Speaker 1: of government activity that has the effect of bypassing restrictions. 754 00:51:28,970 --> 00:51:35,370 Speaker 1: Take another example, when the US government hires private contractors 755 00:51:35,410 --> 00:51:40,010 Speaker 1: to engage in war activities, they are not governed by 756 00:51:40,050 --> 00:51:43,090 Speaker 1: the same rules that govern the US military. She's saying 757 00:51:43,130 --> 00:51:47,690 Speaker 1: that government can get around constitutional limitations on the government 758 00:51:48,290 --> 00:51:51,130 Speaker 1: by just outsourcing it to somebody who's not the government. 759 00:51:51,210 --> 00:51:54,890 Speaker 1: It's currently the case, and I think that's wrong for 760 00:51:54,930 --> 00:51:59,090 Speaker 1: her part, journalist Julia Angwin is baffled by the Wisconsin 761 00:51:59,170 --> 00:52:02,330 Speaker 1: Court's ruling. I mean, we have this idea that you 762 00:52:02,410 --> 00:52:06,210 Speaker 1: should be able to argue against whatever accusations are made. 763 00:52:06,650 --> 00:52:08,970 Speaker 1: But I don't know how you make an argument against 764 00:52:09,490 --> 00:52:13,170 Speaker 1: a score, like the score says you're seven, but you 765 00:52:13,210 --> 00:52:15,330 Speaker 1: think you're a four. How do you make that argument 766 00:52:15,370 --> 00:52:18,770 Speaker 1: If you don't know how that seven was calculated? You 767 00:52:18,770 --> 00:52:25,370 Speaker 1: can't make an argument that you're a four Chapter eight 768 00:52:28,530 --> 00:52:32,410 Speaker 1: robo recruiter. Even if you never find yourself in a 769 00:52:32,450 --> 00:52:36,010 Speaker 1: criminal court filling out a compass questionnaire, that doesn't mean 770 00:52:36,050 --> 00:52:39,370 Speaker 1: you won't be judged by a predictive algorithm. There's actually 771 00:52:39,410 --> 00:52:41,410 Speaker 1: a good chance it will happen the next time you 772 00:52:41,490 --> 00:52:44,730 Speaker 1: go looking for a job. I spoke to a scientist 773 00:52:44,730 --> 00:52:49,170 Speaker 1: at a high tech company that screens job applicants. My 774 00:52:49,250 --> 00:52:53,610 Speaker 1: name is Lindsay Zulaga, and I'm actually educated as a physicist, 775 00:52:53,650 --> 00:52:57,970 Speaker 1: but now working for a company called higher View. Higher 776 00:52:58,050 --> 00:53:02,410 Speaker 1: View is a video interviewing platform. Companies create an interview, 777 00:53:02,450 --> 00:53:05,370 Speaker 1: candidates can take it at any time that's convenient for them, 778 00:53:05,650 --> 00:53:09,010 Speaker 1: So they go through the questions and they record themselves answer. 779 00:53:09,650 --> 00:53:12,370 Speaker 1: So it's really a great substitute for kind of the 780 00:53:12,450 --> 00:53:18,690 Speaker 1: resume phone screening part of the process. When a candidate 781 00:53:18,730 --> 00:53:22,890 Speaker 1: takes a video interview, they're creating thousands of unique points 782 00:53:22,890 --> 00:53:27,410 Speaker 1: of data. A candidate's verbal and nonverbal cues give us 783 00:53:27,450 --> 00:53:32,410 Speaker 1: insight into their emotional, engagement, thinking, and problem solving style. 784 00:53:34,250 --> 00:53:38,890 Speaker 1: This combination of cutting edge AI and validated science is 785 00:53:38,890 --> 00:53:43,410 Speaker 1: the perfect partner for making data driven talent decisions. Higher View. 786 00:53:49,530 --> 00:53:52,450 Speaker 1: You know, we'll have a customer and they are hiring 787 00:53:52,530 --> 00:53:55,450 Speaker 1: for something like a call center, say it's sales calls. 788 00:53:55,850 --> 00:53:58,130 Speaker 1: And what we do is we look at past employees 789 00:53:58,170 --> 00:54:00,890 Speaker 1: that applied, and we look at their video interviews. We 790 00:54:01,010 --> 00:54:04,090 Speaker 1: look at the words they said, tone of voice, pauses, 791 00:54:04,370 --> 00:54:07,810 Speaker 1: and facial expressions, things like that, and we look for 792 00:54:07,890 --> 00:54:11,810 Speaker 1: patterns in how those people with good sales numbers behave 793 00:54:12,130 --> 00:54:14,930 Speaker 1: as compared to people as low sales numbers. And then 794 00:54:14,970 --> 00:54:17,770 Speaker 1: we have this algorithm that scores new candidates as they 795 00:54:17,810 --> 00:54:19,850 Speaker 1: come in, and so we help kind of get those 796 00:54:19,890 --> 00:54:22,570 Speaker 1: more promising candidates to the top of the pile so 797 00:54:22,650 --> 00:54:27,610 Speaker 1: they're seen more quickly. So Higher View trains a predictive 798 00:54:27,610 --> 00:54:32,250 Speaker 1: algorithm on video interviews of past applicants who turned out 799 00:54:32,250 --> 00:54:36,090 Speaker 1: to be successful employees. But how does higher View know 800 00:54:36,210 --> 00:54:41,370 Speaker 1: its program isn't learning sexism or racism or other similar biases. 801 00:54:41,850 --> 00:54:45,610 Speaker 1: There are lots of reasons to worry. For example, studies 802 00:54:45,650 --> 00:54:48,810 Speaker 1: from M I. T have shown that facial recognition algorithms 803 00:54:49,010 --> 00:54:52,890 Speaker 1: can have a hard time reading emotions from black people's faces. 804 00:54:53,370 --> 00:54:56,810 Speaker 1: And how would Higher Views program evaluate videos from people 805 00:54:56,890 --> 00:55:00,930 Speaker 1: who might look or sound different than the average employee, say, 806 00:55:00,930 --> 00:55:04,050 Speaker 1: people who don't speak English as a native language, who 807 00:55:04,050 --> 00:55:08,250 Speaker 1: are disabled, who are on the autism spectrum, or even 808 00:55:08,530 --> 00:55:12,370 Speaker 1: people who are just a little quirky. Well, Lindsay says 809 00:55:12,570 --> 00:55:16,490 Speaker 1: Higher View tests for certain kinds of bias, So we 810 00:55:17,050 --> 00:55:20,210 Speaker 1: audit the algorithm after the fact and see if it's 811 00:55:20,250 --> 00:55:23,450 Speaker 1: scoring different groups differently in terms of age, race, and gender. 812 00:55:23,890 --> 00:55:26,970 Speaker 1: So if we do see that happening a lot of times, 813 00:55:27,050 --> 00:55:29,850 Speaker 1: that's probably coming from the training data. So maybe there 814 00:55:29,930 --> 00:55:32,450 Speaker 1: is only one female software engineer in this data set, 815 00:55:32,650 --> 00:55:35,490 Speaker 1: the model might mimic that bias. If we do see 816 00:55:35,490 --> 00:55:39,850 Speaker 1: any of that adverse impact, we simply remove the features 817 00:55:39,890 --> 00:55:42,810 Speaker 1: that are causing it, so we can say this model 818 00:55:43,090 --> 00:55:46,690 Speaker 1: is being sexist. How does the model even know what 819 00:55:46,770 --> 00:55:49,370 Speaker 1: gender the person is? So we look at all the features, 820 00:55:49,370 --> 00:55:51,650 Speaker 1: and we find the features that are the most correlated 821 00:55:51,730 --> 00:55:54,250 Speaker 1: to gender. If there are, we simply remove some of 822 00:55:54,250 --> 00:55:58,490 Speaker 1: those features. I asked lindsay why people should believe higher 823 00:55:58,570 --> 00:56:03,770 Speaker 1: views or any company's assurances, or whether something more was needed. 824 00:56:04,210 --> 00:56:07,050 Speaker 1: You seem thoughtful about this, but there will be many 825 00:56:07,090 --> 00:56:10,010 Speaker 1: people coming into the industry over time might not be 826 00:56:10,050 --> 00:56:13,570 Speaker 1: as thoughtful or as sophisticated as you are. Do you 827 00:56:13,570 --> 00:56:15,490 Speaker 1: think it would be a good idea to have third 828 00:56:15,610 --> 00:56:21,490 Speaker 1: parties come in to certify the audits for bias? I 829 00:56:21,570 --> 00:56:30,210 Speaker 1: know that's a hard question, I guess I I kind 830 00:56:30,210 --> 00:56:33,650 Speaker 1: of lean towards no. So you're talking about having a 831 00:56:33,650 --> 00:56:38,650 Speaker 1: third party entity that comes in an assess and certifies 832 00:56:38,730 --> 00:56:40,930 Speaker 1: the audit. You know, because you've described what I think 833 00:56:41,010 --> 00:56:43,810 Speaker 1: is a really impressive process. But of course, how do 834 00:56:43,850 --> 00:56:46,370 Speaker 1: we know it's true? You know, you could reveal all 835 00:56:46,410 --> 00:56:49,330 Speaker 1: your algorithms, but probably not the thing you want to do, 836 00:56:49,850 --> 00:56:53,410 Speaker 1: And so the next best thing is a certifier says yes, 837 00:56:53,810 --> 00:56:56,570 Speaker 1: this audit has been done. Probably you know your financials 838 00:56:56,610 --> 00:57:01,650 Speaker 1: presumably get audited. Why not the result of the algorithm? 839 00:57:01,810 --> 00:57:04,570 Speaker 1: I guess a little bit. The reason I'm not sure 840 00:57:04,610 --> 00:57:06,810 Speaker 1: about the certification is just. It is mostly just because 841 00:57:06,810 --> 00:57:09,010 Speaker 1: I feel like I don't know how it would work exactly, 842 00:57:09,210 --> 00:57:13,370 Speaker 1: Like you're right totally that finances are audited. I haven't 843 00:57:13,410 --> 00:57:15,890 Speaker 1: thought about it enough to have like a strong opinion 844 00:57:15,890 --> 00:57:17,810 Speaker 1: that it should happen, because it's like, Okay, we have 845 00:57:17,850 --> 00:57:22,330 Speaker 1: all these different models, it's constantly changing. How to do 846 00:57:22,370 --> 00:57:26,770 Speaker 1: they audit every single model all the time. I was 847 00:57:26,850 --> 00:57:30,690 Speaker 1: impressed with Lindsay's willingness as a scientist to think in 848 00:57:30,810 --> 00:57:34,330 Speaker 1: real time about a hard question, and it turns out 849 00:57:34,730 --> 00:57:38,330 Speaker 1: she kept thinking about it afterwards. A few months later, 850 00:57:38,770 --> 00:57:41,770 Speaker 1: she wrote back to me to say that she changed 851 00:57:41,770 --> 00:57:45,210 Speaker 1: her mind. We do have a lot of private information, 852 00:57:46,010 --> 00:57:47,970 Speaker 1: but if we don't share it, people tend to assume 853 00:57:48,010 --> 00:57:52,050 Speaker 1: the worst. So I've decided, after thinking about it quite 854 00:57:52,090 --> 00:57:55,090 Speaker 1: a bit, that I definitely support the third party auditing 855 00:57:55,090 --> 00:58:00,010 Speaker 1: of algorithms. Sometimes people you assume we're doing horrible, horrible things, 856 00:58:00,290 --> 00:58:03,010 Speaker 1: and that can be frustrating. But I do think the 857 00:58:03,090 --> 00:58:05,290 Speaker 1: more transparent we can be about what we are doing 858 00:58:05,810 --> 00:58:10,930 Speaker 1: is important. Several months later, Lindsay emailed again to say 859 00:58:10,970 --> 00:58:14,530 Speaker 1: that Higher View was now undergoing a third party audit. 860 00:58:15,290 --> 00:58:23,290 Speaker 1: She says she's excited to learn from the results, Chapter nine, 861 00:58:23,410 --> 00:58:29,330 Speaker 1: confronting the black box so higher view at first, Reluctant 862 00:58:29,730 --> 00:58:35,650 Speaker 1: says it's now engaging external auditors. What about Equivant, whose 863 00:58:35,690 --> 00:58:39,490 Speaker 1: Compass scores can heavily influence prison sentences, but which is 864 00:58:39,530 --> 00:58:43,610 Speaker 1: steadfastly refused to let anyone even see how their simple 865 00:58:43,650 --> 00:58:47,930 Speaker 1: algorithm works. Well. Just before we release this podcast, I 866 00:58:48,010 --> 00:58:51,930 Speaker 1: checked back with them. A company spokesperson wrote that Equivant 867 00:58:52,010 --> 00:58:55,970 Speaker 1: now agrees that the Compass scoring process quote should be 868 00:58:56,010 --> 00:59:00,490 Speaker 1: made available for third party examination, but they weren't releasing 869 00:59:00,530 --> 00:59:04,210 Speaker 1: it yet because they first wanted to file for copyright 870 00:59:04,290 --> 00:59:09,610 Speaker 1: protection on their simple algorithm. So we're still waiting. You 871 00:59:09,730 --> 00:59:13,170 Speaker 1: might ask, should it be up to the companies to decide? 872 00:59:13,690 --> 00:59:19,530 Speaker 1: Aren't there laws or regulations? The answer is there's not much. 873 00:59:20,370 --> 00:59:23,370 Speaker 1: Governments are just now waking up to the idea that 874 00:59:23,410 --> 00:59:26,530 Speaker 1: they have a role to play. I traveled back to 875 00:59:26,570 --> 00:59:29,610 Speaker 1: New York City to talk to someone who's been involved 876 00:59:29,650 --> 00:59:33,650 Speaker 1: in this question. My name's Rashida Richardson, and I'm a 877 00:59:33,730 --> 00:59:37,330 Speaker 1: civil rights lawyer that focuses on the social implications of 878 00:59:37,450 --> 00:59:41,650 Speaker 1: artificial intelligence. Rashida served as the director of policy research 879 00:59:41,690 --> 00:59:44,890 Speaker 1: at a i Now Institute at NYU, where she worked 880 00:59:44,890 --> 00:59:49,170 Speaker 1: with Kate Crawford, the Australian expert and algorithmic bias that 881 00:59:49,210 --> 00:59:52,850 Speaker 1: I spoke to earlier in the episode. In twenty eighteen, 882 00:59:53,490 --> 00:59:56,450 Speaker 1: New York City became the first jurisdiction in the US 883 00:59:56,490 --> 01:00:00,010 Speaker 1: to create a task force to come up with recommendations 884 01:00:00,050 --> 01:00:04,250 Speaker 1: about government use of predictive algorithms, or, as they call them, 885 01:00:04,650 --> 01:00:10,130 Speaker 1: automated decision systems. Unfortunately, the task force bogged down in 886 01:00:10,210 --> 01:00:16,010 Speaker 1: details and wasn't very productive. In response, Rashida led a 887 01:00:16,010 --> 01:00:18,970 Speaker 1: group of twenty seven experts that wrote a fifty six 888 01:00:19,050 --> 01:00:26,330 Speaker 1: page shadow report entitled Confronting Black Boxes that offered concrete proposals. 889 01:00:27,570 --> 01:00:30,570 Speaker 1: New York City it turns out, uses quite a few 890 01:00:30,610 --> 01:00:37,290 Speaker 1: algorithms to make major decisions. You have the school matching algorithms. 891 01:00:37,330 --> 01:00:42,090 Speaker 1: You have an algorithm used by the Child Welfare agency here. 892 01:00:42,650 --> 01:00:46,450 Speaker 1: You have public benefits algorithms that are used to determine 893 01:00:46,530 --> 01:00:50,770 Speaker 1: who will qualify or have their public benefits, whether that's 894 01:00:50,850 --> 01:00:56,210 Speaker 1: Medicaid or temporary food assistance terminated, or whether they'll receive 895 01:00:56,250 --> 01:01:00,410 Speaker 1: access to those benefits. You have a gang database which 896 01:01:00,450 --> 01:01:03,650 Speaker 1: tries to identify who is likely to be in a gang, 897 01:01:03,690 --> 01:01:06,250 Speaker 1: and that's both used by the DA's office and the 898 01:01:06,290 --> 01:01:10,370 Speaker 1: police department. If you had to make a yes, how 899 01:01:10,370 --> 01:01:14,810 Speaker 1: many predictive algorithms are used by the City of New York, 900 01:01:15,890 --> 01:01:21,410 Speaker 1: I'd say upwards to thirty and I'm underestimating with that number. 901 01:01:22,370 --> 01:01:28,290 Speaker 1: How many of these thirty plus algorithms are transparent about 902 01:01:28,290 --> 01:01:33,170 Speaker 1: how they work, about their code. None. So what should 903 01:01:33,210 --> 01:01:36,210 Speaker 1: New York do? It was up to you what should 904 01:01:36,250 --> 01:01:39,770 Speaker 1: be the behavior of a responsible city with respect to 905 01:01:40,170 --> 01:01:43,250 Speaker 1: the algorithms it uses. I think the first step is 906 01:01:43,370 --> 01:01:49,410 Speaker 1: creating greater transparency, some annual acknowledgement of what is being used, 907 01:01:49,410 --> 01:01:52,090 Speaker 1: how it's being used, whether it's been tested or had 908 01:01:52,090 --> 01:01:56,890 Speaker 1: a validation study. And then you would also want general 909 01:01:56,930 --> 01:01:59,930 Speaker 1: information about the inputs or factors that are used by 910 01:01:59,930 --> 01:02:03,050 Speaker 1: these systems to make predictions, because in some cases you 911 01:02:03,170 --> 01:02:07,610 Speaker 1: have factors that are just discriminatory or proxies for protected 912 01:02:07,690 --> 01:02:11,450 Speaker 1: status is like race, gender, ability status. All right, So 913 01:02:11,570 --> 01:02:16,770 Speaker 1: step one, disclose what systems you're using. Yes, And then 914 01:02:17,530 --> 01:02:21,770 Speaker 1: the second step, I think is creating a system of audits, 915 01:02:21,770 --> 01:02:26,250 Speaker 1: both prior to procurement and then once procured, ongoing auditing 916 01:02:26,330 --> 01:02:29,290 Speaker 1: of the system to at least have a gauge on 917 01:02:29,330 --> 01:02:32,090 Speaker 1: what it's doing real time. A lot of the horror 918 01:02:32,170 --> 01:02:34,970 Speaker 1: stories we hear are about fully implemented tools that we're 919 01:02:35,050 --> 01:02:38,890 Speaker 1: in works for years. There's never your pause button to 920 01:02:39,370 --> 01:02:43,050 Speaker 1: reevaluate or look at how a system is working real time. 921 01:02:43,570 --> 01:02:46,130 Speaker 1: And even when I did studies on the use of 922 01:02:46,170 --> 01:02:50,650 Speaker 1: predictive policing systems, I looked at thirteen jurisdictions, only one 923 01:02:50,690 --> 01:02:54,690 Speaker 1: of them actually did a retrospective review of their system. 924 01:02:54,890 --> 01:02:57,010 Speaker 1: So what's your theory about how do you get the 925 01:02:57,010 --> 01:03:00,650 Speaker 1: auditing done? If you are going to outsource to third parties, 926 01:03:01,130 --> 01:03:03,650 Speaker 1: I think it's going to have to be some approval 927 01:03:03,690 --> 01:03:07,650 Speaker 1: process to assess their level of independence, but also any 928 01:03:07,730 --> 01:03:10,690 Speaker 1: conflict of interest she use that may come up, and 929 01:03:10,770 --> 01:03:13,610 Speaker 1: then also doing some thinking about what types of expertise 930 01:03:13,610 --> 01:03:16,650 Speaker 1: are needed, because I think if you don't necessarily have 931 01:03:16,690 --> 01:03:20,210 Speaker 1: someone who understands that social context or even the history 932 01:03:20,810 --> 01:03:25,170 Speaker 1: of a certain government sector, then you could have a 933 01:03:25,210 --> 01:03:28,090 Speaker 1: tool that is technically accurate and meets all of the 934 01:03:28,130 --> 01:03:31,370 Speaker 1: technical standards, but is still reproducing harm because it's not 935 01:03:31,450 --> 01:03:34,890 Speaker 1: paying attention to that social context. Should a government be 936 01:03:35,450 --> 01:03:42,610 Speaker 1: permitted to purchase an automated decision system where the code 937 01:03:42,810 --> 01:03:48,130 Speaker 1: can't be disclosed by contract now, and in fact, there's 938 01:03:48,250 --> 01:03:53,290 Speaker 1: movement around creating more provisions that vendors must waive trade 939 01:03:53,330 --> 01:03:56,850 Speaker 1: secrecy claims once they enter a contract with the government. 940 01:03:57,530 --> 01:04:00,290 Speaker 1: Rashida says, we need laws to regulate the use of 941 01:04:00,290 --> 01:04:04,570 Speaker 1: predictive algorithms, both by governments and by private companies like 942 01:04:04,690 --> 01:04:08,250 Speaker 1: higher View. We're beginning to see bills being explored in 943 01:04:08,290 --> 01:04:14,050 Speaker 1: different states. Massachusetts, Vermont, and Washington DC are considering setting 944 01:04:14,090 --> 01:04:17,610 Speaker 1: up commissions to look at the government use of predictive algorithms. 945 01:04:18,530 --> 01:04:23,010 Speaker 1: Idaho recently passed a first in the nation law requiring 946 01:04:23,050 --> 01:04:27,610 Speaker 1: that pre trial risk algorithms be free of bias and transparent. 947 01:04:28,330 --> 01:04:33,290 Speaker 1: It blocks manufacturers of tools like Compass from claiming trade 948 01:04:33,330 --> 01:04:37,290 Speaker 1: secret protection. And at the national level, a bill was 949 01:04:37,410 --> 01:04:42,850 Speaker 1: recently introduced in the US Congress, the Algorithmic Accountability Act. 950 01:04:43,650 --> 01:04:47,090 Speaker 1: The bill would require that private companies ensure certain types 951 01:04:47,130 --> 01:04:53,370 Speaker 1: of algorithms are audited for bias. Unfortunately, it doesn't require 952 01:04:53,450 --> 01:04:56,730 Speaker 1: that the results of the audit are made public, so 953 01:04:56,770 --> 01:04:59,850 Speaker 1: there's still a long way to go. Rashida thinks it's 954 01:04:59,850 --> 01:05:04,410 Speaker 1: important that regulations don't just focus on technical issues. They 955 01:05:04,490 --> 01:05:07,810 Speaker 1: need to look at the larger context. Part of the 956 01:05:07,890 --> 01:05:10,730 Speaker 1: problems that were identif fine with these systems is that 957 01:05:10,770 --> 01:05:14,490 Speaker 1: they're amplifying and reproducing a lot of the historical and 958 01:05:14,570 --> 01:05:18,010 Speaker 1: current discrimination that we see in society. There are large 959 01:05:18,090 --> 01:05:21,050 Speaker 1: questions we've been unable to answer as a society of 960 01:05:21,290 --> 01:05:24,450 Speaker 1: how do you deal with the compounded effect of fifty 961 01:05:24,570 --> 01:05:27,050 Speaker 1: years of discrimination? And we don't have a simple answer, 962 01:05:27,050 --> 01:05:29,850 Speaker 1: and there's not necessarily going to be a technical solution. 963 01:05:30,210 --> 01:05:33,170 Speaker 1: But I think having access to more data in an 964 01:05:33,290 --> 01:05:36,050 Speaker 1: understanding of how these systems are working will help us 965 01:05:36,050 --> 01:05:39,330 Speaker 1: evaluate whether these tools are even being evalue added and 966 01:05:39,370 --> 01:05:45,530 Speaker 1: addressing the larger social questions. Finally, Kate Crawford says laws 967 01:05:45,610 --> 01:05:49,890 Speaker 1: alone likely won't be enough. There's another thing we need 968 01:05:49,890 --> 01:05:53,530 Speaker 1: to focus on. In the end, it really matters who 969 01:05:53,610 --> 01:05:57,090 Speaker 1: is in the room designing these systems. If you have 970 01:05:57,370 --> 01:06:00,490 Speaker 1: people sitting around a conference table, they all look the same. 971 01:06:00,570 --> 01:06:03,290 Speaker 1: Perhaps they all did the same type of engineering degree. 972 01:06:03,370 --> 01:06:06,130 Speaker 1: Perhaps they're all men. Perhaps they're all pretty middle class 973 01:06:06,210 --> 01:06:08,770 Speaker 1: or pretty well off. They're going to be designing systems 974 01:06:08,810 --> 01:06:11,930 Speaker 1: that reflect their worldview. What we're learning is that the 975 01:06:11,970 --> 01:06:14,290 Speaker 1: more diverse those rooms are, and the more we can 976 01:06:14,370 --> 01:06:17,490 Speaker 1: question those kinds of assumptions, the better we can actually 977 01:06:17,530 --> 01:06:27,290 Speaker 1: design systems for a diverse world. Conclusion, choose your planet. 978 01:06:30,370 --> 01:06:33,330 Speaker 1: So there you have it, Steorides of the Brave New Planet. 979 01:06:33,850 --> 01:06:39,410 Speaker 1: Predictive algorithms, a sixty year old dream of artificial intelligence 980 01:06:39,890 --> 01:06:45,090 Speaker 1: machines making human like decisions has finally become a reality. 981 01:06:45,970 --> 01:06:49,050 Speaker 1: If a task can be turned into a prediction problem, 982 01:06:49,090 --> 01:06:52,450 Speaker 1: and if you've got a mountain of training data, algorithms 983 01:06:52,610 --> 01:06:57,050 Speaker 1: can learn to do the job. Countless applications are possible, 984 01:06:57,610 --> 01:07:03,890 Speaker 1: translating languages instantaneously, providing expert medical diagnoses for eye diseases 985 01:07:03,930 --> 01:07:09,370 Speaker 1: and cancer to patients anywhere, improving drug development, all at 986 01:07:09,450 --> 01:07:13,410 Speaker 1: levels comparable to or better than human experts. But it's 987 01:07:13,450 --> 01:07:17,890 Speaker 1: also letting governments and companies make automatic decisions about you, 988 01:07:19,050 --> 01:07:21,850 Speaker 1: whether you should get admitted to college, be hired for 989 01:07:21,890 --> 01:07:26,130 Speaker 1: a job, get a loan, get housing assistance, be granted bail, 990 01:07:26,850 --> 01:07:31,330 Speaker 1: or get medical attention. The problem is that algorithms that 991 01:07:31,490 --> 01:07:35,490 Speaker 1: learn to make human like decisions based on past human 992 01:07:35,530 --> 01:07:42,170 Speaker 1: outcomes can acquire a lot of human biases about gender, race, class, 993 01:07:42,250 --> 01:07:48,810 Speaker 1: and more often masquerading as objective judgment. Even worse, you 994 01:07:48,970 --> 01:07:52,130 Speaker 1: usually don't have a right even to know you're being 995 01:07:52,250 --> 01:07:55,970 Speaker 1: judged by a machine, or what's inside the black box, 996 01:07:56,610 --> 01:08:00,850 Speaker 1: or whether the algorithms are accurate or fair. Should laws 997 01:08:00,930 --> 01:08:04,810 Speaker 1: require that automated decision systems used by governments or companies 998 01:08:05,050 --> 01:08:09,650 Speaker 1: be transparent? Should they require public auditing for a curacy 999 01:08:09,810 --> 01:08:16,090 Speaker 1: and fairness? And what exactly is fairness? Anyway? Governments are 1000 01:08:16,130 --> 01:08:18,730 Speaker 1: just beginning to wake up to these issues, and they're 1001 01:08:18,770 --> 01:08:22,050 Speaker 1: not sure what they should do. In the coming years, 1002 01:08:22,330 --> 01:08:26,490 Speaker 1: they'll decide what rules to set, or perhaps to do 1003 01:08:26,570 --> 01:08:30,490 Speaker 1: nothing at all. So what can you do a lot? 1004 01:08:30,570 --> 01:08:33,770 Speaker 1: It turns out you don't have to be an expert 1005 01:08:33,810 --> 01:08:36,810 Speaker 1: and you don't have to do it alone. Start by 1006 01:08:36,930 --> 01:08:41,690 Speaker 1: learning a bit more. Invite friends over virtually or in 1007 01:08:41,770 --> 01:08:45,410 Speaker 1: person when it's saved for dinner and debate about what 1008 01:08:45,450 --> 01:08:49,090 Speaker 1: we should do. Or organize a conversation at a book club, 1009 01:08:49,410 --> 01:08:53,690 Speaker 1: a faith group, or a campus event. And then email 1010 01:08:53,730 --> 01:08:57,570 Speaker 1: your city or state representatives to ask what they're doing 1011 01:08:57,610 --> 01:09:02,010 Speaker 1: about the issue, maybe even proposing first steps like setting 1012 01:09:02,050 --> 01:09:06,810 Speaker 1: up a task force. When people get engaged, action happens. 1013 01:09:07,650 --> 01:09:10,770 Speaker 1: You'll find lots of resources and ideas at our website, 1014 01:09:11,170 --> 01:09:15,970 Speaker 1: Brave New Planet dot org. It's time to choose our planet. 1015 01:09:16,650 --> 01:09:31,330 Speaker 1: The future is up to us, James. Brave New Planet 1016 01:09:31,450 --> 01:09:33,610 Speaker 1: is a co production of the Broad Institute of Mt 1017 01:09:33,730 --> 01:09:37,530 Speaker 1: and Harvard Pushkin Industries in the Boston Globe, with support 1018 01:09:37,610 --> 01:09:40,930 Speaker 1: from the Alfred P. Sloan Foundation. Our show is produced 1019 01:09:40,930 --> 01:09:44,970 Speaker 1: by Rebecca Lee Douglas with Mary Doo theme song composed 1020 01:09:44,970 --> 01:09:48,770 Speaker 1: by Ned Porter, mastering and sound designed by James Garver, 1021 01:09:49,410 --> 01:09:53,090 Speaker 1: fact checking by Joseph Fridman, and a Stitt and Enchant. 1022 01:09:53,970 --> 01:09:58,170 Speaker 1: Special Thanks to Christine Heenan and Rachel Roberts at Clarendon Communications, 1023 01:09:58,730 --> 01:10:02,290 Speaker 1: to Lee McGuire, Kristen Zarelli and Justine Levin Allerhand at 1024 01:10:02,290 --> 01:10:06,450 Speaker 1: the Broad, to Milobelle and Heather Faine at Pushkin, and 1025 01:10:07,010 --> 01:10:10,370 Speaker 1: to Eli and Edy Brode who made Broad Institute possible. 1026 01:10:11,010 --> 01:10:14,330 Speaker 1: This is Brave New Planet. I'm Ericlander.