WEBVTT - What Algorithms Say About You 0:00:15.250 --> 0:00:28.890 Pushkin. You're listening to Brave New Planet, a podcast about 0:00:28.930 --> 0:00:33.090 amazing new technologies that could dramatically improve our world. Or 0:00:33.490 --> 0:00:36.010 if we don't make wise choices, could leave us a 0:00:36.050 --> 0:00:40.690 lot worse off, Utopia or dystopia. It's up to us. 0:00:46.970 --> 0:00:52.850 On November eleventh, twenty sixteen, the Babelfish burst from fiction 0:00:53.290 --> 0:00:58.450 into reality. The Babelfish was conceived forty years ago in 0:00:58.610 --> 0:01:03.450 Douglas Adam's science fiction classic The Hitchhiker's Guide to the Galaxy. 0:01:04.130 --> 0:01:08.410 In the story, a hapless Earthling finds himself a stowaway 0:01:08.690 --> 0:01:13.010 on a Vogon spaceship. When the alien captain starts an 0:01:13.010 --> 0:01:17.370 announcement over the loudspeaker, his companion tells him to stick 0:01:17.410 --> 0:01:23.090 a small yellow fish in his ear. Listen, it's important, 0:01:24.170 --> 0:01:27.730 it's a I can't just put this in your ear. 0:01:28.570 --> 0:01:35.450 Suddenly he's able to understand the language. The Babelfish is small, yellow, 0:01:35.850 --> 0:01:39.490 leech like and probably the oddest thing in the universe. 0:01:40.930 --> 0:01:45.530 It feeds on brainwave energy, whose ambing all unconscious frequencies, 0:01:46.010 --> 0:01:48.570 the practical upshot of which is that if you stick 0:01:48.610 --> 0:01:51.770 one in your ear, you instantly understand anything said to 0:01:51.810 --> 0:01:54.370 you in any form of language. At the time, the 0:01:54.450 --> 0:01:58.810 idea of sticking an instantaneous universal translator in your ear 0:01:59.330 --> 0:02:04.490 seems charmingly absurd, But a couple of years ago, Google 0:02:04.570 --> 0:02:08.890 and other companies announced plants to start selling Babelfish well 0:02:09.130 --> 0:02:12.570 not fish actually, but earbuds that do the same thing. 0:02:13.490 --> 0:02:17.050 The key breakthrough came in November twenty sixteen, when Google 0:02:17.170 --> 0:02:23.010 replaced the technology behind its translate program. Overnight, the Internet 0:02:23.090 --> 0:02:28.490 realized that something extraordinary had happened. A Japanese computer scientist 0:02:28.570 --> 0:02:31.890 ran a quick test. He dashed off his own Japanese 0:02:31.930 --> 0:02:36.250 translation of the opening lines of Ernest Hemingway's short story 0:02:36.690 --> 0:02:41.290 The Snows of Kilmanjaro, and dared Google Translate to turn 0:02:41.330 --> 0:02:45.210 it back into English. Here's the opening passage from the 0:02:45.250 --> 0:02:49.370 Simon and Schuster audio book. Kilimanjaro is a snow covered 0:02:49.410 --> 0:02:53.050 mountain nineteen thousand, seven hundred and ten feet high and 0:02:53.210 --> 0:02:56.090 is said to be the highest mountain in Africa. Its 0:02:56.090 --> 0:03:00.130 western summit is called the Massai Nagaji Nagai, the House 0:03:00.170 --> 0:03:03.530 of God. Close to the western summit there is the 0:03:03.650 --> 0:03:07.690 dried and frozen carcass of a leopard. No one has 0:03:07.730 --> 0:03:11.810 explained what the leopard seeking at that altitude. Let's just 0:03:11.890 --> 0:03:15.970 consider that last sentence. No one has explained what the 0:03:16.090 --> 0:03:21.330 leopard was seeking at that altitude. One day earlier, Google 0:03:21.410 --> 0:03:26.650 had mangled the back translation quote. Whether the leopard had 0:03:26.690 --> 0:03:30.890 what the demand at that altitude? There is no that 0:03:30.970 --> 0:03:37.850 nobody explained. But now Google Translate returned quote. No one 0:03:37.930 --> 0:03:43.130 has ever explained what leopard wanted at that altitude. It 0:03:43.250 --> 0:03:49.570 was perfect except for a missing the the what explained 0:03:49.610 --> 0:03:54.970 the great leap? While Google had built a predictive algorithm 0:03:55.010 --> 0:03:59.330 that taught itself how to translate between English and Japanese 0:03:59.810 --> 0:04:03.690 by training on a vast library of examples and tweaking 0:04:03.730 --> 0:04:07.210 its connections to get better and better at predicting the 0:04:07.330 --> 0:04:11.330 right answer. In anyways, the algorithm was a black box. 0:04:11.970 --> 0:04:15.730 No one understood precisely how it worked, but it did 0:04:15.770 --> 0:04:21.890 amazingly well. Predictive algorithms turn out to be remarkably general. 0:04:22.570 --> 0:04:25.250 They can be applied to predict which movies a Netflix 0:04:25.370 --> 0:04:27.970 user will want to see next, or whether an eye 0:04:28.050 --> 0:04:32.690 exam or a mammogram indicates disease. But it doesn't stop there. 0:04:33.330 --> 0:04:37.610 Predictive algorithms or also being trained to make societal decisions 0:04:38.450 --> 0:04:41.490 who to hire for a job, whether to approve a 0:04:41.570 --> 0:04:45.890 mortgage application, what students to let into a college, what 0:04:46.050 --> 0:04:49.530 a rest ease to let out on bail? But what 0:04:49.650 --> 0:04:54.170 exactly are these big black boxes learning from massive data sets? 0:04:54.770 --> 0:04:58.330 Are they gaining deep new insights about people? Or might 0:04:58.370 --> 0:05:07.490 they sometimes be automating systemic biases? Today's big question when 0:05:07.530 --> 0:05:11.690 should predictive algorithms be allowed to make big decisions about people? 0:05:12.490 --> 0:05:15.970 And before they judge us, should we have the right 0:05:16.010 --> 0:05:21.090 to know what's inside the black box? My name is 0:05:21.170 --> 0:05:23.570 Eric Lander. I'm a scientist who works on ways to 0:05:23.610 --> 0:05:27.010 improve human health. I helped lead the Human Genome Project, 0:05:27.210 --> 0:05:30.370 and today I lead the Broad Institute of MIT and Harvard. 0:05:31.090 --> 0:05:34.890 In the twenty first century, powerful technologies have been appearing 0:05:34.930 --> 0:05:39.290 at a breathtaking pace, related to the Internet, artificial intelligence, 0:05:39.330 --> 0:05:44.610 genetic engineering, and more. They have amazing potential upsides, but 0:05:44.730 --> 0:05:47.410 we can't ignore the risks that come with them. The 0:05:47.530 --> 0:05:52.210 decisions aren't just up to scientists or politicians. Whether we 0:05:52.290 --> 0:05:55.130 like it or not, we all of us are the 0:05:55.210 --> 0:05:59.130 stewards of a brave New planet. This generation's choices will 0:05:59.130 --> 0:06:11.570 shape the future as never before. Coming up on today's 0:06:11.570 --> 0:06:20.210 episode of Brave New Planet, predictive algorithms. We hear from 0:06:20.210 --> 0:06:23.650 a physician at Google about how this technology might help 0:06:23.770 --> 0:06:27.410 keep millions of people with diabetes from going blind, and 0:06:27.490 --> 0:06:30.010 the idea was, well, if you could retrain the model, 0:06:30.450 --> 0:06:33.650 you could get to more patients to screen them for disease. 0:06:34.010 --> 0:06:37.410 The first iteration of the model was on par with 0:06:37.690 --> 0:06:41.450 the US board sortified ophomologists. I speak with an AI 0:06:41.610 --> 0:06:46.930 researcher about how predictive algorithms sometimes learn to be sexist 0:06:47.010 --> 0:06:50.690 and racist. If you typed in I am a white man, 0:06:50.730 --> 0:06:53.090 you would get positive sentiment. If you typed in I 0:06:53.130 --> 0:06:56.810 am a black lesbian, for example, negative sentiment. We hear 0:06:56.850 --> 0:07:01.330 how algorithms are affecting the criminal justice system. For black defendants, 0:07:01.570 --> 0:07:05.330 it was much more likely to incorrectly predict that they 0:07:05.330 --> 0:07:06.730 were going to go on to come in a future 0:07:06.730 --> 0:07:09.610 crime when they didn't, and for white defend it was 0:07:09.810 --> 0:07:12.490 much more likely to predict that they were going to 0:07:12.530 --> 0:07:14.770 go on to not commit a future crime when they did. 0:07:15.330 --> 0:07:18.370 And we hear from a policy expert about whether these 0:07:18.410 --> 0:07:21.650 systems should be regulated. A lot of the horror stories 0:07:21.690 --> 0:07:25.610 are about fully implemented tools that were in work for years. 0:07:25.690 --> 0:07:29.650 There's never a pause button. To reevaluate or look at 0:07:29.650 --> 0:07:32.690 how a system is working real time. Stay with us 0:07:36.330 --> 0:07:42.130 Chapter one, The Big Black Box. To better understand these algorithms, 0:07:42.170 --> 0:07:44.370 I decided to speak with one of the creators of 0:07:44.410 --> 0:07:49.090 the technology that transformed Google Translate. My name is Greg Kurado, 0:07:49.250 --> 0:07:52.330 and I'm a distinguished scientist at Google Research. Early in 0:07:52.370 --> 0:07:56.370 his career, Greg had trained neuroscience, but he soon shifted 0:07:56.410 --> 0:08:01.210 his focus from organic intelligence to artificial and not turned 0:08:01.210 --> 0:08:04.290 out to be really a very lucky moment, because I 0:08:04.410 --> 0:08:08.570 was becoming interested in artificial intelligence at exactly the moment 0:08:08.610 --> 0:08:12.010 that artificial intelligence was changing so much. Ever since the 0:08:12.050 --> 0:08:16.090 field of artificial intelligence started more than sixty years ago, 0:08:16.610 --> 0:08:19.930 there have been two warring approaches about how to teach 0:08:20.010 --> 0:08:24.210 machines to do human tasks. We might call them human 0:08:24.290 --> 0:08:28.010 rules versus machine learning. The way that we used to 0:08:28.050 --> 0:08:32.610 try to get computers to recognize patterns was to program 0:08:32.650 --> 0:08:36.850 into them specific rules. So we would say, oh, well, 0:08:36.850 --> 0:08:38.810 you can tell the difference between a cat and a 0:08:38.930 --> 0:08:42.530 dog by how long it's whiskers are and what kind 0:08:42.570 --> 0:08:45.090 of fur it has and does it have stripes? And 0:08:45.210 --> 0:08:48.930 trying to put these rules into computers. It kind of worked, 0:08:49.450 --> 0:08:52.650 but it made for a lot of mistakes. The other 0:08:52.690 --> 0:08:56.330 approach was machine learning, let the computer figure everything out 0:08:56.330 --> 0:09:01.290 for itself, somewhat like the biological brain. The machine learning 0:09:01.290 --> 0:09:05.930 system is actually built of tiny little decision makers or neurons. 0:09:06.290 --> 0:09:09.730 They start out connected very much in random ways, but 0:09:10.250 --> 0:09:13.250 we give the system feedback. So, for example, if it's 0:09:13.250 --> 0:09:16.090 guessing between a cat and a dog and it gets 0:09:16.090 --> 0:09:18.890 one wrong, we tell the system that it got one wrong, 0:09:18.970 --> 0:09:21.810 and we make little changes inside so that it's much 0:09:21.850 --> 0:09:25.210 more likely to recognize that cat as a cat and 0:09:25.370 --> 0:09:29.050 not mistake it for a dog. Over time, the system 0:09:29.090 --> 0:09:32.650 gets better and better and better. Machine learning had been 0:09:32.690 --> 0:09:37.450 around for decades with rather unimpressive results. The number of 0:09:37.610 --> 0:09:42.570 connections and neurons in those early systems was pretty small. 0:09:42.890 --> 0:09:47.690 We didn't realize until about two and ten that computers 0:09:47.730 --> 0:09:51.250 had gotten fast enough and the data sets were big 0:09:51.370 --> 0:09:55.850 enough that these systems could actually learn from patterns and 0:09:55.930 --> 0:10:01.930 learn from data better than we could describe rules manually. 0:10:02.170 --> 0:10:07.570 Machine learning made huge leaps. Google itself became the leading 0:10:07.690 --> 0:10:12.570 driver of machine learning. In twenty eleven, Krrado joined with 0:10:12.650 --> 0:10:17.490 two colleagues to form a unit called Google Brain. Among 0:10:17.570 --> 0:10:22.090 other things, they applied a machine learning approach to language translation. 0:10:22.970 --> 0:10:28.290 The strategy turned out to be remarkably effective. It doesn't 0:10:28.410 --> 0:10:31.130 learn French the way you would learn French in high school. 0:10:31.530 --> 0:10:34.290 It learns French the way you would learn French at home, 0:10:34.730 --> 0:10:38.050 much more like the way that a child learns the language. 0:10:38.250 --> 0:10:41.930 We give the machine the English sentence, and then we 0:10:41.970 --> 0:10:45.370 give it an example of a French translation of that 0:10:45.530 --> 0:10:49.730 whole sentence. We show a whole lot of them, probably 0:10:50.170 --> 0:10:53.330 more French and English sentences than you could read in 0:10:53.370 --> 0:10:57.690 your whole life. And by seeing so many examples of 0:10:58.050 --> 0:11:02.170 entire sentences, the system is able to learn, oh, this 0:11:02.250 --> 0:11:05.170 is how I would say this in French. That's actually, 0:11:05.210 --> 0:11:09.530 at this point about as good as a biling human 0:11:09.570 --> 0:11:13.890 would produce. Soon Google was training predictive algorithms for all 0:11:14.090 --> 0:11:17.890 sorts of purposes. We use neural network predictors to help 0:11:18.090 --> 0:11:22.370 rank search results, tell people organize their photos, to recognize speech, 0:11:22.450 --> 0:11:27.370 to find driving directions, to help complete emails. Really anything 0:11:27.410 --> 0:11:30.010 that you can think of where there's some notion of 0:11:30.130 --> 0:11:34.370 finding a pattern or making a prediction, artificial intelligence might 0:11:34.410 --> 0:11:39.330 be at play. Predictive algorithms would become ubiquitous in commerce. 0:11:39.770 --> 0:11:43.410 They let Netflix know which movies to recommend to each customer, 0:11:43.850 --> 0:11:47.890 Amazon to suggest products users might be interested in purchasing, 0:11:48.210 --> 0:11:53.050 and much more well, they're shockingly useful, they can also 0:11:53.130 --> 0:11:57.810 be inscrutable. Modern neural networks are like a black box. 0:11:58.370 --> 0:12:02.850 Understanding how they make their predictions can be surprisingly difficult. 0:12:03.210 --> 0:12:05.530 When you build an artificial neural network, you do not 0:12:05.810 --> 0:12:10.210 necessarily understand exactly the final state of how it works. 0:12:10.730 --> 0:12:15.530 Figuring out how it works becomes its own science project. 0:12:16.010 --> 0:12:21.130 One thing we do know. Predictive algorithms are especially sensitive, 0:12:21.410 --> 0:12:24.530 so the choice of examples used to train them. The 0:12:24.730 --> 0:12:28.490 systems learn to imitate the examples in the data that 0:12:28.530 --> 0:12:31.170 they see. You don't know how well they will do 0:12:31.210 --> 0:12:34.090 on things that are very different. So, for example, if 0:12:34.130 --> 0:12:38.090 you train a system to recognize cats and dogs, but 0:12:38.250 --> 0:12:43.290 you only ever show it border collies and tabbycats, it's 0:12:43.370 --> 0:12:45.570 not clear what it will do. When you show it 0:12:45.610 --> 0:12:50.250 a picture of chihuahua, all it's ever seen as border collies, 0:12:50.690 --> 0:12:53.930 it may not get the right answer. So its concept 0:12:53.970 --> 0:12:56.890 of dog is going to be limited by the dogs 0:12:56.930 --> 0:13:00.130 it's seen. That's right, and this is why diversity of 0:13:00.290 --> 0:13:04.850 data in machine learning systems is so important. You have 0:13:04.930 --> 0:13:08.810 to have a data set that represents the entire spectrum 0:13:08.810 --> 0:13:12.170 of possibilities that you expect the system to work under. 0:13:12.730 --> 0:13:15.970 Teaching algorithms turns out to be not so different than 0:13:16.010 --> 0:13:25.490 teaching people. They learn what they see. Chapter two, retina fundoscopy. 0:13:27.170 --> 0:13:30.650 It's cool that predictive algorithms can learn to translate languages 0:13:30.730 --> 0:13:35.130 and suggest movies, but what about more life changing applications. 0:13:35.850 --> 0:13:38.970 My name is Lily Ping. I am a physician by training, 0:13:39.370 --> 0:13:42.170 and I am a product manager at Google. I went 0:13:42.250 --> 0:13:45.050 to visit doctor Ping because she and her colleagues are 0:13:45.130 --> 0:13:50.330 using predictive algorithms to help millions of people avoid going blind. So. 0:13:50.490 --> 0:13:55.210 Diabetic retinopathy is a complication of diabetes that affects the 0:13:55.250 --> 0:13:58.170 back of the eye, the retina. One of the devastating 0:13:58.210 --> 0:14:02.530 complications is vision loss. All patients that have diabetes need 0:14:02.650 --> 0:14:06.370 to be screened once a year for a diabetic retnopathy. 0:14:06.410 --> 0:14:08.810 This is an asymptomatic disease, which means that you do 0:14:08.850 --> 0:14:12.170 not feel the symptoms. You don't experienced vision loss until 0:14:12.210 --> 0:14:16.330 it's too late. Now, diabetes is epidemic around the world. 0:14:16.370 --> 0:14:19.930 How many diabetics are there? Though by most estimates, there 0:14:19.930 --> 0:14:23.370 are over four hundred million patients in the world with diabetes. 0:14:23.610 --> 0:14:26.810 How do you screen a patient to see whether they 0:14:26.850 --> 0:14:30.930 have diabetic retinopathy. You need to have a special camera 0:14:31.050 --> 0:14:34.050 while a fundis camera and it takes a picture through 0:14:34.090 --> 0:14:36.410 the people of the back of the eye. We have 0:14:36.450 --> 0:14:39.850 a very small supply of retina specialists and eye doctors 0:14:40.130 --> 0:14:42.730 and they do a lot more than reading images, so 0:14:42.970 --> 0:14:47.250 they needed to scale the reading of these images. Four 0:14:47.490 --> 0:14:52.370 hundred million people with diabetes. There just aren't enough specialists 0:14:52.370 --> 0:14:55.770 for all the retinal images that need reading, especially in 0:14:55.810 --> 0:14:59.330 some countries in Asia where resources are limited and the 0:14:59.410 --> 0:15:04.290 incidence of diabetes is skyrocketing. Two hospitals in southern India 0:15:04.370 --> 0:15:07.930 recognize the problem and reached out to Google for help. 0:15:09.210 --> 0:15:12.410 That point, Google was already sort of well known for 0:15:12.450 --> 0:15:17.850 image recognition. We were classifying cats and dogs and consumer images, 0:15:18.090 --> 0:15:20.490 and the idea was, well, if you could retrain the 0:15:20.530 --> 0:15:26.410 model to recognize diabetic retinopathy, you could potentially help the 0:15:26.450 --> 0:15:30.290 hospitals in India get to more patients to screen them 0:15:30.410 --> 0:15:33.090 for disease. How did you and your colleagues set out 0:15:33.090 --> 0:15:36.210 to attack this problem? So when I first started the project, 0:15:36.570 --> 0:15:40.370 we had about one hundred thirty thousand images from eye 0:15:40.410 --> 0:15:43.690 hospitals in India as well as a screening program in 0:15:43.690 --> 0:15:48.170 the US. Also, we gathered the army of opthalmologists to 0:15:48.290 --> 0:15:52.370 grade them eight hundred eighty thousand diagnoses or rendered on 0:15:52.370 --> 0:15:55.610 one hundred thirty thousand images. So we took this training 0:15:55.690 --> 0:15:58.090 data and we put it in a machine learning model 0:15:58.250 --> 0:16:00.770 and had to do The first iteration of the model 0:16:01.210 --> 0:16:06.090 was on par with the US board sortified ophomologists. Since then, 0:16:06.130 --> 0:16:09.570 we've made some improvements to the model, and the initial 0:16:09.610 --> 0:16:13.610 training took about how long The first time we train 0:16:13.650 --> 0:16:16.290 a model, it may have taken a couple of weeks, 0:16:16.330 --> 0:16:18.410 But then the second time you train the next models 0:16:18.410 --> 0:16:21.530 and next models, it's just it's shorter and shorter, sometimes overnight, 0:16:22.050 --> 0:16:24.930 sometimes overnight. Well, yes, all right, And by contrast, how 0:16:24.970 --> 0:16:28.250 long does it take to train a board certified ophthimologist, 0:16:29.570 --> 0:16:33.650 So that usually takes at least five years, and then 0:16:33.690 --> 0:16:37.730 you also have additional fellowship ears to specialize in the retina. 0:16:37.770 --> 0:16:39.450 And at the end of that you only have one 0:16:39.770 --> 0:16:42.890 board certified ophthimologist. Yes, at the end of that you'd 0:16:42.930 --> 0:16:48.250 have one very very well trained doctor, but that's not scaled. Yes, 0:16:48.410 --> 0:16:53.930 So by contrast, a model like this scales worldwide and 0:16:54.050 --> 0:16:58.730 never fatigues. It consistently gives the same diagnosis on the 0:16:58.730 --> 0:17:02.850 same image, and it obviously takes a much shorter time 0:17:02.930 --> 0:17:07.010 to train. That being said, it does a very very 0:17:07.130 --> 0:17:10.050 narrow task that is just a very small portion of 0:17:10.050 --> 0:17:14.050 what that doctor can do. The retina screening tools already 0:17:14.090 --> 0:17:17.490 being used in India, It was recently approved in Europe 0:17:17.530 --> 0:17:21.450 and its under review in the United States. Groups around 0:17:21.490 --> 0:17:24.890 the world are now working on other challenges in medical imaging, 0:17:25.370 --> 0:17:29.130 like detecting breast cancers at earlier stages. But I was 0:17:29.210 --> 0:17:33.970 particularly struck by a surprising discovery by Lily's team that 0:17:34.210 --> 0:17:39.410 unexpected information about patients was hiding in their retinal pictures. 0:17:40.010 --> 0:17:43.450 In the fundess image, there are blood vessels, and so 0:17:43.650 --> 0:17:46.570 one of the thoughts that we had was, because you 0:17:46.610 --> 0:17:49.250 can see these vessels, I wonder if we can predict 0:17:49.450 --> 0:17:53.170 cardiovascular disease from the same image. So we did an 0:17:53.170 --> 0:17:58.250 experiment where we took fundess images and we train a 0:17:58.290 --> 0:18:01.650 model to predict whether or not that patient would have 0:18:01.690 --> 0:18:04.370 a heart attack in five years. We found that we 0:18:04.450 --> 0:18:09.130 could tell whether or not this patient may have a 0:18:09.130 --> 0:18:13.170 a vascular event much better than doctors. It speaks to 0:18:13.410 --> 0:18:16.850 what might be in this data that we've overlooked. The 0:18:16.930 --> 0:18:21.730 model could make predictions that doctors couldn't from the same 0:18:21.890 --> 0:18:26.090 type of data. It turned out the computer could also 0:18:26.130 --> 0:18:30.130 do a reasonable job of predicting a patient sex, age, 0:18:30.170 --> 0:18:32.930 and smoking status. The first time I did this with 0:18:32.970 --> 0:18:35.570 an ahomologist, I think she thought I was trolling her. 0:18:35.850 --> 0:18:38.570 I said, well, here pictures. Guess which one is a woman, 0:18:38.970 --> 0:18:41.410 Guess which one is a man. Guess which one's a smoker, 0:18:41.850 --> 0:18:44.330 Guess which one is young. Right, these are all tasks 0:18:44.370 --> 0:18:48.090 that doctors don't generally do with these images. It turns 0:18:48.130 --> 0:18:51.970 out the model was right ninety eight ninety nine percent 0:18:51.970 --> 0:18:55.610 of the time. That being said, there are much easier 0:18:55.650 --> 0:18:59.290 ways of getting the sex of a patience. So so, 0:18:59.450 --> 0:19:03.890 while scientifically interesting, this is one of the most useless 0:19:03.890 --> 0:19:07.250 clinical predictions ever. So how far can it go? If 0:19:07.250 --> 0:19:11.930 you gave preference for rock music or not? What do 0:19:11.970 --> 0:19:15.850 you think? You know? We tried predicting happiness. That didn't work, 0:19:15.850 --> 0:19:22.290 So I'm guessing rock music. Oh, probably not, but who knows. So. 0:19:22.450 --> 0:19:26.610 Predictive algorithms can learn a remarkable range of tasks, and 0:19:26.650 --> 0:19:30.890 they can even discover hidden patterns that humans miss. We 0:19:31.010 --> 0:19:33.730 just have to give them enough training data to learn from. 0:19:34.450 --> 0:19:43.970 Sounds pretty fantastic. What could possibly go wrong? Chapter three? 0:19:44.290 --> 0:19:49.130 What could possibly go wrong? If predictive algorithms can use 0:19:49.210 --> 0:19:53.410 massive data to discover unexpected connections between your eye and 0:19:53.490 --> 0:19:58.410 your heart, what might they be learning about, say, human society. 0:19:58.930 --> 0:20:01.170 To answer this question, I took a trip to speak 0:20:01.210 --> 0:20:04.330 with Kate Crawford, the co founder and co director of 0:20:04.370 --> 0:20:08.770 the AI Now Institute at New York University. When we 0:20:08.890 --> 0:20:12.930 begin and we were the world's first AI institute dedicated 0:20:12.970 --> 0:20:16.850 to studying the social implications of these tools. To me, 0:20:17.290 --> 0:20:20.170 these are the biggest challenges that we face right now, 0:20:20.210 --> 0:20:23.650 simply because we've spent decades looking at these questions from 0:20:23.650 --> 0:20:26.770 a technical lens at the expense of looking at them 0:20:26.850 --> 0:20:29.490 at a social and an ethical lens. I knew about 0:20:29.570 --> 0:20:32.410 Kate's work because we served together on a working group 0:20:32.450 --> 0:20:36.330 about artificial intelligence for the US National Institutes of Health. 0:20:36.970 --> 0:20:40.770 I also knew she had an interesting background. I grew 0:20:40.810 --> 0:20:44.930 up in Australia. I studied a really strange grab bag 0:20:44.970 --> 0:20:49.370 of disciplines. I studied law, I studied philosophy, and then 0:20:49.370 --> 0:20:52.570 I got really interested in computer science, and this was 0:20:52.570 --> 0:20:56.210 happening at the same time as I was writing electronic 0:20:56.330 --> 0:21:00.090 music on large scale modulus synthesizers, and that's still a 0:21:00.170 --> 0:21:03.250 thing that I do today. It's almost like the opposite 0:21:03.250 --> 0:21:06.730 of artificial intelligence because it's so analog, so I absolutely 0:21:06.730 --> 0:21:09.010 love it for that reason. In the year two thousand 0:21:09.090 --> 0:21:14.090 and Kate's band released an album entitled twenty twenty that 0:21:14.210 --> 0:21:20.930 included a pression song called Machines work so that people 0:21:21.050 --> 0:21:27.210 have time to think. It's funny because we use a 0:21:27.330 --> 0:21:30.930 sample from an early IBM promotional film that was made 0:21:30.930 --> 0:21:34.130 in the nineteen sixties, which says machines can do the 0:21:34.170 --> 0:21:37.250 work so that people have time to think, and we 0:21:37.370 --> 0:21:40.210 actually ended up sort of cutting it and splicing it 0:21:40.250 --> 0:21:42.050 in the track, so it ends up saying that people 0:21:42.050 --> 0:21:44.370 can do the work so that machines have time to think. 0:21:44.690 --> 0:21:47.290 And strangely, the more that I've been working in the 0:21:47.330 --> 0:21:49.930 sort of machine learning space, I think, yeah, there's a 0:21:49.930 --> 0:21:52.050 lot of ways in which actually people are doing the 0:21:52.130 --> 0:22:00.050 work so that machines can do all the thinking. Kate 0:22:00.210 --> 0:22:03.890 gave me a crash course on how predictive algorithms not 0:22:04.010 --> 0:22:08.170 only teach themselves language skills, but also in the process 0:22:08.450 --> 0:22:13.730 acquire human prejudices, even in something as seemingly benign as 0:22:13.850 --> 0:22:18.090 language translation. So in many cases, if you say, translate 0:22:18.090 --> 0:22:21.930 a sentence like she is a doctor into a language 0:22:21.930 --> 0:22:24.890 like Turkish, and then you translate it back into English, 0:22:25.170 --> 0:22:28.130 and you're saying Turkish because Turkish has pronouns that are 0:22:28.130 --> 0:22:32.170 not gendered precisely, and so you would expect that you 0:22:32.170 --> 0:22:34.130 would get the same sentence back, but you do not. 0:22:34.370 --> 0:22:37.410 It will say he is a doctor, so she is 0:22:37.410 --> 0:22:41.810 a doctor was translated into gender neutral Turkish as all 0:22:41.970 --> 0:22:46.770 beer doctor, which was then back translated into English as 0:22:46.930 --> 0:22:49.650 he is a doctor. In fact, you could see how 0:22:49.770 --> 0:22:53.170 much the predictive algorithms had learned about gender roles. Just 0:22:53.250 --> 0:22:57.610 by giving Google Translate a bunch of gender neutral sentences 0:22:57.650 --> 0:23:01.730 in Turkish. You got he is an engineer, she is 0:23:01.730 --> 0:23:04.930 a cook. He is a soldier, but she is a teacher. 0:23:05.290 --> 0:23:08.010 He is a friend, but she is a lover. He 0:23:08.290 --> 0:23:11.170 is happy and she is unhappy. I find that one 0:23:11.290 --> 0:23:15.810 particularly odd, and it's not just language translation that's problematic. 0:23:16.250 --> 0:23:20.650 The same sort of issues arise in language understanding. Predictive 0:23:20.690 --> 0:23:24.850 algorithms were trained to learn analogies by reading lots of texts, 0:23:25.210 --> 0:23:28.570 they concluded that dog is to puppy as cat is 0:23:28.570 --> 0:23:31.530 to kitten, and man is to king as woman is 0:23:31.570 --> 0:23:36.490 to queen. But they also automatically inferred that man is 0:23:36.490 --> 0:23:41.490 to computer programmer as woman is to homemaker. And with 0:23:41.570 --> 0:23:44.890 the rise of social media, Google used text on the 0:23:44.890 --> 0:23:49.530 Internet to train predictive algorithms to infer the sentiment of 0:23:49.650 --> 0:23:53.690 tweets and online reviews. Is it a positive sentiment? Is 0:23:53.690 --> 0:23:56.490 it a negative sentiment? I believe it was Google who 0:23:56.490 --> 0:23:59.690 released their sentiment engine, and you could just try it online, 0:23:59.730 --> 0:24:01.530 you know, put in a sentiment and see what you'd get. 0:24:02.010 --> 0:24:05.730 And again, similar problems emerged. If you typed in I 0:24:05.810 --> 0:24:08.130 am a white man, you would get positive sentiment. If 0:24:08.130 --> 0:24:11.690 you typed in a black lesbian, for example, negative sentiment. 0:24:12.530 --> 0:24:16.370 Just as Greg Korado explained with chihuahuas and border collies, 0:24:16.850 --> 0:24:20.410 the predictive algorithms were learning from the examples they found 0:24:20.410 --> 0:24:24.690 in the world, and those examples reflected a lot about 0:24:24.770 --> 0:24:28.490 past practices and prejudices. If we think about where you 0:24:28.570 --> 0:24:32.090 might be scraping large amounts of text from say Reddit, 0:24:32.170 --> 0:24:35.450 for example, and you're not thinking about how that sentiment 0:24:35.530 --> 0:24:39.250 might be biased against certain groups, then you're just basically 0:24:39.290 --> 0:24:43.130 importing that directly into your tool. But it's not just 0:24:43.250 --> 0:24:47.090 conversations on Reddit. There's the cautionary tale of what happens 0:24:47.130 --> 0:24:50.450 when Amazon let a computer teach itself how to sift 0:24:50.450 --> 0:24:54.930 through mountains of resumes for computer programming jobs to find 0:24:55.010 --> 0:24:59.530 the best candidates to interview. So they set up this system, 0:24:59.650 --> 0:25:01.730 they designed it, and what they found was a very 0:25:01.810 --> 0:25:06.170 quickly this system had learned to discard and really demote 0:25:06.490 --> 0:25:10.570 the applications from women. And typically if you had a 0:25:10.570 --> 0:25:13.410 women's college mentioned, and even if you had the word 0:25:13.770 --> 0:25:18.010 women's on your resume, your application would go to the 0:25:18.050 --> 0:25:20.610 bottom of the pile. All right, So how does it 0:25:20.770 --> 0:25:23.570 learn that? So, first of all, we take a look 0:25:23.570 --> 0:25:26.770 at who is generally hired by Amazon, and of course 0:25:26.850 --> 0:25:30.450 they have a very heavily skewed male workforce, and so 0:25:30.490 --> 0:25:32.850 the system is learning that these are the sorts of 0:25:32.890 --> 0:25:36.210 people who will tend to be hired and promoted. And 0:25:36.330 --> 0:25:38.770 it is not a surprise then that they actually found 0:25:38.770 --> 0:25:41.730 it impossible to really retrain this system. They ended up 0:25:41.770 --> 0:25:45.650 abandoning this tool because simply correcting for a bias is 0:25:45.770 --> 0:25:48.370 very hard to do when all of your ground truth 0:25:48.450 --> 0:25:52.890 data is so profoundly skewed in a particular direction. So 0:25:53.010 --> 0:25:57.250 Amazon dropped this particular machine learning project and Google fixed 0:25:57.290 --> 0:26:01.210 the Turkish to English problem. Today, Google Translate gives both 0:26:01.330 --> 0:26:04.250 he is a doctor and she is a doctor as 0:26:04.290 --> 0:26:09.090 translation options. But biases keep popping up in predictive algorithms 0:26:09.170 --> 0:26:14.170 in many settings, there's no systematic way to prevent them. Instead, 0:26:14.650 --> 0:26:18.850 spotting and fixing biases has become a game of whacamole 0:26:22.570 --> 0:26:28.930 Chapter four quarterbacks. Perhaps it's no surprise that algorithms trained 0:26:28.930 --> 0:26:32.090 in the wild west of the Internet or on tech 0:26:32.130 --> 0:26:37.330 industry hiring practices learned serious biases. But what about more 0:26:37.450 --> 0:26:42.250 sober settings like a hospital. I talked with someone recently 0:26:42.330 --> 0:26:47.850 discovered similar problems with potentially life threatening consequences. Hi am 0:26:47.930 --> 0:26:52.250 Christine Vogeli. I'm the director of evaluation research at Partner's 0:26:52.370 --> 0:26:57.930 Healthcare here in Boston. Partner's Healthcare, recently rebranded as mass 0:26:58.010 --> 0:27:02.570 General Brigham, is the largest healthcare provider in Massachusetts, a 0:27:02.650 --> 0:27:06.410 system that has six thousand doctors and a dozen hospitals 0:27:06.410 --> 0:27:10.370 and serves more than a million patients. As Christine explained 0:27:10.410 --> 0:27:13.530 to me, the role of healthcare providers in the US 0:27:13.530 --> 0:27:18.210 has been shifting. The responsibility for controlling costs and ensuring 0:27:18.290 --> 0:27:21.770 high quality services is now being put down on the 0:27:21.810 --> 0:27:24.690 hospitals and the doctors. And to me, this makes a 0:27:24.690 --> 0:27:27.010 lot of sense, Right, we really should be the ones 0:27:27.090 --> 0:27:29.690 responsible for ensuring that there's good quality care and that 0:27:29.730 --> 0:27:34.410 we're doing it efficiently. Healthcare providers are especially focusing their 0:27:34.450 --> 0:27:38.770