1 00:00:15,316 --> 00:00:20,836 Speaker 1: Pushkin. Hi, It's Jacob Goldstein and I'm here today with 2 00:00:20,916 --> 00:00:23,596 Speaker 1: another podcast I think you might like. The show is 3 00:00:23,636 --> 00:00:27,636 Speaker 1: called Unsung Science and it's hosted by David Pogue. You 4 00:00:27,716 --> 00:00:31,036 Speaker 1: might know David from CBS Sunday Morning, where he's a 5 00:00:31,116 --> 00:00:35,276 Speaker 1: correspondent covering topics like science, tech, and innovation, topics like 6 00:00:35,316 --> 00:00:37,716 Speaker 1: the ones we talk about here on What's Your Problem. 7 00:00:37,796 --> 00:00:40,196 Speaker 1: In the episode You're about to hear, David chats with 8 00:00:40,356 --> 00:00:44,036 Speaker 1: Luis Vonon, the founder and CEO of the popular language 9 00:00:44,036 --> 00:00:47,036 Speaker 1: app Duolingo. You might recall I talked with Louise earlier 10 00:00:47,036 --> 00:00:50,676 Speaker 1: this year about Duolingo and language and the current limits 11 00:00:50,676 --> 00:00:53,636 Speaker 1: of artificial intelligence. But the show you're about to hear 12 00:00:53,796 --> 00:00:57,276 Speaker 1: is about what Louise did before he started Duolingo. He 13 00:00:57,316 --> 00:01:00,436 Speaker 1: invented this thing called Capsha. Capsha is that test that 14 00:01:00,476 --> 00:01:02,396 Speaker 1: you have to take all the time on the Internet 15 00:01:02,436 --> 00:01:05,556 Speaker 1: to prove that you're not a robot. And yes, Louise 16 00:01:05,676 --> 00:01:08,196 Speaker 1: knows that the test is super annoying. But the story 17 00:01:08,236 --> 00:01:11,156 Speaker 1: of capsha and what happened with it is really interesting. 18 00:01:11,916 --> 00:01:17,916 Speaker 1: It's got some great twists. By the year two thousand, 19 00:01:18,116 --> 00:01:22,556 Speaker 1: the Internet was already becoming a cesspool software bots were 20 00:01:22,596 --> 00:01:26,156 Speaker 1: signing up for millions of fake email accounts for sending 21 00:01:26,156 --> 00:01:31,436 Speaker 1: out spam. Luis Vaughan stopped them. He invented the CAPTA, 22 00:01:31,956 --> 00:01:35,436 Speaker 1: the website login test where you have to decipher the 23 00:01:35,476 --> 00:01:38,436 Speaker 1: distorted image of a word, where you have to find 24 00:01:38,556 --> 00:01:41,556 Speaker 1: the traffic lights in a grid of nine blurry photos. 25 00:01:42,156 --> 00:01:45,356 Speaker 1: The only problem we hate that test. I would be 26 00:01:45,396 --> 00:01:47,036 Speaker 1: at a party and you know, people would ask me 27 00:01:47,036 --> 00:01:48,316 Speaker 1: what I did, and I would tell them that I 28 00:01:48,316 --> 00:01:50,076 Speaker 1: helped invent that thing, and people would tell me, oh, 29 00:01:50,116 --> 00:01:54,076 Speaker 1: I hate you. I'm David Pogue And this is Unsung Science, 30 00:02:01,476 --> 00:02:06,356 Speaker 1: Season one, episode fourteen, The man who stopped the spammers. 31 00:02:08,236 --> 00:02:10,676 Speaker 1: In his forty three years on this earth, so far, 32 00:02:11,316 --> 00:02:17,516 Speaker 1: Luis vonn has had three ingenious innovative world changing ideas. 33 00:02:18,796 --> 00:02:22,996 Speaker 1: I guarantee that you've encountered his second one, probably hundreds 34 00:02:22,996 --> 00:02:27,196 Speaker 1: of times. Actually, most of us have zero world changing ideas. 35 00:02:27,596 --> 00:02:33,996 Speaker 1: Occasionally somebody has won, but three times. His first idea 36 00:02:34,076 --> 00:02:37,836 Speaker 1: came to him in Guatemala, where he grew up. They 37 00:02:37,916 --> 00:02:41,516 Speaker 1: wanted to start a gym where instead of charging people 38 00:02:41,876 --> 00:02:44,116 Speaker 1: to show up, let people just show up for free. 39 00:02:44,316 --> 00:02:46,716 Speaker 1: We're going to connect all the machines to kind of 40 00:02:46,756 --> 00:02:48,916 Speaker 1: the power grid, and we're going to use the kinetic 41 00:02:48,996 --> 00:02:53,076 Speaker 1: energy that people had whenever they were exercising to generate power. 42 00:02:53,716 --> 00:02:55,276 Speaker 1: And I thought we could make a lot of money 43 00:02:55,276 --> 00:02:57,356 Speaker 1: from that. Now you will note that I did not 44 00:02:57,476 --> 00:03:00,956 Speaker 1: say that all three of his world changing ideas actually 45 00:03:00,996 --> 00:03:03,836 Speaker 1: succeeded in changing the world. I thought I was the 46 00:03:03,836 --> 00:03:05,396 Speaker 1: first person to have this idea. It turns out it's 47 00:03:05,396 --> 00:03:07,716 Speaker 1: a very old idea. It also turns out it doesn't work. 48 00:03:07,836 --> 00:03:11,756 Speaker 1: That's right, the pedal power Jim idea flopped. It turns 49 00:03:11,756 --> 00:03:13,836 Speaker 1: out this is not a good idea for many reasons, 50 00:03:13,836 --> 00:03:15,436 Speaker 1: the biggest one of which is that humans are just 51 00:03:15,516 --> 00:03:18,316 Speaker 1: not very good at creating energy. Oh, you just just 52 00:03:18,356 --> 00:03:20,116 Speaker 1: don't make a lot of money from this. There's another 53 00:03:20,156 --> 00:03:22,396 Speaker 1: reason why this doesn't work a lot. It turns out 54 00:03:22,476 --> 00:03:24,476 Speaker 1: Jim's make most of their money from people who don't 55 00:03:24,476 --> 00:03:26,956 Speaker 1: show up. Of course, here you kind of need people 56 00:03:26,996 --> 00:03:30,596 Speaker 1: to show up to be fair. He was pretty new 57 00:03:30,636 --> 00:03:33,676 Speaker 1: at the game when he had this first idea. And 58 00:03:33,796 --> 00:03:36,196 Speaker 1: how old were you at this point, twelve years old, 59 00:03:36,196 --> 00:03:39,396 Speaker 1: eleven years old. Things started going better six years later, 60 00:03:39,676 --> 00:03:42,996 Speaker 1: when he came to the United States to attend Duke University. 61 00:03:43,636 --> 00:03:46,716 Speaker 1: As the year two thousand dawned. Luis was at Carnegie 62 00:03:46,756 --> 00:03:49,476 Speaker 1: Mellon in his first year of working toward a PhD 63 00:03:49,796 --> 00:03:53,916 Speaker 1: in computer science, and one fateful day he went to 64 00:03:53,996 --> 00:03:58,236 Speaker 1: a talk by an Israeli computer scientist named Udi Manber, 65 00:03:58,676 --> 00:04:01,916 Speaker 1: who at this point was the chief scientist at Yahoo. 66 00:04:02,076 --> 00:04:03,756 Speaker 1: By the way, at end the year two thousand, Yahoo 67 00:04:03,836 --> 00:04:07,076 Speaker 1: was the biggest biggest tech company in the words like 68 00:04:07,076 --> 00:04:09,996 Speaker 1: the Google of today. And you know, he was giving 69 00:04:10,276 --> 00:04:12,756 Speaker 1: a talk about ten problems that they didn't know how 70 00:04:12,756 --> 00:04:15,916 Speaker 1: to solve inside inside the company. And one of those 71 00:04:16,156 --> 00:04:18,996 Speaker 1: ten problems that the greatest minds at Yahoo could not 72 00:04:19,116 --> 00:04:24,356 Speaker 1: solve was automated software spam bots signing up for free 73 00:04:24,436 --> 00:04:28,916 Speaker 1: Yahoo mail accounts. By the millions. Yahoo gave up free 74 00:04:28,916 --> 00:04:30,796 Speaker 1: email accounts, and there were people who wanted to send 75 00:04:30,836 --> 00:04:34,636 Speaker 1: spam from Yahoo accounts. But each Yahoo account only allowed 76 00:04:34,636 --> 00:04:37,316 Speaker 1: you to send like five hundred messages a day. If 77 00:04:37,316 --> 00:04:39,796 Speaker 1: you wanted to send millions of emails spam emails per day, 78 00:04:40,636 --> 00:04:42,996 Speaker 1: then what these people did is they wrote programs to 79 00:04:43,036 --> 00:04:46,636 Speaker 1: obtain millions of Yahoo accounts every day, and they didn't 80 00:04:46,676 --> 00:04:47,996 Speaker 1: know how to solve that problem, how to stop that. 81 00:04:48,436 --> 00:04:51,116 Speaker 1: So I started talking about it with a person who 82 00:04:51,236 --> 00:04:54,276 Speaker 1: had just become my PhD advisor. His name was Manuel 83 00:04:54,316 --> 00:04:57,516 Speaker 1: Blum or is Manolum. He's still he's most definitely still alive. 84 00:04:57,836 --> 00:05:00,196 Speaker 1: And you know, we started thinking, and this is where 85 00:05:00,196 --> 00:05:02,756 Speaker 1: this idea of a capture came up. The idea was this, 86 00:05:03,556 --> 00:05:06,556 Speaker 1: anytime you tried to sign up for a Yahoo Mail account, 87 00:05:06,956 --> 00:05:10,596 Speaker 1: you'd encounter a little puzzle, something easy for a person 88 00:05:10,636 --> 00:05:13,796 Speaker 1: to solve, but hard for a spambot. The way to 89 00:05:13,876 --> 00:05:16,676 Speaker 1: stop these spammers was to have a test that can 90 00:05:16,756 --> 00:05:20,036 Speaker 1: distinguish between whether you're a human or a computer. If 91 00:05:20,116 --> 00:05:23,196 Speaker 1: you are a human, then presumably you can't get millions 92 00:05:23,236 --> 00:05:26,276 Speaker 1: of email accounts because you get bored, whereas if you're 93 00:05:26,276 --> 00:05:28,076 Speaker 1: a computer, you can get millions. So if the only 94 00:05:28,276 --> 00:05:30,796 Speaker 1: entity is that we were giving email accounts to where humans, 95 00:05:31,036 --> 00:05:34,436 Speaker 1: then that would stop the spam. KAPTA, the name he 96 00:05:34,476 --> 00:05:38,556 Speaker 1: gave his online mini puzzle, is an acronym. It stands 97 00:05:38,596 --> 00:05:44,636 Speaker 1: for completely automated public touring tests to tell computers and 98 00:05:44,836 --> 00:05:49,476 Speaker 1: humans apart more or less. Not sure if you've heard 99 00:05:49,516 --> 00:05:52,436 Speaker 1: of the touring test, but it is incredibly famous among 100 00:05:52,476 --> 00:05:57,196 Speaker 1: computer scientists. It's this experiment proposed by British mathematician and 101 00:05:57,236 --> 00:06:01,036 Speaker 1: computer scientist Alan Turing, who's known as the father of 102 00:06:01,156 --> 00:06:04,836 Speaker 1: artificial intelligence. There was actually a movie about him called 103 00:06:04,836 --> 00:06:09,796 Speaker 1: The Imitation Game, where Benedict's Cumberbatch played Alan Tour. Would 104 00:06:09,836 --> 00:06:14,996 Speaker 1: you like to play play? It's a game, a test 105 00:06:15,036 --> 00:06:20,636 Speaker 1: of songs for determining whether something is machine or a 106 00:06:20,756 --> 00:06:24,796 Speaker 1: human being. Anyway, the Touring test is intended to set 107 00:06:24,796 --> 00:06:29,156 Speaker 1: a standard for determining if a computer has achieved true 108 00:06:29,476 --> 00:06:33,436 Speaker 1: artificial intelligence. When can we tell that a computer is 109 00:06:33,436 --> 00:06:35,996 Speaker 1: actually intelligent. This is kind of like a philosophical test 110 00:06:35,996 --> 00:06:38,756 Speaker 1: that said, like, look, we're going to have a human 111 00:06:38,876 --> 00:06:42,316 Speaker 1: judge ask questions to two entities. One is the computer, 112 00:06:42,396 --> 00:06:44,916 Speaker 1: one is the human. The computer and the human are 113 00:06:44,996 --> 00:06:48,876 Speaker 1: hidden behind two curtains. The judge can't see them. The 114 00:06:48,996 --> 00:06:51,876 Speaker 1: judge types in questions and then looks at the text 115 00:06:51,916 --> 00:06:55,956 Speaker 1: of the responses. If it's impossible to tell which answer 116 00:06:55,996 --> 00:06:58,676 Speaker 1: came from the person in which from the computer, the 117 00:06:58,676 --> 00:07:02,316 Speaker 1: computer has passed the Turing test. The judge can just 118 00:07:02,356 --> 00:07:04,676 Speaker 1: ask whatever questions they want, and if we really can't 119 00:07:04,676 --> 00:07:07,236 Speaker 1: distinguish them, we'll say the computer is really intelligent. To 120 00:07:07,356 --> 00:07:09,716 Speaker 1: this day, we have not made a computer that can 121 00:07:09,716 --> 00:07:13,276 Speaker 1: actually pass the turning test successfully. It's just it's just 122 00:07:13,316 --> 00:07:15,756 Speaker 1: too hard. The funny thing is, if you really think 123 00:07:15,756 --> 00:07:19,476 Speaker 1: about it, the capture problem is the opposite of the 124 00:07:19,516 --> 00:07:23,196 Speaker 1: touring test. The touring test is successful if the judge 125 00:07:23,516 --> 00:07:26,476 Speaker 1: can't tell the difference between a person and a machine. 126 00:07:27,036 --> 00:07:29,796 Speaker 1: The whole point of Louis Vanan's project was to create 127 00:07:29,836 --> 00:07:33,596 Speaker 1: a test that can tell the difference. There's another difference 128 00:07:33,596 --> 00:07:36,596 Speaker 1: between the two tests too. Here's the key. In this case, 129 00:07:36,636 --> 00:07:40,116 Speaker 1: the judge was a human. In our case for the capture, 130 00:07:40,236 --> 00:07:41,716 Speaker 1: what we needed to do is we needed the judge 131 00:07:41,756 --> 00:07:44,276 Speaker 1: to be a computer because we need we need the 132 00:07:44,276 --> 00:07:46,636 Speaker 1: computer to determine whether it's talking to a human our computer, 133 00:07:46,796 --> 00:07:49,436 Speaker 1: which is which is much harder in some sense, at 134 00:07:49,476 --> 00:07:51,316 Speaker 1: least for to grade it. So I think the hardest 135 00:07:51,316 --> 00:07:53,916 Speaker 1: thing was just coming up with this general idea that like, okay, 136 00:07:54,036 --> 00:07:56,036 Speaker 1: what we need is a test that can assume shroom 137 00:07:56,116 --> 00:07:59,796 Speaker 1: some computers, but that computers need to be able to grade. 138 00:07:59,916 --> 00:08:02,116 Speaker 1: Then after that we started coming up with like, okay, 139 00:08:02,116 --> 00:08:03,756 Speaker 1: what I think the computers are not very good at. 140 00:08:04,316 --> 00:08:07,996 Speaker 1: In the year two thousand, the answer was obvious, computers 141 00:08:08,036 --> 00:08:12,636 Speaker 1: are not very good at identifying what's in pictures. We 142 00:08:12,876 --> 00:08:15,756 Speaker 1: quickly owned in on images and just doing you know, 143 00:08:15,916 --> 00:08:19,316 Speaker 1: images of text, images of flowers, images of stuff. And 144 00:08:19,356 --> 00:08:21,716 Speaker 1: then after a while, the images of text were the 145 00:08:21,716 --> 00:08:23,556 Speaker 1: ones that seemed like the best idea. And then I 146 00:08:23,596 --> 00:08:27,276 Speaker 1: just went and developed a program that distorted random text 147 00:08:27,916 --> 00:08:29,916 Speaker 1: and that was the first version of a cap chow. 148 00:08:30,396 --> 00:08:33,236 Speaker 1: That's right. The test they came up with presents you 149 00:08:33,236 --> 00:08:36,396 Speaker 1: with the image of a typed word, but the letters 150 00:08:36,396 --> 00:08:40,516 Speaker 1: are all like twisted, bent and distorted, as though the 151 00:08:40,556 --> 00:08:44,716 Speaker 1: typist were severely drunk and typing on saran wrap. You 152 00:08:44,756 --> 00:08:47,996 Speaker 1: are supposed to interpret what that word is and type 153 00:08:47,996 --> 00:08:52,476 Speaker 1: it into a box on the website. Actually, computers in 154 00:08:52,516 --> 00:08:56,076 Speaker 1: the early two thousands were pretty good at OCR. That's 155 00:08:56,196 --> 00:09:00,596 Speaker 1: optical character recognition, meaning looking at a picture of text 156 00:09:00,836 --> 00:09:03,556 Speaker 1: and figuring out what the letters are. But the added 157 00:09:03,636 --> 00:09:07,796 Speaker 1: challenge of the twisty distortion really threw those OCR programs 158 00:09:07,876 --> 00:09:11,396 Speaker 1: off the track. Behind the scenes, I mean, what is it. 159 00:09:11,436 --> 00:09:13,596 Speaker 1: I mean there's got to be some I don't know, 160 00:09:13,636 --> 00:09:18,236 Speaker 1: sequel database or massive bank of little images. I mean, 161 00:09:18,396 --> 00:09:20,916 Speaker 1: actually there was no database at first. We would just 162 00:09:21,276 --> 00:09:23,116 Speaker 1: write a program that what it would do is it 163 00:09:23,116 --> 00:09:26,436 Speaker 1: would pick some first random characters, would put them on 164 00:09:26,436 --> 00:09:28,316 Speaker 1: an image, then it would distore him, and then we 165 00:09:28,356 --> 00:09:31,236 Speaker 1: would save that image. And then we just had I 166 00:09:31,236 --> 00:09:32,956 Speaker 1: don't know, a couple of million of those saved, not 167 00:09:32,996 --> 00:09:34,676 Speaker 1: even in a sequel database. Is just they were there, 168 00:09:34,716 --> 00:09:39,596 Speaker 1: so save as files. It worked brilliantly. The spambots didn't 169 00:09:39,636 --> 00:09:43,676 Speaker 1: have a chance. At the time. Vonn had no idea 170 00:09:43,796 --> 00:09:46,836 Speaker 1: if his invention would be of any commercial use. But 171 00:09:46,956 --> 00:09:50,916 Speaker 1: one guy he knew would be interested, Oodi Manber, that 172 00:09:51,156 --> 00:09:54,396 Speaker 1: Yahoo chief scientist who'd given the talk that started this 173 00:09:54,436 --> 00:09:57,116 Speaker 1: whole affair. We sent them an email saying, hey, we 174 00:09:57,196 --> 00:10:00,156 Speaker 1: think we can solve your problem, and he said, oh, 175 00:10:00,196 --> 00:10:01,996 Speaker 1: that that seems like it solves the problem. And then 176 00:10:02,036 --> 00:10:03,956 Speaker 1: in fact, pretty soon after that it was being used 177 00:10:03,956 --> 00:10:06,836 Speaker 1: by Yahoo, and then basically every website started using it, 178 00:10:07,676 --> 00:10:11,476 Speaker 1: and you know, there was millions of websites out there. 179 00:10:11,276 --> 00:10:18,516 Speaker 1: We're using it. Well, how wonderful Luis Spawn's ingenuity. One spammers, 180 00:10:18,636 --> 00:10:23,436 Speaker 1: zero Internet saved. And at first I was very kind 181 00:10:23,476 --> 00:10:25,636 Speaker 1: of proud of myself because, okay, look at the impact 182 00:10:25,676 --> 00:10:28,716 Speaker 1: that my work has had. Basically we stopped spams being 183 00:10:28,796 --> 00:10:31,796 Speaker 1: used by a lot of people. There was only one problem. Now, 184 00:10:32,956 --> 00:10:36,596 Speaker 1: people hated his invention. How many of you have had 185 00:10:36,636 --> 00:10:38,156 Speaker 1: to fill out some sort of web form where even 186 00:10:38,156 --> 00:10:40,756 Speaker 1: has to read a distorted sequence of characters like this, Yeah, 187 00:10:40,836 --> 00:10:43,876 Speaker 1: how many of you found it really really annoying. Okay, 188 00:10:43,916 --> 00:10:48,876 Speaker 1: that's standing. So I invented that. That's how he introduces 189 00:10:48,956 --> 00:10:52,196 Speaker 1: himself in a twenty eleven TEDx talk at Carnegie Mellon. 190 00:10:52,476 --> 00:10:54,036 Speaker 1: I would be at a party, and you know, people 191 00:10:54,036 --> 00:10:55,436 Speaker 1: would ask me what I did, and I would tell 192 00:10:55,476 --> 00:10:57,236 Speaker 1: them that I helped invent that thing. And people would 193 00:10:57,236 --> 00:10:59,796 Speaker 1: tell me, oh, I hate you. That's right. The inventor 194 00:10:59,836 --> 00:11:03,316 Speaker 1: of kapta is fully aware that people hate the thing. 195 00:11:03,836 --> 00:11:08,716 Speaker 1: I say, either well, I'm sorry, or I find it 196 00:11:08,716 --> 00:11:12,156 Speaker 1: annoying too. You've heard it right here, folks. Even he 197 00:11:12,316 --> 00:11:15,956 Speaker 1: finds them annoying. In fact, Louise can tell you exactly 198 00:11:16,036 --> 00:11:18,916 Speaker 1: how much of your time they waste. I did a 199 00:11:18,956 --> 00:11:21,356 Speaker 1: little back of the envelope calculation at the time, about 200 00:11:21,356 --> 00:11:23,836 Speaker 1: two hundred million times a day somebody type one of 201 00:11:23,836 --> 00:11:28,316 Speaker 1: these captures two hundred million times times ten seconds, which 202 00:11:28,396 --> 00:11:30,076 Speaker 1: is how long it takes to type one of these. 203 00:11:30,556 --> 00:11:32,876 Speaker 1: Humanity as a whole was wasting about five hundred thousand 204 00:11:32,916 --> 00:11:36,116 Speaker 1: hours every day typing these annoying captures. Great. So I 205 00:11:36,116 --> 00:11:39,436 Speaker 1: started feeling bad about that, and that's when I started thinking, Okay, 206 00:11:39,436 --> 00:11:41,636 Speaker 1: can we do something good with that time? See, the 207 00:11:41,676 --> 00:11:44,636 Speaker 1: thing is kind of similar to the gym idea. Can 208 00:11:44,676 --> 00:11:48,116 Speaker 1: we get millions of people to do something during that 209 00:11:48,156 --> 00:11:51,316 Speaker 1: time that is actually valuable. I'll give you a hint. 210 00:11:51,716 --> 00:11:54,956 Speaker 1: We're only at the halfway point in this story. After 211 00:11:54,996 --> 00:11:56,876 Speaker 1: the break, we'll tell you what he came up with 212 00:11:57,076 --> 00:12:00,676 Speaker 1: to make those half a million hours every day useful 213 00:12:00,716 --> 00:12:11,796 Speaker 1: to humanity. And one more plug here. I'm the author 214 00:12:11,836 --> 00:12:14,316 Speaker 1: of a book called How to Prepare for Climate Change. 215 00:12:14,596 --> 00:12:17,716 Speaker 1: It's a six hundred page paperback that's designed to be 216 00:12:17,756 --> 00:12:20,556 Speaker 1: a field guide to the new climate. It tells you 217 00:12:20,596 --> 00:12:23,636 Speaker 1: where to live, where to invest, what to grow, how 218 00:12:23,636 --> 00:12:26,556 Speaker 1: to reinforce your home, how to insure, how to talk 219 00:12:26,596 --> 00:12:30,716 Speaker 1: to your kids, and how to ride out wildfires, hurricanes, heatwaves, 220 00:12:30,716 --> 00:12:33,356 Speaker 1: and so on. If you live in a state whose 221 00:12:33,436 --> 00:12:36,996 Speaker 1: name contains a vowel, then you've been affected by climate 222 00:12:37,076 --> 00:12:39,756 Speaker 1: change already, and you should check out this book to 223 00:12:39,796 --> 00:12:42,996 Speaker 1: protect your health, your family, your home, and your finances. 224 00:12:43,316 --> 00:12:46,556 Speaker 1: It's How to Prepare for Climate Change. The book that's 225 00:12:46,636 --> 00:12:54,036 Speaker 1: exactly what it sounds like. Welcome back. By two thousand 226 00:12:54,076 --> 00:12:57,916 Speaker 1: and five, Louis vaughan An's invention the captcha test was 227 00:12:57,956 --> 00:13:02,436 Speaker 1: a huge hit. It reduced the world scumbag spammers to 228 00:13:02,556 --> 00:13:06,716 Speaker 1: blubbering losers. No longer could they bombard websites with phony 229 00:13:06,796 --> 00:13:13,196 Speaker 1: sign ups for the purpose of pursuing their pathetic spanny schemes. Unfortunately, 230 00:13:13,436 --> 00:13:18,316 Speaker 1: he had achieved this success by transferring the burden onto us, 231 00:13:18,356 --> 00:13:21,156 Speaker 1: treating us as though we were guilty until proven innocent. 232 00:13:21,876 --> 00:13:25,236 Speaker 1: Now we were the ones being challenged. We were losing 233 00:13:25,436 --> 00:13:30,556 Speaker 1: ten seconds per website typing in those stupid distorted letters. Now. 234 00:13:30,716 --> 00:13:33,636 Speaker 1: To be fair, history is full of examples like that, 235 00:13:34,036 --> 00:13:37,636 Speaker 1: where the actions of a few selfish, greedy idiots wind 236 00:13:37,676 --> 00:13:41,196 Speaker 1: up inconveniencing billions of innocent people for the rest of 237 00:13:41,196 --> 00:13:44,796 Speaker 1: our lives. You know, some dirtbag tries to put poison 238 00:13:44,876 --> 00:13:47,476 Speaker 1: into drug store tile on all bottles, and now the 239 00:13:47,476 --> 00:13:51,076 Speaker 1: rest of us are stuck with frustrating, plastic, wasteful bottle 240 00:13:51,116 --> 00:13:54,316 Speaker 1: lids forever. Some delinquent tries to blow up a plane 241 00:13:54,316 --> 00:13:56,436 Speaker 1: with a shoe bomb, and now we all have to 242 00:13:56,476 --> 00:14:00,076 Speaker 1: walk through the TSA scanners in our socks. Louise felt 243 00:14:00,116 --> 00:14:04,116 Speaker 1: bad that his hacker blockade wasted everybody's time, but at 244 00:14:04,196 --> 00:14:06,836 Speaker 1: least he could do something about it. So that's a 245 00:14:06,916 --> 00:14:09,076 Speaker 1: very valuable time, So can we use it for something? 246 00:14:09,236 --> 00:14:11,396 Speaker 1: And then I ended up coming up with this idea 247 00:14:11,596 --> 00:14:14,356 Speaker 1: that while you were typing a capture, you could be 248 00:14:14,396 --> 00:14:18,916 Speaker 1: helping digitize books. And here's here's kind of how that works. 249 00:14:19,036 --> 00:14:20,756 Speaker 1: So at the time, this is the year maybe two 250 00:14:20,756 --> 00:14:22,316 Speaker 1: thousand and five, two thousand and six, there were a 251 00:14:22,356 --> 00:14:24,116 Speaker 1: lot of projects trying to digitize all of the world's 252 00:14:24,156 --> 00:14:26,196 Speaker 1: books where where you know. The way that worked is 253 00:14:26,356 --> 00:14:28,156 Speaker 1: you start with a physical book and you want to 254 00:14:28,156 --> 00:14:30,156 Speaker 1: put it on the internet. And the way you do 255 00:14:30,196 --> 00:14:33,476 Speaker 1: that is you basically take a digital photograph of every 256 00:14:33,476 --> 00:14:36,396 Speaker 1: page of the book. Now these are pictures of text. 257 00:14:37,076 --> 00:14:38,836 Speaker 1: The next step in the process is that the computer 258 00:14:38,916 --> 00:14:41,716 Speaker 1: needs to decipher what's the text in there. In other words, 259 00:14:42,036 --> 00:14:45,236 Speaker 1: computers had to perform come on, you know, this term 260 00:14:45,596 --> 00:14:52,836 Speaker 1: ocr optical character recognition, and unfortunately, for books that are 261 00:14:52,836 --> 00:14:55,916 Speaker 1: older where maybe the ink has faded, computers could not 262 00:14:55,916 --> 00:14:59,436 Speaker 1: recognize many of the words. So the thought, the idea was, 263 00:14:59,516 --> 00:15:01,476 Speaker 1: let's take all those words that the computers could not 264 00:15:01,516 --> 00:15:04,356 Speaker 1: recognize while books are being digitized, and let's get people 265 00:15:04,396 --> 00:15:06,516 Speaker 1: to read them for us while they're typing a capture. 266 00:15:06,596 --> 00:15:08,796 Speaker 1: So what we started giving people where these words that 267 00:15:08,796 --> 00:15:11,996 Speaker 1: they con computer was not able to digitize and or 268 00:15:12,116 --> 00:15:15,876 Speaker 1: to recognize. So yeah, all this time you thought you 269 00:15:15,916 --> 00:15:20,156 Speaker 1: were typing random words. In fact, you were helping companies 270 00:15:20,356 --> 00:15:24,436 Speaker 1: digitize old books and articles and, by the way, helping 271 00:15:24,516 --> 00:15:27,876 Speaker 1: Luise's little company make money. The ideas we made a capture, 272 00:15:28,076 --> 00:15:30,596 Speaker 1: a system, a whole system that would help your website 273 00:15:30,716 --> 00:15:33,316 Speaker 1: be protected against BAM, and we gave that away for free. 274 00:15:33,356 --> 00:15:36,716 Speaker 1: And for example, Facebook use our capture and we gave 275 00:15:36,716 --> 00:15:39,236 Speaker 1: it away for free, etc. But always with a caveat 276 00:15:39,276 --> 00:15:42,316 Speaker 1: that if they are going to do that, then we 277 00:15:42,356 --> 00:15:44,956 Speaker 1: can see the answers that users are typing, so that 278 00:15:44,956 --> 00:15:47,996 Speaker 1: we helped digitize something. And the way we made money 279 00:15:48,116 --> 00:15:52,356 Speaker 1: is by charging people who needed digitization stuff. For example, 280 00:15:52,476 --> 00:15:54,996 Speaker 1: the New York Times was our client. The New York 281 00:15:54,996 --> 00:15:57,636 Speaker 1: Times had this old archive of all the editions of 282 00:15:57,676 --> 00:15:59,156 Speaker 1: the New York Times from you know, one hundred and 283 00:15:59,236 --> 00:16:00,756 Speaker 1: thirty years of the New York Times or something like that, 284 00:16:00,796 --> 00:16:05,116 Speaker 1: from the eighteen hundreds, and they needed this to help 285 00:16:05,156 --> 00:16:07,316 Speaker 1: digitize their whole archive. They were sending us all the 286 00:16:07,396 --> 00:16:10,356 Speaker 1: scans they had scanded already and we were sending them. 287 00:16:10,476 --> 00:16:12,636 Speaker 1: We were taking all the words that computer could not recognize, 288 00:16:12,716 --> 00:16:15,156 Speaker 1: and we were getting through the captures people who were, 289 00:16:15,156 --> 00:16:18,036 Speaker 1: for example, signing up for Facebook or Twitter or a 290 00:16:18,036 --> 00:16:19,676 Speaker 1: lot of websites that we're using our capture. They were 291 00:16:19,676 --> 00:16:21,516 Speaker 1: helping us digitize the New York Times, and we would 292 00:16:21,516 --> 00:16:23,996 Speaker 1: make money from The New York Times. It became very successful, 293 00:16:24,276 --> 00:16:26,956 Speaker 1: and then Google bought it to help their book digitization 294 00:16:27,116 --> 00:16:31,276 Speaker 1: whole project. The new system called recapture became an even 295 00:16:31,356 --> 00:16:34,396 Speaker 1: bigger hit. Here's how we described the aftermath in his 296 00:16:34,516 --> 00:16:37,276 Speaker 1: TEDx talk. So every time you buy tickets on Ticketmaster, 297 00:16:37,396 --> 00:16:39,796 Speaker 1: you hope to digitize a book. Facebook, every time you 298 00:16:39,796 --> 00:16:41,876 Speaker 1: add a friend, you help to digitize a book. Twitter, 299 00:16:42,236 --> 00:16:44,396 Speaker 1: and about three hundred and fifty thousand other sites are 300 00:16:44,436 --> 00:16:46,516 Speaker 1: all using recapture. And in fact, the number of sites 301 00:16:46,556 --> 00:16:48,436 Speaker 1: that are using recaptures so high that the number of 302 00:16:48,516 --> 00:16:51,196 Speaker 1: words that we're digitizing per day is really really large. 303 00:16:51,276 --> 00:16:53,116 Speaker 1: It's about one hundred million a day, which is the 304 00:16:53,156 --> 00:16:56,396 Speaker 1: equivalent of about two and a half million books a year. 305 00:16:56,796 --> 00:16:58,476 Speaker 1: And this is all being done one word at a 306 00:16:58,516 --> 00:17:05,076 Speaker 1: time by just people tapping captures on the Internet. There 307 00:17:05,116 --> 00:17:07,276 Speaker 1: are some people who are a little nervous about Google 308 00:17:07,956 --> 00:17:11,916 Speaker 1: being the owner of one of the most widely used 309 00:17:12,556 --> 00:17:15,996 Speaker 1: captive systems. I'm sure you've then asked about that. Yeah, 310 00:17:15,996 --> 00:17:17,436 Speaker 1: there are people who are nervous about that. I mean, 311 00:17:17,476 --> 00:17:20,596 Speaker 1: I understand, I think you know this is these are 312 00:17:20,676 --> 00:17:25,316 Speaker 1: very very tricky questions. I mean, personally, I think the 313 00:17:25,396 --> 00:17:28,916 Speaker 1: privacy fight US is over. I mean I I've given 314 00:17:28,996 --> 00:17:31,676 Speaker 1: up on my privacy against large companies a while ago. Wow. 315 00:17:31,996 --> 00:17:35,236 Speaker 1: Not only that, I also think after having been inside Google, 316 00:17:35,276 --> 00:17:37,796 Speaker 1: I saw with how much respect they treat user data 317 00:17:37,836 --> 00:17:40,956 Speaker 1: because they know that they are, you know, a few 318 00:17:40,996 --> 00:17:45,036 Speaker 1: scandals away from being in deep trouble, so they take 319 00:17:45,076 --> 00:17:47,716 Speaker 1: it with a lot of care, I think. And we 320 00:17:47,716 --> 00:17:50,796 Speaker 1: should point out that Google has said we do not 321 00:17:50,956 --> 00:17:54,876 Speaker 1: use data collected for advertising purposes. Yeah, that's the case, 322 00:17:54,916 --> 00:17:57,716 Speaker 1: and so and I actually believe them. Now. Remember Louise 323 00:17:57,756 --> 00:18:00,116 Speaker 1: said that the hard part was finding a test that 324 00:18:00,276 --> 00:18:03,276 Speaker 1: was too hard for a computer to pass, but easy 325 00:18:03,356 --> 00:18:06,996 Speaker 1: enough for a computer to judge whether the test had 326 00:18:07,036 --> 00:18:11,956 Speaker 1: been passed. That's been bugging me. If the computer chooses 327 00:18:12,156 --> 00:18:15,156 Speaker 1: a word that's so distorted that it itself cannot do 328 00:18:15,196 --> 00:18:19,196 Speaker 1: the ocr then how does it know if we're right. Yeah, 329 00:18:19,196 --> 00:18:21,676 Speaker 1: that's a great question. When we try to digitize books, 330 00:18:22,076 --> 00:18:24,476 Speaker 1: Here's here's what we do. We take a word that 331 00:18:24,476 --> 00:18:28,156 Speaker 1: the computer does not know. We actually pair it with 332 00:18:28,196 --> 00:18:30,236 Speaker 1: another word for which the computer does know the answer, 333 00:18:30,436 --> 00:18:32,596 Speaker 1: and we actually give people both words, and we say 334 00:18:33,196 --> 00:18:35,116 Speaker 1: please type both, and we don't tell them which ones which, 335 00:18:35,116 --> 00:18:37,116 Speaker 1: We just say, hey, please type both. If they type 336 00:18:37,116 --> 00:18:39,396 Speaker 1: the word for which we know the answer, if they 337 00:18:39,396 --> 00:18:42,516 Speaker 1: type that one correctly, we assume that they're human, and 338 00:18:42,556 --> 00:18:44,636 Speaker 1: we also get some confidence that they type the other 339 00:18:44,676 --> 00:18:47,476 Speaker 1: word correctly, and then what we do is okay, so 340 00:18:47,556 --> 00:18:49,316 Speaker 1: now we have a guess for what that other word is. 341 00:18:49,436 --> 00:18:51,956 Speaker 1: We give it to like ten other different people and 342 00:18:51,996 --> 00:18:53,916 Speaker 1: we see if they type the same thing, and if 343 00:18:53,916 --> 00:18:55,676 Speaker 1: they all type the same thing, we get with very 344 00:18:55,716 --> 00:18:59,756 Speaker 1: high accuracy what that word really is, and that works. 345 00:19:00,396 --> 00:19:03,516 Speaker 1: One hallmark of the recapture system in other words, is 346 00:19:03,556 --> 00:19:06,996 Speaker 1: that you have to type in two words. There's sometimes 347 00:19:07,156 --> 00:19:11,036 Speaker 1: also funny words that a funny combinations that happen, especially 348 00:19:11,036 --> 00:19:14,236 Speaker 1: because we are showing two words at a time. Oh boy, 349 00:19:14,676 --> 00:19:16,636 Speaker 1: I mean, you know, there's been all kinds of really 350 00:19:16,676 --> 00:19:19,636 Speaker 1: funny examples where it's just like, you know, a website 351 00:19:19,636 --> 00:19:23,316 Speaker 1: of a church that says like bad Christians and it's 352 00:19:23,356 --> 00:19:26,396 Speaker 1: just but these are just two randomly chosen words, so 353 00:19:26,436 --> 00:19:30,276 Speaker 1: we shouldn't infer any evil on your part. No, they're random. Now, 354 00:19:30,276 --> 00:19:33,796 Speaker 1: a lot has happened since two thousand when capture came along, 355 00:19:34,196 --> 00:19:37,796 Speaker 1: and since two thousand and six when you started unsuspectingly 356 00:19:37,836 --> 00:19:41,276 Speaker 1: helping Google in the New York Times digitize their old pages. 357 00:19:41,756 --> 00:19:43,596 Speaker 1: You know, early on in the first version of a 358 00:19:43,676 --> 00:19:47,556 Speaker 1: cap shop, computers were pretty bad at recognizing distorted text, 359 00:19:47,756 --> 00:19:50,116 Speaker 1: so they didn't have to be that distorted. But you know, 360 00:19:50,236 --> 00:19:52,716 Speaker 1: over time, computers got better and better, and in fact, 361 00:19:52,716 --> 00:19:55,916 Speaker 1: by now computers are in many cases about as good 362 00:19:55,916 --> 00:19:58,196 Speaker 1: as humans. Because of that, we have to make them 363 00:19:58,196 --> 00:20:01,236 Speaker 1: harder and harder. A lot of times, the puzzles are 364 00:20:01,316 --> 00:20:05,156 Speaker 1: so hard that even the human can't pass the challenge. 365 00:20:05,236 --> 00:20:08,356 Speaker 1: I'm sure you've been sent screenshots of words that are 366 00:20:08,436 --> 00:20:11,756 Speaker 1: so much no one can tell where it is. Yes, 367 00:20:12,396 --> 00:20:17,756 Speaker 1: that happens, I mean, it's rare that that happens, and 368 00:20:17,876 --> 00:20:21,996 Speaker 1: that's why the capture itself in true arms race fashion 369 00:20:22,476 --> 00:20:26,436 Speaker 1: has evolved. So what has happened is that for the 370 00:20:26,516 --> 00:20:30,036 Speaker 1: more secure things, the captures have moved away from these 371 00:20:30,076 --> 00:20:33,036 Speaker 1: distorted characters. And what is being used now are these 372 00:20:33,396 --> 00:20:35,676 Speaker 1: the puzzles are now things like you see a bunch 373 00:20:35,676 --> 00:20:37,196 Speaker 1: of pictures and you have to click the ones that 374 00:20:37,316 --> 00:20:41,956 Speaker 1: contain a stop sign right the traffic lights, the fire hydrants. Yeah, 375 00:20:42,036 --> 00:20:45,116 Speaker 1: it's exactly the same idea as recapture, except we're not 376 00:20:45,156 --> 00:20:47,436 Speaker 1: the story. We're not trying to digitize books. This a 377 00:20:47,436 --> 00:20:49,996 Speaker 1: lot of times comes from things like all the all 378 00:20:49,996 --> 00:20:53,996 Speaker 1: the mapping cars or the self driving cars. Basically, these 379 00:20:54,076 --> 00:20:56,956 Speaker 1: are cars that are driving around that are capturing images 380 00:20:56,956 --> 00:20:58,876 Speaker 1: of the whole world. They're trying to figure out what's 381 00:20:58,916 --> 00:21:01,796 Speaker 1: around them. Sometimes they cannot recognize what's in an image. 382 00:21:01,876 --> 00:21:04,356 Speaker 1: So it's a similar case. It takes the things like 383 00:21:04,476 --> 00:21:06,316 Speaker 1: is this a stopting I'm not sure, Okay, send it 384 00:21:06,316 --> 00:21:08,316 Speaker 1: to a human, and then when you get it and 385 00:21:08,356 --> 00:21:10,916 Speaker 1: you click on the store sign, you're actually helping either 386 00:21:10,956 --> 00:21:14,036 Speaker 1: the self driving car or the mapping software or whatever 387 00:21:14,196 --> 00:21:16,356 Speaker 1: know that there is actually a stop sign right here. 388 00:21:16,596 --> 00:21:18,876 Speaker 1: Oh so we're still doing good for the world as 389 00:21:18,876 --> 00:21:21,156 Speaker 1: we do this, still doing good for the world, or 390 00:21:21,316 --> 00:21:23,956 Speaker 1: for a company or for a company, but maybe not 391 00:21:23,956 --> 00:21:26,516 Speaker 1: digitizing books. But it's a similar ideas thing that a 392 00:21:26,556 --> 00:21:30,036 Speaker 1: computer cannot do. You've just solved a mystery for hundreds 393 00:21:30,036 --> 00:21:33,716 Speaker 1: of millions of people. Why it's always traffic lights and 394 00:21:33,756 --> 00:21:37,436 Speaker 1: fire hydrants we're supposed to choose and not bananas and puppies, 395 00:21:37,596 --> 00:21:40,316 Speaker 1: or it has to do with both self driving cars 396 00:21:40,316 --> 00:21:43,556 Speaker 1: and also mapping software. Okay, so now we kind of 397 00:21:43,596 --> 00:21:45,636 Speaker 1: get why we have to put up with these challenges, 398 00:21:46,276 --> 00:21:50,036 Speaker 1: or we did twenty years ago, but really nothing better 399 00:21:50,076 --> 00:21:53,916 Speaker 1: has come along since. Are we sure that there's nothing 400 00:21:54,116 --> 00:21:58,316 Speaker 1: less annoying that we could do to thwart these spammers? Yes, 401 00:21:58,396 --> 00:22:00,676 Speaker 1: there is. By now, it did become a lot less annoying. 402 00:22:00,716 --> 00:22:02,996 Speaker 1: I don't know if you've seen that of late, where 403 00:22:03,116 --> 00:22:05,236 Speaker 1: you know, there's a thing that's us recapture. We're just 404 00:22:05,236 --> 00:22:07,956 Speaker 1: trying to figure out whether you're a human, and they 405 00:22:07,996 --> 00:22:10,796 Speaker 1: just ask you to click somewhere, just click on this box. 406 00:22:11,516 --> 00:22:13,996 Speaker 1: That is much less annoying. So sometimes you don't see 407 00:22:13,996 --> 00:22:17,116 Speaker 1: anything except I'm not a robot by yeah, yeah, yeah, 408 00:22:17,236 --> 00:22:19,196 Speaker 1: I'm not a robot. This is something that is that 409 00:22:19,316 --> 00:22:22,356 Speaker 1: is done by Google. This actually comes from you know, 410 00:22:22,396 --> 00:22:24,436 Speaker 1: the original team, that is the company that they bought 411 00:22:24,516 --> 00:22:27,116 Speaker 1: from me. When you get that one, that means that, 412 00:22:27,756 --> 00:22:31,716 Speaker 1: in this particular case, probably means Google has figured out that, yeah, 413 00:22:31,716 --> 00:22:33,756 Speaker 1: you know what, we know you because you've been around 414 00:22:33,756 --> 00:22:37,316 Speaker 1: since twenty sixteen in this computer, and yeah, you have 415 00:22:37,356 --> 00:22:40,596 Speaker 1: a lot of Gmail emails, and you've done a lot 416 00:22:40,636 --> 00:22:43,516 Speaker 1: of Google search queries. You're a normal person, You're not 417 00:22:43,556 --> 00:22:46,236 Speaker 1: a spammer. So they just do a little thing that 418 00:22:46,316 --> 00:22:48,236 Speaker 1: just tries to double check that, you know, I can 419 00:22:48,276 --> 00:22:51,196 Speaker 1: move the mouse or whatever. So one thing that has 420 00:22:51,276 --> 00:22:53,276 Speaker 1: changed from the year two thousand and five to now 421 00:22:53,516 --> 00:22:56,036 Speaker 1: is that there are companies like Google or like Facebook 422 00:22:56,396 --> 00:22:59,836 Speaker 1: that for the majority of people on the Internet, they 423 00:22:59,876 --> 00:23:01,796 Speaker 1: kind of know who you are. If you have a 424 00:23:01,796 --> 00:23:05,316 Speaker 1: fresh computer that you've never used before, then you would 425 00:23:05,316 --> 00:23:08,716 Speaker 1: have to do the annoying capture. But for most of us, 426 00:23:09,156 --> 00:23:11,356 Speaker 1: you're unlikely to have to type these as much as 427 00:23:11,396 --> 00:23:13,516 Speaker 1: you you were back in say the year two thousand 428 00:23:13,516 --> 00:23:15,836 Speaker 1: and five. It has become a lot better, you know, 429 00:23:16,116 --> 00:23:18,716 Speaker 1: probably a little bit at the cost of your privacy. Okay, 430 00:23:18,716 --> 00:23:21,876 Speaker 1: but wait a minute, we now know that computers eventually 431 00:23:21,876 --> 00:23:25,556 Speaker 1: got too smart for the distorted text reading touring tests. 432 00:23:26,036 --> 00:23:28,676 Speaker 1: Won't they eventually get good enough to identify a few 433 00:23:28,956 --> 00:23:31,956 Speaker 1: stupid stop signs in a photo grid? It is, it's 434 00:23:31,956 --> 00:23:34,596 Speaker 1: a cat of mouse game. Now. Probably there's a bunch 435 00:23:34,636 --> 00:23:37,796 Speaker 1: of people working on making better recognition of stop signs 436 00:23:37,876 --> 00:23:40,836 Speaker 1: or something like that eventually. But eventually computers are going 437 00:23:40,876 --> 00:23:42,836 Speaker 1: to be able to do everything humans can, and so 438 00:23:43,116 --> 00:23:45,076 Speaker 1: at some point there won't be a test that kind 439 00:23:45,076 --> 00:23:47,556 Speaker 1: distinguished humans going to computer. Well wait a minute, does 440 00:23:47,596 --> 00:23:50,836 Speaker 1: that mean the end of the internet? I mean, what 441 00:23:50,956 --> 00:23:53,836 Speaker 1: happens if that, If there's no sort of touring tests 442 00:23:53,876 --> 00:23:56,196 Speaker 1: that works anymore. I don't think it's the end of 443 00:23:56,196 --> 00:24:00,316 Speaker 1: the internet, particularly because, like I said, more and more 444 00:24:00,636 --> 00:24:02,476 Speaker 1: these companies are going to know more and more about you, 445 00:24:02,956 --> 00:24:06,116 Speaker 1: and I just don't think there will be a humans problem. Okay, well, 446 00:24:06,236 --> 00:24:10,156 Speaker 1: whatever the end game is, why can't we do today, 447 00:24:10,276 --> 00:24:12,636 Speaker 1: Since we know it's an arms race, Since we know 448 00:24:12,756 --> 00:24:17,356 Speaker 1: that eventually we'll lose it to AI and computers, why 449 00:24:17,356 --> 00:24:22,396 Speaker 1: can't we jump to whatever we'll follow it? Now? I'll 450 00:24:22,396 --> 00:24:23,996 Speaker 1: tell you why this. By the way, it's like ninety 451 00:24:23,996 --> 00:24:25,956 Speaker 1: five percent of the way there. I mean, really, for 452 00:24:26,076 --> 00:24:28,436 Speaker 1: most of us, you know, Facebook knows who we are, 453 00:24:28,476 --> 00:24:30,876 Speaker 1: and Google knows who we are, So it's ninety percent 454 00:24:30,916 --> 00:24:32,556 Speaker 1: of the way there. The reason is not one hundred 455 00:24:32,556 --> 00:24:33,956 Speaker 1: percent of the way there is because there are some 456 00:24:33,956 --> 00:24:37,756 Speaker 1: people who really care about privacy, and you know, there's 457 00:24:37,796 --> 00:24:39,556 Speaker 1: there's always going to be a kind of a way 458 00:24:39,596 --> 00:24:42,756 Speaker 1: to browse privately. So for example, there's a chrome has 459 00:24:42,756 --> 00:24:45,596 Speaker 1: private browsing. So it's all the stuff when people care 460 00:24:45,596 --> 00:24:48,796 Speaker 1: about privacy, I mean there's there's a trade off here, right. Well, 461 00:24:48,836 --> 00:24:52,036 Speaker 1: the irony is it seems like most of the websites 462 00:24:52,076 --> 00:24:54,076 Speaker 1: to present me with a capture I'm trying to get 463 00:24:54,116 --> 00:24:57,876 Speaker 1: to in order to supply my name and address, like 464 00:24:57,876 --> 00:25:00,356 Speaker 1: like I'm signing up for something. Yes, it's funny. Why 465 00:25:00,356 --> 00:25:02,756 Speaker 1: do I need privacy when the whole purpose is to 466 00:25:02,796 --> 00:25:07,956 Speaker 1: supply my information? Yeah, that's funny. Now. I mentioned at 467 00:25:07,996 --> 00:25:12,356 Speaker 1: the beginning that Louis has had three world changing ideas. 468 00:25:12,876 --> 00:25:15,716 Speaker 1: You've heard about the gym membership that powers the grid, 469 00:25:16,076 --> 00:25:18,916 Speaker 1: and you now know about kapture. But what about his 470 00:25:19,036 --> 00:25:24,636 Speaker 1: third creation. It's Duo Lingo, the language training app. At 471 00:25:24,636 --> 00:25:29,356 Speaker 1: this moment, it has half a billion registered users learning 472 00:25:29,396 --> 00:25:38,836 Speaker 1: forty different languages, all for free. And from the very 473 00:25:38,876 --> 00:25:42,716 Speaker 1: beginning you could see the fingerprints of Luis Vaughnan, master 474 00:25:42,836 --> 00:25:46,956 Speaker 1: of crowdsourcing all over it. In early due Lingo, as 475 00:25:46,996 --> 00:25:49,196 Speaker 1: you were learning a language on dual Lingo, you're actually 476 00:25:49,236 --> 00:25:52,276 Speaker 1: helping us to translate stuff that computers could not translate. 477 00:25:52,556 --> 00:25:55,116 Speaker 1: In fact, CNN was a client, so CNN would send 478 00:25:55,196 --> 00:25:58,156 Speaker 1: us their news in English. We would then give it 479 00:25:58,196 --> 00:26:00,196 Speaker 1: to people who were Spanish speakers who were learning English, 480 00:26:00,196 --> 00:26:01,716 Speaker 1: and we would say, hey, you want to practice your English, 481 00:26:01,756 --> 00:26:05,596 Speaker 1: help us translate this CNN article into your native language 482 00:26:05,596 --> 00:26:08,116 Speaker 1: of Spanish. And so they would do it, and they 483 00:26:08,156 --> 00:26:11,836 Speaker 1: would be learning English and then we would get that translation, 484 00:26:11,956 --> 00:26:13,876 Speaker 1: and then we would send it back to CNN and 485 00:26:13,916 --> 00:26:16,156 Speaker 1: they would pay us for the translation. That was the 486 00:26:16,356 --> 00:26:19,796 Speaker 1: very first version of due Lingo. It turned out that, 487 00:26:20,036 --> 00:26:21,956 Speaker 1: just like the gym, it ends up being that it 488 00:26:22,796 --> 00:26:25,796 Speaker 1: just can't make much money from this, and so we decided, okay, 489 00:26:25,916 --> 00:26:28,076 Speaker 1: just go go to a business model where we actually 490 00:26:28,076 --> 00:26:30,476 Speaker 1: give you ads, and the way we make money is by, 491 00:26:30,636 --> 00:26:33,756 Speaker 1: you know, showing your ads. The dude just keeps doing that. 492 00:26:34,236 --> 00:26:36,916 Speaker 1: He keeps coming up with ideas that make the world 493 00:26:36,916 --> 00:26:40,396 Speaker 1: a better place, thwart the bad guys, and make a 494 00:26:40,396 --> 00:26:43,276 Speaker 1: lot of money. It's really a shame he gave up 495 00:26:43,316 --> 00:26:47,076 Speaker 1: that electrical grid gym thing. Are there ever things that 496 00:26:47,276 --> 00:26:49,436 Speaker 1: come to you in the shower that might be your 497 00:26:49,796 --> 00:26:53,716 Speaker 1: big third act? I mean, honestly, to have the impact 498 00:26:53,716 --> 00:26:57,156 Speaker 1: you've had twice is astonishing, But it makes me think 499 00:26:57,196 --> 00:27:01,156 Speaker 1: there's something in you that just has great ideas that 500 00:27:01,276 --> 00:27:05,556 Speaker 1: can go really wide. You know, as time passes, I 501 00:27:05,596 --> 00:27:08,716 Speaker 1: am a lot more interested in literacy and teaching people 502 00:27:08,756 --> 00:27:11,436 Speaker 1: how to read. I think with a computer, we should 503 00:27:11,476 --> 00:27:12,916 Speaker 1: be able to teach the whole world how to read 504 00:27:12,996 --> 00:27:15,036 Speaker 1: significantly better than humans can teach you how to read. 505 00:27:15,356 --> 00:27:17,636 Speaker 1: You know, the US, The US is fine, most adults 506 00:27:17,636 --> 00:27:20,316 Speaker 1: Indias know how to read, but many countries in the 507 00:27:20,356 --> 00:27:22,836 Speaker 1: world there's a significant fraction of people who don't know 508 00:27:22,876 --> 00:27:24,716 Speaker 1: how to read. In fact, there's about a billion adults 509 00:27:24,716 --> 00:27:27,676 Speaker 1: in the world that are illiterate. And I think we 510 00:27:27,716 --> 00:27:29,436 Speaker 1: can I think we can make a big dent, you know, 511 00:27:29,476 --> 00:27:31,476 Speaker 1: with a system to teach people how to read. So 512 00:27:31,476 --> 00:27:34,596 Speaker 1: we're working on that in the meantime. Now you know 513 00:27:34,636 --> 00:27:38,396 Speaker 1: why you have to encounter those infernal website challenges. You 514 00:27:38,436 --> 00:27:41,316 Speaker 1: know how they came about, and you now consider them 515 00:27:41,596 --> 00:27:45,836 Speaker 1: unnecessary evil. Well, maybe you do just for people who 516 00:27:45,836 --> 00:27:47,476 Speaker 1: are like, I don't know what it is. I just 517 00:27:47,476 --> 00:27:49,516 Speaker 1: don't like doing it. I can't even tell what's a 518 00:27:49,516 --> 00:27:52,996 Speaker 1: freaking draffic. Like, let's just lay out what would happen 519 00:27:53,156 --> 00:27:56,516 Speaker 1: if all these challenges went away tomorrow. What would happen 520 00:27:56,556 --> 00:28:00,556 Speaker 1: to the Internet. Most likely, you would get a lot 521 00:28:00,796 --> 00:28:06,796 Speaker 1: more spam in either your email spam or you'd get 522 00:28:06,836 --> 00:28:09,756 Speaker 1: a more kind of random Facebook follow words that are 523 00:28:09,796 --> 00:28:13,156 Speaker 1: not you know, real people. These fake accounts can start 524 00:28:13,276 --> 00:28:17,836 Speaker 1: boosting up that political messages. There would be probably more 525 00:28:17,836 --> 00:28:21,236 Speaker 1: fake news. They would probably be you know, more spam, right, 526 00:28:21,396 --> 00:28:28,116 Speaker 1: and from spam fishing and spywear and yeah, more spywear. Yeah. 527 00:28:28,356 --> 00:28:30,636 Speaker 1: The web would be a less safe place, all right. 528 00:28:30,676 --> 00:28:33,956 Speaker 1: So when you do explain this to someone at the 529 00:28:33,996 --> 00:28:38,916 Speaker 1: proverbial party, are they generally satisfied with the notion that? Yeah, 530 00:28:38,996 --> 00:28:41,636 Speaker 1: I think most people. I think most people realize that 531 00:28:41,676 --> 00:28:43,436 Speaker 1: it's like, you know, these things are kind of like 532 00:28:43,476 --> 00:28:46,636 Speaker 1: a like a key nobody nobody likes. It's not like 533 00:28:46,676 --> 00:28:48,916 Speaker 1: I love opening my door with the key. It's kind 534 00:28:48,916 --> 00:28:51,836 Speaker 1: of annoying, but yeah, that's there, and I understand it 535 00:28:51,876 --> 00:28:54,316 Speaker 1: just makes it makes my house safer. In this case, 536 00:28:54,356 --> 00:28:56,436 Speaker 1: it's kind of just makes the whole Internet safer. I 537 00:28:56,756 --> 00:29:00,316 Speaker 1: kind of gotta do it. Thanks for listening to this 538 00:29:00,436 --> 00:29:03,036 Speaker 1: second to last episode of the season. If you have 539 00:29:03,076 --> 00:29:06,516 Speaker 1: any interest in a second season, please spread the word, 540 00:29:06,836 --> 00:29:10,956 Speaker 1: subscribe to this podcast, leave a review on Apple Podcasts, 541 00:29:11,076 --> 00:29:14,636 Speaker 1: or a rating on Spotify. Unsung Science with David Pogue 542 00:29:14,636 --> 00:29:18,116 Speaker 1: is presented by Simon and Schuster and CBS News and 543 00:29:18,276 --> 00:29:22,316 Speaker 1: produced by PRX Productions. The executive producers for Simon and 544 00:29:22,316 --> 00:29:26,396 Speaker 1: Schuster are Richard Rohrer and Chris Lynch. The PRX production 545 00:29:26,436 --> 00:29:30,996 Speaker 1: team is Jocelyn Gonzalez, Morgan Flannery, Pedro, Raphael Rosatto, and 546 00:29:31,196 --> 00:29:35,636 Speaker 1: Ian Fox. Project manager Jesse Nelson composed the Unsung Science 547 00:29:35,676 --> 00:29:39,476 Speaker 1: theme music and Christina Robello fact checked my script. At 548 00:29:39,556 --> 00:29:43,036 Speaker 1: Unsung Science dot com, you can listen to every episode 549 00:29:43,036 --> 00:29:46,996 Speaker 1: we've ever made and read complete transcripts. For more of 550 00:29:47,036 --> 00:29:50,316 Speaker 1: my stuff, visit David Pogue dot com or follow me 551 00:29:50,356 --> 00:29:58,716 Speaker 1: on Twitter at Pogue Pogue. Thanks for the scene that 552 00:29:58,916 --> 00:30:01,876 Speaker 1: was an episode of Unsung Science from our friends at 553 00:30:01,876 --> 00:30:05,236 Speaker 1: CBS News. You can find more episodes of Unsung Science 554 00:30:05,636 --> 00:30:07,596 Speaker 1: wherever you get your podcasts.