WEBVTT - From Unsung Science with David Pogue: The Man Who Stopped the Spammers

0:00:15.316 --> 0:00:20.836
<v Speaker 1>Pushkin. Hi, It's Jacob Goldstein and I'm here today with

0:00:20.916 --> 0:00:23.596
<v Speaker 1>another podcast I think you might like. The show is

0:00:23.636 --> 0:00:27.636
<v Speaker 1>called Unsung Science and it's hosted by David Pogue. You

0:00:27.716 --> 0:00:31.036
<v Speaker 1>might know David from CBS Sunday Morning, where he's a

0:00:31.116 --> 0:00:35.276
<v Speaker 1>correspondent covering topics like science, tech, and innovation, topics like

0:00:35.316 --> 0:00:37.716
<v Speaker 1>the ones we talk about here on What's Your Problem.

0:00:37.796 --> 0:00:40.196
<v Speaker 1>In the episode You're about to hear, David chats with

0:00:40.356 --> 0:00:44.036
<v Speaker 1>Luis Vonon, the founder and CEO of the popular language

0:00:44.036 --> 0:00:47.036
<v Speaker 1>app Duolingo. You might recall I talked with Louise earlier

0:00:47.036 --> 0:00:50.676
<v Speaker 1>this year about Duolingo and language and the current limits

0:00:50.676 --> 0:00:53.636
<v Speaker 1>of artificial intelligence. But the show you're about to hear

0:00:53.796 --> 0:00:57.276
<v Speaker 1>is about what Louise did before he started Duolingo. He

0:00:57.316 --> 0:01:00.436
<v Speaker 1>invented this thing called Capsha. Capsha is that test that

0:01:00.476 --> 0:01:02.396
<v Speaker 1>you have to take all the time on the Internet

0:01:02.436 --> 0:01:05.556
<v Speaker 1>to prove that you're not a robot. And yes, Louise

0:01:05.676 --> 0:01:08.196
<v Speaker 1>knows that the test is super annoying. But the story

0:01:08.236 --> 0:01:11.156
<v Speaker 1>of capsha and what happened with it is really interesting.

0:01:11.916 --> 0:01:17.916
<v Speaker 1>It's got some great twists. By the year two thousand,

0:01:18.116 --> 0:01:22.556
<v Speaker 1>the Internet was already becoming a cesspool software bots were

0:01:22.596 --> 0:01:26.156
<v Speaker 1>signing up for millions of fake email accounts for sending

0:01:26.156 --> 0:01:31.436
<v Speaker 1>out spam. Luis Vaughan stopped them. He invented the CAPTA,

0:01:31.956 --> 0:01:35.436
<v Speaker 1>the website login test where you have to decipher the

0:01:35.476 --> 0:01:38.436
<v Speaker 1>distorted image of a word, where you have to find

0:01:38.556 --> 0:01:41.556
<v Speaker 1>the traffic lights in a grid of nine blurry photos.

0:01:42.156 --> 0:01:45.356
<v Speaker 1>The only problem we hate that test. I would be

0:01:45.396 --> 0:01:47.036
<v Speaker 1>at a party and you know, people would ask me

0:01:47.036 --> 0:01:48.316
<v Speaker 1>what I did, and I would tell them that I

0:01:48.316 --> 0:01:50.076
<v Speaker 1>helped invent that thing, and people would tell me, oh,

0:01:50.116 --> 0:01:54.076
<v Speaker 1>I hate you. I'm David Pogue And this is Unsung Science,

0:02:01.476 --> 0:02:06.356
<v Speaker 1>Season one, episode fourteen, The man who stopped the spammers.

0:02:08.236 --> 0:02:10.676
<v Speaker 1>In his forty three years on this earth, so far,

0:02:11.316 --> 0:02:17.516
<v Speaker 1>Luis vonn has had three ingenious innovative world changing ideas.

0:02:18.796 --> 0:02:22.996
<v Speaker 1>I guarantee that you've encountered his second one, probably hundreds

0:02:22.996 --> 0:02:27.196
<v Speaker 1>of times. Actually, most of us have zero world changing ideas.

0:02:27.596 --> 0:02:33.996
<v Speaker 1>Occasionally somebody has won, but three times. His first idea

0:02:34.076 --> 0:02:37.836
<v Speaker 1>came to him in Guatemala, where he grew up. They

0:02:37.916 --> 0:02:41.516
<v Speaker 1>wanted to start a gym where instead of charging people

0:02:41.876 --> 0:02:44.116
<v Speaker 1>to show up, let people just show up for free.

0:02:44.316 --> 0:02:46.716
<v Speaker 1>We're going to connect all the machines to kind of

0:02:46.756 --> 0:02:48.916
<v Speaker 1>the power grid, and we're going to use the kinetic

0:02:48.996 --> 0:02:53.076
<v Speaker 1>energy that people had whenever they were exercising to generate power.

0:02:53.716 --> 0:02:55.276
<v Speaker 1>And I thought we could make a lot of money

0:02:55.276 --> 0:02:57.356
<v Speaker 1>from that. Now you will note that I did not

0:02:57.476 --> 0:03:00.956
<v Speaker 1>say that all three of his world changing ideas actually

0:03:00.996 --> 0:03:03.836
<v Speaker 1>succeeded in changing the world. I thought I was the

0:03:03.836 --> 0:03:05.396
<v Speaker 1>first person to have this idea. It turns out it's

0:03:05.396 --> 0:03:07.716
<v Speaker 1>a very old idea. It also turns out it doesn't work.

0:03:07.836 --> 0:03:11.756
<v Speaker 1>That's right, the pedal power Jim idea flopped. It turns

0:03:11.756 --> 0:03:13.836
<v Speaker 1>out this is not a good idea for many reasons,

0:03:13.836 --> 0:03:15.436
<v Speaker 1>the biggest one of which is that humans are just

0:03:15.516 --> 0:03:18.316
<v Speaker 1>not very good at creating energy. Oh, you just just

0:03:18.356 --> 0:03:20.116
<v Speaker 1>don't make a lot of money from this. There's another

0:03:20.156 --> 0:03:22.396
<v Speaker 1>reason why this doesn't work a lot. It turns out

0:03:22.476 --> 0:03:24.476
<v Speaker 1>Jim's make most of their money from people who don't

0:03:24.476 --> 0:03:26.956
<v Speaker 1>show up. Of course, here you kind of need people

0:03:26.996 --> 0:03:30.596
<v Speaker 1>to show up to be fair. He was pretty new

0:03:30.636 --> 0:03:33.676
<v Speaker 1>at the game when he had this first idea. And

0:03:33.796 --> 0:03:36.196
<v Speaker 1>how old were you at this point, twelve years old,

0:03:36.196 --> 0:03:39.396
<v Speaker 1>eleven years old. Things started going better six years later,

0:03:39.676 --> 0:03:42.996
<v Speaker 1>when he came to the United States to attend Duke University.

0:03:43.636 --> 0:03:46.716
<v Speaker 1>As the year two thousand dawned. Luis was at Carnegie

0:03:46.756 --> 0:03:49.476
<v Speaker 1>Mellon in his first year of working toward a PhD

0:03:49.796 --> 0:03:53.916
<v Speaker 1>in computer science, and one fateful day he went to

0:03:53.996 --> 0:03:58.236
<v Speaker 1>a talk by an Israeli computer scientist named Udi Manber,

0:03:58.676 --> 0:04:01.916
<v Speaker 1>who at this point was the chief scientist at Yahoo.

0:04:02.076 --> 0:04:03.756
<v Speaker 1>By the way, at end the year two thousand, Yahoo

0:04:03.836 --> 0:04:07.076
<v Speaker 1>was the biggest biggest tech company in the words like

0:04:07.076 --> 0:04:09.996
<v Speaker 1>the Google of today. And you know, he was giving

0:04:10.276 --> 0:04:12.756
<v Speaker 1>a talk about ten problems that they didn't know how

0:04:12.756 --> 0:04:15.916
<v Speaker 1>to solve inside inside the company. And one of those

0:04:16.156 --> 0:04:18.996
<v Speaker 1>ten problems that the greatest minds at Yahoo could not

0:04:19.116 --> 0:04:24.356
<v Speaker 1>solve was automated software spam bots signing up for free

0:04:24.436 --> 0:04:28.916
<v Speaker 1>Yahoo mail accounts. By the millions. Yahoo gave up free

0:04:28.916 --> 0:04:30.796
<v Speaker 1>email accounts, and there were people who wanted to send

0:04:30.836 --> 0:04:34.636
<v Speaker 1>spam from Yahoo accounts. But each Yahoo account only allowed

0:04:34.636 --> 0:04:37.316
<v Speaker 1>you to send like five hundred messages a day. If

0:04:37.316 --> 0:04:39.796
<v Speaker 1>you wanted to send millions of emails spam emails per day,

0:04:40.636 --> 0:04:42.996
<v Speaker 1>then what these people did is they wrote programs to

0:04:43.036 --> 0:04:46.636
<v Speaker 1>obtain millions of Yahoo accounts every day, and they didn't

0:04:46.676 --> 0:04:47.996
<v Speaker 1>know how to solve that problem, how to stop that.

0:04:48.436 --> 0:04:51.116
<v Speaker 1>So I started talking about it with a person who

0:04:51.236 --> 0:04:54.276
<v Speaker 1>had just become my PhD advisor. His name was Manuel

0:04:54.316 --> 0:04:57.516
<v Speaker 1>Blum or is Manolum. He's still he's most definitely still alive.

0:04:57.836 --> 0:05:00.196
<v Speaker 1>And you know, we started thinking, and this is where

0:05:00.196 --> 0:05:02.756
<v Speaker 1>this idea of a capture came up. The idea was this,

0:05:03.556 --> 0:05:06.556
<v Speaker 1>anytime you tried to sign up for a Yahoo Mail account,

0:05:06.956 --> 0:05:10.596
<v Speaker 1>you'd encounter a little puzzle, something easy for a person

0:05:10.636 --> 0:05:13.796
<v Speaker 1>to solve, but hard for a spambot. The way to

0:05:13.876 --> 0:05:16.676
<v Speaker 1>stop these spammers was to have a test that can

0:05:16.756 --> 0:05:20.036
<v Speaker 1>distinguish between whether you're a human or a computer. If

0:05:20.116 --> 0:05:23.196
<v Speaker 1>you are a human, then presumably you can't get millions

0:05:23.236 --> 0:05:26.276
<v Speaker 1>of email accounts because you get bored, whereas if you're

0:05:26.276 --> 0:05:28.076
<v Speaker 1>a computer, you can get millions. So if the only

0:05:28.276 --> 0:05:30.796
<v Speaker 1>entity is that we were giving email accounts to where humans,

0:05:31.036 --> 0:05:34.436
<v Speaker 1>then that would stop the spam. KAPTA, the name he

0:05:34.476 --> 0:05:38.556
<v Speaker 1>gave his online mini puzzle, is an acronym. It stands

0:05:38.596 --> 0:05:44.636
<v Speaker 1>for completely automated public touring tests to tell computers and

0:05:44.836 --> 0:05:49.476
<v Speaker 1>humans apart more or less. Not sure if you've heard

0:05:49.516 --> 0:05:52.436
<v Speaker 1>of the touring test, but it is incredibly famous among

0:05:52.476 --> 0:05:57.196
<v Speaker 1>computer scientists. It's this experiment proposed by British mathematician and

0:05:57.236 --> 0:06:01.036
<v Speaker 1>computer scientist Alan Turing, who's known as the father of

0:06:01.156 --> 0:06:04.836
<v Speaker 1>artificial intelligence. There was actually a movie about him called

0:06:04.836 --> 0:06:09.796
<v Speaker 1>The Imitation Game, where Benedict's Cumberbatch played Alan Tour. Would

0:06:09.836 --> 0:06:14.996
<v Speaker 1>you like to play play? It's a game, a test

0:06:15.036 --> 0:06:20.636
<v Speaker 1>of songs for determining whether something is machine or a

0:06:20.756 --> 0:06:24.796
<v Speaker 1>human being. Anyway, the Touring test is intended to set

0:06:24.796 --> 0:06:29.156
<v Speaker 1>a standard for determining if a computer has achieved true

0:06:29.476 --> 0:06:33.436
<v Speaker 1>artificial intelligence. When can we tell that a computer is

0:06:33.436 --> 0:06:35.996
<v Speaker 1>actually intelligent. This is kind of like a philosophical test

0:06:35.996 --> 0:06:38.756
<v Speaker 1>that said, like, look, we're going to have a human

0:06:38.876 --> 0:06:42.316
<v Speaker 1>judge ask questions to two entities. One is the computer,

0:06:42.396 --> 0:06:44.916
<v Speaker 1>one is the human. The computer and the human are

0:06:44.996 --> 0:06:48.876
<v Speaker 1>hidden behind two curtains. The judge can't see them. The

0:06:48.996 --> 0:06:51.876
<v Speaker 1>judge types in questions and then looks at the text

0:06:51.916 --> 0:06:55.956
<v Speaker 1>of the responses. If it's impossible to tell which answer

0:06:55.996 --> 0:06:58.676
<v Speaker 1>came from the person in which from the computer, the

0:06:58.676 --> 0:07:02.316
<v Speaker 1>computer has passed the Turing test. The judge can just

0:07:02.356 --> 0:07:04.676
<v Speaker 1>ask whatever questions they want, and if we really can't

0:07:04.676 --> 0:07:07.236
<v Speaker 1>distinguish them, we'll say the computer is really intelligent. To

0:07:07.356 --> 0:07:09.716
<v Speaker 1>this day, we have not made a computer that can

0:07:09.716 --> 0:07:13.276
<v Speaker 1>actually pass the turning test successfully. It's just it's just

0:07:13.316 --> 0:07:15.756
<v Speaker 1>too hard. The funny thing is, if you really think

0:07:15.756 --> 0:07:19.476
<v Speaker 1>about it, the capture problem is the opposite of the

0:07:19.516 --> 0:07:23.196
<v Speaker 1>touring test. The touring test is successful if the judge

0:07:23.516 --> 0:07:26.476
<v Speaker 1>can't tell the difference between a person and a machine.

0:07:27.036 --> 0:07:29.796
<v Speaker 1>The whole point of Louis Vanan's project was to create

0:07:29.836 --> 0:07:33.596
<v Speaker 1>a test that can tell the difference. There's another difference

0:07:33.596 --> 0:07:36.596
<v Speaker 1>between the two tests too. Here's the key. In this case,

0:07:36.636 --> 0:07:40.116
<v Speaker 1>the judge was a human. In our case for the capture,

0:07:40.236 --> 0:07:41.716
<v Speaker 1>what we needed to do is we needed the judge

0:07:41.756 --> 0:07:44.276
<v Speaker 1>to be a computer because we need we need the

0:07:44.276 --> 0:07:46.636
<v Speaker 1>computer to determine whether it's talking to a human our computer,

0:07:46.796 --> 0:07:49.436
<v Speaker 1>which is which is much harder in some sense, at

0:07:49.476 --> 0:07:51.316
<v Speaker 1>least for to grade it. So I think the hardest

0:07:51.316 --> 0:07:53.916
<v Speaker 1>thing was just coming up with this general idea that like, okay,

0:07:54.036 --> 0:07:56.036
<v Speaker 1>what we need is a test that can assume shroom

0:07:56.116 --> 0:07:59.796
<v Speaker 1>some computers, but that computers need to be able to grade.

0:07:59.916 --> 0:08:02.116
<v Speaker 1>Then after that we started coming up with like, okay,

0:08:02.116 --> 0:08:03.756
<v Speaker 1>what I think the computers are not very good at.

0:08:04.316 --> 0:08:07.996
<v Speaker 1>In the year two thousand, the answer was obvious, computers

0:08:08.036 --> 0:08:12.636
<v Speaker 1>are not very good at identifying what's in pictures. We

0:08:12.876 --> 0:08:15.756
<v Speaker 1>quickly owned in on images and just doing you know,

0:08:15.916 --> 0:08:19.316
<v Speaker 1>images of text, images of flowers, images of stuff. And

0:08:19.356 --> 0:08:21.716
<v Speaker 1>then after a while, the images of text were the

0:08:21.716 --> 0:08:23.556
<v Speaker 1>ones that seemed like the best idea. And then I

0:08:23.596 --> 0:08:27.276
<v Speaker 1>just went and developed a program that distorted random text

0:08:27.916 --> 0:08:29.916
<v Speaker 1>and that was the first version of a cap chow.

0:08:30.396 --> 0:08:33.236
<v Speaker 1>That's right. The test they came up with presents you

0:08:33.236 --> 0:08:36.396
<v Speaker 1>with the image of a typed word, but the letters

0:08:36.396 --> 0:08:40.516
<v Speaker 1>are all like twisted, bent and distorted, as though the

0:08:40.556 --> 0:08:44.716
<v Speaker 1>typist were severely drunk and typing on saran wrap. You

0:08:44.756 --> 0:08:47.996
<v Speaker 1>are supposed to interpret what that word is and type

0:08:47.996 --> 0:08:52.476
<v Speaker 1>it into a box on the website. Actually, computers in

0:08:52.516 --> 0:08:56.076
<v Speaker 1>the early two thousands were pretty good at OCR. That's

0:08:56.196 --> 0:09:00.596
<v Speaker 1>optical character recognition, meaning looking at a picture of text

0:09:00.836 --> 0:09:03.556
<v Speaker 1>and figuring out what the letters are. But the added

0:09:03.636 --> 0:09:07.796
<v Speaker 1>challenge of the twisty distortion really threw those OCR programs

0:09:07.876 --> 0:09:11.396
<v Speaker 1>off the track. Behind the scenes, I mean, what is it.

0:09:11.436 --> 0:09:13.596
<v Speaker 1>I mean there's got to be some I don't know,

0:09:13.636 --> 0:09:18.236
<v Speaker 1>sequel database or massive bank of little images. I mean,

0:09:18.396 --> 0:09:20.916
<v Speaker 1>actually there was no database at first. We would just

0:09:21.276 --> 0:09:23.116
<v Speaker 1>write a program that what it would do is it

0:09:23.116 --> 0:09:26.436
<v Speaker 1>would pick some first random characters, would put them on

0:09:26.436 --> 0:09:28.316
<v Speaker 1>an image, then it would distore him, and then we

0:09:28.356 --> 0:09:31.236
<v Speaker 1>would save that image. And then we just had I

0:09:31.236 --> 0:09:32.956
<v Speaker 1>don't know, a couple of million of those saved, not

0:09:32.996 --> 0:09:34.676
<v Speaker 1>even in a sequel database. Is just they were there,

0:09:34.716 --> 0:09:39.596
<v Speaker 1>so save as files. It worked brilliantly. The spambots didn't

0:09:39.636 --> 0:09:43.676
<v Speaker 1>have a chance. At the time. Vonn had no idea

0:09:43.796 --> 0:09:46.836
<v Speaker 1>if his invention would be of any commercial use. But

0:09:46.956 --> 0:09:50.916
<v Speaker 1>one guy he knew would be interested, Oodi Manber, that

0:09:51.156 --> 0:09:54.396
<v Speaker 1>Yahoo chief scientist who'd given the talk that started this

0:09:54.436 --> 0:09:57.116
<v Speaker 1>whole affair. We sent them an email saying, hey, we

0:09:57.196 --> 0:10:00.156
<v Speaker 1>think we can solve your problem, and he said, oh,

0:10:00.196 --> 0:10:01.996
<v Speaker 1>that that seems like it solves the problem. And then

0:10:02.036 --> 0:10:03.956
<v Speaker 1>in fact, pretty soon after that it was being used

0:10:03.956 --> 0:10:06.836
<v Speaker 1>by Yahoo, and then basically every website started using it,

0:10:07.676 --> 0:10:11.476
<v Speaker 1>and you know, there was millions of websites out there.

0:10:11.276 --> 0:10:18.516
<v Speaker 1>We're using it. Well, how wonderful Luis Spawn's ingenuity. One spammers,

0:10:18.636 --> 0:10:23.436
<v Speaker 1>zero Internet saved. And at first I was very kind

0:10:23.476 --> 0:10:25.636
<v Speaker 1>of proud of myself because, okay, look at the impact

0:10:25.676 --> 0:10:28.716
<v Speaker 1>that my work has had. Basically we stopped spams being

0:10:28.796 --> 0:10:31.796
<v Speaker 1>used by a lot of people. There was only one problem. Now,

0:10:32.956 --> 0:10:36.596
<v Speaker 1>people hated his invention. How many of you have had

0:10:36.636 --> 0:10:38.156
<v Speaker 1>to fill out some sort of web form where even

0:10:38.156 --> 0:10:40.756
<v Speaker 1>has to read a distorted sequence of characters like this, Yeah,

0:10:40.836 --> 0:10:43.876
<v Speaker 1>how many of you found it really really annoying. Okay,

0:10:43.916 --> 0:10:48.876
<v Speaker 1>that's standing. So I invented that. That's how he introduces

0:10:48.956 --> 0:10:52.196
<v Speaker 1>himself in a twenty eleven TEDx talk at Carnegie Mellon.

0:10:52.476 --> 0:10:54.036
<v Speaker 1>I would be at a party, and you know, people

0:10:54.036 --> 0:10:55.436
<v Speaker 1>would ask me what I did, and I would tell

0:10:55.476 --> 0:10:57.236
<v Speaker 1>them that I helped invent that thing. And people would

0:10:57.236 --> 0:10:59.796
<v Speaker 1>tell me, oh, I hate you. That's right. The inventor

0:10:59.836 --> 0:11:03.316
<v Speaker 1>of kapta is fully aware that people hate the thing.

0:11:03.836 --> 0:11:08.716
<v Speaker 1>I say, either well, I'm sorry, or I find it

0:11:08.716 --> 0:11:12.156
<v Speaker 1>annoying too. You've heard it right here, folks. Even he

0:11:12.316 --> 0:11:15.956
<v Speaker 1>finds them annoying. In fact, Louise can tell you exactly

0:11:16.036 --> 0:11:18.916
<v Speaker 1>how much of your time they waste. I did a

0:11:18.956 --> 0:11:21.356
<v Speaker 1>little back of the envelope calculation at the time, about

0:11:21.356 --> 0:11:23.836
<v Speaker 1>two hundred million times a day somebody type one of

0:11:23.836 --> 0:11:28.316
<v Speaker 1>these captures two hundred million times times ten seconds, which

0:11:28.396 --> 0:11:30.076
<v Speaker 1>is how long it takes to type one of these.

0:11:30.556 --> 0:11:32.876
<v Speaker 1>Humanity as a whole was wasting about five hundred thousand

0:11:32.916 --> 0:11:36.116
<v Speaker 1>hours every day typing these annoying captures. Great. So I

0:11:36.116 --> 0:11:39.436
<v Speaker 1>started feeling bad about that, and that's when I started thinking, Okay,

0:11:39.436 --> 0:11:41.636
<v Speaker 1>can we do something good with that time? See, the

0:11:41.676 --> 0:11:44.636
<v Speaker 1>thing is kind of similar to the gym idea. Can

0:11:44.676 --> 0:11:48.116
<v Speaker 1>we get millions of people to do something during that

0:11:48.156 --> 0:11:51.316
<v Speaker 1>time that is actually valuable. I'll give you a hint.

0:11:51.716 --> 0:11:54.956
<v Speaker 1>We're only at the halfway point in this story. After

0:11:54.996 --> 0:11:56.876
<v Speaker 1>the break, we'll tell you what he came up with

0:11:57.076 --> 0:12:00.676
<v Speaker 1>to make those half a million hours every day useful

0:12:00.716 --> 0:12:11.796
<v Speaker 1>to humanity. And one more plug here. I'm the author

0:12:11.836 --> 0:12:14.316
<v Speaker 1>of a book called How to Prepare for Climate Change.

0:12:14.596 --> 0:12:17.716
<v Speaker 1>It's a six hundred page paperback that's designed to be

0:12:17.756 --> 0:12:20.556
<v Speaker 1>a field guide to the new climate. It tells you

0:12:20.596 --> 0:12:23.636
<v Speaker 1>where to live, where to invest, what to grow, how

0:12:23.636 --> 0:12:26.556
<v Speaker 1>to reinforce your home, how to insure, how to talk

0:12:26.596 --> 0:12:30.716
<v Speaker 1>to your kids, and how to ride out wildfires, hurricanes, heatwaves,

0:12:30.716 --> 0:12:33.356
<v Speaker 1>and so on. If you live in a state whose

0:12:33.436 --> 0:12:36.996
<v Speaker 1>name contains a vowel, then you've been affected by climate

0:12:37.076 --> 0:12:39.756
<v Speaker 1>change already, and you should check out this book to

0:12:39.796 --> 0:12:42.996
<v Speaker 1>protect your health, your family, your home, and your finances.

0:12:43.316 --> 0:12:46.556
<v Speaker 1>It's How to Prepare for Climate Change. The book that's

0:12:46.636 --> 0:12:54.036
<v Speaker 1>exactly what it sounds like. Welcome back. By two thousand

0:12:54.076 --> 0:12:57.916
<v Speaker 1>and five, Louis vaughan An's invention the captcha test was

0:12:57.956 --> 0:13:02.436
<v Speaker 1>a huge hit. It reduced the world scumbag spammers to

0:13:02.556 --> 0:13:06.716
<v Speaker 1>blubbering losers. No longer could they bombard websites with phony

0:13:06.796 --> 0:13:13.196
<v Speaker 1>sign ups for the purpose of pursuing their pathetic spanny schemes. Unfortunately,

0:13:13.436 --> 0:13:18.316
<v Speaker 1>he had achieved this success by transferring the burden onto us,

0:13:18.356 --> 0:13:21.156
<v Speaker 1>treating us as though we were guilty until proven innocent.

0:13:21.876 --> 0:13:25.236
<v Speaker 1>Now we were the ones being challenged. We were losing

0:13:25.436 --> 0:13:30.556
<v Speaker 1>ten seconds per website typing in those stupid distorted letters. Now.

0:13:30.716 --> 0:13:33.636
<v Speaker 1>To be fair, history is full of examples like that,

0:13:34.036 --> 0:13:37.636
<v Speaker 1>where the actions of a few selfish, greedy idiots wind

0:13:37.676 --> 0:13:41.196
<v Speaker 1>up inconveniencing billions of innocent people for the rest of

0:13:41.196 --> 0:13:44.796
<v Speaker 1>our lives. You know, some dirtbag tries to put poison

0:13:44.876 --> 0:13:47.476
<v Speaker 1>into drug store tile on all bottles, and now the

0:13:47.476 --> 0:13:51.076
<v Speaker 1>rest of us are stuck with frustrating, plastic, wasteful bottle

0:13:51.116 --> 0:13:54.316
<v Speaker 1>lids forever. Some delinquent tries to blow up a plane

0:13:54.316 --> 0:13:56.436
<v Speaker 1>with a shoe bomb, and now we all have to

0:13:56.476 --> 0:14:00.076
<v Speaker 1>walk through the TSA scanners in our socks. Louise felt

0:14:00.116 --> 0:14:04.116
<v Speaker 1>bad that his hacker blockade wasted everybody's time, but at

0:14:04.196 --> 0:14:06.836
<v Speaker 1>least he could do something about it. So that's a

0:14:06.916 --> 0:14:09.076
<v Speaker 1>very valuable time, So can we use it for something?

0:14:09.236 --> 0:14:11.396
<v Speaker 1>And then I ended up coming up with this idea

0:14:11.596 --> 0:14:14.356
<v Speaker 1>that while you were typing a capture, you could be

0:14:14.396 --> 0:14:18.916
<v Speaker 1>helping digitize books. And here's here's kind of how that works.

0:14:19.036 --> 0:14:20.756
<v Speaker 1>So at the time, this is the year maybe two

0:14:20.756 --> 0:14:22.316
<v Speaker 1>thousand and five, two thousand and six, there were a

0:14:22.356 --> 0:14:24.116
<v Speaker 1>lot of projects trying to digitize all of the world's

0:14:24.156 --> 0:14:26.196
<v Speaker 1>books where where you know. The way that worked is

0:14:26.356 --> 0:14:28.156
<v Speaker 1>you start with a physical book and you want to

0:14:28.156 --> 0:14:30.156
<v Speaker 1>put it on the internet. And the way you do

0:14:30.196 --> 0:14:33.476
<v Speaker 1>that is you basically take a digital photograph of every

0:14:33.476 --> 0:14:36.396
<v Speaker 1>page of the book. Now these are pictures of text.

0:14:37.076 --> 0:14:38.836
<v Speaker 1>The next step in the process is that the computer

0:14:38.916 --> 0:14:41.716
<v Speaker 1>needs to decipher what's the text in there. In other words,

0:14:42.036 --> 0:14:45.236
<v Speaker 1>computers had to perform come on, you know, this term

0:14:45.596 --> 0:14:52.836
<v Speaker 1>ocr optical character recognition, and unfortunately, for books that are

0:14:52.836 --> 0:14:55.916
<v Speaker 1>older where maybe the ink has faded, computers could not

0:14:55.916 --> 0:14:59.436
<v Speaker 1>recognize many of the words. So the thought, the idea was,

0:14:59.516 --> 0:15:01.476
<v Speaker 1>let's take all those words that the computers could not

0:15:01.516 --> 0:15:04.356
<v Speaker 1>recognize while books are being digitized, and let's get people

0:15:04.396 --> 0:15:06.516
<v Speaker 1>to read them for us while they're typing a capture.

0:15:06.596 --> 0:15:08.796
<v Speaker 1>So what we started giving people where these words that

0:15:08.796 --> 0:15:11.996
<v Speaker 1>they con computer was not able to digitize and or

0:15:12.116 --> 0:15:15.876
<v Speaker 1>to recognize. So yeah, all this time you thought you

0:15:15.916 --> 0:15:20.156
<v Speaker 1>were typing random words. In fact, you were helping companies

0:15:20.356 --> 0:15:24.436
<v Speaker 1>digitize old books and articles and, by the way, helping

0:15:24.516 --> 0:15:27.876
<v Speaker 1>Luise's little company make money. The ideas we made a capture,

0:15:28.076 --> 0:15:30.596
<v Speaker 1>a system, a whole system that would help your website

0:15:30.716 --> 0:15:33.316
<v Speaker 1>be protected against BAM, and we gave that away for free.

0:15:33.356 --> 0:15:36.716
<v Speaker 1>And for example, Facebook use our capture and we gave

0:15:36.716 --> 0:15:39.236
<v Speaker 1>it away for free, etc. But always with a caveat

0:15:39.276 --> 0:15:42.316
<v Speaker 1>that if they are going to do that, then we

0:15:42.356 --> 0:15:44.956
<v Speaker 1>can see the answers that users are typing, so that

0:15:44.956 --> 0:15:47.996
<v Speaker 1>we helped digitize something. And the way we made money

0:15:48.116 --> 0:15:52.356
<v Speaker 1>is by charging people who needed digitization stuff. For example,

0:15:52.476 --> 0:15:54.996
<v Speaker 1>the New York Times was our client. The New York

0:15:54.996 --> 0:15:57.636
<v Speaker 1>Times had this old archive of all the editions of

0:15:57.676 --> 0:15:59.156
<v Speaker 1>the New York Times from you know, one hundred and

0:15:59.236 --> 0:16:00.756
<v Speaker 1>thirty years of the New York Times or something like that,

0:16:00.796 --> 0:16:05.116
<v Speaker 1>from the eighteen hundreds, and they needed this to help

0:16:05.156 --> 0:16:07.316
<v Speaker 1>digitize their whole archive. They were sending us all the

0:16:07.396 --> 0:16:10.356
<v Speaker 1>scans they had scanded already and we were sending them.

0:16:10.476 --> 0:16:12.636
<v Speaker 1>We were taking all the words that computer could not recognize,

0:16:12.716 --> 0:16:15.156
<v Speaker 1>and we were getting through the captures people who were,

0:16:15.156 --> 0:16:18.036
<v Speaker 1>for example, signing up for Facebook or Twitter or a

0:16:18.036 --> 0:16:19.676
<v Speaker 1>lot of websites that we're using our capture. They were

0:16:19.676 --> 0:16:21.516
<v Speaker 1>helping us digitize the New York Times, and we would

0:16:21.516 --> 0:16:23.996
<v Speaker 1>make money from The New York Times. It became very successful,

0:16:24.276 --> 0:16:26.956
<v Speaker 1>and then Google bought it to help their book digitization

0:16:27.116 --> 0:16:31.276
<v Speaker 1>whole project. The new system called recapture became an even

0:16:31.356 --> 0:16:34.396
<v Speaker 1>bigger hit. Here's how we described the aftermath in his

0:16:34.516 --> 0:16:37.276
<v Speaker 1>TEDx talk. So every time you buy tickets on Ticketmaster,

0:16:37.396 --> 0:16:39.796
<v Speaker 1>you hope to digitize a book. Facebook, every time you

0:16:39.796 --> 0:16:41.876
<v Speaker 1>add a friend, you help to digitize a book. Twitter,

0:16:42.236 --> 0:16:44.396
<v Speaker 1>and about three hundred and fifty thousand other sites are

0:16:44.436 --> 0:16:46.516
<v Speaker 1>all using recapture. And in fact, the number of sites

0:16:46.556 --> 0:16:48.436
<v Speaker 1>that are using recaptures so high that the number of

0:16:48.516 --> 0:16:51.196
<v Speaker 1>words that we're digitizing per day is really really large.

0:16:51.276 --> 0:16:53.116
<v Speaker 1>It's about one hundred million a day, which is the

0:16:53.156 --> 0:16:56.396
<v Speaker 1>equivalent of about two and a half million books a year.

0:16:56.796 --> 0:16:58.476
<v Speaker 1>And this is all being done one word at a

0:16:58.516 --> 0:17:05.076
<v Speaker 1>time by just people tapping captures on the Internet. There

0:17:05.116 --> 0:17:07.276
<v Speaker 1>are some people who are a little nervous about Google

0:17:07.956 --> 0:17:11.916
<v Speaker 1>being the owner of one of the most widely used

0:17:12.556 --> 0:17:15.996
<v Speaker 1>captive systems. I'm sure you've then asked about that. Yeah,

0:17:15.996 --> 0:17:17.436
<v Speaker 1>there are people who are nervous about that. I mean,

0:17:17.476 --> 0:17:20.596
<v Speaker 1>I understand, I think you know this is these are

0:17:20.676 --> 0:17:25.316
<v Speaker 1>very very tricky questions. I mean, personally, I think the

0:17:25.396 --> 0:17:28.916
<v Speaker 1>privacy fight US is over. I mean I I've given

0:17:28.996 --> 0:17:31.676
<v Speaker 1>up on my privacy against large companies a while ago. Wow.

0:17:31.996 --> 0:17:35.236
<v Speaker 1>Not only that, I also think after having been inside Google,

0:17:35.276 --> 0:17:37.796
<v Speaker 1>I saw with how much respect they treat user data

0:17:37.836 --> 0:17:40.956
<v Speaker 1>because they know that they are, you know, a few

0:17:40.996 --> 0:17:45.036
<v Speaker 1>scandals away from being in deep trouble, so they take

0:17:45.076 --> 0:17:47.716
<v Speaker 1>it with a lot of care, I think. And we

0:17:47.716 --> 0:17:50.796
<v Speaker 1>should point out that Google has said we do not

0:17:50.956 --> 0:17:54.876
<v Speaker 1>use data collected for advertising purposes. Yeah, that's the case,

0:17:54.916 --> 0:17:57.716
<v Speaker 1>and so and I actually believe them. Now. Remember Louise

0:17:57.756 --> 0:18:00.116
<v Speaker 1>said that the hard part was finding a test that

0:18:00.276 --> 0:18:03.276
<v Speaker 1>was too hard for a computer to pass, but easy

0:18:03.356 --> 0:18:06.996
<v Speaker 1>enough for a computer to judge whether the test had

0:18:07.036 --> 0:18:11.956
<v Speaker 1>been passed. That's been bugging me. If the computer chooses

0:18:12.156 --> 0:18:15.156
<v Speaker 1>a word that's so distorted that it itself cannot do

0:18:15.196 --> 0:18:19.196
<v Speaker 1>the ocr then how does it know if we're right. Yeah,

0:18:19.196 --> 0:18:21.676
<v Speaker 1>that's a great question. When we try to digitize books,

0:18:22.076 --> 0:18:24.476
<v Speaker 1>Here's here's what we do. We take a word that

0:18:24.476 --> 0:18:28.156
<v Speaker 1>the computer does not know. We actually pair it with

0:18:28.196 --> 0:18:30.236
<v Speaker 1>another word for which the computer does know the answer,

0:18:30.436 --> 0:18:32.596
<v Speaker 1>and we actually give people both words, and we say

0:18:33.196 --> 0:18:35.116
<v Speaker 1>please type both, and we don't tell them which ones which,

0:18:35.116 --> 0:18:37.116
<v Speaker 1>We just say, hey, please type both. If they type

0:18:37.116 --> 0:18:39.396
<v Speaker 1>the word for which we know the answer, if they

0:18:39.396 --> 0:18:42.516
<v Speaker 1>type that one correctly, we assume that they're human, and

0:18:42.556 --> 0:18:44.636
<v Speaker 1>we also get some confidence that they type the other

0:18:44.676 --> 0:18:47.476
<v Speaker 1>word correctly, and then what we do is okay, so

0:18:47.556 --> 0:18:49.316
<v Speaker 1>now we have a guess for what that other word is.

0:18:49.436 --> 0:18:51.956
<v Speaker 1>We give it to like ten other different people and

0:18:51.996 --> 0:18:53.916
<v Speaker 1>we see if they type the same thing, and if

0:18:53.916 --> 0:18:55.676
<v Speaker 1>they all type the same thing, we get with very

0:18:55.716 --> 0:18:59.756
<v Speaker 1>high accuracy what that word really is, and that works.

0:19:00.396 --> 0:19:03.516
<v Speaker 1>One hallmark of the recapture system in other words, is

0:19:03.556 --> 0:19:06.996
<v Speaker 1>that you have to type in two words. There's sometimes

0:19:07.156 --> 0:19:11.036
<v Speaker 1>also funny words that a funny combinations that happen, especially

0:19:11.036 --> 0:19:14.236
<v Speaker 1>because we are showing two words at a time. Oh boy,

0:19:14.676 --> 0:19:16.636
<v Speaker 1>I mean, you know, there's been all kinds of really

0:19:16.676 --> 0:19:19.636
<v Speaker 1>funny examples where it's just like, you know, a website

0:19:19.636 --> 0:19:23.316
<v Speaker 1>of a church that says like bad Christians and it's

0:19:23.356 --> 0:19:26.396
<v Speaker 1>just but these are just two randomly chosen words, so

0:19:26.436 --> 0:19:30.276
<v Speaker 1>we shouldn't infer any evil on your part. No, they're random. Now,

0:19:30.276 --> 0:19:33.796
<v Speaker 1>a lot has happened since two thousand when capture came along,

0:19:34.196 --> 0:19:37.796
<v Speaker 1>and since two thousand and six when you started unsuspectingly

0:19:37.836 --> 0:19:41.276
<v Speaker 1>helping Google in the New York Times digitize their old pages.

0:19:41.756 --> 0:19:43.596
<v Speaker 1>You know, early on in the first version of a

0:19:43.676 --> 0:19:47.556
<v Speaker 1>cap shop, computers were pretty bad at recognizing distorted text,

0:19:47.756 --> 0:19:50.116
<v Speaker 1>so they didn't have to be that distorted. But you know,

0:19:50.236 --> 0:19:52.716
<v Speaker 1>over time, computers got better and better, and in fact,

0:19:52.716 --> 0:19:55.916
<v Speaker 1>by now computers are in many cases about as good

0:19:55.916 --> 0:19:58.196
<v Speaker 1>as humans. Because of that, we have to make them

0:19:58.196 --> 0:20:01.236
<v Speaker 1>harder and harder. A lot of times, the puzzles are

0:20:01.316 --> 0:20:05.156
<v Speaker 1>so hard that even the human can't pass the challenge.

0:20:05.236 --> 0:20:08.356
<v Speaker 1>I'm sure you've been sent screenshots of words that are

0:20:08.436 --> 0:20:11.756
<v Speaker 1>so much no one can tell where it is. Yes,

0:20:12.396 --> 0:20:17.756
<v Speaker 1>that happens, I mean, it's rare that that happens, and

0:20:17.876 --> 0:20:21.996
<v Speaker 1>that's why the capture itself in true arms race fashion

0:20:22.476 --> 0:20:26.436
<v Speaker 1>has evolved. So what has happened is that for the

0:20:26.516 --> 0:20:30.036
<v Speaker 1>more secure things, the captures have moved away from these

0:20:30.076 --> 0:20:33.036
<v Speaker 1>distorted characters. And what is being used now are these

0:20:33.396 --> 0:20:35.676
<v Speaker 1>the puzzles are now things like you see a bunch

0:20:35.676 --> 0:20:37.196
<v Speaker 1>of pictures and you have to click the ones that

0:20:37.316 --> 0:20:41.956
<v Speaker 1>contain a stop sign right the traffic lights, the fire hydrants. Yeah,

0:20:42.036 --> 0:20:45.116
<v Speaker 1>it's exactly the same idea as recapture, except we're not

0:20:45.156 --> 0:20:47.436
<v Speaker 1>the story. We're not trying to digitize books. This a

0:20:47.436 --> 0:20:49.996
<v Speaker 1>lot of times comes from things like all the all

0:20:49.996 --> 0:20:53.996
<v Speaker 1>the mapping cars or the self driving cars. Basically, these

0:20:54.076 --> 0:20:56.956
<v Speaker 1>are cars that are driving around that are capturing images

0:20:56.956 --> 0:20:58.876
<v Speaker 1>of the whole world. They're trying to figure out what's

0:20:58.916 --> 0:21:01.796
<v Speaker 1>around them. Sometimes they cannot recognize what's in an image.

0:21:01.876 --> 0:21:04.356
<v Speaker 1>So it's a similar case. It takes the things like

0:21:04.476 --> 0:21:06.316
<v Speaker 1>is this a stopting I'm not sure, Okay, send it

0:21:06.316 --> 0:21:08.316
<v Speaker 1>to a human, and then when you get it and

0:21:08.356 --> 0:21:10.916
<v Speaker 1>you click on the store sign, you're actually helping either

0:21:10.956 --> 0:21:14.036
<v Speaker 1>the self driving car or the mapping software or whatever

0:21:14.196 --> 0:21:16.356
<v Speaker 1>know that there is actually a stop sign right here.

0:21:16.596 --> 0:21:18.876
<v Speaker 1>Oh so we're still doing good for the world as

0:21:18.876 --> 0:21:21.156
<v Speaker 1>we do this, still doing good for the world, or

0:21:21.316 --> 0:21:23.956
<v Speaker 1>for a company or for a company, but maybe not

0:21:23.956 --> 0:21:26.516
<v Speaker 1>digitizing books. But it's a similar ideas thing that a

0:21:26.556 --> 0:21:30.036
<v Speaker 1>computer cannot do. You've just solved a mystery for hundreds

0:21:30.036 --> 0:21:33.716
<v Speaker 1>of millions of people. Why it's always traffic lights and

0:21:33.756 --> 0:21:37.436
<v Speaker 1>fire hydrants we're supposed to choose and not bananas and puppies,

0:21:37.596 --> 0:21:40.316
<v Speaker 1>or it has to do with both self driving cars

0:21:40.316 --> 0:21:43.556
<v Speaker 1>and also mapping software. Okay, so now we kind of

0:21:43.596 --> 0:21:45.636
<v Speaker 1>get why we have to put up with these challenges,

0:21:46.276 --> 0:21:50.036
<v Speaker 1>or we did twenty years ago, but really nothing better

0:21:50.076 --> 0:21:53.916
<v Speaker 1>has come along since. Are we sure that there's nothing

0:21:54.116 --> 0:21:58.316
<v Speaker 1>less annoying that we could do to thwart these spammers? Yes,

0:21:58.396 --> 0:22:00.676
<v Speaker 1>there is. By now, it did become a lot less annoying.

0:22:00.716 --> 0:22:02.996
<v Speaker 1>I don't know if you've seen that of late, where

0:22:03.116 --> 0:22:05.236
<v Speaker 1>you know, there's a thing that's us recapture. We're just

0:22:05.236 --> 0:22:07.956
<v Speaker 1>trying to figure out whether you're a human, and they

0:22:07.996 --> 0:22:10.796
<v Speaker 1>just ask you to click somewhere, just click on this box.

0:22:11.516 --> 0:22:13.996
<v Speaker 1>That is much less annoying. So sometimes you don't see

0:22:13.996 --> 0:22:17.116
<v Speaker 1>anything except I'm not a robot by yeah, yeah, yeah,

0:22:17.236 --> 0:22:19.196
<v Speaker 1>I'm not a robot. This is something that is that

0:22:19.316 --> 0:22:22.356
<v Speaker 1>is done by Google. This actually comes from you know,

0:22:22.396 --> 0:22:24.436
<v Speaker 1>the original team, that is the company that they bought

0:22:24.516 --> 0:22:27.116
<v Speaker 1>from me. When you get that one, that means that,

0:22:27.756 --> 0:22:31.716
<v Speaker 1>in this particular case, probably means Google has figured out that, yeah,

0:22:31.716 --> 0:22:33.756
<v Speaker 1>you know what, we know you because you've been around

0:22:33.756 --> 0:22:37.316
<v Speaker 1>since twenty sixteen in this computer, and yeah, you have

0:22:37.356 --> 0:22:40.596
<v Speaker 1>a lot of Gmail emails, and you've done a lot

0:22:40.636 --> 0:22:43.516
<v Speaker 1>of Google search queries. You're a normal person, You're not

0:22:43.556 --> 0:22:46.236
<v Speaker 1>a spammer. So they just do a little thing that

0:22:46.316 --> 0:22:48.236
<v Speaker 1>just tries to double check that, you know, I can

0:22:48.276 --> 0:22:51.196
<v Speaker 1>move the mouse or whatever. So one thing that has

0:22:51.276 --> 0:22:53.276
<v Speaker 1>changed from the year two thousand and five to now

0:22:53.516 --> 0:22:56.036
<v Speaker 1>is that there are companies like Google or like Facebook

0:22:56.396 --> 0:22:59.836
<v Speaker 1>that for the majority of people on the Internet, they

0:22:59.876 --> 0:23:01.796
<v Speaker 1>kind of know who you are. If you have a

0:23:01.796 --> 0:23:05.316
<v Speaker 1>fresh computer that you've never used before, then you would

0:23:05.316 --> 0:23:08.716
<v Speaker 1>have to do the annoying capture. But for most of us,

0:23:09.156 --> 0:23:11.356
<v Speaker 1>you're unlikely to have to type these as much as

0:23:11.396 --> 0:23:13.516
<v Speaker 1>you you were back in say the year two thousand

0:23:13.516 --> 0:23:15.836
<v Speaker 1>and five. It has become a lot better, you know,

0:23:16.116 --> 0:23:18.716
<v Speaker 1>probably a little bit at the cost of your privacy. Okay,

0:23:18.716 --> 0:23:21.876
<v Speaker 1>but wait a minute, we now know that computers eventually

0:23:21.876 --> 0:23:25.556
<v Speaker 1>got too smart for the distorted text reading touring tests.

0:23:26.036 --> 0:23:28.676
<v Speaker 1>Won't they eventually get good enough to identify a few

0:23:28.956 --> 0:23:31.956
<v Speaker 1>stupid stop signs in a photo grid? It is, it's

0:23:31.956 --> 0:23:34.596
<v Speaker 1>a cat of mouse game. Now. Probably there's a bunch

0:23:34.636 --> 0:23:37.796
<v Speaker 1>of people working on making better recognition of stop signs

0:23:37.876 --> 0:23:40.836
<v Speaker 1>or something like that eventually. But eventually computers are going

0:23:40.876 --> 0:23:42.836
<v Speaker 1>to be able to do everything humans can, and so

0:23:43.116 --> 0:23:45.076
<v Speaker 1>at some point there won't be a test that kind

0:23:45.076 --> 0:23:47.556
<v Speaker 1>distinguished humans going to computer. Well wait a minute, does

0:23:47.596 --> 0:23:50.836
<v Speaker 1>that mean the end of the internet? I mean, what

0:23:50.956 --> 0:23:53.836
<v Speaker 1>happens if that, If there's no sort of touring tests

0:23:53.876 --> 0:23:56.196
<v Speaker 1>that works anymore. I don't think it's the end of

0:23:56.196 --> 0:24:00.316
<v Speaker 1>the internet, particularly because, like I said, more and more

0:24:00.636 --> 0:24:02.476
<v Speaker 1>these companies are going to know more and more about you,

0:24:02.956 --> 0:24:06.116
<v Speaker 1>and I just don't think there will be a humans problem. Okay, well,

0:24:06.236 --> 0:24:10.156
<v Speaker 1>whatever the end game is, why can't we do today,

0:24:10.276 --> 0:24:12.636
<v Speaker 1>Since we know it's an arms race, Since we know

0:24:12.756 --> 0:24:17.356
<v Speaker 1>that eventually we'll lose it to AI and computers, why

0:24:17.356 --> 0:24:22.396
<v Speaker 1>can't we jump to whatever we'll follow it? Now? I'll

0:24:22.396 --> 0:24:23.996
<v Speaker 1>tell you why this. By the way, it's like ninety

0:24:23.996 --> 0:24:25.956
<v Speaker 1>five percent of the way there. I mean, really, for

0:24:26.076 --> 0:24:28.436
<v Speaker 1>most of us, you know, Facebook knows who we are,

0:24:28.476 --> 0:24:30.876
<v Speaker 1>and Google knows who we are, So it's ninety percent

0:24:30.916 --> 0:24:32.556
<v Speaker 1>of the way there. The reason is not one hundred

0:24:32.556 --> 0:24:33.956
<v Speaker 1>percent of the way there is because there are some

0:24:33.956 --> 0:24:37.756
<v Speaker 1>people who really care about privacy, and you know, there's

0:24:37.796 --> 0:24:39.556
<v Speaker 1>there's always going to be a kind of a way

0:24:39.596 --> 0:24:42.756
<v Speaker 1>to browse privately. So for example, there's a chrome has

0:24:42.756 --> 0:24:45.596
<v Speaker 1>private browsing. So it's all the stuff when people care

0:24:45.596 --> 0:24:48.796
<v Speaker 1>about privacy, I mean there's there's a trade off here, right. Well,

0:24:48.836 --> 0:24:52.036
<v Speaker 1>the irony is it seems like most of the websites

0:24:52.076 --> 0:24:54.076
<v Speaker 1>to present me with a capture I'm trying to get

0:24:54.116 --> 0:24:57.876
<v Speaker 1>to in order to supply my name and address, like

0:24:57.876 --> 0:25:00.356
<v Speaker 1>like I'm signing up for something. Yes, it's funny. Why

0:25:00.356 --> 0:25:02.756
<v Speaker 1>do I need privacy when the whole purpose is to

0:25:02.796 --> 0:25:07.956
<v Speaker 1>supply my information? Yeah, that's funny. Now. I mentioned at

0:25:07.996 --> 0:25:12.356
<v Speaker 1>the beginning that Louis has had three world changing ideas.

0:25:12.876 --> 0:25:15.716
<v Speaker 1>You've heard about the gym membership that powers the grid,

0:25:16.076 --> 0:25:18.916
<v Speaker 1>and you now know about kapture. But what about his

0:25:19.036 --> 0:25:24.636
<v Speaker 1>third creation. It's Duo Lingo, the language training app. At

0:25:24.636 --> 0:25:29.356
<v Speaker 1>this moment, it has half a billion registered users learning

0:25:29.396 --> 0:25:38.836
<v Speaker 1>forty different languages, all for free. And from the very

0:25:38.876 --> 0:25:42.716
<v Speaker 1>beginning you could see the fingerprints of Luis Vaughnan, master

0:25:42.836 --> 0:25:46.956
<v Speaker 1>of crowdsourcing all over it. In early due Lingo, as

0:25:46.996 --> 0:25:49.196
<v Speaker 1>you were learning a language on dual Lingo, you're actually

0:25:49.236 --> 0:25:52.276
<v Speaker 1>helping us to translate stuff that computers could not translate.

0:25:52.556 --> 0:25:55.116
<v Speaker 1>In fact, CNN was a client, so CNN would send

0:25:55.196 --> 0:25:58.156
<v Speaker 1>us their news in English. We would then give it

0:25:58.196 --> 0:26:00.196
<v Speaker 1>to people who were Spanish speakers who were learning English,

0:26:00.196 --> 0:26:01.716
<v Speaker 1>and we would say, hey, you want to practice your English,

0:26:01.756 --> 0:26:05.596
<v Speaker 1>help us translate this CNN article into your native language

0:26:05.596 --> 0:26:08.116
<v Speaker 1>of Spanish. And so they would do it, and they

0:26:08.156 --> 0:26:11.836
<v Speaker 1>would be learning English and then we would get that translation,

0:26:11.956 --> 0:26:13.876
<v Speaker 1>and then we would send it back to CNN and

0:26:13.916 --> 0:26:16.156
<v Speaker 1>they would pay us for the translation. That was the

0:26:16.356 --> 0:26:19.796
<v Speaker 1>very first version of due Lingo. It turned out that,

0:26:20.036 --> 0:26:21.956
<v Speaker 1>just like the gym, it ends up being that it

0:26:22.796 --> 0:26:25.796
<v Speaker 1>just can't make much money from this, and so we decided, okay,

0:26:25.916 --> 0:26:28.076
<v Speaker 1>just go go to a business model where we actually

0:26:28.076 --> 0:26:30.476
<v Speaker 1>give you ads, and the way we make money is by,

0:26:30.636 --> 0:26:33.756
<v Speaker 1>you know, showing your ads. The dude just keeps doing that.

0:26:34.236 --> 0:26:36.916
<v Speaker 1>He keeps coming up with ideas that make the world

0:26:36.916 --> 0:26:40.396
<v Speaker 1>a better place, thwart the bad guys, and make a

0:26:40.396 --> 0:26:43.276
<v Speaker 1>lot of money. It's really a shame he gave up

0:26:43.316 --> 0:26:47.076
<v Speaker 1>that electrical grid gym thing. Are there ever things that

0:26:47.276 --> 0:26:49.436
<v Speaker 1>come to you in the shower that might be your

0:26:49.796 --> 0:26:53.716
<v Speaker 1>big third act? I mean, honestly, to have the impact

0:26:53.716 --> 0:26:57.156
<v Speaker 1>you've had twice is astonishing, But it makes me think

0:26:57.196 --> 0:27:01.156
<v Speaker 1>there's something in you that just has great ideas that

0:27:01.276 --> 0:27:05.556
<v Speaker 1>can go really wide. You know, as time passes, I

0:27:05.596 --> 0:27:08.716
<v Speaker 1>am a lot more interested in literacy and teaching people

0:27:08.756 --> 0:27:11.436
<v Speaker 1>how to read. I think with a computer, we should

0:27:11.476 --> 0:27:12.916
<v Speaker 1>be able to teach the whole world how to read

0:27:12.996 --> 0:27:15.036
<v Speaker 1>significantly better than humans can teach you how to read.

0:27:15.356 --> 0:27:17.636
<v Speaker 1>You know, the US, The US is fine, most adults

0:27:17.636 --> 0:27:20.316
<v Speaker 1>Indias know how to read, but many countries in the

0:27:20.356 --> 0:27:22.836
<v Speaker 1>world there's a significant fraction of people who don't know

0:27:22.876 --> 0:27:24.716
<v Speaker 1>how to read. In fact, there's about a billion adults

0:27:24.716 --> 0:27:27.676
<v Speaker 1>in the world that are illiterate. And I think we

0:27:27.716 --> 0:27:29.436
<v Speaker 1>can I think we can make a big dent, you know,

0:27:29.476 --> 0:27:31.476
<v Speaker 1>with a system to teach people how to read. So

0:27:31.476 --> 0:27:34.596
<v Speaker 1>we're working on that in the meantime. Now you know

0:27:34.636 --> 0:27:38.396
<v Speaker 1>why you have to encounter those infernal website challenges. You

0:27:38.436 --> 0:27:41.316
<v Speaker 1>know how they came about, and you now consider them

0:27:41.596 --> 0:27:45.836
<v Speaker 1>unnecessary evil. Well, maybe you do just for people who

0:27:45.836 --> 0:27:47.476
<v Speaker 1>are like, I don't know what it is. I just

0:27:47.476 --> 0:27:49.516
<v Speaker 1>don't like doing it. I can't even tell what's a

0:27:49.516 --> 0:27:52.996
<v Speaker 1>freaking draffic. Like, let's just lay out what would happen

0:27:53.156 --> 0:27:56.516
<v Speaker 1>if all these challenges went away tomorrow. What would happen

0:27:56.556 --> 0:28:00.556
<v Speaker 1>to the Internet. Most likely, you would get a lot

0:28:00.796 --> 0:28:06.796
<v Speaker 1>more spam in either your email spam or you'd get

0:28:06.836 --> 0:28:09.756
<v Speaker 1>a more kind of random Facebook follow words that are

0:28:09.796 --> 0:28:13.156
<v Speaker 1>not you know, real people. These fake accounts can start

0:28:13.276 --> 0:28:17.836
<v Speaker 1>boosting up that political messages. There would be probably more

0:28:17.836 --> 0:28:21.236
<v Speaker 1>fake news. They would probably be you know, more spam, right,

0:28:21.396 --> 0:28:28.116
<v Speaker 1>and from spam fishing and spywear and yeah, more spywear. Yeah.

0:28:28.356 --> 0:28:30.636
<v Speaker 1>The web would be a less safe place, all right.

0:28:30.676 --> 0:28:33.956
<v Speaker 1>So when you do explain this to someone at the

0:28:33.996 --> 0:28:38.916
<v Speaker 1>proverbial party, are they generally satisfied with the notion that? Yeah,

0:28:38.996 --> 0:28:41.636
<v Speaker 1>I think most people. I think most people realize that

0:28:41.676 --> 0:28:43.436
<v Speaker 1>it's like, you know, these things are kind of like

0:28:43.476 --> 0:28:46.636
<v Speaker 1>a like a key nobody nobody likes. It's not like

0:28:46.676 --> 0:28:48.916
<v Speaker 1>I love opening my door with the key. It's kind

0:28:48.916 --> 0:28:51.836
<v Speaker 1>of annoying, but yeah, that's there, and I understand it

0:28:51.876 --> 0:28:54.316
<v Speaker 1>just makes it makes my house safer. In this case,

0:28:54.356 --> 0:28:56.436
<v Speaker 1>it's kind of just makes the whole Internet safer. I

0:28:56.756 --> 0:29:00.316
<v Speaker 1>kind of gotta do it. Thanks for listening to this

0:29:00.436 --> 0:29:03.036
<v Speaker 1>second to last episode of the season. If you have

0:29:03.076 --> 0:29:06.516
<v Speaker 1>any interest in a second season, please spread the word,

0:29:06.836 --> 0:29:10.956
<v Speaker 1>subscribe to this podcast, leave a review on Apple Podcasts,

0:29:11.076 --> 0:29:14.636
<v Speaker 1>or a rating on Spotify. Unsung Science with David Pogue

0:29:14.636 --> 0:29:18.116
<v Speaker 1>is presented by Simon and Schuster and CBS News and

0:29:18.276 --> 0:29:22.316
<v Speaker 1>produced by PRX Productions. The executive producers for Simon and

0:29:22.316 --> 0:29:26.396
<v Speaker 1>Schuster are Richard Rohrer and Chris Lynch. The PRX production

0:29:26.436 --> 0:29:30.996
<v Speaker 1>team is Jocelyn Gonzalez, Morgan Flannery, Pedro, Raphael Rosatto, and

0:29:31.196 --> 0:29:35.636
<v Speaker 1>Ian Fox. Project manager Jesse Nelson composed the Unsung Science

0:29:35.676 --> 0:29:39.476
<v Speaker 1>theme music and Christina Robello fact checked my script. At

0:29:39.556 --> 0:29:43.036
<v Speaker 1>Unsung Science dot com, you can listen to every episode

0:29:43.036 --> 0:29:46.996
<v Speaker 1>we've ever made and read complete transcripts. For more of

0:29:47.036 --> 0:29:50.316
<v Speaker 1>my stuff, visit David Pogue dot com or follow me

0:29:50.356 --> 0:29:58.716
<v Speaker 1>on Twitter at Pogue Pogue. Thanks for the scene that

0:29:58.916 --> 0:30:01.876
<v Speaker 1>was an episode of Unsung Science from our friends at

0:30:01.876 --> 0:30:05.236
<v Speaker 1>CBS News. You can find more episodes of Unsung Science

0:30:05.636 --> 0:30:07.596
<v Speaker 1>wherever you get your podcasts.