WEBVTT - Deep Background Presents: Brave New Planet

0:00:18.396 --> 0:00:21.116
<v Speaker 1>Noah Feldman here. I'm excited to tell you about a

0:00:21.156 --> 0:00:25.236
<v Speaker 1>special five part Deep Background mini series called Deep Bench.

0:00:25.956 --> 0:00:28.556
<v Speaker 1>The first episodes will appear in your feed on Saturday,

0:00:28.596 --> 0:00:32.316
<v Speaker 1>October seventeenth. The battle for the Supreme Court has become

0:00:32.356 --> 0:00:35.956
<v Speaker 1>a huge issue in the presidential election. In many ways,

0:00:36.236 --> 0:00:40.676
<v Speaker 1>it's a culmination of a conservative legal revolution spearheaded by

0:00:40.676 --> 0:00:44.956
<v Speaker 1>the Federalist Society. Deep Bench is the inside story of

0:00:45.036 --> 0:00:48.476
<v Speaker 1>how these legal conservatives gained power and how, at the

0:00:48.516 --> 0:00:52.556
<v Speaker 1>height of their influence, they're actually in danger of splitting apart.

0:00:53.436 --> 0:00:57.156
<v Speaker 1>But first, we're presenting an episode from Pushkin Industry's newest show,

0:00:57.556 --> 0:01:01.756
<v Speaker 1>Brave New Planet. Every day we see how powerful technologies

0:01:01.796 --> 0:01:06.156
<v Speaker 1>are advancing at a breathtaking pace. They have amazing potential upsides,

0:01:06.636 --> 0:01:09.356
<v Speaker 1>but if we're not careful, some might leave us a

0:01:09.356 --> 0:01:13.556
<v Speaker 1>lot worse off. In Brave New Planet, doctor Eric Lander

0:01:13.596 --> 0:01:16.036
<v Speaker 1>and his guests weigh the pros and cons of a

0:01:16.116 --> 0:01:20.916
<v Speaker 1>wide range of powerful innovations in science and technology. Doctor

0:01:20.996 --> 0:01:24.516
<v Speaker 1>Lander directs the Broad Institute of MIT and Harvard. He

0:01:24.636 --> 0:01:27.796
<v Speaker 1>was a leader of the Human Genome Project for eight years.

0:01:27.796 --> 0:01:31.156
<v Speaker 1>He served as a science advisor to President Obama's White House.

0:01:31.676 --> 0:01:36.596
<v Speaker 1>In this episode, doctor Lander explores deep fakes. Deep fakes

0:01:36.636 --> 0:01:40.156
<v Speaker 1>can be useful in art, education and therapy, but could

0:01:40.156 --> 0:01:43.596
<v Speaker 1>they be weaponized to provoke international conflicts or swing elections?

0:01:44.036 --> 0:01:46.036
<v Speaker 1>And where does the right to free speech fit in?

0:01:46.996 --> 0:01:50.716
<v Speaker 1>Every episode of Brave New Planet will grapple with opportunities

0:01:50.716 --> 0:01:53.316
<v Speaker 1>and challenges that are too big to fit in a tweet,

0:01:53.436 --> 0:01:57.316
<v Speaker 1>but will shape our future. You can subscribe in Apple Podcasts.

0:01:57.836 --> 0:02:08.796
<v Speaker 1>Here's the brilliant Eric Lander and Brave New Planet. Your

0:02:09.396 --> 0:02:13.436
<v Speaker 1>to Brave New Planet, a podcast about amazing new technologies

0:02:13.516 --> 0:02:17.076
<v Speaker 1>that could dramatically improve our world, or, if we don't

0:02:17.076 --> 0:02:19.796
<v Speaker 1>make wise choices, could leave us a lot worse off

0:02:20.596 --> 0:02:35.236
<v Speaker 1>Utopia or dystopia. It's up to us. On July sixteenth,

0:02:35.396 --> 0:02:40.476
<v Speaker 1>nineteen sixty nine, Apollo eleven blasted off from the Kennedy

0:02:40.516 --> 0:02:45.756
<v Speaker 1>Space Center near Cape Canaveral, Florida. Twenty five million Americans

0:02:45.836 --> 0:02:49.956
<v Speaker 1>watched on television as the spacecraft ascended toward the heavens,

0:02:50.476 --> 0:02:55.996
<v Speaker 1>carrying Commander Neil Armstrong, Lunar Module pilot Buzz Aldron, and

0:02:56.076 --> 0:03:00.716
<v Speaker 1>Command Module pilot Michael Collins their mission to be the

0:03:00.796 --> 0:03:04.556
<v Speaker 1>first humans in history to set foot on the Moon.

0:03:05.636 --> 0:03:10.236
<v Speaker 1>Four days later, on Sunday, July twentieth, The lunar module

0:03:10.356 --> 0:03:14.716
<v Speaker 1>separated from the command ship and soon fired its rockets

0:03:15.076 --> 0:03:21.636
<v Speaker 1>to begin its lunar descent. Five minutes later, disaster struck

0:03:22.556 --> 0:03:26.236
<v Speaker 1>about a mile above the Moon's surface. Program alarms twelve

0:03:26.276 --> 0:03:29.556
<v Speaker 1>oh one and twelve O two sounded loudly, indicating that

0:03:29.596 --> 0:03:36.996
<v Speaker 1>the mission computer was overloaded, and then well, every American

0:03:37.076 --> 0:03:51.396
<v Speaker 1>knows what happened next lost date of five good evening,

0:03:51.676 --> 0:03:56.196
<v Speaker 1>my fellow Americans. President Richard Nixon addressed a grieving nation.

0:03:57.316 --> 0:04:00.276
<v Speaker 1>Fates has ordained that the men who went to the

0:04:00.276 --> 0:04:04.516
<v Speaker 1>Moon to explore in peace will stay on the Moon

0:04:05.036 --> 0:04:13.556
<v Speaker 1>to rest in peace. These men, Neil Armstrong and Edwin Auburn,

0:04:14.916 --> 0:04:19.756
<v Speaker 1>know that there is no pope for their recovery, but

0:04:19.876 --> 0:04:23.996
<v Speaker 1>they also know that there is hook for mankind in

0:04:24.076 --> 0:04:29.236
<v Speaker 1>their sacrifice. He ended with the now famous words for

0:04:29.396 --> 0:04:31.996
<v Speaker 1>every human being who looks up at the Moon and

0:04:32.036 --> 0:04:35.156
<v Speaker 1>the nights to come, will know that there is some

0:04:35.276 --> 0:04:44.156
<v Speaker 1>pun or another word that is forever mankind. Wait a minute,

0:04:44.676 --> 0:04:48.636
<v Speaker 1>that never happened. The Moon mission was a historic success.

0:04:49.236 --> 0:04:53.316
<v Speaker 1>The three astronauts returned safely to ticker tape parades and

0:04:53.356 --> 0:04:58.316
<v Speaker 1>a celebratory thirty eight day World Tour. Those alarms actually

0:04:58.436 --> 0:05:02.836
<v Speaker 1>did sound, but they turned out to be harmless. Nixon

0:05:02.916 --> 0:05:06.716
<v Speaker 1>never delivered that speech. His speech writer had written it,

0:05:07.036 --> 0:05:10.476
<v Speaker 1>but it sat in a folder labeled an event of

0:05:10.716 --> 0:05:16.596
<v Speaker 1>Moon disaster until now. The Nixon you just heard is

0:05:16.596 --> 0:05:20.156
<v Speaker 1>a deep fake, part of a seven minute film created

0:05:20.156 --> 0:05:25.276
<v Speaker 1>by artificial intelligence deep learning algorithms. The fake was made

0:05:25.356 --> 0:05:29.476
<v Speaker 1>by the Center for Advanced Virtuality at the Massachusetts Institute

0:05:29.476 --> 0:05:32.716
<v Speaker 1>of Technology as part of an art exhibit to raise

0:05:32.756 --> 0:05:36.636
<v Speaker 1>awareness about the power of synthesized media. Not long ago,

0:05:37.276 --> 0:05:40.316
<v Speaker 1>something like this would have taken a lot of time

0:05:40.316 --> 0:05:43.836
<v Speaker 1>and money, But now it's getting easy. You can make

0:05:43.916 --> 0:05:47.436
<v Speaker 1>new paintings in the style of French Impressionism, revived dead

0:05:47.556 --> 0:05:52.316
<v Speaker 1>movie stars, help patience with ner degenerative disease, or soon

0:05:52.436 --> 0:05:55.396
<v Speaker 1>maybe take a class on a tour of ancient Rome.

0:05:55.836 --> 0:05:59.596
<v Speaker 1>But as the technology quickly becomes democratized, we're getting to

0:05:59.636 --> 0:06:03.396
<v Speaker 1>the point where almost anyone can create a fake video

0:06:03.436 --> 0:06:06.876
<v Speaker 1>of a friend, an ex lover, a stranger, or a

0:06:06.916 --> 0:06:12.316
<v Speaker 1>public figure that's embarrassing, pornographic, or perhaps capable of causing

0:06:12.436 --> 0:06:17.196
<v Speaker 1>international chaos. Some argue that in a culture where faked

0:06:17.196 --> 0:06:22.196
<v Speaker 1>news spreads like wildfire, and political leaders deny the veracity

0:06:22.236 --> 0:06:25.996
<v Speaker 1>of hard facts. Deep faked media may do a lot

0:06:26.116 --> 0:06:33.396
<v Speaker 1>more harm than good. Today's big question will synthesized media

0:06:33.556 --> 0:06:38.276
<v Speaker 1>unleash a new wave of creativity or will it erode

0:06:38.316 --> 0:06:42.476
<v Speaker 1>the already tenuous role of truth in our democracy? And

0:06:43.596 --> 0:06:46.316
<v Speaker 1>is there anything we can do to keep it in check.

0:06:55.196 --> 0:06:57.556
<v Speaker 1>My name is Eric Lander. I'm a scientist who works

0:06:57.596 --> 0:07:00.316
<v Speaker 1>on ways to improve human health. I helped lead the

0:07:00.396 --> 0:07:03.756
<v Speaker 1>Human Genome Project, and today I lead the Broad Institute

0:07:03.756 --> 0:07:08.036
<v Speaker 1>of MIT and Harvard. In the twenty first century, powerful

0:07:08.116 --> 0:07:11.956
<v Speaker 1>technologies have been peering at a breathtaking pace related to

0:07:11.956 --> 0:07:16.476
<v Speaker 1>the Internet, artificial intelligence, genetic engineering, and more. They have

0:07:16.716 --> 0:07:20.836
<v Speaker 1>amazing potential upsides, but we can't ignore the risks that

0:07:20.956 --> 0:07:24.716
<v Speaker 1>come with them. The decisions aren't just up to scientists

0:07:25.036 --> 0:07:28.756
<v Speaker 1>or politicians. Whether we like it or not, we all

0:07:28.756 --> 0:07:31.596
<v Speaker 1>of us are the stewards of a brave New planet.

0:07:32.116 --> 0:07:35.916
<v Speaker 1>This generation's choices will shape the future as never before.

0:07:38.476 --> 0:07:42.436
<v Speaker 1>Coming up on today's episode of Brave New Planet, I

0:07:42.556 --> 0:07:46.916
<v Speaker 1>speak with some of the leaders behind advances in synthesized media.

0:07:47.116 --> 0:07:50.236
<v Speaker 1>You could, certainly, by the way, generate stories that could

0:07:50.636 --> 0:07:54.276
<v Speaker 1>be fresh and interesting and new and personal for every child.

0:07:54.636 --> 0:07:58.476
<v Speaker 1>We got emails from people who were quadruplegic and they

0:07:58.476 --> 0:08:01.156
<v Speaker 1>asked us if we could make them dance. We hear

0:08:01.236 --> 0:08:04.876
<v Speaker 1>from experts about some of the frightening ways that bad

0:08:04.876 --> 0:08:08.756
<v Speaker 1>actors can use deep fakes. Creditors would chime in and say,

0:08:09.196 --> 0:08:11.996
<v Speaker 1>you can absolutely make a deep fake sex video of

0:08:12.036 --> 0:08:14.836
<v Speaker 1>your ex with thirty pictures. I've done it with twenty.

0:08:15.076 --> 0:08:16.876
<v Speaker 1>Here's the things that keep me up at night. Right

0:08:17.556 --> 0:08:20.436
<v Speaker 1>a video of Donald Trump saying I've launched nuclear weapons

0:08:20.436 --> 0:08:23.836
<v Speaker 1>against Iran. And before anybody gets around to figuring out

0:08:23.836 --> 0:08:25.956
<v Speaker 1>whether this is real or not, we have global nuclear

0:08:26.036 --> 0:08:30.476
<v Speaker 1>mountdown and we explore how we might prevent the worst abuses.

0:08:31.316 --> 0:08:37.076
<v Speaker 1>It's important that younger people advocate for the Internet that

0:08:37.116 --> 0:08:39.716
<v Speaker 1>they want. We have to fight for it. We have

0:08:39.836 --> 0:08:49.676
<v Speaker 1>to ask for different things. Stay with us, Chapter one,

0:08:50.276 --> 0:08:55.196
<v Speaker 1>Abraham Lincoln's Head. To begin to understand the significance of

0:08:55.276 --> 0:08:58.956
<v Speaker 1>deep fake technology, I went to San Francisco to speak

0:08:58.956 --> 0:09:02.476
<v Speaker 1>with a world expert on synthetic media. My name is

0:09:02.716 --> 0:09:07.796
<v Speaker 1>alexey Or sometimes called alyosha Eros, and I'm a professor

0:09:07.836 --> 0:09:11.716
<v Speaker 1>at UC Berkeley and Computer Science and Electrical Engineering Department.

0:09:12.236 --> 0:09:18.116
<v Speaker 1>My research is on computer vision, computer graphics, machine learning,

0:09:18.836 --> 0:09:24.636
<v Speaker 1>various aspects of artificial intelligence. Where'd you grow up? I

0:09:24.716 --> 0:09:28.596
<v Speaker 1>grew up in Saint Petersburg in Russia. I was one

0:09:28.636 --> 0:09:33.036
<v Speaker 1>of those geeky kids playing around with computers or dreaming

0:09:33.036 --> 0:09:40.196
<v Speaker 1>about computers. My first computer was actually the first Soviet

0:09:40.716 --> 0:09:44.676
<v Speaker 1>personal computer. So you actually are involved in making sort

0:09:44.676 --> 0:09:49.916
<v Speaker 1>of synthetic content, synthetic media. That's right. Alexei has invented

0:09:49.996 --> 0:09:53.876
<v Speaker 1>powerful artificial intelligence tools, but his lab also has a

0:09:53.916 --> 0:09:58.196
<v Speaker 1>wonderful ability to use computers to enhance the human experience.

0:09:58.876 --> 0:10:02.276
<v Speaker 1>I was struck by a remarkable video on YouTube created

0:10:02.316 --> 0:10:06.116
<v Speaker 1>by his team at Berkeley. So this was a project

0:10:06.196 --> 0:10:11.556
<v Speaker 1>that actually was done by my students who didn't even

0:10:11.596 --> 0:10:15.356
<v Speaker 1>think of this as anything but a silly little toy

0:10:15.436 --> 0:10:19.396
<v Speaker 1>project of trying to see if we could get a

0:10:19.516 --> 0:10:24.316
<v Speaker 1>geeky computer science student to move like a ballerina. In

0:10:24.356 --> 0:10:28.116
<v Speaker 1>the video, one of the students, Carolyn cham dances with

0:10:28.156 --> 0:10:31.796
<v Speaker 1>a skill and grace of a professional, despite never having

0:10:31.796 --> 0:10:35.956
<v Speaker 1>studied ballet. The idea is you take a source actor

0:10:36.036 --> 0:10:40.156
<v Speaker 1>like a ballerina. There is a way to detect the

0:10:40.316 --> 0:10:43.596
<v Speaker 1>limbs of the dancer have a kind of a skeleton

0:10:44.236 --> 0:10:48.956
<v Speaker 1>extracted and also have my student just move around and

0:10:49.036 --> 0:10:52.716
<v Speaker 1>do some geeky moves. And now we're basically just going

0:10:52.796 --> 0:10:58.596
<v Speaker 1>to try to sympathize the appearance of my student driven

0:10:58.956 --> 0:11:01.716
<v Speaker 1>by the skeleton of the ballerina. Put it all together,

0:11:01.876 --> 0:11:05.716
<v Speaker 1>and then we have our grad student dancing pirouets like

0:11:05.876 --> 0:11:12.276
<v Speaker 1>a ballerina through artificial intelligence. Carolyn's body is puppeteered by

0:11:12.316 --> 0:11:15.396
<v Speaker 1>the dancer. We weren't even going to publish it, but

0:11:15.476 --> 0:11:19.916
<v Speaker 1>we just released a video on YouTube called Everybody Dance Now,

0:11:20.356 --> 0:11:25.516
<v Speaker 1>and somehow it really touched the nerve. Well, there's been

0:11:25.556 --> 0:11:29.596
<v Speaker 1>an explosion recently a new ways to manipulate media. Alexei

0:11:29.676 --> 0:11:33.196
<v Speaker 1>notes that the idea itself isn't new, it has a

0:11:33.276 --> 0:11:37.236
<v Speaker 1>long history. I can't help but ask, given that you

0:11:37.236 --> 0:11:41.916
<v Speaker 1>come from Russia. One of the premier users of doctoring

0:11:41.956 --> 0:11:47.196
<v Speaker 1>photographs I think was Stalin, who used the ability to

0:11:47.236 --> 0:11:51.716
<v Speaker 1>manipulate images for political effect. How did they do that?

0:11:52.276 --> 0:11:54.236
<v Speaker 1>Can you think of examples of this and like, what

0:11:54.316 --> 0:12:01.156
<v Speaker 1>was the technology? Then? The urge to change photographs has

0:12:01.236 --> 0:12:05.196
<v Speaker 1>been around basically since the invention of photography. For example,

0:12:05.196 --> 0:12:08.796
<v Speaker 1>there is a photograph of Abraham Lincoln that still hangs

0:12:08.796 --> 0:12:13.836
<v Speaker 1>in may classrooms. That's fake. It's actually Calhoun with Lincoln's

0:12:13.836 --> 0:12:17.956
<v Speaker 1>head attached it. Alexei's referring to John C. Calhoun, the

0:12:17.996 --> 0:12:22.396
<v Speaker 1>South Carolina senator and champion of slavery. A Civil War

0:12:22.516 --> 0:12:27.636
<v Speaker 1>portrait artist superimposed a photo of Lincoln's head onto an

0:12:27.636 --> 0:12:32.396
<v Speaker 1>engraving of Calhoun's body because they thought Lincoln's gangly frame

0:12:32.956 --> 0:12:35.836
<v Speaker 1>wasn't dignified enough, and so they just said, okay, we

0:12:35.916 --> 0:12:39.436
<v Speaker 1>can use Calhoun. Let's slap the Lincoln's head on his body.

0:12:39.676 --> 0:12:42.956
<v Speaker 1>And then, of course, as soon as you go into

0:12:43.276 --> 0:12:46.796
<v Speaker 1>the twentieth century, as soon as you get to dictatorships,

0:12:47.196 --> 0:12:50.876
<v Speaker 1>this is a wonderful toy for a dictator to use.

0:12:51.356 --> 0:12:55.156
<v Speaker 1>So again, Stalin was a big fan of this. He

0:12:55.236 --> 0:12:59.716
<v Speaker 1>would get rid of people in photographs once they were

0:12:59.716 --> 0:13:02.676
<v Speaker 1>out of favor or once they got jailed or killed.

0:13:03.156 --> 0:13:09.436
<v Speaker 1>He would just basically get them scratched out with reasonably cruetics.

0:13:10.076 --> 0:13:13.316
<v Speaker 1>Hitler did it, Mao did it, Castro did it, Bresnev

0:13:13.436 --> 0:13:16.316
<v Speaker 1>did it. I'm sure US agencies have done it also.

0:13:16.836 --> 0:13:20.876
<v Speaker 1>We have always manipulated images with a desire to change history.

0:13:21.356 --> 0:13:24.356
<v Speaker 1>This is Honi for Reed. He's also a professor at

0:13:24.356 --> 0:13:27.276
<v Speaker 1>Berkeley and a friend of Alexei's. I'm a professor of

0:13:27.316 --> 0:13:31.196
<v Speaker 1>computer science and I'm an expert in digital forensics, where

0:13:31.196 --> 0:13:35.436
<v Speaker 1>Alexei works on making synthetic media. Honey has devoted his

0:13:35.516 --> 0:13:39.476
<v Speaker 1>career to identifying when synthetic media is being used to

0:13:39.556 --> 0:13:45.196
<v Speaker 1>fool people, that is spotting fakes. He regularly collaborates on

0:13:45.236 --> 0:13:49.356
<v Speaker 1>this mission with Alexei. So I met Alyosha efros ten

0:13:49.436 --> 0:13:54.036
<v Speaker 1>twenty years ago. He is a really incredibly creative and

0:13:54.316 --> 0:13:58.196
<v Speaker 1>clever guy, and he has done what I consider some

0:13:58.236 --> 0:14:00.716
<v Speaker 1>of the most interesting work in computer vision and computer

0:14:00.796 --> 0:14:04.476
<v Speaker 1>graphics over the last two decades. And if you really

0:14:04.516 --> 0:14:06.996
<v Speaker 1>want to do forensics, well, you have to partner with

0:14:06.996 --> 0:14:10.116
<v Speaker 1>somebody like Alyosha. You have to partner with world class

0:14:10.196 --> 0:14:13.756
<v Speaker 1>mind who knows how to think about the synthesiside so

0:14:13.956 --> 0:14:17.236
<v Speaker 1>that you can synthesize the absolute best content and then

0:14:17.356 --> 0:14:20.036
<v Speaker 1>think about how to detect it. I think it's interesting

0:14:20.076 --> 0:14:22.516
<v Speaker 1>that if you're somebody on the synthesis side and developing

0:14:22.596 --> 0:14:24.836
<v Speaker 1>the forensic there's a little bit of a jekylin hide there,

0:14:24.836 --> 0:14:27.676
<v Speaker 1>and I think it's really fascinating. You know, the idea

0:14:27.836 --> 0:14:32.396
<v Speaker 1>of altering photos, it's not entirely new How far back

0:14:32.436 --> 0:14:35.676
<v Speaker 1>does this go? So we used to have in the

0:14:35.716 --> 0:14:40.796
<v Speaker 1>days of Stalin, highly talented, highly skilled, time consuming, difficult

0:14:40.876 --> 0:14:46.116
<v Speaker 1>process of manipulating images, removing somebody, erasing something from the image,

0:14:46.436 --> 0:14:51.036
<v Speaker 1>splicing faces together. And then we moved into the digital age,

0:14:51.316 --> 0:14:55.036
<v Speaker 1>where now a highly talented digital artist could remove one

0:14:55.036 --> 0:14:57.356
<v Speaker 1>face and add another face, but it was still a

0:14:57.436 --> 0:15:01.636
<v Speaker 1>time consuming and required scale. In nineteen ninety four, the

0:15:01.676 --> 0:15:04.916
<v Speaker 1>makers of the movie Forrest Gump won an Oscar for

0:15:05.036 --> 0:15:09.676
<v Speaker 1>Visual Effects for their representations of the title character interacting

0:15:09.676 --> 0:15:14.476
<v Speaker 1>with historical figures like President John F. Kennedy gratulate. How

0:15:14.516 --> 0:15:18.316
<v Speaker 1>does it feel to be an all Americans? Very good gratulation?

0:15:18.516 --> 0:15:26.276
<v Speaker 1>How do you feel? I believe it. Now computers are

0:15:26.316 --> 0:15:28.236
<v Speaker 1>doing all of the heavy lifting of what used to

0:15:28.236 --> 0:15:31.836
<v Speaker 1>be relegated to talented artists. The average person now can

0:15:31.916 --> 0:15:35.276
<v Speaker 1>use sophisticated technology to not just capture the recording, but

0:15:35.356 --> 0:15:38.676
<v Speaker 1>also manipulate it and then distribute it. The tools used

0:15:38.676 --> 0:15:42.316
<v Speaker 1>to create synthetic media have grown by leaps and bounds,

0:15:42.476 --> 0:15:45.396
<v Speaker 1>especially in the past few years, and so now we

0:15:45.436 --> 0:15:49.356
<v Speaker 1>have technology broadly called deep fake, but more specifically should

0:15:49.356 --> 0:15:53.556
<v Speaker 1>be called synthesized content where you point an image or

0:15:53.596 --> 0:15:56.916
<v Speaker 1>a video or an audio to an AI or machine

0:15:56.996 --> 0:16:00.276
<v Speaker 1>learning system, and it will replace the face for you.

0:16:00.516 --> 0:16:01.916
<v Speaker 1>I mean it can do that in an image, it

0:16:01.916 --> 0:16:04.276
<v Speaker 1>can do that in a video, or it can synthesize

0:16:04.276 --> 0:16:10.956
<v Speaker 1>audio for you in a particular person's voice. It's becomes

0:16:10.996 --> 0:16:15.836
<v Speaker 1>straightforward to swap people's faces. There's a popular YouTube video

0:16:16.236 --> 0:16:20.236
<v Speaker 1>that features tech pioneer Elon Musk's adult face on a

0:16:20.276 --> 0:16:24.316
<v Speaker 1>baby's body, and there's a famous meme where actor Nicholas

0:16:24.396 --> 0:16:28.636
<v Speaker 1>Cage's face replaces those of leading movie actors, both male

0:16:28.676 --> 0:16:32.156
<v Speaker 1>and female. You can put words into people's mouths and

0:16:32.276 --> 0:16:35.996
<v Speaker 1>make them jump and dance and run. You can even

0:16:36.076 --> 0:16:40.356
<v Speaker 1>resurrect powerful figures and have them deliver a fake speech

0:16:40.836 --> 0:16:49.596
<v Speaker 1>about a fake tragedy. From an Altered History, Chapter two,

0:16:50.596 --> 0:16:55.916
<v Speaker 1>Creating Nixon. The text of Nixon's Moon disaster speech that

0:16:55.956 --> 0:16:58.356
<v Speaker 1>we heard at the top of the show is actually

0:16:58.396 --> 0:17:01.396
<v Speaker 1>not fake. As I mentioned, it was written for President

0:17:01.476 --> 0:17:05.676
<v Speaker 1>Nixon as a contingency speech and thankfully never had to

0:17:05.716 --> 0:17:08.956
<v Speaker 1>be delivered. It's an amazing piece of writing. It was

0:17:09.156 --> 0:17:13.156
<v Speaker 1>written by Bill Safire, who was one of Nixon's speech writers.

0:17:13.556 --> 0:17:17.116
<v Speaker 1>This is artist in journalist Francesco Panetta. She's the co

0:17:17.316 --> 0:17:21.916
<v Speaker 1>director of the Nixon Fake or MTS Moon Disaster Team.

0:17:22.556 --> 0:17:27.476
<v Speaker 1>She's also the creative director in MIT's Center for Advanced Virtuality.

0:17:27.836 --> 0:17:32.396
<v Speaker 1>I was doing experimental journalism at the Guardian newspaper. I

0:17:32.476 --> 0:17:35.556
<v Speaker 1>ran the Guardians Virtual Reality studio for the last three years.

0:17:35.836 --> 0:17:38.316
<v Speaker 1>The second half of the Moon Disaster team is sound

0:17:38.436 --> 0:17:42.796
<v Speaker 1>artist Halsey Bergund. My name is Halsey Bergund. I'm a

0:17:42.956 --> 0:17:45.796
<v Speaker 1>sound artist and technologist, and I've had a lot of

0:17:45.836 --> 0:17:49.956
<v Speaker 1>experience with lots of sorts of audio enhanced with technology,

0:17:50.236 --> 0:17:53.596
<v Speaker 1>though this is my first experience with synthetic media, especially

0:17:53.676 --> 0:17:57.156
<v Speaker 1>since I typically focus on authenticity of voice and now

0:17:57.196 --> 0:18:00.636
<v Speaker 1>I'm kind of doing the opposite. So together Halsey and

0:18:00.716 --> 0:18:04.556
<v Speaker 1>Francesca chose to automate a tragic moment in history that

0:18:04.836 --> 0:18:07.996
<v Speaker 1>never actually happened. I think it all started with it

0:18:08.036 --> 0:18:10.396
<v Speaker 1>being the fiftieth anniversary of the Moon landing last year,

0:18:10.556 --> 0:18:13.356
<v Speaker 1>and add on top of that an election cycle in

0:18:13.396 --> 0:18:16.556
<v Speaker 1>this country, and dealing with this information, which is obviously

0:18:17.196 --> 0:18:21.276
<v Speaker 1>very important in election cycles. It was like light bulbs

0:18:21.596 --> 0:18:25.076
<v Speaker 1>went on and we got very excited about pursuing it.

0:18:25.076 --> 0:18:28.436
<v Speaker 1>It's possible to make mediocre fakes pretty quickly and cheaply,

0:18:28.796 --> 0:18:32.596
<v Speaker 1>but Francesca and Halsey wanted high production values. So how

0:18:32.596 --> 0:18:36.476
<v Speaker 1>does one go about making a first rate fake presidential address.

0:18:36.956 --> 0:18:40.116
<v Speaker 1>There are two components. There's the visuals and there's the audio,

0:18:40.156 --> 0:18:44.396
<v Speaker 1>and they are completely different processes. So we decided to

0:18:44.436 --> 0:18:48.716
<v Speaker 1>go with a video dialogue replacement company called Kenny Ai,

0:18:49.156 --> 0:18:51.716
<v Speaker 1>who would do the visuals for us, and then we

0:18:51.796 --> 0:18:56.436
<v Speaker 1>decided to go with Respeecher, who are a dialogue replacement

0:18:56.476 --> 0:19:01.556
<v Speaker 1>company for the voice of Nixon. They tackled the voice first,

0:19:01.716 --> 0:19:04.476
<v Speaker 1>the more challenging of the two mediums. What we were

0:19:04.516 --> 0:19:07.596
<v Speaker 1>told to do was to get two to three hours

0:19:07.676 --> 0:19:10.876
<v Speaker 1>worth of Nixon talking. That was pretty easy because the

0:19:10.996 --> 0:19:14.996
<v Speaker 1>Nixon Library has hours and hours of Nixon, mainly giving

0:19:15.076 --> 0:19:19.276
<v Speaker 1>Vietnam's speeches. The Communist armies of North Vietnam launched a

0:19:19.356 --> 0:19:23.036
<v Speaker 1>massive inversion of South Vietnam. That audio was then chopped

0:19:23.116 --> 0:19:26.916
<v Speaker 1>up into chunks between one and three seconds long. We

0:19:27.036 --> 0:19:32.676
<v Speaker 1>found this incredibly patient actor called Lewis D. Wheeler. Lewis

0:19:32.716 --> 0:19:36.396
<v Speaker 1>would listen to the one second clip and then he

0:19:37.036 --> 0:19:45.796
<v Speaker 1>would repeat that and do what I believe was right. Speech.

0:19:45.876 --> 0:19:47.996
<v Speaker 1>Would say to us things like we need to change

0:19:48.036 --> 0:19:52.636
<v Speaker 1>the diagonal attention, which meant nothing to us. Yes, we

0:19:52.676 --> 0:19:59.196
<v Speaker 1>have a whole lot of potential band names going forward. Yeah,

0:19:59.276 --> 0:20:02.436
<v Speaker 1>Synthetic Nixon is another good one. So once we have

0:20:02.836 --> 0:20:06.236
<v Speaker 1>our Nixon model made out of these thousands of tiny clips,

0:20:06.596 --> 0:20:10.956
<v Speaker 1>it means that whatever our act says will come out

0:20:11.076 --> 0:20:14.476
<v Speaker 1>then in Nixon's voice. So then what we did was

0:20:14.596 --> 0:20:19.196
<v Speaker 1>record the contingency speech of Nixon, and it meant that

0:20:19.236 --> 0:20:24.796
<v Speaker 1>we got Lewis's actually performance, but in Nixon's voice. What

0:20:24.876 --> 0:20:27.996
<v Speaker 1>about the video part, I mean, the video was much easier.

0:20:27.996 --> 0:20:30.036
<v Speaker 1>We're talking a couple of days here and a tiny

0:20:30.076 --> 0:20:35.676
<v Speaker 1>amount of data just with Lewis's iPhone. We filmed him

0:20:35.716 --> 0:20:39.436
<v Speaker 1>reading the contingency speech once a couple of minutes of

0:20:39.516 --> 0:20:42.956
<v Speaker 1>him just chatting to camera, and that was it fate

0:20:44.196 --> 0:20:47.316
<v Speaker 1>that the men who went to the Moon to explore

0:20:48.636 --> 0:20:52.796
<v Speaker 1>will stay on the moon. You know, we were told

0:20:52.796 --> 0:20:56.116
<v Speaker 1>by Kenny Ai that everything would be the same in

0:20:56.156 --> 0:20:59.476
<v Speaker 1>the video apart from just the area around the mouth.

0:20:59.916 --> 0:21:03.076
<v Speaker 1>So every gesture of the hand, every blink, every time

0:21:03.076 --> 0:21:05.876
<v Speaker 1>he moved his face, all of that would stay the same,

0:21:06.396 --> 0:21:11.036
<v Speaker 1>but just the mouth basically would change. So we used

0:21:11.276 --> 0:21:15.196
<v Speaker 1>Nixon's resignation speech. To have served in this office, it

0:21:15.356 --> 0:21:19.916
<v Speaker 1>is to have felt a very personal sense of It

0:21:19.996 --> 0:21:22.876
<v Speaker 1>was the speech of Nixon that looked to the most

0:21:22.916 --> 0:21:25.316
<v Speaker 1>somber way. He seemed to have the most emotion in

0:21:25.356 --> 0:21:28.636
<v Speaker 1>his face. So what actually went on in the computer?

0:21:29.516 --> 0:21:34.996
<v Speaker 1>Artificial intelligence sometimes sounds inscrutable, but the basic ideas are

0:21:35.076 --> 0:21:38.316
<v Speaker 1>quite simple. In this case, it uses a type of

0:21:38.356 --> 0:21:42.716
<v Speaker 1>computer program called an auto encoder. It's trained to take

0:21:42.876 --> 0:21:48.476
<v Speaker 1>complicated things, say spoken sentences or pictures, encode them in

0:21:48.516 --> 0:21:52.676
<v Speaker 1>a much simpler form, and then decode them to recover

0:21:52.716 --> 0:21:56.356
<v Speaker 1>the original as best it can. The encoder tries to

0:21:56.396 --> 0:22:00.036
<v Speaker 1>reduce things to their essence, throwing away most of the

0:22:00.116 --> 0:22:03.036
<v Speaker 1>information but keeping enough to do a good job of

0:22:03.076 --> 0:22:06.596
<v Speaker 1>reconstructing it to make a deep fake. Here's the trick.

0:22:07.316 --> 0:22:11.116
<v Speaker 1>Train a speech auto encode for Nixon to Nixon, and

0:22:11.236 --> 0:22:15.476
<v Speaker 1>a speech auto encoder for actor to actor, but force

0:22:15.596 --> 0:22:20.956
<v Speaker 1>them to use the same encoder. Then you can input

0:22:21.076 --> 0:22:26.036
<v Speaker 1>actor and decoded as Nixon. If you have enough data.

0:22:26.236 --> 0:22:32.156
<v Speaker 1>It's a piece of cake. Around their carefully created video,

0:22:32.516 --> 0:22:36.916
<v Speaker 1>the Moon Disaster team created an entire art installation a

0:22:36.996 --> 0:22:40.916
<v Speaker 1>nineteen sixties living room with a fake vintage newspaper sharing

0:22:40.916 --> 0:22:45.756
<v Speaker 1>the fake tragic news while a fake Nixon speak solemnly

0:22:46.036 --> 0:22:49.916
<v Speaker 1>on a vintage black and white television. Some people, when

0:22:49.996 --> 0:22:52.996
<v Speaker 1>they were watching the installation, they watched a number of times.

0:22:53.036 --> 0:22:54.996
<v Speaker 1>You'd see them, they'd watch it once, then they would

0:22:54.996 --> 0:22:57.916
<v Speaker 1>watch it again, staring at the lips to see if

0:22:57.956 --> 0:23:01.356
<v Speaker 1>they could see any lack of synchronicity. We had some

0:23:01.396 --> 0:23:05.316
<v Speaker 1>people who thought that perhaps Nixon had actually recorded this

0:23:05.556 --> 0:23:09.076
<v Speaker 1>speech as a contingency speech for it to go onto television.

0:23:09.516 --> 0:23:12.916
<v Speaker 1>Lots of folks who were listening, viewing, and even press

0:23:12.996 --> 0:23:15.636
<v Speaker 1>folks just immediately said, oh, the voice is real or whatever,

0:23:15.636 --> 0:23:19.076
<v Speaker 1>you know, said these things that weren't accurate because they

0:23:19.156 --> 0:23:21.596
<v Speaker 1>just felt like there wasn't even a question. I suppose

0:23:22.076 --> 0:23:24.036
<v Speaker 1>that is what we wanted to achieve. But at the

0:23:24.076 --> 0:23:26.676
<v Speaker 1>same time, it's it was a little bit eye opening

0:23:26.676 --> 0:23:29.436
<v Speaker 1>and like a little scary, you know that that could happen.

0:23:33.076 --> 0:23:39.876
<v Speaker 1>Chapter three, Everybody dance. What do you see as just

0:23:39.956 --> 0:23:45.636
<v Speaker 1>the wonderful upsides of having technologies like this? Yeah, I

0:23:45.676 --> 0:23:49.916
<v Speaker 1>mean ai A art is becoming a whole field in itself,

0:23:50.396 --> 0:23:54.996
<v Speaker 1>so creatively there is enormous potential. One of the potential

0:23:55.156 --> 0:23:58.396
<v Speaker 1>positive educational uses of deep fake technology would be to

0:23:58.516 --> 0:24:02.796
<v Speaker 1>bring historical figures back to life to make learning more durable.

0:24:02.876 --> 0:24:05.916
<v Speaker 1>I think one could do that with bringing Abraham Lincoln

0:24:05.916 --> 0:24:08.636
<v Speaker 1>back to life and having him deliver speeches. Film companies

0:24:08.636 --> 0:24:11.676
<v Speaker 1>and radios cited about re enactments. We're already beginning to

0:24:11.716 --> 0:24:15.796
<v Speaker 1>see this in films like Star Wars, when we're bringing

0:24:15.836 --> 0:24:18.556
<v Speaker 1>people like Carrie Fisher back to life. I mean that

0:24:18.716 --> 0:24:22.156
<v Speaker 1>is at the moment not being done through deep fake technologies.

0:24:22.156 --> 0:24:25.716
<v Speaker 1>This is using fairy traditional techniques of CGI at the moment.

0:24:25.916 --> 0:24:28.596
<v Speaker 1>So we still have to see our first deep fake

0:24:28.836 --> 0:24:32.396
<v Speaker 1>big cinema screen release, but this is just to come.

0:24:32.516 --> 0:24:35.516
<v Speaker 1>Like the technology is getting better and better. Not only

0:24:35.556 --> 0:24:38.596
<v Speaker 1>will we be able to potentially bring back actors and

0:24:38.636 --> 0:24:42.236
<v Speaker 1>actresses who are no longer alive and have them star movies,

0:24:42.276 --> 0:24:45.036
<v Speaker 1>but an actor could make a model of their own

0:24:45.116 --> 0:24:47.956
<v Speaker 1>voice and then sell the use of that voice to

0:24:47.996 --> 0:24:52.236
<v Speaker 1>anybody to do a voiceover of whatever is wanted, and

0:24:52.276 --> 0:24:54.716
<v Speaker 1>so they could have twenty of these going on at

0:24:54.716 --> 0:24:57.156
<v Speaker 1>the same time, and the sort of restriction of their

0:24:57.196 --> 0:25:00.716
<v Speaker 1>physical presence is no longer there. And that might mean

0:25:00.756 --> 0:25:03.756
<v Speaker 1>that you know, Brad Pitt is in everything, or it

0:25:03.876 --> 0:25:07.116
<v Speaker 1>might just mean that lower budget films can afford to

0:25:07.156 --> 0:25:10.036
<v Speaker 1>have some of the higher cost talent. That point, you know,

0:25:10.076 --> 0:25:13.356
<v Speaker 1>the top twenty actors could just do everything. Yes, there's

0:25:13.396 --> 0:25:15.796
<v Speaker 1>no doubt that there will be winners and losers from

0:25:16.076 --> 0:25:19.596
<v Speaker 1>these technologies, but the potential of synthetic media goes way

0:25:19.636 --> 0:25:23.876
<v Speaker 1>beyond the arts. There are possible medical and therapeutic applications.

0:25:24.236 --> 0:25:27.596
<v Speaker 1>There are companies that are working very hard to allow

0:25:27.676 --> 0:25:30.196
<v Speaker 1>people who have either lost their voice or who never

0:25:30.236 --> 0:25:32.716
<v Speaker 1>had a voice to be able to speak in a

0:25:32.996 --> 0:25:35.836
<v Speaker 1>way that is either how they used to speak or

0:25:36.556 --> 0:25:39.556
<v Speaker 1>in a way that isn't a canned voice that everybody has.

0:25:39.956 --> 0:25:43.796
<v Speaker 1>Alexei ePROs and his students discovered potential uses of synthetic

0:25:43.836 --> 0:25:48.996
<v Speaker 1>media and medicine quite unintentionally while working on their Everybody

0:25:49.116 --> 0:25:53.276
<v Speaker 1>Danced Now project that could turn anyone into a ballerina.

0:25:53.476 --> 0:25:57.836
<v Speaker 1>We were kind of surprised for all the positive feedback.

0:25:57.876 --> 0:26:01.556
<v Speaker 1>We god, we've got emails from people who were quadruplegic

0:26:01.636 --> 0:26:03.916
<v Speaker 1>and they asked us if we could make them dance,

0:26:03.996 --> 0:26:08.116
<v Speaker 1>and it was very unexpected. So now we are trying

0:26:08.156 --> 0:26:10.316
<v Speaker 1>to get the soft where to be in a state

0:26:10.316 --> 0:26:13.996
<v Speaker 1>where people can use that because yeah, it's somehow it

0:26:14.036 --> 0:26:20.756
<v Speaker 1>did hit a nerve with folks. Chapter four Unicorns in

0:26:20.796 --> 0:26:26.116
<v Speaker 1>the Andes. The past few years have seen amazing advances

0:26:26.196 --> 0:26:30.196
<v Speaker 1>in the creation of synthetic media through artificial intelligence. The

0:26:30.236 --> 0:26:33.836
<v Speaker 1>technology now goes far beyond fitting one face over another

0:26:33.876 --> 0:26:37.276
<v Speaker 1>face in a video. A recent breakthrough has made it

0:26:37.276 --> 0:26:43.196
<v Speaker 1>possible to create entirely new and very convincing content out

0:26:43.236 --> 0:26:49.356
<v Speaker 1>of thin air. The breakthrough, called generative adversarial networks or GAMS,

0:26:49.956 --> 0:26:53.796
<v Speaker 1>came from a machine learning researcher at Google named Ian Goodfellow.

0:26:54.596 --> 0:26:58.916
<v Speaker 1>Like auto encoders, the basic idea is simple but brilliant.

0:26:59.836 --> 0:27:03.596
<v Speaker 1>Suppose you want to create amazingly realistic photos of people

0:27:03.636 --> 0:27:07.356
<v Speaker 1>who don't exist. While you build a GAM consisting of

0:27:07.476 --> 0:27:12.276
<v Speaker 1>two computer programs, a photo generator that learns to generate

0:27:12.316 --> 0:27:17.436
<v Speaker 1>fake photos and a photo discriminator that learns to discriminate

0:27:17.516 --> 0:27:22.596
<v Speaker 1>or identify fake photos from a vast collection of real photos.

0:27:23.276 --> 0:27:26.996
<v Speaker 1>You then let the two programs compete, continually tweaking their

0:27:26.996 --> 0:27:30.636
<v Speaker 1>code to outsmart each other. By the time they're done,

0:27:31.036 --> 0:27:35.556
<v Speaker 1>the GAN can generate amazingly convincing fakes. You can see

0:27:35.596 --> 0:27:38.596
<v Speaker 1>for yourself if you go to the website this person

0:27:38.716 --> 0:27:43.476
<v Speaker 1>does not exist dot com. Every time you refresh the page,

0:27:43.836 --> 0:27:47.716
<v Speaker 1>you're shown a new uncanny image of a person who,

0:27:48.076 --> 0:27:51.836
<v Speaker 1>as the website says, does not and never did exist.

0:27:52.636 --> 0:28:00.796
<v Speaker 1>Francescan I actually tried out the website. This young Asian woman.

0:28:00.956 --> 0:28:04.436
<v Speaker 1>She's got great complexion, envious of that, neat black hair,

0:28:04.516 --> 0:28:08.276
<v Speaker 1>with a fringe pink lipstick and a slightly dreamy look

0:28:08.396 --> 0:28:14.116
<v Speaker 1>as she's kind of gazing off to her left. Oh,

0:28:14.156 --> 0:28:16.316
<v Speaker 1>here's a woman who looks like she could be a

0:28:16.356 --> 0:28:20.836
<v Speaker 1>neighbor of mine in Cambridge, probably about sixty five. She's

0:28:20.836 --> 0:28:26.276
<v Speaker 1>got nice wire framed glasses, layered hair. Her earrings don't

0:28:26.316 --> 0:28:30.196
<v Speaker 1>actually match, but that could just be her distinctive style.

0:28:30.996 --> 0:28:35.636
<v Speaker 1>I mean, of course, she doesn't really exist. It's hard

0:28:35.676 --> 0:28:40.236
<v Speaker 1>to argue that gans aren't creating original art. In fact,

0:28:40.636 --> 0:28:44.356
<v Speaker 1>an artist collective recently used a GAM to create a

0:28:44.396 --> 0:28:50.036
<v Speaker 1>French Impressionist style portrait. When Christie's sold it at auction,

0:28:50.596 --> 0:28:53.836
<v Speaker 1>it fetched an eye popping four hundred and thirty two

0:28:54.076 --> 0:28:59.996
<v Speaker 1>thousand dollars. Alexeiefros, the Berkeley professor, recently pushed gans a

0:29:00.076 --> 0:29:05.516
<v Speaker 1>step further, creating something called cycle gans by connecting two

0:29:05.636 --> 0:29:10.236
<v Speaker 1>gans together in a clever way. Cycle. Gans can transform

0:29:10.276 --> 0:29:14.156
<v Speaker 1>a money painting into what's seemingly a photograph of the

0:29:14.196 --> 0:29:18.396
<v Speaker 1>same scene, or turn a summer landscape into a winter

0:29:18.556 --> 0:29:23.356
<v Speaker 1>landscape of the same view. Alexei's psychogans seem like magic.

0:29:23.796 --> 0:29:27.836
<v Speaker 1>If you were to add in virtual reality, the possibilities

0:29:28.196 --> 0:29:35.236
<v Speaker 1>become mind blowing. You maybe reminiscing about walking down Saint

0:29:35.316 --> 0:29:38.756
<v Speaker 1>German and Paris, and with a few clicks, you are there,

0:29:38.876 --> 0:29:41.596
<v Speaker 1>and you're walking down the boulevard and you're looking at

0:29:41.596 --> 0:29:45.356
<v Speaker 1>all the buildings, and maybe you can even switch to

0:29:45.396 --> 0:29:49.596
<v Speaker 1>a different year. And I think that is I think

0:29:49.756 --> 0:29:54.796
<v Speaker 1>very exciting as a way to mentally travel to different places.

0:29:55.276 --> 0:29:57.676
<v Speaker 1>So if you do this in VR, I mean imagine

0:29:58.076 --> 0:30:01.876
<v Speaker 1>classes going on a class visit to ancient Rome. That's right.

0:30:02.196 --> 0:30:06.996
<v Speaker 1>You could imagine from how a particular city like Chrome

0:30:07.356 --> 0:30:11.196
<v Speaker 1>looks now. Trying to extrapolate we looked into past. It

0:30:11.276 --> 0:30:15.396
<v Speaker 1>turns out that gans aren't just transforming images. I spoke

0:30:15.436 --> 0:30:19.156
<v Speaker 1>with a friend who's very familiar with another remarkable application

0:30:19.196 --> 0:30:22.716
<v Speaker 1>of the technology. My name is Reid Hoffman. I'm a

0:30:22.716 --> 0:30:25.676
<v Speaker 1>podcaster of Master's a Scale. I'm a partner at Greylock,

0:30:25.716 --> 0:30:28.236
<v Speaker 1>which is where we're sitting right now co founder of

0:30:28.276 --> 0:30:32.716
<v Speaker 1>LinkedIn and then a variety of other eccentric hobbies. Reid

0:30:32.836 --> 0:30:37.236
<v Speaker 1>is a board member of an unusual organization called open ai.

0:30:37.676 --> 0:30:40.876
<v Speaker 1>Open a Eyes is highly concerned with artificial general intelligence

0:30:40.956 --> 0:30:45.036
<v Speaker 1>human level intelligence. I helped Sam Altman and Elon Musk

0:30:45.156 --> 0:30:50.436
<v Speaker 1>standing up. The basic concern was that if one company

0:30:50.516 --> 0:30:55.076
<v Speaker 1>created and deployed that that could be disbalancing in all

0:30:55.156 --> 0:30:58.116
<v Speaker 1>kinds of ways. And so the thought is, if it

0:30:58.196 --> 0:31:00.956
<v Speaker 1>could be created, we should make sure that there is

0:31:01.036 --> 0:31:04.356
<v Speaker 1>essentially a nonprofit that is creating this and that can

0:31:04.436 --> 0:31:09.756
<v Speaker 1>make that technology available at selective time slices to industry

0:31:09.756 --> 0:31:13.716
<v Speaker 1>as a whole government sut C. Last year, open ai

0:31:13.916 --> 0:31:18.556
<v Speaker 1>released a program that uses gams to write language from

0:31:18.556 --> 0:31:23.516
<v Speaker 1>a short opening prompt. The system, called GPT two, can

0:31:23.556 --> 0:31:27.156
<v Speaker 1>spin a convincing article or story instead of a deep

0:31:27.196 --> 0:31:32.076
<v Speaker 1>fake video. It's deep fake text. It's pretty amazing. Actually.

0:31:32.516 --> 0:31:37.836
<v Speaker 1>For example, open ai researchers gave the program the following prompt.

0:31:38.716 --> 0:31:42.036
<v Speaker 1>In a shocking finding, scientists discovered a herd of unicorns

0:31:42.076 --> 0:31:45.676
<v Speaker 1>living in a remote, previously unexplored valley in the Andes Mountains.

0:31:46.316 --> 0:31:49.316
<v Speaker 1>Even more surprising to the researchers was the fact that

0:31:49.316 --> 0:31:53.756
<v Speaker 1>the unicorns spoke perfect English. GPT two took it from there,

0:31:53.956 --> 0:31:59.356
<v Speaker 1>delivering nine crisp paragraphs on the landmark discovery. I asked

0:31:59.396 --> 0:32:02.956
<v Speaker 1>Fran to read a bit from the story. Doctor Jorge Perez,

0:32:03.156 --> 0:32:06.956
<v Speaker 1>an evolutionary biologist from the University of Lapaz, and several

0:32:06.996 --> 0:32:10.796
<v Speaker 1>companions were exploring the these mountains when they found a

0:32:10.836 --> 0:32:15.156
<v Speaker 1>small valley with no other animals or humans. Perez noticed

0:32:15.196 --> 0:32:17.596
<v Speaker 1>that the valley had what appeared to be a natural

0:32:17.676 --> 0:32:21.236
<v Speaker 1>fountains surrounded by two peaks of rock and silver snow.

0:32:22.036 --> 0:32:25.076
<v Speaker 1>Perez and the others then ventured further into the valley.

0:32:25.596 --> 0:32:27.436
<v Speaker 1>By the time we reached the top of one peak,

0:32:27.516 --> 0:32:30.556
<v Speaker 1>the water looked blue with some crystals on top, said Perez.

0:32:30.916 --> 0:32:34.076
<v Speaker 1>Perez and his friends were astonished to see the unicorn herd.

0:32:39.156 --> 0:32:40.556
<v Speaker 1>Don't tell me some of the great things you can

0:32:40.596 --> 0:32:45.236
<v Speaker 1>do with language generation, well, say, for example, entertainment, Generate

0:32:45.356 --> 0:32:48.796
<v Speaker 1>stories that could be fresh and interesting and new and

0:32:48.836 --> 0:32:53.596
<v Speaker 1>personal for every child. Embed educational things in those stories

0:32:53.596 --> 0:32:56.356
<v Speaker 1>so they're drawn into the fact that the story is

0:32:56.396 --> 0:32:59.916
<v Speaker 1>involving them and their friends, but also now brings in

0:33:00.076 --> 0:33:04.716
<v Speaker 1>grammar and math and other kinds of things. As are

0:33:04.756 --> 0:33:09.716
<v Speaker 1>doing it. Generate explanatory material of this kind of education

0:33:10.276 --> 0:33:13.316
<v Speaker 1>that works best for this audience, for this kind of people,

0:33:13.356 --> 0:33:14.836
<v Speaker 1>like we want to have this kind of math or

0:33:14.876 --> 0:33:17.076
<v Speaker 1>this kind of physics, or this kind of history or

0:33:17.076 --> 0:33:19.996
<v Speaker 1>this kind of poetry explained in the right way. And

0:33:20.076 --> 0:33:23.596
<v Speaker 1>also the sile language right like you know native city

0:33:23.796 --> 0:33:27.596
<v Speaker 1>x language. When open ai announced its breakthrough program for

0:33:27.676 --> 0:33:31.796
<v Speaker 1>text generation, it took the unusual step of not releasing

0:33:31.796 --> 0:33:34.556
<v Speaker 1>the full powered version because it was worried about the

0:33:34.556 --> 0:33:38.476
<v Speaker 1>possible consequences. Now, part of the open ai decision to

0:33:38.516 --> 0:33:41.916
<v Speaker 1>say we're going to release a smaller model than the

0:33:41.916 --> 0:33:44.836
<v Speaker 1>one we did is because we think that the deep

0:33:44.876 --> 0:33:47.676
<v Speaker 1>fake problem hasn't been solved. And by the way, some

0:33:47.716 --> 0:33:49.876
<v Speaker 1>people complained about that because they said, well, you're slowing

0:33:49.916 --> 0:33:52.516
<v Speaker 1>down our ability to do progress and so far. The

0:33:52.516 --> 0:33:55.036
<v Speaker 1>answer to say, look, when these are released to the

0:33:55.276 --> 0:33:58.396
<v Speaker 1>entire public, we cannot control the downside as well as

0:33:58.436 --> 0:34:06.556
<v Speaker 1>the upsides. Downsides from art to therapy to virtual time travel,

0:34:06.876 --> 0:34:12.756
<v Speaker 1>personalized stories and education, Synthetic media has amazing upsides. What

0:34:13.476 --> 0:34:20.316
<v Speaker 1>could possibly go wrong? Chapter five? What could possibly go wrong?

0:34:21.876 --> 0:34:25.556
<v Speaker 1>The downsides are actually not hard to find. The ability

0:34:25.556 --> 0:34:30.836
<v Speaker 1>to reshape reality brings extraordinary power, and people inevitably use

0:34:30.956 --> 0:34:35.476
<v Speaker 1>power to control other people. It should be no surprise, therefore,

0:34:35.516 --> 0:34:39.236
<v Speaker 1>that ninety six percent of fake videos posted online are

0:34:39.356 --> 0:34:45.236
<v Speaker 1>non consensual pornography videos, almost always of women manipulated to

0:34:45.356 --> 0:34:49.796
<v Speaker 1>depict sex acts that never actually occurred. I spoke with

0:34:49.836 --> 0:34:53.716
<v Speaker 1>a professor who studies deep fakes, including digital attempts to

0:34:53.836 --> 0:34:57.676
<v Speaker 1>control women's bodies. I'm Danielle Citron, and I am a

0:34:57.756 --> 0:35:01.036
<v Speaker 1>law professor at Boston University School of Law. I write

0:35:01.076 --> 0:35:06.396
<v Speaker 1>about privacy, technology, automation. My newest work and my next

0:35:06.396 --> 0:35:09.636
<v Speaker 1>book is going to be about sexual privacy. So I've

0:35:09.676 --> 0:35:13.196
<v Speaker 1>worked in and around consumer privacy, individual rights, civil rights.

0:35:13.236 --> 0:35:16.756
<v Speaker 1>I write a lot about free speech and then automated systems.

0:35:17.116 --> 0:35:20.676
<v Speaker 1>When did you first become aware of deep fakes? Do

0:35:20.676 --> 0:35:23.076
<v Speaker 1>you remember when this cross your rit? I did so.

0:35:23.596 --> 0:35:26.796
<v Speaker 1>There was a Reddit thread devoted to, you know, fake

0:35:27.156 --> 0:35:31.596
<v Speaker 1>pornography movies of Galjado Emma Watson. But the reddit thread

0:35:32.036 --> 0:35:35.716
<v Speaker 1>sort of spooled not just from celebrities but ordinary people,

0:35:36.236 --> 0:35:38.636
<v Speaker 1>and so you had rereditors asking each other, how do

0:35:38.676 --> 0:35:40.956
<v Speaker 1>I make a deep fake sex video of max girlfriend?

0:35:40.956 --> 0:35:43.836
<v Speaker 1>I have thirty pictures, and then other reditors would chime

0:35:43.836 --> 0:35:46.876
<v Speaker 1>in and say, look at this YouTube tutorial. You can

0:35:46.956 --> 0:35:49.916
<v Speaker 1>absolutely make a deep fake sex video of your ex

0:35:50.356 --> 0:35:53.316
<v Speaker 1>with thirty pictures. I've done it with twenty. In November

0:35:53.436 --> 0:35:59.276
<v Speaker 1>twenty seventeen, an anonymous redditor began posting synthesized porn videos

0:35:59.476 --> 0:36:03.356
<v Speaker 1>under the pseudonym deep fakes, perhaps a nod to the

0:36:03.436 --> 0:36:06.516
<v Speaker 1>deep learning technology you used to create them as well

0:36:06.556 --> 0:36:11.996
<v Speaker 1>as the nineteen seventies porn film throat. The Internet quickly

0:36:12.036 --> 0:36:16.716
<v Speaker 1>adopted the term deep fakes and broadened its meanings beyond pornography.

0:36:17.356 --> 0:36:20.676
<v Speaker 1>To create the videos, he used celebrity faces from Google

0:36:20.756 --> 0:36:25.276
<v Speaker 1>image search and YouTube videos, and then trains an algorithm

0:36:25.276 --> 0:36:29.876
<v Speaker 1>on that content together with pornographic videos. Have you seen

0:36:30.556 --> 0:36:34.916
<v Speaker 1>deep fake pornography videos? Yes, so still pretty crude, so

0:36:34.996 --> 0:36:38.556
<v Speaker 1>you probably can tell that it's a fake, But for

0:36:38.596 --> 0:36:43.156
<v Speaker 1>the person who's inserted into pornography, it's devastating. You use

0:36:43.276 --> 0:36:48.756
<v Speaker 1>the neural network technology, the artificial intelligence technology to create

0:36:48.956 --> 0:36:54.596
<v Speaker 1>out of digital whole cloth pornography videos using probably real

0:36:54.636 --> 0:36:58.356
<v Speaker 1>pornography and then inserting the person in the pornography so

0:36:58.436 --> 0:37:01.196
<v Speaker 1>they become the female actress. If it's a female, it's

0:37:01.276 --> 0:37:06.276
<v Speaker 1>usually a female in that video. My name is Noel

0:37:06.476 --> 0:37:14.236
<v Speaker 1>Martin and I am an activist and Lauraform campaigner in Australia.

0:37:14.476 --> 0:37:18.876
<v Speaker 1>Noel is twenty six years old and she lives in Perth, Australia.

0:37:19.156 --> 0:37:25.796
<v Speaker 1>So the first time that I discovered myself on pornographic

0:37:25.876 --> 0:37:31.956
<v Speaker 1>sites was when I was eighteen and out of curiosity,

0:37:32.036 --> 0:37:36.516
<v Speaker 1>decided to Google image reverse search myself. In an instant,

0:37:37.276 --> 0:37:41.796
<v Speaker 1>like in a less than a millisecond, my life completely changed.

0:37:42.196 --> 0:37:45.916
<v Speaker 1>At first, it started with photos still images stolen from

0:37:45.996 --> 0:37:51.396
<v Speaker 1>Noel's social media accounts. They were then doctoring my face

0:37:51.756 --> 0:37:57.996
<v Speaker 1>from ordinary images and superimposing those onto the bodies of

0:37:58.036 --> 0:38:02.516
<v Speaker 1>women depicting me having sexual intercourse. It proved impossible to

0:38:02.556 --> 0:38:06.316
<v Speaker 1>identify who was manipulating Noel's image in this way. It's

0:38:06.356 --> 0:38:09.316
<v Speaker 1>still unclear today, which made it difficult for her to

0:38:09.356 --> 0:38:13.516
<v Speaker 1>seek legal action. I went to the police soon after,

0:38:14.236 --> 0:38:20.636
<v Speaker 1>I contacted government agencies, tried getting a private investigator. Essentially,

0:38:20.956 --> 0:38:24.356
<v Speaker 1>there's nothing that they could do. The sites are hosted overseas,

0:38:24.676 --> 0:38:29.076
<v Speaker 1>the perpetrators are probably overseas. The reaction was at the

0:38:29.156 --> 0:38:31.836
<v Speaker 1>end of the day, I think you can contact the

0:38:31.916 --> 0:38:34.836
<v Speaker 1>webmasters to try and get things deleted. You know, you

0:38:34.876 --> 0:38:38.156
<v Speaker 1>can adjust your privacy setting so that you know nothing

0:38:38.276 --> 0:38:43.436
<v Speaker 1>is available to anyone publicly. It was an unwinnable situation.

0:38:44.036 --> 0:38:48.076
<v Speaker 1>Then things started to escalate. In twenty eighteen, who all

0:38:48.156 --> 0:38:52.316
<v Speaker 1>saw a synthesized pornographic video of herself And I believe

0:38:52.396 --> 0:38:56.556
<v Speaker 1>that it was done for the purposes of silencing me

0:38:57.356 --> 0:39:02.836
<v Speaker 1>because I've been very public about my story and advocating

0:39:02.916 --> 0:39:07.356
<v Speaker 1>for change. So I had actually gotten email from a

0:39:07.436 --> 0:39:11.076
<v Speaker 1>fake email address, and you know, I clicked the link

0:39:11.156 --> 0:39:15.076
<v Speaker 1>I was actually at work. It was a video of

0:39:15.116 --> 0:39:19.996
<v Speaker 1>me having sexual intercourse. The title had my name, the

0:39:20.036 --> 0:39:23.756
<v Speaker 1>face of the woman in it was edited so that

0:39:23.836 --> 0:39:27.196
<v Speaker 1>it was my face, and you know, all the tags

0:39:27.236 --> 0:39:34.076
<v Speaker 1>were like Noel Martin Australia, Feminist, and it didn't look real,

0:39:34.836 --> 0:39:39.156
<v Speaker 1>but the context of everything, with the title my face,

0:39:39.396 --> 0:39:43.996
<v Speaker 1>with the tags all points to me being depicted in

0:39:44.036 --> 0:39:47.436
<v Speaker 1>this video. The fakes were of poor quality, but poor

0:39:47.516 --> 0:39:52.116
<v Speaker 1>and consumers aren't a discriminating lot, and many people reacted

0:39:52.116 --> 0:39:54.396
<v Speaker 1>to them as if they were real. The public reaction

0:39:54.556 --> 0:39:57.916
<v Speaker 1>was horrifying to me. I was a victim, blamed and

0:39:57.996 --> 0:40:01.836
<v Speaker 1>slut shamed, and it's definitely limited the course of where

0:40:01.876 --> 0:40:06.476
<v Speaker 1>I can go in terms of career and employment. Noel

0:40:06.596 --> 0:40:09.796
<v Speaker 1>finished a degree in law and began a peening to

0:40:09.876 --> 0:40:14.716
<v Speaker 1>criminalize this sort of content. My advocacy and my activism

0:40:14.876 --> 0:40:17.756
<v Speaker 1>started off because I had a lived experience of this,

0:40:17.916 --> 0:40:21.596
<v Speaker 1>and I experienced it at a time where it wasn't

0:40:21.636 --> 0:40:28.516
<v Speaker 1>criminalized in Australia, the distribution of altered intimate images or

0:40:28.676 --> 0:40:34.156
<v Speaker 1>altered intimate videos, and so I had to petition, meet

0:40:34.196 --> 0:40:38.236
<v Speaker 1>with my politicians in my area. I wrote a number

0:40:38.276 --> 0:40:41.036
<v Speaker 1>of articles, I spoke to the media, and I was

0:40:41.516 --> 0:40:45.156
<v Speaker 1>involved in the law reform in Australia in a number

0:40:45.196 --> 0:40:49.116
<v Speaker 1>of jurisdictions in Western Australia and New South Wales, and

0:40:49.516 --> 0:40:53.356
<v Speaker 1>I ended up being involved in two press conferences with

0:40:53.436 --> 0:40:57.596
<v Speaker 1>the Attorney generals of each state at the announcement of

0:40:57.636 --> 0:41:03.556
<v Speaker 1>the law that was criminalizing this abuse. Today, in part

0:41:03.636 --> 0:41:07.156
<v Speaker 1>because of Noel's activism, it is illegal in Australia to

0:41:07.196 --> 0:41:11.676
<v Speaker 1>distribute intimate images without and scent, including intimate images and

0:41:11.836 --> 0:41:15.836
<v Speaker 1>videos that have been altered. Although it doesn't encompass all

0:41:15.996 --> 0:41:24.676
<v Speaker 1>malicious synthetic media, Noel has made a solid start. Chapter six,

0:41:25.236 --> 0:41:30.836
<v Speaker 1>Scissors and Glue. The videos depicting Noel Martin were nowhere

0:41:30.956 --> 0:41:34.876
<v Speaker 1>near as sophisticated as those made by the Moon disaster team.

0:41:35.356 --> 0:41:39.436
<v Speaker 1>They were more cheap fakes than deep fakes, and yet

0:41:39.556 --> 0:41:41.996
<v Speaker 1>the point didn't have to be perfect to be devastating.

0:41:42.756 --> 0:41:45.756
<v Speaker 1>The same turns out to be true in politics. To

0:41:45.836 --> 0:41:50.516
<v Speaker 1>understand the power of fakes, you have to understand human psychology.

0:41:50.636 --> 0:41:53.316
<v Speaker 1>It turns out that people are pretty easy to fool.

0:41:53.916 --> 0:41:56.676
<v Speaker 1>John Kerry I was running for president of the US.

0:41:57.116 --> 0:42:01.076
<v Speaker 1>His stance on the Vietnam War was controversial. Jane Fond,

0:42:01.076 --> 0:42:03.636
<v Speaker 1>of course, was a very controversial figure back then because

0:42:03.636 --> 0:42:06.596
<v Speaker 1>of her anti war stand. What have we become as

0:42:06.596 --> 0:42:08.556
<v Speaker 1>a nation if we call the men heroes that were

0:42:08.636 --> 0:42:10.516
<v Speaker 1>used by the had have gone to try to exterminate

0:42:10.516 --> 0:42:12.476
<v Speaker 1>an entire people? What business have we to try to

0:42:12.476 --> 0:42:15.156
<v Speaker 1>exterminate a people? And somebody had created a photo of

0:42:15.156 --> 0:42:17.356
<v Speaker 1>the two of them sharing a stage and an anti

0:42:17.396 --> 0:42:21.196
<v Speaker 1>war rally with the hopes of damaging the Carry campaign.

0:42:21.316 --> 0:42:23.756
<v Speaker 1>The photo was fake. They had never shared a stage together.

0:42:24.196 --> 0:42:26.756
<v Speaker 1>They just took two images, probably put it into some

0:42:26.796 --> 0:42:30.196
<v Speaker 1>standard photo editing software like a photoshop, and just put

0:42:30.196 --> 0:42:32.956
<v Speaker 1>a headline around it, and out to the world it went.

0:42:33.316 --> 0:42:35.916
<v Speaker 1>And I will tell you I remember the most fascinating

0:42:36.076 --> 0:42:39.156
<v Speaker 1>interview I've heard in a long time was right after

0:42:39.476 --> 0:42:42.956
<v Speaker 1>the election, Carry, of course lost, and a voter was

0:42:43.036 --> 0:42:45.916
<v Speaker 1>being interviewed and asked how they voted, and he said

0:42:45.956 --> 0:42:48.196
<v Speaker 1>he couldn't vote for Carry, and the interview said, well

0:42:48.196 --> 0:42:50.956
<v Speaker 1>why not? And the gentleman said, I couldn't get that

0:42:50.996 --> 0:42:53.236
<v Speaker 1>photo of John Carry and Jane Fonda out of my head.

0:42:53.476 --> 0:42:56.196
<v Speaker 1>And the interview said, well, you know that photo is fake,

0:42:56.436 --> 0:42:59.316
<v Speaker 1>and the guy said, much to my surprise, yes, but

0:42:59.396 --> 0:43:02.036
<v Speaker 1>I couldn't get it out of my mind. And this

0:43:02.156 --> 0:43:05.196
<v Speaker 1>is shows you the power of visual imagery, Like even

0:43:05.236 --> 0:43:07.916
<v Speaker 1>after I tell you something is fake, it still had

0:43:07.916 --> 0:43:10.836
<v Speaker 1>an impact on somebody. And I thought, wow, we're in

0:43:10.876 --> 0:43:14.156
<v Speaker 1>a lot of trouble because it's very very hard to

0:43:14.196 --> 0:43:16.476
<v Speaker 1>put the cat back into the bag. Once that content

0:43:16.596 --> 0:43:20.636
<v Speaker 1>is out there, you can't undo it. So seeing is believing,

0:43:21.076 --> 0:43:23.916
<v Speaker 1>even above thinking, Yeah, that seems to be the rule.

0:43:24.276 --> 0:43:27.356
<v Speaker 1>There is very good evidence from the social science literature

0:43:27.436 --> 0:43:30.556
<v Speaker 1>that it's very very difficult to correct the record after

0:43:30.596 --> 0:43:34.436
<v Speaker 1>the mistakes are out there. Law professor Danielle Citram also

0:43:34.516 --> 0:43:38.276
<v Speaker 1>notes that humans tend to pass on information without thinking,

0:43:38.796 --> 0:43:43.956
<v Speaker 1>which triggers what she calls information cascades. Information cascades is

0:43:43.996 --> 0:43:47.356
<v Speaker 1>a phenomenon where we have so much information overload that

0:43:47.476 --> 0:43:50.236
<v Speaker 1>when someone sends us something, some information, and we trust

0:43:50.276 --> 0:43:52.676
<v Speaker 1>that person, we pass it on. We don't even check

0:43:52.836 --> 0:43:57.116
<v Speaker 1>its veracity, and so information can go viral fairly quickly

0:43:57.756 --> 0:44:02.476
<v Speaker 1>because we're not terribly reflective, because we act on impulse.

0:44:03.676 --> 0:44:07.276
<v Speaker 1>Danielle says that information cascades have been given new life

0:44:07.316 --> 0:44:10.836
<v Speaker 1>in the twenty first century through social media. Think about

0:44:10.836 --> 0:44:13.516
<v Speaker 1>the twentieth century phenomenon where we get most of our

0:44:13.556 --> 0:44:20.476
<v Speaker 1>information from trusted sources, trusted newspapers, trusted major couple of

0:44:20.636 --> 0:44:22.756
<v Speaker 1>TV channels. Growing up, we only had to you know,

0:44:22.796 --> 0:44:27.036
<v Speaker 1>we didn't have a million, and they were adhering to

0:44:27.116 --> 0:44:31.156
<v Speaker 1>journalistic ethics and commitments to truth and neutrality and notion

0:44:31.236 --> 0:44:34.756
<v Speaker 1>that you can't publish something without checking it. Now we

0:44:35.636 --> 0:44:38.236
<v Speaker 1>are publishing information that most people say. We're lying on

0:44:38.276 --> 0:44:42.236
<v Speaker 1>our peers and our friends. Social media platforms are designed

0:44:42.636 --> 0:44:46.516
<v Speaker 1>to tailor our information diet to what we want and

0:44:46.636 --> 0:44:49.356
<v Speaker 1>to our pre existing views, so we're locked in a

0:44:49.436 --> 0:44:53.076
<v Speaker 1>digital echo chamber. We think everybody agrees with us. We

0:44:53.236 --> 0:44:57.316
<v Speaker 1>pass on that information we haven't checked the veracity, it

0:44:57.356 --> 0:45:00.476
<v Speaker 1>goes wild. And we're especially likely to pass it on

0:45:00.836 --> 0:45:04.196
<v Speaker 1>if it's negative and novel. Why's that, It's just like

0:45:04.436 --> 0:45:07.956
<v Speaker 1>it's one of our weaknesses. We know how gossip goes

0:45:07.956 --> 0:45:12.116
<v Speaker 1>like wildfire online. So like Hillary Clinton as running a

0:45:12.916 --> 0:45:16.716
<v Speaker 1>sex ring. That's crazy. Oh my god, Eric, did you

0:45:16.756 --> 0:45:19.316
<v Speaker 1>hear about that? I'll post it on Facebook. Eric, you

0:45:19.396 --> 0:45:22.876
<v Speaker 1>pass it on. We just can't help ourselves. And it

0:45:22.956 --> 0:45:25.836
<v Speaker 1>is much in the way that we love suits and

0:45:25.956 --> 0:45:30.036
<v Speaker 1>fats and pizza. You know, we indulge. We don't think

0:45:30.916 --> 0:45:34.636
<v Speaker 1>not some sense. This phenomenon is an old phenomenon, right,

0:45:34.676 --> 0:45:39.316
<v Speaker 1>there's the famous observation by Mark Twain about how a

0:45:39.396 --> 0:45:42.196
<v Speaker 1>lie gets halfway around the world before the truth gets

0:45:42.236 --> 0:45:43.756
<v Speaker 1>its pants hang. Yeah, the truth is still in the

0:45:43.756 --> 0:45:47.796
<v Speaker 1>bedroom getting dressed, and we often will see the lie,

0:45:47.996 --> 0:45:53.476
<v Speaker 1>but the rebuttal is not seen. It's often lost in

0:45:53.556 --> 0:45:57.276
<v Speaker 1>the noise of the defamatory statements. That is not new.

0:45:57.756 --> 0:46:00.996
<v Speaker 1>But what is new is a number of things about

0:46:00.996 --> 0:46:10.356
<v Speaker 1>our information ecosystem are our force multipliers. Chapter seven, Truth decay.

0:46:15.116 --> 0:46:18.676
<v Speaker 1>Many experts are worried that the rapid advances in making fakes,

0:46:19.116 --> 0:46:23.716
<v Speaker 1>combined with a catalyst of information cascades, will undermine democracy.

0:46:24.356 --> 0:46:29.276
<v Speaker 1>The biggest concerns have focused on elections. Globally, we are

0:46:29.316 --> 0:46:35.676
<v Speaker 1>looking at highly polarized situations where this kind of manipulated

0:46:35.756 --> 0:46:37.796
<v Speaker 1>media can be used as a weapon. One of the

0:46:37.836 --> 0:46:41.756
<v Speaker 1>main reasons Francesca and Halsey made their Nixon deep fake

0:46:42.356 --> 0:46:46.076
<v Speaker 1>was to spread awareness about the risks of misinformation campaigns

0:46:46.556 --> 0:46:50.756
<v Speaker 1>before the twenty twenty US presidential election. Similarly, a group

0:46:50.796 --> 0:46:54.116
<v Speaker 1>showcased the power of deep fakes by making videos in

0:46:54.116 --> 0:46:57.516
<v Speaker 1>the run up to the UK parliamentary election showing the

0:46:57.556 --> 0:47:02.396
<v Speaker 1>two bitter rivals Boris Johnson and Jeremy Corman, each endorsing

0:47:02.396 --> 0:47:05.596
<v Speaker 1>the other. I wish to rise above this divide and

0:47:05.756 --> 0:47:09.436
<v Speaker 1>indorse my worthy opponent, the right Honorable Jeremy Corbyn, to

0:47:09.716 --> 0:47:13.796
<v Speaker 1>be Prime Minister of our United Kingdom back Boris Johnson

0:47:13.876 --> 0:47:16.916
<v Speaker 1>to continue as our Prime Minister. But you know what,

0:47:17.676 --> 0:47:20.156
<v Speaker 1>don't listen to me. I think I may be one

0:47:20.196 --> 0:47:23.636
<v Speaker 1>of the thousands of deep fakes on the Internet, using

0:47:23.756 --> 0:47:28.156
<v Speaker 1>powerful technologies to tell stories that aren't so. This just

0:47:28.236 --> 0:47:34.076
<v Speaker 1>kind of indicates how candidates and political figures can be misrepresented,

0:47:34.636 --> 0:47:38.956
<v Speaker 1>and you just need to feed them into people's social

0:47:38.956 --> 0:47:41.876
<v Speaker 1>media feeds for them to be seeing this. At times

0:47:41.876 --> 0:47:44.956
<v Speaker 1>when the stakes are pretty high. So far, we haven't

0:47:45.036 --> 0:47:49.716
<v Speaker 1>yet seen sophisticated deep fakes in US or UK politics.

0:47:49.756 --> 0:47:53.156
<v Speaker 1>That might be because fakes will be most effective if

0:47:53.156 --> 0:47:56.916
<v Speaker 1>they're time for maximum chaos, say close to election day,

0:47:56.956 --> 0:48:00.356
<v Speaker 1>when newsrooms won't have the time to investigate and debunk them.

0:48:01.076 --> 0:48:04.756
<v Speaker 1>But another reason might be that well cheap fakes made

0:48:04.756 --> 0:48:09.676
<v Speaker 1>with basic video editing software or actually pretty effective. Remember

0:48:09.716 --> 0:48:13.116
<v Speaker 1>the video that surfaced of how Speaker Nancy Pelosi, in

0:48:13.116 --> 0:48:17.356
<v Speaker 1>which she appeared intoxicated and confused. We want to give

0:48:17.436 --> 0:48:25.396
<v Speaker 1>this president the opportunity do something historic for our country.

0:48:25.956 --> 0:48:29.276
<v Speaker 1>Both President Trump and Rudy Giuliani shared the video as

0:48:29.356 --> 0:48:32.636
<v Speaker 1>fact on Twitter. The video is just a cheap fake,

0:48:32.996 --> 0:48:37.036
<v Speaker 1>just slowed down Pelosi's speech to make her seem incompetent.

0:48:37.436 --> 0:48:42.196
<v Speaker 1>But maybe elections won't be the biggest targets. Some people

0:48:42.276 --> 0:48:47.356
<v Speaker 1>worry the deep fakes could be weaponized to foment international conflict.

0:48:47.876 --> 0:48:50.796
<v Speaker 1>Berkeley professor Honey f Reed has been working with US

0:48:50.836 --> 0:48:55.356
<v Speaker 1>government's Media Forensics program to address this issue. DARPA, on

0:48:55.436 --> 0:48:58.236
<v Speaker 1>the Defense Department's Research arm, has been pouring a lot

0:48:58.276 --> 0:49:00.516
<v Speaker 1>of money over the last five years into this program.

0:49:01.116 --> 0:49:05.636
<v Speaker 1>They are very concerned about how this technology can be

0:49:05.876 --> 0:49:08.876
<v Speaker 1>a threat to national security and also how when we

0:49:08.876 --> 0:49:11.156
<v Speaker 1>get images and videos from around the world in areas

0:49:11.156 --> 0:49:12.916
<v Speaker 1>of conflict, do we know if they're real or not?

0:49:13.436 --> 0:49:15.556
<v Speaker 1>Is this really an image of a US soldier who

0:49:15.596 --> 0:49:19.196
<v Speaker 1>has been taken hostage? How do we know? So? What

0:49:19.276 --> 0:49:22.196
<v Speaker 1>do you see as some of the worst case scenarios.

0:49:22.556 --> 0:49:24.396
<v Speaker 1>Here's the things that keep me up at night right

0:49:25.116 --> 0:49:27.996
<v Speaker 1>a video of Donald Trump saying I've launched nuclear weapons

0:49:27.996 --> 0:49:31.396
<v Speaker 1>against Iran, and before anybody gets around to figuring out

0:49:31.396 --> 0:49:33.076
<v Speaker 1>whether this is real or not, where we have global

0:49:33.196 --> 0:49:35.956
<v Speaker 1>nuclear mountdown. And here's the thing. I don't think that

0:49:35.956 --> 0:49:39.916
<v Speaker 1>that's likely, but I also don't think that the probability

0:49:39.956 --> 0:49:43.836
<v Speaker 1>of that is zero. And that should worry us because

0:49:44.196 --> 0:49:49.196
<v Speaker 1>while it's not likely, the consequences are spectacularly bad. Lawyer

0:49:49.276 --> 0:49:54.516
<v Speaker 1>Danielle Citrom worries about an even more plausible scenario. Imagine

0:49:54.556 --> 0:49:58.356
<v Speaker 1>a deep fake of a well known American general burning

0:49:58.356 --> 0:50:02.596
<v Speaker 1>a Koran, and it is timed at a very tense

0:50:02.716 --> 0:50:09.556
<v Speaker 1>moment in a particular most country, whether it's agha unerstand.

0:50:09.796 --> 0:50:13.316
<v Speaker 1>It could then lead to physical violence. And you think

0:50:13.356 --> 0:50:16.836
<v Speaker 1>this could be made. No general, no Quran actually used

0:50:16.876 --> 0:50:20.476
<v Speaker 1>in the video just programmed. You can use the technology

0:50:21.076 --> 0:50:24.316
<v Speaker 1>to mine existing photographs. Kind of easy, especially with someone

0:50:24.476 --> 0:50:27.996
<v Speaker 1>could take Jim Mattis when he was our defense secretary.

0:50:27.996 --> 0:50:31.076
<v Speaker 1>Of Jim Mattis, you know, actually taking a Koran and

0:50:31.156 --> 0:50:33.836
<v Speaker 1>ripping it in half and say all Muslims should die.

0:50:34.396 --> 0:50:40.116
<v Speaker 1>Imagine the chaos in diplomacy, the chaos of our soldiers

0:50:40.116 --> 0:50:46.156
<v Speaker 1>abroad in Muslim countries. It would be inciting violence without question. Well,

0:50:46.196 --> 0:50:50.116
<v Speaker 1>we haven't yet seen spectacular fake videos used to disrupt

0:50:50.196 --> 0:50:55.636
<v Speaker 1>elections or create international chaos. We have seen increasingly sophisticated

0:50:55.676 --> 0:51:00.116
<v Speaker 1>attacks on public policymaking. So we've got an example in

0:51:00.276 --> 0:51:04.916
<v Speaker 1>twenty seventeen where the FCC solicited public comment on the

0:51:04.956 --> 0:51:09.436
<v Speaker 1>proposal to repeal net neutrality. Net neutrality is the instable

0:51:09.476 --> 0:51:13.516
<v Speaker 1>that Internet service providers should be a neutral public utility.

0:51:14.116 --> 0:51:18.996
<v Speaker 1>They shouldn't discriminate between websites, say slowing down Netflix streaming

0:51:19.276 --> 0:51:22.716
<v Speaker 1>to encourage you to purchase a different online video service.

0:51:23.556 --> 0:51:27.476
<v Speaker 1>As President Barack Obama described in twenty fourteen, there are

0:51:27.476 --> 0:51:30.916
<v Speaker 1>no gatekeepers deciding which sites you get to access. There

0:51:30.916 --> 0:51:34.276
<v Speaker 1>are no toll roads on the information super Highway. Federal

0:51:34.316 --> 0:51:39.676
<v Speaker 1>communications policy had long supported net neutrality, but in twenty seventeen,

0:51:40.076 --> 0:51:44.396
<v Speaker 1>the Trump administration favored repealing the policy. There were twenty

0:51:44.396 --> 0:51:48.996
<v Speaker 1>two million comments that the e SEC received, but ninety

0:51:49.036 --> 0:51:53.956
<v Speaker 1>six percent of those were actually fake. The interesting thing

0:51:54.196 --> 0:51:58.676
<v Speaker 1>is the real comments were opposed to repeal, whereas the

0:51:58.756 --> 0:52:02.476
<v Speaker 1>fake comments were in favor. A Wall Street Journal investigation

0:52:02.716 --> 0:52:07.076
<v Speaker 1>exposed that the fake public comments were generated by bots.

0:52:07.716 --> 0:52:11.196
<v Speaker 1>It found similar problems with public comments about payday lending.

0:52:12.076 --> 0:52:16.236
<v Speaker 1>The bots varied their comments in a combinatorial fashion so

0:52:16.276 --> 0:52:20.036
<v Speaker 1>that the content wasn't identical. With a little sleuthing, though,

0:52:20.076 --> 0:52:23.516
<v Speaker 1>you could see that they were generated by computers. But

0:52:23.556 --> 0:52:28.076
<v Speaker 1>with a technology increasingly able to generate completely original writing,

0:52:28.636 --> 0:52:32.036
<v Speaker 1>like open ayes program that wrote the story about unicorns

0:52:32.036 --> 0:52:35.316
<v Speaker 1>in the ANDES, it's going to become hard to spot

0:52:35.356 --> 0:52:38.836
<v Speaker 1>the fakes. So there was this Harvest student, Max Weiss,

0:52:38.876 --> 0:52:42.476
<v Speaker 1>who used GPT two to kind of demonstrate this, And

0:52:42.556 --> 0:52:44.516
<v Speaker 1>I went on his site yesterday and he's got this

0:52:44.556 --> 0:52:49.516
<v Speaker 1>little test where you need to decide whether a comment

0:52:50.156 --> 0:52:53.036
<v Speaker 1>is real or fake. So you go on and you

0:52:53.076 --> 0:52:55.356
<v Speaker 1>read it, and you decide whether it's been written by

0:52:55.356 --> 0:52:58.316
<v Speaker 1>a bot or by a human. So I did this,

0:52:58.476 --> 0:53:02.196
<v Speaker 1>and the ones that seemed to be really well written

0:53:02.236 --> 0:53:05.036
<v Speaker 1>and quite narrative and discussive. Generally, I was picking them

0:53:05.036 --> 0:53:07.596
<v Speaker 1>as human. I was wrong almost all the time. It

0:53:07.716 --> 0:53:12.116
<v Speaker 1>was amazing and a lot. In our democracy, public commons

0:53:12.276 --> 0:53:14.756
<v Speaker 1>have been an important way in which citizens can make

0:53:14.796 --> 0:53:19.076
<v Speaker 1>their voices heard. But now it's becoming easy to drown

0:53:19.116 --> 0:53:23.356
<v Speaker 1>out those voices with millions of fake opinions. Now the

0:53:23.436 --> 0:53:27.076
<v Speaker 1>downfall of truth likely won't come with a bang, but

0:53:27.196 --> 0:53:33.276
<v Speaker 1>a whimper, a slow, steady erosion that some call truth decay.

0:53:33.476 --> 0:53:35.556
<v Speaker 1>If you can't believe anything you read, or hear or

0:53:35.556 --> 0:53:37.996
<v Speaker 1>see anymore, I don't know how you have a democracy,

0:53:38.036 --> 0:53:40.796
<v Speaker 1>an I don't know, frankly, how we have civilized society

0:53:41.156 --> 0:53:43.476
<v Speaker 1>if everybody's going to live in an echo chamber, believing

0:53:43.476 --> 0:53:46.116
<v Speaker 1>their own version of events. How do we have a

0:53:46.116 --> 0:53:49.476
<v Speaker 1>dialogue if we can't agree on basic facts. In the end,

0:53:49.756 --> 0:53:53.476
<v Speaker 1>the most insidious impact of deep fakes may not be

0:53:53.516 --> 0:53:57.396
<v Speaker 1>the deep fake content itself, but the ability to claim

0:53:57.476 --> 0:54:01.596
<v Speaker 1>that real content is fake. It's something that Danielle Citron

0:54:01.836 --> 0:54:06.236
<v Speaker 1>refers to as the liars dividend. The liars dividend is

0:54:06.276 --> 0:54:08.876
<v Speaker 1>that the more you educate people about the phenomenon of

0:54:08.916 --> 0:54:13.436
<v Speaker 1>deep fix, the more the wrongdoer can disclaim reality. Think

0:54:13.476 --> 0:54:17.596
<v Speaker 1>about what President Trump did with the Access Hollywood tape.

0:54:17.916 --> 0:54:19.836
<v Speaker 1>You know, I'm automatically attracted to be in the work.

0:54:19.876 --> 0:54:21.916
<v Speaker 1>I just started kissing them. It's like a magnet. You

0:54:22.036 --> 0:54:24.956
<v Speaker 1>just I don't even know. And when you were a start,

0:54:25.036 --> 0:54:27.276
<v Speaker 1>they let you do it. You can do anything whatever

0:54:27.276 --> 0:54:32.516
<v Speaker 1>you want, grabbed by the I can do anything. Initially,

0:54:32.756 --> 0:54:36.796
<v Speaker 1>Trump apologized for the remarks. Anyone who knows me knows

0:54:36.876 --> 0:54:40.716
<v Speaker 1>these words don't reflect who I am. I said it.

0:54:41.196 --> 0:54:45.276
<v Speaker 1>I was wrong, and I apologize. But in twenty seventeen,

0:54:45.796 --> 0:54:49.356
<v Speaker 1>a year after his initial apology and with the idea

0:54:49.396 --> 0:54:53.636
<v Speaker 1>of deep fake content starting to gain attention, Trump changed

0:54:53.716 --> 0:54:56.836
<v Speaker 1>his tune. Upon reflection, he said, they're not real. That

0:54:56.956 --> 0:54:59.396
<v Speaker 1>wasn't me. I don't think that was my voice. That's

0:54:59.396 --> 0:55:03.836
<v Speaker 1>the liar's dividend. In practice, the Trump commented about Access

0:55:03.876 --> 0:55:09.036
<v Speaker 1>Hollywood was remarkable. Slightly more subtle than that, He said,

0:55:09.556 --> 0:55:12.356
<v Speaker 1>I'm not sure that was me. Right. Well, that's the

0:55:12.396 --> 0:55:27.396
<v Speaker 1>corrosive gas lighting. Chapter eight, A Life Stored in the Cloud.

0:55:29.356 --> 0:55:33.676
<v Speaker 1>Deep fakes have the potential to devastate individuals and harms society.

0:55:34.276 --> 0:55:38.116
<v Speaker 1>The question is can we stop them from spreading before

0:55:38.116 --> 0:55:41.396
<v Speaker 1>they get out of control. To do so, we need

0:55:41.476 --> 0:55:45.596
<v Speaker 1>reliable ways to spot deep fakes. So the good news

0:55:45.716 --> 0:55:48.596
<v Speaker 1>is there are still artifacts in the synthesized content, whether

0:55:48.596 --> 0:55:51.196
<v Speaker 1>those are images, audio, or a video, that we as

0:55:51.236 --> 0:55:54.196
<v Speaker 1>the experts, can tell apart. So when for example, the

0:55:54.196 --> 0:55:56.396
<v Speaker 1>New York Times wants to run a story with a video,

0:55:56.956 --> 0:55:59.676
<v Speaker 1>we can help them validate it. What are the real

0:55:59.756 --> 0:56:03.996
<v Speaker 1>sophisticated experts looking gat So the eyes are really wonderful

0:56:04.156 --> 0:56:07.876
<v Speaker 1>forensically because they reflect back to you what is in

0:56:07.916 --> 0:56:11.356
<v Speaker 1>the scene. I'm sitting now right now in a studio.

0:56:11.476 --> 0:56:13.836
<v Speaker 1>There's maybe about a dozen or so lights around me,

0:56:13.876 --> 0:56:16.236
<v Speaker 1>and you can see this very complex set of reflections

0:56:16.236 --> 0:56:20.276
<v Speaker 1>in my eyes. So we can analyze fairly complex lighting patterns,

0:56:20.276 --> 0:56:23.236
<v Speaker 1>for example, to determine if this is one person's head

0:56:23.276 --> 0:56:26.236
<v Speaker 1>spliced onto another person's body, or if the two people

0:56:26.236 --> 0:56:30.596
<v Speaker 1>standing next to each other were digitally inserted from another photograph.

0:56:30.676 --> 0:56:33.316
<v Speaker 1>I could spend another hour telling you about the many

0:56:33.316 --> 0:56:36.796
<v Speaker 1>different forensic techniques that we've developed. There's no silver bullet here.

0:56:37.196 --> 0:56:40.116
<v Speaker 1>Really is a sort of a time consuming and deliberate

0:56:40.156 --> 0:56:43.476
<v Speaker 1>and thoughtful and it requires many many tools, and it

0:56:43.516 --> 0:56:45.636
<v Speaker 1>requires people with a fair amount of skill to do this.

0:56:46.156 --> 0:56:49.436
<v Speaker 1>Honey Freed also has quite a few detection techniques that

0:56:49.516 --> 0:56:52.076
<v Speaker 1>he won't speak about publicly for fear of the deep

0:56:52.116 --> 0:56:55.316
<v Speaker 1>fake creators will learn how to beat his tests. I

0:56:55.316 --> 0:56:57.636
<v Speaker 1>don't create a GitHub repository and give my code to

0:56:57.676 --> 0:57:00.956
<v Speaker 1>all my adversaries. I don't have just one forensic techniques.

0:57:01.036 --> 0:57:03.796
<v Speaker 1>I have a couple dozen of them. So that means you,

0:57:04.036 --> 0:57:06.276
<v Speaker 1>as the person creating this now have to go back

0:57:06.316 --> 0:57:09.636
<v Speaker 1>and implement twenty different techniques. You have to do it

0:57:09.716 --> 0:57:12.516
<v Speaker 1>just perfectly, and that makes the landscape a little bit

0:57:12.556 --> 0:57:15.756
<v Speaker 1>more tricky for you to manage. As technology makes it

0:57:15.796 --> 0:57:19.236
<v Speaker 1>easier to create deep fakes, a big problem will be

0:57:19.276 --> 0:57:22.836
<v Speaker 1>the sheer amounts of content to review. So the average

0:57:22.876 --> 0:57:26.476
<v Speaker 1>person can download software repositories, and so it's getting to

0:57:26.476 --> 0:57:29.756
<v Speaker 1>the point now where the average person can just run

0:57:29.796 --> 0:57:32.196
<v Speaker 1>these as if they're running any standard piece of software.

0:57:32.236 --> 0:57:35.116
<v Speaker 1>There's also websites that have propped up where you can

0:57:35.116 --> 0:57:37.556
<v Speaker 1>pay them twenty bucks and you tell them, please put

0:57:37.596 --> 0:57:39.956
<v Speaker 1>this person's face into this person's video, and they will

0:57:39.956 --> 0:57:42.396
<v Speaker 1>do that for you. And so it doesn't take a

0:57:42.396 --> 0:57:44.916
<v Speaker 1>lot to get access to these tools. Now, I will

0:57:44.956 --> 0:57:47.236
<v Speaker 1>say that the output of those are not quite as

0:57:47.276 --> 0:57:49.756
<v Speaker 1>good as what we can create inside the lab. And

0:57:49.796 --> 0:57:51.556
<v Speaker 1>you just know what the trend is. You just know

0:57:51.596 --> 0:57:53.876
<v Speaker 1>it's going to get better and cheaper and faster and

0:57:53.956 --> 0:57:57.196
<v Speaker 1>easier to use. Detecting few fakes will be a never

0:57:57.396 --> 0:58:02.236
<v Speaker 1>ending cat and mouse game. Remember how generative adversarial networks

0:58:02.316 --> 0:58:06.716
<v Speaker 1>or gams are built by training a fake generator to

0:58:06.796 --> 0:58:12.316
<v Speaker 1>outsmart a detector. Well, as detectors get better, fake generators

0:58:12.396 --> 0:58:16.956
<v Speaker 1>will be trained to keep pays still. Detectives like Honey

0:58:17.036 --> 0:58:21.116
<v Speaker 1>and platforms like Facebook are working to develop automated ways

0:58:21.156 --> 0:58:25.876
<v Speaker 1>to spot deep fakes rapidly and reliably. That's important because

0:58:25.916 --> 0:58:30.036
<v Speaker 1>more than five hundred additional hours of video are being

0:58:30.116 --> 0:58:33.756
<v Speaker 1>uploaded to YouTube every minute. I don't mean to sound

0:58:33.796 --> 0:58:36.796
<v Speaker 1>defeatist about this, but I'm going to lose this war.

0:58:37.036 --> 0:58:39.916
<v Speaker 1>I know this because it's always going to be easier

0:58:39.916 --> 0:58:42.196
<v Speaker 1>to create content than it is to detect it. But

0:58:43.156 --> 0:58:45.276
<v Speaker 1>here's where I will win. I will take it out

0:58:45.316 --> 0:58:48.076
<v Speaker 1>of the hands of the average person. So think about,

0:58:48.156 --> 0:58:51.716
<v Speaker 1>for example, the creation of counterfeit currency. With the latest

0:58:51.836 --> 0:58:55.156
<v Speaker 1>innovations brought on by the Treasure Department, it is hard

0:58:55.156 --> 0:58:57.356
<v Speaker 1>for the average person to take their inkjet printer and

0:58:57.396 --> 0:59:00.756
<v Speaker 1>create compelling fake currency, and I think that's going to

0:59:00.836 --> 0:59:03.196
<v Speaker 1>be the same trend here is that if you're using

0:59:03.196 --> 0:59:05.396
<v Speaker 1>some off the shelf tool, if you're paying somebody on

0:59:05.396 --> 0:59:07.156
<v Speaker 1>the website, we're going to find you, and we're going

0:59:07.196 --> 0:59:09.556
<v Speaker 1>to find you quickly. But if you are a deadated

0:59:09.716 --> 0:59:12.476
<v Speaker 1>highly skilled of the time and the effort to create it,

0:59:12.756 --> 0:59:14.596
<v Speaker 1>we are going to have to work really hard to

0:59:14.596 --> 0:59:19.276
<v Speaker 1>detect those. Given the challenges of detecting fake content, some

0:59:19.356 --> 0:59:23.236
<v Speaker 1>people envision a different kind of techno fix. They proposed

0:59:23.316 --> 0:59:27.876
<v Speaker 1>developing airtight ways for content creators to mark their own

0:59:27.996 --> 0:59:33.436
<v Speaker 1>original video as real. That way, we could instantly recognize

0:59:33.476 --> 0:59:36.916
<v Speaker 1>an altered version if it wasn't identical. Now, there's ways

0:59:36.916 --> 0:59:39.476
<v Speaker 1>of authenticating at the point of recording, and these are

0:59:39.516 --> 0:59:43.036
<v Speaker 1>what it called control capture system. So here's the idea.

0:59:43.316 --> 0:59:46.396
<v Speaker 1>You use a special app on your mobile device that

0:59:46.476 --> 0:59:50.316
<v Speaker 1>at the point of capture a cryptographically signs the image

0:59:50.316 --> 0:59:52.996
<v Speaker 1>of the video or the audio. It puts that signature

0:59:53.036 --> 0:59:54.876
<v Speaker 1>onto the blockchain. And the only thing you have to

0:59:54.876 --> 0:59:57.156
<v Speaker 1>know about the blockchain is that it is an immutable

0:59:57.436 --> 1:00:01.236
<v Speaker 1>distributed ledger, which means that that signature is essentially impossible

1:00:01.276 --> 1:00:05.036
<v Speaker 1>to manipulate, and now all of that happened at the

1:00:05.036 --> 1:00:08.076
<v Speaker 1>point of recording. If I was running a campaign today

1:00:08.076 --> 1:00:12.036
<v Speaker 1>and I was worried about candidates likeness being misused, absolutely

1:00:12.116 --> 1:00:14.596
<v Speaker 1>every public event that they were at, I would record

1:00:14.596 --> 1:00:16.396
<v Speaker 1>with a control capture system, and I'd be able to

1:00:16.436 --> 1:00:19.676
<v Speaker 1>prove what they actually said or did at any point

1:00:19.716 --> 1:00:22.796
<v Speaker 1>in the future. So this approach would shift the burden

1:00:22.796 --> 1:00:27.196
<v Speaker 1>of authentication to the people creating the videos rather than

1:00:27.276 --> 1:00:31.956
<v Speaker 1>publishers or consumers. Law professor Danielle Citron has explored how

1:00:31.956 --> 1:00:36.116
<v Speaker 1>this solution could quickly become dystopium. We might see the

1:00:36.116 --> 1:00:39.276
<v Speaker 1>emergence of an essentially an audit trail of everything you

1:00:39.396 --> 1:00:41.956
<v Speaker 1>do and say all of the time. Danielle refers to

1:00:41.956 --> 1:00:46.716
<v Speaker 1>the business model as immutable lifelogs in the cloud. In

1:00:46.716 --> 1:00:48.836
<v Speaker 1>a way we sort of already seen it. There are

1:00:48.876 --> 1:00:51.716
<v Speaker 1>health plans that if you wear a fitbit all the

1:00:51.756 --> 1:00:54.556
<v Speaker 1>time and you let yourself be monitored, it lowers your

1:00:54.556 --> 1:00:57.956
<v Speaker 1>insurance your health insurance rates. But you can see how

1:00:58.116 --> 1:01:02.076
<v Speaker 1>if the incentives are there in the market to self surveil,

1:01:02.356 --> 1:01:06.956
<v Speaker 1>whether it's for health insurance, life insurance, car insurance, we're

1:01:06.996 --> 1:01:10.716
<v Speaker 1>going to see the unraveling of Brian to say by ourselves.

1:01:11.076 --> 1:01:15.916
<v Speaker 1>You know, corporations may very well, because the CEO is

1:01:15.956 --> 1:01:20.196
<v Speaker 1>so valuable, they may say, you've got to have a log,

1:01:20.396 --> 1:01:22.556
<v Speaker 1>an immutable audit trail of everything you do and say

1:01:22.596 --> 1:01:24.796
<v Speaker 1>so when that deep fake comes up the night before

1:01:24.836 --> 1:01:29.036
<v Speaker 1>the IPO, you can say, look, the CEO wasn't taking

1:01:29.036 --> 1:01:32.316
<v Speaker 1>the bribe, wasn't having sex with a prostitute. And so

1:01:32.356 --> 1:01:36.076
<v Speaker 1>we have proof, we have an auto trail, we have

1:01:36.076 --> 1:01:39.276
<v Speaker 1>a log. So when we were imagining, we were imagining

1:01:39.276 --> 1:01:43.676
<v Speaker 1>a business model that hasn't quite come up, but we

1:01:43.716 --> 1:01:48.236
<v Speaker 1>have gotten a number of requests from insurance companies as

1:01:48.276 --> 1:01:51.796
<v Speaker 1>well as companies to say we're interested in this idea.

1:01:51.916 --> 1:01:53.556
<v Speaker 1>So how much has to be in that log? Does

1:01:53.556 --> 1:01:55.636
<v Speaker 1>this have to be a whole video of your life?

1:01:55.716 --> 1:01:58.636
<v Speaker 1>That is a great question, one that terrifies us. So

1:01:58.676 --> 1:02:04.156
<v Speaker 1>it may be that you're logging locate geolocation, you're logging videos.

1:02:04.236 --> 1:02:07.116
<v Speaker 1>You see people talking and who they're interacting with, and

1:02:07.196 --> 1:02:09.516
<v Speaker 1>that might be good enough to prevent the miss Jeff

1:02:09.996 --> 1:02:14.556
<v Speaker 1>that would hijack the IPM. Your whole life online, yes,

1:02:14.756 --> 1:02:19.716
<v Speaker 1>stored securely, clock down, protected in the cloud. It is

1:02:20.116 --> 1:02:22.956
<v Speaker 1>at least for a privacy scholar. There are so many

1:02:22.956 --> 1:02:26.396
<v Speaker 1>reasons why we ought to have privacy that aren't about

1:02:26.516 --> 1:02:31.756
<v Speaker 1>hiding things. It's about creating spaces and managing boundaries around

1:02:31.836 --> 1:02:35.636
<v Speaker 1>ourselves and our intimates and our loved ones. So I

1:02:35.876 --> 1:02:39.876
<v Speaker 1>worry that if we entirely unravel privacy a in the

1:02:39.916 --> 1:02:45.436
<v Speaker 1>wrong hands is very dangerous right B It changes how

1:02:45.556 --> 1:02:54.316
<v Speaker 1>we think about ourselves and humanity. Chapter nine, Section two thirty.

1:02:55.676 --> 1:02:59.716
<v Speaker 1>So techno fixes are complicated. What about passing laws to

1:03:00.076 --> 1:03:03.076
<v Speaker 1>band deep fakes or at least deep fakes that don't

1:03:03.116 --> 1:03:07.196
<v Speaker 1>disclose their fake So the video and audio is speech?

1:03:07.596 --> 1:03:10.436
<v Speaker 1>In our First Amendment doctrine is very much a protective

1:03:10.956 --> 1:03:13.956
<v Speaker 1>of free speech, and the Supreme Court has explained that

1:03:14.756 --> 1:03:19.036
<v Speaker 1>lies just lies themselves without harm is protected speech. When

1:03:19.116 --> 1:03:21.596
<v Speaker 1>lies cause certain kinds of harm, we can regulate it

1:03:21.876 --> 1:03:29.716
<v Speaker 1>defamation of private people, threats, incitement, fraud, impersonation of government officials.

1:03:29.876 --> 1:03:35.716
<v Speaker 1>What about lies concerning public figures like politicians? California and Texas,

1:03:35.796 --> 1:03:40.076
<v Speaker 1>for instance, recently pass laws making it illegal to publish

1:03:40.116 --> 1:03:42.876
<v Speaker 1>deep fakes of a candidate in the weeks leading up

1:03:42.916 --> 1:03:46.156
<v Speaker 1>to an election. It's not clear yet whether the laws

1:03:46.196 --> 1:03:50.916
<v Speaker 1>will pass constitutional muster. As you're saying in an American content,

1:03:51.716 --> 1:03:55.716
<v Speaker 1>we are just not going to be able to outlaw fakes. Yeah,

1:03:55.716 --> 1:03:57.436
<v Speaker 1>we can't have a flat van, and I don't think

1:03:57.436 --> 1:04:01.396
<v Speaker 1>we should. It would fail on doctrinal grounds, but ultimately

1:04:01.636 --> 1:04:08.796
<v Speaker 1>it would prevent the positive uses. Interestingly, in January twenty twenty, China,

1:04:09.196 --> 1:04:14.036
<v Speaker 1>which has no First Amendment protecting free speech, promulgated regulations

1:04:14.436 --> 1:04:18.516
<v Speaker 1>banning deep fakes. The use of AI or virtuality now

1:04:18.596 --> 1:04:21.716
<v Speaker 1>needs to be clearly marked in a prominent manner, and

1:04:21.796 --> 1:04:25.156
<v Speaker 1>the failure to do so is considered a criminal offense.

1:04:25.956 --> 1:04:29.036
<v Speaker 1>To explore other options for the US, I went to

1:04:29.076 --> 1:04:32.676
<v Speaker 1>speak with a public policy expert. My name is Joan Donovan,

1:04:32.836 --> 1:04:36.396
<v Speaker 1>and I work at Harvard Kennedy Shorenstein Center, where I

1:04:36.516 --> 1:04:39.756
<v Speaker 1>lead a team of researchers looking at medium manipulation and

1:04:39.836 --> 1:04:43.596
<v Speaker 1>disinformation campaigns. Joan is head of the Technology and Social

1:04:43.676 --> 1:04:48.236
<v Speaker 1>Change Research Project, and her staff studies how social media

1:04:48.596 --> 1:04:52.676
<v Speaker 1>gives rise to hoaxes and scams. Her team is particularly

1:04:52.676 --> 1:04:58.516
<v Speaker 1>interested and precisely how misinformation spreads across the Internet. Ultimately,

1:04:58.636 --> 1:05:01.756
<v Speaker 1>underneath all of this is the distribution mechanism, which is

1:05:02.036 --> 1:05:08.356
<v Speaker 1>social media and platforms and platforms have to rethink the

1:05:08.756 --> 1:05:12.596
<v Speaker 1>open of their design because that has now become a

1:05:12.676 --> 1:05:17.516
<v Speaker 1>territory for information warfare. In early twenty twenty, Facebook announced

1:05:17.516 --> 1:05:23.396
<v Speaker 1>a major policy change about synthesized content. Facebook preissued policies

1:05:23.516 --> 1:05:26.556
<v Speaker 1>now on deep fakes, saying that if it is an

1:05:26.596 --> 1:05:31.676
<v Speaker 1>AI generated video and it's misleading in some other contextual way,

1:05:32.476 --> 1:05:37.996
<v Speaker 1>then they will remove it. Interestingly, Facebook banned the Moon

1:05:38.116 --> 1:05:41.436
<v Speaker 1>Disaster Team's Nixon video even though it was made for

1:05:41.596 --> 1:05:45.836
<v Speaker 1>educational purposes, but didn't remove the slowed down version of

1:05:45.956 --> 1:05:50.756
<v Speaker 1>Nancy Pelosi, which was made to mislead the public. Why

1:05:50.876 --> 1:05:55.676
<v Speaker 1>because the Pelosi video wasn't created with artificial intelligence. For now,

1:05:56.236 --> 1:05:59.996
<v Speaker 1>Facebook is choosing to target deep fakes, but not cheap fakes.

1:06:00.476 --> 1:06:03.236
<v Speaker 1>One way to push platforms to take a stronger stance

1:06:03.356 --> 1:06:06.196
<v Speaker 1>might be to remove some of the legal protections that

1:06:06.316 --> 1:06:11.196
<v Speaker 1>they currently enjoy. Undersection two thirty of the Communication's Decency

1:06:11.276 --> 1:06:16.236
<v Speaker 1>Act past in nineteen ninety six, platforms aren't legally liable

1:06:16.476 --> 1:06:20.716
<v Speaker 1>for content posted by its users. The fact that platforms

1:06:20.756 --> 1:06:24.636
<v Speaker 1>have no responsibility for the content they host has an upside.

1:06:25.076 --> 1:06:28.196
<v Speaker 1>It's led to the massive diversity of online content we

1:06:28.316 --> 1:06:32.916
<v Speaker 1>enjoyed today. But it also allows a dangerous escalation of

1:06:32.956 --> 1:06:37.356
<v Speaker 1>fake news. Is it time to change section to thirty

1:06:37.516 --> 1:06:42.236
<v Speaker 1>to create incentives for platforms to police false content? I

1:06:42.316 --> 1:06:45.556
<v Speaker 1>asked the former head of a major platform, LinkedIn co

1:06:45.716 --> 1:06:49.476
<v Speaker 1>founder Reid Hoffman. For example, let's take my view of

1:06:49.516 --> 1:06:52.636
<v Speaker 1>what the response to the christ Church shooting should be

1:06:52.876 --> 1:06:55.996
<v Speaker 1>is to say, well, we want you to solve not

1:06:56.156 --> 1:07:02.036
<v Speaker 1>having terrorism, murderer or murderers displayed to people. So we're

1:07:02.036 --> 1:07:04.676
<v Speaker 1>simply going to do a fine of ten thousand dollars

1:07:04.676 --> 1:07:08.836
<v Speaker 1>per view. Two shootings occurred at mosques in Christchurch, New

1:07:08.916 --> 1:07:13.076
<v Speaker 1>Zealand in March twenty nineteen. Graphic videos of the event

1:07:13.516 --> 1:07:17.276
<v Speaker 1>were soon posted online. Five people saw it, that's fifty

1:07:17.276 --> 1:07:20.116
<v Speaker 1>thousand dollars. But if he becomes a meme and a

1:07:20.196 --> 1:07:24.756
<v Speaker 1>million people see it, that's ten billion dollars. Yes, right,

1:07:24.956 --> 1:07:27.516
<v Speaker 1>So what's really trying to do is get you to say,

1:07:28.076 --> 1:07:30.956
<v Speaker 1>let's make sure that the meme never happens. Okay, So

1:07:30.996 --> 1:07:35.796
<v Speaker 1>that's a governance mechanism there is you find the channel

1:07:35.836 --> 1:07:39.196
<v Speaker 1>the platform based on number of views would be a

1:07:39.476 --> 1:07:42.356
<v Speaker 1>very general way to say, now you guys have to solve.

1:07:42.516 --> 1:07:46.396
<v Speaker 1>Now you solve, you figure it out. What about other solutions.

1:07:46.676 --> 1:07:50.076
<v Speaker 1>If we are to make regulation, it should be about

1:07:50.156 --> 1:07:54.436
<v Speaker 1>the amount of staff in proportion to the amount of

1:07:54.556 --> 1:07:58.116
<v Speaker 1>users so that they can get a handle on the content.

1:07:58.596 --> 1:08:02.076
<v Speaker 1>But can they be fast enough. Maybe the viral spread

1:08:02.116 --> 1:08:06.076
<v Speaker 1>should be slowed down enough to allow them to moderate.

1:08:06.156 --> 1:08:10.236
<v Speaker 1>Let's put it this way. The stock market has certain

1:08:10.916 --> 1:08:14.036
<v Speaker 1>governor is built in when there's massive changes in a

1:08:14.076 --> 1:08:17.796
<v Speaker 1>stock price. There are decelerators that kick in, breaks that

1:08:17.996 --> 1:08:20.956
<v Speaker 1>kick in. Should the platforms have breaks that kick in

1:08:21.356 --> 1:08:26.076
<v Speaker 1>before something can go fully viral? So in terms of deceleration,

1:08:26.876 --> 1:08:29.516
<v Speaker 1>there are things that they do already that accelerate the

1:08:29.556 --> 1:08:33.156
<v Speaker 1>process that they need to think differently about, especially when

1:08:33.196 --> 1:08:37.596
<v Speaker 1>it comes to something turning into a trending topic. So

1:08:37.636 --> 1:08:41.876
<v Speaker 1>there needs to be an intervening moment before things get

1:08:41.916 --> 1:08:45.076
<v Speaker 1>to the homepage and get to trending, where there is

1:08:45.076 --> 1:08:49.076
<v Speaker 1>a content review. So much to say here, but I

1:08:49.156 --> 1:08:52.676
<v Speaker 1>want to think particularly about listeners who are in their

1:08:52.716 --> 1:08:56.876
<v Speaker 1>twenties and thirties and are very tech savvy. They're going

1:08:56.956 --> 1:09:00.076
<v Speaker 1>to be part of the solution here. What would you

1:09:00.116 --> 1:09:04.476
<v Speaker 1>say to them about what they can do? I think

1:09:05.156 --> 1:09:11.156
<v Speaker 1>it's important that younger people advocate for the Internet that

1:09:11.196 --> 1:09:13.596
<v Speaker 1>they want we have to fight for it. We have

1:09:13.716 --> 1:09:18.036
<v Speaker 1>to ask for different things, and that kind of agitation

1:09:18.476 --> 1:09:22.716
<v Speaker 1>can come in the form of posting on the platform,

1:09:22.756 --> 1:09:27.156
<v Speaker 1>writing letters, joining groups like Fight for the Future, and

1:09:27.236 --> 1:09:32.836
<v Speaker 1>trying to work on getting platforms to do better and

1:09:32.916 --> 1:09:35.796
<v Speaker 1>to advocate for the kind of content that you want

1:09:35.836 --> 1:09:40.636
<v Speaker 1>to see more of. The important thing is that our

1:09:40.676 --> 1:09:45.436
<v Speaker 1>society is shaped by these platforms, and so we're not

1:09:45.516 --> 1:09:48.276
<v Speaker 1>going to do away with them, but we don't have

1:09:48.396 --> 1:09:59.396
<v Speaker 1>to make do with them either. Conclusion, choose your planet.

1:10:01.636 --> 1:10:04.596
<v Speaker 1>So there you have it, Stewards of the Brave New Planet.

1:10:05.156 --> 1:10:10.916
<v Speaker 1>Synthetic media or deep fakes, have been manipulating content for

1:10:11.036 --> 1:10:14.836
<v Speaker 1>more than a hundred years, but recent advances in AI

1:10:14.956 --> 1:10:18.236
<v Speaker 1>have taken it to a whole new level of verisimilitude.

1:10:18.836 --> 1:10:23.596
<v Speaker 1>The technology could transform movies and television, favored actors from

1:10:23.676 --> 1:10:27.316
<v Speaker 1>years past starring in new narratives, along with actors who

1:10:27.356 --> 1:10:31.516
<v Speaker 1>never existed, patients regaining the ability to speak in their

1:10:31.556 --> 1:10:37.316
<v Speaker 1>own voices, personalized stories created on demand for any child

1:10:37.396 --> 1:10:41.396
<v Speaker 1>around the globe, matching their interests, written in their dialect,

1:10:41.756 --> 1:10:46.716
<v Speaker 1>representing their communities. But there's also great potential for harm.

1:10:47.356 --> 1:10:52.276
<v Speaker 1>The ability to cast anyone in a pornographic video, weaponized

1:10:52.356 --> 1:10:56.836
<v Speaker 1>media dropping days before an election, or provoking international conflicts.

1:10:57.716 --> 1:11:00.676
<v Speaker 1>Are we going to be able to tell fact from fiction?

1:11:01.236 --> 1:11:06.956
<v Speaker 1>Will truth survive? And what does it mean for our democracy? Better?

1:11:06.956 --> 1:11:10.036
<v Speaker 1>Fake detection may help, but it'll be hard for it

1:11:10.076 --> 1:11:13.636
<v Speaker 1>to keep up, and logging our lives in blockchain to

1:11:13.716 --> 1:11:19.996
<v Speaker 1>protect against misrepresentation doesn't sound like an attractive idea. Outright

1:11:20.156 --> 1:11:22.916
<v Speaker 1>bands on deep fakes are being tried in some countries,

1:11:23.236 --> 1:11:26.596
<v Speaker 1>but they're tricky in the US given our constitutional protections

1:11:26.636 --> 1:11:30.516
<v Speaker 1>for free speech. Maybe the best solution is to put

1:11:30.516 --> 1:11:35.116
<v Speaker 1>the liability on platforms like Facebook and YouTube. If we

1:11:35.276 --> 1:11:39.156
<v Speaker 1>can joan Donovan's right to get the future you want,

1:11:39.556 --> 1:11:42.076
<v Speaker 1>you're going to have to fight for it. You don't

1:11:42.116 --> 1:11:44.476
<v Speaker 1>have to be an expert, and you don't have to

1:11:44.476 --> 1:11:47.916
<v Speaker 1>do it alone. When enough people get engaged, we make

1:11:47.956 --> 1:11:52.236
<v Speaker 1>wise choices. Deep fakes are a problem that everyone can

1:11:52.276 --> 1:11:56.116
<v Speaker 1>engage with. Brainstorm with your friends about what should be done.

1:11:56.556 --> 1:12:00.356
<v Speaker 1>Use social media. Tweet at your elected representatives to ask

1:12:00.396 --> 1:12:03.796
<v Speaker 1>if they're working on laws, like in California and Texas,

1:12:04.556 --> 1:12:08.276
<v Speaker 1>And if you work for a tech company, ask yourself

1:12:08.516 --> 1:12:12.316
<v Speaker 1>and your colleagues if you're doing enough. You can find

1:12:12.476 --> 1:12:16.116
<v Speaker 1>lots of resources and ideas at our website Brave New

1:12:16.196 --> 1:12:20.876
<v Speaker 1>Planet dot org. It's time to choose our planet. The

1:12:20.996 --> 1:12:34.916
<v Speaker 1>future is up to us. Brave New Planet is a

1:12:34.956 --> 1:12:37.636
<v Speaker 1>co production of the Broad Institute of MT and Harvard

1:12:37.716 --> 1:12:41.156
<v Speaker 1>Pushkin Industries in the Boston Globe, with support from the

1:12:41.196 --> 1:12:44.876
<v Speaker 1>Alfred P. Sloan Foundation. Our show is produced by Rebecca

1:12:44.916 --> 1:12:49.196
<v Speaker 1>Lee Douglas with Mary Doo theme song composed by Ned Porter,

1:12:49.796 --> 1:12:53.916
<v Speaker 1>mastering and sound designed by James Garver, fact checking by

1:12:53.996 --> 1:12:58.316
<v Speaker 1>Joseph Fridman, and a Stitt and Enchant. Special Thanks to

1:12:58.436 --> 1:13:03.076
<v Speaker 1>Christine Heenan and Rachel Roberts at Clarendon Communications, to Lee McGuire,

1:13:03.236 --> 1:13:06.596
<v Speaker 1>Kristen Zarelli and Justine Levin Allerhand at the Broad, to

1:13:06.756 --> 1:13:10.956
<v Speaker 1>Milobelle and Heather Faine at Pushkin, and to Eli and

1:13:11.116 --> 1:13:15.196
<v Speaker 1>Edy Brode, who made the Brode Institute possible. This is

1:13:15.276 --> 1:13:17.716
<v Speaker 1>Brave New Planet. I am her Aclander