WEBVTT - Ep6 "What will AI mean for artists?"

0:00:05.559 --> 0:00:11.240
<v Speaker 1>Will writers and artists and musicians become unemployed by AI?

0:00:11.920 --> 0:00:15.280
<v Speaker 1>What are the new capabilities that we're seeing all around us,

0:00:15.360 --> 0:00:16.120
<v Speaker 1>and what is this.

0:00:16.120 --> 0:00:18.840
<v Speaker 2>Going to mean for human creativity?

0:00:19.200 --> 0:00:22.000
<v Speaker 1>And what does this have to do with diamonds and

0:00:22.079 --> 0:00:27.120
<v Speaker 1>Westworld and effort and Frankenstein in Beethoven and the Stark

0:00:27.200 --> 0:00:33.000
<v Speaker 1>Family and Game of Thrones. Welcome to Inner Cosmos with

0:00:33.080 --> 0:00:37.639
<v Speaker 1>me David Eagleman. I'm a neuroscientist and an author at

0:00:37.680 --> 0:00:41.480
<v Speaker 1>Stanford University, and in this episode, I get to dive

0:00:41.520 --> 0:00:46.520
<v Speaker 1>into something that's right at the intersection of science and creativity.

0:00:51.479 --> 0:00:55.720
<v Speaker 1>Most of my podcasts are about evergreen topics about our

0:00:55.760 --> 0:00:59.920
<v Speaker 1>brains and our psychology, but there's something so extraordinary happy

0:01:00.440 --> 0:01:01.080
<v Speaker 1>right now.

0:01:01.560 --> 0:01:05.600
<v Speaker 2>We're in the middle of a revolution with AI, and

0:01:05.640 --> 0:01:09.640
<v Speaker 2>what's called generative AI in particular. So I'm going to

0:01:09.720 --> 0:01:12.960
<v Speaker 2>do a two part episode on this. For today, I'm

0:01:12.959 --> 0:01:15.880
<v Speaker 2>going to dig into what generative AI is and what

0:01:16.000 --> 0:01:20.080
<v Speaker 2>it means for human creativity, and then in the next episode,

0:01:20.120 --> 0:01:24.360
<v Speaker 2>I'm going to tackle the question of sentience. Are these

0:01:24.400 --> 0:01:28.720
<v Speaker 2>ais conscious and if not, now, could they be soon?

0:01:29.400 --> 0:01:31.760
<v Speaker 2>And how would we know when we get there?

0:01:35.000 --> 0:01:39.240
<v Speaker 1>So let's start in twenty seventeen when almost no one

0:01:39.280 --> 0:01:42.440
<v Speaker 1>in the world paid attention when a team at Google

0:01:42.520 --> 0:01:47.280
<v Speaker 1>Brain introduced a new way of building an artificial neural network.

0:01:47.920 --> 0:01:51.040
<v Speaker 1>So this was different than the architectures that came before it,

0:01:51.480 --> 0:01:55.560
<v Speaker 1>which were called things like convolutional neural networks and recurrent

0:01:55.640 --> 0:01:58.800
<v Speaker 1>neural networks. Instead, they presented a new model that was

0:01:58.840 --> 0:02:03.520
<v Speaker 1>called a transformer. Now, transformer is not one of those

0:02:03.640 --> 0:02:06.760
<v Speaker 1>robots that shapeshift into trucks and helicopters.

0:02:07.280 --> 0:02:10.040
<v Speaker 2>Instead, a transformer model is.

0:02:10.000 --> 0:02:14.160
<v Speaker 1>A way to tackle sequential data like the words that

0:02:14.200 --> 0:02:16.280
<v Speaker 1>are in a sentence or the frames in a video.

0:02:16.800 --> 0:02:20.080
<v Speaker 1>And a transformer model takes in everything at once, and

0:02:20.120 --> 0:02:23.239
<v Speaker 1>it essentially pays attention to different parts of the data.

0:02:23.800 --> 0:02:29.400
<v Speaker 1>And this allows training on enormous data sets, bigger than

0:02:29.440 --> 0:02:33.560
<v Speaker 1>what was trained on before. Like now it's essentially everything

0:02:33.680 --> 0:02:36.960
<v Speaker 1>that has been written by humans that is on the Internet,

0:02:37.160 --> 0:02:41.840
<v Speaker 1>which is petabytes of data. So these models they digest

0:02:41.919 --> 0:02:44.840
<v Speaker 1>all of that and what do they do. They essentially

0:02:44.919 --> 0:02:48.200
<v Speaker 1>look at a sequence of inputs like the words and

0:02:48.240 --> 0:02:52.280
<v Speaker 1>a sentence, and they ask what word is most likely

0:02:52.360 --> 0:02:56.440
<v Speaker 1>to come next in that sequence. Now we'll come back

0:02:56.480 --> 0:02:58.000
<v Speaker 1>to that in a second, but I just want to

0:02:58.040 --> 0:03:04.600
<v Speaker 1>note that this transformer model is finding uses way beyond text. So,

0:03:04.680 --> 0:03:08.160
<v Speaker 1>for example, a recent Nature paper used this kind of

0:03:08.200 --> 0:03:11.720
<v Speaker 1>model to look at amino acids, which run in a

0:03:11.720 --> 0:03:14.800
<v Speaker 1>sequence to make proteins, and they looked at these chains

0:03:14.800 --> 0:03:18.200
<v Speaker 1>of amino acids like techt strings, and they set a

0:03:18.360 --> 0:03:21.840
<v Speaker 1>major new water mark in determining how proteins fold, which

0:03:21.880 --> 0:03:25.400
<v Speaker 1>is a very difficult problem. And people are using transformers

0:03:25.400 --> 0:03:30.800
<v Speaker 1>for everything from making music to reading giant reams of

0:03:30.840 --> 0:03:35.040
<v Speaker 1>medical records and so on. These transformer models are built

0:03:35.080 --> 0:03:37.440
<v Speaker 1>into search already, and soon they're going to be in

0:03:37.520 --> 0:03:40.120
<v Speaker 1>your phone and in your car, and in your bank

0:03:40.160 --> 0:03:48.080
<v Speaker 1>and in your doctor's office. So what everyone in Silicon

0:03:48.160 --> 0:03:50.800
<v Speaker 1>Valley is talking about is how this new kind of

0:03:50.880 --> 0:03:55.120
<v Speaker 1>AI is going to disrupt the workforce. And a lot

0:03:55.160 --> 0:03:58.320
<v Speaker 1>of people are thinking about white collar jobs that have

0:03:58.440 --> 0:04:04.200
<v Speaker 1>traditionally required memorization of long textbooks, and these jobs, whether

0:04:04.280 --> 0:04:08.440
<v Speaker 1>they're legal or medical, suddenly seem to be kind of outmoded.

0:04:09.000 --> 0:04:11.880
<v Speaker 1>And so we're all thinking about what this means for

0:04:11.960 --> 0:04:15.120
<v Speaker 1>the economy because so many jobs are going to be

0:04:15.160 --> 0:04:19.920
<v Speaker 1>displaced by this new technology. Now, there's nothing totally new

0:04:19.960 --> 0:04:24.039
<v Speaker 1>about this kind of worry, because every generation sees new

0:04:24.080 --> 0:04:28.240
<v Speaker 1>technologies take over old jobs. That's natural, and we don't

0:04:28.320 --> 0:04:32.840
<v Speaker 1>lament the fact that we don't have elevator operators anymore,

0:04:33.120 --> 0:04:38.000
<v Speaker 1>or switchboard operators at telephone companies, or factories that make

0:04:38.240 --> 0:04:43.800
<v Speaker 1>VCRs or eight track tape players, because new technologies continuously

0:04:43.880 --> 0:04:48.560
<v Speaker 1>replace the old, and industries change and people adapt. But

0:04:48.680 --> 0:04:52.120
<v Speaker 1>the concern that we're seeing with the AI revolution is

0:04:52.200 --> 0:04:55.559
<v Speaker 1>the speed of it. It's probably the case that we've

0:04:56.040 --> 0:05:00.479
<v Speaker 1>never before had a move forward in technology that's so

0:05:00.960 --> 0:05:07.080
<v Speaker 1>unbelievably rapid. So this is why everyone's talking about this

0:05:07.240 --> 0:05:09.919
<v Speaker 1>with a different point of view than we did with

0:05:10.080 --> 0:05:13.080
<v Speaker 1>previous innovations. But I want to zoom in on something

0:05:13.120 --> 0:05:15.360
<v Speaker 1>a little different for this episode. I want to know

0:05:15.400 --> 0:05:19.719
<v Speaker 1>what this all means for human creativity, because the thing

0:05:19.800 --> 0:05:23.080
<v Speaker 1>to note is these models have been trained up not

0:05:23.240 --> 0:05:26.960
<v Speaker 1>just on the handful of novels and conversations and schoolwork

0:05:27.040 --> 0:05:31.599
<v Speaker 1>that you have experienced on your thin trajectory through space

0:05:31.640 --> 0:05:34.880
<v Speaker 1>and time, but they have been trained with everything that's

0:05:34.960 --> 0:05:40.479
<v Speaker 1>ever been written by humans. Every textbook, every article, every poem,

0:05:40.520 --> 0:05:46.800
<v Speaker 1>every blog post, every novel. We're talking seventy one billion

0:05:46.880 --> 0:05:52.719
<v Speaker 1>web pages and hundreds of trillions of words, It's something

0:05:52.760 --> 0:05:57.680
<v Speaker 1>that's so far beyond any human's capacity to consume even

0:05:57.720 --> 0:06:01.360
<v Speaker 1>a fraction of it, or to really imagine a corpus

0:06:01.480 --> 0:06:04.839
<v Speaker 1>of text that large. Oh and by the way, it

0:06:04.880 --> 0:06:08.400
<v Speaker 1>has a perfect memory for every word that it's read.

0:06:08.520 --> 0:06:12.080
<v Speaker 1>So now you're talking about a system that's not the

0:06:12.120 --> 0:06:17.560
<v Speaker 1>same as a brain, but is incredibly powerful at generating

0:06:17.720 --> 0:06:22.159
<v Speaker 1>text or visual art or music and soon video. And

0:06:22.200 --> 0:06:25.000
<v Speaker 1>so while we'll talk about sentience next week, this week,

0:06:25.080 --> 0:06:28.080
<v Speaker 1>I want to address a social point that has quickly

0:06:28.160 --> 0:06:30.839
<v Speaker 1>risen to the surface, which is what will all this

0:06:31.040 --> 0:06:36.599
<v Speaker 1>mean for human art and human creativity? Personally, I'm working

0:06:36.640 --> 0:06:39.800
<v Speaker 1>on my next several books right now, and these are

0:06:39.839 --> 0:06:44.720
<v Speaker 1>all projects that have spanned years, and so I'm fascinated

0:06:44.839 --> 0:06:49.560
<v Speaker 1>and terrified about whether AI is going to replace me

0:06:49.640 --> 0:06:51.960
<v Speaker 1>as a writer. What does this kind of new AI

0:06:52.600 --> 0:06:57.640
<v Speaker 1>mean for writers, for visual artists, for musicians who studied

0:06:57.640 --> 0:07:00.680
<v Speaker 1>their whole lives to be able to compose beautiful piece

0:07:00.720 --> 0:07:06.200
<v Speaker 1>of music? Is human creativity destined for the dust bin

0:07:06.440 --> 0:07:11.160
<v Speaker 1>of history? So let's start with the downside of these models.

0:07:11.680 --> 0:07:14.640
<v Speaker 1>So in my book Live Wired, I talked about how

0:07:14.720 --> 0:07:20.320
<v Speaker 1>AI algorithms don't care about relevance they memorize whatever we

0:07:20.560 --> 0:07:23.080
<v Speaker 1>ask them to. So, now this is a very useful

0:07:23.120 --> 0:07:26.600
<v Speaker 1>feature of AI, but it's also the reason AI is

0:07:26.720 --> 0:07:31.640
<v Speaker 1>not particularly human like, because AI models don't have any

0:07:31.680 --> 0:07:35.400
<v Speaker 1>sort of internal model of the world. They have no

0:07:35.520 --> 0:07:38.880
<v Speaker 1>idea what it is to be a human and have

0:07:39.040 --> 0:07:44.080
<v Speaker 1>drives and concerns. They don't care which problems are interesting

0:07:44.600 --> 0:07:48.640
<v Speaker 1>or germane. Instead, they memorize whatever we feed them. So

0:07:48.720 --> 0:07:51.960
<v Speaker 1>whether that's distinguishing a horse from a zebra in a

0:07:52.000 --> 0:07:56.400
<v Speaker 1>billion photographs, or tracking flight data from every airport on

0:07:56.440 --> 0:08:01.160
<v Speaker 1>the planet, or composing music in the style of Brian Eno,

0:08:01.640 --> 0:08:06.360
<v Speaker 1>they have no sense of importance except in a statistical sense,

0:08:06.960 --> 0:08:09.679
<v Speaker 1>which is to say, which signals occur more often.

0:08:10.440 --> 0:08:13.160
<v Speaker 2>So contemporary AI could never.

0:08:13.080 --> 0:08:17.560
<v Speaker 1>By itself decide that it finds irresistible a particular kind

0:08:17.600 --> 0:08:21.600
<v Speaker 1>of ice cream, or that it abhors a particular kind

0:08:21.600 --> 0:08:26.120
<v Speaker 1>of music, or that it's heartbroken by King Lear's speech

0:08:26.320 --> 0:08:29.760
<v Speaker 1>over his dead daughter. So AI can dispatch, you know,

0:08:29.840 --> 0:08:34.000
<v Speaker 1>ten thousand hours of intense practice in ten thousand nanoseconds,

0:08:34.360 --> 0:08:38.200
<v Speaker 1>but it doesn't care about any zeros and ones over

0:08:38.240 --> 0:08:43.680
<v Speaker 1>any others. As a result, AI can accomplish incredibly impressive feats,

0:08:43.760 --> 0:08:48.760
<v Speaker 1>but not the feat of being quite like a human.

0:08:49.200 --> 0:08:52.320
<v Speaker 1>And so some critics of AI say, look, it's like

0:08:52.360 --> 0:08:55.760
<v Speaker 1>you want a sandwich, and what this transformer model does

0:08:56.200 --> 0:08:58.720
<v Speaker 1>is it looks at all the billions of sandwiches out

0:08:58.760 --> 0:09:02.079
<v Speaker 1>there in the world, and it gives you a slurry

0:09:02.480 --> 0:09:03.880
<v Speaker 1>and it pours it out in.

0:09:03.840 --> 0:09:05.240
<v Speaker 2>The shape of a sandwich.

0:09:05.559 --> 0:09:08.079
<v Speaker 1>A fellow writer gave me that analogy the other day,

0:09:08.120 --> 0:09:12.839
<v Speaker 1>and that doesn't sound particularly appealing, right, And yet these

0:09:13.000 --> 0:09:16.719
<v Speaker 1>ais have massively surprised us.

0:09:17.080 --> 0:09:20.280
<v Speaker 2>The text generation is so good, it's.

0:09:20.120 --> 0:09:24.319
<v Speaker 1>So complete, it's so human like that we find ourselves

0:09:24.360 --> 0:09:27.960
<v Speaker 1>not so much in the phase of invention like with

0:09:28.040 --> 0:09:31.239
<v Speaker 1>all the machines we've made before. Instead, the whole scientific

0:09:31.240 --> 0:09:36.760
<v Speaker 1>community is finding itself in a process of discovery. Everyone

0:09:36.880 --> 0:09:41.880
<v Speaker 1>is exploring to find out what these enormous models are

0:09:41.960 --> 0:09:46.240
<v Speaker 1>capable of, because nobody quite knows. They keep blowing our

0:09:46.280 --> 0:09:50.200
<v Speaker 1>minds with things they're able to do which weren't pre

0:09:50.240 --> 0:09:56.280
<v Speaker 1>programmed and not even foreseen. Have a friend who works

0:09:56.280 --> 0:09:59.800
<v Speaker 1>with a big city symphony, and she's trying to play

0:10:00.160 --> 0:10:03.520
<v Speaker 1>a program for the symphony several months out, which is

0:10:03.559 --> 0:10:07.760
<v Speaker 1>a typical timescale for symphony planning, but she's scheduling to

0:10:07.800 --> 0:10:11.520
<v Speaker 1>put on a program with music composed by AI, and

0:10:11.600 --> 0:10:14.240
<v Speaker 1>she's at a loss for how to plan this because

0:10:14.720 --> 0:10:18.360
<v Speaker 1>she's well aware that things are moving so fast that

0:10:18.400 --> 0:10:22.479
<v Speaker 1>the musical world and the skill level of AI composition

0:10:23.000 --> 0:10:25.320
<v Speaker 1>is going to be entirely different. In a few months,

0:10:25.320 --> 0:10:28.240
<v Speaker 1>it's can be more advanced. So she was telling me

0:10:28.320 --> 0:10:31.880
<v Speaker 1>that she doesn't quite know how to nail down plans

0:10:31.920 --> 0:10:36.600
<v Speaker 1>for this, because unlike every symphony planner who has come before,

0:10:36.760 --> 0:10:39.600
<v Speaker 1>she's now in a world where if she nails down

0:10:39.679 --> 0:10:43.080
<v Speaker 1>a choice of music and trains up the musicians, it

0:10:43.240 --> 0:10:47.360
<v Speaker 1>is guaranteed to be badly outdated some months from now.

0:10:47.760 --> 0:10:51.160
<v Speaker 1>And this is the world we're operating in now. So jennertive,

0:10:51.160 --> 0:10:54.920
<v Speaker 1>AI is moving so rapidly that we have entered this

0:10:55.120 --> 0:10:58.959
<v Speaker 1>massive revolution without most of us realizing that we were

0:10:58.960 --> 0:10:59.440
<v Speaker 1>going there.

0:11:00.280 --> 0:11:03.920
<v Speaker 2>Art and writing and music aren't.

0:11:03.679 --> 0:11:06.960
<v Speaker 1>Going away, but they're going to completely change from how

0:11:07.000 --> 0:11:08.079
<v Speaker 1>we know them today.

0:11:09.360 --> 0:11:09.640
<v Speaker 2>Now.

0:11:09.960 --> 0:11:13.000
<v Speaker 1>I told you earlier that AI doesn't have any idea

0:11:13.120 --> 0:11:14.800
<v Speaker 1>of what it is to be a.

0:11:14.880 --> 0:11:18.200
<v Speaker 2>Human, but I think it doesn't matter.

0:11:18.880 --> 0:11:23.239
<v Speaker 1>AI doesn't need to feel anything to write great literature

0:11:23.320 --> 0:11:26.040
<v Speaker 1>or great art or great music, because while you can

0:11:26.120 --> 0:11:29.800
<v Speaker 1>think of it as a sandwich slurry. You can also

0:11:29.840 --> 0:11:34.160
<v Speaker 1>think of chat GPT as a remix of every human

0:11:34.200 --> 0:11:39.520
<v Speaker 1>writer that has come before. Its training set is humankind,

0:11:39.720 --> 0:11:43.640
<v Speaker 1>and so even if it's just statistical, it's generating the

0:11:43.720 --> 0:11:47.880
<v Speaker 1>expressions and the passions and the fears and the hopes

0:11:48.440 --> 0:11:51.760
<v Speaker 1>of millions of people. So it doesn't matter if it

0:11:51.880 --> 0:11:54.719
<v Speaker 1>feels or knows or has theory of mind, or if

0:11:54.760 --> 0:11:59.760
<v Speaker 1>it cries at king Lear's speech, because it can convincingly

0:12:00.559 --> 0:12:03.400
<v Speaker 1>tell you a story that breaks your heart. And it

0:12:03.440 --> 0:12:07.240
<v Speaker 1>does this by drawing on the best of human writing

0:12:07.440 --> 0:12:11.160
<v Speaker 1>over the centuries. So as a result, it's incredibly good

0:12:11.240 --> 0:12:14.600
<v Speaker 1>and it puts together things in a new way. And

0:12:14.679 --> 0:12:20.160
<v Speaker 1>I think part of understanding this requires acknowledging a really

0:12:20.200 --> 0:12:23.640
<v Speaker 1>important point, which is that the AI is really good,

0:12:23.840 --> 0:12:31.440
<v Speaker 1>but also that humans are so easily hackable. The phrase

0:12:31.880 --> 0:12:34.920
<v Speaker 1>humans are hackable is a phrase that I first started

0:12:34.920 --> 0:12:37.880
<v Speaker 1>hearing from my friend Lisa Joy Nolan, who with her

0:12:37.960 --> 0:12:42.440
<v Speaker 1>husband Joan Nolan, created the television show Westworld, and that

0:12:42.559 --> 0:12:44.840
<v Speaker 1>was a big theme in that show. The humans could

0:12:44.880 --> 0:12:48.400
<v Speaker 1>so easily get seduced by the robots, or convinced to

0:12:48.440 --> 0:12:51.679
<v Speaker 1>do bad actions or act violently and the robots were

0:12:51.720 --> 0:12:54.760
<v Speaker 1>just running AI. But if they say the right thing,

0:12:54.880 --> 0:12:57.559
<v Speaker 1>then they can get humans to do things, whether that's

0:12:57.760 --> 0:13:00.920
<v Speaker 1>fighting or fornicating or whatever. It's like turning the key

0:13:00.960 --> 0:13:03.000
<v Speaker 1>in the lock. Now, there's a point that I want

0:13:03.040 --> 0:13:06.600
<v Speaker 1>to dig into here. If you saw Westworld, you may

0:13:06.640 --> 0:13:09.520
<v Speaker 1>remember the scene from the first episode where a man

0:13:09.640 --> 0:13:13.560
<v Speaker 1>named William has just arrived to Westworld and he's greeted

0:13:13.679 --> 0:13:16.440
<v Speaker 1>in a room by a beautiful woman who guides him

0:13:16.520 --> 0:13:19.440
<v Speaker 1>to pick out his cowboy outfit and his gun in

0:13:19.440 --> 0:13:22.520
<v Speaker 1>his hat, and she makes it clear that she's available

0:13:22.559 --> 0:13:28.720
<v Speaker 1>for him sexually, and he uncomfortably asks her, are you real?

0:13:29.400 --> 0:13:34.200
<v Speaker 1>And she says, if you can't tell, does it matter?

0:13:35.480 --> 0:13:35.680
<v Speaker 2>Now?

0:13:35.760 --> 0:13:40.079
<v Speaker 1>This is a major theme throughout Westworld. Humans are hackable,

0:13:40.360 --> 0:13:43.520
<v Speaker 1>and if you can't tell the difference between something that

0:13:43.600 --> 0:13:47.199
<v Speaker 1>has evolutionary importance to you and a fake version of it,

0:13:47.600 --> 0:13:49.960
<v Speaker 1>then it makes no difference. And this is what we

0:13:50.080 --> 0:13:52.800
<v Speaker 1>see when we look at the text that is spit

0:13:52.880 --> 0:13:58.200
<v Speaker 1>out from chat GPT. It is statistically sound, meaning it

0:13:58.360 --> 0:14:01.559
<v Speaker 1>falls in the orders and rhythms of millions of people

0:14:01.559 --> 0:14:04.240
<v Speaker 1>who have written things like it before, and so we

0:14:04.320 --> 0:14:09.000
<v Speaker 1>can be just as compelled by the text, and therefore

0:14:09.080 --> 0:14:12.800
<v Speaker 1>the fact that AI can write a story that moves

0:14:12.880 --> 0:14:17.160
<v Speaker 1>us and impresses us is no surprise. It's easy to

0:14:17.280 --> 0:14:20.000
<v Speaker 1>move and impress us. In a sense, it's no more

0:14:20.080 --> 0:14:24.280
<v Speaker 1>surprising than drawing a pornographic cartoon that turns someone on.

0:14:24.720 --> 0:14:29.920
<v Speaker 1>You're just plugging into deeply carved programs. A human can't

0:14:29.920 --> 0:14:33.920
<v Speaker 1>mate with the cartoon. But nonetheless, it's easy enough to

0:14:34.120 --> 0:14:38.920
<v Speaker 1>activate the biological programs, so a story can make you

0:14:39.160 --> 0:14:43.240
<v Speaker 1>shed tears or laugh even if the transformer is just

0:14:43.320 --> 0:14:46.880
<v Speaker 1>pushing around zeros and ones. And therefore we shouldn't be

0:14:47.000 --> 0:14:51.600
<v Speaker 1>surprised that AI can write these really great pieces of prose.

0:14:51.680 --> 0:14:56.280
<v Speaker 1>It doesn't have to be real and it doesn't matter.

0:14:57.160 --> 0:15:00.440
<v Speaker 1>So now that we can write beautiful prose with AI,

0:15:00.720 --> 0:15:04.360
<v Speaker 1>what does this mean for the future of books. Well,

0:15:04.440 --> 0:15:07.200
<v Speaker 1>I think we can imagine a pretty cool future for

0:15:07.720 --> 0:15:14.240
<v Speaker 1>AI generated literature. We can imagine generating infinite, wonderful material.

0:15:15.040 --> 0:15:16.720
<v Speaker 2>And you know what, Back in the day.

0:15:17.080 --> 0:15:22.400
<v Speaker 1>Kings and emperors had poems written that were bespoke. The

0:15:22.560 --> 0:15:25.080
<v Speaker 1>poems were written just for them. And now it's going

0:15:25.120 --> 0:15:28.640
<v Speaker 1>to be trivial for us to all live as royalty,

0:15:29.160 --> 0:15:33.240
<v Speaker 1>having bespoke literature written just for us as much as

0:15:33.280 --> 0:15:36.880
<v Speaker 1>we want, as often as we want, in seconds, and

0:15:36.960 --> 0:15:41.120
<v Speaker 1>maybe we'll come to enjoy dynamic novels, by which I

0:15:41.160 --> 0:15:44.120
<v Speaker 1>mean a piece of literature that's not pre written, but

0:15:44.240 --> 0:15:48.480
<v Speaker 1>instead is written on the fly depending on the decisions

0:15:48.480 --> 0:15:51.720
<v Speaker 1>that you make, like a choose your own adventure. So

0:15:51.800 --> 0:15:54.200
<v Speaker 1>you say this is a good book so far. Now

0:15:54.240 --> 0:15:56.600
<v Speaker 1>I want to see what happens if I go in

0:15:56.640 --> 0:15:58.840
<v Speaker 1>the neighbor's door and get a view on his life,

0:15:58.960 --> 0:16:01.720
<v Speaker 1>or the mailman life who just passed by, or the

0:16:01.760 --> 0:16:05.360
<v Speaker 1>traffic cop and the book just keeps writing itself on

0:16:05.400 --> 0:16:09.320
<v Speaker 1>the fly, thousands of pages that end up being.

0:16:09.120 --> 0:16:14.000
<v Speaker 2>Unique for me, for you, for everyone as they go

0:16:14.080 --> 0:16:15.120
<v Speaker 2>on their own adventure.

0:16:15.600 --> 0:16:19.240
<v Speaker 1>Instead of having some poor author who has to write

0:16:19.480 --> 0:16:23.520
<v Speaker 1>every possible branching path, now there's no need to do that.

0:16:23.560 --> 0:16:25.120
<v Speaker 2>You just generated on the fly.

0:16:26.040 --> 0:16:30.200
<v Speaker 1>So now we'll all get to experience literary worlds that

0:16:30.240 --> 0:16:35.200
<v Speaker 1>are infinite in all directions. So in that light, it

0:16:35.280 --> 0:16:40.320
<v Speaker 1>certainly seems that AI is going to replace human creatives.

0:16:40.720 --> 0:16:43.600
<v Speaker 1>It can do things better and millions of times faster,

0:16:44.120 --> 0:16:46.680
<v Speaker 1>and it can be there to write the next pages

0:16:46.720 --> 0:16:51.400
<v Speaker 1>according to your wishes. So it looks like writers are

0:16:51.520 --> 0:16:53.680
<v Speaker 1>going the way of the mastodon?

0:16:54.720 --> 0:16:56.000
<v Speaker 2>Or are they?

0:16:56.600 --> 0:17:00.720
<v Speaker 1>I think the real story is not so simple. I'm

0:17:00.800 --> 0:17:05.320
<v Speaker 1>fairly sure that while AI will augment human told stories,

0:17:05.920 --> 0:17:09.080
<v Speaker 1>there's essentially zero danger that it's going to do a

0:17:09.119 --> 0:17:12.720
<v Speaker 1>wholesale replacement of human creatives. And I'm going to argue

0:17:12.720 --> 0:17:16.640
<v Speaker 1>this for four reasons. The first is that we care

0:17:16.840 --> 0:17:20.640
<v Speaker 1>about the overarching arc of a story, and at least

0:17:20.640 --> 0:17:24.400
<v Speaker 1>at the moment, AI can't even come close to constructing this.

0:17:24.800 --> 0:17:28.800
<v Speaker 1>And this is because of a fundamental limitation in its architecture.

0:17:29.240 --> 0:17:31.760
<v Speaker 1>And this isn't just a question of pouring more money

0:17:31.800 --> 0:17:34.560
<v Speaker 1>in and getting more massive computers on the job. It

0:17:34.640 --> 0:17:40.199
<v Speaker 1>has to do with the exponentially increasing computational cost of

0:17:40.320 --> 0:17:45.760
<v Speaker 1>representing longer pieces of work. So currently with chat GPT four,

0:17:46.320 --> 0:17:50.120
<v Speaker 1>it looks at the past four ninety six tokens, which

0:17:50.160 --> 0:17:53.240
<v Speaker 1>is about three thousand words, and it decides what the

0:17:53.280 --> 0:17:57.280
<v Speaker 1>most likely next word is. But without getting into the

0:17:57.320 --> 0:17:59.719
<v Speaker 1>details of the math, I want to point out that

0:17:59.760 --> 0:18:03.920
<v Speaker 1>this requires a matrix. Think about it like a big

0:18:03.960 --> 0:18:07.000
<v Speaker 1>spreadsheet that has four thousand ninety six rows in four

0:18:07.040 --> 0:18:07.919
<v Speaker 1>thousand ninety.

0:18:07.640 --> 0:18:10.160
<v Speaker 2>Six columns and an entry in every cell.

0:18:10.200 --> 0:18:13.760
<v Speaker 1>That represents something about the probability of those words going

0:18:13.800 --> 0:18:14.360
<v Speaker 1>with each other.

0:18:14.840 --> 0:18:17.680
<v Speaker 2>Now, this matrix will grow larger.

0:18:17.280 --> 0:18:20.560
<v Speaker 1>With time, but the size of the output is inherently

0:18:20.640 --> 0:18:25.160
<v Speaker 1>constrained by this structure, and as a result, chat GPT

0:18:25.359 --> 0:18:28.760
<v Speaker 1>is perfect for poems or blonde posts or small articles,

0:18:29.240 --> 0:18:33.280
<v Speaker 1>but not something the size of a novel. Why because

0:18:33.320 --> 0:18:38.680
<v Speaker 1>a novel has arcs and plot twists and cleverly planted

0:18:38.880 --> 0:18:42.639
<v Speaker 1>clues and cliffhangers, and all of these operate at a

0:18:42.720 --> 0:18:48.080
<v Speaker 1>longer timescale. So a human author mentally zooms in and

0:18:48.119 --> 0:18:52.720
<v Speaker 1>out such that their stories have this sweeping arc to them. So,

0:18:52.840 --> 0:18:55.159
<v Speaker 1>for example, in a mystery novel, we get to the

0:18:55.400 --> 0:18:58.560
<v Speaker 1>end and we realize that all the clues and the

0:18:58.600 --> 0:19:02.560
<v Speaker 1>red herrings we saw or subservient to the solution to

0:19:02.600 --> 0:19:05.440
<v Speaker 1>the mystery, which of course the author knew from the beginning,

0:19:05.680 --> 0:19:08.439
<v Speaker 1>and the author was just spooling out clues to you

0:19:08.480 --> 0:19:11.080
<v Speaker 1>one at a time. In writing, you often have to

0:19:11.200 --> 0:19:14.639
<v Speaker 1>know the end to structure the beginning in the middle.

0:19:14.920 --> 0:19:18.320
<v Speaker 1>And this is, by the way, why chat GPT can't

0:19:18.359 --> 0:19:20.720
<v Speaker 1>make up a new joke, even though it can repeat

0:19:20.800 --> 0:19:22.160
<v Speaker 1>jokes that are already made.

0:19:22.320 --> 0:19:25.760
<v Speaker 2>But it's because to construct a joke, just like a

0:19:25.800 --> 0:19:29.159
<v Speaker 2>mystery novel, you have to know the punchline first, and

0:19:29.200 --> 0:19:33.119
<v Speaker 2>then you construct the joke backwards. But these large language

0:19:33.119 --> 0:19:37.639
<v Speaker 2>models are simply constructing everything in the forward direction. It

0:19:37.680 --> 0:19:41.879
<v Speaker 2>does statistical calculations on what the most probable word to

0:19:41.920 --> 0:19:45.359
<v Speaker 2>come next is given all the words before it. So,

0:19:45.480 --> 0:19:49.240
<v Speaker 2>coming back to the long arc, if you watched all

0:19:49.280 --> 0:19:52.119
<v Speaker 2>eight seasons of Game of Thrones, for example, or you

0:19:52.160 --> 0:19:55.480
<v Speaker 2>read those books, you come to care about these characters

0:19:55.560 --> 0:19:58.879
<v Speaker 2>because you've been with them through so many trials and

0:19:58.920 --> 0:20:01.600
<v Speaker 2>you feel like you know the and understand them, and

0:20:01.640 --> 0:20:05.200
<v Speaker 2>you can predict things about their behavior, and you're invested

0:20:05.320 --> 0:20:09.399
<v Speaker 2>in their long term trajectories. So all the children of

0:20:09.440 --> 0:20:13.359
<v Speaker 2>the Stark family end up scattered in different directions in

0:20:13.400 --> 0:20:17.320
<v Speaker 2>the world, and then in the final season, they end

0:20:17.400 --> 0:20:21.639
<v Speaker 2>up reconvening. After what seems like a lifetime of adventure.

0:20:21.680 --> 0:20:26.119
<v Speaker 2>They're all back together for the final big showdown with

0:20:26.280 --> 0:20:29.280
<v Speaker 2>the Knight King. And when we watch the series and

0:20:29.320 --> 0:20:32.399
<v Speaker 2>we get to season eight, we think, wow, I didn't

0:20:32.480 --> 0:20:35.359
<v Speaker 2>see that coming, that they're all back together now, and

0:20:35.400 --> 0:20:38.640
<v Speaker 2>now this story has a beautiful shape to it.

0:20:39.119 --> 0:20:42.800
<v Speaker 1>I'm really in the hands of a professional here. At

0:20:42.840 --> 0:20:46.960
<v Speaker 1>least with our current AI architectures today, it's impossible to

0:20:47.080 --> 0:20:50.679
<v Speaker 1>achieve that, except possibly in a few thousand word version,

0:20:51.119 --> 0:20:54.800
<v Speaker 1>because chat ept is playing its statistical game, and of

0:20:54.800 --> 0:20:57.320
<v Speaker 1>course it's playing it extremely well and successfully.

0:20:57.560 --> 0:20:59.960
<v Speaker 2>But the trick to recognize here is.

0:21:00.000 --> 0:21:02.800
<v Speaker 1>That it is amazing at the level of paragraphs and

0:21:02.840 --> 0:21:07.000
<v Speaker 1>possibly a few pages, but not at the level of

0:21:07.080 --> 0:21:10.119
<v Speaker 1>thinking about the details of a five hundred page novel,

0:21:10.480 --> 0:21:15.160
<v Speaker 1>or a two hour movie screenplay or an eight season epic.

0:21:15.920 --> 0:21:18.800
<v Speaker 1>It's great at this small stuff because it can do

0:21:18.840 --> 0:21:22.560
<v Speaker 1>that with statistics, but it's fundamentally limited for the longer

0:21:22.600 --> 0:21:26.320
<v Speaker 1>stuff because it has no way to zoom out and

0:21:26.480 --> 0:21:30.159
<v Speaker 1>think about the crops that it wants to plant for

0:21:30.240 --> 0:21:34.120
<v Speaker 1>the long game. Okay, you might say, fine, maybe we'll

0:21:34.160 --> 0:21:36.760
<v Speaker 1>get there at some point, but even for now, couldn't

0:21:36.800 --> 0:21:40.800
<v Speaker 1>you build a big story out of smaller chunks. So

0:21:41.240 --> 0:21:44.720
<v Speaker 1>one idea is to make this form of storytelling in

0:21:44.760 --> 0:21:46.760
<v Speaker 1>which the world is infinitely big.

0:21:47.280 --> 0:21:48.639
<v Speaker 2>Let's come back to this picture.

0:21:48.680 --> 0:21:51.520
<v Speaker 1>I painted a moment ago of a choose your own

0:21:51.560 --> 0:21:56.199
<v Speaker 1>adventure in which the AI generates plot points on the

0:21:56.240 --> 0:21:59.800
<v Speaker 1>fly for you. So I say, okay, open that door

0:21:59.840 --> 0:22:03.480
<v Speaker 1>to my left and the story continues as though it

0:22:03.560 --> 0:22:08.320
<v Speaker 1>were all prescripted, as though I have an author, let's say,

0:22:08.320 --> 0:22:12.720
<v Speaker 1>in the style of Henningway or Nibokov for Morrison, who

0:22:12.760 --> 0:22:16.560
<v Speaker 1>has pre written every possibility. In certain ways, this would

0:22:16.560 --> 0:22:20.320
<v Speaker 1>be amazingly cool, But I think the problem here is

0:22:20.320 --> 0:22:25.680
<v Speaker 1>that a story like that would just equal randomness, and

0:22:25.720 --> 0:22:29.040
<v Speaker 1>that's not actually what we want in a story. Instead,

0:22:29.080 --> 0:22:32.159
<v Speaker 1>we want to feel like we're putting our trust into an.

0:22:32.040 --> 0:22:33.840
<v Speaker 2>Author who sees the big picture.

0:22:33.880 --> 0:22:38.040
<v Speaker 1>We want the Stark children to reconvene such as we

0:22:38.119 --> 0:22:40.720
<v Speaker 1>feel the overarching pattern of the story and we have

0:22:40.760 --> 0:22:44.960
<v Speaker 1>a sense of completeness. If you just wanted randomness, you'd

0:22:45.240 --> 0:22:47.760
<v Speaker 1>go out into the world and find it there. You

0:22:47.760 --> 0:22:51.480
<v Speaker 1>wouldn't sit on your couch and read about meaningless characters

0:22:51.520 --> 0:22:57.320
<v Speaker 1>who are just in Brownian motion. And I think this

0:22:57.400 --> 0:23:00.800
<v Speaker 1>is the same issue with AI music, at least as

0:23:00.840 --> 0:23:01.520
<v Speaker 1>it stands now.

0:23:02.240 --> 0:23:04.080
<v Speaker 2>Recent examples show.

0:23:03.840 --> 0:23:07.560
<v Speaker 1>That it can compose incredible sounding music moment to moment.

0:23:07.720 --> 0:23:10.119
<v Speaker 2>But the reason it doesn't beat out.

0:23:09.920 --> 0:23:13.560
<v Speaker 1>A real human composer, at least today, is because it

0:23:13.560 --> 0:23:16.880
<v Speaker 1>doesn't have any long term vision, and so the whole

0:23:16.920 --> 0:23:21.240
<v Speaker 1>piece of music just hangs together. Statistically, moment to moment,

0:23:21.640 --> 0:23:25.840
<v Speaker 1>and that's perfectly good for composing things like elevator music,

0:23:25.880 --> 0:23:28.760
<v Speaker 1>which is for a short ride, or commercial music which

0:23:28.800 --> 0:23:31.960
<v Speaker 1>only needs to be twenty seconds. But it won't for

0:23:32.040 --> 0:23:36.359
<v Speaker 1>now replace a human composer who writes with the long

0:23:36.640 --> 0:23:39.439
<v Speaker 1>arc in mind. For example, I was just talking with

0:23:39.480 --> 0:23:42.800
<v Speaker 1>my friend Tony Brandt, who's a composer, and he was

0:23:42.840 --> 0:23:46.040
<v Speaker 1>explaining to me that when Ludwig and vad Beethoven died,

0:23:46.520 --> 0:23:50.679
<v Speaker 1>he left behind sketches for a tenth symphony. So a

0:23:50.720 --> 0:23:55.280
<v Speaker 1>few years ago some computer scientists used AI to complete

0:23:55.359 --> 0:23:59.479
<v Speaker 1>the symphony, to finish what was unfinished. Now did they

0:23:59.520 --> 0:24:02.320
<v Speaker 1>do a good job. In one sense, it was an

0:24:02.440 --> 0:24:08.320
<v Speaker 1>incredible feat. They extracted the statistics of Beethoven's choices and

0:24:08.480 --> 0:24:12.040
<v Speaker 1>preferences from everything he'd written, and they used that to

0:24:12.640 --> 0:24:16.320
<v Speaker 1>statistically guess what moves he would have made next had

0:24:16.359 --> 0:24:20.920
<v Speaker 1>he lived, What notes, what chords, what instruments. But even

0:24:20.960 --> 0:24:23.920
<v Speaker 1>with this feat, it was clear that the AI didn't

0:24:23.960 --> 0:24:27.960
<v Speaker 1>know how to think long term. For example, Beethoven's Ninth

0:24:28.040 --> 0:24:31.840
<v Speaker 1>Symphony ends with a chorus, which was such a surprise

0:24:31.960 --> 0:24:34.000
<v Speaker 1>to end a symphony this way. It had not ever

0:24:34.040 --> 0:24:37.960
<v Speaker 1>been done before, so the team training the AI decided

0:24:38.000 --> 0:24:41.560
<v Speaker 1>Beethoven would have found a similar novelty to end his

0:24:41.720 --> 0:24:45.960
<v Speaker 1>tenth Symphony, so they instructed the AI to include an organ,

0:24:46.480 --> 0:24:49.040
<v Speaker 1>a church instrument that had also never been used in

0:24:49.040 --> 0:24:52.440
<v Speaker 1>a symphony before. So at the start of the last movement,

0:24:52.720 --> 0:24:54.679
<v Speaker 1>the AI generates an organ.

0:24:55.119 --> 0:24:58.639
<v Speaker 2>But when we zoom in, we see the difference.

0:25:01.119 --> 0:25:05.000
<v Speaker 1>The real Beethoven laid all sorts of clues in the

0:25:05.119 --> 0:25:09.040
<v Speaker 1>Ninth Symphony to set the groundwork for the chorus. Like

0:25:09.160 --> 0:25:12.640
<v Speaker 1>the orchestra plays a type of music called a recitative

0:25:13.200 --> 0:25:18.680
<v Speaker 1>before the choir enters. Why because recitatives are found in opera,

0:25:18.720 --> 0:25:22.640
<v Speaker 1>and opera has voices, So he was laying clues down.

0:25:22.800 --> 0:25:26.200
<v Speaker 1>But in the AI tenth Symphony, there was no build

0:25:26.240 --> 0:25:29.320
<v Speaker 1>up to the organ. There was no suspense, no hidden

0:25:29.400 --> 0:25:30.399
<v Speaker 1>clues about.

0:25:30.119 --> 0:25:30.919
<v Speaker 2>What was coming.

0:25:31.480 --> 0:25:35.840
<v Speaker 1>The AI didn't know how to prepare the organ's arrival,

0:25:36.240 --> 0:25:39.760
<v Speaker 1>how to give it the significance that's there for experts

0:25:39.760 --> 0:25:46.080
<v Speaker 1>who listen for arcs that build through time. So, at

0:25:46.160 --> 0:25:50.120
<v Speaker 1>least for now, AI is useful at writing brief articles

0:25:50.200 --> 0:25:53.680
<v Speaker 1>and composing short ditties, but it doesn't have the architecture

0:25:53.720 --> 0:25:59.960
<v Speaker 1>to write long pieces that humans love to create, and consume.

0:26:13.480 --> 0:26:14.159
<v Speaker 2>So as I'm.

0:26:14.000 --> 0:26:17.840
<v Speaker 1>Writing my next books, these large language models don't feel

0:26:17.840 --> 0:26:21.640
<v Speaker 1>to me like a real threat, at least not yet.

0:26:21.920 --> 0:26:26.080
<v Speaker 1>But let's imagine that we cut to ten years from

0:26:26.119 --> 0:26:29.840
<v Speaker 1>now and some hardworking programmers have figured out how to

0:26:29.880 --> 0:26:33.639
<v Speaker 1>build an AI with the right sort of architecture that

0:26:33.880 --> 0:26:36.200
<v Speaker 1>zooms in and out on the scope of a story,

0:26:36.480 --> 0:26:41.080
<v Speaker 1>and it can successfully generate a novel with cliffhangers and

0:26:41.200 --> 0:26:44.520
<v Speaker 1>overarching themes and so on. It's certainly not impossible that

0:26:44.560 --> 0:26:47.439
<v Speaker 1>we're going to get there, and it'll probably happen sooner

0:26:47.480 --> 0:26:50.440
<v Speaker 1>than we expect. So let's imagine we get there in

0:26:50.480 --> 0:26:53.160
<v Speaker 1>a year or five or ten. An AI can generate

0:26:53.200 --> 0:26:58.359
<v Speaker 1>a million good novels in an hour. Then what Well,

0:26:58.400 --> 0:27:01.560
<v Speaker 1>there are several directions in which things can go, And

0:27:01.600 --> 0:27:04.520
<v Speaker 1>the possibility that I mentioned earlier is that novels might

0:27:04.560 --> 0:27:10.000
<v Speaker 1>become bespoke, totally personalized to you. So you prompt your

0:27:10.080 --> 0:27:13.440
<v Speaker 1>AI to make an adventure story of exactly the type

0:27:13.480 --> 0:27:15.960
<v Speaker 1>that you might like. So you say, tell me a

0:27:16.200 --> 0:27:20.240
<v Speaker 1>murder mystery about a basketball player who's killed by someone

0:27:20.280 --> 0:27:23.480
<v Speaker 1>who appears to be his girlfriend. But then it turns

0:27:23.480 --> 0:27:26.960
<v Speaker 1>out it's actually a CIA plot. That opens the door

0:27:27.000 --> 0:27:30.840
<v Speaker 1>to a cover up involving a pharmaceutical company. Let's assume

0:27:30.880 --> 0:27:33.280
<v Speaker 1>that the AI then spits out a book to your

0:27:33.320 --> 0:27:36.800
<v Speaker 1>exact specification, and it does an amazing job, and it

0:27:36.800 --> 0:27:39.440
<v Speaker 1>gives you a colorful story just how you wanted it,

0:27:39.720 --> 0:27:42.000
<v Speaker 1>and you can enjoy that on the beach seconds later.

0:27:42.280 --> 0:27:45.720
<v Speaker 1>Well that's cool, But I assert that this is never

0:27:45.920 --> 0:27:49.600
<v Speaker 1>going to replace literature. And this is my second point

0:27:49.640 --> 0:27:53.120
<v Speaker 1>why artists don't need to worry, because when you define

0:27:53.160 --> 0:27:56.920
<v Speaker 1>your own plot, the surprise is diluted.

0:27:57.440 --> 0:27:59.520
<v Speaker 2>The joy of literature is diluted.

0:28:00.080 --> 0:28:03.800
<v Speaker 1>After all, even if you are a creative prompter, you

0:28:03.880 --> 0:28:07.399
<v Speaker 1>are limited to versions of what you have experienced or

0:28:07.520 --> 0:28:10.600
<v Speaker 1>read before. And much of what we love in literature

0:28:10.720 --> 0:28:14.200
<v Speaker 1>is this surprise that comes from a particular point of

0:28:14.280 --> 0:28:18.800
<v Speaker 1>view that you have never considered, like characters or plot

0:28:18.880 --> 0:28:22.720
<v Speaker 1>points that would never be generated by your own limited

0:28:22.800 --> 0:28:26.000
<v Speaker 1>point of view. In the end, I think we don't

0:28:26.119 --> 0:28:30.199
<v Speaker 1>want to be limited by the parochial fence lines of

0:28:30.240 --> 0:28:31.440
<v Speaker 1>our own imagination.

0:28:31.960 --> 0:28:32.960
<v Speaker 2>I suspect that.

0:28:33.040 --> 0:28:35.919
<v Speaker 1>No matter how far in the future we look, we

0:28:35.960 --> 0:28:39.720
<v Speaker 1>are still going to want stories that surprise us, plot

0:28:39.800 --> 0:28:44.120
<v Speaker 1>twists that we don't see coming. Okay, fine, you might say,

0:28:44.160 --> 0:28:46.719
<v Speaker 1>so you agree that it's more exciting if we go

0:28:46.800 --> 0:28:49.920
<v Speaker 1>on rides that we didn't predefine. But you might point

0:28:49.960 --> 0:28:53.040
<v Speaker 1>out there's another thing that AI can do. So let's

0:28:53.120 --> 0:28:56.760
<v Speaker 1>address the next issue, the idea that AI could someday

0:28:56.880 --> 0:29:02.000
<v Speaker 1>generate millions of highly creative versions of a single story,

0:29:02.240 --> 0:29:04.560
<v Speaker 1>so there'd be no need to stick with just one

0:29:04.680 --> 0:29:08.440
<v Speaker 1>version of stories anymore. Instead of George R. R. Martin

0:29:08.840 --> 0:29:13.200
<v Speaker 1>writing Game of Thrones over decades, future AI could generate

0:29:13.360 --> 0:29:17.160
<v Speaker 1>thousands of fascinating versions in a second, and we wouldn't

0:29:17.200 --> 0:29:21.520
<v Speaker 1>depend on him for the next slow novel. But I

0:29:21.600 --> 0:29:24.120
<v Speaker 1>suggest that's not going to catch on either.

0:29:24.760 --> 0:29:25.120
<v Speaker 2>Why.

0:29:25.480 --> 0:29:30.440
<v Speaker 1>It's because we care about shared adventure. Would Game of

0:29:30.520 --> 0:29:33.840
<v Speaker 1>Thrones have been so popular if we each saw our

0:29:33.920 --> 0:29:38.240
<v Speaker 1>own version of it? In my version, John snow dies early,

0:29:38.440 --> 0:29:42.680
<v Speaker 1>and in your version, danaris Mary's Tyrian lanister, and in

0:29:42.720 --> 0:29:45.920
<v Speaker 1>your neighbor's version, Ariya marries into a royal family in

0:29:46.000 --> 0:29:49.200
<v Speaker 1>some subplot island that never even appears in my version.

0:29:49.720 --> 0:29:53.560
<v Speaker 1>If this sounds less appealing to you, to have mutually

0:29:53.720 --> 0:29:57.360
<v Speaker 1>exclusive worlds, it illustrates the point that I want to make,

0:29:57.400 --> 0:30:01.600
<v Speaker 1>which is a big part of story is this social aspect,

0:30:01.720 --> 0:30:06.040
<v Speaker 1>the shared experience. We certainly could use AI to generate

0:30:06.080 --> 0:30:09.520
<v Speaker 1>a million different versions of west Ros, and in the

0:30:09.560 --> 0:30:12.640
<v Speaker 1>future we can generate instant video around these plots with

0:30:12.800 --> 0:30:17.360
<v Speaker 1>terrific special effect. But as a society, I think we

0:30:17.440 --> 0:30:22.160
<v Speaker 1>wouldn't want to each consume our own version. You want

0:30:22.200 --> 0:30:25.680
<v Speaker 1>your John Snow to do the same thing as my

0:30:25.880 --> 0:30:28.680
<v Speaker 1>John Snow. And this is because a huge part of

0:30:28.800 --> 0:30:34.840
<v Speaker 1>story is this shared experience. We enjoy sharing fantasy worlds

0:30:34.880 --> 0:30:37.959
<v Speaker 1>because we talk about them. This is why we do

0:30:38.000 --> 0:30:41.200
<v Speaker 1>book clubs, so we can sit around and discuss something

0:30:41.240 --> 0:30:44.400
<v Speaker 1>we all shared together. All the time, I hear people say, hey,

0:30:44.440 --> 0:30:47.720
<v Speaker 1>did you see the latest episode of The Peripheral or

0:30:47.800 --> 0:30:51.320
<v Speaker 1>Jack Ryan or Severance or Star Trek or whatever. And

0:30:51.400 --> 0:30:55.840
<v Speaker 1>our love of communal stories stems partially from our need

0:30:56.360 --> 0:31:00.080
<v Speaker 1>for shared references. For example, I'm always making reference and

0:31:00.360 --> 0:31:04.040
<v Speaker 1>is to how Neo in the Matrix saw in slow motion,

0:31:04.200 --> 0:31:07.040
<v Speaker 1>and that's decades after that movie came out, but it

0:31:07.120 --> 0:31:10.440
<v Speaker 1>serves as a quick, culturally shared way that we can

0:31:10.480 --> 0:31:14.800
<v Speaker 1>talk about concepts. We all have quick cultural references for

0:31:15.040 --> 0:31:18.680
<v Speaker 1>time travel, where people say met me up Scotti when

0:31:18.680 --> 0:31:22.920
<v Speaker 1>they're talking about teleportation, or we reference Obi wan Kenobi

0:31:22.960 --> 0:31:25.240
<v Speaker 1>when we say may the force be with you, or

0:31:25.360 --> 0:31:29.520
<v Speaker 1>we reference ex Machina or Westworld as a shorthand for

0:31:29.640 --> 0:31:30.680
<v Speaker 1>AI going bad.

0:31:31.120 --> 0:31:33.040
<v Speaker 2>And take this as an example.

0:31:32.920 --> 0:31:37.400
<v Speaker 1>Imagine that you could generate a fantasy football game with

0:31:37.440 --> 0:31:41.080
<v Speaker 1>your favorite players from any decade on one team versus

0:31:41.160 --> 0:31:43.880
<v Speaker 1>players on another team, and you can now watch a

0:31:43.960 --> 0:31:47.680
<v Speaker 1>full football game from stem to stern. But would you

0:31:48.360 --> 0:31:52.000
<v Speaker 1>if no one else ever saw that game? In other words,

0:31:52.320 --> 0:31:55.840
<v Speaker 1>would you follow teams all the way through the World

0:31:55.920 --> 0:32:00.000
<v Speaker 1>Series if it was purely AI generated plays and games.

0:32:00.760 --> 0:32:02.800
<v Speaker 1>I know that people might have different opinions on this,

0:32:02.880 --> 0:32:06.120
<v Speaker 1>but to me, that sounds not the least bit appealing.

0:32:06.560 --> 0:32:09.880
<v Speaker 1>Why it's because a giant part about sports is the

0:32:09.920 --> 0:32:12.960
<v Speaker 1>culture of talking about the game. Hey did you see

0:32:12.960 --> 0:32:15.560
<v Speaker 1>that play last night? Can you believe that shot he took?

0:32:15.800 --> 0:32:19.400
<v Speaker 1>Can you believe the call that refmade? And stories are

0:32:19.480 --> 0:32:22.200
<v Speaker 1>analogous to sports in this way. We come to our

0:32:22.240 --> 0:32:25.160
<v Speaker 1>book clubs to take the world that we read in

0:32:25.320 --> 0:32:28.840
<v Speaker 1>solitude and find a community with other people who were

0:32:28.880 --> 0:32:32.160
<v Speaker 1>there with us from their own living rooms. So I

0:32:32.200 --> 0:32:34.800
<v Speaker 1>suggest that as a culture, we are always going to

0:32:35.440 --> 0:32:40.360
<v Speaker 1>desire and need a shared vocabulary, and the only way

0:32:40.360 --> 0:32:43.440
<v Speaker 1>to grow that is to watch the same movies and

0:32:43.520 --> 0:32:45.280
<v Speaker 1>read the same stories.

0:32:45.760 --> 0:32:47.360
<v Speaker 2>And that's why I predict that.

0:32:47.400 --> 0:32:53.240
<v Speaker 1>While individualized stories might find niche audiences, it won't replace

0:32:53.440 --> 0:32:58.000
<v Speaker 1>our need for shared stories. This is an interesting dimension

0:32:58.040 --> 0:33:05.280
<v Speaker 1>of literature that's not typically canered. Story gives us social glue. Okay, fine,

0:33:05.320 --> 0:33:08.160
<v Speaker 1>so let's assume that at some point AI could write

0:33:08.160 --> 0:33:12.520
<v Speaker 1>a story that's so evocative and beautiful that it becomes

0:33:12.560 --> 0:33:17.400
<v Speaker 1>a shared story, an adventure which everyone taps into and enjoys.

0:33:17.800 --> 0:33:21.640
<v Speaker 1>And now we arrive at my fourth point about why

0:33:21.680 --> 0:33:25.880
<v Speaker 1>AI won't totally displace creatives, and that is the question

0:33:25.960 --> 0:33:28.840
<v Speaker 1>of whether we get something more out of a piece

0:33:28.880 --> 0:33:32.360
<v Speaker 1>of literature or art if we feel there's.

0:33:32.120 --> 0:33:34.120
<v Speaker 2>A heartbeat behind it.

0:33:34.760 --> 0:33:37.640
<v Speaker 1>I read a beautiful quotation in The Atlantic about a

0:33:37.680 --> 0:33:42.080
<v Speaker 1>decade ago quote one of the only requirements for literature

0:33:42.640 --> 0:33:46.240
<v Speaker 1>is that the reader can feel a heart pulsing back

0:33:46.280 --> 0:33:49.600
<v Speaker 1>from them on the other side of the page. The

0:33:49.840 --> 0:33:54.120
<v Speaker 1>heartbeat matters because when we read, we consider the intention

0:33:54.240 --> 0:33:57.719
<v Speaker 1>of the author. We think, oh, this is Mary Shelley,

0:33:57.760 --> 0:34:00.400
<v Speaker 1>whose mother died a couple of weeks after she was born,

0:34:00.480 --> 0:34:03.560
<v Speaker 1>and she had a troubled childhood, and her father homeschooled her.

0:34:03.560 --> 0:34:07.600
<v Speaker 1>And she married the romantic poet Percy bish Shelley, and

0:34:08.000 --> 0:34:10.880
<v Speaker 1>he was already married and his wife committed suicide, and

0:34:10.880 --> 0:34:13.160
<v Speaker 1>they moved to France, and she came back pregnant, and

0:34:13.200 --> 0:34:16.040
<v Speaker 1>they were destitute, and their daughter died. And then they

0:34:16.040 --> 0:34:18.839
<v Speaker 1>went to spend a summer in Geneva with friends, and

0:34:18.880 --> 0:34:21.000
<v Speaker 1>they each set out to write a ghost story, and

0:34:21.080 --> 0:34:23.120
<v Speaker 1>she ended up writing Frankenstein.

0:34:23.640 --> 0:34:24.800
<v Speaker 2>So we read her.

0:34:24.719 --> 0:34:28.200
<v Speaker 1>Novel and we think, this is her voice, and this

0:34:28.360 --> 0:34:31.080
<v Speaker 1>is her viewpoint on the world, and these were the

0:34:31.120 --> 0:34:33.839
<v Speaker 1>things that she knew and the things she didn't know,

0:34:33.880 --> 0:34:35.279
<v Speaker 1>and the things she couldn't know.

0:34:35.800 --> 0:34:38.640
<v Speaker 2>It isn't just the piece of art itself.

0:34:38.719 --> 0:34:43.760
<v Speaker 1>It is the artist behind the art that colors our experience.

0:34:44.239 --> 0:34:48.480
<v Speaker 1>So imagine we get Chad Gpt to adopt Mary Shelley's

0:34:48.600 --> 0:34:52.720
<v Speaker 1>style and write a story involving cell phones and electric cars.

0:34:52.960 --> 0:34:56.200
<v Speaker 1>It might be interesting and amazing, but I suggest we

0:34:56.239 --> 0:35:00.239
<v Speaker 1>wouldn't enjoy it as much because we would recognize there's

0:35:00.360 --> 0:35:05.239
<v Speaker 1>no unique human, no unique beating heart who had the

0:35:05.360 --> 0:35:09.760
<v Speaker 1>experiences and slaved over the words. Now, you could argue

0:35:09.760 --> 0:35:14.120
<v Speaker 1>that almost all of the authors we enjoy. We live

0:35:14.160 --> 0:35:17.080
<v Speaker 1>apart from them in space or time, and we'll never

0:35:17.160 --> 0:35:19.520
<v Speaker 1>meet them, and we just have the vaguest sense of

0:35:19.560 --> 0:35:20.400
<v Speaker 1>their existence.

0:35:20.719 --> 0:35:23.320
<v Speaker 2>And that might be true, but it's still worth.

0:35:23.120 --> 0:35:27.600
<v Speaker 1>Noting that we know fundamentally that they are human and

0:35:27.640 --> 0:35:30.480
<v Speaker 1>they are like us in some way. They may be

0:35:30.840 --> 0:35:34.400
<v Speaker 1>more successful, or more impoverished, or maybe from a different country,

0:35:34.800 --> 0:35:38.879
<v Speaker 1>but we know that fundamentally they are fellow travelers with

0:35:38.960 --> 0:35:55.600
<v Speaker 1>us on the human journey. Now, obviously we love a

0:35:55.600 --> 0:35:59.000
<v Speaker 1>lot of things that aren't real, like Spider Man or Batman,

0:35:59.560 --> 0:36:02.399
<v Speaker 1>but we all I also love the actors behind them.

0:36:02.440 --> 0:36:04.759
<v Speaker 1>If you had a chance to have dinner with or

0:36:04.800 --> 0:36:07.719
<v Speaker 1>even to shake the hand of the actor behind some

0:36:07.920 --> 0:36:11.080
<v Speaker 1>fantasy character that you love, you'd be thrilled about this.

0:36:11.640 --> 0:36:13.040
<v Speaker 2>Now, I think that leads.

0:36:12.760 --> 0:36:17.000
<v Speaker 1>To an interesting open question about some of these new

0:36:17.400 --> 0:36:20.600
<v Speaker 1>avatars that are hitting the scene with hundreds of thousands

0:36:20.600 --> 0:36:24.600
<v Speaker 1>of followers on Twitter. Even though they're fake. They're just avatars,

0:36:24.600 --> 0:36:27.360
<v Speaker 1>they're not real people. The part that strikes me is

0:36:27.400 --> 0:36:30.480
<v Speaker 1>really interesting is that the ones who get all the

0:36:30.520 --> 0:36:34.080
<v Speaker 1>attention are the creators behind the avatar. In other words,

0:36:34.360 --> 0:36:37.360
<v Speaker 1>if I told you there was an avatar on Twitter,

0:36:37.360 --> 0:36:39.359
<v Speaker 1>with a one hundred thousand followers, and you could get

0:36:39.360 --> 0:36:42.040
<v Speaker 1>the chance to meet the young woman behind all this,

0:36:42.520 --> 0:36:45.040
<v Speaker 1>you'd be thrilled. What this tells me is that we

0:36:45.080 --> 0:36:49.600
<v Speaker 1>are compelled by the heartbeat that is just behind the

0:36:49.640 --> 0:36:54.120
<v Speaker 1>actor or the avatar. In many ways, that's more interesting

0:36:54.280 --> 0:36:58.080
<v Speaker 1>to us than the actor or the avatar themselves. Now,

0:36:58.640 --> 0:37:00.719
<v Speaker 1>I don't think this goes on in so let me

0:37:00.760 --> 0:37:03.680
<v Speaker 1>just address the counterpoint. You might say, well, does that

0:37:03.719 --> 0:37:07.200
<v Speaker 1>mean that if AI generated a thousand novels in a second,

0:37:07.239 --> 0:37:09.920
<v Speaker 1>that I'd be really interested in meeting the team of

0:37:10.000 --> 0:37:13.840
<v Speaker 1>young programmers behind that. I don't think so, because meeting

0:37:13.880 --> 0:37:18.319
<v Speaker 1>the programmers doesn't expand your understanding of the story. But

0:37:18.440 --> 0:37:21.880
<v Speaker 1>meeting an author who poured her heart into the story

0:37:21.960 --> 0:37:27.200
<v Speaker 1>for years that does shape and color and expand your understanding.

0:37:27.560 --> 0:37:29.680
<v Speaker 2>And by the way, beyond writing, I think.

0:37:29.480 --> 0:37:33.239
<v Speaker 1>This applies to musical composers and visual artists in the

0:37:33.280 --> 0:37:38.120
<v Speaker 1>same way, and in fact, to all human endeavors. I

0:37:38.239 --> 0:37:40.600
<v Speaker 1>was just talking with a neighbor of mine. He and

0:37:40.640 --> 0:37:43.759
<v Speaker 1>I spend a lot of time on airplanes flying to

0:37:43.840 --> 0:37:47.040
<v Speaker 1>some city in the world to give a talk. He

0:37:47.160 --> 0:37:51.000
<v Speaker 1>just got a three D scan and a high resolution

0:37:51.280 --> 0:37:54.680
<v Speaker 1>avatar of himself made and he can combine that with

0:37:54.800 --> 0:37:58.920
<v Speaker 1>Chad GPT to make his avatar give little speeches. And

0:37:58.960 --> 0:38:01.240
<v Speaker 1>so he and I were really chewing on this because

0:38:01.280 --> 0:38:04.400
<v Speaker 1>the question is, the next time he gets invited to

0:38:04.480 --> 0:38:08.400
<v Speaker 1>speak on some stage and some random city around the world,

0:38:08.800 --> 0:38:12.160
<v Speaker 1>can he just have the avatar give the speech online instead?

0:38:12.560 --> 0:38:15.880
<v Speaker 1>Will conferences still want him to fly across.

0:38:15.520 --> 0:38:16.960
<v Speaker 2>The globe to give a talk.

0:38:17.040 --> 0:38:19.719
<v Speaker 1>Or will the avatar be good enough and save a

0:38:19.760 --> 0:38:24.280
<v Speaker 1>lot of expense and plane fuel? Possibly, But the flip

0:38:24.320 --> 0:38:29.040
<v Speaker 1>side is do people value going to the talk because

0:38:29.080 --> 0:38:30.640
<v Speaker 1>of the beating heart.

0:38:30.520 --> 0:38:31.440
<v Speaker 2>On the stage?

0:38:32.040 --> 0:38:35.960
<v Speaker 1>And my long bet is that conferences will continue to

0:38:36.080 --> 0:38:40.879
<v Speaker 1>invite flesh and blood humans because audiences are humans who

0:38:41.200 --> 0:38:46.239
<v Speaker 1>care about other humans. So when it comes to legal documents,

0:38:46.280 --> 0:38:48.879
<v Speaker 1>if AI can do it better, awesome, when it comes

0:38:48.920 --> 0:38:52.000
<v Speaker 1>to medical diagnoses, if AI can do it better awesome,

0:38:52.600 --> 0:38:56.800
<v Speaker 1>when it comes to hearing a speaker on the stage

0:38:57.239 --> 0:39:01.840
<v Speaker 1>with his or her imperfections and limited knowledge and fundamentally

0:39:02.280 --> 0:39:05.600
<v Speaker 1>human nature, I'm going to take the bet that that

0:39:06.040 --> 0:39:10.600
<v Speaker 1>is going to last and beyond just appreciating the reality

0:39:10.719 --> 0:39:13.560
<v Speaker 1>of another human. This maybe for another reason as well,

0:39:14.120 --> 0:39:17.920
<v Speaker 1>an interesting psychological effect that I think is going to

0:39:17.920 --> 0:39:20.279
<v Speaker 1>be at play here. This is what I'm going to

0:39:20.280 --> 0:39:23.880
<v Speaker 1>call the effort phenomenon. I'll give you an example of this.

0:39:24.320 --> 0:39:27.320
<v Speaker 1>A well known colleague of mine here in Silicon Valley

0:39:27.360 --> 0:39:31.240
<v Speaker 1>recently announced that he had published a book half written

0:39:31.280 --> 0:39:34.680
<v Speaker 1>by him and half written by AI. And when I

0:39:34.719 --> 0:39:37.960
<v Speaker 1>first heard about this, I thought, I wish I wanted

0:39:38.000 --> 0:39:41.839
<v Speaker 1>to read this, but I don't now. I did take

0:39:41.880 --> 0:39:44.640
<v Speaker 1>a look at the book, and there are clever insights,

0:39:44.680 --> 0:39:48.680
<v Speaker 1>and it's well written. But I'm simply not that inspired

0:39:48.760 --> 0:39:53.040
<v Speaker 1>to read something that's even half written by AI, because

0:39:53.400 --> 0:39:56.600
<v Speaker 1>it makes me feel, perhaps unfairly, that.

0:39:56.640 --> 0:39:58.800
<v Speaker 2>He didn't put in the normal amount of effort.

0:39:59.400 --> 0:40:02.719
<v Speaker 1>My analogy you would be if Picasso said, hey, will

0:40:02.760 --> 0:40:05.719
<v Speaker 1>you buy this painting? My students painted most of it,

0:40:05.760 --> 0:40:07.600
<v Speaker 1>but then I finished it off and put my signature

0:40:07.640 --> 0:40:10.200
<v Speaker 1>on it. It feels like it would be slightly less valuable.

0:40:10.800 --> 0:40:14.000
<v Speaker 1>So let's return to that scene in Westworld where William

0:40:14.120 --> 0:40:18.000
<v Speaker 1>asks the host are you real? And she says if

0:40:18.040 --> 0:40:21.600
<v Speaker 1>you can't tell, doesn't matter, Because this is the question

0:40:21.640 --> 0:40:22.480
<v Speaker 1>that comes up.

0:40:22.600 --> 0:40:24.040
<v Speaker 2>About a novel.

0:40:24.320 --> 0:40:27.680
<v Speaker 1>If I spend seven years writing a novel, and if

0:40:28.000 --> 0:40:31.439
<v Speaker 1>Chad Gpt or google bart spits out a novel that's

0:40:31.520 --> 0:40:33.000
<v Speaker 1>word for word equivalent.

0:40:33.719 --> 0:40:34.560
<v Speaker 2>Does it matter?

0:40:35.120 --> 0:40:39.680
<v Speaker 1>And I think, perhaps surprisingly, the answer is yes, it matters.

0:40:40.200 --> 0:40:43.279
<v Speaker 1>We care about the effort that went into it. If

0:40:43.320 --> 0:40:45.440
<v Speaker 1>I were to show you two pieces of artwork that

0:40:45.560 --> 0:40:48.280
<v Speaker 1>someone had done, and one of them just involves painting

0:40:48.320 --> 0:40:51.440
<v Speaker 1>a single dot on the middle of a big white canvas,

0:40:51.480 --> 0:40:55.920
<v Speaker 1>and the other one is the person carefully gluing marbles

0:40:55.960 --> 0:40:58.399
<v Speaker 1>one on top of each other until they balance eight

0:40:58.440 --> 0:41:01.200
<v Speaker 1>feet high. You may have a p for looking at

0:41:01.239 --> 0:41:03.279
<v Speaker 1>one or the other, but just think about how much

0:41:03.320 --> 0:41:06.000
<v Speaker 1>money you would, in theory, be willing to pay for

0:41:06.080 --> 0:41:08.919
<v Speaker 1>each of these. If you're like most people, you think

0:41:08.960 --> 0:41:12.439
<v Speaker 1>the thing that took a lot of effort is worth more.

0:41:13.080 --> 0:41:16.640
<v Speaker 1>There have been psychology studies on this since the nineteen fifties.

0:41:17.000 --> 0:41:20.040
<v Speaker 1>It's difficult for people to separate out the effort that

0:41:20.120 --> 0:41:24.080
<v Speaker 1>went into something from its value. In other words, the

0:41:24.160 --> 0:41:29.640
<v Speaker 1>effort is used as a shortcut for understanding quality. For example,

0:41:29.640 --> 0:41:33.200
<v Speaker 1>in one paper done by Krueger at All, they had

0:41:33.400 --> 0:41:37.200
<v Speaker 1>people rate a poem, or rate a painting, or rate

0:41:37.239 --> 0:41:40.239
<v Speaker 1>a suit of armor, and the people generally thought it

0:41:40.320 --> 0:41:43.480
<v Speaker 1>was better quality and worth more money, and they liked

0:41:43.520 --> 0:41:46.839
<v Speaker 1>it better if they thought it took more time and

0:41:46.920 --> 0:41:50.160
<v Speaker 1>effort to produce a friend of mine. Uses the example

0:41:50.280 --> 0:41:54.040
<v Speaker 1>of diamonds. People will pay much more money for a

0:41:54.200 --> 0:41:58.640
<v Speaker 1>real diamond with flaws than they will for a synthetically

0:41:58.800 --> 0:42:02.799
<v Speaker 1>grown diamond from laboratory that has no flaws at all. Now,

0:42:02.800 --> 0:42:06.759
<v Speaker 1>why would you pay extra money for flaws? Part of

0:42:06.800 --> 0:42:09.239
<v Speaker 1>this has to do with the notion of effort. The

0:42:09.280 --> 0:42:13.480
<v Speaker 1>real diamond was produced by mother nature over millions of

0:42:13.680 --> 0:42:17.719
<v Speaker 1>years of compression, so it's a very special thing that

0:42:17.800 --> 0:42:20.680
<v Speaker 1>took quote unquote effort on the part of mother nature.

0:42:21.160 --> 0:42:23.920
<v Speaker 2>But the lab grown diamond that can be done in

0:42:23.960 --> 0:42:25.040
<v Speaker 2>a day and a half.

0:42:25.440 --> 0:42:28.640
<v Speaker 1>And so even though it's more perfect, it is less

0:42:28.719 --> 0:42:31.480
<v Speaker 1>valuable because it just took less time to make it.

0:42:32.000 --> 0:42:33.840
<v Speaker 2>We actually pay for flaws.

0:42:34.600 --> 0:42:37.000
<v Speaker 1>Now, I'm not arguing that we can't be fooled at

0:42:37.000 --> 0:42:41.760
<v Speaker 1>some point into loving AI generated literature. It seems quite

0:42:41.800 --> 0:42:44.000
<v Speaker 1>possible to me that in the future there will be

0:42:44.120 --> 0:42:48.080
<v Speaker 1>novels written by AI, and we might not always know it,

0:42:48.360 --> 0:42:53.080
<v Speaker 1>because the AI will also generate a false story about

0:42:53.200 --> 0:42:57.480
<v Speaker 1>the author, complete with a biography and a generated photograph.

0:42:57.800 --> 0:43:00.920
<v Speaker 1>My assertion is simply that FA it is going to

0:43:00.920 --> 0:43:03.680
<v Speaker 1>be an important part of what the AI will need

0:43:03.719 --> 0:43:08.080
<v Speaker 1>to do, because it's more difficult to become invested in

0:43:08.160 --> 0:43:12.360
<v Speaker 1>something that we think is simply doing massive statistical calculations

0:43:12.920 --> 0:43:18.320
<v Speaker 1>rather than having a private, limited internal life. We care

0:43:18.600 --> 0:43:23.240
<v Speaker 1>about other humans, So what's the big picture. My friend

0:43:23.520 --> 0:43:27.400
<v Speaker 1>Kevin Kelly suggested to me the other day that generative

0:43:27.480 --> 0:43:31.400
<v Speaker 1>AI may play a role that's analogous to the invention

0:43:31.560 --> 0:43:35.120
<v Speaker 1>of the camera. What happened at that moment in history

0:43:35.239 --> 0:43:39.200
<v Speaker 1>was that painters lamented that this was the end of

0:43:39.320 --> 0:43:43.799
<v Speaker 1>painting because you could now capture anything instantly with the

0:43:43.840 --> 0:43:46.080
<v Speaker 1>click of a button, and you could capture it with

0:43:46.160 --> 0:43:48.960
<v Speaker 1>zero mistakes. So why would you sit there with a

0:43:49.000 --> 0:43:53.920
<v Speaker 1>paint brush and painstakingly try to capture every detail by hand.

0:43:54.480 --> 0:43:58.839
<v Speaker 1>At that moment in history, it seemed clear that painters

0:43:59.280 --> 0:44:03.319
<v Speaker 1>were done for But as it turns out, photographs ended

0:44:03.400 --> 0:44:05.200
<v Speaker 1>up filling a different niche.

0:44:05.960 --> 0:44:09.879
<v Speaker 2>Absolute realism wasn't the only end goal of art.

0:44:10.360 --> 0:44:15.360
<v Speaker 1>People didn't only want a maximumly realistic print of a scene.

0:44:15.440 --> 0:44:19.480
<v Speaker 1>They also wanted swirls, an amazing color, and more importantly,

0:44:19.600 --> 0:44:23.360
<v Speaker 1>things that didn't exist in the outside world. So canvas

0:44:23.400 --> 0:44:28.560
<v Speaker 1>painting remained an active field, even while photography grew and

0:44:28.719 --> 0:44:33.720
<v Speaker 1>ended up flowering on a neighboring field. So one possibility

0:44:34.280 --> 0:44:38.560
<v Speaker 1>is that AI generated literature will not foment it takeover,

0:44:39.040 --> 0:44:42.719
<v Speaker 1>but instead it's going to fill a new niche, one

0:44:42.760 --> 0:44:45.319
<v Speaker 1>that we don't quite see yet, but it isn't the

0:44:45.360 --> 0:44:48.839
<v Speaker 1>same plot of land. And I think there's one more

0:44:48.880 --> 0:44:51.879
<v Speaker 1>possibility for where this could go for writers, not now,

0:44:51.960 --> 0:44:54.920
<v Speaker 1>but in the coming years. And for that, I want

0:44:54.960 --> 0:44:57.919
<v Speaker 1>to tell you what happened with the world champion Go

0:44:58.080 --> 0:45:02.000
<v Speaker 1>player Could Jig. He was the world's number one player

0:45:02.160 --> 0:45:04.799
<v Speaker 1>at Go, which is the game in which you use

0:45:04.880 --> 0:45:08.680
<v Speaker 1>those small black or white rocks to define your territory

0:45:08.680 --> 0:45:11.520
<v Speaker 1>and try to surround your opponent. So in May of

0:45:11.600 --> 0:45:17.080
<v Speaker 1>twenty seventeen, he faced off against an AI program called

0:45:17.280 --> 0:45:21.160
<v Speaker 1>Alpha Go, which was designed by Deep Mind, and Alpha

0:45:21.200 --> 0:45:24.239
<v Speaker 1>Go had been trained on millions and millions of games

0:45:24.280 --> 0:45:28.400
<v Speaker 1>of Go, so it had deeply absorbed the statistics of

0:45:28.600 --> 0:45:33.960
<v Speaker 1>possible plays. So they played the first game and Jiu lost.

0:45:34.520 --> 0:45:38.960
<v Speaker 1>Alpha Go had pulled moves that none of his human

0:45:39.000 --> 0:45:42.799
<v Speaker 1>opponents had ever thought of, and then Jua lost the

0:45:42.880 --> 0:45:46.319
<v Speaker 1>second game. The AI had won over a human in

0:45:46.360 --> 0:45:50.279
<v Speaker 1>a game that's way more complex than chess, and subsequent

0:45:50.440 --> 0:45:53.480
<v Speaker 1>versions of the AI are no doubt going to continue

0:45:53.520 --> 0:45:56.759
<v Speaker 1>to win evermore. But that's not the interesting part of

0:45:56.800 --> 0:46:01.640
<v Speaker 1>the story. The interesting part is what happened next. So

0:46:01.960 --> 0:46:06.799
<v Speaker 1>Jig got over his embarrassment and he became mesmerized by

0:46:06.960 --> 0:46:11.920
<v Speaker 1>what had just transpired, and he studied the games.

0:46:11.560 --> 0:46:12.320
<v Speaker 2>That he lost.

0:46:13.400 --> 0:46:17.520
<v Speaker 1>Before he played Alpha Go, Jia had won a majority

0:46:17.680 --> 0:46:21.920
<v Speaker 1>of the games against his human opponents, but afterwards he

0:46:22.000 --> 0:46:25.240
<v Speaker 1>found he was able to beat his human opponents even

0:46:25.360 --> 0:46:31.160
<v Speaker 1>more easily. After his species shaming defeats in twenty seventeen,

0:46:31.520 --> 0:46:35.160
<v Speaker 1>he went on to play twelve straight matches against humans and.

0:46:35.160 --> 0:46:38.560
<v Speaker 2>He won them all in a row. So what had happened.

0:46:39.360 --> 0:46:43.400
<v Speaker 3>He had been exposed to new kinds of moves and

0:46:43.600 --> 0:46:47.320
<v Speaker 3>strategies that had been pulled by Alpha Go, and these

0:46:47.600 --> 0:46:51.279
<v Speaker 3>all lay outside of traditional ways of doing it.

0:46:51.600 --> 0:46:54.080
<v Speaker 2>All these moves that Alpha Go had done.

0:46:53.920 --> 0:46:57.719
<v Speaker 1>Were legal and possible, but they were just different from

0:46:57.719 --> 0:47:01.040
<v Speaker 1>what had been played over the last twenty five hundred years.

0:47:01.400 --> 0:47:02.799
<v Speaker 2>If you're a Go officionado.

0:47:02.840 --> 0:47:07.000
<v Speaker 1>This included things like playing a stone directly diagonal to

0:47:07.520 --> 0:47:12.319
<v Speaker 1>your opponent's loan stone, or playing six space extensions, while

0:47:12.400 --> 0:47:13.680
<v Speaker 1>humans tend to prefer.

0:47:13.600 --> 0:47:15.080
<v Speaker 2>Five space anyway.

0:47:15.440 --> 0:47:21.320
<v Speaker 1>Joe reported that playing against the AI was like opening.

0:47:20.920 --> 0:47:22.560
<v Speaker 2>A door to another world.

0:47:22.840 --> 0:47:27.080
<v Speaker 1>Once he was exposed to these alien game plays, he

0:47:27.200 --> 0:47:33.120
<v Speaker 1>incorporated them, and this story I suspect typifies the future

0:47:33.719 --> 0:47:37.480
<v Speaker 1>as humans and machines interface. Some people are worried that

0:47:37.560 --> 0:47:41.160
<v Speaker 1>AI is going to take over, but we will continue

0:47:41.200 --> 0:47:45.359
<v Speaker 1>to adapt as well. We will become better writers as

0:47:45.400 --> 0:47:48.880
<v Speaker 1>we see examples that are allowed by the language but

0:47:49.120 --> 0:47:52.960
<v Speaker 1>no one had ever tried it, or visual art techniques

0:47:52.960 --> 0:47:56.759
<v Speaker 1>that involve moves that are allowable, but culturally we just

0:47:56.880 --> 0:47:59.879
<v Speaker 1>never thought to do it, Or musical moves that are

0:48:00.160 --> 0:48:00.839
<v Speaker 1>possible to.

0:48:00.840 --> 0:48:03.799
<v Speaker 2>Do with notes, but no one does.

0:48:03.560 --> 0:48:06.800
<v Speaker 1>Them because traditionally we just wouldn't think of going there.

0:48:06.920 --> 0:48:10.160
<v Speaker 1>Because fundamentally, as a writer, I think I'm doing all

0:48:10.280 --> 0:48:13.640
<v Speaker 1>kinds of original things, but there's a very real sense

0:48:13.680 --> 0:48:18.799
<v Speaker 1>in which I'm simply remixing what I've absorbed before. I

0:48:18.880 --> 0:48:22.960
<v Speaker 1>interpolate between examples that I've seen. So even if AI

0:48:23.160 --> 0:48:28.280
<v Speaker 1>is just interpolating, it's read billions of times more texts

0:48:28.320 --> 0:48:33.600
<v Speaker 1>than I have, and so it can do very clever interpolations,

0:48:33.640 --> 0:48:37.080
<v Speaker 1>and I can learn from that a lot of people

0:48:37.080 --> 0:48:40.840
<v Speaker 1>are worried that AI is going to leave humans far behind,

0:48:40.960 --> 0:48:44.720
<v Speaker 1>and in many respects that's true. But as computers improve,

0:48:45.560 --> 0:48:49.840
<v Speaker 1>so will we. In the battle of man and machine.

0:48:50.600 --> 0:48:53.600
<v Speaker 1>Both are going to get better, and as we continue

0:48:53.640 --> 0:48:58.399
<v Speaker 1>to adapt in parallel, the future definition of AI may

0:48:58.440 --> 0:49:04.880
<v Speaker 1>well shift from our official intelligence to augmented intelligence. In

0:49:04.920 --> 0:49:07.680
<v Speaker 1>the best case scenario, this isn't going to be a war,

0:49:08.160 --> 0:49:12.440
<v Speaker 1>but a collaboration. It's going to be an ongoing, guided

0:49:12.600 --> 0:49:19.560
<v Speaker 1>tour into areas that were previously just beyond our view.

0:49:22.920 --> 0:49:24.080
<v Speaker 2>That's all for this week.

0:49:24.360 --> 0:49:26.839
<v Speaker 1>To find out more and to share your thoughts, head

0:49:26.840 --> 0:49:30.680
<v Speaker 1>over to eagleman dot com, Slash Podcasts, and you can

0:49:30.719 --> 0:49:34.280
<v Speaker 1>also watch full episodes of Inner Cosmos on YouTube.

0:49:34.640 --> 0:49:36.480
<v Speaker 2>Subscribe to my channel so you can.

0:49:36.320 --> 0:49:40.000
<v Speaker 1>Follow along each week for new updates until next time.

0:49:40.360 --> 0:49:43.719
<v Speaker 2>I'm David Eagleman, and this is Inner Cosmos.