WEBVTT - Ep31 "Why do we see #TheDress differently?"

0:00:05.559 --> 0:00:08.959
<v Speaker 1>What's up with those illusions on the Internet where you

0:00:09.000 --> 0:00:12.559
<v Speaker 1>can hear the same sound one of two different ways

0:00:12.600 --> 0:00:15.640
<v Speaker 1>depending on the word that you're looking at. And why

0:00:15.640 --> 0:00:19.159
<v Speaker 1>do electrical outlets sometimes look like a face to you?

0:00:19.960 --> 0:00:23.440
<v Speaker 1>How can you have full, rich visual experience with your

0:00:23.480 --> 0:00:27.080
<v Speaker 1>eyes closed. And when you want to cross a street

0:00:27.160 --> 0:00:30.120
<v Speaker 1>and you hit that crosswalk button, are some of those

0:00:30.120 --> 0:00:32.680
<v Speaker 1>buttons fake and they don't actually do anything?

0:00:33.320 --> 0:00:34.519
<v Speaker 2>And why are there some.

0:00:34.640 --> 0:00:38.040
<v Speaker 1>Pictures that you can only see once you're told what

0:00:38.080 --> 0:00:41.920
<v Speaker 1>you're looking at. And although brains are often celebrated for

0:00:41.960 --> 0:00:46.200
<v Speaker 1>their parallel processing, what did they really be celebrated for.

0:00:49.880 --> 0:00:53.000
<v Speaker 1>Welcome to Inner Cosmos with Me David Eagleman. I'm a

0:00:53.080 --> 0:00:57.560
<v Speaker 1>neuroscientist and author at Stanford and in these episodes we

0:00:57.680 --> 0:01:02.160
<v Speaker 1>sail deeply into our three pounds universe to understand why

0:01:02.200 --> 0:01:05.040
<v Speaker 1>we perceive the world in the ways that we do.

0:01:13.520 --> 0:01:18.080
<v Speaker 1>Today's episode is about expectations and what that has to

0:01:18.120 --> 0:01:24.480
<v Speaker 1>do with perception. Unless you were living in outer space

0:01:24.720 --> 0:01:28.000
<v Speaker 1>or off the grid in twenty fifteen, your life was

0:01:28.120 --> 0:01:33.240
<v Speaker 1>touched by a very tiny, specific event that happened on

0:01:33.280 --> 0:01:37.679
<v Speaker 1>a small island in Scotland. Two young people were going

0:01:37.720 --> 0:01:40.680
<v Speaker 1>to get married there, and a week before the wedding,

0:01:41.080 --> 0:01:43.720
<v Speaker 1>the mother of the bride was shopping around for what

0:01:43.840 --> 0:01:47.680
<v Speaker 1>she was going to wear. So she finds some outfits

0:01:47.760 --> 0:01:50.720
<v Speaker 1>at a store down in Chester, England that she thinks

0:01:50.760 --> 0:01:54.080
<v Speaker 1>will look nice, and while she's making the decision, she

0:01:54.360 --> 0:01:57.480
<v Speaker 1>snaps pictures of each of them and she buys one

0:01:57.520 --> 0:01:57.800
<v Speaker 1>of them.

0:01:58.520 --> 0:01:59.880
<v Speaker 2>So she's driving home.

0:01:59.720 --> 0:02:03.560
<v Speaker 1>After words and she texts the pictures of the three

0:02:03.600 --> 0:02:07.320
<v Speaker 1>outfits to her daughter and she tells her that she

0:02:07.480 --> 0:02:10.560
<v Speaker 1>had bought the third one, and no one could have

0:02:10.720 --> 0:02:14.280
<v Speaker 1>ever guessed that this particular piece of clothing that she

0:02:14.400 --> 0:02:18.680
<v Speaker 1>sent a picture of, this one piece garment, is about

0:02:18.760 --> 0:02:22.440
<v Speaker 1>to become the most famous outfit that ever existed in

0:02:22.480 --> 0:02:27.080
<v Speaker 1>the history of humankind, because the daughter writes back to

0:02:27.280 --> 0:02:32.680
<v Speaker 1>clarify which outfit the mother had bought, and she texts, oh,

0:02:32.919 --> 0:02:37.200
<v Speaker 1>the white and gold one, and the mother texts back, no,

0:02:37.760 --> 0:02:43.240
<v Speaker 1>it's blue and black, and the daughter replies, Mom, if

0:02:43.280 --> 0:02:45.600
<v Speaker 1>you think that's blue and black, you need to go

0:02:45.639 --> 0:02:49.400
<v Speaker 1>and see the doctor. So the mother shows the phone

0:02:49.440 --> 0:02:52.359
<v Speaker 1>to her partner in the car, who, despite having been

0:02:52.400 --> 0:02:54.519
<v Speaker 1>there and bought the dress with her, looks at the

0:02:54.560 --> 0:02:57.000
<v Speaker 1>photo and says, yeah, I think it's white and gold.

0:02:57.680 --> 0:02:59.400
<v Speaker 1>So when they get home, they show the picture to

0:02:59.440 --> 0:03:02.600
<v Speaker 1>their younger, who agrees with the mother that the photo

0:03:02.639 --> 0:03:08.600
<v Speaker 1>looks blue and black. So, given this funny disagreement, the

0:03:08.760 --> 0:03:11.880
<v Speaker 1>bride to be posts the photo to her friends on

0:03:11.960 --> 0:03:16.560
<v Speaker 1>Facebook to settle this, and to her surprise, she doesn't

0:03:16.600 --> 0:03:20.840
<v Speaker 1>find consensus. Some think it's black and blue, others think

0:03:20.919 --> 0:03:25.800
<v Speaker 1>it's white and gold, and each person feels totally certain

0:03:25.880 --> 0:03:29.160
<v Speaker 1>about what they see. So for about a week, this

0:03:29.280 --> 0:03:34.200
<v Speaker 1>debate bubbles around in this small island community. The day

0:03:34.200 --> 0:03:36.920
<v Speaker 1>of the wedding arrives and the mother wears the dress

0:03:36.960 --> 0:03:40.280
<v Speaker 1>to the event, and the issue about the photo becomes

0:03:40.480 --> 0:03:43.080
<v Speaker 1>such a point of discussion that the musicians in the

0:03:43.120 --> 0:03:46.280
<v Speaker 1>band allegedly almost didn't make it onto the stage to

0:03:46.360 --> 0:03:48.720
<v Speaker 1>play because they were wrapped up in the debate.

0:03:49.720 --> 0:03:51.240
<v Speaker 2>So a few days after.

0:03:50.960 --> 0:03:53.200
<v Speaker 1>The wedding, one of the band members, who was a

0:03:53.280 --> 0:03:57.080
<v Speaker 1>friend of the happy couple, she posts the photo to

0:03:57.200 --> 0:04:00.520
<v Speaker 1>her blog on Tumblr, and by the end of the

0:04:00.600 --> 0:04:05.400
<v Speaker 1>day it gets five thousand comments, and soon enough, the

0:04:05.640 --> 0:04:09.839
<v Speaker 1>data scientists at Tumblr are examining this post because it's

0:04:09.880 --> 0:04:15.080
<v Speaker 1>getting fourteen thousand views each second. That's close to a

0:04:15.160 --> 0:04:20.600
<v Speaker 1>million views each minute. So a woman on the BuzzFeed

0:04:20.720 --> 0:04:24.360
<v Speaker 1>social media team sets up a poll about the color

0:04:24.600 --> 0:04:27.680
<v Speaker 1>for Tumblr users, and then she packs up and goes

0:04:27.720 --> 0:04:29.960
<v Speaker 1>home on the subway. And by the time she gets

0:04:30.080 --> 0:04:34.440
<v Speaker 1>off the subway, her phone is overwhelmed, and soon enough

0:04:34.480 --> 0:04:38.840
<v Speaker 1>the BuzzFeed page hits new records for how many unique

0:04:38.920 --> 0:04:41.360
<v Speaker 1>visitors were on the page at the same time, hitting

0:04:41.400 --> 0:04:45.599
<v Speaker 1>almost seven hundred thousand. The number of comments on the

0:04:45.640 --> 0:04:50.200
<v Speaker 1>original post increases tenfold that night. By late that night,

0:04:50.279 --> 0:04:56.120
<v Speaker 1>there are five thousand tweets per minute using hashtag the dress,

0:04:56.320 --> 0:04:58.640
<v Speaker 1>and by the middle of that night it's grown to

0:04:58.760 --> 0:05:02.560
<v Speaker 1>eleven thousand tw wheets per minute. Within the week, more

0:05:02.600 --> 0:05:06.400
<v Speaker 1>than ten million tweets are talking about the dress. This

0:05:06.680 --> 0:05:11.840
<v Speaker 1>was the dress that, as they say, broke the Internet. Now,

0:05:11.920 --> 0:05:15.360
<v Speaker 1>if you were, say a space alien, you might look

0:05:15.400 --> 0:05:18.920
<v Speaker 1>at all this human activity and think, wait, what, why

0:05:19.080 --> 0:05:22.440
<v Speaker 1>is the world stopping over a simple picture of a

0:05:22.560 --> 0:05:26.479
<v Speaker 1>piece of clothing in the UK. Now, the answer, as

0:05:26.520 --> 0:05:28.640
<v Speaker 1>you know, is that none of us humans would have

0:05:28.680 --> 0:05:33.040
<v Speaker 1>found it interesting either, except that someone that you loved

0:05:33.040 --> 0:05:35.720
<v Speaker 1>and trusted said, what do you mean you're seeing it

0:05:35.800 --> 0:05:39.880
<v Speaker 1>that color? It's so clearly the other color, And you said, wait,

0:05:39.960 --> 0:05:42.599
<v Speaker 1>what are you being serious? And they asked you the

0:05:42.640 --> 0:05:48.080
<v Speaker 1>same and then the awe sets in. You both realize

0:05:48.120 --> 0:05:51.320
<v Speaker 1>that you're looking at the same thing in the outside world,

0:05:51.680 --> 0:05:58.960
<v Speaker 1>and you're having different perceptions a different experience on the inside. Now,

0:05:59.160 --> 0:06:02.880
<v Speaker 1>no one was more excited about the dress than neuroscientists,

0:06:02.920 --> 0:06:08.000
<v Speaker 1>because for neuroscientists this was a terrific demonstration of what

0:06:08.040 --> 0:06:11.400
<v Speaker 1>we're going to talk about today. So to start things off,

0:06:11.520 --> 0:06:16.920
<v Speaker 1>let's just point out how important these kinds of perceptual

0:06:17.000 --> 0:06:20.760
<v Speaker 1>oddities are to neuroscience. I've spent a big chunk of

0:06:20.800 --> 0:06:26.760
<v Speaker 1>my career studying illusions. I've published scientific papers about illusions

0:06:26.800 --> 0:06:30.280
<v Speaker 1>in journals like Science and Nature, And some years ago

0:06:30.360 --> 0:06:33.960
<v Speaker 1>I wrote a review article in the journal Nature Reviews Neuroscience,

0:06:34.240 --> 0:06:38.400
<v Speaker 1>and I titled it Visual Illusions and the Brain, And

0:06:38.440 --> 0:06:42.760
<v Speaker 1>in that article I laid out how powerful illusions are

0:06:43.000 --> 0:06:46.599
<v Speaker 1>for figuring out what is under the hood. Sometimes I

0:06:46.600 --> 0:06:49.520
<v Speaker 1>feel like illusions are interesting only to ten year olds

0:06:49.560 --> 0:06:53.880
<v Speaker 1>and for most people they become nothing but entertainment. But truthfully,

0:06:54.040 --> 0:06:59.040
<v Speaker 1>illusions are microscopes for understanding what is happening in the brain.

0:07:00.200 --> 0:07:04.640
<v Speaker 1>Them we can reveal the systematic differences between what is

0:07:04.839 --> 0:07:07.719
<v Speaker 1>actually out there in the world and what we believe

0:07:08.160 --> 0:07:12.600
<v Speaker 1>is out there, And by dialing the illusion around carefully,

0:07:12.960 --> 0:07:16.840
<v Speaker 1>we can usually put constraints on how the network of

0:07:16.960 --> 0:07:22.440
<v Speaker 1>neurons must be operating. Now, most illusions are the type

0:07:22.440 --> 0:07:26.080
<v Speaker 1>in which we measure what's being presented in the outside world,

0:07:26.240 --> 0:07:30.080
<v Speaker 1>like two lines of identical lengths and you see it

0:07:30.160 --> 0:07:33.440
<v Speaker 1>as two different lengths, and we say, ah, there's a

0:07:33.480 --> 0:07:37.400
<v Speaker 1>systematic difference between what's on the page and what you perceive.

0:07:38.040 --> 0:07:40.000
<v Speaker 2>Or maybe I show you two.

0:07:39.720 --> 0:07:43.520
<v Speaker 1>Parallel lines against some background and you don't see them

0:07:43.640 --> 0:07:47.160
<v Speaker 1>as parallel. Or you look at a totally static picture

0:07:47.200 --> 0:07:50.920
<v Speaker 1>on a page and you swear that it's moving. But

0:07:51.000 --> 0:07:54.320
<v Speaker 1>the dress was interesting because it wasn't that traditional kind

0:07:54.360 --> 0:07:58.800
<v Speaker 1>of illusion. Instead, one person sees one thing and the

0:07:58.880 --> 0:08:01.480
<v Speaker 1>person standing right next to them sees another.

0:08:02.640 --> 0:08:04.160
<v Speaker 2>Now, what all.

0:08:03.880 --> 0:08:06.800
<v Speaker 1>Illusions, including the dress, tell us right away is a

0:08:07.000 --> 0:08:11.160
<v Speaker 1>foundational point that's not always intuitive, which is that we

0:08:11.240 --> 0:08:15.640
<v Speaker 1>don't simply look at the world and passively receive what's

0:08:15.680 --> 0:08:22.240
<v Speaker 1>out there. Instead, our brains actively construct our perception, and

0:08:22.640 --> 0:08:26.480
<v Speaker 1>different brains can do so differently. So now let's move

0:08:26.600 --> 0:08:30.680
<v Speaker 1>deeper into this mystery by turning to a different illusion.

0:08:31.000 --> 0:08:33.480
<v Speaker 1>That took over the Internet a few years later, in

0:08:33.559 --> 0:08:35.240
<v Speaker 1>May of twenty eighteen.

0:08:36.120 --> 0:08:40.040
<v Speaker 3>Laurel Laurel, Laurel.

0:08:41.080 --> 0:08:44.520
<v Speaker 1>Now, this was an audio file that was originally recorded

0:08:44.559 --> 0:08:47.800
<v Speaker 1>by a reader in two thousand and seven for vocabulary

0:08:47.840 --> 0:08:51.360
<v Speaker 1>dot com, and some students apparently re recorded that file

0:08:51.440 --> 0:08:54.679
<v Speaker 1>while there was some background noise in a room. So

0:08:55.280 --> 0:08:59.360
<v Speaker 1>a fifteen year old freshman in Georgia named Katie was

0:08:59.440 --> 0:09:03.120
<v Speaker 1>listening to that recording and she realized that she was

0:09:03.200 --> 0:09:07.560
<v Speaker 1>hearing some funny ambiguity, and she posted this little audio

0:09:07.600 --> 0:09:10.440
<v Speaker 1>clip on Instagram, and the next day her friend posted

0:09:10.440 --> 0:09:12.400
<v Speaker 1>it on Reddit, and then it got picked up on

0:09:12.440 --> 0:09:13.280
<v Speaker 1>Twitter and.

0:09:13.320 --> 0:09:14.800
<v Speaker 2>Soon it went nuts.

0:09:15.440 --> 0:09:20.120
<v Speaker 1>Why Because just like the dress, people can have a

0:09:20.320 --> 0:09:25.040
<v Speaker 1>different perception of the same item presented to their senses.

0:09:25.559 --> 0:09:29.240
<v Speaker 1>About half the people hear the word yanny and the

0:09:29.400 --> 0:09:32.200
<v Speaker 1>other half hear the word Laurel.

0:09:32.800 --> 0:09:38.600
<v Speaker 3>Laurel, Laurel, Laurel, Laurel.

0:09:39.840 --> 0:09:44.640
<v Speaker 1>Now, how can people hear different things? So hang tight,

0:09:44.720 --> 0:09:46.360
<v Speaker 1>I'll tell you in a minute. But what I want

0:09:46.400 --> 0:09:48.720
<v Speaker 1>to point out for now is that, just like the dress,

0:09:48.760 --> 0:09:52.040
<v Speaker 1>some people have one experience, some people have another, same

0:09:52.160 --> 0:09:57.920
<v Speaker 1>sound recording, different experiences. Now, the Yanny Laurel clip made

0:09:57.960 --> 0:10:00.960
<v Speaker 1>its rounds on the internet, but it about the exact

0:10:01.040 --> 0:10:05.040
<v Speaker 1>same time. In May of twenty eighteen, something even better

0:10:05.120 --> 0:10:09.920
<v Speaker 1>surfaced on YouTube. A guy had posted a video where

0:10:09.960 --> 0:10:14.199
<v Speaker 1>he was reviewing a children's toy from the ben Ten franchise,

0:10:14.679 --> 0:10:18.600
<v Speaker 1>and the toy lights up and says something. And here's

0:10:18.640 --> 0:10:22.320
<v Speaker 1>what it sounds like. It says the word green needle.

0:10:22.600 --> 0:10:34.080
<v Speaker 1>So listen carefully for green needle. Okay, well, that's not

0:10:34.280 --> 0:10:37.079
<v Speaker 1>actually what the toy was saying. It was actually saying

0:10:37.120 --> 0:10:41.600
<v Speaker 1>the word brainstorm, which is the toy character's name. So

0:10:41.760 --> 0:10:52.320
<v Speaker 1>listen for the word brainstorm.

0:10:52.440 --> 0:10:53.079
<v Speaker 2>Now, I just.

0:10:53.040 --> 0:10:56.840
<v Speaker 1>Played the exact same audio file in both cases, but

0:10:56.960 --> 0:11:01.800
<v Speaker 1>depending on your expectation what you were listening for, you'll

0:11:01.920 --> 0:11:05.400
<v Speaker 1>hear different things. So I'm going to play this file again,

0:11:05.840 --> 0:11:08.160
<v Speaker 1>over and over for about twenty seconds, and I want

0:11:08.160 --> 0:11:13.240
<v Speaker 1>you to think about brainstorm or think about green needle.

0:11:13.840 --> 0:11:16.840
<v Speaker 1>Try to go back and forth about which one you're hearing.

0:11:17.000 --> 0:11:19.720
<v Speaker 1>Switch your thinking from one to the other at any point.

0:11:39.200 --> 0:11:40.840
<v Speaker 2>So, what the heck's going on here?

0:11:41.000 --> 0:11:45.480
<v Speaker 1>How can a single audio file be heard two completely

0:11:45.480 --> 0:11:51.160
<v Speaker 1>different ways? Seems like magic, but it's actually neuroscience. All

0:11:51.240 --> 0:11:56.679
<v Speaker 1>these internet memes actually give deep insight into a fundamental

0:11:57.080 --> 0:12:00.640
<v Speaker 1>and rarely appreciated property of the brain. So I'm going

0:12:00.679 --> 0:12:04.640
<v Speaker 1>to unpack these illusions in a few steps. The first

0:12:04.720 --> 0:12:07.920
<v Speaker 1>clue to the mystery is that the brain does not

0:12:08.240 --> 0:12:13.280
<v Speaker 1>tolerate ambiguity. It really wants to come to a conclusion

0:12:13.440 --> 0:12:17.800
<v Speaker 1>about exactly what's out there. Now, that's a major daily

0:12:17.920 --> 0:12:20.560
<v Speaker 1>challenge for the brain because so much of what you

0:12:20.640 --> 0:12:25.080
<v Speaker 1>see or hear is ambiguous. You have data points that

0:12:25.120 --> 0:12:28.400
<v Speaker 1>come streaming into the brain through the eyes, or the ears,

0:12:28.480 --> 0:12:32.920
<v Speaker 1>or the fingertips, but often they could be interpreted more

0:12:32.960 --> 0:12:36.120
<v Speaker 1>than one way. So what does the brain do in

0:12:36.160 --> 0:12:41.520
<v Speaker 1>this circumstance. It locks onto a single way of understanding it.

0:12:42.320 --> 0:12:46.320
<v Speaker 1>In other words, if there are multiple possibilities, it'll force

0:12:46.440 --> 0:12:49.840
<v Speaker 1>an answer. Now let's pause for just a moment to

0:12:49.880 --> 0:12:53.880
<v Speaker 1>appreciate something here. When you read about the brain, you

0:12:53.960 --> 0:12:57.920
<v Speaker 1>always see it celebrated for its parallel processing. It can

0:12:58.000 --> 0:13:01.280
<v Speaker 1>do lots of things at once. But what it should

0:13:01.280 --> 0:13:04.240
<v Speaker 1>be equally celebrated for, the thing that no one ever

0:13:04.320 --> 0:13:10.280
<v Speaker 1>bothers to highlight is serialization. It takes lots of the

0:13:10.360 --> 0:13:13.840
<v Speaker 1>activity and it squeezes it down to one thing.

0:13:14.080 --> 0:13:15.559
<v Speaker 2>It serializes it.

0:13:15.559 --> 0:13:19.240
<v Speaker 1>It takes an information that could be interpreted in lots

0:13:19.240 --> 0:13:22.160
<v Speaker 1>of different ways, and it crunches it down to a

0:13:22.320 --> 0:13:23.520
<v Speaker 1>single interpretation.

0:13:24.800 --> 0:13:28.760
<v Speaker 2>Now, why is it so good at serializing, at.

0:13:28.600 --> 0:13:33.800
<v Speaker 1>Getting possibilities down to a single answer, Because fundamentally, your

0:13:33.800 --> 0:13:38.240
<v Speaker 1>brain has the challenge of controlling a giant body made

0:13:38.280 --> 0:13:41.800
<v Speaker 1>of trillions of cells, and when you come to a

0:13:42.320 --> 0:13:45.880
<v Speaker 1>tree in the path, it has to go either left

0:13:45.960 --> 0:13:48.719
<v Speaker 1>or right around the tree. Because of the physics of

0:13:48.760 --> 0:13:51.040
<v Speaker 1>the world, it cannot do both, and.

0:13:50.960 --> 0:13:53.360
<v Speaker 2>So it has to make a single.

0:13:53.400 --> 0:13:57.520
<v Speaker 1>Decision, go right or go left, and drag all those

0:13:57.559 --> 0:14:00.560
<v Speaker 1>trillions of cells with it. Your brain it has to

0:14:00.640 --> 0:14:04.960
<v Speaker 1>be good at taking possibilities and crushing them down to

0:14:05.080 --> 0:14:10.880
<v Speaker 1>a single decision. And it's the same with your perceptual life.

0:14:11.360 --> 0:14:14.600
<v Speaker 1>Your brain is used to dealing with a world where

0:14:14.640 --> 0:14:18.120
<v Speaker 1>it has to come to conclusions, having to say, look,

0:14:18.160 --> 0:14:21.760
<v Speaker 1>there are lots of possibilities here, but for me to

0:14:21.840 --> 0:14:24.520
<v Speaker 1>function in the world, I have to make an assumption

0:14:25.000 --> 0:14:27.600
<v Speaker 1>that what I am looking at is a piece of

0:14:27.640 --> 0:14:30.960
<v Speaker 1>food or a boulder, or a bear at a distance

0:14:31.120 --> 0:14:36.000
<v Speaker 1>or whatever. So the brain doesn't tolerate ambiguity, but it

0:14:36.080 --> 0:14:40.480
<v Speaker 1>always says, all right, this is my answer okay, So

0:14:40.680 --> 0:14:45.120
<v Speaker 1>now let's introduce one more perceptual illusion of this flavor,

0:14:45.600 --> 0:14:48.000
<v Speaker 1>and then we're going to unpack what's going on.

0:14:49.480 --> 0:14:51.080
<v Speaker 2>So surely you've seen this one before.

0:14:51.200 --> 0:14:54.080
<v Speaker 1>You draw the outline of a cube on a piece

0:14:54.120 --> 0:14:57.240
<v Speaker 1>of paper. You just draw a square, and then an

0:14:57.320 --> 0:15:00.720
<v Speaker 1>offset square, and then lines connecting the corners of one

0:15:00.760 --> 0:15:03.320
<v Speaker 1>to the corners of the other, so it's twelve lines.

0:15:03.400 --> 0:15:07.400
<v Speaker 1>It's the outline of a cube. This little wireframe drawing

0:15:07.600 --> 0:15:09.920
<v Speaker 1>is known as the Necker cube.

0:15:10.320 --> 0:15:13.479
<v Speaker 2>Now you've seen this before, but as you know, if you've.

0:15:13.280 --> 0:15:18.080
<v Speaker 1>Stared at one, it's perceptually ambiguous because if you stare

0:15:18.120 --> 0:15:21.320
<v Speaker 1>at this little wireframe, it looks like it's coming out

0:15:21.400 --> 0:15:25.400
<v Speaker 1>one way from the page, even though you could perceive.

0:15:25.080 --> 0:15:27.400
<v Speaker 2>The same drawing in two different ways.

0:15:27.720 --> 0:15:30.600
<v Speaker 1>Either the lower square is the face of the cube

0:15:30.640 --> 0:15:33.720
<v Speaker 1>coming toward you, or the upper square is the one

0:15:33.760 --> 0:15:38.280
<v Speaker 1>coming out toward you, but your brain makes a choice. Now,

0:15:38.320 --> 0:15:42.360
<v Speaker 1>you could imagine a space alien who looks at this

0:15:42.400 --> 0:15:45.600
<v Speaker 1>little drawing of the wireframe cube and says, okay, well,

0:15:46.040 --> 0:15:50.240
<v Speaker 1>both configurations of the cube are equally probable, so I'll

0:15:50.240 --> 0:15:53.640
<v Speaker 1>see it both ways at once. But we can't do that.

0:15:54.200 --> 0:15:56.960
<v Speaker 1>We have to see it one way or the other.

0:15:57.120 --> 0:16:01.800
<v Speaker 1>Your brain forces a single interpret and this is the

0:16:01.840 --> 0:16:06.280
<v Speaker 1>same thing that's happening with the other illusions with the dress.

0:16:06.400 --> 0:16:09.640
<v Speaker 1>You don't see it as both blue and black and

0:16:09.880 --> 0:16:12.520
<v Speaker 1>white and gold. And in a minute we'll see why.

0:16:13.160 --> 0:16:14.680
<v Speaker 1>The part I just want to say now is that

0:16:14.720 --> 0:16:17.960
<v Speaker 1>your brain concludes that it is one or the other,

0:16:18.080 --> 0:16:22.440
<v Speaker 1>and then it sticks with that. And likewise with Yanny Laurel.

0:16:23.080 --> 0:16:26.720
<v Speaker 1>Both sounds are present in the audio file, but you

0:16:26.840 --> 0:16:30.640
<v Speaker 1>don't hear Yanny and Laurel at the same time, stacked

0:16:30.680 --> 0:16:33.800
<v Speaker 1>on one another. And it's exactly the same thing with

0:16:33.960 --> 0:16:39.440
<v Speaker 1>brainstorm and green needle. Both interpretations are possible, but your

0:16:39.480 --> 0:16:43.920
<v Speaker 1>brain won't do both at once. It collapses the possibilities

0:16:43.960 --> 0:16:48.600
<v Speaker 1>to a single answer. In all these cases, even though

0:16:48.640 --> 0:16:52.160
<v Speaker 1>the data is consistent with either interpretation, your brain makes

0:16:52.160 --> 0:16:55.280
<v Speaker 1>a call. It goes left or right around the tree.

0:16:55.560 --> 0:16:58.800
<v Speaker 1>You very clearly perceive one or the other. And this

0:16:58.920 --> 0:17:03.040
<v Speaker 1>is because the brain isn't passively receiving the world. It's

0:17:03.200 --> 0:17:25.000
<v Speaker 1>making choices. Okay, but how does your brain know how

0:17:25.040 --> 0:17:30.119
<v Speaker 1>to collapse ambiguous data to a single interpretation. It does

0:17:30.160 --> 0:17:35.760
<v Speaker 1>so by leveraging assumptions, so let's go a level deeper

0:17:35.880 --> 0:17:38.920
<v Speaker 1>with the dress. Why does it happen that some people

0:17:38.960 --> 0:17:41.359
<v Speaker 1>see it one way and some people the other. It

0:17:41.520 --> 0:17:44.560
<v Speaker 1>happens because your brain sees a picture of a dress

0:17:44.560 --> 0:17:50.120
<v Speaker 1>in the shop and it makes dozens of assumptions totally unconsciously. Now,

0:17:50.160 --> 0:17:54.600
<v Speaker 1>what's amazing is that the assumptions aren't directly about the dress,

0:17:55.240 --> 0:17:58.760
<v Speaker 1>but about things you didn't even know you were thinking about.

0:17:58.800 --> 0:18:03.280
<v Speaker 1>What is the light source in the photograph? Is the

0:18:03.440 --> 0:18:07.840
<v Speaker 1>dress mostly being lit by fluorescent lights or by sunlight?

0:18:08.760 --> 0:18:12.240
<v Speaker 1>Is the dress facing a window or is the window

0:18:12.280 --> 0:18:15.840
<v Speaker 1>behind it? What time of day is it, what season

0:18:16.000 --> 0:18:20.320
<v Speaker 1>is it? Your brain is considering all of these questions,

0:18:20.840 --> 0:18:23.720
<v Speaker 1>and fundamentally, this all has to do with a computation

0:18:23.840 --> 0:18:30.080
<v Speaker 1>that it does known as color constancy. Color constancy is

0:18:30.160 --> 0:18:34.960
<v Speaker 1>this sophisticated ability of our visual systems to perceive the

0:18:35.000 --> 0:18:39.000
<v Speaker 1>color of something as constant even when the light source

0:18:39.040 --> 0:18:43.119
<v Speaker 1>the illumination changes. So let's say I'm wearing a white

0:18:43.280 --> 0:18:46.160
<v Speaker 1>T shirt and we're standing outside talking in the sunlight.

0:18:46.560 --> 0:18:50.120
<v Speaker 1>You will see my shirt as white. Now we go

0:18:50.280 --> 0:18:54.479
<v Speaker 1>indoors into the coffee shop and the illuminant changes. In

0:18:54.480 --> 0:18:57.720
<v Speaker 1>other words, the light that's bouncing off my t shirt changes.

0:18:58.359 --> 0:19:03.280
<v Speaker 1>Now it's fluorescent light compared to sunlight. The fluorescent light

0:19:03.359 --> 0:19:06.679
<v Speaker 1>has a different spectrum of colors coming out, and so

0:19:06.720 --> 0:19:09.760
<v Speaker 1>when those bounce off my shirt, you have a different

0:19:09.960 --> 0:19:14.600
<v Speaker 1>spectrum of colors hitting your eyes, and yet you still

0:19:14.600 --> 0:19:17.800
<v Speaker 1>see it as white. And then that night we go

0:19:17.920 --> 0:19:21.680
<v Speaker 1>into a dance club and the lighting is blue, and

0:19:21.760 --> 0:19:25.280
<v Speaker 1>yet you have no problem seeing the shirt as white,

0:19:25.600 --> 0:19:28.960
<v Speaker 1>even though it's mostly blue light reflecting off the shirt

0:19:29.240 --> 0:19:32.520
<v Speaker 1>into your eyes. And then afterwards we go sit by

0:19:32.640 --> 0:19:37.399
<v Speaker 1>a campfire and my shirt still looks white. Your brain

0:19:37.520 --> 0:19:41.359
<v Speaker 1>retains a constant perception of the color of the shirt

0:19:41.800 --> 0:19:45.240
<v Speaker 1>even though the wavelengths bouncing off of it are very different.

0:19:46.320 --> 0:19:50.159
<v Speaker 1>So what does this tell us, Well, it means that

0:19:50.240 --> 0:19:53.560
<v Speaker 1>the way your brain determines the color is not just

0:19:53.680 --> 0:19:56.560
<v Speaker 1>about the colors hitting your eye from the shirt. It

0:19:56.600 --> 0:19:59.720
<v Speaker 1>has to do with something else. And that's something else

0:20:00.119 --> 0:20:04.560
<v Speaker 1>is everything else in the scene. So when you're looking

0:20:04.600 --> 0:20:08.480
<v Speaker 1>at my shirt, your eyes are drinking in everything else.

0:20:09.119 --> 0:20:12.840
<v Speaker 1>The background, the color of the skin on my arms,

0:20:12.880 --> 0:20:17.000
<v Speaker 1>the color of the floors and walls, the colors of

0:20:17.359 --> 0:20:18.120
<v Speaker 1>all the other.

0:20:18.359 --> 0:20:20.680
<v Speaker 2>Jeans and shirts and signs in the.

0:20:20.640 --> 0:20:24.720
<v Speaker 4>Whole scene, and it uses all of that to estimate

0:20:24.800 --> 0:20:30.160
<v Speaker 4>the background illumination and then make the right computation about

0:20:30.200 --> 0:20:34.199
<v Speaker 4>the color of the shirt in the sunlight and the

0:20:34.280 --> 0:20:36.959
<v Speaker 4>coffee shop, at the dance club, at the campfire.

0:20:37.359 --> 0:20:38.520
<v Speaker 2>It's doing all of.

0:20:38.480 --> 0:20:43.080
<v Speaker 1>These computations, and this is what allows it to subtract

0:20:43.320 --> 0:20:46.840
<v Speaker 1>off the background lighting so that it can see what

0:20:47.040 --> 0:20:52.040
<v Speaker 1>color things are most likely to actually be. That's the

0:20:52.080 --> 0:20:56.760
<v Speaker 1>phenomenon of color constancy. The color of the shirt remains

0:20:56.840 --> 0:21:01.920
<v Speaker 1>constant even under different illumination, and that's what allows us

0:21:01.920 --> 0:21:05.720
<v Speaker 1>to see the colors of objects in the world consistently,

0:21:05.760 --> 0:21:10.120
<v Speaker 1>whether we're looking under sunlight or moonlight or firelight or whatever.

0:21:11.000 --> 0:21:14.480
<v Speaker 1>So the first lesson is you're not just seeing what's

0:21:14.600 --> 0:21:19.359
<v Speaker 1>out there. Your brain is actively interpreting information and serving

0:21:19.480 --> 0:21:22.040
<v Speaker 1>up a story to you. And I'll go into this

0:21:22.080 --> 0:21:24.399
<v Speaker 1>more in a future episode. But this is why we

0:21:24.440 --> 0:21:29.600
<v Speaker 1>can see strawberries as red. For example, when we change

0:21:29.600 --> 0:21:32.920
<v Speaker 1>the background color such that the actual light bouncing off

0:21:32.920 --> 0:21:37.440
<v Speaker 1>the strawberries is gray light, your brain can nonetheless say, okay, well,

0:21:37.640 --> 0:21:40.880
<v Speaker 1>given that everything else in the scene is now greenish,

0:21:41.400 --> 0:21:44.160
<v Speaker 1>I can subtrack that off and know that I'm looking

0:21:44.200 --> 0:21:48.080
<v Speaker 1>at something red. Now, in order to do all of

0:21:48.119 --> 0:21:50.960
<v Speaker 1>this that I've been talking about, your brain has to

0:21:51.040 --> 0:21:55.439
<v Speaker 1>make lots of assumptions about what the color should be,

0:21:56.200 --> 0:22:01.239
<v Speaker 1>and different brains do it differently. With the dress, you

0:22:01.320 --> 0:22:05.679
<v Speaker 1>see it as either white and gold or blue and black,

0:22:06.240 --> 0:22:10.680
<v Speaker 1>depending on the assumptions your brain is making. When you

0:22:10.840 --> 0:22:14.000
<v Speaker 1>glance at the photo on your phone, you have no

0:22:14.119 --> 0:22:17.720
<v Speaker 1>idea that your brain is doing all those sophisticated computations

0:22:17.840 --> 0:22:21.359
<v Speaker 1>under the hood to tell you what is the actual

0:22:21.440 --> 0:22:25.760
<v Speaker 1>color of this garment, given my assumptions about all the

0:22:25.880 --> 0:22:27.040
<v Speaker 1>lighting details.

0:22:27.760 --> 0:22:29.240
<v Speaker 2>The issue is that your.

0:22:29.000 --> 0:22:32.520
<v Speaker 1>Brain grew up in a particular environment, maybe with a

0:22:32.560 --> 0:22:35.439
<v Speaker 1>lot of snow or a lot of sunlight or fog,

0:22:36.119 --> 0:22:39.359
<v Speaker 1>and your brain makes assumptions about the time of day

0:22:39.720 --> 0:22:43.199
<v Speaker 1>and the season and the balance of artificial lighting to

0:22:43.320 --> 0:22:47.720
<v Speaker 1>natural lighting. To make sense of this little photo, what

0:22:48.040 --> 0:22:52.199
<v Speaker 1>hues does the lighting contain. If your brain ignores a

0:22:52.200 --> 0:22:54.920
<v Speaker 1>bit of the blue side, you'll see the dress as

0:22:55.040 --> 0:22:58.159
<v Speaker 1>white and gold. If your brain pays less attention to

0:22:58.200 --> 0:23:00.840
<v Speaker 1>the yellow side of the spectrum, you'll see it as

0:23:01.040 --> 0:23:05.000
<v Speaker 1>blue and black. You have no insight into the fact

0:23:05.240 --> 0:23:08.560
<v Speaker 1>that your brain is making all these assumptions under the hood.

0:23:09.280 --> 0:23:11.919
<v Speaker 1>Was the photo of the dress taken with the window

0:23:11.960 --> 0:23:13.120
<v Speaker 1>facing it or behind it.

0:23:13.200 --> 0:23:15.080
<v Speaker 2>Was it morning light or afternoon light?

0:23:15.760 --> 0:23:18.119
<v Speaker 1>And is your experience of the world based on the

0:23:18.160 --> 0:23:21.919
<v Speaker 1>fact that you are a mourning lark or you are.

0:23:21.800 --> 0:23:22.800
<v Speaker 2>A night owl.

0:23:23.080 --> 0:23:26.600
<v Speaker 1>One of my colleagues, Pascal Wallash, showed that people who

0:23:26.640 --> 0:23:30.119
<v Speaker 1>were early risers were more likely to think that the

0:23:30.200 --> 0:23:33.600
<v Speaker 1>dress was lit by natural light, and so they saw

0:23:33.640 --> 0:23:37.840
<v Speaker 1>it as white and gold, but night owls presumably had

0:23:37.880 --> 0:23:41.720
<v Speaker 1>more assumptions about artificial lighting, and they were more likely

0:23:41.760 --> 0:23:45.399
<v Speaker 1>to see the dress as blue and black. Your brain

0:23:45.560 --> 0:23:48.159
<v Speaker 1>is determining the color of the dress by comparing it

0:23:48.280 --> 0:23:51.240
<v Speaker 1>against the other objects of the background of the photo

0:23:51.640 --> 0:23:56.520
<v Speaker 1>and making its best guess at all these parameters. So

0:23:56.920 --> 0:24:00.320
<v Speaker 1>your brain relies on the answers to question is that

0:24:00.359 --> 0:24:03.439
<v Speaker 1>you didn't even think it was asking and the idea

0:24:03.600 --> 0:24:07.600
<v Speaker 1>of imposing assumptions. This is the same with Yanny and

0:24:07.680 --> 0:24:11.840
<v Speaker 1>Laurel in the auditory domain, or with green needle and brainstorm.

0:24:12.040 --> 0:24:16.840
<v Speaker 1>Your brain is imposing an interpretation. But what's interesting in

0:24:16.880 --> 0:24:20.640
<v Speaker 1>this case is that the assumption can be changed more easily,

0:24:21.000 --> 0:24:26.280
<v Speaker 1>typically by just staring at the word visually. Because your

0:24:26.400 --> 0:24:31.280
<v Speaker 1>brain is trying to disambiguate what it's hearing, and suddenly

0:24:31.520 --> 0:24:34.400
<v Speaker 1>it has lots of help from the visual system because

0:24:34.600 --> 0:24:39.800
<v Speaker 1>it sees a word. So the frequencies of both words

0:24:39.880 --> 0:24:43.800
<v Speaker 1>yany and laurel or green needle, brainstorm, they're contained in

0:24:43.920 --> 0:24:48.240
<v Speaker 1>the audio file, so just depending on how you listen

0:24:48.359 --> 0:24:52.560
<v Speaker 1>for it, you can hear one or the other. So

0:24:52.600 --> 0:24:57.120
<v Speaker 1>the brain constantly nails down its world by making assumptions,

0:24:57.640 --> 0:25:00.640
<v Speaker 1>and we see this with everything. And even though these

0:25:01.200 --> 0:25:04.199
<v Speaker 1>internet memes get all of our attention, the fact is

0:25:04.600 --> 0:25:07.560
<v Speaker 1>that our brains have to make assumptions all the time.

0:25:07.880 --> 0:25:10.840
<v Speaker 1>And this is because most of the inputs from the

0:25:10.880 --> 0:25:16.160
<v Speaker 1>world are quite noisy. For example, you can still understand

0:25:16.280 --> 0:25:20.120
<v Speaker 1>me even if my speech is choppy, or if I'm

0:25:20.160 --> 0:25:23.399
<v Speaker 1>speaking and there's lots of background noise like at a restaurant.

0:25:24.160 --> 0:25:27.200
<v Speaker 1>What's actually hitting your ears in these scenarios is a

0:25:27.320 --> 0:25:31.960
<v Speaker 1>very messy signal, But the brain imposes an interpretation about

0:25:32.200 --> 0:25:34.920
<v Speaker 1>what must have been said, and that's what you perceive

0:25:35.440 --> 0:25:38.800
<v Speaker 1>what you believe you heard. A lot of your cell

0:25:38.880 --> 0:25:43.320
<v Speaker 1>phone conversations are super noisy, but you typically don't realize

0:25:43.320 --> 0:25:48.919
<v Speaker 1>it because you keep making your reasonable interpretations. Now, this

0:25:49.040 --> 0:25:52.880
<v Speaker 1>is true of most of what is hitting your eyes

0:25:52.920 --> 0:25:56.399
<v Speaker 1>and ears. You don't catch a fraction of the data,

0:25:56.800 --> 0:25:59.439
<v Speaker 1>but your brain fills in the details to put together

0:25:59.480 --> 0:26:01.959
<v Speaker 1>a story. And this, by the way, is what's at

0:26:02.000 --> 0:26:04.680
<v Speaker 1>the heart of a lot of art and graphic design.

0:26:05.040 --> 0:26:08.000
<v Speaker 1>You just see a few curves and you interpret it

0:26:08.080 --> 0:26:11.720
<v Speaker 1>as a face, or a series of segmented lines and

0:26:11.800 --> 0:26:16.440
<v Speaker 1>you interpret that as a body. We are always operating

0:26:16.480 --> 0:26:19.919
<v Speaker 1>off thin data, but that doesn't stop us from coming

0:26:19.960 --> 0:26:24.440
<v Speaker 1>to clear conclusions. And before I explain how our neural

0:26:24.480 --> 0:26:28.040
<v Speaker 1>networks go about making these assumptions, let's just take a

0:26:28.080 --> 0:26:31.919
<v Speaker 1>second to look at how your brain is so imperfect

0:26:32.040 --> 0:26:36.919
<v Speaker 1>at this. Take paridolia, which is when you perceive a

0:26:37.119 --> 0:26:41.160
<v Speaker 1>meaningful pattern where none exists, Like when you look at

0:26:41.200 --> 0:26:44.760
<v Speaker 1>an electrical outlet and you see a face made up

0:26:44.760 --> 0:26:48.080
<v Speaker 1>of little eyes in a sort of surprised mouth. You

0:26:48.119 --> 0:26:51.760
<v Speaker 1>can't help but see that. Your brain imposes that interpretation

0:26:51.880 --> 0:26:54.920
<v Speaker 1>on it. Or you see a face in the clouds,

0:26:55.359 --> 0:26:58.840
<v Speaker 1>or someone sees the face of their local deity in

0:26:58.920 --> 0:27:02.960
<v Speaker 1>a piece of toast. Why does this happen, Well, your

0:27:03.000 --> 0:27:06.199
<v Speaker 1>brain is really wired up to see faces, and so

0:27:06.359 --> 0:27:10.960
<v Speaker 1>it triggers that interpretation whenever it sees three blobs in

0:27:11.040 --> 0:27:15.560
<v Speaker 1>the approximately right configuration, and the same thing can happen

0:27:15.640 --> 0:27:18.600
<v Speaker 1>with sounds, like when there's some weird sound and your

0:27:18.600 --> 0:27:22.680
<v Speaker 1>brain thinks it's a person shouting or someone calling your

0:27:22.760 --> 0:27:26.600
<v Speaker 1>name or whatever. This is the brain working to make

0:27:26.760 --> 0:27:29.560
<v Speaker 1>sense of the world around it. All it ever does

0:27:30.160 --> 0:27:34.800
<v Speaker 1>is look for meaning from data in the world. In fact,

0:27:35.640 --> 0:27:39.440
<v Speaker 1>typically the brain will try to impose an interpretation even

0:27:39.480 --> 0:27:44.080
<v Speaker 1>if you have random noise. That's the idea with rorshak

0:27:44.200 --> 0:27:47.560
<v Speaker 1>ink blots. You have these blobs on a page, and

0:27:47.640 --> 0:27:51.760
<v Speaker 1>your brain reaches for some way of explaining them. Oh,

0:27:51.800 --> 0:27:54.800
<v Speaker 1>that looks like a rabbit or an airplane, or an

0:27:54.800 --> 0:27:58.640
<v Speaker 1>emperor on a throne or whatever. And generally a lot

0:27:58.680 --> 0:28:03.800
<v Speaker 1>of life involves forcing patterns on random noise. Here's an

0:28:03.880 --> 0:28:07.520
<v Speaker 1>auditory example from my colleague Diana Deutsch, who has spent

0:28:07.600 --> 0:28:12.480
<v Speaker 1>her career pioneering auditory illusions. So here's an experiment where

0:28:12.520 --> 0:28:16.639
<v Speaker 1>she plays mixed up audio that doesn't say anything, but

0:28:17.080 --> 0:28:21.320
<v Speaker 1>it sounds like speech, and people will generally impose the

0:28:21.400 --> 0:28:23.880
<v Speaker 1>interpretation of words on these.

0:28:57.280 --> 0:29:01.600
<v Speaker 4>Come come, come, come, come, come, come, come, come come.

0:29:33.000 --> 0:29:38.160
<v Speaker 1>This is essentially the sound version of Rorschach blots. Different

0:29:38.280 --> 0:29:41.520
<v Speaker 1>people will generally hear different things, and it seems to

0:29:41.560 --> 0:29:44.520
<v Speaker 1>be related to what they are thinking about or what's

0:29:44.560 --> 0:29:47.640
<v Speaker 1>on their mind. So this all reminds us of the

0:29:47.880 --> 0:29:52.720
<v Speaker 1>power of the brain to impose meaning. Just think about

0:29:52.760 --> 0:29:56.360
<v Speaker 1>the situation when you're expecting a friend and you're looking

0:29:56.440 --> 0:30:00.240
<v Speaker 1>around for him at a crowded mall. Everyone looks like him.

0:30:00.240 --> 0:30:02.560
<v Speaker 1>For just a fraction of a second. You look at

0:30:02.600 --> 0:30:05.160
<v Speaker 1>someone's face and you think, oh, that's him, and then

0:30:05.560 --> 0:30:09.560
<v Speaker 1>five hundred milliseconds later, your visual system takes in more

0:30:09.640 --> 0:30:13.480
<v Speaker 1>information and decides out never mind false alarm, And then

0:30:13.520 --> 0:30:16.880
<v Speaker 1>we typically forget that we even thought that. But we

0:30:17.040 --> 0:30:20.240
<v Speaker 1>are expecting to see our friend, and so our brains

0:30:20.720 --> 0:30:26.240
<v Speaker 1>impose that expectation on lots of faces. Okay, so we've

0:30:26.400 --> 0:30:31.480
<v Speaker 1>established that brains take ambiguous signals and squish them down

0:30:31.520 --> 0:30:35.800
<v Speaker 1>to a single interpretation by use of assumptions. And that's

0:30:35.840 --> 0:30:38.520
<v Speaker 1>why we see the dress as one color or the other,

0:30:38.960 --> 0:30:42.520
<v Speaker 1>or we hear brainstorm or a green needle, but not both.

0:30:43.760 --> 0:30:46.520
<v Speaker 1>But how do our brains actually make their choice? How

0:30:46.560 --> 0:30:50.800
<v Speaker 1>do they pull this off? Neurally speaking, they do it

0:30:51.000 --> 0:30:57.360
<v Speaker 1>by combining bottom up information with top down information. Now,

0:30:57.760 --> 0:31:01.720
<v Speaker 1>bottom up means information and from the outside from the world.

0:31:02.080 --> 0:31:04.360
<v Speaker 1>What are the air compression wave of sitting my ear

0:31:04.440 --> 0:31:06.560
<v Speaker 1>drums or what are the photons sitting my retina?

0:31:07.080 --> 0:31:09.000
<v Speaker 2>Those are the signals that I am receiving.

0:31:10.000 --> 0:31:12.800
<v Speaker 1>But we don't interpret those bottom up signals that face

0:31:12.960 --> 0:31:18.120
<v Speaker 1>value because they're usually not sufficient. Instead, your brain melds

0:31:18.200 --> 0:31:23.880
<v Speaker 1>this with top down information, which means your expectations what

0:31:24.040 --> 0:31:26.600
<v Speaker 1>we think is likely to be true in the outside

0:31:26.680 --> 0:31:30.640
<v Speaker 1>world given our experience with it, and it's only in

0:31:30.920 --> 0:31:36.240
<v Speaker 1>combination the data plus our expectations that we see anything

0:31:36.400 --> 0:31:40.520
<v Speaker 1>in the world. And the surprise, I think the counterintuitive

0:31:40.600 --> 0:31:45.000
<v Speaker 1>part is that your prior assumptions, your expectations, the.

0:31:45.080 --> 0:31:45.840
<v Speaker 2>Top down part.

0:31:46.240 --> 0:31:50.680
<v Speaker 1>This is the overwhelming majority of what determines what you see.

0:31:51.880 --> 0:31:54.880
<v Speaker 1>For example, it seems like you just open your eyes

0:31:54.920 --> 0:31:57.400
<v Speaker 1>and there's the world, but in fact, when you look

0:31:57.520 --> 0:32:00.720
<v Speaker 1>at the visual cortex at the back of the which

0:32:00.800 --> 0:32:03.200
<v Speaker 1>is the place that receives the information from the eyes,

0:32:04.040 --> 0:32:07.400
<v Speaker 1>you find that only five percent of the input there

0:32:07.880 --> 0:32:09.280
<v Speaker 1>is coming from the eyes.

0:32:09.360 --> 0:32:12.280
<v Speaker 2>And the rest is all feedback activity.

0:32:12.640 --> 0:32:15.400
<v Speaker 1>In other words, ninety five percent of the data is

0:32:15.480 --> 0:32:19.120
<v Speaker 1>coming from higher levels of the visual system and other

0:32:19.400 --> 0:32:23.480
<v Speaker 1>areas of the brain. In fact, What is so crazy

0:32:23.680 --> 0:32:27.320
<v Speaker 1>is that you don't even need your eyes to have full,

0:32:27.960 --> 0:32:32.200
<v Speaker 1>rich visual experience. You can have this with your eyes closed.

0:32:32.680 --> 0:32:37.040
<v Speaker 1>And this is what we call dreams. And what's happening

0:32:37.080 --> 0:32:40.640
<v Speaker 1>here is that this is all internally generated activity and

0:32:40.720 --> 0:32:43.000
<v Speaker 1>none of it's entering through the eyes when you're asleep,

0:32:43.480 --> 0:32:46.440
<v Speaker 1>and it's not that much different from your normal vision.

0:32:47.120 --> 0:32:49.640
<v Speaker 1>So your perception of the world when you're walking around

0:32:49.800 --> 0:32:52.520
<v Speaker 1>is something like an awake dream.

0:32:53.600 --> 0:32:53.719
<v Speaker 5>Now.

0:32:53.760 --> 0:32:56.160
<v Speaker 1>I'm going to return to this issue in future episodes,

0:32:56.600 --> 0:32:59.000
<v Speaker 1>but what we want to concentrate on right now is

0:32:59.080 --> 0:33:02.320
<v Speaker 1>that your eyes are not simply a camera and your

0:33:02.480 --> 0:33:05.640
<v Speaker 1>ears are not simply a microphone. For those of you

0:33:05.720 --> 0:33:08.000
<v Speaker 1>who have been listening for a while to this podcast,

0:33:08.440 --> 0:33:12.040
<v Speaker 1>you know this is a major theme. Your brain is

0:33:12.200 --> 0:33:16.040
<v Speaker 1>locked in silence and darkness and needs to make assumptions

0:33:16.360 --> 0:33:20.800
<v Speaker 1>based on very thin data. So when I ask you

0:33:20.920 --> 0:33:24.880
<v Speaker 1>to think about the words green needle, that is top

0:33:24.960 --> 0:33:29.840
<v Speaker 1>down information that shapes how you interpret the bottom up data.

0:33:37.120 --> 0:33:40.440
<v Speaker 1>In contrast, imagine that you stare at the word brainstorm

0:33:40.640 --> 0:33:44.280
<v Speaker 1>while listening. You lock that in as your top down expectation,

0:33:44.440 --> 0:33:46.760
<v Speaker 1>and then that shapes your bottom up data.

0:33:46.840 --> 0:33:47.760
<v Speaker 2>And that's what you hear.

0:33:55.360 --> 0:34:00.200
<v Speaker 1>Even though both interpretations are available, your brain surfaces is

0:34:00.280 --> 0:34:04.280
<v Speaker 1>those features out of the landscape of data that match

0:34:04.440 --> 0:34:08.920
<v Speaker 1>what you're looking for. In other words, your expectations. What

0:34:09.120 --> 0:34:13.839
<v Speaker 1>you listen for is what you hear. And by the way,

0:34:13.920 --> 0:34:16.719
<v Speaker 1>all this is related to why lip reading works. When

0:34:16.760 --> 0:34:20.000
<v Speaker 1>you're in a noisy environment, you watch somebody's mouth while

0:34:20.000 --> 0:34:23.239
<v Speaker 1>they're talking, and in this way you combine a bit

0:34:23.400 --> 0:34:27.120
<v Speaker 1>of noisy auditory data with a bit of noisy visual

0:34:27.680 --> 0:34:30.560
<v Speaker 1>and that sharpens your guess for what was just said.

0:34:31.320 --> 0:34:35.440
<v Speaker 1>During the pandemic, a lot of conversations went misunderstood because

0:34:35.800 --> 0:34:39.120
<v Speaker 1>people were wearing masks and lip reading went out the window.

0:34:40.280 --> 0:34:44.840
<v Speaker 1>Now amazingly, this top down information is so important that

0:34:45.000 --> 0:34:48.279
<v Speaker 1>sometimes you can set up a picture where you don't

0:34:48.360 --> 0:34:52.480
<v Speaker 1>have any real prior assumptions and there's not enough information

0:34:52.640 --> 0:34:55.680
<v Speaker 1>in the picture to see anything. And only when I

0:34:55.800 --> 0:35:00.360
<v Speaker 1>tell you some interpretation does the bottom up information should

0:35:00.360 --> 0:35:01.319
<v Speaker 1>make any sense at all.

0:35:01.880 --> 0:35:03.120
<v Speaker 2>You can only see.

0:35:03.160 --> 0:35:06.480
<v Speaker 1>What's in front of you if you're given top down direction.

0:35:07.520 --> 0:35:10.880
<v Speaker 1>For example, I've put a cool picture on my website

0:35:10.920 --> 0:35:14.440
<v Speaker 1>at eagleman dot com slash podcast. Take a look at

0:35:14.480 --> 0:35:17.480
<v Speaker 1>this field of black and white blobs and see what

0:35:17.600 --> 0:35:20.520
<v Speaker 1>it looks like to you, And presumably it looks really

0:35:20.680 --> 0:35:23.200
<v Speaker 1>like nothing at all, just a bunch of blobs. But

0:35:23.320 --> 0:35:25.600
<v Speaker 1>if I tell you what it is while you stare

0:35:25.600 --> 0:35:29.600
<v Speaker 1>at it, then you suddenly see it. It seems immediately

0:35:29.760 --> 0:35:32.840
<v Speaker 1>obvious and you cannot see anything other than that, And

0:35:32.920 --> 0:35:35.440
<v Speaker 1>the only thing that's changed is that you now have

0:35:36.080 --> 0:35:40.600
<v Speaker 1>a top down expectation about what you're seeing, and suddenly

0:35:40.760 --> 0:35:43.680
<v Speaker 1>all these blobs make clear sense. I'm not going to

0:35:43.719 --> 0:35:45.360
<v Speaker 1>tell you what the blobs are here, but if you

0:35:45.400 --> 0:35:47.719
<v Speaker 1>go to the website and scroll all the way to

0:35:47.840 --> 0:35:49.799
<v Speaker 1>the bottom of the page, I'll give you a hint

0:35:49.880 --> 0:35:53.240
<v Speaker 1>there so you can enjoy the experience of not knowing

0:35:53.719 --> 0:35:57.560
<v Speaker 1>and then knowing. And because this is a podcast, I'll

0:35:57.600 --> 0:36:01.360
<v Speaker 1>give you an auditory example of this, again from Diana Deutsch.

0:36:01.840 --> 0:36:04.400
<v Speaker 1>So I'm going to take a piece of music that

0:36:04.560 --> 0:36:08.279
<v Speaker 1>you know, but I'm going to shift each note up

0:36:08.520 --> 0:36:11.960
<v Speaker 1>or down an octave, so one note might be played

0:36:12.000 --> 0:36:14.320
<v Speaker 1>an octave higher and the next note might be played

0:36:14.320 --> 0:36:17.520
<v Speaker 1>an octave lower. And I want you to identify the

0:36:17.600 --> 0:36:28.279
<v Speaker 1>piece of music. It's definitely one that you know. Now

0:36:28.360 --> 0:36:32.319
<v Speaker 1>I assume that you couldn't identify that piece. Now I'm

0:36:32.360 --> 0:36:34.760
<v Speaker 1>going to play it for you without the notes shifted

0:36:34.880 --> 0:36:45.880
<v Speaker 1>up or down in octaves. Now that you know the tune,

0:36:46.000 --> 0:36:47.920
<v Speaker 1>I'm just going to play that first one again and

0:36:48.000 --> 0:36:51.560
<v Speaker 1>you should have little or no trouble hearing the correct melody.

0:36:59.760 --> 0:37:01.920
<v Speaker 2>The only difference between the first.

0:37:01.680 --> 0:37:04.080
<v Speaker 1>Time I played it and the last time is that

0:37:04.320 --> 0:37:08.120
<v Speaker 1>now you have a top down expectation, and so it

0:37:08.239 --> 0:37:13.359
<v Speaker 1>switches from random noise to a tune. And so these

0:37:13.400 --> 0:37:18.080
<v Speaker 1>are all examples in which top down expectations are critical.

0:37:18.160 --> 0:37:21.880
<v Speaker 1>Without them, you don't have any interpretation at all. And

0:37:22.080 --> 0:37:26.640
<v Speaker 1>once you build an expectation, then the data have meaning.

0:37:27.640 --> 0:37:29.360
<v Speaker 1>You need to be told what to see in the

0:37:29.440 --> 0:37:31.600
<v Speaker 1>picture or to hear in the tune to get it.

0:37:31.880 --> 0:37:35.680
<v Speaker 1>And the only difference between before and after is whether

0:37:35.760 --> 0:37:39.680
<v Speaker 1>you have something to match it to some top down expectation,

0:37:40.200 --> 0:37:44.960
<v Speaker 1>and as soon as you do, then you perceive. Now,

0:37:45.160 --> 0:37:47.680
<v Speaker 1>just to be clear, this doesn't mean you can impose

0:37:47.920 --> 0:37:52.680
<v Speaker 1>any top down interpretation. It has to match sufficiently well.

0:37:53.280 --> 0:37:56.960
<v Speaker 1>The thing about brainstorm green needle is that the bottom

0:37:57.040 --> 0:38:01.000
<v Speaker 1>up data can match either one of the top expectations

0:38:01.080 --> 0:38:05.680
<v Speaker 1>for either word. You can hear green needle or brainstorm.

0:38:06.120 --> 0:38:10.600
<v Speaker 1>Because these are possible words that can roughly match the

0:38:10.680 --> 0:38:13.600
<v Speaker 1>bottom up stimulus with all of its noise, But you

0:38:13.760 --> 0:38:17.920
<v Speaker 1>can't hear something totally different like blue reader or my

0:38:18.239 --> 0:38:22.320
<v Speaker 1>penguin because you can't make a good enough match between

0:38:22.800 --> 0:38:26.440
<v Speaker 1>data and expectation. So there has to be a sufficient

0:38:26.560 --> 0:38:30.120
<v Speaker 1>match between the top down and the bottom up for

0:38:30.320 --> 0:38:34.040
<v Speaker 1>perception to happen. Okay, so let's come back.

0:38:33.880 --> 0:38:36.799
<v Speaker 2>To this issue about the assumptions that we make. How

0:38:36.920 --> 0:38:39.880
<v Speaker 2>do we know what to assume about the world.

0:38:40.480 --> 0:38:46.920
<v Speaker 1>Well, this relies almost entirely on our prior experience. For example,

0:38:46.960 --> 0:38:50.680
<v Speaker 1>when you're judging depth, like how far different things are

0:38:50.760 --> 0:38:54.000
<v Speaker 1>from you, which is again a totally unconscious process, you

0:38:54.080 --> 0:38:57.000
<v Speaker 1>can do this by comparing the images from your two eyes,

0:38:57.440 --> 0:38:59.680
<v Speaker 1>but this is only useful out to about thirty meters.

0:39:00.120 --> 0:39:03.000
<v Speaker 1>So it turns out the brain has other ways to

0:39:03.160 --> 0:39:06.840
<v Speaker 1>determine depth, and one of the main ones simply pivots

0:39:06.920 --> 0:39:11.400
<v Speaker 1>on its experience with the world. The visual system builds

0:39:11.520 --> 0:39:16.759
<v Speaker 1>up prior expectations about the relative sizes of objects. So

0:39:16.840 --> 0:39:19.200
<v Speaker 1>if you're standing outside and you see a dog in

0:39:19.280 --> 0:39:21.880
<v Speaker 1>the distance, then it takes up about as much space

0:39:21.960 --> 0:39:25.480
<v Speaker 1>on your retina as the truck over there. You can

0:39:25.560 --> 0:39:28.640
<v Speaker 1>assume that the dog is closer and the truck is

0:39:28.880 --> 0:39:32.719
<v Speaker 1>farther away. Why because a close dog will look a

0:39:32.880 --> 0:39:35.680
<v Speaker 1>certain size and a far away truck will end up

0:39:35.719 --> 0:39:38.360
<v Speaker 1>looking about that same size, and so your brain is

0:39:38.400 --> 0:39:42.759
<v Speaker 1>able to instantly make the proper assumption about how far

0:39:42.920 --> 0:39:45.839
<v Speaker 1>away things are. And you might be wrong, by the way,

0:39:45.920 --> 0:39:48.439
<v Speaker 1>maybe it's a miniature model of a truck that's really

0:39:48.560 --> 0:39:51.960
<v Speaker 1>close and a monstrously huge dog that's really far away.

0:39:52.560 --> 0:39:55.880
<v Speaker 1>But most of the time your assumptions are fine. So

0:39:57.160 --> 0:40:00.759
<v Speaker 1>data doesn't just come in from the world and get seen. Instead,

0:40:01.320 --> 0:40:06.960
<v Speaker 1>your visual system capitalizes on prior expectations. And although this

0:40:07.160 --> 0:40:11.080
<v Speaker 1>idea isn't always intuitive, it's not a new idea. In

0:40:11.160 --> 0:40:15.919
<v Speaker 1>the nineteenth century, the German physician and physicists Hermann von

0:40:16.000 --> 0:40:18.759
<v Speaker 1>Helmholtz was one of the first people to entertain this

0:40:19.000 --> 0:40:23.880
<v Speaker 1>model of perception. He suspected that the small amounts of

0:40:23.960 --> 0:40:27.600
<v Speaker 1>information dribbling in through the eyes were just too slight

0:40:27.800 --> 0:40:32.200
<v Speaker 1>to account for the rich experience of vision. So he

0:40:32.320 --> 0:40:36.439
<v Speaker 1>deduced that the brain makes assumptions about the incoming data

0:40:36.640 --> 0:40:40.800
<v Speaker 1>based on previous experiences, and he correctly surmised that this

0:40:40.960 --> 0:40:44.080
<v Speaker 1>is how the brain can use its best guesses to

0:40:44.320 --> 0:40:48.920
<v Speaker 1>rapidly turn a little trickle of information into a full picture.

0:40:49.960 --> 0:40:51.360
<v Speaker 1>By the way, if you want to look this up

0:40:51.440 --> 0:40:56.040
<v Speaker 1>in more depth, look up Helmholtz's notion of unconscious inference.

0:40:56.480 --> 0:40:59.400
<v Speaker 1>We infer what's out there, and it all happens unconsciously.

0:41:00.000 --> 0:41:02.839
<v Speaker 1>You can also look up Bayes' theorem as a way

0:41:02.920 --> 0:41:05.800
<v Speaker 1>of approaching this mathematically. One way to think about this

0:41:06.080 --> 0:41:09.160
<v Speaker 1>is that our judgments often rely not only on what's

0:41:09.200 --> 0:41:11.120
<v Speaker 1>in front of us, but also on all of our

0:41:11.239 --> 0:41:31.600
<v Speaker 1>prior experiences. So where we are so far is that

0:41:31.719 --> 0:41:35.719
<v Speaker 1>the process of perceiving the world, of interpreting what we

0:41:35.760 --> 0:41:40.160
<v Speaker 1>see or we hear, it's influenced by our past experiences,

0:41:40.239 --> 0:41:45.000
<v Speaker 1>which shape our current expectations, and that's what determines what

0:41:45.160 --> 0:41:49.000
<v Speaker 1>we think we see and hear. Now, it's sometimes the

0:41:49.120 --> 0:41:52.960
<v Speaker 1>case that your brain has more than one prior expectation.

0:41:53.120 --> 0:41:55.160
<v Speaker 1>It could be this, or it could be that, And

0:41:55.280 --> 0:41:58.680
<v Speaker 1>in this case it's easier to witness something very interesting,

0:41:58.680 --> 0:42:00.799
<v Speaker 1>which I want to tell you about now. So let's

0:42:00.880 --> 0:42:05.279
<v Speaker 1>return to the Necker cube, that little wireframe drawing. So

0:42:05.360 --> 0:42:09.160
<v Speaker 1>it's a very simple drawing, but it exposes something amazing,

0:42:09.400 --> 0:42:13.879
<v Speaker 1>which is a competition that is always raging under the hood.

0:42:14.880 --> 0:42:17.440
<v Speaker 1>Your brain is always trying to figure out what is

0:42:17.560 --> 0:42:20.320
<v Speaker 1>going on out there, and the way it does this

0:42:20.920 --> 0:42:24.759
<v Speaker 1>is by assessing probabilities. So this simply means you have

0:42:25.000 --> 0:42:28.320
<v Speaker 1>some networks that are saying, yes, it's definitely this, and

0:42:28.400 --> 0:42:31.240
<v Speaker 1>you have other networks that are saying, yes, it's definitely

0:42:31.320 --> 0:42:31.960
<v Speaker 1>this other thing.

0:42:32.800 --> 0:42:34.040
<v Speaker 2>In the case of the Necker.

0:42:33.880 --> 0:42:36.680
<v Speaker 1>Cube, you have one network saying the cube comes out

0:42:36.719 --> 0:42:40.000
<v Speaker 1>of the page this way, and the other network insisting

0:42:40.040 --> 0:42:42.080
<v Speaker 1>the cube comes out of the page the other way.

0:42:43.120 --> 0:42:45.920
<v Speaker 1>And in other illusions you sometimes have even more networks,

0:42:45.960 --> 0:42:49.000
<v Speaker 1>each voting for their thing. But the key to understand

0:42:49.560 --> 0:42:53.560
<v Speaker 1>is that it's a competition. All these networks are screaming

0:42:53.680 --> 0:42:57.160
<v Speaker 1>off and trying to dominate each other, and it's a

0:42:57.719 --> 0:42:59.960
<v Speaker 1>winner take all competition.

0:43:00.360 --> 0:43:01.960
<v Speaker 2>It's like king of the hill.

0:43:02.040 --> 0:43:04.520
<v Speaker 1>Whichever kid is able to get to the top of

0:43:04.600 --> 0:43:07.719
<v Speaker 1>the hill gets to push everyone else down. In the

0:43:07.840 --> 0:43:12.760
<v Speaker 1>case of local neural networks, when one is successfully firing

0:43:12.840 --> 0:43:16.600
<v Speaker 1>on all cylinders, it's able to inhibit the neighboring networks.

0:43:17.080 --> 0:43:21.440
<v Speaker 1>It releases neurotransmitters that keep itself propped up and at

0:43:21.480 --> 0:43:25.920
<v Speaker 1>the same time inhibiting the activity of the competitors, and

0:43:26.040 --> 0:43:27.800
<v Speaker 1>whichever network.

0:43:27.520 --> 0:43:29.800
<v Speaker 2>Is king is what you perceive.

0:43:30.840 --> 0:43:33.960
<v Speaker 1>And because it's a winner take all competition, there's only

0:43:34.080 --> 0:43:35.239
<v Speaker 1>one king at any time.

0:43:35.800 --> 0:43:37.120
<v Speaker 2>That's why you don't.

0:43:36.960 --> 0:43:41.600
<v Speaker 1>See all the possibilities at once. You only see the winner.

0:43:42.640 --> 0:43:45.360
<v Speaker 1>But here's the wacky thing with the Necker cube. It

0:43:45.520 --> 0:43:49.640
<v Speaker 1>really could be either way. It's equally probable that these

0:43:49.760 --> 0:43:52.399
<v Speaker 1>lions represent a cube this way or represents a cube

0:43:52.400 --> 0:43:55.239
<v Speaker 1>the other way. There's a fifty percent chance of either

0:43:55.280 --> 0:43:58.440
<v Speaker 1>of these. This is known as eque probable. So your

0:43:58.520 --> 0:44:03.000
<v Speaker 1>brain takes this equa probable stimulus and nails it down

0:44:03.080 --> 0:44:06.120
<v Speaker 1>to one choice or the other. But if you have

0:44:06.320 --> 0:44:09.000
<v Speaker 1>stared at one of these drawings for more than ten seconds,

0:44:09.120 --> 0:44:14.080
<v Speaker 1>you know that your brain changes its interpretation. If you

0:44:14.200 --> 0:44:16.680
<v Speaker 1>stare at this wireframe, it looks like it's coming out

0:44:16.719 --> 0:44:19.280
<v Speaker 1>one way from the page, but if you keep staring,

0:44:19.320 --> 0:44:21.920
<v Speaker 1>it'll switch so that it looks like it's coming out

0:44:21.960 --> 0:44:24.279
<v Speaker 1>the other way. And if you stare at this for

0:44:24.320 --> 0:44:27.040
<v Speaker 1>a little while, you'll see that it switches back and forth.

0:44:27.239 --> 0:44:29.440
<v Speaker 1>You see it one way then the other way. Your

0:44:29.480 --> 0:44:32.319
<v Speaker 1>brain will stick with one interpretation for a little while

0:44:32.440 --> 0:44:35.239
<v Speaker 1>and tell you that's what's in the world, and then

0:44:35.280 --> 0:44:40.680
<v Speaker 1>it will suddenly change its claim. Why because, as I said,

0:44:40.719 --> 0:44:43.200
<v Speaker 1>there's a fifty percent chance of interpreting the cube one

0:44:43.239 --> 0:44:46.040
<v Speaker 1>way or the other, and the brain cannot see both

0:44:46.080 --> 0:44:50.080
<v Speaker 1>interpretations at the same time, so it switches between them.

0:44:50.200 --> 0:44:53.600
<v Speaker 2>It's the king of the hill game, but the king

0:44:53.719 --> 0:44:54.600
<v Speaker 2>never lasts.

0:44:55.080 --> 0:44:57.919
<v Speaker 1>Someone always manages to knock that kid off the top,

0:44:58.040 --> 0:45:00.480
<v Speaker 1>and then the new kid has to defe and the

0:45:00.600 --> 0:45:04.719
<v Speaker 1>throne against other invaders. And that's precisely what happens with

0:45:04.880 --> 0:45:09.320
<v Speaker 1>these neural network competitions. One network wins, but it doesn't

0:45:09.440 --> 0:45:12.880
<v Speaker 1>last that long before it gets unseated. And then the

0:45:12.960 --> 0:45:16.680
<v Speaker 1>other network is active in a loop of self reinforcing

0:45:16.760 --> 0:45:20.560
<v Speaker 1>neurons firing. It gets to keep control be king of

0:45:20.640 --> 0:45:22.720
<v Speaker 1>the North for a little bit, but then the first

0:45:22.800 --> 0:45:26.239
<v Speaker 1>one unseats it again. So what you see with the

0:45:26.360 --> 0:45:31.520
<v Speaker 1>simple drawing is the ever present, active battle in your

0:45:31.600 --> 0:45:35.640
<v Speaker 1>skull to control perception. So, in other words, if you

0:45:35.800 --> 0:45:40.040
<v Speaker 1>have two possible top down models, either.

0:45:39.800 --> 0:45:44.080
<v Speaker 5>Of which could equally be right, they'll fight and you'll

0:45:44.200 --> 0:45:48.800
<v Speaker 5>believe whoever the temporary winner is, and then you'll believe

0:45:48.840 --> 0:45:51.000
<v Speaker 5>the next guy when he's back in power, and then

0:45:51.040 --> 0:45:51.760
<v Speaker 5>the first network.

0:45:51.800 --> 0:45:52.080
<v Speaker 2>Again.

0:45:53.520 --> 0:45:57.080
<v Speaker 1>Now the dress tends not to switch, and this is

0:45:57.160 --> 0:46:02.400
<v Speaker 1>because it's not equiprobable. Our brain has developed very clear

0:46:02.960 --> 0:46:07.120
<v Speaker 1>prior expectations about lighting and fabric and windows and so on.

0:46:07.800 --> 0:46:10.960
<v Speaker 1>So my brain makes an interpretation and your brain makes

0:46:11.000 --> 0:46:14.360
<v Speaker 1>an interpretation, and there's no reason for either one of

0:46:14.480 --> 0:46:17.560
<v Speaker 1>them to question it. It's like playing King of the

0:46:17.640 --> 0:46:20.359
<v Speaker 1>hill against some small puppies. No one's going to knock

0:46:20.400 --> 0:46:24.320
<v Speaker 1>you off the throne. And that's why it's so difficult

0:46:24.600 --> 0:46:27.600
<v Speaker 1>to change your interpretation of the dress, even when you're

0:46:27.719 --> 0:46:30.840
<v Speaker 1>told that some other interpretation is possible.

0:46:31.400 --> 0:46:32.920
<v Speaker 2>Your brain relies on.

0:46:33.200 --> 0:46:37.120
<v Speaker 1>Deep assumptions about the world, and it's generally just too

0:46:37.239 --> 0:46:41.040
<v Speaker 1>hard to unseat the monarch. But what the Necker cube

0:46:41.160 --> 0:46:45.160
<v Speaker 1>reveals is that our brain's interpretation of the world can

0:46:45.239 --> 0:46:49.160
<v Speaker 1>be quite active if there are other equally likely interpretations

0:46:49.239 --> 0:46:51.960
<v Speaker 1>to be had, So the way we see the world

0:46:52.440 --> 0:46:56.160
<v Speaker 1>can change from moment to moment. Now as just a

0:46:56.239 --> 0:46:59.640
<v Speaker 1>one minute tangent. The funny thing is that you think

0:46:59.800 --> 0:47:03.560
<v Speaker 1>you are making the cube switch interpretations by yourself. In

0:47:03.640 --> 0:47:06.279
<v Speaker 1>other words, you feel like you're doing it consciously when

0:47:06.320 --> 0:47:07.840
<v Speaker 1>the cube switches back and forth.

0:47:08.560 --> 0:47:11.520
<v Speaker 2>But let's say we measure this. You stare at the

0:47:11.560 --> 0:47:15.160
<v Speaker 2>little cube and you hold down one key when you.

0:47:15.239 --> 0:47:17.799
<v Speaker 1>See it in this configuration, and as soon as your

0:47:17.840 --> 0:47:20.520
<v Speaker 1>perception switches and now looks the other way, you hold

0:47:20.600 --> 0:47:23.120
<v Speaker 1>down the other key. And you do this for a while,

0:47:23.520 --> 0:47:26.120
<v Speaker 1>back and forth and back and forth. You hold down

0:47:26.160 --> 0:47:29.399
<v Speaker 1>a key to let me know which perception you are seeing.

0:47:30.200 --> 0:47:32.720
<v Speaker 1>And remember how amazing this is because nothing is changing

0:47:32.800 --> 0:47:34.239
<v Speaker 1>on the page. It's only in your head.

0:47:35.040 --> 0:47:35.440
<v Speaker 2>Anyway.

0:47:35.680 --> 0:47:38.319
<v Speaker 1>When we look at the data, it's clear that your

0:47:38.400 --> 0:47:42.240
<v Speaker 1>results follow a particular mathematical distribution called a gamma distribution,

0:47:42.840 --> 0:47:46.719
<v Speaker 1>which comes from a random process. For the efficionados, this

0:47:46.840 --> 0:47:50.719
<v Speaker 1>is consistent with a poissone process. All this means is

0:47:50.760 --> 0:47:54.320
<v Speaker 1>that this switching is random, and this is exactly the

0:47:54.400 --> 0:47:58.680
<v Speaker 1>distribution you would expect from randomness. Sometimes you have the

0:47:58.800 --> 0:48:01.640
<v Speaker 1>winning network holding on to the throne for a long time,

0:48:01.719 --> 0:48:04.000
<v Speaker 1>so as for a short time, and on average it

0:48:04.120 --> 0:48:07.600
<v Speaker 1>lasts this medium amount of time before it switches. The

0:48:07.800 --> 0:48:12.840
<v Speaker 1>point is you think you're switching consciously, but it's just random.

0:48:13.480 --> 0:48:16.920
<v Speaker 1>The reason you take credit is because you think, Okay,

0:48:16.920 --> 0:48:19.160
<v Speaker 1>I'm seeing it this way, and I really want to

0:48:19.160 --> 0:48:21.280
<v Speaker 1>make it switch the other way, and I'm going.

0:48:21.239 --> 0:48:23.279
<v Speaker 2>To consciously work to switch it.

0:48:23.719 --> 0:48:28.480
<v Speaker 1>Okay, almost there, not quite working, still trying, and then

0:48:28.520 --> 0:48:32.720
<v Speaker 1>it randomly switches and you take credit for it. Here's

0:48:32.719 --> 0:48:35.000
<v Speaker 1>an analogy to help us understand that. You know those

0:48:35.320 --> 0:48:38.200
<v Speaker 1>pedestrian crossing buttons that you push when you want to

0:48:38.320 --> 0:48:41.480
<v Speaker 1>cross the street, and the little walk signal eventually shows

0:48:41.560 --> 0:48:43.120
<v Speaker 1>up and lets you know that you're safe to walk.

0:48:44.000 --> 0:48:47.800
<v Speaker 2>Some fraction of those buttons are placebos. They're fake. You

0:48:48.000 --> 0:48:50.400
<v Speaker 2>hit them, but they don't do anything.

0:48:50.680 --> 0:48:52.920
<v Speaker 1>You wait for exactly the same amount of time that

0:48:53.080 --> 0:48:56.360
<v Speaker 1>you would have waited anyway, but you have a sense

0:48:56.400 --> 0:49:00.200
<v Speaker 1>of control, an illusion of power over the light, even

0:49:00.239 --> 0:49:02.560
<v Speaker 1>though the timing doesn't change one bit.

0:49:03.200 --> 0:49:06.400
<v Speaker 2>And this is exactly the situation with this switching of

0:49:06.640 --> 0:49:07.440
<v Speaker 2>the Necker cube.

0:49:07.960 --> 0:49:11.759
<v Speaker 1>You consciously try to change it, and when it eventually

0:49:11.960 --> 0:49:14.880
<v Speaker 1>changes on its own, you think, yeah, that was because

0:49:14.920 --> 0:49:18.400
<v Speaker 1>of me. But when we measure the switching times, it

0:49:18.520 --> 0:49:21.919
<v Speaker 1>doesn't change anything at all, whether you're trying or not trying,

0:49:22.000 --> 0:49:26.560
<v Speaker 1>whether you're banging on that button or ignoring it. Okay,

0:49:26.640 --> 0:49:28.560
<v Speaker 1>so now I want to zoom back up to the

0:49:28.640 --> 0:49:30.960
<v Speaker 1>big picture about what we've been talking about, which is

0:49:31.040 --> 0:49:33.800
<v Speaker 1>how your brain makes assumptions about things, and how in

0:49:33.920 --> 0:49:38.080
<v Speaker 1>some circumstances these assumptions can fight it out. So we

0:49:38.200 --> 0:49:41.880
<v Speaker 1>see this in language often take the example of puns.

0:49:42.440 --> 0:49:45.720
<v Speaker 1>Puns strike us as funny because we're able to switch

0:49:45.840 --> 0:49:48.839
<v Speaker 1>back and forth and see the same thing in two

0:49:49.000 --> 0:49:52.040
<v Speaker 1>different ways. What do you get when you drop a

0:49:52.239 --> 0:49:57.560
<v Speaker 1>piano down a mine shaft a flat minor. The point

0:49:57.560 --> 0:50:00.359
<v Speaker 1>about puns is that we know from the s mile

0:50:00.440 --> 0:50:02.719
<v Speaker 1>on the other person's face that there's some joke to

0:50:02.800 --> 0:50:06.239
<v Speaker 1>be had, and so we search for other interpretations, and

0:50:06.360 --> 0:50:09.200
<v Speaker 1>we can switch back and forth between them, just like

0:50:09.280 --> 0:50:12.719
<v Speaker 1>a Necker cube. But something I found interesting is that

0:50:13.160 --> 0:50:16.160
<v Speaker 1>brains can be lazy, and we don't always bother or

0:50:16.280 --> 0:50:19.759
<v Speaker 1>seeking other interpretations. If you don't have a strong enough

0:50:19.880 --> 0:50:24.399
<v Speaker 1>reason to have more than one interpretation, then you stick

0:50:24.480 --> 0:50:28.120
<v Speaker 1>with what you've got. And this is often true in language,

0:50:28.280 --> 0:50:32.600
<v Speaker 1>which is very low bandwidth and depends enormously on assumptions.

0:50:33.120 --> 0:50:35.279
<v Speaker 1>So the other night I was at a party and

0:50:35.440 --> 0:50:38.600
<v Speaker 1>somehow the conversation moved in a direction where I mentioned

0:50:39.080 --> 0:50:44.040
<v Speaker 1>the famous book by Rachel Carson called Silent Spring. It

0:50:44.200 --> 0:50:45.800
<v Speaker 1>just so happened that no one there had heard of

0:50:45.840 --> 0:50:48.880
<v Speaker 1>this book. So in a sentence. I explained that the

0:50:49.120 --> 0:50:53.320
<v Speaker 1>author had argued that if pesticide use continued, there wouldn't

0:50:53.320 --> 0:50:56.759
<v Speaker 1>be any more birds, and so the spring season would

0:50:56.800 --> 0:50:59.560
<v Speaker 1>come around and we would hear no more chirping. It

0:50:59.560 --> 0:51:03.879
<v Speaker 1>would be And I was sort of surprised when everyone said, oh,

0:51:04.880 --> 0:51:07.440
<v Speaker 1>like I had just cleared up some confusion for them,

0:51:08.080 --> 0:51:11.760
<v Speaker 1>because it turns out that when I had said silent spring,

0:51:12.440 --> 0:51:14.760
<v Speaker 1>the person to my left thought I was talking about

0:51:14.840 --> 0:51:17.720
<v Speaker 1>a spring like a creek, so she interpreted the title

0:51:17.840 --> 0:51:21.479
<v Speaker 1>as silent river, and the person to my right thought

0:51:21.520 --> 0:51:26.120
<v Speaker 1>of spring like boeing boeing spring. And the person across

0:51:26.160 --> 0:51:28.520
<v Speaker 1>from me thought I was talking about the word spring

0:51:29.120 --> 0:51:32.880
<v Speaker 1>like the verb to jump, so he pictured silent spring

0:51:33.440 --> 0:51:37.239
<v Speaker 1>as a lion springing on him silently. And this is

0:51:37.320 --> 0:51:39.800
<v Speaker 1>typical of the way that we take in little bits

0:51:39.840 --> 0:51:43.120
<v Speaker 1>of data and impose an interpretation on them, and then

0:51:43.160 --> 0:51:47.560
<v Speaker 1>we're done. Our brains aren't generally incentivized to keep looking

0:51:47.640 --> 0:51:51.400
<v Speaker 1>for interpretations. You pick one and that's it. And by

0:51:51.440 --> 0:51:54.600
<v Speaker 1>the way, that's typically what happens with Laurel and Yanny.

0:51:54.960 --> 0:51:57.919
<v Speaker 1>If you didn't know to listen hard for something else,

0:51:58.120 --> 0:52:02.920
<v Speaker 1>you probably wouldn't. And with green needle and brainstorm. Unless

0:52:02.960 --> 0:52:07.080
<v Speaker 1>you were told to switch your perception, you probably wouldn't

0:52:07.120 --> 0:52:10.440
<v Speaker 1>have even thought to try it. And so I often

0:52:10.640 --> 0:52:13.520
<v Speaker 1>wonder about the ways that we do this with many

0:52:13.680 --> 0:52:17.560
<v Speaker 1>things around us. We pick some top down model and

0:52:17.719 --> 0:52:20.080
<v Speaker 1>that seems to match the bottom up data, and it

0:52:20.200 --> 0:52:24.000
<v Speaker 1>doesn't strike us to examine further because we're pretty sure

0:52:24.080 --> 0:52:26.759
<v Speaker 1>we have a match. I'll leave this as an open

0:52:26.840 --> 0:52:29.320
<v Speaker 1>question for all of us to think about places in

0:52:29.400 --> 0:52:32.520
<v Speaker 1>our life that maybe we haven't even thought to re

0:52:32.840 --> 0:52:37.759
<v Speaker 1>examine more deeply. So to wrap this up, when this

0:52:37.960 --> 0:52:41.120
<v Speaker 1>woman in the UK sent a little cell phone photo

0:52:41.239 --> 0:52:44.600
<v Speaker 1>to her daughter about her dress, it not only broke

0:52:44.719 --> 0:52:48.800
<v Speaker 1>the Internet, the more importantly, it breaks for us a

0:52:48.960 --> 0:52:53.360
<v Speaker 1>critical assumption that almost everyone carries around, the assumption that

0:52:53.480 --> 0:52:55.719
<v Speaker 1>when I look at the world and you look at

0:52:55.760 --> 0:53:00.319
<v Speaker 1>the world, we see the same thing. The naive umption

0:53:00.600 --> 0:53:04.200
<v Speaker 1>is that there is simply truth out there and it's

0:53:04.320 --> 0:53:07.120
<v Speaker 1>just a matter of opening your eyes. But the dress

0:53:07.440 --> 0:53:11.479
<v Speaker 1>and hundreds of other illusions reveal that we don't see

0:53:11.560 --> 0:53:18.120
<v Speaker 1>the world out there directly. Everything is interpretation. We only

0:53:18.239 --> 0:53:22.520
<v Speaker 1>have a bit of data dribbling in through our peripheral devices,

0:53:22.600 --> 0:53:26.240
<v Speaker 1>our sensory organs, and that data enters into a brain

0:53:26.760 --> 0:53:30.440
<v Speaker 1>that's already churning and bubbling with its own activity, its

0:53:30.480 --> 0:53:35.400
<v Speaker 1>own expectations, and so all we ever perceive is the

0:53:35.680 --> 0:53:39.319
<v Speaker 1>best guess from our neural networks about what is going

0:53:39.480 --> 0:53:43.640
<v Speaker 1>on out there, given a little rough data and a

0:53:43.800 --> 0:53:48.759
<v Speaker 1>lot of assumptions shaped by our past experiences. So the

0:53:48.920 --> 0:53:52.440
<v Speaker 1>next time you see a face in an electrical outlet,

0:53:52.680 --> 0:53:55.719
<v Speaker 1>where you see a shape in aurorshack blot, or you

0:53:55.880 --> 0:53:59.480
<v Speaker 1>see the dress and feel certain about its color, just

0:53:59.600 --> 0:54:03.279
<v Speaker 1>remember you are not seeing the world as it is.

0:54:03.960 --> 0:54:06.319
<v Speaker 2>You are seeing it as you are.

0:54:10.560 --> 0:54:13.520
<v Speaker 1>Go to Eagleman dot com slash podcast for more information

0:54:13.840 --> 0:54:17.680
<v Speaker 1>and to find further reading. Send me an email at

0:54:17.880 --> 0:54:21.880
<v Speaker 1>podcast at eagleman dot com with questions or discussions, and

0:54:22.000 --> 0:54:24.120
<v Speaker 1>I'm going to be making episodes in which I address

0:54:24.200 --> 0:54:26.279
<v Speaker 1>those reaching.

0:54:25.960 --> 0:54:29.440
<v Speaker 2>Out on a narrow road from my internal world to yours.

0:54:29.800 --> 0:54:32.000
<v Speaker 2>This is David Eagleman, and thank you for joining me

0:54:32.239 --> 0:54:33.640
<v Speaker 2>in the inner cosmos.