WEBVTT - Ep23 "How can we learn to speak alien?"

0:00:04.720 --> 0:00:08.079
<v Speaker 1>Imagine that one of these centuries we make contact with

0:00:08.240 --> 0:00:09.760
<v Speaker 1>an alien civilization.

0:00:10.039 --> 0:00:11.600
<v Speaker 2>It's a big.

0:00:11.400 --> 0:00:15.880
<v Speaker 1>Cosmos with quintillions of planets, so it's bound to happen

0:00:15.920 --> 0:00:18.279
<v Speaker 1>at some point. But how the heck are we going

0:00:18.320 --> 0:00:22.320
<v Speaker 1>to understand what they're saying? How are we going to

0:00:22.400 --> 0:00:26.960
<v Speaker 1>decode their language? After all, they might not communicate with

0:00:27.240 --> 0:00:30.280
<v Speaker 1>air compression waves. Maybe they do something visual, but in

0:00:30.360 --> 0:00:32.640
<v Speaker 1>ranges of light we can't even pick up with our eyes.

0:00:33.320 --> 0:00:37.000
<v Speaker 1>We won't have a Rosetta stone, So how are we

0:00:37.080 --> 0:00:40.199
<v Speaker 1>going to decipher what they are trying to say to us?

0:00:40.680 --> 0:00:43.120
<v Speaker 1>And this might seem speculative, but what I want to

0:00:43.240 --> 0:00:46.600
<v Speaker 1>draw our attention to is that we currently are in

0:00:46.640 --> 0:00:50.480
<v Speaker 1>the same position right now, right here at home, which

0:00:50.520 --> 0:00:54.400
<v Speaker 1>is that we can't tell what a single one of

0:00:54.520 --> 0:00:58.800
<v Speaker 1>the two million species on our planet are saying, not

0:00:58.920 --> 0:01:03.080
<v Speaker 1>even the six six hundred species of mammals, which are

0:01:03.160 --> 0:01:07.319
<v Speaker 1>presumably kind of like us. We're not having conversations with

0:01:07.640 --> 0:01:12.360
<v Speaker 1>anyone but ourselves. With all these species, I'm willing to

0:01:12.400 --> 0:01:15.000
<v Speaker 1>bet you don't listen to a single podcast not.

0:01:15.120 --> 0:01:16.479
<v Speaker 2>Made by a human.

0:01:17.360 --> 0:01:21.640
<v Speaker 1>But today we're going to see some hope, some pathways

0:01:21.680 --> 0:01:27.080
<v Speaker 1>for how we might get to animal translation and relatively soon.

0:01:31.120 --> 0:01:34.480
<v Speaker 1>Welcome to Inner Cosmos with me David Eagleman. I'm a

0:01:34.520 --> 0:01:38.600
<v Speaker 1>neuroscientist and an author at Stanford and in these episodes

0:01:39.040 --> 0:01:44.120
<v Speaker 1>I examined the intersection of science and our lives, and

0:01:44.200 --> 0:01:54.360
<v Speaker 1>today we're going to talk about understanding animals. When I

0:01:54.400 --> 0:01:57.440
<v Speaker 1>was a kid, I saw some episodes of Star Trek,

0:01:57.520 --> 0:02:00.920
<v Speaker 1>the original one with Kirk and Spock, and the thing

0:02:00.960 --> 0:02:04.200
<v Speaker 1>that always struck me was how every week on schedule

0:02:04.680 --> 0:02:10.040
<v Speaker 1>they discovered new alien civilizations, which is not so crazy

0:02:10.160 --> 0:02:14.839
<v Speaker 1>given that the universe is presumably teeming with life. There

0:02:14.880 --> 0:02:18.399
<v Speaker 1>are about one hundred billion galaxies, and each of these

0:02:18.440 --> 0:02:22.960
<v Speaker 1>has about one hundred billion stars, and most stars have

0:02:23.080 --> 0:02:28.200
<v Speaker 1>some planets rolling around them, so it's extraordinarily unlikely that

0:02:28.280 --> 0:02:31.040
<v Speaker 1>we are the only planet with life on it. But

0:02:31.160 --> 0:02:35.440
<v Speaker 1>before I get into alien communication, let's quickly address something first.

0:02:35.480 --> 0:02:39.800
<v Speaker 1>You've probably heard of the Fermi paradox, and if you haven't,

0:02:39.840 --> 0:02:43.519
<v Speaker 1>it's a very important question. It's the question of why,

0:02:43.680 --> 0:02:46.160
<v Speaker 1>if there's all this life in the cosmos, why have

0:02:46.280 --> 0:02:50.480
<v Speaker 1>we not heard a peep from anyone. This paradox is

0:02:50.560 --> 0:02:54.680
<v Speaker 1>named after the physicist Enrico Fermi, who raised this question

0:02:54.840 --> 0:02:59.800
<v Speaker 1>if there are so many potential alien civilizations, why haven't

0:02:59.840 --> 0:03:03.400
<v Speaker 1>we detected any signals or encountered any of them. Yet

0:03:03.880 --> 0:03:06.160
<v Speaker 1>we're living in a moment of history where there seems

0:03:06.160 --> 0:03:10.919
<v Speaker 1>to be this very strange contradiction between the high probability

0:03:11.040 --> 0:03:16.160
<v Speaker 1>of extraterrestrial civilizations and the lack of any shred of

0:03:16.240 --> 0:03:19.920
<v Speaker 1>evidence for them. So over the decades, people have proposed

0:03:19.960 --> 0:03:25.320
<v Speaker 1>all kinds of possible explanations for the Fermi paradox. The

0:03:25.360 --> 0:03:28.920
<v Speaker 1>first is that maybe aliens don't exist, which is the

0:03:28.960 --> 0:03:34.080
<v Speaker 1>simplest explanation but presumably not terribly likely given the size

0:03:34.120 --> 0:03:37.280
<v Speaker 1>of the cosmos. So some people point out that maybe

0:03:37.320 --> 0:03:41.440
<v Speaker 1>the problem just has to do with the enormous distances,

0:03:41.840 --> 0:03:46.360
<v Speaker 1>the vastness of space, and the limitations of our current

0:03:46.360 --> 0:03:50.360
<v Speaker 1>technology that might make it hard to detect other civilizations

0:03:50.440 --> 0:03:54.119
<v Speaker 1>even if they exist, because the distances between stars are

0:03:54.800 --> 0:03:59.080
<v Speaker 1>enormous and signals may take thousands or millions of years

0:03:59.120 --> 0:04:02.240
<v Speaker 1>to reach us. Okay, so that's a possibility. Or A

0:04:02.360 --> 0:04:06.640
<v Speaker 1>related idea is what's called the rare Earth hypothesis, which

0:04:06.680 --> 0:04:11.400
<v Speaker 1>is that Earth like planets, which are capable of supporting

0:04:11.560 --> 0:04:16.400
<v Speaker 1>complex life, are exceedingly uncommon in the universe, and that

0:04:16.440 --> 0:04:22.760
<v Speaker 1>makes the emergence of intelligent civilizations a rare event. But again,

0:04:23.240 --> 0:04:27.840
<v Speaker 1>given that there are something like seventy quintillion planets, that's

0:04:27.880 --> 0:04:33.039
<v Speaker 1>a seven followed by nineteen zeros Earth like planets can't

0:04:33.080 --> 0:04:39.120
<v Speaker 1>be too rare. So another idea is the technological singularity idea.

0:04:39.720 --> 0:04:44.640
<v Speaker 1>Some thinkers have proposed that advanced civilizations might always end

0:04:44.720 --> 0:04:51.840
<v Speaker 1>up reaching a technological singularity when their technology suddenly accelerates rapidly,

0:04:52.400 --> 0:04:56.080
<v Speaker 1>and one possible outcome of this is that these civilizations

0:04:56.480 --> 0:05:00.640
<v Speaker 1>tend to self destruct, and a related hypothesis is that

0:05:00.960 --> 0:05:04.919
<v Speaker 1>when other civilizations hit this singularity, this leads them to

0:05:05.080 --> 0:05:10.640
<v Speaker 1>a post biological existence. They're no longer products of nature,

0:05:10.680 --> 0:05:14.320
<v Speaker 1>but instead they build themselves into other sorts of devices,

0:05:14.800 --> 0:05:17.800
<v Speaker 1>which would be hard for us to detect given the

0:05:17.800 --> 0:05:22.120
<v Speaker 1>ways that we're searching. And other people suggest that advanced

0:05:22.160 --> 0:05:28.279
<v Speaker 1>civilizations intentionally avoid broadcasting their signals or their presence for

0:05:28.400 --> 0:05:33.560
<v Speaker 1>fear of attracting unwanted attention or causing conflicts with less

0:05:33.600 --> 0:05:38.440
<v Speaker 1>advanced civilizations. Or maybe they're just not interested in contacting us.

0:05:38.440 --> 0:05:40.760
<v Speaker 1>They might be too busy with their own problems, or

0:05:40.800 --> 0:05:44.600
<v Speaker 1>they simply don't see us as an interesting threat or ally,

0:05:44.680 --> 0:05:46.920
<v Speaker 1>so there's no reason to pick up the phone. And

0:05:46.960 --> 0:05:52.080
<v Speaker 1>then there's the possibility that they use extremely different communication

0:05:52.279 --> 0:05:56.200
<v Speaker 1>methods than we do, so different that we can't currently

0:05:56.600 --> 0:05:57.560
<v Speaker 1>understand them.

0:05:57.520 --> 0:05:59.719
<v Speaker 2>Or even know what we should be looking for.

0:06:00.480 --> 0:06:03.800
<v Speaker 1>So ultimately we don't know why we haven't heard from

0:06:03.880 --> 0:06:07.760
<v Speaker 1>anyone yet. There may be other reasons, or maybe multiple

0:06:07.800 --> 0:06:10.680
<v Speaker 1>of the reasons I mentioned, or all at play, but

0:06:10.720 --> 0:06:12.719
<v Speaker 1>for now we just have to live with the fact

0:06:12.800 --> 0:06:16.920
<v Speaker 1>that we haven't yet heard from anyone. So this was

0:06:17.000 --> 0:06:19.960
<v Speaker 1>part of the appeal of Star Trek. Every week there

0:06:20.480 --> 0:06:22.960
<v Speaker 1>spacewarping off to some new coordinates, and.

0:06:22.920 --> 0:06:26.320
<v Speaker 2>Everywhere they go they meet new civilizations.

0:06:26.960 --> 0:06:29.560
<v Speaker 1>Now, the thing that always struck me as funny and

0:06:29.720 --> 0:06:33.360
<v Speaker 1>the point of this episode is that each week they

0:06:33.480 --> 0:06:37.320
<v Speaker 1>end up meeting these new aliens. And often these aliens

0:06:37.360 --> 0:06:40.840
<v Speaker 1>look like a female movie star in a cool jumpsuit,

0:06:40.960 --> 0:06:44.400
<v Speaker 1>but with subtle differences like pointy ears and green skin.

0:06:44.760 --> 0:06:48.800
<v Speaker 1>But the key thing is that all these aliens speak English.

0:06:49.360 --> 0:06:52.719
<v Speaker 1>Usually it's a slightly broken English with a difficult to

0:06:52.800 --> 0:06:57.920
<v Speaker 1>discern accent, but nonetheless pretty easily understandable, which is of

0:06:57.920 --> 0:07:01.479
<v Speaker 1>course very lucky for these Star warsvoyagers who happen in

0:07:01.560 --> 0:07:05.119
<v Speaker 1>several hundred years from now to speak English themselves. Now,

0:07:05.920 --> 0:07:08.279
<v Speaker 1>why did the writers of Star Trek choose to make

0:07:08.360 --> 0:07:09.960
<v Speaker 1>everyone speak just like we do?

0:07:10.680 --> 0:07:10.920
<v Speaker 2>Well?

0:07:11.000 --> 0:07:14.920
<v Speaker 1>This is a basic constraint of storytelling. It's the only

0:07:14.960 --> 0:07:18.760
<v Speaker 1>thing that will work for telling a narrative that people

0:07:18.800 --> 0:07:22.440
<v Speaker 1>will tune into. It's hard to tell a story if

0:07:22.520 --> 0:07:26.080
<v Speaker 1>the alien is some kind of weird fungus thing that

0:07:26.160 --> 0:07:30.280
<v Speaker 1>doesn't speak or lives at a different timescale than we do,

0:07:30.480 --> 0:07:34.120
<v Speaker 1>like a tree. If we land on a planet of mute,

0:07:34.160 --> 0:07:37.680
<v Speaker 1>slow fungus creatures, it's not going to make good television.

0:07:38.360 --> 0:07:41.920
<v Speaker 1>So the stories we tell will always have aliens that

0:07:41.960 --> 0:07:44.840
<v Speaker 1>we can talk with and that serve as not so

0:07:45.040 --> 0:07:49.560
<v Speaker 1>distant reflections of ourselves. Okay, so no problem, that's what

0:07:49.920 --> 0:07:54.920
<v Speaker 1>storytelling requires. But in real life, it's much more likely

0:07:55.320 --> 0:07:58.400
<v Speaker 1>that we're going to have a very very difficult time

0:07:58.920 --> 0:08:03.760
<v Speaker 1>communicating much of anything to aliens when we find them.

0:08:04.160 --> 0:08:06.840
<v Speaker 1>You might think that we can get by with something

0:08:06.960 --> 0:08:09.480
<v Speaker 1>like take me to your leader or some sort of

0:08:09.520 --> 0:08:13.240
<v Speaker 1>hand signals, but in fact none of that's going to work. Now,

0:08:13.280 --> 0:08:17.040
<v Speaker 1>what's the reason that I say this? Why should we

0:08:17.080 --> 0:08:20.400
<v Speaker 1>think that communication is going to be so difficult. Well,

0:08:20.440 --> 0:08:23.320
<v Speaker 1>the aliens we find on other planets are going to

0:08:23.360 --> 0:08:27.440
<v Speaker 1>have a totally different evolutionary history. They may not be

0:08:27.520 --> 0:08:31.640
<v Speaker 1>based on DNA like all earthly creatures are, but instead

0:08:31.720 --> 0:08:35.959
<v Speaker 1>may have found a completely different way of encoding information

0:08:36.320 --> 0:08:41.240
<v Speaker 1>and managing replication and building societies. And it's possible they

0:08:41.280 --> 0:08:44.719
<v Speaker 1>won't even be carbon based like all creatures on Earth are,

0:08:45.040 --> 0:08:48.560
<v Speaker 1>but instead based on something like the element silicon, which

0:08:48.600 --> 0:08:51.640
<v Speaker 1>also has a tetrahedral structure and can make lots of

0:08:51.760 --> 0:08:54.959
<v Speaker 1>useful elements and so on. So there's a whole field

0:08:55.000 --> 0:08:59.800
<v Speaker 1>of this called astrobiology, where astro refers to star and

0:09:00.080 --> 0:09:05.040
<v Speaker 1>often known as exobiology, where exo means outside, and the

0:09:05.080 --> 0:09:08.480
<v Speaker 1>idea with this area of study is to search for

0:09:08.760 --> 0:09:13.679
<v Speaker 1>naturally evolved life in the universe, mostly on other planets

0:09:13.720 --> 0:09:17.800
<v Speaker 1>in the habitable zone, that's the Goldilocks zone of planets

0:09:17.800 --> 0:09:21.080
<v Speaker 1>who rotated just the right distance from their star so

0:09:21.120 --> 0:09:24.000
<v Speaker 1>it's not too hot not too cold. I'll also mention

0:09:24.080 --> 0:09:29.080
<v Speaker 1>there's a closely related field called zenobiology, which means alien

0:09:29.200 --> 0:09:33.320
<v Speaker 1>or foreign biology, and that term is usually reserved to

0:09:33.440 --> 0:09:37.640
<v Speaker 1>refer to biology that is synthetic, not found in nature

0:09:38.200 --> 0:09:41.760
<v Speaker 1>that science has no clue about yet. The general idea

0:09:41.920 --> 0:09:47.520
<v Speaker 1>is that astrobiologists try to detect and eventually analyze life

0:09:47.559 --> 0:09:52.439
<v Speaker 1>elsewhere in the universe, while xenobiologists will attempt to design

0:09:52.880 --> 0:09:56.560
<v Speaker 1>forms of life with a totally different biochemistry or different

0:09:56.559 --> 0:10:00.360
<v Speaker 1>genetic code than on planet Earth. Now, when we search

0:10:00.440 --> 0:10:03.280
<v Speaker 1>for life in the universe, I don't really see any

0:10:03.480 --> 0:10:08.000
<v Speaker 1>reason to imagine a distinction between naturally evolved life on

0:10:08.040 --> 0:10:12.400
<v Speaker 1>other planets and synthetic forms of life, because a planet

0:10:12.440 --> 0:10:15.000
<v Speaker 1>ten thousand years ahead of us would be well on

0:10:15.080 --> 0:10:17.840
<v Speaker 1>its way to building other species in the same way

0:10:17.840 --> 0:10:22.480
<v Speaker 1>that our species existence might simply serve as the spark

0:10:22.800 --> 0:10:26.559
<v Speaker 1>to build an artificial species that colonizes the Solar System.

0:10:27.200 --> 0:10:29.320
<v Speaker 1>And so for this reason we should always be thinking

0:10:29.360 --> 0:10:34.000
<v Speaker 1>about new biologies, new animals that could use a completely

0:10:34.000 --> 0:10:38.319
<v Speaker 1>different kind of biochemistry. On this planet, all we've ever

0:10:38.440 --> 0:10:43.640
<v Speaker 1>seen for coding information is DNA and RNA. We have

0:10:43.800 --> 0:10:47.200
<v Speaker 1>twenty amino acids from which we build all our proteins,

0:10:47.240 --> 0:10:50.559
<v Speaker 1>which are like little molecular machines. But instead of DNA,

0:10:51.400 --> 0:10:56.480
<v Speaker 1>we might find elsewhere what we'll call XNA zeno nucleic acid.

0:10:56.520 --> 0:11:00.560
<v Speaker 1>You might find a massively expanded genetic coat that uses

0:11:00.640 --> 0:11:03.880
<v Speaker 1>other amino acids to build totally new kinds of proteins,

0:11:04.240 --> 0:11:07.160
<v Speaker 1>or perhaps most likely, you could have something that's not

0:11:07.320 --> 0:11:10.760
<v Speaker 1>like our genetic code at all. Something to considers that

0:11:10.800 --> 0:11:13.959
<v Speaker 1>we only discovered the genetic code in nineteen fifty three,

0:11:14.000 --> 0:11:16.400
<v Speaker 1>which is not that long ago. I actually worked with

0:11:16.480 --> 0:11:20.280
<v Speaker 1>Francis Criek, the code discoverer of the structure of DNA,

0:11:20.400 --> 0:11:24.600
<v Speaker 1>So this is massively recent and we don't know what

0:11:24.720 --> 0:11:29.000
<v Speaker 1>we haven't thought of yet. So for this reason, I

0:11:29.000 --> 0:11:34.160
<v Speaker 1>think of the whole endeavor of space biology as zenobiology,

0:11:34.240 --> 0:11:37.240
<v Speaker 1>thinking about and looking for ways that life could be

0:11:37.320 --> 0:11:41.200
<v Speaker 1>built that we haven't yet imagined yet. And by the way,

0:11:41.240 --> 0:11:44.280
<v Speaker 1>it's thought exercises like this that make me wish I

0:11:44.320 --> 0:11:48.520
<v Speaker 1>could see the text books that our descendants will read

0:11:49.000 --> 0:11:51.440
<v Speaker 1>five hundred years from now or one thousand years from now.

0:11:51.960 --> 0:11:55.959
<v Speaker 1>There's going to be so much known that we currently

0:11:56.040 --> 0:11:59.720
<v Speaker 1>can't imagine. That's totally in our dark zone. Things we

0:11:59.840 --> 0:12:03.600
<v Speaker 1>have I haven't even realized that we don't know. Okay,

0:12:03.840 --> 0:12:06.920
<v Speaker 1>So all of this is to say life might be

0:12:07.040 --> 0:12:10.760
<v Speaker 1>massively different than what we have here on Earth, and

0:12:10.880 --> 0:12:14.480
<v Speaker 1>the question is will we figure out how to be

0:12:14.800 --> 0:12:18.040
<v Speaker 1>like Captain Kirk and beam onto their planet and have

0:12:18.160 --> 0:12:19.800
<v Speaker 1>a conversation with them.

0:12:20.600 --> 0:12:22.760
<v Speaker 2>Now, maybe all this talk.

0:12:22.600 --> 0:12:27.400
<v Speaker 1>About extraterrestrials seems abstract because we haven't discovered any yet.

0:12:28.000 --> 0:12:31.160
<v Speaker 1>But I want to point out that we are surrounded

0:12:31.440 --> 0:12:34.679
<v Speaker 1>by aliens. They are all around us, and they're making

0:12:34.880 --> 0:12:40.880
<v Speaker 1>constant sounds. We have measured this alien language with our recorders,

0:12:41.280 --> 0:12:45.440
<v Speaker 1>and at the moment we have no idea what they're saying.

0:12:52.120 --> 0:12:53.320
<v Speaker 2>Now, that's the.

0:12:53.360 --> 0:12:58.760
<v Speaker 1>Sound of whales on our planet, and lots of animals

0:12:59.160 --> 0:13:03.080
<v Speaker 1>make noise, And the question is, are these animals that

0:13:03.120 --> 0:13:04.720
<v Speaker 1>we're surrounded.

0:13:04.080 --> 0:13:06.480
<v Speaker 2>With speaking a language.

0:13:07.320 --> 0:13:09.920
<v Speaker 1>These are the aliens that we don't have to travel

0:13:10.080 --> 0:13:14.120
<v Speaker 1>very far to listen to, And the question is can

0:13:14.160 --> 0:13:15.200
<v Speaker 1>we understand them?

0:13:15.640 --> 0:13:16.920
<v Speaker 2>So maybe so, maybe not?

0:13:17.080 --> 0:13:21.920
<v Speaker 1>After all, these sounds could be like belches or like grunting,

0:13:22.000 --> 0:13:24.240
<v Speaker 1>the way you do when you're home alone and you

0:13:24.320 --> 0:13:26.480
<v Speaker 1>bang your knee the night before and you're going up

0:13:26.480 --> 0:13:29.320
<v Speaker 1>the stairs and you make a sound, but it's not

0:13:29.400 --> 0:13:31.240
<v Speaker 1>really meant for anyone in particular.

0:13:31.280 --> 0:13:32.720
<v Speaker 2>It's just a noise that you're making.

0:13:33.360 --> 0:13:37.559
<v Speaker 1>So how would we know if animals are actually implementing

0:13:37.920 --> 0:13:43.720
<v Speaker 1>language and communicating meaning to one another. Well, this is

0:13:43.760 --> 0:13:47.360
<v Speaker 1>an unanswered question right now, and it probably differs species

0:13:47.400 --> 0:13:50.640
<v Speaker 1>by species. One thing biologists look at to try to

0:13:50.679 --> 0:13:54.480
<v Speaker 1>address this is things like turn taking. Does one animal

0:13:54.520 --> 0:13:57.120
<v Speaker 1>make some sound and then the other animal goes, and

0:13:57.120 --> 0:13:59.480
<v Speaker 1>then the first one again, and then the second. That

0:13:59.559 --> 0:14:03.640
<v Speaker 1>feels more like language, right something had said, there's some response.

0:14:04.040 --> 0:14:07.200
<v Speaker 1>It feels like there's at least the possibility for some

0:14:07.280 --> 0:14:10.880
<v Speaker 1>real meaningful conversations that way. But there are a lot

0:14:10.920 --> 0:14:14.160
<v Speaker 1>of questions here. Even if we found that some species

0:14:14.200 --> 0:14:19.520
<v Speaker 1>were speaking language, would we be able to understand the meaning?

0:14:20.320 --> 0:14:22.840
<v Speaker 1>And I don't mean this in terms of this call

0:14:23.000 --> 0:14:26.200
<v Speaker 1>means this thing, but in terms of what does that

0:14:26.280 --> 0:14:31.080
<v Speaker 1>thing mean for a human? Imagine if a bee is

0:14:31.160 --> 0:14:33.760
<v Speaker 1>talking about some experience that you really have to see

0:14:33.800 --> 0:14:38.160
<v Speaker 1>in ultraviolet to understand, or your dog is experiencing something

0:14:38.200 --> 0:14:42.080
<v Speaker 1>about smell that we couldn't possibly get from our experience,

0:14:42.920 --> 0:14:45.840
<v Speaker 1>or a dolphin is talking about the joy of that

0:14:45.960 --> 0:14:49.000
<v Speaker 1>moment where there's no more little fish, and so the

0:14:49.040 --> 0:14:52.760
<v Speaker 1>whole pod suddenly turns upward and rockets through their world

0:14:53.120 --> 0:14:57.320
<v Speaker 1>and breaks through some surface where everything is different. Might

0:14:57.360 --> 0:15:00.320
<v Speaker 1>it be the case that there's simply no way we

0:15:00.360 --> 0:15:03.480
<v Speaker 1>could totally understand what they mean. There may be things

0:15:03.480 --> 0:15:06.440
<v Speaker 1>we can identify that they are talking about, but I

0:15:06.440 --> 0:15:10.720
<v Speaker 1>think there's a spectrum of how close we would actually

0:15:10.840 --> 0:15:14.120
<v Speaker 1>be in our interpretation. And by the way, I'll just

0:15:14.200 --> 0:15:17.080
<v Speaker 1>note this is true with our fellow humans as well.

0:15:17.440 --> 0:15:21.480
<v Speaker 1>Someone might tell you about an experience with hang gliding

0:15:21.760 --> 0:15:26.040
<v Speaker 1>or stamp collecting, or psychedelic drugs or whatever, and even

0:15:26.080 --> 0:15:28.680
<v Speaker 1>though you gnawd and you say, oh, I gotcha, I

0:15:28.680 --> 0:15:33.000
<v Speaker 1>can relate that to my own experience. There's a spectrum

0:15:33.080 --> 0:15:37.480
<v Speaker 1>of how closely you are actually capturing what they are describing.

0:15:37.920 --> 0:15:41.920
<v Speaker 1>Sometimes you might have an analogous experience that puts you close,

0:15:41.960 --> 0:15:45.920
<v Speaker 1>and sometimes your assumptions may be pretty distant. So back

0:15:45.960 --> 0:15:50.119
<v Speaker 1>to animals, how much could we even understand in a translation,

0:15:50.320 --> 0:15:55.240
<v Speaker 1>given that animals have such different sensory windows on the world,

0:15:55.600 --> 0:16:00.160
<v Speaker 1>and so their concepts might be very different from ours. Now,

0:16:00.200 --> 0:16:03.720
<v Speaker 1>even with all these caveats, how amazing would it be

0:16:03.760 --> 0:16:08.440
<v Speaker 1>if we could get even a low dimensional, blurry glimpse

0:16:08.680 --> 0:16:12.640
<v Speaker 1>of what they were talking about. People have always wanted

0:16:12.680 --> 0:16:15.800
<v Speaker 1>to understand animals, and they've tried in the past. But

0:16:15.920 --> 0:16:19.720
<v Speaker 1>we are at an amazing moment in history where I

0:16:19.760 --> 0:16:22.840
<v Speaker 1>think the time is right around the corner. And I'm

0:16:22.840 --> 0:16:24.840
<v Speaker 1>not saying this is a general claim. I'm saying this

0:16:24.920 --> 0:16:29.880
<v Speaker 1>for two specific reasons. First, we have incredible technology now

0:16:30.280 --> 0:16:34.600
<v Speaker 1>which makes it possible to do biologging. What is biologging.

0:16:34.840 --> 0:16:39.160
<v Speaker 1>This is about collecting data from an animal with a small,

0:16:39.280 --> 0:16:40.160
<v Speaker 1>lightweight device.

0:16:40.320 --> 0:16:41.680
<v Speaker 2>These are called biologgers.

0:16:42.240 --> 0:16:44.320
<v Speaker 1>You hook these up to an animal and that way

0:16:44.360 --> 0:16:48.360
<v Speaker 1>you can collect data for long windows of time without

0:16:48.520 --> 0:16:53.160
<v Speaker 1>humans being around. You can record sounds, and measure physiology

0:16:53.240 --> 0:16:55.960
<v Speaker 1>and track movements, and this is how you get a

0:16:56.200 --> 0:17:00.640
<v Speaker 1>secret window into an animal's world. Any thing is that

0:17:00.680 --> 0:17:04.320
<v Speaker 1>it gives new, rich data that you just can't get

0:17:04.359 --> 0:17:08.639
<v Speaker 1>otherwise about animals in their natural environment. Now, it's not

0:17:08.680 --> 0:17:12.280
<v Speaker 1>always easy to attach to the biologgers to animals, especially

0:17:12.280 --> 0:17:15.560
<v Speaker 1>if they're small and fast. But the main problem is

0:17:15.600 --> 0:17:19.560
<v Speaker 1>that the data collected by the biologgers can be really

0:17:19.680 --> 0:17:24.360
<v Speaker 1>complicated and difficult to analyze. So we humans have amassed

0:17:24.440 --> 0:17:27.760
<v Speaker 1>a ton of rich data that we're sitting on. And

0:17:27.800 --> 0:17:31.000
<v Speaker 1>that leads to the second specific reason why we're just

0:17:31.200 --> 0:17:35.680
<v Speaker 1>on the verge of something amazing, and that's artificial intelligence.

0:17:51.880 --> 0:17:56.040
<v Speaker 1>AI has been used for years to translate and decode

0:17:56.119 --> 0:18:00.000
<v Speaker 1>human languages, and now we have this incredible opportunity to

0:18:00.320 --> 0:18:05.080
<v Speaker 1>leverage it for understanding animal communication. And I'm lucky enough

0:18:05.080 --> 0:18:07.080
<v Speaker 1>to be friends with one of the people at the

0:18:07.240 --> 0:18:10.679
<v Speaker 1>lead of the effort to decrypt these alien languages that

0:18:10.720 --> 0:18:14.720
<v Speaker 1>were surrounded with So how does our modern technology give

0:18:14.840 --> 0:18:18.080
<v Speaker 1>us hope for decoding animal language.

0:18:19.119 --> 0:18:23.080
<v Speaker 3>So the insight originally came in twenty thirteen when I

0:18:23.119 --> 0:18:27.560
<v Speaker 3>was listening actually to NPR and there was a researcher

0:18:27.800 --> 0:18:29.440
<v Speaker 3>describing to a lot of monkeys.

0:18:29.920 --> 0:18:33.400
<v Speaker 1>That's Asa Raskin. He's a writer and entrepreneur and inventor,

0:18:33.840 --> 0:18:36.879
<v Speaker 1>and for the purposes of today, he founded the Earth

0:18:36.960 --> 0:18:42.280
<v Speaker 1>Species Project ESP, which is a nonprofit focused on using

0:18:42.440 --> 0:18:45.879
<v Speaker 1>AI to decode non human communication.

0:18:50.040 --> 0:18:53.119
<v Speaker 3>And these animals are They're credible. They live in the

0:18:53.160 --> 0:18:57.639
<v Speaker 3>Ethiopian highlands. They have like huge maines, red patches on

0:18:57.680 --> 0:19:01.879
<v Speaker 3>their chests. And what I did realize is that, according

0:19:01.920 --> 0:19:04.520
<v Speaker 3>to the researcher, they had one of the largest vocabularies

0:19:04.600 --> 0:19:08.119
<v Speaker 3>of any primate except for US humans. In fact, the

0:19:08.119 --> 0:19:12.480
<v Speaker 3>researchers swear that these animals talk about them behind their back.

0:19:13.280 --> 0:19:15.520
<v Speaker 3>And so the thought back then was like, well, if

0:19:15.560 --> 0:19:17.840
<v Speaker 3>people are just out there. The researchers are just out

0:19:17.840 --> 0:19:22.879
<v Speaker 3>there trying to understand what these beings are saying using

0:19:23.080 --> 0:19:25.800
<v Speaker 3>like a hand recorder in hand transcribing. Couldn't there be

0:19:25.840 --> 0:19:28.520
<v Speaker 3>a better way? Couldn't we use like machine learning AI

0:19:28.960 --> 0:19:32.159
<v Speaker 3>large scale microphone arrays. But of course, in twenty seventeen,

0:19:33.000 --> 0:19:36.560
<v Speaker 3>that wasn't yet possible because machine learning couldn't do something

0:19:36.880 --> 0:19:40.040
<v Speaker 3>that human beings couldn't already do. They couldn't translate a

0:19:40.119 --> 0:19:45.199
<v Speaker 3>language without a Rosetta stone, without any examples. And that

0:19:45.280 --> 0:19:49.480
<v Speaker 3>really changed in twenty seventeen when the machine learning community

0:19:49.480 --> 0:19:51.160
<v Speaker 3>there were two papers that came up back to back

0:19:51.200 --> 0:19:53.480
<v Speaker 3>and the Way this Thing Happens, that showed that you

0:19:53.520 --> 0:19:57.920
<v Speaker 3>could translate between any two human languages without the need

0:19:58.080 --> 0:20:01.680
<v Speaker 3>for examples or Rosetta stones. And I can dive into

0:20:01.840 --> 0:20:03.800
<v Speaker 3>how that works, but that was the moment that we

0:20:03.880 --> 0:20:07.080
<v Speaker 3>said we should get going and along the journey, I

0:20:07.119 --> 0:20:09.520
<v Speaker 3>just should say, like, all right, is there even a

0:20:09.600 --> 0:20:13.080
<v Speaker 3>there there? Like we say animal language, What does that mean?

0:20:13.160 --> 0:20:15.680
<v Speaker 3>What is a rich complex communication structure? What would that

0:20:15.760 --> 0:20:18.000
<v Speaker 3>look like? I just want to give a couple examples

0:20:18.040 --> 0:20:21.880
<v Speaker 3>for your listeners. So off the coast of Norway, every

0:20:21.960 --> 0:20:26.280
<v Speaker 3>year there's a group of false killer whales that all

0:20:26.560 --> 0:20:29.919
<v Speaker 3>phenomenologically speak one way, and a group of dolphins that

0:20:29.960 --> 0:20:33.359
<v Speaker 3>all speak another way, and they come together and they

0:20:33.440 --> 0:20:37.920
<v Speaker 3>hunt in a superpod. And when they do this, they

0:20:37.960 --> 0:20:43.200
<v Speaker 3>speak a third different way, which is just sort of crazy,

0:20:43.280 --> 0:20:45.560
<v Speaker 3>right like. And it turns out that whales, you know,

0:20:45.640 --> 0:20:48.679
<v Speaker 3>have a culture extending back thirty four million years. They

0:20:48.720 --> 0:20:51.399
<v Speaker 3>have dialects that sort of split off, which they can

0:20:51.520 --> 0:20:53.720
<v Speaker 3>understand each other with, and that can split all the

0:20:53.720 --> 0:21:00.560
<v Speaker 3>way into mutually unintelligible languages. Another example is I learned

0:21:00.560 --> 0:21:03.159
<v Speaker 3>this in twenty fourteen that for campell monkeys, hawk for

0:21:03.200 --> 0:21:06.520
<v Speaker 3>them means eagle, crack means leopard, and hawk ooh means

0:21:06.520 --> 0:21:09.359
<v Speaker 3>predator that's up, crack ooh means predator that's down. So

0:21:09.400 --> 0:21:12.119
<v Speaker 3>now we have a simple syntax. And then one of

0:21:12.119 --> 0:21:16.119
<v Speaker 3>my favorite studies is from the University of Hawaii in

0:21:16.280 --> 0:21:19.480
<v Speaker 3>nineteen ninety four, and here they taught dolphins two gestures,

0:21:19.800 --> 0:21:23.199
<v Speaker 3>and the first gesture was do something you've never done before,

0:21:24.000 --> 0:21:25.480
<v Speaker 3>which is sort of a crazy thing to be able

0:21:25.480 --> 0:21:27.680
<v Speaker 3>to communicate, but the dolphins will do it. And to

0:21:27.680 --> 0:21:29.960
<v Speaker 3>remember to do that, that means they have to remember

0:21:30.040 --> 0:21:33.399
<v Speaker 3>every single thing they've done before that session, understand the

0:21:33.400 --> 0:21:36.440
<v Speaker 3>concept negation not one of those things. Then invent whole cloth,

0:21:36.560 --> 0:21:39.479
<v Speaker 3>some new thing that they've never done before, but they

0:21:39.480 --> 0:21:41.440
<v Speaker 3>can do it. And then they'll teach the dolphins a

0:21:41.480 --> 0:21:45.399
<v Speaker 3>second gesture, do something you haven't done before. Together, and

0:21:45.440 --> 0:21:47.760
<v Speaker 3>they'll say, at the same time, do something you haven't

0:21:47.800 --> 0:21:51.280
<v Speaker 3>done before. Together. The dolphins go down, exchange sawnic information,

0:21:51.359 --> 0:21:53.320
<v Speaker 3>come up and do the same trick they've never done

0:21:53.320 --> 0:21:57.240
<v Speaker 3>before at the same time. And while that doesn't prove

0:21:57.320 --> 0:22:01.560
<v Speaker 3>representational language, it certainly places I think Auckham's razor. On

0:22:01.640 --> 0:22:04.159
<v Speaker 3>the other foot, it certainly seems that way. How do

0:22:04.240 --> 0:22:08.520
<v Speaker 3>you know, though, when you're approaching these that there exists languages?

0:22:08.600 --> 0:22:11.600
<v Speaker 3>For example, you said that whale culture or civilization is

0:22:11.600 --> 0:22:14.120
<v Speaker 3>thirty four million years old, But how do we know

0:22:14.200 --> 0:22:16.840
<v Speaker 3>that they're speaking a language that has the kind of

0:22:16.960 --> 0:22:20.560
<v Speaker 3>structure that we have that is capable of passing on

0:22:20.600 --> 0:22:25.680
<v Speaker 3>a culture or civilization. Yeah, great question, And of course

0:22:25.720 --> 0:22:29.439
<v Speaker 3>it's hard if you can't listen in and understand. But

0:22:29.520 --> 0:22:33.119
<v Speaker 3>you know, there are two hallmarks of language that human

0:22:33.160 --> 0:22:36.439
<v Speaker 3>beings have, one of which is to be able to

0:22:36.480 --> 0:22:40.600
<v Speaker 3>talk about something that isn't here and something that isn't now,

0:22:40.760 --> 0:22:42.720
<v Speaker 3>can you refer to things that are in a different time,

0:22:42.720 --> 0:22:46.080
<v Speaker 3>in a different place. And we can actually see already

0:22:46.240 --> 0:22:49.760
<v Speaker 3>from the research that the answer is. It appears to

0:22:49.800 --> 0:22:52.040
<v Speaker 3>be the case that at least some animals can do this.

0:22:52.160 --> 0:22:56.679
<v Speaker 3>So Adrian Lumero is a researcher on great apes, and

0:22:56.720 --> 0:22:59.800
<v Speaker 3>he's discovered in the last year or so that are

0:23:00.240 --> 0:23:03.000
<v Speaker 3>tangs do have a version of a past tense. They

0:23:03.040 --> 0:23:06.000
<v Speaker 3>can talk about things that are not now. And then

0:23:06.200 --> 0:23:10.800
<v Speaker 3>dolphins have names that they called each other by, and

0:23:11.240 --> 0:23:14.280
<v Speaker 3>Ian Yannick in twenty sixteen discovered that they will use

0:23:14.320 --> 0:23:17.120
<v Speaker 3>those names in the third person. They can talk about

0:23:17.880 --> 0:23:20.480
<v Speaker 3>one of their own that is not here. So now

0:23:20.480 --> 0:23:25.680
<v Speaker 3>we have two of the hallmarks not here not now. Now,

0:23:25.920 --> 0:23:28.560
<v Speaker 3>when we say language, we only have one example of

0:23:28.600 --> 0:23:31.800
<v Speaker 3>a species that speaks what we call language humans and

0:23:31.880 --> 0:23:33.960
<v Speaker 3>almost always like if you only know one color and

0:23:34.000 --> 0:23:36.480
<v Speaker 3>then you learn a second color, you discover an entire

0:23:36.560 --> 0:23:40.280
<v Speaker 3>rainbow in between. Like when we say language, you know

0:23:40.320 --> 0:23:43.120
<v Speaker 3>we're using that to be a catch all for rich

0:23:43.440 --> 0:23:47.240
<v Speaker 3>communication systems that can pass on cultural information.

0:23:47.720 --> 0:23:48.520
<v Speaker 2>Yeah, agreed.

0:23:49.800 --> 0:23:51.720
<v Speaker 1>Let me jump back to the orangutans for one second.

0:23:51.840 --> 0:23:54.399
<v Speaker 1>Is there any evidence that they have future tents.

0:23:54.880 --> 0:24:00.480
<v Speaker 3>That's a great question. We do not know. That's what's

0:24:00.480 --> 0:24:04.400
<v Speaker 3>so exciting about this field right now, is I think

0:24:04.480 --> 0:24:06.920
<v Speaker 3>of this it's sort of like the invention of the

0:24:06.960 --> 0:24:10.119
<v Speaker 3>Hubble telescope, right, It's like, and if you remember, I

0:24:10.119 --> 0:24:14.199
<v Speaker 3>think it was back in nineteen ninety five they pointed

0:24:14.200 --> 0:24:16.720
<v Speaker 3>the Hubble telescope and an empty patch in sky and

0:24:16.760 --> 0:24:19.200
<v Speaker 3>what they discovered was the most galaxies that have ever

0:24:19.240 --> 0:24:23.840
<v Speaker 3>existed in one spot. That's essentially what we're discovering here

0:24:23.960 --> 0:24:26.920
<v Speaker 3>is that we just haven't had the tools to look

0:24:27.480 --> 0:24:31.200
<v Speaker 3>and when we do, what we're discovering is much more

0:24:31.200 --> 0:24:33.159
<v Speaker 3>than everything we discovered. What we're discovering is everything.

0:24:33.280 --> 0:24:36.879
<v Speaker 1>Yeah, exactly right, the deep field experiment. So how do

0:24:36.960 --> 0:24:40.080
<v Speaker 1>we actually do it? How do we apply all the

0:24:40.119 --> 0:24:43.080
<v Speaker 1>modern tools of science to see if we can decode

0:24:43.119 --> 0:24:43.879
<v Speaker 1>a language?

0:24:44.440 --> 0:24:44.720
<v Speaker 2>Yeah?

0:24:44.800 --> 0:24:47.560
<v Speaker 3>Great question. So I'm going to start with this twenty

0:24:47.600 --> 0:24:50.600
<v Speaker 3>seventeen technology, and I just want your audience to remember

0:24:50.720 --> 0:24:54.000
<v Speaker 3>that twenty seventeen is essentially the Stone Age in AI.

0:24:55.080 --> 0:24:57.760
<v Speaker 3>But I think it's a really useful conceptual tool to

0:24:57.840 --> 0:25:00.720
<v Speaker 3>understand how it might work. So, how do you translate

0:25:00.800 --> 0:25:04.960
<v Speaker 3>between languages that don't have Rosetta stones. And it turns

0:25:04.960 --> 0:25:06.960
<v Speaker 3>out what you can ask AI to do is build

0:25:07.040 --> 0:25:09.920
<v Speaker 3>a shape that represents a language. So you say, feed

0:25:09.960 --> 0:25:12.520
<v Speaker 3>in all of Wikipedia, a whole bunch of text, and

0:25:12.560 --> 0:25:16.560
<v Speaker 3>the AI generates a shape that represents a language. Imagine

0:25:16.600 --> 0:25:20.800
<v Speaker 3>a galaxy where every star is a word, and words

0:25:20.840 --> 0:25:23.719
<v Speaker 3>that mean similar things are placed near each other, and

0:25:23.760 --> 0:25:28.520
<v Speaker 3>then words that share a sort of conceptual relationship get

0:25:28.560 --> 0:25:31.360
<v Speaker 3>turned into sharing a geometric relationship. What does that mean?

0:25:31.560 --> 0:25:35.000
<v Speaker 3>That means if you imagine king is to man as

0:25:35.040 --> 0:25:37.800
<v Speaker 3>woman is to queen, then in this shape, king is

0:25:37.840 --> 0:25:42.119
<v Speaker 3>the same distance direction to man as woman is to queen.

0:25:42.160 --> 0:25:44.119
<v Speaker 3>And so you actually just subtract king minus man. That

0:25:44.200 --> 0:25:47.080
<v Speaker 3>gives you a distance of direction. You add that to

0:25:47.320 --> 0:25:49.600
<v Speaker 3>boy and that'll equal prince. You add that to girl

0:25:49.640 --> 0:25:52.320
<v Speaker 3>eqal princess. You add that to woman and equal queen.

0:25:53.800 --> 0:25:56.520
<v Speaker 3>And so if you think about all of the relationships,

0:25:56.560 --> 0:25:58.840
<v Speaker 3>the internal relationships of a language, they think about the

0:25:58.840 --> 0:26:01.520
<v Speaker 3>word dog. Dog has relationship to man and to howl

0:26:01.680 --> 0:26:04.920
<v Speaker 3>and to wolf and to fer. If you it sort

0:26:04.920 --> 0:26:06.920
<v Speaker 3>of fixes in a point in space, and if you

0:26:07.640 --> 0:26:11.440
<v Speaker 3>solve this massive multi dimensional Sudoku puzzle of how every

0:26:11.560 --> 0:26:14.080
<v Speaker 3>concept relates to every other concept that gets turned into

0:26:14.119 --> 0:26:18.200
<v Speaker 3>a geometry, and out pops a rigid structure that represents

0:26:18.200 --> 0:26:20.439
<v Speaker 3>a language. Now the computer doesn't know what anything it means.

0:26:20.760 --> 0:26:22.600
<v Speaker 3>It just knows how they all relate to each other.

0:26:22.680 --> 0:26:26.840
<v Speaker 3>The shape represents all of the internal relationships of a language,

0:26:26.840 --> 0:26:29.359
<v Speaker 3>which is of course just a model of the world.

0:26:29.600 --> 0:26:32.000
<v Speaker 3>All right, So you have this shape for English, and

0:26:32.119 --> 0:26:34.360
<v Speaker 3>this is what the machine learners asked in twenty seventeen.

0:26:35.480 --> 0:26:38.760
<v Speaker 3>Is it possible? They said that the shape which is

0:26:38.800 --> 0:26:41.399
<v Speaker 3>English might be similar to or the same as, the

0:26:41.440 --> 0:26:45.000
<v Speaker 3>shape which is German. And if you ask anthropologists, they'd

0:26:45.000 --> 0:26:47.719
<v Speaker 3>be like, no, that's a silly thing to think, like.

0:26:47.840 --> 0:26:50.680
<v Speaker 3>They have different ways of viewing the world, different cosmologies.

0:26:50.880 --> 0:26:52.800
<v Speaker 3>But the machine is like, whatever, let's give it a try.

0:26:53.359 --> 0:26:55.400
<v Speaker 3>And it turns out that it works. You can take

0:26:55.440 --> 0:26:58.359
<v Speaker 3>the shape which is English and the shape which is German,

0:26:58.520 --> 0:27:01.320
<v Speaker 3>and literally rotate one shape on top of the other.

0:27:01.480 --> 0:27:03.119
<v Speaker 3>And even though there are words in one language that

0:27:03.119 --> 0:27:04.919
<v Speaker 3>don't appear in the other, if you blew your eyes,

0:27:05.760 --> 0:27:09.440
<v Speaker 3>the shapes are roughly the same, and the point, which

0:27:09.480 --> 0:27:12.120
<v Speaker 3>is dog ends up in the same in both. Now

0:27:12.160 --> 0:27:14.000
<v Speaker 3>you might be saying okay, but that's because English and

0:27:14.040 --> 0:27:18.159
<v Speaker 3>German are very similar languages. But it turns out this

0:27:18.200 --> 0:27:22.240
<v Speaker 3>works for Finnish, which is a really weird language, Turkish, Aramaic, Urdu.

0:27:22.920 --> 0:27:26.320
<v Speaker 3>Pretty much every human language fits in a kind of

0:27:26.760 --> 0:27:31.240
<v Speaker 3>universal human meaning shape, and the point, which is dog

0:27:31.440 --> 0:27:33.480
<v Speaker 3>ends up in the same spot in all of them,

0:27:34.400 --> 0:27:37.439
<v Speaker 3>and this lets you do translation without the need for

0:27:37.480 --> 0:27:41.600
<v Speaker 3>any examples. And this is I think, such a beautiful,

0:27:42.119 --> 0:27:48.080
<v Speaker 3>profound realization that there is a hidden structure underlying all

0:27:48.119 --> 0:27:52.400
<v Speaker 3>of us that unites our way of seeing. So that

0:27:52.520 --> 0:27:55.920
<v Speaker 3>was the sort of the core insight that said, well,

0:27:56.000 --> 0:27:59.480
<v Speaker 3>maybe now it's time to start building that shape for

0:27:59.840 --> 0:28:02.640
<v Speaker 3>animal communication, which by the way, is very hard because

0:28:02.680 --> 0:28:05.879
<v Speaker 3>it takes denoising and working with many partners to collect

0:28:05.920 --> 0:28:09.400
<v Speaker 3>like the years with the data that's required. But that's

0:28:09.400 --> 0:28:12.440
<v Speaker 3>sort of what we started to do. Now I'll pausit

0:28:12.440 --> 0:28:14.040
<v Speaker 3>for a second, but there are a couple other techniques

0:28:14.080 --> 0:28:15.160
<v Speaker 3>that can add to the top of this.

0:28:15.560 --> 0:28:17.280
<v Speaker 1>Great so let me jump in for one second. So

0:28:17.320 --> 0:28:20.200
<v Speaker 1>the fact that all the human languages have a similar

0:28:20.320 --> 0:28:25.240
<v Speaker 1>structure to them is in part because humans radiated out

0:28:25.280 --> 0:28:29.679
<v Speaker 1>of Africa sort of yesterday and as a result, you know,

0:28:29.720 --> 0:28:33.600
<v Speaker 1>we all have the same brain and it's not so surprising.

0:28:33.640 --> 0:28:37.240
<v Speaker 1>And the question is what do we expect when we're

0:28:37.240 --> 0:28:39.960
<v Speaker 1>looking at animal languages, which I'll come back to some

0:28:40.000 --> 0:28:41.680
<v Speaker 1>more questions on that a second, But what do we

0:28:41.760 --> 0:28:45.560
<v Speaker 1>expect in terms of the similarity there given that animals

0:28:45.560 --> 0:28:48.280
<v Speaker 1>are picking up on different signals from the world they're

0:28:48.440 --> 0:28:51.960
<v Speaker 1>umvelt is different the signals they can get and their

0:28:52.000 --> 0:28:53.320
<v Speaker 1>concepts might be very different.

0:28:53.640 --> 0:28:55.120
<v Speaker 2>How do you think about that?

0:28:55.200 --> 0:28:57.720
<v Speaker 3>This is a great question. It's just to repeat what

0:28:57.760 --> 0:29:01.400
<v Speaker 3>you're saying, is that the censorium, the way that animals

0:29:01.560 --> 0:29:03.600
<v Speaker 3>perceive the world, like what it is like to be

0:29:03.640 --> 0:29:06.360
<v Speaker 3>a bat, may be so completely different than what it

0:29:06.440 --> 0:29:08.400
<v Speaker 3>is like to be a human because they're seeing in

0:29:08.520 --> 0:29:11.880
<v Speaker 3>three D sound that we can never translate anything. And

0:29:11.880 --> 0:29:16.720
<v Speaker 3>that may turn out to be the case, but you know,

0:29:16.760 --> 0:29:19.400
<v Speaker 3>I think there's reason to believe that there may be

0:29:19.480 --> 0:29:22.880
<v Speaker 3>some kind of overlap with our experience. And to just

0:29:22.920 --> 0:29:29.000
<v Speaker 3>give a couple examples. You know, lemurs, for example, are

0:29:29.040 --> 0:29:33.320
<v Speaker 3>known to bite down on centipedes, literally to take a

0:29:33.400 --> 0:29:36.479
<v Speaker 3>hit off of centipedes to get high. They enter this

0:29:36.600 --> 0:29:39.960
<v Speaker 3>very trance like state, they get super cuddly. It looks

0:29:40.000 --> 0:29:43.400
<v Speaker 3>sort of like a scene from Burning Man. Dolphins too,

0:29:43.640 --> 0:29:48.280
<v Speaker 3>are known to intentionally inflate pufferfish to get high after

0:29:48.320 --> 0:29:51.400
<v Speaker 3>their venom and then pass them around literally puff pass.

0:29:52.320 --> 0:29:54.640
<v Speaker 3>Great apes are known to like hang off of vines

0:29:54.680 --> 0:29:58.280
<v Speaker 3>and spin to get dizzy. There is something about a

0:29:58.400 --> 0:30:03.200
<v Speaker 3>transcendent state of conscious altering our state that is at

0:30:03.240 --> 0:30:07.320
<v Speaker 3>least shared amongst the mammals, and so if they're communicating,

0:30:07.320 --> 0:30:10.640
<v Speaker 3>they may well communicate about that, and that's something we'd share.

0:30:10.720 --> 0:30:14.320
<v Speaker 3>Another example is something known as the mirror test. This

0:30:14.480 --> 0:30:18.400
<v Speaker 3>is a test where you take an animal, you paint

0:30:18.400 --> 0:30:20.320
<v Speaker 3>a dot on them where they can't see it. You

0:30:20.360 --> 0:30:22.800
<v Speaker 3>give them a mirror. They look in the mirror, they

0:30:23.200 --> 0:30:25.520
<v Speaker 3>see the dot, and they turn to the dot and

0:30:25.560 --> 0:30:28.320
<v Speaker 3>they try to brush it off of themselves or investigate it.

0:30:28.840 --> 0:30:30.719
<v Speaker 3>And in order for an animal to do that, they

0:30:30.760 --> 0:30:33.880
<v Speaker 3>have to associate the image that's in the mirror with themselves.

0:30:33.920 --> 0:30:35.120
<v Speaker 3>They have to look in the mirror and say like

0:30:35.200 --> 0:30:39.200
<v Speaker 3>that's me. So that means there's a rich sense of interiority,

0:30:39.640 --> 0:30:44.560
<v Speaker 3>like a self awareness. Dolphins past this test, elephants past

0:30:44.640 --> 0:30:47.080
<v Speaker 3>the tests. A number of other species pass this test.

0:30:47.280 --> 0:30:51.640
<v Speaker 3>So even the concept so profound as me self awareness

0:30:51.760 --> 0:30:55.479
<v Speaker 3>that seems to be shared. You know, examples of people

0:30:56.400 --> 0:31:01.640
<v Speaker 3>showing orangutangs magic tricks and they go crazy. It's worth

0:31:01.720 --> 0:31:03.760
<v Speaker 3>just looking them up on YouTube to see these kinds

0:31:03.760 --> 0:31:08.400
<v Speaker 3>of videos. Pilot whales carry they're dead young for three

0:31:08.560 --> 0:31:12.680
<v Speaker 3>four weeks, like grief is a shared part of the experience.

0:31:12.720 --> 0:31:15.600
<v Speaker 3>So if you imagine these shapes, where one of the

0:31:15.640 --> 0:31:17.680
<v Speaker 3>shapes is like human language, is one of these is

0:31:17.760 --> 0:31:20.600
<v Speaker 3>animal communication, I think we should expect to see some

0:31:20.720 --> 0:31:23.440
<v Speaker 3>part of those shapes overlap, and that should be the

0:31:23.480 --> 0:31:27.000
<v Speaker 3>part we should do direct translation. But then there's going

0:31:27.040 --> 0:31:28.960
<v Speaker 3>to be a huge portion of the shape that can

0:31:29.000 --> 0:31:31.640
<v Speaker 3>never be directly translated to human experience, and you'd sort

0:31:31.640 --> 0:31:33.760
<v Speaker 3>of expect that to be sticking out, like where we

0:31:33.760 --> 0:31:36.360
<v Speaker 3>can see complexity, but we don't know how to translate it.

0:31:36.760 --> 0:31:38.200
<v Speaker 3>And I still don't know which one of these two

0:31:38.280 --> 0:31:39.920
<v Speaker 3>is going to be more fascinating, the part where we

0:31:39.960 --> 0:31:43.280
<v Speaker 3>can directly translate the part we don't, because, as I'm saying,

0:31:43.360 --> 0:31:46.800
<v Speaker 3>human beings have been communicating vocally for one hundred thousand

0:31:46.840 --> 0:31:49.720
<v Speaker 3>to three hundred thousand years, passing up culture. Whales and

0:31:49.760 --> 0:31:53.440
<v Speaker 3>dolphins have been doing this for thirty four million years,

0:31:53.480 --> 0:31:56.320
<v Speaker 3>and that which is oldest correlates with that which is wisest.

0:31:56.760 --> 0:32:00.400
<v Speaker 3>So for something to survive thirty four million years, there

0:32:00.440 --> 0:32:03.480
<v Speaker 3>has to be some deep kernel of adaptive truth in there.

0:32:04.040 --> 0:32:07.920
<v Speaker 3>And whatever it is that is the solution to humanities problems, like,

0:32:07.960 --> 0:32:10.280
<v Speaker 3>it's not in our imagination, because if it is, we'd

0:32:10.280 --> 0:32:12.640
<v Speaker 3>probably be trying to do it. So this is a

0:32:12.680 --> 0:32:15.880
<v Speaker 3>way of starting to get the first polaroid sort of

0:32:15.920 --> 0:32:19.840
<v Speaker 3>blurry image pictures of that which is beyond our imagination.

0:32:20.920 --> 0:32:22.960
<v Speaker 2>Now, let me ask you this. If we're just looking

0:32:23.000 --> 0:32:25.280
<v Speaker 2>at the auditory.

0:32:24.760 --> 0:32:27.320
<v Speaker 1>Information that we get from animals, we can do this

0:32:27.400 --> 0:32:30.520
<v Speaker 1>kind of technique where we're looking to match one galaxy

0:32:30.560 --> 0:32:32.920
<v Speaker 1>of stars to the other galaxy of stars and see

0:32:32.920 --> 0:32:35.360
<v Speaker 1>what parts are sticking out and so on. But we

0:32:35.400 --> 0:32:38.600
<v Speaker 1>may well need much more than just the audio right

0:32:39.600 --> 0:32:42.800
<v Speaker 1>to understand the context of what the animal is saying

0:32:42.840 --> 0:32:47.560
<v Speaker 1>in a particular situation. So how are people pursuing that?

0:32:47.560 --> 0:32:51.440
<v Speaker 3>That's a great question. Even for humans, we know that

0:32:51.600 --> 0:32:53.840
<v Speaker 3>so much of the information that we convey. If you've

0:32:53.840 --> 0:32:56.720
<v Speaker 3>ever had to try to order food in a country

0:32:56.760 --> 0:32:58.840
<v Speaker 3>where you don't speak the language and somehow you can

0:32:58.880 --> 0:33:03.560
<v Speaker 3>do it, you can have communication without words the same

0:33:03.560 --> 0:33:08.080
<v Speaker 3>thing may be true for animals. So in fact, chimpanzees

0:33:08.120 --> 0:33:10.640
<v Speaker 3>are known to have sixty plus hand and feet gestures,

0:33:10.640 --> 0:33:12.560
<v Speaker 3>which seems to be at least as far as we know,

0:33:12.600 --> 0:33:16.120
<v Speaker 3>their predominant form of more symbolic communication.

0:33:16.560 --> 0:33:19.320
<v Speaker 1>Plus, we have indefinite references to things all the time, right,

0:33:19.400 --> 0:33:22.520
<v Speaker 1>So when I say she, I might be talking about

0:33:22.560 --> 0:33:25.200
<v Speaker 1>Marie Curie, or I might be talking about Michelle Obama

0:33:25.680 --> 0:33:28.000
<v Speaker 1>or something. But once I've introduced too I'm talking about

0:33:28.000 --> 0:33:30.240
<v Speaker 1>I can just use the word she. But somebody trying

0:33:30.280 --> 0:33:34.200
<v Speaker 1>to decode when I'm saying who doesn't speak English might

0:33:34.240 --> 0:33:36.320
<v Speaker 1>have a hard time understanding what the references to.

0:33:37.080 --> 0:33:40.920
<v Speaker 3>Yeah, that's exactly right, and what you're speaking to is

0:33:41.080 --> 0:33:45.280
<v Speaker 3>the importance of context. In order to understand what someone

0:33:45.560 --> 0:33:48.120
<v Speaker 3>is saying or what an animal is meaning, we have

0:33:48.160 --> 0:33:51.120
<v Speaker 3>to understand the context. Otherwise, like the same grunt may

0:33:51.160 --> 0:33:53.840
<v Speaker 3>mean like that monkey, or it may mean like I'm upset,

0:33:54.000 --> 0:33:57.200
<v Speaker 3>and it all sort of depends on social context. So

0:33:57.400 --> 0:34:00.000
<v Speaker 3>a lot of what we do now is multi mode.

0:34:00.360 --> 0:34:05.880
<v Speaker 3>That is to say, we work with biologists that have

0:34:06.640 --> 0:34:12.600
<v Speaker 3>tags on animals, that record often video, audio, and motion,

0:34:13.400 --> 0:34:16.319
<v Speaker 3>and that lets us begin to translate between all of

0:34:16.360 --> 0:34:19.000
<v Speaker 3>these different modalities, and in fact, with some of the

0:34:19.040 --> 0:34:22.600
<v Speaker 3>species to work with, these tags are on multiple animals

0:34:22.800 --> 0:34:26.440
<v Speaker 3>in the same group, so we can get social context.

0:34:26.760 --> 0:34:30.000
<v Speaker 3>And I want to return for a second to this

0:34:30.040 --> 0:34:33.080
<v Speaker 3>really interesting question you pose. You're like, well, maybe all

0:34:33.200 --> 0:34:36.000
<v Speaker 3>human languages fit in the same shape because we share

0:34:36.000 --> 0:34:39.840
<v Speaker 3>the same physical substrate, the same brains, in the same ears,

0:34:39.840 --> 0:34:43.399
<v Speaker 3>and the same eyes. But there's something deeper going on

0:34:43.520 --> 0:34:47.400
<v Speaker 3>in machine learning than just the ability to match the

0:34:47.440 --> 0:34:51.880
<v Speaker 3>shapes of languages. Maybe your audience has heard of or

0:34:51.960 --> 0:34:56.279
<v Speaker 3>seeing Dolly or mid journey or image diffusion where you

0:34:56.600 --> 0:35:00.359
<v Speaker 3>type in text and outcomes an image that is never

0:35:00.400 --> 0:35:05.160
<v Speaker 3>been seen before. How does that work? Well, these shapes

0:35:05.160 --> 0:35:06.960
<v Speaker 3>are actually really helpful to have in your mind to

0:35:07.000 --> 0:35:10.120
<v Speaker 3>understand how it works. So let's build now a shape

0:35:10.640 --> 0:35:13.640
<v Speaker 3>on human faces, and once again you end up with

0:35:13.719 --> 0:35:16.319
<v Speaker 3>a galaxy where every star now isn't a word, but

0:35:16.440 --> 0:35:21.399
<v Speaker 3>is a human face. Faces that share similar relationships share

0:35:21.480 --> 0:35:24.840
<v Speaker 3>geometric relationships. So if I take a picture of your face, David,

0:35:25.719 --> 0:35:27.680
<v Speaker 3>and then I take a picture of your face that's smiling,

0:35:28.040 --> 0:35:30.640
<v Speaker 3>there's this distance in direction that takes me between your

0:35:30.680 --> 0:35:32.839
<v Speaker 3>face and your faces just smiling. I subtract those two

0:35:32.840 --> 0:35:35.719
<v Speaker 3>as I get smilingness as a relationship. I can now

0:35:35.840 --> 0:35:39.160
<v Speaker 3>add that to any other face in the shape and

0:35:39.160 --> 0:35:40.840
<v Speaker 3>I'll get the smiling version of that face.

0:35:41.200 --> 0:35:41.359
<v Speaker 2>Right.

0:35:41.400 --> 0:35:43.719
<v Speaker 3>So now there's a direction that represents smiling, there's a

0:35:43.719 --> 0:35:48.280
<v Speaker 3>direction that represents frowning, that represents age, that represents gender,

0:35:48.360 --> 0:35:51.600
<v Speaker 3>more male, more female. You end up with a map

0:35:51.640 --> 0:35:55.120
<v Speaker 3>of all the semantic relationships, and you can now do

0:35:55.160 --> 0:35:57.200
<v Speaker 3>that not just for faces, you can do that for

0:35:57.320 --> 0:35:59.759
<v Speaker 3>all of images. And now you have a shape that

0:35:59.800 --> 0:36:02.520
<v Speaker 3>are presents images, a shape that represents languages. You look

0:36:02.520 --> 0:36:05.120
<v Speaker 3>at image caption pairs on the Internet and you can

0:36:05.480 --> 0:36:09.239
<v Speaker 3>align these two shapes. So now you have a way

0:36:09.239 --> 0:36:14.080
<v Speaker 3>of translating between text, language and images. So now you

0:36:14.160 --> 0:36:17.120
<v Speaker 3>just type in something like image I don't know, like

0:36:17.160 --> 0:36:22.080
<v Speaker 3>portrait of the country Chile as a woman. It goes

0:36:22.120 --> 0:36:25.359
<v Speaker 3>into the language space, gets translated to the image space,

0:36:25.440 --> 0:36:27.719
<v Speaker 3>the computer generates the image that's there, and you get that.

0:36:27.760 --> 0:36:31.720
<v Speaker 3>So that's how that technology works. There's something really deep

0:36:31.800 --> 0:36:34.960
<v Speaker 3>actually happening because it's not just working on language. It

0:36:34.960 --> 0:36:39.239
<v Speaker 3>seems to work on almost any modality out there. And

0:36:39.280 --> 0:36:42.960
<v Speaker 3>I think just like there is the unreasonable effectiveness of

0:36:43.239 --> 0:36:46.880
<v Speaker 3>mathematics where it's seems very strange. You go out on

0:36:46.960 --> 0:36:49.640
<v Speaker 3>some branch of abstract mathematics, it seems like it has

0:36:49.719 --> 0:36:51.319
<v Speaker 3>nothing to do with the world, and then it has

0:36:51.719 --> 0:36:54.240
<v Speaker 3>something profound to say about the world. You invent complex

0:36:54.280 --> 0:36:57.080
<v Speaker 3>numbers somehow that describes everything you're going to need to

0:36:57.480 --> 0:37:02.120
<v Speaker 3>deal with electricity. Who knew that's going on? In deep learning,

0:37:02.160 --> 0:37:04.680
<v Speaker 3>where there's an unreasonable effectiveness of deep learning, where the

0:37:04.800 --> 0:37:10.840
<v Speaker 3>same techniques are working across every modality from DNA to

0:37:11.040 --> 0:37:16.000
<v Speaker 3>fMRIs to language to audio to video to images to

0:37:16.840 --> 0:37:20.200
<v Speaker 3>computer code. There's nothing that says that had to have worked,

0:37:20.239 --> 0:37:23.360
<v Speaker 3>and yet it seems to be working. So we're learning

0:37:23.400 --> 0:37:26.880
<v Speaker 3>something I think profound about the structure of our universe.

0:37:27.320 --> 0:37:30.160
<v Speaker 3>But what that means for us specifically is that that

0:37:30.200 --> 0:37:32.800
<v Speaker 3>means we can build these kinds of shapes and embed

0:37:32.960 --> 0:37:37.319
<v Speaker 3>and translate between how an animal behaves and how it

0:37:37.480 --> 0:37:39.560
<v Speaker 3>sounds and what its body post is. So we can

0:37:39.600 --> 0:37:42.520
<v Speaker 3>say we're not quite there yet, but we're moving towards it.

0:37:42.640 --> 0:37:45.799
<v Speaker 3>Generate me the audio of two elephants coming together, and

0:37:46.239 --> 0:37:48.680
<v Speaker 3>that's going to view distribution of calls, some of which

0:37:48.760 --> 0:37:50.839
<v Speaker 3>might mean like hello, some of which might mean, this

0:37:50.920 --> 0:37:52.799
<v Speaker 3>is my name, some of which might mean I've missed you,

0:37:52.800 --> 0:37:54.400
<v Speaker 3>you don't know, but has something to do with affiliation.

0:37:54.480 --> 0:37:56.360
<v Speaker 3>Then you say, okay, now generate me the audio of

0:37:56.400 --> 0:37:58.480
<v Speaker 3>two elephants coming together, but where one of them's flapping

0:37:58.480 --> 0:38:00.839
<v Speaker 3>its ears and the other one's running quickly. What kind

0:38:00.840 --> 0:38:02.759
<v Speaker 3>of sounds does that make? And you can see that

0:38:02.840 --> 0:38:07.480
<v Speaker 3>this becomes a laboratory that lets you very quickly iterate

0:38:07.560 --> 0:38:10.719
<v Speaker 3>to understand what animals are saying. When you get to

0:38:10.960 --> 0:38:14.000
<v Speaker 3>do this in combination with the incredible biologists that are

0:38:14.000 --> 0:38:15.719
<v Speaker 3>out in the field and have already built up a

0:38:15.760 --> 0:38:20.040
<v Speaker 3>lot of that context from time, blood, sweat, and tears.

0:38:20.160 --> 0:38:23.960
<v Speaker 1>Excellent, and I think the biologging is becoming more sophisticated,

0:38:24.000 --> 0:38:28.120
<v Speaker 1>even right where they're looking at temperature and weather patterns

0:38:28.160 --> 0:38:31.160
<v Speaker 1>and gyroscopes and accelerometers and so on, where you get

0:38:31.239 --> 0:38:35.000
<v Speaker 1>all of this data from the animals, and in theory,

0:38:35.000 --> 0:38:37.560
<v Speaker 1>there's no limit to how much we can biologue, as

0:38:37.600 --> 0:38:39.640
<v Speaker 1>long as we can make it small and portable and

0:38:39.719 --> 0:38:43.560
<v Speaker 1>gets it on the animals, and then we're able to discover, hey,

0:38:44.239 --> 0:38:47.520
<v Speaker 1>these are the contextual cues that the animal is responding

0:38:47.560 --> 0:38:50.160
<v Speaker 1>to with the language.

0:38:50.840 --> 0:38:53.760
<v Speaker 3>Yeah, that's exactly right. And in fact, a big shift

0:38:53.800 --> 0:38:58.560
<v Speaker 3>that's happened in biology and conservation and ethology in the

0:38:58.640 --> 0:39:04.799
<v Speaker 3>last five years is because of cell phones driving the

0:39:04.840 --> 0:39:08.439
<v Speaker 3>cost of sensors lower. Biologists have gone from a world

0:39:08.480 --> 0:39:11.760
<v Speaker 3>where they're often data starved to where they're data drowned,

0:39:11.880 --> 0:39:15.880
<v Speaker 3>where they have access to terabytes of data, but they

0:39:15.880 --> 0:39:19.799
<v Speaker 3>don't yet have the tools to understand them. And so

0:39:20.960 --> 0:39:25.840
<v Speaker 3>our goal is to decode non human communication, translate animal language,

0:39:26.040 --> 0:39:29.399
<v Speaker 3>and use that to transform our relationship with the rest

0:39:29.440 --> 0:39:31.560
<v Speaker 3>of nature. But it's sort of like you're trying to

0:39:31.560 --> 0:39:33.320
<v Speaker 3>go to the moon along the way you invent velcro.

0:39:33.880 --> 0:39:38.640
<v Speaker 3>We're building the foundational tools that every biologist needs to

0:39:39.440 --> 0:39:42.919
<v Speaker 3>understand the data that they have now. And our hope

0:39:43.040 --> 0:39:47.080
<v Speaker 3>is that by building those foundational tools or nonprofit or

0:39:47.160 --> 0:39:49.600
<v Speaker 3>open source, we try to give back as much as

0:39:49.600 --> 0:39:53.200
<v Speaker 3>we can that that can broad scale accelerate all of

0:39:53.320 --> 0:39:57.360
<v Speaker 3>like conservation science, which we hope can also accelerate conservation itself.

0:39:57.719 --> 0:40:00.799
<v Speaker 1>Now, I have a technical question, which is if you

0:40:00.880 --> 0:40:04.919
<v Speaker 1>are looking at human languages and making this high dimensional

0:40:05.280 --> 0:40:09.120
<v Speaker 1>space of all the words when you're listening, when you're

0:40:09.160 --> 0:40:13.759
<v Speaker 1>eavesdropping on whales or lemurs or whatever species. How do

0:40:13.800 --> 0:40:16.239
<v Speaker 1>you know what a word is, what a unit of

0:40:16.400 --> 0:40:17.160
<v Speaker 1>meaning is.

0:40:18.280 --> 0:40:22.960
<v Speaker 3>Yeah, a hard problem and there's no one easy solution.

0:40:23.960 --> 0:40:25.960
<v Speaker 3>But one of the things that we can ask the

0:40:26.000 --> 0:40:32.520
<v Speaker 3>AI to do is try chopping up the audio in many, many,

0:40:32.560 --> 0:40:36.400
<v Speaker 3>many different ways and then see which one of those

0:40:36.719 --> 0:40:40.759
<v Speaker 3>ends up making good predictions for what comes next. And

0:40:40.800 --> 0:40:42.560
<v Speaker 3>so you can see if you're trying and varying and

0:40:43.600 --> 0:40:46.040
<v Speaker 3>you're not saying, well, which thing contains meaning, but which

0:40:46.040 --> 0:40:49.640
<v Speaker 3>things help make good predictions. When you try this on humans,

0:40:49.680 --> 0:40:53.960
<v Speaker 3>you end up with phonemes that you get out and

0:40:54.000 --> 0:40:57.560
<v Speaker 3>then those are then combinatorily built into words. So we're

0:40:57.560 --> 0:41:00.719
<v Speaker 3>playing with those kinds of techniques, but we don't have

0:41:00.840 --> 0:41:02.480
<v Speaker 3>like one surefire away yet.

0:41:02.600 --> 0:41:05.400
<v Speaker 1>And when you're thinking about predictions, one of the ways

0:41:05.480 --> 0:41:08.839
<v Speaker 1>that you could test a prediction is with playback, right,

0:41:08.880 --> 0:41:09.759
<v Speaker 1>So tell us about that.

0:41:11.239 --> 0:41:15.000
<v Speaker 3>Yeah, So this is the classic way that you test

0:41:15.040 --> 0:41:18.839
<v Speaker 3>your predictions in the field, where I'll just go out

0:41:18.880 --> 0:41:22.320
<v Speaker 3>and they will play a sound, often from the animal,

0:41:22.360 --> 0:41:24.840
<v Speaker 3>and they'll see whether the animal looks and for how long.

0:41:26.080 --> 0:41:28.839
<v Speaker 3>What we are starting to be able to do is

0:41:29.280 --> 0:41:32.800
<v Speaker 3>just like you can build a chat bot in text

0:41:33.160 --> 0:41:37.279
<v Speaker 3>that speaks Chinese without needing to speak Chinese. We are

0:41:37.320 --> 0:41:40.840
<v Speaker 3>on the cusp of being able to build these kinds

0:41:40.880 --> 0:41:44.319
<v Speaker 3>of chat bots, but that just directly speak in the

0:41:44.480 --> 0:41:47.040
<v Speaker 3>language of animals. So it's sort of like, imagine you

0:41:47.080 --> 0:41:49.920
<v Speaker 3>had a superpower, and your superpower was to go out

0:41:50.680 --> 0:41:52.880
<v Speaker 3>meet someone whose language you don't understand. You sort of

0:41:52.880 --> 0:41:54.680
<v Speaker 3>cock your head to the side and you listen for

0:41:54.680 --> 0:41:56.560
<v Speaker 3>a little bit and you're like, I don't know what

0:41:56.600 --> 0:41:58.880
<v Speaker 3>anything means, but I see that this sound pattern follows

0:41:58.920 --> 0:42:01.600
<v Speaker 3>this sound platted and this context. You just start to

0:42:01.640 --> 0:42:04.440
<v Speaker 3>babble and you have no idea what you're saying, but

0:42:04.520 --> 0:42:07.440
<v Speaker 3>the other person's like crosses the arm, like yeah, wow,

0:42:07.520 --> 0:42:10.680
<v Speaker 3>that's so meaningful. And at the end the person walks away.

0:42:10.680 --> 0:42:12.440
<v Speaker 3>I think they've had a great conversation. You're like, I

0:42:12.480 --> 0:42:14.280
<v Speaker 3>have no idea what I just said. I was just babbling.

0:42:15.280 --> 0:42:18.640
<v Speaker 3>But that's what that's literally what Chatchipdi does, and that's

0:42:18.680 --> 0:42:20.320
<v Speaker 3>the kind of thing that we are going to be

0:42:20.360 --> 0:42:23.320
<v Speaker 3>able to build in the next you know, like twelve months.

0:42:24.400 --> 0:42:25.239
<v Speaker 3>And what does that mean?

0:42:25.680 --> 0:42:27.719
<v Speaker 1>Just so it's clear, can you give an example of

0:42:28.400 --> 0:42:30.560
<v Speaker 1>playback and the kind of things that people are doing

0:42:30.640 --> 0:42:33.040
<v Speaker 1>right now with that yeah.

0:42:33.160 --> 0:42:36.520
<v Speaker 3>So an example of a playback might be one of

0:42:36.520 --> 0:42:40.200
<v Speaker 3>our partners, Michelle Fournet, and you can, actually your listeners

0:42:40.239 --> 0:42:45.200
<v Speaker 3>can go watch her incredible documentary Fathom. She was trying

0:42:45.239 --> 0:42:49.040
<v Speaker 3>to determine how do you say hello to humpback whale

0:42:49.600 --> 0:42:53.480
<v Speaker 3>and possibly include their name, So to say hello in

0:42:53.560 --> 0:42:57.040
<v Speaker 3>hump back, it turns out, is something like poop And

0:42:57.880 --> 0:43:02.280
<v Speaker 3>to test this, she recorded many different they're called whoop calls,

0:43:02.360 --> 0:43:06.359
<v Speaker 3>but they're hellos, many different whoop calls, and then went

0:43:06.400 --> 0:43:10.520
<v Speaker 3>out to Alaska, set up speakers underwater and would play

0:43:11.080 --> 0:43:14.520
<v Speaker 3>the hellos in a very controlled condition and would see

0:43:14.640 --> 0:43:19.759
<v Speaker 3>do the humpbacks respond? And the answer is yes, yes

0:43:19.880 --> 0:43:23.120
<v Speaker 3>they do. When she said hello, they would respond in

0:43:23.200 --> 0:43:26.840
<v Speaker 3>greater number of saying hello back. So that's an example

0:43:27.080 --> 0:43:28.760
<v Speaker 3>of a playback experiment.

0:43:44.960 --> 0:43:48.120
<v Speaker 1>I know you've thought a lot about the ethics involved

0:43:48.120 --> 0:43:51.320
<v Speaker 1>in this so far what we've been talking about sounds amazing,

0:43:51.360 --> 0:43:53.400
<v Speaker 1>and the question is what are the ethical things that

0:43:53.440 --> 0:43:54.360
<v Speaker 1>we need to keep an eye on.

0:43:55.560 --> 0:44:00.839
<v Speaker 3>Yeah, that is a great question because you know we

0:44:00.880 --> 0:44:05.080
<v Speaker 3>are going to be crossing this barrier very very soon,

0:44:05.320 --> 0:44:08.399
<v Speaker 3>which is I mean, this is the plot twist that

0:44:08.560 --> 0:44:12.440
<v Speaker 3>we will be able to communicate before we fully understand

0:44:12.480 --> 0:44:17.799
<v Speaker 3>what we're saying. That's again very surprising. I would have

0:44:17.840 --> 0:44:20.080
<v Speaker 3>not have guessed this if we rewound the clock three

0:44:20.239 --> 0:44:26.560
<v Speaker 3>or four years. What does this mean. This means that

0:44:26.640 --> 0:44:31.080
<v Speaker 3>if you're working with a species which has vocal learning,

0:44:31.840 --> 0:44:35.040
<v Speaker 3>well you might inject something that they say that then

0:44:35.440 --> 0:44:38.440
<v Speaker 3>changes their culture. So, to give an example, humpback whales

0:44:39.080 --> 0:44:43.160
<v Speaker 3>off the coast of Australia. For whatever reason, the Australian

0:44:43.239 --> 0:44:45.960
<v Speaker 3>humpbacks seem to be like the K pop singers, and

0:44:46.880 --> 0:44:49.400
<v Speaker 3>because they can sing halfway across an ocean basin and

0:44:49.400 --> 0:44:54.040
<v Speaker 3>they migrate across the world, often the songs that are

0:44:54.080 --> 0:44:57.600
<v Speaker 3>sung off the coasts of Australia will catch on and

0:44:57.640 --> 0:44:59.920
<v Speaker 3>be sung by much of the world population within a

0:45:00.000 --> 0:45:04.239
<v Speaker 3>couple of seasons. So it's, you know, the ultimate pop tune. Now,

0:45:04.480 --> 0:45:08.600
<v Speaker 3>we don't know as humans what truly the function of

0:45:09.000 --> 0:45:12.520
<v Speaker 3>humpback whale song is and how that culture works. So

0:45:12.600 --> 0:45:16.960
<v Speaker 3>if we just create a synthetic whale that sings, we

0:45:17.040 --> 0:45:20.200
<v Speaker 3>may infect a thirty four million year old wisdom tradition,

0:45:20.520 --> 0:45:24.800
<v Speaker 3>you know, create some kind of viral meme a whale QAnon.

0:45:25.200 --> 0:45:30.000
<v Speaker 3>We just don't know. So we have to be very

0:45:30.160 --> 0:45:35.560
<v Speaker 3>careful as we approach this new responsibility of what does

0:45:35.560 --> 0:45:39.880
<v Speaker 3>it mean to truly communicate with the other cultures of Earth?

0:45:40.200 --> 0:45:43.760
<v Speaker 3>And that means we should not go out and start

0:45:43.800 --> 0:45:47.719
<v Speaker 3>playing like two way communication real time. We should not

0:45:47.760 --> 0:45:51.879
<v Speaker 3>do those kinds of experiments with wild populations that vocally learn.

0:45:51.920 --> 0:45:53.719
<v Speaker 3>We have to think about what is it to have

0:45:53.840 --> 0:45:57.240
<v Speaker 3>like a prime direct of a Geneva convention for cross

0:45:57.239 --> 0:46:00.759
<v Speaker 3>species communication. And this is of course terrifying, and I

0:46:00.760 --> 0:46:03.759
<v Speaker 3>should say, everything that our species does we do with

0:46:03.800 --> 0:46:08.480
<v Speaker 3>biology partners and institutions. We are starting to work on

0:46:08.800 --> 0:46:11.759
<v Speaker 3>what are the ground rules before we even have the

0:46:11.800 --> 0:46:16.440
<v Speaker 3>technology for knowing when and how it is okay to

0:46:17.080 --> 0:46:21.760
<v Speaker 3>have these prime directive first contact moments because first contexts

0:46:21.760 --> 0:46:25.279
<v Speaker 3>have often not gone well for the beings being first contacted.

0:46:25.920 --> 0:46:30.319
<v Speaker 3>So I think the change in the relationship for how

0:46:30.360 --> 0:46:33.800
<v Speaker 3>we relate to nature is the point of our species

0:46:33.960 --> 0:46:35.920
<v Speaker 3>and it's exciting that we're getting to the place where

0:46:36.760 --> 0:46:38.000
<v Speaker 3>that becomes a necessity.

0:46:38.840 --> 0:46:42.120
<v Speaker 1>And what's your prediction for how long it'll be, what

0:46:42.280 --> 0:46:46.719
<v Speaker 1>year will have a meaningful conversation back and forth with

0:46:46.800 --> 0:46:50.239
<v Speaker 1>the species? And which do you think will be the

0:46:50.280 --> 0:46:51.000
<v Speaker 1>first species?

0:46:52.200 --> 0:46:52.480
<v Speaker 2>Yeah?

0:46:52.520 --> 0:46:54.920
<v Speaker 3>I mean This is science, so it's always very hard

0:46:54.960 --> 0:46:58.880
<v Speaker 3>to make predictions like this, and different people on my

0:46:58.880 --> 0:47:01.480
<v Speaker 3>team have different predictions, so I can just say mine,

0:47:01.920 --> 0:47:06.680
<v Speaker 3>but know that answers very I think certainly by twenty

0:47:06.800 --> 0:47:09.560
<v Speaker 3>thirty we will have had two way back and forth

0:47:09.640 --> 0:47:14.400
<v Speaker 3>to what degree we understand unknown, But I think we

0:47:14.440 --> 0:47:17.560
<v Speaker 3>will have a really good handle on it by then.

0:47:19.200 --> 0:47:24.680
<v Speaker 3>It's just so exciting. My personal favorite is Belugas, and

0:47:24.719 --> 0:47:27.359
<v Speaker 3>again everyone has their own personal favorite. But when you

0:47:27.400 --> 0:47:31.120
<v Speaker 3>listen to Belugas communicate, it sounds like an alien modem.

0:47:31.400 --> 0:47:35.000
<v Speaker 3>It sounds digital. There are lots of whistles in there.

0:47:35.360 --> 0:47:37.440
<v Speaker 3>It turns out that you know dolphins, they say their name,

0:47:37.480 --> 0:47:41.160
<v Speaker 3>their signature, whistle in a whistle. It's like a single band.

0:47:41.440 --> 0:47:44.359
<v Speaker 3>This is like full modem pack. It encodes their name,

0:47:44.480 --> 0:47:49.520
<v Speaker 3>it encodes their clan identity. And doctor Valeria Vergaro, with

0:47:49.520 --> 0:47:52.279
<v Speaker 3>whom we work on Beluga communication, she's sort of like

0:47:52.320 --> 0:47:55.000
<v Speaker 3>one of the preeminent scholars. It was her work that

0:47:55.080 --> 0:47:56.960
<v Speaker 3>showed that they have names that they call each other by.

0:47:57.719 --> 0:48:00.360
<v Speaker 3>And what blew my mind is that when she talks

0:48:00.360 --> 0:48:03.800
<v Speaker 3>about her data, she's like she had to throw away

0:48:04.680 --> 0:48:08.120
<v Speaker 3>ninety seven percent of her data in those studies because

0:48:08.120 --> 0:48:11.960
<v Speaker 3>she couldn't tell which beluga was speaking or disentangle them.

0:48:12.280 --> 0:48:14.520
<v Speaker 3>And that's because they are like forty belugas in a

0:48:14.560 --> 0:48:17.799
<v Speaker 3>tight mass that are moving around super fast. It's very

0:48:17.840 --> 0:48:22.120
<v Speaker 3>hard from a computational perspective. But that's where your listener's

0:48:22.200 --> 0:48:24.319
<v Speaker 3>ears should perk up, because here we have the most

0:48:24.400 --> 0:48:28.880
<v Speaker 3>vocal underwater species with the largest vocabulary that we know of,

0:48:29.520 --> 0:48:32.400
<v Speaker 3>and the super majority of data, like ninety seven percent

0:48:32.480 --> 0:48:35.080
<v Speaker 3>is unknown the ocean is what five percent explored Bluga

0:48:35.080 --> 0:48:37.840
<v Speaker 3>communication or at least this data sets are three percent explored.

0:48:38.200 --> 0:48:38.920
<v Speaker 2>Like this is.

0:48:38.840 --> 0:48:41.640
<v Speaker 3>Where you get like brand new discoveries. This is the

0:48:41.680 --> 0:48:42.400
<v Speaker 3>next frontier.

0:48:42.840 --> 0:48:45.319
<v Speaker 1>And do Belugas do turn taking, by the way, which

0:48:45.360 --> 0:48:47.880
<v Speaker 1>is one of the signatures of an actual language as

0:48:47.880 --> 0:48:49.560
<v Speaker 1>opposed to just broadcasting noise.

0:48:50.640 --> 0:48:53.719
<v Speaker 3>Yeah, a number of species to do turn taking, from

0:48:53.800 --> 0:48:58.040
<v Speaker 3>parrots to gelatas to many of the whale species.

0:48:58.719 --> 0:48:59.160
<v Speaker 2>Yeah, I know.

0:48:59.239 --> 0:49:02.439
<v Speaker 1>This is one of the signs that people look at

0:49:02.440 --> 0:49:04.120
<v Speaker 1>to try to figure out, how would we know if

0:49:04.120 --> 0:49:07.480
<v Speaker 1>this is actually a language versus they're just singing songs

0:49:07.560 --> 0:49:10.280
<v Speaker 1>or they're doing whatever, but they're not listening back and forth,

0:49:10.480 --> 0:49:14.799
<v Speaker 1>which leads to this question about if we find alien species,

0:49:15.000 --> 0:49:18.920
<v Speaker 1>eventually we find life on other planets. The question is

0:49:19.320 --> 0:49:22.440
<v Speaker 1>how much do we have to share with another species

0:49:22.440 --> 0:49:27.080
<v Speaker 1>for us to have some meaningful interpretation of the language,

0:49:27.440 --> 0:49:31.640
<v Speaker 1>Because fundamentally we're trapped in our internal model and a

0:49:31.719 --> 0:49:36.000
<v Speaker 1>species that's so different, we will impose an interpretation on

0:49:36.080 --> 0:49:40.719
<v Speaker 1>what they must mean by it. But I wonder when

0:49:40.800 --> 0:49:44.000
<v Speaker 1>we find an alien species, how we will ever be

0:49:44.080 --> 0:49:47.960
<v Speaker 1>able to know whether we understand enough of their language

0:49:48.000 --> 0:49:51.120
<v Speaker 1>to say that we have a meaningful interpretation of it.

0:49:52.840 --> 0:49:54.640
<v Speaker 3>I mean, when you say that, it just makes me wonder,

0:49:54.680 --> 0:49:57.279
<v Speaker 3>how do we ever know that when we're communicating with

0:49:57.320 --> 0:49:59.719
<v Speaker 3>each other's as humans, that we truly understand each other.

0:50:00.000 --> 0:50:05.920
<v Speaker 3>There's almost this undeniably huge and yet invisible gulf, like

0:50:05.960 --> 0:50:08.200
<v Speaker 3>the myth of communication is that it ever happened in

0:50:08.239 --> 0:50:11.359
<v Speaker 3>the first place. We never truly know. We can only

0:50:11.400 --> 0:50:15.400
<v Speaker 3>have clues that we are getting closer, that we're approaching knowing.

0:50:17.120 --> 0:50:20.359
<v Speaker 3>I think it's really important to call out how much

0:50:20.400 --> 0:50:23.320
<v Speaker 3>of our language is built on the metaphor of bodies.

0:50:23.960 --> 0:50:28.000
<v Speaker 3>Almost all of it is body and space, right Like,

0:50:28.120 --> 0:50:30.600
<v Speaker 3>even the things that we might think are really abstract,

0:50:30.840 --> 0:50:33.719
<v Speaker 3>like cursor on your computer. What is the root of

0:50:33.719 --> 0:50:36.520
<v Speaker 3>cursor in Latin, It's cursor the one who runs. It's

0:50:36.880 --> 0:50:41.400
<v Speaker 3>the man who runs impeded, impeded against foot. It's like

0:50:42.280 --> 0:50:46.160
<v Speaker 3>the deeper you look into language, the more you realize.

0:50:46.320 --> 0:50:48.879
<v Speaker 3>And George Lakeoff does an incredible job in a book

0:50:48.920 --> 0:50:53.360
<v Speaker 3>called Metaphors. We live by really deconstructing all of the

0:50:53.400 --> 0:50:55.960
<v Speaker 3>ways that what we think of as our most abstract

0:50:56.000 --> 0:50:59.040
<v Speaker 3>ideas can be traced back to a root of us

0:50:59.120 --> 0:51:01.840
<v Speaker 3>having bodies and talking about our bodies in a physical world.

0:51:01.719 --> 0:51:05.520
<v Speaker 2>And particular bodies, particular bodies. That that is true.

0:51:05.560 --> 0:51:07.280
<v Speaker 1>What I mean is when we find an alien species,

0:51:07.320 --> 0:51:09.400
<v Speaker 1>let's say they're more like slime, mold or something that

0:51:09.480 --> 0:51:14.080
<v Speaker 1>might make it very difficult for us to understand their metaphors.

0:51:14.239 --> 0:51:17.399
<v Speaker 3>That is exactly right. And I think the hope here

0:51:18.239 --> 0:51:22.239
<v Speaker 3>is that because we are conditioned on and live in

0:51:22.280 --> 0:51:24.880
<v Speaker 3>a physical world, that to the extent that there is

0:51:24.920 --> 0:51:29.399
<v Speaker 3>an outside world in which we share, and that that

0:51:29.440 --> 0:51:32.759
<v Speaker 3>will give the kind of grounding that's needed to do

0:51:33.000 --> 0:51:36.520
<v Speaker 3>some kind of translation. But I think it would be

0:51:36.560 --> 0:51:39.239
<v Speaker 3>wrong to say that the translations are going to look

0:51:39.280 --> 0:51:41.120
<v Speaker 3>like Google Translate. You're going to get word forward to

0:51:41.160 --> 0:51:43.880
<v Speaker 3>English it might end up looking like a translation is

0:51:43.920 --> 0:51:47.520
<v Speaker 3>more like a piece of art or poetry, where the

0:51:47.600 --> 0:51:51.000
<v Speaker 3>translation is very ambiguous, but if you spend enough time

0:51:51.040 --> 0:51:53.000
<v Speaker 3>with it, you start to get a felt sense of

0:51:53.040 --> 0:51:55.680
<v Speaker 3>what it's like. Or maybe you're right, maybe it'll be

0:51:55.760 --> 0:51:57.759
<v Speaker 3>so different that we'll never There's just some things we

0:51:57.840 --> 0:51:58.760
<v Speaker 3>will never be able.

0:51:58.560 --> 0:52:01.600
<v Speaker 1>To work right in between, will and pose an interpretation

0:52:01.840 --> 0:52:03.480
<v Speaker 1>even though it will be incorrect.

0:52:04.400 --> 0:52:07.760
<v Speaker 3>Yeah, that is true, And at least here on Earth.

0:52:08.120 --> 0:52:11.600
<v Speaker 3>There are sort of two failure modes. One is anthropomorphizing,

0:52:11.640 --> 0:52:15.160
<v Speaker 3>which is what you're talking about, is assuming that we can, well,

0:52:15.239 --> 0:52:17.560
<v Speaker 3>we can only relate to the experience of others through

0:52:17.600 --> 0:52:19.680
<v Speaker 3>our own experience. That's the only way to ever happened.

0:52:19.719 --> 0:52:21.640
<v Speaker 3>It's very simple to say, but it's actually profound when

0:52:21.640 --> 0:52:25.440
<v Speaker 3>you think about it. So there's a over projection of

0:52:25.480 --> 0:52:29.040
<v Speaker 3>ourselves onto others. And then the other side is human exceptionalism,

0:52:29.320 --> 0:52:32.320
<v Speaker 3>where we assume that our experiences are completely unique to

0:52:32.360 --> 0:52:35.200
<v Speaker 3>ushen we share nothing with other animals. And obviously the

0:52:35.440 --> 0:52:38.840
<v Speaker 3>answer that the truth is in between the two. And

0:52:38.880 --> 0:52:42.120
<v Speaker 3>then we have to have the self honesty to understand

0:52:42.160 --> 0:52:45.520
<v Speaker 3>and the way of asking questions that lets us determine

0:52:45.560 --> 0:52:47.320
<v Speaker 3>when we are over projecting.

0:52:47.160 --> 0:52:50.400
<v Speaker 1>Yes, exactly, And I'm really interested when and this might

0:52:50.440 --> 0:52:54.600
<v Speaker 1>not happen in our lifetimes, but when we discover completely

0:52:54.640 --> 0:52:59.239
<v Speaker 1>alien life, I mean as in living on other planets

0:52:59.280 --> 0:53:00.520
<v Speaker 1>that might be so different.

0:53:00.520 --> 0:53:02.480
<v Speaker 2>Maybe they don't have DNA, maybe they.

0:53:02.320 --> 0:53:06.280
<v Speaker 1>Have a different coding system, maybe they have very different bodies.

0:53:07.320 --> 0:53:13.799
<v Speaker 1>The question is how much is Earth exceptionalism true? You know,

0:53:13.880 --> 0:53:16.680
<v Speaker 1>in Star Trek they go around and they communicate with

0:53:16.719 --> 0:53:18.480
<v Speaker 1>all these aliens and they have a good time, and

0:53:18.640 --> 0:53:21.839
<v Speaker 1>you know, really understand each other to some degree. And

0:53:22.160 --> 0:53:25.000
<v Speaker 1>the question is whether that will be the case or not.

0:53:26.760 --> 0:53:30.239
<v Speaker 3>Yeah, I mean if you start thinking about I think

0:53:30.280 --> 0:53:32.799
<v Speaker 3>in Star Trek they have the crystalline entity, which is

0:53:33.280 --> 0:53:36.560
<v Speaker 3>a giant being the size of a whole planet. And

0:53:36.760 --> 0:53:39.080
<v Speaker 3>at that point, I think I'd come a little more

0:53:39.440 --> 0:53:41.960
<v Speaker 3>along your lines that the scale of which that being

0:53:42.160 --> 0:53:45.239
<v Speaker 3>is feeling and sensing is so broad. We probably share

0:53:45.400 --> 0:53:48.920
<v Speaker 3>very little but anything of roughly our size. And if

0:53:48.960 --> 0:53:51.800
<v Speaker 3>they have family structures, like then there is like hunger,

0:53:53.000 --> 0:54:01.400
<v Speaker 3>there's being tired, there's like safety, there's like familiar relationships,

0:54:01.640 --> 0:54:05.400
<v Speaker 3>there is gossip, and those things are probably conserved across many,

0:54:05.560 --> 0:54:09.520
<v Speaker 3>many different types of beings.

0:54:11.880 --> 0:54:15.719
<v Speaker 1>So we're entering a really exciting time, but the challenges

0:54:15.800 --> 0:54:18.400
<v Speaker 1>are real and there are still a lot of question marks.

0:54:18.960 --> 0:54:22.560
<v Speaker 1>For example, in the scientific literature, there's an ongoing debate

0:54:22.640 --> 0:54:27.719
<v Speaker 1>about which species might have languages. Some researchers listen to

0:54:27.760 --> 0:54:30.600
<v Speaker 1>a particular species and say that seems like that could

0:54:30.640 --> 0:54:33.399
<v Speaker 1>be language, and others listen and they say, no, way,

0:54:33.480 --> 0:54:37.200
<v Speaker 1>that's not language because there's no turn taking, and also

0:54:37.280 --> 0:54:40.560
<v Speaker 1>because the order of the sounds doesn't seem to make

0:54:40.600 --> 0:54:43.719
<v Speaker 1>any difference. And these are all valid debates because we

0:54:43.800 --> 0:54:47.920
<v Speaker 1>don't actually know what qualifies as a language and what doesn't.

0:54:48.719 --> 0:54:53.520
<v Speaker 1>Some species, for example, some songbirds do what's called dueting,

0:54:53.960 --> 0:54:56.560
<v Speaker 1>where they're singing at the same time. Does this mean

0:54:56.600 --> 0:55:00.520
<v Speaker 1>they're not doing language or is it possible? Are very

0:55:00.640 --> 0:55:04.600
<v Speaker 1>different ways of doing language. I'll give you a concrete

0:55:04.600 --> 0:55:07.799
<v Speaker 1>example of a different way of doing language, which is

0:55:08.080 --> 0:55:12.000
<v Speaker 1>sign language. It turns out that the temporal order doesn't

0:55:12.040 --> 0:55:14.600
<v Speaker 1>matter very much. In sign language. You can switch up

0:55:14.640 --> 0:55:17.080
<v Speaker 1>the order of the words and it can still mean

0:55:17.120 --> 0:55:20.200
<v Speaker 1>the same thing. And there are aspects of it that

0:55:20.239 --> 0:55:23.960
<v Speaker 1>are spatial. So, for example, an American sign language, you

0:55:24.000 --> 0:55:27.600
<v Speaker 1>can indicate that something happened in the past by doing

0:55:27.640 --> 0:55:30.600
<v Speaker 1>the signs slightly to your left, and if you make

0:55:30.640 --> 0:55:33.520
<v Speaker 1>the same signs over on your right side, that means

0:55:33.520 --> 0:55:37.520
<v Speaker 1>you're talking about the future. So it's the same signs

0:55:37.920 --> 0:55:41.359
<v Speaker 1>with this subtly different spatial position, and it can mean

0:55:41.400 --> 0:55:44.640
<v Speaker 1>different things. And I call it subtle because if someone

0:55:44.719 --> 0:55:48.759
<v Speaker 1>didn't know to watch for a slight spatial change, they

0:55:48.800 --> 0:55:51.839
<v Speaker 1>wouldn't even notice it. And even a language that only

0:55:51.960 --> 0:55:56.200
<v Speaker 1>uses sounds can be very difficult to decode because so

0:55:56.320 --> 0:56:00.400
<v Speaker 1>much of it depends on shared assumptions about meaning. So

0:56:00.640 --> 0:56:04.200
<v Speaker 1>just as an example, if I'm talking about someone named Aviva,

0:56:04.560 --> 0:56:07.280
<v Speaker 1>I use her name once, and in the next sentence

0:56:07.320 --> 0:56:10.600
<v Speaker 1>I just say her, and you know who I'm talking about.

0:56:10.640 --> 0:56:14.240
<v Speaker 1>I'm referencing Aviva. But if you are an alien working

0:56:14.280 --> 0:56:17.839
<v Speaker 1>to decode my language, you might be confused because one

0:56:17.840 --> 0:56:21.120
<v Speaker 1>minute later you hear me use the same utterance her,

0:56:21.520 --> 0:56:23.600
<v Speaker 1>but now I'm referring to someone else entirely.

0:56:23.640 --> 0:56:25.520
<v Speaker 2>I'm now talking about Sarah, but I.

0:56:25.480 --> 0:56:29.319
<v Speaker 1>Still use the word her, So the same word can

0:56:29.400 --> 0:56:32.480
<v Speaker 1>refer to totally different things. And the alien would be

0:56:32.600 --> 0:56:37.200
<v Speaker 1>very confused if it had concluded that her was the

0:56:37.280 --> 0:56:40.000
<v Speaker 1>word for Aviva, and in the same way when we

0:56:40.120 --> 0:56:43.680
<v Speaker 1>hear a whale make the same sound that we always hear,

0:56:43.760 --> 0:56:46.440
<v Speaker 1>it might be talking about something totally different than the

0:56:46.560 --> 0:56:51.239
<v Speaker 1>last time that used that sound. The context matters, and

0:56:51.440 --> 0:56:54.400
<v Speaker 1>this issue of context, in other words, what's going on

0:56:54.520 --> 0:56:58.759
<v Speaker 1>around the animal. This is why biologuers are interested in

0:56:58.800 --> 0:57:03.239
<v Speaker 1>collecting things beyond and just the audio data. Good biologuing

0:57:03.320 --> 0:57:08.120
<v Speaker 1>now uses video and gyroscope and altimeter and GPS and

0:57:08.480 --> 0:57:11.239
<v Speaker 1>any other measure they can get their hands on. And

0:57:11.280 --> 0:57:16.040
<v Speaker 1>this matters because so much of communication is about context,

0:57:16.280 --> 0:57:18.640
<v Speaker 1>and by the way, a lot of it is nonverbal.

0:57:19.320 --> 0:57:22.320
<v Speaker 1>Consider how you pick up stress from someone else even

0:57:22.400 --> 0:57:27.000
<v Speaker 1>without words, body language, the tightness of their facial muscles,

0:57:27.040 --> 0:57:31.160
<v Speaker 1>the way they're walking, and so on, and animals presumably

0:57:31.240 --> 0:57:36.560
<v Speaker 1>have many equivalents to this. Just think about smells and pheromones.

0:57:37.120 --> 0:57:39.439
<v Speaker 1>Take a close look at your dog the next time

0:57:39.440 --> 0:57:42.480
<v Speaker 1>you're on a walk. It's obvious that a lot of

0:57:42.520 --> 0:57:47.560
<v Speaker 1>your dog's language is happening silently. So all this is

0:57:47.600 --> 0:57:51.120
<v Speaker 1>to say that language can be complicated, and much of

0:57:51.160 --> 0:57:55.400
<v Speaker 1>it can be nonverbal, and this is why the challenge

0:57:55.440 --> 0:57:59.600
<v Speaker 1>of decoding animal language is a big one. And you know,

0:57:59.640 --> 0:58:02.080
<v Speaker 1>one of the things that I'm always on the lookout

0:58:02.160 --> 0:58:05.320
<v Speaker 1>for is whether we can see any evidence that animals

0:58:05.360 --> 0:58:09.600
<v Speaker 1>engage in something like storytelling. One of the classes that

0:58:09.640 --> 0:58:13.240
<v Speaker 1>I teach at Stanford is the Brain and Literature, and

0:58:13.280 --> 0:58:15.920
<v Speaker 1>I teach how weird it is that we go to

0:58:15.960 --> 0:58:18.600
<v Speaker 1>the theater or the movies or a lecture and someone

0:58:18.800 --> 0:58:23.160
<v Speaker 1>speaks and whosh, we get immediately transported into a different

0:58:23.520 --> 0:58:24.480
<v Speaker 1>space and time.

0:58:24.560 --> 0:58:26.760
<v Speaker 2>It's like a guided dream.

0:58:27.280 --> 0:58:29.280
<v Speaker 1>I'm going to do an episode on this issue soon,

0:58:29.320 --> 0:58:31.479
<v Speaker 1>but for now, I just want to point out that

0:58:31.920 --> 0:58:35.920
<v Speaker 1>we don't see bears congregating like hundreds of them on

0:58:35.960 --> 0:58:40.320
<v Speaker 1>a Saturday night listening to one bear grunt along. And

0:58:40.400 --> 0:58:42.480
<v Speaker 1>I'm not certain that we see that in any species,

0:58:42.560 --> 0:58:45.040
<v Speaker 1>but I don't know. But these are the kinds of

0:58:45.080 --> 0:58:48.600
<v Speaker 1>clues we would look for as we move forward. These

0:58:48.600 --> 0:58:51.000
<v Speaker 1>are the questions of not just do they have some

0:58:51.120 --> 0:58:54.720
<v Speaker 1>simple language, but what they can do with their language.

0:58:54.880 --> 0:58:57.160
<v Speaker 1>This is a tougher problem and one that we need

0:58:57.200 --> 0:59:01.440
<v Speaker 1>to keep our eye on. So plenty of remaining question

0:59:01.520 --> 0:59:05.320
<v Speaker 1>marks all around us, but what's clear is that technology

0:59:05.600 --> 0:59:10.160
<v Speaker 1>like biologuers and artificial intelligence are leveling us up into

0:59:10.200 --> 0:59:14.120
<v Speaker 1>a very exciting time. Not all species are going to

0:59:14.120 --> 0:59:18.640
<v Speaker 1>have something interesting to say, but many might. And if

0:59:18.680 --> 0:59:22.080
<v Speaker 1>we find we can decode animal language due to the

0:59:22.200 --> 0:59:26.040
<v Speaker 1>labors of Aseraskin and his co founders Katie and brit

0:59:26.200 --> 0:59:28.720
<v Speaker 1>and dozens of other people in this exciting field of

0:59:28.720 --> 0:59:33.040
<v Speaker 1>animal communication, that will give us a very different view

0:59:33.240 --> 0:59:38.520
<v Speaker 1>of ourselves and our species on this planet. Our grandchildren

0:59:38.560 --> 0:59:42.080
<v Speaker 1>will grow up and they'll feel amazed that we considered

0:59:42.120 --> 0:59:46.680
<v Speaker 1>ourselves the only ones and it wasn't even necessarily because

0:59:46.720 --> 0:59:50.960
<v Speaker 1>of species chauvinism, but instead because we can only hear

0:59:51.000 --> 0:59:53.920
<v Speaker 1>our own voices, and therefore we thought we were the

0:59:53.920 --> 0:59:57.840
<v Speaker 1>only ones in the room. And with enough time, maybe

0:59:57.920 --> 1:00:02.520
<v Speaker 1>we'll have enough technology and practice at decoding animal languages

1:00:02.880 --> 1:00:06.880
<v Speaker 1>that eventually, in the more distant future, we can tackle

1:00:07.440 --> 1:00:12.320
<v Speaker 1>extra planetary communication, and our great great grandkids will be

1:00:12.360 --> 1:00:14.920
<v Speaker 1>amazed that there was a time when we thought we

1:00:15.000 --> 1:00:17.000
<v Speaker 1>were the only ones in the galaxy.

1:00:18.040 --> 1:00:19.880
<v Speaker 2>We maybe look back upon.

1:00:19.960 --> 1:00:25.080
<v Speaker 1>As the era of loneliness, surrounded by voices of all

1:00:25.160 --> 1:00:29.200
<v Speaker 1>types that we just didn't know how to hear.

1:00:34.720 --> 1:00:36.360
<v Speaker 2>Please join me at eagleman.

1:00:36.080 --> 1:00:40.160
<v Speaker 1>Dot com, slash podcasts more information and links to various

1:00:40.160 --> 1:00:44.280
<v Speaker 1>animal communication projects and further reading. Send me an email

1:00:44.320 --> 1:00:47.520
<v Speaker 1>at podcasts at eagleman dot com with questions or discussion,

1:00:47.880 --> 1:00:50.360
<v Speaker 1>and I'll be making an episode soon in which I

1:00:50.400 --> 1:00:54.439
<v Speaker 1>address those. Until next time, I'm David Eagleman, and this

1:00:54.760 --> 1:01:03.840
<v Speaker 1>is Inner Cosmos.