WEBVTT - EP127 "What happens when we marry brains to machines?" with Sergey Stavisky

0:00:05.200 --> 0:00:10.000
<v Speaker 1>What is a brain computer interface? How far along is

0:00:10.160 --> 0:00:14.120
<v Speaker 1>this field? Can we evesdrop on the brain so that

0:00:14.160 --> 0:00:18.040
<v Speaker 1>a person who has lost the ability to move can

0:00:18.200 --> 0:00:22.120
<v Speaker 1>use their brain to control a computer cursor or a

0:00:22.239 --> 0:00:25.880
<v Speaker 1>robotic arm. Can someone who has lost the ability to

0:00:26.040 --> 0:00:30.760
<v Speaker 1>speak send brain signals to a decoder and hear their

0:00:30.880 --> 0:00:36.360
<v Speaker 1>voice again? Can we restore autonomy and dignity and eventually

0:00:36.800 --> 0:00:41.720
<v Speaker 1>do so so seamlessly that the technology disappears and the

0:00:41.760 --> 0:00:46.920
<v Speaker 1>person reappears In the future, where will the ethical boundaries

0:00:46.960 --> 0:00:52.320
<v Speaker 1>be between restoring function and spying on private thought? And

0:00:52.400 --> 0:00:57.400
<v Speaker 1>who owns the stream of neural data that represents you?

0:01:00.640 --> 0:01:03.880
<v Speaker 1>Welcome to Inner Cosmos with me David Eagleman. I'm a

0:01:03.920 --> 0:01:07.720
<v Speaker 1>neuroscientist and author at Stanford and in these episodes we

0:01:07.840 --> 0:01:12.560
<v Speaker 1>sail deeply into our three pound universe to understand why

0:01:12.680 --> 0:01:31.839
<v Speaker 1>and how our lives look the way they do. This week,

0:01:31.840 --> 0:01:36.760
<v Speaker 1>we're talking about technology for reading the brain. Now. One

0:01:36.760 --> 0:01:40.480
<v Speaker 1>thing that I find fascinating is that ancient cultures didn't

0:01:40.520 --> 0:01:44.160
<v Speaker 1>care at all about the brain. They generally would just

0:01:44.680 --> 0:01:48.720
<v Speaker 1>throw it out at autopsy, and it's understandable why it

0:01:48.880 --> 0:01:53.360
<v Speaker 1>just looks and feels like a huge, squishy walnut. If

0:01:53.360 --> 0:01:57.200
<v Speaker 1>you could sit and stare at a brain in action,

0:01:57.880 --> 0:02:03.200
<v Speaker 1>you wouldn't see anything happening. So it's taken centuries and

0:02:03.240 --> 0:02:06.160
<v Speaker 1>a lot of technology to realize that, in fact, the

0:02:06.200 --> 0:02:11.680
<v Speaker 1>brain is alive with lots of tiny cells, microscopically tiny,

0:02:12.040 --> 0:02:15.960
<v Speaker 1>and these cells are transmitting electrical signals tens or one

0:02:16.040 --> 0:02:18.920
<v Speaker 1>hundred times every second for each cell. And you have

0:02:19.080 --> 0:02:23.880
<v Speaker 1>eighty six billion of these cells. So this big, squishy

0:02:23.919 --> 0:02:27.799
<v Speaker 1>walnut is one of the busiest things on the planet.

0:02:28.680 --> 0:02:32.560
<v Speaker 1>But because it is so fragile, Mother Nature surrounds the

0:02:32.600 --> 0:02:36.839
<v Speaker 1>brain with an armored bunker plating the skull, and that

0:02:36.919 --> 0:02:40.080
<v Speaker 1>provides a huge challenge if you want to go in

0:02:40.120 --> 0:02:44.600
<v Speaker 1>there and eavesdrop on what the cells are doing. Now,

0:02:44.639 --> 0:02:47.400
<v Speaker 1>why would you want to spy on these cells? Well,

0:02:47.840 --> 0:02:52.800
<v Speaker 1>imagine if your thoughts could exit the skull as easily

0:02:53.160 --> 0:02:57.440
<v Speaker 1>as words leave your mouth. Now, there's a sense in

0:02:57.480 --> 0:03:00.840
<v Speaker 1>which we always do this. We use keyboards, touch screens,

0:03:00.919 --> 0:03:05.079
<v Speaker 1>and voice assistants, but all of those are detours. They

0:03:05.160 --> 0:03:09.160
<v Speaker 1>force the brain to root its intentions through muscle, and

0:03:09.240 --> 0:03:13.520
<v Speaker 1>that's fine if your muscles work. The problem is that

0:03:13.720 --> 0:03:17.680
<v Speaker 1>lots of people, millions of our neighbors and friends don't

0:03:17.680 --> 0:03:21.000
<v Speaker 1>have a way to get the information out of their

0:03:21.040 --> 0:03:24.959
<v Speaker 1>brain because something about the brain or the brain's pathways

0:03:25.080 --> 0:03:28.639
<v Speaker 1>or the muscles are not working, and therefore their brain

0:03:28.800 --> 0:03:31.640
<v Speaker 1>knows what they want to do or say, but there's

0:03:31.639 --> 0:03:35.280
<v Speaker 1>no way to get that information out. And this is

0:03:35.320 --> 0:03:39.800
<v Speaker 1>where the idea of a brain computer interface comes in.

0:03:40.160 --> 0:03:44.520
<v Speaker 1>What you'll hear referred to as a BCEI brain computer interface.

0:03:45.000 --> 0:03:48.360
<v Speaker 1>The idea of a BCI is to listen directly to

0:03:48.440 --> 0:03:52.320
<v Speaker 1>the neural patterns that mean move or speak or select,

0:03:52.680 --> 0:03:56.800
<v Speaker 1>and then you use some device to translate those patterns

0:03:56.880 --> 0:04:01.840
<v Speaker 1>directly into activation in the outside world. Now, as I said,

0:04:01.840 --> 0:04:04.160
<v Speaker 1>this is a huge deal for all the people for

0:04:04.200 --> 0:04:09.480
<v Speaker 1>whom the path from intention to movement has been interrupted

0:04:09.520 --> 0:04:12.920
<v Speaker 1>by disease or injury. The intent is still alive and

0:04:12.960 --> 0:04:16.800
<v Speaker 1>well in the cortex, and BCIs are the bridge back.

0:04:17.279 --> 0:04:22.239
<v Speaker 1>They turn silent plans into text or voice or cursor

0:04:22.320 --> 0:04:27.120
<v Speaker 1>control or reaching and grasping. But the story will, at

0:04:27.200 --> 0:04:30.839
<v Speaker 1>least in theory, reach beyond the medical because once you

0:04:30.920 --> 0:04:34.240
<v Speaker 1>can read out the programs for say this word or

0:04:34.320 --> 0:04:38.400
<v Speaker 1>press that key, now you've built a communication channel between

0:04:38.520 --> 0:04:43.919
<v Speaker 1>biological tissue and silicon, and that opens new forms of

0:04:44.080 --> 0:04:49.760
<v Speaker 1>interaction that our species has barely begun to imagine. Now,

0:04:49.839 --> 0:04:51.960
<v Speaker 1>let me not get ahead of myself yet, because as

0:04:52.000 --> 0:04:54.599
<v Speaker 1>we're going to see today, we are still at the

0:04:54.760 --> 0:04:58.240
<v Speaker 1>earliest stages of this technology. But this is what we're

0:04:58.279 --> 0:05:01.279
<v Speaker 1>going to talk about at the end. Now, you can

0:05:01.360 --> 0:05:05.160
<v Speaker 1>build bceiyes in lots of flavors. Some rest on the scalp,

0:05:05.480 --> 0:05:08.840
<v Speaker 1>Others sit on the surface of the brain. Others poke

0:05:09.160 --> 0:05:12.440
<v Speaker 1>tiny wires called electrodes into the surface of the brain

0:05:12.880 --> 0:05:15.480
<v Speaker 1>or even down deep into the brain for some purposes.

0:05:16.240 --> 0:05:20.520
<v Speaker 1>Some of these BCIs only read the electrical activity. Others

0:05:20.560 --> 0:05:25.120
<v Speaker 1>will also write with electrical patterns that the brain experiences

0:05:25.200 --> 0:05:28.200
<v Speaker 1>as touch or sound or sight. In every case, the

0:05:28.240 --> 0:05:32.360
<v Speaker 1>principle is the same. Brains issue commands, and they're very

0:05:32.520 --> 0:05:36.760
<v Speaker 1>fast and complex internal language of electrical spikes. This is

0:05:36.800 --> 0:05:41.000
<v Speaker 1>a language that we haven't nearly decoded yet, but machines

0:05:41.120 --> 0:05:44.159
<v Speaker 1>can learn to translate that language through a lot of

0:05:44.360 --> 0:05:48.839
<v Speaker 1>trial and error. Huge populations of neurons are playing some

0:05:49.240 --> 0:05:54.000
<v Speaker 1>symphony piece, and these decoders learn how to hear the

0:05:54.080 --> 0:05:58.240
<v Speaker 1>music and root the commands to a cursor or a

0:05:58.279 --> 0:06:02.040
<v Speaker 1>speaker or a robotic arm or whatever. Now. The issue

0:06:02.080 --> 0:06:04.120
<v Speaker 1>is that when we talk about it, it all seems

0:06:04.200 --> 0:06:08.240
<v Speaker 1>very straightforward and easy, but actually getting in there and

0:06:08.320 --> 0:06:12.320
<v Speaker 1>getting technology that can record from these microscopic little cells,

0:06:12.560 --> 0:06:16.520
<v Speaker 1>having these little changes in their electrical potential of tens

0:06:16.520 --> 0:06:20.640
<v Speaker 1>of millivolts, and making a system that lasts, and then

0:06:20.720 --> 0:06:23.960
<v Speaker 1>putting all the data together to understand what this very

0:06:24.120 --> 0:06:28.320
<v Speaker 1>tiny sampling of neurons, maybe a few hundred out of

0:06:28.640 --> 0:06:32.200
<v Speaker 1>hundreds of billions of neurons. It turns out this is

0:06:32.240 --> 0:06:37.640
<v Speaker 1>a massive engineering challenge and there are a million practical questions.

0:06:38.000 --> 0:06:41.640
<v Speaker 1>How reliable are these systems outside the lab? Can they

0:06:41.680 --> 0:06:46.480
<v Speaker 1>survive infection and signal drift? What about battery life? What's

0:06:46.560 --> 0:06:50.559
<v Speaker 1>the surgical risk? When does insurance cover these? So there's

0:06:50.800 --> 0:06:55.520
<v Speaker 1>a huge gap between a beautiful proof of principle and

0:06:55.800 --> 0:07:00.440
<v Speaker 1>a device that changes lives every day, and crossing that

0:07:00.560 --> 0:07:03.440
<v Speaker 1>gap is the real work of the field right now.

0:07:04.160 --> 0:07:06.320
<v Speaker 1>Now there's also a second issue. As soon as we

0:07:06.360 --> 0:07:10.840
<v Speaker 1>start talking about reading the brain, the questions start to surface,

0:07:11.000 --> 0:07:14.880
<v Speaker 1>what exactly are we reading? Is it intended movements? That's

0:07:14.920 --> 0:07:18.360
<v Speaker 1>one thing is that inner speech? Is it where you

0:07:18.520 --> 0:07:22.120
<v Speaker 1>place your attention? You can imagine situations in which there

0:07:22.160 --> 0:07:25.760
<v Speaker 1>are things that you don't want everyone knowing. We're used

0:07:25.760 --> 0:07:29.640
<v Speaker 1>to the skull having some sort of sanctity. So where

0:07:29.680 --> 0:07:36.080
<v Speaker 1>will the ethical boundaries be between restoring function and evesdropping

0:07:36.120 --> 0:07:40.040
<v Speaker 1>on private thought? Who's going to own the stream of

0:07:40.120 --> 0:07:44.320
<v Speaker 1>data that is literally you? How do we guarantee consent

0:07:44.440 --> 0:07:48.680
<v Speaker 1>and security and dignity when the interface is not on

0:07:48.720 --> 0:07:52.280
<v Speaker 1>your desk but inside your skull. So, even in the

0:07:52.280 --> 0:07:54.800
<v Speaker 1>face of all the tough questions coming down the pike,

0:07:55.320 --> 0:07:59.840
<v Speaker 1>it's hard not to feel awe at what's already possible.

0:07:59.840 --> 0:08:04.040
<v Speaker 1>Who have been locked inside their bodies are communicating again.

0:08:04.360 --> 0:08:07.080
<v Speaker 1>They're talking with their loved ones for the first time

0:08:07.160 --> 0:08:12.360
<v Speaker 1>in years. And the technology keeps improving every month, smarter algorithms,

0:08:12.440 --> 0:08:17.960
<v Speaker 1>better sensors, cleaner signals, and crucially designs that move from

0:08:17.960 --> 0:08:20.960
<v Speaker 1>the hospital to the home. So today I want to

0:08:21.000 --> 0:08:23.600
<v Speaker 1>explore what that looks like and where we are in

0:08:23.600 --> 0:08:26.480
<v Speaker 1>the process and where things are going. So I sat

0:08:26.520 --> 0:08:29.800
<v Speaker 1>down with my colleague Sergei Stavisky. Sergei is at the

0:08:29.840 --> 0:08:34.280
<v Speaker 1>UC Davis Neuroprosthetics Lab, which he co directs with neurosurgeon

0:08:34.480 --> 0:08:38.720
<v Speaker 1>David Brandman. With their collaborators, they work on BCIs that

0:08:38.840 --> 0:08:43.400
<v Speaker 1>restore communication and they're pushing towards systems that are fast

0:08:43.520 --> 0:08:47.600
<v Speaker 1>and expressive and practical for everyday life. So here's my

0:08:47.679 --> 0:08:50.280
<v Speaker 1>interview with Sergei Staviski.

0:08:53.920 --> 0:08:58.120
<v Speaker 2>A brain computer interface is a device that interacts between

0:08:58.200 --> 0:09:01.000
<v Speaker 2>technology and a brains. You have the brain, you have

0:09:01.240 --> 0:09:04.200
<v Speaker 2>some way of getting information in or out, and you

0:09:04.280 --> 0:09:07.560
<v Speaker 2>have some computation that's happening. And that computation it could

0:09:07.559 --> 0:09:09.240
<v Speaker 2>be happening inside the body, so it could be a

0:09:09.320 --> 0:09:12.240
<v Speaker 2>chip that does everything in the brain, or it could

0:09:12.240 --> 0:09:15.800
<v Speaker 2>be sending that information to a laptop next to the person,

0:09:15.880 --> 0:09:18.079
<v Speaker 2>or even to the cloud for more computation.

0:09:18.480 --> 0:09:21.080
<v Speaker 1>Now, one of your interests is that you know, over

0:09:21.120 --> 0:09:23.440
<v Speaker 1>a century ago people figured out you could dunk an

0:09:23.480 --> 0:09:27.480
<v Speaker 1>electrode into the brain the thin wire and because cells

0:09:27.480 --> 0:09:33.320
<v Speaker 1>are communicating with little electrical signals, you're you can eavesdrop

0:09:33.440 --> 0:09:36.440
<v Speaker 1>on that and you can also stimulate the cell to

0:09:36.480 --> 0:09:39.800
<v Speaker 1>do whatever. So tell us about the history of this,

0:09:41.080 --> 0:09:43.880
<v Speaker 1>how people have thought about, let's eavesdrop on the brain

0:09:43.960 --> 0:09:45.240
<v Speaker 1>and turn that into something.

0:09:45.480 --> 0:09:49.440
<v Speaker 2>So starting in the sixties and seventies and eighties, especially

0:09:49.480 --> 0:09:52.800
<v Speaker 2>working in animal models, people realized, yeah, you can put

0:09:52.800 --> 0:09:55.720
<v Speaker 2>electrodes into the brain, and you can get up close

0:09:55.760 --> 0:09:58.079
<v Speaker 2>next to an individual brain cell a neuron, and when

0:09:58.080 --> 0:10:01.199
<v Speaker 2>that neuron's firing, it's genera a big electric field, a

0:10:01.240 --> 0:10:03.520
<v Speaker 2>tiny electric field, but big relative to the electrode right

0:10:03.559 --> 0:10:05.160
<v Speaker 2>next to it, And so.

0:10:05.080 --> 0:10:06.520
<v Speaker 3>We know that that neuron is firing.

0:10:06.559 --> 0:10:09.679
<v Speaker 2>And then there was a whole decades of systems neuroscience

0:10:09.679 --> 0:10:13.240
<v Speaker 2>which was relating those patterns of activity to what typically

0:10:13.280 --> 0:10:16.560
<v Speaker 2>the animal was doing. So a classic example from the

0:10:16.559 --> 0:10:20.240
<v Speaker 2>eighties would be a monkey is moving his arm up

0:10:20.320 --> 0:10:22.920
<v Speaker 2>or down, or left or right, and you can see

0:10:22.920 --> 0:10:26.240
<v Speaker 2>that maybe a neuron fires more when the arm is

0:10:26.280 --> 0:10:28.360
<v Speaker 2>moving to the left, and say, okay, that neuron has

0:10:28.360 --> 0:10:30.960
<v Speaker 2>a left or preferred direction. We're starting to build some

0:10:31.400 --> 0:10:34.800
<v Speaker 2>mental map of how that brain activity relates to movements.

0:10:34.800 --> 0:10:37.240
<v Speaker 2>Of course, it's much more complicated, and the whole field

0:10:37.240 --> 0:10:40.679
<v Speaker 2>of neuroscience is trying to understand how individual neurons and

0:10:40.760 --> 0:10:44.920
<v Speaker 2>hundreds of neurons and whole large assemblies of neurons generate behavior.

0:10:45.320 --> 0:10:50.160
<v Speaker 2>Starting around the two thousands, the field had felt that

0:10:50.240 --> 0:10:53.280
<v Speaker 2>we had enough of a rudimentary understanding of how movement

0:10:53.520 --> 0:10:57.200
<v Speaker 2>is encoded in the brain that this could be used

0:10:57.360 --> 0:10:58.719
<v Speaker 2>for a medical application.

0:10:59.520 --> 0:11:01.240
<v Speaker 3>And kind of in my world.

0:11:01.040 --> 0:11:04.440
<v Speaker 2>That's been focused on restoring movement to people with paralysis.

0:11:04.480 --> 0:11:05.400
<v Speaker 3>So in two.

0:11:05.280 --> 0:11:07.600
<v Speaker 2>Thousand and four it was a big landmark event that

0:11:07.760 --> 0:11:10.319
<v Speaker 2>was when the original brain Gate trial. So this was

0:11:10.400 --> 0:11:13.720
<v Speaker 2>led by John Donahue in Lee Hagberg at Brown University

0:11:13.720 --> 0:11:16.240
<v Speaker 2>in Masteronal Hospital. They put what was called a multi

0:11:16.240 --> 0:11:18.880
<v Speaker 2>electro array, so instead of a single wire like you

0:11:19.040 --> 0:11:21.600
<v Speaker 2>mentioned in the beginning, now imagine a hundred of those

0:11:21.600 --> 0:11:24.959
<v Speaker 2>little wires kind of all stacked together, recording from thus

0:11:25.040 --> 0:11:29.240
<v Speaker 2>about one hundred neurons. And they showed that these arrays

0:11:29.280 --> 0:11:31.480
<v Speaker 2>could be put in a person with paralysis, and even

0:11:31.520 --> 0:11:34.400
<v Speaker 2>though that person hadn't moved in a decade. I think

0:11:34.600 --> 0:11:36.559
<v Speaker 2>the first guy was a young man in his twenties

0:11:36.600 --> 0:11:39.559
<v Speaker 2>who had been paralyzed from the neck down due to

0:11:39.600 --> 0:11:42.560
<v Speaker 2>a knife wound from like a bar fight. So he

0:11:42.600 --> 0:11:46.000
<v Speaker 2>hadn't moved in many, many years. But they put that

0:11:46.040 --> 0:11:48.600
<v Speaker 2>electro array in the motor cortex, the part of the

0:11:48.600 --> 0:11:52.199
<v Speaker 2>brain that normally sends commands to the arm, and when

0:11:52.240 --> 0:11:54.680
<v Speaker 2>he tried to move his arm, lo and behold, those

0:11:54.720 --> 0:11:57.960
<v Speaker 2>neurons fired away. And so kind of the main risk

0:11:58.080 --> 0:12:02.080
<v Speaker 2>had been solved, which is would the brain even still

0:12:02.120 --> 0:12:05.040
<v Speaker 2>try to generate movements because you might think, well, use

0:12:05.080 --> 0:12:07.800
<v Speaker 2>it or lose it. Right, the person's paralyzed, why would

0:12:07.800 --> 0:12:10.880
<v Speaker 2>their brain still generate movement commands. Fortunately it still does,

0:12:11.679 --> 0:12:14.640
<v Speaker 2>and people were able to decode those signals.

0:12:14.320 --> 0:12:16.680
<v Speaker 1>And just as a quick reminder to everybody, the brain

0:12:16.800 --> 0:12:18.920
<v Speaker 1>is saying, okay, I want you to make these movements,

0:12:19.000 --> 0:12:21.880
<v Speaker 1>and then those shoot down down the spinal cord and

0:12:21.880 --> 0:12:24.440
<v Speaker 1>out to the peripheral nervous system and move the muscles.

0:12:24.840 --> 0:12:28.200
<v Speaker 1>And so in this case you're hearing the original command,

0:12:28.720 --> 0:12:33.120
<v Speaker 1>but there's some break in the roadway plunging down the

0:12:33.160 --> 0:12:36.120
<v Speaker 1>spinal cord and out such that the body never gets

0:12:36.160 --> 0:12:37.720
<v Speaker 1>the signals correctly exactly.

0:12:37.760 --> 0:12:39.880
<v Speaker 2>We're bypassing the injury. We're going to the source. So

0:12:39.920 --> 0:12:41.320
<v Speaker 2>where's the command coming from?

0:12:41.360 --> 0:12:43.320
<v Speaker 1>So this was back in two thousand and four, what

0:12:43.320 --> 0:12:46.360
<v Speaker 1>was his name, Matt Nagel. Is that researchers are able

0:12:46.400 --> 0:12:49.400
<v Speaker 1>to listen to what the neurons are intending, and then

0:12:49.760 --> 0:12:51.760
<v Speaker 1>the field has really taken off since then in the

0:12:51.800 --> 0:12:56.120
<v Speaker 1>past two decades. For example, with motor movement, originally it

0:12:56.200 --> 0:12:58.680
<v Speaker 1>was just on a computer screen you could move a

0:12:58.679 --> 0:13:03.079
<v Speaker 1>cursor around. Nowadays people are thinking about Hey, could you

0:13:03.160 --> 0:13:06.719
<v Speaker 1>actually use an exoskeleton to move the arm physically?

0:13:07.120 --> 0:13:09.840
<v Speaker 3>Yeah, or even stimulate those paralyzed muscles.

0:13:09.880 --> 0:13:14.880
<v Speaker 2>So there's these functional electrical stimulation systems or epidural spinal stimulation,

0:13:15.000 --> 0:13:17.959
<v Speaker 2>both for walking and for the arm. So you can

0:13:18.320 --> 0:13:20.800
<v Speaker 2>really close the loop. You can decode what movement the

0:13:20.840 --> 0:13:21.559
<v Speaker 2>person's trying to make.

0:13:21.520 --> 0:13:21.559
<v Speaker 3>It.

0:13:21.600 --> 0:13:23.960
<v Speaker 2>Oh, they're trying to move their arm forward to grab something,

0:13:24.559 --> 0:13:26.960
<v Speaker 2>and then you can have that move a robotic arm.

0:13:27.240 --> 0:13:29.880
<v Speaker 2>You could have that move an exoskeleton, or if they

0:13:29.920 --> 0:13:33.480
<v Speaker 2>also have a stimulator that's implanted under the skin with

0:13:33.559 --> 0:13:36.480
<v Speaker 2>wires going to the muscles or going outside of the spine,

0:13:36.679 --> 0:13:39.880
<v Speaker 2>you can stimulate the body and actually have the person's

0:13:39.880 --> 0:13:44.200
<v Speaker 2>own formally paralyzed muscles make that movement. It's not at

0:13:44.240 --> 0:13:46.280
<v Speaker 2>the level that you or I let a healthy person

0:13:46.320 --> 0:13:48.560
<v Speaker 2>is moving their arm, but it does work. There's been

0:13:48.559 --> 0:13:51.280
<v Speaker 2>some really amazing studies in the last decade doing that.

0:13:51.480 --> 0:13:54.080
<v Speaker 1>Yeah, exactly right, Okay, great, So that's how people have

0:13:54.160 --> 0:13:58.679
<v Speaker 1>been using brain computer interfaces to move a paralyzed body. Now,

0:13:58.760 --> 0:14:01.800
<v Speaker 1>something that several groups have gotten interested in in recent

0:14:01.880 --> 0:14:05.480
<v Speaker 1>years is what if somebody can't speak anymore? So, what

0:14:05.520 --> 0:14:08.040
<v Speaker 1>are the reasons. First of all, that somebody can't speak.

0:14:08.360 --> 0:14:11.960
<v Speaker 2>So one common one is neurodegenerative diseases like ALS. So

0:14:12.040 --> 0:14:16.000
<v Speaker 2>ALS is a terrible disease, hemiotrophic lateral sclerosis, right and

0:14:16.080 --> 0:14:18.839
<v Speaker 2>right now there's no cure. We can't stop it with

0:14:19.240 --> 0:14:21.240
<v Speaker 2>a drug or other therapy.

0:14:21.120 --> 0:14:22.560
<v Speaker 1>Also known as Luke Gerrig's disease.

0:14:22.600 --> 0:14:26.200
<v Speaker 2>That's right, yeah, and almost everyone who has ALS will

0:14:26.240 --> 0:14:28.960
<v Speaker 2>gradually lose the ability to move their body. But also

0:14:29.080 --> 0:14:32.640
<v Speaker 2>that means what we call the speech articulators, so their lips,

0:14:32.680 --> 0:14:35.760
<v Speaker 2>their jaw, their tongue, their diaphragm, and so their speech

0:14:35.800 --> 0:14:39.120
<v Speaker 2>becomes harder and harder to understand, and eventually you wind

0:14:39.200 --> 0:14:41.480
<v Speaker 2>up what's called locked in, so really not able to

0:14:41.520 --> 0:14:44.840
<v Speaker 2>move at all. And of course this is a terrible situation.

0:14:45.680 --> 0:14:48.800
<v Speaker 2>And if there were a way to restore the ability

0:14:48.840 --> 0:14:53.480
<v Speaker 2>to communicate, so like before decoding not now not they

0:14:53.560 --> 0:14:55.480
<v Speaker 2>are movements that're trying to make, or the leg movements,

0:14:55.520 --> 0:14:57.280
<v Speaker 2>but what are the words that're trying to make, or

0:14:57.280 --> 0:14:59.160
<v Speaker 2>what are the movements of those articulars that they're trying

0:14:59.160 --> 0:15:02.600
<v Speaker 2>to make. What's are they trying to produce? Then we

0:15:02.640 --> 0:15:05.680
<v Speaker 2>can have this person communicate again and talk again through

0:15:05.720 --> 0:15:06.160
<v Speaker 2>a computer.

0:15:06.440 --> 0:15:08.520
<v Speaker 1>If you want to figure out what somebody is trying

0:15:08.560 --> 0:15:11.120
<v Speaker 1>to say, where do you put the electrodes?

0:15:11.360 --> 0:15:13.400
<v Speaker 3>Yeah, and that is the big question. So there are

0:15:13.400 --> 0:15:14.200
<v Speaker 3>a lot of ideas.

0:15:14.240 --> 0:15:16.720
<v Speaker 2>One idea would be the broker's area, which was thought

0:15:16.760 --> 0:15:21.200
<v Speaker 2>to plan speech. Another idea would be the motor cortex,

0:15:21.240 --> 0:15:26.440
<v Speaker 2>which would be kind of the last planning to command generation.

0:15:26.520 --> 0:15:28.440
<v Speaker 2>So the part of the brain that's really sending signals

0:15:28.480 --> 0:15:32.640
<v Speaker 2>to the muscles. And then there's a wide part of

0:15:32.720 --> 0:15:34.880
<v Speaker 2>the brain that are called the language network.

0:15:34.920 --> 0:15:36.200
<v Speaker 3>So this is the temporal lobe.

0:15:36.800 --> 0:15:39.760
<v Speaker 2>It's canonically thought of for perceiving language, but also heavily

0:15:39.760 --> 0:15:41.840
<v Speaker 2>involved in producing language. So there are a lot of

0:15:41.920 --> 0:15:46.400
<v Speaker 2>possible choices. One of the challenges for developing a speech

0:15:46.400 --> 0:15:49.840
<v Speaker 2>ne or prosthesis is there's no animal model. So when

0:15:50.240 --> 0:15:52.760
<v Speaker 2>the field was trying to have people walk again or

0:15:52.760 --> 0:15:55.360
<v Speaker 2>people move their arms again, we had a huge head

0:15:55.360 --> 0:15:58.160
<v Speaker 2>start because you could say, okay, where can you code

0:15:58.440 --> 0:16:01.040
<v Speaker 2>the walking or the arm moved of a rat or

0:16:01.080 --> 0:16:04.720
<v Speaker 2>a monkey or another animal. Well, animals don't talk, they

0:16:04.720 --> 0:16:09.360
<v Speaker 2>don't have language, so we don't have that kind of

0:16:09.400 --> 0:16:12.960
<v Speaker 2>guidance for us, and what we do have are less

0:16:13.120 --> 0:16:16.520
<v Speaker 2>precise measurements from other humans. A lot of the really

0:16:16.600 --> 0:16:19.080
<v Speaker 2>important work from the last decade or twenty years was

0:16:19.440 --> 0:16:23.480
<v Speaker 2>done with electrocorticography. So people with epilepsy often will have

0:16:23.840 --> 0:16:26.760
<v Speaker 2>electrodes put under their skull, typically on top of their

0:16:26.800 --> 0:16:30.400
<v Speaker 2>brain or even in their brain to for the neurologists

0:16:30.400 --> 0:16:31.280
<v Speaker 2>to identify.

0:16:30.880 --> 0:16:32.160
<v Speaker 3>Where the teacher is coming from.

0:16:32.440 --> 0:16:34.040
<v Speaker 2>But these people are then in the hospital for a

0:16:34.040 --> 0:16:36.560
<v Speaker 2>couple of weeks, and this is a gold mine for

0:16:36.720 --> 0:16:39.520
<v Speaker 2>human neuroscience. A lot of what we know about direct

0:16:39.520 --> 0:16:42.760
<v Speaker 2>brain recordings and how they relate to human specific behaviors,

0:16:42.800 --> 0:16:46.480
<v Speaker 2>whether that's speaking or language, or imagination or memory.

0:16:46.760 --> 0:16:48.280
<v Speaker 3>Or mood, all of these things.

0:16:48.440 --> 0:16:51.080
<v Speaker 2>A lot of that comes from this sort of opportunistic

0:16:51.160 --> 0:16:53.240
<v Speaker 2>recording people who are they're in the hospital anyway, they're

0:16:53.320 --> 0:16:55.960
<v Speaker 2>kind of bored, they're waiting for the neurologists to have

0:16:56.120 --> 0:16:58.560
<v Speaker 2>enough data, and so it's very easy to ask them, hey, do.

0:16:58.560 --> 0:17:00.680
<v Speaker 3>You want to read a sentence off a screen.

0:17:00.760 --> 0:17:03.960
<v Speaker 2>So from that we already knew that this sensory motor cortex.

0:17:04.080 --> 0:17:08.879
<v Speaker 2>So the motor and the sensory cortex was a prime area,

0:17:08.960 --> 0:17:12.000
<v Speaker 2>and in our brain Gate clinical trial, that's where we

0:17:12.080 --> 0:17:15.359
<v Speaker 2>ended up putting electrodes, so in the motor part, basically

0:17:15.680 --> 0:17:17.879
<v Speaker 2>the part of the brain that would typically send commands

0:17:17.920 --> 0:17:18.679
<v Speaker 2>to the muscles.

0:17:18.920 --> 0:17:23.359
<v Speaker 1>Great, so it's essentially like the last train station before

0:17:23.400 --> 0:17:27.440
<v Speaker 1>it plunges down towards the muscles. Okay, so you're eavesdropping

0:17:27.480 --> 0:17:31.679
<v Speaker 1>there and you're sticking these little electrode or raise these

0:17:31.680 --> 0:17:34.280
<v Speaker 1>little square jobs where they have sixty four electrodes on

0:17:34.280 --> 0:17:35.960
<v Speaker 1>the one and four of those.

0:17:35.920 --> 0:17:38.560
<v Speaker 2>We used four of them, so yeah, four all along

0:17:38.600 --> 0:17:40.680
<v Speaker 2>this precentral gyrus.

0:17:40.760 --> 0:17:44.640
<v Speaker 1>So you're listening to these neurons and you're trying to

0:17:44.840 --> 0:17:49.760
<v Speaker 1>decode what the person is intending to say from that.

0:17:50.280 --> 0:17:53.600
<v Speaker 1>And one question, were you worried at the beginning that

0:17:53.600 --> 0:17:56.720
<v Speaker 1>that wouldn't be enough data or did you feel like, look,

0:17:56.760 --> 0:17:59.640
<v Speaker 1>with two hundred fifty six neurons, we can figure out

0:17:59.680 --> 0:18:02.240
<v Speaker 1>what's going on in terms of what was trying to

0:18:02.320 --> 0:18:03.080
<v Speaker 1>be articulated.

0:18:03.480 --> 0:18:06.359
<v Speaker 2>When I started the project, I was pretty worried. So

0:18:07.200 --> 0:18:09.360
<v Speaker 2>kind of the prior work is we had shown that

0:18:09.400 --> 0:18:11.679
<v Speaker 2>with about one hundred electrodes in a different part of

0:18:11.720 --> 0:18:14.800
<v Speaker 2>the brain, the hand part of motor cortex, we could

0:18:14.800 --> 0:18:18.479
<v Speaker 2>decode speech, but very poorly. There I was classifying between

0:18:18.480 --> 0:18:22.040
<v Speaker 2>the thirty nine phonemes in American English, if I recall

0:18:22.119 --> 0:18:25.760
<v Speaker 2>about thirty three percent accuracy, So that's way better than chance.

0:18:25.800 --> 0:18:27.960
<v Speaker 2>It showed there's information, but that is not good enough

0:18:27.960 --> 0:18:29.280
<v Speaker 2>to understand.

0:18:28.880 --> 0:18:29.440
<v Speaker 3>What someone's saying.

0:18:29.480 --> 0:18:30.679
<v Speaker 1>Tell us what a phoneme is.

0:18:31.240 --> 0:18:33.720
<v Speaker 3>A phoneme is a building block of speech.

0:18:33.800 --> 0:18:36.240
<v Speaker 2>So I think most people are familiar with the syllables,

0:18:36.560 --> 0:18:38.320
<v Speaker 2>think of a phoneme as a little bit smaller than that.

0:18:38.440 --> 0:18:43.200
<v Speaker 2>So good, ooh E. Right, there's consonants, there's vowels. Different

0:18:43.280 --> 0:18:47.159
<v Speaker 2>languages have different phonemes, but in English, depending on the

0:18:47.160 --> 0:18:50.880
<v Speaker 2>dialect or accent, between thirty nine forty one. These are

0:18:50.960 --> 0:18:53.959
<v Speaker 2>the typical ways we break down English.

0:18:54.000 --> 0:18:57.760
<v Speaker 1>Got So you're recording from these neurons, and you were saying,

0:18:57.760 --> 0:19:00.720
<v Speaker 1>can I figure out what phoneme person is trying to

0:19:00.760 --> 0:19:02.919
<v Speaker 1>say right now and right now just from looking at

0:19:02.960 --> 0:19:04.520
<v Speaker 1>this array of neural activity?

0:19:04.720 --> 0:19:05.600
<v Speaker 3>That's exactly right.

0:19:05.680 --> 0:19:09.040
<v Speaker 2>And a little bit before that, my colleagues at Stanford,

0:19:09.080 --> 0:19:10.720
<v Speaker 2>and that was also the lab that I did my

0:19:10.760 --> 0:19:13.800
<v Speaker 2>post doctoral training, and so I started that project then

0:19:13.840 --> 0:19:17.600
<v Speaker 2>moved on. They had implanted one hundred and twenty eight

0:19:17.720 --> 0:19:22.320
<v Speaker 2>electrodes in the motor cortex of a woman with als,

0:19:22.840 --> 0:19:26.000
<v Speaker 2>and with that they were able to decode what words

0:19:26.000 --> 0:19:29.639
<v Speaker 2>she was saying with about seventy five percent accuracy with

0:19:29.680 --> 0:19:31.920
<v Speaker 2>a large vocabulary of one hundred and twenty five thousand words.

0:19:32.080 --> 0:19:35.520
<v Speaker 2>So that was a really really exciting moment for the

0:19:35.520 --> 0:19:38.000
<v Speaker 2>field because that was really banging at the door of

0:19:38.040 --> 0:19:42.639
<v Speaker 2>making this useful for general communication. Now, three out of

0:19:42.640 --> 0:19:45.719
<v Speaker 2>four words correct is amazing. It was way better than

0:19:45.720 --> 0:19:48.320
<v Speaker 2>anything that ever been done before. But you can't have

0:19:48.359 --> 0:19:50.919
<v Speaker 2>a conversation that way. It's just too frustrating. There's too

0:19:50.920 --> 0:19:51.640
<v Speaker 2>many mistakes.

0:19:52.520 --> 0:19:54.399
<v Speaker 1>And so when we will give us a sense of

0:19:54.400 --> 0:19:57.199
<v Speaker 1>the type of mistake, So the person is intending to

0:19:57.240 --> 0:20:01.119
<v Speaker 1>say the word brain, but the neural activity is decoded

0:20:01.160 --> 0:20:03.440
<v Speaker 1>by the computer, and the computer says, oh, he's trying

0:20:03.440 --> 0:20:05.159
<v Speaker 1>to say panda bear or whatever.

0:20:05.359 --> 0:20:07.800
<v Speaker 3>Well it could be panda bear, it's more likely.

0:20:07.880 --> 0:20:10.480
<v Speaker 1>So the the.

0:20:11.320 --> 0:20:14.600
<v Speaker 2>Way that these systems work is well, one way they work.

0:20:14.680 --> 0:20:17.280
<v Speaker 2>The way our systems work is we're decoding from neural

0:20:17.280 --> 0:20:20.600
<v Speaker 2>activity to phonemes and then those phonemes get assembled into

0:20:20.640 --> 0:20:22.840
<v Speaker 2>words using a dictionary.

0:20:22.440 --> 0:20:23.439
<v Speaker 3>And a language model.

0:20:23.760 --> 0:20:25.720
<v Speaker 2>And in fact, if you look at a dictionary, there's

0:20:25.760 --> 0:20:28.160
<v Speaker 2>that phonetic spelling which most people don't use but if

0:20:28.160 --> 0:20:30.520
<v Speaker 2>you want to figure out how to actually pronounce a word.

0:20:30.520 --> 0:20:31.199
<v Speaker 3>You can look at that.

0:20:31.280 --> 0:20:34.120
<v Speaker 2>So the types of mistakes it would more likely make

0:20:34.240 --> 0:20:36.600
<v Speaker 2>would be similar sounding words.

0:20:36.600 --> 0:20:39.800
<v Speaker 3>So if someone's trying to say brain, maybe they'd get barn.

0:20:40.480 --> 0:20:40.920
<v Speaker 1>Yeah.

0:20:40.960 --> 0:20:44.280
<v Speaker 2>And in some contexts you can understand, oh, I hurt

0:20:44.320 --> 0:20:46.720
<v Speaker 2>my barn, I think you maybe you know you got

0:20:46.760 --> 0:20:49.240
<v Speaker 2>an accident, you hurt your brain. But if there's enough

0:20:49.280 --> 0:20:51.560
<v Speaker 2>of those, it just kind of breaks down. And the

0:20:51.560 --> 0:20:54.320
<v Speaker 2>analogy I'd give is when you're typing on your smartphone.

0:20:54.320 --> 0:20:56.560
<v Speaker 2>Most of us are a little bit clumsy. We make

0:20:56.560 --> 0:20:59.760
<v Speaker 2>a lot of typos. The autocorrect can help up to

0:20:59.800 --> 0:21:02.879
<v Speaker 2>a point, but there's this sort of steep cliff where

0:21:03.160 --> 0:21:06.200
<v Speaker 2>if we're making too many typos, the autocrack so the

0:21:06.280 --> 0:21:08.440
<v Speaker 2>language model cannot keep up, and all of a sudden

0:21:08.720 --> 0:21:10.200
<v Speaker 2>you just get gibberish coming out.

0:21:10.680 --> 0:21:12.920
<v Speaker 3>So that's kind of where things were.

0:21:13.080 --> 0:21:15.280
<v Speaker 2>You could it wasn't gibberish, right, that's overstating it, but

0:21:15.680 --> 0:21:33.400
<v Speaker 2>it was not there for communication day to day.

0:21:33.520 --> 0:21:36.719
<v Speaker 1>So you worked with a man who is forty five

0:21:36.800 --> 0:21:40.000
<v Speaker 1>years old, if I'm rememory correctly, and he had als

0:21:40.240 --> 0:21:43.760
<v Speaker 1>and hadn't articulated in about five years. Is that right?

0:21:43.960 --> 0:21:47.480
<v Speaker 2>Yet he was severely disarthuric, meaning most people couldn't understand him,

0:21:47.840 --> 0:21:51.080
<v Speaker 2>and he volunteered for this brain gate to clinical trial

0:21:51.200 --> 0:21:55.200
<v Speaker 2>that we are one of four sights of which meant

0:21:55.359 --> 0:21:59.600
<v Speaker 2>that after a bunch of tests and imaging scans and

0:21:59.640 --> 0:22:02.600
<v Speaker 2>other things, once we determined that it was a good

0:22:02.640 --> 0:22:04.800
<v Speaker 2>fit and it was safe to move forward. He'd had

0:22:04.800 --> 0:22:08.560
<v Speaker 2>this surgery where doctor Brandman, my collaudrator, put these four

0:22:08.960 --> 0:22:11.600
<v Speaker 2>multi electro to rays into his speech motor cortex.

0:22:12.400 --> 0:22:14.240
<v Speaker 3>We waited a couple of weeks.

0:22:13.920 --> 0:22:16.720
<v Speaker 2>For everything to heal up, and then we went to

0:22:16.760 --> 0:22:19.280
<v Speaker 2>his house where all of our equipment was already pre staged.

0:22:19.840 --> 0:22:23.320
<v Speaker 2>We literally plugged him in. So there's this system is wired,

0:22:23.400 --> 0:22:26.480
<v Speaker 2>so it's not wireless yet. And the way we started

0:22:26.520 --> 0:22:29.320
<v Speaker 2>it was we needed what's called training data in the

0:22:29.359 --> 0:22:32.640
<v Speaker 2>machine learning sense, so we needed the algorithms to see

0:22:33.040 --> 0:22:35.479
<v Speaker 2>a bunch of examples of him trying to say words,

0:22:35.480 --> 0:22:37.600
<v Speaker 2>and then what the neural activity looked like, and what

0:22:37.680 --> 0:22:40.240
<v Speaker 2>this actually looked like in the room was picture a

0:22:40.240 --> 0:22:43.399
<v Speaker 2>person in a wheelchair looking at a computer screen. We

0:22:43.520 --> 0:22:46.480
<v Speaker 2>put up what seemed like random sentences. The text would appear,

0:22:46.480 --> 0:22:48.879
<v Speaker 2>it would turn green, he would try to speak, and

0:22:48.920 --> 0:22:50.639
<v Speaker 2>then he would stop. And we just did this for

0:22:50.640 --> 0:22:53.199
<v Speaker 2>about thirty minutes. And one of the big questions at

0:22:53.240 --> 0:22:55.040
<v Speaker 2>the time was how much data do you need to

0:22:55.040 --> 0:22:58.560
<v Speaker 2>make this work? And the conventional wisdom would it was

0:22:58.560 --> 0:23:01.000
<v Speaker 2>that it would take a lot of data. Previous studies

0:23:01.600 --> 0:23:04.919
<v Speaker 2>had waited many, many weeks before they tried to decode

0:23:04.920 --> 0:23:08.560
<v Speaker 2>what's someone was trying to say. The AI fields that

0:23:08.600 --> 0:23:12.240
<v Speaker 2>we were borrowing tools from, for example, automated dictation when

0:23:12.240 --> 0:23:14.760
<v Speaker 2>you talk to your smartphone, those models are trained with

0:23:15.160 --> 0:23:20.280
<v Speaker 2>millions of hours so huge scrapes data sets to get

0:23:20.280 --> 0:23:24.600
<v Speaker 2>them to be able to understand speech. But it turned

0:23:24.640 --> 0:23:26.720
<v Speaker 2>out that because we had these electrodes in the part

0:23:26.760 --> 0:23:29.600
<v Speaker 2>of part of the brain that's controlling speech movements, it

0:23:29.640 --> 0:23:31.720
<v Speaker 2>has what's called a very high signal to noise ratio.

0:23:31.800 --> 0:23:35.800
<v Speaker 2>There's a really clear signal about what movements the body's

0:23:35.840 --> 0:23:38.600
<v Speaker 2>trying to make and thus what sounds is trying to produce.

0:23:39.040 --> 0:23:42.080
<v Speaker 2>And so after just thirty minutes of him reading these sentences,

0:23:42.480 --> 0:23:44.680
<v Speaker 2>we were looking at our little dashboard on the side

0:23:44.680 --> 0:23:46.800
<v Speaker 2>on our computers and it was showing us what we

0:23:46.880 --> 0:23:48.879
<v Speaker 2>call the word error rate. Or the phoneme error rate,

0:23:49.000 --> 0:23:51.920
<v Speaker 2>so how many words or phonemes were being incorrectly decoded.

0:23:52.359 --> 0:23:54.360
<v Speaker 2>And we saw that that was at the point where

0:23:54.359 --> 0:23:56.159
<v Speaker 2>we thought, okay, this thing can actually work, and so

0:23:56.200 --> 0:23:58.399
<v Speaker 2>we said, okay, now we're gonna do something very special.

0:23:58.480 --> 0:24:01.399
<v Speaker 2>We're gonna kind of flipless, which so to speak, and

0:24:01.480 --> 0:24:03.480
<v Speaker 2>now as you try to speak, you're going to see

0:24:03.480 --> 0:24:05.800
<v Speaker 2>words hopefully appearing at the bottom of the screen. And

0:24:05.840 --> 0:24:08.960
<v Speaker 2>we have a cool video of this, and so everyone's

0:24:09.000 --> 0:24:12.920
<v Speaker 2>kind of holding their breath and very excited, and the

0:24:12.960 --> 0:24:15.439
<v Speaker 2>prompt appeared, and he tries to speak, and the first

0:24:15.440 --> 0:24:19.560
<v Speaker 2>two words appeared correctly, and actually, at that point everyone

0:24:19.800 --> 0:24:22.480
<v Speaker 2>broke out in tears and laughter and clapping.

0:24:22.520 --> 0:24:23.720
<v Speaker 3>We actually paused.

0:24:23.359 --> 0:24:26.720
<v Speaker 2>For a few minutes and hugs, and his family was

0:24:26.720 --> 0:24:29.160
<v Speaker 2>there to watch it, in a really amazing moment, and

0:24:29.200 --> 0:24:31.520
<v Speaker 2>then we said, all right, let's get back to work,

0:24:31.880 --> 0:24:34.520
<v Speaker 2>and we kept going. And on that day we had

0:24:34.520 --> 0:24:36.840
<v Speaker 2>set a relatively modest goal. So we were using what's

0:24:36.840 --> 0:24:40.120
<v Speaker 2>called a fifty word vocabulary, meaning the sentences he could

0:24:40.119 --> 0:24:43.199
<v Speaker 2>say with it were restricted to fifty words, and you

0:24:43.200 --> 0:24:46.439
<v Speaker 2>can still say a few things, and that's obviously not

0:24:46.760 --> 0:24:49.480
<v Speaker 2>pragmatically useful, but that was to just to get going.

0:24:50.000 --> 0:24:52.960
<v Speaker 2>We had less than a one percent error rate using

0:24:53.040 --> 0:24:55.720
<v Speaker 2>this fifty word vocabulary, so almost every word was correct.

0:24:56.359 --> 0:24:56.960
<v Speaker 3>That was huge.

0:24:56.960 --> 0:25:01.280
<v Speaker 2>So we'd already established that, like some previous clinical throw participants,

0:25:01.640 --> 0:25:03.800
<v Speaker 2>his brain was still active when he was trying to speak.

0:25:03.880 --> 0:25:05.879
<v Speaker 2>So good, all right, that was the big one of

0:25:05.920 --> 0:25:09.240
<v Speaker 2>the bigger risks. Were we getting good in neural signals

0:25:09.240 --> 0:25:12.320
<v Speaker 2>from these electroder arrays? Yes, we were getting beautiful neural signals,

0:25:12.359 --> 0:25:14.399
<v Speaker 2>in fact, some of the best I've seen in my career.

0:25:14.640 --> 0:25:16.840
<v Speaker 2>And then did we need a ton of data? And

0:25:17.119 --> 0:25:19.320
<v Speaker 2>the answer was no, we were getting enough that we

0:25:19.359 --> 0:25:22.840
<v Speaker 2>could train these machine learning algorithms to map the neural

0:25:22.880 --> 0:25:24.919
<v Speaker 2>activity patterns to the words okay.

0:25:24.920 --> 0:25:27.320
<v Speaker 1>And for the listeners, I'm going to link the video

0:25:27.600 --> 0:25:30.240
<v Speaker 1>which shows when the family started to cry and so

0:25:30.320 --> 0:25:33.720
<v Speaker 1>I found that very moving. And so how long will

0:25:33.760 --> 0:25:39.400
<v Speaker 1>these electrodes last? And you'd be able to get good

0:25:39.480 --> 0:25:40.439
<v Speaker 1>signal out of this?

0:25:40.600 --> 0:25:44.480
<v Speaker 2>For Casey that is a key question, and the answers

0:25:44.520 --> 0:25:47.600
<v Speaker 2>we just don't know. So at this point he has

0:25:47.640 --> 0:25:50.240
<v Speaker 2>had this for about two years. We just had a

0:25:50.240 --> 0:25:53.760
<v Speaker 2>preprint a few months ago showing that out past six

0:25:53.840 --> 0:25:56.760
<v Speaker 2>hundred and fifty days the system is still going strong.

0:25:56.880 --> 0:26:00.959
<v Speaker 2>So this is huge because there was always some concern

0:26:01.000 --> 0:26:03.639
<v Speaker 2>that maybe these electrodes would stop recording neurons after a

0:26:03.680 --> 0:26:05.680
<v Speaker 2>few months or.

0:26:06.080 --> 0:26:09.000
<v Speaker 1>And why it's because of scar tissue building up around

0:26:09.040 --> 0:26:09.800
<v Speaker 1>the electrode.

0:26:09.880 --> 0:26:12.520
<v Speaker 2>There are a lot of potential factors. So yeah, whenever

0:26:12.560 --> 0:26:15.720
<v Speaker 2>you have a foreign body in the brain, the body

0:26:15.760 --> 0:26:19.280
<v Speaker 2>in the brain does not want that thing, So scar

0:26:19.359 --> 0:26:22.240
<v Speaker 2>tissue can form, can be at the microscale, just around

0:26:22.280 --> 0:26:25.800
<v Speaker 2>the electrode tip, which makes it harder to record individual neurons.

0:26:25.680 --> 0:26:28.720
<v Speaker 2>That sort of think of it like you're moving further

0:26:28.760 --> 0:26:31.879
<v Speaker 2>away from someone you're listening to, or there's padding between

0:26:31.920 --> 0:26:33.600
<v Speaker 2>you and them. It kind of it muffles the signal.

0:26:34.200 --> 0:26:36.000
<v Speaker 2>It could be at a more of a macro scale

0:26:36.000 --> 0:26:38.680
<v Speaker 2>where it can actually pull the electrodes out of the brain,

0:26:38.720 --> 0:26:40.360
<v Speaker 2>and that's happened in some other studies.

0:26:40.440 --> 0:26:42.440
<v Speaker 1>The way that your skin pushes a splinter out.

0:26:42.600 --> 0:26:45.679
<v Speaker 2>Yeah, I think that's a good analogy. So that's on

0:26:45.760 --> 0:26:49.960
<v Speaker 2>the biological response. Also, these are electrodes, so the materials

0:26:50.000 --> 0:26:53.240
<v Speaker 2>can fail, The insulation can fail over time, the metal

0:26:53.280 --> 0:26:55.760
<v Speaker 2>can get kind of chipped away or even away at

0:26:56.119 --> 0:27:01.000
<v Speaker 2>the wires, could disconnect, and there's a lot of failure modes,

0:27:01.359 --> 0:27:05.120
<v Speaker 2>but in this case, the records offar is really really encouraging.

0:27:05.160 --> 0:27:08.639
<v Speaker 2>So two years out, it's working great. The accuracy has

0:27:08.640 --> 0:27:10.920
<v Speaker 2>actually gotten better, and our preprint is now ninety nine

0:27:10.920 --> 0:27:13.560
<v Speaker 2>percent accurate, both because we have more data and we've

0:27:13.600 --> 0:27:15.760
<v Speaker 2>had more time to just improve the algorithms and keep

0:27:15.840 --> 0:27:18.800
<v Speaker 2>trying new things. And he is now using this as

0:27:18.800 --> 0:27:20.280
<v Speaker 2>his primary means of communication.

0:27:20.560 --> 0:27:22.600
<v Speaker 1>And so a couple of things. One is, when you

0:27:22.680 --> 0:27:25.359
<v Speaker 1>decode the neural activity, you could just print that as

0:27:25.480 --> 0:27:27.879
<v Speaker 1>words on the screen, but you guys went a step further.

0:27:28.520 --> 0:27:32.640
<v Speaker 2>Yeah, So in our first few months, what we did

0:27:32.720 --> 0:27:34.919
<v Speaker 2>is called text to speech, So the words would appear

0:27:34.960 --> 0:27:38.040
<v Speaker 2>as text on the screen initially, and then when a

0:27:38.080 --> 0:27:40.199
<v Speaker 2>whole utter and so a sentence or it could be

0:27:40.200 --> 0:27:43.440
<v Speaker 2>a whole paragraph, he would use his eyes to look

0:27:43.440 --> 0:27:45.440
<v Speaker 2>at a button on the screen and basically there's a

0:27:45.480 --> 0:27:48.320
<v Speaker 2>done button, and after he hits the done button, the

0:27:48.440 --> 0:27:51.600
<v Speaker 2>computer will read out loud what he said, and we

0:27:51.680 --> 0:27:53.720
<v Speaker 2>basically made a deep fake of his voice, so it

0:27:53.800 --> 0:27:56.560
<v Speaker 2>sounds a lot like he did before he got als.

0:27:56.840 --> 0:27:59.440
<v Speaker 2>It's not perfect, but it really does sound quite a

0:27:59.440 --> 0:28:02.280
<v Speaker 2>lot like him. Technology has progressed a lot, even in

0:28:02.280 --> 0:28:04.879
<v Speaker 2>the last couple of years. Most of the time people

0:28:04.880 --> 0:28:08.400
<v Speaker 2>worry about all the ill uses of faking someone's voice,

0:28:08.400 --> 0:28:10.640
<v Speaker 2>but this is maybe one of the few cases where

0:28:10.640 --> 0:28:12.000
<v Speaker 2>it's actually a really wonderful thing.

0:28:12.400 --> 0:28:15.560
<v Speaker 1>So you got his voice from videos when he was younger,

0:28:15.560 --> 0:28:17.159
<v Speaker 1>before the als had set in.

0:28:17.480 --> 0:28:19.479
<v Speaker 2>Yeah, we asked him and his family and they provided

0:28:19.560 --> 0:28:21.200
<v Speaker 2>us a bunch of things. And actually he had done

0:28:21.200 --> 0:28:25.440
<v Speaker 2>a podcast before, so we had really good material.

0:28:25.640 --> 0:28:29.440
<v Speaker 1>So when he thinks of a sentence, the neural activities decoded,

0:28:29.480 --> 0:28:34.440
<v Speaker 1>the sentence gets reconstructed, and then you turn it into

0:28:34.520 --> 0:28:37.200
<v Speaker 1>his voice. Yes, now that's what you showed in twenty

0:28:37.240 --> 0:28:39.600
<v Speaker 1>twenty four, and you just recently had a paper five

0:28:39.600 --> 0:28:41.680
<v Speaker 1>months ago or so. Tell us about that.

0:28:42.120 --> 0:28:45.360
<v Speaker 2>Yeah, So everything before, even though it could be said

0:28:45.400 --> 0:28:48.920
<v Speaker 2>out loud, ultimately the informations in the form of text.

0:28:49.880 --> 0:28:52.320
<v Speaker 2>And I think we can all appreciate that a lot

0:28:52.400 --> 0:28:54.360
<v Speaker 2>gets lost just through texts.

0:28:55.600 --> 0:28:56.959
<v Speaker 3>There's no intonation.

0:28:57.200 --> 0:29:02.239
<v Speaker 2>You can't indicate that maybe you're being sarcastic. It's less expressive. Right,

0:29:02.240 --> 0:29:05.120
<v Speaker 2>There's a lot of rich nuance that we all convey

0:29:05.520 --> 0:29:08.400
<v Speaker 2>in our voice and through text that's lost, and the

0:29:08.440 --> 0:29:11.960
<v Speaker 2>other problem is the latency or the immediacy. So if

0:29:12.040 --> 0:29:14.600
<v Speaker 2>I was talking to you and I could only write,

0:29:15.240 --> 0:29:18.040
<v Speaker 2>it would be very easy for you to accidentally interrupt me,

0:29:18.520 --> 0:29:20.480
<v Speaker 2>or to just not for me not to be able

0:29:20.480 --> 0:29:23.160
<v Speaker 2>to get a word in, because by the time I've

0:29:23.360 --> 0:29:25.800
<v Speaker 2>finished a sentence and selected a bund to speak it

0:29:25.800 --> 0:29:28.360
<v Speaker 2>out loud, maybe you've already moved on to the next topic.

0:29:28.440 --> 0:29:31.880
<v Speaker 2>Maybe if there's other people in the room, they're talking right. So,

0:29:32.240 --> 0:29:34.400
<v Speaker 2>for all of these reasons, we really wanted to do

0:29:34.760 --> 0:29:36.240
<v Speaker 2>not what we call brain to text, but what we

0:29:36.280 --> 0:29:39.200
<v Speaker 2>call brain to voice, and that means go immediately from

0:29:39.240 --> 0:29:42.880
<v Speaker 2>neuroactivity to sound. This is a hard problem for a

0:29:42.880 --> 0:29:45.000
<v Speaker 2>lot of reasons, one of which is it has to

0:29:45.000 --> 0:29:48.160
<v Speaker 2>be in super fast. You want sound to happen within

0:29:48.200 --> 0:29:52.160
<v Speaker 2>about thirty millisecond. That's kind of matching the natural latency

0:29:52.200 --> 0:29:56.120
<v Speaker 2>of brain to moving the muscles to vibrating air that

0:29:56.600 --> 0:30:00.520
<v Speaker 2>someone can hear. And so because of that, first of all,

0:30:00.520 --> 0:30:03.200
<v Speaker 2>we had to decode these neuro signals very quickly. It

0:30:03.320 --> 0:30:06.000
<v Speaker 2>limits the kind of algorithms we can use. We have

0:30:06.120 --> 0:30:08.400
<v Speaker 2>less data to work with. Right, you can't look into

0:30:08.440 --> 0:30:11.520
<v Speaker 2>the future, there's no autocorrect. You can't look at the

0:30:11.640 --> 0:30:15.200
<v Speaker 2>entire sentence to figure out based on context, like, Oh,

0:30:15.200 --> 0:30:17.959
<v Speaker 2>I reached down to pet the cot. No, you probably

0:30:17.960 --> 0:30:20.960
<v Speaker 2>meant kat because you don't usually pet a cot. You

0:30:21.000 --> 0:30:23.720
<v Speaker 2>can't do that if you're doing brain to voice. As

0:30:23.720 --> 0:30:25.640
<v Speaker 2>soon as you try to say I, you need to

0:30:25.640 --> 0:30:29.160
<v Speaker 2>have the sound eye reached. Right. It just has to

0:30:29.360 --> 0:30:33.640
<v Speaker 2>flow constantly. But we were able to, through a bunch

0:30:33.640 --> 0:30:38.200
<v Speaker 2>of complicated engineering work, get really far in there. And

0:30:38.400 --> 0:30:40.240
<v Speaker 2>where the state of the art in that paper that

0:30:40.280 --> 0:30:43.719
<v Speaker 2>you're referring to is is it is very immediate, So

0:30:43.760 --> 0:30:49.200
<v Speaker 2>the latency is under thirty milliseconds, and it's mostly intelligible,

0:30:49.200 --> 0:30:51.920
<v Speaker 2>but not consistently intelligible. So about fifty six percent of

0:30:51.960 --> 0:30:56.120
<v Speaker 2>words could be understood by someone. It's a big step forward,

0:30:56.160 --> 0:30:58.720
<v Speaker 2>but it's not good enough for daily use. Right. I

0:30:58.760 --> 0:31:01.000
<v Speaker 2>already said earlier that we out of four words is

0:31:01.040 --> 0:31:03.440
<v Speaker 2>not good enough, So you know, one out of two

0:31:03.480 --> 0:31:04.840
<v Speaker 2>words is definitely not good enough.

0:31:05.040 --> 0:31:07.440
<v Speaker 1>So when there's a mistake, what kind of mistake is it?

0:31:07.480 --> 0:31:11.920
<v Speaker 1>Is it barn for brain and therefore sort of intelligible,

0:31:12.000 --> 0:31:13.080
<v Speaker 1>or is it is it worse than that?

0:31:13.720 --> 0:31:16.800
<v Speaker 2>Yeah, it tends to sound like slurry speech, or maybe

0:31:16.840 --> 0:31:20.480
<v Speaker 2>like if someone's mumbling, so sometimes you can get the

0:31:20.560 --> 0:31:23.040
<v Speaker 2>gist of it. The length tends to be the same

0:31:23.040 --> 0:31:26.120
<v Speaker 2>because it's still capturing we call the envelope of speech.

0:31:26.200 --> 0:31:28.440
<v Speaker 2>So if you're saying a short word or a long word,

0:31:28.640 --> 0:31:31.800
<v Speaker 2>that comes through it very clearly, but maybe some of

0:31:31.800 --> 0:31:33.640
<v Speaker 2>the phonemes are a little garbled, and so you can't

0:31:33.840 --> 0:31:35.680
<v Speaker 2>tell exactly what's being said.

0:31:35.920 --> 0:31:39.960
<v Speaker 1>Got it, Because each phoneme that the brain is encoding for,

0:31:40.160 --> 0:31:43.040
<v Speaker 1>you're translating that right away. Thirty milli seconds later that's

0:31:43.080 --> 0:31:44.080
<v Speaker 1>coming out of the speaker.

0:31:44.360 --> 0:31:47.080
<v Speaker 2>Yeah, we just don't have enough signal to noise ratio.

0:31:47.080 --> 0:31:49.160
<v Speaker 2>We don't have enough precisions. So it's like if you

0:31:49.200 --> 0:31:52.640
<v Speaker 2>have a really bad digital camera, really grainy camera, and

0:31:52.680 --> 0:31:55.120
<v Speaker 2>you're trying to parse the scene. You know, sometimes you

0:31:55.160 --> 0:31:56.920
<v Speaker 2>can see what's going on, and other times you just

0:31:57.080 --> 0:32:00.040
<v Speaker 2>can't quite make out. I know that is that a

0:32:00.080 --> 0:32:01.640
<v Speaker 2>person or a ball?

0:32:01.760 --> 0:32:01.959
<v Speaker 3>Is that?

0:32:02.040 --> 0:32:05.560
<v Speaker 2>You know? What does that word say? If it's really grainy,

0:32:05.880 --> 0:32:07.720
<v Speaker 2>you just can't see so well. And although we have

0:32:07.760 --> 0:32:10.040
<v Speaker 2>two hundred and fifty six electros, which sounds like a lot,

0:32:10.680 --> 0:32:14.000
<v Speaker 2>the brain has almost one hundred billion neurons. There's probably

0:32:14.120 --> 0:32:17.320
<v Speaker 2>multiple billions that are involved in just speech and language.

0:32:17.360 --> 0:32:20.120
<v Speaker 2>So in some ways as a miracle that works at all,

0:32:20.160 --> 0:32:23.120
<v Speaker 2>that we're sampling from such a small number of neurons

0:32:23.360 --> 0:32:26.040
<v Speaker 2>and able to reconstruct the sounds that the person's trying

0:32:26.040 --> 0:32:26.280
<v Speaker 2>to make.

0:32:27.200 --> 0:32:30.280
<v Speaker 1>And if I'm remembering in that paper, you also showed

0:32:31.440 --> 0:32:32.800
<v Speaker 1>sort of short singing.

0:32:33.120 --> 0:32:37.240
<v Speaker 2>Yeah, So we wanted to demonstrate that this approach could

0:32:37.320 --> 0:32:41.480
<v Speaker 2>do more than just transmit the words, because we kind

0:32:41.480 --> 0:32:44.000
<v Speaker 2>of already had that with brain to text. Now it

0:32:44.040 --> 0:32:46.520
<v Speaker 2>could do it immediately, so that solves that interruption or

0:32:46.560 --> 0:32:49.040
<v Speaker 2>being heard right away problem. But we wanted to provide

0:32:49.040 --> 0:32:51.480
<v Speaker 2>a proof of concept that this could also be expressive,

0:32:51.600 --> 0:32:54.479
<v Speaker 2>so we had a couple experiments that did that. In

0:32:54.520 --> 0:32:56.400
<v Speaker 2>one of them, he was asked to say sentences as

0:32:56.440 --> 0:32:59.440
<v Speaker 2>either a question or a statement. And in English, when

0:32:59.440 --> 0:33:01.520
<v Speaker 2>we ask a question, can we increase the pitch at

0:33:01.560 --> 0:33:03.720
<v Speaker 2>the end, So he was able to do that. We

0:33:03.760 --> 0:33:06.400
<v Speaker 2>had him emphasize specific words, and you know, you use

0:33:06.480 --> 0:33:09.000
<v Speaker 2>that to change the meaning of what you're saying. So

0:33:09.160 --> 0:33:12.360
<v Speaker 2>this is classic from a different study, sentence that you

0:33:12.360 --> 0:33:14.560
<v Speaker 2>can say in seven different ways, which is I never

0:33:14.600 --> 0:33:17.480
<v Speaker 2>said she stole my money. Now I can say I

0:33:17.520 --> 0:33:20.440
<v Speaker 2>never said she stole my money. I never said she

0:33:20.560 --> 0:33:23.880
<v Speaker 2>stole my money. Right, I'm slightly changing the connotation depending

0:33:23.920 --> 0:33:25.920
<v Speaker 2>on which word I'm stressing. And so we had a

0:33:25.960 --> 0:33:28.800
<v Speaker 2>task where he said that sentence emphasizing all the different

0:33:28.800 --> 0:33:30.760
<v Speaker 2>words and lo and behold.

0:33:30.800 --> 0:33:30.960
<v Speaker 1>Yes.

0:33:31.000 --> 0:33:34.200
<v Speaker 2>From the neuroactivity, we could identify which word he was stressing.

0:33:34.240 --> 0:33:36.280
<v Speaker 2>And so then we had another task where we would

0:33:36.320 --> 0:33:38.120
<v Speaker 2>give him a sentence and we would capitalize a word

0:33:38.400 --> 0:33:40.080
<v Speaker 2>and he was supposed to emphasize that. And then the

0:33:40.120 --> 0:33:42.640
<v Speaker 2>last one is what you were referring to is we

0:33:42.720 --> 0:33:47.080
<v Speaker 2>call a simple singing task. So it was only three notes,

0:33:47.200 --> 0:33:49.640
<v Speaker 2>but basically he could say whatever he wanted to say,

0:33:49.640 --> 0:33:52.000
<v Speaker 2>but at three different pitch levels, so you could say,

0:33:52.000 --> 0:33:54.960
<v Speaker 2>you know, like bah bah bah or like you know,

0:33:55.320 --> 0:34:00.280
<v Speaker 2>la law da. So that task he was able to

0:34:00.360 --> 0:34:03.680
<v Speaker 2>do quite well. He's not going to be singing in

0:34:03.720 --> 0:34:06.880
<v Speaker 2>the opera yet, but it shows the path forward and

0:34:07.520 --> 0:34:10.440
<v Speaker 2>where our lab and many others are working now is

0:34:10.800 --> 0:34:12.560
<v Speaker 2>how do we build on this? So does that mean

0:34:12.960 --> 0:34:17.360
<v Speaker 2>better algorithms? There's always new innovations in the artificial intelligence

0:34:17.360 --> 0:34:20.200
<v Speaker 2>world and just neuroscience making sense of these signals.

0:34:20.440 --> 0:34:21.960
<v Speaker 3>Does that mean putting more electrodes?

0:34:22.000 --> 0:34:22.080
<v Speaker 1>In.

0:34:22.200 --> 0:34:24.480
<v Speaker 2>Certainly that's of interest, and there's a lot of really

0:34:24.480 --> 0:34:28.320
<v Speaker 2>exciting work happening in there. Does that mean maybe putting

0:34:28.320 --> 0:34:32.040
<v Speaker 2>electrodes in additional parts of the brain, so kind of

0:34:32.040 --> 0:34:35.160
<v Speaker 2>at a simplistic level, people think of left versus right

0:34:35.200 --> 0:34:37.600
<v Speaker 2>brain as having some differences with maybe more of these

0:34:37.760 --> 0:34:41.680
<v Speaker 2>what are called parlinguistic elements of voice encoded more on

0:34:41.719 --> 0:34:44.239
<v Speaker 2>the right side of the brain. That's something we'd like

0:34:44.320 --> 0:34:46.120
<v Speaker 2>to find out and we hope to in the future,

0:34:46.880 --> 0:34:48.799
<v Speaker 2>or do we need to put it in other parts

0:34:48.840 --> 0:34:50.160
<v Speaker 2>of the speech network.

0:34:50.200 --> 0:34:53.040
<v Speaker 1>By the way, just to flesh that out for listeners.

0:34:53.719 --> 0:34:55.160
<v Speaker 1>You know, on the left side of the brain, you've

0:34:55.200 --> 0:34:58.880
<v Speaker 1>got a lot involved with language. When people get damage there,

0:34:59.239 --> 0:35:03.680
<v Speaker 1>they let's say, lose the ability to articulate, to produce sentences,

0:35:03.680 --> 0:35:07.560
<v Speaker 1>to understand census. But when people get damage in equivalent

0:35:07.600 --> 0:35:10.239
<v Speaker 1>areas mirror images on the right side, they can get

0:35:10.239 --> 0:35:12.840
<v Speaker 1>what's called a musia, which is the inability to understand

0:35:12.960 --> 0:35:16.319
<v Speaker 1>music anymore. Because as you say, that's where intonation, the

0:35:16.400 --> 0:35:20.839
<v Speaker 1>prosity of language seems to be encoded. So good, this

0:35:20.920 --> 0:35:23.040
<v Speaker 1>is a good segue into the future, then, which is

0:35:24.040 --> 0:35:27.600
<v Speaker 1>first of all, I'm curious what you think is the

0:35:27.680 --> 0:35:31.440
<v Speaker 1>answer you just posed. Is it getting better electrodes, more electrodes,

0:35:31.520 --> 0:35:34.319
<v Speaker 1>is it getting better algorithms? Is there a limitation in

0:35:34.360 --> 0:35:39.880
<v Speaker 1>the signals and noise ratio? Where's the lowest hanging fruit

0:35:39.960 --> 0:35:41.239
<v Speaker 1>for getting improvements? Here?

0:35:41.760 --> 0:35:44.279
<v Speaker 3>Can I go with d all of the above? I

0:35:44.320 --> 0:35:46.000
<v Speaker 3>think we do need all of these things.

0:35:46.239 --> 0:35:50.560
<v Speaker 2>So already we are seeing with our data and this

0:35:50.600 --> 0:35:54.439
<v Speaker 2>current participant that with the same electrodes, we are able

0:35:54.480 --> 0:35:57.279
<v Speaker 2>to squeeze more information out with better algorithms and just

0:35:57.480 --> 0:35:59.600
<v Speaker 2>better understanding what the brain is doing. And there's a

0:35:59.640 --> 0:36:02.399
<v Speaker 2>lot going on there. It's not just the movements. We're

0:36:02.400 --> 0:36:07.480
<v Speaker 2>seeing things like neural error signals. We're seeing prosody and

0:36:07.520 --> 0:36:10.160
<v Speaker 2>intonation encoded. Right. All of these things are kind of

0:36:10.520 --> 0:36:14.560
<v Speaker 2>mixed together in these brain signals we're measuring, and there's

0:36:14.560 --> 0:36:17.239
<v Speaker 2>a lot of science that goes into disentangling them and

0:36:17.239 --> 0:36:19.000
<v Speaker 2>figure out what they mean. What are you trying to

0:36:19.000 --> 0:36:22.640
<v Speaker 2>pay attention to for given application. So that's all moving forward,

0:36:23.320 --> 0:36:25.200
<v Speaker 2>and so we're just learning a ton about how the

0:36:25.239 --> 0:36:28.920
<v Speaker 2>human brain produces speech because we didn't have this opportunity

0:36:28.960 --> 0:36:31.880
<v Speaker 2>at this precision before. There's now only a handful of

0:36:31.960 --> 0:36:34.719
<v Speaker 2>humans in the whole world that have had electrodes that

0:36:34.760 --> 0:36:37.359
<v Speaker 2>measure individual neurons as they try to speak. So we're

0:36:37.400 --> 0:36:41.160
<v Speaker 2>learning a lot, but certainly more electrodes is better, So

0:36:41.360 --> 0:36:43.400
<v Speaker 2>in our trial as we move forward, we intend to

0:36:43.400 --> 0:36:45.880
<v Speaker 2>put more electrodes in. There are now multiple companies that

0:36:45.920 --> 0:36:49.719
<v Speaker 2>are building fully implanted intracortical electrodes, so similar type of

0:36:49.719 --> 0:36:53.200
<v Speaker 2>electrodes that go right up to the neurons, but they

0:36:53.200 --> 0:36:56.600
<v Speaker 2>all have a thousand or more electrodes or recording sites.

0:36:57.080 --> 0:36:59.000
<v Speaker 2>So we're talking about at least a four x if

0:36:59.040 --> 0:37:03.120
<v Speaker 2>not more improved in the density or the count of electrodes.

0:37:03.120 --> 0:37:05.400
<v Speaker 2>And I think that's going to make everything work just

0:37:05.600 --> 0:37:06.400
<v Speaker 2>so much better.

0:37:06.800 --> 0:37:09.480
<v Speaker 1>And of course companies were working on making this wireless

0:37:09.520 --> 0:37:12.960
<v Speaker 1>as well, Neurallink being I guess the first one to

0:37:13.040 --> 0:37:15.800
<v Speaker 1>do it, but other companies moving that way as well,

0:37:16.360 --> 0:37:19.480
<v Speaker 1>so that you could have something that's fully packaged and

0:37:19.520 --> 0:37:23.040
<v Speaker 1>a person can just speak with no wires hanging out.

0:37:23.360 --> 0:37:25.400
<v Speaker 3>Yeah, that is very important.

0:37:25.400 --> 0:37:29.200
<v Speaker 2>So the wired systems we have now, they are what

0:37:29.320 --> 0:37:32.800
<v Speaker 2>is available. They're good for research there in some ways simpler.

0:37:33.360 --> 0:37:37.000
<v Speaker 2>They've been shown to be safe for quite a long time,

0:37:37.400 --> 0:37:39.799
<v Speaker 2>but they're limiting right fully implanted is the way to go,

0:37:39.840 --> 0:37:42.879
<v Speaker 2>and we can look at other medical devices. So there's

0:37:42.880 --> 0:37:47.240
<v Speaker 2>these wild photos of pacemakers in the fifties and it

0:37:47.320 --> 0:37:50.480
<v Speaker 2>was basically like a car battery on a cart with

0:37:50.640 --> 0:37:53.880
<v Speaker 2>you some amplifiers and kind of primitive. They're not computers,

0:37:53.920 --> 0:37:56.760
<v Speaker 2>they're electronics, and then there's a wire going to someone's chest.

0:37:57.520 --> 0:37:59.880
<v Speaker 3>It kept them alive and it showed that this worked.

0:38:00.400 --> 0:38:03.080
<v Speaker 2>But of course today millions and millions of people are

0:38:03.080 --> 0:38:07.160
<v Speaker 2>walking around very healthy with pacemakers that are small and

0:38:07.200 --> 0:38:10.680
<v Speaker 2>their packaged and titanium or other very inert safe materials.

0:38:11.640 --> 0:38:12.440
<v Speaker 3>They have battery.

0:38:12.600 --> 0:38:15.319
<v Speaker 2>Some of them now can be wirelessly recharged. So I

0:38:15.320 --> 0:38:18.640
<v Speaker 2>think this is a well trodden path and we're going

0:38:18.680 --> 0:38:21.200
<v Speaker 2>to absolutely see this with brain computer interfaces. They're going

0:38:21.239 --> 0:38:23.680
<v Speaker 2>to be fully implanted, they're going to be wireless. Data

0:38:23.719 --> 0:38:26.160
<v Speaker 2>is going to come out through radio or lasers or

0:38:26.160 --> 0:38:28.920
<v Speaker 2>other means to get data out of the brain, and

0:38:29.160 --> 0:38:31.279
<v Speaker 2>power is going to go in and it's going to

0:38:31.280 --> 0:38:31.960
<v Speaker 2>be great. Great.

0:38:32.280 --> 0:38:34.120
<v Speaker 1>Now, Okay, let me ask you this. A lot of

0:38:34.160 --> 0:38:36.799
<v Speaker 1>people are very familiar with neuralink. They've heard about it.

0:38:36.880 --> 0:38:38.839
<v Speaker 1>Even though as I mentioned, this idea of recording from

0:38:38.840 --> 0:38:40.640
<v Speaker 1>brains has been happening for a very long time.

0:38:40.960 --> 0:38:41.120
<v Speaker 2>Now.

0:38:41.120 --> 0:38:45.839
<v Speaker 1>What neuralink is doing is implanting very tiny electrodes robotically,

0:38:46.040 --> 0:38:49.040
<v Speaker 1>and it's fully implantable, and so that's part of why

0:38:49.040 --> 0:38:50.880
<v Speaker 1>it's famous. But also part of why it's famous this

0:38:50.920 --> 0:38:55.040
<v Speaker 1>is because it's Elon and there's this mystique about it,

0:38:55.080 --> 0:38:59.640
<v Speaker 1>the sort of idea that everyone will someday get a neuralink.

0:39:00.280 --> 0:39:03.080
<v Speaker 1>Now I have my doubts because it's an open head

0:39:03.080 --> 0:39:06.280
<v Speaker 1>surgery still, even though it's with the robot. But let's

0:39:06.280 --> 0:39:11.359
<v Speaker 1>look towards the future in terms of what use would

0:39:11.400 --> 0:39:14.720
<v Speaker 1>it be to have a brain computer interface for somebody

0:39:14.760 --> 0:39:16.920
<v Speaker 1>without a problem speaking or moving.

0:39:17.320 --> 0:39:21.080
<v Speaker 2>Yeah, I don't think that application, the killer app so

0:39:21.200 --> 0:39:22.960
<v Speaker 2>to speak, has been discovered yet.

0:39:23.040 --> 0:39:25.719
<v Speaker 3>You know, there's times where I'm lying.

0:39:25.480 --> 0:39:27.080
<v Speaker 2>In bed and I kind of wish i could send

0:39:27.120 --> 0:39:29.000
<v Speaker 2>a text message without having to reach for my phone.

0:39:29.040 --> 0:39:30.759
<v Speaker 2>But I'm not going to get a brain surgery to

0:39:30.800 --> 0:39:32.640
<v Speaker 2>do that. I'm going to just reach for my phone.

0:39:32.920 --> 0:39:36.160
<v Speaker 2>So what I think we're going to see is a

0:39:36.200 --> 0:39:39.680
<v Speaker 2>widening of the medical applications. So I think there's gonna

0:39:39.680 --> 0:39:43.320
<v Speaker 2>be many, many more medical needs that can be addressed

0:39:43.320 --> 0:39:48.440
<v Speaker 2>with brain technology, whether stroke, things like sustaining memory in

0:39:48.480 --> 0:39:52.120
<v Speaker 2>the longer term, or dealing with age related decline or

0:39:52.120 --> 0:39:54.520
<v Speaker 2>even Alzheimer's. So there's going to be different types of

0:39:54.600 --> 0:39:59.000
<v Speaker 2>BCIs for different problems. But in terms of fully implanted,

0:39:59.080 --> 0:40:03.520
<v Speaker 2>kind of invasivec eyes for really healthy people, no one

0:40:03.560 --> 0:40:09.280
<v Speaker 2>has yet shown a benefit that I think is worthwhile. Now,

0:40:09.400 --> 0:40:12.920
<v Speaker 2>could I imagine it? Certainly one could imagine it. So,

0:40:13.600 --> 0:40:15.520
<v Speaker 2>you know, if you could have a device in your brain,

0:40:15.680 --> 0:40:19.160
<v Speaker 2>let's say it would allow you to feel more alert

0:40:19.280 --> 0:40:21.640
<v Speaker 2>or to sleep less, right, so kind of modulating some

0:40:22.120 --> 0:40:26.120
<v Speaker 2>circadian rhythms or energy level or attention. One could imagine

0:40:26.120 --> 0:40:28.799
<v Speaker 2>that that kind of like a performance enhancing drug that

0:40:28.840 --> 0:40:33.040
<v Speaker 2>could be done with a neurotechnology or neural interface. But

0:40:33.120 --> 0:40:35.680
<v Speaker 2>no one's done that yet in a way that's compelling.

0:40:36.560 --> 0:40:38.680
<v Speaker 2>People have talked about could it be kind of like

0:40:38.680 --> 0:40:41.279
<v Speaker 2>a coprocessor for your brain, like you know, somehow you

0:40:41.360 --> 0:40:45.400
<v Speaker 2>just know things. It's like having a smart AI assistant,

0:40:45.440 --> 0:40:48.040
<v Speaker 2>but it's inside your mind and it's much more seamless.

0:40:49.280 --> 0:40:51.040
<v Speaker 3>But that is a really long way away.

0:40:51.080 --> 0:40:53.640
<v Speaker 2>I mean, we have we're struggling to get you know,

0:40:54.040 --> 0:40:57.040
<v Speaker 2>crude vision in so people can can read a page. Now,

0:40:57.080 --> 0:40:59.759
<v Speaker 2>I mean, that's amazing, that's like very state of the art.

0:41:00.120 --> 0:41:04.160
<v Speaker 2>Or someone can slowly walk who has a spinal cord injury,

0:41:04.640 --> 0:41:08.680
<v Speaker 2>or someone can talk but not as eloquently as before

0:41:08.719 --> 0:41:11.200
<v Speaker 2>their als or before their stroke. So, given where we

0:41:11.239 --> 0:41:13.760
<v Speaker 2>are now, I think we're quite a ways away from

0:41:13.800 --> 0:41:15.640
<v Speaker 2>like beaming information in Oh.

0:41:15.719 --> 0:41:32.479
<v Speaker 1>I totally agree with you on that. I do wonder

0:41:32.560 --> 0:41:35.440
<v Speaker 1>twenty five years from now, let's say, right if you

0:41:35.560 --> 0:41:37.400
<v Speaker 1>just took a short cut of said, okay, look, I

0:41:37.440 --> 0:41:40.279
<v Speaker 1>want to listen to your covert speech things are not

0:41:40.320 --> 0:41:42.239
<v Speaker 1>saying out loud, and then I want to plug the

0:41:42.280 --> 0:41:44.719
<v Speaker 1>answer right back into your auditory cort text as though

0:41:44.760 --> 0:41:47.600
<v Speaker 1>you're hearing it, and then you know, beam wirelessly to

0:41:47.800 --> 0:41:50.719
<v Speaker 1>open AI or whatever exists in twenty five years from now. Yeah,

0:41:50.760 --> 0:41:53.480
<v Speaker 1>the question is could you ask a question and hear

0:41:53.520 --> 0:41:55.360
<v Speaker 1>the answer that way?

0:41:55.719 --> 0:41:58.880
<v Speaker 2>My prediction is yes, I think that could be done.

0:41:59.080 --> 0:42:00.319
<v Speaker 2>I mean also, I think that could be done the

0:42:00.360 --> 0:42:03.840
<v Speaker 2>next five years. It just would still require a surgery

0:42:04.040 --> 0:42:06.880
<v Speaker 2>to be done accurately, And so would anyone want it?

0:42:07.000 --> 0:42:10.600
<v Speaker 2>Would we as a society choose to allow? It?

0:42:10.600 --> 0:42:13.160
<v Speaker 3>Gets into debates of people's agency over their health.

0:42:13.320 --> 0:42:15.319
<v Speaker 1>Are there moral or ethical questions about that.

0:42:15.480 --> 0:42:18.759
<v Speaker 2>I think these are just general kind of medical and

0:42:18.840 --> 0:42:23.920
<v Speaker 2>societal questions of do we allow people to take medical

0:42:24.040 --> 0:42:27.560
<v Speaker 2>risks to get certain abilities that they otherwise wouldn't have.

0:42:28.120 --> 0:42:30.840
<v Speaker 1>One of the issues is about brain privacy, right, the

0:42:30.960 --> 0:42:34.640
<v Speaker 1>question of let's say I'm doing something that's recording my

0:42:34.880 --> 0:42:37.239
<v Speaker 1>covert thoughts, by which I mean, you know something that

0:42:37.280 --> 0:42:39.719
<v Speaker 1>I'm thinking, but I haven't actually pushed it out to

0:42:39.760 --> 0:42:43.080
<v Speaker 1>my motor cortex to say it yet. Who's the company

0:42:43.080 --> 0:42:48.520
<v Speaker 1>who has access to that? Do I want anybody accessing that?

0:42:49.080 --> 0:42:51.440
<v Speaker 2>I think that's yeah, that's a real concern. We're not

0:42:51.520 --> 0:42:54.400
<v Speaker 2>there yet, so to be clear, there's no BCI that

0:42:54.400 --> 0:42:56.960
<v Speaker 2>can decode covert thought yet exactly.

0:42:57.000 --> 0:42:59.839
<v Speaker 1>I'm talking twenty five years from Yeah. Yeah, I mean,

0:43:00.000 --> 0:43:03.080
<v Speaker 1>this is one of the conundrums about where this is heading.

0:43:03.440 --> 0:43:06.920
<v Speaker 2>Well, we're already dealing with inklings of that. So, for example,

0:43:06.960 --> 0:43:10.279
<v Speaker 2>in our system, because our participant is using this for

0:43:10.320 --> 0:43:12.520
<v Speaker 2>his day to day life. For example, one thing that

0:43:12.520 --> 0:43:15.600
<v Speaker 2>we implement was a privacy mode where if he toggles

0:43:15.600 --> 0:43:19.120
<v Speaker 2>a button, it no longer saves that data. This is

0:43:19.120 --> 0:43:22.239
<v Speaker 2>a academic clinical trial. In general, we're really loath to

0:43:22.239 --> 0:43:24.359
<v Speaker 2>give up any data I mean, it's so precious and

0:43:24.360 --> 0:43:28.359
<v Speaker 2>then these people are making these commitments to science, but

0:43:28.520 --> 0:43:30.239
<v Speaker 2>we also want to be respectful that he might need

0:43:30.280 --> 0:43:32.759
<v Speaker 2>to have a really private conversation and we don't want

0:43:32.800 --> 0:43:35.520
<v Speaker 2>to even have any ability to access that. So that's

0:43:35.560 --> 0:43:38.160
<v Speaker 2>already something we're dealing with in the context of a

0:43:38.239 --> 0:43:41.480
<v Speaker 2>medical trial from an academic medical center. I think this

0:43:41.520 --> 0:43:44.640
<v Speaker 2>is a very high trust scenario. Of course, when you

0:43:44.640 --> 0:43:47.200
<v Speaker 2>have companies that are building these, we're going to want

0:43:47.200 --> 0:43:49.360
<v Speaker 2>to think about we have what rights do in that

0:43:49.440 --> 0:43:53.080
<v Speaker 2>case patients or customers have to the data? Can the

0:43:53.160 --> 0:43:55.799
<v Speaker 2>data be used to improve the algorithms? Who owns the

0:43:55.840 --> 0:43:59.320
<v Speaker 2>benefit of that? What happens if a government subpoena?

0:43:59.360 --> 0:44:02.000
<v Speaker 3>Is it? Right? Now, we have.

0:44:02.000 --> 0:44:05.720
<v Speaker 2>This speech PCI for people with vocal tracked paralysis, meaning

0:44:05.760 --> 0:44:08.239
<v Speaker 2>that they know exactly what they're trying to say. The

0:44:08.280 --> 0:44:10.720
<v Speaker 2>words are clearly formed in their mind. They are trying

0:44:10.719 --> 0:44:14.880
<v Speaker 2>to speak it. Those commands are not reaching the muscles. Okay,

0:44:15.000 --> 0:44:18.520
<v Speaker 2>So we've shown that there is a very compelling therapy there.

0:44:19.120 --> 0:44:22.400
<v Speaker 2>Industry is going to come in and kind of productize it.

0:44:22.480 --> 0:44:24.480
<v Speaker 2>I think this is going to turn into medical device

0:44:24.840 --> 0:44:27.680
<v Speaker 2>in the next five years. There is a much larger

0:44:27.960 --> 0:44:31.920
<v Speaker 2>patient population though with aphasia due to stroke, So there

0:44:32.360 --> 0:44:35.360
<v Speaker 2>the problem is one step further upstream, meaning.

0:44:35.160 --> 0:44:36.799
<v Speaker 1>I mean they can't speak language by the way face.

0:44:36.960 --> 0:44:38.040
<v Speaker 3>Yes, well, there's different types.

0:44:38.080 --> 0:44:41.520
<v Speaker 2>So sometimes within aphasia that means they can't understand language,

0:44:41.560 --> 0:44:45.320
<v Speaker 2>but with expressive aphasia that means in many patients cases

0:44:45.440 --> 0:44:49.359
<v Speaker 2>they want to communicate, they really know what they're trying

0:44:49.400 --> 0:44:51.560
<v Speaker 2>to say in sort of in a meaning sense, but

0:44:51.640 --> 0:44:53.799
<v Speaker 2>they can't find the right words for it. It's almost like,

0:44:54.600 --> 0:44:57.000
<v Speaker 2>you know, sometimes I can't remember a word, but that's

0:44:57.120 --> 0:44:59.320
<v Speaker 2>rare and I can usually remember it or explain in

0:44:59.320 --> 0:45:02.160
<v Speaker 2>other words. But if I couldn't remember most of the words,

0:45:02.480 --> 0:45:04.520
<v Speaker 2>that would be really frustrating and debilitating.

0:45:04.520 --> 0:45:05.600
<v Speaker 3>And there's millions of.

0:45:05.520 --> 0:45:09.160
<v Speaker 2>People that have strokes and partially recover but never fully recover.

0:45:09.960 --> 0:45:12.880
<v Speaker 2>They have a language disorder. Many of them have perfectly

0:45:12.920 --> 0:45:17.200
<v Speaker 2>normal intelligence and their personalities preserved and kind of everything

0:45:17.200 --> 0:45:19.840
<v Speaker 2>else is there, but they just can't form words.

0:45:21.200 --> 0:45:22.000
<v Speaker 3>Can we help them?

0:45:22.040 --> 0:45:24.840
<v Speaker 2>And this is something that our lab and many others

0:45:24.880 --> 0:45:27.520
<v Speaker 2>are starting to think about. The idea is, can we

0:45:27.560 --> 0:45:30.160
<v Speaker 2>basically do this thing that we've done with a speech BCI,

0:45:30.239 --> 0:45:33.200
<v Speaker 2>but now make a language BCI can we put electrodes

0:45:33.600 --> 0:45:36.080
<v Speaker 2>somewhere in the language network and that is a lot

0:45:36.120 --> 0:45:38.359
<v Speaker 2>of the brain that's both a good and a bad thing.

0:45:39.239 --> 0:45:41.799
<v Speaker 3>Could we decode the meaning and this.

0:45:41.800 --> 0:45:43.439
<v Speaker 2>Is kind of getting close to this idea of a thought,

0:45:43.440 --> 0:45:45.799
<v Speaker 2>which is not a very well defined term, but could

0:45:45.800 --> 0:45:47.800
<v Speaker 2>we decode the semantic meaning of what they're trying to

0:45:47.840 --> 0:45:50.719
<v Speaker 2>communicate and have let's say, a tablet in front of

0:45:50.760 --> 0:45:53.680
<v Speaker 2>them print out a sentence or speak a sentence where

0:45:53.680 --> 0:45:56.320
<v Speaker 2>they're saying, I'm happy to see you, or could you

0:45:56.400 --> 0:45:59.319
<v Speaker 2>hand me some water? Or my nose itches or I'm

0:45:59.320 --> 0:46:02.880
<v Speaker 2>not feeling well well right, that thought, that communication intent

0:46:03.040 --> 0:46:06.440
<v Speaker 2>is still in there for many of these patients. We're

0:46:06.520 --> 0:46:10.120
<v Speaker 2>trying to develop a medical technology to help them, but

0:46:10.719 --> 0:46:13.400
<v Speaker 2>that starts getting pretty close to sounding like mind reading.

0:46:14.239 --> 0:46:17.960
<v Speaker 2>And so yeah, I think as an ethical question this

0:46:18.040 --> 0:46:22.279
<v Speaker 2>will potentially become relevant in the coming years if this

0:46:22.600 --> 0:46:24.120
<v Speaker 2>medical project succeeds.

0:46:24.360 --> 0:46:26.799
<v Speaker 1>It's interesting because we mean different things by mind reading.

0:46:26.840 --> 0:46:29.160
<v Speaker 1>There are all these different levels of it, so even

0:46:29.200 --> 0:46:33.400
<v Speaker 1>what somebody is trying to say often masks what they're thinking.

0:46:33.719 --> 0:46:36.480
<v Speaker 1>I'm trying to remember this quotation from the poet Oliver Goldsmith,

0:46:36.480 --> 0:46:39.279
<v Speaker 1>who said something like I think the real purpose of

0:46:39.400 --> 0:46:43.239
<v Speaker 1>language is not to communicate intent but to hide it.

0:46:44.480 --> 0:46:49.239
<v Speaker 1>So anyway, so if somebody says, hey, you know, I'm

0:46:49.239 --> 0:46:51.279
<v Speaker 1>happy to see you, or I you know, whatever the

0:46:51.320 --> 0:46:53.120
<v Speaker 1>thing is they're saying, it may or may not be

0:46:53.200 --> 0:46:55.960
<v Speaker 1>what their thoughts actually are. Is that's what their language is.

0:46:56.200 --> 0:46:59.040
<v Speaker 2>Yeah, so we're still talking. I'm still talking about decoding

0:46:59.040 --> 0:47:02.000
<v Speaker 2>communication and tent and that's sort of I think we

0:47:02.040 --> 0:47:04.440
<v Speaker 2>find it a little bit reassuring because it's an active process.

0:47:04.480 --> 0:47:08.120
<v Speaker 2>It's not like right now that we're nowhere close no

0:47:08.160 --> 0:47:09.680
<v Speaker 2>one even has an inkling of how to make a

0:47:09.680 --> 0:47:13.239
<v Speaker 2>device that can like read everything you know. You know,

0:47:13.239 --> 0:47:15.319
<v Speaker 2>you're not actively thinking about it, but it just knows

0:47:15.400 --> 0:47:18.160
<v Speaker 2>your whole childhood and all your deepest secrets and you

0:47:18.160 --> 0:47:21.040
<v Speaker 2>know what you think about everyone around you. That I

0:47:21.080 --> 0:47:22.880
<v Speaker 2>would not even know how to start to do that,

0:47:23.440 --> 0:47:26.879
<v Speaker 2>But for thinking what you're thinking actively or what you're

0:47:26.880 --> 0:47:31.560
<v Speaker 2>trying to communicate, that seems plausible. And there's some studies

0:47:31.680 --> 0:47:34.560
<v Speaker 2>using imaging that kind of you know, can do above

0:47:34.640 --> 0:47:37.520
<v Speaker 2>chance dey coding which someone's trying to communicate. We have

0:47:37.560 --> 0:47:39.879
<v Speaker 2>some preliminary data others do as well, So I think

0:47:40.160 --> 0:47:40.880
<v Speaker 2>that might happen.

0:47:41.080 --> 0:47:43.160
<v Speaker 1>So let me ask you a few things. When will

0:47:43.200 --> 0:47:44.880
<v Speaker 1>paralysis be solved?

0:47:44.960 --> 0:47:50.279
<v Speaker 2>I think there will be approved BCIs for paralysis in

0:47:50.360 --> 0:47:53.640
<v Speaker 2>about five years. That doesn't mean they'll be available everywhere.

0:47:53.960 --> 0:47:57.040
<v Speaker 2>They might be only available in certain markets. Maybe only

0:47:57.040 --> 0:48:00.200
<v Speaker 2>a few hospitals will initially be providing them, but that

0:48:00.200 --> 0:48:01.000
<v Speaker 2>will grow rapidly.

0:48:01.200 --> 0:48:01.960
<v Speaker 3>Will it mean.

0:48:01.920 --> 0:48:05.360
<v Speaker 2>Paralysis is cured? I think that's too strong a term.

0:48:06.080 --> 0:48:08.520
<v Speaker 2>Maybe that means you can walk slowly, you can move

0:48:08.560 --> 0:48:10.719
<v Speaker 2>your arm, but you maybe can't tie your shoelace.

0:48:10.800 --> 0:48:11.280
<v Speaker 3>Initially.

0:48:11.680 --> 0:48:14.239
<v Speaker 2>You can move a computer cursor really well, but that's

0:48:14.239 --> 0:48:15.720
<v Speaker 2>not the same thing as playing the piano.

0:48:16.120 --> 0:48:18.240
<v Speaker 3>So I think the capabilities will keep getting better.

0:48:18.600 --> 0:48:23.600
<v Speaker 1>And with als and dysarthria where someone can't articulate, well,

0:48:24.600 --> 0:48:25.680
<v Speaker 1>what are we looking at?

0:48:26.040 --> 0:48:28.240
<v Speaker 3>Your prediction, it's actually the same.

0:48:28.360 --> 0:48:32.439
<v Speaker 2>I think that the speech bring computer interfaces are going

0:48:32.480 --> 0:48:36.839
<v Speaker 2>to move very fast. I think that and cursor will

0:48:36.880 --> 0:48:39.239
<v Speaker 2>probably be one of the first approved systems, even though

0:48:39.239 --> 0:48:42.920
<v Speaker 2>people have been trying to move robot arms or paralyzed limbs.

0:48:42.760 --> 0:48:43.640
<v Speaker 3>For much longer.

0:48:43.880 --> 0:48:46.720
<v Speaker 2>So if you're trying to decode what someone's trying to say,

0:48:47.200 --> 0:48:49.600
<v Speaker 2>or decode them trying to move a computer cursor or

0:48:49.719 --> 0:48:52.200
<v Speaker 2>right of the keyboard the thing that they're trying to

0:48:52.200 --> 0:48:55.200
<v Speaker 2>control as a computer, and those are ubiquitous, they're everywhere, they're.

0:48:55.040 --> 0:48:56.480
<v Speaker 3>Cheap, they work really well.

0:48:56.760 --> 0:48:58.919
<v Speaker 2>If you're trying to decode what someone's trying to move

0:48:59.000 --> 0:49:02.400
<v Speaker 2>with their arm, you either need to move a robot arm.

0:49:02.680 --> 0:49:06.319
<v Speaker 2>Robot arms are hard, they break often, they're not as

0:49:06.480 --> 0:49:07.680
<v Speaker 2>precise as people are.

0:49:07.960 --> 0:49:09.520
<v Speaker 3>You know, where does it go? Does it go on

0:49:09.520 --> 0:49:10.239
<v Speaker 3>your wheelchair?

0:49:10.320 --> 0:49:13.000
<v Speaker 2>Is it there with you in the shower, if it's

0:49:13.040 --> 0:49:16.239
<v Speaker 2>mounted on like if you have an amputation, is.

0:49:16.160 --> 0:49:19.279
<v Speaker 3>It mounted on your stump or on your shoulder? That

0:49:19.400 --> 0:49:20.920
<v Speaker 3>is hard. There's a lot of challenges there.

0:49:22.080 --> 0:49:25.719
<v Speaker 2>So kind of the readout part for speech is very

0:49:25.719 --> 0:49:27.720
<v Speaker 2>hard because it's very fast. There's a lot of information

0:49:27.800 --> 0:49:31.839
<v Speaker 2>per second. But once you have that solved, making use

0:49:31.880 --> 0:49:33.840
<v Speaker 2>of it is actually really easy. You just send texts

0:49:33.840 --> 0:49:35.919
<v Speaker 2>to their computer or their phone, or you have their

0:49:36.200 --> 0:49:40.000
<v Speaker 2>tablet talk mix sound and that's something you can carry

0:49:40.000 --> 0:49:41.759
<v Speaker 2>with you all the time and it's really reliable. So

0:49:42.120 --> 0:49:44.080
<v Speaker 2>because for all those reasons, I think we're going to

0:49:44.120 --> 0:49:49.880
<v Speaker 2>have speech and also computer use BCIs hopefully starting to

0:49:49.960 --> 0:49:51.360
<v Speaker 2>hit the market in the next five years.

0:49:51.760 --> 0:49:54.440
<v Speaker 1>Great and when you think about fifty years from now,

0:49:54.480 --> 0:49:58.239
<v Speaker 1>when you think about as you're retiring and you look

0:49:58.280 --> 0:50:00.160
<v Speaker 1>around the field, what do you say.

0:50:00.880 --> 0:50:03.560
<v Speaker 2>I think BCIs will be well, the term may not

0:50:03.600 --> 0:50:06.120
<v Speaker 2>even mean anything because it's going to be so wide.

0:50:06.880 --> 0:50:09.640
<v Speaker 2>I think many of the diseases that we struggle with

0:50:09.680 --> 0:50:12.360
<v Speaker 2>today are going to be treated with some sort of

0:50:12.400 --> 0:50:15.040
<v Speaker 2>technology inside the head or interacting with the head.

0:50:15.120 --> 0:50:16.560
<v Speaker 3>Maybe it's somehow not.

0:50:16.600 --> 0:50:20.279
<v Speaker 2>Invasive, whether that's paralysis, which is going to be I

0:50:20.280 --> 0:50:24.240
<v Speaker 2>think much faster than that. Or will we have systems

0:50:24.239 --> 0:50:27.960
<v Speaker 2>that help us regulate our mood, Will they treat psychiatric issues,

0:50:28.040 --> 0:50:31.440
<v Speaker 2>Will they perhaps reconnect parts of the brain that have

0:50:31.520 --> 0:50:35.400
<v Speaker 2>been disconnected due to aging or damage, or injury or stroke.

0:50:36.200 --> 0:50:38.840
<v Speaker 2>If we're talking about fifty years, a lot can happen

0:50:38.840 --> 0:50:41.880
<v Speaker 2>in fifty years, right, I mean technology is moving very quickly.

0:50:42.480 --> 0:50:45.400
<v Speaker 2>The interfaces will get better. So instead of talking about

0:50:45.800 --> 0:50:47.960
<v Speaker 2>instead of me being right now excited about recording from

0:50:48.000 --> 0:50:51.560
<v Speaker 2>a thousand neurons, in fifty years, could we be interfacing

0:50:51.560 --> 0:50:53.600
<v Speaker 2>with one hundred thousand or a million neurons.

0:50:53.880 --> 0:50:55.160
<v Speaker 3>I think that's really plausible.

0:50:56.320 --> 0:51:01.719
<v Speaker 2>Through tiny nano wires or biohybrids or focused beams that

0:51:01.760 --> 0:51:02.600
<v Speaker 2>are non invasive.

0:51:02.840 --> 0:51:03.640
<v Speaker 3>A lot can happen.

0:51:03.640 --> 0:51:05.840
<v Speaker 2>In fifty years, our neuroscience, I think, will be a

0:51:05.840 --> 0:51:06.600
<v Speaker 2>lot more advanced.

0:51:06.800 --> 0:51:09.359
<v Speaker 3>We will not be limited to right now.

0:51:09.400 --> 0:51:12.480
<v Speaker 2>We mostly understand the peripheres, We understand movement, We understand

0:51:12.480 --> 0:51:15.840
<v Speaker 2>the senses really well because it's really easy to experimentally

0:51:15.960 --> 0:51:16.760
<v Speaker 2>manipulate those.

0:51:17.239 --> 0:51:18.279
<v Speaker 3>We as soon as you get.

0:51:18.160 --> 0:51:22.360
<v Speaker 2>Into the kind of the inside the center cognition intelligence,

0:51:22.400 --> 0:51:26.400
<v Speaker 2>how do we problem solve creativity? We don't understand that

0:51:26.440 --> 0:51:29.000
<v Speaker 2>really well, but I think at fifty years we will.

0:51:29.480 --> 0:51:31.640
<v Speaker 2>And part of that is because as we make these

0:51:31.840 --> 0:51:36.000
<v Speaker 2>medical systems, we will have access to human brains. So

0:51:36.200 --> 0:51:38.200
<v Speaker 2>think of this as a flywheel. So let's say someone

0:51:38.239 --> 0:51:40.880
<v Speaker 2>has a few thousand electrodes because they have a stroke

0:51:40.920 --> 0:51:44.000
<v Speaker 2>and they want to communicate. Maybe these are spread across

0:51:44.040 --> 0:51:46.360
<v Speaker 2>several different brain areas because you get different pieces of it.

0:51:46.480 --> 0:51:49.320
<v Speaker 2>Or maybe you get the prosody in one area primarily

0:51:49.400 --> 0:51:51.359
<v Speaker 2>and you get what they're trying to say in the

0:51:51.400 --> 0:51:54.960
<v Speaker 2>motor cortex. But you get some planning benefit and language

0:51:54.960 --> 0:51:56.960
<v Speaker 2>benefit from the temporal lobe. Okay, so let's say you

0:51:56.960 --> 0:52:00.879
<v Speaker 2>have four or five six areas that you're recording from. Well,

0:52:00.880 --> 0:52:02.920
<v Speaker 2>now you have a wealth of information that you can

0:52:03.000 --> 0:52:04.960
<v Speaker 2>use for other things. So some of these patients are

0:52:04.960 --> 0:52:10.000
<v Speaker 2>going to develop dementia over time, or they might be depressed,

0:52:10.440 --> 0:52:14.719
<v Speaker 2>or they might have OCD, And instead of having to

0:52:14.800 --> 0:52:17.120
<v Speaker 2>do a new brain implant with all the new risks

0:52:17.120 --> 0:52:18.600
<v Speaker 2>of that, you can just look at the data you're

0:52:18.600 --> 0:52:21.360
<v Speaker 2>already collecting and try to relate that to their mood

0:52:21.520 --> 0:52:23.719
<v Speaker 2>or what are they looking at? What are they trying

0:52:23.719 --> 0:52:26.759
<v Speaker 2>to remember? Oh, they're trying to remember where they put

0:52:26.800 --> 0:52:30.880
<v Speaker 2>their keys. Hey, Actually, because we have electrodes in the

0:52:30.920 --> 0:52:34.040
<v Speaker 2>temporal lobe, it's close to the hippocampus, it's cortex, it's

0:52:34.080 --> 0:52:36.400
<v Speaker 2>part of the memory system as well, everything's kind of

0:52:36.440 --> 0:52:40.120
<v Speaker 2>spread out. Well, maybe now we're seeing some neural correlative

0:52:40.400 --> 0:52:44.640
<v Speaker 2>that memory process. Maybe we can even ask if they're

0:52:44.640 --> 0:52:48.080
<v Speaker 2>willing to do another clinical trail where we stimulate and

0:52:48.520 --> 0:52:50.759
<v Speaker 2>try to boost that memory, try to kind of help

0:52:50.840 --> 0:52:53.759
<v Speaker 2>nudget be remembered correctly. I think when we're talking about

0:52:53.760 --> 0:52:56.520
<v Speaker 2>fifty years that's going to happen. And so through this

0:52:56.560 --> 0:52:59.600
<v Speaker 2>process we're going to learn a lot more about how

0:52:59.640 --> 0:53:01.680
<v Speaker 2>the human mind works and thus how to fix it.

0:53:06.200 --> 0:53:09.360
<v Speaker 1>That was my interview with Sergei Stavisky, a neuroscientist that

0:53:09.480 --> 0:53:13.400
<v Speaker 1>you see Davis and co director of the Neuroprosthetics Lab.

0:53:13.840 --> 0:53:17.120
<v Speaker 1>We talked about what BCIs can do, what they might

0:53:17.160 --> 0:53:20.759
<v Speaker 1>do soon, and how will navigate the human questions that

0:53:20.800 --> 0:53:23.759
<v Speaker 1>they raise. What we talked about today was how a

0:53:24.000 --> 0:53:28.480
<v Speaker 1>person's intention can find its way back into the world

0:53:28.920 --> 0:53:33.240
<v Speaker 1>when bodies have lost function. Brain computer interfaces are opening

0:53:33.320 --> 0:53:36.960
<v Speaker 1>a new lane right now. These technologies are crude in

0:53:37.040 --> 0:53:39.759
<v Speaker 1>some ways, but they're getting better fast. Each year they

0:53:39.760 --> 0:53:42.520
<v Speaker 1>get a little faster and more expressive. So this is

0:53:42.600 --> 0:53:48.040
<v Speaker 1>how BCIs can restore autonomy and intimacy and dignity. And

0:53:48.120 --> 0:53:51.279
<v Speaker 1>when it's done right, you don't see the technology at all,

0:53:51.560 --> 0:53:54.520
<v Speaker 1>You just see the person again. So here's how I

0:53:54.560 --> 0:53:57.440
<v Speaker 1>see it. In the next five years, BCIs are going

0:53:57.520 --> 0:54:02.440
<v Speaker 1>to start looking less like research product and more like appliances.

0:54:02.680 --> 0:54:06.560
<v Speaker 1>We're going to have fully implantable systems for communication. In

0:54:06.600 --> 0:54:08.239
<v Speaker 1>other words, at some point in the future, we'll be

0:54:08.280 --> 0:54:12.759
<v Speaker 1>looking at a small surgery, a wireless puck that goes in,

0:54:13.239 --> 0:54:16.440
<v Speaker 1>and a setup that takes minutes instead of hours. You'll

0:54:16.680 --> 0:54:20.480
<v Speaker 1>turn on your speech BCI or your BCI that controls

0:54:20.480 --> 0:54:24.520
<v Speaker 1>a computer cursor, and the key thing will be reliability,

0:54:24.880 --> 0:54:30.080
<v Speaker 1>these decoders will hold steady through years, and also identity.

0:54:30.680 --> 0:54:33.760
<v Speaker 1>The voice is going to sound just like you, your cadence,

0:54:33.840 --> 0:54:36.520
<v Speaker 1>your prosity, your humor at the end of a sentence.

0:54:36.960 --> 0:54:40.880
<v Speaker 1>Maybe rehab teams will have a neural therapist who tunes

0:54:40.920 --> 0:54:44.520
<v Speaker 1>your decoder the way that an audiologist tunes a cochlear implant.

0:54:44.760 --> 0:54:46.720
<v Speaker 1>And if I had a guess, this will all become

0:54:47.200 --> 0:54:52.319
<v Speaker 1>normal rather than newsworthy. Now around ten years out, we'll

0:54:52.320 --> 0:54:56.080
<v Speaker 1>get good feedback of signals moving in both directions. So

0:54:56.440 --> 0:55:00.360
<v Speaker 1>a person who is suffering from paralysis will can control

0:55:00.400 --> 0:55:04.400
<v Speaker 1>her hand through say electrodes in her motor cortex, and

0:55:04.440 --> 0:55:08.080
<v Speaker 1>you have another interface, say electrodes in her somatosentury cortex,

0:55:08.520 --> 0:55:12.520
<v Speaker 1>that's inputting information so that she feels a push back

0:55:12.640 --> 0:55:17.280
<v Speaker 1>with electrically evoked touch, and that loop makes the movements

0:55:17.640 --> 0:55:20.920
<v Speaker 1>smooth and automatic. This is all going to continue getting

0:55:20.960 --> 0:55:25.160
<v Speaker 1>smaller and better. Soon will have thin film options to

0:55:25.280 --> 0:55:30.399
<v Speaker 1>reduce the surgical footprints. The decoders will auto calibrate, they'll

0:55:30.440 --> 0:55:34.640
<v Speaker 1>borrow tricks from language models, and they'll figure out how

0:55:34.680 --> 0:55:38.759
<v Speaker 1>to adjust to your neural dynamics when you're tired or

0:55:38.800 --> 0:55:43.760
<v Speaker 1>stressed or boosted on caffeine. Eventually your BCI will speak

0:55:43.800 --> 0:55:47.839
<v Speaker 1>the same API language as your phone and home devices,

0:55:48.000 --> 0:55:51.040
<v Speaker 1>so that you can text or adjust the lights or

0:55:51.080 --> 0:55:56.160
<v Speaker 1>turn on appliances without moving a limb or making a sound.

0:55:56.440 --> 0:56:02.560
<v Speaker 1>And crucially, the privacy architecture is to evolve like inner

0:56:02.560 --> 0:56:06.960
<v Speaker 1>speech stays off limits by default, and your neural stream

0:56:07.080 --> 0:56:10.799
<v Speaker 1>lives behind consent gates. We'll need to have a kind

0:56:10.840 --> 0:56:14.960
<v Speaker 1>of airplane mode for the mind. Okay, And if I

0:56:15.000 --> 0:56:18.120
<v Speaker 1>were going to speculate on a quarter century from now,

0:56:18.640 --> 0:56:21.600
<v Speaker 1>I'm thinking that what we're looking at is very high

0:56:21.640 --> 0:56:26.040
<v Speaker 1>bandwidth arrays. These might be micro needles or flexible meshes,

0:56:26.600 --> 0:56:31.200
<v Speaker 1>or electrode stents living on the inside of the blood vessels.

0:56:31.480 --> 0:56:34.920
<v Speaker 1>Whatever the technology, it's going to give us coverage that

0:56:35.080 --> 0:56:41.120
<v Speaker 1>approaches the dexterousness of natural hand control. Imagine playing a

0:56:41.160 --> 0:56:45.880
<v Speaker 1>piano with one of these. Imagine prosthetics and exoskeletons that

0:56:46.000 --> 0:56:49.840
<v Speaker 1>feel less like machines and more like natural limbs because

0:56:49.880 --> 0:56:53.399
<v Speaker 1>the brain sees and feels them just as part of

0:56:53.480 --> 0:56:57.280
<v Speaker 1>the body. And for communication, we'll get the full richness

0:56:57.320 --> 0:57:00.759
<v Speaker 1>of natural speech. Just imagine talking with a person with

0:57:00.800 --> 0:57:04.680
<v Speaker 1>a BCI and you hear the emphasis of ups and

0:57:04.719 --> 0:57:08.560
<v Speaker 1>downs of speech, and their laughter and their little half

0:57:08.600 --> 0:57:14.200
<v Speaker 1>swallowed syllables when people are negotiating, turn taking and singing.

0:57:15.080 --> 0:57:17.560
<v Speaker 1>And soon enough, I think, in our lifetimes for sure,

0:57:18.080 --> 0:57:21.640
<v Speaker 1>the science fiction edge of this all is going to

0:57:21.680 --> 0:57:24.680
<v Speaker 1>start to glow. So imagine a scene like this when

0:57:24.720 --> 0:57:27.640
<v Speaker 1>you step onto a train maybe thirty five years from now.

0:57:28.160 --> 0:57:32.240
<v Speaker 1>People are sitting there. It's crowded, and they're all speaking

0:57:32.440 --> 0:57:36.040
<v Speaker 1>private messages to their friends who are somewhere else. There's

0:57:36.080 --> 0:57:40.920
<v Speaker 1>no sound, the train is quiet. Each person's decoder is

0:57:41.040 --> 0:57:45.120
<v Speaker 1>locked onto their attempted speech, not their idle thoughts, and

0:57:45.280 --> 0:57:48.960
<v Speaker 1>every message is signed with a cryptographic water mark that

0:57:49.040 --> 0:57:52.520
<v Speaker 1>proves it came from that person's neural key. So you're

0:57:52.720 --> 0:57:57.160
<v Speaker 1>looking at a silent train car, but it's filled with conversations.

0:57:57.640 --> 0:58:01.960
<v Speaker 1>Or just imagine something simpler. Here's a carpenter who lost

0:58:01.960 --> 0:58:04.960
<v Speaker 1>his hand, but he's back at work with a prosthetic

0:58:05.000 --> 0:58:10.680
<v Speaker 1>hand that streams touch information into the brain pressure and temperature.

0:58:10.840 --> 0:58:13.960
<v Speaker 1>But also he can feel the details of the grain.

0:58:14.080 --> 0:58:17.280
<v Speaker 1>He can tell the difference between pine and oak just

0:58:17.320 --> 0:58:21.800
<v Speaker 1>by running his sensory packed robotic fingers over it. And

0:58:21.840 --> 0:58:24.280
<v Speaker 1>the key is that He doesn't think about the device

0:58:24.440 --> 0:58:28.560
<v Speaker 1>at all. He just builds, just like you use the

0:58:29.120 --> 0:58:33.040
<v Speaker 1>high bandwidth sensory devices on your own hand, and you

0:58:33.160 --> 0:58:36.760
<v Speaker 1>rarely stop to think about it. Eventually, there'll be a

0:58:36.800 --> 0:58:39.640
<v Speaker 1>lot of legislation in place, because there are going to

0:58:39.680 --> 0:58:43.240
<v Speaker 1>be hard lines we choose as a society not to cross.

0:58:43.680 --> 0:58:47.080
<v Speaker 1>Not all thoughts should be digitized. We're going to need

0:58:47.480 --> 0:58:51.520
<v Speaker 1>neuro rights with teeth, will need on device processing that

0:58:51.640 --> 0:58:55.840
<v Speaker 1>keeps data local where maybe you have your own descendant

0:58:55.960 --> 0:58:59.920
<v Speaker 1>of modern day LLMS living with you in your brain.

0:59:00.680 --> 0:59:05.640
<v Speaker 1>Whatever the case, will presumably keep asking philosophical questions about

0:59:05.680 --> 0:59:09.280
<v Speaker 1>our brains and ourselves, but we'll get to do it

0:59:09.320 --> 0:59:13.400
<v Speaker 1>with better and better tools than we have now. And

0:59:13.560 --> 0:59:17.200
<v Speaker 1>I think what this means is that we have more

0:59:17.280 --> 0:59:21.440
<v Speaker 1>in common with our ancestors of a thousand years ago

0:59:22.000 --> 0:59:28.760
<v Speaker 1>than we do with our descendants a century from now.

0:59:29.880 --> 0:59:32.840
<v Speaker 1>Go to Eagleman dot com slash podcast for more information

0:59:32.920 --> 0:59:36.960
<v Speaker 1>and to find further reading. Send me an email at

0:59:37.040 --> 0:59:41.320
<v Speaker 1>podcasts at eagleman dot com with questions or discussion and

0:59:41.400 --> 0:59:44.360
<v Speaker 1>check out Subscribe to Inner Cosmos on YouTube for videos

0:59:44.400 --> 0:59:48.880
<v Speaker 1>of each episode and to leave comments. Until next time.

0:59:48.960 --> 1:00:03.480
<v Speaker 1>I'm David eagleman, and this is inner cosmos.