WEBVTT - The AI Pioneer Developing New Kinds of Medicine

0:00:15.356 --> 0:00:15.796
<v Speaker 1>Pushkin.

0:00:20.156 --> 0:00:22.156
<v Speaker 2>If I were going to pick one paper from the

0:00:22.196 --> 0:00:25.316
<v Speaker 2>past decade that had the biggest impact on the world,

0:00:25.876 --> 0:00:28.676
<v Speaker 2>I would choose one called Attention Is All You Need,

0:00:28.916 --> 0:00:34.836
<v Speaker 2>published in twenty seventeen. That paper basically invented transformer models.

0:00:35.356 --> 0:00:38.836
<v Speaker 2>You've almost certainly used a transformer model if you have

0:00:39.036 --> 0:00:42.676
<v Speaker 2>used chat GPT or Gemini or Claude or deep Seek.

0:00:42.996 --> 0:00:47.236
<v Speaker 2>In fact, the tea in chat GPT stands for a transformer,

0:00:48.076 --> 0:00:51.196
<v Speaker 2>and transformer models have turned out to be wildly useful,

0:00:51.276 --> 0:00:54.476
<v Speaker 2>not just at generating language, but also at everything from

0:00:54.516 --> 0:00:57.596
<v Speaker 2>generating images to predicting what proteins will look like.

0:00:58.516 --> 0:01:00.716
<v Speaker 1>In fact, transformers.

0:01:00.076 --> 0:01:03.156
<v Speaker 2>Are so ubiquitous and so powerful that it's easy to

0:01:03.196 --> 0:01:05.556
<v Speaker 2>forget that some guy just thought them up.

0:01:06.316 --> 0:01:08.036
<v Speaker 1>But in fact, some guy did.

0:01:08.116 --> 0:01:10.876
<v Speaker 2>Just think up transform and I'm talking to him today

0:01:11.276 --> 0:01:19.396
<v Speaker 2>on the show. I'm Jacob Goldstein and this is What's

0:01:19.436 --> 0:01:21.396
<v Speaker 2>Your Problem, the show where I talk to people who

0:01:21.396 --> 0:01:24.956
<v Speaker 2>are trying to make technological progress. My guest today is

0:01:25.036 --> 0:01:28.436
<v Speaker 2>Yakub Uskolai. And just to be clear, Yakub was one

0:01:28.476 --> 0:01:31.796
<v Speaker 2>of several co authors on that transformer paper, and on

0:01:31.836 --> 0:01:34.596
<v Speaker 2>top of that, lots of other researchers were working on

0:01:34.716 --> 0:01:36.756
<v Speaker 2>related things at the same time, so a lot of

0:01:36.796 --> 0:01:39.596
<v Speaker 2>people were working on this, but the key idea did

0:01:39.636 --> 0:01:44.076
<v Speaker 2>seem to come from Yakub. Today, Yakub is the CEO

0:01:44.236 --> 0:01:47.476
<v Speaker 2>of Inceptive. That's a company that he co founded to

0:01:47.636 --> 0:01:51.076
<v Speaker 2>use AI to develop new kinds of medicine, and the

0:01:51.116 --> 0:01:54.876
<v Speaker 2>company is particularly focused on RNA. We talked about his

0:01:54.916 --> 0:01:57.476
<v Speaker 2>work at Inceptive in the second part of our conversation.

0:01:58.236 --> 0:02:00.196
<v Speaker 2>In the first part, we talked about his work on

0:02:00.316 --> 0:02:03.956
<v Speaker 2>transformer models. At the time he started working on the

0:02:04.036 --> 0:02:07.236
<v Speaker 2>idea for transformers, this is around a decade ago now,

0:02:07.556 --> 0:02:10.916
<v Speaker 2>there were a couple of big problems with existing language models.

0:02:11.436 --> 0:02:13.956
<v Speaker 2>For one thing, they were slow. They were in fact

0:02:13.996 --> 0:02:16.276
<v Speaker 2>so slow that they could not even keep up with

0:02:16.356 --> 0:02:19.996
<v Speaker 2>all the new training data that was becoming available. A

0:02:20.036 --> 0:02:24.716
<v Speaker 2>second problem, they struggled with what are called long range dependencies.

0:02:25.196 --> 0:02:29.156
<v Speaker 2>Basically in language, that's relationships between words that are far

0:02:29.236 --> 0:02:32.196
<v Speaker 2>apart from each other in a sentence. So to start,

0:02:32.356 --> 0:02:34.756
<v Speaker 2>I asked Yakab for an example we could use to

0:02:34.796 --> 0:02:37.756
<v Speaker 2>discuss these problems and also how he came up with

0:02:37.836 --> 0:02:41.076
<v Speaker 2>his big idea for how to solve them. So, pick

0:02:41.116 --> 0:02:43.916
<v Speaker 2>a sentence that's going to be a good object lesson

0:02:43.956 --> 0:02:44.276
<v Speaker 2>for us.

0:02:44.356 --> 0:02:47.516
<v Speaker 1>Okay, so we could have the frog didn't cross the

0:02:47.556 --> 0:02:50.356
<v Speaker 1>road because it was too tired. Okay, so we got

0:02:50.356 --> 0:02:51.556
<v Speaker 1>our sentence. Yep.

0:02:52.236 --> 0:02:55.236
<v Speaker 2>How would the sort of big, powerful but slow to

0:02:55.316 --> 0:02:57.836
<v Speaker 2>train algorithm in twenty fifteen.

0:02:59.036 --> 0:03:02.596
<v Speaker 1>Have processed that sentence? So basically it would have walked

0:03:02.636 --> 0:03:06.476
<v Speaker 1>through that sentence word by word, and so it would

0:03:06.556 --> 0:03:11.876
<v Speaker 1>walk through the sentence left to right. The frog did

0:03:12.076 --> 0:03:15.036
<v Speaker 1>not cross the road because it was too tired.

0:03:15.076 --> 0:03:17.796
<v Speaker 2>Which is logical, which is how I would think a

0:03:17.836 --> 0:03:18.636
<v Speaker 2>system would work.

0:03:18.676 --> 0:03:21.676
<v Speaker 1>It's more or less how we read, right, it's how

0:03:21.716 --> 0:03:25.076
<v Speaker 1>we read, but it's not necessarily how we understand. Uh huh.

0:03:25.116 --> 0:03:28.476
<v Speaker 1>That is actually one of the integral I would say

0:03:29.036 --> 0:03:31.876
<v Speaker 1>for what we then how we then went about trying

0:03:31.916 --> 0:03:32.716
<v Speaker 1>to speak us all up?

0:03:32.876 --> 0:03:34.796
<v Speaker 2>Well, I love that. I want you to say more

0:03:34.836 --> 0:03:37.236
<v Speaker 2>about it. When you say it's not how we understand,

0:03:37.276 --> 0:03:37.876
<v Speaker 2>what do you mean?

0:03:38.516 --> 0:03:43.156
<v Speaker 1>So? On one hand, right linearity of time forces us

0:03:43.236 --> 0:03:48.636
<v Speaker 1>to almost always feel that we're communicating language in order

0:03:48.836 --> 0:03:53.876
<v Speaker 1>and just linearly. It actually turns out that that's not

0:03:53.916 --> 0:03:56.956
<v Speaker 1>really how we read, not even in terms of our secades,

0:03:56.956 --> 0:03:59.236
<v Speaker 1>in terms of our em movements. We actually do jump

0:03:59.276 --> 0:04:01.916
<v Speaker 1>back and forth quite a bit while reading, and if

0:04:01.956 --> 0:04:06.756
<v Speaker 1>you look at conversations, you also have highly nonlinear elements

0:04:06.796 --> 0:04:11.076
<v Speaker 1>where there's repetition, there's reference, there's basically different flavors of interruption.

0:04:11.556 --> 0:04:14.036
<v Speaker 1>But sure, by and large right, we would say we

0:04:14.156 --> 0:04:17.356
<v Speaker 1>certainly right them left to right right. So if you

0:04:17.516 --> 0:04:20.556
<v Speaker 1>write a proper text, you don't write it as you

0:04:20.556 --> 0:04:22.276
<v Speaker 1>would read it, and you also don't write it as

0:04:22.316 --> 0:04:24.196
<v Speaker 1>you would talk about it. You do write it in

0:04:24.796 --> 0:04:28.196
<v Speaker 1>one linear order. Now, as we read this and as

0:04:28.196 --> 0:04:34.196
<v Speaker 1>we understand this, we actually form groups of words that

0:04:34.316 --> 0:04:38.156
<v Speaker 1>then form meaning. Right. So an example of that is

0:04:38.596 --> 0:04:42.556
<v Speaker 1>you know adjective noun, right, it's or say, in this

0:04:42.596 --> 0:04:45.756
<v Speaker 1>case an article noun, it's not a frog, it's the frog. Right.

0:04:45.916 --> 0:04:48.716
<v Speaker 1>We could have also said it's the green frog or

0:04:49.156 --> 0:04:49.956
<v Speaker 1>the lazy frog.

0:04:50.356 --> 0:04:54.076
<v Speaker 2>Right. Language has a structure, right, and there things can

0:04:54.116 --> 0:04:58.076
<v Speaker 2>modify other things, and things can modify the modifiers exactly exactly.

0:04:58.156 --> 0:05:02.156
<v Speaker 1>But the interesting thing now is that structure in as

0:05:01.956 --> 0:05:05.636
<v Speaker 1>a as a tree structured clean hierarchy, only tells you

0:05:05.676 --> 0:05:10.876
<v Speaker 1>half the story. There's so many exceptions where statistical dependencies,

0:05:10.956 --> 0:05:13.836
<v Speaker 1>where modification actually happens at a distance.

0:05:14.116 --> 0:05:16.116
<v Speaker 2>So okay, So just to bring this back to your

0:05:16.156 --> 0:05:19.076
<v Speaker 2>sample sentence, The frog didn't cross the road because it

0:05:19.156 --> 0:05:22.956
<v Speaker 2>was too tired. That word it is actually quite far

0:05:22.996 --> 0:05:25.996
<v Speaker 2>from the word frog. And if you're an AI going

0:05:25.996 --> 0:05:28.956
<v Speaker 2>from left to right, you may well get confused there, right,

0:05:28.996 --> 0:05:33.196
<v Speaker 2>You may think it refers to road instead of to frog.

0:05:34.276 --> 0:05:37.076
<v Speaker 2>So this is one of the problems you were trying

0:05:37.116 --> 0:05:39.236
<v Speaker 2>to solve. And then the other one you were mentioning before,

0:05:39.556 --> 0:05:43.836
<v Speaker 2>which is these models were just slow because after each word,

0:05:43.876 --> 0:05:46.956
<v Speaker 2>the model just recalculates what everything means, and that just

0:05:47.156 --> 0:05:48.156
<v Speaker 2>takes a long time.

0:05:48.476 --> 0:05:51.396
<v Speaker 1>They can't go fast enough exactly. It takes a long time,

0:05:51.476 --> 0:05:55.116
<v Speaker 1>and it doesn't play to the strengths of the computers,

0:05:55.356 --> 0:05:57.356
<v Speaker 1>of the accelerators that we're using there.

0:05:57.756 --> 0:05:59.916
<v Speaker 2>And when you say accelerators, I know Google has their

0:05:59.956 --> 0:06:02.956
<v Speaker 2>own chips, but basically we mean GPUs.

0:06:02.356 --> 0:06:04.756
<v Speaker 1>Now right, we mean GPUs, We mean.

0:06:04.636 --> 0:06:08.276
<v Speaker 2>The chips that Nvidia sells. What is the nature of.

0:06:08.196 --> 0:06:10.396
<v Speaker 1>Those particular ships. Yeah, So the nature of those particular

0:06:10.516 --> 0:06:16.596
<v Speaker 1>chips is that instead of doing a broad variety of

0:06:16.756 --> 0:06:22.636
<v Speaker 1>complex computations in sequence, they are incredibly good. They excel

0:06:23.116 --> 0:06:27.516
<v Speaker 1>at performing many, many, many simple computations in parallel. And

0:06:27.556 --> 0:06:32.396
<v Speaker 1>so what this hierarchical or semi hierrachical nature of language

0:06:32.956 --> 0:06:38.356
<v Speaker 1>enables you to do is instead of having, so to speak,

0:06:38.476 --> 0:06:41.956
<v Speaker 1>one place where you read the current word, you could

0:06:42.036 --> 0:06:46.876
<v Speaker 1>now imagine you actually read every You look at everything

0:06:46.876 --> 0:06:52.196
<v Speaker 1>at the same time, and you apply many simple operations

0:06:52.556 --> 0:06:55.516
<v Speaker 1>at the same time to each position in your sentence.

0:06:55.796 --> 0:06:58.276
<v Speaker 2>Huh So this is the big idea, I just want

0:06:58.276 --> 0:07:01.876
<v Speaker 2>to because this is it, right, this is the breakthrough happening. Yes,

0:07:02.756 --> 0:07:05.836
<v Speaker 2>it's basically, what if instead of reading the sentence one

0:07:05.836 --> 0:07:08.276
<v Speaker 2>word at a time from left to right, we read

0:07:08.276 --> 0:07:10.036
<v Speaker 2>the whole thing all at once.

0:07:10.396 --> 0:07:13.676
<v Speaker 1>All at once. And now the problem is clearly something's

0:07:13.676 --> 0:07:17.076
<v Speaker 1>got to give, right, so there's no fore lunch in

0:07:17.116 --> 0:07:20.116
<v Speaker 1>that sense. You have to now simplify what you can

0:07:20.156 --> 0:07:22.956
<v Speaker 1>do at every position when you do this all in parallel,

0:07:24.436 --> 0:07:26.716
<v Speaker 1>but you can now afford to do this a bunch

0:07:26.716 --> 0:07:29.916
<v Speaker 1>of times after another and revise it over time or

0:07:29.956 --> 0:07:32.796
<v Speaker 1>over these steps. And so instead of walking through the

0:07:32.836 --> 0:07:35.836
<v Speaker 1>sentence from beginning to end, whether an average sentence has

0:07:35.876 --> 0:07:39.076
<v Speaker 1>like twenty words or so average sentence in pros, instead

0:07:39.116 --> 0:07:41.716
<v Speaker 1>of walking those twenty positions, what you're doing is you're

0:07:41.756 --> 0:07:46.116
<v Speaker 1>looking at every word at the same time, but in

0:07:46.156 --> 0:07:49.036
<v Speaker 1>a simpler way. But now you can do that maybe

0:07:49.156 --> 0:07:53.556
<v Speaker 1>five or six times, revising your understanding, and that turns

0:07:53.556 --> 0:07:59.596
<v Speaker 1>out is faster, way faster on GPUs and because of

0:07:59.636 --> 0:08:01.916
<v Speaker 1>this hierarchical nature of language, it's also better.

0:08:02.996 --> 0:08:06.076
<v Speaker 2>So you have this idea, and as I read the

0:08:06.076 --> 0:08:08.076
<v Speaker 2>little note on the paper, it was in fact your idea.

0:08:08.076 --> 0:08:09.516
<v Speaker 2>I know you were working with a t but the

0:08:09.556 --> 0:08:12.796
<v Speaker 2>paper credits you with the idea. So let's let's take

0:08:12.836 --> 0:08:15.476
<v Speaker 2>this idea, this basic idea of look at the whole

0:08:15.476 --> 0:08:18.796
<v Speaker 2>input sentence all at once, yep, a few times, and

0:08:19.876 --> 0:08:22.436
<v Speaker 2>apply it to our frog sentence. Give me, give me

0:08:22.436 --> 0:08:23.436
<v Speaker 2>that frog sentence again.

0:08:23.996 --> 0:08:26.396
<v Speaker 1>The frog did not cross the road because it was

0:08:26.436 --> 0:08:27.916
<v Speaker 1>too tired. Good.

0:08:28.276 --> 0:08:30.156
<v Speaker 2>Tired is good because that's unambiguous.

0:08:30.196 --> 0:08:31.716
<v Speaker 1>Hot could be either one. It could be the road

0:08:31.836 --> 0:08:33.756
<v Speaker 1>or the frog, right, Hot could be hot could be

0:08:33.836 --> 0:08:36.596
<v Speaker 1>the one exactly is in fact hot could either could

0:08:36.596 --> 0:08:40.476
<v Speaker 1>actually either one and non referential and non referential because

0:08:40.516 --> 0:08:42.396
<v Speaker 1>it was too hot outside.

0:08:41.996 --> 0:08:44.076
<v Speaker 2>Outside it could be any of three things, the weather,

0:08:44.396 --> 0:08:47.396
<v Speaker 2>or the frog or the road exactly. I love that

0:08:47.636 --> 0:08:52.956
<v Speaker 2>tired solves the problem. So your model, this new way

0:08:52.996 --> 0:08:57.276
<v Speaker 2>of doing things, how does it parse that sentence, what

0:08:57.316 --> 0:08:57.716
<v Speaker 2>does it do?

0:08:58.356 --> 0:09:02.636
<v Speaker 1>So basically, let's look at the word it and look

0:09:02.636 --> 0:09:04.956
<v Speaker 1>at it in every single step of these you know,

0:09:05.076 --> 0:09:10.356
<v Speaker 1>say a handful of times repeated operation. Imagine you're looking

0:09:10.396 --> 0:09:12.516
<v Speaker 1>at this word it, that's the one that you are

0:09:12.556 --> 0:09:16.076
<v Speaker 1>now trying to understand better, and you now compare it

0:09:16.276 --> 0:09:18.796
<v Speaker 1>to every other word in the sense. Okay, so you

0:09:18.836 --> 0:09:22.196
<v Speaker 1>compare it to the to frog that did not cross

0:09:22.316 --> 0:09:28.356
<v Speaker 1>the road because two and tired, there was two and

0:09:28.516 --> 0:09:36.076
<v Speaker 1>tire and initially in the first past. Already a very

0:09:36.596 --> 0:09:40.276
<v Speaker 1>simple insight the model can fairly easily learn is that

0:09:40.356 --> 0:09:49.196
<v Speaker 1>it could be strongly informed by frog, by road, by nothing,

0:09:50.196 --> 0:09:54.276
<v Speaker 1>but not so by two or by the or maybe

0:09:54.276 --> 0:09:56.916
<v Speaker 1>only to a certain extent by us. But if you

0:09:56.916 --> 0:10:00.676
<v Speaker 1>want to know more about what it denotes, then it

0:10:00.716 --> 0:10:03.876
<v Speaker 1>could be, you know, it could be informed by by

0:10:03.876 --> 0:10:04.476
<v Speaker 1>all of these.

0:10:04.876 --> 0:10:08.036
<v Speaker 2>And just to be clear, that sort of understanding arises

0:10:08.076 --> 0:10:09.476
<v Speaker 2>because it has trained in.

0:10:09.396 --> 0:10:11.116
<v Speaker 1>This way on lots of data.

0:10:11.156 --> 0:10:14.956
<v Speaker 2>It's encountering a new sentence after reading lots of other

0:10:15.036 --> 0:10:19.236
<v Speaker 2>sentences with lots of pronouns with different possible antecedents.

0:10:19.316 --> 0:10:23.156
<v Speaker 1>Yeah, exactly, exactly. So Now the interesting thing is that

0:10:23.516 --> 0:10:29.836
<v Speaker 1>which of the two it actually refers to, doesn't depend

0:10:30.116 --> 0:10:33.156
<v Speaker 1>on only on what those other two words are. And

0:10:33.196 --> 0:10:36.396
<v Speaker 1>this is why you need these subsequent steps because so

0:10:36.996 --> 0:10:39.476
<v Speaker 1>let's talk with the first step. So what now happens

0:10:39.556 --> 0:10:44.116
<v Speaker 1>is that, say the model identifies frog and road could

0:10:44.116 --> 0:10:46.916
<v Speaker 1>have a lot to do with the word it. So

0:10:47.156 --> 0:10:51.396
<v Speaker 1>now you basically copy some information from both frog and

0:10:51.636 --> 0:10:55.916
<v Speaker 1>road over to it, and you don't just copy it,

0:10:55.956 --> 0:10:59.076
<v Speaker 1>you kind of transform it also on the way, but

0:10:59.196 --> 0:11:02.956
<v Speaker 1>you refine your understanding of it. And this is all learned,

0:11:02.996 --> 0:11:06.076
<v Speaker 1>does not given by rules or you know, in any

0:11:06.116 --> 0:11:07.916
<v Speaker 1>way pre specifying.

0:11:07.436 --> 0:11:11.276
<v Speaker 2>Right, just by training on loge, just by training this emergency,

0:11:11.276 --> 0:11:13.636
<v Speaker 2>and so that sort of the meaning of it after

0:11:13.676 --> 0:11:17.676
<v Speaker 2>this first step is kind of influenced by both frog

0:11:17.756 --> 0:11:18.196
<v Speaker 2>and road.

0:11:18.316 --> 0:11:23.356
<v Speaker 1>Yes, both frog and road. Okay, so now we repeat

0:11:23.396 --> 0:11:27.596
<v Speaker 1>this operation again and we now know that it is

0:11:27.796 --> 0:11:30.876
<v Speaker 1>unsure or the model basically now has this kind of superposition. Right,

0:11:30.916 --> 0:11:34.076
<v Speaker 1>it could be road, it could be frog. But now

0:11:34.196 --> 0:11:36.836
<v Speaker 1>in the next step it also looks at tired, and

0:11:36.876 --> 0:11:41.116
<v Speaker 1>somehow the model has learned that when it means something inanimate,

0:11:41.276 --> 0:11:46.556
<v Speaker 1>that tired is not the thing. And so maybe in

0:11:46.676 --> 0:11:50.116
<v Speaker 1>context of tired, it is more likely to refer to frog,

0:11:50.836 --> 0:11:54.716
<v Speaker 1>and now you know, well, it is more likely and

0:11:54.796 --> 0:11:57.036
<v Speaker 1>now maybe the model has figured out already, maybe needs

0:11:57.076 --> 0:12:00.516
<v Speaker 1>a bit more, a few more iterations that it is

0:12:00.596 --> 0:12:03.636
<v Speaker 1>most likely to refer to frog because of the presence

0:12:04.196 --> 0:12:07.036
<v Speaker 1>of tired. So it has solved the problem. But it

0:12:07.036 --> 0:12:08.036
<v Speaker 1>has solved the problem.

0:12:08.556 --> 0:12:12.436
<v Speaker 2>So you do, you have this idea, you try it out.

0:12:12.876 --> 0:12:14.996
<v Speaker 2>There's a detail that you mentioned that's kind of fun,

0:12:15.036 --> 0:12:17.596
<v Speaker 2>and we kind of skipped it, but you mentioned that

0:12:17.996 --> 0:12:20.116
<v Speaker 2>another one of the co authors, who has also gone

0:12:20.156 --> 0:12:22.476
<v Speaker 2>on to do very big things, was about to leave

0:12:22.556 --> 0:12:25.276
<v Speaker 2>Google when you sort of want to test this idea,

0:12:25.356 --> 0:12:27.476
<v Speaker 2>and and that fact that he was about to leave

0:12:27.476 --> 0:12:29.956
<v Speaker 2>Google was actually important to the history of this idea.

0:12:29.996 --> 0:12:33.316
<v Speaker 1>Tell me about that it was important. So this Ilia Plususian,

0:12:34.636 --> 0:12:39.476
<v Speaker 1>he was at the time that this started to gain

0:12:39.716 --> 0:12:44.036
<v Speaker 1>any kind of speed, Elia was managing a good chunk

0:12:44.116 --> 0:12:47.676
<v Speaker 1>of my organization. And the moment he really made the

0:12:47.676 --> 0:12:51.396
<v Speaker 1>decision to leave the company, he had to wait ultimately

0:12:51.556 --> 0:12:55.436
<v Speaker 1>for his co for his co founder, and for them

0:12:55.476 --> 0:12:58.156
<v Speaker 1>to then actually get going together in earnest and so

0:12:58.236 --> 0:13:00.716
<v Speaker 1>he had a few months where he knew and I

0:13:00.796 --> 0:13:04.716
<v Speaker 1>also knew that he was about to leave and where

0:13:04.836 --> 0:13:06.716
<v Speaker 1>you know, the right thing would of course be to

0:13:06.756 --> 0:13:11.236
<v Speaker 1>transition his team to another manager, which we did immediately,

0:13:11.716 --> 0:13:14.436
<v Speaker 1>but where you then suddenly was in a position of

0:13:14.476 --> 0:13:18.356
<v Speaker 1>having nothing to lose and yet quite some time left

0:13:18.436 --> 0:13:21.556
<v Speaker 1>to play with Google's resources and do cool stuff with interesting,

0:13:21.796 --> 0:13:25.756
<v Speaker 1>interesting people. And and so that's one of those moments

0:13:25.756 --> 0:13:31.716
<v Speaker 1>where suddenly your appetite for risk as a researcher just spikes, right, huh,

0:13:32.636 --> 0:13:34.556
<v Speaker 1>because you have, for for a few more months, you

0:13:34.556 --> 0:13:38.796
<v Speaker 1>have these resources at your disposal, you've transitioned your responsibilities.

0:13:38.876 --> 0:13:41.836
<v Speaker 1>At that stage, you're just like, Okay, let's try this

0:13:41.956 --> 0:13:46.236
<v Speaker 1>crazy shit and and and it's and that's literally in

0:13:46.356 --> 0:13:49.676
<v Speaker 1>so many ways, was was one of the integral catalysts

0:13:50.796 --> 0:13:55.036
<v Speaker 1>because that also enabled, right, this kind of mindset of

0:13:55.276 --> 0:13:58.796
<v Speaker 1>we're going for this now, whatever the reason. It still

0:13:58.956 --> 0:14:02.956
<v Speaker 1>you know affects other people. And so there were others

0:14:02.956 --> 0:14:06.796
<v Speaker 1>who joined that collaboration really really early on, who I

0:14:06.916 --> 0:14:10.236
<v Speaker 1>feel were much more excited a result, much more likely

0:14:10.276 --> 0:14:12.396
<v Speaker 1>to really work on this and to really give it

0:14:12.396 --> 0:14:17.196
<v Speaker 1>there all because of his you know, nothing left to lose,

0:14:17.636 --> 0:14:19.796
<v Speaker 1>I'm going to go for this attitude at this.

0:14:19.716 --> 0:14:23.556
<v Speaker 2>Point, Right, was there a moment when you realized it worked.

0:14:23.796 --> 0:14:27.396
<v Speaker 1>There were actually a few moments. And it's interesting because

0:14:29.396 --> 0:14:32.596
<v Speaker 1>on one hand, right, it's a very gradual thing, right,

0:14:32.636 --> 0:14:35.636
<v Speaker 1>And initially, actually it took us many months to get

0:14:35.676 --> 0:14:39.156
<v Speaker 1>to the point where we saw significant first signs of

0:14:39.196 --> 0:14:41.836
<v Speaker 1>life of this not just being a curiosity but really

0:14:41.876 --> 0:14:45.196
<v Speaker 1>being something that would end up being competitive. So there

0:14:45.276 --> 0:14:48.036
<v Speaker 1>certainly was a moment when that started. There was another

0:14:48.116 --> 0:14:51.796
<v Speaker 1>moment when we for the for the first time had

0:14:51.916 --> 0:14:56.956
<v Speaker 1>one machine translation challenge, one language pair of the W

0:14:57.116 --> 0:15:00.796
<v Speaker 1>and T task as it's called, where our score, our

0:15:00.836 --> 0:15:05.116
<v Speaker 1>model performed better than any other single model. The point

0:15:05.116 --> 0:15:07.796
<v Speaker 1>in time when I think all of us realized this

0:15:07.876 --> 0:15:12.436
<v Speaker 1>is special was when we not only had the best

0:15:12.436 --> 0:15:16.036
<v Speaker 1>one in one of these tasks, but in multiple and

0:15:17.276 --> 0:15:19.556
<v Speaker 1>we didn't just have the best number. We also at

0:15:19.556 --> 0:15:22.316
<v Speaker 1>that point were able to establish that we've gotten there

0:15:22.356 --> 0:15:27.236
<v Speaker 1>with about ten times less energy or training compute spend.

0:15:27.716 --> 0:15:30.116
<v Speaker 2>Wow, So you do one tenth the work and you

0:15:30.156 --> 0:15:31.076
<v Speaker 2>get a better result.

0:15:31.316 --> 0:15:33.276
<v Speaker 1>One tenth the work and you get a better result

0:15:33.316 --> 0:15:37.116
<v Speaker 1>not just across one specific challenge, but across multiple including

0:15:37.156 --> 0:15:39.876
<v Speaker 1>the hardest or of one of the harder ones. Right.

0:15:40.316 --> 0:15:45.476
<v Speaker 1>And then at that stage we were still improving rapidly,

0:15:46.716 --> 0:15:50.596
<v Speaker 1>and then you realize, okay, this is for real. There's

0:15:51.036 --> 0:15:53.396
<v Speaker 1>because there right, It wasn't like we it wasn't that

0:15:53.436 --> 0:15:56.116
<v Speaker 1>we had to squeeze those last little bits and pieces

0:15:56.156 --> 0:15:59.756
<v Speaker 1>of gain out of it. It was still improving fairly rapidly,

0:16:00.796 --> 0:16:03.476
<v Speaker 1>to the point where actually, by the time we actually

0:16:03.476 --> 0:16:08.196
<v Speaker 1>published the paper, we again reduced the computer requirements, not

0:16:08.276 --> 0:16:11.356
<v Speaker 1>quite by an entire order of magnitude, but almost right,

0:16:11.476 --> 0:16:14.516
<v Speaker 1>so it still was getting faster and better at a

0:16:14.556 --> 0:16:17.076
<v Speaker 1>pretty rapid rate. Wow, so we had in the paper

0:16:17.116 --> 0:16:19.836
<v Speaker 1>we had some results that were those roughly ten x

0:16:19.836 --> 0:16:23.396
<v Speaker 1>faster on eighthpus and what we demonstrated in terms of

0:16:23.516 --> 0:16:26.596
<v Speaker 1>quality on those eight GPUs by the time we actually

0:16:26.676 --> 0:16:29.396
<v Speaker 1>published the paper properly we were able to do with one.

0:16:29.276 --> 0:16:32.636
<v Speaker 2>GPU, one GPU meaning one chip of the kind that

0:16:32.676 --> 0:16:35.556
<v Speaker 2>people by one hundred thousand of now to build a

0:16:35.636 --> 0:16:39.116
<v Speaker 2>data center exactly. So the paper actually at the end

0:16:39.876 --> 0:16:45.596
<v Speaker 2>mentions other possible uses beyond language for this technology. It

0:16:45.716 --> 0:16:50.996
<v Speaker 2>mentions images, audio, and video, I think explicitly. How much

0:16:51.036 --> 0:16:52.836
<v Speaker 2>were you thinking about that at the time. Was that

0:16:52.956 --> 0:16:55.036
<v Speaker 2>just like an afterthought or were you like, hey, wait

0:16:55.076 --> 0:16:57.356
<v Speaker 2>a minute, it's not just language.

0:16:57.596 --> 0:16:59.956
<v Speaker 1>By the time it was actually published at a conference,

0:16:59.996 --> 0:17:04.076
<v Speaker 1>not just the preprint. By December, we had initial models

0:17:04.236 --> 0:17:07.436
<v Speaker 1>on other modalities on generating images. We had the first

0:17:07.716 --> 0:17:09.836
<v Speaker 1>the first at the stay. At that time they were

0:17:10.116 --> 0:17:12.636
<v Speaker 1>not performing that well yet, but you know, they were

0:17:12.716 --> 0:17:15.916
<v Speaker 1>rapidly getting better. We had the first prototypes actually of

0:17:16.036 --> 0:17:20.676
<v Speaker 1>models working on genomic data, working on protein structure. That's

0:17:20.676 --> 0:17:24.116
<v Speaker 1>good for shadow good for shadowing exactly. But then we

0:17:24.236 --> 0:17:27.236
<v Speaker 1>ended up for a variety of reasons, we ended up

0:17:27.756 --> 0:17:30.716
<v Speaker 1>at first focusing on applications in computer vision.

0:17:31.116 --> 0:17:33.716
<v Speaker 2>The paper comes out, you know, you're working on these

0:17:33.756 --> 0:17:37.756
<v Speaker 2>other applications, you're presenting the paper, it's published in various forms.

0:17:38.356 --> 0:17:42.636
<v Speaker 1>What's the response like. It was interesting because the response

0:17:43.076 --> 0:17:49.836
<v Speaker 1>built in deep learning AI circles basically between the pre

0:17:49.916 --> 0:17:52.236
<v Speaker 1>print that I think came out and I want to

0:17:52.276 --> 0:17:56.436
<v Speaker 1>say June twenty seventeen, and then the actually actual publication,

0:17:56.876 --> 0:17:59.916
<v Speaker 1>to the extent that by the time the poster session

0:17:59.956 --> 0:18:03.676
<v Speaker 1>happened at the conference, there was quite a crowd at

0:18:03.676 --> 0:18:07.036
<v Speaker 1>the poster so we had to be shoved out of

0:18:07.116 --> 0:18:10.316
<v Speaker 1>the out of the hall in which the poster session happened.

0:18:10.316 --> 0:18:14.276
<v Speaker 1>About security and had very hors voices by the end

0:18:14.316 --> 0:18:18.676
<v Speaker 1>of the evening, you guys were like the Beatles of

0:18:18.756 --> 0:18:23.556
<v Speaker 1>the AI conference. I wouldn't say that because we weren't

0:18:23.556 --> 0:18:26.396
<v Speaker 1>the Beatles, because it was really it was still very specific.

0:18:26.436 --> 0:18:27.836
<v Speaker 2>You were more that you were more of the cool

0:18:27.916 --> 0:18:30.036
<v Speaker 2>hipster band. You were the hipster.

0:18:29.756 --> 0:18:32.156
<v Speaker 1>Band, certainly more the cool hipster band. But it was

0:18:32.196 --> 0:18:35.396
<v Speaker 1>an interesting experience because there were some folks and including

0:18:35.396 --> 0:18:38.916
<v Speaker 1>some greats in the field, who came by and said, Wow,

0:18:39.036 --> 0:18:40.396
<v Speaker 1>this is this is cool.

0:18:40.716 --> 0:18:44.156
<v Speaker 2>What has happened since has been wild.

0:18:44.596 --> 0:18:48.036
<v Speaker 1>It seems wild to say the least. Yes, Is it

0:18:48.116 --> 0:18:52.196
<v Speaker 1>surprising to you? Of course, many aspects are surprising. For sure.

0:18:53.796 --> 0:18:57.956
<v Speaker 1>We definitely saw pretty early on already back in twenty eighteen,

0:18:57.996 --> 0:19:04.076
<v Speaker 1>twenty nineteen, that something really exciting was happening here. Now

0:19:05.316 --> 0:19:08.836
<v Speaker 1>I'm still surprised by with the advent of chat GPT,

0:19:10.276 --> 0:19:15.076
<v Speaker 1>something that didn't go way beyond those language models that

0:19:15.116 --> 0:19:19.036
<v Speaker 1>we had already seen a few years before, was suddenly

0:19:20.436 --> 0:19:23.876
<v Speaker 1>the world's fastest growing consumer product.

0:19:23.836 --> 0:19:25.436
<v Speaker 2>Ever, right, I think ever?

0:19:25.676 --> 0:19:26.716
<v Speaker 1>Ever? Yes?

0:19:27.076 --> 0:19:31.956
<v Speaker 2>And by the way, GBT stands for generative pre transformer, right,

0:19:31.996 --> 0:19:35.636
<v Speaker 2>transformer is your word, that's right? So there's an interesting

0:19:36.956 --> 0:19:39.996
<v Speaker 2>I don't know, business side to this right, which is,

0:19:40.356 --> 0:19:42.196
<v Speaker 2>you were working for Google when you came up with this.

0:19:42.356 --> 0:19:48.316
<v Speaker 2>Google presumably owned the idea, had intellectual property around.

0:19:47.996 --> 0:19:49.956
<v Speaker 1>The idea has filed many a patent.

0:19:50.116 --> 0:19:52.436
<v Speaker 2>Was it just a choice Google made to let everybody

0:19:52.556 --> 0:19:56.196
<v Speaker 2>use it? Like when you see the fastest growing consumer

0:19:56.356 --> 0:19:58.716
<v Speaker 2>product in this year of the world not only built

0:19:58.716 --> 0:20:02.076
<v Speaker 2>on this idea, but using the name like and it's

0:20:02.116 --> 0:20:04.356
<v Speaker 2>a different company that was five years later.

0:20:04.236 --> 0:20:04.836
<v Speaker 1>Five years later.

0:20:04.876 --> 0:20:07.436
<v Speaker 2>But a patent's good for more than five years? Is

0:20:07.476 --> 0:20:08.276
<v Speaker 2>that a choice?

0:20:08.356 --> 0:20:10.516
<v Speaker 1>Is that a stret dig choice? What's going on there?

0:20:11.036 --> 0:20:14.196
<v Speaker 1>So the choice to do it in the first place,

0:20:15.036 --> 0:20:19.396
<v Speaker 1>to publish it in the first place, is really based

0:20:19.476 --> 0:20:23.916
<v Speaker 1>on and and rooted in a deep conviction of Google

0:20:23.956 --> 0:20:26.636
<v Speaker 1>at the time, And I'm actually pretty sure it still

0:20:26.676 --> 0:20:31.596
<v Speaker 1>is the case that it is. Actually these developments are

0:20:31.676 --> 0:20:34.356
<v Speaker 1>the tide that floats all votes, that lifts.

0:20:33.876 --> 0:20:38.796
<v Speaker 2>All votes, like a belief in progress, a belief in progress,

0:20:38.796 --> 0:20:40.596
<v Speaker 2>a good old fashioned Now.

0:20:40.996 --> 0:20:45.436
<v Speaker 1>It's also the case that at the time, organizationally, that

0:20:45.556 --> 0:20:51.596
<v Speaker 1>specific research arm was unusually separated from the product organizations.

0:20:51.756 --> 0:20:56.476
<v Speaker 1>And the reason why Brain or in general, the deep

0:20:56.556 --> 0:21:02.436
<v Speaker 1>learning groups were more separated was in part historical, namely

0:21:02.676 --> 0:21:05.716
<v Speaker 1>that when they started out there were no applications and

0:21:05.876 --> 0:21:08.996
<v Speaker 1>the technology was not ready for being applied, and so

0:21:09.356 --> 0:21:13.996
<v Speaker 1>it's completely understandable and just you know a consequence of

0:21:14.156 --> 0:21:20.196
<v Speaker 1>organic developments that when this technology suddenly is on the

0:21:20.236 --> 0:21:24.836
<v Speaker 1>cusp of being incredibly impactful, you're probably still under utilizing

0:21:24.876 --> 0:21:29.796
<v Speaker 1>it internally and potentially also not yet treating it in

0:21:29.876 --> 0:21:32.756
<v Speaker 1>the same way as you would have maybe otherwise treated

0:21:32.836 --> 0:21:35.036
<v Speaker 1>previous trade secrets.

0:21:34.716 --> 0:21:39.516
<v Speaker 2>For example, as it feels like this out their research project,

0:21:39.716 --> 0:21:42.276
<v Speaker 2>not like what's going to be this consumer.

0:21:42.476 --> 0:21:47.316
<v Speaker 1>Product exactly exactly, And to be fair, it took Open

0:21:47.356 --> 0:21:49.836
<v Speaker 1>a Eye in this case a fair amount of time

0:21:50.116 --> 0:21:54.396
<v Speaker 1>and to then turn this into this product, and most

0:21:54.396 --> 0:21:57.876
<v Speaker 1>of that time it also from their vantage point, wasn't

0:21:57.876 --> 0:22:01.036
<v Speaker 1>a product. Right. So up until all the way through

0:22:01.676 --> 0:22:06.836
<v Speaker 1>chat REPT, Open Eye have published all of their GPT developments,

0:22:07.356 --> 0:22:10.036
<v Speaker 1>maybe not all, but you know, their large fraction of

0:22:10.676 --> 0:22:11.316
<v Speaker 1>their work on this.

0:22:11.516 --> 0:22:12.516
<v Speaker 2>Yeah, they're early models.

0:22:12.516 --> 0:22:15.676
<v Speaker 1>The whole models were open exactly. They were more true

0:22:15.676 --> 0:22:19.716
<v Speaker 1>to their name really also believing in the same thing.

0:22:19.716 --> 0:22:22.396
<v Speaker 1>And it was only really after chat GPT and after

0:22:22.476 --> 0:22:27.156
<v Speaker 1>this to them also surprise to a certain extent success,

0:22:27.676 --> 0:22:31.556
<v Speaker 1>that they started to become more closed as well when

0:22:31.596 --> 0:22:37.676
<v Speaker 1>it comes to scientific developments in this past. You'll be

0:22:37.716 --> 0:22:54.836
<v Speaker 1>back in just a minute. Let's talk about your company.

0:22:55.236 --> 0:22:58.156
<v Speaker 1>When'd you decide to start Inceptive? The decision took a

0:22:58.156 --> 0:23:03.196
<v Speaker 1>while and was influenced by events that happened over the

0:23:03.236 --> 0:23:07.116
<v Speaker 1>course of about three months two to three months in

0:23:07.196 --> 0:23:12.196
<v Speaker 1>late twenty twenty, starting with the birth of my first child.

0:23:13.116 --> 0:23:16.916
<v Speaker 1>So when am I was born, two things happened. Number one,

0:23:17.516 --> 0:23:21.316
<v Speaker 1>witnessing a pregnancy and a birth during a pandemic where

0:23:21.516 --> 0:23:24.916
<v Speaker 1>there's a pathogen that's rapidly spreading, and so all of

0:23:24.956 --> 0:23:29.196
<v Speaker 1>that was a pretty daunting experience, and everything went great,

0:23:30.276 --> 0:23:35.436
<v Speaker 1>But having this new human in my arms also really

0:23:35.676 --> 0:23:41.636
<v Speaker 1>made me question if I couldn't more directly affect people's

0:23:41.676 --> 0:23:45.956
<v Speaker 1>lives positively with my work. And so I was at

0:23:46.036 --> 0:23:49.836
<v Speaker 1>the time quite confident that indirectly it would have effect

0:23:49.916 --> 0:23:53.556
<v Speaker 1>also on things like medicine, biology, etc. But I was wondering,

0:23:53.916 --> 0:23:58.196
<v Speaker 1>couldn't this happen more directly if I focused more on it.

0:23:58.436 --> 0:24:00.916
<v Speaker 1>The next thing that happened was that alpha fold two

0:24:01.636 --> 0:24:05.156
<v Speaker 1>results at CAST fourteen were published. CAST fourteen is this

0:24:05.636 --> 0:24:09.596
<v Speaker 1>biannual challenge for protein structure prediction and some other related problems.

0:24:09.716 --> 0:24:11.636
<v Speaker 1>This is the protein folding problem, and this is the

0:24:11.636 --> 0:24:13.036
<v Speaker 1>protein folding problem exactly.

0:24:13.076 --> 0:24:15.836
<v Speaker 2>The machine learning solving the protein folding problem, which had

0:24:15.876 --> 0:24:18.676
<v Speaker 2>been a problem for decades given us chain of amino

0:24:18.716 --> 0:24:21.476
<v Speaker 2>acids predict the three D structure of approach precisely, and

0:24:22.276 --> 0:24:24.676
<v Speaker 2>humans failed and machine learning succeeded.

0:24:24.756 --> 0:24:29.356
<v Speaker 1>Just amazing. Yes, it's a great example. Humans failed despite

0:24:29.396 --> 0:24:33.156
<v Speaker 1>the fact that we actually understand the physics fundamentally, but

0:24:33.276 --> 0:24:37.276
<v Speaker 1>we still couldn't create models that were good enough using

0:24:37.316 --> 0:24:39.756
<v Speaker 1>our conceptual understanding of the processes involve.

0:24:39.876 --> 0:24:42.636
<v Speaker 2>You would think an algorithm would work on that one, right,

0:24:42.676 --> 0:24:45.116
<v Speaker 2>You would just think an old school set of rules,

0:24:45.196 --> 0:24:48.196
<v Speaker 2>like we know what the molecules look like, we know

0:24:48.316 --> 0:24:51.516
<v Speaker 2>the laws of physics. It's amazing that we couldn't predict

0:24:51.556 --> 0:24:53.276
<v Speaker 2>it that way. Right. All you want to know is

0:24:53.316 --> 0:24:55.396
<v Speaker 2>what shape is the protein going to be? You know

0:24:55.516 --> 0:24:57.916
<v Speaker 2>all of the constituent parts, you know every atom in it,

0:24:57.956 --> 0:25:00.276
<v Speaker 2>and you still couldn't predict it with a set of rules,

0:25:00.276 --> 0:25:02.796
<v Speaker 2>but AI machine learning could.

0:25:03.556 --> 0:25:07.036
<v Speaker 1>Amazing, Yes, and it is amazing. Actually, when you put

0:25:07.036 --> 0:25:09.156
<v Speaker 1>it like this, it's important to point out that and

0:25:09.676 --> 0:25:12.756
<v Speaker 1>when we say we understand it, we make massive oversimplifying

0:25:12.796 --> 0:25:16.596
<v Speaker 1>assumptions because we ignore all the other players that are

0:25:16.636 --> 0:25:19.876
<v Speaker 1>present when a protein folds. We ignore a lot of

0:25:19.916 --> 0:25:22.796
<v Speaker 1>the kinetics of it because we say we know the structure,

0:25:22.996 --> 0:25:25.156
<v Speaker 1>but the truth is, we don't know all the wiggling

0:25:25.236 --> 0:25:27.516
<v Speaker 1>and all the shenanigans that happen on the way there, right,

0:25:27.556 --> 0:25:31.996
<v Speaker 1>and we don't know about uh, you know, chaperone proteins

0:25:31.996 --> 0:25:34.076
<v Speaker 1>that are there to influence the folding. We don't know

0:25:34.316 --> 0:25:36.636
<v Speaker 1>around all sorts of other I'm doing the physics one.

0:25:36.676 --> 0:25:40.796
<v Speaker 2>I'm doing the assume a frictionless plane version of protein precisely.

0:25:40.436 --> 0:25:43.556
<v Speaker 1>Precisely, precisely. And the beauty is that deep learning doesn't

0:25:43.556 --> 0:25:45.556
<v Speaker 1>need to make this assumption. AI doesn't need to make

0:25:45.556 --> 0:25:48.396
<v Speaker 1>this assumption. AI it just looks at data, and it

0:25:48.436 --> 0:25:51.356
<v Speaker 1>can look at more data than any human or even

0:25:51.516 --> 0:25:55.796
<v Speaker 1>humanity eventually could look at together. It's such a good

0:25:55.836 --> 0:25:59.076
<v Speaker 1>example problem to demonstrate that these models are ready for

0:25:59.156 --> 0:26:02.476
<v Speaker 1>prime time in this field and ready for lots of applications,

0:26:02.476 --> 0:26:04.676
<v Speaker 1>not just one or two, but men sold, and so

0:26:04.876 --> 0:26:08.156
<v Speaker 1>that happens, so sold exactly. And then the third thing

0:26:08.316 --> 0:26:14.036
<v Speaker 1>was that the COVID mRNA vaccines came out with astonishing

0:26:14.196 --> 0:26:17.196
<v Speaker 1>ninety plus percent out of.

0:26:17.156 --> 0:26:21.996
<v Speaker 2>The gate that they were still so underraty. Under the

0:26:21.996 --> 0:26:24.756
<v Speaker 2>beginning of the pandemic, people were like, it'll be two

0:26:24.836 --> 0:26:27.596
<v Speaker 2>or three years, and if there's sixty percent effective, that'll be.

0:26:27.516 --> 0:26:30.836
<v Speaker 1>Great, exactly exactly, And so everybody forgets. Everybody forgets it.

0:26:30.996 --> 0:26:33.676
<v Speaker 1>And when you look at it, this is a molecule

0:26:33.756 --> 0:26:35.996
<v Speaker 1>family that was for you know, most of the time

0:26:35.996 --> 0:26:38.116
<v Speaker 1>that we've known about it since the sixties, I suppose

0:26:38.756 --> 0:26:43.796
<v Speaker 1>we've treated it like a neglected stepchild of molecular biology,

0:26:43.996 --> 0:26:47.156
<v Speaker 1>because you're talking about marine in general. In general.

0:26:47.876 --> 0:26:49.916
<v Speaker 2>Everybody loves DNA, right, DNA.

0:26:50.036 --> 0:26:53.716
<v Speaker 3>Everybody loves DNA movie star, Yeah, exactly, exactly, even though

0:26:53.756 --> 0:26:57.316
<v Speaker 3>now looking back, DNA is merely you know, the place

0:26:57.356 --> 0:27:00.716
<v Speaker 3>where life takes its notes, maybe the hard drive and

0:27:00.796 --> 0:27:01.356
<v Speaker 3>the memory.

0:27:01.556 --> 0:27:04.356
<v Speaker 1>It's the book, right, it's the book. So but but

0:27:04.436 --> 0:27:07.076
<v Speaker 1>at the end of the day, it was this molecule

0:27:07.076 --> 0:27:10.196
<v Speaker 1>family that was about to save, you know, depending on them,

0:27:10.396 --> 0:27:14.276
<v Speaker 1>tens of millions of lives and in rapid time. So

0:27:14.356 --> 0:27:16.836
<v Speaker 1>all these things hold, but we have no training data

0:27:16.956 --> 0:27:21.596
<v Speaker 1>to apply anything like alpha fold to this specific molecule family,

0:27:21.676 --> 0:27:24.356
<v Speaker 1>no training data to speak of. We had two hundred

0:27:24.396 --> 0:27:28.876
<v Speaker 1>thousand known protein structures at the time, I believe, maybe optimistically,

0:27:28.916 --> 0:27:31.996
<v Speaker 1>we had maybe twelve hundred known RNA structures. And on

0:27:32.036 --> 0:27:34.636
<v Speaker 1>top of that, it was also fairly clear that for

0:27:34.796 --> 0:27:38.236
<v Speaker 1>RNA going directly to function would be much much more important,

0:27:38.276 --> 0:27:41.796
<v Speaker 1>because it's in a certain sense a less strongly structured molecule,

0:27:41.876 --> 0:27:44.916
<v Speaker 1>and other aspects of the molecule might play a bigger role.

0:27:45.756 --> 0:27:49.196
<v Speaker 1>And then on top of that, the attention that generative

0:27:49.236 --> 0:27:53.476
<v Speaker 1>AI was receiving overall, also now in the field of

0:27:53.716 --> 0:27:58.756
<v Speaker 1>pharma or of medicine, was building, And so I ended

0:27:58.836 --> 0:28:02.996
<v Speaker 1>up finding myself in a conversation where very I would

0:28:02.996 --> 0:28:07.956
<v Speaker 1>say wise longtime mentor of mine pointed out that, you know,

0:28:08.036 --> 0:28:11.316
<v Speaker 1>maybe ten years from now or so, somebody could tell

0:28:11.396 --> 0:28:14.836
<v Speaker 1>my daughter that there was this perfect storm where this

0:28:14.956 --> 0:28:17.516
<v Speaker 1>MACLE molecule with no training data was about to save

0:28:17.556 --> 0:28:20.076
<v Speaker 1>the world and could do so much more in the

0:28:20.076 --> 0:28:23.996
<v Speaker 1>direction of positively impacting people's lives. We didn't have training data,

0:28:24.396 --> 0:28:27.676
<v Speaker 1>would be very expensive to create it, but using the

0:28:27.796 --> 0:28:30.196
<v Speaker 1>technology that I've been or technologies that I'd been working

0:28:30.276 --> 0:28:32.076
<v Speaker 1>on for the last I don't know, ten plus years,

0:28:32.516 --> 0:28:36.076
<v Speaker 1>and the ability because of the attention that people were

0:28:36.716 --> 0:28:40.316
<v Speaker 1>now giving to AI in this field the ability to

0:28:40.356 --> 0:28:43.116
<v Speaker 1>raise quite a bit of money. I, in that position,

0:28:43.276 --> 0:28:47.956
<v Speaker 1>chose to stay back at my cushy dream job in

0:28:47.956 --> 0:28:53.436
<v Speaker 1>big tech and not actually take this opportunity to really

0:28:53.436 --> 0:28:56.716
<v Speaker 1>positively impact people's lives, And that idea was not one

0:28:56.796 --> 0:28:58.556
<v Speaker 1>I was willing to entertain.

0:28:59.036 --> 0:29:00.956
<v Speaker 2>You couldn't just coast it out at Google and let

0:29:01.036 --> 0:29:03.316
<v Speaker 2>somebody else go figure out RNA.

0:29:03.516 --> 0:29:06.556
<v Speaker 1>Yeah, and it's not just RNA. I think RNA is

0:29:06.556 --> 0:29:08.356
<v Speaker 1>a great starting point at the end of the day,

0:29:08.676 --> 0:29:14.916
<v Speaker 1>but building models that learn from first of all, all

0:29:14.916 --> 0:29:17.076
<v Speaker 1>the publicly available data that we can possibly get our

0:29:17.116 --> 0:29:19.596
<v Speaker 1>hands on, but also from data that we can reasonably

0:29:19.636 --> 0:29:24.516
<v Speaker 1>effectively create in our own lab. How to design molecules

0:29:24.596 --> 0:29:28.356
<v Speaker 1>for specific functions is something that now is within reach

0:29:28.636 --> 0:29:32.156
<v Speaker 1>and that will in the next years, in the years

0:29:32.196 --> 0:29:36.116
<v Speaker 1>to come, have completely transformational impact on how we even

0:29:36.156 --> 0:29:41.716
<v Speaker 1>think about what medicines are. That any opportunity to speed

0:29:41.756 --> 0:29:44.596
<v Speaker 1>this up, to make this happen, even just a day

0:29:44.676 --> 0:29:48.116
<v Speaker 1>sooner than it could have otherwise happened, is incredibly valuable

0:29:48.196 --> 0:29:48.836
<v Speaker 1>in my opinion.

0:29:49.116 --> 0:29:52.116
<v Speaker 2>As you're talking about this idea that the absence of

0:29:52.276 --> 0:29:55.116
<v Speaker 2>training data is kind of seems to be at the

0:29:55.116 --> 0:29:56.716
<v Speaker 2>center of it, right, It seems to be the core

0:29:57.396 --> 0:30:01.156
<v Speaker 2>yeah problem, which makes sense, right, Like the reason language

0:30:01.156 --> 0:30:03.356
<v Speaker 2>works so well is basically because of the Internet. I know,

0:30:03.436 --> 0:30:05.796
<v Speaker 2>now we're going beyond it, but like it just happened

0:30:05.796 --> 0:30:08.556
<v Speaker 2>to be that there was this incredibly giant set of

0:30:08.636 --> 0:30:12.556
<v Speaker 2>natural life language that became available. We don't have anything

0:30:12.636 --> 0:30:14.676
<v Speaker 2>like that for RNA, so are you. I mean, it's

0:30:14.756 --> 0:30:19.916
<v Speaker 2>kind of step one at inceptive creating the data. Is

0:30:19.956 --> 0:30:21.476
<v Speaker 2>that kind of what's happening?

0:30:22.356 --> 0:30:25.156
<v Speaker 1>So step one that inceptive is learning to use all

0:30:25.196 --> 0:30:27.036
<v Speaker 1>the data or was I think we've made a lot

0:30:27.036 --> 0:30:28.796
<v Speaker 1>of focus in that direction, learning to use all the

0:30:28.876 --> 0:30:33.676
<v Speaker 1>data that is available already and identify what other data

0:30:33.716 --> 0:30:36.276
<v Speaker 1>we're missing, and then see how far we can get

0:30:36.316 --> 0:30:39.556
<v Speaker 1>with just the publicly available data and at the same

0:30:39.596 --> 0:30:42.996
<v Speaker 1>time scale up generating our own data. And it turns

0:30:42.996 --> 0:30:46.916
<v Speaker 1>out that actually, because of the nature of evolution, because

0:30:46.916 --> 0:30:51.676
<v Speaker 1>of how evolution isn't actually incentivized to really explore the

0:30:51.876 --> 0:30:58.516
<v Speaker 1>entire space of possibilities. It is almost always given that

0:30:58.676 --> 0:31:02.716
<v Speaker 1>if you are trying to design exceptional molecules, especially ones

0:31:02.836 --> 0:31:08.156
<v Speaker 1>that are not say, you know, natural formats, you are

0:31:08.396 --> 0:31:11.596
<v Speaker 1>basically gearing need to need novel training in it.

0:31:11.916 --> 0:31:15.276
<v Speaker 2>Yeah, basically you're saying you build RNAs that don't exist

0:31:15.356 --> 0:31:17.796
<v Speaker 2>in the world that have therapeutic uses, and there's no

0:31:17.996 --> 0:31:19.556
<v Speaker 2>kind of definitionally no training.

0:31:19.636 --> 0:31:21.916
<v Speaker 1>Yes, that exist. The funny thing is we have a

0:31:21.956 --> 0:31:25.596
<v Speaker 1>few of them, and so we have existence proofs of

0:31:25.796 --> 0:31:32.956
<v Speaker 1>OURNA molecules, for example, RNA viruses that actually exhibit incredibly

0:31:32.996 --> 0:31:38.316
<v Speaker 1>complex different functions in ourselves, that do all sorts of

0:31:38.436 --> 0:31:40.876
<v Speaker 1>things that we don't usually like. But if we could

0:31:40.956 --> 0:31:43.836
<v Speaker 1>use those, you know, for good, If we could use those,

0:31:44.236 --> 0:31:48.396
<v Speaker 1>you know, in ways that would actually be aimed at

0:31:48.396 --> 0:31:51.676
<v Speaker 1>fighting disease rather than creating them, those kinds of functions,

0:31:51.716 --> 0:31:55.516
<v Speaker 1>even just a small subset of them, would really transform

0:31:55.556 --> 0:31:58.076
<v Speaker 1>medicine already. And so we know it's possible. What are

0:31:58.076 --> 0:31:59.516
<v Speaker 1>you dreaming of when you say that, what are you

0:31:59.556 --> 0:32:03.516
<v Speaker 1>thinking of? Specific? Okay, So, for example, right, one estimate

0:32:03.676 --> 0:32:07.756
<v Speaker 1>is that in order for COVID to infect you, you

0:32:07.796 --> 0:32:13.436
<v Speaker 1>would need potentially as few as five COVID genomes inside

0:32:13.436 --> 0:32:16.436
<v Speaker 1>your organism that's already in five five viral particles. Five

0:32:16.516 --> 0:32:21.356
<v Speaker 1>viral particles. Yeah, you inhale those, you wouldn't have to

0:32:21.516 --> 0:32:24.516
<v Speaker 1>inject it you wouldn't even have to swallow it, you

0:32:24.516 --> 0:32:25.076
<v Speaker 1>inhale them.

0:32:25.316 --> 0:32:27.156
<v Speaker 2>If we could have a medicine that worked as well

0:32:27.156 --> 0:32:29.076
<v Speaker 2>as a disease is a version of your.

0:32:29.076 --> 0:32:31.996
<v Speaker 1>Truth, exactly exactly so at the end of the day, right,

0:32:32.076 --> 0:32:36.716
<v Speaker 1>this medicine is able to spread in your body only

0:32:36.876 --> 0:32:40.156
<v Speaker 1>into certain types of organs and tissues and cells. It

0:32:40.196 --> 0:32:42.876
<v Speaker 1>does certain things there that are really quite complex, right,

0:32:43.076 --> 0:32:46.916
<v Speaker 1>changing the cells behavior again not usually in this case

0:32:46.996 --> 0:32:50.676
<v Speaker 1>in favorable ways, but still in ways that wouldn't have

0:32:50.716 --> 0:32:53.276
<v Speaker 1>to be modified that much in order to potentially be

0:32:53.436 --> 0:32:56.636
<v Speaker 1>exactly what you would need for complex multifactorial medicine. And

0:32:56.676 --> 0:32:58.756
<v Speaker 1>if you could make all of that happen by just

0:32:58.796 --> 0:33:02.756
<v Speaker 1>inhaling five of those molecules, then again, that would completely

0:33:02.836 --> 0:33:05.516
<v Speaker 1>change how you think about medicine. Right, you have viruses

0:33:05.756 --> 0:33:09.116
<v Speaker 1>that aren't immediately active, but that are inactive for long

0:33:09.116 --> 0:33:12.956
<v Speaker 1>periods of time in your organism, and only under certain conditions,

0:33:13.036 --> 0:33:19.516
<v Speaker 1>say under certain immune conditions, really start being reactivated. Why

0:33:19.516 --> 0:33:23.236
<v Speaker 1>can't we have medicines that work in a similar way

0:33:23.276 --> 0:33:26.756
<v Speaker 1>where you actually not only in a vaccination sense, but

0:33:26.876 --> 0:33:29.836
<v Speaker 1>where you take a medicine for a genetic predisposition for

0:33:29.836 --> 0:33:31.916
<v Speaker 1>a certain disease that you are able to take a

0:33:31.956 --> 0:33:33.676
<v Speaker 1>metic design of medicine that you can take and that

0:33:33.796 --> 0:33:36.876
<v Speaker 1>waits until the disease actually starts to develop, and only

0:33:36.876 --> 0:33:39.676
<v Speaker 1>then and only where that disease then starts developed, becomes

0:33:39.716 --> 0:33:43.236
<v Speaker 1>active and actually affects it and potentially also then alarms

0:33:43.316 --> 0:33:44.916
<v Speaker 1>the doctor through a blunt test.

0:33:45.916 --> 0:33:48.436
<v Speaker 2>Like for cancer cells or something. So you have some

0:33:48.956 --> 0:33:51.396
<v Speaker 2>kind of prophylactic medicine in your body and it is

0:33:51.516 --> 0:33:54.756
<v Speaker 2>encoded in such a way that it just hangs out there,

0:33:55.116 --> 0:33:58.476
<v Speaker 2>like herpes, to take a pathological example for example, and

0:33:58.516 --> 0:34:02.196
<v Speaker 2>only in certain settings does it do anything. And those

0:34:02.236 --> 0:34:05.076
<v Speaker 2>settings are if you see a cancer cell, destroy it,

0:34:05.116 --> 0:34:06.876
<v Speaker 2>otherwise just it there precisely.

0:34:07.316 --> 0:34:09.476
<v Speaker 1>And if you can design those also in ways where

0:34:09.476 --> 0:34:12.356
<v Speaker 1>you can just make them all go away. When you know,

0:34:12.476 --> 0:34:15.956
<v Speaker 1>you take a say a completely harmless small molecule, and

0:34:15.996 --> 0:34:17.636
<v Speaker 1>that's again entirely feasible.

0:34:17.836 --> 0:34:21.036
<v Speaker 2>Sure, So, I mean you're dreaming big. These are wonderful

0:34:21.076 --> 0:34:23.276
<v Speaker 2>big you know, science fiction andy dreams that I hope

0:34:23.316 --> 0:34:27.076
<v Speaker 2>you figure them out. On a practical level. What's happening

0:34:27.076 --> 0:34:29.036
<v Speaker 2>at the company right now? How many people work there,

0:34:29.116 --> 0:34:30.716
<v Speaker 2>what are they doing, and what are they figured out

0:34:30.756 --> 0:34:31.036
<v Speaker 2>so far?

0:34:31.116 --> 0:34:35.636
<v Speaker 1>We're round forty. What we're doing is really exactly what

0:34:35.676 --> 0:34:40.796
<v Speaker 1>we just talked about. We're basically scaling data generation experiments

0:34:40.836 --> 0:34:44.516
<v Speaker 1>in our lab that allow us to assess a variety

0:34:44.556 --> 0:34:50.396
<v Speaker 1>of different functions of different mostly RNA molecules actually mostly

0:34:50.516 --> 0:34:54.876
<v Speaker 1>m RNA molecules at the moment, that are relevant to

0:34:55.116 --> 0:34:58.236
<v Speaker 1>a pretty broad variety of different diseases. And so this

0:34:58.356 --> 0:35:03.636
<v Speaker 1>ranges from things like infectious disease vaccines to sell therapies

0:35:03.676 --> 0:35:06.636
<v Speaker 1>that can be applied in oncology or an auto or

0:35:06.676 --> 0:35:12.076
<v Speaker 1>against autoimmune disease. We have mRNAs that we hope will

0:35:12.116 --> 0:35:16.076
<v Speaker 1>eventually be effective in enzyme replacement as enzyme replacement therapies

0:35:16.436 --> 0:35:20.396
<v Speaker 1>for families of a large family of rare diseases, and

0:35:20.436 --> 0:35:23.836
<v Speaker 1>the list goes on. And so we're creating this or

0:35:23.956 --> 0:35:28.836
<v Speaker 1>growing this training data set that eventually, on top of

0:35:30.236 --> 0:35:33.436
<v Speaker 1>foundation and models that we pre trained on all publicly

0:35:33.476 --> 0:35:39.036
<v Speaker 1>available data, allow us to tune those foundation models towards

0:35:39.116 --> 0:35:44.636
<v Speaker 1>designing exceptional molecules for exactly those applications and many more

0:35:44.676 --> 0:35:45.956
<v Speaker 1>sharing similar properties.

0:35:45.996 --> 0:35:50.556
<v Speaker 2>So you basically build new mr and a model molecules

0:35:50.596 --> 0:35:52.796
<v Speaker 2>and test them, and then you give that data to

0:35:52.876 --> 0:35:56.356
<v Speaker 2>your model and presumably it tells you what to build next,

0:35:56.396 --> 0:35:58.116
<v Speaker 2>or it helps you figure out what to build next.

0:35:58.116 --> 0:35:59.436
<v Speaker 2>It's sort of a loop in that way.

0:35:59.516 --> 0:36:03.236
<v Speaker 1>The models are definitely one interesting source for proposals if

0:36:03.276 --> 0:36:07.596
<v Speaker 1>you wish for what to synthesize and test next, they're

0:36:07.636 --> 0:36:11.036
<v Speaker 1>not the only such source, so we basically also explore

0:36:11.196 --> 0:36:14.916
<v Speaker 1>kind of and maybe less guided or heuristically guided ways,

0:36:15.316 --> 0:36:18.236
<v Speaker 1>but exactly so in some of the cases, it's really

0:36:18.316 --> 0:36:21.076
<v Speaker 1>quite iterative. For some of those functions and for some

0:36:21.156 --> 0:36:25.956
<v Speaker 1>of those modalities and diseases or disease targets, we're actually

0:36:25.956 --> 0:36:29.316
<v Speaker 1>already at a point where our models can spit out

0:36:29.476 --> 0:36:33.116
<v Speaker 1>entirely novel molecules that really are unlike anything they've ever

0:36:33.156 --> 0:36:37.756
<v Speaker 1>seen or we've ever seen in nature, that very consistently

0:36:38.676 --> 0:36:43.156
<v Speaker 1>perform quite favorably compared to pretty strong baselines by incumbents

0:36:43.156 --> 0:36:43.636
<v Speaker 1>in the field.

0:36:44.076 --> 0:36:48.716
<v Speaker 2>When you say perform quite favorably compared to baselines by

0:36:48.716 --> 0:36:50.916
<v Speaker 2>incumbents in the field, and does that on some level

0:36:50.996 --> 0:36:54.196
<v Speaker 2>mean better than what experts would think.

0:36:54.076 --> 0:36:56.756
<v Speaker 1>Up, better than what experts can think up, and also

0:36:56.876 --> 0:37:00.876
<v Speaker 1>better than more traditional machine learning tools can easily produce.

0:37:01.276 --> 0:37:03.796
<v Speaker 2>It's like that famous moment in the Go match when

0:37:03.836 --> 0:37:07.476
<v Speaker 2>alpha go made some move that like no human being

0:37:07.716 --> 0:37:08.796
<v Speaker 2>ever would have thought of.

0:37:09.796 --> 0:37:14.276
<v Speaker 1>Yes, so I would say we've long passed the move

0:37:14.356 --> 0:37:18.556
<v Speaker 1>thirty seven in the sense that our understanding of the

0:37:18.676 --> 0:37:23.116
<v Speaker 1>underlying biological phenomena is so incomplete that for most of

0:37:23.196 --> 0:37:26.756
<v Speaker 1>the things that we're able to design for, we don't

0:37:26.756 --> 0:37:28.396
<v Speaker 1>really understand why they happen.

0:37:28.836 --> 0:37:31.196
<v Speaker 2>Huh, when you say weed, you mean at inceptive or

0:37:31.236 --> 0:37:32.916
<v Speaker 2>do you mean just medicine in general?

0:37:33.516 --> 0:37:34.916
<v Speaker 1>I would say just medicine in general.

0:37:35.316 --> 0:37:39.156
<v Speaker 2>Okay, So Inceptive is doing this very kind of high

0:37:39.276 --> 0:37:42.836
<v Speaker 2>level work, right, I mean building what will hopefully be

0:37:42.876 --> 0:37:46.236
<v Speaker 2>the foundation. What's the right amount of time in the

0:37:46.236 --> 0:37:48.716
<v Speaker 2>future to ask about when will we know if it works?

0:37:48.836 --> 0:37:49.956
<v Speaker 2>You think five years?

0:37:50.316 --> 0:37:55.276
<v Speaker 1>So the general idea of using genitive AI and similar

0:37:55.436 --> 0:38:02.156
<v Speaker 1>techniques to generate therapeutics, there are some things in clinical

0:38:02.196 --> 0:38:07.076
<v Speaker 1>trials that were largely designed with AI. As far as

0:38:07.076 --> 0:38:11.276
<v Speaker 1>I know, we're still maybe now we have the first

0:38:11.356 --> 0:38:15.756
<v Speaker 1>trials just now starting for molecules that were truly entirely

0:38:15.796 --> 0:38:18.076
<v Speaker 1>designed by A.

0:38:17.356 --> 0:38:20.436
<v Speaker 2>As opposed to sort of selected from a library.

0:38:20.036 --> 0:38:24.716
<v Speaker 1>Or selected, influenced, exactly selected, adjusted to you, and tweaked,

0:38:25.156 --> 0:38:29.076
<v Speaker 1>et cetera. Right, So that's really still only happening just now,

0:38:29.876 --> 0:38:33.556
<v Speaker 1>but we will see I believe, the first success or

0:38:33.636 --> 0:38:37.236
<v Speaker 1>a first success of such molecules, certainly within the next

0:38:37.276 --> 0:38:37.876
<v Speaker 1>five years.

0:38:38.116 --> 0:38:41.076
<v Speaker 2>What about more narrowly, the project at inceptive.

0:38:41.156 --> 0:38:44.316
<v Speaker 1>It's a similar timeframe. We should be able to get

0:38:44.876 --> 0:38:49.356
<v Speaker 1>molecules into the clinic in the next few years, certainly

0:38:49.396 --> 0:38:51.476
<v Speaker 1>in the next handful of years. Now. These will not

0:38:51.556 --> 0:38:57.196
<v Speaker 1>be molecules with where the objective that we used in

0:38:57.236 --> 0:39:01.756
<v Speaker 1>their design is you know, even remotely as complex or

0:39:02.236 --> 0:39:05.516
<v Speaker 1>you know, kind of the different functions that we're designing

0:39:05.556 --> 0:39:09.436
<v Speaker 1>for are are not going to be even remotely as

0:39:09.476 --> 0:39:12.276
<v Speaker 1>diverse as say what you would find because we used

0:39:12.276 --> 0:39:15.676
<v Speaker 1>this example earlier in ourna virus. These will really be

0:39:15.876 --> 0:39:21.236
<v Speaker 1>more simpler. Those will be molecules that don't do things

0:39:21.276 --> 0:39:24.076
<v Speaker 1>that we couldn't possibly have done before, but that do

0:39:24.236 --> 0:39:28.276
<v Speaker 1>them much better in ways that are more accessible, in

0:39:28.356 --> 0:39:30.916
<v Speaker 1>ways that come with less side effects.

0:39:30.996 --> 0:39:34.916
<v Speaker 2>What biotech largely is is they make protein drugs. And

0:39:34.956 --> 0:39:37.836
<v Speaker 2>so if you could make an mRNA drug where you

0:39:37.836 --> 0:39:39.836
<v Speaker 2>put the m RNA into the body and the body

0:39:39.836 --> 0:39:42.436
<v Speaker 2>makes the protein, it wouldn't be some crazy sleeper cell

0:39:42.436 --> 0:39:44.196
<v Speaker 2>that sits in your body for twenty years or whatever,

0:39:44.436 --> 0:39:48.996
<v Speaker 2>but it might be a more practical alternative to today's biotech drugs.

0:39:49.116 --> 0:39:49.596
<v Speaker 1>Absolutely.

0:39:50.396 --> 0:39:53.756
<v Speaker 2>So you've had a kind of crash course in biology

0:39:53.796 --> 0:39:55.676
<v Speaker 2>in the last few years, yes, And I'm curious, like,

0:39:56.276 --> 0:40:00.156
<v Speaker 2>what is what is something that has been particularly compelling

0:40:00.276 --> 0:40:02.876
<v Speaker 2>or surprising or interesting to you that you have learned

0:40:02.876 --> 0:40:03.676
<v Speaker 2>about biology.

0:40:03.956 --> 0:40:07.596
<v Speaker 1>They're countless things. The biggest one, or the red thread

0:40:08.116 --> 0:40:15.196
<v Speaker 1>across many of them is really just how effective life

0:40:16.396 --> 0:40:22.676
<v Speaker 1>is at finding solutions to problems that on one hand

0:40:22.916 --> 0:40:27.756
<v Speaker 1>are incredibly robust, surprisingly robust, and on the other hand,

0:40:28.436 --> 0:40:34.476
<v Speaker 1>are so different from how we would design solutions to

0:40:34.636 --> 0:40:35.596
<v Speaker 1>similar problems.

0:40:36.356 --> 0:40:37.156
<v Speaker 2>Uh huh.

0:40:37.516 --> 0:40:40.116
<v Speaker 1>That really this comes back to this idea that we

0:40:40.196 --> 0:40:43.516
<v Speaker 1>might just not be particularly well equipped in terms of

0:40:43.516 --> 0:40:48.356
<v Speaker 1>cognitive capabilities to understand biology that basically, you know we

0:40:49.236 --> 0:40:53.276
<v Speaker 1>are we would never think to do it this way,

0:40:53.356 --> 0:40:57.196
<v Speaker 1>and how we think to do it is oftentimes much

0:40:57.276 --> 0:40:57.876
<v Speaker 1>more brittle.

0:40:58.556 --> 0:41:01.956
<v Speaker 2>Uh huh. Brittle is an interesting world, less, less resilient,

0:41:02.076 --> 0:41:04.036
<v Speaker 2>less able to persist under different.

0:41:03.756 --> 0:41:06.796
<v Speaker 1>Conditions, exactly exactly. I mean, you know, we still haven't

0:41:06.796 --> 0:41:08.916
<v Speaker 1>built machines that can fix themselves, for one.

0:41:09.116 --> 0:41:11.916
<v Speaker 2>Which is fundamentally the miracle of being a human being.

0:41:12.036 --> 0:41:17.516
<v Speaker 1>Just fundamentally exactly, exactly exactly and so and of course

0:41:17.516 --> 0:41:20.796
<v Speaker 1>this is true across the scales, right from from you know,

0:41:21.196 --> 0:41:24.996
<v Speaker 1>single cells all the way to complex organisms like ourselves

0:41:25.436 --> 0:41:33.116
<v Speaker 1>and and really just how many also very different kinds

0:41:33.116 --> 0:41:37.796
<v Speaker 1>of solutions life has found and or and or constantly

0:41:37.876 --> 0:41:40.956
<v Speaker 1>is finding. Uh. And you see this all over the place,

0:41:40.956 --> 0:41:47.716
<v Speaker 1>and it's both daunting, humbling, but also incredibly inspiring when

0:41:47.716 --> 0:41:51.316
<v Speaker 1>it comes to applying AI in this area, because again

0:41:51.356 --> 0:41:54.396
<v Speaker 1>I think that at least so far, it's the best

0:41:54.436 --> 0:41:58.636
<v Speaker 1>tool and maybe actually the only tool we have so

0:41:58.796 --> 0:42:02.716
<v Speaker 1>far in face of this kind of complexity. Really design

0:42:02.756 --> 0:42:07.356
<v Speaker 1>interventions that medicines that go way beyond what we were

0:42:07.396 --> 0:42:09.556
<v Speaker 1>able to do or are able to do, just based

0:42:09.596 --> 0:42:10.916
<v Speaker 1>on our own conceptual understanding.

0:42:14.436 --> 0:42:16.636
<v Speaker 2>We'll be back in a minute with the lightning round.

0:42:18.196 --> 0:42:33.116
<v Speaker 2>M hm, let's finish for the lightning round. As an

0:42:33.156 --> 0:42:38.476
<v Speaker 2>inventor of the Transformer model, are there particular possible uses

0:42:38.516 --> 0:42:41.836
<v Speaker 2>of it that worry you flash make you sad?

0:42:42.596 --> 0:42:48.396
<v Speaker 1>I am quite concerned about the p doom doomerism, whatever

0:42:48.436 --> 0:42:53.836
<v Speaker 1>you want to call it, existential fear instilling rhetoric that

0:42:53.956 --> 0:42:57.916
<v Speaker 1>is in some cases actually also promoted by people by

0:42:58.396 --> 0:42:59.956
<v Speaker 1>entities in the space.

0:43:00.436 --> 0:43:02.436
<v Speaker 2>So just to be clear, you're you're not worried about

0:43:02.436 --> 0:43:05.556
<v Speaker 2>the existential risk. You're worried about people talking.

0:43:05.676 --> 0:43:10.276
<v Speaker 1>I'm worried about the about the existential risk being inflated

0:43:10.356 --> 0:43:17.196
<v Speaker 1>or the perception being inflated to the extent that we

0:43:17.316 --> 0:43:20.996
<v Speaker 1>actually don't look enough at some of the much more

0:43:21.076 --> 0:43:23.996
<v Speaker 1>concrete and much more immediate risks. Right. I'm not going

0:43:24.076 --> 0:43:27.436
<v Speaker 1>to say that the existential risk is zero. That would

0:43:27.436 --> 0:43:27.836
<v Speaker 1>be silly.

0:43:27.956 --> 0:43:31.276
<v Speaker 2>What is a concrete an immediate risk that is you

0:43:31.316 --> 0:43:32.236
<v Speaker 2>think under.

0:43:32.396 --> 0:43:37.396
<v Speaker 1>Discuss these large scale models are such defective tools in

0:43:37.636 --> 0:43:42.556
<v Speaker 1>manipulating people in large numbers already today, and it's happening

0:43:42.756 --> 0:43:47.436
<v Speaker 1>everywhere for many, many different purposes by in some cases

0:43:47.476 --> 0:43:52.516
<v Speaker 1>benevolent and in many cases malevolent actors that I really

0:43:53.156 --> 0:43:56.516
<v Speaker 1>firmly believe we need to look much more at things

0:43:56.636 --> 0:44:03.116
<v Speaker 1>like enabling cryptographic certification of human generated content, because doing

0:44:03.156 --> 0:44:05.196
<v Speaker 1>that with the machine generated content is not going to work.

0:44:05.316 --> 0:44:09.716
<v Speaker 1>But we definitely can cryptographically certify human generated content as.

0:44:09.556 --> 0:44:12.996
<v Speaker 2>Such basically watermarking or something some way to say this

0:44:13.196 --> 0:44:14.036
<v Speaker 2>a human made this.

0:44:14.316 --> 0:44:15.716
<v Speaker 1>Exactly what would you be.

0:44:15.716 --> 0:44:19.156
<v Speaker 2>Working on if you were not working in biology on

0:44:19.236 --> 0:44:19.956
<v Speaker 2>drug development?

0:44:20.516 --> 0:44:26.076
<v Speaker 1>Education using using artificial intelligence to democratize access to education.

0:44:26.836 --> 0:44:30.876
<v Speaker 2>What have you seen that has been impressive or compelling

0:44:30.916 --> 0:44:31.716
<v Speaker 2>to you in that regard?

0:44:31.916 --> 0:44:35.396
<v Speaker 1>There are lots of little examples so far and really countless.

0:44:36.116 --> 0:44:39.236
<v Speaker 1>It's what's happening at the con Academy. There are many

0:44:39.276 --> 0:44:44.076
<v Speaker 1>examples of AI applied to education problems in places like China,

0:44:44.156 --> 0:44:47.316
<v Speaker 1>for example. You have a bunch of very compelling examples

0:44:47.316 --> 0:44:50.556
<v Speaker 1>in fiction. A book I really like, like a named

0:44:50.636 --> 0:44:54.676
<v Speaker 1>Neil Stephenson, The Diamond Age or a Young Ladies Illustrated

0:44:54.676 --> 0:44:56.756
<v Speaker 1>primer that I recommend if you.

0:44:56.796 --> 0:44:59.516
<v Speaker 2>Just everybody in AI talks about that, Well now they do.

0:44:59.596 --> 0:45:01.436
<v Speaker 1>Yeah, it's yeah, well.

0:45:01.316 --> 0:45:01.676
<v Speaker 2>Now they do.

0:45:01.796 --> 0:45:02.516
<v Speaker 1>You liked it before?

0:45:02.516 --> 0:45:02.996
<v Speaker 2>It was cool?

0:45:03.076 --> 0:45:05.836
<v Speaker 1>I'm sure at one point I thought it was really

0:45:05.836 --> 0:45:09.516
<v Speaker 1>really important and sure that Neil students know is that

0:45:09.836 --> 0:45:13.876
<v Speaker 1>we are about to be able to build the primary

0:45:13.956 --> 0:45:16.516
<v Speaker 1>and so I ended up having coffee with him to

0:45:16.516 --> 0:45:19.316
<v Speaker 1>tell him, oh, that's great. So at the end of

0:45:19.356 --> 0:45:24.756
<v Speaker 1>the day, maybe the biggest inspiration there is my daughter.

0:45:25.076 --> 0:45:28.476
<v Speaker 1>She's four and a half now, and I think she

0:45:28.796 --> 0:45:34.036
<v Speaker 1>could today read. She can read read okay, but she

0:45:34.076 --> 0:45:38.076
<v Speaker 1>could read, you know, grade school level if she had

0:45:38.116 --> 0:45:41.276
<v Speaker 1>access to you know, an AI tutor teaching her how

0:45:41.276 --> 0:45:41.596
<v Speaker 1>to read?

0:45:41.676 --> 0:45:45.276
<v Speaker 2>Does your daughter use AI use you know, AI chat

0:45:45.316 --> 0:45:49.796
<v Speaker 2>butts not directly without me, But we've.

0:45:49.596 --> 0:45:54.396
<v Speaker 1>Actually used chat GPT to implement an AI reading tutor

0:45:55.236 --> 0:45:58.036
<v Speaker 1>that works reasonably well. I mean we basically, you know,

0:45:58.156 --> 0:46:01.476
<v Speaker 1>kind of as I call it now, vibe coding, vibe coded.

0:46:02.316 --> 0:46:04.956
<v Speaker 1>And I wasn't there for all of it. Took some time,

0:46:04.996 --> 0:46:06.396
<v Speaker 1>but she was there for some of it. Oh, you

0:46:06.556 --> 0:46:09.076
<v Speaker 1>vibe coded it with her? Yeah, well, I mean she was,

0:46:09.196 --> 0:46:11.996
<v Speaker 1>she was there. You know, she witnessed a good chunk

0:46:12.036 --> 0:46:14.196
<v Speaker 1>of it, Yes, although she was more interested in the

0:46:14.196 --> 0:46:16.636
<v Speaker 1>image generation parts. But yeah, we have a sketch of

0:46:16.676 --> 0:46:19.716
<v Speaker 1>one that she quite enjoys. So that's kind of like

0:46:19.756 --> 0:46:23.596
<v Speaker 1>the extent of her at the sage using I directly.

0:46:30.036 --> 0:46:32.676
<v Speaker 1>Yakabust is the CEO and.

0:46:32.636 --> 0:46:36.156
<v Speaker 2>Co founder of Inceptive and the co author of the

0:46:36.196 --> 0:46:40.076
<v Speaker 2>paper Attention Is All You Need. Just a quick note,

0:46:40.596 --> 0:46:42.996
<v Speaker 2>This is our last episode before a break of a

0:46:43.036 --> 0:46:45.996
<v Speaker 2>couple of weeks, and then we'll be back with more episodes.

0:46:46.676 --> 0:46:49.996
<v Speaker 2>Please email us at problem at Pushkin dot fm. We

0:46:50.036 --> 0:46:53.756
<v Speaker 2>are always looking for new guests for the show. Today's

0:46:53.756 --> 0:46:57.556
<v Speaker 2>show was produced by Trinamanino and Gabriel Hunter Chang. It

0:46:57.756 --> 0:47:01.756
<v Speaker 2>was edited by Alexander Garretton and engineered by Sarah muguerrett