WEBVTT - Predicting Human Health with AI 0:00:15.356 --> 0:00:24.076 Pushkin. Imagine something that is sort of like chat GPT, 0:00:24.636 --> 0:00:28.076 but for the human body. Chat GPT looks at a 0:00:28.116 --> 0:00:31.236 sentence and predicts what words are likely to come next. 0:00:31.756 --> 0:00:34.316 This thing would look at a human body and predict 0:00:34.356 --> 0:00:38.156 what diseases are likely to come next. The body is 0:00:38.316 --> 0:00:41.836 wildly complex and unpredictable. This seems like a very, very 0:00:41.876 --> 0:00:45.076 hard problem, but it is a problem people are working on, 0:00:45.516 --> 0:00:48.996 and at least in some circumstances, they're figuring out how 0:00:49.036 --> 0:00:57.716 to make predictions that are truly useful. I'm Jacob Goldstein, 0:00:57.756 --> 0:00:59.636 and this is What's Your Problem, the show where I 0:00:59.716 --> 0:01:02.796 talk to people who are trying to make technological progress. 0:01:03.156 --> 0:01:06.476 My guest today is Charles Fisher, co founder and CEO 0:01:06.636 --> 0:01:10.836 of Unlearned. Charles' problem is how do you build an 0:01:10.836 --> 0:01:14.836 AI model that can predict human health. Charles and his 0:01:14.876 --> 0:01:17.796 colleagues have built a predictive model of human health that's 0:01:17.876 --> 0:01:21.076 already being used in clinical trials for new drugs and 0:01:21.196 --> 0:01:24.356 new medical devices. But we started out talking about the 0:01:24.396 --> 0:01:27.956 big picture, about the very idea of trying to predict 0:01:27.996 --> 0:01:31.836 what's going to happen to a human body. 0:01:32.996 --> 0:01:36.356 It's funny when I talk about trying to quantify biology 0:01:36.356 --> 0:01:38.756 and make it predictable. I often get hit with this 0:01:40.796 --> 0:01:45.716 critique that biology isn't physics. Biology is complex, biology is 0:01:45.716 --> 0:01:47.476 not physics. We're not going to be able to do that. 0:01:47.876 --> 0:01:49.276 Let's deterministic. 0:01:50.396 --> 0:01:54.636 Right, So for physics, for two thousand years, right, people 0:01:54.676 --> 0:01:58.476 started working on physics in ancient Greece. And for two 0:01:58.476 --> 0:02:03.596 thousand years, physics wasn't physics. Physics was unpredictable. Physics was 0:02:03.756 --> 0:02:07.676 too complex to understand until something was invented. And that 0:02:07.676 --> 0:02:09.076 thing was calculus. 0:02:10.116 --> 0:02:10.796 Until new right. 0:02:10.916 --> 0:02:15.236 Yeah, So once calculus was invented, all of the sudden, 0:02:15.276 --> 0:02:18.036 we had a new language. In this language, this new 0:02:18.116 --> 0:02:21.276 kind of mathematics allowed us to really easily describe lots 0:02:21.276 --> 0:02:25.036 of physical phenomena. And so now physics has become this 0:02:25.116 --> 0:02:28.676 thing that's very predictable and well understood. And that's what 0:02:28.676 --> 0:02:31.116 we've been waiting for in biology. We've been waiting for 0:02:31.156 --> 0:02:33.996 a new tool, a new language, a new mathematics that 0:02:34.036 --> 0:02:37.276 will allow us to understand these complex systems. And that's 0:02:37.396 --> 0:02:39.476 really what I think these new tools are. 0:02:39.796 --> 0:02:41.796 So I think so your hope, your hope is that 0:02:41.916 --> 0:02:48.156 machine learning generative AI will do for medicine biology. What 0:02:48.236 --> 0:02:49.756 Calculus did for physics. 0:02:50.236 --> 0:02:53.516 Exactly. That is big, big, It's exactly what I hope. 0:02:53.916 --> 0:02:54.996 That's exactly what I hope. 0:02:55.036 --> 0:02:57.916 So okay, so this is your hope. You're starting this 0:02:58.036 --> 0:03:03.796 company to test your hypothesis. Uh, what do you do? 0:03:04.796 --> 0:03:06.116 What do you mean? What do I do? What I 0:03:06.156 --> 0:03:08.836 do on day one? Or like, what are we doing? No? 0:03:08.836 --> 0:03:12.276 No, no, We're back to twenty seventeen. You have this 0:03:12.516 --> 0:03:16.036 big up in the sky, two thousand year, thirty thousand 0:03:16.036 --> 0:03:18.636 foot idea. But you got to make a thing that 0:03:18.676 --> 0:03:20.876 somebody is going to pay you for that will hopefully 0:03:20.996 --> 0:03:22.916 use AI in medicine in some way. So what do 0:03:22.956 --> 0:03:23.156 you do? 0:03:23.836 --> 0:03:27.436 So we didn't know what would work, so we focused 0:03:27.516 --> 0:03:33.196 on two different problems at the time. So one problem is, 0:03:33.636 --> 0:03:36.596 let's imagine we're going to have a bunch of data 0:03:36.956 --> 0:03:40.356 from some maybe a big large collection of patients. We're 0:03:40.356 --> 0:03:43.276 gonna have this data all over time, so the symptoms 0:03:43.276 --> 0:03:46.996 that a patient might have every week, four year or 0:03:47.036 --> 0:03:49.716 something like that. And our goal is to be able 0:03:49.756 --> 0:03:53.596 to create a simulator of a patient's future health. So, 0:03:53.756 --> 0:03:55.876 given what I know about a patient in the past, 0:03:56.316 --> 0:03:59.196 can I simulate what will happen to them in the future. 0:03:59.916 --> 0:04:03.636 And presumably that is sort of probabilistic. I mean, what 0:04:03.756 --> 0:04:05.476 we know about health, Like you can say there's an 0:04:05.636 --> 0:04:08.476 x percent chance that in why years this person will 0:04:08.476 --> 0:04:09.876 have a heart attack something like that. 0:04:10.236 --> 0:04:13.716 Exactly. Yeah, we want to yes, because so many things 0:04:13.756 --> 0:04:16.996 are undetermined in that you know, maybe yeah, exactly. 0:04:16.836 --> 0:04:18.996 Right, and it's just the nature of the world, right, 0:04:19.356 --> 0:04:20.036 one hundred percent. 0:04:20.196 --> 0:04:21.036 Yeah. 0:04:21.556 --> 0:04:24.636 So okay, so you have this idea of basically where 0:04:24.836 --> 0:04:28.036 chat GBT, which didn't exist yet, but predicts the next 0:04:28.036 --> 0:04:31.316 word with some probability you want to predict the next 0:04:31.356 --> 0:04:32.476 health outcome. 0:04:32.076 --> 0:04:34.716 For exactly that is the big idea. Yeah, So that 0:04:34.916 --> 0:04:36.676 that was one of them. The other that was not 0:04:36.756 --> 0:04:38.516 the only one that was the one that is what 0:04:38.516 --> 0:04:40.956 we do. The one that we didn't do is we 0:04:40.956 --> 0:04:43.676 were interested as well potentially so that's at a very 0:04:43.716 --> 0:04:47.756 macroscopic scale, that's at the scale of the person, whereas 0:04:47.796 --> 0:04:49.956 the other thing we were interested in was potentially could 0:04:49.996 --> 0:04:52.116 we go at the micro scale and look at what's 0:04:52.156 --> 0:04:55.356 happening inside individual cells. We were interested in this at 0:04:55.396 --> 0:04:58.036 the beginning. Basically, the way we figured this out is 0:04:58.116 --> 0:05:01.116 we signed a few deals with farmer companies to try 0:05:01.156 --> 0:05:06.356 these things, and we found found that the technology worked 0:05:06.396 --> 0:05:11.196 really well in this simulating health outcomes, and it didn't 0:05:11.196 --> 0:05:13.676 work very well when it comes down to simulating what's 0:05:13.716 --> 0:05:15.996 inside the cell. And I think this comes down to data, 0:05:16.356 --> 0:05:18.676 which is that we get a ton of data on 0:05:18.916 --> 0:05:21.516 human health outcomes, like literally every time you go to 0:05:21.556 --> 0:05:24.436 the doctor, there's data there on your health outcomes. But 0:05:24.476 --> 0:05:28.036 the data from the things inside the cell, there is 0:05:28.076 --> 0:05:31.636 a lot of it, but it's much more difficult to 0:05:31.676 --> 0:05:34.116 work with. So I think that was a lot of 0:05:34.156 --> 0:05:37.596 what drove us in this direction is really the focus 0:05:37.636 --> 0:05:39.556 on what we think we have the data to solve 0:05:39.636 --> 0:05:40.596 these kinds of problems. 0:05:40.676 --> 0:05:44.476 So, Okay, you go in the direction of simulating health 0:05:44.516 --> 0:05:48.756 outcomes for patients, and in particular, sort of where you 0:05:48.796 --> 0:05:52.596 get to is working with companies that are running clinical trials. 0:05:52.596 --> 0:05:54.796 And I know eventually you get to a point where 0:05:54.836 --> 0:05:57.596 companies can use your model, use your software to run 0:05:57.636 --> 0:06:01.396 clinical trials with fewer patients. So just tell me about that, 0:06:02.156 --> 0:06:03.476 arc tell me how you get there. 0:06:04.076 --> 0:06:08.076 Clinical trials are, well, they're super tick forever, and they're 0:06:08.076 --> 0:06:10.916 really really expensive. Something might take like five years and 0:06:10.956 --> 0:06:14.796 cost one hundred million dollars to run a clinical trial. Yeah, 0:06:14.836 --> 0:06:17.996 in the way that these are hundreds or thousands of patients, right, oh, 0:06:18.036 --> 0:06:21.996 thousands of patients typically, right, Yeah, And typically half of 0:06:22.036 --> 0:06:24.516 the patients in a clinical trial are receiving a PLACBO. 0:06:25.436 --> 0:06:27.796 So you're going to randomly assign half to receive an 0:06:27.796 --> 0:06:30.476 experimental treatment have to receive a PLACBO. And the reason 0:06:30.596 --> 0:06:33.916 is that every clinical trial is ultimately just doing a comparison. 0:06:34.396 --> 0:06:36.956 You're comparing how a patient responds to the new treatment 0:06:36.996 --> 0:06:38.796 to how they respond if they don't get that treatment. 0:06:38.836 --> 0:06:40.716 And let me just give a shout out to the 0:06:40.796 --> 0:06:44.836 randomized controlled trial as like a really beautiful construct, right, 0:06:45.636 --> 0:06:47.956 not that old? Not that old. I learned that a 0:06:48.076 --> 0:06:51.636 ring for this interview, like less than one hundred years old, amazingly. 0:06:52.676 --> 0:06:56.356 But it's a perfect way to assess not perfect, it's 0:06:56.356 --> 0:06:59.956 a very very good way to assess causality. It's really elegant. 0:07:00.156 --> 0:07:03.076 It is an elegant idea. But if you're a patient, 0:07:04.356 --> 0:07:06.996 why are you participating a clinical trial at all? What's 0:07:07.036 --> 0:07:09.716 the number one reason people participate in clinic trials. They 0:07:09.716 --> 0:07:12.116 participate in clinical trials because they want access to this 0:07:12.196 --> 0:07:14.796 experimental treatment that you can't get any other way. That's 0:07:14.796 --> 0:07:17.716 the number one reason why patients are participating in clinical trials. 0:07:17.796 --> 0:07:19.156 Number one, Now they. 0:07:19.076 --> 0:07:21.316 Don't they don't want to be randomized to the placebo. 0:07:21.516 --> 0:07:23.636 No, no, no, they don't. 0:07:23.716 --> 0:07:27.156 I can certainly understand that it is the case, right 0:07:27.276 --> 0:07:33.076 that most trials fail, meaning the drug is not helping 0:07:33.076 --> 0:07:36.836 you and possibly hurting you, meaning on average, you're better 0:07:36.916 --> 0:07:39.396 off being in the placebo arm Like that is true, right. 0:07:39.396 --> 0:07:42.516 Yea, there's a principle of equipoise. But that's an academic 0:07:42.636 --> 0:07:43.956 Ivory tower principle. 0:07:44.156 --> 0:07:48.636 I mean, it also is true. Just sue, that's fine, that's. 0:07:48.476 --> 0:07:52.716 Fine, but in the end, that's like, in the end, 0:07:52.836 --> 0:07:56.956 patients choose not to participate in clinical trials because they 0:07:56.996 --> 0:08:00.516 don't want to get a placebo. Patients drop out of 0:08:00.596 --> 0:08:03.196 clinical trials when they think they are getting a placbo. 0:08:03.796 --> 0:08:07.436 Those are also true. Number one reason those things happen. 0:08:07.556 --> 0:08:08.516 Are those reasons? Fair? 0:08:08.676 --> 0:08:08.916 Okay? 0:08:08.956 --> 0:08:12.636 Right? So, And in fact, twenty percent of clinical trials 0:08:12.636 --> 0:08:14.996 failed not because the drug didn't work, but because they 0:08:15.036 --> 0:08:19.476 just couldn't find enough people to participate, okay. And what 0:08:19.516 --> 0:08:23.916 we realized though, is that there was a way for 0:08:24.076 --> 0:08:29.156 us not to try to replace the randomized control trial, 0:08:29.196 --> 0:08:31.716 but to make it better, and that what we are 0:08:31.756 --> 0:08:35.916 doing is we could take what we call digital twins 0:08:35.636 --> 0:08:38.516 of the patients, so these are these simulations of their 0:08:38.596 --> 0:08:42.076 of their future outcomes, and that we could incorporate those 0:08:42.156 --> 0:08:48.836 data into our cts directly randomized control trials. We call 0:08:48.876 --> 0:08:51.276 it just kind of like a reimagining of our cts. 0:08:51.396 --> 0:08:54.996 It's it's you're going to have a RCT that is 0:08:55.676 --> 0:09:01.156 more accurate, that is has requires fewer patients, and as 0:09:01.196 --> 0:09:03.356 a result, you get a lot of the benefits of 0:09:03.996 --> 0:09:06.756 faster trials of things that are better for the patients. 0:09:06.996 --> 0:09:09.756 We can talk about that in a minute, but you 0:09:09.876 --> 0:09:11.476 keep all of the same scientific rigger. 0:09:12.716 --> 0:09:17.356 So specifically, okay, that's a good like big picture. Specifically, 0:09:18.596 --> 0:09:19.116 how does it. 0:09:19.076 --> 0:09:25.196 Work right now? We build one model per disease. So, 0:09:25.276 --> 0:09:28.716 for example, we have a model for patients with Alzheimer's disease. 0:09:28.756 --> 0:09:31.236 We have a separate model for patients with als, we 0:09:31.276 --> 0:09:33.476 have a separate model for multiple scleroses, et cetera. 0:09:33.876 --> 0:09:36.436 Let's pick one model and talk about it. What's the 0:09:36.436 --> 0:09:38.516 one that's farthest along, Which is the one that works 0:09:38.516 --> 0:09:38.876 the best? 0:09:39.076 --> 0:09:41.996 Yeah, So our Alzheimer's disease model is that was our 0:09:41.996 --> 0:09:44.916 first one that we've published scientific papers on and things 0:09:44.956 --> 0:09:47.076 like this, so that ones our most well known. 0:09:47.356 --> 0:09:51.156 Okay, so you're setting out to build a model that 0:09:51.236 --> 0:09:55.196 will predict whether what's going to happen, presumably to a 0:09:55.196 --> 0:09:58.196 patient who has the early stages of Alzheimer's disease, How 0:09:58.196 --> 0:10:00.876 will their disease progress? A hard thing to know in 0:10:00.916 --> 0:10:04.636 the real world. How do you build that? What do 0:10:04.676 --> 0:10:04.956 you do? 0:10:05.796 --> 0:10:07.836 So the first thing is that you need data to 0:10:07.916 --> 0:10:11.956 learn from. Yeah, it's kind of obvious. So our first 0:10:11.996 --> 0:10:14.076 step was like, oh, we say, okay, we want to 0:10:14.116 --> 0:10:16.236 have data sets where we get a ton of information 0:10:16.276 --> 0:10:19.436 about each patient. What's that mean? That means that any 0:10:19.476 --> 0:10:22.436 individual time, I want to have a lot of different 0:10:22.996 --> 0:10:25.276 different measurements made on that patient at each time. 0:10:26.156 --> 0:10:29.116 So alsumably you want to have a lot of moments 0:10:29.196 --> 0:10:31.156 when lots of information exactly. 0:10:31.156 --> 0:10:32.156 You also want to have lots of. 0:10:32.196 --> 0:10:34.516 Lots of times over a long period of time, over 0:10:34.556 --> 0:10:35.276 a long period. 0:10:35.316 --> 0:10:37.316 Yeah, and so you know these are going to be 0:10:37.476 --> 0:10:39.756 for Alzheimer's. You're looking at a bunch of things related 0:10:39.836 --> 0:10:45.356 to the patient's cognitive performance on different assessments. Just also 0:10:45.396 --> 0:10:48.236 there's things about just their daily life. How are they 0:10:48.276 --> 0:10:51.076 able to function in their daily life. There's things related 0:10:51.116 --> 0:10:55.996 to their caregivers actually, like how does their caregiver rate 0:10:56.316 --> 0:11:00.796 their behavior? Brain imaging, blood tests, all that kind of information. 0:11:00.916 --> 0:11:02.796 You want to have as much of it about each patient. 0:11:02.876 --> 0:11:05.276 You want to have it as many times as possible. Sure, 0:11:05.516 --> 0:11:07.276 and we'll try to get that for you know, like 0:11:07.356 --> 0:11:11.516 fifty thousand people. And that's the kind of data set 0:11:11.596 --> 0:11:13.036 that we that we're starting with. 0:11:13.396 --> 0:11:16.836 And like, is there one repository that when you get that, 0:11:16.876 --> 0:11:18.276 you're like jackpot or what. 0:11:19.236 --> 0:11:23.316 No, we we have to aggregate data from lots and 0:11:23.356 --> 0:11:25.156 lots of different places to be able to build a 0:11:25.156 --> 0:11:25.996 big enough data set. 0:11:26.916 --> 0:11:29.196 Okay, so now you got the data, what do you 0:11:29.236 --> 0:11:29.676 do next? 0:11:30.436 --> 0:11:33.716 Then we got to train a model to to to 0:11:33.836 --> 0:11:37.036 be able to learn from those data how to simulate things. 0:11:37.396 --> 0:11:38.676 And now actually what we do. 0:11:38.876 --> 0:11:42.556 In particular in this case, how to predict, given some 0:11:42.596 --> 0:11:45.036 set of inputs for a patient, what's going to happen 0:11:45.076 --> 0:11:46.276 next exactly? 0:11:46.316 --> 0:11:48.956 And so this does look you were using that analogy 0:11:49.036 --> 0:11:52.036 of like a language model predicts the next word. So 0:11:52.436 --> 0:11:55.036 given these words I've seen before, predicts the next word. 0:11:55.316 --> 0:11:57.676 And that's that is similar to how our models and 0:11:57.716 --> 0:11:59.956 these diseases work. So we're going to say, given I've 0:11:59.996 --> 0:12:02.956 observed these things in the past about a patient, what 0:12:03.036 --> 0:12:06.876 will happen to them next? That is is very analogous 0:12:06.916 --> 0:12:07.876 to kind of what we're doing. 0:12:08.516 --> 0:12:11.276 It's okay, so you build the model, how does it work? 0:12:11.276 --> 0:12:14.636 How does it work in a clinical trial, specifically so 0:12:14.676 --> 0:12:17.236 that you know the people running the trial can can 0:12:17.316 --> 0:12:18.756 do it with fewer patients. 0:12:18.996 --> 0:12:25.836 Sure. So in a typical case, we're involved at the 0:12:25.876 --> 0:12:29.916 beginning of the clinical trial in the design of the protocol. Okay, 0:12:30.316 --> 0:12:34.556 So there's a question of how many patients should you 0:12:34.716 --> 0:12:37.716 randomize to your control group, how many patients do you 0:12:37.756 --> 0:12:40.076 need overall, and how many should be in the treatment, 0:12:40.076 --> 0:12:40.716 how many should be in. 0:12:40.716 --> 0:12:42.876 The control It's not always fifty to fifty. 0:12:43.196 --> 0:12:46.316 It's not always fifty to fifty in our studies. Our 0:12:46.396 --> 0:12:49.356 typical goal is to try to minimize the number of 0:12:49.396 --> 0:12:51.876 people that you need to put in the control group. Okay, 0:12:52.996 --> 0:12:56.156 And so we're involved in doing helping to do that 0:12:56.756 --> 0:12:58.796 calculation to say, here's how big your trial should be. 0:12:58.996 --> 0:13:03.476 And so then as patients enroll in the study, we 0:13:03.556 --> 0:13:08.436 take data from their first visit before they receive whatever 0:13:08.596 --> 0:13:12.956 new treatment they're going to receive and we take those data, 0:13:13.076 --> 0:13:15.796 we input them into our pre trained model. So I 0:13:15.916 --> 0:13:17.876 like to think about you know, CHATCHBTU give it a 0:13:17.916 --> 0:13:20.476 prompt and it gives us output. Same thing. We take 0:13:20.516 --> 0:13:22.716 the data from the patient, we prompt the model and 0:13:22.756 --> 0:13:25.276 it outputs their predictions for what will happen. 0:13:24.996 --> 0:13:26.596 In the And to be clear, you do that for 0:13:26.716 --> 0:13:28.916 all of the patients in both arms the treatments. 0:13:29.476 --> 0:13:32.356 Yes, yeah, and we don't know, right, it's blinded blind 0:13:32.356 --> 0:13:35.716 it's you, it's blinded to us. We don't know what. Yeah, 0:13:35.756 --> 0:13:37.356 So we do that for one hundred percent of the 0:13:37.396 --> 0:13:42.156 patients and then we give those data to the customer, 0:13:42.756 --> 0:13:44.116 to the farmer company. 0:13:44.316 --> 0:13:46.796 So then what happens next? What happens next? 0:13:46.916 --> 0:13:50.076 We wait around for a while. Yeah. And then when 0:13:50.116 --> 0:13:53.196 the study is actually completed, right, and they they they 0:13:53.196 --> 0:13:57.596 do unblind the data. We have to help to to 0:13:57.956 --> 0:14:01.356 say here's how you now can incorporate these these predicted 0:14:01.396 --> 0:14:03.396 outcomes into the analysis. 0:14:02.956 --> 0:14:04.836 Like so this is this is it. Now We're at 0:14:04.836 --> 0:14:07.716 the moment now when the thing you have built is useful. 0:14:07.796 --> 0:14:11.476 So so now it's it's they have done the study, 0:14:11.836 --> 0:14:14.796 they have the outcomes for the real human beings and 0:14:14.836 --> 0:14:17.956 they have the predicted outcomes from your model. How is 0:14:17.996 --> 0:14:19.716 your system? How's your model useful? 0:14:20.396 --> 0:14:22.996 So the very first thing that we're basically going to 0:14:22.996 --> 0:14:24.156 do is what I'm going to say, We're going to 0:14:24.196 --> 0:14:28.956 recalibrate our model. Recalibrate and you're going to figure out 0:14:29.036 --> 0:14:33.236 a relationship between your predicted outcomes and your observed outcomes 0:14:33.276 --> 0:14:36.796 for the patients who really received the placebo, for. 0:14:36.876 --> 0:14:39.116 The patients in the placebo group, And basically you're going 0:14:39.156 --> 0:14:40.436 to see how you did how do we do. 0:14:40.876 --> 0:14:43.436 Yes, and in particularly going to find out not just 0:14:43.836 --> 0:14:45.716 it's not like a measure of was it good or bad, 0:14:45.756 --> 0:14:47.956 You're going to find out exactly how are they related? 0:14:48.916 --> 0:14:53.076 And then you can take that information in adjust your predictions. 0:14:53.636 --> 0:14:57.676 Okay for everybody. So you can say, let's imagine that 0:14:57.956 --> 0:15:03.156 I find out, well, on average, I'm i underestimating how 0:15:03.236 --> 0:15:05.436 much a patient would progress by one point per year. 0:15:05.476 --> 0:15:08.236 I'm on average underestimating it. Well, then I'll go through 0:15:08.236 --> 0:15:09.836 and I'll take my prediction and I'll be like, well 0:15:10.476 --> 0:15:13.516 add one point, add one point forer you. And then 0:15:13.916 --> 0:15:15.876 now you have said, okay, well, now I've taken the 0:15:15.916 --> 0:15:18.236 model and I've been able to do it in such 0:15:18.236 --> 0:15:21.036 a way where I've fixed these mistakes by looking at 0:15:21.076 --> 0:15:23.556 the actual patients who got place ebo, And now I'm 0:15:23.596 --> 0:15:25.596 going to apply that model to the patient and the 0:15:25.636 --> 0:15:28.996 treatment group, and I'm going to look at Now, I 0:15:29.156 --> 0:15:31.676 just look at that difference between the patients and the 0:15:31.676 --> 0:15:33.636 treatment group and their predictions from the model, and I 0:15:33.676 --> 0:15:36.156 average that and I get an estimate for the treatment effect. 0:15:36.596 --> 0:15:39.996 Now that is described in a two stage procedure. That 0:15:40.236 --> 0:15:43.236 it's not actually a two stage procedure. It's one mathematical 0:15:43.236 --> 0:15:47.796 analysis that you do it. But the thing that's really 0:15:48.316 --> 0:15:53.036 I think quite amazing actually is that this has a 0:15:53.596 --> 0:15:57.916 bunch of mathematical guarantees to it. We can actually prove 0:15:58.956 --> 0:16:01.596 that the estimate that you get for how effective the 0:16:01.636 --> 0:16:06.236 treatment is is still unbiased. So it's not an overestimate, 0:16:06.236 --> 0:16:09.836 it's not under ustan, it's on average correct. Can prove 0:16:10.076 --> 0:16:12.636 that if you compute a P value from the analysis 0:16:12.636 --> 0:16:15.236 like you would typically do, that it has exactly the 0:16:15.316 --> 0:16:17.596 right properties as it does out of a regular RCT. 0:16:17.756 --> 0:16:20.516 P value is roughly the probability that the funding was 0:16:20.516 --> 0:16:20.916 a fluke. 0:16:22.156 --> 0:16:25.756 Ye right, Yeah. If you compute an arabar the arabar 0:16:25.876 --> 0:16:27.996 you get from our analysis the air bar you would 0:16:27.996 --> 0:16:31.916 get from a normal there. They all have exactly identical statistics. 0:16:31.956 --> 0:16:35.476 This is not intuitive, but but you're saying, the mathematical 0:16:35.596 --> 0:16:39.076 fact is that it works. Yes, And just to be clear, 0:16:40.036 --> 0:16:42.716 what this allows you or the people running the trial 0:16:42.836 --> 0:16:46.796 to do is to enroll fewer people in the placebo 0:16:46.916 --> 0:16:49.636 arm not none, but fewer than they otherwise would have 0:16:49.716 --> 0:16:52.236 had to get the same amount of statistical power. Right, 0:16:52.316 --> 0:16:55.076 that is the bottom line thing that you are delivering. Yes, 0:16:55.156 --> 0:16:57.956 that's correct, And it's something like a quarter or a 0:16:58.076 --> 0:17:00.036 third less, is that right? Yeah? 0:17:00.156 --> 0:17:03.956 So it depends on how accurate our models are. The 0:17:04.076 --> 0:17:06.516 more accurate the model is, the fewer patients you need 0:17:06.556 --> 0:17:10.676 in your placebo group. Sure so typically right now, yet 0:17:10.796 --> 0:17:13.716 somewhere between like a quarter, like fifty percent. It depends 0:17:13.836 --> 0:17:15.796 on the specific details. 0:17:15.956 --> 0:17:19.236 So tell me what is the effect of that at 0:17:19.236 --> 0:17:21.476 a macro scale? What does it mean to say a 0:17:21.596 --> 0:17:26.036 drug company can get the same statistical power by enrolling 0:17:26.156 --> 0:17:30.196 twenty five percent fewer people in their study, specifically in 0:17:30.276 --> 0:17:30.876 the placeboar. 0:17:31.916 --> 0:17:34.476 Well, I think that there are two things. First is 0:17:35.356 --> 0:17:39.316 I think people don't always understand how expensive clinical trials 0:17:39.356 --> 0:17:43.116 are you know, companies are paying one hundred sometimes two 0:17:43.236 --> 0:17:46.036 hundred thousand dollars per patient in one of their clinical trials, 0:17:46.116 --> 0:17:49.196 So finding and enrolling and monitoring a patient for all 0:17:49.236 --> 0:17:52.116 that time is very, very expensive. It also just takes 0:17:52.116 --> 0:17:54.556 a long time to find people who are willing to participate. 0:17:55.316 --> 0:17:58.036 And so if you're talking about a large phase three trial, 0:17:58.276 --> 0:18:01.996 reducing the size of the control group by twenty five percent, 0:18:02.076 --> 0:18:04.156 that might mean like one hundred fewer patients that you 0:18:04.236 --> 0:18:06.916 need to actually recruit and enroll in your study, and 0:18:07.276 --> 0:18:09.516 that that could be like a year. But you know, 0:18:09.676 --> 0:18:11.996 so you can save six months to a year off 0:18:12.036 --> 0:18:15.396 of your total clinical trial timeline. That means a lot, right, 0:18:16.116 --> 0:18:19.436 but both for patients. If the drug is actually successful, 0:18:19.876 --> 0:18:24.636 that's a year faster it gets to market. And you know, 0:18:24.716 --> 0:18:27.276 for the farmer company, that's office a big value proposition 0:18:27.396 --> 0:18:29.116 being able to get the drug to market a year faster. 0:18:35.716 --> 0:18:39.836 In a minute, moving from clinical trials to individual patients, 0:18:47.396 --> 0:18:53.236 now back to the show. What is the what's the 0:18:53.276 --> 0:18:55.076 big picture? Where are you trying to get to and 0:18:55.516 --> 0:19:01.076 you know, in the medium termament in the long term, So. 0:19:02.476 --> 0:19:06.796 The ability to understand what a person's health outcome is 0:19:06.836 --> 0:19:09.356 going to be under different scenarios. This is I think 0:19:09.396 --> 0:19:12.396 what's really important. Is it not just hey, given that 0:19:12.436 --> 0:19:14.396 they would get a placebo, what's going to happen to 0:19:14.436 --> 0:19:16.636 the health outcomes? That's nice for clinical trials, but we 0:19:16.716 --> 0:19:19.476 want to know, hey, there's ten different treatment options for 0:19:19.556 --> 0:19:22.116 this patient, and if I were to give them each 0:19:22.156 --> 0:19:24.436 one of these different treatment options, what would their health 0:19:24.476 --> 0:19:26.356 outcomes look like in those different scenarios. 0:19:27.276 --> 0:19:30.036 So there you're also moving out of the clinical trial 0:19:30.596 --> 0:19:32.876 into the realm of like a doctor seeing a patient. 0:19:32.996 --> 0:19:35.756 Let's just be very clear, like that that's a huge leap, 0:19:36.076 --> 0:19:37.556 and like that's what you're talking about. 0:19:37.796 --> 0:19:42.556 I think that there's a really good pathway to being 0:19:42.676 --> 0:19:47.596 able to build these things and make them useful for 0:19:47.996 --> 0:19:50.196 problems that are at the individual patient level. 0:19:50.396 --> 0:19:52.516 And is the narrow way to think about it, Like 0:19:53.236 --> 0:19:56.876 before you get to the magical computer that can predict 0:19:56.916 --> 0:19:59.316 everything for everybody, that you get to a very very 0:19:59.396 --> 0:20:04.196 good model that can predict for individuals in certain circumstances 0:20:04.236 --> 0:20:06.116 a certain set of outcomes. So, for example, you might 0:20:06.156 --> 0:20:09.636 have a very very good Alzheimer's model for certain patients 0:20:10.156 --> 0:20:12.996 at a certain stage of disease. This model is very 0:20:13.156 --> 0:20:15.396 powerful at the level of the individual. Is that the 0:20:15.436 --> 0:20:18.036 way to think about it, Yeah, the way I'll tell you. 0:20:18.036 --> 0:20:19.956 The way I think about it. I think that the 0:20:20.076 --> 0:20:23.316 most important thing that models can do, which actually things 0:20:23.396 --> 0:20:26.716 like a chat ept are not good at, is that 0:20:26.796 --> 0:20:33.476 they can give you really well calibrated estimates of their 0:20:33.556 --> 0:20:37.956 own confidence. That's the most important thing that a model 0:20:37.996 --> 0:20:43.196 can do, because, like we said earlier, health is stochastic. 0:20:43.556 --> 0:20:49.356 There are all kinds of things that happens fundamentally exactly right. 0:20:50.356 --> 0:20:52.796 And so you know, we're going to make a prediction 0:20:53.036 --> 0:20:56.196 about somebody in the future, and sometimes we're going to 0:20:56.196 --> 0:20:58.636 be really confident in that prediction and then it's actionable, 0:20:59.836 --> 0:21:02.836 but sometimes you're not. It's not you're not confident, and 0:21:02.956 --> 0:21:06.356 maybe it's not actionable because you're really unconfident. And the 0:21:06.476 --> 0:21:08.396 most we're never going to get to the point that's 0:21:08.396 --> 0:21:10.396 going to say, hey, you're going to have a heart 0:21:10.436 --> 0:21:15.196 attack on July seventeenth of twenty thirty seven. It's like, 0:21:15.236 --> 0:21:17.636 it's never going to be like that detail. But the 0:21:17.876 --> 0:21:21.996 point question is can you believe the model's estimates of 0:21:22.076 --> 0:21:25.036 its own confidence? And if you can, then you when 0:21:25.076 --> 0:21:27.396 it is confident, you can act on it, and when 0:21:27.436 --> 0:21:29.756 it's not confident, you can do other things. And that's 0:21:29.836 --> 0:21:32.756 the that's so it's actually a really key technical thing, 0:21:32.836 --> 0:21:34.156