WEBVTT - Two Researchers Explain How Quants Are Going To Revolutionize Long-Term Investing 0:00:08.680 --> 0:00:12.799 Hello, and welcome to another episode of the Odd Lots Podcast. 0:00:12.840 --> 0:00:16.760 I'm Joe Wisenthal and I'm Tracy Alloway. Tracy, I really 0:00:16.800 --> 0:00:21.000 liked last week's episode with Andrew Lowe talking about quant 0:00:21.120 --> 0:00:25.320 stuff and his sort of evolution of the efficient market 0:00:25.400 --> 0:00:28.920 hypothesis and where that might go. Yeah, I did too. 0:00:29.000 --> 0:00:32.839 I really like the ecosystem analogy, the idea that you 0:00:32.840 --> 0:00:35.640 have all these different players with different motivations and they're 0:00:35.680 --> 0:00:40.200 constantly evolving and adapting to the market. That point about 0:00:40.200 --> 0:00:43.000 it adapting, and of course that's the name of his book, 0:00:43.080 --> 0:00:47.360 Adaptive Market is the name of his hypothesis. Is really 0:00:47.520 --> 0:00:50.600 key because one of the main points that he made 0:00:50.920 --> 0:00:54.000 that I loved was this idea of hedge funds as 0:00:54.080 --> 0:00:57.920 sort of the R and D laboratory for all of 0:00:57.960 --> 0:01:02.160 the financial industry. Right, the funds are where innovative new 0:01:02.240 --> 0:01:05.720 techniques get to be sort of hashed out without doing 0:01:06.040 --> 0:01:09.280 usually too much damage, I guess, to the wider ecosyst 0:01:09.280 --> 0:01:13.680 hopefully right, not always. There's certainly examples of hedge funds 0:01:13.680 --> 0:01:17.919 actually having done major damage from time to time. But ideally, 0:01:18.160 --> 0:01:22.160 you know what the evolution seems to be that some 0:01:22.200 --> 0:01:24.759 new idea sort of starts in the hedge fund world 0:01:25.040 --> 0:01:27.600 and eventually makes its way to the broader world. I 0:01:27.600 --> 0:01:29.759 think the most obvious example of that that we could 0:01:29.760 --> 0:01:32.160 say these days is a lot of this sort of 0:01:32.520 --> 0:01:36.360 popular smart beta strategies e t f s that are 0:01:36.360 --> 0:01:41.119 built on things like momentum or value or other factors, 0:01:41.440 --> 0:01:44.640 sort of quantitative ideas that for many years were only 0:01:44.680 --> 0:01:48.920 available to uh, you know, researchers at hedge funds. Right. So, 0:01:48.960 --> 0:01:53.160 these sorts of quantitative investment or trading methods were usually 0:01:53.240 --> 0:01:56.040 the purview of sophisticated hedge funds who had the time 0:01:56.080 --> 0:01:58.880 and resources to develop them. And then you had a 0:01:58.920 --> 0:02:01.120 bunch of e t f s who kind of caught 0:02:01.160 --> 0:02:03.800 on and managed to replicate them. And now we can 0:02:03.840 --> 0:02:08.679 all trade lighthedge funds for zero percent fees, right, exactly right, 0:02:09.040 --> 0:02:12.280 And of course, once everyone can do it for very 0:02:12.320 --> 0:02:15.560 few fees, I think it's safe to say those strategies 0:02:15.560 --> 0:02:19.840 aren't going to produce the same returns, and hence the 0:02:19.919 --> 0:02:23.560 market is forced to adapt again. Right. Presumably the hedge 0:02:23.560 --> 0:02:26.400 funds are always trying to stay one step ahead as well, right, 0:02:27.120 --> 0:02:30.360 exactly so, which raises the idea of like what will 0:02:30.400 --> 0:02:34.960 be the next thing. If anyone can sort of invest 0:02:35.000 --> 0:02:38.720 in a crude momentum strategy for virtually no fees, then 0:02:38.760 --> 0:02:41.440 that requires the people on the cutting edge, the people 0:02:41.520 --> 0:02:44.160 doing the R and D of this industry, to uh, 0:02:44.240 --> 0:02:46.000 you know, figure out what the next big thing it's 0:02:46.000 --> 0:02:48.359 gonna be. Do you know what the next big thing 0:02:48.400 --> 0:02:50.239 is going to be? Joe? Can you share it with 0:02:50.680 --> 0:02:55.360 your fellow partner at Odd Thoughts LLC? Sadly and unfortunately 0:02:56.000 --> 0:02:59.760 to all of the odd Lots fan, I myself do 0:02:59.800 --> 0:03:03.480 not know what the next big thing in quantitative strategy 0:03:03.639 --> 0:03:06.480 or sort of advanced investing is going to be. But 0:03:06.560 --> 0:03:09.960 I'm hoping that our guests on today's episode might be 0:03:10.040 --> 0:03:13.960 able to shed some light. Who are there? Okay, so 0:03:14.040 --> 0:03:17.040 today we're going to be talking to John Elberg. He 0:03:17.160 --> 0:03:21.240 is the founder of Euclidean Technologies, a quant firm, as 0:03:21.240 --> 0:03:24.919 well as Zach Lipton, a professor at Carnegie Mellon University 0:03:24.960 --> 0:03:28.519 in the Business School and expert on machine learning. They 0:03:28.639 --> 0:03:33.480 recently published a paper titled Improving factor based Quantitative Investing 0:03:34.000 --> 0:03:39.520 by Forecasting company fundamentals. So what I think that means 0:03:39.560 --> 0:03:41.000 and we'll talk to them, is, you know, we talk 0:03:41.080 --> 0:03:44.800 all this stuff about price and computers and algorithms figuring 0:03:44.840 --> 0:03:47.880 out what signal we can get from price, But maybe 0:03:48.480 --> 0:03:51.200 the next generation can actually tell us something about the 0:03:51.240 --> 0:03:54.640 fundamental workings of the company itself, and maybe this could 0:03:54.680 --> 0:03:57.880 be sort of the next wave of where quant investing goes. 0:03:58.440 --> 0:04:02.080 And this sounds absolutely senating, Joe, let's bring them on. 0:04:12.280 --> 0:04:15.240 John and Zach, thank you very much for joining us. 0:04:15.920 --> 0:04:19.800 Was that a reasonable characterization of sort of where your 0:04:20.160 --> 0:04:24.080 paper and where your research is taking things? Yeah, I 0:04:24.120 --> 0:04:27.280 think it is so. So. First of all, machine learning 0:04:27.320 --> 0:04:30.160 has been kind of on a rocket ship of innovation 0:04:30.240 --> 0:04:33.159 for the last ten years or so, and with the 0:04:33.240 --> 0:04:36.800 advent of deep learning, you know, computers and machine learning 0:04:36.800 --> 0:04:39.120 have been able to do things that you know, historically 0:04:39.120 --> 0:04:44.400 have been very challenging, like image captioning and language translation. 0:04:45.080 --> 0:04:47.840 So we Zach and I, you know, a couple of 0:04:47.880 --> 0:04:52.080 years back, thought of the idea of collaborating to apply 0:04:52.560 --> 0:04:56.480 deep learning to the problem of long term investing. So 0:04:56.520 --> 0:04:59.240 how did you actually go about doing that and what 0:04:59.400 --> 0:05:03.680 exactly do you mean by deep learning? That's exactly what 0:05:03.720 --> 0:05:07.159 I wanted to know. To deep learning sort of the 0:05:07.200 --> 0:05:10.560 rebranding of neural networks research to say I say I 0:05:10.600 --> 0:05:13.279 had some data about a company, right like I had 0:05:13.680 --> 0:05:16.000 machine learning. We call a vector of features. But what 0:05:16.240 --> 0:05:18.080 we mean it's just like a list of attributes, each 0:05:18.120 --> 0:05:20.880 of which is somehow like be made into a numerical quantity, 0:05:21.000 --> 0:05:24.240 whether it's like their income, they're the number of assets whatever. 0:05:24.640 --> 0:05:28.039 One way of deciding how to predict what the say, 0:05:28.120 --> 0:05:30.360 what the price will be or something, as we say, well, 0:05:30.600 --> 0:05:33.400 we're going to have this long vector of features, and 0:05:33.440 --> 0:05:36.200 then we're going for every single company, uh, you know, 0:05:36.279 --> 0:05:38.640 at every single time while this vector of features corresponding 0:05:38.640 --> 0:05:40.360 to the state of the company at some period of time, 0:05:40.880 --> 0:05:44.640 and then we'll have some target that we want to predict. 0:05:44.680 --> 0:05:47.600 This could be a binary quantity like will the stock 0:05:47.640 --> 0:05:50.280 go up or down in the next you know, time 0:05:50.400 --> 0:05:53.320 unit of your choice, whether it's the next day or 0:05:53.320 --> 0:05:55.400 in the next month or in the next year. Or 0:05:55.440 --> 0:05:58.599 you could try to directly predict say the relative price, 0:05:58.680 --> 0:06:02.520 so like you know, the percent improvement or decrease based 0:06:02.560 --> 0:06:05.440 on sort of the available features. So one of the 0:06:05.480 --> 0:06:07.680 simplest ways you can make a model is you say, hey, 0:06:07.720 --> 0:06:09.120 I've got a bunch of features. I'm gonna do is 0:06:09.120 --> 0:06:11.479 I'm gonna take a weighted some of these features the 0:06:11.520 --> 0:06:14.000 way like you'd calculate a score to see, like what's 0:06:14.000 --> 0:06:16.240 your risk of a heart disease. Maybe you take you know, well, 0:06:16.240 --> 0:06:20.120 four times your cholesterol plus two times your age minus 0:06:20.120 --> 0:06:23.440 one times you know, your amount of good cholesterol is 0:06:23.480 --> 0:06:25.800 something like this if you come up with some formula 0:06:25.920 --> 0:06:28.800 that's expressed simply as a weighted sum, so that would 0:06:28.800 --> 0:06:31.719 be a linear model. Where deep learning make things different 0:06:31.839 --> 0:06:35.479 is that you have many different layers of computation that 0:06:35.800 --> 0:06:38.720 you basically are learning very complex patterns that maybe couldn't 0:06:38.760 --> 0:06:41.120 be expressed as a as a weighted sum. So maybe 0:06:41.120 --> 0:06:44.960 you're uncovering interactions between all of your features. Um. So, 0:06:45.000 --> 0:06:48.159 for example, if you want to learn to recognize a 0:06:48.240 --> 0:06:50.839 dog versus a cat in an image, there's no weighted 0:06:50.880 --> 0:06:53.000 sum of pixel values it's actually going to tell you 0:06:53.000 --> 0:06:55.359 this because it's just the patterns too complicated. So in 0:06:55.360 --> 0:06:58.239 that case you need some some more like heavy duty machinery. 0:06:58.360 --> 0:07:00.800 So what you do in deep learning essentially is that 0:07:00.839 --> 0:07:05.320 you learn multiple successive transformations of your data such that 0:07:05.480 --> 0:07:09.640 after applying many such transformations, you know, could be two, 0:07:09.760 --> 0:07:12.480 four or five, ten, whatever, you come out at the 0:07:12.600 --> 0:07:15.240 end of a representation of your data where you actually 0:07:15.280 --> 0:07:17.680 can learn a very simple model on top of that. 0:07:17.920 --> 0:07:21.280 So we sometimes call deep learning representation learning because it's 0:07:21.320 --> 0:07:24.200 what we're doing is we're both learning how to feature 0:07:24.200 --> 0:07:26.600 eye our data essentially, how to transform it and how 0:07:26.600 --> 0:07:29.360 to classify it at the same time. So one of 0:07:29.400 --> 0:07:32.720 the things in sort of traditional quantity, a lot of 0:07:32.800 --> 0:07:37.600 quantitative investing focuses a lot on price and sort of 0:07:37.960 --> 0:07:43.240 listening to your characterization. It seems like price and this 0:07:43.320 --> 0:07:46.200 is relatively speaking, of course, price is a fairly you know, 0:07:46.240 --> 0:07:49.240 it's sort of easy idea to capture. So you can 0:07:49.280 --> 0:07:53.720 come up with some definition of what momentum is and 0:07:53.760 --> 0:07:56.640 then sort of say, okay, these stocks are experiencing momentum 0:07:56.760 --> 0:08:00.600 right now, or these stocks aren't, and then his history 0:08:00.720 --> 0:08:02.560 tell us the stocks are going to do next if 0:08:02.560 --> 0:08:07.120 they sort of meet these characterizations. Your paper really looks 0:08:07.200 --> 0:08:10.440 at what can you do with this technology for sort 0:08:10.440 --> 0:08:14.080 of looking at future fundamental so looking at the sort 0:08:14.120 --> 0:08:17.080 of characteristics of the company and not just trying to 0:08:17.560 --> 0:08:21.440 see where prices going, but where those characteristics are going, 0:08:21.480 --> 0:08:25.960 so explain sort of what your research specifically attempts to uncover. 0:08:26.680 --> 0:08:30.160 So one thing that deep learning allows a researcher to 0:08:30.240 --> 0:08:35.280 do is look at kind of more raw features um. 0:08:35.320 --> 0:08:38.520 Like Zach explained in the image case, you're looking at 0:08:38.640 --> 0:08:42.840 raw pixels. Now, if you think about most quant funds 0:08:43.000 --> 0:08:46.240 and most quant models, they the features that go into 0:08:46.280 --> 0:08:49.160 the model are highly engineered, and they include things like 0:08:49.240 --> 0:08:53.800 price and maybe book value, price divided by book value, 0:08:53.840 --> 0:08:57.520 price divided by earnings, and then maybe some momentum features. 0:08:58.000 --> 0:09:01.000 The interesting thing about deep learning is it allows you 0:09:01.080 --> 0:09:05.720 to potentially let it uncover what the best features are. 0:09:05.760 --> 0:09:09.240 If you over engineer features, you may not find the 0:09:09.240 --> 0:09:12.040 ones that are best to predict what you're interested in predicting. 0:09:12.600 --> 0:09:16.360 So that, you know, allows you to potentially find features 0:09:16.360 --> 0:09:20.199 in the data that you wouldn't find through which traditional 0:09:20.240 --> 0:09:24.640 feature engineering process. Yeah, and you know, to directly address 0:09:24.679 --> 0:09:27.080 your question, your point is that the very most obvious 0:09:27.080 --> 0:09:28.920 thing you could say, now, if I have this, I 0:09:28.960 --> 0:09:31.199 have this learning machine, I have a bunch of features, 0:09:31.200 --> 0:09:32.720 and I have to choose what am I going to predict? 0:09:33.080 --> 0:09:35.000 The very most obvious thing to try to predict is 0:09:35.000 --> 0:09:38.040 the price, because if you can actually do that perfectly, 0:09:38.360 --> 0:09:41.240 then you're done, right. If if you actually know which 0:09:41.240 --> 0:09:43.280 way the price is going to move in the next year, 0:09:43.440 --> 0:09:46.679 then you can make the perfect choice. So the problem 0:09:46.760 --> 0:09:49.440 is that that's that's not so easy because the markets 0:09:49.440 --> 0:09:53.120 are quite capricious, right, Um, So one problem that we 0:09:53.200 --> 0:09:55.280 found is we actually did these models where we were 0:09:55.280 --> 0:09:58.280 trying to predict price directly. But among the other things 0:09:58.280 --> 0:10:00.520 that you have is that one, it's hard to learn 0:10:00.640 --> 0:10:03.280 models that do a good job of this that are 0:10:03.360 --> 0:10:06.480 sort of robust across different time periods. So you might 0:10:06.559 --> 0:10:08.600 have like, hey, I'm going to train on these like 0:10:08.720 --> 0:10:11.080 decades of data and I'm going to try to directly 0:10:11.160 --> 0:10:13.800 predict the price. But then I come into periods of 0:10:13.840 --> 0:10:16.760 time where the markets behaving a little bit differently, and 0:10:16.800 --> 0:10:20.080 we call this nonstationarity. Basically, like you're modeled, there's a 0:10:20.080 --> 0:10:23.280 great job of uncovering the pattern that's present in the 0:10:23.440 --> 0:10:25.760 data that you gave to the model, but that data 0:10:25.880 --> 0:10:28.079 is anchored to some period of time, and the future 0:10:28.160 --> 0:10:30.840 data that comes in, you know, the patterns changed a 0:10:30.840 --> 0:10:33.680 little bit, and so the kind of like function that 0:10:33.679 --> 0:10:35.840 you've learned no longer does a great job. So so 0:10:35.880 --> 0:10:38.120 what we do instead of directly trying to predict the price, 0:10:38.480 --> 0:10:40.719 the idea that we had was to think, well, this 0:10:40.880 --> 0:10:43.880 core idea behind a factor model, generally right, is to 0:10:43.960 --> 0:10:45.840 just say, hey, I'm going to sort all the stocks 0:10:46.320 --> 0:10:49.360 according to some reason idea, Hey, the price of the 0:10:49.360 --> 0:10:52.440 company should be tied to its income, any company, and 0:10:52.520 --> 0:10:55.040 somehow it is justified by like it's the long term 0:10:55.120 --> 0:10:58.760 discounted cash as well. Let's just say a factor strategy 0:10:58.800 --> 0:11:00.600 just something very simple. It says, well, let's just look 0:11:00.600 --> 0:11:03.760 at the current income divided by say the current price 0:11:03.880 --> 0:11:06.640 or current income divided by the current you know, market 0:11:06.640 --> 0:11:10.880 cap or enterprise value, some some notion of income and 0:11:10.920 --> 0:11:13.880 some notion notion of financial performance, and divided by some 0:11:13.920 --> 0:11:16.640 notion of company size and this, and then I'm going 0:11:16.679 --> 0:11:18.640 to sort the stocks according to this. The ones that 0:11:18.720 --> 0:11:22.200 come out highest are like most cheaply priced, so let's 0:11:22.200 --> 0:11:25.200 buy those. So the ideas to say, hey, well, what 0:11:25.240 --> 0:11:27.800 if I told you so. We actually know that this 0:11:27.840 --> 0:11:29.720 does pretty well in back testing whether or not the 0:11:29.720 --> 0:11:32.600 patterns will hold in the future. But you know, many 0:11:32.600 --> 0:11:34.199 people have made a lot of money for many years, 0:11:34.360 --> 0:11:36.680 so there's an idea of if you knew the income, 0:11:36.720 --> 0:11:39.520 this is a good thing, a reasonable thing to try 0:11:39.559 --> 0:11:43.200 to do. Our question that we asked, Unfortunately, um, John, 0:11:43.440 --> 0:11:46.840 because he's actually in finance and I'm not, has this 0:11:47.240 --> 0:11:50.360 really great set of like industry grade tools that unlike 0:11:50.360 --> 0:11:52.600 most academic papers that look at like one stock over 0:11:52.640 --> 0:11:54.280 a short period of time or something, we actually had, 0:11:54.360 --> 0:11:57.439 you know, forty plus years of financial data and can 0:11:57.480 --> 0:12:00.920 actually simulate like an applausible that guess what's going on. 0:12:01.320 --> 0:12:04.720 We said, well, what if you did a factor model, 0:12:05.000 --> 0:12:08.120 but someone gave you a crystal ball. So basically, instead 0:12:08.120 --> 0:12:11.240 of dividing the current income divided by the current enterprise value, 0:12:11.600 --> 0:12:15.160 someone gave you next year's income, and so you sorted 0:12:15.160 --> 0:12:18.080 the stocks according to next year's income divided by the 0:12:18.120 --> 0:12:22.080 current enterprise value something like this. So you're you're able 0:12:22.120 --> 0:12:24.560 to peek into the future. You know how the company 0:12:24.600 --> 0:12:28.160 will be performing next year, and you're saying, is how 0:12:28.240 --> 0:12:31.360 is its next year's performance? Is that based on next 0:12:31.440 --> 0:12:34.280 year's performance? Is its current price? Is it currently priced 0:12:34.320 --> 0:12:36.640 cheaply or not? So it's what we call like a 0:12:36.720 --> 0:12:39.959 clairvoyant factor model. Like you don't actually have such a 0:12:39.960 --> 0:12:41.880 crystal ball, but if you, you know, give us some 0:12:41.960 --> 0:12:44.360 license and you imagine that you did what would have 0:12:44.360 --> 0:12:46.320 happened if you went back in history and you had 0:12:46.320 --> 0:12:49.080 this crystal ball and you traded based on a clairvoyant 0:12:49.120 --> 0:12:51.800 factor model, and it turns out that the clairvoyant factor 0:12:51.880 --> 0:12:55.000 model just crushes it. So it does really, really well 0:12:55.040 --> 0:12:58.640 and and not surprisingly, the more clairvoyant the model is. 0:12:58.679 --> 0:13:01.480 So if it if it knows the performance of the 0:13:01.480 --> 0:13:05.440 company six months out versus now, or twelve months out 0:13:05.520 --> 0:13:08.040 versus six months out, it keeps getting better and better 0:13:08.080 --> 0:13:12.280 and better. So what we decided was, well, maybe trying 0:13:12.280 --> 0:13:15.160 to predict price directly as a bit you know, subject 0:13:15.520 --> 0:13:18.720 to you know, a kind of fickle market, but the 0:13:18.760 --> 0:13:23.280 patterns present in the fundamental reporting data itself is more stable. 0:13:23.840 --> 0:13:25.880 So in our method what we do is instead of 0:13:25.920 --> 0:13:28.480 just trying to predict a return, we try to predict 0:13:28.640 --> 0:13:32.680 actually the fundamental reporting data itself, just so we're given 0:13:33.360 --> 0:13:36.920 um these these features for like a trailing window of 0:13:36.920 --> 0:13:42.040 of time corresponding to the company's like financial reporting, and 0:13:42.080 --> 0:13:44.199 then we're trying to predict what they're going to report 0:13:44.240 --> 0:13:46.600 next year. And then based on what they're going we 0:13:46.640 --> 0:13:49.040 think they're going to report next year, we sort the 0:13:49.040 --> 0:13:52.920 companies according to a value factor. So in essence, you 0:13:52.960 --> 0:13:56.760 can pick out of that future prediction the components of 0:13:56.800 --> 0:13:59.960 the factor model. Let let it whether it's a few 0:14:00.000 --> 0:14:04.480 future predicted earnings, and you can take that out of 0:14:04.520 --> 0:14:09.079 the future predicted fundamentals, divide that by current enterprise value 0:14:09.679 --> 0:14:12.160 and and sort and then you have basically a factor 0:14:12.240 --> 0:14:17.719 model which you are using. Instead of trailing twelve months earnings, 0:14:18.120 --> 0:14:21.560 you're using the future predicted earnings by the deep learning 0:14:21.760 --> 0:14:24.680 the deep neural network. So, as I understand it, the 0:14:25.400 --> 0:14:28.960 deep learning or the neural networks are used primarily to 0:14:29.000 --> 0:14:34.960 forecast the future fundamentals based on historic performance. Is that right, 0:14:35.800 --> 0:14:40.680 historic fundamentals? Yeah, okay, So walk us through how you 0:14:40.720 --> 0:14:45.040 actually develop an application that's able to do that, Like 0:14:45.320 --> 0:14:48.400 what are those neural networks looking at and what sort 0:14:48.440 --> 0:14:51.360 of information are they drawing in other than you know, 0:14:51.440 --> 0:14:56.880 past predictive data to make those forecasts. There's two parts 0:14:56.880 --> 0:14:59.120 of that. One is the data that we use, and 0:14:59.120 --> 0:15:02.360 then two is the technology we use to build the 0:15:02.400 --> 0:15:06.720 deep you know, neural network models. So on the data side, 0:15:06.960 --> 0:15:11.680 what you use is historical fundamentals on all companies you 0:15:11.720 --> 0:15:13.880 know that have ever you know, been listed in the 0:15:14.000 --> 0:15:17.720 US for the past fifty years, and so what a 0:15:17.800 --> 0:15:21.880 historical fundamentals mean? What it means earnings, book value, anything 0:15:21.880 --> 0:15:24.080 you can find on an income statement and balance sheet 0:15:24.120 --> 0:15:27.880 going back in time. In addition to fundamentals, we also 0:15:28.600 --> 0:15:32.080 use as inputs to the to the model, you know, 0:15:32.240 --> 0:15:37.080 momentum over you know, one month, six months, twelve months. 0:15:37.120 --> 0:15:38.920 So then you know, if you think of it as 0:15:39.000 --> 0:15:43.040 like a big you know, spreadsheet table where each row 0:15:43.600 --> 0:15:47.760 is a point in time for a specific company, and 0:15:47.800 --> 0:15:52.120 then you can think of sequences going back through time. 0:15:52.400 --> 0:15:57.400 You know IBM in March of and then all of 0:15:57.440 --> 0:16:00.720 its fundamentals in one row, plus it's moment at them, 0:16:00.760 --> 0:16:03.920 and then that going back five years and time. So 0:16:03.920 --> 0:16:07.880 those sequences, both the fundamentals and the momentum are fed 0:16:08.080 --> 0:16:12.680 into a neural network and uh and and all of 0:16:12.720 --> 0:16:15.840 those sequences for all companies and all time are fed 0:16:15.880 --> 0:16:19.840 into a neural network and are trained to predict what 0:16:20.000 --> 0:16:22.800 the fundamentals will be, you know, one time step out 0:16:22.840 --> 0:16:36.440 in the future. So just to sort of summarize it 0:16:36.480 --> 0:16:39.360 all up, you know, it's like, if you have all 0:16:39.400 --> 0:16:43.440 these strategies, if you have all these funds chasing things 0:16:43.480 --> 0:16:49.040 like earnings, quality, earnings, growth, momentum, all kinds of stuff 0:16:49.080 --> 0:16:53.600 like that, your goal is to anticipate today when those 0:16:53.640 --> 0:16:56.160 funds are going to be buying tomorrow. Is that a 0:16:56.160 --> 0:16:58.320 fair way to characterize it. I think that's a fair 0:16:58.440 --> 0:17:01.280 way to characterize is it. I think what we're really 0:17:01.320 --> 0:17:03.720 just doing is trying to build a better a better 0:17:03.760 --> 0:17:06.840 factor model, A better factor model in the sense that 0:17:06.880 --> 0:17:09.560 you know, as Zach explained, if you had a clairvoyant 0:17:09.560 --> 0:17:13.679 model where you actually knew what future fundamentals were and 0:17:13.760 --> 0:17:16.680 could plug that into a factor model, you do substantially 0:17:16.760 --> 0:17:21.399 better than what you could achieve with a value factor model. Today, 0:17:21.840 --> 0:17:25.040 we're not like directly considering the psychology of the other 0:17:25.040 --> 0:17:28.440 players in the market in this particular approach, right, No, sure, 0:17:28.920 --> 0:17:32.920 but it's essentially saying, like, maybe the way to characterize 0:17:32.920 --> 0:17:36.600 it is, if you want to invest on some fundamental 0:17:36.680 --> 0:17:40.680 factor like earnings quality or earnings growth, bottom line, is 0:17:41.320 --> 0:17:44.679 better to look at future twelve month results rather than 0:17:44.720 --> 0:17:48.840 trailing twelve months. You look at the trailing, but you're 0:17:48.960 --> 0:17:51.959 trying to predict the future. Like, so those two components, right, 0:17:52.000 --> 0:17:54.720 you could say, like one is we have the component 0:17:54.800 --> 0:17:58.080 that is trying to predict the future fundamentals. You know, 0:17:58.119 --> 0:18:01.520 imagine that I came for the future, and I got 0:18:01.520 --> 0:18:03.919 out of my time machine, and I gave you the 0:18:03.960 --> 0:18:07.320 earnings reports from the future. Right, So so the first 0:18:07.320 --> 0:18:09.520 thing you need is how do I get an approximate 0:18:09.600 --> 0:18:12.200 time machine, right, which in our case is a predictive 0:18:12.200 --> 0:18:14.440 model that has a good guess about what the future 0:18:14.440 --> 0:18:17.040 will look like. The second thing is you still need 0:18:17.080 --> 0:18:20.080 a way of executing on the strategy ones I. You 0:18:20.119 --> 0:18:23.560 still need a way to decide which stocks to buy, right, So, 0:18:24.119 --> 0:18:27.520 based on based on this future information, Like, it's possible 0:18:27.560 --> 0:18:29.359 that if I if I come from the future and 0:18:29.400 --> 0:18:32.160 I give you the earnings report, and I tell you 0:18:32.200 --> 0:18:35.199 what the future income will be, what, it's possible that 0:18:35.200 --> 0:18:36.600 the income is going to go up with the stock 0:18:36.640 --> 0:18:38.960 price is going to go down, you know, like say 0:18:39.119 --> 0:18:41.720 it's an Apple and like they made a lot more money, 0:18:41.760 --> 0:18:44.600 but it was also like announced that they had a 0:18:44.640 --> 0:18:48.920 major plant failure in the iPhone fourteen or whatever they're 0:18:49.000 --> 0:18:52.040 up to is going to be delayed. So these two 0:18:52.040 --> 0:18:54.640 components are are a little bit modular, Like we could 0:18:54.640 --> 0:18:57.320 come up with m. John I think is more the 0:18:57.359 --> 0:18:59.680 domain expert, so I'm I'm more the machine learning guy. 0:18:59.760 --> 0:19:02.119 Like I'm sure John could come up with you know, 0:19:02.400 --> 0:19:04.920 a million other ways that you might imagine that someone 0:19:04.960 --> 0:19:08.000 would try to execute on this information. In our case, 0:19:08.040 --> 0:19:10.320 what we're doing is we've adapted a factor model to 0:19:10.560 --> 0:19:14.840 work with this kind of future guess. So one other example, 0:19:15.400 --> 0:19:17.439 so so again, in our case, what we're doing is 0:19:17.480 --> 0:19:20.879 taking the predicted future fundamentals and feeding that into a 0:19:20.960 --> 0:19:25.440 value factor model. But you could imagine using let's say 0:19:25.600 --> 0:19:29.480 the deep neural networks said, you know, a company is 0:19:29.480 --> 0:19:33.080 going to do a hundred million, but consensus estimates in 0:19:33.080 --> 0:19:36.560 in in earnings. Let's say, but consensus estimates said it's 0:19:36.560 --> 0:19:41.359 gonna do seventy five million in in earnings. Well, you 0:19:41.400 --> 0:19:43.640 know that might be you could you can imagine devising 0:19:43.640 --> 0:19:46.280 a strategy around that where you'd want to go, you know, 0:19:46.359 --> 0:19:49.919 bet on those guys and ones where consensus estimates are 0:19:49.960 --> 0:19:53.120 above what the deep neural network is predicting, you'd want 0:19:53.119 --> 0:19:56.679 to bet against. Right, assuming the current price is pricing 0:19:56.760 --> 0:19:59.520 and that that's a really you know, John, you shouldnt 0:19:59.520 --> 0:20:03.040 give away so goods, that that's a really good idea. 0:20:03.920 --> 0:20:08.800 So are these kinds of machine learning driven predictive models 0:20:09.080 --> 0:20:11.320 the future of investing? You think is that the way 0:20:11.320 --> 0:20:14.199 that we're heading. I think what this paper showed is 0:20:14.240 --> 0:20:17.560 that there's a lot of potential in using deep learning 0:20:17.680 --> 0:20:21.400 to long term investing. I think that there's been some 0:20:21.480 --> 0:20:25.520 debate about whether, you know, deep learning, which requires a 0:20:25.560 --> 0:20:29.760 lot of data um to to to build successful models, 0:20:30.400 --> 0:20:34.320 um whether in finance there's enough data, or whether you 0:20:34.359 --> 0:20:38.680 even need this these kinds of complex models in finance, 0:20:38.720 --> 0:20:40.960 I mean a lot of quant people feel, you know, 0:20:41.119 --> 0:20:44.679 linear simple factor models are the best route to go, 0:20:45.520 --> 0:20:47.919 And I think what we showed here is that if 0:20:47.920 --> 0:20:51.080 you're trying to predict price changes, that might be true. 0:20:51.760 --> 0:20:55.800 But if you decompose the problem into first trying to 0:20:55.840 --> 0:20:59.040 predict fundamentals and then later you know, through a factor 0:20:59.119 --> 0:21:01.920 model or some other method, trying to use those predicted 0:21:01.920 --> 0:21:06.720 fundamentals to predict price, deep learning has a lot of 0:21:06.720 --> 0:21:12.200 potential and does does substantially better at predicting future fundamentals 0:21:12.200 --> 0:21:14.280 than than what you could do with a linear model. 0:21:14.600 --> 0:21:17.560 There's a sort of a technical reason to recommend the 0:21:17.600 --> 0:21:20.760 way we've cast a problem also without going too far 0:21:20.760 --> 0:21:24.520 into the weeds. Basically, uh, you think really really powerful 0:21:24.600 --> 0:21:27.639 machine learning models, deep neural networks. The thing that you 0:21:27.720 --> 0:21:29.879 worried about is John was talking about how people people 0:21:29.960 --> 0:21:32.240 agonize over what can you bring us to bear on 0:21:32.280 --> 0:21:34.520 long term investing because you don't have as much data 0:21:34.960 --> 0:21:38.119 right as if you were looking at the you know, 0:21:38.400 --> 0:21:42.679 micro second kind of trade frequency, then you'd have, you know, 0:21:42.960 --> 0:21:45.399 trillions of trade examples or something you get on. But 0:21:45.800 --> 0:21:47.760 if you if you're looking at you know, your your 0:21:47.760 --> 0:21:49.879 time tick is I have a data point you know, 0:21:49.960 --> 0:21:53.000 once per month or once per year suddenly, and I 0:21:53.040 --> 0:21:56.439 only have thousands of stocks, not millions of stocks. You 0:21:56.480 --> 0:22:00.199 don't have such a huge amount of data. Um, So 0:22:00.840 --> 0:22:03.080 what you worry about is that a model given given 0:22:03.119 --> 0:22:05.919 a super powerful model, like a super overpowered model, and 0:22:05.960 --> 0:22:08.399 then not too much data, that there's a propensity for 0:22:08.440 --> 0:22:10.840 the models to do what we call overfitting, which is 0:22:10.880 --> 0:22:13.000 the model basically it does a really good job of 0:22:13.040 --> 0:22:16.600 memorizing the training data it's seen, but it learns kind 0:22:16.600 --> 0:22:19.639 of a spurious pattern that doesn't generalize to future data 0:22:19.680 --> 0:22:23.040 that it hasn't seen. So one cool thing about the 0:22:23.080 --> 0:22:25.800 way that we're casting the problem is that we're not 0:22:25.880 --> 0:22:28.600 just trying to predict the factor of interest. We're actually 0:22:28.600 --> 0:22:31.119 trying to predict all the factors in the future. And 0:22:31.160 --> 0:22:35.159 this means that the model has to simultaneously get the 0:22:35.200 --> 0:22:37.679