1 00:00:00,040 --> 00:00:03,240 S1: I'm not completely sure I'm right about this, but I've 2 00:00:03,240 --> 00:00:06,360 S1: never been a big believer in training custom models. I've 3 00:00:06,360 --> 00:00:09,479 S1: also never believed in fine tuning going all the way 4 00:00:09,480 --> 00:00:12,640 S1: back to 2023. My intuition has always pushed me towards 5 00:00:12,640 --> 00:00:16,000 S1: the best state of the art model possible, combined with 6 00:00:16,000 --> 00:00:20,959 S1: context management. I just finally crystallized my reasoning around this. 7 00:00:21,720 --> 00:00:24,520 S1: Anytime you think you're using a small model for a 8 00:00:24,520 --> 00:00:27,920 S1: small task, there's usually a whole lot more going into 9 00:00:27,960 --> 00:00:31,560 S1: a given decision than just that individual area of expertise. 10 00:00:32,520 --> 00:00:37,560 S1: For example, labeling emails, writing reports, processing security events, searching 11 00:00:37,560 --> 00:00:40,360 S1: for threats on a network. On one hand, I think 12 00:00:40,360 --> 00:00:42,680 S1: these are specialized, but the fact is, the smarter and 13 00:00:42,680 --> 00:00:46,080 S1: more experienced a human is who has this expertise, the 14 00:00:46,080 --> 00:00:49,159 S1: better job they're going to do. This is because most 15 00:00:49,159 --> 00:00:53,560 S1: specialized tasks still benefit from the general life experience of 16 00:00:53,560 --> 00:00:56,440 S1: the person doing the execution. This is why I think 17 00:00:56,440 --> 00:00:59,040 S1: the future is not a whole bunch of extremely small, 18 00:00:59,040 --> 00:01:03,280 S1: specialized models throughout the enterprise. I think what's far more 19 00:01:03,280 --> 00:01:07,600 S1: likely is more of an opus, sonnet haiku model, where 20 00:01:07,600 --> 00:01:10,000 S1: the best of the best just keeps coming down in price, 21 00:01:10,360 --> 00:01:14,440 S1: including going into open source. And those smaller models are 22 00:01:14,440 --> 00:01:17,160 S1: used in conjunction with context to perform all the different 23 00:01:17,160 --> 00:01:21,160 S1: tasks in an organization at much lower cost. But I 24 00:01:21,160 --> 00:01:24,960 S1: think they'll still be extremely general models, not tiny and 25 00:01:24,959 --> 00:01:28,920 S1: narrow custom ones. I think the Tldr here is when 26 00:01:28,920 --> 00:01:32,160 S1: you think you're doing a narrow task, that narrow task 27 00:01:32,160 --> 00:01:36,440 S1: is actually benefiting from a ton of general experience. And 28 00:01:36,440 --> 00:01:38,720 S1: I think this applies to humans, and I think it 29 00:01:38,720 --> 00:01:42,720 S1: also applies to models. I'm not completely convinced of this. 30 00:01:42,760 --> 00:01:46,840 S1: I'm about 70% sure. But yeah, I think this is 31 00:01:46,840 --> 00:01:47,720 S1: the way it's going to go.