1
00:00:00,040 --> 00:00:03,240
S1: I'm not completely sure I'm right about this, but I've

2
00:00:03,240 --> 00:00:06,360
S1: never been a big believer in training custom models. I've

3
00:00:06,360 --> 00:00:09,479
S1: also never believed in fine tuning going all the way

4
00:00:09,480 --> 00:00:12,640
S1: back to 2023. My intuition has always pushed me towards

5
00:00:12,640 --> 00:00:16,000
S1: the best state of the art model possible, combined with

6
00:00:16,000 --> 00:00:20,959
S1: context management. I just finally crystallized my reasoning around this.

7
00:00:21,720 --> 00:00:24,520
S1: Anytime you think you're using a small model for a

8
00:00:24,520 --> 00:00:27,920
S1: small task, there's usually a whole lot more going into

9
00:00:27,960 --> 00:00:31,560
S1: a given decision than just that individual area of expertise.

10
00:00:32,520 --> 00:00:37,560
S1: For example, labeling emails, writing reports, processing security events, searching

11
00:00:37,560 --> 00:00:40,360
S1: for threats on a network. On one hand, I think

12
00:00:40,360 --> 00:00:42,680
S1: these are specialized, but the fact is, the smarter and

13
00:00:42,680 --> 00:00:46,080
S1: more experienced a human is who has this expertise, the

14
00:00:46,080 --> 00:00:49,159
S1: better job they're going to do. This is because most

15
00:00:49,159 --> 00:00:53,560
S1: specialized tasks still benefit from the general life experience of

16
00:00:53,560 --> 00:00:56,440
S1: the person doing the execution. This is why I think

17
00:00:56,440 --> 00:00:59,040
S1: the future is not a whole bunch of extremely small,

18
00:00:59,040 --> 00:01:03,280
S1: specialized models throughout the enterprise. I think what's far more

19
00:01:03,280 --> 00:01:07,600
S1: likely is more of an opus, sonnet haiku model, where

20
00:01:07,600 --> 00:01:10,000
S1: the best of the best just keeps coming down in price,

21
00:01:10,360 --> 00:01:14,440
S1: including going into open source. And those smaller models are

22
00:01:14,440 --> 00:01:17,160
S1: used in conjunction with context to perform all the different

23
00:01:17,160 --> 00:01:21,160
S1: tasks in an organization at much lower cost. But I

24
00:01:21,160 --> 00:01:24,960
S1: think they'll still be extremely general models, not tiny and

25
00:01:24,959 --> 00:01:28,920
S1: narrow custom ones. I think the Tldr here is when

26
00:01:28,920 --> 00:01:32,160
S1: you think you're doing a narrow task, that narrow task

27
00:01:32,160 --> 00:01:36,440
S1: is actually benefiting from a ton of general experience. And

28
00:01:36,440 --> 00:01:38,720
S1: I think this applies to humans, and I think it

29
00:01:38,720 --> 00:01:42,720
S1: also applies to models. I'm not completely convinced of this.

30
00:01:42,760 --> 00:01:46,840
S1: I'm about 70% sure. But yeah, I think this is

31
00:01:46,840 --> 00:01:47,720
S1: the way it's going to go.