WEBVTT - We're All Building a Single Digital Assistant

0:00:02.200 --> 0:00:03.800
<v S1>Hey, what's up? So I want to talk about where

0:00:03.800 --> 0:00:06.120
<v S1>I think all this personal AI stuff is going. We've

0:00:06.120 --> 0:00:10.280
<v S1>been talking about agents since 2025, and now we're talking

0:00:10.280 --> 0:00:13.280
<v S1>about harnesses. And I think this is all heading in

0:00:13.280 --> 0:00:16.200
<v S1>the exact same direction. And I initially talked about this

0:00:16.239 --> 0:00:19.040
<v S1>in 2016, which we could talk about later. But I

0:00:19.040 --> 0:00:21.680
<v S1>think the direction this is all heading is into a

0:00:21.680 --> 0:00:27.080
<v S1>single interface, a single interface for handling everything AI related.

0:00:27.080 --> 0:00:29.320
<v S1>And there are a few pieces that are missing here.

0:00:29.320 --> 0:00:31.160
<v S1>I think the main thing that we're missing right now

0:00:31.160 --> 0:00:34.480
<v S1>is that our AI system doesn't have a single entity,

0:00:34.520 --> 0:00:37.040
<v S1>it doesn't have a single identity, it doesn't have a

0:00:37.040 --> 0:00:40.159
<v S1>single personality. And I think that is the interface that

0:00:40.159 --> 0:00:42.640
<v S1>we will move to. I think a bunch of people

0:00:42.640 --> 0:00:45.159
<v S1>have sort of figured this out. I mean, I believe

0:00:45.159 --> 0:00:48.640
<v S1>OpenAI is heading in this direction with some sort of device, right?

0:00:48.680 --> 0:00:52.000
<v S1>They hired Jony Ive to work on some sort of wearable.

0:00:52.000 --> 0:00:54.360
<v S1>And the idea for them is they want to bypass

0:00:54.360 --> 0:00:56.960
<v S1>the mobile infrastructure, right? They don't want to deal with

0:00:57.000 --> 0:00:59.120
<v S1>Apple and the iPhone anymore. They want to have their

0:00:59.120 --> 0:01:02.970
<v S1>own OS, essentially like an AI OS that basically everything

0:01:02.970 --> 0:01:05.649
<v S1>goes through and then all their infrastructure and stuff on

0:01:05.650 --> 0:01:08.569
<v S1>the back end. But I think when we talk about harnesses,

0:01:08.569 --> 0:01:13.690
<v S1>context engineering, prompt engineering agents, especially like we get stuck

0:01:13.690 --> 0:01:16.490
<v S1>in the weeds, right? We're talking about, okay, you know,

0:01:16.530 --> 0:01:19.250
<v S1>what's the best agent framework? What are the best agents,

0:01:19.250 --> 0:01:21.370
<v S1>what are the best skills? And I think the best

0:01:21.370 --> 0:01:24.130
<v S1>way to think about this is to imagine all that

0:01:24.130 --> 0:01:26.850
<v S1>stuff abstracted away. And the way I like to think

0:01:26.850 --> 0:01:29.410
<v S1>about this is to and I learned this when I

0:01:29.410 --> 0:01:31.850
<v S1>was at Apple and we got this DNA, I believe

0:01:31.850 --> 0:01:34.530
<v S1>we stole it. A lot of it from Amazon, actually.

0:01:34.530 --> 0:01:38.290
<v S1>It's the concept of reversing backwards to go to like

0:01:38.330 --> 0:01:41.170
<v S1>into the future where you believe you see some sort

0:01:41.170 --> 0:01:43.530
<v S1>of outcome that you want, a product that you want

0:01:43.530 --> 0:01:45.690
<v S1>or a future that you believe is going to happen.

0:01:45.690 --> 0:01:48.850
<v S1>And you basically articulate that and you say, okay, this

0:01:48.850 --> 0:01:50.890
<v S1>is the thing that I think people want. This is

0:01:50.890 --> 0:01:53.690
<v S1>the thing that I think people will, you know, resonate

0:01:53.690 --> 0:01:56.850
<v S1>with and ultimately really enjoy. And then you say, okay,

0:01:56.890 --> 0:01:59.950
<v S1>what is a, they call it a PR, a public release.

0:01:59.950 --> 0:02:02.230
<v S1>What does a release look like? Right. And then they

0:02:02.270 --> 0:02:06.230
<v S1>work backwards. So I've been thinking this way since 2014

0:02:06.230 --> 0:02:09.030
<v S1>or something. So in 2016, I wrote this shitty book,

0:02:09.030 --> 0:02:10.630
<v S1>which you don't have to read because I turned it

0:02:10.630 --> 0:02:13.190
<v S1>into a blog post, but I basically said that everything

0:02:13.190 --> 0:02:15.350
<v S1>is heading in this direction of you're going to have

0:02:15.350 --> 0:02:18.510
<v S1>a single day, which is an AI, a single AI

0:02:18.669 --> 0:02:21.510
<v S1>digital assistant, which is going to be your conduit. It's

0:02:21.510 --> 0:02:24.190
<v S1>going to be your buddy, your friend, and most importantly,

0:02:24.190 --> 0:02:27.389
<v S1>your digital assistant knows everything about you. Okay. That is

0:02:27.389 --> 0:02:31.190
<v S1>like the primary concept from this book in 2016. This

0:02:31.190 --> 0:02:34.669
<v S1>thing knows everything about you. It knows your work, your life,

0:02:34.669 --> 0:02:37.549
<v S1>your relationships. It knows what you struggle with. It knows

0:02:37.550 --> 0:02:39.950
<v S1>what your strengths are. It knows what your weaknesses are.

0:02:39.990 --> 0:02:42.389
<v S1>It knows what you're trying to accomplish, right? So if

0:02:42.389 --> 0:02:45.350
<v S1>you look at all my various projects, well, first being

0:02:45.510 --> 0:02:49.070
<v S1>that that thing in 2016, but also you look at substrate,

0:02:49.070 --> 0:02:52.470
<v S1>most importantly, you look at telos, if you're familiar with

0:02:52.470 --> 0:02:55.230
<v S1>that at all. So telos is essentially a system for

0:02:55.230 --> 0:02:58.560
<v S1>defining yourself, just defining what your goals are, what your

0:02:58.560 --> 0:03:01.760
<v S1>problems are that you're working on, and essentially having all

0:03:01.800 --> 0:03:03.840
<v S1>that in one place. Like what are my challenges? What

0:03:03.840 --> 0:03:06.720
<v S1>are my projects? You know, what are my budget? You know,

0:03:06.760 --> 0:03:09.079
<v S1>if this is a company versus a person, what is

0:03:09.080 --> 0:03:11.200
<v S1>the team that I'm dealing with? What are the different

0:03:11.200 --> 0:03:14.320
<v S1>dynamics there? What is the active work that's going on?

0:03:14.320 --> 0:03:17.960
<v S1>So the idea is to have all this clearly articulated

0:03:17.960 --> 0:03:21.359
<v S1>inside of a system, right? And then with that context,

0:03:21.400 --> 0:03:25.120
<v S1>your digital assistant can then basically monitor this. The other

0:03:25.120 --> 0:03:27.919
<v S1>crazy concept here that's very much related that I've talked

0:03:27.919 --> 0:03:30.360
<v S1>about a bunch in the last year, is this concept

0:03:30.360 --> 0:03:34.800
<v S1>of ideal state and the concept of current state. So

0:03:34.800 --> 0:03:36.840
<v S1>the cool thing about all this AI stuff and all

0:03:36.840 --> 0:03:39.600
<v S1>these agents is that they can constantly gather, they can

0:03:39.600 --> 0:03:44.960
<v S1>constantly go and collect context, you know, research, knowledge, facts,

0:03:45.000 --> 0:03:49.640
<v S1>activities happening in the world, news signal from different sources.

0:03:49.640 --> 0:03:53.600
<v S1>It can always be gathering, but here is the central concept. Okay,

0:03:53.640 --> 0:03:56.660
<v S1>your Da, I'm just going to use mine. Mine's name

0:03:56.660 --> 0:04:00.860
<v S1>is Chi. Chi for me is constantly going and collecting things.

0:04:00.860 --> 0:04:04.940
<v S1>He is constantly collecting and organizing knowledge for me inside

0:04:04.940 --> 0:04:08.100
<v S1>of my system, which my harness is called Pi. It's

0:04:08.140 --> 0:04:11.580
<v S1>actually a public repo. It's an open source project releasing

0:04:11.660 --> 0:04:14.140
<v S1>5.0 about to come out, so you should get that

0:04:14.140 --> 0:04:16.300
<v S1>very soon. Might actually be out by the time you

0:04:16.339 --> 0:04:20.099
<v S1>watch this, but this infrastructure is not designed to be agents.

0:04:20.100 --> 0:04:22.940
<v S1>It's not designed to be AI tools. It's not designed

0:04:22.940 --> 0:04:25.420
<v S1>to be workflows. It's not designed to be any of that.

0:04:25.420 --> 0:04:29.659
<v S1>It's designed to be the back end infrastructure for context

0:04:29.700 --> 0:04:35.980
<v S1>to collection and management. For my specific individual unitary digital

0:04:35.980 --> 0:04:39.740
<v S1>assistant whose name is Chi, I intend to interact with Chi.

0:04:40.180 --> 0:04:44.260
<v S1>Chi then understands everything about me, all my preferences, everything

0:04:44.260 --> 0:04:47.060
<v S1>I'm trying to do in my world, in my life,

0:04:47.060 --> 0:04:50.539
<v S1>in my work, with my friends, with my relationships. And

0:04:50.540 --> 0:04:54.990
<v S1>this is the backplane. This is the foundation of everything

0:04:54.990 --> 0:04:57.950
<v S1>that is happening in my AI life. Okay, this is

0:04:57.950 --> 0:05:00.750
<v S1>the direction I think everyone is going to go. I

0:05:00.750 --> 0:05:02.390
<v S1>don't know if it's going to happen, really, if it's

0:05:02.390 --> 0:05:05.469
<v S1>going to kick off in 2026, because we still seem

0:05:05.470 --> 0:05:08.270
<v S1>pretty obsessed with like agents, but it's starting to happen.

0:05:08.270 --> 0:05:10.230
<v S1>I feel like it's starting to happen because there's more

0:05:10.230 --> 0:05:13.109
<v S1>and more talk of harnesses. And I think the next

0:05:13.110 --> 0:05:15.909
<v S1>thing we figure out after harness is, is that we

0:05:15.910 --> 0:05:19.630
<v S1>actually just want a single person, a single entity to

0:05:19.670 --> 0:05:22.630
<v S1>interact with. And because we are humans, we want that

0:05:22.630 --> 0:05:25.710
<v S1>to be someone with a personality and a memory. Now,

0:05:25.750 --> 0:05:28.469
<v S1>Open Claw kind of helped push this along a decent

0:05:28.470 --> 0:05:31.990
<v S1>amount because it was somebody who was proactive. This is

0:05:31.990 --> 0:05:35.349
<v S1>a major, major feature that's required that previous agents didn't

0:05:35.350 --> 0:05:38.270
<v S1>have proactivity. Right. The fact that you can give it

0:05:38.270 --> 0:05:40.670
<v S1>some things that you care about to some degree, right?

0:05:40.710 --> 0:05:42.430
<v S1>And it puts it in a text file or whatever.

0:05:42.430 --> 0:05:44.550
<v S1>And it could just like check on them regularly. You

0:05:44.550 --> 0:05:47.470
<v S1>could like check a scheduled tasks, right? This kind of

0:05:47.510 --> 0:05:49.910
<v S1>moved it forward a little bit. Yeah. So I put

0:05:49.910 --> 0:05:54.250
<v S1>together this personal AI maturity model And this, ironically, was

0:05:54.250 --> 0:05:56.929
<v S1>a couple of weeks right before Opencore came out, which

0:05:56.970 --> 0:05:59.250
<v S1>I can't even remember. The first name went through like

0:05:59.250 --> 0:06:02.050
<v S1>four names, but this was right before that happened, which

0:06:02.050 --> 0:06:04.409
<v S1>I was very happy to see it come out after this.

0:06:04.410 --> 0:06:07.490
<v S1>And I'm like, okay, cool. This is definitely catching on. So,

0:06:07.490 --> 0:06:10.450
<v S1>so check this out. The concept here is you move

0:06:10.450 --> 0:06:13.170
<v S1>through these stages and I've got three different stages for this.

0:06:13.170 --> 0:06:16.850
<v S1>I've got chat bots, which is like you're talking to ChatGPT.

0:06:16.890 --> 0:06:20.050
<v S1>That's level one. Level two is agents. And then everyone

0:06:20.050 --> 0:06:22.970
<v S1>knows what agents are. I mean, that's 2025 is when

0:06:22.970 --> 0:06:25.770
<v S1>that kicked off like full steam. And who knows where

0:06:25.770 --> 0:06:27.690
<v S1>we are right now. This is a little bit out

0:06:27.730 --> 0:06:30.729
<v S1>of date. I would say we're still around ag2, although

0:06:30.770 --> 0:06:33.490
<v S1>it's not clean lines, right? So you have like in

0:06:33.490 --> 0:06:37.850
<v S1>some sense we're already ag3 and moving into As1 again.

0:06:37.850 --> 0:06:41.330
<v S1>Opencore really helped us try to move into As1, but

0:06:41.330 --> 0:06:44.289
<v S1>the key point is that we're moving from agents to

0:06:44.330 --> 0:06:46.970
<v S1>the next thing, which is assistance. And these are the

0:06:46.970 --> 0:06:49.090
<v S1>different levels. So you have three different levels of the

0:06:49.089 --> 0:06:51.250
<v S1>chat like phase that we went through. We got three

0:06:51.300 --> 0:06:54.539
<v S1>different levels of agents that we went through. At first

0:06:54.540 --> 0:06:58.820
<v S1>it's just basic CLI, basic web interfaces, transactional, and there's

0:06:58.820 --> 0:07:00.740
<v S1>kind of no memory. You just ask a question, it

0:07:00.740 --> 0:07:02.780
<v S1>comes back, it goes and does stuff. It's cool. Then

0:07:02.779 --> 0:07:07.219
<v S1>you've got basic voice interaction ag3 you've got extensive voice interaction, right?

0:07:07.260 --> 0:07:10.420
<v S1>So the features ramp up and, you know, as they

0:07:10.420 --> 0:07:13.020
<v S1>get ready to move into the next phase. And then

0:07:13.020 --> 0:07:15.180
<v S1>what we're moving into, which is the whole point of

0:07:15.180 --> 0:07:19.900
<v S1>this video is we're moving towards assistance, right? Personality can

0:07:19.900 --> 0:07:25.100
<v S1>see and hear around you goal monitoring and personal persistent personality, right?

0:07:25.140 --> 0:07:27.140
<v S1>We don't really have that yet. We've got a bunch

0:07:27.140 --> 0:07:30.420
<v S1>of startups who have made like, you know, virtual boyfriends

0:07:30.420 --> 0:07:34.180
<v S1>and virtual girlfriends and, you know, virtual best friends. And

0:07:34.180 --> 0:07:37.180
<v S1>that's a whole space that's like taking off massively. But

0:07:37.180 --> 0:07:40.020
<v S1>guess what? Those tend to not have the functionality of

0:07:40.020 --> 0:07:43.100
<v S1>the whole professional system that we work in, which is,

0:07:43.140 --> 0:07:45.700
<v S1>you know, the agent system up here, right? So my

0:07:45.700 --> 0:07:48.300
<v S1>whole point to you is that this is the direction

0:07:48.300 --> 0:07:51.520
<v S1>it's heading towards assistance, but it's combining all of this

0:07:51.520 --> 0:07:54.480
<v S1>agent stuff into the assistant stuff. So I think that's

0:07:54.480 --> 0:07:57.080
<v S1>the direction that we're going. And I think it's just

0:07:57.080 --> 0:07:59.800
<v S1>going to be extremely powerful. Right. And I want to

0:07:59.800 --> 0:08:02.720
<v S1>talk just about like a bunch of these different things

0:08:02.720 --> 0:08:06.440
<v S1>that are happening around this. So the main concept is

0:08:06.440 --> 0:08:10.000
<v S1>that the human life, the life of the principal, which

0:08:10.000 --> 0:08:12.640
<v S1>is like the, the center, the human that is running

0:08:12.640 --> 0:08:14.920
<v S1>all this, I call it the principal kind of like

0:08:14.960 --> 0:08:17.720
<v S1>executive protection or whatever. My concept here is that the

0:08:17.720 --> 0:08:21.160
<v S1>principal is at the center of the, the whole system, right?

0:08:21.200 --> 0:08:23.800
<v S1>None of this tech matters. None of this AI matters.

0:08:23.800 --> 0:08:26.760
<v S1>None of these agents matter if they're not doing something

0:08:26.760 --> 0:08:29.120
<v S1>for a human right. The whole purpose of all of

0:08:29.120 --> 0:08:32.280
<v S1>this is to have human things happen, right? And human

0:08:32.280 --> 0:08:35.800
<v S1>things be optimized. And, you know, it's improving the life

0:08:35.800 --> 0:08:38.080
<v S1>of the principal of the human at the center of

0:08:38.080 --> 0:08:41.360
<v S1>all of this. Okay, so back in late 23, I

0:08:41.360 --> 0:08:44.400
<v S1>think I finally posted this thing in early 24. I've

0:08:44.400 --> 0:08:47.440
<v S1>been talking about this concept of augmented, right? So I

0:08:47.440 --> 0:08:49.490
<v S1>had this augmented course, I believe it was one of

0:08:49.490 --> 0:08:52.330
<v S1>the first AI courses out there. And essentially what I

0:08:52.330 --> 0:08:56.050
<v S1>wanted to talk about was how AI to me has

0:08:56.050 --> 0:08:58.530
<v S1>never been about the tools and the websites and all

0:08:58.530 --> 0:09:01.690
<v S1>the different, you know, chatbots or whatever. It's about this,

0:09:01.730 --> 0:09:03.850
<v S1>it's about a human life. It's about what you could

0:09:03.850 --> 0:09:06.370
<v S1>do in your life and how you can improve your

0:09:06.370 --> 0:09:09.570
<v S1>life using all these tools as like the back end, right?

0:09:09.610 --> 0:09:12.530
<v S1>So what I encourage everyone to do is basically come

0:09:12.530 --> 0:09:15.570
<v S1>up with, what are your actual challenges? What are the components?

0:09:15.570 --> 0:09:18.250
<v S1>What are the different workflows and little pieces that you

0:09:18.250 --> 0:09:21.650
<v S1>can turn into a granular problem that AI can help

0:09:21.650 --> 0:09:24.610
<v S1>you solve, right? So over here, there's security things, there's

0:09:24.610 --> 0:09:28.170
<v S1>personal things, there's knowledge things, education things, right? And then

0:09:28.210 --> 0:09:31.089
<v S1>you basically turn that into workflows that you can then

0:09:31.090 --> 0:09:34.250
<v S1>execute AI on, right? So I've got like ideas here.

0:09:34.250 --> 0:09:36.730
<v S1>You have like an idea, a random idea for like

0:09:36.770 --> 0:09:38.690
<v S1>an essay or something, right? And you can turn it

0:09:38.690 --> 0:09:41.170
<v S1>into an essay. You could then turn that into a

0:09:41.210 --> 0:09:44.849
<v S1>LinkedIn post, right? And this is all using those individual components,

0:09:44.850 --> 0:09:47.050
<v S1>but it all comes back to something that you care

0:09:47.150 --> 0:09:50.270
<v S1>about as a human. Here's another example. Since my background

0:09:50.270 --> 0:09:53.510
<v S1>is in security, like you can do analyze incident, and

0:09:53.510 --> 0:09:55.750
<v S1>this just becomes a workflow that you can do. In

0:09:55.750 --> 0:09:57.949
<v S1>this case, this is inside of a tool called fabric,

0:09:57.950 --> 0:10:00.750
<v S1>which is now part of my Pi project. But you

0:10:00.750 --> 0:10:03.670
<v S1>can essentially go and analyze an incident. You just paste

0:10:03.670 --> 0:10:06.310
<v S1>in content and boom, it puts out an incident with

0:10:06.309 --> 0:10:09.030
<v S1>great formatting like this. And this is kind of well

0:10:09.070 --> 0:10:11.829
<v S1>known now, but it wasn't really known back in 2023.

0:10:11.870 --> 0:10:14.190
<v S1>This was one of my favorite workflows. This one's called

0:10:14.190 --> 0:10:16.589
<v S1>Extract Wisdom. I still use it to this day. You

0:10:16.590 --> 0:10:19.630
<v S1>can basically take any video and pull out like the

0:10:19.630 --> 0:10:22.270
<v S1>most interesting content from it, and you could then do

0:10:22.270 --> 0:10:24.630
<v S1>something with it. Again, you can go and write something,

0:10:24.630 --> 0:10:27.030
<v S1>you could create a video about it, do whatever. So

0:10:27.070 --> 0:10:29.630
<v S1>see something you like, capture it for later in a

0:10:29.630 --> 0:10:31.990
<v S1>format that you can share, right? And all of those

0:10:31.990 --> 0:10:34.949
<v S1>can be little individual workflows. So all of this to

0:10:34.990 --> 0:10:37.989
<v S1>say that this concept of the human at the center

0:10:37.990 --> 0:10:41.030
<v S1>doing their life things to me has always been the

0:10:41.030 --> 0:10:44.710
<v S1>point of any technology and especially AI. So this is

0:10:44.710 --> 0:10:48.720
<v S1>why I believe and why I have not stopped believing

0:10:48.720 --> 0:10:51.439
<v S1>that the direction for all of this is this single

0:10:51.440 --> 0:10:55.560
<v S1>interaction point with a personality that knows us better than anyone,

0:10:55.559 --> 0:10:59.800
<v S1>probably better than our significant others, and can constantly help

0:10:59.800 --> 0:11:02.319
<v S1>us advocate for that stuff. Okay, so let me show

0:11:02.320 --> 0:11:04.600
<v S1>you what that looked like originally when I wrote this

0:11:04.640 --> 0:11:07.000
<v S1>in 2016. Again, this is the book which you don't

0:11:07.000 --> 0:11:08.880
<v S1>have to read because guess what? It's a blog post.

0:11:08.880 --> 0:11:10.920
<v S1>It's free. But if we click up here and we

0:11:10.920 --> 0:11:15.280
<v S1>go to digital assistants, most visible and significant role for

0:11:15.280 --> 0:11:18.480
<v S1>synthetic intelligence, I don't know why I called it synthetic. Basically,

0:11:18.480 --> 0:11:21.679
<v S1>I made the argument that it's not artificial, okay? It's

0:11:21.679 --> 0:11:24.080
<v S1>just different from us. So we should call it synthetic

0:11:24.080 --> 0:11:27.160
<v S1>instead of artificial. Kind of stupid. A computer system that

0:11:27.160 --> 0:11:31.200
<v S1>can monitor human contacts, intentions and commands, interpret them, take

0:11:31.240 --> 0:11:34.560
<v S1>action as well or better than a human professional personal assistant.

0:11:34.559 --> 0:11:36.600
<v S1>So that was the idea for a day. And if

0:11:36.600 --> 0:11:38.880
<v S1>we just look down here, not just that there will

0:11:38.880 --> 0:11:41.560
<v S1>be intelligent, but that they'll know our preferences, they'll be

0:11:41.600 --> 0:11:44.280
<v S1>able to adjust the world to our liking. So the

0:11:44.380 --> 0:11:47.020
<v S1>other concept here that goes with this is kind of

0:11:47.059 --> 0:11:50.179
<v S1>everything getting APIs. That was like the second idea from

0:11:50.179 --> 0:11:52.940
<v S1>this set of concepts from this book. So it's like

0:11:52.980 --> 0:11:56.420
<v S1>you talk to your AI, your Da who is named.

0:11:56.420 --> 0:11:59.660
<v S1>So I talked to Kai. The world is full of APIs.

0:11:59.700 --> 0:12:02.699
<v S1>All the companies are APIs, all the different services and

0:12:02.700 --> 0:12:06.699
<v S1>products and the restaurants and the companies. They're all different APIs.

0:12:06.700 --> 0:12:09.059
<v S1>And when I want to do something like I want to, hey,

0:12:09.059 --> 0:12:10.579
<v S1>I want to buy this product, you know, I want

0:12:10.580 --> 0:12:13.220
<v S1>to schedule a trip or whatever. I tell Kai. Kai

0:12:13.380 --> 0:12:16.140
<v S1>then goes and scours, looks at a bunch of lists

0:12:16.140 --> 0:12:18.939
<v S1>of like the best restaurant or the best Thai food

0:12:18.940 --> 0:12:21.580
<v S1>or the best like place to buy a shirt or

0:12:21.620 --> 0:12:23.740
<v S1>the best place to do a vacation for the cheapest

0:12:23.740 --> 0:12:26.660
<v S1>amount of, you know, money or whatever. And he can

0:12:26.660 --> 0:12:28.620
<v S1>go and research all of that and come back to

0:12:28.620 --> 0:12:30.740
<v S1>me and present it to me somehow, which the third

0:12:30.740 --> 0:12:33.980
<v S1>piece was, okay, custom interfaces. So we're not going to

0:12:33.980 --> 0:12:37.179
<v S1>be using the interfaces provided by the service provider. Our

0:12:37.220 --> 0:12:39.660
<v S1>Da is going to custom make us interfaces. So that

0:12:39.660 --> 0:12:41.860
<v S1>was another of the big ideas here. And what I

0:12:41.860 --> 0:12:45.309
<v S1>said here was this will change how we interact with everything.

0:12:45.309 --> 0:12:49.390
<v S1>So instead of physically manipulating technology, much of which has

0:12:49.390 --> 0:12:53.429
<v S1>widely varied interfaces, you have to find it, learn the interface, whatever,

0:12:53.429 --> 0:12:56.429
<v S1>start using it in some way. We switched to guess

0:12:56.429 --> 0:13:01.470
<v S1>what voice gestures, text part of our natural human communication paradigm.

0:13:01.470 --> 0:13:04.309
<v S1>So this is why I think this stuff is predictable.

0:13:04.309 --> 0:13:06.270
<v S1>And I talked about this in the beginning of the book.

0:13:06.309 --> 0:13:10.030
<v S1>You can't predict tech, you can't predict implementations, you can't

0:13:10.030 --> 0:13:12.710
<v S1>predict who's going to win. You can't predict what company

0:13:12.750 --> 0:13:15.190
<v S1>is going to like beat out the other company, what

0:13:15.230 --> 0:13:17.069
<v S1>order the tech is going to come out. But what

0:13:17.070 --> 0:13:19.310
<v S1>I think you can predict, and this was kind of

0:13:19.309 --> 0:13:21.550
<v S1>the basis of the whole thing and the basis of

0:13:21.550 --> 0:13:24.310
<v S1>this whole video as well, is you can predict to

0:13:24.350 --> 0:13:27.710
<v S1>some degree what humans want because it doesn't really change.

0:13:27.710 --> 0:13:30.069
<v S1>What we want is we want to feel seen, we

0:13:30.070 --> 0:13:33.430
<v S1>want to feel understood, we want to have a close relationship.

0:13:33.429 --> 0:13:35.670
<v S1>We like to have a trusted relationship. And one of

0:13:35.670 --> 0:13:38.470
<v S1>the cool things is that billionaires or, you know, people

0:13:38.470 --> 0:13:40.070
<v S1>who make a lot of money, one of the things

0:13:40.070 --> 0:13:42.970
<v S1>they say helps them more than anything is having a really,

0:13:42.970 --> 0:13:45.690
<v S1>really high quality assistant. So I think this has always

0:13:45.690 --> 0:13:48.090
<v S1>been a thing that we would want when we could

0:13:48.090 --> 0:13:50.610
<v S1>have it. So this is why I'm so locked onto

0:13:50.610 --> 0:13:54.250
<v S1>this model of a single D a that you do

0:13:54.250 --> 0:13:58.530
<v S1>all interaction through, right? So instead of interacting with technology directly,

0:13:58.530 --> 0:14:01.090
<v S1>we will interact with our Da. Our Da will work

0:14:01.090 --> 0:14:04.610
<v S1>out the details with the necessary daemon API. We speak,

0:14:04.650 --> 0:14:07.770
<v S1>things happen, we gestures, things happen, we text, things happen.

0:14:07.770 --> 0:14:10.970
<v S1>No need to fine understand or master new tech. That's

0:14:10.970 --> 0:14:14.330
<v S1>for the service and your Da. My key to work

0:14:14.370 --> 0:14:17.610
<v S1>out amongst themselves, right? Digital assistants will become the preferred

0:14:17.610 --> 0:14:21.090
<v S1>interface between humans and the world in a disruptive and

0:14:21.090 --> 0:14:25.130
<v S1>foundational way, right? So that's, that's what I'm essentially talking

0:14:25.130 --> 0:14:27.250
<v S1>about still to this day. All right. I want to

0:14:27.250 --> 0:14:30.050
<v S1>give some more examples of just like real world stuff, right?

0:14:30.090 --> 0:14:32.610
<v S1>So if you're watching this, you're probably somebody who likes

0:14:32.610 --> 0:14:36.330
<v S1>to optimize. You're probably already deep into the agent ecosystem

0:14:36.330 --> 0:14:39.130
<v S1>or thinking about going in that direction. And I was

0:14:39.130 --> 0:14:41.260
<v S1>just sitting, I was having a friend with my buddy

0:14:41.260 --> 0:14:44.100
<v S1>Will recently, and we were just talking, and I was

0:14:44.100 --> 0:14:46.260
<v S1>trying to explain this whole concept to him of like,

0:14:46.300 --> 0:14:48.420
<v S1>why I think this is the direction everything's going to

0:14:48.420 --> 0:14:51.180
<v S1>go and why I think all this agent stuff and

0:14:51.180 --> 0:14:54.340
<v S1>the harness stuff just fades into the background because it's

0:14:54.340 --> 0:14:58.260
<v S1>just infrastructure that's going to be used by your Da. Okay.

0:14:58.300 --> 0:15:01.580
<v S1>So as we were talking, I'm like, okay, so what

0:15:01.580 --> 0:15:04.380
<v S1>cool ideas have we had in the course of this conversation?

0:15:04.380 --> 0:15:06.460
<v S1>And what is going to be done with those ideas?

0:15:06.460 --> 0:15:08.860
<v S1>And the answer is, well, did we write them down?

0:15:08.860 --> 0:15:11.740
<v S1>Do we have a recording device? Is that recording device

0:15:11.740 --> 0:15:14.940
<v S1>somehow capturing things? Okay, that's fine. If it captured it now,

0:15:14.940 --> 0:15:17.820
<v S1>what can it do with it? If I care about

0:15:17.820 --> 0:15:22.420
<v S1>conversations where we have really cool interactions and he gives

0:15:22.420 --> 0:15:25.140
<v S1>me ideas, I give him ideas. I want that to

0:15:25.180 --> 0:15:28.180
<v S1>be always captured. Okay. If I'm reading a book and

0:15:28.180 --> 0:15:30.460
<v S1>I'm like, oh, this is really cool. I want my

0:15:30.500 --> 0:15:33.100
<v S1>Da to be watching over my shoulder, looking at the

0:15:33.100 --> 0:15:35.700
<v S1>book and when I say, oh, wow, this is really cool.

0:15:35.740 --> 0:15:38.740
<v S1>My Da is looking at the page and can extract

0:15:38.840 --> 0:15:40.640
<v S1>what I meant and make a note of it. If

0:15:40.640 --> 0:15:43.120
<v S1>I say, oh man, that's really cool, well, that should

0:15:43.120 --> 0:15:45.000
<v S1>be extracted out to make a note of it. That

0:15:45.000 --> 0:15:48.120
<v S1>should go into some sort of infrastructure which is waiting

0:15:48.160 --> 0:15:50.520
<v S1>to be followed up on, or I'm going to write

0:15:50.520 --> 0:15:52.480
<v S1>a piece about it, or I'm going to make a

0:15:52.480 --> 0:15:54.440
<v S1>video about it, or I'm just going to think about it.

0:15:54.440 --> 0:15:56.000
<v S1>Maybe I'm not going to do anything. I just want

0:15:56.000 --> 0:15:58.280
<v S1>to be able to remember that we had the conversation

0:15:58.280 --> 0:16:01.080
<v S1>and this useful thing came out of it, right? So

0:16:01.120 --> 0:16:04.160
<v S1>there are just a million examples of this. Okay, let's

0:16:04.160 --> 0:16:07.040
<v S1>just break down a person's life. Let's say they do

0:16:07.040 --> 0:16:10.000
<v S1>lots of research. Let's say they do security research. Again,

0:16:10.000 --> 0:16:12.640
<v S1>because my background security. So it's like I'm thinking about

0:16:12.640 --> 0:16:16.440
<v S1>vulnerability research. I'm thinking about threat intelligence. I'm thinking about

0:16:16.440 --> 0:16:19.560
<v S1>what are the latest attackers doing all that kind of stuff,

0:16:19.560 --> 0:16:23.479
<v S1>that stuff. Kai should be collecting. Kai actually is collecting

0:16:23.480 --> 0:16:26.720
<v S1>all of that stuff, collecting, organizing, trying to see if

0:16:26.720 --> 0:16:28.760
<v S1>it applies to any of the tech stacks that we

0:16:28.760 --> 0:16:31.640
<v S1>work with. Does it apply to any of the customers

0:16:31.680 --> 0:16:34.800
<v S1>that I advise for? Right. Oh, there's a million different

0:16:34.800 --> 0:16:38.330
<v S1>things that can be done at any given moment that

0:16:38.330 --> 0:16:40.570
<v S1>if that was the only thing that I had time

0:16:40.570 --> 0:16:42.170
<v S1>to do, I would do it. If I had the

0:16:42.170 --> 0:16:44.850
<v S1>time to do these things, I would physically be doing

0:16:44.850 --> 0:16:47.290
<v S1>them myself. I don't have time for them because there

0:16:47.330 --> 0:16:49.730
<v S1>are thousands of them. Guess what can do it? A

0:16:49.730 --> 0:16:52.250
<v S1>whole army of agents, right? But I can't talk to

0:16:52.290 --> 0:16:53.890
<v S1>an army of agents. How am I going to talk

0:16:53.890 --> 0:16:56.130
<v S1>to an army of agents? I just talk to Kai.

0:16:56.170 --> 0:16:59.970
<v S1>Kai has all the context. Kai knows what my ideal

0:16:59.970 --> 0:17:02.450
<v S1>state is. Kai knows what I'm trying to do in

0:17:02.450 --> 0:17:05.210
<v S1>the world, what all my goals are. And check this out.

0:17:05.210 --> 0:17:07.890
<v S1>This is like one of the primary ideas here, okay?

0:17:07.930 --> 0:17:11.010
<v S1>This is like the centerpiece of the entire tech stack

0:17:11.010 --> 0:17:13.410
<v S1>in my opinion. This is where this is going. Your

0:17:13.410 --> 0:17:18.010
<v S1>single Da will have basically one prime directive. Know what

0:17:18.010 --> 0:17:21.330
<v S1>your current state is from reading all these APIs, from

0:17:21.330 --> 0:17:24.970
<v S1>pulling all this context from it can see your heartbeat, right?

0:17:25.010 --> 0:17:27.290
<v S1>It can see your depression level. It could hear the

0:17:27.290 --> 0:17:29.290
<v S1>tone of your voice. It could see if you've worked

0:17:29.290 --> 0:17:31.290
<v S1>out recently, you could see if you're fighting with your

0:17:31.290 --> 0:17:33.210
<v S1>significant other. You could see if you haven't talked to

0:17:33.210 --> 0:17:35.530
<v S1>your friends, what is your current state? What is your

0:17:35.710 --> 0:17:38.630
<v S1>desired state? What is your ideal state that is captured

0:17:38.630 --> 0:17:41.310
<v S1>in your telos that is captured as part of. In

0:17:41.350 --> 0:17:44.869
<v S1>my case, my Pi personal AI infrastructure. Right. That is

0:17:44.869 --> 0:17:49.030
<v S1>very clear. Kai is watching this constantly when my Diet

0:17:49.030 --> 0:17:52.590
<v S1>Coke runs out. If the restaurant has a slash menu

0:17:52.590 --> 0:17:55.710
<v S1>and a slash order API, I could request a Diet Coke.

0:17:55.750 --> 0:17:59.430
<v S1>It beeps in the back and my friend the waiter

0:17:59.430 --> 0:18:02.109
<v S1>comes over and brings it over. Or a robot rolls

0:18:02.109 --> 0:18:04.430
<v S1>out and brings it over, or it drops from the ceiling.

0:18:04.430 --> 0:18:06.350
<v S1>Who fucking knows how it's going to happen, right? Who

0:18:06.350 --> 0:18:08.150
<v S1>knows who's going to get there? This type of stuff

0:18:08.150 --> 0:18:11.869
<v S1>is not predictable. What is predictable is when my Diet

0:18:11.869 --> 0:18:15.070
<v S1>Coke runs out. Humans would like their Diet Coke not

0:18:15.070 --> 0:18:17.270
<v S1>to be run out. They would like to have more

0:18:17.310 --> 0:18:19.990
<v S1>Diet Coke. That's the way it works, right? So some

0:18:19.990 --> 0:18:23.629
<v S1>of these things are so predictable. What's predictable? The conversation

0:18:23.630 --> 0:18:25.990
<v S1>I'm having with Will, we want to record that and

0:18:25.990 --> 0:18:28.870
<v S1>extract cool stuff out of it? That is extremely predictable.

0:18:28.910 --> 0:18:32.070
<v S1>Having the best meal experience, having the best food at

0:18:32.070 --> 0:18:35.000
<v S1>the best restaurant. Okay. The other example that I have,

0:18:35.000 --> 0:18:37.240
<v S1>I don't have kids myself, but I was painting a

0:18:37.240 --> 0:18:40.040
<v S1>picture for him of like, okay, my daughter, she is

0:18:40.040 --> 0:18:43.040
<v S1>in college in New York City. It's late there. And

0:18:43.080 --> 0:18:45.280
<v S1>you know, she's walking home right now. So Kai is

0:18:45.280 --> 0:18:47.760
<v S1>telling me in my ear while I'm talking to Will,

0:18:47.800 --> 0:18:50.880
<v S1>that she's walking down the street. There's actually someone following her.

0:18:50.920 --> 0:18:53.359
<v S1>How do I know that? How does Kai know that?

0:18:53.359 --> 0:18:57.080
<v S1>Because her Da reports this kind of stuff when she

0:18:57.080 --> 0:19:00.080
<v S1>goes for a walk by herself, or she's walking back

0:19:00.080 --> 0:19:03.199
<v S1>to her apartment from her dorm or whatever? Again, I

0:19:03.200 --> 0:19:05.480
<v S1>don't have kids. I'm just making this up. But she's

0:19:05.480 --> 0:19:08.679
<v S1>wearing some sort of necklace or EarPods or whatever it

0:19:08.680 --> 0:19:11.280
<v S1>is that allows her Da to see in front of

0:19:11.280 --> 0:19:13.639
<v S1>her and see behind her, maybe to the sides or whatever,

0:19:13.640 --> 0:19:16.439
<v S1>and that kind of stuff. Her Da knows to report

0:19:16.440 --> 0:19:19.200
<v S1>that to Kai right now. Kai is watching that all

0:19:19.200 --> 0:19:21.520
<v S1>the time, right? Because her Da is sending it all

0:19:21.520 --> 0:19:23.840
<v S1>the time to Kai, right? Because that's an agreement I

0:19:23.840 --> 0:19:26.280
<v S1>have with my daughter. She's okay with that, right? Doesn't

0:19:26.280 --> 0:19:28.480
<v S1>normally interrupt me, but right now, in the middle of

0:19:28.480 --> 0:19:32.360
<v S1>this conversation with Wil, it's like, hey, Bing, what's going on? Yeah.

0:19:32.359 --> 0:19:35.820
<v S1>So Julie, she's currently being followed. I just called, you know,

0:19:35.859 --> 0:19:38.540
<v S1>neighborhood watch, and someone just came out and she's being

0:19:38.540 --> 0:19:42.620
<v S1>escorted now. It's all okay. Think about what is possible here. Okay.

0:19:42.660 --> 0:19:44.860
<v S1>This is where it starts to get a little bit crazy,

0:19:44.859 --> 0:19:49.620
<v S1>but it's extremely not crazy. It's extremely tangible and possible.

0:19:49.619 --> 0:19:53.060
<v S1>And this is like literally what I'm moving towards with Pi.

0:19:53.100 --> 0:19:56.780
<v S1>I'm literally building this now instead of building like more

0:19:56.780 --> 0:20:00.020
<v S1>and more optimizations because I'm super guilty of like deep

0:20:00.020 --> 0:20:02.300
<v S1>diving on all the agents and how they talk to

0:20:02.300 --> 0:20:04.620
<v S1>each other, blah, blah, blah. No, this is where things

0:20:04.619 --> 0:20:07.540
<v S1>are going. This is where everyone will be in two years.

0:20:07.540 --> 0:20:09.620
<v S1>So kind of the whole point of this video is

0:20:09.619 --> 0:20:11.900
<v S1>to like let you know this. So you could just

0:20:11.900 --> 0:20:14.900
<v S1>be like, oh crap, and like jump way ahead in

0:20:14.900 --> 0:20:17.860
<v S1>this timeline, right? But think about this. Okay, guess what

0:20:17.859 --> 0:20:20.500
<v S1>else is is spinning up right now. It's nowhere near

0:20:20.500 --> 0:20:23.859
<v S1>as advanced as like the agent stuff, but robotics. Okay,

0:20:23.900 --> 0:20:26.820
<v S1>what if she had a dog with her little robotic dog?

0:20:26.820 --> 0:20:28.740
<v S1>And by the way, I'm still afraid of those because

0:20:28.780 --> 0:20:30.900
<v S1>of the Black Mirror episode, but she's got a little

0:20:30.900 --> 0:20:33.590
<v S1>robotic dog. It has all the cameras. What if the

0:20:33.590 --> 0:20:36.910
<v S1>moment she left because the sun was down, her little drone.

0:20:36.910 --> 0:20:39.670
<v S1>It's like this big just flies over her head. So

0:20:39.790 --> 0:20:43.430
<v S1>Kai is watching a top down view of her going from,

0:20:43.470 --> 0:20:45.830
<v S1>you know, the dorm or whatever class she was in

0:20:45.830 --> 0:20:48.190
<v S1>at the college, walking to the apartment. It could just

0:20:48.190 --> 0:20:50.910
<v S1>watch her the whole way, could see people coming and going.

0:20:50.950 --> 0:20:54.109
<v S1>It's watching the police radio. Her Da is watching the

0:20:54.109 --> 0:20:57.109
<v S1>police radio. Her Da is also watching the drone. This

0:20:57.109 --> 0:21:01.230
<v S1>is awareness, okay? There's nothing humans care more about than

0:21:01.270 --> 0:21:05.350
<v S1>like protecting their loved ones. Like being attractive, being interesting,

0:21:05.510 --> 0:21:10.470
<v S1>essentially survival and reproduction. If you reverse engineer survival and reproduction,

0:21:10.470 --> 0:21:14.070
<v S1>you will very quickly get to a trusted Da that

0:21:14.070 --> 0:21:17.550
<v S1>can do everything for you and understands everything about your life,

0:21:17.550 --> 0:21:20.830
<v S1>and can monitor and listen to and see everything in

0:21:20.830 --> 0:21:24.470
<v S1>your life. Okay, so eventually, you know, way in the future,

0:21:24.470 --> 0:21:26.910
<v S1>ten years in the future, 15 years in the future,

0:21:26.910 --> 0:21:30.110
<v S1>when she's walking, she'll have like four robots for like

0:21:30.290 --> 0:21:34.090
<v S1>optimists or whoever's making robots at the time walking with her. Right.

0:21:34.090 --> 0:21:35.369
<v S1>I would have to make a lot of money to

0:21:35.410 --> 0:21:38.130
<v S1>be able to afford four robots, presumably. But she could

0:21:38.130 --> 0:21:40.170
<v S1>have an escort. Okay, if she's going through a bad

0:21:40.170 --> 0:21:43.409
<v S1>part of town or whatever. Plus the drones, right? I mean,

0:21:43.450 --> 0:21:45.969
<v S1>this is the type of situation where these Das are

0:21:45.970 --> 0:21:50.290
<v S1>going to be able to deploy extra sensors, extra Intel gathering,

0:21:50.330 --> 0:21:54.610
<v S1>use extra tokens to do whatever matters to that principle.

0:21:54.609 --> 0:21:56.609
<v S1>In this case, I'm the principal. I have things I

0:21:56.650 --> 0:21:59.690
<v S1>care about, which includes my daughter. And I am, you know,

0:21:59.730 --> 0:22:04.050
<v S1>implementing Kai on my behalf. And her Da is implementing

0:22:04.050 --> 0:22:07.410
<v S1>the resources that they have to protect this resource, which

0:22:07.410 --> 0:22:10.850
<v S1>is my daughter. Right? So that extends to everything that

0:22:10.850 --> 0:22:15.050
<v S1>extends to a company protecting its resources and its employees. And,

0:22:15.090 --> 0:22:17.410
<v S1>you know, everything you're trying to do in life. Just

0:22:17.410 --> 0:22:21.770
<v S1>think about this. 24 seven using a giant army of agents,

0:22:21.770 --> 0:22:24.170
<v S1>which is all this harness stuff that we're doing, all

0:22:24.170 --> 0:22:27.409
<v S1>the Intel gathering, all the different skills and capabilities that

0:22:27.410 --> 0:22:31.619
<v S1>tech has, except for it's yours. It is your day

0:22:31.820 --> 0:22:35.500
<v S1>watching that millions of things at once. Thousands of things

0:22:35.500 --> 0:22:38.020
<v S1>at once, right? It has to scale up, right? With

0:22:38.020 --> 0:22:40.540
<v S1>time and, you know, having enough compute and all this stuff.

0:22:40.540 --> 0:22:43.860
<v S1>This is ultimately what we're building. And I am sprinting

0:22:43.900 --> 0:22:46.780
<v S1>to this. This is what Pi is all about, right?

0:22:46.820 --> 0:22:51.300
<v S1>Is essentially building this harness with a named personal assistant

0:22:51.300 --> 0:22:53.500
<v S1>that is monitoring all this stuff for you. And it

0:22:53.500 --> 0:22:57.540
<v S1>starts with having extraordinarily good context about yourself and what

0:22:57.540 --> 0:22:59.820
<v S1>you want. All right. So let me just show you

0:22:59.820 --> 0:23:02.700
<v S1>a couple things from Pi. So this is my Pi system.

0:23:02.700 --> 0:23:04.460
<v S1>Let me just open a new session here.

0:23:04.500 --> 0:23:05.580
<v S2>Kai here ready to go.

0:23:05.580 --> 0:23:08.619
<v S1>So this is Kai. This is Kai here, my current

0:23:08.619 --> 0:23:10.940
<v S1>form of Kai. Now I can talk to Kai via

0:23:10.940 --> 0:23:14.859
<v S1>telegram because I have basically chat system built in. None

0:23:14.859 --> 0:23:17.300
<v S1>of it is open claw. It's all pi native. I

0:23:17.300 --> 0:23:21.060
<v S1>could talk via telegram. I could talk via iMessage. Most importantly,

0:23:21.060 --> 0:23:25.020
<v S1>I could just talk right here. Right. So magnifying human capabilities, Pi,

0:23:25.060 --> 0:23:27.460
<v S1>that's what this is. Now if you look here, I've

0:23:27.600 --> 0:23:31.880
<v S1>got 51 public skills and 43 private skills. I've got

0:23:31.880 --> 0:23:35.680
<v S1>418 workflows. All the stuff that I've been telling you about,

0:23:35.680 --> 0:23:38.879
<v S1>I've been building all this stuff since 2023. All these

0:23:38.880 --> 0:23:42.080
<v S1>skills are all the different capabilities that Kai has to

0:23:42.119 --> 0:23:46.000
<v S1>move me from current state to ideal state, right? That

0:23:46.000 --> 0:23:48.640
<v S1>is what we're building here. Okay? This is what an

0:23:48.640 --> 0:23:52.800
<v S1>agent harness ultimately is becoming an advocate for you, to

0:23:52.840 --> 0:23:56.160
<v S1>move you towards your ideal state. And for most people,

0:23:56.160 --> 0:23:58.080
<v S1>they don't know what their ideal state is. It's a

0:23:58.080 --> 0:24:00.400
<v S1>silly question. It's a silly concept. What are you even

0:24:00.400 --> 0:24:03.080
<v S1>talking about? It comes down to who are you actually,

0:24:03.080 --> 0:24:05.440
<v S1>what are you actually about? What are you actually trying

0:24:05.440 --> 0:24:07.959
<v S1>to do? Yeah. So we could do telos. So we

0:24:07.960 --> 0:24:11.560
<v S1>have telos skill. We can actually interview you and figure

0:24:11.560 --> 0:24:14.520
<v S1>out exactly what you are doing with your telos. So

0:24:14.520 --> 0:24:17.040
<v S1>if I go k d and I go to skills

0:24:17.080 --> 0:24:19.600
<v S1>or actually I go to user, this is like the,

0:24:19.600 --> 0:24:22.000
<v S1>the structure here, right? So I've got a whole bunch

0:24:22.000 --> 0:24:24.760
<v S1>of different stuff in here. I've got personal, I've got life,

0:24:24.760 --> 0:24:28.369
<v S1>I've got finance, I've got business, I've got health stuff,

0:24:28.369 --> 0:24:31.370
<v S1>I've got everything in here. I've got my writing style,

0:24:31.369 --> 0:24:33.929
<v S1>I've got Kai's writing style in here. I've got a

0:24:33.930 --> 0:24:37.409
<v S1>whole bunch of stuff about Kai's identity, which ebbs and

0:24:37.410 --> 0:24:41.010
<v S1>flows right as we have this relationship, he and I.

0:24:41.090 --> 0:24:44.770
<v S1>Pronunciations for different things. I've got contacts in here for

0:24:44.810 --> 0:24:47.930
<v S1>like my particular contacts that are important to me in

0:24:47.930 --> 0:24:50.650
<v S1>my life. I've got contact details so I could say

0:24:50.690 --> 0:24:53.490
<v S1>email them. I could say, hey, go do this research

0:24:53.490 --> 0:24:56.330
<v S1>on this and send it over to Jason or Sasha

0:24:56.330 --> 0:25:01.010
<v S1>or whoever. Right? The point is, I don't need to describe. Oh,

0:25:01.050 --> 0:25:04.570
<v S1>there's a person, they have this email they and re-explain

0:25:04.570 --> 0:25:07.650
<v S1>myself every single time I can literally, because I have

0:25:07.650 --> 0:25:08.409
<v S1>this system.

0:25:08.450 --> 0:25:09.250
<v S2>Kai here ready to go.

0:25:09.290 --> 0:25:13.010
<v S1>Because I have this system, I could say very, very little.

0:25:13.010 --> 0:25:15.730
<v S1>And this thing will just start going and working, right?

0:25:15.770 --> 0:25:17.929
<v S1>I would show you a bunch of stuff, but I

0:25:17.970 --> 0:25:19.850
<v S1>need to do that in a separate video. I'll do

0:25:19.850 --> 0:25:22.689
<v S1>that in the five zero announcement video describing all the

0:25:22.690 --> 0:25:25.190
<v S1>different stuff that you can do with the Pi system.

0:25:25.190 --> 0:25:27.470
<v S1>But the point is, I shouldn't be in here. I

0:25:27.470 --> 0:25:29.870
<v S1>shouldn't be in here on a command line, like typing

0:25:29.869 --> 0:25:32.390
<v S1>in things or like I do everything through voice, but

0:25:32.390 --> 0:25:35.150
<v S1>even that I'm still inside of a terminal, right? And

0:25:35.150 --> 0:25:37.230
<v S1>if I'm inside of an app, that's no better. I

0:25:37.230 --> 0:25:40.070
<v S1>should be talking to my agent who lives in my ear,

0:25:40.109 --> 0:25:42.590
<v S1>who can see what I see, right? Right now, I'm

0:25:42.590 --> 0:25:44.830
<v S1>sitting in front of four screens here. Kai should be

0:25:44.830 --> 0:25:46.790
<v S1>able to see everything on my screen. Should be able

0:25:46.790 --> 0:25:50.149
<v S1>to control everything on my screen. What we're moving to is, Jarvis.

0:25:50.190 --> 0:25:54.470
<v S1>What we're moving to is Minority report gestures. Hey, do this, hey,

0:25:54.470 --> 0:25:57.030
<v S1>do that, whatever. And we're getting pretty close. I mean,

0:25:57.070 --> 0:25:59.470
<v S1>I can already do that pretty well with Pi, just

0:25:59.470 --> 0:26:03.550
<v S1>with voice activation of like going to do something like, hey,

0:26:03.590 --> 0:26:05.790
<v S1>run the pi upgrade skill and see what's new out

0:26:05.790 --> 0:26:08.629
<v S1>there for us. So this one will actually go into research,

0:26:08.630 --> 0:26:11.710
<v S1>everything new that's happened in AI. And we'll look at

0:26:11.710 --> 0:26:15.110
<v S1>our entire harness and see, hey, what's going on with

0:26:15.109 --> 0:26:17.830
<v S1>the latest in AI, what new skills came out, what

0:26:17.869 --> 0:26:21.070
<v S1>new blog posts from anthropic, what new videos came out

0:26:21.070 --> 0:26:23.919
<v S1>from my favorite YouTube? People who talk about AI. It

0:26:23.920 --> 0:26:26.480
<v S1>goes and collects all that. It watches all the videos.

0:26:26.480 --> 0:26:29.040
<v S1>It pulls all the transcripts together. And then guess what?

0:26:29.040 --> 0:26:32.160
<v S1>It does? It looks at our entire harness. It looks

0:26:32.160 --> 0:26:35.240
<v S1>at the entire Pi system and all the context that

0:26:35.240 --> 0:26:38.560
<v S1>he knows about me. And it comes back and says, hey,

0:26:38.600 --> 0:26:42.160
<v S1>I recommend we implement this. OpenAI just did this cool thing.

0:26:42.160 --> 0:26:45.400
<v S1>I recommend we do this. This one woman, she made

0:26:45.400 --> 0:26:48.320
<v S1>a new skill that does research much better and it

0:26:48.320 --> 0:26:51.359
<v S1>has better research agents, and it has a fact checker

0:26:51.359 --> 0:26:53.520
<v S1>at the end, which would be a good upgrade to

0:26:53.560 --> 0:26:56.480
<v S1>our fact checker, right? And I'm like, yeah, cool. Sounds good. Boom.

0:26:56.520 --> 0:26:58.920
<v S1>It goes. And does it. Do you know how much

0:26:58.920 --> 0:27:01.399
<v S1>research this thing is actually doing? I mean, this is

0:27:01.440 --> 0:27:04.000
<v S1>thousands of tokens this thing is using. I mean, this

0:27:04.000 --> 0:27:07.240
<v S1>thing is just getting started. We're currently inside of the

0:27:07.280 --> 0:27:10.200
<v S1>Pi upgrade skill running a workflow, which is upgrade, but

0:27:10.200 --> 0:27:13.560
<v S1>there's actually multiple workflows in there. It's looking at so

0:27:13.560 --> 0:27:16.400
<v S1>many different sources which are customized for me. The point

0:27:16.400 --> 0:27:18.360
<v S1>is all I have to do. I didn't even have

0:27:18.359 --> 0:27:20.879
<v S1>to say run the Pi upgrade skill. I could just say,

0:27:20.880 --> 0:27:23.460
<v S1>what's the latest out there that we should be thinking about,

0:27:23.460 --> 0:27:26.300
<v S1>and Kai would know to go and run the pie

0:27:26.340 --> 0:27:30.220
<v S1>upgrade skill. So look at this. YouTube channels. GitHub trending

0:27:30.220 --> 0:27:32.820
<v S1>cloud code freshness check. So this is looking at the

0:27:32.820 --> 0:27:36.300
<v S1>latest releases, the engineering blog, the red team blog, all

0:27:36.300 --> 0:27:39.380
<v S1>these different sources brings all that stuff up and maybe

0:27:39.380 --> 0:27:41.260
<v S1>I'll show you what that looks like when it comes back.

0:27:41.300 --> 0:27:42.900
<v S1>All right, so let me show you what it looks

0:27:42.900 --> 0:27:45.780
<v S1>like inside of pie as well. Okay, so this is

0:27:45.780 --> 0:27:49.260
<v S1>another interface to pie that is now in version five.

0:27:49.260 --> 0:27:52.699
<v S1>So because I'm going so heavy into this D a

0:27:53.060 --> 0:27:57.780
<v S1>centered single interface point with all this scaffolding and agents

0:27:57.780 --> 0:28:01.500
<v S1>and everything hidden behind your Da, your named da with

0:28:01.500 --> 0:28:05.219
<v S1>a personality, your single point of contact for for AI,

0:28:05.260 --> 0:28:09.380
<v S1>because that is the case. I've basically brought pie into

0:28:09.540 --> 0:28:12.180
<v S1>kind of like a web interface world, which makes it

0:28:12.180 --> 0:28:15.060
<v S1>clear that this is not just an agent harness, but

0:28:15.060 --> 0:28:19.060
<v S1>it is a life optimization system. This is a life OS, right?

0:28:19.100 --> 0:28:23.510
<v S1>Based on AI, which is context focused with a single Da,

0:28:23.550 --> 0:28:26.230
<v S1>with personality basically in charge of all of it, right?

0:28:26.270 --> 0:28:29.470
<v S1>So I've got life stuff here. Work. Telos. Right? Look

0:28:29.470 --> 0:28:31.430
<v S1>at this. I've got my different Telos stuff and I've

0:28:31.430 --> 0:28:34.669
<v S1>got the observer thing turned off, so anything sensitive gets

0:28:34.710 --> 0:28:36.710
<v S1>hidden out. But yeah, I've got my health stuff here.

0:28:36.710 --> 0:28:40.430
<v S1>I've got finances, I've got business stuff. Work life. Right.

0:28:40.470 --> 0:28:42.950
<v S1>This is the view that I care about. This is

0:28:42.950 --> 0:28:45.990
<v S1>what I care about from AI. So this pulse system

0:28:45.990 --> 0:28:48.550
<v S1>is new in 5.0. And again this is not some

0:28:48.550 --> 0:28:51.270
<v S1>product pitch. This is all completely open source. I want

0:28:51.270 --> 0:28:54.110
<v S1>everyone to have this. That's literally the purpose of pi

0:28:54.150 --> 0:28:57.430
<v S1>is to enable people activate people is what I call

0:28:57.430 --> 0:29:00.630
<v S1>it with all this stuff to have these capabilities and

0:29:00.630 --> 0:29:03.070
<v S1>more importantly, like kind of the point of this video

0:29:03.070 --> 0:29:05.510
<v S1>is to think about AI in this way. AI is

0:29:05.510 --> 0:29:07.590
<v S1>not here to be an agent for you. You're not

0:29:07.590 --> 0:29:11.510
<v S1>here to manage AI harnesses like, fuck all that. That's ridiculous.

0:29:11.510 --> 0:29:14.830
<v S1>We are here to live human lives, enhanced human lives.

0:29:14.830 --> 0:29:18.469
<v S1>And AI is a capability that we've never had to

0:29:18.510 --> 0:29:20.930
<v S1>be able to have it go and research and do

0:29:20.930 --> 0:29:24.969
<v S1>all this stuff for us, collect knowledge, constantly optimize. It's

0:29:24.970 --> 0:29:28.450
<v S1>literally stressing constantly about how to make your life better, right?

0:29:28.490 --> 0:29:31.170
<v S1>What better use of technology is there than that? So

0:29:31.170 --> 0:29:33.570
<v S1>that's what this looks like. Click on agents. You actually

0:29:33.570 --> 0:29:37.170
<v S1>see look it's currently running the Pi upgrade skill. And yeah,

0:29:37.170 --> 0:29:39.250
<v S1>if it goes into the algorithm you see a lot

0:29:39.250 --> 0:29:41.370
<v S1>more content there. Oh look at this. This is a

0:29:41.410 --> 0:29:44.050
<v S1>knowledge base. Look at this. We've got companies. Look, this

0:29:44.050 --> 0:29:47.010
<v S1>is like every idea I've ever had. So Carpathia came

0:29:47.010 --> 0:29:50.050
<v S1>out with this concept of an LLM wiki. So I

0:29:50.050 --> 0:29:53.090
<v S1>basically took because Pi has been doing this already, I

0:29:53.090 --> 0:29:56.010
<v S1>basically have every idea I've ever had kind of brought

0:29:56.010 --> 0:29:59.130
<v S1>into this concept. It's actually listed, right? You have full

0:29:59.130 --> 0:30:01.770
<v S1>data on the knowledge. Look at this. This is every

0:30:01.770 --> 0:30:05.450
<v S1>document for how Pi works. It's all in here. And

0:30:05.450 --> 0:30:07.850
<v S1>guess what? Kai can read it and everyone can read it.

0:30:07.850 --> 0:30:10.010
<v S1>Look at all the different skills and what they do.

0:30:10.010 --> 0:30:12.530
<v S1>All of our different hooks that we have implemented are bold.

0:30:12.530 --> 0:30:15.210
<v S1>That's a little bit sensitive. Can't show that one security

0:30:15.250 --> 0:30:19.780
<v S1>system won't show that one performance. Yeah. Just absolutely insane. Oh,

0:30:19.780 --> 0:30:21.980
<v S1>this is like, how much it's costing me. That's frightening.

0:30:21.980 --> 0:30:25.700
<v S1>And here's the assistant. Here's Kai. He's online. He's doing tasks.

0:30:25.740 --> 0:30:28.500
<v S1>We've got different, you know, scheduled tasks and stuff like that.

0:30:28.500 --> 0:30:31.860
<v S1>This is just like absolutely insane. I can't tell you

0:30:31.860 --> 0:30:35.100
<v S1>how powerful it is to have this all in one place,

0:30:35.100 --> 0:30:39.820
<v S1>organized in this way, oriented around your life, your human life,

0:30:39.820 --> 0:30:42.620
<v S1>instead of around the tech itself. The tech is not

0:30:42.620 --> 0:30:44.540
<v S1>the point. The human is the point. All right, so

0:30:44.540 --> 0:30:46.860
<v S1>that's what I wanted to talk about in this video.

0:30:46.900 --> 0:30:50.700
<v S1>Essentially combination of multiple ideas. But the central theme is

0:30:50.700 --> 0:30:53.820
<v S1>that I think it's very clear where all of this

0:30:53.820 --> 0:30:57.060
<v S1>personal AI tech is going. And the reason I like

0:30:57.060 --> 0:31:00.219
<v S1>to offer this is because it can be very stressful

0:31:00.220 --> 0:31:04.380
<v S1>to constantly be tracking harnesses, constantly be tracking different platforms,

0:31:04.380 --> 0:31:07.740
<v S1>different chatbots, different agents. Oh, which platform should I use?

0:31:07.740 --> 0:31:11.740
<v S1>Blah blah blah. Think further out. Think who is my da?

0:31:11.780 --> 0:31:15.460
<v S1>What capabilities am I giving my da? Have I defined

0:31:15.460 --> 0:31:19.280
<v S1>everything about myself. And given that context to my day,

0:31:19.280 --> 0:31:22.719
<v S1>and have I defined what my ideal life looks like?

0:31:22.720 --> 0:31:25.560
<v S1>Have I defined what ideal state looks like for me,

0:31:25.560 --> 0:31:29.560
<v S1>for my businesses, for my finance, for my health, for

0:31:29.560 --> 0:31:32.920
<v S1>my relationships? Define what ideal state looks like. Make it

0:31:32.920 --> 0:31:36.000
<v S1>very clear to your Da that their goal is to

0:31:36.040 --> 0:31:38.880
<v S1>use all these capabilities, all these skills I've got what?

0:31:38.920 --> 0:31:42.720
<v S1>What is it? Over 100 skills. Got all these skills proactively,

0:31:42.720 --> 0:31:47.000
<v S1>not reactively. Proactively monitoring for the current state. AM I

0:31:47.040 --> 0:31:49.680
<v S1>surrounded by someone that's like looking over my shoulder and

0:31:49.680 --> 0:31:52.120
<v S1>trying to steal something? AM I surrounded by someone who's

0:31:52.120 --> 0:31:54.520
<v S1>about to steal my phone? Is someone about to attack me?

0:31:54.520 --> 0:31:56.320
<v S1>Is someone following me on the street in the middle

0:31:56.320 --> 0:31:59.760
<v S1>of the night? There's safety issues. There's enhancement issues. All

0:31:59.760 --> 0:32:03.480
<v S1>of this encapsulated into a structure that is visible and

0:32:03.480 --> 0:32:07.200
<v S1>understandable by your Da. I think it's incredibly powerful. I

0:32:07.200 --> 0:32:09.640
<v S1>think it's a really powerful way to think about where

0:32:09.640 --> 0:32:11.760
<v S1>things are going. And I really look forward to seeing

0:32:11.760 --> 0:32:14.360
<v S1>what you do with this idea. Respond with feedback. Let

0:32:14.360 --> 0:32:15.760
<v S1>me know what you think and we'll see you in

0:32:15.760 --> 0:32:16.200
<v S1>the next one.