1 00:00:15,356 --> 00:00:15,796 Speaker 1: Pushkin. 2 00:00:22,036 --> 00:00:24,476 Speaker 2: After every interview we do for the show, we upload 3 00:00:24,516 --> 00:00:28,236 Speaker 2: the audio to a piece of software called descript. Descript 4 00:00:28,316 --> 00:00:31,476 Speaker 2: turns the audio into a transcript, and then I can 5 00:00:31,756 --> 00:00:35,556 Speaker 2: edit that transcript, cut out the boring parts, move sections around, 6 00:00:35,916 --> 00:00:39,876 Speaker 2: and when I do that, descript edits the underlying audio 7 00:00:40,036 --> 00:00:45,436 Speaker 2: to match. As software, Descript is pretty janky, buggy. It's 8 00:00:45,476 --> 00:00:47,996 Speaker 2: constantly changing in ways that can make it hard to use, 9 00:00:48,356 --> 00:00:51,556 Speaker 2: and sometimes it just blows stuff up. But we use 10 00:00:51,596 --> 00:00:55,716 Speaker 2: it anyway because Descript is an incredible. 11 00:00:54,956 --> 00:00:56,676 Speaker 1: Advance over what came before. 12 00:00:57,396 --> 00:01:01,956 Speaker 2: Before descript audio software represented audio files not as words 13 00:01:01,996 --> 00:01:05,436 Speaker 2: that you can read and edit, but as waveforms, as 14 00:01:05,676 --> 00:01:07,676 Speaker 2: squiggly lines presented. 15 00:01:07,316 --> 00:01:09,876 Speaker 1: On a timeline. So when does the script came along? 16 00:01:09,996 --> 00:01:12,396 Speaker 2: Being able to edit audio by editing words on a 17 00:01:12,436 --> 00:01:15,956 Speaker 2: screen was this huge advance, and it was an advance 18 00:01:15,956 --> 00:01:21,876 Speaker 2: made possible by artificial intelligence. Eventually, Descript expanded to allow 19 00:01:21,916 --> 00:01:25,036 Speaker 2: people to edit not just audio but also video, and 20 00:01:25,116 --> 00:01:28,636 Speaker 2: last fall, open AI, the company that makes chat GPT, 21 00:01:29,196 --> 00:01:30,956 Speaker 2: led a fifty million dollar. 22 00:01:30,796 --> 00:01:32,396 Speaker 1: Investment round in Descript. 23 00:01:32,876 --> 00:01:35,636 Speaker 2: It's a sign that Descript is moving out to the 24 00:01:35,676 --> 00:01:40,036 Speaker 2: new AI frontier, the frontier of generative AI. AI that 25 00:01:40,076 --> 00:01:44,076 Speaker 2: creates words and pictures. This is of immediate interest to me, 26 00:01:44,796 --> 00:01:47,836 Speaker 2: as in is AI gonna help me do my job? 27 00:01:48,236 --> 00:01:50,756 Speaker 2: Is AI gonna do my job? 28 00:01:51,276 --> 00:01:54,116 Speaker 1: But there is also a bigger question here what is 29 00:01:54,196 --> 00:01:55,076 Speaker 1: AI going to mean? 30 00:01:55,236 --> 00:01:58,956 Speaker 2: More broadly for people whose jobs involve writing things and 31 00:01:58,996 --> 00:02:02,116 Speaker 2: creating visuals, which is to say, what is AI going 32 00:02:02,196 --> 00:02:11,316 Speaker 2: to mean for almost all white collar workers. I'm Jacob 33 00:02:11,356 --> 00:02:13,676 Speaker 2: Goldstein and this is What's Your Problem, a show about 34 00:02:13,716 --> 00:02:17,556 Speaker 2: people trying to make technological progress. My guest today is 35 00:02:17,636 --> 00:02:21,516 Speaker 2: Andrew Mason, founder and CEO of descript or maybe it's 36 00:02:21,836 --> 00:02:25,156 Speaker 2: descript by the way, I've always said descript and I'm 37 00:02:25,156 --> 00:02:28,356 Speaker 2: pretty sure that's wrong, right, it's a descript like dtour. 38 00:02:28,436 --> 00:02:30,196 Speaker 3: Weird noncommittal on the issue. 39 00:02:30,316 --> 00:02:31,756 Speaker 2: Let's do the subjective version. 40 00:02:31,756 --> 00:02:33,676 Speaker 1: You're just one man. How do you say the name 41 00:02:33,716 --> 00:02:34,356 Speaker 1: of your company? 42 00:02:34,676 --> 00:02:37,836 Speaker 3: Yeah, I've kind of cultivated the ability to flip between 43 00:02:37,876 --> 00:02:38,876 Speaker 3: them as I speak. 44 00:02:39,716 --> 00:02:40,396 Speaker 1: You're killing me. 45 00:02:40,556 --> 00:02:42,036 Speaker 3: The world still needs a little mystery. 46 00:02:42,196 --> 00:02:44,436 Speaker 1: Okay, how about this? Say your name and your job. 47 00:02:45,156 --> 00:02:48,476 Speaker 3: My name is Andrew Mason. I work at descript. That's 48 00:02:49,036 --> 00:02:53,316 Speaker 3: dscript dscript. 49 00:02:53,556 --> 00:02:54,476 Speaker 1: Well played. 50 00:02:55,836 --> 00:02:58,516 Speaker 2: Earlier in his career. Andrew Mason was the co founder 51 00:02:58,556 --> 00:03:01,596 Speaker 2: of groupunk. He took the company public and then got 52 00:03:01,636 --> 00:03:05,076 Speaker 2: fired after its stock fell by something like seventy five percent. 53 00:03:05,596 --> 00:03:08,676 Speaker 2: After that, he started a company called Dtour or maybe 54 00:03:08,676 --> 00:03:12,236 Speaker 2: it's I don't know. The company made these highly produced 55 00:03:12,316 --> 00:03:15,356 Speaker 2: audio walking tours that you could listen to on your phone. 56 00:03:15,396 --> 00:03:18,276 Speaker 2: In that job, Andrew saw the challenges of working with 57 00:03:18,356 --> 00:03:20,236 Speaker 2: the old waveform. 58 00:03:19,716 --> 00:03:21,276 Speaker 1: Based audio editing software. 59 00:03:21,596 --> 00:03:25,116 Speaker 2: At the same time, AI generated transcripts were getting better 60 00:03:25,276 --> 00:03:28,476 Speaker 2: and cheaper, and new technology was making it possible to 61 00:03:28,636 --> 00:03:32,476 Speaker 2: automatically match a transcript to an audio file. Andrew looked 62 00:03:32,476 --> 00:03:35,076 Speaker 2: at those two developments and thought, we should make an 63 00:03:35,116 --> 00:03:38,396 Speaker 2: audio editor that works like a word processor, which he 64 00:03:38,516 --> 00:03:41,036 Speaker 2: admits was a distraction from what he was supposed to 65 00:03:41,036 --> 00:03:43,276 Speaker 2: be doing, which was making walking tours. 66 00:03:44,916 --> 00:03:47,396 Speaker 3: If I'm being honest, it was a bit of an indulgence. 67 00:03:47,636 --> 00:03:50,436 Speaker 3: It just felt like an incredibly cool problem to work on. 68 00:03:51,076 --> 00:03:53,956 Speaker 3: I went to school for music technology and worked in 69 00:03:53,996 --> 00:03:58,156 Speaker 3: a recording studio after I graduated, and just always loved tools, 70 00:03:58,236 --> 00:04:02,436 Speaker 3: and audio visual tools in particular. It was just so 71 00:04:02,756 --> 00:04:05,076 Speaker 3: fun to start thinking about this puzzle. 72 00:04:05,636 --> 00:04:06,316 Speaker 1: Uh huh. 73 00:04:06,356 --> 00:04:10,676 Speaker 3: So we told ourselves it was kind of way of diversifying, 74 00:04:10,716 --> 00:04:13,636 Speaker 3: but that's just like a ridiculous way for a product 75 00:04:13,676 --> 00:04:15,796 Speaker 3: that at like a startup that just hasn't even found 76 00:04:16,236 --> 00:04:19,716 Speaker 3: product market fit in their core product to be thinking 77 00:04:19,796 --> 00:04:22,196 Speaker 3: about the world. You know, all of the advice textbooks 78 00:04:22,236 --> 00:04:24,836 Speaker 3: will tell you not to do that, and it's probably 79 00:04:24,876 --> 00:04:28,956 Speaker 3: generally good advice, but it was just it was just irresistible. 80 00:04:28,596 --> 00:04:28,796 Speaker 1: You know. 81 00:04:28,876 --> 00:04:31,516 Speaker 2: So I am a fan of descript. I started using 82 00:04:31,596 --> 00:04:35,036 Speaker 2: it around when it came out several years ago. Certainly 83 00:04:35,636 --> 00:04:38,996 Speaker 2: I think it's great. It is kind of janky, and 84 00:04:39,076 --> 00:04:42,836 Speaker 2: it's always kind of janky, right, And my guess is 85 00:04:43,396 --> 00:04:46,556 Speaker 2: jankie meaning like a little bit unstable, things don't quite work. 86 00:04:46,596 --> 00:04:48,156 Speaker 2: It's always telling you to restart. 87 00:04:48,676 --> 00:04:50,356 Speaker 3: By the way, if I'm not sure, if I have 88 00:04:50,516 --> 00:04:52,476 Speaker 3: your side, so I may ask you to send me 89 00:04:52,596 --> 00:04:54,676 Speaker 3: this entire portion of the interview so I can share 90 00:04:54,676 --> 00:04:55,236 Speaker 3: it with the team. 91 00:04:55,756 --> 00:04:58,156 Speaker 2: So the thing is, like, I wonder, why is it 92 00:04:58,156 --> 00:05:00,436 Speaker 2: always kind of janky? Why is it never just like 93 00:05:00,476 --> 00:05:02,916 Speaker 2: stable and it works? And my guess is it's because 94 00:05:02,916 --> 00:05:05,436 Speaker 2: you're pushing forward really fast, right, You're trying to make 95 00:05:05,476 --> 00:05:08,476 Speaker 2: it better and better and better, and presumably there is 96 00:05:08,516 --> 00:05:11,236 Speaker 2: some trace off, right, like the faster you try and 97 00:05:11,276 --> 00:05:14,796 Speaker 2: push it forward, the more janky it's gonna be. You could, 98 00:05:14,836 --> 00:05:17,236 Speaker 2: I'm sure, just perfect the way it was four years ago, 99 00:05:17,316 --> 00:05:18,836 Speaker 2: but then it would never get better, but it would 100 00:05:18,836 --> 00:05:21,076 Speaker 2: be stable, right, And so this is like a big 101 00:05:21,116 --> 00:05:25,236 Speaker 2: whatever startup founder type question, like, is that some balance 102 00:05:25,236 --> 00:05:27,276 Speaker 2: you're always trying to figure out how fast do we 103 00:05:27,756 --> 00:05:29,756 Speaker 2: iterate versus how much do we try and make it 104 00:05:29,876 --> 00:05:31,036 Speaker 2: just stable and work. 105 00:05:31,876 --> 00:05:36,796 Speaker 3: Yeah, that's an astute observation, not the fact that it's janky. 106 00:05:36,996 --> 00:05:38,596 Speaker 3: That doesn't take a genius. 107 00:05:38,316 --> 00:05:41,236 Speaker 2: Respectful but respectfully as a fan. 108 00:05:41,076 --> 00:05:41,916 Speaker 1: I'm telling you it's. 109 00:05:43,396 --> 00:05:45,716 Speaker 3: But I think like your your attempt to make sense 110 00:05:45,756 --> 00:05:48,116 Speaker 3: out of it, I think like a good story to 111 00:05:48,196 --> 00:05:51,116 Speaker 3: tell here is maybe like going back to the to 112 00:05:51,196 --> 00:05:56,116 Speaker 3: the very beginning of Descript. So when we became Descript, 113 00:05:56,476 --> 00:06:01,236 Speaker 3: we sold off Detour to Bows and we decided to 114 00:06:01,356 --> 00:06:05,796 Speaker 3: just focus on building out this media word processor thing. 115 00:06:06,996 --> 00:06:10,956 Speaker 3: And some of the public radio producers who had worked 116 00:06:10,956 --> 00:06:14,916 Speaker 3: at Detour went on back into public radio and they 117 00:06:14,996 --> 00:06:21,596 Speaker 3: became some of the earliest customers of dscript. And what 118 00:06:21,756 --> 00:06:27,316 Speaker 3: we found was that they pushed it so much farther 119 00:06:27,876 --> 00:06:29,756 Speaker 3: than we were ready for. 120 00:06:30,236 --> 00:06:32,876 Speaker 1: Ah, so quickly, what do you mean by that? Like, 121 00:06:32,916 --> 00:06:34,036 Speaker 1: what is an example of that. 122 00:06:34,316 --> 00:06:37,076 Speaker 3: Yeah, I mean specifically in the case of like some 123 00:06:37,116 --> 00:06:40,836 Speaker 3: of these shows, it means putting together three to five 124 00:06:40,916 --> 00:06:45,756 Speaker 3: hour cuts of tape from many different files, with lots 125 00:06:45,876 --> 00:06:49,716 Speaker 3: like tons and tons of edits and notes mixed into 126 00:06:49,716 --> 00:06:54,916 Speaker 3: the edits, and just like stuff that we hadn't pressure 127 00:06:54,956 --> 00:06:58,396 Speaker 3: tested from a performance just giant. 128 00:06:58,476 --> 00:07:01,276 Speaker 2: The files are really big, right, Like a three hour 129 00:07:01,636 --> 00:07:04,276 Speaker 2: audio file is actually a giant file, right, And if 130 00:07:04,276 --> 00:07:05,836 Speaker 2: you're stacking up a bunch of those, so you have 131 00:07:05,836 --> 00:07:08,556 Speaker 2: all these giant files and you're making tons of cuts, 132 00:07:08,636 --> 00:07:14,196 Speaker 2: that's just like computationally intensive, that kind of thing storage intensive. 133 00:07:15,036 --> 00:07:17,396 Speaker 3: Yeah, it was just something that we hadn't that we 134 00:07:17,436 --> 00:07:20,876 Speaker 3: hadn't optimized for. It's it's an eminently solvable problem, but 135 00:07:20,916 --> 00:07:25,396 Speaker 3: it was something that in the earliest versions we hadn't done. 136 00:07:26,036 --> 00:07:28,676 Speaker 3: And so that is kind of in many ways been 137 00:07:28,796 --> 00:07:32,316 Speaker 3: the story of descript up to this point, where there's 138 00:07:32,396 --> 00:07:35,476 Speaker 3: there's been that element of it, and there were kind 139 00:07:35,556 --> 00:07:39,156 Speaker 3: of realities of needing to make quick progress that we 140 00:07:39,196 --> 00:07:46,636 Speaker 3: had to balance against stability and what we had for 141 00:07:46,716 --> 00:07:50,436 Speaker 3: our customers in terms of like the core product idea 142 00:07:50,996 --> 00:07:53,756 Speaker 3: of being able to edit by text was still for 143 00:07:53,916 --> 00:07:58,476 Speaker 3: them so much better than the alternative that there was 144 00:07:59,276 --> 00:08:04,516 Speaker 3: just a tolerance of the stability issues that honestly made 145 00:08:04,596 --> 00:08:06,596 Speaker 3: us sick to our stomachs that we had to put 146 00:08:06,636 --> 00:08:10,036 Speaker 3: people through. And it's not like wegn but it was 147 00:08:10,116 --> 00:08:13,596 Speaker 3: like we had to make trade offs there. So all 148 00:08:13,636 --> 00:08:16,516 Speaker 3: of this pushing kind of culminated with this release of 149 00:08:16,556 --> 00:08:20,316 Speaker 3: a pretty major overhaul that we did at the end 150 00:08:20,316 --> 00:08:25,996 Speaker 3: of last year and since then, since last November and 151 00:08:26,556 --> 00:08:29,996 Speaker 3: really like through the first half of this year, is 152 00:08:30,036 --> 00:08:32,036 Speaker 3: when we think we start to get to a good place. 153 00:08:32,796 --> 00:08:35,396 Speaker 3: Our goal is that if we're having this conversation, like 154 00:08:35,436 --> 00:08:38,516 Speaker 3: we're not going to be having the same conversation in 155 00:08:38,716 --> 00:08:41,436 Speaker 3: say July, for sure at the very latest, like the 156 00:08:41,436 --> 00:08:44,196 Speaker 3: conversation we'll be having with someone like you will be Wow, 157 00:08:44,236 --> 00:08:46,756 Speaker 3: it's gotten. It's like not an issue anymore. 158 00:08:46,876 --> 00:08:50,516 Speaker 2: So you say all that, but also, you just got 159 00:08:50,516 --> 00:08:53,916 Speaker 2: this big investment from open Ai. You got a thing 160 00:08:54,116 --> 00:08:57,756 Speaker 2: on descript that says sign up to try GPT four 161 00:08:57,876 --> 00:09:00,836 Speaker 2: with Descript, which I just signed up for and I'm 162 00:09:00,916 --> 00:09:04,636 Speaker 2: very curious about That doesn't sound like, oh, we've arrived 163 00:09:04,676 --> 00:09:06,316 Speaker 2: and now we've got our product and we've just got 164 00:09:06,316 --> 00:09:08,316 Speaker 2: a hone it that sounds like there's this whole giant 165 00:09:08,396 --> 00:09:10,356 Speaker 2: new universe of things you were about to try and 166 00:09:10,356 --> 00:09:10,916 Speaker 2: figure out. 167 00:09:11,556 --> 00:09:13,916 Speaker 3: That's true, And that's the funny thing about all of 168 00:09:13,916 --> 00:09:16,996 Speaker 3: this are is that at the same time that we're 169 00:09:16,996 --> 00:09:21,076 Speaker 3: turning to focus on quality, it's a moment where generative 170 00:09:21,076 --> 00:09:25,076 Speaker 3: AI has arrived at a scale and with a force 171 00:09:25,316 --> 00:09:28,156 Speaker 3: that no one really saw coming this quickly. 172 00:09:28,676 --> 00:09:31,396 Speaker 2: So so okay, I know from the beginning Descript was 173 00:09:31,396 --> 00:09:35,076 Speaker 2: built on top of AI, you know, the technology for 174 00:09:35,116 --> 00:09:39,516 Speaker 2: transcription for matching audio to text, but was Descript itself 175 00:09:39,596 --> 00:09:40,716 Speaker 2: an AI company. 176 00:09:41,516 --> 00:09:44,236 Speaker 3: So we had some really smart people on the team 177 00:09:44,596 --> 00:09:47,636 Speaker 3: in UH with machine learning experience, but I wouldn't say 178 00:09:47,636 --> 00:09:50,196 Speaker 3: in the early days we were like a company that 179 00:09:50,676 --> 00:09:53,956 Speaker 3: was with anybody that was doing like original AI research 180 00:09:54,036 --> 00:09:59,436 Speaker 3: or anything like that. We saw that as a gap 181 00:09:59,836 --> 00:10:05,036 Speaker 3: that we wanted to solve. And so I forget exactly 182 00:10:05,636 --> 00:10:07,956 Speaker 3: what year it was, it was maybe about four years 183 00:10:07,956 --> 00:10:12,916 Speaker 3: ago we saw this company called Liarbird. It was a 184 00:10:12,916 --> 00:10:16,436 Speaker 3: a company out of y Combinator with some really smart 185 00:10:17,076 --> 00:10:21,036 Speaker 3: PhD candidates. They had built model that would build a 186 00:10:21,076 --> 00:10:23,836 Speaker 3: clone of your voice based on I think about three 187 00:10:24,076 --> 00:10:27,476 Speaker 3: minutes or five minutes of training data. Of just talking 188 00:10:27,556 --> 00:10:27,796 Speaker 3: to it. 189 00:10:27,876 --> 00:10:31,756 Speaker 2: Let me just say, I know Liarbird is spelled l yri, 190 00:10:32,596 --> 00:10:34,476 Speaker 2: but I assume they're aware. 191 00:10:34,236 --> 00:10:35,036 Speaker 1: Of the hominem. 192 00:10:35,156 --> 00:10:37,036 Speaker 2: Right, this is a thing that is cloning your voice 193 00:10:37,076 --> 00:10:39,636 Speaker 2: so that you can make it sound like you're talking 194 00:10:39,756 --> 00:10:42,876 Speaker 2: even if you're not talking. And the company is called Liarbird, 195 00:10:43,276 --> 00:10:47,276 Speaker 2: and this is a somewhat fraud thing, right, Like, I 196 00:10:47,276 --> 00:10:49,156 Speaker 2: feel like they're throwing it in my face that this 197 00:10:49,196 --> 00:10:53,636 Speaker 2: is a sketchy product that they're developing. 198 00:10:54,436 --> 00:10:55,716 Speaker 1: Did it cross your mind? 199 00:10:56,196 --> 00:10:59,276 Speaker 3: Did it cross my mind? Is like the ethical quandary 200 00:10:59,276 --> 00:11:01,956 Speaker 3: that we were getting into, or like the branding implications 201 00:11:01,996 --> 00:11:02,596 Speaker 3: of the name. 202 00:11:03,636 --> 00:11:04,836 Speaker 1: More the ethical quandary. 203 00:11:05,396 --> 00:11:10,236 Speaker 3: Yeah, the ethical quandary absolutely entered our mind. And our 204 00:11:10,276 --> 00:11:12,756 Speaker 3: point of view on that, and has been our point 205 00:11:12,796 --> 00:11:16,316 Speaker 3: of view on these things in general, has been that 206 00:11:17,356 --> 00:11:22,196 Speaker 3: we don't want to be like out there paving the 207 00:11:22,236 --> 00:11:25,756 Speaker 3: way for any new paths to the apocalypse, so to speak. 208 00:11:27,756 --> 00:11:31,996 Speaker 3: We actually think, like have always felt like not really 209 00:11:32,036 --> 00:11:36,236 Speaker 3: sure how society was going to put the brakes on 210 00:11:36,236 --> 00:11:38,116 Speaker 3: this sort of thing. We just knew that we didn't 211 00:11:38,116 --> 00:11:40,156 Speaker 3: want to be part of it, and we tried to 212 00:11:40,156 --> 00:11:42,996 Speaker 3: put guardrails in place on our product. That would make 213 00:11:43,036 --> 00:11:46,316 Speaker 3: it easy to stay off the slippery slope. So in 214 00:11:46,356 --> 00:11:50,996 Speaker 3: the case of Lyyerbird, which once we bought them, we 215 00:11:51,276 --> 00:11:53,596 Speaker 3: integrated their technology and released it as something that we 216 00:11:53,636 --> 00:11:55,996 Speaker 3: call overdub. It's a way that you can clone your voice. 217 00:11:56,636 --> 00:11:59,556 Speaker 3: We require you to authenticate that it's actually you, and 218 00:11:59,596 --> 00:12:02,756 Speaker 3: we only let you clone your own voice, and that's 219 00:12:02,796 --> 00:12:05,476 Speaker 3: worked really well. We're now in a world where there's 220 00:12:05,476 --> 00:12:08,876 Speaker 3: other people that have similar models and they're not putting 221 00:12:08,876 --> 00:12:11,756 Speaker 3: those protections in place. And the use case that we've 222 00:12:11,756 --> 00:12:14,876 Speaker 3: always been the most excited about is making it possible 223 00:12:14,876 --> 00:12:18,916 Speaker 3: to edit your natural recordings, so going in and changing 224 00:12:18,956 --> 00:12:22,236 Speaker 3: an individual word, and we've built some special stuff that 225 00:12:22,316 --> 00:12:24,676 Speaker 3: will kind of listen to the audio on either sides 226 00:12:24,716 --> 00:12:27,396 Speaker 3: and make sure that it blends in. From an intonation perspective, 227 00:12:27,996 --> 00:12:30,196 Speaker 3: we started with the ability to delete stuff and move 228 00:12:30,236 --> 00:12:32,796 Speaker 3: stuff around. Now you can just type and really make 229 00:12:32,836 --> 00:12:34,156 Speaker 3: it feel like it's a word processor. 230 00:12:34,316 --> 00:12:39,316 Speaker 2: Presumably the better you get, the better the technology you 231 00:12:39,436 --> 00:12:43,156 Speaker 2: use to Clona Voice gets, the more words it can do. Right, 232 00:12:43,196 --> 00:12:46,196 Speaker 2: I mean, every week, for what's your problem? I write 233 00:12:46,196 --> 00:12:50,676 Speaker 2: a little introduction and then I read it. But presumably 234 00:12:50,716 --> 00:12:53,236 Speaker 2: at some point overdub will be good enough that no 235 00:12:53,276 --> 00:12:55,556 Speaker 2: one knows will know whether it's me reading it or 236 00:12:55,596 --> 00:12:56,836 Speaker 2: I'm just typing it right. 237 00:12:57,316 --> 00:13:00,196 Speaker 3: We have a new version of overdub that will release 238 00:13:01,116 --> 00:13:04,156 Speaker 3: in the next couple of months, and it's the first 239 00:13:04,196 --> 00:13:08,236 Speaker 3: time that I've heard my own voice doing a narration 240 00:13:08,356 --> 00:13:12,076 Speaker 3: of something that made me say, like, this sounds so 241 00:13:12,236 --> 00:13:16,116 Speaker 3: much like me in a way that it's not distracting 242 00:13:16,196 --> 00:13:18,156 Speaker 3: or the AI does not get in the way. 243 00:13:18,556 --> 00:13:23,236 Speaker 2: Can I try that new version now, like, not this minute, 244 00:13:23,276 --> 00:13:24,156 Speaker 2: but like for the show? 245 00:13:24,516 --> 00:13:25,196 Speaker 1: Yeah, for the show. 246 00:13:26,196 --> 00:13:27,956 Speaker 3: I bet we could find a way to do it. 247 00:13:27,956 --> 00:13:30,396 Speaker 3: It's just so you could hear it and stuff. 248 00:13:30,796 --> 00:13:33,116 Speaker 2: There's a universe where I say, at this moment in 249 00:13:33,156 --> 00:13:37,236 Speaker 2: the show, guess what today? That voice me reading the 250 00:13:37,236 --> 00:13:39,756 Speaker 2: intro at the top of the show that was overdubbed. 251 00:13:39,756 --> 00:13:40,676 Speaker 1: It wasn't really made. 252 00:13:40,916 --> 00:13:45,756 Speaker 3: Yeah, we tried overdubb for the voice doing the intro 253 00:13:45,836 --> 00:13:47,196 Speaker 3: at the top of the show. 254 00:13:47,516 --> 00:13:49,316 Speaker 1: And we decided it wasn't quite. 255 00:13:49,076 --> 00:13:51,676 Speaker 2: Good enough, but we decided it would work for this 256 00:13:51,796 --> 00:13:52,636 Speaker 2: part of the show. 257 00:13:53,156 --> 00:13:57,036 Speaker 1: What you're hearing right now, it's not really me. It's overdubbed. 258 00:13:58,116 --> 00:14:01,756 Speaker 2: In a minute, what overdubb and chat GPT and generative 259 00:14:01,796 --> 00:14:04,116 Speaker 2: AI will mean for descript and for the. 260 00:14:04,076 --> 00:14:06,396 Speaker 1: World and also for me. 261 00:14:12,436 --> 00:14:16,316 Speaker 2: Now back to the show, descript is expanding from podcasts 262 00:14:16,316 --> 00:14:18,716 Speaker 2: to video, and it just took a big investment from 263 00:14:18,836 --> 00:14:22,636 Speaker 2: open Ai, the company that makes chat GPT, and also 264 00:14:22,676 --> 00:14:26,516 Speaker 2: this system called Dolly that uses AI to generate images. 265 00:14:26,916 --> 00:14:30,116 Speaker 2: So Descript is clearly pointing toward a future where it's 266 00:14:30,156 --> 00:14:33,276 Speaker 2: going to be software for creating AI generated or at 267 00:14:33,356 --> 00:14:34,316 Speaker 2: least AI. 268 00:14:34,236 --> 00:14:35,956 Speaker 1: Enhanced audio and video. 269 00:14:36,556 --> 00:14:39,356 Speaker 2: And I asked Andrew, what does that future look like? 270 00:14:39,796 --> 00:14:42,596 Speaker 2: How is generative AI going to work in descript? 271 00:14:43,236 --> 00:14:45,676 Speaker 3: I don't think we know entirely yet. In a lot 272 00:14:45,676 --> 00:14:47,836 Speaker 3: of ways, it feels to me like you're letting this 273 00:14:48,116 --> 00:14:51,876 Speaker 3: alien into into your app. You're just giving it the 274 00:14:51,956 --> 00:14:56,716 Speaker 3: keys and then the interfaces. How do you find how 275 00:14:56,756 --> 00:15:00,916 Speaker 3: do you find a way to kind of give the 276 00:15:00,956 --> 00:15:04,556 Speaker 3: aliens some buttons in tier UI, give them the ability 277 00:15:04,636 --> 00:15:06,796 Speaker 3: to press the buttons, and then how do you talk 278 00:15:06,836 --> 00:15:07,476 Speaker 3: to the alien? 279 00:15:07,796 --> 00:15:08,796 Speaker 1: What do you mean? Like? 280 00:15:08,836 --> 00:15:13,396 Speaker 2: That is a striking metaphor a little scarier right. It 281 00:15:13,476 --> 00:15:17,036 Speaker 2: suggests a certain level of uncertainty and potential downside. It's 282 00:15:17,076 --> 00:15:18,716 Speaker 2: not like, oh, this is great, this is going to 283 00:15:18,716 --> 00:15:20,556 Speaker 2: solve a problem like, why do you say it's like 284 00:15:20,636 --> 00:15:21,716 Speaker 2: letting an alien. 285 00:15:21,396 --> 00:15:24,116 Speaker 3: In as opposed to letting a human in. 286 00:15:25,276 --> 00:15:27,836 Speaker 1: It's a really interesting choice of words. Tell me more 287 00:15:27,836 --> 00:15:28,276 Speaker 1: about it. 288 00:15:30,996 --> 00:15:35,436 Speaker 3: So let's start by just saying, like, very specifically, what 289 00:15:35,476 --> 00:15:40,276 Speaker 3: I mean. I think, when implemented, well, what this will 290 00:15:40,276 --> 00:15:43,596 Speaker 3: feel like is as if you had a co editor 291 00:15:44,236 --> 00:15:47,196 Speaker 3: in a document with you, in our case, in a 292 00:15:47,276 --> 00:15:52,436 Speaker 3: video or a podcast that you're working on that is smart, 293 00:15:52,596 --> 00:15:54,956 Speaker 3: knows how to do everything, definitely knows how to do 294 00:15:54,996 --> 00:15:59,316 Speaker 3: the tedios busy work, and you can kind of kind 295 00:15:59,356 --> 00:16:03,156 Speaker 3: of guide or direct through giving these tasks. You know, 296 00:16:03,716 --> 00:16:07,596 Speaker 3: it's almost like it's the production assistant or something like that, 297 00:16:07,636 --> 00:16:10,716 Speaker 3: and you're the director and you're able to just guide 298 00:16:10,756 --> 00:16:13,076 Speaker 3: it and give it feedback on how it's doing and 299 00:16:13,076 --> 00:16:15,196 Speaker 3: what it's doing well and what it's not doing well. 300 00:16:15,676 --> 00:16:18,076 Speaker 2: There's a version of it where it's like we've gotten 301 00:16:18,156 --> 00:16:21,716 Speaker 2: used to the graphical user interface, right, We've been trained 302 00:16:21,756 --> 00:16:24,796 Speaker 2: since the Magintosh computer in the mid nineteen eighties that 303 00:16:24,876 --> 00:16:26,836 Speaker 2: the way you interact with a computer is like there's 304 00:16:26,876 --> 00:16:28,996 Speaker 2: little pictures and little folders and you point. 305 00:16:28,756 --> 00:16:30,676 Speaker 1: And click one way or another, right, and. 306 00:16:31,276 --> 00:16:35,436 Speaker 2: One possibility here is the new standard interface is chat. 307 00:16:35,476 --> 00:16:38,996 Speaker 2: You just type in like whatever, please trim all the 308 00:16:39,116 --> 00:16:42,436 Speaker 2: ums from this file, or even please turn this thirty 309 00:16:42,476 --> 00:16:44,836 Speaker 2: minute interview into a twenty minute interview, and the way 310 00:16:44,836 --> 00:16:47,596 Speaker 2: that makes it most interesting, right, and you just type 311 00:16:47,596 --> 00:16:48,516 Speaker 2: that in and it happens. 312 00:16:48,556 --> 00:16:50,396 Speaker 1: I mean that's a version of what I hear you 313 00:16:50,396 --> 00:16:50,916 Speaker 1: saying there. 314 00:16:50,956 --> 00:16:53,316 Speaker 3: I think some people believe that that chat or a 315 00:16:53,396 --> 00:16:57,796 Speaker 3: text field will become the primary interface for making things. 316 00:16:58,476 --> 00:17:01,156 Speaker 3: I think of it more as like it's the primary 317 00:17:01,196 --> 00:17:04,276 Speaker 3: interface for interacting with the alien, and then you and 318 00:17:04,316 --> 00:17:07,396 Speaker 3: the alien are still going to be working, like have 319 00:17:07,556 --> 00:17:11,356 Speaker 3: other buttons that they can press. You still, sometimes you 320 00:17:11,436 --> 00:17:13,556 Speaker 3: just want to take the thing in your hands and 321 00:17:13,556 --> 00:17:14,236 Speaker 3: do it yourself. 322 00:17:14,796 --> 00:17:17,316 Speaker 2: The alien metaphor, I mean there's a real like do 323 00:17:17,356 --> 00:17:21,316 Speaker 2: we welcome our alien overlord's question? When you choose that metaphor, 324 00:17:21,396 --> 00:17:22,196 Speaker 2: it makes. 325 00:17:21,996 --> 00:17:24,356 Speaker 3: Me I mean, maybe it feels that way. 326 00:17:24,396 --> 00:17:26,316 Speaker 2: It doesn't it it doesn't make me feel better. 327 00:17:26,396 --> 00:17:29,996 Speaker 3: I'll say that I think it feels the way that 328 00:17:30,036 --> 00:17:33,916 Speaker 3: an alien arrival would probably feel, where you know, maybe 329 00:17:33,916 --> 00:17:37,476 Speaker 3: you shake its hand and immediately it has something in 330 00:17:37,516 --> 00:17:41,476 Speaker 3: its skin that cures your cancer, and you feel hopeful, 331 00:17:42,996 --> 00:17:45,516 Speaker 3: but you also want to know what they're up. 332 00:17:45,436 --> 00:17:48,876 Speaker 2: To and yeah, and cure your cancer is definitely the 333 00:17:48,916 --> 00:17:49,636 Speaker 2: happy version. 334 00:17:49,676 --> 00:17:51,476 Speaker 1: Not usually in the alien movie. 335 00:17:51,276 --> 00:17:53,236 Speaker 2: What happens, but I guess that could happen. 336 00:17:53,396 --> 00:17:55,356 Speaker 3: Well, there's the good there's a good part, right, But 337 00:17:55,396 --> 00:17:57,636 Speaker 3: you never really know, I think is the point. And 338 00:17:57,956 --> 00:18:00,516 Speaker 3: I think we're all living in this kind of like 339 00:18:00,596 --> 00:18:03,916 Speaker 3: pushing forward in this mystery, kind of kind of stuck 340 00:18:03,956 --> 00:18:05,876 Speaker 3: between awe and terror. 341 00:18:06,276 --> 00:18:10,156 Speaker 2: You sound more ambivalent than I might have thought. Why 342 00:18:10,196 --> 00:18:12,836 Speaker 2: is that because you just took a giant investment from 343 00:18:12,876 --> 00:18:13,356 Speaker 2: open AI. 344 00:18:14,876 --> 00:18:17,516 Speaker 3: I think like at moments like this, you have a 345 00:18:17,596 --> 00:18:23,436 Speaker 3: choice between either renunciation and just like stopping and out 346 00:18:23,476 --> 00:18:27,236 Speaker 3: of from a place of fear. Which maybe that's right, 347 00:18:27,356 --> 00:18:32,836 Speaker 3: you know, maybe fulfillment and happiness everything we have for 348 00:18:32,876 --> 00:18:36,796 Speaker 3: that is is already here, and we should focus our 349 00:18:37,076 --> 00:18:39,756 Speaker 3: energies on making peace with our inevitable death. 350 00:18:41,356 --> 00:18:42,676 Speaker 1: In any case, we should do that. 351 00:18:42,836 --> 00:18:47,436 Speaker 3: But go on the other way to think of it 352 00:18:47,516 --> 00:18:52,076 Speaker 3: is to just forge ahead and realize that the potential 353 00:18:52,116 --> 00:18:54,356 Speaker 3: of what's on the other end of this might make 354 00:18:54,476 --> 00:18:59,196 Speaker 3: us feel in retrospect like we were just in the 355 00:18:59,236 --> 00:19:05,476 Speaker 3: earliest possible innings of our of the human experiment. So 356 00:19:06,756 --> 00:19:09,396 Speaker 3: you know, I feel like we're all going to die 357 00:19:09,436 --> 00:19:12,396 Speaker 3: one way or another, might as well forge ahead. It's 358 00:19:12,396 --> 00:19:15,756 Speaker 3: not ambivalence, but it's more just being clear eyed about 359 00:19:15,796 --> 00:19:18,716 Speaker 3: the fact that not trying to pretend that there's parts 360 00:19:18,716 --> 00:19:20,156 Speaker 3: of it that don't seem scary. 361 00:19:21,236 --> 00:19:23,996 Speaker 2: I mean, one of the things that's really striking to 362 00:19:24,156 --> 00:19:29,756 Speaker 2: me with AI, and that seems quite different from other 363 00:19:30,476 --> 00:19:34,636 Speaker 2: technologies in the past, is the people who are working 364 00:19:34,676 --> 00:19:38,036 Speaker 2: on it, the people who really understand it, seem more 365 00:19:38,076 --> 00:19:40,836 Speaker 2: scared than everybody else. 366 00:19:41,516 --> 00:19:44,716 Speaker 3: I'm not a first time founder. I went through the 367 00:19:44,796 --> 00:19:50,396 Speaker 3: experience of being a young person building building group on 368 00:19:51,356 --> 00:19:54,396 Speaker 3: telling myself a story about how it was going to 369 00:19:54,876 --> 00:19:59,156 Speaker 3: revolutionize local commerce and all the good stuff, and it 370 00:19:59,276 --> 00:20:01,116 Speaker 3: just didn't turn out that way. And I think we've 371 00:20:01,156 --> 00:20:06,156 Speaker 3: seen a generation of tech companies that just like didn't 372 00:20:06,556 --> 00:20:10,756 Speaker 3: turn out the way that the the super Rose colored 373 00:20:10,756 --> 00:20:15,356 Speaker 3: Glasses mission statement would have suggested. And I think we're 374 00:20:15,396 --> 00:20:19,316 Speaker 3: just trying to be we just have that experience, that 375 00:20:19,356 --> 00:20:24,196 Speaker 3: recent experience at top of mind, and are trying to 376 00:20:24,316 --> 00:20:27,836 Speaker 3: think about it in a way that has guardrails around 377 00:20:27,996 --> 00:20:30,356 Speaker 3: around repeating that history and just make sure we're really 378 00:20:30,396 --> 00:20:33,476 Speaker 3: proud of what we build. Does that make sense? 379 00:20:33,596 --> 00:20:34,236 Speaker 1: It makes sense. 380 00:20:34,596 --> 00:20:36,196 Speaker 3: Am I Am I going to regret saying all this? 381 00:20:37,156 --> 00:20:37,836 Speaker 1: I don't think so. 382 00:20:37,996 --> 00:20:40,676 Speaker 2: You haven't said anything like incriminating as far as I 383 00:20:40,676 --> 00:20:42,996 Speaker 2: can tell. You know, I heard somebody saying the other day, like, 384 00:20:43,076 --> 00:20:46,076 Speaker 2: it's an interesting question to ask somebody like, what was 385 00:20:46,116 --> 00:20:49,996 Speaker 2: the first thing you asked GPT chet GPT to do? 386 00:20:50,596 --> 00:20:53,156 Speaker 2: And the first thing I asked chet GPT to do 387 00:20:53,596 --> 00:20:57,236 Speaker 2: was write an episode of Planet Money podcast I used 388 00:20:57,236 --> 00:20:59,236 Speaker 2: to host, of which there are you know, a thousand 389 00:20:59,276 --> 00:21:01,956 Speaker 2: transcripts on the internet. Write an episode of Planet Money 390 00:21:01,996 --> 00:21:04,276 Speaker 2: about whether the FED is going to raise interest rates 391 00:21:04,276 --> 00:21:06,756 Speaker 2: by twenty five basis points or leave them unchanged, right, 392 00:21:07,396 --> 00:21:10,516 Speaker 2: And it wrote something that was pretty good, like not 393 00:21:10,596 --> 00:21:13,716 Speaker 2: a whole show, it's not there now, but at the 394 00:21:13,796 --> 00:21:16,556 Speaker 2: rate of current improvement, you could definitely imagine it writing 395 00:21:16,596 --> 00:21:20,716 Speaker 2: that episode pretty well in whatever a year or two 396 00:21:20,836 --> 00:21:22,956 Speaker 2: years or some amount of time when I will still 397 00:21:23,036 --> 00:21:26,716 Speaker 2: want to be gainfully employed. And like I do wonder 398 00:21:26,756 --> 00:21:29,796 Speaker 2: on this one, is there a day slash? How far 399 00:21:29,916 --> 00:21:30,716 Speaker 2: are we from. 400 00:21:30,556 --> 00:21:34,596 Speaker 1: The day when generative a I can just make a 401 00:21:34,636 --> 00:21:36,276 Speaker 1: podcast without me? 402 00:21:36,836 --> 00:21:37,876 Speaker 3: How does that make you feel? 403 00:21:39,436 --> 00:21:43,876 Speaker 2: I mean somewhat afraid, also like interested in figuring out 404 00:21:43,916 --> 00:21:46,596 Speaker 2: how to use it, right, Like it feels like a steamroller. 405 00:21:46,636 --> 00:21:49,676 Speaker 2: It's like, oh, maybe I should go get in that steamroller. 406 00:21:49,716 --> 00:21:51,676 Speaker 2: If my choices are get in the steamroller or get 407 00:21:51,756 --> 00:21:52,476 Speaker 2: run over by it. 408 00:21:53,156 --> 00:21:56,796 Speaker 3: Yeah, I think, like before I comment on it, I 409 00:21:56,796 --> 00:22:01,956 Speaker 3: think it's important that people understand, Like it's very true that, 410 00:22:03,596 --> 00:22:07,236 Speaker 3: like it's easy to think that I'll have a bullshitty 411 00:22:07,276 --> 00:22:10,276 Speaker 3: answer to a question like this because I work at 412 00:22:10,316 --> 00:22:12,316 Speaker 3: a tech company that's working on a lot of this stuff. 413 00:22:12,356 --> 00:22:15,956 Speaker 3: But you have to remember that, like, if that's true, 414 00:22:16,316 --> 00:22:19,676 Speaker 3: we're out of jobs as soon as like a human 415 00:22:19,836 --> 00:22:23,676 Speaker 3: is no longer in the loop. That's really bad for us. 416 00:22:24,116 --> 00:22:26,516 Speaker 3: Like does that make sense to you buy that. 417 00:22:26,836 --> 00:22:27,996 Speaker 1: At some margin? 418 00:22:28,276 --> 00:22:30,996 Speaker 2: Right, there's a long way between all the people who 419 00:22:30,996 --> 00:22:33,236 Speaker 2: are doing it now and zero people. There's a lot 420 00:22:33,236 --> 00:22:36,636 Speaker 2: of intermediate cases between the way it is now and 421 00:22:36,676 --> 00:22:40,156 Speaker 2: like a fully AI generated podcast, right, and like we're 422 00:22:40,156 --> 00:22:43,076 Speaker 2: already starting down the road, right, getting AI to write 423 00:22:43,116 --> 00:22:47,116 Speaker 2: show notes or something that's basically has happened now. And 424 00:22:48,356 --> 00:22:51,476 Speaker 2: you know, like I know the history of technology and 425 00:22:51,556 --> 00:22:54,836 Speaker 2: the labor market pretty well, you know, from the Industrial 426 00:22:54,876 --> 00:22:55,516 Speaker 2: Revolution on. 427 00:22:55,916 --> 00:22:56,796 Speaker 1: I'm pro. 428 00:22:58,276 --> 00:23:02,556 Speaker 2: Technological innovation. I believe in productivity gains and efficiency gains. 429 00:23:03,156 --> 00:23:06,196 Speaker 2: I'm also aware that there are instances when highly skilled 430 00:23:06,236 --> 00:23:09,276 Speaker 2: crafts people are displaced by technology. Right, that is definitely 431 00:23:09,276 --> 00:23:11,036 Speaker 2: a thing that happens. And I recognize that the pie 432 00:23:11,076 --> 00:23:13,156 Speaker 2: gets bigger and everybody's better off than the long run, 433 00:23:13,516 --> 00:23:16,396 Speaker 2: But like, I just want to not get pinched, right, 434 00:23:16,476 --> 00:23:18,636 Speaker 2: I just want to be you know, you don't want 435 00:23:18,676 --> 00:23:19,116 Speaker 2: to be the one. 436 00:23:19,556 --> 00:23:21,676 Speaker 1: I don't want to be the one. And you know. 437 00:23:21,956 --> 00:23:26,716 Speaker 2: I'm not out on using it. It's getting really good, 438 00:23:26,756 --> 00:23:28,316 Speaker 2: really fast. It's doing a lot of the things that 439 00:23:28,356 --> 00:23:28,796 Speaker 2: I can do. 440 00:23:29,596 --> 00:23:31,756 Speaker 3: There's one other thing I wanted to say, just about 441 00:23:31,756 --> 00:23:35,476 Speaker 3: the fear for your job thing, which is something we 442 00:23:35,516 --> 00:23:38,836 Speaker 3: say around here a lot, is that you should struggle 443 00:23:38,916 --> 00:23:41,956 Speaker 3: with your story and not your tools. That's almost like 444 00:23:42,036 --> 00:23:44,956 Speaker 3: a guiding light for us, is we want to take 445 00:23:45,356 --> 00:23:47,996 Speaker 3: all of the cognitive friction away from using the tools. 446 00:23:48,916 --> 00:23:51,716 Speaker 3: The funny thing about all of these things is like 447 00:23:51,996 --> 00:23:54,236 Speaker 3: there's a brief moment in time where you feel like 448 00:23:54,276 --> 00:23:59,196 Speaker 3: you have superpowers, but then everybody has them, and humans 449 00:23:59,236 --> 00:24:02,516 Speaker 3: once again become the differentiator. And we really think to 450 00:24:02,636 --> 00:24:05,756 Speaker 3: make like making great stuff is always going to be 451 00:24:05,796 --> 00:24:09,116 Speaker 3: a thing, and great is always going to be determined 452 00:24:09,596 --> 00:24:11,316 Speaker 3: by the human that's in the loop. 453 00:24:11,556 --> 00:24:14,396 Speaker 2: I mean, you know, there's this story about chess, right, 454 00:24:14,516 --> 00:24:18,036 Speaker 2: a computer chess program beat a person a long time ago, 455 00:24:18,116 --> 00:24:22,396 Speaker 2: decades ago now. But then after that people pointed out 456 00:24:22,436 --> 00:24:25,956 Speaker 2: the fact optimistically from my point of view, that a 457 00:24:25,996 --> 00:24:30,636 Speaker 2: computer plus a person could still beat any computer. Right, 458 00:24:30,636 --> 00:24:32,916 Speaker 2: a person working with a computer was better than the 459 00:24:32,956 --> 00:24:34,756 Speaker 2: best computer in the world. And that was like the 460 00:24:34,796 --> 00:24:37,676 Speaker 2: metaphor for like, yes, if we work with machines, we 461 00:24:37,716 --> 00:24:40,636 Speaker 2: can be better. That is no longer true now the 462 00:24:40,636 --> 00:24:43,476 Speaker 2: computer's kept getting better, and now people can't make them better. 463 00:24:43,516 --> 00:24:46,396 Speaker 2: Even a person plus a computer cannot beat a computer. 464 00:24:47,076 --> 00:24:49,276 Speaker 2: And I know that chess is less complex than the 465 00:24:49,276 --> 00:24:52,276 Speaker 2: real world, and so perhaps still a reason for optimism. 466 00:24:52,996 --> 00:24:56,196 Speaker 2: I certainly think I'm clever and good at making podcasts 467 00:24:56,236 --> 00:24:58,516 Speaker 2: and hope that I can do that. I hope that 468 00:24:58,556 --> 00:25:01,076 Speaker 2: I can work with AI to make something better than 469 00:25:01,076 --> 00:25:03,396 Speaker 2: ANYII or more like me or something. 470 00:25:05,556 --> 00:25:08,516 Speaker 3: It's it might not be true, though, but here's the 471 00:25:08,556 --> 00:25:13,156 Speaker 3: amazing thing. People are still playing chess. Right. It's like true, 472 00:25:13,836 --> 00:25:18,756 Speaker 3: there's some separation. Some separation happens where the machines become 473 00:25:18,876 --> 00:25:20,956 Speaker 3: so good and we just say, okay, you you machines, 474 00:25:20,996 --> 00:25:23,396 Speaker 3: you go off and do your thing, and we're going 475 00:25:23,476 --> 00:25:28,316 Speaker 3: to be here kind of reveling in our humanity with 476 00:25:28,396 --> 00:25:31,276 Speaker 3: each other. I think what we'll see is there's there's 477 00:25:31,316 --> 00:25:34,236 Speaker 3: going to be a certain category of content that's really 478 00:25:34,356 --> 00:25:37,636 Speaker 3: just about like the transmission of bits of information from 479 00:25:37,836 --> 00:25:40,676 Speaker 3: your brain to my brain, and that's all that it's about. 480 00:25:41,396 --> 00:25:41,516 Speaker 1: That. 481 00:25:42,796 --> 00:25:44,796 Speaker 3: Maybe we do one day see humans taken out of 482 00:25:44,796 --> 00:25:47,516 Speaker 3: the loop, but I really do believe there will always 483 00:25:47,796 --> 00:25:53,796 Speaker 3: be space for like at the core great content, storytelling, 484 00:25:53,836 --> 00:25:57,236 Speaker 3: whatever you call it, it's it's about feeling connected to 485 00:25:57,516 --> 00:26:00,796 Speaker 3: the humans and other people. And as soon as machines 486 00:26:01,116 --> 00:26:04,116 Speaker 3: play to have too heavy a hand, it's just not 487 00:26:04,196 --> 00:26:05,116 Speaker 3: interesting anymore. 488 00:26:08,236 --> 00:26:10,356 Speaker 2: We'll be back in a minute with the Lightning Round, 489 00:26:10,836 --> 00:26:13,516 Speaker 2: which includes a message from Andrew to. 490 00:26:13,596 --> 00:26:24,276 Speaker 1: His future self. That's the end of the ads. Now 491 00:26:24,276 --> 00:26:25,276 Speaker 1: we're going back to the show. 492 00:26:25,756 --> 00:26:28,196 Speaker 2: Okay, so this is the Lightning Round, now you ready. 493 00:26:28,676 --> 00:26:31,476 Speaker 2: It's just a bunch of questions. Do you use generitive 494 00:26:31,596 --> 00:26:33,836 Speaker 2: AI in your life outside of work? 495 00:26:34,356 --> 00:26:34,596 Speaker 1: Now? 496 00:26:35,196 --> 00:26:38,516 Speaker 3: You know what's interesting. I did something this morning where 497 00:26:38,516 --> 00:26:40,596 Speaker 3: I was actually like, I don't I don't even care 498 00:26:40,636 --> 00:26:42,836 Speaker 3: if it's wrong. I don't even care if it's. 499 00:26:44,036 --> 00:26:46,876 Speaker 2: Like the test of a theory is not is it correct? 500 00:26:46,916 --> 00:26:47,836 Speaker 1: But is it interesting? 501 00:26:48,796 --> 00:26:52,556 Speaker 3: Yeah? Exactly. I was asking it about I think, like 502 00:26:52,796 --> 00:26:54,516 Speaker 3: my son got hit in the head with a baseball, 503 00:26:54,556 --> 00:26:57,116 Speaker 3: and I was trying to I really should care about this. Actually, 504 00:26:57,796 --> 00:26:58,556 Speaker 3: you should. 505 00:26:58,276 --> 00:27:01,556 Speaker 1: Not ask chat GPT anything significant. 506 00:27:00,996 --> 00:27:02,996 Speaker 2: About that, not to give your parents advice. 507 00:27:04,996 --> 00:27:08,516 Speaker 3: It's stuff like that, like I've I've pretty quickly been. 508 00:27:08,356 --> 00:27:10,356 Speaker 2: Able to like that you should not be asking for 509 00:27:10,396 --> 00:27:11,996 Speaker 2: medical advice about your child. 510 00:27:15,876 --> 00:27:17,956 Speaker 3: I know. But when I say stuff like that, like 511 00:27:18,316 --> 00:27:20,956 Speaker 3: I would have googled it and probably just done what 512 00:27:20,996 --> 00:27:22,876 Speaker 3: I was going to do anyway. So it was almost 513 00:27:22,916 --> 00:27:25,396 Speaker 3: just a curiosity. He was fine, He didn't. 514 00:27:25,156 --> 00:27:29,996 Speaker 2: Need to go see a doctor, not according to JGBT. No, 515 00:27:31,956 --> 00:27:36,436 Speaker 2: I'm curious about your time working in a recording studio, right, 516 00:27:36,476 --> 00:27:38,956 Speaker 2: You worked in a recording studio where musicians came in 517 00:27:39,236 --> 00:27:42,996 Speaker 2: and recorded. Did you see there any like moments of 518 00:27:43,076 --> 00:27:43,956 Speaker 2: musical genius? 519 00:27:44,196 --> 00:27:44,596 Speaker 1: Is there one? 520 00:27:44,596 --> 00:27:45,236 Speaker 2: In particular? 521 00:27:46,156 --> 00:27:51,236 Speaker 3: I worked for this guy named Steve Albini, who is 522 00:27:51,836 --> 00:27:56,796 Speaker 3: a pretty well known engineer producer that was in some 523 00:27:57,676 --> 00:28:00,236 Speaker 3: popular kind of punk rock bands in the in the 524 00:28:00,276 --> 00:28:05,036 Speaker 3: eighties and currently and definitely saw some cool bands. But 525 00:28:05,076 --> 00:28:08,596 Speaker 3: I think also I really feel like I learned a 526 00:28:08,636 --> 00:28:15,636 Speaker 3: ton from watching him work. He's so talented, so articulate, 527 00:28:15,756 --> 00:28:18,956 Speaker 3: so smart in many ways, like an example of what 528 00:28:18,996 --> 00:28:23,236 Speaker 3: I aspired to be at the time, and so seeing 529 00:28:23,316 --> 00:28:25,956 Speaker 3: that output, but then also seeing him every day and 530 00:28:26,036 --> 00:28:30,356 Speaker 3: how hard he worked, it was a real like, oh, 531 00:28:30,556 --> 00:28:33,236 Speaker 3: this is how it happens kind of moment for me, 532 00:28:34,196 --> 00:28:39,316 Speaker 3: and it kind of inspired me. It inspired within me 533 00:28:39,796 --> 00:28:41,596 Speaker 3: a kind of work ethic that I'm not sure I 534 00:28:41,596 --> 00:28:42,996 Speaker 3: would have gotten to otherwise. 535 00:28:44,676 --> 00:28:46,636 Speaker 2: What's the best deal you ever got from group on? 536 00:28:53,396 --> 00:28:56,836 Speaker 3: Man? You know, it's so funny because, like obviously I 537 00:28:56,876 --> 00:28:58,396 Speaker 3: was asked. I used to be asked that question all 538 00:28:58,436 --> 00:29:02,116 Speaker 3: the time. I think it was a sensory deprivation tank. 539 00:29:02,716 --> 00:29:07,716 Speaker 3: They had a sensory deprivation tank center in somewhere in Chicago. 540 00:29:08,436 --> 00:29:09,996 Speaker 3: Had never tried. It was really cool. 541 00:29:11,356 --> 00:29:14,036 Speaker 2: This is a descript question. Now, how will you know 542 00:29:14,076 --> 00:29:15,476 Speaker 2: when it's time to do something else? 543 00:29:15,996 --> 00:29:18,196 Speaker 1: But leave? Dude? 544 00:29:18,396 --> 00:29:19,476 Speaker 3: I don't know if I want to say this on 545 00:29:19,516 --> 00:29:21,996 Speaker 3: a podcast, because if I do decide to take the 546 00:29:22,036 --> 00:29:25,156 Speaker 3: company public, it'll come back to haunt me. But I 547 00:29:25,156 --> 00:29:28,836 Speaker 3: almost want to say it specifically for that reason, Andrew, 548 00:29:29,076 --> 00:29:31,916 Speaker 3: I'm talking to future Andrew right now. You do not 549 00:29:31,956 --> 00:29:35,236 Speaker 3: want to be a public company CEO again, Okay, hire 550 00:29:35,276 --> 00:29:38,996 Speaker 3: someone else to do that. I know you're talking yourself 551 00:29:38,996 --> 00:29:40,636 Speaker 3: into it and saying it's going to be different time. 552 00:29:40,676 --> 00:29:44,676 Speaker 3: It's okay, but you hate it. It's the things that 553 00:29:45,196 --> 00:29:47,916 Speaker 3: those people are good at is and are interested in 554 00:29:48,356 --> 00:29:51,476 Speaker 3: is different than you go do something else. 555 00:29:53,596 --> 00:29:54,196 Speaker 1: Amazing. 556 00:29:54,396 --> 00:29:57,476 Speaker 2: I've never had someone leave themselves at time. Tast a 557 00:29:57,556 --> 00:29:58,756 Speaker 2: lot of podcasts before. 558 00:30:02,756 --> 00:30:04,276 Speaker 1: I'm going to send that to you. If you go public, 559 00:30:04,276 --> 00:30:05,596 Speaker 1: I'm to have you back on the show and I'm 560 00:30:05,596 --> 00:30:06,356 Speaker 1: going to play it to you. 561 00:30:09,116 --> 00:30:11,276 Speaker 2: Thank you, Thank you for being so generous with your time. 562 00:30:11,916 --> 00:30:14,076 Speaker 2: I appreciate your candor and I'm grateful for that. 563 00:30:14,636 --> 00:30:17,036 Speaker 3: I appreciate that I had. I had fun too. You're 564 00:30:17,076 --> 00:30:21,276 Speaker 3: good at your job in the sense that, like uh you, 565 00:30:21,276 --> 00:30:22,116 Speaker 3: you bring it out in me. 566 00:30:22,876 --> 00:30:25,956 Speaker 1: I'm better than a machine for now. It's gonna that's 567 00:30:26,036 --> 00:30:28,876 Speaker 1: my model, better than a machine for now. 568 00:30:34,436 --> 00:30:38,316 Speaker 2: Andrew Mason is the founder and CEO of Descript. Today's 569 00:30:38,356 --> 00:30:41,996 Speaker 2: show was edited by Sarah Nix, produced by Edith Russolo, and. 570 00:30:42,036 --> 00:30:45,516 Speaker 1: Engineered by Amanda k Wong. I'm Jacob Goldstein. 571 00:30:45,596 --> 00:30:48,316 Speaker 2: We'll be back next week with another episode of What's 572 00:30:48,316 --> 00:30:52,596 Speaker 2: Your Problem? And here, finally is the top of today's show. 573 00:30:52,636 --> 00:30:55,836 Speaker 2: The intro to the show as read if that's what 574 00:30:55,916 --> 00:31:01,276 Speaker 2: you'd call it, as generated by overdub descripts AI, powered 575 00:31:01,396 --> 00:31:06,836 Speaker 2: voice whatever emulator. After every interview we do for the show, 576 00:31:06,956 --> 00:31:09,316 Speaker 2: we upload the audio to a piece of soft were 577 00:31:09,356 --> 00:31:13,516 Speaker 2: called the script. Descript turns the audio into a transcript, 578 00:31:13,956 --> 00:31:16,316 Speaker 2: and then I can edit the transcript, cut out the 579 00:31:16,356 --> 00:31:19,276 Speaker 2: boring parts, move sections around, and when I do that, 580 00:31:19,756 --> 00:31:24,156 Speaker 2: descript edits the underlying audio to match. As software, descript 581 00:31:24,196 --> 00:31:27,756 Speaker 2: is pretty janky, it's buggy, it's constantly changing in ways 582 00:31:27,756 --> 00:31:30,276 Speaker 2: that can make it hard to use, and sometimes it 583 00:31:30,436 --> 00:31:34,116 Speaker 2: just blows stuff up. But we use it anyway because 584 00:31:34,156 --> 00:31:38,236 Speaker 2: descript is an incredible advance over what came before. Before 585 00:31:38,276 --> 00:31:42,356 Speaker 2: descript audio software represented audio files not as words, but 586 00:31:42,436 --> 00:31:46,556 Speaker 2: as waveforms, squiggly lines presented on a timeline. So when 587 00:31:46,596 --> 00:31:50,076 Speaker 2: descript came along, being able to edit audio by editing 588 00:31:50,156 --> 00:31:52,756 Speaker 2: words on a screen was a huge advance, and it 589 00:31:52,796 --> 00:31:55,836 Speaker 2: was an advance made possible by artificial intelligence.