1 00:00:08,119 --> 00:00:10,720 Speaker 1: From The Australian. This is the weekend edition of The Front. 2 00:00:10,840 --> 00:00:13,960 Speaker 1: I'm Claire Harvey. 3 00:00:15,160 --> 00:00:20,639 Speaker 2: These people aren't real the creative underclass. That's a powerful phrase, 4 00:00:21,400 --> 00:00:24,080 Speaker 2: but doesn't that risk romanticizing things a bit. 5 00:00:24,320 --> 00:00:26,319 Speaker 3: That's a really crucial point and your rate to push 6 00:00:26,360 --> 00:00:26,880 Speaker 3: back on it. 7 00:00:27,600 --> 00:00:30,800 Speaker 1: But they're coming to a podcast near you very soon. 8 00:00:31,320 --> 00:00:34,600 Speaker 1: And so, just as in art and music, the most 9 00:00:34,760 --> 00:00:39,080 Speaker 1: human of art forms is being replicated by artificial intelligence 10 00:00:39,120 --> 00:00:42,840 Speaker 1: models that aren't satisfied with just helping humans. They want 11 00:00:42,880 --> 00:00:46,320 Speaker 1: to imitate us too. Here at The Australian, we will 12 00:00:46,360 --> 00:00:50,800 Speaker 1: always make handmade human journalism, but some of our peers 13 00:00:50,800 --> 00:00:54,560 Speaker 1: in the podcast world aren't so sure. So what happens 14 00:00:54,600 --> 00:00:58,000 Speaker 1: when AI steps out from behind the curtain and tries 15 00:00:58,080 --> 00:01:02,080 Speaker 1: to claim the spotlight. The Australian's audio lead, Jasper League, 16 00:01:02,160 --> 00:01:05,280 Speaker 1: joins me today and we look at podcasts that can 17 00:01:05,319 --> 00:01:14,000 Speaker 1: be created with the click of a button. Jas But 18 00:01:14,160 --> 00:01:16,600 Speaker 1: you are a very chilled out dude. I don't often 19 00:01:16,640 --> 00:01:20,160 Speaker 1: see you lose it, but one moment in the past 20 00:01:20,240 --> 00:01:23,479 Speaker 1: year when I have seen you rattled is when we 21 00:01:23,480 --> 00:01:26,720 Speaker 1: were doing a great AI boot camp here at work. 22 00:01:26,760 --> 00:01:28,680 Speaker 1: We were learning all about the AI tools that can 23 00:01:28,720 --> 00:01:31,440 Speaker 1: help us do our journalism. One of them is called 24 00:01:31,520 --> 00:01:36,120 Speaker 1: Notebook LM, Google's Large Language Model, and the way we 25 00:01:36,240 --> 00:01:39,039 Speaker 1: use it here at News is that we might upload 26 00:01:39,240 --> 00:01:43,760 Speaker 1: a very long research report or ten council meeting minutes 27 00:01:44,319 --> 00:01:46,479 Speaker 1: and ask it to analyze that text for us, give 28 00:01:46,520 --> 00:01:48,600 Speaker 1: us some dot points that might help us in researching 29 00:01:48,600 --> 00:01:51,840 Speaker 1: a story. We never actually publish anything that Notebook LM 30 00:01:51,880 --> 00:01:54,600 Speaker 1: spits out for us, but it's useful in helping us 31 00:01:54,600 --> 00:01:56,400 Speaker 1: get our heads around a concept so that we can 32 00:01:56,440 --> 00:02:00,000 Speaker 1: then do the journalism to check out what it's telling us. 33 00:02:00,000 --> 00:02:03,080 Speaker 1: Thing that got you riled up was Notebook LM has 34 00:02:03,200 --> 00:02:07,640 Speaker 1: an audio summary feature. What is that and why did 35 00:02:07,640 --> 00:02:08,240 Speaker 1: it annoy you? 36 00:02:08,400 --> 00:02:08,720 Speaker 4: Well? 37 00:02:08,919 --> 00:02:13,200 Speaker 5: Notebook LM has an audio overview feature that is essentially 38 00:02:13,280 --> 00:02:17,919 Speaker 5: the text to voice component of Notebook LM, where the 39 00:02:17,960 --> 00:02:22,519 Speaker 5: Australian use text to audio technology in our articles sometimes, 40 00:02:23,000 --> 00:02:24,799 Speaker 5: which is a feature I like, you know, it's something 41 00:02:24,840 --> 00:02:27,840 Speaker 5: handy for you know, if you're driving somewhere and you 42 00:02:27,919 --> 00:02:30,200 Speaker 5: really want to read that feature, but you don't have 43 00:02:30,240 --> 00:02:32,440 Speaker 5: the twenty minutes to set aside to actually sit there 44 00:02:32,480 --> 00:02:34,280 Speaker 5: and read it, you can hit play and you can 45 00:02:34,320 --> 00:02:35,160 Speaker 5: listen to it on the go. 46 00:02:35,400 --> 00:02:36,560 Speaker 1: And this is what it sounds like. 47 00:02:37,440 --> 00:02:41,400 Speaker 4: This article is read by an automated voice. Real wage 48 00:02:41,400 --> 00:02:44,480 Speaker 4: growth is going backwards for the first time since September 49 00:02:44,480 --> 00:02:48,760 Speaker 4: twenty twenty three, after inflation outpaced the latest annual nominal 50 00:02:48,800 --> 00:02:50,960 Speaker 4: wage growth figures published on Wednesday. 51 00:02:52,240 --> 00:02:56,440 Speaker 1: So it's an audio reading out of this newspaper story. 52 00:02:56,520 --> 00:02:59,000 Speaker 1: And while your hands are full, or you're cooking, or 53 00:02:59,000 --> 00:03:00,919 Speaker 1: you're driving, you can read the paper. 54 00:03:01,040 --> 00:03:01,519 Speaker 2: That's right. 55 00:03:01,800 --> 00:03:05,359 Speaker 5: That's a great use of that particular technology. But what 56 00:03:05,440 --> 00:03:08,720 Speaker 5: notebook LM does is it creates what's called an audio overview, 57 00:03:09,360 --> 00:03:14,400 Speaker 5: which is a simulation of a podcast. You can upload 58 00:03:14,400 --> 00:03:17,440 Speaker 5: a huge amount of information to notebook LM, and instead 59 00:03:17,440 --> 00:03:19,800 Speaker 5: of sitting there and sifting through it all yourself, you 60 00:03:19,840 --> 00:03:22,120 Speaker 5: can say, Hey, create me a ten minute podcast on 61 00:03:22,160 --> 00:03:26,320 Speaker 5: everything that I've just uploaded, and the end result is 62 00:03:26,600 --> 00:03:30,760 Speaker 5: an engaging to host podcast on the theme of your choice. 63 00:03:31,160 --> 00:03:34,040 Speaker 5: And this technology we're seeing an outside of notebook LM 64 00:03:34,120 --> 00:03:36,240 Speaker 5: now and they all sort of get advertised in the 65 00:03:36,240 --> 00:03:39,400 Speaker 5: same way, which is essentially a podcast created with the 66 00:03:39,400 --> 00:03:40,120 Speaker 5: click of a button. 67 00:03:40,280 --> 00:03:43,600 Speaker 1: There's two aspects to the way we pull our podcast together. First, 68 00:03:43,600 --> 00:03:48,200 Speaker 1: we do the journalism, which is reporting, writing, fact checking, 69 00:03:48,480 --> 00:03:51,480 Speaker 1: getting things legal through our lawyers. And then it comes 70 00:03:51,480 --> 00:03:55,720 Speaker 1: to the audio production and polishing. That means putting all 71 00:03:55,760 --> 00:03:59,200 Speaker 1: the audio together and then making it sound good. What 72 00:03:59,360 --> 00:04:01,920 Speaker 1: are the tasks involved in making it sound good? 73 00:04:02,320 --> 00:04:04,800 Speaker 5: A huge chunk of my time every day is spent 74 00:04:04,920 --> 00:04:08,640 Speaker 5: just cleaning audio, removing all the gross mouth sounds that 75 00:04:09,360 --> 00:04:14,000 Speaker 5: nobody really wants to listen to. It's like photoshopping or 76 00:04:14,320 --> 00:04:18,359 Speaker 5: retouching audio. These tasks add up because I mean, in 77 00:04:18,400 --> 00:04:20,280 Speaker 5: fifteen minutes of audio, there's a lot of clicks, and 78 00:04:20,320 --> 00:04:22,200 Speaker 5: there's a lot of plosives and a lot of loud breaths, 79 00:04:22,200 --> 00:04:24,159 Speaker 5: and there's a lot of false starts, and there's a 80 00:04:24,160 --> 00:04:29,040 Speaker 5: lot of semi complete thoughts that need to be edited 81 00:04:29,080 --> 00:04:31,839 Speaker 5: down into a way that they can be better understood 82 00:04:31,880 --> 00:04:34,960 Speaker 5: by an audience. And then there's also balancing and then 83 00:04:35,080 --> 00:04:38,840 Speaker 5: pairing journalism with music or sound design. There are a 84 00:04:38,880 --> 00:04:41,000 Speaker 5: lot of moving parts and it's not an easy job. 85 00:04:41,320 --> 00:04:43,080 Speaker 5: But I have a lot of pride in doing these 86 00:04:43,120 --> 00:04:45,200 Speaker 5: things in the way that I think they should be done. 87 00:04:45,480 --> 00:04:47,920 Speaker 1: One of your rules of FULMB which our whole team 88 00:04:47,960 --> 00:04:52,400 Speaker 1: abides by, is that five minutes of published audio will 89 00:04:52,440 --> 00:04:55,960 Speaker 1: take roughly an hour in terms of cleaning and polishing 90 00:04:56,000 --> 00:04:58,360 Speaker 1: to make it sound as good as we wanted to 91 00:04:58,400 --> 00:04:59,240 Speaker 1: hear at the Australian. 92 00:04:59,400 --> 00:05:01,240 Speaker 5: And the idea of course, is that the audience would 93 00:05:01,279 --> 00:05:02,839 Speaker 5: never even notice. And there is a bit of a 94 00:05:02,839 --> 00:05:05,560 Speaker 5: sleight of hand going on sometimes when you see a 95 00:05:05,640 --> 00:05:10,080 Speaker 5: show that looks very casual and informal, but it's not 96 00:05:10,120 --> 00:05:13,280 Speaker 5: like it just so happens to sound really clean and 97 00:05:13,360 --> 00:05:16,400 Speaker 5: balanced and there are no gross mouth sounds. It's like 98 00:05:17,000 --> 00:05:19,800 Speaker 5: the illusion is that it's just this easy thing that's 99 00:05:19,839 --> 00:05:22,680 Speaker 5: been thrown together. There is always a lot of work 100 00:05:22,720 --> 00:05:25,360 Speaker 5: that then happens on the back end to create that illusion. 101 00:05:25,640 --> 00:05:27,560 Speaker 1: So Jesper, you did a bit of an experiment with 102 00:05:27,680 --> 00:05:30,400 Speaker 1: Notebook LM and it involved being on the treadmill at 103 00:05:30,440 --> 00:05:30,919 Speaker 1: the gym. 104 00:05:31,000 --> 00:05:34,000 Speaker 5: Yes, yes, what happened. So I was at the gym 105 00:05:34,080 --> 00:05:35,960 Speaker 5: recently and I was sick of the music in my 106 00:05:36,040 --> 00:05:39,320 Speaker 5: playlist and I couldn't find a podcast that I wanted 107 00:05:39,360 --> 00:05:41,719 Speaker 5: to listen to. What I was wanting to listen to 108 00:05:41,839 --> 00:05:45,040 Speaker 5: doesn't really exist. I wanted to listen to a short 109 00:05:45,160 --> 00:05:48,080 Speaker 5: ten to fifteen minute episode on the town where I live, 110 00:05:48,640 --> 00:05:51,640 Speaker 5: So I thought, well, this could be a perfect opportunity 111 00:05:51,680 --> 00:05:54,520 Speaker 5: to take Notebook LM for a spin. As you said before, 112 00:05:55,440 --> 00:05:58,839 Speaker 5: Notebook LM doesn't scan the Internet for its information. It 113 00:05:59,040 --> 00:06:03,039 Speaker 5: only responds to information that you give it. So understanding 114 00:06:03,080 --> 00:06:05,800 Speaker 5: that Notebook ELM is a walled garden, I went to 115 00:06:05,920 --> 00:06:09,960 Speaker 5: Google's generative AI tour called Gemini to create a script 116 00:06:10,000 --> 00:06:12,559 Speaker 5: for me. And the prompt I gave it was pretty 117 00:06:12,560 --> 00:06:15,080 Speaker 5: embarrassing in hindsight, but it was something like, I want 118 00:06:15,120 --> 00:06:18,040 Speaker 5: a short podcast on Northkatoomba. Any facts that you can 119 00:06:18,080 --> 00:06:22,800 Speaker 5: share with me? Is it low income? Go? And it 120 00:06:22,839 --> 00:06:25,560 Speaker 5: only took Gemini a matter of seconds really to whip 121 00:06:25,680 --> 00:06:27,680 Speaker 5: up a script, and then I copied and pasted that 122 00:06:27,720 --> 00:06:30,400 Speaker 5: script into Notebook LM, hit the audio of a view 123 00:06:30,480 --> 00:06:33,480 Speaker 5: feature and within a couple of minutes I had my podcast. 124 00:06:34,040 --> 00:06:34,799 Speaker 1: Let's have a listen. 125 00:06:36,480 --> 00:06:39,880 Speaker 2: Welcome back to the deep Dive. If you've ever been 126 00:06:39,920 --> 00:06:43,520 Speaker 2: to the Blue Mountains, you know that feeling, the damp, 127 00:06:43,680 --> 00:06:47,120 Speaker 2: cool air, the smell of eucalyptus, and just the sheer 128 00:06:47,200 --> 00:06:50,520 Speaker 2: scale of it all. It's breastaking, it really is. We're 129 00:06:50,600 --> 00:06:53,440 Speaker 2: up here a thousand years above sea level in Katoomba, 130 00:06:53,480 --> 00:06:56,039 Speaker 2: which most people know for the postcard view. 131 00:06:56,160 --> 00:06:58,080 Speaker 3: The three sisters, the cliff. 132 00:06:57,760 --> 00:07:01,919 Speaker 2: Walks exactly, the Carrington Hotel, but that whole picture, that 133 00:07:02,120 --> 00:07:05,080 Speaker 2: entire tourist world, it's pretty much all on one side 134 00:07:05,080 --> 00:07:07,200 Speaker 2: of the train tracks South Katuma. 135 00:07:07,240 --> 00:07:10,240 Speaker 3: And that's why today we're actually turning our backs on 136 00:07:10,280 --> 00:07:13,800 Speaker 3: those cliffs. Okay, we are, you know, physically crossing the 137 00:07:13,880 --> 00:07:17,800 Speaker 3: railway line to dive into the story of North Katuma. 138 00:07:17,200 --> 00:07:19,000 Speaker 2: The other side of the tracks. Literally. 139 00:07:19,280 --> 00:07:22,080 Speaker 3: Literally, our whole mission here is to look at the 140 00:07:22,080 --> 00:07:25,400 Speaker 3: sources that tell this, this tale of two cities that 141 00:07:25,440 --> 00:07:26,880 Speaker 3: really defines the community. 142 00:07:27,360 --> 00:07:29,560 Speaker 5: I'm going to stop it there because I find that 143 00:07:29,760 --> 00:07:34,280 Speaker 5: bit really interesting where you hear the female voice pause 144 00:07:34,480 --> 00:07:37,000 Speaker 5: and she says um, and then it's almost like she's 145 00:07:37,080 --> 00:07:39,920 Speaker 5: checking her notes or something. Yeah, but it's like, of 146 00:07:39,960 --> 00:07:42,480 Speaker 5: course there are no notes. This whole thing is a simulation. 147 00:07:43,240 --> 00:07:44,760 Speaker 1: It's quite eerie, it is. 148 00:07:44,920 --> 00:07:49,360 Speaker 5: I also I hate it too. To me, it's a 149 00:07:49,400 --> 00:07:53,640 Speaker 5: gimmick that doesn't really have much value, nested in a 150 00:07:53,680 --> 00:07:55,560 Speaker 5: product that has immense value. 151 00:07:56,360 --> 00:08:00,920 Speaker 1: So there's a huge convenience opportunity here. You wanted something 152 00:08:01,160 --> 00:08:05,200 Speaker 1: very niche and very bespoke about North Katomba to listen 153 00:08:05,240 --> 00:08:07,840 Speaker 1: to while you're on the treadmill, in Northkatomber. Yes, you 154 00:08:07,920 --> 00:08:09,880 Speaker 1: got these two robots to talk about it for you, 155 00:08:10,600 --> 00:08:12,280 Speaker 1: and then they pretended to be humans. 156 00:08:12,360 --> 00:08:15,240 Speaker 5: That's right. If I'm going to go to a sort 157 00:08:15,240 --> 00:08:18,400 Speaker 5: of an artificial product like this, my preference would be 158 00:08:18,480 --> 00:08:20,880 Speaker 5: to have the AI version of. 159 00:08:20,760 --> 00:08:23,840 Speaker 1: That, to have it sound artificial, to have it sound artificial. 160 00:08:24,040 --> 00:08:27,040 Speaker 1: I did my own experiment with Notebook LM. I had 161 00:08:27,080 --> 00:08:30,120 Speaker 1: to do some pre reading for a meeting. I uploaded 162 00:08:30,120 --> 00:08:32,199 Speaker 1: the documents into notebook LM. I got it to give 163 00:08:32,240 --> 00:08:35,000 Speaker 1: me an audio summary a little podcast episode, so I 164 00:08:35,040 --> 00:08:37,480 Speaker 1: didn't have to do the reading for this meeting. I 165 00:08:37,559 --> 00:08:41,160 Speaker 1: then thought, Wow, that was sort of alarmingly warm and chatty. 166 00:08:41,280 --> 00:08:43,800 Speaker 1: But I wonder what it's like with some confronting material. 167 00:08:44,240 --> 00:08:48,120 Speaker 1: So I went and downloaded the Charter of Hamas, which 168 00:08:48,200 --> 00:08:50,600 Speaker 1: is out there on the internet. And I was hoping 169 00:08:50,679 --> 00:08:55,520 Speaker 1: that the robots would take that same kind of chummy tone, 170 00:08:55,720 --> 00:09:00,120 Speaker 1: but actually it disappointed me by taking quite an appropriate tone. 171 00:09:00,440 --> 00:09:01,000 Speaker 1: Let's listen. 172 00:09:02,280 --> 00:09:04,440 Speaker 2: It's really a foundational text, isn't. It gives you a 173 00:09:04,440 --> 00:09:07,880 Speaker 2: snapshot of their stated principles, their goals right back when 174 00:09:07,880 --> 00:09:08,920 Speaker 2: it was written exactly. 175 00:09:09,000 --> 00:09:11,600 Speaker 3: And our mission here, as always is just to understand 176 00:09:11,600 --> 00:09:15,560 Speaker 3: what the source material says, especially with something this politically charged. 177 00:09:15,840 --> 00:09:16,040 Speaker 4: Yeah. 178 00:09:16,080 --> 00:09:20,679 Speaker 2: Absolutely, we're just extracting the key ideas from this specific document. 179 00:09:21,240 --> 00:09:24,400 Speaker 2: Impartially just helping you get informed about what's actually in it. 180 00:09:24,720 --> 00:09:27,400 Speaker 2: We're reporting, not endorsing anything here. 181 00:09:27,480 --> 00:09:30,280 Speaker 3: So let's just bring in how does the charter define itself? 182 00:09:30,320 --> 00:09:30,840 Speaker 3: What's the fund? 183 00:09:31,040 --> 00:09:34,079 Speaker 1: Again? With the arms and the h's and the hesitations 184 00:09:34,080 --> 00:09:37,240 Speaker 1: and the checking of the notes. Yes, it's pretending to 185 00:09:37,280 --> 00:09:41,520 Speaker 1: be considering this information. So while it's able to recognize 186 00:09:41,559 --> 00:09:45,160 Speaker 1: that this is confronting material, or it's at least very serious, 187 00:09:45,160 --> 00:09:49,760 Speaker 1: it's not appropriate to joke. I find the fake humanity 188 00:09:50,600 --> 00:09:52,000 Speaker 1: really depressing. 189 00:09:52,160 --> 00:09:55,920 Speaker 5: Yeah, I do too. The big difference with your podcast 190 00:09:55,960 --> 00:10:00,760 Speaker 5: in mind was that I uploaded a script copied and 191 00:10:00,800 --> 00:10:05,920 Speaker 5: pasted from Google's Gemini that had been compiled using various 192 00:10:05,960 --> 00:10:09,280 Speaker 5: bits of information from the web. Bearing in mind, I 193 00:10:09,320 --> 00:10:11,600 Speaker 5: asked it for a podcast on the tower where I live, 194 00:10:11,640 --> 00:10:14,280 Speaker 5: which is North Katoomba. The first thing that the host 195 00:10:14,280 --> 00:10:18,080 Speaker 5: started to talk about was a brief history of the gully. 196 00:10:18,240 --> 00:10:21,400 Speaker 5: It was an indigenous settlement home to quite a proud 197 00:10:21,480 --> 00:10:27,080 Speaker 5: Indigenous community that were ultimately moved off their land when 198 00:10:27,440 --> 00:10:31,400 Speaker 5: a decision was made to build a racetrack in the 199 00:10:31,400 --> 00:10:34,520 Speaker 5: middle of the gully. The racetrack was ultimately a bust, 200 00:10:34,679 --> 00:10:37,880 Speaker 5: and the whole thing really just amounts to a sad 201 00:10:37,960 --> 00:10:39,280 Speaker 5: chapter in Katomba's history. 202 00:10:39,320 --> 00:10:40,840 Speaker 1: This must have made the robots sad. 203 00:10:41,160 --> 00:10:42,240 Speaker 5: They were very empathetic. 204 00:10:44,200 --> 00:10:46,040 Speaker 1: Do you hear a bit of it? Yeah? Sure, hear 205 00:10:46,040 --> 00:10:47,280 Speaker 1: a bit of the robots speak sad. 206 00:10:47,480 --> 00:10:48,640 Speaker 5: Yeah. 207 00:10:49,040 --> 00:10:49,720 Speaker 2: What happened? 208 00:10:49,760 --> 00:10:53,200 Speaker 3: A group of local businessmen decided Katoomba needed a big, 209 00:10:53,280 --> 00:10:56,560 Speaker 3: modern attraction, not just lookouts. They wanted something loud, They wanted. 210 00:10:56,360 --> 00:10:58,200 Speaker 2: A racing circuit and they chose the gully. 211 00:10:58,480 --> 00:11:01,920 Speaker 3: They chose the gully irony. The cruelty of that decision 212 00:11:02,000 --> 00:11:03,920 Speaker 3: is just it's hard to process. 213 00:11:04,160 --> 00:11:06,320 Speaker 2: So what did that look like for the families living there? 214 00:11:06,600 --> 00:11:09,640 Speaker 3: The documents from the time are heartbreaking. They were given 215 00:11:09,679 --> 00:11:13,800 Speaker 3: almost no notice and then they were forced to watch 216 00:11:13,920 --> 00:11:18,640 Speaker 3: watch the bulldozers roll in systematically destroying their homes, their churches, 217 00:11:18,760 --> 00:11:21,360 Speaker 3: everything they had built to clear the ground for the 218 00:11:21,400 --> 00:11:22,440 Speaker 3: Catalina racetrack. 219 00:11:22,559 --> 00:11:27,960 Speaker 5: Wow, Claire, huge issue with this part of the podcast. 220 00:11:28,920 --> 00:11:36,000 Speaker 5: What is it the gullies in South Katoomba. That's a 221 00:11:36,040 --> 00:11:40,120 Speaker 5: real problem and that's not notebook LM's fault. That's Google 222 00:11:40,160 --> 00:11:43,199 Speaker 5: Gemini's fault. But I just thought, I can't trust this. 223 00:11:44,000 --> 00:11:49,480 Speaker 1: The people who have trained these artificial intelligence models think 224 00:11:49,559 --> 00:11:55,120 Speaker 1: that what audiences want is to hear computer generated voices 225 00:11:55,679 --> 00:11:57,880 Speaker 1: having feelings. I don't think they do. 226 00:11:58,520 --> 00:12:01,200 Speaker 5: It does make me wonder about the bubble that these 227 00:12:01,640 --> 00:12:06,920 Speaker 5: people work in Silicon Valley's obviously in California. The sound 228 00:12:06,960 --> 00:12:11,000 Speaker 5: of this AI podcast is quite clearly modeled on the 229 00:12:11,080 --> 00:12:15,920 Speaker 5: Southern Californian sound of NPR, but by sort of foisting 230 00:12:15,960 --> 00:12:18,720 Speaker 5: that on the rest of us, there's an arrogance to that. 231 00:12:18,800 --> 00:12:23,600 Speaker 1: I think podcasting is a relatively new medium. This was 232 00:12:23,640 --> 00:12:27,400 Speaker 1: the great disruption to radio, and what we learned about 233 00:12:27,480 --> 00:12:32,319 Speaker 1: audiences is actually they wanted to hear natural voices, candid conversations. 234 00:12:32,679 --> 00:12:35,800 Speaker 1: They didn't need news readers to sound like this anymore. Yes, sure, 235 00:12:36,200 --> 00:12:44,000 Speaker 1: and now AI models are aping that to create products 236 00:12:44,040 --> 00:12:46,560 Speaker 1: in audio that have no humans involved at all. 237 00:12:46,960 --> 00:12:48,240 Speaker 2: It's very black mirror. 238 00:12:48,760 --> 00:12:52,760 Speaker 5: Another thing I noticed was that there's almost an artificial 239 00:12:52,880 --> 00:12:54,839 Speaker 5: instinct to please very much. 240 00:12:54,880 --> 00:12:58,400 Speaker 1: So they call that hallucination in tools like Google's AI 241 00:12:58,440 --> 00:13:01,920 Speaker 1: summary or Gemini, where it gives you the answer it 242 00:13:01,960 --> 00:13:04,120 Speaker 1: thinks you want. I had an example of this when 243 00:13:04,120 --> 00:13:06,760 Speaker 1: there are lots of shark attacks happening in Sydney. We 244 00:13:06,760 --> 00:13:08,680 Speaker 1: were down on the south coast of New South Wales. 245 00:13:09,160 --> 00:13:12,200 Speaker 1: I googled the local estuary where we swim. Have there 246 00:13:12,240 --> 00:13:15,880 Speaker 1: been any bull sharks in this estuary? And it said, yes, 247 00:13:15,920 --> 00:13:19,080 Speaker 1: there have been bull sharks in this estuary. And when 248 00:13:19,080 --> 00:13:21,040 Speaker 1: I looked at the links that it was using to 249 00:13:21,080 --> 00:13:24,640 Speaker 1: give me that answer, they were all related to Queensland, 250 00:13:24,840 --> 00:13:27,880 Speaker 1: or South Africa or somewhere else. So it thought I 251 00:13:27,960 --> 00:13:30,439 Speaker 1: wanted it to say yes. In fact, I really did 252 00:13:30,440 --> 00:13:32,360 Speaker 1: not want it to say yes. Sure, so it gave 253 00:13:32,400 --> 00:13:35,040 Speaker 1: me the answer it thought I wanted, and it hallucinated 254 00:13:36,000 --> 00:13:36,920 Speaker 1: that into being. 255 00:13:37,160 --> 00:13:38,960 Speaker 5: The truth is, there's not a huge amount to talk 256 00:13:39,000 --> 00:13:41,400 Speaker 5: about in North Katuma. There's not a lot going on. 257 00:13:41,679 --> 00:13:45,600 Speaker 5: You know, it's you know, it's quiet, it's residential. There 258 00:13:45,640 --> 00:13:48,120 Speaker 5: are some industrial pockets and there's a beautiful waterfall on 259 00:13:48,120 --> 00:13:50,600 Speaker 5: one side of town. There's one cafe, and there's a 260 00:13:50,640 --> 00:13:53,200 Speaker 5: gin distillery and there's an op shop. Instead of just 261 00:13:53,240 --> 00:13:58,400 Speaker 5: saying look no, I can't do that, or you've asked 262 00:13:58,400 --> 00:14:01,880 Speaker 5: for fifteen minutes, I can give you the fact that 263 00:14:01,920 --> 00:14:06,800 Speaker 5: it just patted out the podcast with a lot of 264 00:14:06,840 --> 00:14:09,959 Speaker 5: information that wasn't really connected to North Katoomba at all. 265 00:14:10,559 --> 00:14:12,320 Speaker 5: Made me feel good about the work that we do here. 266 00:14:12,679 --> 00:14:14,960 Speaker 1: One of the interesting things about audio as a medium, 267 00:14:15,080 --> 00:14:18,040 Speaker 1: especially in this new, very warm, kind of candid era 268 00:14:18,120 --> 00:14:21,120 Speaker 1: of podcasting, is that it's very emotional. You want to 269 00:14:21,200 --> 00:14:23,600 Speaker 1: laugh along with the podcast hosts, you want to be 270 00:14:23,640 --> 00:14:27,080 Speaker 1: outraged by, for example, when Heavenly Thomas tells you that 271 00:14:27,160 --> 00:14:30,160 Speaker 1: a woman's been murdered and the cops didn't care in 272 00:14:30,200 --> 00:14:32,600 Speaker 1: one of his long form investigations like for example, the 273 00:14:32,640 --> 00:14:35,760 Speaker 1: Teacher's Pet. So for a fake podcast to be trying 274 00:14:35,800 --> 00:14:40,560 Speaker 1: to push those same emotional buttons feels ultra manipulative and. 275 00:14:40,520 --> 00:14:41,360 Speaker 5: It doesn't work. 276 00:14:42,040 --> 00:14:43,360 Speaker 2: The ability to. 277 00:14:44,920 --> 00:14:48,920 Speaker 5: Write in a way that elicit's real buy and from 278 00:14:48,960 --> 00:14:53,840 Speaker 5: a listener is a distinctly human attribute. AI can mimic formulas, 279 00:14:53,880 --> 00:14:56,520 Speaker 5: and that can mimic the way that people talk, but 280 00:14:56,560 --> 00:14:59,920 Speaker 5: I think coming up with stories that elicit real human 281 00:15:00,040 --> 00:15:02,960 Speaker 5: emotion is best left to us humans. 282 00:15:03,320 --> 00:15:05,800 Speaker 1: This is relevant now because this is not just a 283 00:15:05,800 --> 00:15:08,360 Speaker 1: tool that we're using in our living rooms or on 284 00:15:08,360 --> 00:15:12,840 Speaker 1: the treadmill to inform us about something. AI podcasts are 285 00:15:12,880 --> 00:15:17,080 Speaker 1: being published by real publishers who are making money from them, 286 00:15:17,200 --> 00:15:20,000 Speaker 1: and that seems to be their main goal. The Hollywood Reporter, 287 00:15:20,120 --> 00:15:23,160 Speaker 1: which is a venerable news source employing lots of real 288 00:15:23,240 --> 00:15:26,520 Speaker 1: journals in America, did an article about this. Let's hear 289 00:15:26,560 --> 00:15:28,320 Speaker 1: a bit of it and this is read by their 290 00:15:28,520 --> 00:15:29,560 Speaker 1: text to speech function. 291 00:15:30,840 --> 00:15:33,640 Speaker 6: I met the high costs for producing narrative podcasts and 292 00:15:33,760 --> 00:15:37,520 Speaker 6: pricy short term contracts for popular hosts. The idea here 293 00:15:37,560 --> 00:15:40,160 Speaker 6: is being able to own scale and control the talent 294 00:15:40,360 --> 00:15:42,960 Speaker 6: unlike those off the cuff humans, and produce shows at 295 00:15:43,000 --> 00:15:46,080 Speaker 6: a minimal cost. The company builds a stable of AI 296 00:15:46,080 --> 00:15:47,520 Speaker 6: talent to host podcasts. 297 00:15:48,640 --> 00:15:51,960 Speaker 1: It referenced the company there. That's Inception point AI. That's 298 00:15:52,000 --> 00:15:55,560 Speaker 1: a company run by a woman named Jeanine Wright. She 299 00:15:55,680 --> 00:15:59,000 Speaker 1: was previously a senior executive at a very established podcasting 300 00:15:59,000 --> 00:16:02,640 Speaker 1: company called One Wonder. Is actually famous for making beautiful 301 00:16:02,680 --> 00:16:04,400 Speaker 1: handmade human podcast. 302 00:16:04,000 --> 00:16:06,120 Speaker 5: Isn't it Teaming up with the La Times They made 303 00:16:06,120 --> 00:16:10,840 Speaker 5: Dirty John. They are the team behind Doctor Death, widely 304 00:16:10,840 --> 00:16:14,080 Speaker 5: seen as one of the best podcast distributors. Inception point 305 00:16:14,120 --> 00:16:17,960 Speaker 5: AI's way of creating podcasts isn't actually too different to 306 00:16:18,000 --> 00:16:20,840 Speaker 5: how I created my podcast on Off Katoomba. They go 307 00:16:20,920 --> 00:16:24,920 Speaker 5: to AI for script generation. They dump the script into 308 00:16:24,960 --> 00:16:27,600 Speaker 5: a voice to text tool, they hit a button and bang, 309 00:16:27,640 --> 00:16:30,440 Speaker 5: you've got your podcast. The company say that takes them 310 00:16:30,440 --> 00:16:34,360 Speaker 5: about an hour to produce a podcast with very little 311 00:16:34,640 --> 00:16:36,120 Speaker 5: human involvement. 312 00:16:36,280 --> 00:16:38,800 Speaker 1: And they say in that Hollywood Reporter article that they 313 00:16:38,880 --> 00:16:43,000 Speaker 1: choose generic topics that people might be googling, like, for example, wales. 314 00:16:43,480 --> 00:16:46,360 Speaker 1: So they churn out an episode about blue whales. Twenty 315 00:16:46,400 --> 00:16:49,840 Speaker 1: people listen, they hear some ads, and inception point Ali 316 00:16:50,000 --> 00:16:51,160 Speaker 1: makes money for them. 317 00:16:51,160 --> 00:16:51,640 Speaker 5: That's a win. 318 00:16:52,160 --> 00:16:55,040 Speaker 1: So one way that we could respond to this, being 319 00:16:55,080 --> 00:16:57,880 Speaker 1: emotional human beings, just go cool, no worries, you guys 320 00:16:57,920 --> 00:17:01,240 Speaker 1: do that. We'll do what we do, real journalism, handmade 321 00:17:01,280 --> 00:17:05,200 Speaker 1: audio productions, and we'll have faith that our audience wants this. Yes, 322 00:17:05,640 --> 00:17:07,959 Speaker 1: what's the other way to think about it? 323 00:17:08,000 --> 00:17:12,600 Speaker 5: Cry in the shower. To be totally frank, this company 324 00:17:12,600 --> 00:17:15,639 Speaker 5: clearly isn't looking at their audience as people. They're looking 325 00:17:15,680 --> 00:17:18,560 Speaker 5: at them as numbers in a spreadsheet, and they think, well, 326 00:17:18,560 --> 00:17:20,720 Speaker 5: if we can get twenty people to listen to one 327 00:17:20,720 --> 00:17:22,640 Speaker 5: of these podcasts, then we've done our job and we've 328 00:17:22,680 --> 00:17:23,760 Speaker 5: made our company money. 329 00:17:24,200 --> 00:17:26,520 Speaker 1: It really reminded me of when you get into a 330 00:17:26,600 --> 00:17:29,359 Speaker 1: lyft in a shopping center, or you're shopping on a 331 00:17:29,480 --> 00:17:31,639 Speaker 1: quiet Wednesday, we can actually hear the music that the 332 00:17:31,640 --> 00:17:36,560 Speaker 1: shopping center is playing. It's often musak analyzed or stripped 333 00:17:36,600 --> 00:17:39,959 Speaker 1: back versions of real songs, so you might hear Nirvana's 334 00:17:40,040 --> 00:17:42,760 Speaker 1: smells like teen Spirit played on something that sounds like 335 00:17:43,000 --> 00:17:52,320 Speaker 1: the Glockonshbiel. Yes, it's a joke. Now. Music it's a 336 00:17:52,400 --> 00:17:55,240 Speaker 1: byword for something kind of cheap and tacky and cynical. 337 00:17:56,240 --> 00:17:57,720 Speaker 1: It's kind of like that, isn't it. 338 00:17:58,480 --> 00:18:02,000 Speaker 5: Music itself is a corruption of a lovely idea that 339 00:18:02,520 --> 00:18:07,200 Speaker 5: music can be wallpaper, but it can be beautiful wallpaper. 340 00:18:07,840 --> 00:18:10,760 Speaker 5: One of the pioneers of warpaper music was French composer 341 00:18:10,880 --> 00:18:13,800 Speaker 5: Eric Sarti, who was active in France at the turn 342 00:18:13,840 --> 00:18:19,080 Speaker 5: of the last century, who very intentionally wrote music to 343 00:18:19,160 --> 00:18:23,520 Speaker 5: create a mood. Sati was famously an eccentric man. There 344 00:18:23,520 --> 00:18:29,119 Speaker 5: are stories that he would attend soirees in Paris where 345 00:18:29,240 --> 00:18:32,600 Speaker 5: his music was being played. It wasn't being performed, it 346 00:18:32,640 --> 00:18:34,439 Speaker 5: was just being played in the background to set a 347 00:18:34,480 --> 00:18:37,280 Speaker 5: mood for the event, and he would walk through the 348 00:18:37,320 --> 00:18:40,359 Speaker 5: crowd and if you noticed people listening to his music 349 00:18:40,440 --> 00:18:43,080 Speaker 5: to stop and appreciate it because it was very beautiful music. 350 00:18:43,440 --> 00:18:45,280 Speaker 5: He would hit them with his umbrella and say, don't 351 00:18:45,280 --> 00:18:47,360 Speaker 5: listen to the music. You just meant to be enjoying yourself. 352 00:18:47,400 --> 00:18:49,480 Speaker 5: Just let the music have sort of washed over you 353 00:18:50,040 --> 00:18:51,960 Speaker 5: don't listen to it. So it's sort of this idea 354 00:18:52,000 --> 00:18:56,119 Speaker 5: of like subliminal music, which is actually as an artistic statement, 355 00:18:56,160 --> 00:19:00,080 Speaker 5: it's a really interesting thing to do. But then, like 356 00:19:00,119 --> 00:19:03,560 Speaker 5: you said, that quite pure and interesting idea was eventually 357 00:19:03,640 --> 00:19:07,199 Speaker 5: hijacked and turned into a big business idea that we 358 00:19:07,280 --> 00:19:09,520 Speaker 5: all need to have the golf mpperanina playing in every 359 00:19:09,560 --> 00:19:10,720 Speaker 5: elevator around the world. 360 00:19:11,200 --> 00:19:14,040 Speaker 1: Let's hear a little bit of Jim Nipiti number one 361 00:19:14,160 --> 00:19:25,120 Speaker 1: by Eric Satti. It's funny that he thought that this 362 00:19:25,400 --> 00:19:29,040 Speaker 1: was just to be heard in the background, because it's 363 00:19:29,080 --> 00:19:30,680 Speaker 1: very arresting. It's very different. 364 00:19:35,920 --> 00:19:38,080 Speaker 5: I think if it was just the cause, you could 365 00:19:38,080 --> 00:19:40,920 Speaker 5: ignore that. I think a melody like that comes in, 366 00:19:41,240 --> 00:19:41,879 Speaker 5: how you're not going to. 367 00:19:41,880 --> 00:19:42,359 Speaker 7: Listen to that. 368 00:19:42,440 --> 00:19:44,679 Speaker 1: It's a beautiful work of art. Yeah, it's amazing, just 369 00:19:44,720 --> 00:19:45,560 Speaker 1: like our podcasts. 370 00:19:46,720 --> 00:19:59,960 Speaker 7: Thanks Jeffer, thank you. 371 00:20:01,040 --> 00:20:03,919 Speaker 1: Jasperlik is the Australian's audio lead. You can read his 372 00:20:04,040 --> 00:20:07,600 Speaker 1: article about AI podcasts at the Australian dot com dot au. 373 00:20:08,200 --> 00:20:10,639 Speaker 1: This episode of the Front was hosted by me Claire 374 00:20:10,640 --> 00:20:13,879 Speaker 1: Harvey and produced by Jasperlik, who edited the episode and 375 00:20:13,920 --> 00:20:16,520 Speaker 1: also wrote our theme. Thanks for joining us on the 376 00:20:16,520 --> 00:20:20,560 Speaker 1: Front this week. Our team also includes Kristin Amiot, Leat Sammaglu, 377 00:20:20,840 --> 00:20:22,679 Speaker 1: Tiffany Dimack and Joshua Burton.