1 00:00:04,240 --> 00:00:07,240 Speaker 1: Welcome to Tech Stuff, a production of I Heart Radios 2 00:00:07,320 --> 00:00:13,880 Speaker 1: How Stuff Works. Hey there, and welcome to tech Stuff. 3 00:00:13,880 --> 00:00:17,400 Speaker 1: I'm your host, Jonathan Strickland. I'm an executive producer with 4 00:00:17,600 --> 00:00:19,560 Speaker 1: I Heart Radio and How Stuff Works, and I love 5 00:00:19,600 --> 00:00:24,200 Speaker 1: all things tech. And I'm sitting in the audience of 6 00:00:24,239 --> 00:00:28,240 Speaker 1: a local theater like Stage theater not long ago. I'm 7 00:00:28,240 --> 00:00:31,440 Speaker 1: waiting for the show to start, and there's a song 8 00:00:31,720 --> 00:00:34,440 Speaker 1: that's playing over the sound system, and I'm really kind 9 00:00:34,440 --> 00:00:37,479 Speaker 1: of digging the song, but I totally don't recognize it. 10 00:00:38,040 --> 00:00:40,920 Speaker 1: And I glanced down at my phone and I see 11 00:00:41,240 --> 00:00:44,320 Speaker 1: that on the phone below the time on the locked 12 00:00:44,400 --> 00:00:48,600 Speaker 1: phone screen, it says that the song is danger High 13 00:00:48,680 --> 00:00:52,280 Speaker 1: Voltage by Electric six. Now this is obviously a hypothetical 14 00:00:52,280 --> 00:00:54,680 Speaker 1: example because I would recognize that song anywhere, but you 15 00:00:54,720 --> 00:00:57,959 Speaker 1: get the point. Anyway, I'm thinking, that's so cool. My 16 00:00:58,000 --> 00:01:01,640 Speaker 1: phone knows what songs are playing around me. That's so neat. 17 00:01:02,360 --> 00:01:05,000 Speaker 1: I didn't even have to tell to do anything. And 18 00:01:05,040 --> 00:01:07,760 Speaker 1: then a couple of hours later, as I think back 19 00:01:07,800 --> 00:01:11,560 Speaker 1: on this moment, uncertainty and dreads start to see Ben, 20 00:01:11,680 --> 00:01:15,240 Speaker 1: wait a minute, if my phone can identify a song 21 00:01:15,440 --> 00:01:18,400 Speaker 1: that's playing around me, that means my phone is actually 22 00:01:18,440 --> 00:01:21,319 Speaker 1: listening to stuff. It wouldn't be able to tell me 23 00:01:21,680 --> 00:01:23,920 Speaker 1: the song title. Otherwise it has to be able to 24 00:01:23,959 --> 00:01:26,959 Speaker 1: pick up the audio. I didn't activate any app. I 25 00:01:26,959 --> 00:01:30,880 Speaker 1: didn't turn on shah Zam or ask my phone or anything. 26 00:01:30,920 --> 00:01:33,560 Speaker 1: My phone did it by itself. So my phone is 27 00:01:33,600 --> 00:01:36,800 Speaker 1: detecting the sounds around it even when it's not in 28 00:01:36,920 --> 00:01:41,280 Speaker 1: an active mode. Now, on a similar note, I'm sure 29 00:01:41,440 --> 00:01:45,640 Speaker 1: we all have had these personal assistant experiences out there. 30 00:01:45,680 --> 00:01:48,520 Speaker 1: Whether we use one ourselves, we've been around when someone 31 00:01:48,520 --> 00:01:52,880 Speaker 1: else uses them, things like Google Assistant or Alexa or 32 00:01:52,920 --> 00:01:56,120 Speaker 1: Siri or Cartana. There's more of them out there. You 33 00:01:56,160 --> 00:01:59,200 Speaker 1: can activate these assistants with a specific word or phrase, 34 00:01:59,560 --> 00:02:01,640 Speaker 1: and then you speak to them to carry out some 35 00:02:01,680 --> 00:02:04,560 Speaker 1: sort of task or to get you some sort of 36 00:02:04,560 --> 00:02:07,400 Speaker 1: information or something along those lines. We've got a Google 37 00:02:07,440 --> 00:02:10,200 Speaker 1: Home device in our house, so we might use it 38 00:02:10,240 --> 00:02:13,480 Speaker 1: to get a quick rundown on the weather Report. We 39 00:02:13,560 --> 00:02:15,360 Speaker 1: might ask it to play a track off an album 40 00:02:15,360 --> 00:02:19,000 Speaker 1: by the jazz Fusion band weather Report. But wait, that 41 00:02:19,080 --> 00:02:22,120 Speaker 1: means that device is listening to We didn't have to 42 00:02:22,120 --> 00:02:24,280 Speaker 1: take any physical action. We didn't have to push a 43 00:02:24,320 --> 00:02:27,560 Speaker 1: button to make it work. We just spoke the keyword 44 00:02:27,720 --> 00:02:31,160 Speaker 1: or a key phrase, and off it goes. And then 45 00:02:31,200 --> 00:02:34,760 Speaker 1: we get into stuff that seems super creepy. And I'm 46 00:02:34,800 --> 00:02:37,240 Speaker 1: sure most of you have had some sort of experience 47 00:02:37,280 --> 00:02:40,840 Speaker 1: like this. Say you're chatting with friends, maybe you're at 48 00:02:40,880 --> 00:02:44,400 Speaker 1: a restaurant or you're just hanging out, and you're talking 49 00:02:44,440 --> 00:02:47,480 Speaker 1: about this new snack food you just heard about, and 50 00:02:47,520 --> 00:02:50,519 Speaker 1: this is just one part of a conversation that rambles 51 00:02:50,560 --> 00:02:55,200 Speaker 1: all over the place. But then you talk a little 52 00:02:55,200 --> 00:02:56,840 Speaker 1: bit about the snack food for a couple of minutes. 53 00:02:56,840 --> 00:02:58,760 Speaker 1: You're like, you've heard about it, you wanted to try it, 54 00:02:58,880 --> 00:03:01,080 Speaker 1: you haven't tried it yet. Later on, you pop on 55 00:03:01,120 --> 00:03:03,079 Speaker 1: over to Facebook, and as you're scrolling through your feed, 56 00:03:03,160 --> 00:03:06,440 Speaker 1: there it is. There's an ad for the very same 57 00:03:06,480 --> 00:03:09,480 Speaker 1: snack food you mentioned to your friends just a little 58 00:03:09,480 --> 00:03:13,240 Speaker 1: earlier that day. You've never purchased the snack as far 59 00:03:13,280 --> 00:03:15,520 Speaker 1: as you remember, you haven't even searched for it on 60 00:03:15,560 --> 00:03:19,240 Speaker 1: the web, and there's the ad. So as Facebook listening 61 00:03:19,280 --> 00:03:22,200 Speaker 1: in on your conversation in an effort to serve up 62 00:03:22,240 --> 00:03:26,680 Speaker 1: a laser focused targeted ad. One this episode, we're gonna 63 00:03:26,680 --> 00:03:29,840 Speaker 1: take a look at the technology that allows our devices 64 00:03:29,880 --> 00:03:33,320 Speaker 1: to listen in on us, and we'll explore the studies 65 00:03:33,320 --> 00:03:36,200 Speaker 1: about whether or not anything hanky is going on and 66 00:03:36,200 --> 00:03:40,400 Speaker 1: try to separate fact from fud FU D that's fear, 67 00:03:40,520 --> 00:03:44,240 Speaker 1: uncertainty and doubt. And we'll also chat about some recent 68 00:03:44,320 --> 00:03:47,120 Speaker 1: news stories about how big companies have been handing over 69 00:03:47,160 --> 00:03:51,280 Speaker 1: audio messages to third party human contractors and what that 70 00:03:51,360 --> 00:03:55,680 Speaker 1: means in terms of privacy and ethics. Now, first, let's 71 00:03:55,720 --> 00:04:00,160 Speaker 1: address a big reason why devices aren't constantly recording or 72 00:04:00,200 --> 00:04:05,520 Speaker 1: broadcasting all the sounds within an environment that's reachable by microphone. 73 00:04:06,320 --> 00:04:10,840 Speaker 1: It's because that's truly enormous, Like, that's a huge amount 74 00:04:10,960 --> 00:04:14,040 Speaker 1: of data. So let's just take Facebook as an example. 75 00:04:14,680 --> 00:04:18,360 Speaker 1: There are more than two billion people using Facebook every month. 76 00:04:18,880 --> 00:04:21,080 Speaker 1: At least one and a half billion people pop on 77 00:04:21,080 --> 00:04:24,400 Speaker 1: Facebook every single day. Now that's not necessarily the same 78 00:04:24,880 --> 00:04:27,680 Speaker 1: one and a half billion people every day, but every 79 00:04:27,760 --> 00:04:31,640 Speaker 1: day one point five billion people check Facebook, and out 80 00:04:31,640 --> 00:04:35,400 Speaker 1: of that number, nearly one billion of them are accessing 81 00:04:35,440 --> 00:04:40,360 Speaker 1: Facebook on mobile devices. So, just from a data management standpoint, 82 00:04:41,040 --> 00:04:45,240 Speaker 1: there's no way any company, even one as large as Facebook, 83 00:04:45,400 --> 00:04:49,279 Speaker 1: could be actively monitoring, recording, or even analyzing all that 84 00:04:49,360 --> 00:04:54,080 Speaker 1: audio that would be coming in from a billion mobile handsets. 85 00:04:54,960 --> 00:04:56,960 Speaker 1: We are in the age of big data, but we 86 00:04:57,040 --> 00:04:59,640 Speaker 1: still have our limits. Plus you'd have to figure out 87 00:05:00,240 --> 00:05:03,520 Speaker 1: that you know that that large amount of data, most 88 00:05:03,560 --> 00:05:06,640 Speaker 1: of it wouldn't be useful to Facebook. Now, don't get 89 00:05:06,640 --> 00:05:08,880 Speaker 1: me wrong. At the end of the day, you and 90 00:05:08,960 --> 00:05:14,000 Speaker 1: I are the products being bought and sold on Facebook 91 00:05:14,080 --> 00:05:19,240 Speaker 1: and Google and other providers out there. We're potential customers 92 00:05:19,279 --> 00:05:22,720 Speaker 1: for all of the advertisers that use those companies like 93 00:05:22,760 --> 00:05:26,839 Speaker 1: Facebook as a platform. So it benefits the advertisers and 94 00:05:27,040 --> 00:05:31,120 Speaker 1: Facebook and sometimes even us as customers to match the 95 00:05:31,200 --> 00:05:34,360 Speaker 1: right ads to the right people. So there's definitely an 96 00:05:34,400 --> 00:05:37,880 Speaker 1: incentive to learn as much about users as possible to 97 00:05:38,000 --> 00:05:42,200 Speaker 1: leverage their interests and potentially convert them into paying customers 98 00:05:42,240 --> 00:05:45,960 Speaker 1: to an advertiser. Now, this is the very basic foundation 99 00:05:46,080 --> 00:05:50,520 Speaker 1: of Facebook's business model. So if Facebook could do this 100 00:05:50,839 --> 00:05:54,160 Speaker 1: from a technical standpoint, and if the company could get 101 00:05:54,200 --> 00:05:58,400 Speaker 1: away with it from a public perception standpoint, I think 102 00:05:58,400 --> 00:06:03,000 Speaker 1: there's little doubt that face Book would do it. But honestly, 103 00:06:03,000 --> 00:06:05,440 Speaker 1: it's just way too much information to process and to 104 00:06:05,480 --> 00:06:09,200 Speaker 1: boil down into actionable plans. We talk about a lot 105 00:06:09,200 --> 00:06:12,080 Speaker 1: of stuff in our day, you know, and some of 106 00:06:12,120 --> 00:06:14,159 Speaker 1: it we may not really be interested in. We're just 107 00:06:14,200 --> 00:06:17,839 Speaker 1: talking about something, So it wouldn't do Facebook any good 108 00:06:17,839 --> 00:06:20,239 Speaker 1: to serve up ads for stuff that we weren't actually 109 00:06:20,279 --> 00:06:22,880 Speaker 1: really interested in, So it has to pick and choose 110 00:06:22,880 --> 00:06:27,360 Speaker 1: its moments. Facebook has denied using phone microphones in this way. 111 00:06:27,720 --> 00:06:30,320 Speaker 1: In a June second, two thousand sixteen blog post on 112 00:06:30,360 --> 00:06:34,280 Speaker 1: the Facebook newsroom site, a company representative wrote this, and 113 00:06:34,320 --> 00:06:39,720 Speaker 1: here's a quote. Facebook does not use your phone's microphone 114 00:06:39,760 --> 00:06:42,359 Speaker 1: to inform ads or to change what you see in 115 00:06:42,440 --> 00:06:45,800 Speaker 1: news feed. Some recent articles have suggested that we must 116 00:06:45,839 --> 00:06:48,280 Speaker 1: be listening to people's conversations in order to show them 117 00:06:48,279 --> 00:06:52,360 Speaker 1: relevant ads. This is not true. We show ads based 118 00:06:52,400 --> 00:06:56,400 Speaker 1: on people's interests and other profile information, not what you're 119 00:06:56,400 --> 00:07:00,160 Speaker 1: talking out loud about. We only access your microphone if 120 00:07:00,200 --> 00:07:02,560 Speaker 1: you have given our app permission, and if you are 121 00:07:02,600 --> 00:07:06,560 Speaker 1: actively using a specific feature that requires audio. This might 122 00:07:06,600 --> 00:07:09,600 Speaker 1: include recording a video or using an optional feature we 123 00:07:09,640 --> 00:07:12,560 Speaker 1: introduced two years ago to include music or other audio 124 00:07:12,600 --> 00:07:18,240 Speaker 1: in your status updates. End quote. Now, it's understandable that 125 00:07:18,320 --> 00:07:22,200 Speaker 1: people would be a bit skeptical regarding Facebook's claims of innocence. 126 00:07:22,520 --> 00:07:25,840 Speaker 1: In this regard. The company has had several high profile 127 00:07:25,920 --> 00:07:29,840 Speaker 1: scandals and issues with privacy and security. Zuckerberg himself once 128 00:07:29,960 --> 00:07:35,240 Speaker 1: famously declared that privacy is dead. Also, he simultaneously does 129 00:07:35,280 --> 00:07:38,400 Speaker 1: his best to preserve his own privacy. But that's commentary 130 00:07:38,440 --> 00:07:42,400 Speaker 1: for another episode. So I don't blame people for thinking 131 00:07:42,440 --> 00:07:45,480 Speaker 1: that Facebook might actually be listening in on conversations because 132 00:07:45,480 --> 00:07:48,880 Speaker 1: the company has already proven it hasn't been the best 133 00:07:49,000 --> 00:07:52,640 Speaker 1: steward of user privacy in the past. But that doesn't 134 00:07:52,680 --> 00:07:56,040 Speaker 1: mean the company has actually been spying on people. It 135 00:07:56,080 --> 00:08:00,480 Speaker 1: doesn't have to, at least not in that way. And 136 00:08:00,720 --> 00:08:03,680 Speaker 1: this is where we get into some troubling territory because 137 00:08:03,720 --> 00:08:06,200 Speaker 1: it's where we start to learn how services like Google 138 00:08:06,280 --> 00:08:10,880 Speaker 1: and Facebook and others can glean information about us, whether 139 00:08:10,960 --> 00:08:14,240 Speaker 1: we have consciously shared that information or not, and it 140 00:08:14,240 --> 00:08:17,840 Speaker 1: helps explain how these companies can advertise to us so effectively. 141 00:08:18,640 --> 00:08:22,200 Speaker 1: One way Facebook does this is with an innovation called 142 00:08:22,360 --> 00:08:26,640 Speaker 1: Facebook Pixel. Now, this is a piece of code that 143 00:08:27,000 --> 00:08:32,320 Speaker 1: Facebook's clients advertisers really can put on their own websites. 144 00:08:32,720 --> 00:08:35,600 Speaker 1: So it's the type of code you would insert into 145 00:08:35,640 --> 00:08:38,040 Speaker 1: the website for a business. So let's say you own 146 00:08:38,080 --> 00:08:42,359 Speaker 1: a specialty niche marketing shop. We'll say you sell figurines 147 00:08:42,400 --> 00:08:46,200 Speaker 1: based off of iconic horror movie monsters and characters, and 148 00:08:46,240 --> 00:08:49,200 Speaker 1: you're going to advertise on Facebook. The pixel code is 149 00:08:49,240 --> 00:08:52,920 Speaker 1: one way Facebook can optimize that experience. The code pulls 150 00:08:52,960 --> 00:08:57,320 Speaker 1: information off of user behavior on your website and sends 151 00:08:57,320 --> 00:09:00,760 Speaker 1: it to Facebook. If people click over to your site 152 00:09:00,760 --> 00:09:03,560 Speaker 1: because of an ad on Facebook, pixel will register it. 153 00:09:04,000 --> 00:09:07,120 Speaker 1: This helps you see how effective or ineffective your ads 154 00:09:07,200 --> 00:09:10,800 Speaker 1: are on the site. It also can target your ads 155 00:09:10,920 --> 00:09:13,520 Speaker 1: to people on Facebook who would be most likely to 156 00:09:13,600 --> 00:09:17,160 Speaker 1: click on those ads. It might analyze the traits common 157 00:09:17,200 --> 00:09:19,600 Speaker 1: to people who are interacting with your ads, and then 158 00:09:19,640 --> 00:09:22,760 Speaker 1: extrapolate that to target people who have similar traits and 159 00:09:22,880 --> 00:09:27,920 Speaker 1: behaviors but they haven't yet seen your advertisements. Facebook, meanwhile, 160 00:09:28,040 --> 00:09:30,360 Speaker 1: can also use that data to serve up ads from 161 00:09:30,400 --> 00:09:33,559 Speaker 1: other companies to users based on similar findings, and it 162 00:09:33,640 --> 00:09:36,400 Speaker 1: can track other stuff too. Let's say you click over 163 00:09:36,480 --> 00:09:38,880 Speaker 1: to an article on a blog or news site that 164 00:09:38,960 --> 00:09:42,680 Speaker 1: incorporates Facebook pixel in the site's code. Facebook can see 165 00:09:42,679 --> 00:09:45,160 Speaker 1: how long you were on that article, which in turn 166 00:09:45,200 --> 00:09:48,600 Speaker 1: indicates your interest and investment level in that topic. Then 167 00:09:48,640 --> 00:09:51,640 Speaker 1: Facebook can serve up ads related to the contents of 168 00:09:51,679 --> 00:09:54,920 Speaker 1: that article to you. In the end, it's all about 169 00:09:54,920 --> 00:09:58,760 Speaker 1: analyzing user behavior to get the biggest return on investment, 170 00:09:59,080 --> 00:10:01,800 Speaker 1: and it doesn't require are using the microphone to do it. 171 00:10:02,160 --> 00:10:05,000 Speaker 1: They can just look at who you are, where you've been, 172 00:10:05,440 --> 00:10:09,280 Speaker 1: both in real life if it's tracking your location and 173 00:10:09,360 --> 00:10:12,720 Speaker 1: on the Internet if it's tracking your your browsing and 174 00:10:12,800 --> 00:10:15,600 Speaker 1: who your friends are. And all of this information combined 175 00:10:16,000 --> 00:10:19,240 Speaker 1: gives Facebook a ton of data about what kind of 176 00:10:19,280 --> 00:10:21,920 Speaker 1: ads to target towards you. Now, on top of that, 177 00:10:22,200 --> 00:10:26,120 Speaker 1: Facebook can purchase information from data brokers to supplement its 178 00:10:26,120 --> 00:10:29,400 Speaker 1: own guard Ganga and database. There are companies that manage 179 00:10:29,400 --> 00:10:33,160 Speaker 1: stuff like loyalty programs, which also track what you buy. 180 00:10:33,360 --> 00:10:36,000 Speaker 1: They have to for the loyalty programs to work, and 181 00:10:36,040 --> 00:10:39,400 Speaker 1: those purchases are linked to you as a person. They know, Oh, 182 00:10:39,480 --> 00:10:42,480 Speaker 1: Jonathan goes to Starbucks all the time and he always 183 00:10:42,480 --> 00:10:45,520 Speaker 1: gets those Nitro cold brews, So let's put an ad 184 00:10:46,000 --> 00:10:49,720 Speaker 1: that targets him based on that information. Now, that data 185 00:10:49,800 --> 00:10:51,920 Speaker 1: isn't just being used to help you get the best 186 00:10:52,200 --> 00:10:56,080 Speaker 1: deal on whatever it happens to be. That information is valuable. 187 00:10:56,559 --> 00:11:00,480 Speaker 1: So companies that manage these loyalty programs can and do 188 00:11:00,840 --> 00:11:03,600 Speaker 1: buy and sell sell that data you know are spending 189 00:11:03,640 --> 00:11:07,400 Speaker 1: habits are part of this sort of encyclopedia entry about 190 00:11:07,400 --> 00:11:11,080 Speaker 1: our interests, priorities, and behaviors. Now, none of this needs 191 00:11:11,200 --> 00:11:15,200 Speaker 1: to use a microphone to spy on us. So in 192 00:11:15,240 --> 00:11:17,800 Speaker 1: the case of seeing that snack food pop up on 193 00:11:17,800 --> 00:11:20,480 Speaker 1: the Facebook feed, it could simply be that you exhibit 194 00:11:20,559 --> 00:11:23,520 Speaker 1: behaviors similar to ones that people who have bought that 195 00:11:23,600 --> 00:11:26,200 Speaker 1: snack food tend to have. As well. You've liked the 196 00:11:26,240 --> 00:11:29,480 Speaker 1: same sort of pages. You may even have a lot 197 00:11:29,520 --> 00:11:32,080 Speaker 1: of friends who have already bought this stuff. You may 198 00:11:32,120 --> 00:11:34,959 Speaker 1: live in a region where it has recently been introduced. 199 00:11:35,360 --> 00:11:37,600 Speaker 1: These are the kinds of points of data that Facebook 200 00:11:37,679 --> 00:11:39,320 Speaker 1: might use in order to serve that add up to 201 00:11:39,360 --> 00:11:41,840 Speaker 1: you that have nothing to do with your microphone. So 202 00:11:41,880 --> 00:11:44,640 Speaker 1: you got the ad not because you talked about the 203 00:11:44,640 --> 00:11:47,760 Speaker 1: snack food, but because Facebook has sussed out you're the 204 00:11:47,760 --> 00:11:50,640 Speaker 1: type of person who would like that snack food because 205 00:11:51,400 --> 00:11:54,360 Speaker 1: spoiler alert, You're not as special as you think you are, 206 00:11:54,880 --> 00:11:57,600 Speaker 1: and I'm not as special as I think I am. 207 00:11:57,640 --> 00:12:00,080 Speaker 1: Now you could argue, and I would agree with you 208 00:12:00,160 --> 00:12:03,480 Speaker 1: on this, that what Facebook is doing is at least 209 00:12:03,559 --> 00:12:06,520 Speaker 1: as creepy as listening in on a microphone, perhaps even 210 00:12:06,600 --> 00:12:10,760 Speaker 1: more so. Facebook has filed patents that focus on technology 211 00:12:10,840 --> 00:12:13,200 Speaker 1: is meant to predict where you're going to go next 212 00:12:13,559 --> 00:12:16,400 Speaker 1: based on your history of location data. So, in other words, 213 00:12:16,640 --> 00:12:19,160 Speaker 1: Facebook is trying to figure out where you're going to 214 00:12:19,240 --> 00:12:23,000 Speaker 1: go before you go there. And it's not just you, 215 00:12:23,160 --> 00:12:25,680 Speaker 1: it's all the people you know who are using Facebook 216 00:12:25,720 --> 00:12:29,440 Speaker 1: two and so it's not just predicting where you'll go, 217 00:12:30,120 --> 00:12:33,600 Speaker 1: it's also predicting which people you may be running into, 218 00:12:33,679 --> 00:12:35,800 Speaker 1: because it's predicting those people are going to go to 219 00:12:35,840 --> 00:12:38,560 Speaker 1: that same place and whether or not you might encounter 220 00:12:38,679 --> 00:12:41,199 Speaker 1: one another. It can also use that to make suggestions 221 00:12:41,240 --> 00:12:44,480 Speaker 1: to add people on Facebook who are going to those 222 00:12:44,520 --> 00:12:48,240 Speaker 1: same places so that they become your friends online. Now 223 00:12:48,240 --> 00:12:51,400 Speaker 1: why does Facebook care who your friends are? Because the 224 00:12:51,440 --> 00:12:55,120 Speaker 1: more people who use Facebook and the more interconnected they become, 225 00:12:55,640 --> 00:12:59,480 Speaker 1: the more useful the information they generate for Facebook. That 226 00:12:59,720 --> 00:13:03,640 Speaker 1: that ends up becoming more valuable to the company. So 227 00:13:05,040 --> 00:13:07,480 Speaker 1: it is pretty creepy and invasive, and it doesn't have 228 00:13:07,520 --> 00:13:10,439 Speaker 1: to use the microphone. But when we come back, I'll 229 00:13:10,440 --> 00:13:13,040 Speaker 1: talk a bit more about these sound activated features and 230 00:13:13,080 --> 00:13:15,439 Speaker 1: what's actually going on, because there is some stuff we've 231 00:13:15,480 --> 00:13:17,760 Speaker 1: got to be worried about. But first, let's take a 232 00:13:17,880 --> 00:13:28,240 Speaker 1: quick break. When I opened this show, I talked about 233 00:13:28,240 --> 00:13:30,920 Speaker 1: how my phone could listen in on music and identify 234 00:13:31,000 --> 00:13:34,320 Speaker 1: the song even when the phone was in its locked mode. 235 00:13:34,800 --> 00:13:38,200 Speaker 1: Now that's because I have a Pixel to xcel phone. 236 00:13:38,240 --> 00:13:41,839 Speaker 1: It's an Android phone. It's actually a flagship Google phone, 237 00:13:42,160 --> 00:13:45,400 Speaker 1: and there's a feature on the Pixel too that's called 238 00:13:45,640 --> 00:13:48,560 Speaker 1: now playing. You have to activate this feature, you have 239 00:13:48,600 --> 00:13:51,679 Speaker 1: to choose to optimize it. So I want to make 240 00:13:51,720 --> 00:13:54,679 Speaker 1: that clear. I chose to activate this feature. It's not 241 00:13:54,760 --> 00:13:59,240 Speaker 1: just active by default, and with it active, the phone 242 00:13:59,240 --> 00:14:01,920 Speaker 1: can identify music that's playing, and it can tell me 243 00:14:01,960 --> 00:14:04,720 Speaker 1: the title even when the phone is in its locked position. 244 00:14:04,800 --> 00:14:08,360 Speaker 1: So what gives Well, this is not as creepy and 245 00:14:08,440 --> 00:14:12,040 Speaker 1: invasive as it sounds at first glance, because his feature, 246 00:14:12,480 --> 00:14:16,480 Speaker 1: this is incredible to me, is actually entirely local to 247 00:14:16,600 --> 00:14:21,320 Speaker 1: the Pixel two phones. It works on the phone itself. 248 00:14:21,360 --> 00:14:24,320 Speaker 1: It's not consulting the cloud at all, it's not sending 249 00:14:24,360 --> 00:14:28,760 Speaker 1: any information. So how can that be possible? How can 250 00:14:29,320 --> 00:14:32,400 Speaker 1: all this information exists on the phone already? Well, let's 251 00:14:32,440 --> 00:14:35,960 Speaker 1: boil it down first, if you've ever played with any 252 00:14:36,000 --> 00:14:40,920 Speaker 1: digital sound recording software, you've likely seen sound recorded as 253 00:14:40,920 --> 00:14:44,880 Speaker 1: a wave form, a visualization of sound, and typically it's 254 00:14:44,880 --> 00:14:47,120 Speaker 1: pretty simple stuff like if you're using a very basic 255 00:14:47,240 --> 00:14:51,920 Speaker 1: sound recording system, you're mostly looking at changes in amplitude 256 00:14:52,280 --> 00:14:55,119 Speaker 1: or volume. In other words, so you see a continuous 257 00:14:55,200 --> 00:14:57,520 Speaker 1: series of peaks and valleys over the course of a 258 00:14:57,560 --> 00:15:02,200 Speaker 1: sound recording. Those represent the loudest and the quietest parts 259 00:15:02,240 --> 00:15:05,200 Speaker 1: of the recording that changes in volume. You can also 260 00:15:05,240 --> 00:15:09,480 Speaker 1: graph frequency or pitch, and you can if you zoom 261 00:15:09,520 --> 00:15:12,480 Speaker 1: way in, see shapes in the wave form that indicates 262 00:15:12,480 --> 00:15:17,080 Speaker 1: specific phonetics and sounds. Anyone who has worked in audio 263 00:15:17,240 --> 00:15:20,760 Speaker 1: editing for a while can identify at a glance certain 264 00:15:20,800 --> 00:15:26,000 Speaker 1: distinctive sounds. Tari, my producer, can probably tell you just 265 00:15:26,160 --> 00:15:29,520 Speaker 1: by looking at a waveform of my recording which moments 266 00:15:29,560 --> 00:15:34,400 Speaker 1: represent the irritating mouth sounds she removes before publishing an episode. 267 00:15:35,080 --> 00:15:37,680 Speaker 1: It doesn't take long before you can do this yourself. 268 00:15:38,040 --> 00:15:40,560 Speaker 1: It's actually pretty easy to identify, say it like a 269 00:15:40,640 --> 00:15:46,000 Speaker 1: high hat symbol in a music recording, because it's very distinctive. Now, 270 00:15:46,080 --> 00:15:49,200 Speaker 1: that means that songs have these distinctive features like a 271 00:15:49,240 --> 00:15:53,400 Speaker 1: fingerprint that represent the sound of the song, and if 272 00:15:53,440 --> 00:15:56,800 Speaker 1: you can recognize the fingerprint, you can identify the song 273 00:15:57,040 --> 00:15:59,600 Speaker 1: even if you're not listening to the song at that moment. 274 00:16:00,040 --> 00:16:03,000 Speaker 1: And you could look at a print out of a 275 00:16:03,000 --> 00:16:06,280 Speaker 1: wave form of a song and you can try and 276 00:16:06,360 --> 00:16:10,760 Speaker 1: match it against a library of print outs. That's essentially 277 00:16:10,840 --> 00:16:14,280 Speaker 1: what the pixel Too is doing. The program runs in 278 00:16:14,320 --> 00:16:17,960 Speaker 1: the background, It activates when the sound profile indicates that 279 00:16:18,000 --> 00:16:22,160 Speaker 1: there's music present, so it then analyzes the sound that's 280 00:16:22,160 --> 00:16:24,800 Speaker 1: coming in through the microphone and it creates one of 281 00:16:24,800 --> 00:16:28,400 Speaker 1: these digital fingerprints that I was just saying. Then, just 282 00:16:28,440 --> 00:16:31,040 Speaker 1: like you would with a crime scene fingerprint, the pixel 283 00:16:31,080 --> 00:16:34,760 Speaker 1: Too will compare the digital analysis of the song that's 284 00:16:34,760 --> 00:16:38,560 Speaker 1: playing against a local database on the phone of fingerprints 285 00:16:38,600 --> 00:16:42,640 Speaker 1: that represent thousands of popular songs for your region. Now 286 00:16:42,680 --> 00:16:45,920 Speaker 1: exactly how many hasn't really been released, but supposedly in 287 00:16:45,960 --> 00:16:49,560 Speaker 1: the tens of thousands of songs range. And if the 288 00:16:49,560 --> 00:16:51,920 Speaker 1: pixel Too finds a match between the song that is 289 00:16:51,960 --> 00:16:55,200 Speaker 1: currently playing and the one that's in the database, it 290 00:16:55,280 --> 00:16:58,200 Speaker 1: returns the result. This works even if the phone has 291 00:16:58,200 --> 00:17:01,840 Speaker 1: cellular and WiFi data turned off, because again it's all local. 292 00:17:02,440 --> 00:17:06,480 Speaker 1: Now the now playing feature doesn't run constantly because that 293 00:17:06,520 --> 00:17:10,119 Speaker 1: would drain battery life like crazy. Instead, it samples the 294 00:17:10,160 --> 00:17:14,600 Speaker 1: audio approximately every sixty seconds, and it takes time to 295 00:17:14,680 --> 00:17:17,560 Speaker 1: match a song to an entry in the database. The 296 00:17:17,600 --> 00:17:20,959 Speaker 1: cleaner the audio, in other words, the less background noise 297 00:17:21,040 --> 00:17:24,800 Speaker 1: and less interference that's present, the faster this process tends 298 00:17:24,800 --> 00:17:28,440 Speaker 1: to be. This means that when songs transition from one 299 00:17:28,480 --> 00:17:31,200 Speaker 1: song to another, it can take a little bit before 300 00:17:31,240 --> 00:17:33,879 Speaker 1: the phone registers the change. It all depends on the 301 00:17:33,920 --> 00:17:38,040 Speaker 1: acoustic quality of the environment and where in this sampling 302 00:17:38,160 --> 00:17:42,440 Speaker 1: cycle the phone is at any given time, so that's 303 00:17:42,480 --> 00:17:45,840 Speaker 1: not quite as creepy because everything's local on the device. 304 00:17:45,920 --> 00:17:49,159 Speaker 1: It's not sending any data out anywhere else. It's not 305 00:17:49,280 --> 00:17:52,240 Speaker 1: listening to what I'm listening to and an alerting Google 306 00:17:52,400 --> 00:17:55,359 Speaker 1: to let them know, hey, Jonathan's once again listening to 307 00:17:55,400 --> 00:17:59,960 Speaker 1: the soundtrack to be More Chill, which would be an 308 00:18:00,040 --> 00:18:03,000 Speaker 1: accurate suggestion that it would make because I do listen 309 00:18:03,040 --> 00:18:05,840 Speaker 1: to that a lot. Anyway, you can use this feature 310 00:18:06,520 --> 00:18:09,560 Speaker 1: to learn more about the track, the artist, the album, 311 00:18:09,600 --> 00:18:13,320 Speaker 1: including potentially purchasing that music. And those features do connect 312 00:18:13,359 --> 00:18:16,679 Speaker 1: to the outside world through WiFi or cellular connections, but 313 00:18:16,760 --> 00:18:20,639 Speaker 1: that requires an extra step on the part of the user. Also, 314 00:18:20,680 --> 00:18:23,520 Speaker 1: Google pushes out updates to this database with the most 315 00:18:23,520 --> 00:18:27,560 Speaker 1: popular songs, and these are regionalized to reflect the country 316 00:18:27,560 --> 00:18:31,240 Speaker 1: you're in, because you're less likely to run into, say 317 00:18:31,600 --> 00:18:35,320 Speaker 1: a Peruvian pop song when you're in Scotland. The push 318 00:18:35,440 --> 00:18:39,320 Speaker 1: updates do happen over WiFi or cellular local connections. But 319 00:18:39,960 --> 00:18:42,920 Speaker 1: but this is just the reference data that analyze music 320 00:18:42,960 --> 00:18:47,080 Speaker 1: gets compared against. An app like Shazam, on the other hand, 321 00:18:47,520 --> 00:18:50,400 Speaker 1: connects to the cloud, but you also have to activate 322 00:18:50,440 --> 00:18:52,760 Speaker 1: the app to have it listened to the audio, so 323 00:18:53,160 --> 00:18:56,439 Speaker 1: it's a user choice to have the app listen. So 324 00:18:56,480 --> 00:18:59,040 Speaker 1: this is more like a push to talk device, except 325 00:18:59,040 --> 00:19:02,439 Speaker 1: it's pushed to listen. Shazam is also analyzing music to 326 00:19:02,480 --> 00:19:05,399 Speaker 1: sus out a digital fingerprint for the audio, but it 327 00:19:05,480 --> 00:19:09,480 Speaker 1: can compare the sampled audio against a much larger database 328 00:19:09,800 --> 00:19:13,239 Speaker 1: consisting of millions of songs, rather than the tens of 329 00:19:13,280 --> 00:19:16,439 Speaker 1: thousands you would find on the pixel to now playing feature. 330 00:19:17,040 --> 00:19:20,320 Speaker 1: More importantly, I think it's fair to say this isn't 331 00:19:20,359 --> 00:19:23,679 Speaker 1: a creepy use of the technology, since the listening feature 332 00:19:23,760 --> 00:19:27,240 Speaker 1: only activates on the user's command rather than just being 333 00:19:27,320 --> 00:19:30,320 Speaker 1: on by default. Now, this isn't that much different than 334 00:19:30,359 --> 00:19:34,440 Speaker 1: what virtual assistants are doing when you use them. Clearly, 335 00:19:35,000 --> 00:19:38,359 Speaker 1: the microphone on a virtual assistant like Google Home or 336 00:19:38,440 --> 00:19:41,960 Speaker 1: Siri or whatever, it has to be active all the time, 337 00:19:42,040 --> 00:19:44,879 Speaker 1: otherwise you wouldn't get a response when you used whatever 338 00:19:44,920 --> 00:19:48,800 Speaker 1: the keyword or phrase was to activate the assistant. I'm 339 00:19:48,800 --> 00:19:52,440 Speaker 1: going to try and avoid saying any of those phrases, 340 00:19:52,520 --> 00:19:54,399 Speaker 1: by the way, because I don't want those of you 341 00:19:54,520 --> 00:19:57,280 Speaker 1: who have those devices to deal with the frustration of 342 00:19:57,320 --> 00:20:01,200 Speaker 1: them going off in response to something I say. A Now, 343 00:20:01,200 --> 00:20:05,000 Speaker 1: those words or phrases have a specific sound, just like 344 00:20:05,240 --> 00:20:09,040 Speaker 1: music does. In this case, we're talking about phonemes, which 345 00:20:09,040 --> 00:20:12,440 Speaker 1: are recognizable sounds found in language. So in English there 346 00:20:12,480 --> 00:20:16,560 Speaker 1: are forty four phonemes. The order and combination of those 347 00:20:16,560 --> 00:20:19,560 Speaker 1: phonemes are the key. So if you say something that 348 00:20:19,680 --> 00:20:23,000 Speaker 1: has those phonemes in the right order, or if it's 349 00:20:23,119 --> 00:20:26,440 Speaker 1: close enough, if it's an a noisy environment, this can 350 00:20:26,480 --> 00:20:30,560 Speaker 1: activate the virtual assistant. It's like a key fitting into 351 00:20:30,600 --> 00:20:33,640 Speaker 1: a lock. Now, if you're saying other stuff, it's like 352 00:20:33,680 --> 00:20:37,000 Speaker 1: the wrong key is inserted and nothing happens. It's only 353 00:20:37,000 --> 00:20:39,720 Speaker 1: when you say something that fits the lock that the 354 00:20:39,760 --> 00:20:45,000 Speaker 1: assistant activates. This process continues after activation. When you talk 355 00:20:45,080 --> 00:20:48,960 Speaker 1: to the virtual assistant, it analyzes your speech by phonemes. 356 00:20:49,920 --> 00:20:53,000 Speaker 1: Software processes those to figure out what words you are 357 00:20:53,080 --> 00:20:56,520 Speaker 1: actually saying. Well for the first step, that is, because 358 00:20:56,560 --> 00:21:00,199 Speaker 1: it's actually more complicated than that. So, for example, there 359 00:21:00,240 --> 00:21:03,440 Speaker 1: are hominems. These are words that have a similar sound 360 00:21:03,760 --> 00:21:08,480 Speaker 1: but different meanings and often different spellings. An easy example 361 00:21:08,600 --> 00:21:12,080 Speaker 1: is the number eight in the past tense for to eat, 362 00:21:12,520 --> 00:21:16,520 Speaker 1: such as I ate an entire bowl of cao. Mm 363 00:21:16,600 --> 00:21:22,840 Speaker 1: hmm okay. So those two words eight and eight sound 364 00:21:22,920 --> 00:21:26,199 Speaker 1: exactly the same, but they have different meanings. Now that 365 00:21:26,240 --> 00:21:29,400 Speaker 1: means the software can't rely on just the sounds you're 366 00:21:29,440 --> 00:21:32,000 Speaker 1: making when you speak to figure out what you mean, 367 00:21:32,480 --> 00:21:36,120 Speaker 1: has to actually analyze syntax and context and make judgment 368 00:21:36,160 --> 00:21:38,960 Speaker 1: calls about what you are actually meaning when you say 369 00:21:38,960 --> 00:21:43,040 Speaker 1: these things. Sometimes it gets things right, sometimes it gets 370 00:21:43,040 --> 00:21:45,840 Speaker 1: things wrong. But don't be too hard on it. Because 371 00:21:46,160 --> 00:21:50,000 Speaker 1: humans misunderstand other humans all the time. Even when we 372 00:21:50,040 --> 00:21:52,719 Speaker 1: are both communicating with it in the same language, we 373 00:21:52,760 --> 00:21:56,600 Speaker 1: can misunderstand each other. Now, this is still just the 374 00:21:56,680 --> 00:22:00,000 Speaker 1: first step you can think of. This is essentially speed 375 00:22:00,000 --> 00:22:02,960 Speaker 1: each to text. From there, you have to determine what 376 00:22:03,160 --> 00:22:06,320 Speaker 1: is actually being asked by the speaker, what is the 377 00:22:06,400 --> 00:22:11,600 Speaker 1: intent behind the words. If someone speaks French very slowly 378 00:22:11,640 --> 00:22:14,199 Speaker 1: to me, I might be able to spell out what 379 00:22:14,359 --> 00:22:17,400 Speaker 1: is being said phonetically, but that doesn't mean I understand 380 00:22:17,440 --> 00:22:21,360 Speaker 1: the actual content of what was spoken. And to complicate matters, 381 00:22:21,640 --> 00:22:23,560 Speaker 1: there are a lot of different ways to ask for 382 00:22:23,600 --> 00:22:27,199 Speaker 1: the same information. I might say what's the weather for 383 00:22:27,240 --> 00:22:30,280 Speaker 1: this week? Or will I need an umbrella today, or 384 00:22:30,320 --> 00:22:32,879 Speaker 1: one of a dozen other ways to inquire about the weather. 385 00:22:33,359 --> 00:22:36,479 Speaker 1: The software has to be able to determine what the 386 00:22:36,560 --> 00:22:40,960 Speaker 1: intent was behind my question, and then there's another step, 387 00:22:41,280 --> 00:22:45,280 Speaker 1: which is matching intent with action. The assistant has to 388 00:22:45,359 --> 00:22:48,679 Speaker 1: respond to my request, and hopefully it does so in 389 00:22:48,680 --> 00:22:51,320 Speaker 1: a way that's relevant to whatever I was asking about 390 00:22:51,320 --> 00:22:53,840 Speaker 1: in the first place. So if I ask my virtual 391 00:22:53,880 --> 00:22:56,720 Speaker 1: assistant for an update on the weather, I'm not going 392 00:22:56,760 --> 00:22:59,679 Speaker 1: to be impressed if it instead tells me about the 393 00:22:59,720 --> 00:23:03,720 Speaker 1: track FAIC or vice versa. And as assistants get connected 394 00:23:03,760 --> 00:23:08,320 Speaker 1: into more systems like security systems, lights, apps, and more, 395 00:23:08,760 --> 00:23:12,520 Speaker 1: the software has to send appropriate commands to these other 396 00:23:12,600 --> 00:23:16,679 Speaker 1: elements to produce the expected results. Now, this is all impressive, 397 00:23:17,000 --> 00:23:20,040 Speaker 1: and because it's impressive, it could be a little scary 398 00:23:20,160 --> 00:23:23,639 Speaker 1: when we think about assistance as hanging on our every word. 399 00:23:23,760 --> 00:23:27,440 Speaker 1: What are are they always listening? Are they always paying attention? Now? 400 00:23:27,480 --> 00:23:30,760 Speaker 1: They're always monitoring sound, but they're not doing so in 401 00:23:30,800 --> 00:23:34,520 Speaker 1: an effort to broadcast or record information. They are on 402 00:23:34,720 --> 00:23:39,399 Speaker 1: alert for that initiating phrase or word. They ignore everything else. 403 00:23:40,200 --> 00:23:43,399 Speaker 1: More on that a little bit later. Now that being said, 404 00:23:43,800 --> 00:23:47,280 Speaker 1: there are ways in which someone could hack an assistant 405 00:23:47,560 --> 00:23:51,199 Speaker 1: or a phone, or really any connected device that has 406 00:23:51,240 --> 00:23:55,719 Speaker 1: a microphone in order to eavesdrop using that devices microphone. 407 00:23:56,359 --> 00:23:59,280 Speaker 1: Edward Snowden revealed that the n s A use such 408 00:23:59,320 --> 00:24:03,520 Speaker 1: tactics in the agency's surveillance efforts. Apps that have access 409 00:24:03,560 --> 00:24:06,600 Speaker 1: to your phone's camera and microphone for the purposes of 410 00:24:06,640 --> 00:24:10,680 Speaker 1: sharing video, audio, and related features can do some disturbing 411 00:24:10,720 --> 00:24:13,800 Speaker 1: stuff if they're compromised. They can also do some disturbing 412 00:24:13,800 --> 00:24:16,520 Speaker 1: stuff if they're not compromised, but if the party behind 413 00:24:16,560 --> 00:24:22,240 Speaker 1: it is malicious. Felix Krauss made such an app as 414 00:24:22,280 --> 00:24:26,159 Speaker 1: a proof of concept for iOS devices. The app, like 415 00:24:26,240 --> 00:24:29,679 Speaker 1: many others, asked the user for permission to access the camera. 416 00:24:30,040 --> 00:24:32,639 Speaker 1: Kraus stated that once a user agreed to this, the 417 00:24:32,640 --> 00:24:36,240 Speaker 1: app could access both the front and back camera anytime 418 00:24:36,280 --> 00:24:38,800 Speaker 1: the app was in the foreground of the iOS device. 419 00:24:39,160 --> 00:24:42,159 Speaker 1: It could take videos and pictures with no indication to 420 00:24:42,200 --> 00:24:44,560 Speaker 1: the user that such a thing was happening, and it 421 00:24:44,600 --> 00:24:47,360 Speaker 1: could upload that data to a remote server. It could 422 00:24:47,400 --> 00:24:51,639 Speaker 1: even run real time facial recognition software. Now does this 423 00:24:51,720 --> 00:24:56,360 Speaker 1: mean apps like Facebook's Messenger or YouTube are doing this? Well, 424 00:24:56,359 --> 00:24:59,480 Speaker 1: not necessarily, but it does mean it's at least possible 425 00:24:59,600 --> 00:25:03,639 Speaker 1: to do and nothing is stopping him. More, let's say 426 00:25:03,680 --> 00:25:08,399 Speaker 1: ethically unconcerned app from doing just that. So what can 427 00:25:08,440 --> 00:25:12,480 Speaker 1: you do to protect yourself from bad actors? Uh, here's 428 00:25:12,520 --> 00:25:16,160 Speaker 1: the bad news. Not much you could go without using 429 00:25:16,160 --> 00:25:19,480 Speaker 1: such devices and apps in the first place. That's pretty 430 00:25:19,560 --> 00:25:23,520 Speaker 1: darn restrictive. Crowds recommended using camera covers to obscure the 431 00:25:23,520 --> 00:25:27,440 Speaker 1: phone's cameras when you weren't actively using them, or revoking 432 00:25:27,520 --> 00:25:30,800 Speaker 1: camera access to the various apps on the phone. And 433 00:25:30,920 --> 00:25:35,000 Speaker 1: that's about it. Yikes. Now, when we come back, I'll 434 00:25:35,040 --> 00:25:38,479 Speaker 1: cover a related topic that's been in the news lately. 435 00:25:38,520 --> 00:25:49,280 Speaker 1: But first let's take another quick break. Okay, so we 436 00:25:49,400 --> 00:25:52,720 Speaker 1: know it's possible to use cameras and microphones against people, 437 00:25:52,960 --> 00:25:56,560 Speaker 1: either with malware or what amounts to a security loophole 438 00:25:56,680 --> 00:26:00,240 Speaker 1: between handset hardware and apps. But there's something us we 439 00:26:00,240 --> 00:26:03,760 Speaker 1: need to chat about, and that's humans listening in on 440 00:26:03,840 --> 00:26:08,160 Speaker 1: what were assumed to be private conversations and messages. Now 441 00:26:08,160 --> 00:26:12,440 Speaker 1: here's the context. In August two thousand nineteen, several major 442 00:26:12,480 --> 00:26:17,480 Speaker 1: media outlets reported an upsetting revelation, namely that Facebook had 443 00:26:17,480 --> 00:26:20,520 Speaker 1: been sending out audio files that users were creating in 444 00:26:20,720 --> 00:26:24,760 Speaker 1: Facebook Messenger, for example. And these were audio clips sent 445 00:26:24,960 --> 00:26:28,720 Speaker 1: through Messenger itself, so it's akin to a private text 446 00:26:28,840 --> 00:26:32,000 Speaker 1: to a friend. And Facebook was sending these audio files 447 00:26:32,040 --> 00:26:36,359 Speaker 1: to a third party contractor to transcribe that audio. So 448 00:26:36,400 --> 00:26:40,159 Speaker 1: imagine having a private text message thread set to a 449 00:26:40,320 --> 00:26:43,600 Speaker 1: complete stranger for review. It was similar to that, except 450 00:26:43,600 --> 00:26:47,080 Speaker 1: it was audio, not text. So what's actually going on? Well, 451 00:26:47,320 --> 00:26:49,520 Speaker 1: Facebook said this all had to do with users who 452 00:26:49,560 --> 00:26:54,200 Speaker 1: had opted into having their audio messages transcribed automatically. Essentially, 453 00:26:54,960 --> 00:26:59,360 Speaker 1: it was all about using the voice to text option 454 00:26:59,800 --> 00:27:06,320 Speaker 1: in Facebook. Now, according to Express Computer, this option didn't 455 00:27:06,359 --> 00:27:09,720 Speaker 1: really have a warning that let you know that those 456 00:27:10,359 --> 00:27:13,560 Speaker 1: audio files you were creating through this voice to text 457 00:27:13,640 --> 00:27:18,040 Speaker 1: feature would go to be heard by any humans out there. 458 00:27:18,560 --> 00:27:21,760 Speaker 1: In fact, they said that the warning that would pop up, 459 00:27:21,840 --> 00:27:25,800 Speaker 1: or the notification that popped up said, turn on voice 460 00:27:25,840 --> 00:27:31,199 Speaker 1: to text in this chat using Facebook Messenger, and above 461 00:27:31,280 --> 00:27:34,119 Speaker 1: the no and yes buttons where you would choose one 462 00:27:34,160 --> 00:27:38,040 Speaker 1: of these options. Facebook further would describe the option display 463 00:27:38,200 --> 00:27:41,720 Speaker 1: text of voice clips you send and receive. You can 464 00:27:41,720 --> 00:27:45,240 Speaker 1: control whether text is visible to you for each chat. 465 00:27:46,359 --> 00:27:49,520 Speaker 1: So again it makes it sound like, oh, this is 466 00:27:49,520 --> 00:27:52,080 Speaker 1: all automated. If I use voice to text, I just 467 00:27:52,320 --> 00:27:55,760 Speaker 1: say a phrase, the text shows up. I might have 468 00:27:55,800 --> 00:27:58,840 Speaker 1: to make some adjustments to the text, maybe it has 469 00:27:58,960 --> 00:28:01,560 Speaker 1: misinterpreted one of the words or whatever. But sort of 470 00:28:01,600 --> 00:28:06,520 Speaker 1: a hands free approach to sending messages in Messenger. Lots 471 00:28:06,560 --> 00:28:09,520 Speaker 1: of apps use voice to text features, and in theory 472 00:28:10,000 --> 00:28:12,760 Speaker 1: it's a pretty great feature. You can dictate a message 473 00:28:12,800 --> 00:28:15,280 Speaker 1: to be sent to your friend without having to stare 474 00:28:15,359 --> 00:28:18,520 Speaker 1: at the screen and type or swipe on a keyboard. 475 00:28:19,200 --> 00:28:22,800 Speaker 1: Tons of folks use features like this if they want 476 00:28:22,840 --> 00:28:25,680 Speaker 1: to interact with an app while they're driving, for example, 477 00:28:25,720 --> 00:28:29,440 Speaker 1: to minimize the distractions they have as they putter around. 478 00:28:30,000 --> 00:28:34,200 Speaker 1: But you'll notice those messages don't seem to indicate anywhere 479 00:28:34,960 --> 00:28:37,800 Speaker 1: that the voice to text recordings could be sent to 480 00:28:38,000 --> 00:28:42,959 Speaker 1: a human being for review. Express Computer further explains that 481 00:28:43,160 --> 00:28:47,200 Speaker 1: even on a supplemental page explaining the voice to text feature, 482 00:28:48,040 --> 00:28:51,280 Speaker 1: Facebook fails to mention that human beings will be reviewing 483 00:28:51,320 --> 00:28:56,040 Speaker 1: that material. Instead. The supplemental page talks about how voice 484 00:28:56,040 --> 00:28:59,680 Speaker 1: to text uses machine learning to get better at interpreting 485 00:28:59,680 --> 00:29:02,160 Speaker 1: what you saying, so that it becomes more useful to 486 00:29:02,200 --> 00:29:05,840 Speaker 1: you the more you actually use the feature. So the 487 00:29:05,880 --> 00:29:10,520 Speaker 1: concept here was that some voice recognition software would transcribe 488 00:29:10,560 --> 00:29:13,880 Speaker 1: this audio. Google Voice also used to do this for 489 00:29:14,000 --> 00:29:17,760 Speaker 1: voice messages. I remember getting voicemails from my mother, who 490 00:29:17,840 --> 00:29:21,600 Speaker 1: has a Southern US dialect as do I, but hers 491 00:29:21,720 --> 00:29:25,520 Speaker 1: is more pronounced. The Google Voice speech to text program 492 00:29:25,640 --> 00:29:30,840 Speaker 1: had problems interpreting my mother's messages, and frequently the transcription 493 00:29:30,880 --> 00:29:34,520 Speaker 1: would be hilariously off track, and most of the time 494 00:29:34,720 --> 00:29:37,200 Speaker 1: I wouldn't even be able to guess what the original 495 00:29:37,240 --> 00:29:40,800 Speaker 1: message was based off the transcription. It meant that I 496 00:29:40,840 --> 00:29:43,240 Speaker 1: would listen to the voicemail and then I would shake 497 00:29:43,280 --> 00:29:46,240 Speaker 1: my head a lot as I would read the transcription 498 00:29:46,320 --> 00:29:48,520 Speaker 1: at the same time and just see how far off 499 00:29:48,600 --> 00:29:53,320 Speaker 1: it was. This is a big challenge for voice recognition programs. 500 00:29:53,560 --> 00:29:57,280 Speaker 1: There are a lot of different dialects and accents. People 501 00:29:57,320 --> 00:30:01,080 Speaker 1: from different regions within the same country can sound very 502 00:30:01,160 --> 00:30:04,680 Speaker 1: different even if they're speaking the exact same language. If 503 00:30:04,680 --> 00:30:08,760 Speaker 1: you get someone from Savannah, Georgia, a native of Savannah, Georgia, 504 00:30:09,000 --> 00:30:12,960 Speaker 1: and a native from Boston, Massachusetts, they're going to be 505 00:30:13,000 --> 00:30:15,600 Speaker 1: able to have a conversation with each other, but they 506 00:30:15,640 --> 00:30:19,280 Speaker 1: will end up saying the same words very differently from 507 00:30:19,280 --> 00:30:22,880 Speaker 1: one another. And that's before you even start talking about 508 00:30:22,960 --> 00:30:26,760 Speaker 1: people who have a different native language, who have learned 509 00:30:26,800 --> 00:30:30,560 Speaker 1: English and have a foreign accent on top of the 510 00:30:30,560 --> 00:30:34,120 Speaker 1: English they speak. There's no hard and fast rule you 511 00:30:34,160 --> 00:30:37,640 Speaker 1: can create for a voice recognition program to follow to 512 00:30:37,800 --> 00:30:42,040 Speaker 1: interpret speech correctly throughout a language. Because there's so much 513 00:30:42,120 --> 00:30:45,000 Speaker 1: variation in how the words and that language are said, 514 00:30:45,600 --> 00:30:49,479 Speaker 1: training the model becomes a challenge. So one thing you 515 00:30:49,560 --> 00:30:53,960 Speaker 1: can do is you have a human being transcribe spoken 516 00:30:54,000 --> 00:30:59,600 Speaker 1: words and then compare the human transcription against the machine 517 00:30:59,680 --> 00:31:03,120 Speaker 1: produce transcription in an effort to train your model to 518 00:31:03,200 --> 00:31:07,840 Speaker 1: be more effective. Humans are pretty good, though not perfect, 519 00:31:08,000 --> 00:31:11,800 Speaker 1: at figuring out what some other humans says. Assuming both 520 00:31:11,840 --> 00:31:15,200 Speaker 1: parties are fluent in the same language. By comparing these 521 00:31:15,200 --> 00:31:17,800 Speaker 1: two records against each other and then making corrections to 522 00:31:17,840 --> 00:31:21,560 Speaker 1: the model, computer scientists can tweak their voice recognition software 523 00:31:21,560 --> 00:31:25,479 Speaker 1: models to be more accurate. Now, ideally you would do 524 00:31:25,520 --> 00:31:29,440 Speaker 1: this before unleashing such a system on the public, but 525 00:31:29,760 --> 00:31:33,360 Speaker 1: that's not really that practical. There is no in lab 526 00:31:33,520 --> 00:31:36,280 Speaker 1: project that is going to come close to generating the 527 00:31:36,360 --> 00:31:39,800 Speaker 1: amount of data and the sheer variety that you will 528 00:31:39,880 --> 00:31:43,360 Speaker 1: encounter out in the real world. Improving the model would 529 00:31:43,360 --> 00:31:47,360 Speaker 1: happen much faster with a larger sample of subjects using 530 00:31:47,480 --> 00:31:50,520 Speaker 1: the model, and a billion or so people is a 531 00:31:50,560 --> 00:31:55,400 Speaker 1: pretty darn big sample size. But that means sending these 532 00:31:55,440 --> 00:31:59,320 Speaker 1: audio files to humans in the first place. And Facebook 533 00:31:59,320 --> 00:32:02,520 Speaker 1: has said that the files were anonymized so that there 534 00:32:02,560 --> 00:32:06,240 Speaker 1: was no identifiable name or anything associated with each of 535 00:32:06,240 --> 00:32:09,440 Speaker 1: the audio files being sent for human review. But hey, 536 00:32:09,600 --> 00:32:12,360 Speaker 1: I hear you say. Earlier in this episode, you pointed 537 00:32:12,360 --> 00:32:14,480 Speaker 1: out how it's possible to really get an idea about 538 00:32:14,480 --> 00:32:18,640 Speaker 1: a person just from the other data they provide, and 539 00:32:18,720 --> 00:32:22,520 Speaker 1: you'd be right. These audio files had all sorts of 540 00:32:22,560 --> 00:32:25,480 Speaker 1: different types of content in them, some of it was 541 00:32:25,600 --> 00:32:30,719 Speaker 1: likely upsetting disturbing or inappropriate. Contractors who had been hired 542 00:32:30,760 --> 00:32:34,320 Speaker 1: to do the transcription came forward anonymously, I might add, 543 00:32:34,320 --> 00:32:36,520 Speaker 1: because they didn't want to get fired from their jobs, 544 00:32:36,920 --> 00:32:40,040 Speaker 1: and said they felt that the practice was an unethical one. 545 00:32:40,280 --> 00:32:42,680 Speaker 1: And media outlets looked into it and their conclusions were 546 00:32:42,680 --> 00:32:45,480 Speaker 1: pretty much the same. Right down the board, Facebook was 547 00:32:45,600 --> 00:32:49,440 Speaker 1: not transparent about what was happening with this audio, and 548 00:32:49,440 --> 00:32:52,680 Speaker 1: there were no clear indications to users that their audio 549 00:32:52,680 --> 00:32:55,480 Speaker 1: files might get sent to some stranger for the purposes 550 00:32:55,520 --> 00:32:59,280 Speaker 1: of transcription. Now, for its part, Facebook said it halted 551 00:32:59,280 --> 00:33:03,080 Speaker 1: the practice in early August two thousand nineteen, and third 552 00:33:03,120 --> 00:33:06,280 Speaker 1: party contractors have said that that is true that they 553 00:33:06,320 --> 00:33:09,480 Speaker 1: no longer are doing this work for Facebook. Facebook isn't 554 00:33:09,480 --> 00:33:11,680 Speaker 1: the only company to come under scrutiny for this kind 555 00:33:11,720 --> 00:33:15,320 Speaker 1: of thing. Google, Apple, and Microsoft have also been under 556 00:33:15,320 --> 00:33:18,880 Speaker 1: the microscope for very similar practices. Now, on the one hand, 557 00:33:19,320 --> 00:33:22,160 Speaker 1: it's understandable that these companies want to improve their voice 558 00:33:22,200 --> 00:33:26,280 Speaker 1: recognition capabilities. It's what makes these apps and products useful 559 00:33:26,720 --> 00:33:29,640 Speaker 1: and makes it more useful to a wider variety of 560 00:33:29,680 --> 00:33:33,120 Speaker 1: people by training the models on this stuff. But the 561 00:33:33,160 --> 00:33:37,040 Speaker 1: privacy concerns remain and it's something that isn't just troubling 562 00:33:37,080 --> 00:33:39,640 Speaker 1: to users, but to the people actually being paid to 563 00:33:39,720 --> 00:33:42,480 Speaker 1: transcribe the stuff in the first place. Now, it would 564 00:33:42,520 --> 00:33:46,160 Speaker 1: be another matter if the companies were transparent about this practice. 565 00:33:46,480 --> 00:33:50,040 Speaker 1: If users knew that there's a chance a real, live 566 00:33:50,120 --> 00:33:52,200 Speaker 1: human being would be listening in on some of those 567 00:33:52,240 --> 00:33:55,680 Speaker 1: voice messages for the purposes of quality control for the 568 00:33:55,760 --> 00:33:59,000 Speaker 1: voice to text feature, maybe they wouldn't opt into using 569 00:33:59,000 --> 00:34:01,239 Speaker 1: the voice to text in the first place, or they 570 00:34:01,320 --> 00:34:05,080 Speaker 1: might opt in and not care. In some cases, I'm 571 00:34:05,080 --> 00:34:07,120 Speaker 1: sure there'd be no shortage of people who would actually 572 00:34:07,160 --> 00:34:11,680 Speaker 1: say truly terrible things, hoping that some poor contractor would 573 00:34:11,719 --> 00:34:13,760 Speaker 1: have to listen to it all and check the audio 574 00:34:13,800 --> 00:34:18,480 Speaker 1: against the automated transcription, because some people would just play nasty. 575 00:34:18,880 --> 00:34:21,480 Speaker 1: Don't be nasty. By the way, there are better ways 576 00:34:21,480 --> 00:34:24,759 Speaker 1: to entertain yourself than by making some other person's life miserable. 577 00:34:25,560 --> 00:34:30,480 Speaker 1: Facebook could potentially face some serious charges based on this practice. 578 00:34:30,880 --> 00:34:34,279 Speaker 1: The company had settled with the Federal Trade Commission, or FTC, 579 00:34:35,000 --> 00:34:38,320 Speaker 1: earlier in the summer of two thousand nineteen. The settlement 580 00:34:38,400 --> 00:34:43,040 Speaker 1: was for an incredible five billion dollars, and it largely 581 00:34:43,040 --> 00:34:47,400 Speaker 1: revolved around the company's rather abysmal record with privacy. The 582 00:34:47,520 --> 00:34:50,520 Speaker 1: charges date all the way back to two thousand twelve, 583 00:34:50,800 --> 00:34:55,440 Speaker 1: when the FTC brought eight privacy related allegations against Facebook. 584 00:34:55,920 --> 00:34:59,239 Speaker 1: And again, this isn't a big surprise. Zuckerberg had already 585 00:34:59,360 --> 00:35:03,759 Speaker 1: cavalierly proclaimed privacy dead a couple of years before that. Now, 586 00:35:03,760 --> 00:35:07,120 Speaker 1: in the settlement, Facebook agreed to adhere to some rules. 587 00:35:07,400 --> 00:35:11,440 Speaker 1: Those rules said that Facebook was prohibited from making misrepresentations 588 00:35:11,520 --> 00:35:15,920 Speaker 1: about the privacy or security of consumers information, prohibited from 589 00:35:15,960 --> 00:35:20,120 Speaker 1: misrepresenting the extent to which it shares personal data, and 590 00:35:20,239 --> 00:35:24,560 Speaker 1: it required Facebook to implement a reasonable privacy program. Now 591 00:35:24,600 --> 00:35:28,319 Speaker 1: I'm no legal expert, not by a long shot, but 592 00:35:28,400 --> 00:35:32,200 Speaker 1: it seems to me that Facebook's failure to alert users 593 00:35:32,280 --> 00:35:34,640 Speaker 1: that their voice to text data could be sent to 594 00:35:34,760 --> 00:35:39,440 Speaker 1: non Facebook employees for review is in violation of this agreement. 595 00:35:39,880 --> 00:35:43,080 Speaker 1: That Facebook agreed to these terms in July two thousand nineteen, 596 00:35:43,520 --> 00:35:47,640 Speaker 1: and then continued the practice into August is a big problem. 597 00:35:47,680 --> 00:35:50,160 Speaker 1: Whether or not it will result in further legal action 598 00:35:50,480 --> 00:35:53,840 Speaker 1: against this company is unknown as I record this episode, 599 00:35:54,040 --> 00:35:57,440 Speaker 1: but it seems like it's at least possible, So I'm 600 00:35:57,440 --> 00:36:00,160 Speaker 1: gonna wrap this up. We know that microphones can sit 601 00:36:00,239 --> 00:36:02,440 Speaker 1: in on us without our knowledge. The n s A 602 00:36:02,560 --> 00:36:05,759 Speaker 1: worked on programs in the United States that did exactly that. 603 00:36:06,239 --> 00:36:09,120 Speaker 1: And while companies with virtual personal assistants tell us that 604 00:36:09,160 --> 00:36:13,399 Speaker 1: those assistants only activate when certain phrases are spoken, it's 605 00:36:13,440 --> 00:36:16,760 Speaker 1: also possible that that list of phrases could go well 606 00:36:16,840 --> 00:36:20,480 Speaker 1: beyond the ones published by the company. So, in other words, 607 00:36:20,880 --> 00:36:24,799 Speaker 1: I might know that to wake up my hypothetical virtual assistant, 608 00:36:25,080 --> 00:36:28,759 Speaker 1: I would have to say the alert phrase sky net awaken, 609 00:36:29,200 --> 00:36:31,520 Speaker 1: and then it pays attention. But what if there's a 610 00:36:31,560 --> 00:36:35,680 Speaker 1: whole laundry list of other words or phrases that could 611 00:36:35,719 --> 00:36:38,880 Speaker 1: wake it up so that it records or transcribes whatever 612 00:36:38,960 --> 00:36:43,040 Speaker 1: audio follows. What if, for example, the phrase shopping or 613 00:36:43,280 --> 00:36:48,240 Speaker 1: going shopping activates it so that whatever follows gets registered 614 00:36:48,280 --> 00:36:50,320 Speaker 1: by the device. So if I tell a friend tomorrow, 615 00:36:50,360 --> 00:36:53,839 Speaker 1: I'm going shopping for some new sneakers, the device has 616 00:36:53,880 --> 00:36:57,279 Speaker 1: registered the phrase new speakers because it paid attention once 617 00:36:57,320 --> 00:37:00,200 Speaker 1: I said the words going shopping, and then I starting 618 00:37:00,200 --> 00:37:03,359 Speaker 1: ads pop up everywhere I go online for sneakers. Now, 619 00:37:03,440 --> 00:37:08,759 Speaker 1: is that something that's possible, Well, yeah, it's possible. That 620 00:37:08,800 --> 00:37:12,399 Speaker 1: doesn't mean it's happening, but it could be It's also 621 00:37:12,440 --> 00:37:15,440 Speaker 1: possible that my other behaviors have indicated that I'm on 622 00:37:15,480 --> 00:37:19,160 Speaker 1: the lookout for some new kicks. Coincidence is a thing, 623 00:37:19,480 --> 00:37:23,319 Speaker 1: and it's frustrating because without seeing behind the scenes, it's 624 00:37:23,360 --> 00:37:28,120 Speaker 1: hard to draw any firm conclusions. Most of us, myself included, 625 00:37:28,400 --> 00:37:32,000 Speaker 1: have a limited understanding of exactly how much data we're 626 00:37:32,040 --> 00:37:34,719 Speaker 1: generating in our day to day lives and how that 627 00:37:34,840 --> 00:37:38,719 Speaker 1: data can be analyzed for patterns and predictions. We may 628 00:37:38,760 --> 00:37:42,080 Speaker 1: not even be aware that we're heading toward a particular 629 00:37:42,120 --> 00:37:46,840 Speaker 1: decision before an algorithm draws that conclusion, and it's spooky 630 00:37:46,960 --> 00:37:50,080 Speaker 1: and disturbing. But it doesn't necessarily mean that we're being 631 00:37:50,160 --> 00:37:53,440 Speaker 1: spied on by a microphone. It may mean we're just 632 00:37:53,520 --> 00:37:57,880 Speaker 1: broadcasting our decisions before we've known that we've made a decision, 633 00:37:58,600 --> 00:38:01,640 Speaker 1: and it does indicate that there is some sort of 634 00:38:02,000 --> 00:38:05,800 Speaker 1: eaves dropping going on, just not necessarily audio eaves dropping. 635 00:38:05,800 --> 00:38:09,800 Speaker 1: It's more about all of our other behaviors that humans 636 00:38:09,840 --> 00:38:11,919 Speaker 1: don't pick up on, so we've never had to worry 637 00:38:11,960 --> 00:38:14,840 Speaker 1: about it before, but machines can analyze it at a 638 00:38:14,920 --> 00:38:19,080 Speaker 1: level that is disturbing. In fact, an actual study at 639 00:38:19,120 --> 00:38:22,560 Speaker 1: Northeastern University looked into the possibility of whether or not 640 00:38:22,719 --> 00:38:26,960 Speaker 1: phones were getting activated by clandestine phrases and listening in 641 00:38:27,000 --> 00:38:30,400 Speaker 1: on conversations, and it found that there was no evidence 642 00:38:30,480 --> 00:38:32,920 Speaker 1: that this was happening. They did find that a lot 643 00:38:33,000 --> 00:38:36,360 Speaker 1: of apps were taking screenshots of stuff on phones and 644 00:38:36,400 --> 00:38:39,080 Speaker 1: sending those screenshots to third parties, though, so you know, 645 00:38:39,560 --> 00:38:44,600 Speaker 1: that's also disturbing, But it doesn't appear that these devices 646 00:38:44,600 --> 00:38:48,320 Speaker 1: are actively listening to you all the time and recording 647 00:38:48,400 --> 00:38:54,120 Speaker 1: or transcribing or broadcasting that information anywhere. There's a lot 648 00:38:54,200 --> 00:38:59,600 Speaker 1: to lose from doing that approach. The problem is it 649 00:38:59,800 --> 00:39:03,239 Speaker 1: is something that is possible, and the other problem is 650 00:39:03,280 --> 00:39:06,239 Speaker 1: that there are other behaviors were doing that are just 651 00:39:06,320 --> 00:39:09,719 Speaker 1: as revealing, if not more so, than recording what it 652 00:39:09,840 --> 00:39:13,919 Speaker 1: is we're saying, and that without being aware of that, 653 00:39:14,360 --> 00:39:18,040 Speaker 1: we are just giving away more and more information about 654 00:39:18,040 --> 00:39:21,200 Speaker 1: ourselves and more and more control over our own lives. 655 00:39:21,360 --> 00:39:23,360 Speaker 1: And we're going to see more and more targeted ads 656 00:39:23,360 --> 00:39:27,400 Speaker 1: that seem super creepy because there's mentioning things that we 657 00:39:27,400 --> 00:39:31,359 Speaker 1: didn't think anyone knew about, because most people wouldn't pick 658 00:39:31,440 --> 00:39:35,080 Speaker 1: up on it fun times, So I don't think this 659 00:39:35,160 --> 00:39:39,800 Speaker 1: was a particularly you know, um, I don't think this 660 00:39:39,880 --> 00:39:44,440 Speaker 1: show really helps allay any fears. It may just switch 661 00:39:44,520 --> 00:39:48,759 Speaker 1: fears from microphones to everything else. But I did want 662 00:39:48,760 --> 00:39:50,920 Speaker 1: to cover this because a lot of people have been 663 00:39:50,960 --> 00:39:53,319 Speaker 1: talking about it for the last few years, and with 664 00:39:53,520 --> 00:39:59,560 Speaker 1: these transcription services that has brought the whole conversation back 665 00:39:59,640 --> 00:40:02,120 Speaker 1: into you the forefront. So I wanted to take an 666 00:40:02,160 --> 00:40:05,080 Speaker 1: opportunity to really tackle it here on the show. If 667 00:40:05,120 --> 00:40:08,080 Speaker 1: you have a suggestion for a future episode of tech Stuff, 668 00:40:08,320 --> 00:40:10,920 Speaker 1: send me an email the addresses tech Stuff at how 669 00:40:11,000 --> 00:40:13,319 Speaker 1: stuff works dot com, or drop me a line. By 670 00:40:13,640 --> 00:40:16,760 Speaker 1: going to tech stuff podcast dot com. You will find 671 00:40:16,920 --> 00:40:20,239 Speaker 1: there a link to all of our archived episodes, as 672 00:40:20,280 --> 00:40:23,120 Speaker 1: well as links to our presence on social media where 673 00:40:23,160 --> 00:40:25,120 Speaker 1: you can get in touch with us, and also a 674 00:40:25,160 --> 00:40:27,640 Speaker 1: link to our online store, where every purchase you make 675 00:40:27,760 --> 00:40:30,880 Speaker 1: goes to help the show. We greatly appreciate your support 676 00:40:31,400 --> 00:40:39,359 Speaker 1: and I will talk to you again really soon. Text 677 00:40:39,400 --> 00:40:42,040 Speaker 1: Stuff is a production of I Heart Radio's How Stuff Works. 678 00:40:42,200 --> 00:40:45,040 Speaker 1: For more podcasts from my heart Radio, visit the i 679 00:40:45,160 --> 00:40:48,360 Speaker 1: heart Radio app, Apple Podcasts, or wherever you listen to 680 00:40:48,400 --> 00:40:49,360 Speaker 1: your favorite shows.