1 00:00:04,440 --> 00:00:12,479 Speaker 1: Welcome to tech Stuff, a production from iHeartRadio. Hey there, 2 00:00:12,520 --> 00:00:15,720 Speaker 1: and welcome to tech Stuff. I'm your host, Jonathan Strickland. 3 00:00:15,720 --> 00:00:18,799 Speaker 1: I'm an executive producer with iHeart Podcasts and How the 4 00:00:18,880 --> 00:00:22,239 Speaker 1: tech are you? So I'm getting ready to go on vacation, 5 00:00:22,560 --> 00:00:26,280 Speaker 1: which means we've got some classic episodes lined up for you. Actually, 6 00:00:26,280 --> 00:00:30,040 Speaker 1: these aren't that classic. These came out last year and 7 00:00:30,080 --> 00:00:33,959 Speaker 1: today I thought I would bring a short one for you. 8 00:00:34,440 --> 00:00:38,840 Speaker 1: This one was published originally on June seventh, twenty twenty three. 9 00:00:39,200 --> 00:00:42,479 Speaker 1: It's a fun little episode. It is titled what was 10 00:00:42,960 --> 00:00:46,560 Speaker 1: the First MP three? This is like one of those 11 00:00:46,600 --> 00:00:50,200 Speaker 1: pub trivia style tech stuff topics. I hope you enjoy. 12 00:00:52,280 --> 00:00:54,240 Speaker 1: It's time for a tech stuff tidbits. I'm going to 13 00:00:54,320 --> 00:00:58,720 Speaker 1: answer the question what was the first MP three? Well, 14 00:00:58,800 --> 00:01:01,960 Speaker 1: here's the too long didn't answer. It was Tom's Diner 15 00:01:02,040 --> 00:01:06,360 Speaker 1: by Suzanne Vega. It's a song I personally do not like. 16 00:01:07,000 --> 00:01:09,800 Speaker 1: It's not to say it's a bad song. Just because 17 00:01:09,920 --> 00:01:12,679 Speaker 1: I don't like something doesn't mean it's bad. I just 18 00:01:12,720 --> 00:01:16,000 Speaker 1: mean I personally do not find this song at all appealing. 19 00:01:16,640 --> 00:01:19,520 Speaker 1: But it was, in fact the first MP three. Now, 20 00:01:19,520 --> 00:01:23,320 Speaker 1: if you don't know Tom's Diner. It features Vega giving 21 00:01:23,360 --> 00:01:26,360 Speaker 1: a little slice a life moment from the perspective of 22 00:01:26,400 --> 00:01:29,240 Speaker 1: a man sitting in a diner who feels kind of 23 00:01:29,280 --> 00:01:32,960 Speaker 1: distanced from the world around him. In case you need 24 00:01:33,000 --> 00:01:38,000 Speaker 1: a reminder, here's the first verse of the song. I 25 00:01:38,040 --> 00:01:41,240 Speaker 1: am sitting in the morning at the diner on the corner. 26 00:01:41,480 --> 00:01:44,120 Speaker 1: I am waiting at the counter for the man to 27 00:01:44,240 --> 00:01:47,840 Speaker 1: pour the coffee, and he fills it only halfway and 28 00:01:47,920 --> 00:01:51,400 Speaker 1: before I even argue, he is looking out the window 29 00:01:51,840 --> 00:01:58,480 Speaker 1: at somebody coming in. Now that song doesn't work for me. 30 00:01:59,160 --> 00:02:01,880 Speaker 1: I get that it got really popular, especially after someone 31 00:02:01,920 --> 00:02:04,800 Speaker 1: did an unauthorized remix of it, which is the version 32 00:02:04,920 --> 00:02:08,160 Speaker 1: most people know. But it turned out to be an 33 00:02:08,320 --> 00:02:12,919 Speaker 1: absolute perfect song to test the MP three compression algorithm. 34 00:02:13,360 --> 00:02:16,800 Speaker 1: To understand why, we need to learn about the purpose 35 00:02:16,880 --> 00:02:20,040 Speaker 1: of the MP three compression algorithm in the first place. 36 00:02:20,360 --> 00:02:23,440 Speaker 1: So in this case, the compression we're talking about is 37 00:02:23,520 --> 00:02:27,200 Speaker 1: relating to file size. There's an interesting side note. There's 38 00:02:27,240 --> 00:02:30,960 Speaker 1: a different kind of audio compression. This refers to the 39 00:02:31,000 --> 00:02:35,360 Speaker 1: reduction of dynamic range in a recording, and by that 40 00:02:35,480 --> 00:02:40,720 Speaker 1: I mean reducing the volume distance between the loudest and 41 00:02:40,800 --> 00:02:44,240 Speaker 1: the softest parts of a recording that can actually take 42 00:02:44,600 --> 00:02:49,639 Speaker 1: a part in file compression as well, but that's we're 43 00:02:49,639 --> 00:02:52,080 Speaker 1: going to set it aside. Just put a pin in that, 44 00:02:52,280 --> 00:02:54,920 Speaker 1: take a look at it later on. But with file 45 00:02:55,000 --> 00:02:58,840 Speaker 1: compression generally, the whole goal is to find ways to 46 00:02:58,960 --> 00:03:03,800 Speaker 1: pack information into smaller file sizes. That makes those files 47 00:03:03,840 --> 00:03:07,600 Speaker 1: easier to manage. That's important if you are dealing with 48 00:03:07,639 --> 00:03:11,000 Speaker 1: a limited amount of storage, or maybe you want to 49 00:03:11,080 --> 00:03:13,639 Speaker 1: send the file from one machine to another and you've 50 00:03:13,639 --> 00:03:17,200 Speaker 1: got limited bandwidth so you need smaller file sizes, or 51 00:03:17,240 --> 00:03:19,760 Speaker 1: else the process is going to take way too long, 52 00:03:20,200 --> 00:03:23,200 Speaker 1: But how do you do it well? One approach to 53 00:03:23,520 --> 00:03:27,160 Speaker 1: file compression is to take a real good look at 54 00:03:27,160 --> 00:03:31,480 Speaker 1: the file you're trying to compress, and you ask the question, 55 00:03:32,360 --> 00:03:35,560 Speaker 1: is all the information that is inside this file necessary? 56 00:03:36,080 --> 00:03:38,840 Speaker 1: Or could I get rid of some of that information 57 00:03:39,400 --> 00:03:43,200 Speaker 1: and still have a usable file on the other side 58 00:03:43,200 --> 00:03:47,080 Speaker 1: of it With music, That means figuring out which bits 59 00:03:47,160 --> 00:03:50,640 Speaker 1: of data you can drop without it having a noticeable 60 00:03:50,680 --> 00:03:55,640 Speaker 1: effect on the audio quality. Ideally the compressed file would 61 00:03:55,680 --> 00:04:00,600 Speaker 1: be indistinguishable from the original raw audio, but since tossing 62 00:04:00,680 --> 00:04:05,160 Speaker 1: out information that's not necessarily a guarantee. This is what 63 00:04:05,320 --> 00:04:10,520 Speaker 1: makes the MP three a loss e file format. MP 64 00:04:10,600 --> 00:04:14,120 Speaker 1: three is just one example of a loss e file format. 65 00:04:14,160 --> 00:04:17,159 Speaker 1: There are others, and the word loss e means just 66 00:04:17,320 --> 00:04:21,039 Speaker 1: exactly what you think. It means that some information is 67 00:04:21,160 --> 00:04:25,359 Speaker 1: tossed aside or lost in the process of compressing the 68 00:04:25,360 --> 00:04:28,520 Speaker 1: file to a smaller size. The folks who worked on 69 00:04:28,560 --> 00:04:32,360 Speaker 1: the MP three format had to figure out which information 70 00:04:32,920 --> 00:04:35,359 Speaker 1: was most likely to have little to no impact on 71 00:04:35,480 --> 00:04:39,600 Speaker 1: audio quality within an audio file. To do that, they 72 00:04:39,640 --> 00:04:43,880 Speaker 1: had to take into account human psychology and the limitations 73 00:04:44,000 --> 00:04:49,159 Speaker 1: of human hearing. So psychoacoustics played a big part in 74 00:04:49,240 --> 00:04:54,359 Speaker 1: determining the MP three compression algorithm. So for example, by that, 75 00:04:54,480 --> 00:04:58,039 Speaker 1: I mean, let's think of the range of human hearing 76 00:04:58,080 --> 00:05:01,160 Speaker 1: in terms of frequencies for a second, so your typical 77 00:05:01,240 --> 00:05:06,360 Speaker 1: human is able to hear frequencies as low as twenty 78 00:05:06,440 --> 00:05:11,000 Speaker 1: hurts and as high as twenty thousand hurts or twenty 79 00:05:11,080 --> 00:05:15,760 Speaker 1: killer hurts. Hurts in this case references an oscillation per 80 00:05:15,760 --> 00:05:19,039 Speaker 1: second or a vibration per second, So twenty hurts means 81 00:05:19,560 --> 00:05:24,479 Speaker 1: that something is effectively vibrating twenty times per second. So 82 00:05:24,520 --> 00:05:27,640 Speaker 1: if you had a string that when you plucked, it 83 00:05:27,680 --> 00:05:31,160 Speaker 1: would vibrate twenty times per second. That string is vibrating 84 00:05:31,200 --> 00:05:35,360 Speaker 1: at twenty hurts. That would be a very very low note. 85 00:05:35,960 --> 00:05:38,880 Speaker 1: The higher the frequency, the higher the pitch, and as 86 00:05:38,920 --> 00:05:41,200 Speaker 1: we age, we tend to lose the ability to hear 87 00:05:41,240 --> 00:05:44,200 Speaker 1: some of those higher pitches, which is why you would 88 00:05:44,240 --> 00:05:48,120 Speaker 1: hear about some convenience stores experimenting with playing very high 89 00:05:48,200 --> 00:05:52,480 Speaker 1: pitch noises to discourage young punks who wanted to loiter 90 00:05:52,640 --> 00:05:56,960 Speaker 1: in the joint. So human hearing has limitations, and in 91 00:05:57,040 --> 00:06:01,880 Speaker 1: theory you can eliminate sounds that would fall outside of 92 00:06:01,920 --> 00:06:05,960 Speaker 1: those limitations. If a sound file contains frequencies that are 93 00:06:06,000 --> 00:06:09,520 Speaker 1: at twenty one killer hertz, but your typical person can't 94 00:06:09,520 --> 00:06:14,560 Speaker 1: hear anything above twenty killer hertz, well, at least theoretically, 95 00:06:14,600 --> 00:06:17,599 Speaker 1: you can just toss that information and it won't change anything. 96 00:06:17,880 --> 00:06:21,240 Speaker 1: If a sound file contains a sound but no one 97 00:06:21,320 --> 00:06:24,279 Speaker 1: has the capacity to hear it, does a tree fall 98 00:06:24,360 --> 00:06:28,440 Speaker 1: in the forest? Might be getting a little lost in 99 00:06:28,480 --> 00:06:32,760 Speaker 1: the woods here anyway. That frequency example, that's just one 100 00:06:32,839 --> 00:06:35,520 Speaker 1: example of the sound that humans would have trouble hearing. 101 00:06:35,920 --> 00:06:40,120 Speaker 1: So another is when we hear a very soft sound 102 00:06:40,200 --> 00:06:44,400 Speaker 1: that immediately follows a very loud sound, we don't actually 103 00:06:44,440 --> 00:06:48,200 Speaker 1: perceive the soft one. The loud sound we hear eclipses 104 00:06:48,279 --> 00:06:51,400 Speaker 1: the soft sound, and it turns out we can't hear 105 00:06:51,440 --> 00:06:54,839 Speaker 1: the soft one at all. So again, if we can't 106 00:06:54,839 --> 00:06:58,120 Speaker 1: hear that soft sound that played immediately after a loud one, 107 00:06:58,680 --> 00:07:00,640 Speaker 1: why would you keep it? You know, you might as 108 00:07:00,680 --> 00:07:02,480 Speaker 1: well just get rid of the information you can't hear 109 00:07:02,520 --> 00:07:05,720 Speaker 1: it anyway, Just get rid of it, save the space. 110 00:07:06,520 --> 00:07:10,120 Speaker 1: This psychoacoustic approach to sound would lead the developers of 111 00:07:10,120 --> 00:07:13,360 Speaker 1: the MP three format to create a strategy regarding what 112 00:07:13,520 --> 00:07:17,600 Speaker 1: information to keep and what information to ditch. On top 113 00:07:17,680 --> 00:07:22,000 Speaker 1: of that, the algorithm had sort of a sliding scale, 114 00:07:22,640 --> 00:07:25,680 Speaker 1: So maybe you want to keep as much information as possible, 115 00:07:25,720 --> 00:07:27,920 Speaker 1: so you select that when you create the MP three 116 00:07:28,400 --> 00:07:32,000 Speaker 1: So you're losing less information in the process. You're still 117 00:07:32,000 --> 00:07:34,160 Speaker 1: compressing the file, but not to the extent that you 118 00:07:34,320 --> 00:07:38,640 Speaker 1: could if you chose. Maybe the most important thing to 119 00:07:38,720 --> 00:07:41,080 Speaker 1: you is that you reduce the file size as much 120 00:07:41,080 --> 00:07:44,440 Speaker 1: as you can, so you crank the compression up. Now, 121 00:07:44,480 --> 00:07:47,880 Speaker 1: obviously the harder you go, the more likely you're going 122 00:07:47,920 --> 00:07:50,680 Speaker 1: to lose information that will make a noticeable difference in 123 00:07:50,720 --> 00:07:54,760 Speaker 1: the playback of the audio. File, and you'll you would say, oh, 124 00:07:54,800 --> 00:07:57,400 Speaker 1: the quality here is not as good as I thought 125 00:07:57,440 --> 00:08:01,000 Speaker 1: it would be. This is where Tom's Diner comes in. 126 00:08:01,840 --> 00:08:05,160 Speaker 1: Carl Heinz Brandenburg, who was one of the leads on 127 00:08:05,280 --> 00:08:09,800 Speaker 1: creating the MP three format, used Tom's Diner to listen 128 00:08:09,840 --> 00:08:13,920 Speaker 1: back to compressed files and determine how the compression was 129 00:08:13,960 --> 00:08:18,960 Speaker 1: affecting the audio quality. So it was a great track 130 00:08:19,040 --> 00:08:23,920 Speaker 1: to use because the actual qualities of the recording itself 131 00:08:24,600 --> 00:08:27,920 Speaker 1: were such that it was easy to detect if something 132 00:08:28,160 --> 00:08:32,200 Speaker 1: was not quite right. The original recording of Tom's Diner 133 00:08:32,320 --> 00:08:36,200 Speaker 1: is not the one that has the catchy beat and 134 00:08:36,240 --> 00:08:38,720 Speaker 1: the horns in it. It's a very simple a cappella 135 00:08:38,800 --> 00:08:42,200 Speaker 1: recording of Suzanne Vegas singing her tale of looking at 136 00:08:42,200 --> 00:08:44,760 Speaker 1: the world from a male perspective through a sense of 137 00:08:44,800 --> 00:08:49,800 Speaker 1: distance and attachment. Brandenburg would use that track while tweaking 138 00:08:49,800 --> 00:08:52,920 Speaker 1: the algorithm, trying to create the thin line between an 139 00:08:53,000 --> 00:08:57,440 Speaker 1: effective data compression technique and a minimal impact on sound quality, 140 00:08:57,559 --> 00:09:00,000 Speaker 1: and for her contributions to the effort, although she made 141 00:09:00,160 --> 00:09:04,600 Speaker 1: them unknowingly, branden Berg would name Suzanne Vega the mother 142 00:09:04,880 --> 00:09:08,800 Speaker 1: of the MP three. Interestingly, Ryan maguire decided to take 143 00:09:08,840 --> 00:09:12,280 Speaker 1: a sort of negative image of the compressed Tom's Diner. 144 00:09:12,320 --> 00:09:15,960 Speaker 1: He identified sounds that were deleted in the process of 145 00:09:16,000 --> 00:09:19,280 Speaker 1: creating a lossy version of Tom's Diner, and then it 146 00:09:19,360 --> 00:09:22,480 Speaker 1: created a new recording that contained only the bits that 147 00:09:22,679 --> 00:09:26,640 Speaker 1: had been cut from the file. And it's almost like 148 00:09:26,760 --> 00:09:29,439 Speaker 1: listening to the ghost of a song. In fact, I 149 00:09:29,480 --> 00:09:33,160 Speaker 1: think they called the project the Ghost of the MP three. 150 00:09:33,240 --> 00:09:35,640 Speaker 1: It's pretty creepy stuff. It would not be out of 151 00:09:35,679 --> 00:09:38,480 Speaker 1: place in a horror movie. The fact that lossy files, 152 00:09:38,480 --> 00:09:41,479 Speaker 1: by definition lose information in the process of data compression 153 00:09:42,120 --> 00:09:45,480 Speaker 1: meant that audio files dismissed. The MP three format is 154 00:09:45,559 --> 00:09:48,800 Speaker 1: inherently inferior to others, at least as far as listening 155 00:09:48,880 --> 00:09:52,440 Speaker 1: experiences go, and there are arguments that some of the 156 00:09:52,520 --> 00:09:56,720 Speaker 1: lost information, while potentially being imperceptible within the song itself, 157 00:09:57,040 --> 00:10:00,720 Speaker 1: helped shape the overall sound and tone the piece. So 158 00:10:00,760 --> 00:10:05,040 Speaker 1: though you can't directly hear the stuff that's being cut, 159 00:10:05,600 --> 00:10:09,480 Speaker 1: that stuff actually influences how you perceive other things, so 160 00:10:09,960 --> 00:10:13,840 Speaker 1: you still change the experience of hearing the finished audio. 161 00:10:14,080 --> 00:10:17,120 Speaker 1: But the MP three format create the opportunity to store 162 00:10:17,200 --> 00:10:20,120 Speaker 1: and transfer audio files without having to deal with massive 163 00:10:20,200 --> 00:10:23,960 Speaker 1: raw audio formats, and back in the day that was 164 00:10:24,000 --> 00:10:27,840 Speaker 1: not a trivial thing. And so that is the answer 165 00:10:27,840 --> 00:10:32,320 Speaker 1: to the question. Tom's Diner the first MP three Hope 166 00:10:32,320 --> 00:10:36,400 Speaker 1: you're all well and I'll talk to you again really soon. 167 00:10:42,880 --> 00:10:47,520 Speaker 1: Tech Stuff is an iHeartRadio production. For more podcasts from iHeartRadio, 168 00:10:47,840 --> 00:10:51,559 Speaker 1: visit the iHeartRadio app, Apple Podcasts, or wherever you listen 169 00:10:51,600 --> 00:10:56,200 Speaker 1: to your favorite shows.