1 00:00:04,440 --> 00:00:12,360 Speaker 1: Welcome to tech Stuff, a production from iHeartRadio. Hey there, 2 00:00:12,360 --> 00:00:15,840 Speaker 1: and welcome to tech Stuff. I'm your host, Jonathan Strickland. 3 00:00:15,840 --> 00:00:18,920 Speaker 1: I'm an executive producer with iHeartRadio. And how the tech 4 00:00:18,960 --> 00:00:22,000 Speaker 1: are you? It's time for a tech Stuff tidbits. I'm 5 00:00:22,000 --> 00:00:26,680 Speaker 1: going to answer the question what was the first MP three? Well, 6 00:00:26,760 --> 00:00:29,520 Speaker 1: here's the too long, didn't listen answer. It was Tom's 7 00:00:29,560 --> 00:00:33,680 Speaker 1: Diner by Suzanne Vega. It's a song I personally do 8 00:00:33,760 --> 00:00:36,400 Speaker 1: not like. That's not to say it's a bad song. 9 00:00:37,320 --> 00:00:39,520 Speaker 1: Just because I don't like something doesn't mean it's bad. 10 00:00:40,320 --> 00:00:42,840 Speaker 1: I just mean I personally do not find this song 11 00:00:43,120 --> 00:00:46,640 Speaker 1: at all appealing. But it was, in fact the first 12 00:00:46,720 --> 00:00:49,839 Speaker 1: MP three. Now, if you don't know Tom's Diner, it 13 00:00:49,920 --> 00:00:53,040 Speaker 1: features Vega giving a little slice a life moment from 14 00:00:53,120 --> 00:00:56,240 Speaker 1: the perspective of a man sitting in a diner who 15 00:00:56,280 --> 00:01:00,400 Speaker 1: feels kind of distanced from the world around him. In 16 00:01:00,440 --> 00:01:04,800 Speaker 1: case you need a reminder, here's the first verse of 17 00:01:04,840 --> 00:01:08,039 Speaker 1: the song. I am sitting in the morning at the 18 00:01:08,120 --> 00:01:10,840 Speaker 1: diner on the corner. I am waiting at the counter 19 00:01:11,319 --> 00:01:14,279 Speaker 1: for the man to pour the coffee, and he fills 20 00:01:14,319 --> 00:01:18,240 Speaker 1: it only halfway and before I even argue he is 21 00:01:18,280 --> 00:01:24,520 Speaker 1: looking out the window at somebody coming in. Now that 22 00:01:24,920 --> 00:01:28,000 Speaker 1: song doesn't work for me. I get that it got 23 00:01:28,000 --> 00:01:31,720 Speaker 1: really popular, especially after someone did an unauthorized remix of it, 24 00:01:31,760 --> 00:01:35,160 Speaker 1: which is the version most people know. But it turned 25 00:01:35,200 --> 00:01:38,800 Speaker 1: out to be an absolute perfect song to test the 26 00:01:38,959 --> 00:01:43,520 Speaker 1: MP three compression algorithm. To understand why, we need to 27 00:01:43,600 --> 00:01:47,119 Speaker 1: learn about the purpose of the MP three compression algorithm 28 00:01:47,200 --> 00:01:50,240 Speaker 1: in the first place. So in this case, the compression 29 00:01:50,280 --> 00:01:53,880 Speaker 1: we're talking about is relating to file size. There's an 30 00:01:53,920 --> 00:01:57,360 Speaker 1: interesting side note. There's a different kind of audio compression. 31 00:01:57,800 --> 00:02:01,840 Speaker 1: This refers to the reduction of diner range in a recording, 32 00:02:02,960 --> 00:02:07,760 Speaker 1: and by that I mean reducing the volume distance between 33 00:02:07,800 --> 00:02:11,240 Speaker 1: the loudest and the softest parts of a recording. That 34 00:02:11,280 --> 00:02:16,400 Speaker 1: can actually take a part in file compression as well, 35 00:02:16,440 --> 00:02:19,399 Speaker 1: but that's we're going to set it aside. Just put 36 00:02:19,400 --> 00:02:21,360 Speaker 1: a pin in that, take a look at it later on. 37 00:02:22,040 --> 00:02:25,960 Speaker 1: But with file compression generally, the whole goal is to 38 00:02:26,040 --> 00:02:30,760 Speaker 1: find ways to pack information into smaller file sizes. That 39 00:02:30,800 --> 00:02:34,880 Speaker 1: makes those files easier to manage. That's important if you 40 00:02:34,960 --> 00:02:38,400 Speaker 1: are dealing with a limited amount of storage, or maybe 41 00:02:38,400 --> 00:02:41,160 Speaker 1: you want to send the file from one machine to another. 42 00:02:41,240 --> 00:02:45,000 Speaker 1: And you've got limited bandwidth, so you need smaller file sizes, 43 00:02:45,080 --> 00:02:47,680 Speaker 1: or else the process is going to take way too long. 44 00:02:48,120 --> 00:02:51,160 Speaker 1: But how do you do it well? One approach to 45 00:02:51,440 --> 00:02:55,080 Speaker 1: file compression is to take a real good look at 46 00:02:55,120 --> 00:02:59,400 Speaker 1: the file you're trying to compress, and you ask the question, 47 00:03:00,280 --> 00:03:03,480 Speaker 1: is all the information that is inside this file necessary? 48 00:03:04,000 --> 00:03:06,800 Speaker 1: Or could I get rid of some of that information 49 00:03:07,320 --> 00:03:11,160 Speaker 1: and still have a usable file on the other side 50 00:03:11,160 --> 00:03:15,080 Speaker 1: of it With music. That means figuring out which bits 51 00:03:15,080 --> 00:03:18,560 Speaker 1: of data you can drop without it having a noticeable 52 00:03:18,600 --> 00:03:23,600 Speaker 1: effect on the audio quality. Ideally, the compressed file would 53 00:03:23,600 --> 00:03:28,040 Speaker 1: be indistinguishable from the original raw audio, but since you're 54 00:03:28,120 --> 00:03:32,919 Speaker 1: tossing out information, that's not necessarily a guarantee. This is 55 00:03:33,000 --> 00:03:37,760 Speaker 1: what makes the MP three a loss e file format. 56 00:03:38,320 --> 00:03:41,120 Speaker 1: MP three is just one example of a loss e 57 00:03:41,320 --> 00:03:44,360 Speaker 1: file format. There are others, and the word loss e 58 00:03:44,600 --> 00:03:47,600 Speaker 1: means just exactly what you think. It means that some 59 00:03:47,800 --> 00:03:52,480 Speaker 1: information is tossed aside or lost in the process of 60 00:03:52,560 --> 00:03:56,080 Speaker 1: compressing the file to a smaller size. The folks who 61 00:03:56,120 --> 00:03:58,960 Speaker 1: worked on the MP three format had to figure out 62 00:03:59,320 --> 00:04:02,480 Speaker 1: which information was most likely to have little to no 63 00:04:02,680 --> 00:04:06,840 Speaker 1: impact on audio quality within an audio file. To do that, 64 00:04:07,400 --> 00:04:10,840 Speaker 1: they had to take into account human psychology and the 65 00:04:10,880 --> 00:04:16,920 Speaker 1: limitations of human hearing. So psychoacoustics played a big part 66 00:04:17,040 --> 00:04:21,520 Speaker 1: in determining the MP three compression algorithm. So for example, 67 00:04:22,000 --> 00:04:25,320 Speaker 1: by that, I mean, let's think of the range of 68 00:04:25,400 --> 00:04:28,240 Speaker 1: human hearing in terms of frequencies for a second, So 69 00:04:28,440 --> 00:04:33,520 Speaker 1: your typical human is able to hear frequencies as low 70 00:04:33,680 --> 00:04:38,480 Speaker 1: as twenty hurts and as high as twenty thousand hurts 71 00:04:38,560 --> 00:04:42,760 Speaker 1: or twenty killer hurts. Hurts in this case references an 72 00:04:42,800 --> 00:04:46,240 Speaker 1: oscillation per second or a vibration per second, So twenty 73 00:04:46,360 --> 00:04:52,160 Speaker 1: hurts means that something is effectively vibrating twenty times per second. 74 00:04:52,360 --> 00:04:55,480 Speaker 1: So if you had a string that when you plucked, 75 00:04:55,520 --> 00:04:58,560 Speaker 1: it would vibrate twenty times per second, that string is 76 00:04:58,640 --> 00:05:02,559 Speaker 1: vibrating at twenty hurts. That would be a very very 77 00:05:02,640 --> 00:05:05,800 Speaker 1: low note. The higher the frequency, the higher the pitch, 78 00:05:06,360 --> 00:05:08,800 Speaker 1: and as we age we tend to lose the ability 79 00:05:08,839 --> 00:05:11,800 Speaker 1: to hear some of those higher pitches, which is why 80 00:05:11,839 --> 00:05:15,520 Speaker 1: you would hear about some convenience stores experimenting with playing 81 00:05:15,640 --> 00:05:19,800 Speaker 1: very high pitched noises to discourage young punks who wanted 82 00:05:19,800 --> 00:05:24,400 Speaker 1: to loiter in the joint. So human hearing has limitations, 83 00:05:24,480 --> 00:05:28,559 Speaker 1: and in theory you can eliminate sounds that would fall 84 00:05:28,680 --> 00:05:33,279 Speaker 1: outside of those limitations. If a sound file contains frequencies 85 00:05:33,640 --> 00:05:36,800 Speaker 1: that are at twenty one killer hertz, but your typical 86 00:05:36,839 --> 00:05:41,640 Speaker 1: person can't hear anything above twenty killer hertz, well, at 87 00:05:41,720 --> 00:05:44,479 Speaker 1: least theoretically, you can just toss that information and it 88 00:05:44,560 --> 00:05:47,920 Speaker 1: won't change anything. If a sound file contains a sound 89 00:05:48,560 --> 00:05:51,479 Speaker 1: but no one has the capacity to hear it, does 90 00:05:51,480 --> 00:05:55,720 Speaker 1: a tree fall in the forest. Might be getting a 91 00:05:55,720 --> 00:05:59,839 Speaker 1: little lost in the woods here anyway. That frequency example, 92 00:05:59,839 --> 00:06:02,479 Speaker 1: that's just one example of the sound that humans would 93 00:06:02,480 --> 00:06:06,920 Speaker 1: have trouble hearing. So another is when we hear a 94 00:06:07,080 --> 00:06:11,240 Speaker 1: very soft sound that immediately follows a very loud sound, 95 00:06:11,680 --> 00:06:14,560 Speaker 1: we don't actually perceive the soft one. The loud sound 96 00:06:14,640 --> 00:06:18,360 Speaker 1: we hear eclipses the soft sound, and it turns out 97 00:06:18,760 --> 00:06:21,640 Speaker 1: we can't hear the soft one at all. So again, 98 00:06:22,160 --> 00:06:25,120 Speaker 1: if we can't hear that soft sound that played immediately 99 00:06:25,200 --> 00:06:28,200 Speaker 1: after a loud one, why would you keep it? You know, 100 00:06:28,240 --> 00:06:29,880 Speaker 1: you might as well just get rid of that information. 101 00:06:29,960 --> 00:06:32,520 Speaker 1: You can't hear it anyway, Just get rid of it. 102 00:06:32,839 --> 00:06:37,320 Speaker 1: Save the space. This psychoacoustic approach to sound would lead 103 00:06:37,360 --> 00:06:39,640 Speaker 1: the developers of the MP three format to create a 104 00:06:39,680 --> 00:06:44,200 Speaker 1: strategy regarding what information to keep and what information to ditch. 105 00:06:45,160 --> 00:06:48,520 Speaker 1: On top of that, the algorithm had sort of a 106 00:06:48,560 --> 00:06:52,480 Speaker 1: sliding scale, so maybe you want to keep as much 107 00:06:52,520 --> 00:06:55,080 Speaker 1: information as possible, so you select that when you create 108 00:06:55,120 --> 00:06:59,480 Speaker 1: the MP three So you're losing less information in the process. 109 00:06:59,520 --> 00:07:01,800 Speaker 1: You're still impressing the file, but not to the extent 110 00:07:01,839 --> 00:07:06,320 Speaker 1: that you could if you chose. Maybe the most important 111 00:07:06,320 --> 00:07:08,640 Speaker 1: thing to you is that you reduce the file size 112 00:07:08,680 --> 00:07:12,400 Speaker 1: as much as you can, so you crank the compression up. Now, 113 00:07:12,440 --> 00:07:15,800 Speaker 1: obviously the harder you go, the more likely you're going 114 00:07:15,840 --> 00:07:18,600 Speaker 1: to lose information that will make a noticeable difference in 115 00:07:18,640 --> 00:07:22,680 Speaker 1: the playback of the audio file, and you'll you would say, oh, 116 00:07:22,720 --> 00:07:25,360 Speaker 1: the quality here is not as good as I thought 117 00:07:25,360 --> 00:07:28,920 Speaker 1: it would be. This is where Tom's Diner comes in. 118 00:07:29,760 --> 00:07:33,080 Speaker 1: Carl Heinz Brandenburg, who is one of the leads on 119 00:07:33,240 --> 00:07:37,720 Speaker 1: creating the MP three format, used Tom's Diner to listen 120 00:07:37,760 --> 00:07:41,840 Speaker 1: back to compressed files and determine how the compression was 121 00:07:41,880 --> 00:07:46,920 Speaker 1: affecting the audio quality. So it was a great track 122 00:07:47,000 --> 00:07:51,840 Speaker 1: to use because the actual qualities of the recording itself 123 00:07:52,520 --> 00:07:55,880 Speaker 1: were such that it was easy to detect if something 124 00:07:56,120 --> 00:07:59,960 Speaker 1: was not quite right. The original recording of Tom's Diner 125 00:08:00,240 --> 00:08:04,120 Speaker 1: is not the one that has the catchy beat and 126 00:08:04,160 --> 00:08:06,640 Speaker 1: the horns in it. It's a very simple a cappella 127 00:08:06,720 --> 00:08:10,160 Speaker 1: recording of Suzanne Vegas singing her tale of looking at 128 00:08:10,160 --> 00:08:12,680 Speaker 1: the world from a male perspective through a sense of 129 00:08:12,760 --> 00:08:17,320 Speaker 1: distance and attachment. Branden Berg would use that track while 130 00:08:17,320 --> 00:08:20,720 Speaker 1: tweaking the algorithm, trying to create the thin line between 131 00:08:20,760 --> 00:08:24,440 Speaker 1: an effective data compression technique and a minimal impact on 132 00:08:24,560 --> 00:08:27,679 Speaker 1: sound quality. And for her contributions to the effort, although 133 00:08:27,680 --> 00:08:32,200 Speaker 1: she made them unknowingly, Brandenburg would name Suzanne Vega the 134 00:08:32,280 --> 00:08:36,559 Speaker 1: mother of the mp three. Interestingly, Ryan maguire decided to 135 00:08:36,600 --> 00:08:40,200 Speaker 1: take a sort of negative image of the compressed Tom's Diner. 136 00:08:40,280 --> 00:08:43,920 Speaker 1: He identified sounds that were deleted in the process of 137 00:08:43,920 --> 00:08:47,240 Speaker 1: creating a lossy version of Tom's Diner, and then it 138 00:08:47,320 --> 00:08:50,440 Speaker 1: created a new recording that contained only the bits that 139 00:08:50,600 --> 00:08:54,600 Speaker 1: had been cut from the file. And it's almost like 140 00:08:54,720 --> 00:08:57,400 Speaker 1: listening to the ghost of a song. In fact, I 141 00:08:57,400 --> 00:09:01,040 Speaker 1: think they called the project the Ghost of the MPIE three. 142 00:09:01,160 --> 00:09:03,520 Speaker 1: It's pretty creepy stuff. It would not be out of 143 00:09:03,600 --> 00:09:06,360 Speaker 1: place in a horror movie. The fact that lossy files 144 00:09:06,400 --> 00:09:09,400 Speaker 1: by definition lose information in the process of data compression 145 00:09:10,040 --> 00:09:13,440 Speaker 1: meant that audio files dismiss the MP three format is 146 00:09:13,480 --> 00:09:16,800 Speaker 1: inherently inferior to others, at least as far as listening 147 00:09:16,840 --> 00:09:20,360 Speaker 1: experiences go. And there are arguments that some of the 148 00:09:20,440 --> 00:09:24,679 Speaker 1: lost information, while potentially being imperceptible within the song itself, 149 00:09:25,000 --> 00:09:28,520 Speaker 1: help shape the overall sound and tone of the piece. 150 00:09:28,559 --> 00:09:33,000 Speaker 1: So though you can't directly hear the stuff that's being cut, 151 00:09:33,559 --> 00:09:37,400 Speaker 1: that stuff actually influences how you perceive other things, so 152 00:09:37,920 --> 00:09:41,760 Speaker 1: you still change the experience of hearing the finished audio. 153 00:09:42,000 --> 00:09:45,079 Speaker 1: But the MP three format created the opportunity to store 154 00:09:45,120 --> 00:09:48,040 Speaker 1: and transfer audio files without having to deal with massive 155 00:09:48,160 --> 00:09:51,880 Speaker 1: raw audio formats, and back in the day that was 156 00:09:51,960 --> 00:09:55,720 Speaker 1: not a trivial thing. And so that is the answer 157 00:09:55,800 --> 00:10:00,280 Speaker 1: to the question Tom's Diner the first MP three Hope 158 00:10:00,280 --> 00:10:04,360 Speaker 1: you're all well and I'll talk to you again really soon. 159 00:10:10,800 --> 00:10:15,480 Speaker 1: Tech Stuff is an iHeartRadio production. For more podcasts from iHeartRadio, 160 00:10:15,800 --> 00:10:19,520 Speaker 1: visit the iHeartRadio app, Apple Podcasts, or wherever you listen 161 00:10:19,520 --> 00:10:20,600 Speaker 1: to your favorite shows.