1
00:00:04,440 --> 00:00:12,360
Speaker 1: Welcome to tech Stuff, a production from iHeartRadio. Hey there,

2
00:00:12,360 --> 00:00:15,840
Speaker 1: and welcome to tech Stuff. I'm your host, Jonathan Strickland.

3
00:00:15,840 --> 00:00:18,920
Speaker 1: I'm an executive producer with iHeartRadio. And how the tech

4
00:00:18,960 --> 00:00:22,000
Speaker 1: are you? It's time for a tech Stuff tidbits. I'm

5
00:00:22,000 --> 00:00:26,680
Speaker 1: going to answer the question what was the first MP three? Well,

6
00:00:26,760 --> 00:00:29,520
Speaker 1: here's the too long, didn't listen answer. It was Tom's

7
00:00:29,560 --> 00:00:33,680
Speaker 1: Diner by Suzanne Vega. It's a song I personally do

8
00:00:33,760 --> 00:00:36,400
Speaker 1: not like. That's not to say it's a bad song.

9
00:00:37,320 --> 00:00:39,520
Speaker 1: Just because I don't like something doesn't mean it's bad.

10
00:00:40,320 --> 00:00:42,840
Speaker 1: I just mean I personally do not find this song

11
00:00:43,120 --> 00:00:46,640
Speaker 1: at all appealing. But it was, in fact the first

12
00:00:46,720 --> 00:00:49,839
Speaker 1: MP three. Now, if you don't know Tom's Diner, it

13
00:00:49,920 --> 00:00:53,040
Speaker 1: features Vega giving a little slice a life moment from

14
00:00:53,120 --> 00:00:56,240
Speaker 1: the perspective of a man sitting in a diner who

15
00:00:56,280 --> 00:01:00,400
Speaker 1: feels kind of distanced from the world around him. In

16
00:01:00,440 --> 00:01:04,800
Speaker 1: case you need a reminder, here's the first verse of

17
00:01:04,840 --> 00:01:08,039
Speaker 1: the song. I am sitting in the morning at the

18
00:01:08,120 --> 00:01:10,840
Speaker 1: diner on the corner. I am waiting at the counter

19
00:01:11,319 --> 00:01:14,279
Speaker 1: for the man to pour the coffee, and he fills

20
00:01:14,319 --> 00:01:18,240
Speaker 1: it only halfway and before I even argue he is

21
00:01:18,280 --> 00:01:24,520
Speaker 1: looking out the window at somebody coming in. Now that

22
00:01:24,920 --> 00:01:28,000
Speaker 1: song doesn't work for me. I get that it got

23
00:01:28,000 --> 00:01:31,720
Speaker 1: really popular, especially after someone did an unauthorized remix of it,

24
00:01:31,760 --> 00:01:35,160
Speaker 1: which is the version most people know. But it turned

25
00:01:35,200 --> 00:01:38,800
Speaker 1: out to be an absolute perfect song to test the

26
00:01:38,959 --> 00:01:43,520
Speaker 1: MP three compression algorithm. To understand why, we need to

27
00:01:43,600 --> 00:01:47,119
Speaker 1: learn about the purpose of the MP three compression algorithm

28
00:01:47,200 --> 00:01:50,240
Speaker 1: in the first place. So in this case, the compression

29
00:01:50,280 --> 00:01:53,880
Speaker 1: we're talking about is relating to file size. There's an

30
00:01:53,920 --> 00:01:57,360
Speaker 1: interesting side note. There's a different kind of audio compression.

31
00:01:57,800 --> 00:02:01,840
Speaker 1: This refers to the reduction of diner range in a recording,

32
00:02:02,960 --> 00:02:07,760
Speaker 1: and by that I mean reducing the volume distance between

33
00:02:07,800 --> 00:02:11,240
Speaker 1: the loudest and the softest parts of a recording. That

34
00:02:11,280 --> 00:02:16,400
Speaker 1: can actually take a part in file compression as well,

35
00:02:16,440 --> 00:02:19,399
Speaker 1: but that's we're going to set it aside. Just put

36
00:02:19,400 --> 00:02:21,360
Speaker 1: a pin in that, take a look at it later on.

37
00:02:22,040 --> 00:02:25,960
Speaker 1: But with file compression generally, the whole goal is to

38
00:02:26,040 --> 00:02:30,760
Speaker 1: find ways to pack information into smaller file sizes. That

39
00:02:30,800 --> 00:02:34,880
Speaker 1: makes those files easier to manage. That's important if you

40
00:02:34,960 --> 00:02:38,400
Speaker 1: are dealing with a limited amount of storage, or maybe

41
00:02:38,400 --> 00:02:41,160
Speaker 1: you want to send the file from one machine to another.

42
00:02:41,240 --> 00:02:45,000
Speaker 1: And you've got limited bandwidth, so you need smaller file sizes,

43
00:02:45,080 --> 00:02:47,680
Speaker 1: or else the process is going to take way too long.

44
00:02:48,120 --> 00:02:51,160
Speaker 1: But how do you do it well? One approach to

45
00:02:51,440 --> 00:02:55,080
Speaker 1: file compression is to take a real good look at

46
00:02:55,120 --> 00:02:59,400
Speaker 1: the file you're trying to compress, and you ask the question,

47
00:03:00,280 --> 00:03:03,480
Speaker 1: is all the information that is inside this file necessary?

48
00:03:04,000 --> 00:03:06,800
Speaker 1: Or could I get rid of some of that information

49
00:03:07,320 --> 00:03:11,160
Speaker 1: and still have a usable file on the other side

50
00:03:11,160 --> 00:03:15,080
Speaker 1: of it With music. That means figuring out which bits

51
00:03:15,080 --> 00:03:18,560
Speaker 1: of data you can drop without it having a noticeable

52
00:03:18,600 --> 00:03:23,600
Speaker 1: effect on the audio quality. Ideally, the compressed file would

53
00:03:23,600 --> 00:03:28,040
Speaker 1: be indistinguishable from the original raw audio, but since you're

54
00:03:28,120 --> 00:03:32,919
Speaker 1: tossing out information, that's not necessarily a guarantee. This is

55
00:03:33,000 --> 00:03:37,760
Speaker 1: what makes the MP three a loss e file format.

56
00:03:38,320 --> 00:03:41,120
Speaker 1: MP three is just one example of a loss e

57
00:03:41,320 --> 00:03:44,360
Speaker 1: file format. There are others, and the word loss e

58
00:03:44,600 --> 00:03:47,600
Speaker 1: means just exactly what you think. It means that some

59
00:03:47,800 --> 00:03:52,480
Speaker 1: information is tossed aside or lost in the process of

60
00:03:52,560 --> 00:03:56,080
Speaker 1: compressing the file to a smaller size. The folks who

61
00:03:56,120 --> 00:03:58,960
Speaker 1: worked on the MP three format had to figure out

62
00:03:59,320 --> 00:04:02,480
Speaker 1: which information was most likely to have little to no

63
00:04:02,680 --> 00:04:06,840
Speaker 1: impact on audio quality within an audio file. To do that,

64
00:04:07,400 --> 00:04:10,840
Speaker 1: they had to take into account human psychology and the

65
00:04:10,880 --> 00:04:16,920
Speaker 1: limitations of human hearing. So psychoacoustics played a big part

66
00:04:17,040 --> 00:04:21,520
Speaker 1: in determining the MP three compression algorithm. So for example,

67
00:04:22,000 --> 00:04:25,320
Speaker 1: by that, I mean, let's think of the range of

68
00:04:25,400 --> 00:04:28,240
Speaker 1: human hearing in terms of frequencies for a second, So

69
00:04:28,440 --> 00:04:33,520
Speaker 1: your typical human is able to hear frequencies as low

70
00:04:33,680 --> 00:04:38,480
Speaker 1: as twenty hurts and as high as twenty thousand hurts

71
00:04:38,560 --> 00:04:42,760
Speaker 1: or twenty killer hurts. Hurts in this case references an

72
00:04:42,800 --> 00:04:46,240
Speaker 1: oscillation per second or a vibration per second, So twenty

73
00:04:46,360 --> 00:04:52,160
Speaker 1: hurts means that something is effectively vibrating twenty times per second.

74
00:04:52,360 --> 00:04:55,480
Speaker 1: So if you had a string that when you plucked,

75
00:04:55,520 --> 00:04:58,560
Speaker 1: it would vibrate twenty times per second, that string is

76
00:04:58,640 --> 00:05:02,559
Speaker 1: vibrating at twenty hurts. That would be a very very

77
00:05:02,640 --> 00:05:05,800
Speaker 1: low note. The higher the frequency, the higher the pitch,

78
00:05:06,360 --> 00:05:08,800
Speaker 1: and as we age we tend to lose the ability

79
00:05:08,839 --> 00:05:11,800
Speaker 1: to hear some of those higher pitches, which is why

80
00:05:11,839 --> 00:05:15,520
Speaker 1: you would hear about some convenience stores experimenting with playing

81
00:05:15,640 --> 00:05:19,800
Speaker 1: very high pitched noises to discourage young punks who wanted

82
00:05:19,800 --> 00:05:24,400
Speaker 1: to loiter in the joint. So human hearing has limitations,

83
00:05:24,480 --> 00:05:28,559
Speaker 1: and in theory you can eliminate sounds that would fall

84
00:05:28,680 --> 00:05:33,279
Speaker 1: outside of those limitations. If a sound file contains frequencies

85
00:05:33,640 --> 00:05:36,800
Speaker 1: that are at twenty one killer hertz, but your typical

86
00:05:36,839 --> 00:05:41,640
Speaker 1: person can't hear anything above twenty killer hertz, well, at

87
00:05:41,720 --> 00:05:44,479
Speaker 1: least theoretically, you can just toss that information and it

88
00:05:44,560 --> 00:05:47,920
Speaker 1: won't change anything. If a sound file contains a sound

89
00:05:48,560 --> 00:05:51,479
Speaker 1: but no one has the capacity to hear it, does

90
00:05:51,480 --> 00:05:55,720
Speaker 1: a tree fall in the forest. Might be getting a

91
00:05:55,720 --> 00:05:59,839
Speaker 1: little lost in the woods here anyway. That frequency example,

92
00:05:59,839 --> 00:06:02,479
Speaker 1: that's just one example of the sound that humans would

93
00:06:02,480 --> 00:06:06,920
Speaker 1: have trouble hearing. So another is when we hear a

94
00:06:07,080 --> 00:06:11,240
Speaker 1: very soft sound that immediately follows a very loud sound,

95
00:06:11,680 --> 00:06:14,560
Speaker 1: we don't actually perceive the soft one. The loud sound

96
00:06:14,640 --> 00:06:18,360
Speaker 1: we hear eclipses the soft sound, and it turns out

97
00:06:18,760 --> 00:06:21,640
Speaker 1: we can't hear the soft one at all. So again,

98
00:06:22,160 --> 00:06:25,120
Speaker 1: if we can't hear that soft sound that played immediately

99
00:06:25,200 --> 00:06:28,200
Speaker 1: after a loud one, why would you keep it? You know,

100
00:06:28,240 --> 00:06:29,880
Speaker 1: you might as well just get rid of that information.

101
00:06:29,960 --> 00:06:32,520
Speaker 1: You can't hear it anyway, Just get rid of it.

102
00:06:32,839 --> 00:06:37,320
Speaker 1: Save the space. This psychoacoustic approach to sound would lead

103
00:06:37,360 --> 00:06:39,640
Speaker 1: the developers of the MP three format to create a

104
00:06:39,680 --> 00:06:44,200
Speaker 1: strategy regarding what information to keep and what information to ditch.

105
00:06:45,160 --> 00:06:48,520
Speaker 1: On top of that, the algorithm had sort of a

106
00:06:48,560 --> 00:06:52,480
Speaker 1: sliding scale, so maybe you want to keep as much

107
00:06:52,520 --> 00:06:55,080
Speaker 1: information as possible, so you select that when you create

108
00:06:55,120 --> 00:06:59,480
Speaker 1: the MP three So you're losing less information in the process.

109
00:06:59,520 --> 00:07:01,800
Speaker 1: You're still impressing the file, but not to the extent

110
00:07:01,839 --> 00:07:06,320
Speaker 1: that you could if you chose. Maybe the most important

111
00:07:06,320 --> 00:07:08,640
Speaker 1: thing to you is that you reduce the file size

112
00:07:08,680 --> 00:07:12,400
Speaker 1: as much as you can, so you crank the compression up. Now,

113
00:07:12,440 --> 00:07:15,800
Speaker 1: obviously the harder you go, the more likely you're going

114
00:07:15,840 --> 00:07:18,600
Speaker 1: to lose information that will make a noticeable difference in

115
00:07:18,640 --> 00:07:22,680
Speaker 1: the playback of the audio file, and you'll you would say, oh,

116
00:07:22,720 --> 00:07:25,360
Speaker 1: the quality here is not as good as I thought

117
00:07:25,360 --> 00:07:28,920
Speaker 1: it would be. This is where Tom's Diner comes in.

118
00:07:29,760 --> 00:07:33,080
Speaker 1: Carl Heinz Brandenburg, who is one of the leads on

119
00:07:33,240 --> 00:07:37,720
Speaker 1: creating the MP three format, used Tom's Diner to listen

120
00:07:37,760 --> 00:07:41,840
Speaker 1: back to compressed files and determine how the compression was

121
00:07:41,880 --> 00:07:46,920
Speaker 1: affecting the audio quality. So it was a great track

122
00:07:47,000 --> 00:07:51,840
Speaker 1: to use because the actual qualities of the recording itself

123
00:07:52,520 --> 00:07:55,880
Speaker 1: were such that it was easy to detect if something

124
00:07:56,120 --> 00:07:59,960
Speaker 1: was not quite right. The original recording of Tom's Diner

125
00:08:00,240 --> 00:08:04,120
Speaker 1: is not the one that has the catchy beat and

126
00:08:04,160 --> 00:08:06,640
Speaker 1: the horns in it. It's a very simple a cappella

127
00:08:06,720 --> 00:08:10,160
Speaker 1: recording of Suzanne Vegas singing her tale of looking at

128
00:08:10,160 --> 00:08:12,680
Speaker 1: the world from a male perspective through a sense of

129
00:08:12,760 --> 00:08:17,320
Speaker 1: distance and attachment. Branden Berg would use that track while

130
00:08:17,320 --> 00:08:20,720
Speaker 1: tweaking the algorithm, trying to create the thin line between

131
00:08:20,760 --> 00:08:24,440
Speaker 1: an effective data compression technique and a minimal impact on

132
00:08:24,560 --> 00:08:27,679
Speaker 1: sound quality. And for her contributions to the effort, although

133
00:08:27,680 --> 00:08:32,200
Speaker 1: she made them unknowingly, Brandenburg would name Suzanne Vega the

134
00:08:32,280 --> 00:08:36,559
Speaker 1: mother of the mp three. Interestingly, Ryan maguire decided to

135
00:08:36,600 --> 00:08:40,200
Speaker 1: take a sort of negative image of the compressed Tom's Diner.

136
00:08:40,280 --> 00:08:43,920
Speaker 1: He identified sounds that were deleted in the process of

137
00:08:43,920 --> 00:08:47,240
Speaker 1: creating a lossy version of Tom's Diner, and then it

138
00:08:47,320 --> 00:08:50,440
Speaker 1: created a new recording that contained only the bits that

139
00:08:50,600 --> 00:08:54,600
Speaker 1: had been cut from the file. And it's almost like

140
00:08:54,720 --> 00:08:57,400
Speaker 1: listening to the ghost of a song. In fact, I

141
00:08:57,400 --> 00:09:01,040
Speaker 1: think they called the project the Ghost of the MPIE three.

142
00:09:01,160 --> 00:09:03,520
Speaker 1: It's pretty creepy stuff. It would not be out of

143
00:09:03,600 --> 00:09:06,360
Speaker 1: place in a horror movie. The fact that lossy files

144
00:09:06,400 --> 00:09:09,400
Speaker 1: by definition lose information in the process of data compression

145
00:09:10,040 --> 00:09:13,440
Speaker 1: meant that audio files dismiss the MP three format is

146
00:09:13,480 --> 00:09:16,800
Speaker 1: inherently inferior to others, at least as far as listening

147
00:09:16,840 --> 00:09:20,360
Speaker 1: experiences go. And there are arguments that some of the

148
00:09:20,440 --> 00:09:24,679
Speaker 1: lost information, while potentially being imperceptible within the song itself,

149
00:09:25,000 --> 00:09:28,520
Speaker 1: help shape the overall sound and tone of the piece.

150
00:09:28,559 --> 00:09:33,000
Speaker 1: So though you can't directly hear the stuff that's being cut,

151
00:09:33,559 --> 00:09:37,400
Speaker 1: that stuff actually influences how you perceive other things, so

152
00:09:37,920 --> 00:09:41,760
Speaker 1: you still change the experience of hearing the finished audio.

153
00:09:42,000 --> 00:09:45,079
Speaker 1: But the MP three format created the opportunity to store

154
00:09:45,120 --> 00:09:48,040
Speaker 1: and transfer audio files without having to deal with massive

155
00:09:48,160 --> 00:09:51,880
Speaker 1: raw audio formats, and back in the day that was

156
00:09:51,960 --> 00:09:55,720
Speaker 1: not a trivial thing. And so that is the answer

157
00:09:55,800 --> 00:10:00,280
Speaker 1: to the question Tom's Diner the first MP three Hope

158
00:10:00,280 --> 00:10:04,360
Speaker 1: you're all well and I'll talk to you again really soon.

159
00:10:10,800 --> 00:10:15,480
Speaker 1: Tech Stuff is an iHeartRadio production. For more podcasts from iHeartRadio,

160
00:10:15,800 --> 00:10:19,520
Speaker 1: visit the iHeartRadio app, Apple Podcasts, or wherever you listen

161
00:10:19,520 --> 00:10:20,600
Speaker 1: to your favorite shows.