WEBVTT - Techstuff Classic: The Dirt on Digital Audio

0:00:04.160 --> 0:00:07.200
<v Speaker 1>Get in touch with technology with tech Stuff from half

0:00:07.240 --> 0:00:14.720
<v Speaker 1>stuff works dot com. Hey everybody, it's Jonathan Strickland here

0:00:14.920 --> 0:00:19.120
<v Speaker 1>with text Stuff classic episodes. We're doing some Saturday morning

0:00:19.239 --> 0:00:22.439
<v Speaker 1>reruns for you guys. This is a special series where

0:00:22.440 --> 0:00:25.440
<v Speaker 1>we're going to dig up some classic episodes of tech

0:00:25.520 --> 0:00:28.400
<v Speaker 1>Stuff and present them to you guys who may not

0:00:28.520 --> 0:00:30.480
<v Speaker 1>have had a chance to listen to them, especially if

0:00:30.480 --> 0:00:33.400
<v Speaker 1>you're a brand new listener. First of all, welcome. If

0:00:33.440 --> 0:00:36.480
<v Speaker 1>that's the case, I hope you enjoy these episodes. This

0:00:36.520 --> 0:00:39.800
<v Speaker 1>one is called The Dirt on Digital Audio, and it

0:00:39.880 --> 0:00:43.960
<v Speaker 1>is an episode all about the actual technical process of

0:00:44.040 --> 0:00:48.200
<v Speaker 1>recording audio into a digital format and what that requires

0:00:48.240 --> 0:00:51.680
<v Speaker 1>because it's very different from the analog style. I hope

0:00:51.720 --> 0:00:55.280
<v Speaker 1>you guys enjoy it. This episode was originally published on

0:00:55.360 --> 0:00:58.960
<v Speaker 1>November twenty three, two thousand sixteen. And just in case

0:00:58.960 --> 0:01:01.120
<v Speaker 1>you're listening to this one in the far future, I'm

0:01:01.160 --> 0:01:04.440
<v Speaker 1>recording this in two thousand eighteen, So we're gonna time

0:01:04.440 --> 0:01:08.000
<v Speaker 1>travel a bit and listen to this classic episode The

0:01:08.080 --> 0:01:12.080
<v Speaker 1>Dirt on Digital Audio. So to start it all off,

0:01:12.880 --> 0:01:15.840
<v Speaker 1>we all have to take a quick trip to Germany,

0:01:16.000 --> 0:01:19.520
<v Speaker 1>So anyone who is not in Germany get your passport.

0:01:20.240 --> 0:01:22.280
<v Speaker 1>I was actually in Germany not that long ago. I

0:01:22.319 --> 0:01:25.800
<v Speaker 1>got to visit Berlin and had a wonderful time. And

0:01:25.880 --> 0:01:29.279
<v Speaker 1>in Germany there's a company called frown Hofer Gazelle Shaft.

0:01:30.080 --> 0:01:32.000
<v Speaker 1>And you might wonder, well, what does this company do

0:01:32.440 --> 0:01:38.040
<v Speaker 1>they think? I joke that my profession, that my title

0:01:38.200 --> 0:01:40.800
<v Speaker 1>that I should put on my business card it should

0:01:40.800 --> 0:01:44.399
<v Speaker 1>say professional smart person. Well, no joke, that's what these

0:01:44.400 --> 0:01:49.920
<v Speaker 1>people are. They specialize in research and development, applied research.

0:01:51.080 --> 0:01:54.520
<v Speaker 1>It's a whole company that specializes and applied research. And

0:01:54.640 --> 0:01:58.960
<v Speaker 1>it's huge. It encompasses sixties seven institutes and research units

0:01:59.040 --> 0:02:04.480
<v Speaker 1>across Germany. Well back in the eighties and there was

0:02:04.520 --> 0:02:10.320
<v Speaker 1>a researcher named Karl Heinz Brandenburg, and Karl Heinz made

0:02:10.720 --> 0:02:16.720
<v Speaker 1>a breakthrough round uh and came up with this clever

0:02:16.880 --> 0:02:21.360
<v Speaker 1>idea about encoding audio. He was actually working towards creating

0:02:21.360 --> 0:02:25.640
<v Speaker 1>a way that would allow for high audio quality transfer

0:02:26.040 --> 0:02:30.239
<v Speaker 1>but having a low bit rate sampling, so that file

0:02:30.360 --> 0:02:34.160
<v Speaker 1>sizes and transfer times wouldn't get out of control. Because

0:02:34.160 --> 0:02:36.040
<v Speaker 1>you got to remember, this is the eighties, this is

0:02:36.120 --> 0:02:39.200
<v Speaker 1>before the World Wide Web was a thing that would

0:02:39.360 --> 0:02:42.400
<v Speaker 1>that wouldn't happen until the early nineties, so the Internet

0:02:42.440 --> 0:02:44.280
<v Speaker 1>was very young. In fact, they weren't even looking at

0:02:44.320 --> 0:02:47.320
<v Speaker 1>the Internet as a method of distribution for this particular

0:02:47.400 --> 0:02:50.639
<v Speaker 1>type of encoded audio. They were looking at using this

0:02:51.000 --> 0:02:54.880
<v Speaker 1>to transmit across telephone lines. So they needed to have

0:02:54.960 --> 0:02:58.320
<v Speaker 1>something that was going to be high quality but low space.

0:02:59.520 --> 0:03:01.600
<v Speaker 1>So what the heck does that mean? All right? Well,

0:03:02.560 --> 0:03:06.960
<v Speaker 1>digital audio and analog audio are very different things. So

0:03:07.000 --> 0:03:10.320
<v Speaker 1>to understand that, we need to look at how sound

0:03:10.360 --> 0:03:14.679
<v Speaker 1>works and how we describe sound, because that informs how

0:03:14.720 --> 0:03:19.440
<v Speaker 1>we can capture sound and replicate those qualities digitally. So

0:03:19.520 --> 0:03:22.560
<v Speaker 1>stick with me. We're gonna go back to school for

0:03:22.720 --> 0:03:27.400
<v Speaker 1>some basic sound science. And this goes back to the

0:03:27.400 --> 0:03:31.720
<v Speaker 1>way sound physically moves through a medium, whether that's a

0:03:31.800 --> 0:03:36.640
<v Speaker 1>solid or through the air or through water. Sound is vibration.

0:03:37.480 --> 0:03:42.640
<v Speaker 1>Now we sense this primarily through hearing it or sometimes

0:03:42.880 --> 0:03:45.720
<v Speaker 1>feeling it. If it's the right frequency and the right amplitude.

0:03:45.720 --> 0:03:49.240
<v Speaker 1>We can actually feel sound. Anyone who stood close to,

0:03:49.280 --> 0:03:52.320
<v Speaker 1>say a sub wiffer that was really blasting out bass notes,

0:03:52.400 --> 0:03:54.600
<v Speaker 1>you know what I'm talking about. You can feel it

0:03:54.720 --> 0:03:58.720
<v Speaker 1>pressing against you. Well, sound travels through the air when

0:03:58.760 --> 0:04:03.320
<v Speaker 1>molecules vibrate against each other, and this creates instances of

0:04:03.560 --> 0:04:07.840
<v Speaker 1>increased pressure and decreased pressure at what is a hyper

0:04:08.000 --> 0:04:11.000
<v Speaker 1>local level. We're not talking about weather maps here, We're

0:04:11.040 --> 0:04:15.160
<v Speaker 1>talking about tiny little areas. So this increase in decrease

0:04:15.160 --> 0:04:17.760
<v Speaker 1>in pressure is something that we can sense as sound.

0:04:18.320 --> 0:04:21.520
<v Speaker 1>When those changes in pressure affect a diaphragm, such as

0:04:21.640 --> 0:04:25.640
<v Speaker 1>one that's in a microphone or maybe your ear drum,

0:04:25.839 --> 0:04:30.040
<v Speaker 1>for example, it causes the diaphragm to actually move. So

0:04:30.200 --> 0:04:36.839
<v Speaker 1>increased pressure pushes the diaphragm in and decreased pressure doesn't

0:04:36.920 --> 0:04:39.440
<v Speaker 1>really pull the diaphragm out. I mean, you could say

0:04:39.440 --> 0:04:42.080
<v Speaker 1>it it pulls the diaphragm out, but to be more accurate,

0:04:42.120 --> 0:04:46.080
<v Speaker 1>the diaphragm actually pushes outward because the pressure on the

0:04:46.080 --> 0:04:48.640
<v Speaker 1>outside is lower than the pressure on the inside. But

0:04:48.680 --> 0:04:51.839
<v Speaker 1>you get what I'm saying. The diaphragm begins to to

0:04:52.440 --> 0:04:56.200
<v Speaker 1>flex inward and outward depending upon the amount of pressure

0:04:56.279 --> 0:04:59.720
<v Speaker 1>that it's it's encountering. You can imagine this being kind

0:04:59.720 --> 0:05:01.919
<v Speaker 1>of like a drum drum, not an ear drum, but

0:05:01.920 --> 0:05:04.800
<v Speaker 1>an actual drum and striking it. Uh, that's the same

0:05:04.839 --> 0:05:08.920
<v Speaker 1>sort of thing. So sound is the fluctuations of pressure,

0:05:09.480 --> 0:05:12.760
<v Speaker 1>which we can diagram as a wave or a wave

0:05:12.839 --> 0:05:16.440
<v Speaker 1>length a wave form on an x y axis, So

0:05:16.480 --> 0:05:21.560
<v Speaker 1>the horizontal line that access that represents time that has passed,

0:05:21.960 --> 0:05:26.000
<v Speaker 1>and the vertical axis represents the amplitude or the volume

0:05:26.560 --> 0:05:29.880
<v Speaker 1>of the sound wave. The wave length of the sound

0:05:30.240 --> 0:05:32.760
<v Speaker 1>which is the distance between successive points on a wave,

0:05:32.839 --> 0:05:36.359
<v Speaker 1>such as like the successive crests on a wave. That

0:05:36.440 --> 0:05:39.600
<v Speaker 1>tells you a lot about the frequency. So sound moves

0:05:39.640 --> 0:05:43.280
<v Speaker 1>at a constant rate through a given medium, but it

0:05:43.320 --> 0:05:47.200
<v Speaker 1>moves at different rates through different media. So in other words,

0:05:47.440 --> 0:05:49.760
<v Speaker 1>it moves a different speed through a solid than it

0:05:49.800 --> 0:05:53.400
<v Speaker 1>does through air. If the crests of each sound wave

0:05:53.480 --> 0:05:57.479
<v Speaker 1>are really close together, that's a high frequency sound. More

0:05:57.520 --> 0:06:00.920
<v Speaker 1>waves will pass through an arbitrary point within a second.

0:06:01.360 --> 0:06:04.240
<v Speaker 1>The waves that are spaced further apart, that would be

0:06:04.279 --> 0:06:07.559
<v Speaker 1>a lower frequency sound. Higher frequency sounds have a higher

0:06:07.600 --> 0:06:10.719
<v Speaker 1>pitch than lower frequency sounds. So if you hold a

0:06:10.800 --> 0:06:14.360
<v Speaker 1>single note at a constant frequency, you'll have what is

0:06:14.400 --> 0:06:18.520
<v Speaker 1>called a simple harmonic motion. That means the vibrations are

0:06:18.600 --> 0:06:21.880
<v Speaker 1>moving at a constant rate inward and outward. The cycle

0:06:22.120 --> 0:06:25.680
<v Speaker 1>is constant. A tuning fork is a good example of this.

0:06:26.800 --> 0:06:31.080
<v Speaker 1>So if you hear a clear C note played on

0:06:31.120 --> 0:06:34.360
<v Speaker 1>a musical instrument, that could be a simple harmonic motion.

0:06:34.600 --> 0:06:36.720
<v Speaker 1>It won't be, but it could be. I'll tell you

0:06:36.720 --> 0:06:39.080
<v Speaker 1>why it won't be in a minute. So the frequency

0:06:39.120 --> 0:06:42.240
<v Speaker 1>of vibration doesn't change, and so you would get this

0:06:42.480 --> 0:06:44.840
<v Speaker 1>very clear note as a result, And if you were

0:06:44.839 --> 0:06:49.760
<v Speaker 1>to diagram it, you would have very regular crests and troughs,

0:06:49.800 --> 0:06:53.600
<v Speaker 1>all of the same amplitude and distance from each other.

0:06:53.640 --> 0:06:58.240
<v Speaker 1>The frequency and volume would remain constant, assuming of course,

0:06:58.320 --> 0:07:02.480
<v Speaker 1>that you're not trying to change the frequency or volume. Now,

0:07:02.520 --> 0:07:05.320
<v Speaker 1>this is where I point out most musical instruments don't

0:07:05.400 --> 0:07:09.920
<v Speaker 1>produce a single clear note, even if played expertly. They

0:07:09.920 --> 0:07:15.360
<v Speaker 1>actually create several resonant frequencies. So every physical object resonates

0:07:15.400 --> 0:07:19.400
<v Speaker 1>at several different frequencies. You've probably seen this in various programs.

0:07:19.440 --> 0:07:22.840
<v Speaker 1>MythBusters did one about bridges, the idea being that if

0:07:22.880 --> 0:07:25.840
<v Speaker 1>you were to have a group of people marching on

0:07:25.880 --> 0:07:28.960
<v Speaker 1>a bridge at the bridge's resonant frequency, it could cause

0:07:29.000 --> 0:07:33.600
<v Speaker 1>the bridge to start to vibrate and swing out of control. Well,

0:07:33.640 --> 0:07:35.480
<v Speaker 1>there's a reason for this. You may have also seen

0:07:35.560 --> 0:07:39.160
<v Speaker 1>videos of people singing a certain note and causing a

0:07:39.240 --> 0:07:43.640
<v Speaker 1>crystal glass to shatter. That's because that crystal glass does

0:07:43.680 --> 0:07:45.880
<v Speaker 1>have a resonant frequency, and if you can hit that

0:07:45.920 --> 0:07:49.200
<v Speaker 1>resonant frequency at the right volume, you can cause the

0:07:49.240 --> 0:07:52.360
<v Speaker 1>glass to start to deform, or the crystal in this case,

0:07:52.440 --> 0:07:55.160
<v Speaker 1>to deform to a point where it loses integrity and

0:07:55.160 --> 0:08:00.840
<v Speaker 1>it shatters as a result. Well, the resonation of an

0:08:00.840 --> 0:08:04.560
<v Speaker 1>object is dependent upon lots of different factors, and in fact,

0:08:04.720 --> 0:08:09.760
<v Speaker 1>most stuff will resonate at different frequencies, but at different intensities.

0:08:10.040 --> 0:08:14.239
<v Speaker 1>Like there might be one sweet spot, one specific frequency

0:08:14.320 --> 0:08:18.600
<v Speaker 1>that will have the greatest effect, but other related frequencies

0:08:18.640 --> 0:08:20.480
<v Speaker 1>may also have an effect. It will just be to

0:08:20.520 --> 0:08:24.040
<v Speaker 1>a lesser extent. Well, if you were to pluck a

0:08:24.040 --> 0:08:28.240
<v Speaker 1>guitar string, just you've tuned it to whatever note doesn't matter.

0:08:28.320 --> 0:08:31.640
<v Speaker 1>Let's say it's you tuned it to to G and

0:08:31.880 --> 0:08:35.920
<v Speaker 1>you play the G string on your guitar. The note

0:08:35.960 --> 0:08:38.960
<v Speaker 1>that you will hear really over all others will be

0:08:39.000 --> 0:08:40.640
<v Speaker 1>g that that is going to be the one that

0:08:40.640 --> 0:08:43.320
<v Speaker 1>will sound the loudest, But it will also play resonant

0:08:43.320 --> 0:08:47.240
<v Speaker 1>frequencies at a decreased amplitude. In other words, of decreased

0:08:47.360 --> 0:08:51.440
<v Speaker 1>volume so you still hear the intended note above everything else,

0:08:51.480 --> 0:08:54.600
<v Speaker 1>above all the other resonant frequencies. This is called a

0:08:54.679 --> 0:08:58.800
<v Speaker 1>complex tone, and that collection of frequencies in their amplitudes

0:08:59.000 --> 0:09:03.720
<v Speaker 1>is called the sectrum of sound. You get a full spectrum. Now,

0:09:03.760 --> 0:09:09.280
<v Speaker 1>some of the components of that complex tone will be uh,

0:09:09.320 --> 0:09:12.360
<v Speaker 1>imperceptible to you. You there'll be so quiet that you

0:09:12.400 --> 0:09:15.640
<v Speaker 1>wouldn't really notice them. They might affect the overall quality

0:09:15.640 --> 0:09:17.280
<v Speaker 1>of the sound, but in such a subtle way that

0:09:17.320 --> 0:09:19.160
<v Speaker 1>it may be difficult for you to even put it

0:09:19.240 --> 0:09:23.200
<v Speaker 1>into words. Each of those little components is called a partial.

0:09:23.640 --> 0:09:25.959
<v Speaker 1>So in the example of a guitar string, the partials

0:09:26.000 --> 0:09:30.080
<v Speaker 1>are all integers of the same fundamental frequency, and the

0:09:30.160 --> 0:09:34.680
<v Speaker 1>sound has a harmonic spectrum. But as you get further

0:09:34.760 --> 0:09:39.600
<v Speaker 1>away from that fundamental frequency, the amplitude decreases significantly. So,

0:09:39.679 --> 0:09:42.719
<v Speaker 1>like I said, you get far enough away, they are

0:09:42.760 --> 0:09:47.040
<v Speaker 1>technically there, but they might be imperceptible to you. Now,

0:09:47.080 --> 0:09:51.520
<v Speaker 1>some sounds have frequencies that aren't integers of a fundamental

0:09:51.559 --> 0:09:55.320
<v Speaker 1>frequency and are inharmonic Uh. Certain bells, like if you

0:09:55.320 --> 0:09:57.240
<v Speaker 1>hear a bell ring, you can probably pick out a

0:09:57.240 --> 0:10:00.840
<v Speaker 1>couple of different frequencies. There that are not harmon frequencies.

0:10:01.679 --> 0:10:04.439
<v Speaker 1>These are very complex sounds, and to our perception, if

0:10:04.480 --> 0:10:07.480
<v Speaker 1>it's complex enough, it can seem like there's no single

0:10:07.559 --> 0:10:12.480
<v Speaker 1>discernible pitch. They're like there's no fundamental frequency over all

0:10:12.559 --> 0:10:16.640
<v Speaker 1>the others. If it's complex enough, we call it noise.

0:10:17.360 --> 0:10:21.040
<v Speaker 1>That is the technical term. It is noise. Now, the

0:10:21.160 --> 0:10:26.319
<v Speaker 1>unit we use to measure frequency is the hurts uh

0:10:26.559 --> 0:10:29.839
<v Speaker 1>H E R t Z. Typical human hearing ranges from

0:10:29.880 --> 0:10:33.840
<v Speaker 1>twenty hurts, which means a wave will pass a given

0:10:33.960 --> 0:10:37.400
<v Speaker 1>arbitrary point twenty times within a second, all the way

0:10:37.480 --> 0:10:40.439
<v Speaker 1>up to twenty killer hurts, which means a wave will

0:10:40.440 --> 0:10:44.520
<v Speaker 1>pass a particular point in time twenty thousand times in

0:10:44.520 --> 0:10:47.320
<v Speaker 1>a second, or particular point on your wave form twenty

0:10:47.320 --> 0:10:50.880
<v Speaker 1>thousand times in the second. And most of our sensitivity

0:10:51.040 --> 0:10:54.800
<v Speaker 1>tends to be between one or two killer hurts up

0:10:54.840 --> 0:10:58.000
<v Speaker 1>to four or five killer hurts. That's generally where we

0:10:58.240 --> 0:11:02.280
<v Speaker 1>have human voices, and we've really gotten good at picking

0:11:02.280 --> 0:11:04.800
<v Speaker 1>those out of over everything else. So our sensitivity of

0:11:04.880 --> 0:11:07.520
<v Speaker 1>hearing is really concentrated between one killer hurts and four

0:11:07.600 --> 0:11:10.640
<v Speaker 1>killer hurts or two and five depending upon whom you ask.

0:11:12.840 --> 0:11:16.200
<v Speaker 1>Now we get back over to amplitude, that is referring

0:11:16.240 --> 0:11:18.520
<v Speaker 1>to the height of the wave. It also refers to

0:11:18.559 --> 0:11:23.679
<v Speaker 1>the volume the loudness of something. Amplitude means bigness. So

0:11:23.720 --> 0:11:27.199
<v Speaker 1>how big is the sound, Well, the greater the amplitude,

0:11:27.240 --> 0:11:30.319
<v Speaker 1>the louder it is. And amplitudes can have an enormous

0:11:30.480 --> 0:11:34.080
<v Speaker 1>range and affect how we perceive sounds. So, for example,

0:11:34.559 --> 0:11:38.720
<v Speaker 1>take a really complicated classical piece of music. It's just

0:11:38.840 --> 0:11:42.120
<v Speaker 1>easy to explain it in that term. You might have

0:11:42.160 --> 0:11:45.760
<v Speaker 1>a stretch in that classical piece of music in which

0:11:45.840 --> 0:11:48.360
<v Speaker 1>all the instruments are more or less playing at a

0:11:48.440 --> 0:11:52.000
<v Speaker 1>similar volume, so the sound from each instrument section has

0:11:52.040 --> 0:11:55.319
<v Speaker 1>a similar amplitude. But then there might be one segment

0:11:55.400 --> 0:11:58.920
<v Speaker 1>where an instrument group or maybe even a single soloist

0:11:59.559 --> 0:12:03.600
<v Speaker 1>has an increased amplitude and increased volume. It rises over

0:12:03.640 --> 0:12:06.960
<v Speaker 1>the rest of the orchestra, and that peak of the

0:12:07.000 --> 0:12:10.600
<v Speaker 1>amplitude is called the attack of the sound, and the

0:12:10.880 --> 0:12:16.040
<v Speaker 1>entire range of amplitudes is called the amplitude envelope. Now

0:12:16.040 --> 0:12:18.920
<v Speaker 1>this is important when we get to m P three's

0:12:18.960 --> 0:12:23.640
<v Speaker 1>because the way we perceive these sounds uh that that

0:12:23.679 --> 0:12:26.240
<v Speaker 1>has everything to do with the way the MP three

0:12:26.360 --> 0:12:29.600
<v Speaker 1>was designed. The whole point of the MP three was

0:12:29.640 --> 0:12:34.800
<v Speaker 1>to try and create a small file size to represent

0:12:34.880 --> 0:12:37.880
<v Speaker 1>what we can hear and kind of ignore everything else.

0:12:38.120 --> 0:12:40.760
<v Speaker 1>We'll get to that in a little bit more more time.

0:12:40.920 --> 0:12:43.880
<v Speaker 1>So this is really interesting to me. If you take

0:12:44.240 --> 0:12:49.679
<v Speaker 1>a sound and you double its amplitude, you increase the

0:12:49.720 --> 0:12:54.080
<v Speaker 1>amplitude by twofold, a listener would not necessarily feel that

0:12:54.120 --> 0:12:59.400
<v Speaker 1>the sound is twice as loud. Human hearing is incredibly subjective,

0:13:00.040 --> 0:13:04.079
<v Speaker 1>and typically for most listeners, it would require much more

0:13:04.880 --> 0:13:08.760
<v Speaker 1>than doubling the sounds amplitude for them to feel that

0:13:08.880 --> 0:13:12.400
<v Speaker 1>the sound itself was twice as loud. This perception of

0:13:12.480 --> 0:13:14.920
<v Speaker 1>volume is important when we get to the lossy formats

0:13:14.920 --> 0:13:19.839
<v Speaker 1>for audio files. Now I've given you all this information,

0:13:20.040 --> 0:13:22.760
<v Speaker 1>and I know everyone is probably thinking, you know, I

0:13:23.120 --> 0:13:26.480
<v Speaker 1>learned this in primary school, elementary school. All of this

0:13:26.559 --> 0:13:29.800
<v Speaker 1>is really familiar to me, and you're maybe rolling your

0:13:29.800 --> 0:13:32.840
<v Speaker 1>eyes because it's so basic. But I think it's important

0:13:33.280 --> 0:13:36.560
<v Speaker 1>to have that refresher so that you can understand the

0:13:36.600 --> 0:13:41.240
<v Speaker 1>difference between sound as we experience it and sound as

0:13:41.320 --> 0:13:45.960
<v Speaker 1>the way we encode it digitally and replicate it digitally.

0:13:46.840 --> 0:13:49.840
<v Speaker 1>For one thing, this illustrates how sound in the real

0:13:49.880 --> 0:13:54.640
<v Speaker 1>world is a continuum. It's a continuum both in frequency

0:13:54.679 --> 0:13:59.920
<v Speaker 1>and amplitude. You can have sound changing in frequency very

0:14:00.280 --> 0:14:04.520
<v Speaker 1>smoothly from one pitch to another. You can also have

0:14:04.600 --> 0:14:09.199
<v Speaker 1>sound increase or decrease in amplitude in a very smooth way.

0:14:09.360 --> 0:14:14.240
<v Speaker 1>And it is continuous, it's unbroken. It can have smooth transitions.

0:14:14.240 --> 0:14:17.240
<v Speaker 1>And these qualities provide challenges when we want to describe

0:14:17.280 --> 0:14:22.960
<v Speaker 1>something digitally because at the heart of digital information is

0:14:23.400 --> 0:14:28.120
<v Speaker 1>the bit, the basic unit of information. It is a

0:14:28.240 --> 0:14:31.840
<v Speaker 1>unit of information that only has two states zero or

0:14:32.000 --> 0:14:36.120
<v Speaker 1>one is essentially off or on. When you get down

0:14:36.160 --> 0:14:41.040
<v Speaker 1>to defining information in just two states, then you start

0:14:41.080 --> 0:14:44.040
<v Speaker 1>to look at something that is continuous and you realize

0:14:44.560 --> 0:14:46.240
<v Speaker 1>this is going to be a challenge. How do I

0:14:46.320 --> 0:14:52.160
<v Speaker 1>describe a continuous experience in very discrete amounts of information.

0:14:53.160 --> 0:14:57.280
<v Speaker 1>And that's when we get to the methodology we've developed

0:14:57.920 --> 0:15:01.280
<v Speaker 1>to digitally encode sound. I'm going to get into that

0:15:01.320 --> 0:15:04.680
<v Speaker 1>in just a minute, but before I do that, let's

0:15:04.720 --> 0:15:16.240
<v Speaker 1>take a quick break to thank our sponsor. All right,

0:15:16.360 --> 0:15:20.400
<v Speaker 1>let's get back into it. So we've talked about the

0:15:20.480 --> 0:15:23.400
<v Speaker 1>nature of sound. Analog sound, by the way, tries to

0:15:23.440 --> 0:15:27.560
<v Speaker 1>replicate exactly what we would experience in nature. It tries

0:15:27.560 --> 0:15:32.640
<v Speaker 1>to create this continuous experience, so you get these smooth

0:15:32.720 --> 0:15:38.320
<v Speaker 1>waves of frequencies and amplitudes. And that's why some people

0:15:38.480 --> 0:15:43.920
<v Speaker 1>argue that that analog styles of of sound recordings are

0:15:44.000 --> 0:15:48.800
<v Speaker 1>superior to digital ones. I don't necessarily think they're right,

0:15:49.360 --> 0:15:52.800
<v Speaker 1>but they often feel that way. So something like a

0:15:52.920 --> 0:15:58.000
<v Speaker 1>vinyl album, which is an analog format of digital or sorry,

0:15:58.040 --> 0:16:02.280
<v Speaker 1>an analog format of music storage should say sound storage. Uh,

0:16:02.320 --> 0:16:04.960
<v Speaker 1>they think that that is superior to say a c D,

0:16:05.320 --> 0:16:10.320
<v Speaker 1>which is a digital storage format. Uh. And who's to say.

0:16:10.400 --> 0:16:14.440
<v Speaker 1>I mean, like, if your sense of hearing is incredibly

0:16:14.720 --> 0:16:18.040
<v Speaker 1>well tuned, you might be able to pick up on

0:16:18.120 --> 0:16:22.160
<v Speaker 1>some differences. Or if someone did a really terrible job

0:16:22.680 --> 0:16:28.000
<v Speaker 1>encoding music digitally, then that might reveal itself to you

0:16:28.040 --> 0:16:30.760
<v Speaker 1>as well. Uh. But this is one of those things

0:16:30.800 --> 0:16:32.960
<v Speaker 1>that I think a lot of people feel they can

0:16:32.960 --> 0:16:34.760
<v Speaker 1>tell the difference, but if they would do a double

0:16:34.800 --> 0:16:39.360
<v Speaker 1>blind test, they might be surprised at how difficult it is.

0:16:39.840 --> 0:16:43.200
<v Speaker 1>If things if everything's working the way it should, then

0:16:43.440 --> 0:16:48.000
<v Speaker 1>there shouldn't be a perceptible difference at any rate. Digital

0:16:48.040 --> 0:16:54.360
<v Speaker 1>audio has two really important factors. Sample rate and bit depth,

0:16:55.160 --> 0:16:57.640
<v Speaker 1>or to another extent, bit rate. We'll talk about bit

0:16:57.760 --> 0:17:02.280
<v Speaker 1>rate as well. So the sample rate refers to how

0:17:02.320 --> 0:17:05.919
<v Speaker 1>many times you reference an analog sound to create the

0:17:05.960 --> 0:17:09.760
<v Speaker 1>digital version. So sound, like I said, is uninterrupted in

0:17:09.800 --> 0:17:14.840
<v Speaker 1>the analog world, you've got that that nice wave form.

0:17:14.920 --> 0:17:18.040
<v Speaker 1>In the analog world, that's not how digital world works.

0:17:18.119 --> 0:17:21.320
<v Speaker 1>Digital world, we have to describe that sound in a

0:17:21.400 --> 0:17:27.600
<v Speaker 1>series of discrete snippets of sound. It's probably easiest to

0:17:27.640 --> 0:17:33.840
<v Speaker 1>describe this with an analogy to movies on film. If

0:17:33.880 --> 0:17:37.359
<v Speaker 1>you work with film, like you're creating a movie on film,

0:17:37.840 --> 0:17:41.040
<v Speaker 1>then you know that you're not looking at a real

0:17:41.240 --> 0:17:44.240
<v Speaker 1>moving picture when you see the film played out at

0:17:44.280 --> 0:17:47.560
<v Speaker 1>the cinema. Instead, what you're looking at is a series

0:17:47.640 --> 0:17:52.160
<v Speaker 1>of photographs. If you take a film strip and you

0:17:52.240 --> 0:17:56.280
<v Speaker 1>look at it under a light, you'll see it's one

0:17:56.359 --> 0:18:00.760
<v Speaker 1>after another photograph. It's just a series of pictures. It's

0:18:00.760 --> 0:18:02.920
<v Speaker 1>only when you play them back at the right speed

0:18:03.520 --> 0:18:05.800
<v Speaker 1>and you projected onto a screen that you get the

0:18:05.880 --> 0:18:10.520
<v Speaker 1>illusion of continuous motion. But it's not really continuous. It's

0:18:10.560 --> 0:18:13.800
<v Speaker 1>just this series of photographs played at twenty four frames

0:18:13.800 --> 0:18:18.840
<v Speaker 1>per second in the case of actual film. So that

0:18:19.040 --> 0:18:22.200
<v Speaker 1>ends up being very analogous to the way we encode

0:18:22.200 --> 0:18:26.040
<v Speaker 1>digital audio. You take the analog recording and you take

0:18:26.359 --> 0:18:31.840
<v Speaker 1>snapshots of sound. The more frequently you take those snapshots,

0:18:32.240 --> 0:18:34.480
<v Speaker 1>the higher your sample rates. So in other words, if

0:18:34.480 --> 0:18:37.639
<v Speaker 1>you did one a second, your sample rate would be awful.

0:18:38.400 --> 0:18:40.600
<v Speaker 1>You would have a sample rate of one. But the

0:18:40.680 --> 0:18:43.680
<v Speaker 1>higher the sample rate, the closer your digital representation will

0:18:43.680 --> 0:18:47.280
<v Speaker 1>be to the frequency in the analog sound format. Actually,

0:18:47.760 --> 0:18:50.000
<v Speaker 1>what's really important to remember is that your sample rate

0:18:50.040 --> 0:18:52.480
<v Speaker 1>has to be about twice actually does have to be

0:18:52.520 --> 0:18:56.920
<v Speaker 1>twice what the highest frequency sound is in your recording.

0:18:58.440 --> 0:19:01.480
<v Speaker 1>It has to be because as if it's not, it

0:19:01.640 --> 0:19:07.720
<v Speaker 1>cannot encode that sound accurately. It's kind of interesting and

0:19:07.840 --> 0:19:09.919
<v Speaker 1>you might wonder, how do we take these snapshots in

0:19:09.920 --> 0:19:12.920
<v Speaker 1>the first place. Well, if you're capturing audio, let's say

0:19:12.960 --> 0:19:16.360
<v Speaker 1>we're recording to digital, So we've got a microphone set

0:19:16.440 --> 0:19:20.960
<v Speaker 1>up and we're recording to a digital media storage. Like

0:19:21.040 --> 0:19:23.280
<v Speaker 1>let's just say we're recording straight to someone's hard drive.

0:19:23.440 --> 0:19:26.760
<v Speaker 1>So we're talking into a microphone recording to a hard drive.

0:19:27.720 --> 0:19:31.440
<v Speaker 1>So you're using an analog microphone. Let's say you would

0:19:31.440 --> 0:19:35.760
<v Speaker 1>need an analog to digital converter Now this particular component

0:19:36.040 --> 0:19:40.800
<v Speaker 1>can receive discrete voltages from another device like your microphone.

0:19:41.040 --> 0:19:47.920
<v Speaker 1>So your microphone is converting sound into uh differences in voltage.

0:19:48.000 --> 0:19:50.880
<v Speaker 1>That's essentially how it communicates, so that it can then

0:19:51.040 --> 0:19:54.080
<v Speaker 1>send that to some other element. In this case, it's

0:19:54.119 --> 0:19:57.720
<v Speaker 1>sending it to the the analog to digital converter so

0:19:57.760 --> 0:20:00.400
<v Speaker 1>that it can be stored digitally on your our drive.

0:20:01.480 --> 0:20:08.560
<v Speaker 1>So this analog digital converters references or samples the discrete

0:20:08.680 --> 0:20:12.240
<v Speaker 1>voltage many times every second in order to create a

0:20:12.280 --> 0:20:16.760
<v Speaker 1>digital representation of the analog sound. It converts the voltages

0:20:16.840 --> 0:20:21.399
<v Speaker 1>into numbers and a process called quantization, and we express

0:20:21.480 --> 0:20:24.480
<v Speaker 1>those numbers in bits, So these are zeros and ones.

0:20:25.080 --> 0:20:27.760
<v Speaker 1>When you want to play the digital audio, a digital

0:20:27.800 --> 0:20:31.840
<v Speaker 1>to analog converter does the same process in reverse. So

0:20:32.080 --> 0:20:35.800
<v Speaker 1>it takes this digital information, these zeros and ones and

0:20:35.880 --> 0:20:39.600
<v Speaker 1>converts it into a series of discrete voltages, which then

0:20:39.840 --> 0:20:43.520
<v Speaker 1>can be amplified and sent to a speaker and create sound.

0:20:44.760 --> 0:20:47.360
<v Speaker 1>So all of that's really important. But now let's let's

0:20:47.359 --> 0:20:49.960
<v Speaker 1>talk about some concrete examples, and the best way to

0:20:49.960 --> 0:20:53.240
<v Speaker 1>do this is to go with compact discs. Because we

0:20:53.320 --> 0:20:57.119
<v Speaker 1>have a standard sample rate for compact discs, and that

0:20:57.280 --> 0:21:00.560
<v Speaker 1>standard sample rate is forty four point one la hurts

0:21:00.680 --> 0:21:04.200
<v Speaker 1>to create CD equality audio. That means that the audio

0:21:04.280 --> 0:21:10.000
<v Speaker 1>is sampled forty four thousand, one hundred times every second

0:21:10.880 --> 0:21:12.840
<v Speaker 1>the way they hear you say, the range of human

0:21:12.880 --> 0:21:15.359
<v Speaker 1>hearing you said only goes to twenty hurts to twenty

0:21:15.440 --> 0:21:18.280
<v Speaker 1>killer hurts. If it only goes up to twenty killer hurts,

0:21:18.280 --> 0:21:21.080
<v Speaker 1>why are you sampling at forty four thousand, one hundred

0:21:21.160 --> 0:21:25.560
<v Speaker 1>times every second? If it's twenty thousand times a second

0:21:25.600 --> 0:21:28.919
<v Speaker 1>for the frequency, why go up to four thousand, one

0:21:29.000 --> 0:21:31.520
<v Speaker 1>hundred Is there some relationship between that and the c

0:21:31.640 --> 0:21:34.680
<v Speaker 1>D sample rate? And the answer is yes. So there

0:21:34.800 --> 0:21:40.000
<v Speaker 1>is a theorem called the Nyquist Shannon sampling theorem, and

0:21:40.080 --> 0:21:42.760
<v Speaker 1>that states that the sample rate must be twice the

0:21:42.840 --> 0:21:46.000
<v Speaker 1>maximum frequency of a recording in order to describe the

0:21:46.040 --> 0:21:50.240
<v Speaker 1>frequency properly. So the general thought is the maximum frequency

0:21:50.320 --> 0:21:52.919
<v Speaker 1>most humans can here's twenty killer hurts. And for that reason,

0:21:52.960 --> 0:21:55.760
<v Speaker 1>Phillips and Sony when they were working to create the

0:21:55.960 --> 0:21:59.879
<v Speaker 1>CD format to make it a standard, they decide on

0:22:00.040 --> 0:22:02.879
<v Speaker 1>forty four point one killer hurts as that standard sample

0:22:03.000 --> 0:22:05.399
<v Speaker 1>rate for c D audio. It was more than double

0:22:05.440 --> 0:22:08.040
<v Speaker 1>the top frequency generally considered to be in the upper

0:22:08.119 --> 0:22:11.200
<v Speaker 1>level of human hearing. But what happens if you were

0:22:11.200 --> 0:22:14.400
<v Speaker 1>to lower the sampling rate. What if you didn't sample

0:22:14.480 --> 0:22:19.600
<v Speaker 1>at What if you sampled at let's say sixteen killer hurts,

0:22:19.600 --> 0:22:23.120
<v Speaker 1>so sixteen thousand times a second you sample it well,

0:22:23.400 --> 0:22:25.560
<v Speaker 1>that means you would only be able to record and

0:22:25.600 --> 0:22:29.200
<v Speaker 1>replicate any sound with a frequency up to eight killer

0:22:29.280 --> 0:22:34.280
<v Speaker 1>hurts or less, so eight thousand hurts or less. But

0:22:34.440 --> 0:22:37.640
<v Speaker 1>if you had any sound that was greater than eight

0:22:37.640 --> 0:22:42.080
<v Speaker 1>thousand hurts or eight killer hurts, anything higher than that,

0:22:43.080 --> 0:22:46.400
<v Speaker 1>it would be folded down to fit below the eight

0:22:46.480 --> 0:22:50.200
<v Speaker 1>killer hurts limit. Perceptually, that means the sounds you would

0:22:50.240 --> 0:22:53.199
<v Speaker 1>hear in the playback could include frequencies that were not

0:22:53.359 --> 0:22:58.160
<v Speaker 1>present in the original performance of that sound. So let's

0:22:58.160 --> 0:23:02.600
<v Speaker 1>say that I'm using a sample rate of sixteen uh,

0:23:02.640 --> 0:23:06.359
<v Speaker 1>you know, killer hurts, and someone is playing a musical

0:23:06.440 --> 0:23:09.200
<v Speaker 1>instrument and they play a note that's at a nine

0:23:09.280 --> 0:23:14.760
<v Speaker 1>killer hurts frequency. Well, because I'm sampling at sixteen killer hurts,

0:23:15.400 --> 0:23:19.679
<v Speaker 1>my limit for frequencies is eight killer hurts. If you

0:23:19.720 --> 0:23:22.600
<v Speaker 1>play something at nine killer hurts, what happens is it

0:23:22.920 --> 0:23:27.280
<v Speaker 1>the recording seems to fold the sound back, and it

0:23:27.400 --> 0:23:31.879
<v Speaker 1>folds it back at the same limit that the sound

0:23:31.960 --> 0:23:37.119
<v Speaker 1>goes over, the sample rate rather the Nyquist limit, I

0:23:37.119 --> 0:23:39.639
<v Speaker 1>should say, not the sample rateself, but the Nyquist limit.

0:23:40.760 --> 0:23:45.760
<v Speaker 1>So nine killer hurts sound played, My limit is eight

0:23:45.840 --> 0:23:49.000
<v Speaker 1>killer hurts. Well, nine killer hurts is one killer hurts

0:23:49.000 --> 0:23:52.040
<v Speaker 1>more than eight, so it folds it back and the

0:23:52.080 --> 0:23:55.359
<v Speaker 1>sound you would hear on the recording would be seven

0:23:55.440 --> 0:23:59.040
<v Speaker 1>killer hurts. So the original sound is nine killer hurts.

0:23:59.119 --> 0:24:03.480
<v Speaker 1>The playbacks sound is seven killer hurts, and you would

0:24:03.560 --> 0:24:07.719
<v Speaker 1>hear something recorded that wasn't actually played. That's why you

0:24:07.760 --> 0:24:10.840
<v Speaker 1>have to have a really high sample rate so that

0:24:10.880 --> 0:24:14.720
<v Speaker 1>you don't have these instances where sound gets folded back

0:24:15.520 --> 0:24:20.399
<v Speaker 1>into the frequency range, because otherwise what you were hearing

0:24:20.560 --> 0:24:24.560
<v Speaker 1>is not an accurate representation of what was actually generated

0:24:24.800 --> 0:24:28.960
<v Speaker 1>what you were trying to record. This whole phenomenon, by

0:24:29.000 --> 0:24:32.320
<v Speaker 1>the way, is called fold over or sometimes alias sing.

0:24:33.720 --> 0:24:36.880
<v Speaker 1>So that's sample rate. But then we've got bit depth. Now,

0:24:36.920 --> 0:24:41.159
<v Speaker 1>this is all about measuring the volume or amplitude of

0:24:41.160 --> 0:24:44.440
<v Speaker 1>a sound. So you have a range. You just make

0:24:44.440 --> 0:24:48.280
<v Speaker 1>an arbitrary range to say, like we're gonna go quietest

0:24:48.320 --> 0:24:51.320
<v Speaker 1>to loudest, and you just define what that range is.

0:24:51.440 --> 0:24:54.160
<v Speaker 1>It could literally be any range. Let's say you say

0:24:54.240 --> 0:24:58.360
<v Speaker 1>zero to one. Zero is dead silence, no sound at all.

0:24:58.840 --> 0:25:02.560
<v Speaker 1>One hundred is as loud as the sound ever gets.

0:25:02.680 --> 0:25:06.480
<v Speaker 1>It's the peak volume of sound. That means you can

0:25:06.560 --> 0:25:11.399
<v Speaker 1>describe all the different volumes within that recording at a

0:25:11.520 --> 0:25:15.359
<v Speaker 1>number between zero and one hundred. But let's say you

0:25:15.440 --> 0:25:18.800
<v Speaker 1>take that same recording and instead of making the range

0:25:19.000 --> 0:25:22.840
<v Speaker 1>zero to one hundred, you say it's zero to two thousand.

0:25:23.240 --> 0:25:26.840
<v Speaker 1>You haven't made the volume louder. The volume is still

0:25:27.080 --> 0:25:29.679
<v Speaker 1>the exact same as it was when you called the

0:25:29.760 --> 0:25:32.800
<v Speaker 1>range zero to one hundred. But what you have done

0:25:33.240 --> 0:25:38.160
<v Speaker 1>is added more units. You've created more precise steps between

0:25:38.400 --> 0:25:43.520
<v Speaker 1>absolute silent and as loud as it gets. So you've

0:25:43.560 --> 0:25:45.359
<v Speaker 1>just increased the size of the range so that you

0:25:45.400 --> 0:25:48.959
<v Speaker 1>can be more precise in the differences in volume. And

0:25:48.960 --> 0:25:52.440
<v Speaker 1>this is really important. So let's say that you've got

0:25:52.480 --> 0:25:55.280
<v Speaker 1>a sound that you rank at seventy eight and another

0:25:55.320 --> 0:25:58.800
<v Speaker 1>sound that you rank at seventy nine, and that's gonna

0:25:58.800 --> 0:26:01.159
<v Speaker 1>be the same for both of these changes. Uh, just

0:26:01.240 --> 0:26:04.000
<v Speaker 1>two different examples. Actually, So you've got your zero to

0:26:04.040 --> 0:26:08.000
<v Speaker 1>one range and a seventy eight would be seventy eight

0:26:08.080 --> 0:26:12.399
<v Speaker 1>percent of the loudest sound in the entire recording, and

0:26:12.440 --> 0:26:15.320
<v Speaker 1>at seventy nine would be a seventy nine of the

0:26:15.400 --> 0:26:19.320
<v Speaker 1>loudest sound in the entire recording. That's an actually pretty

0:26:19.320 --> 0:26:21.920
<v Speaker 1>hefty jump. But let's say we instead went with that

0:26:22.119 --> 0:26:25.320
<v Speaker 1>zero to two thousand range and you still had seventy

0:26:25.320 --> 0:26:29.600
<v Speaker 1>eight and seventy nine. Well, seventy eight would represent three

0:26:29.640 --> 0:26:33.160
<v Speaker 1>point nine percent of the full volume and seventy nine

0:26:33.160 --> 0:26:37.199
<v Speaker 1>would represent represent three point nine five of a full volume.

0:26:37.400 --> 0:26:40.520
<v Speaker 1>In other words, you'd be able to mark much more

0:26:40.640 --> 0:26:44.600
<v Speaker 1>subtle differences in volume, and that means you can have

0:26:44.680 --> 0:26:49.440
<v Speaker 1>more nuance in your recording. And since we're talking about

0:26:49.560 --> 0:26:52.280
<v Speaker 1>a natural sound to start off with, so you're taking

0:26:52.320 --> 0:26:55.560
<v Speaker 1>a natural sound and you're trying to digitize it. Smooth

0:26:55.640 --> 0:26:59.879
<v Speaker 1>changes in amplitude are possible in natural sound. Using a

0:27:00.000 --> 0:27:03.159
<v Speaker 1>broader range to describe the volume is best if you

0:27:03.200 --> 0:27:08.040
<v Speaker 1>want to get an accurate representation or resolution of that sound.

0:27:08.680 --> 0:27:11.679
<v Speaker 1>Going back to that zero to one range changes in

0:27:11.720 --> 0:27:15.000
<v Speaker 1>volume would be more chunky. Two sounds that have slight

0:27:15.080 --> 0:27:18.960
<v Speaker 1>differences in amplitude would end up being defined as being

0:27:18.960 --> 0:27:22.840
<v Speaker 1>identical because you wouldn't have the precision. You know, you

0:27:22.840 --> 0:27:25.320
<v Speaker 1>couldn't say this one seventy eight and a half. It

0:27:25.320 --> 0:27:27.680
<v Speaker 1>would either be seventy eight or seventy nine. So you

0:27:27.720 --> 0:27:31.520
<v Speaker 1>could have two sounds that in greater precision you could

0:27:31.520 --> 0:27:35.080
<v Speaker 1>tell the difference between their volumes. But if you have

0:27:35.359 --> 0:27:39.720
<v Speaker 1>that lower, that more shallow bit depth, you wouldn't be

0:27:39.760 --> 0:27:41.440
<v Speaker 1>able to tell the difference of it. You would lose

0:27:41.520 --> 0:27:44.760
<v Speaker 1>that nuance, that subtlety. This is part of the reason

0:27:44.880 --> 0:27:48.960
<v Speaker 1>why people say, like a lot of the modern music

0:27:49.080 --> 0:27:53.280
<v Speaker 1>has uh lower ranges and changes in volume, like the

0:27:53.280 --> 0:27:56.719
<v Speaker 1>the loudest loud parts and the softest soft parts. That

0:27:56.880 --> 0:28:00.800
<v Speaker 1>range has decreased over time, which a lot of people

0:28:00.800 --> 0:28:04.800
<v Speaker 1>have argued has meant that music has gotten less complex

0:28:05.040 --> 0:28:09.280
<v Speaker 1>and therefore, in some minds, less interesting. That's on a

0:28:09.359 --> 0:28:13.760
<v Speaker 1>related uh kind of philosophy to what I'm talking about here.

0:28:15.400 --> 0:28:19.480
<v Speaker 1>So you want to have those smaller steps between each

0:28:19.560 --> 0:28:23.879
<v Speaker 1>unit so you can create greater resolution, more smoothness to

0:28:23.920 --> 0:28:28.840
<v Speaker 1>the recorded audio. And it's actually the bit rate and

0:28:28.920 --> 0:28:32.880
<v Speaker 1>CD audio that will help make the sound seem smooth.

0:28:33.600 --> 0:28:36.159
<v Speaker 1>So if you ever listened to eight bit music, you know,

0:28:36.320 --> 0:28:39.040
<v Speaker 1>like the kind from old video game consoles. That sound

0:28:39.080 --> 0:28:42.520
<v Speaker 1>is really harsh and sort of chunky. It has an appeal,

0:28:42.800 --> 0:28:46.320
<v Speaker 1>but it's not you know, it's not smooth at all.

0:28:47.120 --> 0:28:49.680
<v Speaker 1>It can create an amazing effect, but if you want

0:28:49.720 --> 0:28:54.160
<v Speaker 1>to represent true analog sound, it's not awesome. But if

0:28:54.200 --> 0:28:58.800
<v Speaker 1>you went up to sixteen bit, that's CD quality bit depth,

0:28:59.480 --> 0:29:04.080
<v Speaker 1>it's much better. Uh, professional recording studios will do twenty

0:29:04.120 --> 0:29:07.400
<v Speaker 1>four bit or thirty two bit because they're gonna do

0:29:07.440 --> 0:29:10.840
<v Speaker 1>a lot of post processing work on those audio files.

0:29:11.120 --> 0:29:13.040
<v Speaker 1>And when you do that post processing work, if you

0:29:13.080 --> 0:29:16.880
<v Speaker 1>do it at sixteen bit, the stuff you're doing, the

0:29:16.960 --> 0:29:19.880
<v Speaker 1>changes you make, can become noticeable, and most times you

0:29:19.920 --> 0:29:22.480
<v Speaker 1>don't want that. You don't want it to be you know,

0:29:22.720 --> 0:29:24.640
<v Speaker 1>you don't want it to stand out from the rest

0:29:24.680 --> 0:29:27.360
<v Speaker 1>of the audio file. But that's the only reason they

0:29:27.400 --> 0:29:29.400
<v Speaker 1>go up to twenty four bit or thirty two bit.

0:29:29.760 --> 0:29:33.360
<v Speaker 1>There'd be no point in playing it back at that rate,

0:29:33.520 --> 0:29:39.240
<v Speaker 1>that bit depth, because human hearing is not so adept

0:29:39.360 --> 0:29:42.320
<v Speaker 1>to tell the difference, at least not for most humans.

0:29:43.200 --> 0:29:46.280
<v Speaker 1>So if you played back a recording at sixteen bit

0:29:46.640 --> 0:29:48.400
<v Speaker 1>and another one at twenty four bit, and it's the

0:29:48.480 --> 0:29:51.640
<v Speaker 1>same piece, most people would not be able to tell

0:29:51.640 --> 0:29:55.120
<v Speaker 1>the difference because you've already reached a resolution that equals

0:29:55.240 --> 0:29:59.120
<v Speaker 1>the precision of human hearing. Keeping in mind again, human

0:29:59.160 --> 0:30:02.080
<v Speaker 1>hearing is subject. If not everyone is equal, there's some

0:30:02.120 --> 0:30:05.640
<v Speaker 1>people who have incredible hearing who may be able to

0:30:05.680 --> 0:30:08.880
<v Speaker 1>pick out that difference. I am not one of those people,

0:30:09.480 --> 0:30:11.240
<v Speaker 1>but I am a person who's going to tell you.

0:30:11.760 --> 0:30:14.240
<v Speaker 1>We'll get to the last section in just a bit,

0:30:14.720 --> 0:30:18.440
<v Speaker 1>but first let's take another quick break to thank our sponsor.

0:30:27.120 --> 0:30:29.800
<v Speaker 1>All Right, so bids depth. What we just talked about

0:30:30.120 --> 0:30:32.400
<v Speaker 1>that can be thought of is how well the sound

0:30:32.480 --> 0:30:36.960
<v Speaker 1>is described, and the sampling rate is how frequently or

0:30:37.000 --> 0:30:41.640
<v Speaker 1>how much the sound is described. And CD Audio quality

0:30:41.680 --> 0:30:45.280
<v Speaker 1>has sixteen bit audio. That means that they actually have

0:30:45.520 --> 0:30:50.280
<v Speaker 1>sixty five thousand, five hundred thirty six different levels of

0:30:50.360 --> 0:30:55.120
<v Speaker 1>volume that they can describe within an audio track. So

0:30:55.200 --> 0:30:59.520
<v Speaker 1>my example of zero to two thousand that is primitive

0:30:59.640 --> 0:31:01.800
<v Speaker 1>compared at the c D audio because it has the

0:31:01.960 --> 0:31:06.120
<v Speaker 1>sixteen bit style six five hundred thirty six different levels.

0:31:06.880 --> 0:31:11.240
<v Speaker 1>And how is that possible? Well, when we say sixteen bit,

0:31:11.920 --> 0:31:15.320
<v Speaker 1>remember a bit represents two states zero or one. So

0:31:15.360 --> 0:31:18.720
<v Speaker 1>you take the number two and then you raise it

0:31:18.760 --> 0:31:23.680
<v Speaker 1>to the power of sixteen. Uh, so you multiply to

0:31:24.000 --> 0:31:27.200
<v Speaker 1>by itself sixteen times and you get sixty five thousand,

0:31:27.200 --> 0:31:30.240
<v Speaker 1>three D fifty six. So that's that's where that number

0:31:30.280 --> 0:31:34.440
<v Speaker 1>comes from. Now, with your digital sample, you have a

0:31:34.480 --> 0:31:37.640
<v Speaker 1>collection of points that roughly replicate the shape of an

0:31:37.680 --> 0:31:40.680
<v Speaker 1>analog sound wave. It's gonna look a little funky, but

0:31:41.120 --> 0:31:44.720
<v Speaker 1>you'll be able to see what the frequency and amplitude

0:31:45.120 --> 0:31:48.240
<v Speaker 1>generally was of the original recording if you were to

0:31:48.400 --> 0:31:51.960
<v Speaker 1>plot this on an X y axis. But if you

0:31:52.000 --> 0:31:55.480
<v Speaker 1>were just to connect each successive point with a straight line,

0:31:56.080 --> 0:31:58.680
<v Speaker 1>even as close together as they would be, because you're

0:31:58.720 --> 0:32:02.160
<v Speaker 1>looking at forty four thousand one times a second, it

0:32:02.200 --> 0:32:05.560
<v Speaker 1>had sound pretty awful. So we actually use an algorithm

0:32:05.640 --> 0:32:10.240
<v Speaker 1>called interpolation to join the points smoothly to imitate a

0:32:10.320 --> 0:32:13.600
<v Speaker 1>sound wave form, and that gives a musical playback program

0:32:13.640 --> 0:32:17.280
<v Speaker 1>the ability to replicate an analog wave form. And that's

0:32:17.320 --> 0:32:22.240
<v Speaker 1>actually called pulse code modulation or pc M. And if

0:32:22.280 --> 0:32:27.760
<v Speaker 1>you store audio uh intact this way, you would have

0:32:27.840 --> 0:32:31.920
<v Speaker 1>what we call a lossless audio file, which means exactly

0:32:31.920 --> 0:32:34.880
<v Speaker 1>what it sounds like. None of that data would ever

0:32:34.920 --> 0:32:37.800
<v Speaker 1>get filtered out of the file, even if the sounds

0:32:37.880 --> 0:32:40.320
<v Speaker 1>were beyond the range of human hearing, they would be

0:32:40.360 --> 0:32:45.120
<v Speaker 1>recorded and you would have a lossless file format. Those

0:32:45.160 --> 0:32:47.840
<v Speaker 1>files tend to be quite big, depending upon how long

0:32:47.840 --> 0:32:51.600
<v Speaker 1>a recording you make, of course. All right, so now

0:32:52.040 --> 0:32:53.720
<v Speaker 1>here's where it gets a little confusing. And I think

0:32:53.720 --> 0:32:55.520
<v Speaker 1>I even said bit rate a couple of times when

0:32:55.520 --> 0:32:58.720
<v Speaker 1>I really meant bit depths earlier. But up to this point,

0:32:58.760 --> 0:33:02.680
<v Speaker 1>I really was talking at depth. So my apologies to

0:33:02.720 --> 0:33:05.280
<v Speaker 1>all of you out there if a bit rate slipped through,

0:33:05.760 --> 0:33:07.360
<v Speaker 1>because I did not mean it. Now I'm going to

0:33:07.440 --> 0:33:10.480
<v Speaker 1>talk about bit rate and show you how it's different

0:33:10.520 --> 0:33:14.920
<v Speaker 1>than bit depth. Bit Rate refers to the amount of

0:33:15.040 --> 0:33:20.000
<v Speaker 1>data audio uses per second or requires per second of recording,

0:33:20.560 --> 0:33:23.920
<v Speaker 1>and you derive bit rate from the bit depth and

0:33:24.000 --> 0:33:28.880
<v Speaker 1>the sampling rate. It's represented as bits per second. So again,

0:33:28.920 --> 0:33:31.120
<v Speaker 1>let's go to ceed equality sound. That makes it easy.

0:33:31.400 --> 0:33:36.560
<v Speaker 1>You have thousand one samples per second. You've got sixteen

0:33:36.640 --> 0:33:40.959
<v Speaker 1>bits or two bites, because remember a bite is eight bits,

0:33:41.760 --> 0:33:45.480
<v Speaker 1>so you've got two bites to describe each sample. So

0:33:45.680 --> 0:33:51.239
<v Speaker 1>two bites for one samples per second. Uh plus you

0:33:51.400 --> 0:33:54.360
<v Speaker 1>probably are gonna have to multiply that by two because

0:33:54.400 --> 0:33:57.120
<v Speaker 1>you're probably recording in stereo, so you have to do

0:33:57.160 --> 0:34:01.560
<v Speaker 1>that once reach track, so you get that number, then

0:34:01.600 --> 0:34:04.120
<v Speaker 1>you have to multiply that by sixty seconds to determine

0:34:04.160 --> 0:34:07.360
<v Speaker 1>how much data per minute you are creating when you're recording,

0:34:07.840 --> 0:34:10.440
<v Speaker 1>and with seed quality audio, that ends up being about

0:34:10.480 --> 0:34:14.480
<v Speaker 1>ten megabytes of data per minute. Now these days that's

0:34:14.640 --> 0:34:17.960
<v Speaker 1>not really that big a deal because we're dealing with

0:34:18.080 --> 0:34:22.640
<v Speaker 1>super fast internet speeds and enormous hard drives. But just

0:34:22.880 --> 0:34:25.640
<v Speaker 1>a few years ago, that was considered to be a

0:34:25.840 --> 0:34:29.880
<v Speaker 1>really sizeable file, I mean an enormous file, and so

0:34:30.040 --> 0:34:32.480
<v Speaker 1>if you wanted to find a way to distribute digital

0:34:32.520 --> 0:34:35.560
<v Speaker 1>audio so it didn't take up too much space, you

0:34:35.719 --> 0:34:39.680
<v Speaker 1>had to figure out how you could compress those files

0:34:40.239 --> 0:34:43.799
<v Speaker 1>and make them smaller, make them more manageable. And now

0:34:43.840 --> 0:34:48.319
<v Speaker 1>we can finally get back to Germany and Hair Brandenburg.

0:34:49.000 --> 0:34:52.120
<v Speaker 1>You thought we left him behind, We didn't. He was

0:34:52.200 --> 0:34:55.480
<v Speaker 1>just part of a flashback. So let's go to the

0:34:55.560 --> 0:34:58.719
<v Speaker 1>MP three. First of all, it gets his name from

0:34:58.760 --> 0:35:02.120
<v Speaker 1>the Motion Picture at Spurts Group, also known as IMPEG.

0:35:03.160 --> 0:35:06.640
<v Speaker 1>It was part of a project that IMPEG was doing

0:35:06.680 --> 0:35:10.080
<v Speaker 1>that was looking at ways of compressing audio. Along with

0:35:10.840 --> 0:35:14.160
<v Speaker 1>the work that they were doing with video files. It's

0:35:14.160 --> 0:35:18.600
<v Speaker 1>actually named after the process that they developed, called IMPEG

0:35:18.640 --> 0:35:21.759
<v Speaker 1>Audio Layer three. So yes, there was a layer one

0:35:21.800 --> 0:35:25.120
<v Speaker 1>and a layer two. Layer three was a refinement of

0:35:25.160 --> 0:35:27.880
<v Speaker 1>the approach and was the one that was actually successful

0:35:27.920 --> 0:35:32.560
<v Speaker 1>in the market. Now, Brandenburg was working with an instructor

0:35:32.800 --> 0:35:36.040
<v Speaker 1>he was pursuing Brandenburg was pursuing a PhD at the

0:35:36.080 --> 0:35:38.680
<v Speaker 1>time and trying to come up with a practical means

0:35:38.680 --> 0:35:42.160
<v Speaker 1>of transmitting digital audio across phone lines, and in the

0:35:42.200 --> 0:35:45.600
<v Speaker 1>process he began to experiment with algorithms that could take

0:35:45.760 --> 0:35:51.280
<v Speaker 1>digital audio information and determine which bits are significant. Anything

0:35:51.320 --> 0:35:55.480
<v Speaker 1>that was deemed insignificant could be discarded. So the thinking

0:35:55.600 --> 0:35:59.440
<v Speaker 1>was that information we cannot perceive as human beings is worthless.

0:35:59.480 --> 0:36:03.320
<v Speaker 1>There's no point in preserving it in an audio file format.

0:36:03.360 --> 0:36:06.120
<v Speaker 1>It's just taking up space that we can't even perceive

0:36:06.200 --> 0:36:08.840
<v Speaker 1>when we play it back, So there's no reason to

0:36:08.880 --> 0:36:11.719
<v Speaker 1>replicate it, there's no reason to record it. Leave it out,

0:36:12.200 --> 0:36:15.560
<v Speaker 1>and that way you could compress digital audio files. Or

0:36:15.640 --> 0:36:18.800
<v Speaker 1>to put it another way, if the algorithm determined that

0:36:18.880 --> 0:36:21.440
<v Speaker 1>a sound was outside the range of human hearing, it

0:36:21.480 --> 0:36:24.399
<v Speaker 1>would drop it from the encoding process, so you get

0:36:24.440 --> 0:36:29.120
<v Speaker 1>a sound file much smaller than the more accurate representative version.

0:36:29.400 --> 0:36:32.520
<v Speaker 1>So the lossless version would be more accurate to the

0:36:32.560 --> 0:36:36.239
<v Speaker 1>original sound. But this new version, what we would call

0:36:36.280 --> 0:36:39.759
<v Speaker 1>a lossy version, a compressed file, would be able to

0:36:39.800 --> 0:36:44.480
<v Speaker 1>replicate it pretty well if it's designed properly, and maybe

0:36:44.880 --> 0:36:47.440
<v Speaker 1>to a point if you design it well enough that

0:36:47.520 --> 0:36:50.200
<v Speaker 1>you couldn't tell the difference between the two. Uh. That

0:36:50.280 --> 0:36:54.000
<v Speaker 1>took some time. That was not easy to do. So

0:36:55.360 --> 0:36:58.440
<v Speaker 1>the new file, the new version, the compressed one, the

0:36:58.480 --> 0:37:02.640
<v Speaker 1>lossy format, would only have the actual relevant data, and

0:37:02.719 --> 0:37:05.799
<v Speaker 1>from that point forward, the challenge was to determine what

0:37:05.920 --> 0:37:10.400
<v Speaker 1>are the benchmarks to figure out what is relevant versus

0:37:10.440 --> 0:37:13.719
<v Speaker 1>what is irrelevant, Because if you lose too much information,

0:37:13.800 --> 0:37:16.640
<v Speaker 1>you change the quality of the recording, meaning it's no

0:37:16.719 --> 0:37:20.440
<v Speaker 1>longer an accurate representation of the original sound. So you

0:37:20.520 --> 0:37:24.480
<v Speaker 1>might say that any sound below twenty hurts isn't relevant

0:37:24.520 --> 0:37:28.240
<v Speaker 1>because it's below the range of your typical human humans

0:37:28.280 --> 0:37:31.800
<v Speaker 1>ability to hear. You might say that anything above twenty

0:37:31.840 --> 0:37:37.280
<v Speaker 1>thousand hurts or twenty killer hurts is irrelevant because humans

0:37:37.280 --> 0:37:41.600
<v Speaker 1>typically can't hear sounds above that frequency. You might say

0:37:41.640 --> 0:37:46.120
<v Speaker 1>that sounds at a certain amplitude or lower are irrelevant

0:37:46.160 --> 0:37:50.360
<v Speaker 1>because they're so quiet that humans wouldn't hear them. Or

0:37:50.480 --> 0:37:53.560
<v Speaker 1>you might say that if a certain sound is at

0:37:53.600 --> 0:37:56.239
<v Speaker 1>a lower amplitude and a different sound is at a

0:37:56.320 --> 0:38:00.440
<v Speaker 1>higher amplitude, the higher amplitude sound is drowning out the

0:38:00.520 --> 0:38:04.120
<v Speaker 1>lower amplitude sound, and so we humans don't really perceive

0:38:04.200 --> 0:38:08.319
<v Speaker 1>the lower amplitude sound. This is where we get into psychoacoustics.

0:38:08.400 --> 0:38:10.880
<v Speaker 1>It's not just what we hear, but how we perceive

0:38:11.320 --> 0:38:15.120
<v Speaker 1>the sound itself. And a lot of that went into

0:38:15.320 --> 0:38:18.160
<v Speaker 1>formulating the algorithms to figure out how to compress this

0:38:18.280 --> 0:38:21.680
<v Speaker 1>music in a way where you get a recording that

0:38:22.120 --> 0:38:27.200
<v Speaker 1>represents the original without you know, compromising too much and

0:38:27.280 --> 0:38:31.400
<v Speaker 1>still getting the file size to a manageable size. And

0:38:31.440 --> 0:38:33.560
<v Speaker 1>these are the decisions you have to make to figure

0:38:33.560 --> 0:38:35.879
<v Speaker 1>out which bits of information you keep in which ones

0:38:35.920 --> 0:38:40.160
<v Speaker 1>you ditch. Brandenburg and a team we're working on refining

0:38:40.160 --> 0:38:44.120
<v Speaker 1>this approach in the late eighties and early nineties, And

0:38:44.120 --> 0:38:46.040
<v Speaker 1>he said, at one point he thought he had nailed it,

0:38:46.360 --> 0:38:49.720
<v Speaker 1>and then he heard an acapella song. It was Tom's

0:38:49.760 --> 0:38:54.400
<v Speaker 1>Diner by Suzanne Vega, And then he listened to the

0:38:54.400 --> 0:38:58.400
<v Speaker 1>compressed MP three version of that song using the the

0:38:58.600 --> 0:39:00.920
<v Speaker 1>version of MP three that had been developed up to

0:39:01.080 --> 0:39:04.800
<v Speaker 1>that point, and he said, it ruined the song. It

0:39:05.000 --> 0:39:09.800
<v Speaker 1>trashed it. It sounded terrible. He said that other representations

0:39:09.800 --> 0:39:13.160
<v Speaker 1>of music seemed fine with this particular approach, but when

0:39:13.200 --> 0:39:16.239
<v Speaker 1>they went with this stripped down acapella song with this

0:39:16.400 --> 0:39:19.840
<v Speaker 1>particular kind of you're in the middle of a space

0:39:19.920 --> 0:39:24.480
<v Speaker 1>listening to Suzanne Vegas sing, it ruined her voice. And

0:39:24.560 --> 0:39:27.080
<v Speaker 1>so the team began to tweet the compression algorithms to

0:39:27.160 --> 0:39:30.080
<v Speaker 1>correct for this problem, and it took a lot of

0:39:30.120 --> 0:39:33.520
<v Speaker 1>work to figure out, Okay, well, what are the elements

0:39:33.560 --> 0:39:37.440
<v Speaker 1>of sound that we messed with that have created this issue,

0:39:37.600 --> 0:39:39.839
<v Speaker 1>and ultimately they were finally able to create an MP

0:39:39.920 --> 0:39:43.400
<v Speaker 1>three file that didn't distort or ruin the recording. Brandberg

0:39:43.480 --> 0:39:46.160
<v Speaker 1>said he listened to that song somewhere between five hundred

0:39:46.200 --> 0:39:49.400
<v Speaker 1>and a thousand times, and then he saw Suzanne Vega

0:39:49.480 --> 0:39:53.640
<v Speaker 1>performance live and he was able to recognize all of

0:39:53.680 --> 0:39:58.760
<v Speaker 1>those subtle changes in her voice because he had paid

0:39:58.920 --> 0:40:01.720
<v Speaker 1>so close attention to it during the process of tweaking

0:40:01.719 --> 0:40:05.880
<v Speaker 1>this algorithm. He said, ultimately, the real telling thing is

0:40:06.040 --> 0:40:10.960
<v Speaker 1>he still enjoyed the song, which says a lot about him. Me.

0:40:11.120 --> 0:40:14.520
<v Speaker 1>I can't stand that song, but maybe it's just because

0:40:14.520 --> 0:40:16.400
<v Speaker 1>to me there's a point where it just sounds like

0:40:16.400 --> 0:40:18.960
<v Speaker 1>someone is just singing about what they're doing. And I

0:40:19.040 --> 0:40:22.960
<v Speaker 1>do that every day. No one gave me a record deal, alright.

0:40:22.960 --> 0:40:27.360
<v Speaker 1>So getting back to MP three, they had finalized the

0:40:27.760 --> 0:40:31.359
<v Speaker 1>foul format and created the standard, but it was just

0:40:31.520 --> 0:40:35.240
<v Speaker 1>one of several possibilities for encoding audio and it didn't

0:40:35.280 --> 0:40:42.040
<v Speaker 1>immediately take off. It wasn't immediately adopted by consumers. The

0:40:42.080 --> 0:40:45.799
<v Speaker 1>team had identified the Internet as a possible distribute distribution

0:40:45.840 --> 0:40:48.920
<v Speaker 1>method for MP three files, rather than just over telephone lines.

0:40:48.960 --> 0:40:51.520
<v Speaker 1>They said, well, can technically we could send and B

0:40:51.680 --> 0:40:56.240
<v Speaker 1>three's across the Internet, so you could send manageable sized

0:40:56.320 --> 0:41:02.120
<v Speaker 1>files across this network. Until life fourteen, they created the

0:41:02.160 --> 0:41:06.759
<v Speaker 1>file extension DOT MP three. Now it would take a

0:41:06.840 --> 0:41:09.960
<v Speaker 1>little bit longer for software to take advantage of this.

0:41:10.080 --> 0:41:13.359
<v Speaker 1>One of the early programs was win amp, which made

0:41:13.440 --> 0:41:16.479
<v Speaker 1>MP three decoding accessible, and from that point the file

0:41:16.560 --> 0:41:20.560
<v Speaker 1>format began to take off. To follow would be dedicated

0:41:20.640 --> 0:41:23.760
<v Speaker 1>MP three players and sites that allowed people to upload

0:41:23.800 --> 0:41:28.760
<v Speaker 1>and download compressed audio files, which also indicated a rise

0:41:28.880 --> 0:41:33.440
<v Speaker 1>in piracy, and then in response to the rise in piracy.

0:41:33.520 --> 0:41:36.400
<v Speaker 1>We saw an increase in d r M strategies digital

0:41:36.480 --> 0:41:40.960
<v Speaker 1>rights management or copy protection if you prefer, and that

0:41:41.120 --> 0:41:44.120
<v Speaker 1>all really ended up shaping a lot of the policies

0:41:44.320 --> 0:41:48.680
<v Speaker 1>and strategies that affect the Internet today. So you could

0:41:48.680 --> 0:41:51.720
<v Speaker 1>say that the MP three is one of the reasons

0:41:51.760 --> 0:41:54.440
<v Speaker 1>why the Internet is the way it is right now,

0:41:54.480 --> 0:41:58.799
<v Speaker 1>and why arguments both for and against net neutrality have

0:41:59.560 --> 0:42:02.080
<v Speaker 1>formula aided in certain ways. A lot of it is

0:42:02.120 --> 0:42:06.080
<v Speaker 1>shaped by the MP three. So that kind of wraps

0:42:06.160 --> 0:42:10.000
<v Speaker 1>up this discussion about digital audio in general and a

0:42:10.040 --> 0:42:12.640
<v Speaker 1>little bit on MP three files. In the next episode

0:42:12.640 --> 0:42:15.880
<v Speaker 1>of this series, I will dive into a more technical

0:42:15.960 --> 0:42:18.919
<v Speaker 1>explanation of what is actually going on with the MP

0:42:19.040 --> 0:42:23.239
<v Speaker 1>three compression algorithms, and I bet you can't wait to

0:42:23.320 --> 0:42:27.440
<v Speaker 1>learn all about fast Furrier transforms. I know I can't,

0:42:28.040 --> 0:42:31.040
<v Speaker 1>And like I said, I have other episodes to sprinkle

0:42:31.080 --> 0:42:33.440
<v Speaker 1>in between this one and the next one and then

0:42:33.520 --> 0:42:36.000
<v Speaker 1>the third one, so that way you won't just get

0:42:36.080 --> 0:42:39.560
<v Speaker 1>digital audio overload. And if you guys have any comments

0:42:39.760 --> 0:42:43.000
<v Speaker 1>or questions or suggestions for show topics or people I

0:42:43.040 --> 0:42:45.879
<v Speaker 1>should interview, or maybe people I should have on as

0:42:45.880 --> 0:42:49.560
<v Speaker 1>a guest host shoot him my way. My email is

0:42:49.640 --> 0:42:53.040
<v Speaker 1>tech stuff at how stuff works dot com, or you

0:42:53.040 --> 0:42:55.600
<v Speaker 1>can always drop me a line on Facebook or Twitter

0:42:55.760 --> 0:42:59.080
<v Speaker 1>with the handle tech stuff hs W and I'll talk

0:42:59.120 --> 0:43:05.719
<v Speaker 1>to you guys again really soon. For more on this

0:43:05.880 --> 0:43:08.399
<v Speaker 1>and thousands of other topics, is it how stuff works

0:43:08.400 --> 0:43:18.600
<v Speaker 1>dot com