WEBVTT - Audio Wars: Analog vs Digital

0:00:04.400 --> 0:00:07.800
<v Speaker 1>Welcome to tech Stuff, a production from I Heart Radio.

0:00:12.160 --> 0:00:15.200
<v Speaker 1>Hey there, and welcome to tech Stuff. I'm your host,

0:00:15.360 --> 0:00:18.240
<v Speaker 1>Jonathan Strickland. I'm an executive producer with I Heart Radio

0:00:18.320 --> 0:00:21.640
<v Speaker 1>and I love all things tech and recently I received

0:00:21.720 --> 0:00:26.200
<v Speaker 1>a tweet from Twitter user Salvatore del Knock, a k

0:00:26.400 --> 0:00:30.320
<v Speaker 1>A non juror, asking if I would do a breakdown

0:00:30.720 --> 0:00:33.840
<v Speaker 1>on how analog to digital and digital to analog audio

0:00:33.920 --> 0:00:38.040
<v Speaker 1>converters work. And that's a great request. Um, it is

0:00:38.200 --> 0:00:41.440
<v Speaker 1>incredibly technical when you really get down to it. So

0:00:41.560 --> 0:00:45.360
<v Speaker 1>I'm going to do a very high level view of

0:00:45.400 --> 0:00:49.360
<v Speaker 1>the concept because otherwise we're gonna have to get into

0:00:49.960 --> 0:00:54.560
<v Speaker 1>the various methodologies that DAC and a d c's work,

0:00:55.120 --> 0:00:58.560
<v Speaker 1>and uh, it would quickly become like a technical manual.

0:00:58.680 --> 0:01:01.600
<v Speaker 1>But if people want that, then I can do a

0:01:01.680 --> 0:01:04.880
<v Speaker 1>subsequent episode and go into more detail. But one of

0:01:04.880 --> 0:01:08.119
<v Speaker 1>the things about this is that lets us talk about

0:01:08.160 --> 0:01:13.160
<v Speaker 1>the differences between analog and digital audio and why converters

0:01:13.200 --> 0:01:15.720
<v Speaker 1>are necessary in the first place, and to open up

0:01:15.760 --> 0:01:19.640
<v Speaker 1>the eternal argument about whether one is inherently better than

0:01:19.800 --> 0:01:22.640
<v Speaker 1>the other. This one goes out to all you audio

0:01:22.720 --> 0:01:26.120
<v Speaker 1>files out there, so get ready to send me angry messages,

0:01:26.120 --> 0:01:28.679
<v Speaker 1>because no matter what I say, some of y'all are

0:01:28.720 --> 0:01:32.040
<v Speaker 1>going to get upset. Anyway, let's start with what it

0:01:32.080 --> 0:01:35.920
<v Speaker 1>means to be analog versus digital. Now, when I was

0:01:35.959 --> 0:01:38.640
<v Speaker 1>a young boy, nobody loved me. I was a poor

0:01:38.680 --> 0:01:43.280
<v Speaker 1>boy from a poor family. No, hang on, that's now,

0:01:43.280 --> 0:01:46.440
<v Speaker 1>that's Queen's bohemian rhapsty Now, when I was a young boy,

0:01:46.560 --> 0:01:51.200
<v Speaker 1>analog was the standard. Digital did not even enter into

0:01:51.200 --> 0:01:54.720
<v Speaker 1>my awareness until I was a teenager, when compact discs

0:01:54.720 --> 0:01:56.640
<v Speaker 1>were starting to become popular. They had been around for

0:01:56.640 --> 0:01:59.160
<v Speaker 1>a while before I was a teenager, but I was

0:01:59.240 --> 0:02:02.680
<v Speaker 1>not really aware of them, because I mean, I grew

0:02:02.720 --> 0:02:05.120
<v Speaker 1>up in rural Georgia. We would get technology a few

0:02:05.200 --> 0:02:09.560
<v Speaker 1>years behind everybody else. Anyway, I grew up thinking analog

0:02:09.680 --> 0:02:13.680
<v Speaker 1>essentially meant old and digital meant new, Like that was

0:02:13.760 --> 0:02:16.280
<v Speaker 1>the sort of the abstract distinction between the two in

0:02:16.360 --> 0:02:20.000
<v Speaker 1>my head. But the differences are obviously more complicated than that,

0:02:20.400 --> 0:02:23.560
<v Speaker 1>and we need to understand how sound works, which I

0:02:23.600 --> 0:02:26.720
<v Speaker 1>know I've covered many many times, but it's important so

0:02:26.800 --> 0:02:29.760
<v Speaker 1>that we know how the analog and digital methods of

0:02:29.840 --> 0:02:35.120
<v Speaker 1>recording and thus reproducing and eventually playing back sound. You

0:02:35.160 --> 0:02:38.760
<v Speaker 1>know how they work with relation to the original sound

0:02:39.000 --> 0:02:43.280
<v Speaker 1>that existed. So sound is, when you really get down

0:02:43.320 --> 0:02:48.720
<v Speaker 1>to it, vibration or pressure waves. Now, we mostly experienced

0:02:48.720 --> 0:02:52.320
<v Speaker 1>sound by hearing these vibrations travel through the air, but

0:02:52.520 --> 0:02:55.840
<v Speaker 1>you can also experience this underwater. Sound can move through

0:02:55.880 --> 0:02:59.600
<v Speaker 1>different media, including solid material. Like if you put your

0:03:00.040 --> 0:03:02.639
<v Speaker 1>ear against a table, a really long table, and some

0:03:02.720 --> 0:03:06.280
<v Speaker 1>one on the other end is tapping very lightly on

0:03:06.360 --> 0:03:09.560
<v Speaker 1>that table, you'll hear it. And it's not because the

0:03:09.560 --> 0:03:12.400
<v Speaker 1>sound is traveling effectively through the air, though it is

0:03:12.760 --> 0:03:15.480
<v Speaker 1>doing a little bit of that too, but that it

0:03:15.560 --> 0:03:19.320
<v Speaker 1>travels through the table to you. Sound also travels at

0:03:19.360 --> 0:03:23.000
<v Speaker 1>different speeds through different media, and in fact, stuff like

0:03:23.080 --> 0:03:26.520
<v Speaker 1>air temperature can affect how quickly sound travels, which is

0:03:26.560 --> 0:03:30.240
<v Speaker 1>why when we talk about the speed of sound, we

0:03:30.400 --> 0:03:33.440
<v Speaker 1>technically actually need to be a little more specific than that.

0:03:33.800 --> 0:03:36.480
<v Speaker 1>So the standard way of describing the speed of sound

0:03:36.680 --> 0:03:39.960
<v Speaker 1>is to say that it moves at three per second

0:03:40.160 --> 0:03:44.840
<v Speaker 1>in dry air at twenty celsius, that's about sixty eight fahrenheit.

0:03:45.240 --> 0:03:47.960
<v Speaker 1>And if you start changing those parameters, you know, if

0:03:47.960 --> 0:03:50.480
<v Speaker 1>you introduce, say a lot of humidity into the air,

0:03:50.960 --> 0:03:53.840
<v Speaker 1>or you change the air temperature like it goes up

0:03:53.960 --> 0:03:56.520
<v Speaker 1>or it goes down. Well, sound will travel at a

0:03:56.600 --> 0:04:00.840
<v Speaker 1>slightly different speed than at that standard I was talking about.

0:04:01.160 --> 0:04:04.160
<v Speaker 1>Now I could get into how the vibrations cause air

0:04:04.160 --> 0:04:06.880
<v Speaker 1>molecules to move back and forth, creating little changes in

0:04:06.960 --> 0:04:11.480
<v Speaker 1>air pressure. And it's these pressure waves, these air fluctuation changes,

0:04:11.520 --> 0:04:13.520
<v Speaker 1>that our ear drums pick up and transfer to our

0:04:13.560 --> 0:04:17.200
<v Speaker 1>inner ears. That's where special nerves pick up these fluctuations

0:04:17.200 --> 0:04:19.760
<v Speaker 1>in our inner ears, and then our brains process those

0:04:20.200 --> 0:04:23.120
<v Speaker 1>those nerve signals as sound. But most of this isn't

0:04:23.160 --> 0:04:27.280
<v Speaker 1>important for the rest of this episode, so instead, let's

0:04:27.320 --> 0:04:31.960
<v Speaker 1>talk about sound waves, all right. So we can think

0:04:31.960 --> 0:04:34.599
<v Speaker 1>of a vibration as something in which a particle is

0:04:34.640 --> 0:04:37.960
<v Speaker 1>moved out of its usual place and then it snaps

0:04:38.040 --> 0:04:40.640
<v Speaker 1>back to its usual place, and it might do this

0:04:40.880 --> 0:04:44.279
<v Speaker 1>several times. Think of a guitar string. If you pluck

0:04:44.320 --> 0:04:47.280
<v Speaker 1>a guitar string, you're pulling the string out of where

0:04:47.320 --> 0:04:50.719
<v Speaker 1>it usually sits, and then it snaps back and forth

0:04:51.000 --> 0:04:55.880
<v Speaker 1>and oscillates around its normal position until it settles down again.

0:04:56.240 --> 0:04:58.919
<v Speaker 1>So we can describe the number of times that a

0:04:59.000 --> 0:05:02.440
<v Speaker 1>particle does as a frequency, you know, or the number

0:05:02.440 --> 0:05:05.800
<v Speaker 1>of times a string goes from one point all the

0:05:05.839 --> 0:05:08.960
<v Speaker 1>way across and back to that starting point over the

0:05:08.960 --> 0:05:11.520
<v Speaker 1>course of a second. So with sound, we usually use

0:05:11.560 --> 0:05:16.080
<v Speaker 1>the unit hurts to measure frequency. If a particle only

0:05:16.120 --> 0:05:19.680
<v Speaker 1>did one cycle of vibration per second, if it took

0:05:19.680 --> 0:05:22.560
<v Speaker 1>a full second for it to go from the you know,

0:05:23.120 --> 0:05:29.240
<v Speaker 1>the one crest to the next crest, uh, then it

0:05:29.279 --> 0:05:31.640
<v Speaker 1>would be one hurts. That would also, by the way,

0:05:31.680 --> 0:05:33.560
<v Speaker 1>be a frequency that was way too low for us

0:05:33.600 --> 0:05:36.760
<v Speaker 1>to hear. Typical human hearing has a range of around

0:05:36.839 --> 0:05:40.480
<v Speaker 1>twenty hurts at the low end, to twenty thousand hurts

0:05:40.560 --> 0:05:43.160
<v Speaker 1>or twenty killer hurts, in other words, on the high end.

0:05:43.640 --> 0:05:46.880
<v Speaker 1>So for stuff vibrating in a cycle that's twenty times

0:05:46.880 --> 0:05:50.119
<v Speaker 1>a second all the way up to twenty thousand times

0:05:50.120 --> 0:05:53.320
<v Speaker 1>a second, that's something we could potentially hear. Now. I

0:05:53.360 --> 0:05:57.280
<v Speaker 1>say potentially because that is typical human hearing. There are

0:05:57.279 --> 0:05:59.640
<v Speaker 1>people who can hear outside of that range a little bit,

0:06:00.120 --> 0:06:02.760
<v Speaker 1>and then there are some of us, especially as we

0:06:02.800 --> 0:06:07.080
<v Speaker 1>get older, who can hear a more narrow range of frequencies.

0:06:08.120 --> 0:06:11.080
<v Speaker 1>But frequency is just one part of how we describe sound.

0:06:11.440 --> 0:06:14.600
<v Speaker 1>We can also describe sound by how loud it is.

0:06:15.160 --> 0:06:18.640
<v Speaker 1>The volume of sound. So from a physics perspective, we

0:06:18.680 --> 0:06:21.159
<v Speaker 1>can think of this is how much pressure the sound

0:06:21.200 --> 0:06:24.200
<v Speaker 1>places upon our ear drums. You know how dramatic those

0:06:24.200 --> 0:06:28.240
<v Speaker 1>fluctuations and air pressure are. In other words, But loudness

0:06:28.400 --> 0:06:31.680
<v Speaker 1>isn't just down to physics. The way we experience loudness

0:06:31.680 --> 0:06:35.880
<v Speaker 1>depends not just on that sound pressure itself, but stuff

0:06:35.920 --> 0:06:39.479
<v Speaker 1>like psychoacoustics. That's how our brains perceive sound in the

0:06:39.520 --> 0:06:43.360
<v Speaker 1>first place. But now we've got two criteria we can

0:06:43.480 --> 0:06:46.200
<v Speaker 1>use to assign to any sound correct Like, we can

0:06:46.279 --> 0:06:49.440
<v Speaker 1>talk about the frequency of that sound, you know, how

0:06:49.480 --> 0:06:53.120
<v Speaker 1>frequently that those particles are vibrating, And then we can

0:06:53.160 --> 0:06:56.200
<v Speaker 1>also talk about the displacement of those particles vibrating, or

0:06:56.240 --> 0:06:58.839
<v Speaker 1>what we might think of as the loudness or volume

0:06:58.920 --> 0:07:02.520
<v Speaker 1>of that sound. We could then plot a sound wave

0:07:02.800 --> 0:07:06.200
<v Speaker 1>as a transverse wave on a graph, and we could

0:07:06.200 --> 0:07:09.400
<v Speaker 1>have the X axis, you know, the horizontal axis of

0:07:09.440 --> 0:07:13.080
<v Speaker 1>this graph representing the passage of time. So on the

0:07:13.160 --> 0:07:15.800
<v Speaker 1>left side we might say zero, and we say time

0:07:15.840 --> 0:07:18.679
<v Speaker 1>increases as you go to the right. The y axis

0:07:18.760 --> 0:07:22.400
<v Speaker 1>we could have being displacement, which kind of you know,

0:07:22.440 --> 0:07:25.960
<v Speaker 1>amplitude or volume in other words, and we could then

0:07:26.000 --> 0:07:29.360
<v Speaker 1>plot all the points where a particular vibrating particle would

0:07:29.360 --> 0:07:33.400
<v Speaker 1>occupy over a given span of time. If we had

0:07:33.560 --> 0:07:36.880
<v Speaker 1>a sound of a steady frequency, then we would end

0:07:36.920 --> 0:07:38.440
<v Speaker 1>up with a wave that would look a lot like

0:07:38.480 --> 0:07:42.240
<v Speaker 1>a sign or cosign wave. The distance between two consecutive

0:07:42.320 --> 0:07:47.000
<v Speaker 1>crests of this wave would be the wavelength for that sound,

0:07:47.280 --> 0:07:52.080
<v Speaker 1>and the sounds wavelength has an inversely proportional relationship with

0:07:52.160 --> 0:07:56.120
<v Speaker 1>the sounds frequency, So the higher the frequency of sound,

0:07:56.760 --> 0:08:00.160
<v Speaker 1>the shorter the wavelength will be. So deep bay Ace

0:08:00.280 --> 0:08:03.920
<v Speaker 1>notes would have sound waves that have much longer wavelengths

0:08:04.240 --> 0:08:09.120
<v Speaker 1>than very high pitched high frequency notes. Uh frequency relates

0:08:09.160 --> 0:08:12.840
<v Speaker 1>to pitch. There isn't like an easy mathematical way we

0:08:12.920 --> 0:08:17.000
<v Speaker 1>can kind of relate pitch, by the way, There are

0:08:17.120 --> 0:08:20.160
<v Speaker 1>easy ways we can relate frequencies, but it gets a

0:08:20.160 --> 0:08:25.640
<v Speaker 1>little tricky anyway. The reason I even talk about plotting

0:08:25.720 --> 0:08:28.560
<v Speaker 1>sound waves at all is that it makes us easier

0:08:28.600 --> 0:08:31.720
<v Speaker 1>for us to consider the differences between analog and digital

0:08:31.760 --> 0:08:34.760
<v Speaker 1>audio recording. Keep in mind, if we plotted that sound wave,

0:08:35.200 --> 0:08:38.640
<v Speaker 1>that's not that's not the physical sound wave that we've

0:08:38.679 --> 0:08:43.160
<v Speaker 1>just plotted. That's our description of that sound wave, its frequency,

0:08:43.200 --> 0:08:47.439
<v Speaker 1>and its loudness. Um The classic sign wave like depiction

0:08:47.480 --> 0:08:49.840
<v Speaker 1>of the sound wave shows us that there's a continuous

0:08:49.960 --> 0:08:55.199
<v Speaker 1>representation of sound across time. It is unbroken. We can

0:08:55.480 --> 0:08:58.360
<v Speaker 1>put plot, you know, even complicated sounds with changes in

0:08:58.400 --> 0:09:01.640
<v Speaker 1>amplitude and frequency, and the shape of the waves tells

0:09:01.720 --> 0:09:05.320
<v Speaker 1>us a little bit about the tambre or quality of sound. Now,

0:09:05.320 --> 0:09:08.440
<v Speaker 1>by quality, I don't mean, oh, this sound is very

0:09:08.480 --> 0:09:12.440
<v Speaker 1>good quality or this sound is really bad quality. Instead,

0:09:12.480 --> 0:09:17.960
<v Speaker 1>I'm talking about the elements that differentiate say piano playing

0:09:18.160 --> 0:09:22.480
<v Speaker 1>middle C from a guitar playing that same note middle C.

0:09:23.040 --> 0:09:26.880
<v Speaker 1>Both instruments are producing the same note at the same frequency,

0:09:27.000 --> 0:09:29.719
<v Speaker 1>assuming both instruments are you know, properly tuned, and both

0:09:29.800 --> 0:09:33.000
<v Speaker 1>of them are using the same pitch tuning, but you

0:09:33.040 --> 0:09:36.560
<v Speaker 1>would hear a difference in the type of sound between them, right,

0:09:36.640 --> 0:09:41.199
<v Speaker 1>A piano and a guitar sound different. Otherwise all instruments

0:09:41.200 --> 0:09:44.560
<v Speaker 1>would produce exactly the same kind of sound as each other.

0:09:45.080 --> 0:09:46.959
<v Speaker 1>But you know, you can tell the difference between a

0:09:47.000 --> 0:09:50.439
<v Speaker 1>piano and a guitar, or a clarinet or a flute

0:09:50.520 --> 0:09:54.480
<v Speaker 1>or whatever. The tambre is different, even if the instruments

0:09:54.480 --> 0:09:58.000
<v Speaker 1>are all producing you know, technically the same frequency, even

0:09:58.040 --> 0:10:01.040
<v Speaker 1>at the same volume. This leads us to the fact

0:10:01.120 --> 0:10:05.040
<v Speaker 1>that sound is this continuous thing for us. It isn't

0:10:05.080 --> 0:10:08.720
<v Speaker 1>happening in discrete units. It's kind of like the difference

0:10:08.760 --> 0:10:12.480
<v Speaker 1>between jumping into a pool filled filled with water, which

0:10:12.559 --> 0:10:15.320
<v Speaker 1>is you know, continuous to us because we can't you know,

0:10:15.600 --> 0:10:19.480
<v Speaker 1>experience it down on the molecular level, or jumping into

0:10:19.480 --> 0:10:23.199
<v Speaker 1>a pool that's filled with plastic balls. So to us,

0:10:23.600 --> 0:10:27.440
<v Speaker 1>sound is kind of like a fluid, and analog recording

0:10:27.600 --> 0:10:33.200
<v Speaker 1>captures that. The analog approach to recording is older than digital.

0:10:33.400 --> 0:10:37.320
<v Speaker 1>So way way back in the nineteenth century, folks like

0:10:37.360 --> 0:10:39.480
<v Speaker 1>Alexander Graham Bell, we're trying to figure out how to

0:10:39.520 --> 0:10:43.600
<v Speaker 1>transmit the human voice across great distances using electricity, and

0:10:43.720 --> 0:10:46.679
<v Speaker 1>the microphone was one half of what was needed to

0:10:46.720 --> 0:10:50.000
<v Speaker 1>do this, the loud speaker being the other half. And

0:10:50.200 --> 0:10:54.240
<v Speaker 1>the basic way a standard microphone works is to convert

0:10:54.559 --> 0:10:59.160
<v Speaker 1>sound that continuous you know phenomena of pressure wave changes

0:11:00.080 --> 0:11:03.600
<v Speaker 1>to a varying electric signal, an electric signal that has

0:11:04.280 --> 0:11:08.880
<v Speaker 1>varying voltage. This is another continuous phenomena, right, it's unbroken,

0:11:09.000 --> 0:11:12.440
<v Speaker 1>it's it's like another wave. Here's how it works. So

0:11:12.520 --> 0:11:17.559
<v Speaker 1>inside an analog microphone is a tiny little diaphragm, typically

0:11:17.640 --> 0:11:20.000
<v Speaker 1>made of very thin plastic, and it behaves in a

0:11:20.040 --> 0:11:23.960
<v Speaker 1>way similar to how our ear drums work in our ears.

0:11:23.960 --> 0:11:29.480
<v Speaker 1>So when sound, you know, these pressure waves hit that microphone,

0:11:29.840 --> 0:11:33.240
<v Speaker 1>it moves the diaphragm back and forth, and the diaphragm

0:11:33.280 --> 0:11:37.320
<v Speaker 1>is actually attached to an electro magnet. A simple microphone

0:11:37.400 --> 0:11:40.679
<v Speaker 1>could have a permanent magnet inside it, and wrapped around

0:11:40.679 --> 0:11:43.680
<v Speaker 1>this permanent magnet is a little coil of metal wire

0:11:43.920 --> 0:11:47.160
<v Speaker 1>that connects to the diaphragm. So the diaphragm moves the coil,

0:11:47.320 --> 0:11:50.560
<v Speaker 1>which then moves along the length of this permanent magnet.

0:11:51.280 --> 0:11:54.640
<v Speaker 1>That introduces a fluctuating magnetic field, or rather, you know

0:11:54.720 --> 0:11:58.240
<v Speaker 1>the effect of a fluctuating magnetic field. The permanent magnets

0:11:58.280 --> 0:12:01.920
<v Speaker 1>magnetic field is stable, but moving a coil through a

0:12:01.960 --> 0:12:04.520
<v Speaker 1>magnetic field, it's the same thing as if you were

0:12:04.559 --> 0:12:08.680
<v Speaker 1>to fluctuate a magnetic field around a you know, non

0:12:08.720 --> 0:12:12.640
<v Speaker 1>moving coil, you get the same effect. Now, the laws

0:12:12.640 --> 0:12:16.160
<v Speaker 1>of electromagnetism tell us that if you have a conductive

0:12:16.200 --> 0:12:21.200
<v Speaker 1>material and it encounters a fluctuating magnetic field, that field

0:12:21.640 --> 0:12:25.840
<v Speaker 1>will then induce an electric current in the conductive material.

0:12:25.960 --> 0:12:29.920
<v Speaker 1>So now you've got the microphone producing an electric current,

0:12:30.400 --> 0:12:33.640
<v Speaker 1>and again the voltage of this current varies depending upon

0:12:33.720 --> 0:12:37.679
<v Speaker 1>the sound hitting the microphone. That means the microphone is

0:12:37.720 --> 0:12:41.040
<v Speaker 1>a type of transducer. That's a device that converts one

0:12:41.120 --> 0:12:44.880
<v Speaker 1>form of energy, in this case acoustic pressure, into another

0:12:44.920 --> 0:12:49.040
<v Speaker 1>form electric signals. Now, you could send this electric current

0:12:49.280 --> 0:12:52.680
<v Speaker 1>with varying voltage somewhere to do something else interesting, like

0:12:53.160 --> 0:12:57.040
<v Speaker 1>you could have it go directly to allowed speaker for playback. Now,

0:12:57.080 --> 0:13:01.560
<v Speaker 1>of course, this electric current is really uh there are

0:13:01.880 --> 0:13:04.880
<v Speaker 1>you know, very small elements in your microphone, right, so

0:13:05.840 --> 0:13:10.600
<v Speaker 1>it cannot produce an incredibly strong electric current. So typically

0:13:11.200 --> 0:13:14.520
<v Speaker 1>you would first pass this electric current through an amplifier,

0:13:15.040 --> 0:13:17.840
<v Speaker 1>which increases the strength of the signal. I'm not going

0:13:17.920 --> 0:13:20.560
<v Speaker 1>to go into how amplifiers work. I've talked about in

0:13:20.640 --> 0:13:23.720
<v Speaker 1>other episodes, and it would mean that this this episode

0:13:23.760 --> 0:13:25.760
<v Speaker 1>would go like an hour and a half long if

0:13:25.760 --> 0:13:28.640
<v Speaker 1>I were to to dive into that. The important thing

0:13:28.679 --> 0:13:32.839
<v Speaker 1>to think of is that amplifiers take incoming week signals

0:13:33.200 --> 0:13:36.680
<v Speaker 1>and then push out a stronger version of that same signal.

0:13:36.760 --> 0:13:40.760
<v Speaker 1>Assuming the amplifiers working properly, then that signal could go

0:13:40.880 --> 0:13:44.360
<v Speaker 1>to a speaker and you would have the same process

0:13:44.400 --> 0:13:47.000
<v Speaker 1>that you had with the microphone, only in reverse. The

0:13:47.080 --> 0:13:51.600
<v Speaker 1>speaker also has a voice coil inside it, a coil

0:13:51.760 --> 0:13:56.040
<v Speaker 1>of you know, conductive of metal wire, and also a

0:13:56.040 --> 0:13:59.920
<v Speaker 1>magnet inside the loudspeaker. So the incoming current goes to

0:14:00.120 --> 0:14:02.920
<v Speaker 1>the wire, and we know by the laws of electro

0:14:02.960 --> 0:14:06.960
<v Speaker 1>magnetism that this means the flowing current through the wire

0:14:07.000 --> 0:14:09.280
<v Speaker 1>will also produce a magnetic field. I mean, this is

0:14:09.320 --> 0:14:12.839
<v Speaker 1>how electro magnetism works, and that this magnetic field will

0:14:12.880 --> 0:14:16.439
<v Speaker 1>then pull and push against the magnetic field generated by

0:14:16.480 --> 0:14:20.080
<v Speaker 1>the permanent magnet that's already inside the speaker, and this

0:14:20.160 --> 0:14:24.240
<v Speaker 1>in turn creates the force that pushes and pulls the

0:14:24.400 --> 0:14:28.600
<v Speaker 1>cone inside the speaker that connects to another diaphragm. This

0:14:28.680 --> 0:14:31.480
<v Speaker 1>is a much larger diaphragm than the one that's on

0:14:31.520 --> 0:14:34.280
<v Speaker 1>the microphone on the other side. Right, Because you've boosted

0:14:34.280 --> 0:14:37.360
<v Speaker 1>the electric signal, it can then have enough power to

0:14:37.520 --> 0:14:41.120
<v Speaker 1>move this larger diaphragm. So this larger diaphragm begins to

0:14:41.120 --> 0:14:43.640
<v Speaker 1>move in and out, and it's pushing and pulling air

0:14:44.200 --> 0:14:49.960
<v Speaker 1>and it's just recreating the acoustic pressure waves that we're

0:14:50.120 --> 0:14:53.040
<v Speaker 1>used to go into the microphone and generate the electric

0:14:53.040 --> 0:14:55.360
<v Speaker 1>signal in the first place, so you're kind of preserved

0:14:55.560 --> 0:15:00.800
<v Speaker 1>this experience from sound going into a microphone. The microphone

0:15:01.080 --> 0:15:06.120
<v Speaker 1>as a transducer, transforming that acoustic pressure into an electric

0:15:06.160 --> 0:15:10.520
<v Speaker 1>current with varying voltage, sending that to an amplifier, and

0:15:10.560 --> 0:15:13.760
<v Speaker 1>then a speaker, which then does the opposite. It's also

0:15:13.760 --> 0:15:17.200
<v Speaker 1>a transducer. It takes this electric current with varying voltage

0:15:17.480 --> 0:15:20.520
<v Speaker 1>and converts it back into acoustic pressure and we get

0:15:20.520 --> 0:15:24.680
<v Speaker 1>the playback. That's an analog chain from start to finish. Now,

0:15:24.720 --> 0:15:27.160
<v Speaker 1>if you've got a good quality microphone and a good

0:15:27.200 --> 0:15:31.160
<v Speaker 1>amplifier and a good speaker, you can transmit sound pretty effectively.

0:15:31.560 --> 0:15:34.120
<v Speaker 1>And because the whole process is using that continuous and

0:15:34.200 --> 0:15:39.120
<v Speaker 1>varying signal, it is analogous to the experience of hearing

0:15:39.160 --> 0:15:43.200
<v Speaker 1>the sound itself. We've transformed the energy from one kind

0:15:43.240 --> 0:15:49.200
<v Speaker 1>to another, but apart from that, it is an unbroken chain. Now,

0:15:49.240 --> 0:15:53.640
<v Speaker 1>analog media includes stuff like magnetic tape and vinyl records,

0:15:54.160 --> 0:15:58.360
<v Speaker 1>which are produced in a way where you are transmitting

0:15:58.400 --> 0:16:02.520
<v Speaker 1>analog signals and they are effectively carved into a surface

0:16:03.320 --> 0:16:07.000
<v Speaker 1>that then can be picked up with a stylus on

0:16:07.120 --> 0:16:10.920
<v Speaker 1>a turntable and then converted back into an electric signal

0:16:11.000 --> 0:16:13.800
<v Speaker 1>that then can be sent to speakers. So either way

0:16:14.240 --> 0:16:20.000
<v Speaker 1>you are preserving that analog signal with magnetic tape. You've

0:16:20.000 --> 0:16:22.320
<v Speaker 1>got a recording device set up that takes that varying

0:16:22.320 --> 0:16:26.320
<v Speaker 1>electric signal from the recording and then creates a magnetic

0:16:26.440 --> 0:16:30.920
<v Speaker 1>field with the the writer the right head. Uh, And

0:16:31.000 --> 0:16:33.760
<v Speaker 1>you've got a little electro magnet in this thing, and

0:16:33.840 --> 0:16:38.120
<v Speaker 1>that magnetic field rearranges particles that aren't a strip of

0:16:38.320 --> 0:16:42.000
<v Speaker 1>plastic tape. That's how cassette tapes work. That's all VHS

0:16:42.080 --> 0:16:46.160
<v Speaker 1>tapes work. So attached to this strip of plastic that

0:16:46.400 --> 0:16:50.040
<v Speaker 1>is the actual tape in a tape, are these tiny

0:16:50.120 --> 0:16:53.920
<v Speaker 1>magnetic particles that are bound to that plastic. And by

0:16:53.960 --> 0:16:56.640
<v Speaker 1>applying the magnetic field to the tape, using in a

0:16:56.640 --> 0:16:59.520
<v Speaker 1>tiny electro magnet, you can change the direction that these

0:16:59.560 --> 0:17:03.840
<v Speaker 1>particles are facing on the tape itself. So this process

0:17:03.920 --> 0:17:07.280
<v Speaker 1>arranges particles on magnetic tape in a specific way to

0:17:07.440 --> 0:17:11.360
<v Speaker 1>record that original electric signal you were using. The magnetic

0:17:11.400 --> 0:17:15.000
<v Speaker 1>particles represent the original signal and then in turn represents

0:17:15.000 --> 0:17:18.320
<v Speaker 1>the sound that was used to generate the electric signal

0:17:18.480 --> 0:17:21.200
<v Speaker 1>during the recording process. So when you play a tape back,

0:17:22.160 --> 0:17:26.400
<v Speaker 1>the tape passes underneath an electro magnet at a distance

0:17:26.440 --> 0:17:29.280
<v Speaker 1>that's close enough that the electro magnet is picking up

0:17:29.280 --> 0:17:32.600
<v Speaker 1>the magnetic fields of all those tiny particles, and the

0:17:32.680 --> 0:17:35.960
<v Speaker 1>particles have been arranged in patterns because of that, you know,

0:17:36.040 --> 0:17:39.960
<v Speaker 1>recording process, right. So the fluctuating magnetic field that is

0:17:40.040 --> 0:17:43.080
<v Speaker 1>created because these particles are now passing by an electro

0:17:43.160 --> 0:17:47.800
<v Speaker 1>magnet are again reversing that process. The electro magnet starts

0:17:47.840 --> 0:17:50.919
<v Speaker 1>to generate an electric signal because of that magnetic field,

0:17:51.400 --> 0:17:53.600
<v Speaker 1>and then can go to an amplifier and then go

0:17:53.640 --> 0:17:55.960
<v Speaker 1>out to speakers. So again we use a lot of

0:17:56.000 --> 0:18:00.480
<v Speaker 1>transformational processes to record this sound, right, because you're in

0:18:00.520 --> 0:18:05.080
<v Speaker 1>this case, we took pressure waves, vibrations, The sound went

0:18:05.080 --> 0:18:08.360
<v Speaker 1>into a microphone, creates an electric current with varying voltage.

0:18:08.480 --> 0:18:12.840
<v Speaker 1>That electric current then goes to a tape recorder essentially

0:18:13.359 --> 0:18:17.560
<v Speaker 1>that uses magnetic fields to record onto tape. We take

0:18:17.560 --> 0:18:20.639
<v Speaker 1>that tape, we put that tape into a tape player,

0:18:21.160 --> 0:18:25.879
<v Speaker 1>and that magnetic record then produces an electric current in

0:18:25.920 --> 0:18:28.760
<v Speaker 1>our tape player, which goes to an amplifier and then

0:18:28.800 --> 0:18:31.240
<v Speaker 1>goes to drive speakers and replicate the sound that we

0:18:31.320 --> 0:18:33.920
<v Speaker 1>record in the first place. So again we transformed things

0:18:34.000 --> 0:18:41.200
<v Speaker 1>multiple times, but the analogous sound process has remained stable. Now,

0:18:41.400 --> 0:18:44.560
<v Speaker 1>there's a lot in this process that I have not covered.

0:18:44.760 --> 0:18:47.480
<v Speaker 1>The equipment and methods you use in recording and playback

0:18:48.080 --> 0:18:50.240
<v Speaker 1>determine whether or not the copy you have is a

0:18:50.320 --> 0:18:54.200
<v Speaker 1>really like accurate representation of the original sound like does

0:18:54.240 --> 0:18:57.560
<v Speaker 1>it sound like you were actually there? Or is the

0:18:57.640 --> 0:19:00.320
<v Speaker 1>nuance lost? And the same is true for a back.

0:19:00.400 --> 0:19:04.119
<v Speaker 1>Playback on a really sophisticated system will likely sound better

0:19:04.320 --> 0:19:07.920
<v Speaker 1>than one that's played on some super cheap stereo. Though

0:19:08.280 --> 0:19:11.080
<v Speaker 1>pretty quickly you do reach a point where the returns

0:19:11.240 --> 0:19:14.520
<v Speaker 1>are harder to detect, right like where you might listen

0:19:14.560 --> 0:19:16.640
<v Speaker 1>to something on a good system, and then you might

0:19:16.680 --> 0:19:19.720
<v Speaker 1>listen to that same thing on what's considered like the

0:19:19.800 --> 0:19:22.800
<v Speaker 1>highest of high end systems, and you might not be

0:19:22.920 --> 0:19:26.000
<v Speaker 1>able to tell a whole lot of difference. But the

0:19:26.040 --> 0:19:29.119
<v Speaker 1>basics for analog recording and playback are all there. Now.

0:19:29.200 --> 0:19:32.200
<v Speaker 1>When we come back, we'll talk about the digital approach,

0:19:32.480 --> 0:19:42.720
<v Speaker 1>but first let's take a quick break. Okay, So now,

0:19:42.760 --> 0:19:45.959
<v Speaker 1>we've got an idea of how the analog process of

0:19:46.040 --> 0:19:49.880
<v Speaker 1>recording and playback works. We transform stuff, but we still

0:19:49.920 --> 0:19:53.960
<v Speaker 1>have a continuous signal that represents sound, which is, you know,

0:19:54.000 --> 0:19:57.959
<v Speaker 1>a continuous phenomena as sound changes, as the pitch and

0:19:58.040 --> 0:20:01.120
<v Speaker 1>the frequency shifts, or as the volume changes, or as

0:20:01.160 --> 0:20:05.080
<v Speaker 1>different instruments or voices produced sounds. All those subtle and

0:20:05.160 --> 0:20:08.680
<v Speaker 1>maybe not so subtle shifts are part of that recording method.

0:20:09.040 --> 0:20:13.840
<v Speaker 1>It's an unbroken wave. Digital recording uses a different approach

0:20:14.119 --> 0:20:17.760
<v Speaker 1>in a way. Digital recording is like taking snapshots of

0:20:17.800 --> 0:20:21.480
<v Speaker 1>what is going on during a recording session. And I

0:20:21.560 --> 0:20:23.720
<v Speaker 1>thought of a kind of goofy analogy to sort of

0:20:23.760 --> 0:20:27.119
<v Speaker 1>explain what I mean. So imagine for a moment that

0:20:27.200 --> 0:20:30.720
<v Speaker 1>you are in a soundproofed room and you cannot hear

0:20:30.760 --> 0:20:34.760
<v Speaker 1>anything that's going on outside of this room. However, you

0:20:34.760 --> 0:20:37.200
<v Speaker 1>do have a little panel like almost like a hatch

0:20:37.440 --> 0:20:39.959
<v Speaker 1>in this room, and it happens to be facing a

0:20:40.000 --> 0:20:43.840
<v Speaker 1>really big orchestra pit, and the orchestra is playing. And

0:20:43.880 --> 0:20:45.600
<v Speaker 1>you know this because there's a light in the room

0:20:45.640 --> 0:20:47.439
<v Speaker 1>that lights up when the orchestra is playing. But you

0:20:47.480 --> 0:20:50.960
<v Speaker 1>can't hear anything because the rooms sound proved However, next

0:20:51.000 --> 0:20:52.840
<v Speaker 1>to the panel is a button, and if you press

0:20:52.880 --> 0:20:55.280
<v Speaker 1>the button, the panel opens up, but only for a

0:20:55.320 --> 0:20:58.560
<v Speaker 1>split second. Next to the panel, you have a table,

0:20:58.680 --> 0:21:01.080
<v Speaker 1>you get some paper, you got a pen, and your

0:21:01.200 --> 0:21:04.560
<v Speaker 1>job is to press the button, listen for that split second,

0:21:05.000 --> 0:21:06.959
<v Speaker 1>and then write down what you think is going on

0:21:07.040 --> 0:21:10.480
<v Speaker 1>in the orchestra. You know, like you could write down

0:21:10.560 --> 0:21:15.840
<v Speaker 1>everything from the specific instruments that you're hearing, the relative

0:21:15.920 --> 0:21:19.359
<v Speaker 1>volume of those instruments, any sort of harmonies you're hearing.

0:21:19.640 --> 0:21:23.080
<v Speaker 1>Maybe you're even just trying to play name that tune. Now,

0:21:23.160 --> 0:21:26.360
<v Speaker 1>let's say there's some other rules in place too. If

0:21:26.359 --> 0:21:28.440
<v Speaker 1>you push the button, you are not allowed to push

0:21:28.440 --> 0:21:31.680
<v Speaker 1>it again until five seconds have passed. So every five

0:21:31.680 --> 0:21:34.919
<v Speaker 1>seconds you get another instant of sound as the panel

0:21:34.960 --> 0:21:39.080
<v Speaker 1>opens and closes. This is that little snapshot of what's happening.

0:21:39.080 --> 0:21:43.240
<v Speaker 1>It would be really hard to accurately describe the music

0:21:43.320 --> 0:21:46.640
<v Speaker 1>because you wouldn't have a lot of information to go by, right,

0:21:47.119 --> 0:21:50.360
<v Speaker 1>you would just have this instant of sound every five seconds.

0:21:50.520 --> 0:21:53.119
<v Speaker 1>It might as well be noise at that point. But

0:21:53.240 --> 0:21:56.480
<v Speaker 1>then let's say we start to decrease the delay, where

0:21:56.600 --> 0:21:59.000
<v Speaker 1>you get to have the panel open so that you're

0:21:59.040 --> 0:22:03.960
<v Speaker 1>getting these instants is of sound more close together. As

0:22:04.040 --> 0:22:06.200
<v Speaker 1>that gets closer and closer, it will start to sound

0:22:06.200 --> 0:22:10.600
<v Speaker 1>more like uninterrupted music. Maybe we even rig up the button.

0:22:10.640 --> 0:22:13.120
<v Speaker 1>We tape down the button so it's always pressed down,

0:22:13.560 --> 0:22:15.600
<v Speaker 1>and the panel still has to open and close, but

0:22:16.040 --> 0:22:19.240
<v Speaker 1>it can open immediately after it shuts, so it's effectively

0:22:19.280 --> 0:22:22.840
<v Speaker 1>a shutter. At a fast enough rate, you wouldn't necessarily

0:22:22.920 --> 0:22:26.359
<v Speaker 1>even notice the shutters effect on the music. To you.

0:22:26.560 --> 0:22:30.240
<v Speaker 1>It would sound unbroken if it were fast enough, And

0:22:30.280 --> 0:22:33.600
<v Speaker 1>then you could accurately describe the music you could write down,

0:22:34.040 --> 0:22:35.919
<v Speaker 1>you know, depending on how quickly you can write, you

0:22:35.920 --> 0:22:39.080
<v Speaker 1>can write down a really accurate explanation of what is

0:22:39.080 --> 0:22:42.080
<v Speaker 1>going on with the music, or maybe you're just identifying

0:22:42.320 --> 0:22:46.040
<v Speaker 1>what pieces playing. But uh, you know, in this case,

0:22:46.200 --> 0:22:49.000
<v Speaker 1>if you've got that shutter going at a high enough rate,

0:22:49.720 --> 0:22:53.080
<v Speaker 1>it's almost like you're not in a soundproof room at all. Well,

0:22:53.160 --> 0:22:57.800
<v Speaker 1>this kind of is how digital recording works. So rather

0:22:57.840 --> 0:23:02.760
<v Speaker 1>than preserving an unbroken sign Knoll, the digital process breaks

0:23:02.840 --> 0:23:07.160
<v Speaker 1>up a signal into discrete units. It has to because digital,

0:23:07.160 --> 0:23:10.040
<v Speaker 1>when we get down to it, we're talking about binary

0:23:10.160 --> 0:23:14.320
<v Speaker 1>data zeros and ones. You cannot use zeros and ones

0:23:14.880 --> 0:23:18.560
<v Speaker 1>to uh to to do anything other than talk about

0:23:18.760 --> 0:23:22.320
<v Speaker 1>discrete units. It can't be a continuous thing. Now. As

0:23:22.359 --> 0:23:24.400
<v Speaker 1>I mentioned earlier in this episode, there are a lot

0:23:24.440 --> 0:23:27.680
<v Speaker 1>of quantifiable elements we can look at when it comes

0:23:27.720 --> 0:23:30.800
<v Speaker 1>to sound. We can describe how loud it is, or

0:23:30.840 --> 0:23:34.320
<v Speaker 1>what frequency or pitch it is. We can describe the

0:23:34.359 --> 0:23:36.760
<v Speaker 1>timbre or quality of the sound. That that kind of

0:23:36.760 --> 0:23:39.360
<v Speaker 1>gets us into areas that are a little less concrete

0:23:39.400 --> 0:23:43.480
<v Speaker 1>at least in human language and digital equipment like computers

0:23:44.119 --> 0:23:48.359
<v Speaker 1>are pretty good at handling things that are discrete and quantifiable.

0:23:48.520 --> 0:23:52.480
<v Speaker 1>This is the realm of computers. And remember, ultimately computers

0:23:52.480 --> 0:23:55.360
<v Speaker 1>are relying on those zeros and ones to describe everything.

0:23:55.680 --> 0:23:58.159
<v Speaker 1>Just to be clear, to get to this point, we

0:23:58.200 --> 0:24:01.600
<v Speaker 1>would need to use an analog to digital converter, but

0:24:01.640 --> 0:24:04.720
<v Speaker 1>I'm actually gonna circle round back to that later on.

0:24:05.000 --> 0:24:07.040
<v Speaker 1>For now, we're just going to focus on the basics

0:24:07.040 --> 0:24:11.119
<v Speaker 1>of digital recording because understanding that makes the whole you know,

0:24:11.320 --> 0:24:14.600
<v Speaker 1>a D C and d a C stuff way more

0:24:14.600 --> 0:24:18.960
<v Speaker 1>easy to understand. So, the way digital recording systems work

0:24:19.440 --> 0:24:23.960
<v Speaker 1>is that they take snapshots of a continuous wave. They're

0:24:24.000 --> 0:24:28.960
<v Speaker 1>measuring precisely all the elements at that moment in time

0:24:29.600 --> 0:24:33.359
<v Speaker 1>of the wave and it's or signal signal is probably

0:24:33.359 --> 0:24:35.760
<v Speaker 1>a better word than wave. Really, we're talking about the

0:24:35.800 --> 0:24:40.320
<v Speaker 1>electric signal generated as you're using a transducer to pick

0:24:40.400 --> 0:24:45.399
<v Speaker 1>up sound from you know, wherever. So in this way,

0:24:45.440 --> 0:24:48.480
<v Speaker 1>they're like that panel in that soundproof room. If the

0:24:48.520 --> 0:24:51.760
<v Speaker 1>sample rate is too low, if you are not sampling

0:24:51.880 --> 0:24:55.520
<v Speaker 1>the signal frequently enough, then you do not get an

0:24:55.520 --> 0:24:59.480
<v Speaker 1>accurate representation of that original signal. You know, You're you're

0:24:59.480 --> 0:25:01.640
<v Speaker 1>having to make a lot of guesses of what's happening

0:25:01.680 --> 0:25:05.400
<v Speaker 1>between each snapshot. Just like if you had a camera

0:25:05.880 --> 0:25:10.280
<v Speaker 1>and you were taking pictures of a fast moving, you know,

0:25:10.440 --> 0:25:14.000
<v Speaker 1>scenario in front of you. If the rate of which

0:25:14.000 --> 0:25:17.119
<v Speaker 1>you're taking pictures is pretty slow, you've got to make

0:25:17.160 --> 0:25:19.720
<v Speaker 1>a lot of interpretation of what happened between picture one

0:25:19.760 --> 0:25:22.440
<v Speaker 1>and picture two, and picture two in picture three. Same

0:25:22.480 --> 0:25:26.439
<v Speaker 1>thing with these digital recording systems. If you were to

0:25:26.480 --> 0:25:29.240
<v Speaker 1>try and play a recording like that back, it would

0:25:29.240 --> 0:25:32.199
<v Speaker 1>not sound very good because it would not be a

0:25:32.240 --> 0:25:35.520
<v Speaker 1>good representation of the original signal. So you need a

0:25:35.560 --> 0:25:39.280
<v Speaker 1>really fast sample rate to get an accurate representation of

0:25:39.320 --> 0:25:43.679
<v Speaker 1>what was really happening. This is the major difference between

0:25:43.720 --> 0:25:49.560
<v Speaker 1>analog and digital. Analog is continuous and unbroken. Digital is discreet.

0:25:50.160 --> 0:25:53.600
<v Speaker 1>But if you are using a very fast sample rate,

0:25:53.800 --> 0:25:56.919
<v Speaker 1>you can create a digital record of a continuous signal

0:25:57.680 --> 0:26:02.240
<v Speaker 1>that to human ears appears to be continuous itself. Again,

0:26:02.760 --> 0:26:05.720
<v Speaker 1>if that shutter is opening and closing fast enough, it's

0:26:05.720 --> 0:26:09.119
<v Speaker 1>almost like it's not even there. Now, let's imagine that

0:26:09.200 --> 0:26:14.280
<v Speaker 1>we've got two graphs that are showing the same signal, right,

0:26:14.440 --> 0:26:18.240
<v Speaker 1>and on the left side we've got the analog signal represented,

0:26:18.600 --> 0:26:21.480
<v Speaker 1>and on the rights that we've got the digital signal. Now,

0:26:22.080 --> 0:26:25.080
<v Speaker 1>let's say at first glance, these two are identical. They

0:26:25.080 --> 0:26:27.960
<v Speaker 1>both look like, you know, a typical sign wave. But

0:26:28.000 --> 0:26:31.080
<v Speaker 1>then you zoom into the analog representation. But no matter

0:26:31.119 --> 0:26:33.000
<v Speaker 1>how how far you zoom in, you see it's just

0:26:33.040 --> 0:26:39.120
<v Speaker 1>a continuous, unbroken line that's representing this this signe wave. Now,

0:26:39.200 --> 0:26:41.920
<v Speaker 1>let's say we take the digital one and we zoom

0:26:41.960 --> 0:26:43.760
<v Speaker 1>way in. Well, as we zoom wag in, we and

0:26:43.840 --> 0:26:46.639
<v Speaker 1>we get closer, we see that rather than being continuous,

0:26:46.680 --> 0:26:50.679
<v Speaker 1>it's actually a series of discrete moments, like almost like

0:26:50.760 --> 0:26:54.880
<v Speaker 1>steps or stairs. That's kind of what we're talking about here.

0:26:54.920 --> 0:26:57.640
<v Speaker 1>The question is how many stairs do we use? Like,

0:26:57.680 --> 0:27:01.960
<v Speaker 1>what's the resolution that we're using here. You can kind

0:27:01.960 --> 0:27:05.280
<v Speaker 1>of think of it like megapixels in a picture. If

0:27:05.280 --> 0:27:08.080
<v Speaker 1>you don't have a lot of megapixels, then you might

0:27:08.160 --> 0:27:11.480
<v Speaker 1>see some blockiness in a photo once you get to

0:27:11.520 --> 0:27:15.359
<v Speaker 1>a certain density, depending on you know, the size of

0:27:15.400 --> 0:27:17.440
<v Speaker 1>the image you're looking at, Like if you're looking at

0:27:17.520 --> 0:27:19.119
<v Speaker 1>on the side of a building, you're gonna need a

0:27:19.160 --> 0:27:22.240
<v Speaker 1>lot of megapixels so it doesn't look blocky. But depending

0:27:22.280 --> 0:27:25.200
<v Speaker 1>on that, uh, it may look really smooth. Same sort

0:27:25.240 --> 0:27:28.359
<v Speaker 1>of thing with sound. Now, if you've ever played with

0:27:28.480 --> 0:27:33.440
<v Speaker 1>digital audio recorders, you've probably seen something labeled sample rate

0:27:33.760 --> 0:27:37.320
<v Speaker 1>or project rate. This refers to the number of samples

0:27:37.359 --> 0:27:41.000
<v Speaker 1>that the recording is taking every second, and to record

0:27:41.000 --> 0:27:43.840
<v Speaker 1>a sound, that sample rate has to be fast enough

0:27:44.080 --> 0:27:48.600
<v Speaker 1>to take two samples within one wavelength of every sound

0:27:48.680 --> 0:27:52.359
<v Speaker 1>that's appearing in that in that recording. And remember I

0:27:52.400 --> 0:27:56.520
<v Speaker 1>said that it sounds wavelength is inversely proportional or has

0:27:56.560 --> 0:28:00.919
<v Speaker 1>an inverse proportional relationship to the sounds frequency, So the

0:28:01.040 --> 0:28:05.240
<v Speaker 1>higher frequency sounds have shorter wavelengths, and you do need

0:28:05.320 --> 0:28:09.320
<v Speaker 1>two samples per wavelength to capture the data necessary. To

0:28:09.440 --> 0:28:12.040
<v Speaker 1>have a recording of that sound. If the wavelength is

0:28:12.080 --> 0:28:15.080
<v Speaker 1>too small, then your sample rate will not be sufficient

0:28:15.119 --> 0:28:17.680
<v Speaker 1>to get all, you know, the full information about that

0:28:17.880 --> 0:28:20.080
<v Speaker 1>sound wave. You won't be able to record it, at

0:28:20.119 --> 0:28:23.960
<v Speaker 1>least not accurately. So remember I said the typical range

0:28:23.960 --> 0:28:27.840
<v Speaker 1>of human hearing is between twenty hurts to twenty killer

0:28:27.920 --> 0:28:32.399
<v Speaker 1>hurts or twenty thousand hurts. That's twenty thousand cycles per second,

0:28:32.720 --> 0:28:35.040
<v Speaker 1>and you have to gather two samples per wavelength or

0:28:35.320 --> 0:28:37.760
<v Speaker 1>or cycle. So that means you need a sampling rate

0:28:37.800 --> 0:28:41.040
<v Speaker 1>of at least forty thousand times per second or forty

0:28:41.160 --> 0:28:44.040
<v Speaker 1>killer hurts to be able to sample everything that's within

0:28:44.120 --> 0:28:48.000
<v Speaker 1>the typical human hearing range. Well, a basic sample rate

0:28:48.040 --> 0:28:50.960
<v Speaker 1>that a lot of people will use for various recording

0:28:50.960 --> 0:28:55.160
<v Speaker 1>projects is forty four point one killer hurts, uh, and

0:28:55.200 --> 0:28:57.000
<v Speaker 1>then they go up from there. In fact, we use

0:28:57.200 --> 0:29:00.600
<v Speaker 1>forty eight killer hurts when we're recording our episodes. I'm

0:29:00.680 --> 0:29:02.880
<v Speaker 1>using forty eight killer hurts right now. I had to

0:29:02.960 --> 0:29:06.280
<v Speaker 1>check because I did accidentally do forty four point one

0:29:06.360 --> 0:29:09.600
<v Speaker 1>for an episode a few weeks back, and Sary needed

0:29:09.600 --> 0:29:12.040
<v Speaker 1>to gently remind me that I need to fix that.

0:29:12.480 --> 0:29:14.960
<v Speaker 1>So it's a forty eight Killer Hurts. So I also

0:29:15.040 --> 0:29:18.719
<v Speaker 1>mentioned that we're quantifying all those elements about the sound.

0:29:18.800 --> 0:29:22.440
<v Speaker 1>We want as accurate a representation of the original sound

0:29:23.120 --> 0:29:25.920
<v Speaker 1>as possible. That means we're not just concerned with the

0:29:26.040 --> 0:29:30.000
<v Speaker 1>number of snapshots that we're taking every second. We're also

0:29:30.160 --> 0:29:33.960
<v Speaker 1>concerned with the quality of each of those snapshots. If

0:29:34.000 --> 0:29:36.640
<v Speaker 1>we were using a literal camera to take pictures, we

0:29:36.680 --> 0:29:38.880
<v Speaker 1>would want stuff like the lighting and the lens to

0:29:38.920 --> 0:29:42.080
<v Speaker 1>be perfect so that every single photo we got was

0:29:42.120 --> 0:29:45.320
<v Speaker 1>an accurate representation of what we were seeing when we

0:29:45.320 --> 0:29:48.400
<v Speaker 1>were there. Well, with digital recording, you know, we're not

0:29:48.440 --> 0:29:51.120
<v Speaker 1>talking about lights and cameras. We're talking about how much

0:29:51.280 --> 0:29:55.760
<v Speaker 1>data we're using to describe the original signal. This is

0:29:55.800 --> 0:29:59.920
<v Speaker 1>called bit depth. Bit depth refers to how many potential

0:30:00.160 --> 0:30:03.440
<v Speaker 1>values we can assign to a signal in an effort

0:30:03.480 --> 0:30:06.480
<v Speaker 1>to describe it. The more potential values we can use,

0:30:07.000 --> 0:30:10.640
<v Speaker 1>the more accurately we can describe the signal. So let's

0:30:10.640 --> 0:30:13.920
<v Speaker 1>do another analogy. All right, Let's say that we're in

0:30:13.960 --> 0:30:16.960
<v Speaker 1>a room. It's you and your best friend. Your best

0:30:16.960 --> 0:30:18.920
<v Speaker 1>friends all the way, I across the other side of

0:30:18.920 --> 0:30:22.800
<v Speaker 1>the room, and I hand you a picture. Your job

0:30:23.040 --> 0:30:26.520
<v Speaker 1>is to subscribe that picture to your best friend who's

0:30:26.520 --> 0:30:28.959
<v Speaker 1>across the room. Your best friend cannot see the picture,

0:30:29.280 --> 0:30:32.400
<v Speaker 1>They can only hear your description. Their job is to

0:30:32.480 --> 0:30:35.520
<v Speaker 1>try and recreate the picture, to draw it as you

0:30:35.560 --> 0:30:38.640
<v Speaker 1>describe it. However, I give you some more restrictions. I say,

0:30:39.080 --> 0:30:42.040
<v Speaker 1>you can only use five adjectives. Uh, you can only

0:30:42.120 --> 0:30:44.720
<v Speaker 1>use five sentences, and they have to be simple sentences.

0:30:44.720 --> 0:30:48.680
<v Speaker 1>They can't be compound or complex or anything. Five simple

0:30:48.720 --> 0:30:52.680
<v Speaker 1>short sentences with a maximum of five adjectives to describe

0:30:52.720 --> 0:30:56.720
<v Speaker 1>that picture. Well, chances are your best friend would draw

0:30:56.760 --> 0:30:59.800
<v Speaker 1>something that's kind of similar to the picture I gave you,

0:31:00.360 --> 0:31:02.840
<v Speaker 1>but it wouldn't be an accurate copy of it. Right.

0:31:02.960 --> 0:31:04.480
<v Speaker 1>You might be like, Okay, I can see where you

0:31:04.560 --> 0:31:07.520
<v Speaker 1>got that based upon the description. But let's say we

0:31:07.560 --> 0:31:09.880
<v Speaker 1>repeat this task, and each time we repeat it, I

0:31:09.960 --> 0:31:12.320
<v Speaker 1>give you a little more freedom and how you can

0:31:12.360 --> 0:31:15.320
<v Speaker 1>describe the picture you're looking at to your friend. So

0:31:15.360 --> 0:31:17.760
<v Speaker 1>you get to use more adjectives, you get to use

0:31:17.760 --> 0:31:21.280
<v Speaker 1>more complex sentences, and each time you're given a larger

0:31:21.360 --> 0:31:24.320
<v Speaker 1>set of potential values that you can express to your

0:31:24.360 --> 0:31:27.880
<v Speaker 1>best friend. Well, that's kind of like bit depths. If

0:31:27.880 --> 0:31:32.080
<v Speaker 1>you're using sixteen bit bit depths. That means you're using

0:31:32.120 --> 0:31:35.239
<v Speaker 1>sixteen bits to determine the range of values that can

0:31:35.320 --> 0:31:39.040
<v Speaker 1>describe the signal. So a bit is either a zero

0:31:39.160 --> 0:31:42.160
<v Speaker 1>or a one. With sixteen bits, you can represent up

0:31:42.160 --> 0:31:46.400
<v Speaker 1>to sixty five thousand, five hundred thirty six values. However,

0:31:46.880 --> 0:31:48.640
<v Speaker 1>let's say you were to go to thirty two bit,

0:31:48.960 --> 0:31:52.680
<v Speaker 1>so sixteen to thirty two, you would think, oh, you

0:31:52.680 --> 0:31:55.400
<v Speaker 1>could do twice as many. That's not that's not the case.

0:31:55.640 --> 0:31:58.160
<v Speaker 1>With thirty two bit depth, you wouldn't be talking about

0:31:58.200 --> 0:32:01.040
<v Speaker 1>twice as many as sixteen bit. With thirty two bits,

0:32:01.080 --> 0:32:04.040
<v Speaker 1>you would be able to describe up to four billion,

0:32:04.200 --> 0:32:08.360
<v Speaker 1>two million, nine d sixty seven thousand, two hundred nineties

0:32:08.440 --> 0:32:12.400
<v Speaker 1>six values. So the greater the bit depth, the more

0:32:12.560 --> 0:32:18.960
<v Speaker 1>accurately you can describe something. Essentially, uh So, it's both

0:32:19.000 --> 0:32:22.320
<v Speaker 1>the sample rate and bit depth together that can allow

0:32:22.400 --> 0:32:26.880
<v Speaker 1>a digital system to create a digital recording that represents

0:32:27.320 --> 0:32:32.160
<v Speaker 1>that continuous signal. It was sampling. Again, the digital recording

0:32:32.520 --> 0:32:35.200
<v Speaker 1>is not continuous. If we zoomed way in, we would

0:32:35.200 --> 0:32:36.959
<v Speaker 1>see it's a bunch of these little steps that are

0:32:37.000 --> 0:32:39.560
<v Speaker 1>all linked together. But if the sample rate is high

0:32:39.680 --> 0:32:42.080
<v Speaker 1>enough and the bit depth is great, enough, we can

0:32:42.120 --> 0:32:44.920
<v Speaker 1>reach a point where the human ear really can't discern

0:32:45.000 --> 0:32:48.800
<v Speaker 1>the difference. Does this mean at lower settings we would

0:32:48.800 --> 0:32:52.640
<v Speaker 1>actually notice a difference if you go low enough. Yeah,

0:32:52.760 --> 0:32:56.120
<v Speaker 1>but really, most of the time even sixteen bit is

0:32:56.160 --> 0:33:00.520
<v Speaker 1>sufficient for just plain old recording and playback. However, if

0:33:00.560 --> 0:33:02.520
<v Speaker 1>you want to work on a project. Let's say you're

0:33:02.840 --> 0:33:06.880
<v Speaker 1>an editor and you're you're trying to edit together music

0:33:06.960 --> 0:33:10.760
<v Speaker 1>files or audio files, larger bit depth gives you much

0:33:10.760 --> 0:33:14.160
<v Speaker 1>more space to work in without introducing stuff like distortion.

0:33:14.680 --> 0:33:18.040
<v Speaker 1>This is called headroom. And if you remember the character

0:33:18.120 --> 0:33:21.440
<v Speaker 1>Max Headroom, that name is a pun on this very

0:33:21.480 --> 0:33:25.880
<v Speaker 1>sort of thing. Technically, at the lower rates you get

0:33:26.000 --> 0:33:30.719
<v Speaker 1>deviations from the true sound. You're essentially inserting errors into

0:33:31.000 --> 0:33:35.160
<v Speaker 1>the digital file. Uh As you increase semple rate and

0:33:35.200 --> 0:33:38.320
<v Speaker 1>bit depth, you can decrease those errors until you reach

0:33:38.360 --> 0:33:42.160
<v Speaker 1>a point where any errors that exist are are impossible

0:33:42.200 --> 0:33:45.840
<v Speaker 1>to detect, at least with our natural equipment. Maybe you

0:33:45.840 --> 0:33:49.720
<v Speaker 1>could detect them if you had supersensitive electronic equipment to

0:33:49.920 --> 0:33:52.680
<v Speaker 1>indicate it, but it wouldn't be something that would be

0:33:52.760 --> 0:33:57.600
<v Speaker 1>necessarily perceptible to human ears. One other interesting thing, or

0:33:57.600 --> 0:34:00.080
<v Speaker 1>a couple of interesting things that I should mention with

0:34:00.120 --> 0:34:03.040
<v Speaker 1>sample rates. So I said, like, your sample rate has

0:34:03.080 --> 0:34:05.360
<v Speaker 1>to be fast enough to capture two points of data

0:34:05.560 --> 0:34:08.279
<v Speaker 1>along the wavelength of every sound, and for most of us,

0:34:08.320 --> 0:34:11.560
<v Speaker 1>that hearing range caps out at twenty killer hurts. That

0:34:11.680 --> 0:34:13.600
<v Speaker 1>might lead you to the question, well, why would you

0:34:13.600 --> 0:34:16.560
<v Speaker 1>bother to go higher than forty killer hurts? Now, if

0:34:16.600 --> 0:34:19.560
<v Speaker 1>twenty killer hurts is the limit of human hearing, typical

0:34:19.600 --> 0:34:23.360
<v Speaker 1>human hearing, why go to forty four point one? Well,

0:34:24.280 --> 0:34:26.160
<v Speaker 1>there are some other things that we need to think

0:34:26.200 --> 0:34:28.960
<v Speaker 1>about that play a factor in this. One of those

0:34:28.960 --> 0:34:33.200
<v Speaker 1>are harmonics. Uh Now, harmonics are way too complicated for

0:34:33.239 --> 0:34:36.120
<v Speaker 1>me to really fully get into in this episode. But

0:34:36.239 --> 0:34:39.800
<v Speaker 1>harmonics can actually exist above the range of human hearing

0:34:40.120 --> 0:34:43.960
<v Speaker 1>and yet still shape how we experience a sound. You

0:34:43.960 --> 0:34:48.319
<v Speaker 1>can almost think of it as the harmonics are sculpting

0:34:49.080 --> 0:34:51.440
<v Speaker 1>the sounds we hear. So even harmonics that are outside

0:34:51.440 --> 0:34:53.880
<v Speaker 1>of our hearing range might be affecting the sounds we

0:34:54.000 --> 0:34:57.640
<v Speaker 1>still can here. So we're not hearing the harmonics directly,

0:34:58.040 --> 0:35:00.960
<v Speaker 1>We're rather experiencing how they are affecting the rest of

0:35:01.000 --> 0:35:04.080
<v Speaker 1>the stuff we can perceive. If that makes sense, Well,

0:35:04.920 --> 0:35:07.279
<v Speaker 1>if you're sampling at a rate that's too low to

0:35:07.360 --> 0:35:10.640
<v Speaker 1>capture those harmonics. Those harmonics are not going to be

0:35:10.680 --> 0:35:13.160
<v Speaker 1>in the digital recording, so they won't be in the playback.

0:35:13.200 --> 0:35:16.279
<v Speaker 1>When you listen to it, you lose that sound. So

0:35:16.560 --> 0:35:19.040
<v Speaker 1>when you do listen back, you're gonna be losing those

0:35:19.080 --> 0:35:23.879
<v Speaker 1>effects and you're not going to experience the sound as

0:35:23.920 --> 0:35:25.840
<v Speaker 1>you would had you been in the place when it

0:35:25.920 --> 0:35:28.839
<v Speaker 1>was being recorded. Also, one thing that we can do

0:35:28.880 --> 0:35:31.600
<v Speaker 1>with recordings is we can change the pitch when we

0:35:31.640 --> 0:35:34.200
<v Speaker 1>record stuff. You know, like if you have a digital recording,

0:35:34.760 --> 0:35:39.319
<v Speaker 1>you can digitally change the pitch. In fact, Tari, if

0:35:39.360 --> 0:35:43.160
<v Speaker 1>you would like to digitally alter the pitch of my voice,

0:35:43.280 --> 0:35:47.479
<v Speaker 1>maybe increase the pitch so that I get that kind

0:35:47.520 --> 0:35:51.160
<v Speaker 1>of chipmunk sound to it. That's you know, boosting the

0:35:51.239 --> 0:35:55.960
<v Speaker 1>frequency up or maybe bringing that frequency way down and

0:35:56.000 --> 0:36:00.759
<v Speaker 1>giving me that deep, bass, booming voice that I know

0:36:00.840 --> 0:36:04.720
<v Speaker 1>I'll never have and I'll never be able to really

0:36:05.160 --> 0:36:10.279
<v Speaker 1>play like a baritone in a musical. Feel free to

0:36:10.320 --> 0:36:13.960
<v Speaker 1>do it. The world is your plaything. So you can

0:36:14.080 --> 0:36:16.759
<v Speaker 1>record audio with a sample rate of forty four point

0:36:16.760 --> 0:36:20.000
<v Speaker 1>one killer hurts. Then on playback, maybe you decide you

0:36:20.040 --> 0:36:23.960
<v Speaker 1>want to pitch everything down well, you'll hit a ceiling

0:36:24.320 --> 0:36:27.319
<v Speaker 1>of the sounds that you'll have in that recording once

0:36:27.320 --> 0:36:29.719
<v Speaker 1>you get to killer hurts or so. So, if there

0:36:29.760 --> 0:36:34.200
<v Speaker 1>were sounds that were above killer hurts, you're not really

0:36:34.200 --> 0:36:36.000
<v Speaker 1>going to be able to hear them with the pitched

0:36:36.080 --> 0:36:39.680
<v Speaker 1>down recording. Remember that pitch down recording will bring stuff

0:36:39.719 --> 0:36:43.200
<v Speaker 1>that is outside human hearing into the range of human

0:36:43.200 --> 0:36:46.000
<v Speaker 1>hearing because you've pitched it down. But if your sample

0:36:46.120 --> 0:36:50.160
<v Speaker 1>rate is too slow, too low, in other words, you

0:36:50.160 --> 0:36:55.040
<v Speaker 1>won't have captured those higher pitches. So let's say that

0:36:55.080 --> 0:36:58.520
<v Speaker 1>you're recording something that's in a very very high frequency,

0:36:58.680 --> 0:37:01.239
<v Speaker 1>like beyond the range of human hearing. But then you

0:37:01.280 --> 0:37:03.799
<v Speaker 1>want to do a pitch adjustment so that people can

0:37:03.840 --> 0:37:08.640
<v Speaker 1>actually hear a sound, even though you know normally they

0:37:08.640 --> 0:37:10.000
<v Speaker 1>wouldn't be able to hear it at all because it

0:37:10.000 --> 0:37:11.759
<v Speaker 1>would be outside their range. Maybe you're doing like a

0:37:11.840 --> 0:37:16.160
<v Speaker 1>nature documentary and there's a critter that makes sounds that

0:37:16.200 --> 0:37:18.600
<v Speaker 1>typically we cannot hear, but by pitching it down, you

0:37:18.600 --> 0:37:20.920
<v Speaker 1>can say this is what it sounds like once we

0:37:20.960 --> 0:37:23.200
<v Speaker 1>reduce the pitch. Well, you have to have a sample

0:37:23.280 --> 0:37:27.520
<v Speaker 1>rate that's high enough so that you capture that range

0:37:27.520 --> 0:37:30.319
<v Speaker 1>of sound in the first place. Right, So that's one

0:37:30.400 --> 0:37:33.120
<v Speaker 1>reason why you might want a very high sample rate.

0:37:34.000 --> 0:37:36.239
<v Speaker 1>I just thought that was neat all. Right, we need

0:37:36.280 --> 0:37:38.799
<v Speaker 1>to take another break. When we come back, we'll talk

0:37:38.840 --> 0:37:40.759
<v Speaker 1>about the process we need to follow in order to

0:37:40.800 --> 0:37:43.799
<v Speaker 1>go from analog to digital and back again. It's gonna

0:37:43.840 --> 0:37:45.160
<v Speaker 1>be a lot of us talking about some of the

0:37:45.160 --> 0:37:47.919
<v Speaker 1>stuff we just chatted about, and we'll also talk about

0:37:48.000 --> 0:37:51.160
<v Speaker 1>audio files a little bit. But first let's take another

0:37:51.239 --> 0:38:03.080
<v Speaker 1>quick break. Now, before I dive into the converter's part,

0:38:03.160 --> 0:38:06.719
<v Speaker 1>I should add there are some outliers, right There are

0:38:06.719 --> 0:38:10.560
<v Speaker 1>digital microphones. For example. Now there's some digital microphones that

0:38:10.719 --> 0:38:13.399
<v Speaker 1>are analog at the front end, so in other words,

0:38:13.440 --> 0:38:16.440
<v Speaker 1>they still have the diaphragm, they still have the electromagnet,

0:38:17.360 --> 0:38:21.560
<v Speaker 1>they're still generating an electric current with varying voltage. But

0:38:21.680 --> 0:38:25.160
<v Speaker 1>then they'll have an analog to digital converter built into

0:38:25.200 --> 0:38:27.400
<v Speaker 1>the microphone itself. So you have an A D C

0:38:28.080 --> 0:38:31.400
<v Speaker 1>and it's right there in the device, and then you

0:38:31.480 --> 0:38:36.480
<v Speaker 1>have the signal go to other elements of your recording studio.

0:38:37.400 --> 0:38:41.720
<v Speaker 1>There are other digital microphones that use the pressure waves

0:38:41.800 --> 0:38:47.600
<v Speaker 1>to move elements that immediately convert into digital data, getting

0:38:47.600 --> 0:38:50.960
<v Speaker 1>into that is pretty complicated. They are not super common.

0:38:51.080 --> 0:38:53.840
<v Speaker 1>It's not like that's the type of microphone that everyone

0:38:53.960 --> 0:38:59.239
<v Speaker 1>is using. Um, they're important, but you could argue that

0:38:59.280 --> 0:39:02.600
<v Speaker 1>it's a microphon own and a d C all in

0:39:02.680 --> 0:39:07.880
<v Speaker 1>one because you're taking audio, which is an analog you know, signal,

0:39:08.840 --> 0:39:12.680
<v Speaker 1>and you're converting it immediately into binary or digital information.

0:39:13.719 --> 0:39:17.320
<v Speaker 1>But we're really going to talk about analog to digital

0:39:17.360 --> 0:39:21.279
<v Speaker 1>and digital to analog, which is what most equipment is

0:39:21.440 --> 0:39:24.360
<v Speaker 1>dealing with. When we're speaking about this kind of stuff,

0:39:24.360 --> 0:39:28.160
<v Speaker 1>We're not gonna worry about stuff that's native digital because

0:39:28.200 --> 0:39:32.600
<v Speaker 1>it's just it's not that common. Um like digital speakers

0:39:33.760 --> 0:39:37.399
<v Speaker 1>are a different thing altogether as well, and um yeah,

0:39:37.480 --> 0:39:40.120
<v Speaker 1>we're just gonna wipe those out. We're gonna look at

0:39:40.160 --> 0:39:42.480
<v Speaker 1>what most people use, which is that you know, your

0:39:42.480 --> 0:39:48.640
<v Speaker 1>typical stereo system or your typical audio recording setup. So again,

0:39:48.719 --> 0:39:53.960
<v Speaker 1>typically the end equipment that you use to either record

0:39:54.080 --> 0:39:56.640
<v Speaker 1>or listen to audio, the stuff at the very ends

0:39:56.640 --> 0:40:01.080
<v Speaker 1>of that chain are typically analog in nature. Again, there

0:40:01.080 --> 0:40:03.600
<v Speaker 1>are outliers, but for the vast majority of cases, we're

0:40:03.640 --> 0:40:08.280
<v Speaker 1>talking about an analog device that generates an analog signal

0:40:08.480 --> 0:40:12.279
<v Speaker 1>or plays back an analog signal. So we take an

0:40:12.280 --> 0:40:15.759
<v Speaker 1>analog phenomena, the pressure waves that make up sound. We

0:40:15.920 --> 0:40:18.960
<v Speaker 1>feed that through a transducer to create a different but

0:40:19.120 --> 0:40:22.080
<v Speaker 1>still analog signal, in this case, an electric current with

0:40:22.200 --> 0:40:25.680
<v Speaker 1>variable voltage. But now we get to a point where

0:40:25.680 --> 0:40:28.000
<v Speaker 1>we say, all right, we want to transform that into

0:40:28.040 --> 0:40:33.800
<v Speaker 1>a digital file that quantifies this signal. When we play

0:40:33.880 --> 0:40:37.400
<v Speaker 1>the digital file back, that signal ultimately needs to go

0:40:37.480 --> 0:40:40.520
<v Speaker 1>through some kind of loud speaker for us to hear it.

0:40:41.160 --> 0:40:44.200
<v Speaker 1>Maybe that loud speakers in our headphones, maybe it's a

0:40:44.239 --> 0:40:47.640
<v Speaker 1>stereo system, maybe it's you know, the speaker on your smartphone.

0:40:48.600 --> 0:40:50.839
<v Speaker 1>Maybe it's a sound system in a stadium. But we

0:40:50.880 --> 0:40:53.759
<v Speaker 1>need a way to transform that digital information, all those

0:40:53.800 --> 0:40:57.560
<v Speaker 1>zeros and ones into an electric signal with variable voltage,

0:40:58.000 --> 0:41:00.640
<v Speaker 1>and we probably have to amplify that nal so that

0:41:00.680 --> 0:41:03.719
<v Speaker 1>it's strong enough to drive whatever speakers were using to

0:41:03.760 --> 0:41:08.360
<v Speaker 1>create the sound, which again we experience as an analog phenomena. Now,

0:41:08.640 --> 0:41:11.520
<v Speaker 1>if there was some way that we can interface directly

0:41:11.600 --> 0:41:15.640
<v Speaker 1>with machines and have those digital signals interact with our brains,

0:41:16.640 --> 0:41:18.879
<v Speaker 1>maybe we wouldn't need to do this kind of transformation.

0:41:19.160 --> 0:41:21.600
<v Speaker 1>But as it stands we do have to do this,

0:41:21.880 --> 0:41:25.520
<v Speaker 1>and this is where converters come into play. The converters

0:41:25.719 --> 0:41:29.839
<v Speaker 1>could be standalone devices, or frequently they're worked into the

0:41:29.880 --> 0:41:34.040
<v Speaker 1>design of various pieces of equipment. So for example, a

0:41:34.160 --> 0:41:36.480
<v Speaker 1>USB microphone, if you have one of those that you

0:41:36.480 --> 0:41:39.040
<v Speaker 1>plug into your computer, like I'm using one right now

0:41:39.080 --> 0:41:44.080
<v Speaker 1>to record this, they have that a d C converter

0:41:44.320 --> 0:41:47.080
<v Speaker 1>built into them. And that I'm being repetitive because that's

0:41:47.160 --> 0:41:50.360
<v Speaker 1>analog to digital converter. And then I said a DC converter.

0:41:50.400 --> 0:41:53.560
<v Speaker 1>It's like saying a t M machine. Anyway, the microphone

0:41:53.600 --> 0:41:56.920
<v Speaker 1>still acts just as a traditional analog mic on that end,

0:41:57.360 --> 0:41:59.279
<v Speaker 1>but then the electric signal has to go through a

0:41:59.280 --> 0:42:02.840
<v Speaker 1>converter converts into a digital signal, and that's what transmits

0:42:02.880 --> 0:42:04.960
<v Speaker 1>through the USB cable too, you know, whatever you got

0:42:04.960 --> 0:42:08.239
<v Speaker 1>hooked up to, like in my case, it's my work laptop.

0:42:08.719 --> 0:42:13.000
<v Speaker 1>Now here's the thing. There's more than one way to

0:42:13.120 --> 0:42:18.319
<v Speaker 1>convert an analog signal into a digital one. All of

0:42:18.360 --> 0:42:22.640
<v Speaker 1>these ways get pretty technical talking about the way it's sampled,

0:42:23.120 --> 0:42:28.439
<v Speaker 1>the way it ends up taking these measurements of the signal. So,

0:42:28.560 --> 0:42:31.200
<v Speaker 1>for example, with analog to digital converters or a d

0:42:31.280 --> 0:42:34.720
<v Speaker 1>c s. There are several popular methodologies, but generally speaking,

0:42:35.600 --> 0:42:39.720
<v Speaker 1>they all do the same thing on a big picture scale.

0:42:39.840 --> 0:42:43.200
<v Speaker 1>They all sample a signal. This is the snapshots that

0:42:43.239 --> 0:42:45.920
<v Speaker 1>I was talking about earlier. They look at a signal

0:42:46.440 --> 0:42:49.680
<v Speaker 1>and a specific frequency, like a specific They look at

0:42:49.680 --> 0:42:52.960
<v Speaker 1>the signal a specific number of times every second, and

0:42:53.000 --> 0:42:57.200
<v Speaker 1>they quantified the signal. They measure the signal, which determines

0:42:57.239 --> 0:43:01.239
<v Speaker 1>the resolution that you get of the signal. Obviously, if

0:43:01.239 --> 0:43:03.640
<v Speaker 1>you want high quality sound, you need both a good

0:43:03.640 --> 0:43:06.759
<v Speaker 1>sample rate and a good resolution, which we can think

0:43:06.800 --> 0:43:09.720
<v Speaker 1>of as you know, the accuracy in capturing the nature

0:43:09.880 --> 0:43:13.120
<v Speaker 1>of that signal. You can think of it as an

0:43:13.120 --> 0:43:17.120
<v Speaker 1>A d C is measuring the electric current many many

0:43:17.160 --> 0:43:21.760
<v Speaker 1>times per second and quantifies that measurement as digital data.

0:43:21.960 --> 0:43:26.960
<v Speaker 1>And it's not just like how important is this signal

0:43:27.080 --> 0:43:30.520
<v Speaker 1>at this specific moment in time, but also how important

0:43:30.560 --> 0:43:35.480
<v Speaker 1>are the changes in that signal over greater lengths of time. Now,

0:43:35.520 --> 0:43:37.680
<v Speaker 1>the bit depth we can think of is how detailed

0:43:37.719 --> 0:43:40.759
<v Speaker 1>these measurements can be. So the number of measurements and

0:43:40.800 --> 0:43:43.759
<v Speaker 1>the detail we get together determine the quality of the

0:43:43.800 --> 0:43:48.240
<v Speaker 1>digital signal compared to the original analog signal. And again

0:43:48.960 --> 0:43:52.760
<v Speaker 1>we're talking about digitally describing an electric current. At this point,

0:43:52.800 --> 0:43:57.680
<v Speaker 1>we're not talking about describing the sound necessarily. We're describing

0:43:57.680 --> 0:44:01.640
<v Speaker 1>the electric current that the transducer created after the sound

0:44:01.760 --> 0:44:04.920
<v Speaker 1>went through the transducer. Now, if the sample rate of

0:44:04.920 --> 0:44:07.359
<v Speaker 1>an A d C is too low, you get what's

0:44:07.360 --> 0:44:10.720
<v Speaker 1>called alias sing. Now, this means that the digital signal

0:44:10.760 --> 0:44:14.680
<v Speaker 1>will differ greatly from the original signal. Uh. And that

0:44:14.760 --> 0:44:16.959
<v Speaker 1>means that you're not going to have a good representation

0:44:17.040 --> 0:44:20.000
<v Speaker 1>of what was originally creating that signal in the first place,

0:44:20.160 --> 0:44:24.560
<v Speaker 1>in this case, whatever the sound was. UH. So that

0:44:24.560 --> 0:44:28.000
<v Speaker 1>that's what alias sing means in this context. Now, a

0:44:28.000 --> 0:44:30.440
<v Speaker 1>A DOCK or d a C is a digital to

0:44:30.480 --> 0:44:33.440
<v Speaker 1>audio converter, and it's basically the same thing we just

0:44:33.480 --> 0:44:37.160
<v Speaker 1>talked about, but in reverse. The d a C takes

0:44:37.480 --> 0:44:42.520
<v Speaker 1>digital information, which essentially is describing an analog signal an

0:44:42.600 --> 0:44:47.840
<v Speaker 1>electric current of variable voltage. Then it produces that analog signal.

0:44:48.760 --> 0:44:51.560
<v Speaker 1>The way it does again depends upon the type of

0:44:51.719 --> 0:44:54.520
<v Speaker 1>d A C. Just as A d C s have

0:44:54.640 --> 0:44:58.400
<v Speaker 1>different methodologies, so do d a C s. H. I

0:44:58.520 --> 0:45:01.239
<v Speaker 1>might do an episode that goes into more detail, like

0:45:01.280 --> 0:45:03.719
<v Speaker 1>I mentioned at the top of this episode, But honestly,

0:45:03.800 --> 0:45:07.719
<v Speaker 1>once you really start diving in there, it gets incredibly

0:45:07.760 --> 0:45:13.200
<v Speaker 1>technical very quickly. Generally speaking, we're talking about sophisticated circuit

0:45:13.239 --> 0:45:16.520
<v Speaker 1>boards that are designed to convert digital to analog or

0:45:16.600 --> 0:45:20.239
<v Speaker 1>vice versa, to switch between the data made up of

0:45:20.360 --> 0:45:25.439
<v Speaker 1>zeros and ones and a continuous electric signal. And again,

0:45:25.480 --> 0:45:27.760
<v Speaker 1>if there's interest, I'll go into more about how that works,

0:45:27.800 --> 0:45:31.200
<v Speaker 1>but believe me, it gets really complicated, and without visual

0:45:31.480 --> 0:45:37.200
<v Speaker 1>aids it's really hard to kind of get it across. Anyway,

0:45:37.440 --> 0:45:40.040
<v Speaker 1>Now let's talk about audio files. Also, I should mention

0:45:40.320 --> 0:45:42.319
<v Speaker 1>there's a ton of stuff I did not talk about, right,

0:45:42.360 --> 0:45:46.600
<v Speaker 1>I didn't talk about multiplexing or anything like that, So

0:45:46.680 --> 0:45:48.600
<v Speaker 1>there is a lot more to it than just the

0:45:49.360 --> 0:45:52.880
<v Speaker 1>general information I'm giving anyway. Audio files. So, back in

0:45:52.920 --> 0:45:56.279
<v Speaker 1>the day when c d s were fairly new, there

0:45:56.280 --> 0:46:00.000
<v Speaker 1>were audio files who just despised digital media. The process

0:46:00.040 --> 0:46:03.880
<v Speaker 1>us of converting an analog signal into a digital file

0:46:04.239 --> 0:46:08.239
<v Speaker 1>and then back again to analog. Well, that represented a

0:46:08.280 --> 0:46:12.080
<v Speaker 1>potential loss in quality, right, the playback experience might not

0:46:12.239 --> 0:46:16.520
<v Speaker 1>be as vibrant. Audio files typically use words like warm

0:46:16.680 --> 0:46:20.080
<v Speaker 1>or full to describe sound. These are words that are

0:46:20.120 --> 0:46:24.719
<v Speaker 1>hard to quantify they are experiential, I guess, and they

0:46:24.719 --> 0:46:29.600
<v Speaker 1>would lament that digitization removed some of those elements from recordings.

0:46:30.239 --> 0:46:33.840
<v Speaker 1>The thing is, depending upon how you're digitally sampling a signal,

0:46:34.000 --> 0:46:37.360
<v Speaker 1>some of that could be actually happening. You could be

0:46:37.480 --> 0:46:40.800
<v Speaker 1>losing harmonics. And this isn't even touching on the issue

0:46:40.840 --> 0:46:43.600
<v Speaker 1>that you start getting if you're if you're doing stuff

0:46:43.640 --> 0:46:47.279
<v Speaker 1>like compression file compression in this sense. I'll talk about

0:46:47.280 --> 0:46:51.400
<v Speaker 1>audio compression in a bit, but file compression can involve

0:46:51.520 --> 0:46:56.239
<v Speaker 1>using what are called lossy formats. A lossy format discards

0:46:56.440 --> 0:47:01.600
<v Speaker 1>part of a digital file that describe a signal, and

0:47:01.719 --> 0:47:04.480
<v Speaker 1>typically the way it does this is that the encoding

0:47:04.600 --> 0:47:08.439
<v Speaker 1>process is getting rid of information that it deems as

0:47:08.480 --> 0:47:13.040
<v Speaker 1>being irrelevant. So let me explain that last bit. I

0:47:13.040 --> 0:47:15.840
<v Speaker 1>did a full series of episodes about MP three's that

0:47:15.920 --> 0:47:18.879
<v Speaker 1>goes into this into far more detail, but i'll give

0:47:18.880 --> 0:47:21.480
<v Speaker 1>it down in dirty version for this episode. So, the

0:47:21.600 --> 0:47:25.880
<v Speaker 1>MP three method of compressing a file takes a psycho

0:47:25.960 --> 0:47:30.200
<v Speaker 1>acoustic approach in part when figuring out how to make

0:47:30.239 --> 0:47:34.920
<v Speaker 1>an audio file size smaller, because raw audio files can

0:47:34.960 --> 0:47:40.000
<v Speaker 1>be huge if you're really using a very high sample

0:47:40.120 --> 0:47:44.040
<v Speaker 1>rate and a big bit depth. During your recording process,

0:47:44.080 --> 0:47:48.680
<v Speaker 1>you're generating enormous files, right because the system is taking

0:47:48.760 --> 0:47:53.160
<v Speaker 1>data many many many times, many thousands of times every

0:47:53.200 --> 0:47:56.480
<v Speaker 1>second and using an enormous amount of information to try

0:47:56.520 --> 0:48:01.879
<v Speaker 1>and describe that signal each time, every single snapshot. That's

0:48:01.920 --> 0:48:05.839
<v Speaker 1>a lot of information and that isn't really convenient if

0:48:05.880 --> 0:48:08.480
<v Speaker 1>you want to store that file on like an old

0:48:08.560 --> 0:48:11.160
<v Speaker 1>MP three player. You know, if you remember those where

0:48:11.200 --> 0:48:13.440
<v Speaker 1>you had to like in the old old days, you

0:48:13.480 --> 0:48:15.839
<v Speaker 1>had to connect them physically to your computer. You would

0:48:15.920 --> 0:48:20.279
<v Speaker 1>download or rip music and you would then send that

0:48:20.680 --> 0:48:24.480
<v Speaker 1>music file to your device. These devices had very limited

0:48:24.760 --> 0:48:27.719
<v Speaker 1>storage space on them, so you couldn't really hold a

0:48:27.719 --> 0:48:30.640
<v Speaker 1>lot of raw audio. Like a single file might end

0:48:30.680 --> 0:48:33.000
<v Speaker 1>up taking up the entire storage on your m P

0:48:33.120 --> 0:48:36.640
<v Speaker 1>three player. And maybe you really like journeys Don't stop Believing,

0:48:36.920 --> 0:48:39.200
<v Speaker 1>but you might want some other songs on there too.

0:48:39.320 --> 0:48:41.120
<v Speaker 1>This is also more complicated if you want to do

0:48:41.200 --> 0:48:45.440
<v Speaker 1>something like stream music. You don't want to have enormous

0:48:45.480 --> 0:48:49.719
<v Speaker 1>files that would require like a gigabit Internet connection in

0:48:49.800 --> 0:48:52.200
<v Speaker 1>order to be able to stream it, So you have

0:48:52.280 --> 0:48:55.040
<v Speaker 1>to have a way to compress files down to sizes

0:48:55.040 --> 0:48:57.879
<v Speaker 1>that are easier to handle well. The way the MP

0:48:58.080 --> 0:49:01.840
<v Speaker 1>three algorithm does this is that once you set some

0:49:01.920 --> 0:49:05.719
<v Speaker 1>general parameters, like you decide how compressed you want to

0:49:05.760 --> 0:49:10.040
<v Speaker 1>make this file, essentially you're telling the MP three algorithm

0:49:10.080 --> 0:49:13.080
<v Speaker 1>how hard it needs to go. When it's starting to

0:49:13.120 --> 0:49:16.239
<v Speaker 1>cut stuff, well, then the algorithm begins to toss out

0:49:16.320 --> 0:49:19.040
<v Speaker 1>data that, at least in theory, should not affect your

0:49:19.120 --> 0:49:23.600
<v Speaker 1>experience when you listen back to the audio playback. So,

0:49:23.640 --> 0:49:27.239
<v Speaker 1>for example, let's say you've got a sound file and

0:49:27.320 --> 0:49:29.960
<v Speaker 1>in that sound file you have a very soft sound

0:49:30.400 --> 0:49:34.000
<v Speaker 1>that immediately follows a very loud sound. So loud sound happens,

0:49:34.239 --> 0:49:38.840
<v Speaker 1>soft sound happens immediately after that. Well, you wouldn't actually

0:49:38.840 --> 0:49:41.719
<v Speaker 1>hear that really soft sound, and that's just because of

0:49:41.719 --> 0:49:46.200
<v Speaker 1>how our ears work. The loud sound effectively masks the

0:49:46.239 --> 0:49:49.120
<v Speaker 1>softer one, so it's almost like the soft one didn't

0:49:49.120 --> 0:49:51.640
<v Speaker 1>exist at all. Well, if it's like the soft one

0:49:51.719 --> 0:49:55.799
<v Speaker 1>didn't exist, then there's no reason to keep it right.

0:49:55.960 --> 0:49:58.920
<v Speaker 1>If you couldn't hear it anyway, there's no reason that

0:49:58.920 --> 0:50:02.719
<v Speaker 1>that should be in uh the file, right, So the

0:50:02.840 --> 0:50:08.080
<v Speaker 1>algorithm effectively, through analyzing this data says, ah, that doesn't

0:50:08.120 --> 0:50:09.879
<v Speaker 1>need to be in there. No one would hear it,

0:50:09.960 --> 0:50:12.560
<v Speaker 1>so it tosses the data out. That's why it's a

0:50:12.640 --> 0:50:17.000
<v Speaker 1>lossy file format. The same goes for frequencies that would

0:50:17.040 --> 0:50:20.439
<v Speaker 1>be outside the range of human hearing. The logic is, well,

0:50:20.480 --> 0:50:24.080
<v Speaker 1>you can't hear something that's at killer hurts, so we're

0:50:24.120 --> 0:50:26.560
<v Speaker 1>just gonna get rid of anything that's occurring at that

0:50:26.600 --> 0:50:31.120
<v Speaker 1>frequency because there's no reason to keep it. However, depending

0:50:31.120 --> 0:50:33.200
<v Speaker 1>on how much you want to compress that file, those

0:50:33.239 --> 0:50:36.920
<v Speaker 1>cuts can really start to affect the quality of the

0:50:37.000 --> 0:50:40.400
<v Speaker 1>playback audio when you put it back through you know,

0:50:40.440 --> 0:50:42.279
<v Speaker 1>a decode er and you get the audio on the

0:50:42.320 --> 0:50:46.719
<v Speaker 1>other end. By the way, as I mentioned, file compression

0:50:46.880 --> 0:50:50.640
<v Speaker 1>is not the same thing as audio compression. I'll explain

0:50:51.239 --> 0:50:53.640
<v Speaker 1>what I mean by that, but first let's take one

0:50:53.760 --> 0:51:04.360
<v Speaker 1>last break. Okay, before the break, I said that audio

0:51:04.400 --> 0:51:06.879
<v Speaker 1>compression and file compression are two different things. It does

0:51:06.960 --> 0:51:10.439
<v Speaker 1>get confusing, and I myself have been guilty of kind

0:51:10.440 --> 0:51:13.960
<v Speaker 1>of interchanging the words or not clarifying enough while talking

0:51:14.000 --> 0:51:18.000
<v Speaker 1>about compression and uh, thus I have been guilty of

0:51:18.040 --> 0:51:21.279
<v Speaker 1>confusing it even more so. My apologies for that, but

0:51:21.360 --> 0:51:25.680
<v Speaker 1>let's get to it. Audio compression refers to reducing the

0:51:25.760 --> 0:51:30.680
<v Speaker 1>dynamic range of volume in a recording. Uh So, in

0:51:30.719 --> 0:51:34.960
<v Speaker 1>other words, it's about reducing the volume distance between the

0:51:35.080 --> 0:51:38.480
<v Speaker 1>softest sounds and the loudest sounds. Now, this can be

0:51:38.520 --> 0:51:41.600
<v Speaker 1>really important for certain types of recording. I'll give you

0:51:41.600 --> 0:51:44.840
<v Speaker 1>an example that I frequently run into that drives me nuts.

0:51:44.880 --> 0:51:48.520
<v Speaker 1>And this happens a lot with like streaming media for me,

0:51:49.120 --> 0:51:52.440
<v Speaker 1>so movies and television. Have you ever watched like an

0:51:52.440 --> 0:51:56.120
<v Speaker 1>action movie where you can barely hear some of the dialogue,

0:51:56.200 --> 0:51:59.160
<v Speaker 1>especially if people are speaking in like low voices and

0:51:59.239 --> 0:52:01.759
<v Speaker 1>you know they're trying to be secretive or whatever. And

0:52:01.800 --> 0:52:03.440
<v Speaker 1>then so you turn the volume up so that you

0:52:03.440 --> 0:52:05.439
<v Speaker 1>can hear what people are saying. But then the next

0:52:05.480 --> 0:52:09.359
<v Speaker 1>time something explodes, you're worried that you've just destroyed all

0:52:09.400 --> 0:52:12.880
<v Speaker 1>your speakers, or maybe you've caused yourself permanent hearing damage.

0:52:13.320 --> 0:52:16.160
<v Speaker 1>This happens to me all the time, where the softest

0:52:16.200 --> 0:52:20.799
<v Speaker 1>sounds and the loudest sounds are so far apart that

0:52:20.880 --> 0:52:24.480
<v Speaker 1>there's no comfortable volume. I can select where I can

0:52:24.480 --> 0:52:28.399
<v Speaker 1>hear everything and not feel like one I'm missing out

0:52:28.440 --> 0:52:31.239
<v Speaker 1>on some dialogue, or two my neighbors are going to

0:52:31.280 --> 0:52:33.919
<v Speaker 1>come over and complain that I've got my volume turned

0:52:34.000 --> 0:52:37.080
<v Speaker 1>up too loud. So compression in a case like that

0:52:37.239 --> 0:52:40.799
<v Speaker 1>can narrow the gap between the softest parts and the

0:52:40.840 --> 0:52:43.840
<v Speaker 1>loudest parts so that you can find that kind of

0:52:43.880 --> 0:52:49.600
<v Speaker 1>comfortable volume where you can hear everything. However, going overboard

0:52:50.000 --> 0:52:53.400
<v Speaker 1>with audio compression will reduce the dynamic range in a

0:52:53.440 --> 0:52:56.399
<v Speaker 1>recorded piece of audio, and if you do that too much,

0:52:56.640 --> 0:52:59.959
<v Speaker 1>it can make the audio sound flat and uninteresting, where

0:53:00.040 --> 0:53:03.000
<v Speaker 1>everything is just coming out at exactly the same volume.

0:53:03.719 --> 0:53:07.120
<v Speaker 1>If there's no real volume range, then your ears just

0:53:07.200 --> 0:53:10.120
<v Speaker 1>kind of get tired of hearing everything played back at

0:53:10.160 --> 0:53:15.719
<v Speaker 1>that same level. Some digital recordings really suffered from this

0:53:16.160 --> 0:53:19.520
<v Speaker 1>kind of processing, Like there was an era of music

0:53:20.000 --> 0:53:24.480
<v Speaker 1>where audio files in particular were really complaining that everything

0:53:24.520 --> 0:53:27.480
<v Speaker 1>that was being laid down had so much compression in

0:53:27.520 --> 0:53:30.920
<v Speaker 1>it that there was no real dynamic range and audio,

0:53:31.920 --> 0:53:34.320
<v Speaker 1>and it just meant that the music wasn't as interesting,

0:53:34.440 --> 0:53:39.640
<v Speaker 1>like there wasn't there wasn't enough variation, and it makes

0:53:39.920 --> 0:53:43.680
<v Speaker 1>music kind of boring. Uh, it wouldn't matter if you

0:53:43.760 --> 0:53:47.600
<v Speaker 1>had an analog pressing of a digital recording session, because

0:53:47.640 --> 0:53:52.160
<v Speaker 1>analog does not magically fix the problems of the recording process.

0:53:52.200 --> 0:53:55.640
<v Speaker 1>So if you're recording digitally, and then you make a

0:53:55.719 --> 0:53:59.839
<v Speaker 1>vinyl record pressing of that digital recording. I mean, all

0:53:59.840 --> 0:54:02.640
<v Speaker 1>that processing you did on the digital side, that's still

0:54:02.680 --> 0:54:06.960
<v Speaker 1>going to be part of what ends up being recorded

0:54:07.000 --> 0:54:11.240
<v Speaker 1>on the vinyl. It's it's not like vinyl suddenly cures

0:54:11.280 --> 0:54:15.280
<v Speaker 1>all sins of digital. So even if you were to

0:54:15.280 --> 0:54:19.400
<v Speaker 1>to go with analog audio media, you would still have

0:54:19.440 --> 0:54:23.920
<v Speaker 1>the same problems that were introduced in the digital processing. Now,

0:54:24.360 --> 0:54:27.040
<v Speaker 1>this does not mean that all digital to audio is

0:54:27.080 --> 0:54:31.560
<v Speaker 1>inherently flawed. Even if we just look at the analog chain,

0:54:31.840 --> 0:54:35.200
<v Speaker 1>we have to acknowledge that the process of recording in

0:54:35.280 --> 0:54:38.880
<v Speaker 1>playback means, you know, you're taking pressure waves of the

0:54:38.880 --> 0:54:42.160
<v Speaker 1>original sound, you pass them through a system that converts

0:54:42.200 --> 0:54:45.520
<v Speaker 1>those pressure waves into an analog electric signal, and then

0:54:45.520 --> 0:54:48.880
<v Speaker 1>you've got to reverse that process during playback, and stuff

0:54:48.960 --> 0:54:52.840
<v Speaker 1>can happen along that pathway that could affect either the

0:54:52.880 --> 0:54:56.600
<v Speaker 1>recording process or the playback process or both. So, in

0:54:56.640 --> 0:55:01.800
<v Speaker 1>other words, analog does not necessarily mean better, because flaws

0:55:01.880 --> 0:55:05.000
<v Speaker 1>can exist in the analog approach just as they can

0:55:05.080 --> 0:55:08.400
<v Speaker 1>with the digital approach. And there are other elements as well,

0:55:08.520 --> 0:55:12.359
<v Speaker 1>such as low level noise. Analog systems can introduce a

0:55:12.400 --> 0:55:17.600
<v Speaker 1>low level noise into a signal. Digital avoids that. Now,

0:55:17.640 --> 0:55:20.200
<v Speaker 1>that does not mean that digital is better, mind you,

0:55:20.280 --> 0:55:23.120
<v Speaker 1>because there are other ways to reduce and eliminate noise

0:55:23.120 --> 0:55:27.239
<v Speaker 1>and analog systems and digital can introduce other artifacts that

0:55:27.360 --> 0:55:30.560
<v Speaker 1>didn't exist in the original signal, and then that comes

0:55:30.560 --> 0:55:34.719
<v Speaker 1>across as like errors in your playback, Like you might

0:55:34.760 --> 0:55:37.480
<v Speaker 1>hear some weird blip noise and think, what the heck

0:55:37.560 --> 0:55:40.840
<v Speaker 1>was that, and it wasn't necessarily present in the original

0:55:40.880 --> 0:55:45.359
<v Speaker 1>recording session, but was introduced as a digital artifact. This

0:55:45.440 --> 0:55:49.359
<v Speaker 1>is just another example of how one format is not

0:55:49.600 --> 0:55:52.839
<v Speaker 1>necessarily superior to the other. It depends on way too

0:55:52.840 --> 0:55:58.040
<v Speaker 1>many other factors. They're just different. And honestly, I'm fairly

0:55:58.160 --> 0:56:02.040
<v Speaker 1>confident that if you were to do a double blind test,

0:56:02.600 --> 0:56:05.840
<v Speaker 1>and just in case you're unfamiliar with that term, double

0:56:05.840 --> 0:56:09.560
<v Speaker 1>blind is a type of scientific test where neither the

0:56:09.600 --> 0:56:12.560
<v Speaker 1>subject that's going through the test nor the person who

0:56:12.680 --> 0:56:18.280
<v Speaker 1>is in charge of administering the test knows which version

0:56:18.480 --> 0:56:20.480
<v Speaker 1>anyone is getting. So if you have a control group,

0:56:20.719 --> 0:56:24.680
<v Speaker 1>the person administering the test doesn't know if that's a

0:56:24.719 --> 0:56:27.920
<v Speaker 1>control group or if it's an actual test group. That

0:56:28.000 --> 0:56:31.240
<v Speaker 1>they're testing at any given time. That way, the person

0:56:31.280 --> 0:56:35.000
<v Speaker 1>administering the test does not give bias to the person

0:56:35.040 --> 0:56:38.040
<v Speaker 1>who's experiencing the test. The thought is, if I know

0:56:38.200 --> 0:56:41.960
<v Speaker 1>as an administrator, then I might give a tell to

0:56:42.080 --> 0:56:44.839
<v Speaker 1>the test subject. Right. So let's say it's a double

0:56:44.840 --> 0:56:47.520
<v Speaker 1>blind test and the audio files are going into a

0:56:47.600 --> 0:56:51.279
<v Speaker 1>room and they're going to experience the same piece of

0:56:51.320 --> 0:56:56.440
<v Speaker 1>audio recording, but it's over different systems. So like some

0:56:56.520 --> 0:57:00.279
<v Speaker 1>of the pieces of the same stretch of audio are

0:57:00.360 --> 0:57:04.120
<v Speaker 1>analog sources, some of them are digital sources. They might

0:57:04.160 --> 0:57:07.520
<v Speaker 1>even include different like systems like premium systems, like super

0:57:07.680 --> 0:57:11.000
<v Speaker 1>super high end systems that cost maybe upwards of hundreds

0:57:11.040 --> 0:57:14.040
<v Speaker 1>of thousands of dollars, and maybe just on some that

0:57:14.040 --> 0:57:18.040
<v Speaker 1>are really good systems, like you know, they're still expensive.

0:57:18.120 --> 0:57:21.280
<v Speaker 1>Maybe it's a few thousand dollars, but they're not, you know,

0:57:21.440 --> 0:57:25.880
<v Speaker 1>monumentally expensive. My bet is that most audio files would

0:57:25.880 --> 0:57:30.200
<v Speaker 1>have trouble picking out which ones are analog systems versus

0:57:30.240 --> 0:57:33.680
<v Speaker 1>digital systems unless there's some giveaway, like if you hear

0:57:33.720 --> 0:57:38.080
<v Speaker 1>the scratch of a needle hitting record, then that's kind

0:57:38.120 --> 0:57:40.400
<v Speaker 1>of a dead giveaway. But let's say you know you're

0:57:40.440 --> 0:57:43.560
<v Speaker 1>you're talking about, like the highest of high ends. I

0:57:43.600 --> 0:57:45.280
<v Speaker 1>don't think they would be able to tell the difference

0:57:45.480 --> 0:57:48.800
<v Speaker 1>very easily. And that's because our approach to digital processing

0:57:48.880 --> 0:57:52.480
<v Speaker 1>has become sophisticated enough that to our human ears, it's

0:57:52.480 --> 0:57:57.840
<v Speaker 1>pretty close to an analog signal. And that you know. Also,

0:57:57.840 --> 0:58:01.000
<v Speaker 1>I want to mention the returns on those high end

0:58:01.080 --> 0:58:05.880
<v Speaker 1>audio equipment, like the differences that you start to see

0:58:05.920 --> 0:58:09.920
<v Speaker 1>when you're really hitting that upper echelon of audio equipment.

0:58:10.280 --> 0:58:13.440
<v Speaker 1>Some of those returns are so minor that after you

0:58:13.480 --> 0:58:17.840
<v Speaker 1>reach a certain point, they are largely meaningless. Like like,

0:58:18.160 --> 0:58:20.720
<v Speaker 1>as far as perception goes, you wouldn't be able to

0:58:20.760 --> 0:58:24.320
<v Speaker 1>tell the difference. Uh, And for people like me, people

0:58:24.360 --> 0:58:27.400
<v Speaker 1>who have had some hearing loss, it matters even less

0:58:27.400 --> 0:58:30.920
<v Speaker 1>than that, right because like for me, like you could

0:58:30.920 --> 0:58:33.760
<v Speaker 1>almost say, it's like I have an unsophisticated palette, Like

0:58:33.920 --> 0:58:36.960
<v Speaker 1>you could serve me an amazing meal, but I'm not

0:58:37.080 --> 0:58:39.920
<v Speaker 1>likely to notice it being any better than you know,

0:58:40.400 --> 0:58:46.360
<v Speaker 1>a cheeseburger. But again, this is my hypothesis. I I

0:58:46.440 --> 0:58:50.760
<v Speaker 1>believe this is probably true. It is entirely possible, and

0:58:50.840 --> 0:58:53.840
<v Speaker 1>I admit this that if I actually were to conduct

0:58:54.040 --> 0:58:56.600
<v Speaker 1>this kind of study, I might find that I'm totally

0:58:56.600 --> 0:58:59.240
<v Speaker 1>wrong that the audio files are like no, that is

0:58:59.280 --> 0:59:02.880
<v Speaker 1>clearly the premium, and maybe the differences are subtle, but

0:59:03.000 --> 0:59:06.120
<v Speaker 1>maybe they're detectable. Right, It could be that I'm wrong

0:59:06.160 --> 0:59:09.120
<v Speaker 1>about that. Uh. I just think that there gets to

0:59:09.160 --> 0:59:14.919
<v Speaker 1>be a point where people start to buy into a philosophy,

0:59:15.080 --> 0:59:20.840
<v Speaker 1>especially with audio files that isn't necessarily supportable by you know,

0:59:21.160 --> 0:59:25.360
<v Speaker 1>quantifiable evidence. It becomes so subjective that once you remove

0:59:25.400 --> 0:59:28.240
<v Speaker 1>the subjective element, like you remove the ability for them

0:59:28.320 --> 0:59:31.040
<v Speaker 1>to know whether or not they're listening to their preferred

0:59:31.080 --> 0:59:35.000
<v Speaker 1>set up, that it starts to disappear. Maybe I'm wrong

0:59:35.040 --> 0:59:38.400
<v Speaker 1>about that. I don't think I am, But that's the

0:59:38.480 --> 0:59:41.560
<v Speaker 1>overview of analog and digital and why you have to

0:59:41.600 --> 0:59:44.200
<v Speaker 1>have the converters. As I said, to get into specifics

0:59:44.640 --> 0:59:47.320
<v Speaker 1>would take more time, you know, talking about delta Sigma

0:59:47.360 --> 0:59:50.640
<v Speaker 1>processing and that kind of stuff. But if you want it,

0:59:51.160 --> 0:59:53.600
<v Speaker 1>let me know and I will try and put that

0:59:53.680 --> 0:59:58.080
<v Speaker 1>episode together. It will just be far more niche oriented

0:59:58.120 --> 1:00:02.000
<v Speaker 1>than even this one was. B It is a fascinating subject,

1:00:02.120 --> 1:00:06.160
<v Speaker 1>like there's some really cool technology that goes into making

1:00:06.200 --> 1:00:10.400
<v Speaker 1>this all work, and the fact that that technology does

1:00:10.480 --> 1:00:12.920
<v Speaker 1>work and that it has become so sophisticated is why

1:00:12.960 --> 1:00:16.880
<v Speaker 1>I feel pretty confident in saying that with a sufficiently

1:00:16.960 --> 1:00:19.280
<v Speaker 1>good system, you wouldn't be able to tell the difference.

1:00:19.840 --> 1:00:22.920
<v Speaker 1>But that's it for this episode. If you would like

1:00:23.080 --> 1:00:25.760
<v Speaker 1>me to cover any kind of topic, whatever it might

1:00:25.800 --> 1:00:28.200
<v Speaker 1>be in the tech world, let me know. The best

1:00:28.200 --> 1:00:30.560
<v Speaker 1>way to get in touch is on Twitter. The handle

1:00:30.600 --> 1:00:34.040
<v Speaker 1>for the show is tech Stuff h SW and I

1:00:34.040 --> 1:00:37.439
<v Speaker 1>greatly appreciate it. I'm getting some wonderful suggestions. Really makes

1:00:37.440 --> 1:00:39.640
<v Speaker 1>my job easier because I know exactly what people want

1:00:39.680 --> 1:00:43.360
<v Speaker 1>to hear um. So yeah, reach out and let me

1:00:43.360 --> 1:00:46.360
<v Speaker 1>know what you think and I'll talk to you again

1:00:47.240 --> 1:00:55.160
<v Speaker 1>really soon. Tech Stuff is an I Heart Radio production.

1:00:55.400 --> 1:00:58.240
<v Speaker 1>For more podcasts from my Heart Radio, visit the i

1:00:58.320 --> 1:01:01.439
<v Speaker 1>Heart Radio app, Apple pine Casts, or wherever you listen

1:01:01.480 --> 1:01:02.520
<v Speaker 1>to your favorite shows.