WEBVTT - Web Analytics and Your Data

0:00:04.120 --> 0:00:07.160
<v Speaker 1>Get in touch with technology with tech Stuff from how

0:00:07.200 --> 0:00:13.720
<v Speaker 1>stuff Works dot com. Hey there, and welcome to tex Stuff.

0:00:13.760 --> 0:00:16.439
<v Speaker 1>I'm your host, Jonathan Strickland. I'm an executive producer with

0:00:16.480 --> 0:00:19.960
<v Speaker 1>how Stuff Works and I love all things tech and

0:00:19.960 --> 0:00:22.720
<v Speaker 1>in our last episode, I talked about how web analytics

0:00:22.800 --> 0:00:25.400
<v Speaker 1>work in general and why they are important both for

0:00:25.520 --> 0:00:30.240
<v Speaker 1>people visiting a website and owners of websites and the

0:00:30.400 --> 0:00:35.159
<v Speaker 1>advertisers who support websites and the companies that advertise through

0:00:35.200 --> 0:00:39.600
<v Speaker 1>these advertisers. They really help website designers also get a

0:00:39.600 --> 0:00:43.159
<v Speaker 1>better understanding of how their users navigate and consume stuff

0:00:43.159 --> 0:00:46.760
<v Speaker 1>on their sites and allows the web administrators to tweak

0:00:46.800 --> 0:00:49.760
<v Speaker 1>things to make the experience better. So it's not just

0:00:50.040 --> 0:00:53.360
<v Speaker 1>about advertising. It's also about how can I make this

0:00:53.479 --> 0:00:58.520
<v Speaker 1>website easier to navigate, more intuitive, more interesting, more exciting

0:00:58.560 --> 0:01:01.400
<v Speaker 1>to use, or more useful or whatever the purpose of

0:01:01.440 --> 0:01:04.600
<v Speaker 1>the website is that benefits the visitor, makes the experience

0:01:04.640 --> 0:01:08.720
<v Speaker 1>more satisfying one, and it helps the website administrator also

0:01:08.840 --> 0:01:11.560
<v Speaker 1>monetize through web advertising. But now let's get to the

0:01:11.600 --> 0:01:15.959
<v Speaker 1>other side of the coin. Tracking information obviously brings with

0:01:16.000 --> 0:01:20.920
<v Speaker 1>it some very nasty potential problems like threats to privacy

0:01:20.959 --> 0:01:27.119
<v Speaker 1>and security. Information is incredibly valuable. It is the currency

0:01:27.160 --> 0:01:30.040
<v Speaker 1>of the Internet. You might thought it was bitcoin, it's not.

0:01:30.760 --> 0:01:35.199
<v Speaker 1>Data is your currency. And generally speaking, the more data

0:01:35.240 --> 0:01:39.600
<v Speaker 1>a company can get about people who are using the web,

0:01:40.040 --> 0:01:43.200
<v Speaker 1>the better it is for that company, not necessarily better

0:01:43.280 --> 0:01:47.360
<v Speaker 1>for the people, the better for that company. Knowing information

0:01:47.360 --> 0:01:50.960
<v Speaker 1>about a person means being able to sell to that

0:01:51.040 --> 0:01:54.280
<v Speaker 1>person more effectively, or it might mean being able to

0:01:54.360 --> 0:01:59.480
<v Speaker 1>exploit that person in less legal or ethical ways, and

0:01:59.560 --> 0:02:02.040
<v Speaker 1>so they out of gathered about users can become a

0:02:02.040 --> 0:02:05.560
<v Speaker 1>tool or a weapon, depending upon the type of information

0:02:05.600 --> 0:02:09.160
<v Speaker 1>gathered and the will of the person who has access

0:02:09.160 --> 0:02:14.520
<v Speaker 1>to that information. So ideally you don't have any bad

0:02:14.560 --> 0:02:18.360
<v Speaker 1>actors out there, and even if people are gathering a

0:02:18.360 --> 0:02:21.760
<v Speaker 1>lot of information about users, they're not trying to put

0:02:21.760 --> 0:02:24.680
<v Speaker 1>it to any malicious purpose. Before I dive into a

0:02:24.720 --> 0:02:28.119
<v Speaker 1>detailed account of web analytics and privacy, I should say

0:02:28.160 --> 0:02:30.840
<v Speaker 1>that not everyone is out to scrape every bit of

0:02:30.919 --> 0:02:33.959
<v Speaker 1>data off of users or to figure out the identity

0:02:34.000 --> 0:02:37.040
<v Speaker 1>of a specific user. Many analyzes are more focused on

0:02:37.120 --> 0:02:41.800
<v Speaker 1>identifying emerging trends rather than singling out one specific user,

0:02:42.240 --> 0:02:46.160
<v Speaker 1>So the goal is not to look at that data

0:02:46.560 --> 0:02:50.160
<v Speaker 1>like a browser's history, like looking at the cookie information

0:02:50.200 --> 0:02:53.600
<v Speaker 1>and saying, oh, this person went from X website to

0:02:53.919 --> 0:02:56.880
<v Speaker 1>HY website to Z website and then come to the

0:02:56.919 --> 0:03:02.320
<v Speaker 1>conclusion of that must be Jonathan Strickland instead. More often

0:03:02.320 --> 0:03:06.120
<v Speaker 1>than not, these analytics companies are looking at aggregated data

0:03:07.160 --> 0:03:11.040
<v Speaker 1>that is, at least on the service level, anonymous, and

0:03:11.120 --> 0:03:13.880
<v Speaker 1>the purpose is to see more valuable information, such as

0:03:15.120 --> 0:03:18.119
<v Speaker 1>rose Gold is so totally in right now, so put

0:03:18.120 --> 0:03:21.079
<v Speaker 1>all your rose Gold products on your main page because

0:03:21.120 --> 0:03:23.400
<v Speaker 1>people are gonna go nuts. Right This really dates this

0:03:23.440 --> 0:03:28.480
<v Speaker 1>podcast because I'm about about two years out of touch,

0:03:29.080 --> 0:03:30.840
<v Speaker 1>so it tells you this one should have come out

0:03:30.840 --> 0:03:35.040
<v Speaker 1>two years ago. Anyway, this concept makes sense when you're

0:03:35.080 --> 0:03:38.440
<v Speaker 1>thinking of big sweeping strategies, like which products you want

0:03:38.440 --> 0:03:42.520
<v Speaker 1>to feature on an online stores homepage, or which news

0:03:42.560 --> 0:03:44.320
<v Speaker 1>stories are likely to be thought of as the most

0:03:44.360 --> 0:03:47.200
<v Speaker 1>important and relevant on any given day. So you might

0:03:47.240 --> 0:03:49.920
<v Speaker 1>look at something like Google Trends and say, oh, well,

0:03:50.080 --> 0:03:54.040
<v Speaker 1>a lot of people are searching this particular term. Let's

0:03:54.120 --> 0:03:57.640
<v Speaker 1>create an article about this thing. We can inform people,

0:03:57.680 --> 0:04:00.080
<v Speaker 1>we can make sure it's a really good article, but

0:04:00.120 --> 0:04:01.880
<v Speaker 1>we can also take advantage of the fact that people

0:04:01.880 --> 0:04:06.440
<v Speaker 1>are interested in this idea right now, so it's kind

0:04:06.440 --> 0:04:11.000
<v Speaker 1>of a mutually beneficial experience in the ideal. But it

0:04:11.040 --> 0:04:14.720
<v Speaker 1>would be silly to say that no one's interested in

0:04:14.800 --> 0:04:19.320
<v Speaker 1>your individual preferences, because that's not true. There are people

0:04:19.320 --> 0:04:22.280
<v Speaker 1>who are very interested in your individual preferences. For one thing,

0:04:22.760 --> 0:04:27.159
<v Speaker 1>it can help identify what different groups of people like,

0:04:27.640 --> 0:04:32.039
<v Speaker 1>so a company could present those different groups with distinct

0:04:32.120 --> 0:04:36.279
<v Speaker 1>experiences that were meant to appeal to that group. Right.

0:04:36.560 --> 0:04:40.160
<v Speaker 1>That's targeted marketing or targeted advertising. So let me give

0:04:40.160 --> 0:04:44.200
<v Speaker 1>an example. Let's say I run an online store, and

0:04:44.240 --> 0:04:47.640
<v Speaker 1>I've coded my home page in such a way that

0:04:47.760 --> 0:04:51.680
<v Speaker 1>it can dynamically display different products based off the information

0:04:51.720 --> 0:04:55.599
<v Speaker 1>I glean from analyzing a user's behaviors. And my site

0:04:55.680 --> 0:05:00.479
<v Speaker 1>uses cookies and JavaScript, and those analyze the are and

0:05:00.640 --> 0:05:05.520
<v Speaker 1>it presents the most appropriate products for return visitors. So

0:05:05.560 --> 0:05:07.960
<v Speaker 1>when you pop into my store, I happen to know

0:05:08.360 --> 0:05:11.599
<v Speaker 1>that you recently started for Star Wars toys because the

0:05:11.640 --> 0:05:15.120
<v Speaker 1>cookie information that I've installed on your browser from your

0:05:15.120 --> 0:05:19.080
<v Speaker 1>previous visit has told me this, And so I have

0:05:19.240 --> 0:05:22.560
<v Speaker 1>some Star Wars related products that I want to prominently

0:05:22.640 --> 0:05:25.360
<v Speaker 1>show to you in my homepage. Now when I say

0:05:25.400 --> 0:05:28.719
<v Speaker 1>I want to, all of this is done automatically. You've

0:05:28.800 --> 0:05:32.400
<v Speaker 1>got all this meta information, these tags that computers can

0:05:32.520 --> 0:05:37.440
<v Speaker 1>use to sort through and select to present what appears

0:05:37.440 --> 0:05:41.240
<v Speaker 1>to be the most appropriate products that will appeal to

0:05:41.600 --> 0:05:44.560
<v Speaker 1>the visitor. Now, let's say your buddy shows up and

0:05:44.600 --> 0:05:46.720
<v Speaker 1>your buddy is not as into Star Wars as you are.

0:05:46.880 --> 0:05:50.880
<v Speaker 1>Your buddies like a big clover Field fan, and your

0:05:50.920 --> 0:05:53.760
<v Speaker 1>buddy visits my online store and see is a totally

0:05:53.839 --> 0:05:57.120
<v Speaker 1>different selection of products than you do when they pop on.

0:05:57.520 --> 0:05:59.880
<v Speaker 1>Maybe your buddy is visiting my store for the first time,

0:06:00.080 --> 0:06:02.880
<v Speaker 1>in which case I don't have any for any information

0:06:03.080 --> 0:06:05.720
<v Speaker 1>about him or her. I don't know anything about this

0:06:05.760 --> 0:06:08.080
<v Speaker 1>person because they've just come to my website for the

0:06:08.120 --> 0:06:12.039
<v Speaker 1>first time. Now, they come there and I decided to

0:06:12.040 --> 0:06:16.320
<v Speaker 1>pop a cookie on their web browser, so i'll know

0:06:16.400 --> 0:06:18.839
<v Speaker 1>the next time they come through. But this first time,

0:06:18.880 --> 0:06:22.760
<v Speaker 1>it's a blank slate. That means that my store is

0:06:22.760 --> 0:06:26.680
<v Speaker 1>probably gonna show them a pretty neutral selection of products.

0:06:26.920 --> 0:06:29.239
<v Speaker 1>Maybe there will be some of the most popular products

0:06:29.240 --> 0:06:32.039
<v Speaker 1>that happened to appeal to a broad spectrum of people,

0:06:32.600 --> 0:06:35.839
<v Speaker 1>but they aren't targeted toward that specific person yet. Because

0:06:35.839 --> 0:06:39.880
<v Speaker 1>I don't know what that person's preferences are. But as

0:06:39.920 --> 0:06:44.120
<v Speaker 1>your friend navigates through my site, I'm collecting more and

0:06:44.160 --> 0:06:47.839
<v Speaker 1>more information about what they like based upon their behaviors,

0:06:48.600 --> 0:06:51.240
<v Speaker 1>and then I can make sure the next time they

0:06:51.240 --> 0:06:54.200
<v Speaker 1>come to my website that it serves up a more

0:06:54.279 --> 0:06:59.680
<v Speaker 1>appropriate landing page for them based upon their preferences. Again,

0:07:00.000 --> 0:07:04.000
<v Speaker 1>and I say, I decide this is all automatic. Let's

0:07:04.000 --> 0:07:06.960
<v Speaker 1>go a step further. Let's say that you are running

0:07:06.960 --> 0:07:10.640
<v Speaker 1>a blog that has online advertising on it. So you've

0:07:10.640 --> 0:07:14.080
<v Speaker 1>got spaces on your blog that are reserved for advertising,

0:07:14.760 --> 0:07:19.120
<v Speaker 1>and the ads themselves are tracking users with cookies and JavaScript.

0:07:20.040 --> 0:07:23.680
<v Speaker 1>Most ads come from brokers who have numerous clients, right,

0:07:23.760 --> 0:07:25.880
<v Speaker 1>So let's say that you go to a blog and

0:07:25.920 --> 0:07:29.480
<v Speaker 1>you see an ad for a popular soft drink company.

0:07:30.360 --> 0:07:33.680
<v Speaker 1>That ad did not come directly from the soft drink company.

0:07:34.200 --> 0:07:37.280
<v Speaker 1>More likely than that it came through an advertising company

0:07:37.360 --> 0:07:40.680
<v Speaker 1>that has that soft drink company is one of its clients.

0:07:40.720 --> 0:07:44.600
<v Speaker 1>So the brokers, these companies that have thousands of clients

0:07:44.640 --> 0:07:48.360
<v Speaker 1>representing all these different industries, can use this tracking information

0:07:48.600 --> 0:07:51.760
<v Speaker 1>in cookies and JavaScript to determine what stuff you're most

0:07:51.800 --> 0:07:55.480
<v Speaker 1>likely to respond to based upon your browsing history, so

0:07:56.640 --> 0:08:00.520
<v Speaker 1>that means the broker could potentially serve up ads based

0:08:00.560 --> 0:08:02.960
<v Speaker 1>on the information to help improve the chances that you'll

0:08:03.000 --> 0:08:06.800
<v Speaker 1>find any given ad more useful and click on it.

0:08:07.320 --> 0:08:12.280
<v Speaker 1>In these cases, the experiences are personalized, but that personalization

0:08:12.400 --> 0:08:18.000
<v Speaker 1>still is not dependent upon your identity per se. I mean,

0:08:18.000 --> 0:08:22.080
<v Speaker 1>it's based upon what you like and what your behaviors

0:08:22.120 --> 0:08:26.320
<v Speaker 1>have indicated you find valuable or interesting. But it's not

0:08:26.960 --> 0:08:31.240
<v Speaker 1>like that specific data is identifiable stuff like your name

0:08:31.480 --> 0:08:35.679
<v Speaker 1>or your address or anything like that. Although they can

0:08:35.720 --> 0:08:38.800
<v Speaker 1>at least get an approximation of your address based upon

0:08:39.640 --> 0:08:42.920
<v Speaker 1>uh your IB address, so that that could at least

0:08:42.920 --> 0:08:47.360
<v Speaker 1>know generally where you were, um maybe more specifically if

0:08:47.400 --> 0:08:49.960
<v Speaker 1>you're happy to use a mobile device and you have

0:08:50.480 --> 0:08:53.960
<v Speaker 1>location tracking on, or as it turns out, you don't

0:08:54.000 --> 0:08:56.559
<v Speaker 1>necessarily have to have location tracking turned on. There was

0:08:56.559 --> 0:09:01.040
<v Speaker 1>a recent story from uh AP that looked into this

0:09:01.200 --> 0:09:04.920
<v Speaker 1>and said that Google Android devices would check in with

0:09:05.000 --> 0:09:10.840
<v Speaker 1>Google an average of fourteen times an hour, giving information

0:09:11.240 --> 0:09:15.040
<v Speaker 1>about location even with location services turned off. So that's

0:09:15.040 --> 0:09:19.319
<v Speaker 1>a kind of tracking information that definitely rubs people the

0:09:19.360 --> 0:09:24.000
<v Speaker 1>wrong way, very valuable information. If Google wants to serve

0:09:24.080 --> 0:09:26.480
<v Speaker 1>up ads to you. That's that are based on your

0:09:26.720 --> 0:09:30.839
<v Speaker 1>your locale, but not very comforting if you're thinking about

0:09:30.960 --> 0:09:33.440
<v Speaker 1>I'm just carrying my phone around. I don't need my

0:09:33.520 --> 0:09:38.000
<v Speaker 1>phone telling Google everywhere I'm going throughout the day. Now,

0:09:38.000 --> 0:09:40.640
<v Speaker 1>there are instances where a company, an agency, or a

0:09:40.679 --> 0:09:45.240
<v Speaker 1>government might want to identify someone based upon their browsing behavior.

0:09:45.840 --> 0:09:48.360
<v Speaker 1>For example, let's say that there's a crime that's been

0:09:48.400 --> 0:09:51.600
<v Speaker 1>committed and law enforcement has come into possession of a

0:09:51.640 --> 0:09:56.000
<v Speaker 1>computer that they believe belonged to the perpetrator of that crime,

0:09:56.600 --> 0:09:59.520
<v Speaker 1>but they still don't know who that perpetrator is. They've

0:09:59.520 --> 0:10:01.840
<v Speaker 1>got they've up the computer, his or her computer, but

0:10:01.880 --> 0:10:04.400
<v Speaker 1>they don't know who that person is yet, and there's

0:10:04.400 --> 0:10:09.199
<v Speaker 1>no overtly identifiable information on the computer's hard drive, no fingerprints,

0:10:09.240 --> 0:10:11.360
<v Speaker 1>that kind of thing. Would it be possible for an

0:10:11.400 --> 0:10:14.240
<v Speaker 1>investigator or an analyst to be able to figure out

0:10:14.240 --> 0:10:18.520
<v Speaker 1>the identity of the computer's owner just through that person's

0:10:18.720 --> 0:10:23.080
<v Speaker 1>browsing history. If you looked at the information of what

0:10:23.200 --> 0:10:25.840
<v Speaker 1>websites they went to, would you be able to figure

0:10:25.880 --> 0:10:30.360
<v Speaker 1>out who it was that owned that computer? Well, setting

0:10:30.400 --> 0:10:34.000
<v Speaker 1>aside the possibility that the perpetrator had remained signed into

0:10:34.000 --> 0:10:37.839
<v Speaker 1>any services that would link back to his or her identity.

0:10:38.000 --> 0:10:40.160
<v Speaker 1>The task would require the analysts to look at the

0:10:40.200 --> 0:10:43.080
<v Speaker 1>patterns of behaviors and the browser history to figure out

0:10:43.240 --> 0:10:47.120
<v Speaker 1>what had the person had that computer's keyboard been doing.

0:10:47.559 --> 0:10:49.719
<v Speaker 1>It's kind of scary to think about this, but this

0:10:49.760 --> 0:10:52.000
<v Speaker 1>is totally possible to do. It's built upon the same

0:10:52.000 --> 0:10:55.360
<v Speaker 1>principles that were used to support e commerce. Back in

0:10:55.400 --> 0:10:59.120
<v Speaker 1>two six there were some Russian analysts who proposed a

0:10:59.200 --> 0:11:03.840
<v Speaker 1>method of user profiling that would create profiles of users

0:11:03.840 --> 0:11:08.600
<v Speaker 1>based on their browser history. So you would get shoveled

0:11:08.640 --> 0:11:13.280
<v Speaker 1>into progressively smaller groups based on your behavior. So you know,

0:11:14.240 --> 0:11:18.000
<v Speaker 1>initial analysis might put you in one of several broad categories,

0:11:18.600 --> 0:11:26.079
<v Speaker 1>but the more specific behaviors you exhibit, the more specific

0:11:26.120 --> 0:11:28.840
<v Speaker 1>the groups could be that you would be sorted into,

0:11:29.160 --> 0:11:32.280
<v Speaker 1>and that would represent profiles. As word vectors, that's a

0:11:32.440 --> 0:11:36.240
<v Speaker 1>method to assign context to words that ties into natural

0:11:36.280 --> 0:11:39.240
<v Speaker 1>language processing. I did a couple episodes on those a

0:11:39.280 --> 0:11:42.960
<v Speaker 1>little while back. The researchers use those word vectors to

0:11:42.960 --> 0:11:47.000
<v Speaker 1>create clusters of topics in a hierarchy to determine or

0:11:47.360 --> 0:11:52.000
<v Speaker 1>determined by rather user behavior and the stuff that users valued.

0:11:52.040 --> 0:11:55.480
<v Speaker 1>More as demonstrated in their behavior by following links or

0:11:55.520 --> 0:11:58.000
<v Speaker 1>staying on certain pages for a longer time, or making

0:11:58.040 --> 0:12:01.080
<v Speaker 1>searches would occupy a higher place in that hierarchy, and

0:12:01.120 --> 0:12:04.760
<v Speaker 1>that was one way of identifying users, at least by interest.

0:12:04.800 --> 0:12:07.000
<v Speaker 1>Now again that didn't assign a name yet, but that

0:12:07.080 --> 0:12:09.600
<v Speaker 1>was a building block towards this. There's a two thousand

0:12:09.679 --> 0:12:12.319
<v Speaker 1>seven paper I read that described a different approach that

0:12:12.360 --> 0:12:15.800
<v Speaker 1>could predict a user's gender and age based on his

0:12:16.000 --> 0:12:19.680
<v Speaker 1>or her web browsing behavior. The researchers created a model

0:12:19.840 --> 0:12:23.840
<v Speaker 1>that relied on users reporting their age and their gender,

0:12:24.160 --> 0:12:27.400
<v Speaker 1>so it's a self reporting kind of thing, and they

0:12:27.440 --> 0:12:30.320
<v Speaker 1>would also give up access to their browsing history to

0:12:30.440 --> 0:12:34.640
<v Speaker 1>this model, and the model would learn the associate to

0:12:34.720 --> 0:12:38.080
<v Speaker 1>associate certain behaviors with respect to age and gender and

0:12:38.160 --> 0:12:42.520
<v Speaker 1>draw general conclusions based on that. And once it learned

0:12:42.720 --> 0:12:45.840
<v Speaker 1>through this training process, it could then analyze an unknown

0:12:46.080 --> 0:12:50.000
<v Speaker 1>users browser history and then predict that person's gender and age.

0:12:50.720 --> 0:12:52.840
<v Speaker 1>I don't know how accurate it was. I came across

0:12:52.960 --> 0:12:56.240
<v Speaker 1>this information all reading a totally different but related paper,

0:12:56.440 --> 0:12:58.840
<v Speaker 1>didn't have time to track down the two seven document.

0:12:59.760 --> 0:13:04.000
<v Speaker 1>By this does lead to the way law enforcement might

0:13:04.240 --> 0:13:09.240
<v Speaker 1>use user profiling to identify someone based on their browser behavior.

0:13:09.720 --> 0:13:11.840
<v Speaker 1>I'll explain more in just a second, but first let's

0:13:11.880 --> 0:13:22.480
<v Speaker 1>take a quick break to thank our sponsor. Before the break,

0:13:22.520 --> 0:13:25.800
<v Speaker 1>I mentioned a paper that related paper was specifically about

0:13:26.520 --> 0:13:29.760
<v Speaker 1>identifying a suspect based on their web behavior and it

0:13:29.840 --> 0:13:34.880
<v Speaker 1>has the title Web user profiling based on Browsing Behavior Analysis.

0:13:35.400 --> 0:13:38.000
<v Speaker 1>And in that paper, the researchers describe a method in

0:13:38.000 --> 0:13:41.440
<v Speaker 1>which a computer believed to belong to a suspect is

0:13:41.480 --> 0:13:46.800
<v Speaker 1>compared to other computers that have known users. So law

0:13:46.880 --> 0:13:49.280
<v Speaker 1>enforcement gets hold of a computer, they know that this

0:13:49.320 --> 0:13:52.240
<v Speaker 1>computer was used by the perpetrator of a crime. They

0:13:52.280 --> 0:13:54.640
<v Speaker 1>don't have an identity yet, they do have some suspects.

0:13:55.360 --> 0:13:58.040
<v Speaker 1>They don't know if any of the suspects actually were

0:13:58.080 --> 0:14:01.960
<v Speaker 1>the perpetrator. So the goal is to take this target computer,

0:14:02.280 --> 0:14:06.880
<v Speaker 1>the one that was involved with the actual perpetrator, with

0:14:07.160 --> 0:14:11.559
<v Speaker 1>candidate computers the ones that suspects are using, and factors

0:14:11.600 --> 0:14:14.800
<v Speaker 1>such as the specific sites that were visited, the time

0:14:14.960 --> 0:14:19.560
<v Speaker 1>spent on every site, the order that the user would

0:14:19.560 --> 0:14:22.480
<v Speaker 1>browse the sites. All of these things are taken into consideration,

0:14:23.120 --> 0:14:24.760
<v Speaker 1>and at the heart of the matter is the idea

0:14:24.800 --> 0:14:27.560
<v Speaker 1>that we humans tend to be creatures of habit. So

0:14:27.680 --> 0:14:30.720
<v Speaker 1>here's how it would work. Investigators take that target computer

0:14:31.720 --> 0:14:34.920
<v Speaker 1>and they perform a data extraction on the computer. They

0:14:34.920 --> 0:14:37.760
<v Speaker 1>pull all the information they can off of it to

0:14:37.800 --> 0:14:40.240
<v Speaker 1>get a lead on the identity, and includes the browser

0:14:40.320 --> 0:14:44.600
<v Speaker 1>history and browser behaviors, and they analyze this. They have

0:14:44.720 --> 0:14:47.920
<v Speaker 1>identified some suspects and those suspects may be using other

0:14:47.960 --> 0:14:52.200
<v Speaker 1>computer s access online services, and those are the candidate computers.

0:14:52.200 --> 0:14:56.840
<v Speaker 1>So law enforcement gets possession of those candidate computers, presumably

0:14:57.000 --> 0:15:00.160
<v Speaker 1>through a warrant, and they preserve they do the same

0:15:00.200 --> 0:15:02.040
<v Speaker 1>sort of thing. They do a data extraction on each

0:15:02.080 --> 0:15:07.360
<v Speaker 1>of those computers. Then they process all that information and

0:15:07.400 --> 0:15:11.600
<v Speaker 1>they analyze it, and investigators determine which factors are domains

0:15:11.600 --> 0:15:16.120
<v Speaker 1>of interest, like what what are the things in the

0:15:16.160 --> 0:15:21.520
<v Speaker 1>target computer that could potentially be identify irs for somebody,

0:15:22.200 --> 0:15:25.240
<v Speaker 1>and they break this down into a vector representation. They

0:15:25.320 --> 0:15:27.960
<v Speaker 1>wait each of the factors to assign each one in

0:15:28.040 --> 0:15:32.520
<v Speaker 1>relative importance. So, for example, awaiting might represent that the

0:15:32.560 --> 0:15:36.080
<v Speaker 1>activity on the target computer showed the perpetrator repeatedly visited

0:15:36.120 --> 0:15:40.040
<v Speaker 1>the same five websites, so those websites would be weighted

0:15:40.480 --> 0:15:44.040
<v Speaker 1>heavier than others because the perpetrator had gone to them

0:15:44.120 --> 0:15:48.640
<v Speaker 1>multiple times, and it might within those five websites, each

0:15:48.680 --> 0:15:51.560
<v Speaker 1>of those websites might have their own weighting that is

0:15:51.560 --> 0:15:55.160
<v Speaker 1>based upon the amount of time spent on those sites

0:15:55.680 --> 0:15:58.200
<v Speaker 1>and the number of times that the perpetrator had logged

0:15:58.200 --> 0:16:01.640
<v Speaker 1>into them that are recorded in that browser history. These

0:16:01.720 --> 0:16:06.560
<v Speaker 1>indicate trends and behaviors. Then you would compare that with

0:16:06.920 --> 0:16:09.520
<v Speaker 1>the information you found from the candidate computers, and if

0:16:09.560 --> 0:16:13.600
<v Speaker 1>you found one that demonstrated a similar browsing behavior as

0:16:13.640 --> 0:16:16.040
<v Speaker 1>the one that was on the target computer, you can

0:16:16.080 --> 0:16:19.280
<v Speaker 1>make an argument that the respective suspect may well be

0:16:19.400 --> 0:16:22.480
<v Speaker 1>your criminal, then you can consider them a lead. It's

0:16:22.480 --> 0:16:27.320
<v Speaker 1>not exactly a smoking gun, but it's certainly says this

0:16:27.400 --> 0:16:32.840
<v Speaker 1>person browses on the Internet exactly the same way as

0:16:32.880 --> 0:16:36.040
<v Speaker 1>the person who owned this computer, and we know the

0:16:36.040 --> 0:16:39.360
<v Speaker 1>person who owned this computer committed the crime, and it

0:16:39.400 --> 0:16:42.920
<v Speaker 1>can lead you into a more specific investigation. In two

0:16:42.960 --> 0:16:46.680
<v Speaker 1>thousand and seventeen, Gizmoto ran a piece titled Here's all

0:16:46.760 --> 0:16:49.680
<v Speaker 1>the data collected from you as you browse the Web,

0:16:50.000 --> 0:16:52.760
<v Speaker 1>and it was written by David Neild and I really

0:16:52.800 --> 0:16:56.880
<v Speaker 1>recommend checking out this article. Again, it's called here's all

0:16:56.880 --> 0:16:59.360
<v Speaker 1>the data collected from you as you browse the web.

0:16:59.480 --> 0:17:01.760
<v Speaker 1>It's great piece. I'm gonna kind of go over it

0:17:01.840 --> 0:17:04.560
<v Speaker 1>here a little bit. Neil points out the type of

0:17:04.640 --> 0:17:07.520
<v Speaker 1>data your computer can share with sites on the Internet,

0:17:08.600 --> 0:17:12.040
<v Speaker 1>and as he mentions, it can include all of the following.

0:17:12.480 --> 0:17:15.720
<v Speaker 1>Your IP address. Now that makes sense. The IP address

0:17:15.760 --> 0:17:21.280
<v Speaker 1>corresponds to your computer or your router UH or a router.

0:17:21.440 --> 0:17:24.159
<v Speaker 1>It's necessary so that a site knows where to send

0:17:24.440 --> 0:17:28.320
<v Speaker 1>the data that you've requested. So if you visit a

0:17:28.359 --> 0:17:31.439
<v Speaker 1>website your typically you're technically sending a request to a

0:17:31.480 --> 0:17:34.359
<v Speaker 1>web server. The server has to know where to send

0:17:34.400 --> 0:17:39.040
<v Speaker 1>that site otherwise you'll never get anything back. But an

0:17:39.080 --> 0:17:42.600
<v Speaker 1>IP address can provide information that gives the site owners

0:17:42.640 --> 0:17:46.920
<v Speaker 1>a general idea of your location, not specifically where you are,

0:17:47.080 --> 0:17:50.119
<v Speaker 1>but generally where you are. Then there's the type of

0:17:50.240 --> 0:17:53.040
<v Speaker 1>system you're using, such as whether or not you're on

0:17:53.119 --> 0:17:55.679
<v Speaker 1>a phone or a tablet, or a computer or a

0:17:55.720 --> 0:18:00.480
<v Speaker 1>gaming console. UH. This is what will also typically include

0:18:00.480 --> 0:18:04.920
<v Speaker 1>information like the operating system that you're using, the display

0:18:04.960 --> 0:18:08.760
<v Speaker 1>resolution on the device you have, what processors your machine

0:18:08.840 --> 0:18:12.440
<v Speaker 1>might have like CPU and GPU, and the specific types

0:18:13.160 --> 0:18:15.840
<v Speaker 1>like how many cores that how much processing power that

0:18:15.880 --> 0:18:19.080
<v Speaker 1>kind of stuff, Which browser you might be using, what

0:18:19.280 --> 0:18:24.040
<v Speaker 1>plugins you have installed in that browser, your devices battery

0:18:24.160 --> 0:18:27.720
<v Speaker 1>charge could be part of the information. All of that

0:18:28.160 --> 0:18:31.280
<v Speaker 1>is part of the information that that your machine is

0:18:31.359 --> 0:18:35.200
<v Speaker 1>handing over. In this exchange, Neild also mentions the web

0:18:35.200 --> 0:18:37.400
<v Speaker 1>page that will let you know all the data your

0:18:37.400 --> 0:18:41.160
<v Speaker 1>browser since two pages. By default, that site is called

0:18:41.359 --> 0:18:45.920
<v Speaker 1>web k dot robin linus dot com or linus if

0:18:45.920 --> 0:18:49.439
<v Speaker 1>you prefer, it's w E B k A Y dot

0:18:50.080 --> 0:18:53.800
<v Speaker 1>R O b I n l I n us dot com.

0:18:53.840 --> 0:18:55.359
<v Speaker 1>So I went ahead and checked it out just to

0:18:55.359 --> 0:18:57.560
<v Speaker 1>see what would say about my connection here at work.

0:18:58.080 --> 0:19:01.760
<v Speaker 1>So it knew my work computer is running when seven, yeah,

0:19:01.800 --> 0:19:05.280
<v Speaker 1>I know. It also knew that I was using Chrome

0:19:05.480 --> 0:19:09.480
<v Speaker 1>as my browser. It identified the GPU and the CPU

0:19:09.680 --> 0:19:13.960
<v Speaker 1>for my computer. It knew what resolution I had set

0:19:14.040 --> 0:19:16.320
<v Speaker 1>my screen. It knew my laptops battery was at a

0:19:17.119 --> 0:19:19.639
<v Speaker 1>charge because it was plugged into a docking station at

0:19:19.640 --> 0:19:23.639
<v Speaker 1>the time. It identified the I s B my office uses.

0:19:24.160 --> 0:19:27.679
<v Speaker 1>It identified the download speed I had available to me.

0:19:27.840 --> 0:19:30.680
<v Speaker 1>It estimated my location. It was off by a couple

0:19:30.720 --> 0:19:33.240
<v Speaker 1>of blocks, but it was in the general area. It

0:19:33.359 --> 0:19:37.560
<v Speaker 1>identified which social media accounts I was logged into at

0:19:37.600 --> 0:19:41.560
<v Speaker 1>that time. If it had been a mobile device, um,

0:19:41.600 --> 0:19:44.360
<v Speaker 1>it would have also told me about my devices orientation,

0:19:44.480 --> 0:19:46.600
<v Speaker 1>like whether it was in portrait or landscape mode, and

0:19:46.680 --> 0:19:50.199
<v Speaker 1>more information like that. And then yield linked to another

0:19:50.240 --> 0:19:54.359
<v Speaker 1>site called click that one can monitor mouse movements and

0:19:54.680 --> 0:19:57.600
<v Speaker 1>mouse clicks and how active you are with a site.

0:19:57.640 --> 0:19:59.600
<v Speaker 1>I visited this one too, and it was kind of creepy.

0:19:59.600 --> 0:20:01.600
<v Speaker 1>It's just find in a way to actually reveal to

0:20:01.640 --> 0:20:06.040
<v Speaker 1>you how much information is being sent to a website.

0:20:06.040 --> 0:20:10.360
<v Speaker 1>So there's actually a voice that talks to you, prerecorded

0:20:10.400 --> 0:20:14.359
<v Speaker 1>stuff that's meant to be a little unsettling, and it

0:20:14.480 --> 0:20:17.240
<v Speaker 1>sends you information telling you, oh, you just move the

0:20:17.280 --> 0:20:19.520
<v Speaker 1>mouse to the right, you just moved it to the left,

0:20:20.320 --> 0:20:22.760
<v Speaker 1>You've sat still for thirty seconds, You've been viewing this

0:20:22.800 --> 0:20:26.080
<v Speaker 1>page for a minute. So this is all information that

0:20:26.160 --> 0:20:28.600
<v Speaker 1>could be sent to a site like they could actually

0:20:28.600 --> 0:20:32.920
<v Speaker 1>monitor where is your mouse moving across a web page,

0:20:33.280 --> 0:20:35.760
<v Speaker 1>which again gets a little creepy. Right now, there are

0:20:35.880 --> 0:20:39.400
<v Speaker 1>legitimate uses for that kind of information from a website

0:20:39.400 --> 0:20:41.440
<v Speaker 1>design perspective, it could tell you a lot about the

0:20:41.480 --> 0:20:45.200
<v Speaker 1>sort of things users find attractive or interesting. About your website,

0:20:46.240 --> 0:20:50.679
<v Speaker 1>but there are also potential misuses and legit analytics firms

0:20:50.680 --> 0:20:54.439
<v Speaker 1>won't use information to compromise users privacy, but not everyone's legit.

0:20:54.920 --> 0:20:59.520
<v Speaker 1>Here's another example. Let's say that you are in a

0:20:59.640 --> 0:21:02.520
<v Speaker 1>faery person. Actually, I'm not gonna say that you're a

0:21:02.600 --> 0:21:05.399
<v Speaker 1>nice person, you're not nefarious. Let's say there is a

0:21:05.480 --> 0:21:09.639
<v Speaker 1>nefarious person out there, and this nefarious person has installed

0:21:09.720 --> 0:21:13.920
<v Speaker 1>some rogue JavaScript on a website, then has tricked people

0:21:13.920 --> 0:21:17.840
<v Speaker 1>into going to it, and is able to give certain

0:21:17.960 --> 0:21:24.160
<v Speaker 1>bits of information that appear to include compromising information about

0:21:24.200 --> 0:21:28.879
<v Speaker 1>the user, and they're able to contact the user to

0:21:29.080 --> 0:21:32.119
<v Speaker 1>send a message out to that users perhaps their email

0:21:32.160 --> 0:21:35.919
<v Speaker 1>address or something on those lines, and through this method

0:21:35.960 --> 0:21:39.240
<v Speaker 1>of contact, they are trying to blackmail the users, saying

0:21:39.720 --> 0:21:42.800
<v Speaker 1>I have dirt on you because I know that you've

0:21:42.880 --> 0:21:46.400
<v Speaker 1>visited such and such website. Maybe it's an adult content website,

0:21:46.440 --> 0:21:50.520
<v Speaker 1>maybe it's a website that's about a sensitive subject. And

0:21:50.560 --> 0:21:54.080
<v Speaker 1>they're able to tell this from the cookies or the JavaScript,

0:21:54.720 --> 0:21:58.440
<v Speaker 1>and so they're sending a message that's essentially saying, if

0:21:58.480 --> 0:22:01.159
<v Speaker 1>you don't cooperate with me, I'm going to reveal the

0:22:01.200 --> 0:22:03.280
<v Speaker 1>information I have about you, Now that may not be

0:22:03.359 --> 0:22:06.880
<v Speaker 1>that they have any real information about you, anything that's

0:22:07.000 --> 0:22:13.920
<v Speaker 1>of any real damaging worth. But they're trading on people's

0:22:14.040 --> 0:22:18.960
<v Speaker 1>natural fears and and they know that even if not

0:22:19.960 --> 0:22:22.760
<v Speaker 1>of their attacks are going to be successful, at least

0:22:22.840 --> 0:22:24.560
<v Speaker 1>enough of them will be for it to be worthwhile.

0:22:25.280 --> 0:22:28.199
<v Speaker 1>So that's one way someone might make nefarious use of

0:22:28.200 --> 0:22:30.840
<v Speaker 1>this kind of data. I'll talk a little bit about

0:22:31.000 --> 0:22:38.119
<v Speaker 1>some ways that governments and companies and individuals have tried

0:22:38.160 --> 0:22:42.560
<v Speaker 1>to protect themselves and others from this kind of abuse

0:22:42.720 --> 0:22:44.880
<v Speaker 1>in just a second, but first let's take another quick

0:22:44.880 --> 0:22:55.320
<v Speaker 1>break to thank our sponsor. Now, there are some laws

0:22:55.440 --> 0:22:59.119
<v Speaker 1>in place that help protect people from predatory use of

0:22:59.240 --> 0:23:03.200
<v Speaker 1>their data. In the United States gets a little loosey goosey.

0:23:03.320 --> 0:23:07.040
<v Speaker 1>There's some state level laws in some places, but obviously

0:23:07.080 --> 0:23:10.800
<v Speaker 1>those apply within a state, not across the entire country.

0:23:11.119 --> 0:23:13.440
<v Speaker 1>There are a few federal protections that are in place.

0:23:13.560 --> 0:23:16.879
<v Speaker 1>In Europe, the protections are way more extensive. The g

0:23:17.040 --> 0:23:19.560
<v Speaker 1>d PR resolution is an example of that, but it's

0:23:19.600 --> 0:23:23.360
<v Speaker 1>just one example of that. So in Europe people generally

0:23:23.480 --> 0:23:27.679
<v Speaker 1>enjoy a better level of protection as far as uh

0:23:28.359 --> 0:23:31.040
<v Speaker 1>their data security is concerned, and there are a lot

0:23:31.040 --> 0:23:33.480
<v Speaker 1>of analytics companies out there that have tried to address

0:23:33.520 --> 0:23:37.320
<v Speaker 1>these issues because they want to know. They want people

0:23:37.320 --> 0:23:39.560
<v Speaker 1>to know, Hey, what we do is valuable. What we

0:23:39.640 --> 0:23:43.000
<v Speaker 1>do actually is part of what makes the Internet work.

0:23:43.720 --> 0:23:47.639
<v Speaker 1>As long as we do it with accountability and we

0:23:47.680 --> 0:23:50.439
<v Speaker 1>do it with respect to your privacy, everything should be

0:23:50.480 --> 0:23:53.080
<v Speaker 1>fine and everyone should benefit. So one of the big

0:23:53.119 --> 0:23:56.919
<v Speaker 1>pushes in the industry is to be more transparent about

0:23:57.359 --> 0:24:01.560
<v Speaker 1>what which data points these rights are collecting and to

0:24:01.720 --> 0:24:04.800
<v Speaker 1>what purpose, Like why are they collecting all this information?

0:24:05.720 --> 0:24:07.800
<v Speaker 1>And it can't just be transparent. It needs to be

0:24:07.840 --> 0:24:10.879
<v Speaker 1>worded in a way that makes sense. It's not buried

0:24:10.920 --> 0:24:14.920
<v Speaker 1>in jargon and legal ease, because then just nine people

0:24:14.920 --> 0:24:17.000
<v Speaker 1>just skip over it and they don't get angry until

0:24:17.080 --> 0:24:21.160
<v Speaker 1>something goes wrong. So being able to explain in blame language, hey,

0:24:21.240 --> 0:24:26.000
<v Speaker 1>we are collecting these data points about people. This is

0:24:26.080 --> 0:24:29.080
<v Speaker 1>how we're using that data. Here's how you will benefit

0:24:29.160 --> 0:24:32.720
<v Speaker 1>from that use, and here's how we benefit from that use.

0:24:32.760 --> 0:24:37.439
<v Speaker 1>If it's completely transparent, everyone is much less likely to

0:24:37.480 --> 0:24:41.920
<v Speaker 1>get upset because they're less likely to misinterpret what is

0:24:41.960 --> 0:24:47.160
<v Speaker 1>happening or to make assumptions about the worst right. So

0:24:47.200 --> 0:24:50.240
<v Speaker 1>tracking in itself might not be malicious. It's meant to

0:24:50.240 --> 0:24:54.280
<v Speaker 1>make things better for everybody, but it's also very easy

0:24:54.320 --> 0:24:58.600
<v Speaker 1>to misuse the information and data is valuable right, so

0:24:59.040 --> 0:25:02.080
<v Speaker 1>it has actual real value to it. That means bad

0:25:02.080 --> 0:25:04.679
<v Speaker 1>actors will go after it too. So what can you

0:25:04.800 --> 0:25:09.160
<v Speaker 1>do on a personal level to protect yourself. One thing

0:25:09.440 --> 0:25:12.880
<v Speaker 1>is that browsers have a do not track setting that

0:25:12.920 --> 0:25:17.600
<v Speaker 1>you can enact. You can enable do not tracked track. Rather,

0:25:17.840 --> 0:25:21.359
<v Speaker 1>in theory, that protocol would mean that sites would agree

0:25:21.520 --> 0:25:24.520
<v Speaker 1>not to track you. Now, I say in theory because

0:25:24.520 --> 0:25:28.000
<v Speaker 1>there's nothing legally requiring sites to obey that protocol, so

0:25:28.040 --> 0:25:32.760
<v Speaker 1>they might track you anyway. The more reputable ones probably won't,

0:25:33.160 --> 0:25:36.960
<v Speaker 1>but other sites might not really give it any mind,

0:25:37.280 --> 0:25:41.960
<v Speaker 1>so it's not really the safest approach. You can try

0:25:41.960 --> 0:25:45.840
<v Speaker 1>to browse in private or incognito mode and a browser

0:25:45.960 --> 0:25:49.679
<v Speaker 1>lots of browsers allowed for this, and usually what that

0:25:49.720 --> 0:25:52.800
<v Speaker 1>means is it will only load cookies for that current session,

0:25:53.040 --> 0:25:56.160
<v Speaker 1>so you're not gonna have cookies save to the browser

0:25:56.200 --> 0:26:01.199
<v Speaker 1>in this way, so that reduces a site's ability to

0:26:01.240 --> 0:26:04.280
<v Speaker 1>track your information. Although the longer you stay on a

0:26:04.320 --> 0:26:07.359
<v Speaker 1>site and the more you click around, the more information

0:26:07.400 --> 0:26:11.640
<v Speaker 1>you are giving that site. Uh, Incognito mode really only

0:26:12.320 --> 0:26:15.280
<v Speaker 1>kind of a racist trace of your activities on that

0:26:15.359 --> 0:26:19.480
<v Speaker 1>local device. So the computer you're using, the mobile device

0:26:19.480 --> 0:26:22.919
<v Speaker 1>you're using, whatever that may be. Incognito mode really just

0:26:23.040 --> 0:26:25.960
<v Speaker 1>keeps it from being you know, your activities being left

0:26:26.080 --> 0:26:29.520
<v Speaker 1>on that device. Your Internet service provider will still see

0:26:29.520 --> 0:26:32.919
<v Speaker 1>where you're going, because it has to in order to

0:26:32.960 --> 0:26:35.160
<v Speaker 1>be able to send you the information that you're requesting

0:26:35.160 --> 0:26:37.760
<v Speaker 1>through the web browser. Um, you still have an IP

0:26:37.880 --> 0:26:40.640
<v Speaker 1>address that can still narrow down where you live or

0:26:40.640 --> 0:26:44.360
<v Speaker 1>where you're accessing the information from. If you log into

0:26:44.440 --> 0:26:46.880
<v Speaker 1>a service like Facebook or Twitter or something like that,

0:26:46.880 --> 0:26:50.280
<v Speaker 1>that's a dead giveaway. So this is a limited help.

0:26:50.800 --> 0:26:53.760
<v Speaker 1>Another thing you might do is install browser extensions that

0:26:53.840 --> 0:26:57.680
<v Speaker 1>limit active scripts from running on websites without your authorization.

0:26:58.440 --> 0:27:01.960
<v Speaker 1>So there are extensions like no Script Security Suite that's

0:27:02.000 --> 0:27:06.080
<v Speaker 1>for Firefox, UH, their Script Safe that's for Chrome. These

0:27:06.080 --> 0:27:09.640
<v Speaker 1>are extensions that put the control in your hands. So

0:27:09.920 --> 0:27:11.680
<v Speaker 1>when you access a site that has one of these

0:27:11.680 --> 0:27:14.879
<v Speaker 1>sort of invisible trackers on it or whatever, it'll pop

0:27:15.000 --> 0:27:17.879
<v Speaker 1>up and alert you and you can choose to either

0:27:18.000 --> 0:27:20.639
<v Speaker 1>allow it or to prevent it from being able to

0:27:20.680 --> 0:27:24.280
<v Speaker 1>track you. UH. At least in the JavaScript approach. If

0:27:24.320 --> 0:27:28.520
<v Speaker 1>people are looking at their access logs, that's still gonna

0:27:28.520 --> 0:27:31.080
<v Speaker 1>show that you've visited the site, but it won't give

0:27:31.119 --> 0:27:34.520
<v Speaker 1>the kind of tiny amounts of data that JavaScript would.

0:27:34.800 --> 0:27:37.800
<v Speaker 1>Tiny tiny is in focused, there's actually quite a lot

0:27:37.840 --> 0:27:41.640
<v Speaker 1>of data. The Electronic Frontier Foundation offers up an extension

0:27:41.680 --> 0:27:46.280
<v Speaker 1>for Firefox, Opera, and Android called Privacy Badger. This add

0:27:46.320 --> 0:27:50.720
<v Speaker 1>on blocks trackers and spy wear. Specifically, it quote stops

0:27:50.800 --> 0:27:54.359
<v Speaker 1>advertisers and other third party trackers from secretly tracking where

0:27:54.400 --> 0:27:56.960
<v Speaker 1>you go and what pages you look at on the web.

0:27:57.560 --> 0:28:00.000
<v Speaker 1>If an advertiser seems to be tracking you across multiple

0:28:00.080 --> 0:28:04.560
<v Speaker 1>websites without your permission, Privacy Badger automatically blocks that advertiser

0:28:04.600 --> 0:28:07.719
<v Speaker 1>from loading any more content in your browser. To the advertiser,

0:28:07.720 --> 0:28:10.840
<v Speaker 1>it looks like you suddenly disappeared. End quote. So it

0:28:10.880 --> 0:28:15.280
<v Speaker 1>does this by identifying which content sources are registering your

0:28:15.280 --> 0:28:18.199
<v Speaker 1>presence on a web page, including the ads that are

0:28:18.280 --> 0:28:21.000
<v Speaker 1>loaded on that web page, and as you go from

0:28:21.000 --> 0:28:23.639
<v Speaker 1>one page to another, if it keeps picking up the

0:28:23.720 --> 0:28:26.720
<v Speaker 1>same sources, that's an indication that you're being tracked, and

0:28:26.760 --> 0:28:30.159
<v Speaker 1>those are the ones that will um it will stop

0:28:30.200 --> 0:28:33.800
<v Speaker 1>loading into your web browser, and since it stops loading it,

0:28:34.200 --> 0:28:37.480
<v Speaker 1>the source can no longer get information about your activities,

0:28:37.480 --> 0:28:39.640
<v Speaker 1>and it's like you just disappeared into thin air. But

0:28:39.760 --> 0:28:43.080
<v Speaker 1>what about them virtual private networks. I'm gonna have to

0:28:43.080 --> 0:28:45.920
<v Speaker 1>do a full episode about VPNs and why they exist

0:28:45.960 --> 0:28:49.240
<v Speaker 1>and why they're important and when you should use one.

0:28:49.960 --> 0:28:53.080
<v Speaker 1>I'll do one of those in the future, but generally,

0:28:53.560 --> 0:28:58.800
<v Speaker 1>in this context, they're mostly good for hiding your physical location. UH.

0:28:58.840 --> 0:29:02.440
<v Speaker 1>The lokal ation will appear to correspond to that of

0:29:02.520 --> 0:29:05.720
<v Speaker 1>the virtual private network, not to you, not to your

0:29:05.800 --> 0:29:11.760
<v Speaker 1>real world location, because the web browser will be acting

0:29:11.840 --> 0:29:17.479
<v Speaker 1>like the VPN is the source of the traffic, not

0:29:17.480 --> 0:29:21.200
<v Speaker 1>not your computer, and the VPN handles it from that

0:29:21.240 --> 0:29:24.240
<v Speaker 1>point to get it to you. So you would still

0:29:24.280 --> 0:29:26.280
<v Speaker 1>get cookies from sites. They'd still be able to track

0:29:26.320 --> 0:29:30.040
<v Speaker 1>your activities, but I would do it through the the

0:29:30.080 --> 0:29:34.080
<v Speaker 1>context of the VPN and UH. And since your behaviors

0:29:34.080 --> 0:29:36.360
<v Speaker 1>are filtering through the VPN instead of your normal I

0:29:36.560 --> 0:29:39.160
<v Speaker 1>s P, what you're really doing is trading one entity

0:29:39.240 --> 0:29:42.160
<v Speaker 1>for another. Instead of having the I s P be

0:29:42.280 --> 0:29:45.600
<v Speaker 1>the one monitoring all the stuff you're doing, the VPN

0:29:46.360 --> 0:29:49.360
<v Speaker 1>could technically monitor all the stuff you're doing, so I

0:29:49.400 --> 0:29:51.200
<v Speaker 1>guess then it just comes down to who do you

0:29:51.240 --> 0:29:54.760
<v Speaker 1>trust more, the VPN or the I s p UM.

0:29:54.840 --> 0:29:57.440
<v Speaker 1>The answer is going to be very dependent upon which

0:29:57.480 --> 0:30:00.520
<v Speaker 1>of those entities are you're making use of at any

0:30:00.560 --> 0:30:04.760
<v Speaker 1>given time. So one last little bit about the pros

0:30:04.800 --> 0:30:09.280
<v Speaker 1>and cons of tracking. Tracking is what makes online advertising work.

0:30:10.040 --> 0:30:13.560
<v Speaker 1>So it's somewhat infuriating because online tracking gives us a

0:30:13.600 --> 0:30:16.480
<v Speaker 1>really granular view of which ads work on which sites,

0:30:16.520 --> 0:30:20.120
<v Speaker 1>and which ones don't. We learned about how different form

0:30:20.200 --> 0:30:23.240
<v Speaker 1>factors can be more or less effective. You might find

0:30:23.240 --> 0:30:25.960
<v Speaker 1>out that at A tests really well on site one,

0:30:26.600 --> 0:30:29.280
<v Speaker 1>but it fails miserably on site too. But AD B,

0:30:30.240 --> 0:30:32.080
<v Speaker 1>which is for the exact same product, is at A,

0:30:32.240 --> 0:30:34.760
<v Speaker 1>but it's a different design that one works great on

0:30:34.800 --> 0:30:37.360
<v Speaker 1>site too. Or maybe you find out just by changing

0:30:37.360 --> 0:30:40.360
<v Speaker 1>where an AD displays on a page it drives more engagement.

0:30:41.040 --> 0:30:43.800
<v Speaker 1>The reason this is important is because running a website

0:30:44.000 --> 0:30:47.240
<v Speaker 1>is not free. If it were, the world would be

0:30:47.240 --> 0:30:51.280
<v Speaker 1>a very different place. So companies like how stuff works

0:30:51.320 --> 0:30:54.960
<v Speaker 1>dot Com have costs associated with them, right, and those

0:30:54.960 --> 0:30:58.240
<v Speaker 1>are significant costs, not just like web hosting, but other

0:30:58.280 --> 0:31:02.320
<v Speaker 1>stuff like off the space, lay salaries, healthcare lots and

0:31:02.400 --> 0:31:06.360
<v Speaker 1>lots of costs. So if there's no money coming in

0:31:06.560 --> 0:31:09.120
<v Speaker 1>to cover those costs, you won't stay in business. You

0:31:09.240 --> 0:31:14.160
<v Speaker 1>go into debt. Eventually you go into bankruptcy. Uh. So

0:31:14.320 --> 0:31:16.520
<v Speaker 1>you want to make money to pay off the costs,

0:31:16.560 --> 0:31:18.360
<v Speaker 1>and you really want to make enough to make a profit.

0:31:18.400 --> 0:31:20.040
<v Speaker 1>I mean, that's what a business is all about, is

0:31:20.040 --> 0:31:24.560
<v Speaker 1>making profits. So without profit, businesses don't really exist. And

0:31:24.600 --> 0:31:27.720
<v Speaker 1>then the content goes away. So unless we move to

0:31:28.000 --> 0:31:32.720
<v Speaker 1>a totally different model of the web, which probably be

0:31:32.800 --> 0:31:34.720
<v Speaker 1>one where we have to pay for everything we want

0:31:34.760 --> 0:31:38.680
<v Speaker 1>to access, everything would be behind a paywall, it would

0:31:38.680 --> 0:31:42.800
<v Speaker 1>be really hard to continue to have web content. We

0:31:42.880 --> 0:31:46.840
<v Speaker 1>have to have some financial means to support the content

0:31:47.720 --> 0:31:50.000
<v Speaker 1>or else the content goes away. Same thing is true

0:31:50.040 --> 0:31:53.920
<v Speaker 1>for podcasts. I mean, the reason we have sponsors is

0:31:54.000 --> 0:31:59.480
<v Speaker 1>to h to pay off the costs of producing these

0:31:59.480 --> 0:32:04.520
<v Speaker 1>shows and posting the shows and continue to develop shows

0:32:04.520 --> 0:32:09.560
<v Speaker 1>and make new shows. The ads support that, and hopefully

0:32:10.040 --> 0:32:14.800
<v Speaker 1>the ads that we are choosing to place with shows

0:32:15.440 --> 0:32:18.600
<v Speaker 1>are meaningful to our listeners, because if they're not, then

0:32:18.600 --> 0:32:22.360
<v Speaker 1>it's not really doing anyone any good. And ultimately, you

0:32:22.440 --> 0:32:29.440
<v Speaker 1>want the best possible relationship between content, advertising, and users.

0:32:29.520 --> 0:32:33.680
<v Speaker 1>You want something where everybody is happy with it, because otherwise,

0:32:33.680 --> 0:32:36.280
<v Speaker 1>what's the point. The same thing is true with the website,

0:32:37.280 --> 0:32:39.680
<v Speaker 1>so the tracking is very important to get that kind

0:32:39.680 --> 0:32:42.960
<v Speaker 1>of information. It's kind of funny to me because classic media,

0:32:43.040 --> 0:32:47.760
<v Speaker 1>your traditional media, things like television, magazines, newspapers, that kind

0:32:47.800 --> 0:32:51.960
<v Speaker 1>of stuff, everything that has advertising in it, Uh, it's

0:32:51.960 --> 0:32:55.160
<v Speaker 1>a lot harder to tell how well that advertising works,

0:32:56.320 --> 0:32:59.000
<v Speaker 1>how much impact that advertising has. With the exception of

0:32:59.040 --> 0:33:02.200
<v Speaker 1>stuff like the Super Bowl in the United States, where

0:33:02.200 --> 0:33:06.000
<v Speaker 1>people famously will tune in just to watch commercials, you

0:33:06.000 --> 0:33:09.560
<v Speaker 1>really don't know how much attention is being directed toward commercials.

0:33:09.640 --> 0:33:12.719
<v Speaker 1>You might be able to get some general ratings about

0:33:12.920 --> 0:33:16.160
<v Speaker 1>how well a certain television show has done, but that

0:33:16.200 --> 0:33:20.880
<v Speaker 1>doesn't really tell you anything about the ads themselves. So

0:33:22.720 --> 0:33:27.440
<v Speaker 1>it's funny to me that the traditional media, the advertising world,

0:33:27.480 --> 0:33:30.960
<v Speaker 1>is very comfortable in that space and in the online

0:33:30.960 --> 0:33:33.600
<v Speaker 1>space where we can actually see how well an ad

0:33:33.640 --> 0:33:36.760
<v Speaker 1>does because we can see how many people click on it,

0:33:36.840 --> 0:33:41.000
<v Speaker 1>how many people actually went through and said this is interesting,

0:33:41.120 --> 0:33:43.000
<v Speaker 1>I want to know more, I want to be able

0:33:43.000 --> 0:33:46.040
<v Speaker 1>to buy this. We can actually see how effective that is,

0:33:46.120 --> 0:33:52.160
<v Speaker 1>and somehow that makes it less valuable, uh in some cases,

0:33:52.200 --> 0:33:55.920
<v Speaker 1>like the CPMs that are demanded and in direct mail,

0:33:56.400 --> 0:33:59.960
<v Speaker 1>like sending stuff out in magazines and things that's way

0:34:00.200 --> 0:34:05.680
<v Speaker 1>higher than what you typically see for most online advertising. Um,

0:34:05.720 --> 0:34:08.800
<v Speaker 1>one of those things where a little knowledge can be dangerous.

0:34:08.840 --> 0:34:13.799
<v Speaker 1>I guess, very fascinating topic. And while you can go

0:34:13.880 --> 0:34:18.280
<v Speaker 1>through and do those extensions and use VPNs and things

0:34:18.320 --> 0:34:21.920
<v Speaker 1>and turn off a lot of the the elements that

0:34:22.040 --> 0:34:25.879
<v Speaker 1>will allow sites to track you, if you do that,

0:34:26.360 --> 0:34:31.000
<v Speaker 1>you also lose that of the benefits that tracking gives

0:34:31.200 --> 0:34:34.680
<v Speaker 1>to users. That might be a worthy trade off for

0:34:34.760 --> 0:34:37.800
<v Speaker 1>you if you really value your privacy and you don't

0:34:37.840 --> 0:34:41.560
<v Speaker 1>want sites to get access to that kind of information.

0:34:42.520 --> 0:34:46.040
<v Speaker 1>But UM, you know it's it's it's just this kind

0:34:46.040 --> 0:34:49.120
<v Speaker 1>of the way our online world works, and without some

0:34:49.200 --> 0:34:53.479
<v Speaker 1>sort of transformative change, I don't see that being any

0:34:53.520 --> 0:34:57.520
<v Speaker 1>different anytime soon. But it is an interesting subject. If

0:34:57.520 --> 0:35:00.880
<v Speaker 1>you guys have any ideas for future episodes, I any

0:35:00.920 --> 0:35:02.719
<v Speaker 1>sort of topic you want me to cover, whether it's

0:35:02.760 --> 0:35:06.160
<v Speaker 1>a technology, a company, a person in tech. Maybe there's

0:35:06.160 --> 0:35:08.600
<v Speaker 1>someone I should interview or have on as a guest host.

0:35:09.160 --> 0:35:11.839
<v Speaker 1>Send me a message. The email addresses tech stuff at

0:35:11.880 --> 0:35:14.479
<v Speaker 1>how Stuff works dot com or drop me a line

0:35:14.480 --> 0:35:16.359
<v Speaker 1>on Facebook or Twitter to handle it. Both of those

0:35:16.480 --> 0:35:19.640
<v Speaker 1>is tech stuff hs W. Don't forget. Head on over

0:35:19.680 --> 0:35:23.279
<v Speaker 1>to T public dot com slash tech stuff. That's T

0:35:23.560 --> 0:35:26.800
<v Speaker 1>e e Public dot com slash tech stuff to get

0:35:26.840 --> 0:35:30.719
<v Speaker 1>all your tech stuff merchandise needs. You know, maybe maybe

0:35:30.760 --> 0:35:34.160
<v Speaker 1>you're sitting there thinking, I have a cup of hot

0:35:34.200 --> 0:35:37.359
<v Speaker 1>coffee sitting here, but I have no mug to put

0:35:37.400 --> 0:35:40.680
<v Speaker 1>it in. Get yourself a tech stuff mug. They're pretty awesome.

0:35:40.719 --> 0:35:44.360
<v Speaker 1>I've got two of them myself, And don't forget to

0:35:44.400 --> 0:35:49.120
<v Speaker 1>follow us on Instagram. Don't talk to you again really

0:35:49.200 --> 0:35:57.760
<v Speaker 1>soon for more on this and thousands of other topics

0:35:57.800 --> 0:36:04.759
<v Speaker 1>because it how stuff works dot com. Who who Who