1 00:00:01,920 --> 00:00:06,520 Speaker 1: Welcome to brain Stuff production of I Heart Radio. Hey 2 00:00:06,559 --> 00:00:10,160 Speaker 1: brain Stuff, Lauren vog Obam here. If a tree falls 3 00:00:10,160 --> 00:00:13,280 Speaker 1: in a forest doesn't really make a sound? And if 4 00:00:13,280 --> 00:00:17,040 Speaker 1: a website changes overnight, did its previous homepage ever really 5 00:00:17,040 --> 00:00:20,160 Speaker 1: exist in the first place. Because so much of our 6 00:00:20,160 --> 00:00:23,720 Speaker 1: world is increasingly digital and ephemeral, it's not just a 7 00:00:23,720 --> 00:00:28,040 Speaker 1: philosophical question, it's also a simple matter of history. That's 8 00:00:28,080 --> 00:00:30,680 Speaker 1: why the way Back Machine, which features step shots of 9 00:00:30,680 --> 00:00:33,520 Speaker 1: websites as they age and change, is such a fascinating 10 00:00:33,560 --> 00:00:36,400 Speaker 1: glimpse into the dusty corners of the web. The way 11 00:00:36,440 --> 00:00:39,360 Speaker 1: Back Machine is a massive digital archive meant to preserve 12 00:00:39,400 --> 00:00:42,400 Speaker 1: web pages that would otherwise be permanently lost to time. 13 00:00:43,240 --> 00:00:45,519 Speaker 1: Without this horde of data, every time a page was 14 00:00:45,600 --> 00:00:48,280 Speaker 1: updated or deleted, it would simply vanish, as if it 15 00:00:48,280 --> 00:00:51,839 Speaker 1: had never been there. Mark Graham, the director of the 16 00:00:51,840 --> 00:00:56,080 Speaker 1: way Back Machine, noted in Entrepreneur article that the average 17 00:00:56,080 --> 00:00:58,720 Speaker 1: life expectancy of a web page is about a hundred days. 18 00:00:59,240 --> 00:01:02,240 Speaker 1: There are a multitude of reasons why these web pages disappear. 19 00:01:02,760 --> 00:01:05,520 Speaker 1: A site creators move on to other projects, web hosting 20 00:01:05,600 --> 00:01:09,120 Speaker 1: companies go bankrupt, or maybe the pages moved or replaced 21 00:01:09,120 --> 00:01:12,600 Speaker 1: with new data and content. One place you may have 22 00:01:12,640 --> 00:01:15,440 Speaker 1: seen the way back machines work. More than eleven million 23 00:01:15,480 --> 00:01:18,959 Speaker 1: web pages referenced in Wikipedia articles have gone bad over 24 00:01:18,959 --> 00:01:21,440 Speaker 1: the years. In other words, they now return a four 25 00:01:21,480 --> 00:01:25,480 Speaker 1: oh four or page not found error because they've been archived. 26 00:01:25,480 --> 00:01:27,880 Speaker 1: In the way Back Machine. Technicians there were able to 27 00:01:28,000 --> 00:01:30,959 Speaker 1: edit those Wikipedia pages, so the references now point to 28 00:01:31,040 --> 00:01:34,800 Speaker 1: archived versions of those defunct u r l s. The 29 00:01:34,840 --> 00:01:37,560 Speaker 1: way Back Machine is the brainchild of Brewster Kale and 30 00:01:37,560 --> 00:01:40,880 Speaker 1: Bruce Giliad, who also founded the Internet Archive, which is 31 00:01:40,920 --> 00:01:44,160 Speaker 1: a digital library of websites, books, audio and video recordings, 32 00:01:44,200 --> 00:01:49,240 Speaker 1: and software. Both projects are San Francisco based nonprofits. Kale 33 00:01:49,240 --> 00:01:52,920 Speaker 1: and Gilliatt also created Alexa Internet, which analyzes web traffic 34 00:01:52,960 --> 00:01:57,720 Speaker 1: patterns and was sold to Amazon. Project director Graham said 35 00:01:57,800 --> 00:02:00,840 Speaker 1: via email they with Kale and Gilad, had started to 36 00:02:01,000 --> 00:02:04,880 Speaker 1: archive web pages in and in two thousand one launched 37 00:02:04,880 --> 00:02:07,640 Speaker 1: the way Back Machine to support discovery and playback of 38 00:02:07,760 --> 00:02:11,680 Speaker 1: those archived web resources and yes, the name was inspired 39 00:02:11,680 --> 00:02:14,840 Speaker 1: by the nineteen sixties cartoon series The Rocky and Bullwinkle Show. 40 00:02:15,480 --> 00:02:18,959 Speaker 1: In the cartoon, the way Back w A B a c. 41 00:02:19,320 --> 00:02:22,839 Speaker 1: Machine was a plot device used to transport the characters Mr. 42 00:02:22,880 --> 00:02:25,720 Speaker 1: Peabody and Sherman back in time to visit important events 43 00:02:25,720 --> 00:02:29,320 Speaker 1: in human history. In a world where there are more 44 00:02:29,360 --> 00:02:32,480 Speaker 1: than one point seven billion websites, with the number climbing 45 00:02:32,560 --> 00:02:35,760 Speaker 1: dramatically by the day, how can anyone possibly hope to 46 00:02:35,840 --> 00:02:38,960 Speaker 1: catalog so many web pages? The way Back Machine uses 47 00:02:39,000 --> 00:02:42,119 Speaker 1: what are called crawlers, a type of software that automatically 48 00:02:42,200 --> 00:02:45,120 Speaker 1: moves through the web, taking snapshots of billions of sites 49 00:02:45,160 --> 00:02:48,640 Speaker 1: as it goes. Some of the process is automated, but 50 00:02:48,720 --> 00:02:51,440 Speaker 1: many of the requests are generated manually by a network 51 00:02:51,480 --> 00:02:54,799 Speaker 1: of librarians who prioritize certain types of sites that they 52 00:02:54,840 --> 00:02:58,840 Speaker 1: think are important to preserve for posterity and for future generations. 53 00:03:00,120 --> 00:03:04,000 Speaker 1: The crawlers don't capture every iteration of sites. The frequency 54 00:03:04,000 --> 00:03:07,720 Speaker 1: of snapshots differs by these sites importance. Very significant sites 55 00:03:07,800 --> 00:03:10,959 Speaker 1: might be recorded every few hours. Others might be logged 56 00:03:11,000 --> 00:03:14,520 Speaker 1: weeks or months apart. Most aren't logged at all, So 57 00:03:14,800 --> 00:03:17,519 Speaker 1: don't worry that embarrassing fan website you made in high 58 00:03:17,520 --> 00:03:20,880 Speaker 1: school is probably long gone by now. The way Back 59 00:03:20,919 --> 00:03:24,680 Speaker 1: Machine aims to capture snapshots of important content, say the 60 00:03:24,800 --> 00:03:29,520 Speaker 1: breaking news headlines created by major media companies, Furthermore, it 61 00:03:29,560 --> 00:03:33,120 Speaker 1: doesn't necessarily recreate the entire site, and it doesn't preserve 62 00:03:33,160 --> 00:03:35,080 Speaker 1: the data in a way that you'd experience it with 63 00:03:35,120 --> 00:03:38,520 Speaker 1: your browser. It may only capture a few images of 64 00:03:38,560 --> 00:03:41,640 Speaker 1: a few pages and not preserve content that's linked to 65 00:03:41,680 --> 00:03:45,720 Speaker 1: other sites outside of the domain. But on a more 66 00:03:45,720 --> 00:03:49,080 Speaker 1: practical level, you've probably had the experience of clicking on 67 00:03:49,080 --> 00:03:50,720 Speaker 1: a link on a web page and getting a four 68 00:03:50,760 --> 00:03:53,720 Speaker 1: oh four or page dot found notation, and now you're 69 00:03:53,720 --> 00:03:56,760 Speaker 1: wondering what was on the page originally. That's where the 70 00:03:56,760 --> 00:04:00,000 Speaker 1: way back machine can help. To use the way back machine, 71 00:04:00,280 --> 00:04:04,160 Speaker 1: go to archive dot org slash web type the ur 72 00:04:04,320 --> 00:04:06,000 Speaker 1: L of the site you want to investigate in the 73 00:04:06,120 --> 00:04:09,080 Speaker 1: browse history search bar, and the results you'll see a 74 00:04:09,160 --> 00:04:11,920 Speaker 1: chronological barograph that shows how many times the site was 75 00:04:11,960 --> 00:04:15,760 Speaker 1: crawled and saved in a given year. Click the year 76 00:04:15,840 --> 00:04:18,440 Speaker 1: and blow You'll see a twelve month calendar with various 77 00:04:18,520 --> 00:04:21,680 Speaker 1: dates highlighted. Blue highlights mean the site was saved properly, 78 00:04:21,920 --> 00:04:24,839 Speaker 1: red means it was not. Click one of the highlighted 79 00:04:24,920 --> 00:04:27,599 Speaker 1: dates and the site stop shots will appear. Click on 80 00:04:27,600 --> 00:04:30,359 Speaker 1: one of those snapshots, and just like that, you've traveled 81 00:04:30,360 --> 00:04:32,400 Speaker 1: back in time to that older version of the site. 82 00:04:33,400 --> 00:04:35,280 Speaker 1: If you want to make sure that a particular site 83 00:04:35,320 --> 00:04:37,760 Speaker 1: is recorded to the archive, you can do so manually 84 00:04:38,360 --> 00:04:41,120 Speaker 1: use the save page now option to save a specific 85 00:04:41,120 --> 00:04:44,200 Speaker 1: page once, but realize that doing so only saves that 86 00:04:44,320 --> 00:04:47,440 Speaker 1: one page, not an entire website, and it doesn't guarantee 87 00:04:47,440 --> 00:04:50,279 Speaker 1: that the site will be crawled in the future. And 88 00:04:50,720 --> 00:04:53,920 Speaker 1: if content owners want their material excluded from the Wayback Machine, 89 00:04:54,160 --> 00:04:56,320 Speaker 1: they can submit a request by sending an email to 90 00:04:56,400 --> 00:05:00,560 Speaker 1: info at archive dot org. Graham's as that the most 91 00:05:00,600 --> 00:05:02,640 Speaker 1: amazing thing about the way Back Machine is that it 92 00:05:02,720 --> 00:05:04,920 Speaker 1: exists at all, and how much of the public web 93 00:05:04,960 --> 00:05:07,120 Speaker 1: it's able to preserve. Given that it has such a 94 00:05:07,120 --> 00:05:10,039 Speaker 1: small budget and team, they do use volunteers as well, 95 00:05:11,240 --> 00:05:13,840 Speaker 1: he said, with more support, we can do an even 96 00:05:13,920 --> 00:05:16,080 Speaker 1: better job of backing up more of the public web. 97 00:05:16,640 --> 00:05:19,040 Speaker 1: Funding for the Internet Archive and the way Back Machine 98 00:05:19,240 --> 00:05:22,040 Speaker 1: comes from a combination of earned income from our subscription 99 00:05:22,080 --> 00:05:25,400 Speaker 1: based web arcing service archive it dot org, major donors 100 00:05:25,400 --> 00:05:27,880 Speaker 1: and foundations, as well as contributions from more than a 101 00:05:27,960 --> 00:05:31,280 Speaker 1: hundred thousand individual donors. We love being able to give 102 00:05:31,279 --> 00:05:33,960 Speaker 1: away our services and don't run ads on our web pages. 103 00:05:35,200 --> 00:05:37,040 Speaker 1: He's sure that the way Back Machine will become even 104 00:05:37,120 --> 00:05:40,599 Speaker 1: more important in the future. Quote. As the nature of 105 00:05:40,600 --> 00:05:44,320 Speaker 1: how people communicate and share information evolves, so too we 106 00:05:44,360 --> 00:05:48,120 Speaker 1: will need to build technologies, processes, and partnerships to continue 107 00:05:48,160 --> 00:05:50,080 Speaker 1: to do the best job we can to preserve as 108 00:05:50,120 --> 00:05:53,440 Speaker 1: much of this public information as possible. All in support 109 00:05:53,440 --> 00:05:55,960 Speaker 1: of the way Back machines mission to help make the 110 00:05:55,960 --> 00:05:59,400 Speaker 1: web more useful and reliable, and in particular, to help 111 00:05:59,440 --> 00:06:04,279 Speaker 1: support your lists, activists, academics, historians, researchers, and the general public. 112 00:06:09,560 --> 00:06:11,960 Speaker 1: Today's episode was written by Nathan Chandler and produced by 113 00:06:11,960 --> 00:06:14,719 Speaker 1: Tyler Clay. Brain Stuff is production of I Heart Radio's 114 00:06:14,720 --> 00:06:16,599 Speaker 1: How Stuff Works. For more on this and lots of 115 00:06:16,600 --> 00:06:19,400 Speaker 1: other well archived topics, visit our home planet how stuff 116 00:06:19,400 --> 00:06:21,880 Speaker 1: Works dot com and for more podcasts for my heart 117 00:06:21,960 --> 00:06:24,400 Speaker 1: Radio but it's the I Heart Radio app, Apple Podcasts, 118 00:06:24,440 --> 00:06:26,240 Speaker 1: or wherever you listen to your favorite shows.