1 00:00:04,400 --> 00:00:07,800 Speaker 1: Welcome to tech Stuff, a production from I Heart Radio. 2 00:00:12,160 --> 00:00:15,120 Speaker 1: Hey there, and welcome to tech Stuff. I'm your host, 3 00:00:15,280 --> 00:00:18,400 Speaker 1: Jonathan Strickland. I'm an executive producer with I Heart Radio 4 00:00:18,440 --> 00:00:21,560 Speaker 1: and a love of all things tech and last week, 5 00:00:21,960 --> 00:00:27,360 Speaker 1: Amazon's US East one cloud region had a bit of 6 00:00:27,400 --> 00:00:32,360 Speaker 1: an outage, and the effects were widespread. Amazon delivery services 7 00:00:32,400 --> 00:00:35,680 Speaker 1: were affected. A lot of deliveries just couldn't be made 8 00:00:35,720 --> 00:00:40,240 Speaker 1: because the whole system that underlies that the computer system 9 00:00:40,320 --> 00:00:45,919 Speaker 1: was affected. Computer games like Player Unknowns Battlegrounds became unavailable. 10 00:00:46,320 --> 00:00:49,520 Speaker 1: People discovered that some of their home automation devices weren't 11 00:00:49,520 --> 00:00:53,600 Speaker 1: working properly. Room Baz went berserk and rose up against 12 00:00:53,600 --> 00:00:57,720 Speaker 1: their human owners. Even down at Walt Disney World, guests 13 00:00:57,800 --> 00:01:01,360 Speaker 1: found themselves struggling with systems like Genie Plus, or even 14 00:01:01,400 --> 00:01:04,080 Speaker 1: just making a park reservation so that they could visit 15 00:01:04,160 --> 00:01:06,720 Speaker 1: a theme park. Also, I kind of made up the 16 00:01:06,800 --> 00:01:11,319 Speaker 1: roomba thing. So today I thought I would talk a 17 00:01:11,360 --> 00:01:15,399 Speaker 1: little bit about the history of Amazon Web Services, what 18 00:01:15,600 --> 00:01:19,280 Speaker 1: it actually does, why it's such a big deal for 19 00:01:19,319 --> 00:01:23,520 Speaker 1: Amazon the company, and why when there's an outage it 20 00:01:23,640 --> 00:01:28,040 Speaker 1: has such a widespread effect. Now, the history of Amazon 21 00:01:28,120 --> 00:01:32,119 Speaker 1: Web Services or AWS goes back a couple of decades 22 00:01:32,360 --> 00:01:35,560 Speaker 1: and it is tied closely with the general rise of 23 00:01:35,640 --> 00:01:39,800 Speaker 1: cloud computing. So first, let's define cloud computing just so 24 00:01:39,840 --> 00:01:43,080 Speaker 1: that we have a common language. Now, if you were 25 00:01:43,160 --> 00:01:47,919 Speaker 1: to go to Google and query the terms cloud computing definition, 26 00:01:48,520 --> 00:01:52,320 Speaker 1: you would likely get something like the following quote. The 27 00:01:52,360 --> 00:01:56,040 Speaker 1: practice of using a network of remote servers hosted on 28 00:01:56,080 --> 00:02:00,360 Speaker 1: the Internet to store, manage, and process data either than 29 00:02:00,400 --> 00:02:04,520 Speaker 1: a local server or a personal computer end quote. So, 30 00:02:05,320 --> 00:02:09,919 Speaker 1: at its most simplest form, cloud computing is when you 31 00:02:10,000 --> 00:02:14,839 Speaker 1: access computational resources that are on someone else's computer and 32 00:02:14,960 --> 00:02:17,440 Speaker 1: you use the Internet to do it. So, if you 33 00:02:17,560 --> 00:02:21,400 Speaker 1: use any sort of cloud storage like one drive or 34 00:02:21,480 --> 00:02:24,520 Speaker 1: drop box or any of a thousand others, what you 35 00:02:24,560 --> 00:02:29,000 Speaker 1: are actually doing is saving files to special data servers 36 00:02:29,040 --> 00:02:33,200 Speaker 1: that are in some massive server farm somewhere in the world, 37 00:02:33,320 --> 00:02:37,280 Speaker 1: probably not too far from where you are, Or maybe 38 00:02:37,639 --> 00:02:40,560 Speaker 1: you're actually saving that one file two servers that are 39 00:02:40,560 --> 00:02:44,560 Speaker 1: in a few different massive server farms. Though you wouldn't 40 00:02:44,600 --> 00:02:47,400 Speaker 1: necessarily be aware of any of this, because that would 41 00:02:47,400 --> 00:02:49,720 Speaker 1: be going on in the background, and it would be 42 00:02:49,760 --> 00:02:52,799 Speaker 1: a matter of redundancy to make sure that your file 43 00:02:53,440 --> 00:02:56,680 Speaker 1: remains available even if something should happen to any one 44 00:02:56,840 --> 00:03:00,960 Speaker 1: particular machine. So when you access that file, what you're 45 00:03:01,000 --> 00:03:04,440 Speaker 1: doing is connecting back to one of those servers that 46 00:03:04,560 --> 00:03:08,160 Speaker 1: holds that particular file, and you might download the file 47 00:03:08,200 --> 00:03:11,200 Speaker 1: to your local machine, so you're just retrieving it, or 48 00:03:11,320 --> 00:03:14,560 Speaker 1: depending on the type of file you're accessing and the 49 00:03:14,560 --> 00:03:16,760 Speaker 1: type of service you're using, you might be able to 50 00:03:16,800 --> 00:03:19,720 Speaker 1: do stuff like make changes to that file through a 51 00:03:19,760 --> 00:03:22,720 Speaker 1: web based client. So if you were to create a 52 00:03:22,800 --> 00:03:26,720 Speaker 1: document in Google Docs, for example, that would follow that 53 00:03:26,840 --> 00:03:30,040 Speaker 1: kind of cloud computing model. It's one of the simplest 54 00:03:30,080 --> 00:03:34,840 Speaker 1: manifestations of cloud computing and is effectively cloud storage with 55 00:03:34,960 --> 00:03:38,040 Speaker 1: a little bit of editing thrown in. But cloud computing 56 00:03:38,080 --> 00:03:41,840 Speaker 1: can go far beyond just storing files. There are cloud 57 00:03:41,840 --> 00:03:45,920 Speaker 1: based services that allow developers to build out an app environment. 58 00:03:46,520 --> 00:03:49,440 Speaker 1: They might do this so that a distributed team, you know, 59 00:03:49,520 --> 00:03:52,400 Speaker 1: people who aren't working all in the same location can 60 00:03:52,520 --> 00:03:56,560 Speaker 1: simultaneously work on the same code and create test environments 61 00:03:56,600 --> 00:04:00,000 Speaker 1: to make sure that the app performs as expected before 62 00:04:00,120 --> 00:04:03,760 Speaker 1: or they deploy the app to end users, you know, 63 00:04:03,800 --> 00:04:09,000 Speaker 1: to customers. Other cloud services serve as an actual deployment platform, 64 00:04:09,080 --> 00:04:12,080 Speaker 1: so not just to develop, but to deploy the gifts, 65 00:04:12,120 --> 00:04:14,720 Speaker 1: developers the assets that they need to push out an 66 00:04:14,760 --> 00:04:20,120 Speaker 1: app and handle user interactions. So some apps might quote 67 00:04:20,160 --> 00:04:24,919 Speaker 1: unquote live natively on your device. Right, You download a 68 00:04:25,040 --> 00:04:28,560 Speaker 1: file to whatever you're using, whether it's a computer or 69 00:04:28,680 --> 00:04:31,800 Speaker 1: smartphone or tablet or whatever it is, and then all 70 00:04:31,839 --> 00:04:35,400 Speaker 1: the processes and all the data could be contained right 71 00:04:35,400 --> 00:04:39,200 Speaker 1: there locally on your machine. Um that's like the old 72 00:04:39,279 --> 00:04:43,719 Speaker 1: form of computing. But increasingly we're seeing apps that rely 73 00:04:43,880 --> 00:04:47,800 Speaker 1: on the cloud for functionality. So games could have stuff 74 00:04:47,839 --> 00:04:51,280 Speaker 1: like leaderboards or ways that you can compete or cooperate 75 00:04:51,360 --> 00:04:55,400 Speaker 1: with other players in real time. A weather app needs 76 00:04:55,440 --> 00:04:58,240 Speaker 1: to fetch data from servers to tell you what the 77 00:04:58,279 --> 00:05:01,240 Speaker 1: weather will be, like your your own doesn't magically know 78 00:05:01,279 --> 00:05:03,560 Speaker 1: what the weather is going to be. Even a lot 79 00:05:03,560 --> 00:05:07,520 Speaker 1: of home automation apps will communicate back with a web 80 00:05:07,560 --> 00:05:11,320 Speaker 1: server somewhere rather than handle everything right there in your home. 81 00:05:11,360 --> 00:05:14,000 Speaker 1: In fact, that's a sticking point for a lot of 82 00:05:14,040 --> 00:05:18,440 Speaker 1: home automation folks, right they don't necessarily want to have 83 00:05:18,640 --> 00:05:21,080 Speaker 1: the cloud part of the infrastructure. They would prefer to 84 00:05:21,080 --> 00:05:23,799 Speaker 1: have their home be kind of a self contained system. 85 00:05:24,400 --> 00:05:27,280 Speaker 1: You see this a lot with people who have security 86 00:05:27,320 --> 00:05:30,839 Speaker 1: systems where they would prefer to have something that was 87 00:05:30,920 --> 00:05:34,400 Speaker 1: completely contained within their own home, as opposed to having 88 00:05:34,440 --> 00:05:38,600 Speaker 1: their security system become a surveillance tool for a company 89 00:05:38,640 --> 00:05:41,560 Speaker 1: that may or may not be working in conjunction with, say, 90 00:05:41,839 --> 00:05:45,120 Speaker 1: law enforcement. That's become a big issue, but that's a 91 00:05:45,120 --> 00:05:48,559 Speaker 1: matter for a different podcast. Now. Building out these kinds 92 00:05:48,560 --> 00:05:53,719 Speaker 1: of systems is expensive because you need the physical facilities right, 93 00:05:53,760 --> 00:05:55,919 Speaker 1: You need the actual buildings, and they have to be 94 00:05:56,000 --> 00:05:59,240 Speaker 1: large enough to hold all the servers that are designed 95 00:05:59,560 --> 00:06:03,279 Speaker 1: to make your app work right, and then the facilities 96 00:06:03,320 --> 00:06:06,039 Speaker 1: also have to be designed themselves to allow those servers 97 00:06:06,080 --> 00:06:09,479 Speaker 1: to operate. That means building out stuff like cooling systems 98 00:06:09,760 --> 00:06:12,320 Speaker 1: so that your machines don't overheat. So you know, it's 99 00:06:12,320 --> 00:06:15,480 Speaker 1: not just enough to have a place where you store 100 00:06:15,480 --> 00:06:19,360 Speaker 1: all the computers. You've gotta have it be appropriate for that. 101 00:06:19,400 --> 00:06:22,760 Speaker 1: You know, it needs to be dry, free of dust, cooled, 102 00:06:22,880 --> 00:06:26,280 Speaker 1: that kind of stuff. You also need to maintain those systems. 103 00:06:26,400 --> 00:06:29,359 Speaker 1: You have to repair or replace components as they fail, 104 00:06:29,440 --> 00:06:32,279 Speaker 1: because we all know technology does fail at some point 105 00:06:32,360 --> 00:06:35,760 Speaker 1: for a variety of reasons. That's also why you need 106 00:06:35,800 --> 00:06:39,880 Speaker 1: to make more than the bare minimum to run your operations, right, 107 00:06:39,920 --> 00:06:42,760 Speaker 1: you need to do more than just the basics. You 108 00:06:42,800 --> 00:06:47,240 Speaker 1: need backups for redundancy so that if and when a 109 00:06:47,400 --> 00:06:51,039 Speaker 1: specific machine goes down, others can take its place seamlessly 110 00:06:51,160 --> 00:06:54,720 Speaker 1: without affecting the end user. So in our Google Docs experience, 111 00:06:55,400 --> 00:06:58,240 Speaker 1: like if you were to create a document that's not 112 00:06:58,320 --> 00:07:01,080 Speaker 1: just sitting on one server that Google owns, it's actually 113 00:07:01,120 --> 00:07:05,840 Speaker 1: on multiple servers um multiple machines, and if one of 114 00:07:05,880 --> 00:07:08,320 Speaker 1: those machines goes down, you can still get access to 115 00:07:08,360 --> 00:07:11,360 Speaker 1: your file. Also when you make changes to it. Essentially 116 00:07:11,400 --> 00:07:14,560 Speaker 1: you're making changes on one file on one machine that 117 00:07:14,560 --> 00:07:16,640 Speaker 1: that machine then sends out a message to all the 118 00:07:16,680 --> 00:07:19,440 Speaker 1: other machines that have that same file so that they 119 00:07:19,440 --> 00:07:22,640 Speaker 1: can all be updated with the newest version. UM. I've 120 00:07:22,680 --> 00:07:25,600 Speaker 1: done episodes about you know, that kind of background stuff 121 00:07:25,720 --> 00:07:29,040 Speaker 1: in in the Google Docs world. So moving on, Essentially, 122 00:07:29,080 --> 00:07:35,040 Speaker 1: cloud computing utilizes network connections to allow you, other people, organizations, 123 00:07:35,120 --> 00:07:38,160 Speaker 1: companies to rely on machines that are hosted somewhere else, 124 00:07:38,400 --> 00:07:41,440 Speaker 1: and that frees you up considerably. You don't have to 125 00:07:41,520 --> 00:07:43,960 Speaker 1: invest in buying hard drives so that you can save 126 00:07:44,000 --> 00:07:46,840 Speaker 1: all your files. You can just subscribe to a service 127 00:07:46,920 --> 00:07:49,800 Speaker 1: to get some cloud storage companies don't have to keep 128 00:07:49,880 --> 00:07:53,600 Speaker 1: themselves out with massive computer systems, uh complete with an 129 00:07:53,600 --> 00:07:56,400 Speaker 1: I T department to support those computer systems. They can 130 00:07:56,440 --> 00:07:58,760 Speaker 1: just you know, spend some money to use a cloud 131 00:07:58,800 --> 00:08:02,240 Speaker 1: computing service owned by someone else and then host all 132 00:08:02,280 --> 00:08:05,160 Speaker 1: their operations through that. Though, I should add a lot 133 00:08:05,160 --> 00:08:07,760 Speaker 1: of companies take a more hybrid approach, so they have 134 00:08:07,880 --> 00:08:12,880 Speaker 1: some systems typically really like mission critical systems are sometimes 135 00:08:12,880 --> 00:08:15,200 Speaker 1: ones that require a great deal of privacy and security. 136 00:08:15,520 --> 00:08:19,400 Speaker 1: They might run those on premises or on prem and 137 00:08:19,440 --> 00:08:22,800 Speaker 1: then rely on other stuff, like the more administrative stuff 138 00:08:23,440 --> 00:08:28,560 Speaker 1: for cloud systems. Cloud computing started becoming a term, a 139 00:08:28,600 --> 00:08:32,360 Speaker 1: buzz term really around I mean, the idea was older 140 00:08:32,360 --> 00:08:34,960 Speaker 1: than that, but it was starting to really get circulation 141 00:08:35,000 --> 00:08:37,040 Speaker 1: around twenty ten or so. But the seeds, like I said, 142 00:08:37,080 --> 00:08:40,679 Speaker 1: we're planted earlier. So let's let's take a look at 143 00:08:40,679 --> 00:08:44,199 Speaker 1: Amazon specifically, because it played a big part in this. 144 00:08:45,160 --> 00:08:48,360 Speaker 1: So way back in two thousand, Amazon was scrambling to 145 00:08:48,440 --> 00:08:51,720 Speaker 1: keep up with some scaling issues, and this is something 146 00:08:51,760 --> 00:08:54,719 Speaker 1: that you hear about with startups pretty frequently. A new 147 00:08:54,760 --> 00:08:58,920 Speaker 1: startup is usually a fairly small company, and it's nimble, 148 00:08:59,000 --> 00:09:01,440 Speaker 1: and it's agile, and it might offer a small range 149 00:09:01,440 --> 00:09:04,400 Speaker 1: of services or products, or it might only serve a 150 00:09:04,440 --> 00:09:08,600 Speaker 1: relatively small region or both. I think companies like Lift 151 00:09:08,640 --> 00:09:11,600 Speaker 1: and Uber those launched in just a couple of cities 152 00:09:11,720 --> 00:09:15,680 Speaker 1: early on, right, so they were able to grow in 153 00:09:15,720 --> 00:09:19,360 Speaker 1: a controlled manner. Well, if customer demand is high and 154 00:09:19,520 --> 00:09:22,920 Speaker 1: investors are pouring money into the startup, it makes some 155 00:09:23,040 --> 00:09:26,400 Speaker 1: sense to try and grow the company and expand operations. 156 00:09:26,440 --> 00:09:29,640 Speaker 1: But growing ads new challenges and making sure that the 157 00:09:29,679 --> 00:09:32,600 Speaker 1: things you offer are able to scale up and meet 158 00:09:32,640 --> 00:09:37,360 Speaker 1: demand is a non trivial matter. That's the situation Amazon 159 00:09:37,520 --> 00:09:40,920 Speaker 1: was in around two thousand. One of the things the 160 00:09:40,920 --> 00:09:44,520 Speaker 1: company was exploring was building out merchant sites for other 161 00:09:44,600 --> 00:09:48,680 Speaker 1: companies but still using the Amazon platform. So, for example, 162 00:09:48,880 --> 00:09:52,640 Speaker 1: Amazon might partner with a retail company like Target to 163 00:09:52,720 --> 00:09:57,360 Speaker 1: provide an online store, but use Amazon's infrastructure underlying that store, 164 00:09:57,920 --> 00:10:00,000 Speaker 1: and this would bring in a new stream of revenue 165 00:10:00,240 --> 00:10:03,080 Speaker 1: for Amazon, and it would mean these retail companies could 166 00:10:03,080 --> 00:10:06,480 Speaker 1: rely on Amazon's platform rather than having to build out 167 00:10:06,559 --> 00:10:09,680 Speaker 1: us an online store all of their own. So Amazon 168 00:10:09,760 --> 00:10:13,560 Speaker 1: called this merchant dot com. But it turned out building 169 00:10:13,600 --> 00:10:16,800 Speaker 1: merchant dot com was pretty challenging. It was one thing 170 00:10:16,840 --> 00:10:20,040 Speaker 1: to manage Amazon's rapid growth, but it was another to 171 00:10:20,080 --> 00:10:23,160 Speaker 1: build out products that could immediately scale to fit the 172 00:10:23,200 --> 00:10:27,160 Speaker 1: needs of established companies like Target. The initial result was 173 00:10:27,200 --> 00:10:30,840 Speaker 1: a product that had so many interconnected moving parts and 174 00:10:30,960 --> 00:10:33,960 Speaker 1: features that it was difficult for a user to navigate 175 00:10:34,040 --> 00:10:37,240 Speaker 1: and actually use. And I'm sure all of you out 176 00:10:37,320 --> 00:10:40,280 Speaker 1: there know that if a tool is hard to use, 177 00:10:41,120 --> 00:10:43,680 Speaker 1: most people don't bother with it. Right. You might get 178 00:10:43,720 --> 00:10:46,199 Speaker 1: it and try and think this is too much hassle, 179 00:10:46,280 --> 00:10:48,439 Speaker 1: so you would rather go without or find some of 180 00:10:48,480 --> 00:10:52,320 Speaker 1: their alternative. Well, in two thousand two, Amazon began building 181 00:10:52,320 --> 00:10:56,439 Speaker 1: out Amazon dot Com Web Service. Now this would not 182 00:10:56,679 --> 00:11:00,000 Speaker 1: quite be the same thing as Amazon Web Services, despite 183 00:11:00,040 --> 00:11:02,760 Speaker 1: the similar names. It was much more simple than that. 184 00:11:03,240 --> 00:11:08,079 Speaker 1: It used a SOAP and XML interface. And by SOAP, 185 00:11:08,120 --> 00:11:11,280 Speaker 1: I don't mean the stuff you use to get clean. Now, 186 00:11:11,440 --> 00:11:14,160 Speaker 1: if you're not a developer, those things probably sound a 187 00:11:14,160 --> 00:11:17,520 Speaker 1: little confusing, so let's clear it up. SOAP is a 188 00:11:17,559 --> 00:11:22,280 Speaker 1: messaging protocol which originally stood for Simple Object Access Protocol 189 00:11:22,760 --> 00:11:29,400 Speaker 1: and XML means extensible markup Language. It is a language 190 00:11:30,480 --> 00:11:32,880 Speaker 1: so weird to say this, so it's like a machine 191 00:11:32,920 --> 00:11:36,439 Speaker 1: readable and human readable language that's used to create sets 192 00:11:36,480 --> 00:11:40,800 Speaker 1: of rules for document encoding. So this this is a 193 00:11:41,800 --> 00:11:46,320 Speaker 1: language we used to define rules as opposed to you know, 194 00:11:46,440 --> 00:11:51,280 Speaker 1: programming something together. These allowed developers to create processes that 195 00:11:51,320 --> 00:11:53,960 Speaker 1: can run on pretty much any machine that has HTTP 196 00:11:54,160 --> 00:11:57,760 Speaker 1: installed on it. UH. That way, you could create a 197 00:11:57,880 --> 00:12:00,719 Speaker 1: process that can run on Windows device is or Mac 198 00:12:00,800 --> 00:12:03,280 Speaker 1: os or Lenox, all that kind of stuff without having 199 00:12:03,280 --> 00:12:06,880 Speaker 1: to program a specific version for each operating system. So 200 00:12:07,320 --> 00:12:10,520 Speaker 1: Amazon's version of this allowed for a pretty limited amount 201 00:12:10,559 --> 00:12:14,120 Speaker 1: of development around creating processes that could access the Amazon 202 00:12:14,200 --> 00:12:18,240 Speaker 1: product catalog. This would allow web developers to create an 203 00:12:18,280 --> 00:12:22,720 Speaker 1: interface on their own web page that would utilize Amazon's store, 204 00:12:23,360 --> 00:12:25,800 Speaker 1: with the idea that people could buy a product right 205 00:12:25,840 --> 00:12:29,360 Speaker 1: there from that web page instead of having to navigate 206 00:12:29,440 --> 00:12:33,199 Speaker 1: over to Amazon dot com itself, and the developers would 207 00:12:33,200 --> 00:12:36,760 Speaker 1: earn a small commission on every sale made through that 208 00:12:37,240 --> 00:12:39,960 Speaker 1: you know, web page based point of sale. It's just 209 00:12:40,040 --> 00:12:44,120 Speaker 1: a tiny dip of the toe in the cloud based infrastructure. Also, 210 00:12:44,160 --> 00:12:47,240 Speaker 1: Amazon noticed that developers were I mean, this happens all 211 00:12:47,280 --> 00:12:50,680 Speaker 1: the time. Developers were taking that tool and making stuff 212 00:12:50,679 --> 00:12:55,160 Speaker 1: that Amazon had not anticipated or intended, and nothing necessarily bad, 213 00:12:55,200 --> 00:12:58,559 Speaker 1: but like some were making games where they would show 214 00:12:59,240 --> 00:13:02,360 Speaker 1: use this this methodology to show a picture of an 215 00:13:02,360 --> 00:13:04,360 Speaker 1: Amazon product, and it was up to you to guess 216 00:13:04,400 --> 00:13:07,959 Speaker 1: what that product was. That kind of thing. So they 217 00:13:07,960 --> 00:13:11,360 Speaker 1: were gamifying certain elements of this and that kind of 218 00:13:11,360 --> 00:13:15,240 Speaker 1: got wheels turning over at Amazon. This happens all the time. 219 00:13:15,280 --> 00:13:17,440 Speaker 1: Whenever you create anything and you give it to developers, 220 00:13:17,480 --> 00:13:20,280 Speaker 1: they immediately figure out ways to misuse it, I mean 221 00:13:20,400 --> 00:13:25,479 Speaker 1: use it creatively anyway. Around the same time, Amazon executives 222 00:13:25,520 --> 00:13:28,480 Speaker 1: began to realize that their various development teams were running 223 00:13:28,480 --> 00:13:32,120 Speaker 1: into the same problems over and over. Namely, each team 224 00:13:32,120 --> 00:13:35,600 Speaker 1: working on a different internal project would need to go 225 00:13:35,679 --> 00:13:38,120 Speaker 1: through the same basic steps before they could do any 226 00:13:38,160 --> 00:13:41,040 Speaker 1: serious work on the project itself, which involved things like 227 00:13:41,640 --> 00:13:46,240 Speaker 1: establishing systems to handle compute operations, UH, storage solutions to 228 00:13:46,280 --> 00:13:49,920 Speaker 1: hold all the data, and also database solutions to organize everything, 229 00:13:50,280 --> 00:13:53,600 Speaker 1: and a clear picture began to emerge. Amazon's teams were 230 00:13:53,640 --> 00:13:57,360 Speaker 1: having to reinvent the wheel with every new project, and 231 00:13:57,400 --> 00:13:59,880 Speaker 1: the original projection for seeing a project go from start 232 00:13:59,880 --> 00:14:01,959 Speaker 1: to finish was supposed to be three months. That was 233 00:14:02,000 --> 00:14:04,720 Speaker 1: the goal for Amazon, but it turned out that just 234 00:14:04,800 --> 00:14:07,760 Speaker 1: building out the infrastructure to allow a project team to 235 00:14:07,800 --> 00:14:12,320 Speaker 1: actually start developing their project would take three months, so 236 00:14:12,440 --> 00:14:16,119 Speaker 1: everything was running behind schedule. The lesson that the executives 237 00:14:16,120 --> 00:14:18,360 Speaker 1: took from this is that it would be a worthwhile 238 00:14:18,440 --> 00:14:23,280 Speaker 1: endeavor to establish an centralized internal system that could support 239 00:14:23,320 --> 00:14:27,120 Speaker 1: the compute, database, and storage needs of all these different 240 00:14:27,120 --> 00:14:29,480 Speaker 1: project teams. So it would need to be a system 241 00:14:29,560 --> 00:14:33,120 Speaker 1: that could compartmentalize and contain each project so that every 242 00:14:33,120 --> 00:14:35,880 Speaker 1: one of them would have the resources that the teams needed. 243 00:14:36,320 --> 00:14:39,480 Speaker 1: It meant building out virtual machines and figuring out ways 244 00:14:39,520 --> 00:14:42,760 Speaker 1: to create redundancy, and it was a matter of necessity 245 00:14:42,840 --> 00:14:45,400 Speaker 1: for Amazon in order for those internal teams to get 246 00:14:45,440 --> 00:14:48,880 Speaker 1: out of that you know, three month projection goal. But 247 00:14:48,960 --> 00:14:51,600 Speaker 1: it also meant Amazon was building up something that could 248 00:14:51,640 --> 00:14:54,280 Speaker 1: potentially end up being a service that the company could 249 00:14:54,280 --> 00:14:57,160 Speaker 1: offer to others. It would take a little bit longer 250 00:14:57,200 --> 00:14:59,880 Speaker 1: for that to come about. Over time, Folks at Amazon 251 00:15:00,000 --> 00:15:02,680 Speaker 1: again to think of this effort as creating something almost 252 00:15:02,760 --> 00:15:06,440 Speaker 1: like an operating system, but for the Internet rather than 253 00:15:06,520 --> 00:15:10,640 Speaker 1: for a computer or a mobile device. These ideas began 254 00:15:10,680 --> 00:15:13,160 Speaker 1: to first take shape around two thousand three, when Amazon 255 00:15:13,240 --> 00:15:16,480 Speaker 1: executives were attending a company retreat. It would be another 256 00:15:16,560 --> 00:15:19,760 Speaker 1: few years before the earliest version of Amazon's web services 257 00:15:19,840 --> 00:15:22,560 Speaker 1: would launch. All Right, we're gonna take a quick break. 258 00:15:22,880 --> 00:15:34,400 Speaker 1: When we come back, we'll talk more about Amazon Web Services. Okay, 259 00:15:34,400 --> 00:15:36,800 Speaker 1: we left off in two thousand three. Let's put this 260 00:15:36,880 --> 00:15:39,800 Speaker 1: in perspective. If there anything like me, you might say, 261 00:15:39,840 --> 00:15:41,840 Speaker 1: all right, well that's less than twenty years ago. I 262 00:15:41,880 --> 00:15:44,480 Speaker 1: get it. But we let's think about other things that 263 00:15:44,520 --> 00:15:46,920 Speaker 1: were going on. Right, So, two thousand three was a 264 00:15:47,000 --> 00:15:51,360 Speaker 1: year before Facebook would launch at Harvard, let alone expand 265 00:15:51,400 --> 00:15:53,560 Speaker 1: beyond it. In fact, it was about three years before 266 00:15:53,560 --> 00:15:55,360 Speaker 1: Facebook would get out of the phase where it was 267 00:15:55,480 --> 00:15:59,040 Speaker 1: only available to college students. Two thousand three was two 268 00:15:59,160 --> 00:16:02,360 Speaker 1: years before You Too blaunched. It was four years before 269 00:16:02,360 --> 00:16:05,760 Speaker 1: Apple would introduce the iPhone, and it was just two 270 00:16:05,840 --> 00:16:09,680 Speaker 1: years after we had had the dot com crash that 271 00:16:09,760 --> 00:16:13,680 Speaker 1: had whited out numerous web based companies. So this was 272 00:16:13,800 --> 00:16:18,880 Speaker 1: very early on in thinking about cloud computing and operations 273 00:16:18,880 --> 00:16:21,600 Speaker 1: at this kind of scale. The company began to invest 274 00:16:21,640 --> 00:16:24,640 Speaker 1: in building out data centers, you know, these huge facilities. 275 00:16:24,640 --> 00:16:29,040 Speaker 1: The whole thousands of servers and engineers developed and tweaked 276 00:16:29,120 --> 00:16:34,800 Speaker 1: database management services to coordinate and partition these machines effectively Meanwhile, 277 00:16:35,200 --> 00:16:38,040 Speaker 1: the product development teams would work on new products to 278 00:16:38,120 --> 00:16:41,840 Speaker 1: expand what Amazon could do for customers. So in two 279 00:16:41,880 --> 00:16:45,120 Speaker 1: thousand three, Andy Jesse, who would go on to become 280 00:16:45,200 --> 00:16:48,760 Speaker 1: the CEO of Amazon as of July five of this year, 281 00:16:49,400 --> 00:16:52,960 Speaker 1: he became the project lead for Amazon Web Services. He 282 00:16:53,040 --> 00:16:56,320 Speaker 1: had suggested to Jeff Bezos that Amazon could take the 283 00:16:56,440 --> 00:16:59,680 Speaker 1: systems the company had been developing for internal use and 284 00:16:59,720 --> 00:17:03,040 Speaker 1: then open those up as a product for other companies. 285 00:17:03,080 --> 00:17:06,640 Speaker 1: And he was essentially pitching cloud computing to Jeff Bezos, 286 00:17:07,040 --> 00:17:09,600 Speaker 1: and he got the go ahead. In two thousand four, 287 00:17:09,680 --> 00:17:12,639 Speaker 1: Jesse's team had a beta version of this product that 288 00:17:12,720 --> 00:17:16,280 Speaker 1: was ready for testing, and over the following two years 289 00:17:16,560 --> 00:17:19,440 Speaker 1: they would refine and tweak that product until in two 290 00:17:19,480 --> 00:17:23,920 Speaker 1: thousand six, AWS was ready to launch its first initial product. 291 00:17:24,440 --> 00:17:27,800 Speaker 1: Now this would not be Amazon Web Services as a 292 00:17:27,840 --> 00:17:33,160 Speaker 1: cohesive whole, but rather a single product called Simple Storage 293 00:17:33,200 --> 00:17:37,680 Speaker 1: Services or S three, which debuted on March fourteen, two 294 00:17:37,680 --> 00:17:42,280 Speaker 1: thousand six. Now, Amazon described AS three as a tool 295 00:17:42,320 --> 00:17:46,119 Speaker 1: that would let developers save and retrieve quote any amount 296 00:17:46,160 --> 00:17:49,480 Speaker 1: of data at any time from anywhere on the web 297 00:17:49,800 --> 00:17:53,639 Speaker 1: end quote. So this was a cloud storage product. It 298 00:17:53,840 --> 00:17:57,159 Speaker 1: is a cloud storage product it still exists. The experience 299 00:17:57,200 --> 00:18:00,359 Speaker 1: of merchant dot com had, however, taught Amazon de olopers 300 00:18:00,400 --> 00:18:04,800 Speaker 1: a pretty valuable lesson, which I would summarize as just 301 00:18:05,000 --> 00:18:08,600 Speaker 1: because he can doesn't mean you should. Now. Granted, I 302 00:18:08,720 --> 00:18:12,439 Speaker 1: usually use that phrase to criticize vocalists who do irritating 303 00:18:12,520 --> 00:18:17,840 Speaker 1: vocal runs during their songs um Mariah carry, but in 304 00:18:17,880 --> 00:18:22,440 Speaker 1: this case, I'm talking about the issue of feature creep. Now. 305 00:18:22,560 --> 00:18:26,119 Speaker 1: Feature creep is this tendency to throw in extra features 306 00:18:26,160 --> 00:18:30,040 Speaker 1: and options into a product just because he can. These 307 00:18:30,040 --> 00:18:34,359 Speaker 1: features don't necessarily contribute to the usefulness of that product. 308 00:18:34,680 --> 00:18:37,119 Speaker 1: In fact, more often than not, they can cause a 309 00:18:37,119 --> 00:18:39,840 Speaker 1: product to be jan kie and hard to navigate. The 310 00:18:39,880 --> 00:18:43,360 Speaker 1: Amazon developers didn't want S three to fall into that trap, 311 00:18:43,400 --> 00:18:46,240 Speaker 1: and so early on the team decided that the only 312 00:18:46,240 --> 00:18:48,680 Speaker 1: thing that needed to be done was to make sure 313 00:18:48,720 --> 00:18:51,520 Speaker 1: the storage service was as good as it could be 314 00:18:51,760 --> 00:18:56,159 Speaker 1: and just avoid including any extraneous options. Their motto was 315 00:18:56,280 --> 00:19:00,000 Speaker 1: quote the system should be made as simple as possible, 316 00:19:00,720 --> 00:19:04,200 Speaker 1: but no simpler end quote. That's also a good point. 317 00:19:04,560 --> 00:19:08,040 Speaker 1: A bare bones approach is sometimes the best one, but 318 00:19:08,240 --> 00:19:11,639 Speaker 1: you do still need the bones to be there. The 319 00:19:11,760 --> 00:19:15,480 Speaker 1: architecture of the product can be described as objects, buckets, 320 00:19:15,520 --> 00:19:20,240 Speaker 1: and keys. Objects are essentially data, and that data could 321 00:19:20,240 --> 00:19:23,040 Speaker 1: be just about anything. S three doesn't care what the 322 00:19:23,160 --> 00:19:25,600 Speaker 1: data is. It could be video files, it could be 323 00:19:26,160 --> 00:19:29,760 Speaker 1: a game, it could be a database, it could be music, 324 00:19:29,800 --> 00:19:33,560 Speaker 1: it could be whatever. The objects have metadata that describes 325 00:19:33,680 --> 00:19:38,280 Speaker 1: what the object is and when it was last modified. Next, 326 00:19:38,359 --> 00:19:40,639 Speaker 1: you've got your buckets, and this is a kind of 327 00:19:40,680 --> 00:19:45,760 Speaker 1: classification system. So imagine you've got these objects, that is, files, 328 00:19:46,040 --> 00:19:48,120 Speaker 1: and you've got a lot of different types of them 329 00:19:48,280 --> 00:19:51,199 Speaker 1: that belong to a lot of different things. So you 330 00:19:51,280 --> 00:19:54,320 Speaker 1: might have a bucket for specific kind of file like 331 00:19:54,480 --> 00:19:58,720 Speaker 1: music files, or more likely, you might organize buckets according 332 00:19:58,720 --> 00:20:02,040 Speaker 1: to specific projects. So one project might have all its 333 00:20:02,080 --> 00:20:05,560 Speaker 1: objects sorted into one or more buckets that belong to 334 00:20:05,600 --> 00:20:09,280 Speaker 1: that project alone. Now, keys are a kind of I 335 00:20:09,440 --> 00:20:13,240 Speaker 1: D for each object inside bucket, and each object has 336 00:20:13,480 --> 00:20:17,240 Speaker 1: one key, so you can find any object inside S 337 00:20:17,280 --> 00:20:20,400 Speaker 1: three if you have two pieces of information, the bucket 338 00:20:20,480 --> 00:20:23,320 Speaker 1: it is in and the key for the object. So 339 00:20:23,480 --> 00:20:26,960 Speaker 1: keys are used mainly for retrieval and you know that 340 00:20:27,080 --> 00:20:31,080 Speaker 1: kind of thing. The Amazon developers created a storage system 341 00:20:31,119 --> 00:20:35,159 Speaker 1: that was priced at fifteen cents per gigabyte of storage 342 00:20:35,240 --> 00:20:39,440 Speaker 1: space per month. At least at launch, it is significantly 343 00:20:39,520 --> 00:20:42,040 Speaker 1: cheaper than that now and this tells you that Amazon 344 00:20:42,119 --> 00:20:46,000 Speaker 1: has scaled the service dramatically, and considering we're well into 345 00:20:46,040 --> 00:20:48,480 Speaker 1: the era of big data, that's a good thing for developers. 346 00:20:48,840 --> 00:20:54,080 Speaker 1: So today, Amazon's S three standard storage has three different 347 00:20:54,119 --> 00:20:57,760 Speaker 1: tiers of cost, which depends on how much storage you're 348 00:20:57,760 --> 00:20:59,560 Speaker 1: actually using, Like how much data do you have in 349 00:20:59,600 --> 00:21:03,560 Speaker 1: the system. So let's say that you have fifty terabytes 350 00:21:03,720 --> 00:21:07,320 Speaker 1: or less in S three standard, that would mean that 351 00:21:07,359 --> 00:21:10,919 Speaker 1: you are looking at two point three since per gigabyte 352 00:21:10,920 --> 00:21:13,959 Speaker 1: per month. Uh, if you've got more than five hundred 353 00:21:14,040 --> 00:21:16,760 Speaker 1: terabytes stored, that's on the other end of the scale, 354 00:21:17,400 --> 00:21:21,240 Speaker 1: then you're paying two point one since per gigabyte per month. 355 00:21:21,640 --> 00:21:23,600 Speaker 1: And yeah, that adds up for companies that need to 356 00:21:23,600 --> 00:21:26,560 Speaker 1: store a lot of data. Anyway, I bring it up 357 00:21:26,560 --> 00:21:29,520 Speaker 1: to help illustrate how much things have changed. Fifteen cents 358 00:21:29,560 --> 00:21:32,760 Speaker 1: per gig per month is way, way, way, way, way 359 00:21:32,800 --> 00:21:35,880 Speaker 1: more expensive than two point three cents per gig per month. 360 00:21:36,359 --> 00:21:39,040 Speaker 1: Oh and I should also mention that S three today 361 00:21:39,080 --> 00:21:43,000 Speaker 1: offers several other storage products that have other features and 362 00:21:43,119 --> 00:21:46,560 Speaker 1: costs associated with them. But this is not meant to 363 00:21:46,560 --> 00:21:48,600 Speaker 1: be an ad for S three, so we're just gonna 364 00:21:49,119 --> 00:21:52,639 Speaker 1: leave that for now. Anyway, S three right now, the 365 00:21:52,640 --> 00:21:56,520 Speaker 1: gate was successful. In fact, just two months after launch, 366 00:21:56,800 --> 00:22:00,120 Speaker 1: Amazon saw that demand had exceeded their projections by act 367 00:22:00,280 --> 00:22:03,760 Speaker 1: of one hundred. Today, there are more than one hundred 368 00:22:03,880 --> 00:22:07,320 Speaker 1: trillion objects stored in buckets in S three, and the 369 00:22:07,359 --> 00:22:10,080 Speaker 1: fact that the product could scale up to accommodate that 370 00:22:10,200 --> 00:22:13,800 Speaker 1: number of objects attests to good design decisions that were 371 00:22:13,840 --> 00:22:17,960 Speaker 1: made early on. Yeah, the organization system is simple, but 372 00:22:18,040 --> 00:22:21,440 Speaker 1: that simplicity also meant that S three could grow on demand, 373 00:22:21,680 --> 00:22:25,679 Speaker 1: which it did. In August two thousand six, Amazon launched 374 00:22:25,680 --> 00:22:28,159 Speaker 1: a new cloud based service, and this one was called 375 00:22:28,280 --> 00:22:31,639 Speaker 1: and still is called Amazon Elastic Compute Cloud or e 376 00:22:31,840 --> 00:22:36,240 Speaker 1: C two, And as the name suggests, this product offers 377 00:22:36,320 --> 00:22:41,040 Speaker 1: up a different element of computing, the actual compute part. 378 00:22:41,600 --> 00:22:44,600 Speaker 1: That is, this is a system that would allow customers 379 00:22:44,640 --> 00:22:48,800 Speaker 1: the chance to tap into on demand computing power. Now, 380 00:22:48,800 --> 00:22:51,080 Speaker 1: developers who had a great idea but who lacked the 381 00:22:51,119 --> 00:22:54,160 Speaker 1: money or space or both to build out a computer 382 00:22:54,200 --> 00:22:58,239 Speaker 1: facility could subscribe to e C two and lean on 383 00:22:58,280 --> 00:23:02,120 Speaker 1: Amazon's systems to do the work for them. Like S three. 384 00:23:02,359 --> 00:23:05,040 Speaker 1: This idea had its roots back in two thousand three. 385 00:23:05,080 --> 00:23:08,760 Speaker 1: A couple of Amazon engineers, Chris Pinkham and Benjamin Black, 386 00:23:08,920 --> 00:23:12,520 Speaker 1: had authored a memo suggesting a product that could give 387 00:23:12,560 --> 00:23:15,960 Speaker 1: developers the chance to run software on Amazon computer systems 388 00:23:16,040 --> 00:23:21,240 Speaker 1: specifically designated for that task. Around this same time, Amazon 389 00:23:21,400 --> 00:23:25,880 Speaker 1: introduced Simple que Services or Amazon s q S. This 390 00:23:26,040 --> 00:23:29,040 Speaker 1: is a type of message que and by message I 391 00:23:29,040 --> 00:23:32,440 Speaker 1: mean the kinds of communications that go from service to service. 392 00:23:32,920 --> 00:23:35,080 Speaker 1: So let's say you're running an app on your phone 393 00:23:35,400 --> 00:23:38,240 Speaker 1: and the app might in the background send a request 394 00:23:38,359 --> 00:23:41,840 Speaker 1: to a remote server to get access to some data, 395 00:23:41,960 --> 00:23:44,399 Speaker 1: and that would be a message. So s q S 396 00:23:44,480 --> 00:23:47,320 Speaker 1: is a platform that queues up messages so that the 397 00:23:47,400 --> 00:23:50,800 Speaker 1: back end of a system can respond appropriately to requests 398 00:23:51,359 --> 00:23:55,280 Speaker 1: that should give the end user a seamless experience. Now 399 00:23:55,320 --> 00:23:57,600 Speaker 1: there's a lot more to s q S than that, 400 00:23:58,080 --> 00:24:00,760 Speaker 1: but I think that's simple X A nation will serve 401 00:24:00,840 --> 00:24:04,879 Speaker 1: us well enough for this episode. So these products S three, 402 00:24:05,320 --> 00:24:08,639 Speaker 1: e C two and s q S kind of became 403 00:24:08,680 --> 00:24:12,240 Speaker 1: the backbone for what would grow into Amazon Web Services 404 00:24:12,320 --> 00:24:14,560 Speaker 1: as a whole. And there are a lot of other 405 00:24:14,720 --> 00:24:18,960 Speaker 1: focused products in that suite, but generally speaking, each one 406 00:24:19,040 --> 00:24:21,840 Speaker 1: is meant to be really good at doing something specific 407 00:24:22,000 --> 00:24:25,640 Speaker 1: without having that feature creep issue come into play. Amazon 408 00:24:25,760 --> 00:24:28,720 Speaker 1: got the jump on other big companies like Google and 409 00:24:28,760 --> 00:24:32,080 Speaker 1: Microsoft when it came to offering up cloud based computing products. 410 00:24:32,119 --> 00:24:34,760 Speaker 1: This gave Amazon the chance to establish a dominant position 411 00:24:34,840 --> 00:24:37,480 Speaker 1: in the market. I mean, when you're effectively the only 412 00:24:37,520 --> 00:24:40,920 Speaker 1: game in town, it's you know, not hard to become dominant. 413 00:24:41,240 --> 00:24:45,160 Speaker 1: But today these other companies, Microsoft and Google and lots 414 00:24:45,160 --> 00:24:49,520 Speaker 1: more have their own cloud computing services available. Still, Amazon's 415 00:24:49,560 --> 00:24:52,000 Speaker 1: head start meant that the company still has a very 416 00:24:52,040 --> 00:24:56,800 Speaker 1: strong presence. According to Synergy Research Group, Amazon's share of 417 00:24:56,840 --> 00:25:00,760 Speaker 1: the cloud computing market is thirty two percent, or nearly 418 00:25:00,880 --> 00:25:04,760 Speaker 1: one third of the entire market. That's more than Microsoft 419 00:25:04,800 --> 00:25:08,600 Speaker 1: and Google's products combined together, those companies make up about 420 00:25:09,640 --> 00:25:12,919 Speaker 1: of the market. So about a third of all the 421 00:25:12,920 --> 00:25:15,920 Speaker 1: cloud computing business that's going on out there is going 422 00:25:15,920 --> 00:25:19,359 Speaker 1: through Amazon. And like I said, that includes tons of 423 00:25:19,400 --> 00:25:22,359 Speaker 1: different things from apps on your phone to video games 424 00:25:22,359 --> 00:25:26,719 Speaker 1: to Walt Disney World's virtual ticketing system. Now, I'm not 425 00:25:26,760 --> 00:25:29,600 Speaker 1: going to say that as long as a WS is 426 00:25:29,680 --> 00:25:33,800 Speaker 1: running smoothly, everything should go well, because all the products 427 00:25:33,840 --> 00:25:36,200 Speaker 1: that are built on top of a w S still 428 00:25:36,240 --> 00:25:38,360 Speaker 1: need to have a good design. I mean, it's possible 429 00:25:38,400 --> 00:25:42,640 Speaker 1: to make a really lousy product that's using AWS, and 430 00:25:42,840 --> 00:25:45,960 Speaker 1: it's not the fault of AWS if that product is lousy. 431 00:25:46,000 --> 00:25:48,480 Speaker 1: But as we saw last week, when things get harry 432 00:25:48,880 --> 00:25:52,639 Speaker 1: on AWS, all the products that rely on those services 433 00:25:52,720 --> 00:25:56,280 Speaker 1: they can be affected. So last week, at approximately ten 434 00:25:56,359 --> 00:26:00,680 Speaker 1: thirty am on Tuesday, December seven, twenty one, a w 435 00:26:01,000 --> 00:26:03,280 Speaker 1: S had what we in the tech biz call a 436 00:26:03,280 --> 00:26:06,600 Speaker 1: whoop see. It was a whoop see that lasted between 437 00:26:06,680 --> 00:26:09,400 Speaker 1: five to seven hours, depending upon the services you were 438 00:26:09,440 --> 00:26:14,200 Speaker 1: relying upon. And because a WS has this massive presence 439 00:26:14,240 --> 00:26:17,080 Speaker 1: in the market, and because so many big companies rely 440 00:26:17,240 --> 00:26:20,119 Speaker 1: on it in order to make their stuff work, that 441 00:26:20,160 --> 00:26:23,360 Speaker 1: whoop see had a pretty big footprint. According to Amazon, 442 00:26:23,480 --> 00:26:26,000 Speaker 1: the issue was that there was a glitch in some 443 00:26:26,080 --> 00:26:29,840 Speaker 1: crucial networking hardware. And this hardware is in charge of 444 00:26:29,880 --> 00:26:34,159 Speaker 1: hosting what Amazon called foundational services, including stuff like e 445 00:26:34,359 --> 00:26:38,960 Speaker 1: C two, but also it handled stuff like Amazon's Domain 446 00:26:39,040 --> 00:26:42,000 Speaker 1: name service. Now, this service is kind of like the 447 00:26:42,040 --> 00:26:46,239 Speaker 1: liaison that connects human readable u r L addresses with 448 00:26:46,440 --> 00:26:52,040 Speaker 1: machine readable addresses, and without it, you can tell your 449 00:26:52,240 --> 00:26:55,080 Speaker 1: browser to go to that particular website all you like, 450 00:26:55,280 --> 00:26:58,560 Speaker 1: but it ain't happening because the liaison is on like 451 00:26:58,600 --> 00:27:01,440 Speaker 1: a five to seven hour offee break, and the machines 452 00:27:01,480 --> 00:27:05,440 Speaker 1: have no idea what you're on about. Anyway, the AWS 453 00:27:05,520 --> 00:27:10,399 Speaker 1: internal system became overwhelmed, and that's something that usually doesn't happen. 454 00:27:10,640 --> 00:27:13,800 Speaker 1: Usually there's this cross network scaling system that kicks in 455 00:27:13,920 --> 00:27:17,439 Speaker 1: and meets increased demand. But this glitch essentially caused a 456 00:27:17,480 --> 00:27:21,399 Speaker 1: massive game of telephone within the AWS system, and it 457 00:27:21,520 --> 00:27:25,800 Speaker 1: overloaded all the circuits. To use a somewhat flimsy analogy, 458 00:27:25,840 --> 00:27:29,080 Speaker 1: so the glitch triggered what Amazon called quote a large 459 00:27:29,119 --> 00:27:32,960 Speaker 1: surge of connection activity that overwhelmed the networking devices between 460 00:27:32,960 --> 00:27:37,480 Speaker 1: the internal network and the main AWS network, resulting in 461 00:27:37,520 --> 00:27:41,280 Speaker 1: delays for a communication between these networks end quote. So 462 00:27:41,320 --> 00:27:44,800 Speaker 1: it's almost like a classic denial of service attack, only 463 00:27:44,840 --> 00:27:48,920 Speaker 1: Amazon kind of did it to itself. I guess we're 464 00:27:48,920 --> 00:27:51,880 Speaker 1: being fair, we would say the glitch cost it. Now. 465 00:27:51,960 --> 00:27:54,719 Speaker 1: A delay in communication normally just means you have an 466 00:27:54,720 --> 00:27:59,680 Speaker 1: irritating experience like lag, right, and you can manage that 467 00:27:59,800 --> 00:28:02,800 Speaker 1: you usually, but you know, it just makes whatever you're 468 00:28:02,840 --> 00:28:05,800 Speaker 1: doing more difficult. Except a lot of systems have time 469 00:28:05,800 --> 00:28:08,400 Speaker 1: out features, in which if there is a long enough 470 00:28:08,440 --> 00:28:11,800 Speaker 1: delay between sending a message and getting a response, you 471 00:28:11,880 --> 00:28:16,200 Speaker 1: reach a failed state, and that happened a lot last Tuesday. 472 00:28:17,119 --> 00:28:20,040 Speaker 1: What made matters more difficult was that Amazon's own real 473 00:28:20,119 --> 00:28:25,119 Speaker 1: time monitoring services rely on those internal AWS systems. I 474 00:28:25,119 --> 00:28:27,520 Speaker 1: mean that's how AWS even got started, right, I mean 475 00:28:27,560 --> 00:28:30,640 Speaker 1: it was Amazon building out its own infrastructure and then 476 00:28:30,680 --> 00:28:34,200 Speaker 1: offering up those capabilities to other companies. So that meant 477 00:28:34,400 --> 00:28:38,040 Speaker 1: the mitigation teams who were working to fix stuff didn't 478 00:28:38,080 --> 00:28:41,520 Speaker 1: have all their real time monitoring tools available as they 479 00:28:41,520 --> 00:28:44,200 Speaker 1: were tackling the problem, so that slowed down the recovery 480 00:28:44,280 --> 00:28:48,400 Speaker 1: quite a bit. Amazon has since apologized to customers for 481 00:28:48,480 --> 00:28:51,320 Speaker 1: this outage, and the reps now say that the company 482 00:28:51,400 --> 00:28:55,360 Speaker 1: is working to distribute its service Health Dashboard across multiple regions, 483 00:28:55,360 --> 00:28:58,680 Speaker 1: so that should something similar happen in the future, the 484 00:28:58,760 --> 00:29:03,160 Speaker 1: fixed should theoretically happened much more quickly. Uh So, Yeah, 485 00:29:03,200 --> 00:29:05,800 Speaker 1: this is another way for us to realize that we 486 00:29:05,880 --> 00:29:09,920 Speaker 1: have put a tremendous amount of trust and dependence upon 487 00:29:10,040 --> 00:29:15,120 Speaker 1: cloud services. And it's another reminder that sometimes like you 488 00:29:15,160 --> 00:29:18,000 Speaker 1: could have designed everything yourself as good as it can 489 00:29:18,040 --> 00:29:21,000 Speaker 1: possibly be. You can have an incredible app, but if 490 00:29:21,040 --> 00:29:24,560 Speaker 1: the technology that powers that app goes down, it doesn't 491 00:29:24,560 --> 00:29:27,120 Speaker 1: matter how good your product is, right it, you know, 492 00:29:27,240 --> 00:29:29,800 Speaker 1: and since you don't control that, since you are dependent 493 00:29:29,880 --> 00:29:35,280 Speaker 1: upon a cloud uh provider, then if the cloud provider 494 00:29:35,320 --> 00:29:38,440 Speaker 1: has problems, that's really a big blow to your own 495 00:29:38,800 --> 00:29:43,160 Speaker 1: business plans. It's one of the reasons why companies really 496 00:29:43,200 --> 00:29:45,680 Speaker 1: debate on what services they want to put on the 497 00:29:45,720 --> 00:29:49,680 Speaker 1: cloud versus on premises. Um It's it's a complicated thing too, 498 00:29:49,720 --> 00:29:54,200 Speaker 1: because scaling is such a tricky issue. Most companies you 499 00:29:54,240 --> 00:29:57,960 Speaker 1: know that aren't like huge fortune five companies don't have 500 00:29:58,040 --> 00:30:00,840 Speaker 1: the assets necessary to be able to scare al at 501 00:30:00,920 --> 00:30:03,600 Speaker 1: least not to the massive scales that we're seeing in 502 00:30:03,680 --> 00:30:08,000 Speaker 1: the global Internet space. Anyway, I hope you found this 503 00:30:08,040 --> 00:30:10,840 Speaker 1: episode interesting as we talked about a WS and what 504 00:30:10,960 --> 00:30:13,400 Speaker 1: happened last week. If you have suggestions for topics I 505 00:30:13,400 --> 00:30:16,360 Speaker 1: should cover on future episodes of tech Stuff, please reach 506 00:30:16,360 --> 00:30:17,960 Speaker 1: out to me. The best way to do that is 507 00:30:18,000 --> 00:30:20,760 Speaker 1: on Twitter. The handle for the show is text Stuff 508 00:30:21,160 --> 00:30:25,680 Speaker 1: H s W and I'll talk to you again really soon. 509 00:30:30,760 --> 00:30:33,800 Speaker 1: Text Stuff is an I Heart Radio production. For more 510 00:30:33,880 --> 00:30:37,280 Speaker 1: podcasts from My Heart radio, visit the I heart Radio app, 511 00:30:37,400 --> 00:30:40,560 Speaker 1: Apple podcasts, or wherever you listen to your favorite shows.