WEBVTT - Could we make a sarcastic supercomputer? 0:00:04.120 --> 0:00:07.160 Get in touch with technology with tech Stuff from how 0:00:07.200 --> 0:00:14.080 stuff Works dot com. Hey there, and welcome to tech Stuff. 0:00:14.120 --> 0:00:16.960 I'm your host, Jonathan Strickland. I'm an executive producer with 0:00:17.000 --> 0:00:20.560 How Stuff Works and I love all things tech. Today, 0:00:21.079 --> 0:00:24.000 I want to talk to you about an interesting topic 0:00:24.160 --> 0:00:26.599 that I got to explore a couple of years ago 0:00:27.120 --> 0:00:31.760 with Joe McCormick and Lauren fogobaum As we debated the 0:00:31.840 --> 0:00:37.800 possibilities of computers learning how to understand sarcasm. We did 0:00:37.840 --> 0:00:41.160 it for a podcast called Forward Thinking, which was around 0:00:41.200 --> 0:00:42.920 for a couple of years. It was a lot of 0:00:42.920 --> 0:00:46.040 fun to work on that that show is over, but 0:00:46.080 --> 0:00:48.640 I thought I would revisit the topic and talk about 0:00:48.680 --> 0:00:52.120 it for you guys and kind of go over what 0:00:52.280 --> 0:00:54.800 would it take to have a computer that could actually 0:00:54.880 --> 0:00:59.400 understand when someone's being sarcastic. Now to understand why this 0:00:59.440 --> 0:01:02.360 is a big d it helps to have a refresher 0:01:02.400 --> 0:01:05.679 course on how computers process information. And I know I 0:01:05.720 --> 0:01:08.560 talked about this a lot, but I still think it's 0:01:08.560 --> 0:01:11.200 important to cover the basics when you want to talk 0:01:11.240 --> 0:01:14.840 about something as advanced as being able to detect and 0:01:15.040 --> 0:01:21.200 understand sarcasm. So computers understand machine code or assembly language. 0:01:21.480 --> 0:01:25.320 This is a language that corresponds with the actual physical 0:01:25.600 --> 0:01:30.319 architecture of the computers, so the way the computer is built, 0:01:30.680 --> 0:01:33.880 that's how this language interacts. It's it's essentially how the 0:01:33.959 --> 0:01:39.119 physical components of the computer are able to handle electric 0:01:39.200 --> 0:01:45.560 current or voltage differences in order to process information, and 0:01:45.880 --> 0:01:51.680 computers can interpret this and execute upon this language very quickly. 0:01:52.240 --> 0:01:56.160 It is the basic language of those physical components. However, 0:01:57.000 --> 0:02:00.880 it is almost impossible for human to work with this, 0:02:01.200 --> 0:02:04.040 at least on a way that is at all efficient, 0:02:04.480 --> 0:02:10.800 because it ultimately for most computers boils down to binary language, right, 0:02:11.000 --> 0:02:16.120 zeros and ones, So you see a huge block of 0:02:16.200 --> 0:02:18.799 zeros and ones, and unless you are neo from the matrix, 0:02:18.840 --> 0:02:22.920 it means nothing to you. So we speak in natural 0:02:23.240 --> 0:02:27.480 language to one another. Natural language, however, is filled with 0:02:27.520 --> 0:02:31.640 a lot of components that make it very very challenging 0:02:31.680 --> 0:02:36.160 for machines to interpret, like ambiguity, or there might be 0:02:36.200 --> 0:02:39.200 double meanings in a phrase and you may mean both 0:02:39.280 --> 0:02:43.960 meanings at the same time, and that is too complicated 0:02:44.000 --> 0:02:46.200 for most machines to be able to process. They just 0:02:46.240 --> 0:02:50.560 can't deal with that. So to bridge the gap between 0:02:50.760 --> 0:02:54.480 the way we humans communicate and the way that computers 0:02:54.600 --> 0:03:00.440 process language. We have created programming languages and compilers. Now, 0:03:00.800 --> 0:03:04.760 programming languages fall into two broad categories. It's more like 0:03:05.080 --> 0:03:07.920 a spectrum, and you could be further on one end 0:03:08.000 --> 0:03:11.320 than the other, and we typically call them high level 0:03:11.560 --> 0:03:15.960 programming languages and low level programming languages. The lower the 0:03:16.120 --> 0:03:19.920 level of programming language, the closer it is to machine code, 0:03:20.560 --> 0:03:23.399 and the easier it is for a computer to understand, 0:03:23.800 --> 0:03:26.040 but the harder it is to work with if you 0:03:26.080 --> 0:03:29.200 happen to be, you know, a human being. High level 0:03:29.240 --> 0:03:33.480 programming languages are easier for humans to understand. Now, if 0:03:33.520 --> 0:03:36.960 you have never taken any courses in programming and you're 0:03:37.000 --> 0:03:41.040 looking at a page of code, it can seem indecipherable 0:03:41.040 --> 0:03:46.360 to you. It is just meaningless strings of characters. But 0:03:47.240 --> 0:03:50.720 once you learn the rules of that programming language, how 0:03:50.800 --> 0:03:54.800 you construct an instruction and a series of instructions, how 0:03:54.840 --> 0:03:57.640 you go from one instruction to the next. Once you 0:03:57.720 --> 0:04:00.920 understand the rules, it actually becomes quite easy to use 0:04:01.160 --> 0:04:03.200 in the grand scheme of things, much more easy than 0:04:03.280 --> 0:04:06.880 machine language would be. But again, the problem here is 0:04:06.920 --> 0:04:11.960 that computers don't understand programming languages, not natively. Even though 0:04:12.480 --> 0:04:15.280 this is not exactly the same as human natural language, 0:04:15.280 --> 0:04:18.039 it's also not the same as machine language. That's why 0:04:18.040 --> 0:04:23.719 you need compilers. A compiler is essentially a translator. It 0:04:23.880 --> 0:04:28.479 takes this high level programming language or higher level anyway, 0:04:28.560 --> 0:04:32.080 and then converts it into a machine readable language for 0:04:32.120 --> 0:04:35.400 the computer to actually execute upon. And this is all 0:04:35.440 --> 0:04:39.080 in the design of the programming languages and the compilers. 0:04:40.040 --> 0:04:44.039 So this is the way that for decades we have 0:04:44.120 --> 0:04:46.760 interacted with computers, when you're talking about it on a 0:04:46.839 --> 0:04:49.680 on a direct level, not just executing a program, but 0:04:49.839 --> 0:04:54.599 creating code, creating programs for computers to run. Over the 0:04:54.720 --> 0:04:58.960 last few decades, we've had some very very smart people 0:04:59.520 --> 0:05:05.599 working on natural language systems for machines which would allow 0:05:05.839 --> 0:05:12.560 a computer to interpret natural language in a way that 0:05:12.560 --> 0:05:14.920 would make some sort of sense and for the computer 0:05:14.960 --> 0:05:17.320 to be able to act upon that language. And we've 0:05:17.360 --> 0:05:22.479 seen this in plenty of examples recently. Most smartphones have 0:05:22.680 --> 0:05:26.560 some sort of smart assistant. You have standalone products like 0:05:26.720 --> 0:05:31.000 Amazon's Echo, you have Google Home, You've got tons of 0:05:31.080 --> 0:05:37.080 devices that can interact with people. It can be activated 0:05:37.120 --> 0:05:39.800 by typically an alert phrase, which I'm not going to 0:05:39.880 --> 0:05:41.680 say because I don't want any of you guys to 0:05:41.720 --> 0:05:43.880 have to deal with that. I know how irritating it 0:05:43.960 --> 0:05:47.400 is when I'm watching a video and someone activates their 0:05:48.760 --> 0:05:52.920 specific system and then mine begins to respond and all 0:05:52.920 --> 0:05:55.640 my lights started going on and off because the people 0:05:55.640 --> 0:05:58.560 on YouTube we're talking funny. I know how irritating that is. 0:05:58.600 --> 0:06:01.680 But use that at debates and then you can speak 0:06:02.080 --> 0:06:06.400 and typically you can say the same thing several different 0:06:06.400 --> 0:06:11.520 ways and the device appears to understand you no matter 0:06:11.560 --> 0:06:14.279 how you word it. And this is a real challenge 0:06:14.279 --> 0:06:17.120 because we human beings can find lots of different ways 0:06:17.560 --> 0:06:20.360 to say the same thing. For example, if I say 0:06:20.400 --> 0:06:23.560 what is the weather today, it could be very similar 0:06:23.600 --> 0:06:25.640 to if I if I ask a question, is it 0:06:25.720 --> 0:06:29.120 going to rain today? Both of those are asking for 0:06:29.160 --> 0:06:32.560 information about the weather, but are very different ways of 0:06:32.600 --> 0:06:36.760 saying that. A good natural language recognition program will be 0:06:36.800 --> 0:06:42.360 able to parse that information and then return the appropriate response. 0:06:43.600 --> 0:06:46.760 This is not an easy thing to do. Typically it 0:06:46.800 --> 0:06:50.880 involves creating a neural network structure, and I've talked about 0:06:50.960 --> 0:06:55.640 artificial neural networks recently. That's a typically a network that 0:06:55.720 --> 0:07:01.440 can accept multiple binary inputs, so either a zero or 0:07:01.520 --> 0:07:06.640 a one input that represents something uh, some sort of yes, 0:07:06.720 --> 0:07:10.440 no or on off kind of feature. It can accept 0:07:10.560 --> 0:07:14.760 multiple multiple inputs of that nature, so multiple zeros or 0:07:14.840 --> 0:07:18.920 ones that all factor into making a decision, and then 0:07:18.920 --> 0:07:22.720 it has a waiting for each of those components, and 0:07:22.760 --> 0:07:26.400 then it produces a single output that's also binary in nature, 0:07:26.440 --> 0:07:28.920 either a zero one, and it passes that on to 0:07:29.240 --> 0:07:33.440 other artificial neurons further down the chain. Sometimes that will 0:07:33.480 --> 0:07:37.080 come back around and you have a recursive artificial neural network. 0:07:37.440 --> 0:07:42.920 The goal here is for this process two ultimately result 0:07:43.760 --> 0:07:49.080 in a response that is reasonably certain to meet the 0:07:49.120 --> 0:07:52.800 requirements of the person asking the question. This tends to 0:07:52.800 --> 0:07:56.720 be talked about in the realm of probabilities. We we 0:07:56.760 --> 0:08:00.280 talked about how certain the machine is that the respons 0:08:00.400 --> 0:08:03.240 is the appropriate one, and if it falls below a 0:08:03.280 --> 0:08:07.800 certain threshold, then the machine would typically respond with I'm sorry, 0:08:07.840 --> 0:08:10.040 I don't know what you're asking for, or something similar 0:08:10.080 --> 0:08:13.840 to that. There are cases where you just get misinterpreted 0:08:13.960 --> 0:08:16.559 and you'll get a response that does not reflect whatever 0:08:16.600 --> 0:08:18.760 you ask. That's a little different. That's where the machine 0:08:18.760 --> 0:08:22.760 has drawn a conclusion, has been reasonably certain that it 0:08:22.800 --> 0:08:24.680 came to the right conclusion, it turns out it was 0:08:24.720 --> 0:08:29.240 wrong the whole way. But that's the process. Now, when 0:08:29.280 --> 0:08:36.559 it comes to sarcasm, that adds yet another layer of difficulty, 0:08:37.320 --> 0:08:42.120 because now a machine isn't just parsing what you are saying. 0:08:42.520 --> 0:08:46.520 It has to understand what you mean, the meaning of 0:08:46.559 --> 0:08:51.480 your words and the meaning of the way you deliver them. 0:08:51.480 --> 0:08:54.120 It could be different. So if I were to just 0:08:54.240 --> 0:08:59.360 write out a phrase with no tone, no body language, uh, 0:08:59.600 --> 0:09:03.920 not emphasizing any one word over another, it might be 0:09:04.040 --> 0:09:08.319 very difficult to detect what my intent was. It may 0:09:08.360 --> 0:09:11.559 seem like I'm being sincere, when in fact I'm being insincere. 0:09:11.840 --> 0:09:16.280 For example, Uh, if I were to say that guy 0:09:16.400 --> 0:09:22.040 is super tall, but I'm being sarcastic, then just in 0:09:22.080 --> 0:09:25.440 that phrase the way I write it out, you would think, oh, well, 0:09:25.480 --> 0:09:29.959 that person he's looking at must be super tall. How 0:09:30.000 --> 0:09:34.120 do you recognize sarcasm? How can you detect that this 0:09:34.200 --> 0:09:37.280 is in place and then understand what the meaning underneath 0:09:37.320 --> 0:09:41.760 it is. One of the approaches that has been put 0:09:41.800 --> 0:09:48.480 forward relates to IBM's Watson platform. Now. Watson first made 0:09:48.480 --> 0:09:52.440 headlines back when it was a contestant on Jeopardy. It 0:09:52.720 --> 0:09:56.880 went up against two former champions, including Ken Jennings, who 0:09:57.000 --> 0:10:00.240 shows up on a house Stuff Works podcast. Anyway, Utson 0:10:00.280 --> 0:10:03.840 went up against these two former champions and it was 0:10:03.920 --> 0:10:07.160 able to interpret natural language. It had to in order 0:10:07.200 --> 0:10:09.120 to play the game of Jeopardy. And for those who 0:10:09.200 --> 0:10:11.920 do not know what Jeopardy is or they're not familiar 0:10:11.920 --> 0:10:15.120 with the game show, Jeopardy is a game where you 0:10:15.160 --> 0:10:21.079 are presented with categories of trivia and each category has 0:10:21.200 --> 0:10:27.679 multiple uh questions or multiple entries in it, and they 0:10:27.800 --> 0:10:33.360 range in dollar value, and the lower dollar value ones 0:10:33.400 --> 0:10:37.000 are easier to answer than the higher dollar value ones, 0:10:38.120 --> 0:10:41.680 and UH, you're Typically the way Jeopardy works is that 0:10:41.720 --> 0:10:44.600 you're you're given quote unquote the answer and you have 0:10:44.679 --> 0:10:49.840 to provide the question. So uh, if the answer were 0:10:51.360 --> 0:10:57.440 this film that detailed the adventures of a young playwright 0:10:57.640 --> 0:11:01.920 in sixteenth century England one picture, you would say, what 0:11:02.080 --> 0:11:06.240 was Shakespeare in Love? So this computer is playing against 0:11:06.240 --> 0:11:08.920 these two former champions. This was sort of an exhibition 0:11:09.480 --> 0:11:14.160 series of games. It wasn't meant for uh, a competition 0:11:14.200 --> 0:11:16.480 in the way the typical Jeopardy games were. There was 0:11:16.559 --> 0:11:19.960 money on the line. It was an exhibition and Watson 0:11:20.000 --> 0:11:23.160 won it beat both of the champions, and it did 0:11:23.160 --> 0:11:26.440 what I was telling you. It it would analyze the 0:11:26.600 --> 0:11:30.719 clue that was given, the answer that was given, it 0:11:30.760 --> 0:11:33.959 would try and generate a question to correspond with that answer, 0:11:34.360 --> 0:11:37.480 and only if the question met a certain threshold of 0:11:37.520 --> 0:11:40.600 confidence with Watson buzz in. If it did not meet 0:11:40.960 --> 0:11:45.040 that level of confidence, Watson would remain quiet. And most importantly, 0:11:45.320 --> 0:11:47.920 Watson was not at all connected to the Internet. All 0:11:48.000 --> 0:11:53.640 the information was contained within a massive series of servers 0:11:54.559 --> 0:11:57.080 more than gosh, I can't even remember. There's a ton 0:11:57.160 --> 0:12:02.440 of processors attached to it. Um so a very powerful machine, 0:12:03.520 --> 0:12:09.640 but it still wasn't exactly able to detect sarcasm. It 0:12:09.720 --> 0:12:14.040 could work with wordplay, and it could work with riddles, 0:12:14.040 --> 0:12:16.960 so that was really impressive. But what it really did 0:12:17.000 --> 0:12:19.560 was it gave IBM the opportunity to say, we have 0:12:19.720 --> 0:12:24.360 this platform here, and we're welcoming developers to create applications 0:12:24.400 --> 0:12:28.160 that tap into this platform and make use of this 0:12:28.880 --> 0:12:32.640 in order to do interesting stuff with it. And IBM 0:12:32.720 --> 0:12:35.319 was largely working with the medical industry at that point 0:12:35.360 --> 0:12:41.600 to try and help doctors treat and diagnose patients, and 0:12:41.679 --> 0:12:43.760 it was sort of computer guidance. It wasn't that you 0:12:43.840 --> 0:12:47.960 had an automatic doctor, but rather the doctor had what 0:12:48.320 --> 0:12:53.480 equates to a medical expert to confer with when trying 0:12:53.520 --> 0:12:56.760 to determine why's the best course of action for a patient. 0:12:57.800 --> 0:13:01.120 IBM put up an Application program m interface or API 0:13:01.640 --> 0:13:06.320 and let developers create their own cognitive computing applications built 0:13:06.400 --> 0:13:10.600 on top of Watson. One of those was called the 0:13:10.640 --> 0:13:14.680 tone analyzer. It still exists back when we were doing 0:13:14.679 --> 0:13:18.120 this episode for forward Thinking. It was in the form 0:13:18.400 --> 0:13:21.520 of analyzing some text and telling you whether or not 0:13:22.040 --> 0:13:26.120 that text would come across as agreeable or argumentative, or 0:13:26.200 --> 0:13:31.439 positive or negative, and it would assign tone to those pieces. 0:13:32.040 --> 0:13:35.040 I'll explain more about how it did and what it 0:13:35.120 --> 0:13:37.560 did in just a minute, but first let's take a 0:13:37.640 --> 0:13:48.360 quick break to thank our sponsor. So how did this 0:13:48.440 --> 0:13:53.920 tone analyzer work. It would search for cues in any 0:13:54.080 --> 0:13:59.480 written text, social cues, written cues, emotional cues in order 0:13:59.520 --> 0:14:02.760 to determine in the overall tone of a piece, which 0:14:02.800 --> 0:14:07.640 actually meant that The analyzer would tag individual words within 0:14:07.960 --> 0:14:13.160 a text, words that it recognized and had already pre 0:14:13.280 --> 0:14:17.319 labeled as falling into various categories. So words that might 0:14:17.360 --> 0:14:23.880 have a positive meaning like happy, glad, joy, things like that. 0:14:23.880 --> 0:14:27.480 Those would get tagged as cheerful. But then it would 0:14:27.480 --> 0:14:31.040 then assign all the individual words tags and then tally 0:14:31.120 --> 0:14:33.680 everything up. So let's say you've got a bunch of 0:14:33.680 --> 0:14:39.000 sentences and it starts individually labeling certain words as being 0:14:39.120 --> 0:14:44.240 cheerful or sad or angry or helpful, and then it 0:14:44.280 --> 0:14:46.680 adds it all up and then would give you a percentage. 0:14:47.120 --> 0:14:52.880 So a message might be agreeable or thirty conscientious, you 0:14:52.880 --> 0:14:55.760 would actually get multiples of these, and that would just 0:14:55.800 --> 0:14:59.600 really indicate the density of those types of words within 0:14:59.640 --> 0:15:04.240 the mess itage itself. Now, in an ideal world, if 0:15:04.320 --> 0:15:08.960 language were very simple to understand and interpret by machines, 0:15:09.480 --> 0:15:12.960 this would help you gauge how people would respond to 0:15:13.080 --> 0:15:17.360 your work. Right, So, you could write a message. Before 0:15:17.400 --> 0:15:20.400 you send it, you put it through the tone analyzer 0:15:20.800 --> 0:15:25.000 and it tells you what sort of a tone you 0:15:25.040 --> 0:15:28.360 are setting. So if you wanted to create a business letter, 0:15:28.960 --> 0:15:30.840 you could send it through this tone analyzer, and if 0:15:30.840 --> 0:15:33.760 it came back as saying it's coming across as as 0:15:33.840 --> 0:15:37.320 a indecisive, you might want to go back in and 0:15:37.480 --> 0:15:40.680 edit that message so that you can make a more 0:15:41.080 --> 0:15:46.640 straightforward and decisive message and not give the wrong impression 0:15:46.720 --> 0:15:50.320 before you send the message out to your actual human recipient, 0:15:50.680 --> 0:15:53.280 and come up with alternate word choices in order to 0:15:53.280 --> 0:15:55.200 make sure that your message is received the way you 0:15:55.240 --> 0:15:58.560 intended it. And anyone who has communicated over the internet 0:15:58.600 --> 0:16:01.280 can think of ways that this might have been helpful 0:16:01.320 --> 0:16:05.400 in the past, because again, language depends on so many 0:16:05.520 --> 0:16:09.800 different elements to get your meaning across, and when you 0:16:09.840 --> 0:16:14.520 reduce it to the written form, especially the written form online, 0:16:14.560 --> 0:16:19.239 where we tend to be very short with our our communication, 0:16:19.400 --> 0:16:22.880 it comes in very quick bursts, a couple of sentences 0:16:22.880 --> 0:16:25.960 here or there. We lack all that body language, we 0:16:26.040 --> 0:16:29.320 lack that tone. It's very easy to misinterpret. I'm sure 0:16:29.360 --> 0:16:32.440 there's been an example in your life where either you 0:16:32.520 --> 0:16:35.080 got offended from receiving something that was meant in a 0:16:35.120 --> 0:16:38.360 way that was different from the way you you interpreted it, 0:16:38.480 --> 0:16:40.920 or the reverse happened where you sent a message and 0:16:41.000 --> 0:16:45.320 somebody had a reaction you did not anticipate because they 0:16:45.360 --> 0:16:48.240 could not tell what tone you were using just from 0:16:48.280 --> 0:16:51.960 the words you were using. Machines have that same problem. 0:16:52.200 --> 0:16:55.760 In the future, an analyzer like this tone analyzer, it 0:16:55.760 --> 0:17:00.280 could be incorporated into word processors or email sir verse, 0:17:00.360 --> 0:17:03.920 or email services, I should say, or social media platforms. 0:17:04.240 --> 0:17:06.879 So you start typing in your message, and before you 0:17:06.960 --> 0:17:11.159 hit published or post or send, you could analyze that text. 0:17:11.680 --> 0:17:13.560 It could tell you what the tone is, and then 0:17:13.600 --> 0:17:16.440 you could say, oh, no, that's gonna come across totally 0:17:16.600 --> 0:17:18.840 the wrong way, and you could actually fix it before 0:17:18.920 --> 0:17:21.000 you posted it or sent it, and then you wouldn't 0:17:21.040 --> 0:17:24.680 have that awkward decision of whether or not to edit something, or, 0:17:24.720 --> 0:17:27.639 in the case of Twitter, which continues to refuse to 0:17:27.680 --> 0:17:30.919 allow you to edit tweets, to delete a tweet. I 0:17:31.000 --> 0:17:33.960 deleted a tweet the other day when I posted a 0:17:34.040 --> 0:17:36.679 link to a news story, and I had done a 0:17:36.760 --> 0:17:40.080 rookie mistake, one that I try to avoid, but I 0:17:40.640 --> 0:17:43.800 did it this pastime, which is that I didn't think 0:17:43.840 --> 0:17:46.040 to look at the date when the news item had 0:17:46.080 --> 0:17:49.240 been published, and had been published a full year earlier, 0:17:49.600 --> 0:17:51.919 so it was not new news, it was old news. 0:17:52.440 --> 0:17:55.240 And uh then deleted the tweet and it wasn't up 0:17:55.280 --> 0:17:57.520 for long, but I still felt dumb about it. It 0:17:57.520 --> 0:17:59.239 would have been nice to have been able to check that. 0:17:59.440 --> 0:18:02.119 Although that's not tone obviously, that's but similar in the 0:18:02.840 --> 0:18:06.200 and the idea that you want to check before you 0:18:06.920 --> 0:18:10.240 end up offending someone, unless you're one of those jerk 0:18:10.320 --> 0:18:13.000 faces that just sets out to offend people, in which case, 0:18:14.000 --> 0:18:16.960 rethink your strategy. There are better things to do. It's 0:18:17.080 --> 0:18:19.240 just as you can make just as big an impact 0:18:19.320 --> 0:18:21.960 being a positive person as you can being a jerk face. 0:18:22.320 --> 0:18:23.960 I know it can seem like it's more work, but 0:18:24.000 --> 0:18:27.600 it's also more rewarding in the long run. Okay, soapbox done. So. 0:18:27.960 --> 0:18:31.440 There is a demo of the tone analyzer that's available online, 0:18:32.080 --> 0:18:36.080 and back when we were recording Forward Thinking, the demo 0:18:36.480 --> 0:18:39.240 worked in a way where it would tell you about 0:18:39.280 --> 0:18:42.760 emotional tone and break it down by percentage. It's a 0:18:42.760 --> 0:18:46.199 little different now, but I want to tell you the 0:18:46.920 --> 0:18:50.639 what words and the results we got in the past 0:18:50.760 --> 0:18:53.840 because they were so much fun. Granted you would get 0:18:53.880 --> 0:18:56.520 a different result now because the tone analyzer has been 0:18:56.560 --> 0:19:00.000 tweaked since we recorded that episode. So when we recorded 0:19:00.040 --> 0:19:03.680 that episode, one of my co hosts decided to put 0:19:03.760 --> 0:19:08.560 a sentence that is somewhat known in literary circles into 0:19:08.560 --> 0:19:10.879 this tone analyzer and find out what it said. And 0:19:10.960 --> 0:19:14.879 the sentence used was it is a truth universally acknowledged 0:19:15.080 --> 0:19:17.640 that a single man in possession of a good fortune 0:19:17.960 --> 0:19:21.240 must be in want of a wife. Now, the analyzer 0:19:21.800 --> 0:19:26.560 said that this emotional tone was cheerful, the social tone 0:19:26.680 --> 0:19:31.000 was seventy six percent open and fifty agreeable, and the 0:19:31.080 --> 0:19:35.760 writing tone was analytical. You can also view the sentence 0:19:35.840 --> 0:19:38.520 in terms of word count as opposed to the weighted 0:19:38.600 --> 0:19:41.840 value of individual words, and using that view, five percent 0:19:41.960 --> 0:19:46.440 of the sentence sentences were in an emotional tone, in 0:19:46.480 --> 0:19:49.879 a social tone, and five percent in a writing tone. Now, 0:19:50.280 --> 0:19:54.240 the analyzer highlights each word according to how it classifies them, 0:19:54.680 --> 0:19:58.520 So emotional words would be highlighted in red or pink 0:19:58.600 --> 0:20:01.439 in that older version of the tone analyzer, social words 0:20:01.680 --> 0:20:05.280 would show up in blue, and writing tones would be 0:20:05.359 --> 0:20:07.879 in green. And you could click on any word and 0:20:07.880 --> 0:20:10.720 the analyzer would offer alternative words that you might want 0:20:10.720 --> 0:20:14.159 to use and classify those words in the tones that 0:20:14.320 --> 0:20:16.639 they are associated with. Such you could shape your message 0:20:16.680 --> 0:20:19.439 to meet the tone you wish to convey. Also, the 0:20:19.560 --> 0:20:24.320 tone analyzer demo used the business letter format as the 0:20:24.320 --> 0:20:28.440 means of comparison, So, in other words, we compared Jane 0:20:28.480 --> 0:20:32.320 Austen to a business letter. Presumably if you were to 0:20:32.480 --> 0:20:34.960 use a full version of the analyzer, not just the 0:20:34.960 --> 0:20:37.720 demo version. You would have other options so you could 0:20:38.080 --> 0:20:42.160 compare it with other models, not just a business letter 0:20:42.600 --> 0:20:49.640 Joe McCormick. He included an excerpt from Dostoyevsky's Notes from Underground. 0:20:49.680 --> 0:20:53.639 That excerpt was, I could not become anything, neither good 0:20:53.680 --> 0:20:57.280 nor bad, neither a scoundrel nor an honest man, neither 0:20:57.359 --> 0:21:00.800 a hero nor an insect. And now I eking out 0:21:00.920 --> 0:21:04.760 my days in my corner, taunting myself with the bitter 0:21:04.960 --> 0:21:09.879 and entirely useless constellation that an intelligent man cannot seriously 0:21:09.960 --> 0:21:14.600 become anything, that only a fool can become something. The 0:21:14.640 --> 0:21:19.480 feedback was that the emotional tone had anger at cheerfulness 0:21:19.560 --> 0:21:24.879 at so happy anger negative at. The social tone was 0:21:25.880 --> 0:21:31.080 agreeable zero percent conscientious, zero percent open. The writing tone 0:21:31.119 --> 0:21:36.600 was analytical, zero percent confident and tentative. Joe would actually 0:21:36.720 --> 0:21:39.760 end up highlighting some of the words to find out 0:21:39.920 --> 0:21:42.359 which words were the ones that ended up giving that 0:21:43.600 --> 0:21:47.920 cheerfulness result. Those four words were a good, honest, hero, 0:21:48.200 --> 0:21:55.040 and intelligent and that kind of are that that's important 0:21:55.280 --> 0:21:59.399 because those words, the way they are used uh in 0:21:59.480 --> 0:22:03.680 that passage are not used in a positive sense. They 0:22:03.720 --> 0:22:09.000 are positive words, but they're meant to show kind of 0:22:09.040 --> 0:22:15.280 a negation there not, and not an assertion. So that 0:22:15.359 --> 0:22:18.720 really highlights a big problem in this tone analyzer, which 0:22:18.760 --> 0:22:24.719 is that it's tagging these words individually without context. So 0:22:24.800 --> 0:22:28.680 if I wrote the phrase I am not glad, it 0:22:28.720 --> 0:22:31.520 would tag the word glad and say that's a cheerful word. 0:22:32.200 --> 0:22:35.879 But I said I am not glad. You if I 0:22:35.960 --> 0:22:38.960 told you I am not glad, you would not think, oh, well, 0:22:38.960 --> 0:22:40.919 that's a cheerful thing to say or a positive thing 0:22:40.960 --> 0:22:44.560 to say. But according to the tone analyzer, it would 0:22:44.600 --> 0:22:47.920 come across as a cheerful statement because it had tagged 0:22:47.920 --> 0:22:50.119 that word as as being cheerful. In the other words 0:22:50.359 --> 0:22:53.880 are not that strong, they don't they don't warrant being 0:22:53.880 --> 0:22:58.280 tagged in a way like that. Now, over time, we 0:22:58.359 --> 0:23:01.360 might have a tone analyzer that can actually take context 0:23:01.600 --> 0:23:05.879 into account, and then you would learn a lot more 0:23:05.920 --> 0:23:09.679 about the actual meaning behind a phrase. It would be 0:23:09.720 --> 0:23:12.520 more than just tone. So if you were trying to 0:23:12.520 --> 0:23:18.240 get across tone by using more complicated and subtle word choice, 0:23:18.760 --> 0:23:23.520 where you're sort of being kind of uh poetic in 0:23:23.560 --> 0:23:28.200 your expression, you're trying to get across a feeling by 0:23:28.280 --> 0:23:33.399 using irony or sarcasm. Then a tone analyzer like this 0:23:33.440 --> 0:23:36.040 would totally miss it because it would just be counting 0:23:36.040 --> 0:23:40.280 the hits and not understanding the usage there the hidden 0:23:40.359 --> 0:23:44.520 meeting the word play. So that is going to be 0:23:44.960 --> 0:23:49.880 a real challenge. So it's kind of another interesting use 0:23:49.880 --> 0:23:52.120 of IBMS Watson. There are a lot of other ones 0:23:52.160 --> 0:23:54.600 that we could talk about, like Chef Watson, which was 0:23:54.680 --> 0:23:58.960 my favorite. Chef Watson would generate new recipes based upon 0:23:59.160 --> 0:24:01.600 ingredients that you would tell it that you had on hand, 0:24:02.040 --> 0:24:07.000 and it wouldn't it wouldn't go and reference old recipes 0:24:07.040 --> 0:24:09.800 and pull one up for you. Instead, it would make 0:24:09.840 --> 0:24:13.520 flavor profiles based upon all the different combinations of food 0:24:13.560 --> 0:24:16.280 that were found in various recipe books and generate a 0:24:16.280 --> 0:24:18.879 brand new recipe for you, right there on the spot. 0:24:19.240 --> 0:24:24.000 And sometimes they were whacka doodle crazy, y'all. So in 0:24:24.040 --> 0:24:26.240 a way you could say that Chef Watson was another 0:24:26.640 --> 0:24:29.760 another way of seeing how IBM S Watson has a 0:24:29.800 --> 0:24:33.480 lot of promise, but it requires a ton of work 0:24:34.000 --> 0:24:37.600 on the app level in order to leverage it and 0:24:37.640 --> 0:24:40.440 make actual practical use out of it. I have more 0:24:40.480 --> 0:24:45.280 to say about computers detecting sarcasm. But first let's take 0:24:45.960 --> 0:24:58.520 a quick word from our sponsor. So back in twent 0:24:59.600 --> 0:25:03.240 there were some researchers at the Hebrew University in Israel 0:25:03.359 --> 0:25:08.760 who designed a system called the Semi Supervised Algorithm for 0:25:08.800 --> 0:25:15.639 Sarcasm Identification or SAZI, and they used SAZI to analyze 0:25:15.640 --> 0:25:20.520 collections of nearly six million tweets and also around sixty 0:25:20.600 --> 0:25:25.680 six thousand product reviews from Amazon. They wanted to find 0:25:26.480 --> 0:25:31.160 rich treasure troves of sarcasm that turns out reviews and 0:25:31.200 --> 0:25:37.119 tweets they fit the bill sarcasm is. Really it's typically 0:25:37.200 --> 0:25:40.960 conveyed in in some vocal tone right and nonverbal cues. 0:25:41.760 --> 0:25:45.840 So you have to first go someplace where sarcasm is 0:25:45.840 --> 0:25:49.240 is rampant in text form to be able to really 0:25:49.400 --> 0:25:54.280 fine tune how you can identify sarcasm versus something that's 0:25:54.320 --> 0:25:57.400 meant exactly the way it's written on the surface level. 0:25:57.760 --> 0:26:03.120 So they started to map out the various features that 0:26:03.200 --> 0:26:07.520 were common in sarcastic comments online. So they were looking 0:26:07.520 --> 0:26:11.520 for things like hyperbolic words and if you're using a 0:26:11.520 --> 0:26:15.440 lot of exaggeration, that could be a key. Excessive punctuation 0:26:15.760 --> 0:26:19.040 was another one, especially ellipses, which I tend to use 0:26:19.160 --> 0:26:21.480 a lot, though I don't know if I use it 0:26:21.520 --> 0:26:24.680 so much for sarcasm as I do for just timing purposes. 0:26:24.720 --> 0:26:27.399 To indicate this is the beat I would take if 0:26:27.400 --> 0:26:30.159 I were saying this out loud. I guess that's just 0:26:30.240 --> 0:26:34.560 as irritating, though, also how straightforward is the Senate structure? 0:26:35.040 --> 0:26:37.600 And they gave it examples of sarcasm. They fed it 0:26:37.680 --> 0:26:43.919