WEBVTT - Could we make a sarcastic supercomputer?

0:00:04.120 --> 0:00:07.160
<v Speaker 1>Get in touch with technology with tech Stuff from how

0:00:07.200 --> 0:00:14.080
<v Speaker 1>stuff Works dot com. Hey there, and welcome to tech Stuff.

0:00:14.120 --> 0:00:16.960
<v Speaker 1>I'm your host, Jonathan Strickland. I'm an executive producer with

0:00:17.000 --> 0:00:20.560
<v Speaker 1>How Stuff Works and I love all things tech. Today,

0:00:21.079 --> 0:00:24.000
<v Speaker 1>I want to talk to you about an interesting topic

0:00:24.160 --> 0:00:26.599
<v Speaker 1>that I got to explore a couple of years ago

0:00:27.120 --> 0:00:31.760
<v Speaker 1>with Joe McCormick and Lauren fogobaum As we debated the

0:00:31.840 --> 0:00:37.800
<v Speaker 1>possibilities of computers learning how to understand sarcasm. We did

0:00:37.840 --> 0:00:41.160
<v Speaker 1>it for a podcast called Forward Thinking, which was around

0:00:41.200 --> 0:00:42.920
<v Speaker 1>for a couple of years. It was a lot of

0:00:42.920 --> 0:00:46.040
<v Speaker 1>fun to work on that that show is over, but

0:00:46.080 --> 0:00:48.640
<v Speaker 1>I thought I would revisit the topic and talk about

0:00:48.680 --> 0:00:52.120
<v Speaker 1>it for you guys and kind of go over what

0:00:52.280 --> 0:00:54.800
<v Speaker 1>would it take to have a computer that could actually

0:00:54.880 --> 0:00:59.400
<v Speaker 1>understand when someone's being sarcastic. Now to understand why this

0:00:59.440 --> 0:01:02.360
<v Speaker 1>is a big d it helps to have a refresher

0:01:02.400 --> 0:01:05.679
<v Speaker 1>course on how computers process information. And I know I

0:01:05.720 --> 0:01:08.560
<v Speaker 1>talked about this a lot, but I still think it's

0:01:08.560 --> 0:01:11.200
<v Speaker 1>important to cover the basics when you want to talk

0:01:11.240 --> 0:01:14.840
<v Speaker 1>about something as advanced as being able to detect and

0:01:15.040 --> 0:01:21.200
<v Speaker 1>understand sarcasm. So computers understand machine code or assembly language.

0:01:21.480 --> 0:01:25.320
<v Speaker 1>This is a language that corresponds with the actual physical

0:01:25.600 --> 0:01:30.319
<v Speaker 1>architecture of the computers, so the way the computer is built,

0:01:30.680 --> 0:01:33.880
<v Speaker 1>that's how this language interacts. It's it's essentially how the

0:01:33.959 --> 0:01:39.119
<v Speaker 1>physical components of the computer are able to handle electric

0:01:39.200 --> 0:01:45.560
<v Speaker 1>current or voltage differences in order to process information, and

0:01:45.880 --> 0:01:51.680
<v Speaker 1>computers can interpret this and execute upon this language very quickly.

0:01:52.240 --> 0:01:56.160
<v Speaker 1>It is the basic language of those physical components. However,

0:01:57.000 --> 0:02:00.880
<v Speaker 1>it is almost impossible for human to work with this,

0:02:01.200 --> 0:02:04.040
<v Speaker 1>at least on a way that is at all efficient,

0:02:04.480 --> 0:02:10.800
<v Speaker 1>because it ultimately for most computers boils down to binary language, right,

0:02:11.000 --> 0:02:16.120
<v Speaker 1>zeros and ones, So you see a huge block of

0:02:16.200 --> 0:02:18.799
<v Speaker 1>zeros and ones, and unless you are neo from the matrix,

0:02:18.840 --> 0:02:22.920
<v Speaker 1>it means nothing to you. So we speak in natural

0:02:23.240 --> 0:02:27.480
<v Speaker 1>language to one another. Natural language, however, is filled with

0:02:27.520 --> 0:02:31.640
<v Speaker 1>a lot of components that make it very very challenging

0:02:31.680 --> 0:02:36.160
<v Speaker 1>for machines to interpret, like ambiguity, or there might be

0:02:36.200 --> 0:02:39.200
<v Speaker 1>double meanings in a phrase and you may mean both

0:02:39.280 --> 0:02:43.960
<v Speaker 1>meanings at the same time, and that is too complicated

0:02:44.000 --> 0:02:46.200
<v Speaker 1>for most machines to be able to process. They just

0:02:46.240 --> 0:02:50.560
<v Speaker 1>can't deal with that. So to bridge the gap between

0:02:50.760 --> 0:02:54.480
<v Speaker 1>the way we humans communicate and the way that computers

0:02:54.600 --> 0:03:00.440
<v Speaker 1>process language. We have created programming languages and compilers. Now,

0:03:00.800 --> 0:03:04.760
<v Speaker 1>programming languages fall into two broad categories. It's more like

0:03:05.080 --> 0:03:07.920
<v Speaker 1>a spectrum, and you could be further on one end

0:03:08.000 --> 0:03:11.320
<v Speaker 1>than the other, and we typically call them high level

0:03:11.560 --> 0:03:15.960
<v Speaker 1>programming languages and low level programming languages. The lower the

0:03:16.120 --> 0:03:19.920
<v Speaker 1>level of programming language, the closer it is to machine code,

0:03:20.560 --> 0:03:23.399
<v Speaker 1>and the easier it is for a computer to understand,

0:03:23.800 --> 0:03:26.040
<v Speaker 1>but the harder it is to work with if you

0:03:26.080 --> 0:03:29.200
<v Speaker 1>happen to be, you know, a human being. High level

0:03:29.240 --> 0:03:33.480
<v Speaker 1>programming languages are easier for humans to understand. Now, if

0:03:33.520 --> 0:03:36.960
<v Speaker 1>you have never taken any courses in programming and you're

0:03:37.000 --> 0:03:41.040
<v Speaker 1>looking at a page of code, it can seem indecipherable

0:03:41.040 --> 0:03:46.360
<v Speaker 1>to you. It is just meaningless strings of characters. But

0:03:47.240 --> 0:03:50.720
<v Speaker 1>once you learn the rules of that programming language, how

0:03:50.800 --> 0:03:54.800
<v Speaker 1>you construct an instruction and a series of instructions, how

0:03:54.840 --> 0:03:57.640
<v Speaker 1>you go from one instruction to the next. Once you

0:03:57.720 --> 0:04:00.920
<v Speaker 1>understand the rules, it actually becomes quite easy to use

0:04:01.160 --> 0:04:03.200
<v Speaker 1>in the grand scheme of things, much more easy than

0:04:03.280 --> 0:04:06.880
<v Speaker 1>machine language would be. But again, the problem here is

0:04:06.920 --> 0:04:11.960
<v Speaker 1>that computers don't understand programming languages, not natively. Even though

0:04:12.480 --> 0:04:15.280
<v Speaker 1>this is not exactly the same as human natural language,

0:04:15.280 --> 0:04:18.039
<v Speaker 1>it's also not the same as machine language. That's why

0:04:18.040 --> 0:04:23.719
<v Speaker 1>you need compilers. A compiler is essentially a translator. It

0:04:23.880 --> 0:04:28.479
<v Speaker 1>takes this high level programming language or higher level anyway,

0:04:28.560 --> 0:04:32.080
<v Speaker 1>and then converts it into a machine readable language for

0:04:32.120 --> 0:04:35.400
<v Speaker 1>the computer to actually execute upon. And this is all

0:04:35.440 --> 0:04:39.080
<v Speaker 1>in the design of the programming languages and the compilers.

0:04:40.040 --> 0:04:44.039
<v Speaker 1>So this is the way that for decades we have

0:04:44.120 --> 0:04:46.760
<v Speaker 1>interacted with computers, when you're talking about it on a

0:04:46.839 --> 0:04:49.680
<v Speaker 1>on a direct level, not just executing a program, but

0:04:49.839 --> 0:04:54.599
<v Speaker 1>creating code, creating programs for computers to run. Over the

0:04:54.720 --> 0:04:58.960
<v Speaker 1>last few decades, we've had some very very smart people

0:04:59.520 --> 0:05:05.599
<v Speaker 1>working on natural language systems for machines which would allow

0:05:05.839 --> 0:05:12.560
<v Speaker 1>a computer to interpret natural language in a way that

0:05:12.560 --> 0:05:14.920
<v Speaker 1>would make some sort of sense and for the computer

0:05:14.960 --> 0:05:17.320
<v Speaker 1>to be able to act upon that language. And we've

0:05:17.360 --> 0:05:22.479
<v Speaker 1>seen this in plenty of examples recently. Most smartphones have

0:05:22.680 --> 0:05:26.560
<v Speaker 1>some sort of smart assistant. You have standalone products like

0:05:26.720 --> 0:05:31.000
<v Speaker 1>Amazon's Echo, you have Google Home, You've got tons of

0:05:31.080 --> 0:05:37.080
<v Speaker 1>devices that can interact with people. It can be activated

0:05:37.120 --> 0:05:39.800
<v Speaker 1>by typically an alert phrase, which I'm not going to

0:05:39.880 --> 0:05:41.680
<v Speaker 1>say because I don't want any of you guys to

0:05:41.720 --> 0:05:43.880
<v Speaker 1>have to deal with that. I know how irritating it

0:05:43.960 --> 0:05:47.400
<v Speaker 1>is when I'm watching a video and someone activates their

0:05:48.760 --> 0:05:52.920
<v Speaker 1>specific system and then mine begins to respond and all

0:05:52.920 --> 0:05:55.640
<v Speaker 1>my lights started going on and off because the people

0:05:55.640 --> 0:05:58.560
<v Speaker 1>on YouTube we're talking funny. I know how irritating that is.

0:05:58.600 --> 0:06:01.680
<v Speaker 1>But use that at debates and then you can speak

0:06:02.080 --> 0:06:06.400
<v Speaker 1>and typically you can say the same thing several different

0:06:06.400 --> 0:06:11.520
<v Speaker 1>ways and the device appears to understand you no matter

0:06:11.560 --> 0:06:14.279
<v Speaker 1>how you word it. And this is a real challenge

0:06:14.279 --> 0:06:17.120
<v Speaker 1>because we human beings can find lots of different ways

0:06:17.560 --> 0:06:20.360
<v Speaker 1>to say the same thing. For example, if I say

0:06:20.400 --> 0:06:23.560
<v Speaker 1>what is the weather today, it could be very similar

0:06:23.600 --> 0:06:25.640
<v Speaker 1>to if I if I ask a question, is it

0:06:25.720 --> 0:06:29.120
<v Speaker 1>going to rain today? Both of those are asking for

0:06:29.160 --> 0:06:32.560
<v Speaker 1>information about the weather, but are very different ways of

0:06:32.600 --> 0:06:36.760
<v Speaker 1>saying that. A good natural language recognition program will be

0:06:36.800 --> 0:06:42.360
<v Speaker 1>able to parse that information and then return the appropriate response.

0:06:43.600 --> 0:06:46.760
<v Speaker 1>This is not an easy thing to do. Typically it

0:06:46.800 --> 0:06:50.880
<v Speaker 1>involves creating a neural network structure, and I've talked about

0:06:50.960 --> 0:06:55.640
<v Speaker 1>artificial neural networks recently. That's a typically a network that

0:06:55.720 --> 0:07:01.440
<v Speaker 1>can accept multiple binary inputs, so either a zero or

0:07:01.520 --> 0:07:06.640
<v Speaker 1>a one input that represents something uh, some sort of yes,

0:07:06.720 --> 0:07:10.440
<v Speaker 1>no or on off kind of feature. It can accept

0:07:10.560 --> 0:07:14.760
<v Speaker 1>multiple multiple inputs of that nature, so multiple zeros or

0:07:14.840 --> 0:07:18.920
<v Speaker 1>ones that all factor into making a decision, and then

0:07:18.920 --> 0:07:22.720
<v Speaker 1>it has a waiting for each of those components, and

0:07:22.760 --> 0:07:26.400
<v Speaker 1>then it produces a single output that's also binary in nature,

0:07:26.440 --> 0:07:28.920
<v Speaker 1>either a zero one, and it passes that on to

0:07:29.240 --> 0:07:33.440
<v Speaker 1>other artificial neurons further down the chain. Sometimes that will

0:07:33.480 --> 0:07:37.080
<v Speaker 1>come back around and you have a recursive artificial neural network.

0:07:37.440 --> 0:07:42.920
<v Speaker 1>The goal here is for this process two ultimately result

0:07:43.760 --> 0:07:49.080
<v Speaker 1>in a response that is reasonably certain to meet the

0:07:49.120 --> 0:07:52.800
<v Speaker 1>requirements of the person asking the question. This tends to

0:07:52.800 --> 0:07:56.720
<v Speaker 1>be talked about in the realm of probabilities. We we

0:07:56.760 --> 0:08:00.280
<v Speaker 1>talked about how certain the machine is that the respons

0:08:00.400 --> 0:08:03.240
<v Speaker 1>is the appropriate one, and if it falls below a

0:08:03.280 --> 0:08:07.800
<v Speaker 1>certain threshold, then the machine would typically respond with I'm sorry,

0:08:07.840 --> 0:08:10.040
<v Speaker 1>I don't know what you're asking for, or something similar

0:08:10.080 --> 0:08:13.840
<v Speaker 1>to that. There are cases where you just get misinterpreted

0:08:13.960 --> 0:08:16.559
<v Speaker 1>and you'll get a response that does not reflect whatever

0:08:16.600 --> 0:08:18.760
<v Speaker 1>you ask. That's a little different. That's where the machine

0:08:18.760 --> 0:08:22.760
<v Speaker 1>has drawn a conclusion, has been reasonably certain that it

0:08:22.800 --> 0:08:24.680
<v Speaker 1>came to the right conclusion, it turns out it was

0:08:24.720 --> 0:08:29.240
<v Speaker 1>wrong the whole way. But that's the process. Now, when

0:08:29.280 --> 0:08:36.559
<v Speaker 1>it comes to sarcasm, that adds yet another layer of difficulty,

0:08:37.320 --> 0:08:42.120
<v Speaker 1>because now a machine isn't just parsing what you are saying.

0:08:42.520 --> 0:08:46.520
<v Speaker 1>It has to understand what you mean, the meaning of

0:08:46.559 --> 0:08:51.480
<v Speaker 1>your words and the meaning of the way you deliver them.

0:08:51.480 --> 0:08:54.120
<v Speaker 1>It could be different. So if I were to just

0:08:54.240 --> 0:08:59.360
<v Speaker 1>write out a phrase with no tone, no body language, uh,

0:08:59.600 --> 0:09:03.920
<v Speaker 1>not emphasizing any one word over another, it might be

0:09:04.040 --> 0:09:08.319
<v Speaker 1>very difficult to detect what my intent was. It may

0:09:08.360 --> 0:09:11.559
<v Speaker 1>seem like I'm being sincere, when in fact I'm being insincere.

0:09:11.840 --> 0:09:16.280
<v Speaker 1>For example, Uh, if I were to say that guy

0:09:16.400 --> 0:09:22.040
<v Speaker 1>is super tall, but I'm being sarcastic, then just in

0:09:22.080 --> 0:09:25.440
<v Speaker 1>that phrase the way I write it out, you would think, oh, well,

0:09:25.480 --> 0:09:29.959
<v Speaker 1>that person he's looking at must be super tall. How

0:09:30.000 --> 0:09:34.120
<v Speaker 1>do you recognize sarcasm? How can you detect that this

0:09:34.200 --> 0:09:37.280
<v Speaker 1>is in place and then understand what the meaning underneath

0:09:37.320 --> 0:09:41.760
<v Speaker 1>it is. One of the approaches that has been put

0:09:41.800 --> 0:09:48.480
<v Speaker 1>forward relates to IBM's Watson platform. Now. Watson first made

0:09:48.480 --> 0:09:52.440
<v Speaker 1>headlines back when it was a contestant on Jeopardy. It

0:09:52.720 --> 0:09:56.880
<v Speaker 1>went up against two former champions, including Ken Jennings, who

0:09:57.000 --> 0:10:00.240
<v Speaker 1>shows up on a house Stuff Works podcast. Anyway, Utson

0:10:00.280 --> 0:10:03.840
<v Speaker 1>went up against these two former champions and it was

0:10:03.920 --> 0:10:07.160
<v Speaker 1>able to interpret natural language. It had to in order

0:10:07.200 --> 0:10:09.120
<v Speaker 1>to play the game of Jeopardy. And for those who

0:10:09.200 --> 0:10:11.920
<v Speaker 1>do not know what Jeopardy is or they're not familiar

0:10:11.920 --> 0:10:15.120
<v Speaker 1>with the game show, Jeopardy is a game where you

0:10:15.160 --> 0:10:21.079
<v Speaker 1>are presented with categories of trivia and each category has

0:10:21.200 --> 0:10:27.679
<v Speaker 1>multiple uh questions or multiple entries in it, and they

0:10:27.800 --> 0:10:33.360
<v Speaker 1>range in dollar value, and the lower dollar value ones

0:10:33.400 --> 0:10:37.000
<v Speaker 1>are easier to answer than the higher dollar value ones,

0:10:38.120 --> 0:10:41.680
<v Speaker 1>and UH, you're Typically the way Jeopardy works is that

0:10:41.720 --> 0:10:44.600
<v Speaker 1>you're you're given quote unquote the answer and you have

0:10:44.679 --> 0:10:49.840
<v Speaker 1>to provide the question. So uh, if the answer were

0:10:51.360 --> 0:10:57.440
<v Speaker 1>this film that detailed the adventures of a young playwright

0:10:57.640 --> 0:11:01.920
<v Speaker 1>in sixteenth century England one picture, you would say, what

0:11:02.080 --> 0:11:06.240
<v Speaker 1>was Shakespeare in Love? So this computer is playing against

0:11:06.240 --> 0:11:08.920
<v Speaker 1>these two former champions. This was sort of an exhibition

0:11:09.480 --> 0:11:14.160
<v Speaker 1>series of games. It wasn't meant for uh, a competition

0:11:14.200 --> 0:11:16.480
<v Speaker 1>in the way the typical Jeopardy games were. There was

0:11:16.559 --> 0:11:19.960
<v Speaker 1>money on the line. It was an exhibition and Watson

0:11:20.000 --> 0:11:23.160
<v Speaker 1>won it beat both of the champions, and it did

0:11:23.160 --> 0:11:26.440
<v Speaker 1>what I was telling you. It it would analyze the

0:11:26.600 --> 0:11:30.719
<v Speaker 1>clue that was given, the answer that was given, it

0:11:30.760 --> 0:11:33.959
<v Speaker 1>would try and generate a question to correspond with that answer,

0:11:34.360 --> 0:11:37.480
<v Speaker 1>and only if the question met a certain threshold of

0:11:37.520 --> 0:11:40.600
<v Speaker 1>confidence with Watson buzz in. If it did not meet

0:11:40.960 --> 0:11:45.040
<v Speaker 1>that level of confidence, Watson would remain quiet. And most importantly,

0:11:45.320 --> 0:11:47.920
<v Speaker 1>Watson was not at all connected to the Internet. All

0:11:48.000 --> 0:11:53.640
<v Speaker 1>the information was contained within a massive series of servers

0:11:54.559 --> 0:11:57.080
<v Speaker 1>more than gosh, I can't even remember. There's a ton

0:11:57.160 --> 0:12:02.440
<v Speaker 1>of processors attached to it. Um so a very powerful machine,

0:12:03.520 --> 0:12:09.640
<v Speaker 1>but it still wasn't exactly able to detect sarcasm. It

0:12:09.720 --> 0:12:14.040
<v Speaker 1>could work with wordplay, and it could work with riddles,

0:12:14.040 --> 0:12:16.960
<v Speaker 1>so that was really impressive. But what it really did

0:12:17.000 --> 0:12:19.560
<v Speaker 1>was it gave IBM the opportunity to say, we have

0:12:19.720 --> 0:12:24.360
<v Speaker 1>this platform here, and we're welcoming developers to create applications

0:12:24.400 --> 0:12:28.160
<v Speaker 1>that tap into this platform and make use of this

0:12:28.880 --> 0:12:32.640
<v Speaker 1>in order to do interesting stuff with it. And IBM

0:12:32.720 --> 0:12:35.319
<v Speaker 1>was largely working with the medical industry at that point

0:12:35.360 --> 0:12:41.600
<v Speaker 1>to try and help doctors treat and diagnose patients, and

0:12:41.679 --> 0:12:43.760
<v Speaker 1>it was sort of computer guidance. It wasn't that you

0:12:43.840 --> 0:12:47.960
<v Speaker 1>had an automatic doctor, but rather the doctor had what

0:12:48.320 --> 0:12:53.480
<v Speaker 1>equates to a medical expert to confer with when trying

0:12:53.520 --> 0:12:56.760
<v Speaker 1>to determine why's the best course of action for a patient.

0:12:57.800 --> 0:13:01.120
<v Speaker 1>IBM put up an Application program m interface or API

0:13:01.640 --> 0:13:06.320
<v Speaker 1>and let developers create their own cognitive computing applications built

0:13:06.400 --> 0:13:10.600
<v Speaker 1>on top of Watson. One of those was called the

0:13:10.640 --> 0:13:14.680
<v Speaker 1>tone analyzer. It still exists back when we were doing

0:13:14.679 --> 0:13:18.120
<v Speaker 1>this episode for forward Thinking. It was in the form

0:13:18.400 --> 0:13:21.520
<v Speaker 1>of analyzing some text and telling you whether or not

0:13:22.040 --> 0:13:26.120
<v Speaker 1>that text would come across as agreeable or argumentative, or

0:13:26.200 --> 0:13:31.439
<v Speaker 1>positive or negative, and it would assign tone to those pieces.

0:13:32.040 --> 0:13:35.040
<v Speaker 1>I'll explain more about how it did and what it

0:13:35.120 --> 0:13:37.560
<v Speaker 1>did in just a minute, but first let's take a

0:13:37.640 --> 0:13:48.360
<v Speaker 1>quick break to thank our sponsor. So how did this

0:13:48.440 --> 0:13:53.920
<v Speaker 1>tone analyzer work. It would search for cues in any

0:13:54.080 --> 0:13:59.480
<v Speaker 1>written text, social cues, written cues, emotional cues in order

0:13:59.520 --> 0:14:02.760
<v Speaker 1>to determine in the overall tone of a piece, which

0:14:02.800 --> 0:14:07.640
<v Speaker 1>actually meant that The analyzer would tag individual words within

0:14:07.960 --> 0:14:13.160
<v Speaker 1>a text, words that it recognized and had already pre

0:14:13.280 --> 0:14:17.319
<v Speaker 1>labeled as falling into various categories. So words that might

0:14:17.360 --> 0:14:23.880
<v Speaker 1>have a positive meaning like happy, glad, joy, things like that.

0:14:23.880 --> 0:14:27.480
<v Speaker 1>Those would get tagged as cheerful. But then it would

0:14:27.480 --> 0:14:31.040
<v Speaker 1>then assign all the individual words tags and then tally

0:14:31.120 --> 0:14:33.680
<v Speaker 1>everything up. So let's say you've got a bunch of

0:14:33.680 --> 0:14:39.000
<v Speaker 1>sentences and it starts individually labeling certain words as being

0:14:39.120 --> 0:14:44.240
<v Speaker 1>cheerful or sad or angry or helpful, and then it

0:14:44.280 --> 0:14:46.680
<v Speaker 1>adds it all up and then would give you a percentage.

0:14:47.120 --> 0:14:52.880
<v Speaker 1>So a message might be agreeable or thirty conscientious, you

0:14:52.880 --> 0:14:55.760
<v Speaker 1>would actually get multiples of these, and that would just

0:14:55.800 --> 0:14:59.600
<v Speaker 1>really indicate the density of those types of words within

0:14:59.640 --> 0:15:04.240
<v Speaker 1>the mess itage itself. Now, in an ideal world, if

0:15:04.320 --> 0:15:08.960
<v Speaker 1>language were very simple to understand and interpret by machines,

0:15:09.480 --> 0:15:12.960
<v Speaker 1>this would help you gauge how people would respond to

0:15:13.080 --> 0:15:17.360
<v Speaker 1>your work. Right, So, you could write a message. Before

0:15:17.400 --> 0:15:20.400
<v Speaker 1>you send it, you put it through the tone analyzer

0:15:20.800 --> 0:15:25.000
<v Speaker 1>and it tells you what sort of a tone you

0:15:25.040 --> 0:15:28.360
<v Speaker 1>are setting. So if you wanted to create a business letter,

0:15:28.960 --> 0:15:30.840
<v Speaker 1>you could send it through this tone analyzer, and if

0:15:30.840 --> 0:15:33.760
<v Speaker 1>it came back as saying it's coming across as as

0:15:33.840 --> 0:15:37.320
<v Speaker 1>a indecisive, you might want to go back in and

0:15:37.480 --> 0:15:40.680
<v Speaker 1>edit that message so that you can make a more

0:15:41.080 --> 0:15:46.640
<v Speaker 1>straightforward and decisive message and not give the wrong impression

0:15:46.720 --> 0:15:50.320
<v Speaker 1>before you send the message out to your actual human recipient,

0:15:50.680 --> 0:15:53.280
<v Speaker 1>and come up with alternate word choices in order to

0:15:53.280 --> 0:15:55.200
<v Speaker 1>make sure that your message is received the way you

0:15:55.240 --> 0:15:58.560
<v Speaker 1>intended it. And anyone who has communicated over the internet

0:15:58.600 --> 0:16:01.280
<v Speaker 1>can think of ways that this might have been helpful

0:16:01.320 --> 0:16:05.400
<v Speaker 1>in the past, because again, language depends on so many

0:16:05.520 --> 0:16:09.800
<v Speaker 1>different elements to get your meaning across, and when you

0:16:09.840 --> 0:16:14.520
<v Speaker 1>reduce it to the written form, especially the written form online,

0:16:14.560 --> 0:16:19.239
<v Speaker 1>where we tend to be very short with our our communication,

0:16:19.400 --> 0:16:22.880
<v Speaker 1>it comes in very quick bursts, a couple of sentences

0:16:22.880 --> 0:16:25.960
<v Speaker 1>here or there. We lack all that body language, we

0:16:26.040 --> 0:16:29.320
<v Speaker 1>lack that tone. It's very easy to misinterpret. I'm sure

0:16:29.360 --> 0:16:32.440
<v Speaker 1>there's been an example in your life where either you

0:16:32.520 --> 0:16:35.080
<v Speaker 1>got offended from receiving something that was meant in a

0:16:35.120 --> 0:16:38.360
<v Speaker 1>way that was different from the way you you interpreted it,

0:16:38.480 --> 0:16:40.920
<v Speaker 1>or the reverse happened where you sent a message and

0:16:41.000 --> 0:16:45.320
<v Speaker 1>somebody had a reaction you did not anticipate because they

0:16:45.360 --> 0:16:48.240
<v Speaker 1>could not tell what tone you were using just from

0:16:48.280 --> 0:16:51.960
<v Speaker 1>the words you were using. Machines have that same problem.

0:16:52.200 --> 0:16:55.760
<v Speaker 1>In the future, an analyzer like this tone analyzer, it

0:16:55.760 --> 0:17:00.280
<v Speaker 1>could be incorporated into word processors or email sir verse,

0:17:00.360 --> 0:17:03.920
<v Speaker 1>or email services, I should say, or social media platforms.

0:17:04.240 --> 0:17:06.879
<v Speaker 1>So you start typing in your message, and before you

0:17:06.960 --> 0:17:11.159
<v Speaker 1>hit published or post or send, you could analyze that text.

0:17:11.680 --> 0:17:13.560
<v Speaker 1>It could tell you what the tone is, and then

0:17:13.600 --> 0:17:16.440
<v Speaker 1>you could say, oh, no, that's gonna come across totally

0:17:16.600 --> 0:17:18.840
<v Speaker 1>the wrong way, and you could actually fix it before

0:17:18.920 --> 0:17:21.000
<v Speaker 1>you posted it or sent it, and then you wouldn't

0:17:21.040 --> 0:17:24.680
<v Speaker 1>have that awkward decision of whether or not to edit something, or,

0:17:24.720 --> 0:17:27.639
<v Speaker 1>in the case of Twitter, which continues to refuse to

0:17:27.680 --> 0:17:30.919
<v Speaker 1>allow you to edit tweets, to delete a tweet. I

0:17:31.000 --> 0:17:33.960
<v Speaker 1>deleted a tweet the other day when I posted a

0:17:34.040 --> 0:17:36.679
<v Speaker 1>link to a news story, and I had done a

0:17:36.760 --> 0:17:40.080
<v Speaker 1>rookie mistake, one that I try to avoid, but I

0:17:40.640 --> 0:17:43.800
<v Speaker 1>did it this pastime, which is that I didn't think

0:17:43.840 --> 0:17:46.040
<v Speaker 1>to look at the date when the news item had

0:17:46.080 --> 0:17:49.240
<v Speaker 1>been published, and had been published a full year earlier,

0:17:49.600 --> 0:17:51.919
<v Speaker 1>so it was not new news, it was old news.

0:17:52.440 --> 0:17:55.240
<v Speaker 1>And uh then deleted the tweet and it wasn't up

0:17:55.280 --> 0:17:57.520
<v Speaker 1>for long, but I still felt dumb about it. It

0:17:57.520 --> 0:17:59.239
<v Speaker 1>would have been nice to have been able to check that.

0:17:59.440 --> 0:18:02.119
<v Speaker 1>Although that's not tone obviously, that's but similar in the

0:18:02.840 --> 0:18:06.200
<v Speaker 1>and the idea that you want to check before you

0:18:06.920 --> 0:18:10.240
<v Speaker 1>end up offending someone, unless you're one of those jerk

0:18:10.320 --> 0:18:13.000
<v Speaker 1>faces that just sets out to offend people, in which case,

0:18:14.000 --> 0:18:16.960
<v Speaker 1>rethink your strategy. There are better things to do. It's

0:18:17.080 --> 0:18:19.240
<v Speaker 1>just as you can make just as big an impact

0:18:19.320 --> 0:18:21.960
<v Speaker 1>being a positive person as you can being a jerk face.

0:18:22.320 --> 0:18:23.960
<v Speaker 1>I know it can seem like it's more work, but

0:18:24.000 --> 0:18:27.600
<v Speaker 1>it's also more rewarding in the long run. Okay, soapbox done. So.

0:18:27.960 --> 0:18:31.440
<v Speaker 1>There is a demo of the tone analyzer that's available online,

0:18:32.080 --> 0:18:36.080
<v Speaker 1>and back when we were recording Forward Thinking, the demo

0:18:36.480 --> 0:18:39.240
<v Speaker 1>worked in a way where it would tell you about

0:18:39.280 --> 0:18:42.760
<v Speaker 1>emotional tone and break it down by percentage. It's a

0:18:42.760 --> 0:18:46.199
<v Speaker 1>little different now, but I want to tell you the

0:18:46.920 --> 0:18:50.639
<v Speaker 1>what words and the results we got in the past

0:18:50.760 --> 0:18:53.840
<v Speaker 1>because they were so much fun. Granted you would get

0:18:53.880 --> 0:18:56.520
<v Speaker 1>a different result now because the tone analyzer has been

0:18:56.560 --> 0:19:00.000
<v Speaker 1>tweaked since we recorded that episode. So when we recorded

0:19:00.040 --> 0:19:03.680
<v Speaker 1>that episode, one of my co hosts decided to put

0:19:03.760 --> 0:19:08.560
<v Speaker 1>a sentence that is somewhat known in literary circles into

0:19:08.560 --> 0:19:10.879
<v Speaker 1>this tone analyzer and find out what it said. And

0:19:10.960 --> 0:19:14.879
<v Speaker 1>the sentence used was it is a truth universally acknowledged

0:19:15.080 --> 0:19:17.640
<v Speaker 1>that a single man in possession of a good fortune

0:19:17.960 --> 0:19:21.240
<v Speaker 1>must be in want of a wife. Now, the analyzer

0:19:21.800 --> 0:19:26.560
<v Speaker 1>said that this emotional tone was cheerful, the social tone

0:19:26.680 --> 0:19:31.000
<v Speaker 1>was seventy six percent open and fifty agreeable, and the

0:19:31.080 --> 0:19:35.760
<v Speaker 1>writing tone was analytical. You can also view the sentence

0:19:35.840 --> 0:19:38.520
<v Speaker 1>in terms of word count as opposed to the weighted

0:19:38.600 --> 0:19:41.840
<v Speaker 1>value of individual words, and using that view, five percent

0:19:41.960 --> 0:19:46.440
<v Speaker 1>of the sentence sentences were in an emotional tone, in

0:19:46.480 --> 0:19:49.879
<v Speaker 1>a social tone, and five percent in a writing tone. Now,

0:19:50.280 --> 0:19:54.240
<v Speaker 1>the analyzer highlights each word according to how it classifies them,

0:19:54.680 --> 0:19:58.520
<v Speaker 1>So emotional words would be highlighted in red or pink

0:19:58.600 --> 0:20:01.439
<v Speaker 1>in that older version of the tone analyzer, social words

0:20:01.680 --> 0:20:05.280
<v Speaker 1>would show up in blue, and writing tones would be

0:20:05.359 --> 0:20:07.879
<v Speaker 1>in green. And you could click on any word and

0:20:07.880 --> 0:20:10.720
<v Speaker 1>the analyzer would offer alternative words that you might want

0:20:10.720 --> 0:20:14.159
<v Speaker 1>to use and classify those words in the tones that

0:20:14.320 --> 0:20:16.639
<v Speaker 1>they are associated with. Such you could shape your message

0:20:16.680 --> 0:20:19.439
<v Speaker 1>to meet the tone you wish to convey. Also, the

0:20:19.560 --> 0:20:24.320
<v Speaker 1>tone analyzer demo used the business letter format as the

0:20:24.320 --> 0:20:28.440
<v Speaker 1>means of comparison, So, in other words, we compared Jane

0:20:28.480 --> 0:20:32.320
<v Speaker 1>Austen to a business letter. Presumably if you were to

0:20:32.480 --> 0:20:34.960
<v Speaker 1>use a full version of the analyzer, not just the

0:20:34.960 --> 0:20:37.720
<v Speaker 1>demo version. You would have other options so you could

0:20:38.080 --> 0:20:42.160
<v Speaker 1>compare it with other models, not just a business letter

0:20:42.600 --> 0:20:49.640
<v Speaker 1>Joe McCormick. He included an excerpt from Dostoyevsky's Notes from Underground.

0:20:49.680 --> 0:20:53.639
<v Speaker 1>That excerpt was, I could not become anything, neither good

0:20:53.680 --> 0:20:57.280
<v Speaker 1>nor bad, neither a scoundrel nor an honest man, neither

0:20:57.359 --> 0:21:00.800
<v Speaker 1>a hero nor an insect. And now I eking out

0:21:00.920 --> 0:21:04.760
<v Speaker 1>my days in my corner, taunting myself with the bitter

0:21:04.960 --> 0:21:09.879
<v Speaker 1>and entirely useless constellation that an intelligent man cannot seriously

0:21:09.960 --> 0:21:14.600
<v Speaker 1>become anything, that only a fool can become something. The

0:21:14.640 --> 0:21:19.480
<v Speaker 1>feedback was that the emotional tone had anger at cheerfulness

0:21:19.560 --> 0:21:24.879
<v Speaker 1>at so happy anger negative at. The social tone was

0:21:25.880 --> 0:21:31.080
<v Speaker 1>agreeable zero percent conscientious, zero percent open. The writing tone

0:21:31.119 --> 0:21:36.600
<v Speaker 1>was analytical, zero percent confident and tentative. Joe would actually

0:21:36.720 --> 0:21:39.760
<v Speaker 1>end up highlighting some of the words to find out

0:21:39.920 --> 0:21:42.359
<v Speaker 1>which words were the ones that ended up giving that

0:21:43.600 --> 0:21:47.920
<v Speaker 1>cheerfulness result. Those four words were a good, honest, hero,

0:21:48.200 --> 0:21:55.040
<v Speaker 1>and intelligent and that kind of are that that's important

0:21:55.280 --> 0:21:59.399
<v Speaker 1>because those words, the way they are used uh in

0:21:59.480 --> 0:22:03.680
<v Speaker 1>that passage are not used in a positive sense. They

0:22:03.720 --> 0:22:09.000
<v Speaker 1>are positive words, but they're meant to show kind of

0:22:09.040 --> 0:22:15.280
<v Speaker 1>a negation there not, and not an assertion. So that

0:22:15.359 --> 0:22:18.720
<v Speaker 1>really highlights a big problem in this tone analyzer, which

0:22:18.760 --> 0:22:24.719
<v Speaker 1>is that it's tagging these words individually without context. So

0:22:24.800 --> 0:22:28.680
<v Speaker 1>if I wrote the phrase I am not glad, it

0:22:28.720 --> 0:22:31.520
<v Speaker 1>would tag the word glad and say that's a cheerful word.

0:22:32.200 --> 0:22:35.879
<v Speaker 1>But I said I am not glad. You if I

0:22:35.960 --> 0:22:38.960
<v Speaker 1>told you I am not glad, you would not think, oh, well,

0:22:38.960 --> 0:22:40.919
<v Speaker 1>that's a cheerful thing to say or a positive thing

0:22:40.960 --> 0:22:44.560
<v Speaker 1>to say. But according to the tone analyzer, it would

0:22:44.600 --> 0:22:47.920
<v Speaker 1>come across as a cheerful statement because it had tagged

0:22:47.920 --> 0:22:50.119
<v Speaker 1>that word as as being cheerful. In the other words

0:22:50.359 --> 0:22:53.880
<v Speaker 1>are not that strong, they don't they don't warrant being

0:22:53.880 --> 0:22:58.280
<v Speaker 1>tagged in a way like that. Now, over time, we

0:22:58.359 --> 0:23:01.360
<v Speaker 1>might have a tone analyzer that can actually take context

0:23:01.600 --> 0:23:05.879
<v Speaker 1>into account, and then you would learn a lot more

0:23:05.920 --> 0:23:09.679
<v Speaker 1>about the actual meaning behind a phrase. It would be

0:23:09.720 --> 0:23:12.520
<v Speaker 1>more than just tone. So if you were trying to

0:23:12.520 --> 0:23:18.240
<v Speaker 1>get across tone by using more complicated and subtle word choice,

0:23:18.760 --> 0:23:23.520
<v Speaker 1>where you're sort of being kind of uh poetic in

0:23:23.560 --> 0:23:28.200
<v Speaker 1>your expression, you're trying to get across a feeling by

0:23:28.280 --> 0:23:33.399
<v Speaker 1>using irony or sarcasm. Then a tone analyzer like this

0:23:33.440 --> 0:23:36.040
<v Speaker 1>would totally miss it because it would just be counting

0:23:36.040 --> 0:23:40.280
<v Speaker 1>the hits and not understanding the usage there the hidden

0:23:40.359 --> 0:23:44.520
<v Speaker 1>meeting the word play. So that is going to be

0:23:44.960 --> 0:23:49.880
<v Speaker 1>a real challenge. So it's kind of another interesting use

0:23:49.880 --> 0:23:52.120
<v Speaker 1>of IBMS Watson. There are a lot of other ones

0:23:52.160 --> 0:23:54.600
<v Speaker 1>that we could talk about, like Chef Watson, which was

0:23:54.680 --> 0:23:58.960
<v Speaker 1>my favorite. Chef Watson would generate new recipes based upon

0:23:59.160 --> 0:24:01.600
<v Speaker 1>ingredients that you would tell it that you had on hand,

0:24:02.040 --> 0:24:07.000
<v Speaker 1>and it wouldn't it wouldn't go and reference old recipes

0:24:07.040 --> 0:24:09.800
<v Speaker 1>and pull one up for you. Instead, it would make

0:24:09.840 --> 0:24:13.520
<v Speaker 1>flavor profiles based upon all the different combinations of food

0:24:13.560 --> 0:24:16.280
<v Speaker 1>that were found in various recipe books and generate a

0:24:16.280 --> 0:24:18.879
<v Speaker 1>brand new recipe for you, right there on the spot.

0:24:19.240 --> 0:24:24.000
<v Speaker 1>And sometimes they were whacka doodle crazy, y'all. So in

0:24:24.040 --> 0:24:26.240
<v Speaker 1>a way you could say that Chef Watson was another

0:24:26.640 --> 0:24:29.760
<v Speaker 1>another way of seeing how IBM S Watson has a

0:24:29.800 --> 0:24:33.480
<v Speaker 1>lot of promise, but it requires a ton of work

0:24:34.000 --> 0:24:37.600
<v Speaker 1>on the app level in order to leverage it and

0:24:37.640 --> 0:24:40.440
<v Speaker 1>make actual practical use out of it. I have more

0:24:40.480 --> 0:24:45.280
<v Speaker 1>to say about computers detecting sarcasm. But first let's take

0:24:45.960 --> 0:24:58.520
<v Speaker 1>a quick word from our sponsor. So back in twent

0:24:59.600 --> 0:25:03.240
<v Speaker 1>there were some researchers at the Hebrew University in Israel

0:25:03.359 --> 0:25:08.760
<v Speaker 1>who designed a system called the Semi Supervised Algorithm for

0:25:08.800 --> 0:25:15.639
<v Speaker 1>Sarcasm Identification or SAZI, and they used SAZI to analyze

0:25:15.640 --> 0:25:20.520
<v Speaker 1>collections of nearly six million tweets and also around sixty

0:25:20.600 --> 0:25:25.680
<v Speaker 1>six thousand product reviews from Amazon. They wanted to find

0:25:26.480 --> 0:25:31.160
<v Speaker 1>rich treasure troves of sarcasm that turns out reviews and

0:25:31.200 --> 0:25:37.119
<v Speaker 1>tweets they fit the bill sarcasm is. Really it's typically

0:25:37.200 --> 0:25:40.960
<v Speaker 1>conveyed in in some vocal tone right and nonverbal cues.

0:25:41.760 --> 0:25:45.840
<v Speaker 1>So you have to first go someplace where sarcasm is

0:25:45.840 --> 0:25:49.240
<v Speaker 1>is rampant in text form to be able to really

0:25:49.400 --> 0:25:54.280
<v Speaker 1>fine tune how you can identify sarcasm versus something that's

0:25:54.320 --> 0:25:57.400
<v Speaker 1>meant exactly the way it's written on the surface level.

0:25:57.760 --> 0:26:03.120
<v Speaker 1>So they started to map out the various features that

0:26:03.200 --> 0:26:07.520
<v Speaker 1>were common in sarcastic comments online. So they were looking

0:26:07.520 --> 0:26:11.520
<v Speaker 1>for things like hyperbolic words and if you're using a

0:26:11.520 --> 0:26:15.440
<v Speaker 1>lot of exaggeration, that could be a key. Excessive punctuation

0:26:15.760 --> 0:26:19.040
<v Speaker 1>was another one, especially ellipses, which I tend to use

0:26:19.160 --> 0:26:21.480
<v Speaker 1>a lot, though I don't know if I use it

0:26:21.520 --> 0:26:24.680
<v Speaker 1>so much for sarcasm as I do for just timing purposes.

0:26:24.720 --> 0:26:27.399
<v Speaker 1>To indicate this is the beat I would take if

0:26:27.400 --> 0:26:30.159
<v Speaker 1>I were saying this out loud. I guess that's just

0:26:30.240 --> 0:26:34.560
<v Speaker 1>as irritating, though, also how straightforward is the Senate structure?

0:26:35.040 --> 0:26:37.600
<v Speaker 1>And they gave it examples of sarcasm. They fed it

0:26:37.680 --> 0:26:43.919
<v Speaker 1>tweets that were tagged hashtag sarcasm, so that the machine

0:26:43.960 --> 0:26:47.600
<v Speaker 1>quote unquote knew that that was already a sarcastic tweet

0:26:47.840 --> 0:26:50.919
<v Speaker 1>and could start to analyze it and build out a

0:26:51.040 --> 0:26:53.240
<v Speaker 1>model for what sarcasm is. They also fed at a

0:26:53.320 --> 0:26:57.080
<v Speaker 1>bunch of one star Amazon reviews that had been judged

0:26:57.160 --> 0:27:01.480
<v Speaker 1>to be sarcastic by a panel consisting of fifteen human beings,

0:27:02.040 --> 0:27:06.880
<v Speaker 1>and the system was told it had to rate sentences

0:27:06.920 --> 0:27:10.440
<v Speaker 1>on a scale of one to five, One being not sarcastic.

0:27:10.880 --> 0:27:16.040
<v Speaker 1>They mean exactly what the Senate says, five being holy cow,

0:27:16.200 --> 0:27:20.440
<v Speaker 1>this person should write for the Onion, this is incredibly sarcastic.

0:27:21.000 --> 0:27:27.800
<v Speaker 1>SAZI could identify sarcastic Amazon reviews with precision, not bad,

0:27:28.840 --> 0:27:31.440
<v Speaker 1>but when it came to Twitter, it did even better,

0:27:32.200 --> 0:27:36.159
<v Speaker 1>I think, probably because there had to be very short

0:27:36.200 --> 0:27:39.280
<v Speaker 1>messages on Twitter. This was before Twitter had even expanded

0:27:39.280 --> 0:27:42.560
<v Speaker 1>to characters, so it's still back in the one character days.

0:27:43.080 --> 0:27:47.760
<v Speaker 1>The precision rate for SAZI for Twitter was so it

0:27:47.840 --> 0:27:52.560
<v Speaker 1>was really good at detecting straightforward sarcasm, the kind that

0:27:52.600 --> 0:27:55.000
<v Speaker 1>a lot of people on Twitter use, because you have

0:27:55.160 --> 0:27:57.240
<v Speaker 1>limited space so you can't really set it up in

0:27:57.320 --> 0:28:01.720
<v Speaker 1>a more complex way, but it was all so uh

0:28:02.080 --> 0:28:08.199
<v Speaker 1>more prone to judging things as false negative evaluations rather

0:28:08.240 --> 0:28:10.960
<v Speaker 1>than false positives. In other words, it was more likely

0:28:11.600 --> 0:28:15.600
<v Speaker 1>to look at a negative sarcastic message and say that's

0:28:15.600 --> 0:28:18.440
<v Speaker 1>not sarcastic than it was to look at a straightforward

0:28:18.440 --> 0:28:21.960
<v Speaker 1>message and say, no, that is sarcastic. So that was

0:28:22.040 --> 0:28:27.040
<v Speaker 1>kind of interesting. Back to Watson. Another use of Watson

0:28:27.480 --> 0:28:31.520
<v Speaker 1>came out of the Milk and Institute Global Conference at

0:28:31.720 --> 0:28:35.720
<v Speaker 1>IBM showed off some research that it had been working

0:28:35.840 --> 0:28:40.520
<v Speaker 1>on internally, and it was calling this research debating Technologies.

0:28:41.280 --> 0:28:44.600
<v Speaker 1>This was a project in which IBM was trying to

0:28:44.640 --> 0:28:48.440
<v Speaker 1>see if they could feed a computer raw information, have

0:28:48.640 --> 0:28:53.640
<v Speaker 1>the computer synthesize the information, understand that information, at least

0:28:53.640 --> 0:29:00.840
<v Speaker 1>on a computational level and then create a a debating

0:29:00.880 --> 0:29:05.000
<v Speaker 1>strategy for both pros and cons based on that information.

0:29:05.400 --> 0:29:09.080
<v Speaker 1>So it would take a huge amount of content like

0:29:09.720 --> 0:29:13.280
<v Speaker 1>all of Wikipedia, for example, and then on any given

0:29:13.280 --> 0:29:15.920
<v Speaker 1>subject that would be covered in Wikipedia, it would be

0:29:15.960 --> 0:29:19.800
<v Speaker 1>asked form an argument that is in favor of or

0:29:19.960 --> 0:29:25.000
<v Speaker 1>is against a concept, whatever that concept might be. John

0:29:25.120 --> 0:29:27.560
<v Speaker 1>Kelly of IBM showed off in a demo how the

0:29:27.560 --> 0:29:31.080
<v Speaker 1>tool could be used to predict pro or con arguments

0:29:31.120 --> 0:29:35.360
<v Speaker 1>about a subject based on a body of information. So

0:29:36.400 --> 0:29:40.360
<v Speaker 1>you might be able to use this technology in order

0:29:40.400 --> 0:29:47.000
<v Speaker 1>to anticipate what an opposing person might say on any

0:29:47.040 --> 0:29:49.360
<v Speaker 1>given subject. Let's say that you are getting ready to

0:29:49.440 --> 0:29:55.200
<v Speaker 1>debate a topic. You might feed that information to a

0:29:55.280 --> 0:29:58.480
<v Speaker 1>computer system using this Watson platform. You might feed in

0:29:58.560 --> 0:30:02.400
<v Speaker 1>a ton of information, and then you might say, who

0:30:03.760 --> 0:30:08.000
<v Speaker 1>imagine someone who is against this particular topic, whatever it

0:30:08.080 --> 0:30:12.560
<v Speaker 1>might be. Uh. Let's say it's it's it's renewable energy

0:30:12.960 --> 0:30:17.040
<v Speaker 1>and the uh the efficiency of solar panels, whether or

0:30:17.040 --> 0:30:20.000
<v Speaker 1>not it makes sense to invest in solar panels. Let's

0:30:20.000 --> 0:30:22.480
<v Speaker 1>say that your stance is that you have to argue

0:30:22.640 --> 0:30:26.200
<v Speaker 1>for solar panels. You might say, what would someone who

0:30:26.200 --> 0:30:31.040
<v Speaker 1>wants to argue against solar panels, say, and then Watson

0:30:31.120 --> 0:30:36.160
<v Speaker 1>would analyze this information and return to you what it

0:30:36.280 --> 0:30:40.480
<v Speaker 1>thinks would be an argument someone would use to support

0:30:40.560 --> 0:30:45.040
<v Speaker 1>that that stance, and then you could prepare for that,

0:30:45.640 --> 0:30:47.640
<v Speaker 1>which would be an incredible tool. I mean, you could

0:30:47.640 --> 0:30:50.000
<v Speaker 1>think of this as for political debates. It would be amazing.

0:30:50.200 --> 0:30:53.000
<v Speaker 1>You could think of how you might want to prepare

0:30:53.320 --> 0:30:56.480
<v Speaker 1>so that you can argue intelligently against an opponent, and

0:30:56.480 --> 0:30:58.920
<v Speaker 1>you can already anticipate what that opponent is going to

0:30:58.960 --> 0:31:01.959
<v Speaker 1>say because you oh their general stance on a topic,

0:31:02.240 --> 0:31:04.760
<v Speaker 1>but you might not know what tactics they might use

0:31:04.840 --> 0:31:08.760
<v Speaker 1>to support that stance. Maybe politics isn't a great choice

0:31:08.800 --> 0:31:11.440
<v Speaker 1>because that's not always in the realm of rationality. That

0:31:11.480 --> 0:31:17.840
<v Speaker 1>often falls into a call toward emotional response rather than

0:31:17.960 --> 0:31:22.640
<v Speaker 1>rational response. That's more of a a commentary on politics

0:31:22.640 --> 0:31:25.440
<v Speaker 1>in general, regardless of what side you might be on,

0:31:25.680 --> 0:31:29.240
<v Speaker 1>all sides do this anyway. He actually showed at this

0:31:29.320 --> 0:31:32.680
<v Speaker 1>demo a different example. He said, what if you were

0:31:32.720 --> 0:31:35.800
<v Speaker 1>to take the sale of violent video games to minors

0:31:35.960 --> 0:31:40.280
<v Speaker 1>should be banned, that's the topic, and that the computer

0:31:40.320 --> 0:31:43.040
<v Speaker 1>would then go through all the information and had access

0:31:43.080 --> 0:31:46.720
<v Speaker 1>to it would end up sorting out all the parts

0:31:46.760 --> 0:31:49.840
<v Speaker 1>that were relevant to the discussion, so it just put

0:31:49.880 --> 0:31:52.840
<v Speaker 1>those aside and that would become the core of the

0:31:52.920 --> 0:31:55.960
<v Speaker 1>data it would reference. I would then go through and

0:31:56.040 --> 0:32:00.760
<v Speaker 1>identify basic statements is either being a pro stance of

0:32:01.880 --> 0:32:07.080
<v Speaker 1>banning violent video games to miners or a constance for

0:32:07.160 --> 0:32:09.680
<v Speaker 1>that saying no, we should be able to sell violent

0:32:09.760 --> 0:32:14.520
<v Speaker 1>video games to miners. The tools scanned four million articles.

0:32:14.560 --> 0:32:17.280
<v Speaker 1>It returned the top ten articles that were determined to

0:32:17.320 --> 0:32:21.200
<v Speaker 1>be the most relevant to that particular debate, and it

0:32:21.320 --> 0:32:26.280
<v Speaker 1>scanned approximately three thousand sentences, come from from top to bottom,

0:32:26.440 --> 0:32:31.000
<v Speaker 1>and it then identified sentences that contained candidate claims that

0:32:31.320 --> 0:32:34.560
<v Speaker 1>would be statements that would either be interpreted as being

0:32:34.600 --> 0:32:37.920
<v Speaker 1>pro or con for the stance. Then it identified the

0:32:37.920 --> 0:32:41.000
<v Speaker 1>parameters of those claims. Then it assessed the claims for

0:32:41.120 --> 0:32:44.400
<v Speaker 1>the pro and con polarity, then constructed a sample pro

0:32:44.640 --> 0:32:47.840
<v Speaker 1>or con statement. And the statements in the demo were

0:32:48.240 --> 0:32:51.360
<v Speaker 1>kind of interesting. And since the computer is constructing arguments

0:32:51.440 --> 0:32:55.880
<v Speaker 1>based upon what people have already written, it would reflect

0:32:55.880 --> 0:32:58.960
<v Speaker 1>a lot of vague statements that aren't a firm stance. So,

0:32:59.000 --> 0:33:01.080
<v Speaker 1>in other words, like it it and take a bunch

0:33:01.320 --> 0:33:05.560
<v Speaker 1>of stuff that was written that itself did not take

0:33:05.920 --> 0:33:09.440
<v Speaker 1>either a pro or constance and then transform that magically

0:33:09.640 --> 0:33:13.200
<v Speaker 1>into the perfect pro stance or the perfect constance. Uh.

0:33:13.280 --> 0:33:16.360
<v Speaker 1>It's dependent upon the words that human beings have already written,

0:33:16.800 --> 0:33:19.600
<v Speaker 1>So it could not magically come up with a killer

0:33:19.960 --> 0:33:22.760
<v Speaker 1>argument if the data that had been written about this

0:33:22.920 --> 0:33:27.280
<v Speaker 1>subject didn't come down on a firm stance one way

0:33:27.360 --> 0:33:32.560
<v Speaker 1>or the other. Um. The point of the demonstration wasn't

0:33:32.600 --> 0:33:36.800
<v Speaker 1>to create a tool that could either troll people or

0:33:36.960 --> 0:33:39.880
<v Speaker 1>counter trolls. It was to show that a computer could

0:33:39.960 --> 0:33:43.120
<v Speaker 1>be useful to aid in the reasoning process when you're

0:33:43.200 --> 0:33:46.520
<v Speaker 1>making a critical decision. Again, to go back to that

0:33:46.600 --> 0:33:49.600
<v Speaker 1>medical example, it could be used to help a doctor

0:33:50.120 --> 0:33:54.320
<v Speaker 1>determine which diagnosis is the most likely to be accurate

0:33:54.360 --> 0:33:58.240
<v Speaker 1>for a patient, what what course of treatment might be

0:33:58.320 --> 0:34:03.160
<v Speaker 1>the most helpful for that patient, and thus it could

0:34:03.240 --> 0:34:07.480
<v Speaker 1>have real practical use outside of this more esoteric, interesting

0:34:08.239 --> 0:34:13.160
<v Speaker 1>UH debate. Us. Now, will we see computers in the

0:34:13.280 --> 0:34:17.759
<v Speaker 1>future able to detect sarcasm just as easily as your

0:34:17.760 --> 0:34:23.440
<v Speaker 1>typical human being can when given the right circumstances. And

0:34:23.480 --> 0:34:27.040
<v Speaker 1>I use the word typical reluctantly, but you get what

0:34:27.080 --> 0:34:30.439
<v Speaker 1>I mean, I don't know. It's gonna take some time.

0:34:30.719 --> 0:34:32.960
<v Speaker 1>It takes an awful lot of processing power too. You

0:34:32.960 --> 0:34:37.400
<v Speaker 1>have to remember that for these neural networks systems, the

0:34:37.400 --> 0:34:40.879
<v Speaker 1>ones that are running these these various platforms and programs

0:34:40.880 --> 0:34:44.600
<v Speaker 1>and strategies, they take up a lot of processing power.

0:34:45.840 --> 0:34:52.120
<v Speaker 1>Because our brains have billion neurons in them, so we

0:34:52.200 --> 0:34:56.560
<v Speaker 1>have a very sophisticated supercomputer sitting in our heads. Moreover,

0:34:56.960 --> 0:35:01.160
<v Speaker 1>our brains are insanely energy efficient. They require about the

0:35:01.200 --> 0:35:05.239
<v Speaker 1>equivalent of twenty watts of power. A supercomputer needs a

0:35:05.320 --> 0:35:09.680
<v Speaker 1>lot more power than that. So while we're seeing advances

0:35:09.719 --> 0:35:13.720
<v Speaker 1>in this, it requires so much processing power, so much energy.

0:35:14.080 --> 0:35:19.319
<v Speaker 1>It is not a practical approach to most forms of computing,

0:35:19.800 --> 0:35:23.400
<v Speaker 1>at least from a consumer standpoint. You might see a

0:35:23.440 --> 0:35:25.759
<v Speaker 1>future where the sort of stuff is all in the

0:35:25.800 --> 0:35:29.720
<v Speaker 1>cloud and then we can access it through an app

0:35:29.840 --> 0:35:32.239
<v Speaker 1>or a program or whatever. That way, you don't have

0:35:32.320 --> 0:35:35.160
<v Speaker 1>to have a supercomputer sitting on your desk in order

0:35:35.160 --> 0:35:38.239
<v Speaker 1>to tap into those uh, those capabilities, but you have

0:35:38.280 --> 0:35:41.239
<v Speaker 1>to have an Internet connection, which most of us these

0:35:41.320 --> 0:35:44.200
<v Speaker 1>days tend to have fairly frequently. I mean, there are

0:35:44.200 --> 0:35:46.200
<v Speaker 1>a lot of people out there who at this point

0:35:46.520 --> 0:35:49.680
<v Speaker 1>have had a persistent Internet connection for pretty much their

0:35:49.680 --> 0:35:53.440
<v Speaker 1>whole lives, which blows my mind. But that's the kind

0:35:53.480 --> 0:35:55.480
<v Speaker 1>of world we'd have to live in in order to

0:35:55.560 --> 0:35:58.640
<v Speaker 1>really take advantage of this, at least in the near term.

0:35:58.680 --> 0:36:00.560
<v Speaker 1>I don't know if we're are going to see a

0:36:00.600 --> 0:36:04.480
<v Speaker 1>computer that can analyze, say, an article from the Onion

0:36:05.280 --> 0:36:09.200
<v Speaker 1>and not only point out that it's being sarcastic or ironic,

0:36:09.320 --> 0:36:11.920
<v Speaker 1>but also point out why it's funny. I think at

0:36:11.960 --> 0:36:14.759
<v Speaker 1>one point, when you start analyzing comedy, that gets to

0:36:14.840 --> 0:36:18.400
<v Speaker 1>be a level where nothing is ever funny ever again.

0:36:18.920 --> 0:36:23.440
<v Speaker 1>But it is a really interesting problem. So that's whether

0:36:23.520 --> 0:36:26.319
<v Speaker 1>that's that's this look back on if AI is ever

0:36:26.360 --> 0:36:28.920
<v Speaker 1>going to understand sarcasm. I'm curious to hear what you

0:36:28.960 --> 0:36:34.719
<v Speaker 1>guys think. Do you think we're closer than I am suggesting? Uh? Maybe, well,

0:36:34.760 --> 0:36:36.480
<v Speaker 1>I mean, we're definitely closer than we were when we

0:36:36.520 --> 0:36:38.320
<v Speaker 1>did this episode on Forward Thinking, because that was a

0:36:38.320 --> 0:36:42.200
<v Speaker 1>few years ago. But I don't know that we're you know,

0:36:42.320 --> 0:36:46.160
<v Speaker 1>significantly closer. It's a it's a real tough problem. Or

0:36:46.200 --> 0:36:48.279
<v Speaker 1>do you think that sarcasm is one of those things

0:36:48.360 --> 0:36:50.640
<v Speaker 1>that's just innately human and machines are never really going

0:36:50.680 --> 0:36:53.520
<v Speaker 1>to be able to handle it. We've got a lot

0:36:53.600 --> 0:36:56.160
<v Speaker 1>of programs out there that appear to be sarcastic, but

0:36:56.200 --> 0:37:01.080
<v Speaker 1>that's because they're they're acting on preprogrammed respond says two

0:37:01.080 --> 0:37:03.480
<v Speaker 1>things that we ask them. It's not exactly the same.

0:37:03.480 --> 0:37:06.080
<v Speaker 1>It's kind of cheating, but I'm curious to hear what

0:37:06.120 --> 0:37:09.080
<v Speaker 1>you guys think. Also, make sure you go to our

0:37:09.160 --> 0:37:13.359
<v Speaker 1>brand new website for tech stuff. That's tech Stuff Podcast

0:37:13.560 --> 0:37:16.160
<v Speaker 1>dot com. That's where you're going to find all the

0:37:16.239 --> 0:37:18.480
<v Speaker 1>links to all sorts of stuff like how to contact

0:37:18.520 --> 0:37:21.640
<v Speaker 1>me in case you're wondering the best ways through email,

0:37:21.680 --> 0:37:23.759
<v Speaker 1>It's tech Stuff at how stuff Works dot com, or

0:37:23.800 --> 0:37:25.880
<v Speaker 1>through Facebook or Twitter that's Tech Stuff hs W. But

0:37:25.920 --> 0:37:28.320
<v Speaker 1>all that information is also on the website, as is

0:37:28.360 --> 0:37:31.040
<v Speaker 1>a link to our store at t Public. Remember every

0:37:31.080 --> 0:37:34.280
<v Speaker 1>single purchase you make at that store helps out the show.

0:37:34.880 --> 0:37:38.080
<v Speaker 1>Don't forget to follow us on Instagram and I'll talk

0:37:38.120 --> 0:37:46.640
<v Speaker 1>to you again really soon for more on this and

0:37:46.680 --> 0:37:59.359
<v Speaker 1>thousands of other topics. Because it how Stuff Works dot com.