WEBVTT - TechStuff Rerun: Could we make a sarcastic supercomputer?

0:00:04.400 --> 0:00:12.600
<v Speaker 1>Welcome to tex Stuff production from I Heart Radio. Hey there,

0:00:12.640 --> 0:00:16.200
<v Speaker 1>and welcome to tech Stuff. I'm your host, Jonathan Strickland.

0:00:16.239 --> 0:00:18.400
<v Speaker 1>I'm an executive producer with I Heart Radio and a

0:00:18.480 --> 0:00:21.520
<v Speaker 1>love of all things tech, and I'm going to bring

0:00:21.560 --> 0:00:24.800
<v Speaker 1>you guys a little bit of a rerun today. I

0:00:24.840 --> 0:00:27.960
<v Speaker 1>am trying to get some stuff put together for a

0:00:28.040 --> 0:00:32.199
<v Speaker 1>special series of episodes as well as prepare for some

0:00:32.280 --> 0:00:35.199
<v Speaker 1>other stuff. So great things right around the corner. I

0:00:35.200 --> 0:00:37.600
<v Speaker 1>did not want to leave you without an episode at all,

0:00:37.680 --> 0:00:40.000
<v Speaker 1>So we're going to listen to this one that originally

0:00:40.040 --> 0:00:43.280
<v Speaker 1>published on October eighteenth, two thousand eighteen, and it kind

0:00:43.280 --> 0:00:45.559
<v Speaker 1>of goes in line with some other stuff we've been

0:00:45.560 --> 0:00:49.040
<v Speaker 1>covering in recent episodes of Tech Stuff. This episode was

0:00:49.080 --> 0:00:54.600
<v Speaker 1>titled Could We Make a Sarcastic Supercomputer? And yeah, it

0:00:54.640 --> 0:00:59.920
<v Speaker 1>really dives into the whole concept of artificial intelligence, natural lane,

0:01:00.040 --> 0:01:03.960
<v Speaker 1>which and just kind of understanding the quirks of what

0:01:04.040 --> 0:01:07.120
<v Speaker 1>it is to be human and the whole concept of sarcasm.

0:01:07.360 --> 0:01:10.680
<v Speaker 1>I hope you guys enjoy it. I mean that without

0:01:10.720 --> 0:01:13.280
<v Speaker 1>even a hint of sarcasm. And I'll chat with you

0:01:13.480 --> 0:01:18.200
<v Speaker 1>after the episode. Today. I want to talk to you

0:01:18.319 --> 0:01:21.920
<v Speaker 1>about an interesting topic that I got to explore a

0:01:21.959 --> 0:01:25.760
<v Speaker 1>couple of years ago with Joe McCormick and Lauren fogobaum

0:01:26.200 --> 0:01:30.880
<v Speaker 1>As we debated the possibilities of computers learning how to

0:01:31.280 --> 0:01:36.440
<v Speaker 1>understand sarcasm. We did it for a podcast called Forward Thinking,

0:01:36.760 --> 0:01:38.760
<v Speaker 1>which was around for a couple of years. It was

0:01:38.800 --> 0:01:40.760
<v Speaker 1>a lot of fun to work on that that show

0:01:41.319 --> 0:01:43.840
<v Speaker 1>is over, but I thought I would revisit the topic

0:01:44.200 --> 0:01:47.160
<v Speaker 1>and talk about it for you guys and kind of

0:01:47.200 --> 0:01:50.360
<v Speaker 1>go over what would it take to have a computer

0:01:50.440 --> 0:01:54.360
<v Speaker 1>that could actually understand when someone's being sarcastic. Now to

0:01:54.840 --> 0:01:57.360
<v Speaker 1>understand why this is a big deal, it helps to

0:01:57.440 --> 0:02:01.520
<v Speaker 1>have a refresher course on how computers process information. And

0:02:01.560 --> 0:02:03.960
<v Speaker 1>I know I talked about this a lot, but I

0:02:04.000 --> 0:02:07.000
<v Speaker 1>still think it's important to cover the basics when you

0:02:07.000 --> 0:02:10.160
<v Speaker 1>want to talk about something as advanced as being able

0:02:10.200 --> 0:02:16.200
<v Speaker 1>to detect and understand sarcasm. So computers understand machine code

0:02:16.280 --> 0:02:19.960
<v Speaker 1>or assembly language. This is a language that corresponds with

0:02:20.080 --> 0:02:25.400
<v Speaker 1>the actual physical architecture of the computers. So the way

0:02:25.440 --> 0:02:28.639
<v Speaker 1>the computer is built, that's how this language interacts. It's

0:02:28.680 --> 0:02:32.440
<v Speaker 1>it's essentially how the physical components of the computer are

0:02:32.520 --> 0:02:38.480
<v Speaker 1>able to handle electric current or voltage differences in order

0:02:38.520 --> 0:02:45.600
<v Speaker 1>to process information, and computers can interpret this and execute

0:02:45.720 --> 0:02:49.919
<v Speaker 1>upon this language very quickly. It is the basic language

0:02:49.960 --> 0:02:55.600
<v Speaker 1>of those physical components. However, it is almost impossible for

0:02:55.960 --> 0:02:58.320
<v Speaker 1>humans to work with this, at least on a way

0:02:58.400 --> 0:03:02.720
<v Speaker 1>that is at all of shion, because it ultimately for

0:03:02.919 --> 0:03:08.520
<v Speaker 1>most computers boils down to binary language, right, zeros and ones.

0:03:09.360 --> 0:03:13.079
<v Speaker 1>So you see a huge block of zeros and ones,

0:03:13.120 --> 0:03:15.640
<v Speaker 1>and unless you are neo from the matrix, it means

0:03:15.680 --> 0:03:20.080
<v Speaker 1>nothing to you. So we speak in natural language to

0:03:20.120 --> 0:03:24.120
<v Speaker 1>one another. Natural language, however, is filled with a lot

0:03:24.160 --> 0:03:28.600
<v Speaker 1>of components that make it very very challenging for machines

0:03:28.639 --> 0:03:33.280
<v Speaker 1>to interpret, like ambiguity, or there might be double meanings

0:03:33.320 --> 0:03:36.320
<v Speaker 1>in a phrase and you may mean both meanings at

0:03:36.320 --> 0:03:40.640
<v Speaker 1>the same time, and that is too complicated for most

0:03:40.680 --> 0:03:43.680
<v Speaker 1>machines to be able to process. They just can't deal

0:03:43.760 --> 0:03:47.520
<v Speaker 1>with that. So to bridge the gap between the way

0:03:47.600 --> 0:03:51.840
<v Speaker 1>we humans communicate and the way that computers process language,

0:03:52.280 --> 0:03:58.000
<v Speaker 1>we have created programming languages and compilers. Now, programming languages

0:03:58.080 --> 0:04:02.120
<v Speaker 1>fall into two broad category worries. It's more like a spectrum,

0:04:02.440 --> 0:04:04.840
<v Speaker 1>and you could be further on one end than the other,

0:04:05.280 --> 0:04:08.920
<v Speaker 1>and we typically call them high level programming languages and

0:04:09.000 --> 0:04:13.920
<v Speaker 1>low level programming languages. The lower the level of programming language,

0:04:14.000 --> 0:04:17.719
<v Speaker 1>the closer it is to machine code, and the easier

0:04:17.760 --> 0:04:20.800
<v Speaker 1>it is for a computer to understand, but the harder

0:04:20.839 --> 0:04:22.880
<v Speaker 1>it is to work with. If you happen to be,

0:04:22.960 --> 0:04:27.240
<v Speaker 1>you know, a human being. High level programming languages are

0:04:27.320 --> 0:04:30.640
<v Speaker 1>easier for humans to understand. Now, if you have never

0:04:30.720 --> 0:04:33.919
<v Speaker 1>taken any courses in programming and you're looking at a

0:04:34.000 --> 0:04:38.039
<v Speaker 1>page of code, it could seem indecipherable to you. It

0:04:38.200 --> 0:04:44.160
<v Speaker 1>is just meaningless strings of characters. But once you learn

0:04:44.200 --> 0:04:49.200
<v Speaker 1>the rules of that programming language, how you construct an instruction,

0:04:49.640 --> 0:04:51.720
<v Speaker 1>and a series of instructions, how you go from one

0:04:51.720 --> 0:04:55.080
<v Speaker 1>instruction to the next. Once you understand the rules, it

0:04:55.120 --> 0:04:58.159
<v Speaker 1>actually becomes quite easy to use in the grand scheme

0:04:58.160 --> 0:05:00.640
<v Speaker 1>of things, much more easy than machine which would be.

0:05:01.720 --> 0:05:04.680
<v Speaker 1>But again, the problem here is that computers don't understand

0:05:04.720 --> 0:05:09.880
<v Speaker 1>programming languages, not natively. Even though this is not exactly

0:05:09.920 --> 0:05:12.320
<v Speaker 1>the same as human natural language, it's also not the

0:05:12.400 --> 0:05:15.920
<v Speaker 1>same as machine language. That's why you need compilers. A

0:05:15.920 --> 0:05:21.840
<v Speaker 1>compiler is essentially a translator. It takes this high level

0:05:21.920 --> 0:05:26.240
<v Speaker 1>programming language or higher level anyway and then converts it

0:05:26.320 --> 0:05:29.440
<v Speaker 1>into a machine readable language for the computer to actually

0:05:29.440 --> 0:05:32.680
<v Speaker 1>execute upon. And this is all in the design of

0:05:32.680 --> 0:05:37.320
<v Speaker 1>the programming languages and the compilers. So this is the

0:05:37.400 --> 0:05:41.960
<v Speaker 1>way that for decades we have interacted with computers, when

0:05:41.960 --> 0:05:43.960
<v Speaker 1>you're talking about it on a on a direct level,

0:05:44.000 --> 0:05:48.080
<v Speaker 1>not just executing a program, but creating code, creating programs

0:05:48.080 --> 0:05:53.160
<v Speaker 1>for computers to run. Over the last few decades, we've

0:05:53.200 --> 0:05:58.880
<v Speaker 1>had some very very smart people working on natural language

0:05:58.880 --> 0:06:05.760
<v Speaker 1>systems for machines, which would allow a computer to interpret

0:06:06.360 --> 0:06:09.880
<v Speaker 1>natural language in a way that would make some sort

0:06:09.920 --> 0:06:11.640
<v Speaker 1>of sense, and for the computer to be able to

0:06:11.680 --> 0:06:15.280
<v Speaker 1>act upon that language. And we've seen this in plenty

0:06:15.320 --> 0:06:20.640
<v Speaker 1>of examples recently. Most smartphones have some sort of smart assistant.

0:06:21.120 --> 0:06:25.279
<v Speaker 1>You have standalone products like Amazon's Echo, you have Google Home,

0:06:25.400 --> 0:06:30.560
<v Speaker 1>You've got tons of devices that can interact with people.

0:06:30.839 --> 0:06:35.600
<v Speaker 1>It can be activated by typically an alert phrase, which

0:06:35.600 --> 0:06:37.279
<v Speaker 1>I'm not going to say because I don't want any

0:06:37.320 --> 0:06:39.120
<v Speaker 1>of you guys to have to deal with that. I

0:06:39.200 --> 0:06:41.559
<v Speaker 1>know how irritating it is when I'm watching a video

0:06:41.680 --> 0:06:48.120
<v Speaker 1>and someone activates their specific system and then mine begins

0:06:48.160 --> 0:06:50.200
<v Speaker 1>to respond, and all my lights started going on and

0:06:50.279 --> 0:06:53.760
<v Speaker 1>off because the people on YouTube we're talking funny. I

0:06:53.800 --> 0:06:56.640
<v Speaker 1>know how irritating that is, but use that it activates

0:06:56.920 --> 0:07:00.839
<v Speaker 1>and then you can speak. And typically you can say

0:07:00.880 --> 0:07:06.279
<v Speaker 1>the same thing several different ways and the device appears

0:07:06.279 --> 0:07:09.240
<v Speaker 1>to understand you no matter how you word it. And

0:07:09.320 --> 0:07:11.960
<v Speaker 1>this is a real challenge because we human beings can

0:07:12.000 --> 0:07:14.760
<v Speaker 1>find lots of different ways to say the same thing.

0:07:15.120 --> 0:07:17.800
<v Speaker 1>For example, if I say what is the weather today,

0:07:18.640 --> 0:07:20.640
<v Speaker 1>it could be very similar to if I if I

0:07:20.680 --> 0:07:24.040
<v Speaker 1>ask a question is it going to rain today? Both

0:07:24.080 --> 0:07:27.280
<v Speaker 1>of those are asking for information about the weather, but

0:07:27.400 --> 0:07:31.160
<v Speaker 1>are very different ways of saying that. A good natural

0:07:31.280 --> 0:07:34.560
<v Speaker 1>language recognition program will be able to parse that information

0:07:35.320 --> 0:07:40.760
<v Speaker 1>and then return the appropriate response. This is not an

0:07:40.760 --> 0:07:45.040
<v Speaker 1>easy thing to do. Typically it involves creating a neural

0:07:45.080 --> 0:07:49.000
<v Speaker 1>network structure, and I've talked about artificial neural networks recently.

0:07:49.640 --> 0:07:56.520
<v Speaker 1>That's a typically a network that can accept multiple binary inputs,

0:07:56.560 --> 0:07:59.600
<v Speaker 1>so either a zero or a one input that represents

0:07:59.680 --> 0:08:03.760
<v Speaker 1>some thing uh, some sort of yes, no or on

0:08:03.920 --> 0:08:09.000
<v Speaker 1>off kind of feature. It can accept multiple multiple inputs

0:08:09.160 --> 0:08:12.360
<v Speaker 1>of that nature, so multiple zeros or ones that all

0:08:12.600 --> 0:08:15.520
<v Speaker 1>factor into making a decision, and then it has a

0:08:15.560 --> 0:08:19.800
<v Speaker 1>waiting for each of those components, and then it produces

0:08:19.880 --> 0:08:22.960
<v Speaker 1>a single output that's also binary in nature, either is

0:08:23.040 --> 0:08:26.320
<v Speaker 1>zero one, and it passes that on to other artificial

0:08:26.360 --> 0:08:30.040
<v Speaker 1>neurons further down the chain. Sometimes that will come back

0:08:30.080 --> 0:08:33.760
<v Speaker 1>around and you have a recursive artificial neural network. The

0:08:33.880 --> 0:08:40.080
<v Speaker 1>goal here is for this process two ultimately result in

0:08:40.640 --> 0:08:46.000
<v Speaker 1>a response that is reasonably certain to meet the requirements

0:08:46.040 --> 0:08:49.199
<v Speaker 1>of the person asking the question. This tends to be

0:08:49.720 --> 0:08:53.240
<v Speaker 1>talked about in the realm of probabilities. We we talked

0:08:53.240 --> 0:08:56.680
<v Speaker 1>about how certain the machine is that the response is

0:08:56.720 --> 0:09:00.440
<v Speaker 1>the appropriate one, and if it falls below a certain toushold,

0:09:01.120 --> 0:09:04.079
<v Speaker 1>then the machine would typically respond with I'm sorry, I

0:09:04.160 --> 0:09:06.600
<v Speaker 1>don't know what you're asking for, or something similar to that.

0:09:08.000 --> 0:09:10.520
<v Speaker 1>There are cases where you just get misinterpreted and you'll

0:09:10.520 --> 0:09:13.240
<v Speaker 1>get a response that does not reflect whatever you ask.

0:09:13.400 --> 0:09:16.080
<v Speaker 1>That's a little different. That's where the machine has drawn

0:09:16.120 --> 0:09:19.320
<v Speaker 1>a conclusion, has been reasonably certain that it came to

0:09:19.320 --> 0:09:21.320
<v Speaker 1>the right conclusion. It turns out it was wrong the

0:09:21.320 --> 0:09:26.000
<v Speaker 1>whole way, but that's the process. Now, when it comes

0:09:26.040 --> 0:09:34.000
<v Speaker 1>to sarcasm, that adds yet another layer of difficulty, because

0:09:34.400 --> 0:09:38.360
<v Speaker 1>now a machine isn't just parsing what you are saying.

0:09:38.760 --> 0:09:42.720
<v Speaker 1>It has to understand what you mean, the meaning of

0:09:42.760 --> 0:09:47.679
<v Speaker 1>your words and the meaning of the way you deliver them.

0:09:47.720 --> 0:09:50.320
<v Speaker 1>It could be different. So if I were to just

0:09:50.480 --> 0:09:55.480
<v Speaker 1>write out a phrase with no tone, nobody language, uh,

0:09:55.840 --> 0:10:00.200
<v Speaker 1>not emphasizing any one word over another, it might be

0:10:00.320 --> 0:10:04.520
<v Speaker 1>very difficult to detect what my intent was. It may

0:10:04.559 --> 0:10:07.760
<v Speaker 1>seem like I'm being sincere, when in fact I'm being insincere.

0:10:08.080 --> 0:10:12.520
<v Speaker 1>For example, Uh, if I were to say that guy

0:10:12.600 --> 0:10:18.240
<v Speaker 1>is super tall, but I'm being sarcastic, then just in

0:10:18.280 --> 0:10:21.640
<v Speaker 1>that phrase the way I write it out, you would think, oh, well,

0:10:21.679 --> 0:10:26.160
<v Speaker 1>that person he's looking at must be super tall. How

0:10:26.200 --> 0:10:30.319
<v Speaker 1>do you recognize sarcasm? How can you detect that this

0:10:30.400 --> 0:10:33.520
<v Speaker 1>is in place and then understand what the meaning underneath

0:10:33.520 --> 0:10:37.960
<v Speaker 1>it is. One of the approaches that has been put

0:10:38.000 --> 0:10:44.720
<v Speaker 1>forward relates to IBM's Watson platform. Now. Watson first made

0:10:44.720 --> 0:10:48.680
<v Speaker 1>headlines back when it was a contestant on Jeopardy. It

0:10:48.960 --> 0:10:53.120
<v Speaker 1>went up against two former champions, including Ken Jennings, who

0:10:53.200 --> 0:10:56.440
<v Speaker 1>shows up on a house Stuff Works podcast. Anyway, Watson

0:10:56.480 --> 0:11:00.120
<v Speaker 1>went up against these two former champions, and it is

0:11:00.160 --> 0:11:03.400
<v Speaker 1>able to interpret natural language. It had to in order

0:11:03.400 --> 0:11:05.400
<v Speaker 1>to play the game of Jeopardy, And for those who

0:11:05.440 --> 0:11:08.120
<v Speaker 1>do not know what jeopardy is or they're not familiar

0:11:08.160 --> 0:11:11.319
<v Speaker 1>with the game show, Jeopardy is a game where you

0:11:11.360 --> 0:11:17.320
<v Speaker 1>are presented with categories of trivia and each category has

0:11:17.400 --> 0:11:23.920
<v Speaker 1>multiple uh questions or multiple entries in it, and they

0:11:24.040 --> 0:11:29.600
<v Speaker 1>range in dollar value, and the lower dollar value ones

0:11:29.640 --> 0:11:33.240
<v Speaker 1>are easier to answer than the higher dollar value ones,

0:11:34.320 --> 0:11:37.920
<v Speaker 1>and UH, you're Typically the way jeopardy works is that

0:11:37.960 --> 0:11:40.840
<v Speaker 1>you're you're given quote unquote the answer and you have

0:11:40.920 --> 0:11:46.199
<v Speaker 1>to provide the question. So, uh, if the answer were

0:11:47.600 --> 0:11:53.640
<v Speaker 1>this film that detailed the adventures of a young playwright

0:11:53.880 --> 0:11:57.560
<v Speaker 1>in sixteenth century England, one best picture, you would say,

0:11:57.880 --> 0:12:02.160
<v Speaker 1>what was Shakespeare in Love? So this computer is playing

0:12:02.160 --> 0:12:04.480
<v Speaker 1>against these two former champions. This was sort of an

0:12:04.520 --> 0:12:09.800
<v Speaker 1>exhibition series of games. It wasn't meant for, uh, a

0:12:09.840 --> 0:12:12.600
<v Speaker 1>competition in the way the typical Jeopardy games were there

0:12:12.640 --> 0:12:16.440
<v Speaker 1>was money on the line. Was an exhibition and Watson won.

0:12:16.880 --> 0:12:19.480
<v Speaker 1>It beat both of the champions, and it did what

0:12:19.559 --> 0:12:23.160
<v Speaker 1>I was telling you. It It would analyze the clue

0:12:23.600 --> 0:12:27.200
<v Speaker 1>that was given, the answer that was given, It would

0:12:27.240 --> 0:12:30.199
<v Speaker 1>try and generate a question to correspond with that answer,

0:12:30.559 --> 0:12:33.720
<v Speaker 1>and only if the question met a certain threshold of

0:12:33.760 --> 0:12:37.120
<v Speaker 1>confidence with Watson buzz in. If it did not meet

0:12:37.160 --> 0:12:41.280
<v Speaker 1>that level of confidence, Watson would remain quiet. And most importantly,

0:12:41.520 --> 0:12:44.160
<v Speaker 1>Watson was not at all connected to the Internet. All

0:12:44.240 --> 0:12:49.880
<v Speaker 1>the information was contained within a massive series of servers

0:12:50.760 --> 0:12:53.320
<v Speaker 1>more than gosh I can't even remember. There's a ton

0:12:53.400 --> 0:12:58.680
<v Speaker 1>of processors attached to it. Um so a very powerful machine,

0:12:59.720 --> 0:13:05.880
<v Speaker 1>but it still wasn't exactly able to detect sarcasm. It

0:13:05.960 --> 0:13:10.240
<v Speaker 1>could work with word play and it could work with riddles,

0:13:10.280 --> 0:13:13.200
<v Speaker 1>so that was really impressive. But what it really did

0:13:13.240 --> 0:13:15.800
<v Speaker 1>was it gave IBM the opportunity to say, we have

0:13:15.960 --> 0:13:20.600
<v Speaker 1>this platform here and we're welcoming developers to create applications

0:13:20.600 --> 0:13:24.360
<v Speaker 1>that tap into this platform and make use of this

0:13:25.120 --> 0:13:28.800
<v Speaker 1>in order to do interesting stuff with it. And IBM

0:13:28.960 --> 0:13:31.559
<v Speaker 1>was largely working with the medical industry at that point

0:13:31.600 --> 0:13:37.839
<v Speaker 1>to try and help doctors treat and diagnose patients, and

0:13:37.920 --> 0:13:40.000
<v Speaker 1>it was sort of computer guidance. It wasn't that you

0:13:40.040 --> 0:13:44.200
<v Speaker 1>had an automatic doctor, but rather the doctor had what

0:13:44.559 --> 0:13:49.720
<v Speaker 1>equates to a medical expert to confer with when trying

0:13:49.760 --> 0:13:53.000
<v Speaker 1>to determine why's the best course of action for a patient.

0:13:54.000 --> 0:13:58.000
<v Speaker 1>IBM put up an application Program Interface or API, and

0:13:58.080 --> 0:14:02.720
<v Speaker 1>let developers create their own cognitive computing applications built on

0:14:02.840 --> 0:14:08.000
<v Speaker 1>top of Watson. One of those was called the tone analyzer.

0:14:08.240 --> 0:14:11.439
<v Speaker 1>It still exists back when we were doing this episode

0:14:11.440 --> 0:14:15.679
<v Speaker 1>for Forward Thinking. It was in the form of analyzing

0:14:15.720 --> 0:14:18.720
<v Speaker 1>some text and telling you whether or not that text

0:14:18.720 --> 0:14:23.760
<v Speaker 1>would come across as agreeable or argumentative, or positive or negative,

0:14:24.240 --> 0:14:28.880
<v Speaker 1>and it would assign tone to those pieces. I'll explain

0:14:28.960 --> 0:14:32.160
<v Speaker 1>more about how it did and what it did in

0:14:32.240 --> 0:14:34.480
<v Speaker 1>just a minute, but first let's take a quick break

0:14:34.680 --> 0:14:46.040
<v Speaker 1>to thank our sponsor. So how did this tone analyzer work.

0:14:46.520 --> 0:14:52.960
<v Speaker 1>It would search for cues in any written text, social cues,

0:14:53.040 --> 0:14:56.960
<v Speaker 1>written cues, emotional cues in order to determine the overall

0:14:57.040 --> 0:14:59.960
<v Speaker 1>tone of a piece, which actually meant that the analy

0:15:00.040 --> 0:15:05.200
<v Speaker 1>lizer would tag individual words within a text, words that

0:15:05.280 --> 0:15:11.000
<v Speaker 1>it recognized and had already pre labeled as falling into

0:15:11.080 --> 0:15:14.560
<v Speaker 1>various categories. So words that might have a positive meaning

0:15:14.960 --> 0:15:20.600
<v Speaker 1>like happy, glad, joy, things like that, those would get

0:15:20.680 --> 0:15:24.600
<v Speaker 1>tagged as cheerful. But then it would then assign all

0:15:24.640 --> 0:15:28.560
<v Speaker 1>the individual words tags and then tally everything up. So

0:15:28.840 --> 0:15:31.000
<v Speaker 1>let's say you've got a bunch of sentences and it

0:15:31.080 --> 0:15:36.960
<v Speaker 1>starts individually labeling certain words as being cheerful or sad,

0:15:37.120 --> 0:15:41.000
<v Speaker 1>or angry or helpful, and then it adds it all

0:15:41.080 --> 0:15:43.520
<v Speaker 1>up and then would give you a percentage. So a

0:15:43.560 --> 0:15:49.520
<v Speaker 1>message might be agreeable or thirty conscientious, you would actually

0:15:49.560 --> 0:15:52.640
<v Speaker 1>get multiples of these, and that would just really indicate

0:15:52.680 --> 0:15:58.240
<v Speaker 1>the density of those types of words within the message itself. Now,

0:15:58.920 --> 0:16:02.000
<v Speaker 1>in an ideal world, if language were very simple to

0:16:02.800 --> 0:16:07.160
<v Speaker 1>understand and interpret by machines, this would help you gauge

0:16:07.360 --> 0:16:11.200
<v Speaker 1>how people would respond to your work. Right, So, you

0:16:11.200 --> 0:16:15.240
<v Speaker 1>could write a message. Before you send it, you put

0:16:15.280 --> 0:16:18.240
<v Speaker 1>it through the tone analyzer and it tells you what

0:16:18.440 --> 0:16:22.480
<v Speaker 1>sort of a tone you are setting. So if you

0:16:22.520 --> 0:16:25.680
<v Speaker 1>wanted to create a business letter, you could send it

0:16:25.680 --> 0:16:27.560
<v Speaker 1>through this tone analyzer and if it came back as

0:16:27.600 --> 0:16:32.200
<v Speaker 1>saying it's coming across as as a indecisive, you might

0:16:32.240 --> 0:16:35.360
<v Speaker 1>want to go back in and edit that message so

0:16:35.400 --> 0:16:40.200
<v Speaker 1>that you can make a more straightforward and uh decisive

0:16:40.400 --> 0:16:43.800
<v Speaker 1>message and not give the wrong impression before you send

0:16:43.880 --> 0:16:47.160
<v Speaker 1>the message out to your actual human recipient and come

0:16:47.240 --> 0:16:49.760
<v Speaker 1>up with alternate word choices in order to make sure

0:16:49.800 --> 0:16:52.000
<v Speaker 1>that your message is received the way you intended it

0:16:52.440 --> 0:16:55.240
<v Speaker 1>and anyone who has communicated over the Internet can think

0:16:55.320 --> 0:16:58.040
<v Speaker 1>of ways that this might have been helpful in the past,

0:16:58.160 --> 0:17:03.600
<v Speaker 1>because again, language depends on so many different elements to

0:17:03.760 --> 0:17:06.840
<v Speaker 1>get your meaning across, and when you reduce it to

0:17:07.920 --> 0:17:11.040
<v Speaker 1>the written form, especially the written form online, where we

0:17:11.119 --> 0:17:15.639
<v Speaker 1>tend to be very short with our our communication, it

0:17:15.840 --> 0:17:19.200
<v Speaker 1>comes in very quick bursts, a couple of sentences here

0:17:19.280 --> 0:17:22.400
<v Speaker 1>or there. We lack all that body language, we lack

0:17:22.480 --> 0:17:25.720
<v Speaker 1>that tone. It's very easy to misinterpret. I'm sure there's

0:17:25.760 --> 0:17:28.880
<v Speaker 1>been an example in your life where either you got

0:17:28.880 --> 0:17:31.520
<v Speaker 1>offended from receiving something that was meant in a way

0:17:31.760 --> 0:17:34.520
<v Speaker 1>that was different from the way you you interpreted it,

0:17:34.640 --> 0:17:37.080
<v Speaker 1>or the reverse happened where you sent a message and

0:17:37.160 --> 0:17:41.440
<v Speaker 1>somebody had a reaction you did not anticipate because they

0:17:41.520 --> 0:17:44.400
<v Speaker 1>could not tell what tone you were using just from

0:17:44.440 --> 0:17:48.119
<v Speaker 1>the words you were using. Machines have that same problem.

0:17:48.359 --> 0:17:51.879
<v Speaker 1>In the future, an analyzer like this tone analyzer, it

0:17:51.920 --> 0:17:56.560
<v Speaker 1>could be incorporated into word processors, or email servers, or

0:17:56.920 --> 0:18:00.520
<v Speaker 1>email services, i should say, or social media platform. So

0:18:00.560 --> 0:18:03.320
<v Speaker 1>you start typing in your message and before you hit

0:18:03.760 --> 0:18:07.320
<v Speaker 1>published or post or send, you could analyze that text.

0:18:07.840 --> 0:18:09.719
<v Speaker 1>It could tell you what the tone is and then

0:18:09.760 --> 0:18:12.600
<v Speaker 1>you could say, oh, no, that's gonna come across totally

0:18:12.760 --> 0:18:15.000
<v Speaker 1>the wrong way, and you could actually fix it before

0:18:15.080 --> 0:18:17.160
<v Speaker 1>you posted it or sent it, and then you wouldn't

0:18:17.200 --> 0:18:20.840
<v Speaker 1>have that awkward decision of whether or not to edit something, or,

0:18:20.840 --> 0:18:23.800
<v Speaker 1>in the case of Twitter, which continues to refuse to

0:18:23.840 --> 0:18:27.080
<v Speaker 1>allow you to edit tweets, to delete a tweet. I

0:18:27.160 --> 0:18:30.119
<v Speaker 1>deleted a tweet the other day when I posted a

0:18:30.200 --> 0:18:32.840
<v Speaker 1>link to a news story, and I had done a

0:18:32.920 --> 0:18:36.240
<v Speaker 1>rookie mistake, one that I try to avoid, but I

0:18:36.800 --> 0:18:39.960
<v Speaker 1>did it this pastime, which is that I didn't think

0:18:40.000 --> 0:18:42.199
<v Speaker 1>to look at the date when the news item had

0:18:42.240 --> 0:18:45.400
<v Speaker 1>been published, and had been published a full year earlier,

0:18:45.760 --> 0:18:48.080
<v Speaker 1>so it was not new news, it was old news.

0:18:48.600 --> 0:18:51.399
<v Speaker 1>And uh then deleted the tweet and it wasn't up

0:18:51.440 --> 0:18:53.680
<v Speaker 1>for long, but I still felt dumb about it. It

0:18:53.680 --> 0:18:55.399
<v Speaker 1>would have been nice to have been able to check that.

0:18:55.600 --> 0:18:58.280
<v Speaker 1>Although that's not tone obviously, that's but similar in the

0:18:59.000 --> 0:19:02.360
<v Speaker 1>and the idea that you want to check before you

0:19:03.080 --> 0:19:06.400
<v Speaker 1>end up offending someone, unless you're one of those jerk

0:19:06.480 --> 0:19:08.880
<v Speaker 1>faces that just sets out to offend people, in which

0:19:08.880 --> 0:19:12.760
<v Speaker 1>case rethink your strategy. There are better things to do.

0:19:12.920 --> 0:19:15.080
<v Speaker 1>It's just as you can make just as big an

0:19:15.080 --> 0:19:17.520
<v Speaker 1>impact being a positive person as you can being a

0:19:17.560 --> 0:19:20.000
<v Speaker 1>jerk face. I know it can seem like it's more work,

0:19:20.080 --> 0:19:22.359
<v Speaker 1>but it's also more rewarding in the long run. Okay,

0:19:22.400 --> 0:19:25.720
<v Speaker 1>soapbox done. So. There is a demo of the tone

0:19:25.760 --> 0:19:30.119
<v Speaker 1>analyzer that's available online, and back when we were recording

0:19:30.400 --> 0:19:33.840
<v Speaker 1>Forward Thinking, the demo worked in a way where it

0:19:33.840 --> 0:19:38.080
<v Speaker 1>would tell you about emotional tone and break it down

0:19:38.080 --> 0:19:41.120
<v Speaker 1>by percentage. It's a little different now, but I want

0:19:41.119 --> 0:19:45.000
<v Speaker 1>to tell you the what words and the results we

0:19:45.080 --> 0:19:48.680
<v Speaker 1>got in the past because they were so much fun.

0:19:49.280 --> 0:19:51.359
<v Speaker 1>Granted you would get a different result now because the

0:19:51.359 --> 0:19:55.399
<v Speaker 1>tone analyzer has been tweaked since we recorded that episode. So,

0:19:55.440 --> 0:19:58.760
<v Speaker 1>when we recorded that episode, one of my co hosts

0:19:59.119 --> 0:20:02.600
<v Speaker 1>decided to put a a sentence that is somewhat known

0:20:02.720 --> 0:20:06.000
<v Speaker 1>in literary circles into this tone analyzer and find out

0:20:06.000 --> 0:20:09.280
<v Speaker 1>what it said. And the sentence used was it is

0:20:09.320 --> 0:20:13.040
<v Speaker 1>a truth universally acknowledged that a single man in possession

0:20:13.040 --> 0:20:16.680
<v Speaker 1>of a good fortune must be in want of a wife. Now,

0:20:16.680 --> 0:20:21.879
<v Speaker 1>the analyzer said that this emotional tone was cheerful, the

0:20:21.960 --> 0:20:26.720
<v Speaker 1>social tone was seventy six percent open and fifty one agreeable,

0:20:26.960 --> 0:20:31.360
<v Speaker 1>and the writing tone was analytical. You can also view

0:20:31.400 --> 0:20:34.200
<v Speaker 1>the sentence in terms of word count as opposed to

0:20:34.240 --> 0:20:37.359
<v Speaker 1>the weighted value of individual words, and using that view,

0:20:37.440 --> 0:20:41.720
<v Speaker 1>five percent of the sentence sentences were in an emotional tone,

0:20:41.760 --> 0:20:44.399
<v Speaker 1>eighty nine percent in a social tone, and five percent

0:20:44.440 --> 0:20:48.159
<v Speaker 1>in a writing tone. Now, the analyzer highlights each word

0:20:48.200 --> 0:20:52.600
<v Speaker 1>according to how it classifies them, so emotional words would

0:20:52.600 --> 0:20:55.879
<v Speaker 1>be highlighted in red or pink in that older version

0:20:55.880 --> 0:20:58.919
<v Speaker 1>of the Tone Analyzer, social words would show up in blue,

0:20:59.359 --> 0:21:02.960
<v Speaker 1>and writing tones would be in green. And you could

0:21:02.960 --> 0:21:05.800
<v Speaker 1>click on any word and the analyzer would offer alternative

0:21:05.880 --> 0:21:08.320
<v Speaker 1>words that you might want to use and classify those

0:21:08.320 --> 0:21:11.520
<v Speaker 1>words in the tones that they are associated with, so

0:21:11.560 --> 0:21:13.520
<v Speaker 1>that you could shape your message to meet the tone

0:21:13.520 --> 0:21:17.679
<v Speaker 1>you wish to convey. Also, the Tone Analyzer demo used

0:21:17.840 --> 0:21:22.560
<v Speaker 1>the business letter format as the means of comparison, so,

0:21:23.040 --> 0:21:27.920
<v Speaker 1>in other words, we compared Jane Austen to a business letter. Presumably,

0:21:27.960 --> 0:21:30.680
<v Speaker 1>if you were to use a full version of the analyzer,

0:21:30.760 --> 0:21:33.360
<v Speaker 1>not just the demo version, you would have other options

0:21:33.359 --> 0:21:37.520
<v Speaker 1>so you could compare it with other models, not just

0:21:37.560 --> 0:21:42.960
<v Speaker 1>a business letter. Joe McCormick. He included an excerpt from

0:21:43.560 --> 0:21:47.840
<v Speaker 1>Dostoyevsky's Notes from Underground. That excerpt was, I could not

0:21:47.960 --> 0:21:51.960
<v Speaker 1>become anything, neither good nor bad, neither a scoundrel nor

0:21:52.000 --> 0:21:55.520
<v Speaker 1>an honest man, neither a hero nor an insect. And

0:21:55.560 --> 0:21:58.400
<v Speaker 1>now I am eking out my days in my corner,

0:21:58.760 --> 0:22:03.480
<v Speaker 1>taunting myself with the bitter and entirely useless constellation that

0:22:03.600 --> 0:22:07.959
<v Speaker 1>an intelligent man cannot seriously become anything, that only a

0:22:08.000 --> 0:22:12.320
<v Speaker 1>fool can become something. The feedback was that the emotional

0:22:12.359 --> 0:22:19.120
<v Speaker 1>tone had anger at cheerfulness at so happy anger negative at.

0:22:20.119 --> 0:22:25.840
<v Speaker 1>The social tone was agreeable zero percent conscientious, zero percent open.

0:22:26.320 --> 0:22:31.320
<v Speaker 1>The writing tone was analytical, zero percent confident and tentative.

0:22:31.920 --> 0:22:34.479
<v Speaker 1>Joe would actually end up highlighting some of the words

0:22:34.960 --> 0:22:37.960
<v Speaker 1>to find out which words were the ones that ended

0:22:38.040 --> 0:22:44.080
<v Speaker 1>up giving that cheerfulness result. Those four words were good, honest, hero,

0:22:44.320 --> 0:22:51.200
<v Speaker 1>and intelligent and that kind of are that That's important

0:22:51.440 --> 0:22:55.600
<v Speaker 1>because those words, the way they are used, uh in

0:22:55.640 --> 0:22:59.800
<v Speaker 1>that passage are not used in a positive sense. They

0:23:00.000 --> 0:23:05.159
<v Speaker 1>are positive words, but they're meant to show kind of

0:23:05.200 --> 0:23:11.440
<v Speaker 1>a negation there not and not an assertion. So that

0:23:11.520 --> 0:23:14.880
<v Speaker 1>really highlights a big problem in this tone analyzer, which

0:23:14.920 --> 0:23:20.879
<v Speaker 1>is that it's tagging these words individually without context. So

0:23:20.920 --> 0:23:24.840
<v Speaker 1>if I wrote the phrase I am not glad, it

0:23:24.880 --> 0:23:27.680
<v Speaker 1>would tag the word glad and say that's a cheerful word.

0:23:28.359 --> 0:23:32.040
<v Speaker 1>But I said I am not glad. You if I

0:23:32.119 --> 0:23:35.120
<v Speaker 1>told you I am not glad, you would not think, oh, well,

0:23:35.119 --> 0:23:37.080
<v Speaker 1>that's a cheerful thing to say or a positive thing

0:23:37.119 --> 0:23:40.720
<v Speaker 1>to say. But according to the tone analyzer, it would

0:23:40.760 --> 0:23:44.040
<v Speaker 1>come across as a cheerful statement because it had tagged

0:23:44.080 --> 0:23:46.280
<v Speaker 1>that word as as being cheerful. In the other words

0:23:46.520 --> 0:23:50.040
<v Speaker 1>are not that strong, They don't They don't warrant being

0:23:50.040 --> 0:23:54.439
<v Speaker 1>tagged in a way like that. Now, over time, we

0:23:54.520 --> 0:23:57.520
<v Speaker 1>might have a tone analyzer that can actually take context

0:23:57.760 --> 0:24:02.040
<v Speaker 1>into account, and then you would learn a lot more

0:24:02.080 --> 0:24:05.840
<v Speaker 1>about the actual meaning behind a phrase. It would be

0:24:05.880 --> 0:24:08.679
<v Speaker 1>more than just tone. So if you were trying to

0:24:08.680 --> 0:24:13.960
<v Speaker 1>get across tone by using more complicated and subtle word

0:24:14.040 --> 0:24:19.160
<v Speaker 1>choice where you're sort of being kind of uh poetic

0:24:19.560 --> 0:24:23.680
<v Speaker 1>in your expression, you're trying to get across a feeling

0:24:24.160 --> 0:24:29.439
<v Speaker 1>by using irony or sarcasm, then a tone analyzer like

0:24:29.480 --> 0:24:31.879
<v Speaker 1>this would totally miss it because it would just be

0:24:31.920 --> 0:24:36.000
<v Speaker 1>counting the hits and not understanding the usage. There the

0:24:36.080 --> 0:24:40.480
<v Speaker 1>hidden meeting the word play, so that is going to

0:24:40.520 --> 0:24:45.760
<v Speaker 1>be a real challenge. So it's kind of another interesting

0:24:45.880 --> 0:24:48.120
<v Speaker 1>use of IBMS Watson. There are a lot of other

0:24:48.119 --> 0:24:50.600
<v Speaker 1>ones that we could talk about, like Chef Watson, which

0:24:50.680 --> 0:24:54.840
<v Speaker 1>was my favorite. Chef Watson would generate new recipes based

0:24:54.920 --> 0:24:57.280
<v Speaker 1>upon ingredients that you would tell it that you had

0:24:57.320 --> 0:25:01.600
<v Speaker 1>on hand, and it wouldn't it wouldn't go and reference

0:25:01.840 --> 0:25:05.520
<v Speaker 1>old recipes and pull one up for you. Instead, it

0:25:05.520 --> 0:25:09.200
<v Speaker 1>would make flavor profiles based upon all the different combinations

0:25:09.240 --> 0:25:11.800
<v Speaker 1>of food that were found in various recipe books and

0:25:11.840 --> 0:25:14.520
<v Speaker 1>generate a brand new recipe for you right there on

0:25:14.560 --> 0:25:18.280
<v Speaker 1>the spot. And sometimes they were whacka doodle crazy, y'all.

0:25:18.760 --> 0:25:21.959
<v Speaker 1>So in a way, you could say that Chef Watson

0:25:22.000 --> 0:25:25.320
<v Speaker 1>was another another way of seeing how I b M

0:25:25.359 --> 0:25:28.119
<v Speaker 1>S Watson has a lot of promise, but it requires

0:25:28.200 --> 0:25:32.240
<v Speaker 1>a ton of work on the app level in order

0:25:32.280 --> 0:25:35.760
<v Speaker 1>to leverage it and make actual practical use out of it.

0:25:36.160 --> 0:25:40.560
<v Speaker 1>I have more to say about computers detecting sarcasm, but

0:25:40.680 --> 0:25:52.960
<v Speaker 1>first let's take a quick word from our sponsor. So

0:25:53.400 --> 0:25:57.720
<v Speaker 1>back in two there were some researchers at the Hebrew

0:25:57.840 --> 0:26:02.760
<v Speaker 1>University in Israel who designed a system called the Semi

0:26:02.800 --> 0:26:10.080
<v Speaker 1>Supervised Algorithm for Sarcasm Identification or SAZI, and they used

0:26:10.160 --> 0:26:15.239
<v Speaker 1>SAZI to analyze collections of nearly six million tweets and

0:26:15.359 --> 0:26:20.320
<v Speaker 1>also around sixty six thousand product reviews from Amazon. They

0:26:20.400 --> 0:26:25.879
<v Speaker 1>wanted to find rich treasure troves of sarcasm that turns

0:26:25.880 --> 0:26:30.159
<v Speaker 1>out reviews and tweets. They fit the bill sarcasm is.

0:26:30.760 --> 0:26:35.600
<v Speaker 1>Really it's typically conveyed in some vocal tone, right and

0:26:35.680 --> 0:26:40.760
<v Speaker 1>nonverbal cues. So you have to first go someplace where

0:26:40.800 --> 0:26:44.520
<v Speaker 1>sarcasm is is rampant in text form to be able

0:26:44.560 --> 0:26:49.679
<v Speaker 1>to really fine tune how you can identify sarcasm versus

0:26:49.720 --> 0:26:52.600
<v Speaker 1>something that's meant exactly the way it's written on the

0:26:52.640 --> 0:26:57.920
<v Speaker 1>surface level. So they started to map out the various

0:26:58.040 --> 0:27:03.040
<v Speaker 1>features that were common in sarcastic comments online. So they

0:27:03.040 --> 0:27:06.080
<v Speaker 1>were looking for things like hyperbolic words and if you're

0:27:06.160 --> 0:27:09.640
<v Speaker 1>using a lot of exaggeration, that could be a key.

0:27:10.200 --> 0:27:14.600
<v Speaker 1>Excessive punctuation was another one, especially ellipses, which I tend

0:27:14.640 --> 0:27:17.080
<v Speaker 1>to use a lot, though I don't know if I

0:27:17.160 --> 0:27:19.159
<v Speaker 1>use it so much for sarcasm as I do for

0:27:19.359 --> 0:27:22.720
<v Speaker 1>just timing purposes. To indicate this is the beat I

0:27:22.760 --> 0:27:25.439
<v Speaker 1>would take if I were saying this out loud, I

0:27:25.480 --> 0:27:29.560
<v Speaker 1>guess that's just as irritating. Though, also how straightforward is

0:27:29.560 --> 0:27:32.840
<v Speaker 1>the Senate structure? And they gave it examples of sarcasm.

0:27:33.040 --> 0:27:37.719
<v Speaker 1>They fed it tweets that were tagged hashtag sarcasm, so

0:27:37.840 --> 0:27:42.400
<v Speaker 1>that the machine quote unquote knew that that was already

0:27:42.440 --> 0:27:45.399
<v Speaker 1>a sarcastic tweet and could start to analyze it and

0:27:45.480 --> 0:27:48.760
<v Speaker 1>build out a model for what sarcasm is. They also

0:27:48.800 --> 0:27:51.720
<v Speaker 1>fed at a bunch of one star Amazon reviews that

0:27:51.800 --> 0:27:55.880
<v Speaker 1>had been judged to be sarcastic by a panel consisting

0:27:55.920 --> 0:28:00.639
<v Speaker 1>of fifteen human beings, and the system was told it

0:28:00.720 --> 0:28:04.440
<v Speaker 1>had to rate sentences on a scale of one to five,

0:28:04.800 --> 0:28:09.120
<v Speaker 1>One being not sarcastic, they mean exactly what the Senate says,

0:28:09.560 --> 0:28:13.560
<v Speaker 1>five being holy cow, this person should write for the Onion.

0:28:13.920 --> 0:28:20.200
<v Speaker 1>This is incredibly sarcastic. SAZI could identify sarcastic Amazon reviews

0:28:20.720 --> 0:28:26.200
<v Speaker 1>with precision. Not bad, But when it came to Twitter

0:28:26.480 --> 0:28:31.040
<v Speaker 1>it did even better. I think, probably because there had

0:28:31.080 --> 0:28:33.560
<v Speaker 1>to be very short messages on Twitter. This was before

0:28:33.600 --> 0:28:36.960
<v Speaker 1>Twitter had even expanded to characters, so it was still

0:28:37.000 --> 0:28:40.240
<v Speaker 1>back in the one character days. The precision rate for

0:28:40.360 --> 0:28:45.200
<v Speaker 1>SAZI for Twitter was so it was really good at

0:28:45.280 --> 0:28:49.040
<v Speaker 1>detecting straightforward sarcasm, the kind that a lot of people

0:28:49.040 --> 0:28:52.160
<v Speaker 1>on Twitter use because you have limited space so you

0:28:52.200 --> 0:28:55.120
<v Speaker 1>can't really set it up in a more complex way.

0:28:55.160 --> 0:29:01.840
<v Speaker 1>But it was also uh more prone to judging things

0:29:01.960 --> 0:29:05.600
<v Speaker 1>as false negative evaluations rather than false positives. In other words,

0:29:05.960 --> 0:29:09.920
<v Speaker 1>it was more likely to look at a negative sarcastic

0:29:10.440 --> 0:29:13.200
<v Speaker 1>message and say that's not sarcastic than it was to

0:29:13.280 --> 0:29:16.640
<v Speaker 1>look at a straightforward message and say, no, that is sarcastic.

0:29:17.320 --> 0:29:20.920
<v Speaker 1>So that was kind of interesting back to Watson. Another

0:29:22.160 --> 0:29:25.480
<v Speaker 1>use of Watson came out of the Milk and Institute

0:29:25.480 --> 0:29:30.800
<v Speaker 1>Global Conference at IBM showed off some research that it

0:29:30.880 --> 0:29:34.280
<v Speaker 1>had been working on internally, and it was calling this

0:29:34.360 --> 0:29:40.000
<v Speaker 1>research debating Technologies. This was a project in which IBM

0:29:40.120 --> 0:29:42.480
<v Speaker 1>was trying to see if they could feed a computer

0:29:42.760 --> 0:29:48.400
<v Speaker 1>raw information, have the computer synthesize the information, understand that

0:29:48.520 --> 0:29:53.440
<v Speaker 1>information at least on a computational level, and then create

0:29:54.520 --> 0:29:59.880
<v Speaker 1>a a debating strategy for both pros and cons they

0:30:00.120 --> 0:30:02.960
<v Speaker 1>on that information. So it would take a huge amount

0:30:03.160 --> 0:30:08.360
<v Speaker 1>of content like all of Wikipedia, for example, and then

0:30:08.480 --> 0:30:11.440
<v Speaker 1>on any given subject that would be covered in Wikipedia.

0:30:11.480 --> 0:30:14.600
<v Speaker 1>It would be asked form an argument that is in

0:30:14.680 --> 0:30:19.520
<v Speaker 1>favor of or is against a concept, whatever that concept

0:30:19.560 --> 0:30:22.560
<v Speaker 1>might be. John Kelly of IBM showed off in a

0:30:22.640 --> 0:30:25.600
<v Speaker 1>demo how the tool could be used to predict pro

0:30:25.840 --> 0:30:29.760
<v Speaker 1>or con arguments about a subject based on a body

0:30:29.960 --> 0:30:34.680
<v Speaker 1>of information. So you might be able to use this

0:30:34.920 --> 0:30:41.360
<v Speaker 1>technology in order to anticipate what an opposing person might

0:30:41.480 --> 0:30:44.440
<v Speaker 1>say on any given subject. Let's say that you are

0:30:44.560 --> 0:30:49.400
<v Speaker 1>getting ready to debate a topic. You might feed that

0:30:49.720 --> 0:30:53.600
<v Speaker 1>information to a computer system using this Watson platform. You

0:30:53.680 --> 0:30:56.480
<v Speaker 1>might feed in a ton of information, and then you

0:30:56.560 --> 0:31:00.760
<v Speaker 1>might say, who is a man and someone who is

0:31:00.800 --> 0:31:05.840
<v Speaker 1>against this particular topic, whatever it might be. Uh. Let's

0:31:05.840 --> 0:31:10.880
<v Speaker 1>say it's it's it's renewable energy, and the uh the

0:31:11.120 --> 0:31:13.840
<v Speaker 1>efficiency of solar panels, whether or not it makes sense

0:31:13.880 --> 0:31:17.040
<v Speaker 1>to invest in solar panels. Let's say that your stance

0:31:17.320 --> 0:31:20.400
<v Speaker 1>is that you have to argue for solar panels. You

0:31:20.480 --> 0:31:23.640
<v Speaker 1>might say, what would someone who wants to argue against

0:31:23.920 --> 0:31:29.000
<v Speaker 1>solar panels say? And then Watson would analyze this information

0:31:29.560 --> 0:31:33.880
<v Speaker 1>and return to you what it thinks would be an

0:31:33.960 --> 0:31:39.840
<v Speaker 1>argument someone would use to support that that stance, and

0:31:39.880 --> 0:31:42.120
<v Speaker 1>then you could prepare for that, which would be an

0:31:42.160 --> 0:31:44.120
<v Speaker 1>incredible tool. I mean, you could think of this as

0:31:44.160 --> 0:31:46.760
<v Speaker 1>for political debates. It would be amazing. You could think

0:31:46.760 --> 0:31:49.760
<v Speaker 1>of how you might want to prepare so that you

0:31:49.840 --> 0:31:52.960
<v Speaker 1>can argue intelligently against an opponent, and you can already

0:31:53.000 --> 0:31:55.880
<v Speaker 1>anticipate what that opponent is going to say, because you

0:31:55.880 --> 0:31:58.680
<v Speaker 1>know their general stance on a topic, but you might

0:31:58.720 --> 0:32:02.080
<v Speaker 1>not know what tactic they might use to support that stance.

0:32:03.000 --> 0:32:05.920
<v Speaker 1>Maybe politics isn't a great choice because that's not always

0:32:05.960 --> 0:32:09.680
<v Speaker 1>in the realm of rationality. That often falls into a

0:32:11.040 --> 0:32:15.440
<v Speaker 1>call toward emotional response rather than rational response. That's more

0:32:15.480 --> 0:32:20.120
<v Speaker 1>of a a commentary on politics in general, regardless of

0:32:20.160 --> 0:32:23.240
<v Speaker 1>what side you might be on, all sides do this anyway.

0:32:23.600 --> 0:32:27.680
<v Speaker 1>He actually showed at this demo a different example. He said,

0:32:28.040 --> 0:32:30.400
<v Speaker 1>what if you were to take the sale of violent

0:32:30.520 --> 0:32:34.240
<v Speaker 1>video games to minors should be banned. That's the topic,

0:32:35.160 --> 0:32:37.840
<v Speaker 1>and that the computer would then go through all the

0:32:37.840 --> 0:32:41.480
<v Speaker 1>information and had access to. It would end up sorting

0:32:41.480 --> 0:32:44.520
<v Speaker 1>out all the parts that were relevant to the discussion,

0:32:45.120 --> 0:32:47.120
<v Speaker 1>so it just put those aside and that would become

0:32:47.200 --> 0:32:51.360
<v Speaker 1>the core of the data it would reference. I would

0:32:51.360 --> 0:32:54.080
<v Speaker 1>then go through and identify basic statements as either being

0:32:54.480 --> 0:32:59.880
<v Speaker 1>a pro stance of banning violent video games to my

0:33:00.000 --> 0:33:04.120
<v Speaker 1>nes or a constance for that saying no, we should

0:33:04.160 --> 0:33:08.160
<v Speaker 1>be able to sell violent video games to minors. The

0:33:08.160 --> 0:33:11.720
<v Speaker 1>tools scanned four million articles, it returned the top ten

0:33:11.920 --> 0:33:14.800
<v Speaker 1>articles that were determined to be the most relevant to

0:33:15.040 --> 0:33:19.760
<v Speaker 1>that particular debate, and it scanned approximately three thousand sentences

0:33:20.720 --> 0:33:24.120
<v Speaker 1>from from top to bottom, and it then identified sentences

0:33:24.120 --> 0:33:28.640
<v Speaker 1>that contained candidate claims that would be statements that would

0:33:28.680 --> 0:33:32.280
<v Speaker 1>either be interpreted as being pro or con for the stance.

0:33:32.920 --> 0:33:35.720
<v Speaker 1>Then it identified the parameters of those claims. Then it

0:33:35.760 --> 0:33:38.960
<v Speaker 1>assessed the claims for the pro and con polarity, then

0:33:39.000 --> 0:33:42.760
<v Speaker 1>constructed a sample pro or con statement. And the statements

0:33:42.800 --> 0:33:45.760
<v Speaker 1>in the demo were kind of interesting. And since the

0:33:45.760 --> 0:33:50.440
<v Speaker 1>computer is constructing arguments based upon what people have already written,

0:33:51.080 --> 0:33:53.640
<v Speaker 1>it would reflect a lot of vague statements that aren't

0:33:53.640 --> 0:33:56.200
<v Speaker 1>a firm stance. So, in other words, like it couldn't

0:33:56.200 --> 0:33:59.560
<v Speaker 1>take a bunch of stuff that was written that it

0:33:59.680 --> 0:34:03.640
<v Speaker 1>sell did not take either a pro or constance, and

0:34:03.680 --> 0:34:07.080
<v Speaker 1>then transformed that magically into the perfect pro stance or

0:34:07.120 --> 0:34:10.640
<v Speaker 1>the perfect constance. Uh, it's dependent upon the words that

0:34:10.760 --> 0:34:14.440
<v Speaker 1>human beings have already written, so it could not magically

0:34:14.440 --> 0:34:17.440
<v Speaker 1>come up with a killer argument if the data that

0:34:17.480 --> 0:34:21.279
<v Speaker 1>had been written about this subject didn't come down on

0:34:21.480 --> 0:34:26.640
<v Speaker 1>a firm stance one way or the other. Um, the

0:34:26.680 --> 0:34:29.319
<v Speaker 1>point of the demonstration wasn't to create a tool that

0:34:29.360 --> 0:34:34.680
<v Speaker 1>could either troll people or counter trolls. It was to

0:34:34.719 --> 0:34:37.279
<v Speaker 1>show that a computer could be useful to aid in

0:34:37.320 --> 0:34:41.760
<v Speaker 1>the reasoning process when you're making a critical decision. Again,

0:34:41.800 --> 0:34:44.360
<v Speaker 1>to go back to that medical example, it could be

0:34:44.440 --> 0:34:48.600
<v Speaker 1>used to help a doctor determine which diagnosis is the

0:34:48.640 --> 0:34:51.800
<v Speaker 1>most likely to be accurate for a patient, what what

0:34:52.560 --> 0:34:55.720
<v Speaker 1>course of treatment might be the most helpful for that patient,

0:34:56.520 --> 0:35:01.080
<v Speaker 1>and thus it could have real practic coal use outside

0:35:01.120 --> 0:35:07.799
<v Speaker 1>of this more esoteric, interesting uh debate news. Now, will

0:35:07.840 --> 0:35:12.520
<v Speaker 1>we see computers in the future able to detect sarcasm

0:35:12.560 --> 0:35:16.279
<v Speaker 1>just as easily as your typical human being can when

0:35:16.320 --> 0:35:21.960
<v Speaker 1>given the right circumstances. And I use the word typical reluctantly,

0:35:22.320 --> 0:35:25.319
<v Speaker 1>but you get what I mean. I don't know. It's

0:35:25.360 --> 0:35:27.680
<v Speaker 1>gonna take some time. It takes an awful lot of

0:35:27.680 --> 0:35:30.800
<v Speaker 1>processing power too. You have to remember that for these

0:35:30.920 --> 0:35:34.719
<v Speaker 1>neural networks systems, the ones that are running these these

0:35:34.800 --> 0:35:39.360
<v Speaker 1>various platforms and programs and strategies. They take up a

0:35:39.400 --> 0:35:46.640
<v Speaker 1>lot of processing power because our brains have billion neurons

0:35:46.640 --> 0:35:50.680
<v Speaker 1>in them, so we have a very sophisticated supercomputer sitting

0:35:50.719 --> 0:35:55.920
<v Speaker 1>in our heads. Moreover, our brains are insanely energy efficient.

0:35:56.040 --> 0:35:58.960
<v Speaker 1>They require about the equivalent of twenty watts of power.

0:36:00.000 --> 0:36:03.600
<v Speaker 1>A supercomputer needs a lot more power than that. So

0:36:04.400 --> 0:36:07.919
<v Speaker 1>while we're seeing advances in this, it requires so much

0:36:07.960 --> 0:36:11.239
<v Speaker 1>processing power, so much energy, it is not a practical

0:36:12.120 --> 0:36:16.440
<v Speaker 1>approach to most forms of computing, at least from a

0:36:16.480 --> 0:36:20.640
<v Speaker 1>consumer standpoint. You might see a future where the sort

0:36:20.680 --> 0:36:23.239
<v Speaker 1>of stuff is all in the cloud and then we

0:36:23.280 --> 0:36:26.920
<v Speaker 1>can access it through an app or a program or whatever.

0:36:27.400 --> 0:36:29.959
<v Speaker 1>That way, you don't have to have a supercomputer sitting

0:36:30.000 --> 0:36:32.920
<v Speaker 1>on your desk in order to tap into those, uh,

0:36:33.040 --> 0:36:35.360
<v Speaker 1>those capabilities, but you have to have an Internet connection,

0:36:35.400 --> 0:36:39.680
<v Speaker 1>which most of us these days tend to have fairly frequently.

0:36:39.760 --> 0:36:41.040
<v Speaker 1>I mean, there are a lot of people out there

0:36:41.040 --> 0:36:44.680
<v Speaker 1>who at this point have had a persistent Internet connection

0:36:44.800 --> 0:36:48.280
<v Speaker 1>for pretty much their whole lives, which blows my mind.

0:36:48.840 --> 0:36:50.319
<v Speaker 1>But that's the kind of world we'd have to live

0:36:50.360 --> 0:36:52.759
<v Speaker 1>in in order to really take advantage of this at

0:36:52.800 --> 0:36:55.839
<v Speaker 1>least in the near term. I don't know if we're

0:36:55.840 --> 0:36:59.040
<v Speaker 1>ever going to see a computer that can analyze, say,

0:36:59.040 --> 0:37:02.799
<v Speaker 1>an article from the Onion and not only point out

0:37:02.800 --> 0:37:06.279
<v Speaker 1>that it's being sarcastic or ironic, but also point out

0:37:06.280 --> 0:37:08.759
<v Speaker 1>why it's funny. I think at one point, when you

0:37:08.800 --> 0:37:12.440
<v Speaker 1>start analyzing comedy, there gets to be a level where

0:37:12.480 --> 0:37:15.360
<v Speaker 1>nothing is ever funny ever again, but it is a

0:37:15.400 --> 0:37:20.520
<v Speaker 1>really interesting problem. So that's whether that's that's this look

0:37:20.560 --> 0:37:25.840
<v Speaker 1>back on if AI is ever going to understand sarcasm. Well, guys,

0:37:25.840 --> 0:37:29.200
<v Speaker 1>I hope you enjoyed that classic episode of tech stuff.

0:37:29.239 --> 0:37:31.839
<v Speaker 1>I guess I guess two years old isn't old enough

0:37:31.880 --> 0:37:36.280
<v Speaker 1>to be classic. That uh that that only somewhat less

0:37:36.280 --> 0:37:40.400
<v Speaker 1>than fresh episode of text stuff about artificial intelligence and

0:37:40.480 --> 0:37:44.920
<v Speaker 1>sarcasm and things of that nature. I am constantly impressed

0:37:45.200 --> 0:37:49.560
<v Speaker 1>with how artificial intelligence is advancing year over year. But

0:37:50.160 --> 0:37:52.600
<v Speaker 1>when you look at what it means to be human

0:37:53.000 --> 0:37:56.280
<v Speaker 1>and the ways that we humans interact with one another,

0:37:56.480 --> 0:38:00.480
<v Speaker 1>and the ways that we can communicate complicated three things,

0:38:00.560 --> 0:38:03.960
<v Speaker 1>sometimes just through you know, subtle methods that are not

0:38:04.520 --> 0:38:09.520
<v Speaker 1>overt or or you know, directly spoken, it reminds us

0:38:09.560 --> 0:38:11.880
<v Speaker 1>that machines have got a long way to go in

0:38:12.000 --> 0:38:14.640
<v Speaker 1>order to really grasp what it is to be human,

0:38:14.920 --> 0:38:19.279
<v Speaker 1>So unless you're Commander Data, you're probably struggling a bit.

0:38:20.000 --> 0:38:22.040
<v Speaker 1>So I hope you guys enjoyed this. If you have

0:38:22.080 --> 0:38:24.799
<v Speaker 1>suggestions for future episodes of tech Stuff, I've got a

0:38:24.800 --> 0:38:28.560
<v Speaker 1>few episodes based on listener suggestions coming up soon. But

0:38:28.680 --> 0:38:31.440
<v Speaker 1>if you want to get your suggestions in tweet me.

0:38:31.920 --> 0:38:35.880
<v Speaker 1>The Twitter handle is text stuff H s W and

0:38:35.920 --> 0:38:44.040
<v Speaker 1>I'll talk to you again really soon. Text Stuff is

0:38:44.080 --> 0:38:47.200
<v Speaker 1>an I Heart Radio production. For more podcasts from my

0:38:47.360 --> 0:38:50.960
<v Speaker 1>Heart Radio, visit the i Heart Radio app, Apple Podcasts,

0:38:51.080 --> 0:38:53.080
<v Speaker 1>or wherever you listen to your favorite shows.