WEBVTT - The challenge of natural language processing 0:00:04.120 --> 0:00:07.160 Get in touch with technology with tech Stuff from how 0:00:07.200 --> 0:00:14.120 stuff works dot com. Hey there, and welcome to tech Stuff. 0:00:14.160 --> 0:00:17.400 I'm your host, Jonathan Strickland. I'm an executive producer at 0:00:17.440 --> 0:00:20.680 how stuff Works and I love all things tech. And 0:00:20.720 --> 0:00:24.000 in the last episode, I covered the history and technology 0:00:24.040 --> 0:00:28.639 behind speech recognition. So today we're going to look at 0:00:28.680 --> 0:00:34.440 a related concept called natural language processing or natural language understanding. 0:00:34.479 --> 0:00:38.920 The two are are related. This technology and speech recognition 0:00:39.000 --> 0:00:42.800 are both part of what make voice assistants like Sirie, 0:00:43.120 --> 0:00:46.840 Alexa and Google Assistant work, though there are other technologies 0:00:46.880 --> 0:00:49.040 that also go into that. Now, this is a huge 0:00:49.120 --> 0:00:53.120 topic and as a long and fascinating history, so this 0:00:53.200 --> 0:00:55.120 episode is just going to be the start of it. 0:00:55.320 --> 0:00:58.320 In the next episode, I will conclude a discussion on 0:00:58.480 --> 0:01:01.360 natural language processing and go into the history of these 0:01:01.400 --> 0:01:05.920 actual voice assistants. So, on a high level, what is 0:01:06.120 --> 0:01:11.080 natural language processing? Well, simply put, it's programming a machine 0:01:11.160 --> 0:01:14.720 to interpret language the way we use it we human beings. 0:01:14.840 --> 0:01:19.640 So in an ideal implementation, which would also require advanced 0:01:19.720 --> 0:01:23.680 artificial intelligence, you could speak to a machine or type 0:01:23.720 --> 0:01:25.760 whatever you like into a terminal and it would be 0:01:25.800 --> 0:01:29.080 able to understand what you meant. What your commands were, 0:01:29.200 --> 0:01:32.800 no matter how you worded the phrase. In turn, the 0:01:32.880 --> 0:01:36.440 machine would be able to generate responses that made linguistic 0:01:36.560 --> 0:01:39.959 sense to us, and we could in effect hold entire 0:01:40.080 --> 0:01:44.840 conversations with those machines. This, as it turns out, is 0:01:44.880 --> 0:01:49.000 a very difficult challenge. Even creating a machine that can 0:01:49.040 --> 0:01:52.560 respond to basic commands delivered in a natural language is 0:01:52.720 --> 0:01:56.080 really really hard to do, and we haven't yet cracked 0:01:56.240 --> 0:02:00.520 the nut on making a machine that can actually hold 0:02:00.560 --> 0:02:04.040 a real conversation with us. Yet we can sometimes forget 0:02:04.520 --> 0:02:09.520 that machines do not natively understand human language. Machines process 0:02:09.600 --> 0:02:13.639 information in machine code, which is difficult for humans to understand. 0:02:14.120 --> 0:02:17.480 I almost said impossible for humans to understand, but really 0:02:17.880 --> 0:02:22.600 it's just impractical. It's incredibly difficult. So, for example, computers 0:02:22.600 --> 0:02:26.639 that run on binary systems process all information in zeros 0:02:26.760 --> 0:02:29.840 and ones. Ultimately, when you get down to it, so 0:02:29.880 --> 0:02:31.880 if you were to look at a sheet of zeros 0:02:31.919 --> 0:02:36.280 and ones, it would probably seem completely incomprehensible to you, 0:02:36.440 --> 0:02:40.560 although to a computer it could seem perfectly logical. Our 0:02:40.680 --> 0:02:46.000 language is equally incomprehensible to machines. Programming languages make it 0:02:46.080 --> 0:02:49.079 easier for humans to make machines do what we want 0:02:49.160 --> 0:02:52.960 them to do. Programming languages create a level of abstraction 0:02:53.200 --> 0:02:56.200 between human language and machine language. It's kind of a 0:02:56.600 --> 0:02:59.600 meeting ground in the middle. Programming languages tend to be 0:02:59.720 --> 0:03:05.079 highly structured with specific strict sets of rules. Programming within 0:03:05.160 --> 0:03:08.200 those rules will get you the results you want, assuming 0:03:08.360 --> 0:03:11.960 your code is good, but if you stray outside those rules, 0:03:12.160 --> 0:03:15.359 you start to get errors. Human language is much more 0:03:15.440 --> 0:03:20.200 variable and complicated and ambiguous, and that's something that machines 0:03:20.200 --> 0:03:22.880 are not very good at handling. Now, if you've ever 0:03:22.880 --> 0:03:26.600 played a text based adventure from way back in the day, 0:03:26.639 --> 0:03:29.800 like Zork, you know that those adventure games have a 0:03:29.880 --> 0:03:34.080 very limited vocabulary. The game can accept certain commands, but 0:03:34.200 --> 0:03:37.200 only because the programmer built in the option in the game. 0:03:37.280 --> 0:03:40.880 They incorporated that in the game's design. So you might 0:03:40.920 --> 0:03:44.200 be able to type something like go north or just north, 0:03:44.280 --> 0:03:46.840 and the game understands you want your character to move 0:03:46.880 --> 0:03:49.240 to a new location that's to the north of your 0:03:49.240 --> 0:03:52.480 current location. But maybe you type something else, maybe you 0:03:52.520 --> 0:03:57.120 type jog north or saunter north, and the programmer didn't 0:03:57.160 --> 0:03:58.880 think of that. They didn't come up with all the 0:03:58.920 --> 0:04:01.560 different ways you have describe the way you want to 0:04:01.640 --> 0:04:04.240 move north, so you might get a result that says 0:04:04.280 --> 0:04:07.440 something like I didn't understand that, or you can't do 0:04:07.480 --> 0:04:12.360 that here. Computers only have the illusion of understanding us. 0:04:12.400 --> 0:04:15.720 They don't actually know what we mean when we say something, 0:04:15.760 --> 0:04:19.599 at least not natively. Now, that meant that for most 0:04:19.640 --> 0:04:22.640 of our history with computers, humans have had to learn 0:04:22.720 --> 0:04:25.560 how to work with machines, not the other way around. 0:04:26.000 --> 0:04:30.719 We have had to learn commands and syntax that machines accept, 0:04:31.120 --> 0:04:32.960 and if we try to word those commands in a 0:04:33.000 --> 0:04:36.760 different way, we tend to get an error. Natural language 0:04:36.760 --> 0:04:40.000 processing attempts to flip the tables on this relationship and 0:04:40.000 --> 0:04:43.039 teach machines how to work with humans so that we 0:04:43.080 --> 0:04:45.599 don't have to go through any sort of learning curve. 0:04:45.640 --> 0:04:48.960 We don't need to formulate our our commands in a 0:04:49.000 --> 0:04:53.360 specific way to be understood. The technology works on our terms, 0:04:53.640 --> 0:04:56.640 or as close to those as we can manage. That 0:04:56.720 --> 0:04:59.800 means that programmers have to build systems that can parse 0:05:00.160 --> 0:05:03.680 language for meaning, and it also means having to build 0:05:03.760 --> 0:05:07.160 tools and machines that can handle stuff that you typically 0:05:07.240 --> 0:05:11.600 encounter in higher level language courses. So here's a quick 0:05:11.720 --> 0:05:16.480 rundown on some of the stuff a natural language processing 0:05:16.480 --> 0:05:21.000 approach has to take into account. First, you have grammar. Now, 0:05:21.000 --> 0:05:25.120 grammar can refer to the study of language, but generally speaking, 0:05:25.120 --> 0:05:27.200 when we say grammar, or at least when I'm using 0:05:27.240 --> 0:05:30.640 the term in the context of natural language processing, I 0:05:30.680 --> 0:05:35.320 mean a set of rules for the organization of components 0:05:35.360 --> 0:05:39.760 of a language into meaningful statements or sentences. This is 0:05:39.800 --> 0:05:43.520 a broad concept. It is a big, big idea. It 0:05:43.560 --> 0:05:47.479 actually encompasses a couple of other also big ideas that 0:05:47.520 --> 0:05:50.880 are important in natural language processing. One of those is 0:05:50.920 --> 0:05:56.400 the concept of morphology. Morphology has to do with word forms. 0:05:57.240 --> 0:06:01.080 Words consist of more themes, and a word can actually 0:06:01.120 --> 0:06:04.599 have multiple moreph themes. So, for example, let's take a 0:06:04.640 --> 0:06:10.080 word like sky divers. Sky divers technically has four more themes, 0:06:10.120 --> 0:06:16.840 and they are sky dive er and s sky divers. 0:06:16.880 --> 0:06:20.080 The more themes only make sense if we put them 0:06:20.120 --> 0:06:24.760 in that particular order. For the word skydivers, dive skiers 0:06:24.839 --> 0:06:27.760 does not mean the same thing. Actually, it doesn't mean 0:06:27.880 --> 0:06:30.200 anything at all. So a good system will have to 0:06:30.240 --> 0:06:34.200 understand morphology and know how words can and cannot be formed. 0:06:34.600 --> 0:06:38.039 So again, with skydivers and knows all right, well, I 0:06:38.200 --> 0:06:40.320 know the word sky, I know what that means. I 0:06:40.360 --> 0:06:43.279 know what the word dive means. Er means that this 0:06:43.360 --> 0:06:47.040 is not an action. This is actually an entity that 0:06:47.160 --> 0:06:50.599 engages in that action. Right. A sky diver is someone 0:06:50.640 --> 0:06:54.919 who's skydives, and the s SO says it's plural, so 0:06:54.960 --> 0:06:59.200 that there's more than one skydiver. That's what morphology is 0:06:59.240 --> 0:07:02.880 all about. This is this sort of internal logic of 0:07:02.920 --> 0:07:09.240 word formation. Syntax is another big concept within grammar. Syntax, however, 0:07:09.320 --> 0:07:13.560 does not refer to word formation. It refers to sentence structure. 0:07:13.600 --> 0:07:18.680 How do we arrange words to make meaningful sentences. For example, 0:07:18.880 --> 0:07:23.200 the sentence you must have patience, my young Padawan. That 0:07:23.240 --> 0:07:27.560 follows good syntax, but patients you must have my young 0:07:27.640 --> 0:07:31.360 Padawan is a bit hanky because Yoda is all over 0:07:31.400 --> 0:07:35.760 the place with his syntax. In addition to grammar, you 0:07:35.840 --> 0:07:39.240 also have to take into account semantics. Now, that is 0:07:39.280 --> 0:07:43.240 the study of the meaning within language. This is a 0:07:43.240 --> 0:07:46.160 tricky one because there's a lot to unwrap here. For example, 0:07:46.480 --> 0:07:50.440 words and phrases can actually stand for different meanings. They 0:07:50.440 --> 0:07:54.960 can denote different ideas. We might use many different phrases 0:07:55.120 --> 0:07:58.320 or words to describe the same concept. Right, So we 0:07:58.400 --> 0:08:02.320 might use a usen or more different ways to say 0:08:02.360 --> 0:08:05.840 the same thing, or we might use two similar words 0:08:05.960 --> 0:08:09.240 or phrases to describe very different concepts. We might even 0:08:09.360 --> 0:08:13.880 use the same phrase to describe wildly different things or 0:08:13.920 --> 0:08:16.840 with very different meanings. Semantics gets down to what we 0:08:16.880 --> 0:08:20.320 actually mean when we say something. If you've ever had 0:08:20.360 --> 0:08:23.920 a discussion with someone and that person says, you know 0:08:24.000 --> 0:08:27.800 what I meant, that's essentially a statement that indicates semantically 0:08:28.280 --> 0:08:31.800 the meaning was clear, even if the phrasing did not 0:08:32.000 --> 0:08:35.800 indicate it on the face of things. Then there is 0:08:35.880 --> 0:08:41.600 pragmatics that's all about context. Contextual information is incredibly important 0:08:41.600 --> 0:08:45.240 in communication, and it relates a little bit to semantics. 0:08:45.320 --> 0:08:50.000 Semantics is about structure, and pragmatics is about context. So 0:08:50.040 --> 0:08:53.920 if I say the weather sure is nice today, on 0:08:54.080 --> 0:08:55.880 the face of it, that sounds like I'm in favor 0:08:56.080 --> 0:08:58.520 of the way the weather is. Right, it sounds like, oh, 0:08:58.640 --> 0:09:01.120 I like how the weather is. But if I say 0:09:01.120 --> 0:09:04.800 that same phrase while I'm standing in a downpour and 0:09:04.880 --> 0:09:08.960 I'm clearly not happy, I'm obviously being sarcastic. I mean 0:09:09.000 --> 0:09:12.600 the opposite of what I actually said. The context of 0:09:12.640 --> 0:09:16.240 the situation changes the meaning of what I am saying, 0:09:16.600 --> 0:09:19.839 even though the actual phrasing would seem to indicate the 0:09:19.920 --> 0:09:23.959 opposite of what my meaning was. As we develop more 0:09:24.000 --> 0:09:26.600 technology that can communicate with us, we have to take 0:09:26.600 --> 0:09:30.120 pragmatics into consideration, or else machines are going to be 0:09:30.160 --> 0:09:34.080 misinterpreting what we actually mean when we say stuff. So 0:09:34.320 --> 0:09:36.160 machines are going to have to learn how to deal 0:09:36.200 --> 0:09:41.280 with stuff like sarcasm. Yeah. Right. Then we have phonology, 0:09:41.400 --> 0:09:44.680 that is the sound of a language. I talked a 0:09:44.679 --> 0:09:48.000 little bit about this in the Speech Recognition podcast about 0:09:48.000 --> 0:09:51.000 how different languages have different phonemes. So I'm not going 0:09:51.040 --> 0:09:52.960 to dwell on that again. You can listen to the 0:09:53.000 --> 0:09:56.200 Speech Recognition podcast to learn more about it. But it 0:09:56.320 --> 0:09:59.439 is an important element in languages, especially when you get 0:09:59.480 --> 0:10:05.000 into uh natural language processing that is taking verbal input 0:10:05.120 --> 0:10:09.520 and not just textual input. Then you have lexicons that's 0:10:09.559 --> 0:10:14.240 the total vocabulary for a system. Ideally, alexicon has not 0:10:14.360 --> 0:10:18.240 just the words, but some sort of metadata attached that 0:10:18.360 --> 0:10:22.000 indicate the meaning of words or the relationship of words 0:10:22.080 --> 0:10:24.760 with one another. Though you can fudge this a little 0:10:24.760 --> 0:10:27.280 bit depending upon the implementation of the system. I'll talk 0:10:27.320 --> 0:10:30.719 a lot more about that throughout these podcasts. Now, these 0:10:30.760 --> 0:10:34.840 can be tricky concepts for human beings, let alone for machines. 0:10:35.160 --> 0:10:39.640 Machines are very good at following strict sets of instructions, 0:10:40.120 --> 0:10:43.760 but language can sometimes defy logic. Think of rules that 0:10:43.840 --> 0:10:47.960 apply to your native language, then just think of the 0:10:48.000 --> 0:10:52.040 exceptions that exist to those rules. Every language has exceptions 0:10:52.080 --> 0:10:55.520 for rules that are established, and depending upon the rule 0:10:55.679 --> 0:10:58.160 and the exception, there may seem to be no rhyme 0:10:58.440 --> 0:11:01.600 or reason for the deviation and from the rule. Moreover, 0:11:02.240 --> 0:11:05.480 if we want machines that are capable of understanding us 0:11:05.640 --> 0:11:08.680 and responding to our language in a meaningful way, those 0:11:08.720 --> 0:11:12.040 machines need to be able to handle the idiosyncrasies of 0:11:12.120 --> 0:11:16.360 individual speakers. To some extent. There may be regional turns 0:11:16.400 --> 0:11:19.880 of phrase or vocabulary that don't extend to the general 0:11:19.920 --> 0:11:24.199 population of speakers of the respected language. So you might 0:11:24.440 --> 0:11:29.680 encounter a person who speaks in local idioms quite a bit, 0:11:30.320 --> 0:11:33.520 and if those are not frequently used in the broader 0:11:33.559 --> 0:11:37.320 general population of that language, then you're gonna have a 0:11:37.320 --> 0:11:40.680 lot of communication errors between that person and a machine 0:11:40.800 --> 0:11:44.880 that is trying to process that language. Ideally, machines would 0:11:44.880 --> 0:11:48.520 be able to understand whatever we say and interpret the 0:11:48.600 --> 0:11:52.360 meaning correctly, although we haven't even gotten to a world 0:11:52.360 --> 0:11:54.920 where human beings can do that reliably, So I don't 0:11:54.920 --> 0:11:57.360 know why I'm holding machines up to such a high standard. 0:11:57.600 --> 0:11:59.960 We definitely would want them to reach a certain love 0:12:00.200 --> 0:12:05.319 of confidence and and capability, however that machines just are 0:12:05.360 --> 0:12:09.200 not quite there yet. I'm going to talk a lot 0:12:09.360 --> 0:12:13.640 more about the history of natural language processing in just 0:12:13.679 --> 0:12:16.680 a moment, but first let's take a quick break to 0:12:16.800 --> 0:12:27.960 thank our sponsor. The history of natural language processing is 0:12:28.120 --> 0:12:32.920 pretty darn complicated because it involves multiple lines of research 0:12:33.120 --> 0:12:37.559 and lots of different disciplines. So we have all sorts 0:12:37.559 --> 0:12:40.480 of things that play into this, like hidden Markov models 0:12:40.520 --> 0:12:45.000 I talked about those in the Speech Recognition podcast, neural networks, 0:12:45.360 --> 0:12:50.239 referencing language using mathematical vectors, and a lot more contributing 0:12:50.240 --> 0:12:53.240 to the evolution of natural language processing, and a lot 0:12:53.280 --> 0:12:58.359 of disciplines like not just computer science, but linguistics and psychology. 0:12:58.520 --> 0:13:02.880 So there's not like a single line I can follow 0:13:03.240 --> 0:13:07.040 where it's a lad to be led to see. So 0:13:07.080 --> 0:13:10.240 we're gonna be jumping around a little bit. However, one 0:13:10.280 --> 0:13:12.160 of the sources I want to call out that I 0:13:12.240 --> 0:13:15.040 used while I was researching this episode was a paper 0:13:15.080 --> 0:13:20.160 written by Karen Spark Jones called Natural Language Processing a 0:13:20.240 --> 0:13:24.320 Historical Review. It's pretty dense, it's pretty technical, but it's 0:13:24.360 --> 0:13:26.800 also available to read online if you want a more 0:13:26.840 --> 0:13:29.840 thorough treatment of the history of the technology up to 0:13:29.880 --> 0:13:32.400 two thousand. I'm gonna be skimming over quite a bit 0:13:32.440 --> 0:13:35.000 of it because, as I say, it gets really deep 0:13:35.040 --> 0:13:38.200 and really technical, and it uses a lot of shorthand 0:13:38.240 --> 0:13:40.600 to reference things, which meant that I had to do 0:13:40.640 --> 0:13:43.800 a lot of jumping down research rabbit holes to learn more. 0:13:43.840 --> 0:13:47.960 But it was a very useful starting point for this research. 0:13:48.440 --> 0:13:51.040 And also it was published in two thousand one. Obviously 0:13:51.480 --> 0:13:54.320 a lot has happened since then. We're almost two decades 0:13:54.400 --> 0:13:58.280 out from that. But I'm gonna start at the beginning 0:13:58.320 --> 0:14:01.360 and then work my way up to what's going on today. 0:14:01.400 --> 0:14:05.320 So early work in natural language processing it actually surprised me. 0:14:05.360 --> 0:14:07.640 I was surprised at how old it was. It actually 0:14:07.720 --> 0:14:11.079 dates all the way back to the nineteen forties. Physicist 0:14:11.120 --> 0:14:15.360 and computer scientist Andrew Donald Booth proposed using computers to 0:14:15.400 --> 0:14:19.360 translate passages from one language into another, which is the 0:14:19.400 --> 0:14:21.640 type of natural language processing. You have to be able 0:14:21.640 --> 0:14:25.200 to recognize the words of one language and then map 0:14:25.320 --> 0:14:28.800 them to a similar meaning in a different language. Now, 0:14:28.840 --> 0:14:33.000 Booth's approach involved creating a word for word model. If 0:14:33.000 --> 0:14:36.440 the model couldn't find a match between two words, it 0:14:36.480 --> 0:14:40.440 would automatically discard the last letter on the input word 0:14:40.760 --> 0:14:43.840 and try again. It would do this until it found 0:14:43.840 --> 0:14:45.840 a match, or if it didn't find a match, you've 0:14:45.840 --> 0:14:48.720 got an error. But it would find a match, it 0:14:48.760 --> 0:14:51.080 would search its memory to see if the ending of 0:14:51.120 --> 0:14:54.320 the input word could give information about what the ending 0:14:54.440 --> 0:14:57.920 does to the meaning of the word. So, for example, 0:14:58.120 --> 0:15:01.240 if you were using this to tr inslate from English 0:15:01.320 --> 0:15:06.680 into Russian and you use the word writer, maybe writer 0:15:07.040 --> 0:15:11.240 does not show up in the Russian lexicon, but right 0:15:11.720 --> 0:15:17.840 does w R I T E. So the translating program 0:15:17.880 --> 0:15:21.800 tries to translate writer from English into Russian, cannot find 0:15:21.840 --> 0:15:25.760 a Russian equivalent to writer, drops the r looks for 0:15:25.800 --> 0:15:28.440 the Russian word for right, and it finds it. Then says, 0:15:28.480 --> 0:15:31.640 all right, well, in English, what does writer remain? What 0:15:31.680 --> 0:15:36.200 does that are due to the word right and it 0:15:36.240 --> 0:15:38.480 looks at its memory and finds out that the letter 0:15:38.720 --> 0:15:43.040 R makes a a noun out of the verb, but 0:15:43.240 --> 0:15:47.160 it creates an entity that does the action, which is 0:15:47.400 --> 0:15:51.280 to right. Then it looks in the Russian lexicon and says, 0:15:51.520 --> 0:15:54.720 all right, well, is there a word in that lexicon 0:15:55.160 --> 0:15:59.400 that matches this meaning. It's kind of a slow, laborious 0:15:59.480 --> 0:16:02.240 way of doing things, but was also very very early. 0:16:02.280 --> 0:16:07.360 I mean it was the following year, in nine, Warren 0:16:07.440 --> 0:16:12.160 Weaver produced a memorandum about machine translation, and Weaver admitted 0:16:12.200 --> 0:16:14.840 in the memorandum that such an application would likely be 0:16:14.960 --> 0:16:17.760 much more challenging than what he understood it to be, 0:16:18.360 --> 0:16:21.560 but that he was quote willing to expose my ignorance, 0:16:21.600 --> 0:16:24.920 hoping that will be slightly shielded by my intentions in 0:16:25.000 --> 0:16:28.640 the quote. And I think that's rather charming. In that memo, 0:16:29.080 --> 0:16:32.560 Weaver cites a letter he wrote to Professor Norbert Wiener 0:16:32.680 --> 0:16:36.800 of M I T. And that included the following paragraph. 0:16:36.880 --> 0:16:40.600 So here's a full paragraph. Actually it's two paragraphs from 0:16:40.640 --> 0:16:45.920 the memorandum recognizing fully, even though necessarily vaguely, the semantic 0:16:46.000 --> 0:16:50.400 difficulties because of multiple meanings, etcetera. I have wondered if 0:16:50.440 --> 0:16:54.119 it were unthinkable to design a computer which would translate, 0:16:54.520 --> 0:16:58.360 even if it would only translate only scientific material, where 0:16:58.360 --> 0:17:02.280 the semantic difficulties are very notably less, and even if 0:17:02.320 --> 0:17:06.440 it did produce an inelegant but intelligible result, it would 0:17:06.440 --> 0:17:10.960 seem to me worthwhile also knowing nothing official about, but 0:17:11.280 --> 0:17:16.199 having guests and inferred considerable about powerful new mechanized methods 0:17:16.200 --> 0:17:20.040 and cryptography methods which I believe succeed even when one 0:17:20.080 --> 0:17:23.560 does not know what language has been coded. One naturally 0:17:23.640 --> 0:17:27.400 wonders if the problem of translation could conceivably be treated 0:17:27.440 --> 0:17:30.600 as a problem in cryptography. When I look at an 0:17:30.680 --> 0:17:34.040 article in Russian, I say, this is really written in English, 0:17:34.119 --> 0:17:36.960 but it has been coded in some strange symbols I 0:17:37.000 --> 0:17:40.439 will now proceed to decode. So he got this idea 0:17:40.480 --> 0:17:43.720 because of activities that were going on in World War Two, 0:17:44.240 --> 0:17:48.240 where teams were trying to decode messages. And they might 0:17:48.400 --> 0:17:52.520 decode the message, they might figure out what letters correspond 0:17:52.600 --> 0:17:55.159 to the code, but it may even be in a 0:17:55.160 --> 0:17:58.400 totally different language than when they speak. So while they 0:17:58.400 --> 0:18:01.680 are able to decode the message into a native language, 0:18:01.720 --> 0:18:04.320 they are not able to speak that language. He says, well, 0:18:04.320 --> 0:18:07.000 what if we just take that same step, and now 0:18:07.040 --> 0:18:10.399 we treat the other language as a code in of 0:18:10.480 --> 0:18:13.320 itself and try to translate that into English or or 0:18:13.560 --> 0:18:17.600 decrypt it into English. Weaver are acknowledged that the word 0:18:17.800 --> 0:18:21.320 into word approach that Booth and his contemporaries were relying 0:18:21.400 --> 0:18:25.080 upon had limited utility. He wrote, quote, it is in 0:18:25.160 --> 0:18:28.639 fact amply clear that a translation procedure that does little 0:18:28.640 --> 0:18:31.200 more than handle a one to one correspondence of words 0:18:31.520 --> 0:18:35.440 cannot hope to be useful for problems of literary translation 0:18:35.760 --> 0:18:38.680 in which style is important, and in which the problems 0:18:38.720 --> 0:18:42.879 of idiom, multiple meanings, etcetera. Are frequent. End quote. So 0:18:42.920 --> 0:18:46.679 there he's saying, you can't just take a foreign word, 0:18:47.160 --> 0:18:51.560 translate it into whatever the closest equivalent in English is, 0:18:52.080 --> 0:18:55.520 and hope to get the same meaning, especially in literary works, 0:18:55.640 --> 0:18:58.720 because they are all these different turns of phrase and 0:18:58.840 --> 0:19:03.199 cultural meanings that will get lost. In that translation. You 0:19:03.240 --> 0:19:07.280 would have something that might technically be considered more or 0:19:07.359 --> 0:19:10.520 less correct, but would not be actually correct. You wouldn't 0:19:10.520 --> 0:19:14.840 be getting across the meaning of the author in that translation. 0:19:15.080 --> 0:19:19.800 You would just have words in a syntactical order that 0:19:19.880 --> 0:19:24.360 would make sense from a syntax perspective. In other words, 0:19:24.400 --> 0:19:28.600 you would have sentences that held up grammatically, but they 0:19:28.600 --> 0:19:33.240 wouldn't necessarily have the meaning of the original writing. Weaver's 0:19:33.240 --> 0:19:36.600 proposal was to perhaps expand the word into word model 0:19:36.680 --> 0:19:39.000 and create a system that would analyze not just the 0:19:39.040 --> 0:19:42.800 target word, but the words adjacent to the target in 0:19:42.920 --> 0:19:46.479 order to determine the context of the word the meaning 0:19:46.680 --> 0:19:48.720 of the word. As we'll see when we get a 0:19:48.760 --> 0:19:51.359 little bit further down in the timeline, this is one 0:19:51.400 --> 0:19:54.520 of the methods that folks working in in natural language 0:19:54.520 --> 0:19:58.080 processing incorporated into their approach. So this was incredibly forward 0:19:58.080 --> 0:20:02.800 thinking of Weaver. On January seven, ninety four, researchers from 0:20:02.800 --> 0:20:06.720 IBM and Georgetown University demonstrated a system that was able 0:20:06.760 --> 0:20:12.760 to translate around sixty sentences from Russian into English automatically. Now, 0:20:12.760 --> 0:20:16.359 the process wasn't exactly painless. It required an operator to 0:20:16.440 --> 0:20:19.800 take a sentence written in Russian but transcribed for the 0:20:19.800 --> 0:20:23.439 English alphabet. It wasn't in the cyrillic alphabet. The person 0:20:23.520 --> 0:20:27.800 would then encode that sentence on punch cards. They would 0:20:27.800 --> 0:20:30.800 feed the punch cards into a seven oh one computer. 0:20:31.359 --> 0:20:34.480 I mentioned the seven oh one that was an IBM system, 0:20:34.520 --> 0:20:37.080 but I mentioned that in the previous episode and speech recognition. 0:20:37.359 --> 0:20:40.440 Then they would wait for the translation program's response, which 0:20:40.440 --> 0:20:43.000 would take a few seconds. The program would attempt to 0:20:43.040 --> 0:20:47.480 translate the words from Russian to English. The demonstration was impressive, 0:20:47.600 --> 0:20:50.480 but it was limited in scope. The program had alexicon 0:20:50.560 --> 0:20:53.640 of only two fifty words or so, and it required 0:20:53.680 --> 0:20:58.199 extensive programming to cope with syntax because word order in 0:20:58.320 --> 0:21:02.679 Russian is different then word order in English, and you 0:21:02.720 --> 0:21:06.560 can think of the programming as including metadata. The researchers 0:21:06.560 --> 0:21:11.000 would tag Russian words with little signs that related to 0:21:11.119 --> 0:21:14.480 specific rules. So, for example, one of the terms the 0:21:14.520 --> 0:21:18.760 system could translate was a Russian two word phrase. It 0:21:18.840 --> 0:21:25.200 was g dial major, which is I'm butchering the Russian pronunciation, 0:21:25.280 --> 0:21:28.760 but in English it means major general. But the word 0:21:28.880 --> 0:21:32.119 order is reversed in Russian. If you did a strict 0:21:32.160 --> 0:21:35.760 word to word translation, you would get general major with 0:21:35.840 --> 0:21:39.600 the translation, because that's the order that the Russian phrase 0:21:39.600 --> 0:21:42.560 would put it in. So the programmers would tag each 0:21:42.600 --> 0:21:45.879 word with a rule to kind of give the idea 0:21:45.920 --> 0:21:48.880 of of what what you would what you should follow 0:21:48.920 --> 0:21:51.520 when you're making these translations, and by you I mean 0:21:51.720 --> 0:21:55.359 the computer system. So the word for general got the 0:21:55.400 --> 0:22:00.679 assignment of rule twenty one and the rule for major 0:22:01.200 --> 0:22:04.879 got the sign on. So when the system encountered a word, 0:22:05.240 --> 0:22:08.320 it would look up any related rules to that word. 0:22:08.720 --> 0:22:11.200 So if it comes across a word that has the 0:22:11.240 --> 0:22:14.760 associated rule one, it would say, all right, this rule 0:22:14.800 --> 0:22:17.200 tells me I have to go back over the message 0:22:17.240 --> 0:22:19.240 and look to see if there was a rule twenty 0:22:19.280 --> 0:22:22.720 one word in that same phrase, And if it finds 0:22:22.760 --> 0:22:25.159 a rule twenty one word, it would then know I 0:22:25.240 --> 0:22:29.479 need to reverse the order of these two words. This 0:22:29.480 --> 0:22:33.080 this uh word order that appears in Russian needs to 0:22:33.119 --> 0:22:36.280 be flipped for English. Now that's a pretty laborious process 0:22:36.840 --> 0:22:40.280 and it doesn't work great for larger lexicons. The larger 0:22:40.440 --> 0:22:43.879 the vocabulary, the more complex the sentences can become, the 0:22:43.920 --> 0:22:47.000 more exceptions and rules you're going to encounter. It would 0:22:47.040 --> 0:22:49.640 be really hard to implement this on a big scale, 0:22:49.680 --> 0:22:53.119 but it was an impressive display of machine translation. The 0:22:53.160 --> 0:22:56.440 system was essentially a vocabulary list and a long series 0:22:56.480 --> 0:23:00.919 of if then rules. If the word is this, then 0:23:01.040 --> 0:23:04.919 look for this. If that is there, then switch the 0:23:05.119 --> 0:23:09.720 word order. Essentially according to articles, it could translate sentences 0:23:09.760 --> 0:23:13.640 designed for the system in about six seconds. But again 0:23:13.720 --> 0:23:17.640 it was designed for the system, very limited vocabulary, so 0:23:18.359 --> 0:23:21.639 limited implementation there. And it's good to point out that 0:23:21.680 --> 0:23:24.200 a lot of work and machine translation around this time 0:23:24.240 --> 0:23:27.919 focused on English and Russian, which is no big surprise. 0:23:28.720 --> 0:23:30.720 Keep in mind the time scale we're talking about the 0:23:30.800 --> 0:23:34.719 nineteen fifties. Here, the USA and the then USS are 0:23:34.880 --> 0:23:38.000 we're not on great terms. Both countries were using pretty 0:23:38.080 --> 0:23:41.439 much every means at their disposal to analyze one another, 0:23:41.960 --> 0:23:44.919 to spy on one another, to maneuver to make certain 0:23:44.960 --> 0:23:47.800 the other nation didn't get a superior position. And we 0:23:47.800 --> 0:23:50.640 saw a lot of technological development during this period, including 0:23:50.680 --> 0:23:53.800 the space race that was all wrapped up in this 0:23:53.880 --> 0:23:57.840 Cold War issue as well, and perhaps as no big surprise, 0:23:57.920 --> 0:24:00.520 the US government was pretty keen to fund research and 0:24:00.560 --> 0:24:03.919 development in machine translation up to a point. That is, 0:24:04.240 --> 0:24:09.440 in nineteen sixty six, Joseph Wisenbaum published a computer program 0:24:09.480 --> 0:24:12.919 called Eliza. I've talked about Eliza in previous episodes of 0:24:12.960 --> 0:24:16.800 Tech Stuff. This was a primitive chat bought text based 0:24:17.000 --> 0:24:22.119 chat bot. It mimicked a Rogerian psychotherapist. That's a discipline 0:24:22.160 --> 0:24:26.440 that was pioneered by the psychologist Carl Rogers. It's sometimes 0:24:26.480 --> 0:24:31.679 also called persons centered therapy. Eliza was strictly this text 0:24:31.680 --> 0:24:34.760 based terminal operation. You would see a line of text 0:24:34.800 --> 0:24:37.000 pop up. It would ask you how what how you're doing? 0:24:37.440 --> 0:24:39.080 You can type stuff in and then it would respond 0:24:39.119 --> 0:24:43.399 to you, so you would get the responses that appeared 0:24:43.440 --> 0:24:46.400 to be semi intelligent. Typically it would be a question 0:24:46.440 --> 0:24:49.600 to ask for more information, or sometimes it would be 0:24:49.640 --> 0:24:52.600 a phrase to change the subject. So you might say 0:24:53.240 --> 0:24:56.640 something along the lines of I'm so angry right now, 0:24:56.880 --> 0:24:59.800 and Eliza might respond with what has made you angry? 0:25:00.320 --> 0:25:03.880