1
00:00:02,480 --> 00:00:15,760
Speaker 1: Bloomberg Audio Studios, Podcasts, radio News.

2
00:00:17,920 --> 00:00:21,200
Speaker 2: Hello and welcome to another episode of The Odd Lots podcast.

3
00:00:21,280 --> 00:00:24,439
Speaker 2: I'm Tracy Alloway and I'm Joe Wisenthal. Joe, you know,

4
00:00:24,640 --> 00:00:27,480
Speaker 2: I had a life realization recently.

5
00:00:27,720 --> 00:00:30,240
Speaker 3: Okay, this should be good, go on.

6
00:00:30,800 --> 00:00:34,440
Speaker 2: It struck me that I am spending a non negligible

7
00:00:34,520 --> 00:00:37,440
Speaker 2: amount of my time proving that I am in fact

8
00:00:37,880 --> 00:00:38,680
Speaker 2: a human being.

9
00:00:39,120 --> 00:00:42,120
Speaker 3: It's getting harder and harder. I know what you're talking about.

10
00:00:42,159 --> 00:00:44,320
Speaker 3: So we're talking. You know, you go to a website

11
00:00:44,360 --> 00:00:46,160
Speaker 3: and you have to enter in the captcha and it's

12
00:00:46,240 --> 00:00:49,600
Speaker 3: like click all these squares that has like a crosswalk

13
00:00:49,640 --> 00:00:51,920
Speaker 3: on them or a truck, and like it feels like

14
00:00:51,960 --> 00:00:54,400
Speaker 3: it's just getting harder. And sometimes I'm like, no, trust me,

15
00:00:54,440 --> 00:00:54,960
Speaker 3: I'm a human.

16
00:00:55,760 --> 00:00:58,080
Speaker 2: This is it. And every time it happens, I kind

17
00:00:58,080 --> 00:01:01,040
Speaker 2: of have a moment of self doubt whether or not

18
00:01:02,000 --> 00:01:05,319
Speaker 2: is it just me? Am I particularly bad at picking

19
00:01:05,360 --> 00:01:08,720
Speaker 2: out all the motorcycles in a set of pictures? Or

20
00:01:08,800 --> 00:01:13,559
Speaker 2: are they just becoming increasingly weird or perhaps increasingly sophisticated

21
00:01:13,760 --> 00:01:16,679
Speaker 2: in the face of new types of technology.

22
00:01:17,040 --> 00:01:19,600
Speaker 3: It's not just you. I've heard this from multiple people

23
00:01:19,800 --> 00:01:24,360
Speaker 3: in fact, prepping for this episode, I heard people talking

24
00:01:24,400 --> 00:01:27,280
Speaker 3: about exactly this, But you know, it's like a big problem.

25
00:01:27,319 --> 00:01:29,240
Speaker 3: You know, we did that world Coin episode, like everyone

26
00:01:29,400 --> 00:01:32,039
Speaker 3: is trying to figure out, like how in a world

27
00:01:32,080 --> 00:01:35,240
Speaker 3: of AI and bods and artificial intelligence all that stuff,

28
00:01:35,560 --> 00:01:38,039
Speaker 3: how do you know whether someone you're interacting with is

29
00:01:38,080 --> 00:01:38,880
Speaker 3: in fact a person.

30
00:01:39,120 --> 00:01:43,039
Speaker 2: Yeah, and I'm glad you mentioned AI because obviously part

31
00:01:43,080 --> 00:01:46,119
Speaker 2: of this dynamic is AI seems to be getting better

32
00:01:46,240 --> 00:01:50,560
Speaker 2: at solving these particular types of problems, but also they're

33
00:01:50,600 --> 00:01:54,720
Speaker 2: being used more right to train AI models. So at

34
00:01:54,720 --> 00:01:56,840
Speaker 2: this point, I think we all know why we're constantly

35
00:01:57,000 --> 00:02:00,640
Speaker 2: trying to identify bikes and a bunch of photos. But

36
00:02:00,960 --> 00:02:06,680
Speaker 2: the whole idea behind captures is or was that humans

37
00:02:06,720 --> 00:02:09,320
Speaker 2: still have an edge. So there are some things that

38
00:02:09,400 --> 00:02:13,480
Speaker 2: humans are better able to do versus machines. And one

39
00:02:13,480 --> 00:02:15,400
Speaker 2: of the things that we used to talk about humans

40
00:02:15,440 --> 00:02:18,840
Speaker 2: having an edge in was linguistics. So there is this

41
00:02:18,919 --> 00:02:23,000
Speaker 2: idea that human language was so complex, so nuanced, that

42
00:02:23,080 --> 00:02:27,359
Speaker 2: machines would maybe never be able to fully appreciate all

43
00:02:27,360 --> 00:02:31,160
Speaker 2: the intricacies and subtleties of the human language. But obviously,

44
00:02:31,200 --> 00:02:35,200
Speaker 2: since the arrival of generative AI and natural language processing.

45
00:02:35,560 --> 00:02:38,640
Speaker 2: I think there's more of a question mark around that. Yeah.

46
00:02:38,720 --> 00:02:41,440
Speaker 3: I mean, look, I think like a typical chat bot

47
00:02:41,520 --> 00:02:44,080
Speaker 3: right now is probably better than most people at just

48
00:02:44,200 --> 00:02:47,280
Speaker 3: typing out several paragraphs. It's all sort of like seemed

49
00:02:47,280 --> 00:02:48,880
Speaker 3: to sort of as they say on the internet, kind

50
00:02:48,919 --> 00:02:51,080
Speaker 3: of mid curve to me. It never like strikes me

51
00:02:51,120 --> 00:02:55,919
Speaker 3: as like incredibly intelligent, but clearly computers can talk about

52
00:02:55,919 --> 00:02:58,480
Speaker 3: as well as humans, and so it raises all sorts

53
00:02:58,480 --> 00:03:01,320
Speaker 3: of interesting questions. You mentioned that part of capture is

54
00:03:01,400 --> 00:03:04,200
Speaker 3: part of this, like training computers. A big part of

55
00:03:04,240 --> 00:03:07,440
Speaker 3: these chatbots the so called like real life human feedback

56
00:03:07,480 --> 00:03:09,680
Speaker 3: where people say this answer is better then another, this

57
00:03:09,720 --> 00:03:12,240
Speaker 3: answer is better another, is they refine the models, et cetera.

58
00:03:12,720 --> 00:03:13,840
Speaker 4: So I think there's like.

59
00:03:13,800 --> 00:03:16,920
Speaker 3: An interesting moment where like we're learning from computers and

60
00:03:16,960 --> 00:03:21,720
Speaker 3: computers are learning from us, maybe collaboratively, the two sides

61
00:03:22,240 --> 00:03:25,120
Speaker 3: in a carbon and silicon working together.

62
00:03:25,680 --> 00:03:27,560
Speaker 2: I think that's a great way of putting it. Also,

63
00:03:27,800 --> 00:03:31,880
Speaker 2: mid curve is such an underappreciated insult, like calling people

64
00:03:31,960 --> 00:03:34,760
Speaker 2: top of the bell curve is one of my favorite

65
00:03:34,760 --> 00:03:37,320
Speaker 2: things to do online. Anyway, I am very pleased to

66
00:03:37,400 --> 00:03:41,240
Speaker 2: say that today we actually have the perfect guest. We're

67
00:03:41,280 --> 00:03:45,680
Speaker 2: going to be speaking to someone who was very instrumental

68
00:03:45,800 --> 00:03:49,440
Speaker 2: in the development of things like Captcha and someone who

69
00:03:49,520 --> 00:03:53,440
Speaker 2: is doing a lot with AI, particularly in the field

70
00:03:53,600 --> 00:03:56,800
Speaker 2: of linguistics and language. Right now, we're going to be

71
00:03:56,800 --> 00:03:59,440
Speaker 2: speaking with Louis von On. He is, of course the

72
00:03:59,560 --> 00:04:02,920
Speaker 2: CEO and co founder of Duo Lingo. So, Louise, thank

73
00:04:03,000 --> 00:04:04,200
Speaker 2: you so much for coming on.

74
00:04:04,040 --> 00:04:06,119
Speaker 4: On thoughts, Thank you, thank you for having me.

75
00:04:06,800 --> 00:04:09,680
Speaker 2: So maybe to begin with talk to us about the

76
00:04:09,800 --> 00:04:14,160
Speaker 2: idea behind capture and why it seems to have become

77
00:04:14,320 --> 00:04:17,039
Speaker 2: I don't want to say a significant portion of my life,

78
00:04:17,080 --> 00:04:20,080
Speaker 2: but I certainly spend a couple minutes every day doing

79
00:04:20,080 --> 00:04:21,000
Speaker 2: at least one version.

80
00:04:21,680 --> 00:04:24,359
Speaker 4: Yeah. So the original capture, the idea of a capture

81
00:04:24,560 --> 00:04:28,279
Speaker 4: was a test to distinguish humans from computers. The reasons

82
00:04:28,320 --> 00:04:31,120
Speaker 4: why you may want to distinguish whether you're interacting with

83
00:04:31,160 --> 00:04:34,000
Speaker 4: a human or a computer online for example, and this

84
00:04:34,120 --> 00:04:37,240
Speaker 4: is kind of the original motivation for it. Companies offer

85
00:04:37,279 --> 00:04:40,359
Speaker 4: free email services, and you know they have the problem

86
00:04:40,440 --> 00:04:43,599
Speaker 4: that if you allow anything to sign up for a

87
00:04:43,600 --> 00:04:46,920
Speaker 4: freemail service, like either a computer or human, somebody could

88
00:04:46,960 --> 00:04:49,560
Speaker 4: write a program to obtain millions of free email accounts,

89
00:04:49,880 --> 00:04:53,919
Speaker 4: whereas humans, because they are usually not that patient, cannot

90
00:04:54,240 --> 00:04:56,520
Speaker 4: get millions of email accounts for themselves. They can only

91
00:04:56,520 --> 00:05:00,360
Speaker 4: get one or two. So the original motivation for aptual

92
00:05:00,440 --> 00:05:02,359
Speaker 4: was to make a test to make sure that whoever

93
00:05:02,640 --> 00:05:04,760
Speaker 4: is getting a freemail accunt is actually a human and

94
00:05:04,800 --> 00:05:07,760
Speaker 4: not a computer program that was written to obtain millions

95
00:05:07,760 --> 00:05:11,000
Speaker 4: of email accounts, so, you know, and the way it worked,

96
00:05:11,120 --> 00:05:13,400
Speaker 4: there's there's many kind of tests. Originally, the way it

97
00:05:13,440 --> 00:05:16,560
Speaker 4: worked is distorted letters, So you would get a bunch

98
00:05:16,600 --> 00:05:18,800
Speaker 4: of letters that were predistorted and you had to type

99
00:05:18,800 --> 00:05:21,640
Speaker 4: what they were. And the reason that worked is because

100
00:05:22,240 --> 00:05:25,560
Speaker 4: human beings are very good at reindistorted letters. But at

101
00:05:25,600 --> 00:05:27,720
Speaker 4: the time this was, you know, more than twenty years ago,

102
00:05:28,000 --> 00:05:31,720
Speaker 4: computers just could not recognize distorted letters very well. So

103
00:05:31,760 --> 00:05:34,280
Speaker 4: that was a great test to determine whether you were

104
00:05:34,279 --> 00:05:36,880
Speaker 4: talking to a human or a computer. But what happened

105
00:05:36,880 --> 00:05:40,880
Speaker 4: is over time, computers got quite good at this trying

106
00:05:40,920 --> 00:05:45,919
Speaker 4: to deciphering distorted text, so it was no longer possible

107
00:05:45,960 --> 00:05:48,640
Speaker 4: to give an image with distorted text and distinguish a

108
00:05:48,680 --> 00:05:50,840
Speaker 4: human from a computer, because computers pretty much got as

109
00:05:50,880 --> 00:05:54,360
Speaker 4: good as a human at that point, these tests started

110
00:05:54,520 --> 00:05:56,480
Speaker 4: changing to other things. I mean, one of the more

111
00:05:56,520 --> 00:05:59,400
Speaker 4: popular ones that you see nowadays is kind of clicking

112
00:05:59,520 --> 00:06:02,520
Speaker 4: on the images of something. So you can see a grid,

113
00:06:02,680 --> 00:06:05,240
Speaker 4: like a four by four grid, and it may say

114
00:06:05,640 --> 00:06:07,920
Speaker 4: click on all the traffic lights, or click on all

115
00:06:07,960 --> 00:06:12,960
Speaker 4: the bicycles, et cetera. And by clicking on them, you know,

116
00:06:13,120 --> 00:06:17,279
Speaker 4: you're you're showing that you can actually recognize these things.

117
00:06:17,480 --> 00:06:20,520
Speaker 4: And the reason they're getting harder is because computers are

118
00:06:20,520 --> 00:06:23,960
Speaker 4: getting better and better at deciphering which ones are traffic lights,

119
00:06:24,000 --> 00:06:27,160
Speaker 4: et cetera. And by now, what you're getting here are

120
00:06:27,200 --> 00:06:30,280
Speaker 4: the things that we still think computers are not very

121
00:06:30,279 --> 00:06:33,920
Speaker 4: good at. So the image may be very blurry, or

122
00:06:34,080 --> 00:06:35,920
Speaker 4: you know, you may just get a tiny little corner

123
00:06:36,000 --> 00:06:38,599
Speaker 4: of it and things like that. So that's why they're

124
00:06:38,600 --> 00:06:40,960
Speaker 4: getting harder, and I expect that to continue happening.

125
00:06:41,680 --> 00:06:45,040
Speaker 3: So you are the found You founded a company called

126
00:06:45,200 --> 00:06:49,000
Speaker 3: recap Show, which you sold to Google and several years ago.

127
00:06:49,480 --> 00:06:52,040
Speaker 3: Is there gonna be a point where I mean, I

128
00:06:52,080 --> 00:06:56,479
Speaker 3: assume computer vision and their ability to decode images or

129
00:06:56,520 --> 00:06:59,800
Speaker 3: recognize images is not done improving. I assume it's going

130
00:06:59,800 --> 00:07:03,479
Speaker 3: to get better, whereas humans' ability to decode images. I

131
00:07:03,560 --> 00:07:06,159
Speaker 3: doubt it's really getting any better. We've probably been about

132
00:07:06,160 --> 00:07:09,120
Speaker 3: the same for a couple thousand years now. Like, is

133
00:07:09,160 --> 00:07:11,920
Speaker 3: there going to be a point in which it's impossible

134
00:07:12,040 --> 00:07:14,600
Speaker 3: to create a visual test that humans are better at

135
00:07:14,600 --> 00:07:15,480
Speaker 3: than computers?

136
00:07:15,680 --> 00:07:18,320
Speaker 4: I believe that will happen at some point. Yeah, it's

137
00:07:18,480 --> 00:07:21,840
Speaker 4: very hard to say when exactly, but you know, you

138
00:07:21,880 --> 00:07:24,440
Speaker 4: can just see at this point it's getting you know,

139
00:07:24,480 --> 00:07:27,200
Speaker 4: computers are getting better and better. And you know, the

140
00:07:27,240 --> 00:07:30,120
Speaker 4: other thing that is important to mention is this type

141
00:07:30,120 --> 00:07:33,200
Speaker 4: of test has extra constraints. It also has to be

142
00:07:33,280 --> 00:07:36,360
Speaker 4: the case that it's not just that humans can do

143
00:07:36,400 --> 00:07:38,040
Speaker 4: it. It's like, really, humans should be able to do it

144
00:07:38,040 --> 00:07:43,600
Speaker 4: pretty quickly and you know, success.

145
00:07:43,360 --> 00:07:46,400
Speaker 3: Quickly, and on a mobile phone and a very small

146
00:07:46,480 --> 00:07:48,840
Speaker 3: screen in which like my thumb is like half the

147
00:07:48,880 --> 00:07:49,520
Speaker 3: size of the screen.

148
00:07:49,640 --> 00:07:51,600
Speaker 4: Yeah. Yeah, And it may not be you know, quickly.

149
00:07:51,680 --> 00:07:53,520
Speaker 4: I mean it may take you, I don't know, thirty

150
00:07:53,520 --> 00:07:55,480
Speaker 4: seconds or a minute. But we cannot make a test

151
00:07:55,480 --> 00:07:59,200
Speaker 4: that takes you an hour. We can't do that. So

152
00:07:59,240 --> 00:08:01,200
Speaker 4: it has to be quick. It has to be done

153
00:08:01,200 --> 00:08:02,480
Speaker 4: on a mobile phone. It has to be the case

154
00:08:02,480 --> 00:08:04,440
Speaker 4: that the computer should be able to grade it. Computer

155
00:08:04,480 --> 00:08:06,160
Speaker 4: should be able to know what the right answer was,

156
00:08:06,280 --> 00:08:09,400
Speaker 4: even though it can't solve it. So because of all

157
00:08:09,400 --> 00:08:11,880
Speaker 4: of these constraints, I mean, my sense is at some

158
00:08:12,000 --> 00:08:14,160
Speaker 4: point this is just going to be impossible. I mean,

159
00:08:14,320 --> 00:08:17,360
Speaker 4: we knew this when we started the original capture that

160
00:08:17,400 --> 00:08:19,640
Speaker 4: at some point computers were going to get good enough,

161
00:08:20,800 --> 00:08:22,960
Speaker 4: but we just had no idea how long it was

162
00:08:23,000 --> 00:08:25,520
Speaker 4: going to take. And I still don't know how long

163
00:08:25,600 --> 00:08:27,520
Speaker 4: it's going to take. But you know, I would not

164
00:08:27,600 --> 00:08:29,960
Speaker 4: be surprised if in five to ten years there's just

165
00:08:30,040 --> 00:08:32,679
Speaker 4: not much that you can do that is really quick

166
00:08:33,080 --> 00:08:36,200
Speaker 4: online to be able to differentiate humans from computers.

167
00:08:36,360 --> 00:08:39,760
Speaker 2: Yeah, that's when we get the eyeball scanning ORBS. But

168
00:08:40,000 --> 00:08:42,360
Speaker 2: I mean you mentioned that you can't have a test

169
00:08:42,679 --> 00:08:45,760
Speaker 2: that takes an hour or something like that. But this

170
00:08:45,880 --> 00:08:49,160
Speaker 2: kind of begs the question in my mind of why

171
00:08:49,200 --> 00:08:51,839
Speaker 2: are people using these tests at all? So, like, Okay,

172
00:08:51,920 --> 00:08:56,160
Speaker 2: obviously you want to distinguish between humans and robots, but

173
00:08:56,280 --> 00:08:59,160
Speaker 2: I sometimes get the sense that these are basically free

174
00:08:59,240 --> 00:09:03,600
Speaker 2: labor AI training programs, Right, So even if you can

175
00:09:03,760 --> 00:09:07,439
Speaker 2: verify identity in some other way, why not get people

176
00:09:07,679 --> 00:09:10,920
Speaker 2: on a mass scale to spend two minutes training self

177
00:09:11,000 --> 00:09:11,680
Speaker 2: driving cars.

178
00:09:12,200 --> 00:09:14,240
Speaker 4: Yeah, I mean, this is what these things are doing.

179
00:09:14,240 --> 00:09:17,320
Speaker 4: That was the original idea of Recapture, which was my company.

180
00:09:17,400 --> 00:09:21,120
Speaker 4: The idea was that you could, at the same time

181
00:09:21,160 --> 00:09:23,000
Speaker 4: as you were proving that you are a human, you

182
00:09:23,040 --> 00:09:25,400
Speaker 4: could be doing something that computers could not yet do,

183
00:09:25,800 --> 00:09:29,080
Speaker 4: and that data could be used to improve computer programs

184
00:09:29,080 --> 00:09:32,520
Speaker 4: to do it. So certainly, when you're clicking on bicycles

185
00:09:32,600 --> 00:09:35,280
Speaker 4: or when you're clicking on traffic lights or whatever, that

186
00:09:35,440 --> 00:09:38,600
Speaker 4: is likely data that is being used. I say likely

187
00:09:38,600 --> 00:09:40,800
Speaker 4: because you know, I don't know what capture you're using.

188
00:09:41,000 --> 00:09:42,360
Speaker 4: There may be some that are not doing that, but

189
00:09:42,800 --> 00:09:47,000
Speaker 4: overall that data is being used to improve things like

190
00:09:47,559 --> 00:09:51,800
Speaker 4: self driving cars, image recognition programs, et cetera. So that

191
00:09:51,920 --> 00:09:54,800
Speaker 4: is happening, and that's you know, generally a good thing

192
00:09:54,840 --> 00:09:59,000
Speaker 4: because that's making basically AI smarter and smarter. But you know,

193
00:09:59,480 --> 00:10:01,520
Speaker 4: we still needed to be the case that it's a

194
00:10:01,559 --> 00:10:05,480
Speaker 4: good security mechanism. So if at some point just computers

195
00:10:05,480 --> 00:10:09,080
Speaker 4: can do that, then you know, that's just not a

196
00:10:09,080 --> 00:10:10,959
Speaker 4: great security mechanism and it's not going to be used.

197
00:10:10,960 --> 00:10:13,480
Speaker 4: And my sense is if we're gonna want to do something,

198
00:10:13,480 --> 00:10:16,280
Speaker 4: we are going to need something like real identity, Like

199
00:10:16,600 --> 00:10:18,040
Speaker 4: I don't know if it's going to be eyeball scanning

200
00:10:18,120 --> 00:10:20,520
Speaker 4: or whatever, but it's good. We're gonna you know, the

201
00:10:20,840 --> 00:10:23,360
Speaker 4: nice thing about a capture is it doesn't tie you

202
00:10:23,400 --> 00:10:26,040
Speaker 4: to you. It just proves that you're a human. Right,

203
00:10:26,440 --> 00:10:29,040
Speaker 4: We're probably going to need something that ties you to you.

204
00:10:29,760 --> 00:10:31,760
Speaker 4: We're probably going to need something that says, well, I

205
00:10:31,960 --> 00:10:35,400
Speaker 4: just know this is this specific person because you know whatever,

206
00:10:35,800 --> 00:10:39,040
Speaker 4: we're scanning their eyeball, we're looking at their fingerprint, whatever

207
00:10:39,080 --> 00:10:41,040
Speaker 4: it is, and it is actually a real person, and

208
00:10:41,080 --> 00:10:42,000
Speaker 4: it is this person.

209
00:10:43,000 --> 00:10:45,280
Speaker 3: Why don't we sort of zoom out and back up

210
00:10:45,320 --> 00:10:48,240
Speaker 3: for a second. So currently you are the CEO of

211
00:10:48,360 --> 00:10:54,120
Speaker 3: Duo Lingo of the popular language learning app, publicly traded company.

212
00:10:54,600 --> 00:10:58,160
Speaker 3: Done much better sort of stockwise than many companies that

213
00:10:58,240 --> 00:11:01,480
Speaker 3: came public in twenty twenty one. I have expected, you know,

214
00:11:01,640 --> 00:11:03,760
Speaker 3: there was a boom when people a bunch of time

215
00:11:03,800 --> 00:11:06,520
Speaker 3: on their hand gone down. You also sort of one

216
00:11:06,520 --> 00:11:10,240
Speaker 3: of the most respected sort of computer sciences thinkers coming

217
00:11:10,240 --> 00:11:13,520
Speaker 3: out of the Carnegie Mellon University. What is the through

218
00:11:13,600 --> 00:11:16,120
Speaker 3: line of your work or how would you characterize that

219
00:11:16,200 --> 00:11:20,280
Speaker 3: connects something like captures to language learning a dual lingo.

220
00:11:20,760 --> 00:11:23,600
Speaker 4: It's similar to what you were talking about smiling when

221
00:11:23,600 --> 00:11:25,320
Speaker 4: you were mentioning that. I mean, I think the general

222
00:11:25,360 --> 00:11:29,319
Speaker 4: through line is a combination of humans learning from computers

223
00:11:29,320 --> 00:11:32,480
Speaker 4: and computers learning from humans. And you know, capture had

224
00:11:32,520 --> 00:11:35,480
Speaker 4: that while you were typing a capture, computers were learning

225
00:11:35,520 --> 00:11:38,040
Speaker 4: from what you were doing. In the case of duolingo,

226
00:11:38,600 --> 00:11:41,760
Speaker 4: it's really a symbiotic thing that both are learning, in

227
00:11:41,800 --> 00:11:45,160
Speaker 4: that humans are learning a language and in the case

228
00:11:45,160 --> 00:11:47,080
Speaker 4: of due a lingo, due lingos learning how to teach

229
00:11:47,160 --> 00:11:51,520
Speaker 4: humans better by interacting with humans a lot. So you know,

230
00:11:51,600 --> 00:11:54,960
Speaker 4: dual lingo just gets better with time because we figure

231
00:11:55,000 --> 00:11:58,520
Speaker 4: out different ways in which humans are just learning better.

232
00:11:59,160 --> 00:12:01,440
Speaker 4: You know, humans are getting better with a language, and

233
00:12:01,520 --> 00:12:03,439
Speaker 4: do a linguos getting better at teaching you languages.

234
00:12:19,120 --> 00:12:20,640
Speaker 2: Joe, have you used to a lingo?

235
00:12:21,400 --> 00:12:25,520
Speaker 3: I haven't. Well, okay, I hadn't up until recently. So

236
00:12:26,080 --> 00:12:29,040
Speaker 3: last week, as it turns out, I visited my mother

237
00:12:29,120 --> 00:12:32,199
Speaker 3: who lives in Guatemala, which luis I Anderson You're from,

238
00:12:32,280 --> 00:12:35,280
Speaker 3: And oh, wow, yeah, she's she is. Uh, she's not

239
00:12:35,360 --> 00:12:38,440
Speaker 3: from there, but she visited a friend there eight years

240
00:12:38,440 --> 00:12:39,880
Speaker 3: ago and she loved it, and she's like, I'm just

241
00:12:39,920 --> 00:12:42,720
Speaker 3: gonna stay and she has a little never left. She

242
00:12:42,800 --> 00:12:44,440
Speaker 3: loved it so much, and so I visited her for

243
00:12:44,480 --> 00:12:47,240
Speaker 3: the first time at her house near Lake Atitlan, and

244
00:12:47,240 --> 00:12:48,679
Speaker 3: then I was like, oh, there's a great life and

245
00:12:48,720 --> 00:12:51,640
Speaker 3: maybe one day I'll even have that house. And I

246
00:12:51,640 --> 00:12:54,560
Speaker 3: should learn Spanish, And so I did, partly because of

247
00:12:54,559 --> 00:12:57,280
Speaker 3: that trip and partly to prepare for this episode. I

248
00:12:57,400 --> 00:12:59,880
Speaker 3: downloaded it and have started. I know a little bit

249
00:12:59,920 --> 00:13:02,280
Speaker 3: of Spanish, not much like I can, you know, ask

250
00:13:02,320 --> 00:13:04,079
Speaker 3: for the bill and stuff, but it's like, oh, I should,

251
00:13:04,120 --> 00:13:05,040
Speaker 3: I should start to learn it.

252
00:13:05,160 --> 00:13:09,160
Speaker 2: That's funny because I also started learning Spanish right before

253
00:13:09,280 --> 00:13:12,040
Speaker 2: a trip to Guatemala. There you go with Duolingo, and

254
00:13:12,280 --> 00:13:16,000
Speaker 2: I'm not the best advertisement for the app. I'm afraid,

255
00:13:16,080 --> 00:13:18,959
Speaker 2: like the only thing I remember is basically like Kissierra

256
00:13:19,120 --> 00:13:24,000
Speaker 2: una hapatas personas. That's all I remember from.

257
00:13:23,920 --> 00:13:24,600
Speaker 3: It's pretty good.

258
00:13:25,000 --> 00:13:26,160
Speaker 4: Thanks, that's pretty good.

259
00:13:26,920 --> 00:13:28,720
Speaker 2: All right, I need to get back on it. But

260
00:13:29,080 --> 00:13:31,600
Speaker 2: why don't you talk to us a little bit about

261
00:13:31,640 --> 00:13:37,040
Speaker 2: the opportunity with AI in this sort of language learning space,

262
00:13:37,280 --> 00:13:41,280
Speaker 2: because intuitively, it would seem like things like chat bots

263
00:13:41,320 --> 00:13:44,800
Speaker 2: and generative AI and natural language processing and things like

264
00:13:44,840 --> 00:13:48,840
Speaker 2: that would be an amazing fit for this type of business.

265
00:13:49,120 --> 00:13:51,600
Speaker 4: Yeah, it's a really good fit. So okay, So you know,

266
00:13:51,600 --> 00:13:55,320
Speaker 4: we teach languages. We do a lingo. Historically, you know,

267
00:13:55,400 --> 00:13:57,720
Speaker 4: learning a language just has a lot of different components.

268
00:13:57,760 --> 00:14:00,440
Speaker 4: You got to learn how to how to read language.

269
00:14:00,440 --> 00:14:02,760
Speaker 4: You got to learn some vocabulary, you got to learn

270
00:14:02,760 --> 00:14:05,480
Speaker 4: how to listen to it. If there's a different writing system,

271
00:14:05,520 --> 00:14:07,839
Speaker 4: you've got to learn the writing system, you got to

272
00:14:07,920 --> 00:14:09,800
Speaker 4: learn how to have a conversation. There's a lot of

273
00:14:09,800 --> 00:14:14,480
Speaker 4: different skills that are required in learning a language. Historically,

274
00:14:14,520 --> 00:14:17,720
Speaker 4: we have done pretty well in all the skills except

275
00:14:17,760 --> 00:14:21,080
Speaker 4: for one of them, which is having a multi turned

276
00:14:21,120 --> 00:14:24,960
Speaker 4: fluid conversation. So we could teach you, you know, historically, we

277
00:14:25,000 --> 00:14:27,320
Speaker 4: could teach you, We could teach your vocabulary really well.

278
00:14:27,360 --> 00:14:29,000
Speaker 4: We could teach you how to listen to a language.

279
00:14:29,040 --> 00:14:30,880
Speaker 4: It's you know, generally just by just getting you to

280
00:14:30,880 --> 00:14:32,920
Speaker 4: listen a lot to something. So we could teach you

281
00:14:32,960 --> 00:14:37,280
Speaker 4: all the things, but being able to practice actual multi

282
00:14:37,320 --> 00:14:40,160
Speaker 4: turned conversation was not something that we could do with

283
00:14:40,320 --> 00:14:42,840
Speaker 4: just a computer. Historically, that needed us to pair you

284
00:14:42,880 --> 00:14:45,240
Speaker 4: with another human. Now we do a ling We never

285
00:14:45,280 --> 00:14:47,280
Speaker 4: paired people up with other humans, because it turns out

286
00:14:47,800 --> 00:14:50,400
Speaker 4: a very small fraction of people actually want to be

287
00:14:50,480 --> 00:14:53,600
Speaker 4: paired with a random person over the internet who speaks

288
00:14:53,600 --> 00:14:56,720
Speaker 4: a different language. It's just it's kind of too embarrassing

289
00:14:56,760 --> 00:15:00,640
Speaker 4: for most people. I never did that. Well, it may

290
00:15:00,680 --> 00:15:04,640
Speaker 4: be dangerous, yes, but it also it's just it's like

291
00:15:04,720 --> 00:15:08,320
Speaker 4: ninety percent of people just not extroverted enough, yeah to

292
00:15:08,400 --> 00:15:11,120
Speaker 4: do that. I just don't want to do it. So

293
00:15:11,600 --> 00:15:14,440
Speaker 4: we always, you know, kind of we did these kind

294
00:15:14,440 --> 00:15:18,000
Speaker 4: of wonky things to try to emulate short conversations, but

295
00:15:18,040 --> 00:15:20,360
Speaker 4: we could never do anything like what we can do

296
00:15:20,480 --> 00:15:24,720
Speaker 4: now because with large language models, we really can get

297
00:15:24,760 --> 00:15:27,840
Speaker 4: you to practice you know, it may not be a

298
00:15:27,920 --> 00:15:30,160
Speaker 4: three hour conversation, but we can get you to practice

299
00:15:30,160 --> 00:15:32,440
Speaker 4: a multi turn, you know, ten minute conversation and it's

300
00:15:32,480 --> 00:15:34,680
Speaker 4: pretty good. So that's that's what we're doing with du

301
00:15:34,680 --> 00:15:38,680
Speaker 4: A Lingo. We're using it to help you learn conversational

302
00:15:38,720 --> 00:15:41,000
Speaker 4: skills a lot better, and that's helping out quite a bit.

303
00:15:41,840 --> 00:15:44,320
Speaker 3: There are so many questions I have, and I you know,

304
00:15:44,880 --> 00:15:46,920
Speaker 3: I think my mom will rely like this episode because,

305
00:15:46,960 --> 00:15:50,320
Speaker 3: in addition to the Guatemala connection, she is a linguist.

306
00:15:50,520 --> 00:15:54,440
Speaker 3: She speaks like seven languages, including Spanish, and like basically

307
00:15:55,240 --> 00:15:57,080
Speaker 3: you know all the others, not all the others, but

308
00:15:57,680 --> 00:16:01,040
Speaker 3: all the others, many many others. But you know something

309
00:16:01,080 --> 00:16:03,600
Speaker 3: that I was curious about, and maybe this is a

310
00:16:03,640 --> 00:16:05,600
Speaker 3: little bit of random jumping point, you know. I think

311
00:16:05,640 --> 00:16:09,480
Speaker 3: about like chess computers, and originally they were sort of

312
00:16:09,520 --> 00:16:12,680
Speaker 3: trained on a corpus of famous chess games, and then

313
00:16:12,720 --> 00:16:13,240
Speaker 3: with some.

314
00:16:13,120 --> 00:16:14,120
Speaker 4: Computer they got better.

315
00:16:14,120 --> 00:16:18,720
Speaker 3: And then the new generation essentially relearned chess from just

316
00:16:18,800 --> 00:16:21,640
Speaker 3: the rules from first principles, and it turns out that

317
00:16:21,640 --> 00:16:24,560
Speaker 3: they're way better. And I'm wondering, if you're learning through

318
00:16:24,560 --> 00:16:26,520
Speaker 3: the process of building out do a lingo improvement, Like

319
00:16:27,160 --> 00:16:30,960
Speaker 3: are there forms of pedagogy that in language learning, whether

320
00:16:31,040 --> 00:16:33,960
Speaker 3: it's the need for immersion or the need for roat drills,

321
00:16:34,040 --> 00:16:37,640
Speaker 3: or certain things that linguists have always thought were necessary

322
00:16:37,640 --> 00:16:41,880
Speaker 3: components of good language learning that when rebuilding education from

323
00:16:41,920 --> 00:16:46,240
Speaker 3: the ground up, like old dictums just turn out to

324
00:16:46,240 --> 00:16:49,000
Speaker 3: be completely wrong, And when you rebuild the process from

325
00:16:49,040 --> 00:16:52,600
Speaker 3: the beginning, like novel forms of pedagogy emerge.

326
00:16:53,160 --> 00:16:56,240
Speaker 4: It's a great question, and it's a hard question to

327
00:16:56,280 --> 00:16:59,840
Speaker 4: answer for the following reason, at least for us we

328
00:17:00,200 --> 00:17:04,760
Speaker 4: teach a language from an app. Historically, the way people

329
00:17:04,840 --> 00:17:08,280
Speaker 4: learn languages is basically by practicing with another human or

330
00:17:08,400 --> 00:17:10,840
Speaker 4: being in a classroom or whatever. Whereas we teach from

331
00:17:10,880 --> 00:17:14,240
Speaker 4: an app, the setting is just very different for one

332
00:17:14,680 --> 00:17:18,600
Speaker 4: key reason, which is that it is so easy to

333
00:17:18,720 --> 00:17:21,879
Speaker 4: leave the app, whereas leaving a classroom it's just not

334
00:17:21,920 --> 00:17:23,720
Speaker 4: that easy. You kind of have to go. You're usually

335
00:17:23,720 --> 00:17:25,800
Speaker 4: forced by your parents to go to a classroom, and like,

336
00:17:26,119 --> 00:17:29,760
Speaker 4: you know, so generally, the thing about learning something by

337
00:17:29,800 --> 00:17:33,240
Speaker 4: yourself when you're just learning it through a computer is

338
00:17:33,280 --> 00:17:37,439
Speaker 4: that the hardest thing is motivation. It turns out that

339
00:17:37,320 --> 00:17:41,040
Speaker 4: the pedagogy is important, of course it is, but much

340
00:17:41,119 --> 00:17:44,359
Speaker 4: like exercising, what matters the most is that you're actually

341
00:17:44,440 --> 00:17:46,720
Speaker 4: motivated to do it every day. So like, is the

342
00:17:46,760 --> 00:17:51,560
Speaker 4: elliptical better than the step climber or better than the treadmill? Like, yeah,

343
00:17:51,600 --> 00:17:55,000
Speaker 4: they're probably differences, but the reality is what's most important

344
00:17:55,040 --> 00:17:57,280
Speaker 4: is that you kind of do it often. And so

345
00:17:57,760 --> 00:17:59,760
Speaker 4: what we have found with dual linguo is that if

346
00:17:59,800 --> 00:18:01,960
Speaker 4: we're going to teach it with an app, there are

347
00:18:01,960 --> 00:18:05,480
Speaker 4: a lot of things that historically, you know, language teachers

348
00:18:05,640 --> 00:18:09,920
Speaker 4: or linguists didn't think we're the best ways to teach languages,

349
00:18:10,000 --> 00:18:11,359
Speaker 4: but if you're going to do it with an app,

350
00:18:11,359 --> 00:18:13,960
Speaker 4: you have to make it engaging. And we've had to

351
00:18:13,960 --> 00:18:16,320
Speaker 4: do it that way, and we have found that we

352
00:18:16,359 --> 00:18:20,320
Speaker 4: can do some things significantly better than human teachers, and

353
00:18:20,359 --> 00:18:23,560
Speaker 4: something's not as good because it's a very different system.

354
00:18:23,640 --> 00:18:26,040
Speaker 4: But again, the most important thing is just to keep

355
00:18:26,080 --> 00:18:29,480
Speaker 4: you motivated. So examples of things that we've had to

356
00:18:29,480 --> 00:18:32,320
Speaker 4: do to keep people motivated are quote unquote classes, which

357
00:18:32,359 --> 00:18:35,000
Speaker 4: is a lesson undu a lingo. They're not thirty minutes

358
00:18:35,080 --> 00:18:37,280
Speaker 4: or forty five minutes, they're two and a half minutes.

359
00:18:38,119 --> 00:18:41,960
Speaker 4: If they're any longer, we start losing people's attention. So

360
00:18:42,000 --> 00:18:44,359
Speaker 4: stuff like that I think has been really important. Now

361
00:18:44,400 --> 00:18:47,440
Speaker 4: I'll say, related to your question, one thing that has

362
00:18:47,440 --> 00:18:50,160
Speaker 4: been amazing is that, you know, we start out with

363
00:18:50,840 --> 00:18:53,720
Speaker 4: language experts who you know, people with PhDs and second

364
00:18:53,760 --> 00:18:56,200
Speaker 4: language acquisition, who tell us how to best teach something.

365
00:18:56,240 --> 00:18:58,280
Speaker 4: But then it takes it from there and the computer

366
00:18:58,359 --> 00:19:01,760
Speaker 4: optimizes it, and so the computer starts finding different ways.

367
00:19:01,800 --> 00:19:05,399
Speaker 4: There are different orderings of things that are actually better

368
00:19:05,880 --> 00:19:09,520
Speaker 4: than what the people with phg's and second language acquisition thought.

369
00:19:09,600 --> 00:19:12,040
Speaker 4: But it's because they just didn't have the data to

370
00:19:12,119 --> 00:19:14,240
Speaker 4: optimize this, whereas now you know, we do a lingo,

371
00:19:14,320 --> 00:19:17,239
Speaker 4: we have it's something like one billion exercises. Is one

372
00:19:17,280 --> 00:19:20,600
Speaker 4: billion exercises are solved every day by people using dual lingo,

373
00:19:21,119 --> 00:19:22,840
Speaker 4: and that just has a lot of data that helps

374
00:19:22,880 --> 00:19:23,480
Speaker 4: us teach better.

375
00:19:23,880 --> 00:19:26,280
Speaker 2: This is exactly what I wanted to ask you, which

376
00:19:26,320 --> 00:19:30,119
Speaker 2: is how iterative is this technology? So how much is

377
00:19:30,119 --> 00:19:33,320
Speaker 2: it about the AI model sort of developing off the

378
00:19:33,400 --> 00:19:36,199
Speaker 2: data that you feed it, and then the AI model

379
00:19:36,480 --> 00:19:41,600
Speaker 2: improving the outcome for users and thereby generating more data

380
00:19:41,680 --> 00:19:42,600
Speaker 2: from which it can train.

381
00:19:43,000 --> 00:19:47,080
Speaker 4: It's exactly we're exactly doing that, and in particular, one

382
00:19:47,119 --> 00:19:49,480
Speaker 4: of the things that we've been able to optimize a

383
00:19:49,520 --> 00:19:53,000
Speaker 4: lot is which exercise we give to which person. So

384
00:19:53,040 --> 00:19:54,840
Speaker 4: when you start a lesson and do a lingo, you

385
00:19:54,880 --> 00:19:56,800
Speaker 4: may think that all lessons are the same for everybody.

386
00:19:56,840 --> 00:20:00,119
Speaker 4: They're absolutely not. When you use to a lingo, you

387
00:20:00,200 --> 00:20:04,040
Speaker 4: watch what you do, and you know, the computer makes

388
00:20:04,040 --> 00:20:06,680
Speaker 4: a model of you as a student, so it sees

389
00:20:06,760 --> 00:20:08,879
Speaker 4: everything you get right, everything you get wrong, and based

390
00:20:08,880 --> 00:20:11,080
Speaker 4: on that, it starts realizing you're not very good at

391
00:20:11,080 --> 00:20:14,000
Speaker 4: the past tense, or you're not very good at the

392
00:20:14,000 --> 00:20:16,639
Speaker 4: future tens or whatever. And whenever you start a lesson,

393
00:20:17,160 --> 00:20:19,560
Speaker 4: it uses that model specifically for you, and it knows

394
00:20:19,560 --> 00:20:21,119
Speaker 4: that you're not very good at a past tense, so

395
00:20:21,119 --> 00:20:24,080
Speaker 4: it may give you more past tense or it does

396
00:20:24,119 --> 00:20:26,560
Speaker 4: stuff like that. And that definitely gets better with more

397
00:20:26,560 --> 00:20:28,600
Speaker 4: and more data. And I'll say another thing that is

398
00:20:28,640 --> 00:20:31,240
Speaker 4: really important. If we were to give you a lesson

399
00:20:32,000 --> 00:20:35,280
Speaker 4: only with the things that you're not good at, that

400
00:20:35,320 --> 00:20:38,560
Speaker 4: would be a horrible lesson because that would be extremely frustrating.

401
00:20:38,600 --> 00:20:40,239
Speaker 4: It's just basically, here are the things you're bad at,

402
00:20:40,400 --> 00:20:42,479
Speaker 4: just that we do a lot more of that. So

403
00:20:42,680 --> 00:20:45,080
Speaker 4: in addition to that, we have a system that tries

404
00:20:45,119 --> 00:20:48,119
Speaker 4: to and it gets better and better over time. It

405
00:20:48,200 --> 00:20:51,359
Speaker 4: is tuned for every exercise we have on DUELINGO that

406
00:20:51,640 --> 00:20:54,159
Speaker 4: could give you. It knows the probability that you're going

407
00:20:54,200 --> 00:20:57,520
Speaker 4: to get that exercise correct. And whenever we are giving

408
00:20:57,520 --> 00:21:00,720
Speaker 4: you an exercise, we optimize so that we try to

409
00:21:00,840 --> 00:21:03,160
Speaker 4: only give you exercises that you have about an eighty

410
00:21:03,200 --> 00:21:06,760
Speaker 4: percent chance of getting right. And that has been quite

411
00:21:06,800 --> 00:21:09,320
Speaker 4: good because it turns out eighty percent is kind of

412
00:21:09,359 --> 00:21:13,080
Speaker 4: at this zone of maximal development where basically it's not

413
00:21:13,760 --> 00:21:16,399
Speaker 4: too easy because you're not getting Having a one hundred

414
00:21:16,400 --> 00:21:18,440
Speaker 4: percent chance of getting it right if it's too easy

415
00:21:18,480 --> 00:21:20,600
Speaker 4: has two problems. Not only is it boring that it's

416
00:21:20,640 --> 00:21:23,399
Speaker 4: too easy, but also you're probably not learning anything if

417
00:21:23,400 --> 00:21:24,920
Speaker 4: you have a hundred percent chance of getting it right.

418
00:21:25,240 --> 00:21:28,400
Speaker 4: And it's also not too hard because humans get frustrated

419
00:21:28,440 --> 00:21:30,880
Speaker 4: if you're getting things right only thirty percent of the time.

420
00:21:31,160 --> 00:21:32,720
Speaker 4: So it turns out that we should give you things

421
00:21:32,720 --> 00:21:34,320
Speaker 4: that you have an eighty percent chance of getting right,

422
00:21:34,320 --> 00:21:36,879
Speaker 4: and that has been really successful, and you know, we

423
00:21:37,000 --> 00:21:40,080
Speaker 4: keep getting better and better at at finding that exact

424
00:21:40,119 --> 00:21:42,400
Speaker 4: exercise that you have an eighty percent chance of getting right.

425
00:21:42,840 --> 00:21:46,000
Speaker 3: Okay, I have another I guess I would say theory

426
00:21:46,040 --> 00:21:48,840
Speaker 3: of language question, and I think I read in one

427
00:21:48,840 --> 00:21:51,200
Speaker 3: of your interviews. You know, is part of the process

428
00:21:51,240 --> 00:21:54,000
Speaker 3: of making the dual lingo ad better, you're always a

429
00:21:54,080 --> 00:21:57,639
Speaker 3: b testing things like should people learn vocabulary first, should

430
00:21:57,640 --> 00:22:01,040
Speaker 3: people learn adjectives before adverbs or a verbs before verbs,

431
00:22:01,160 --> 00:22:04,000
Speaker 3: whatever it is, and that there's this constant process of

432
00:22:04,320 --> 00:22:08,639
Speaker 3: what is the correct sequence? Do rules about the sequence

433
00:22:08,720 --> 00:22:12,400
Speaker 3: of what you learn differ across languages. So let's say

434
00:22:12,400 --> 00:22:16,240
Speaker 3: someone learning Portuguese may have a different optimal path of

435
00:22:16,280 --> 00:22:20,040
Speaker 3: what to learn first grammatically or vocabulary wise, versus say

436
00:22:20,160 --> 00:22:24,640
Speaker 3: someone learning Chinese or Polish, because I'm curious about whether

437
00:22:24,680 --> 00:22:29,119
Speaker 3: we can undercover deep facts about common grammar and language

438
00:22:29,480 --> 00:22:33,240
Speaker 3: from the sort of learning sequence that is optimal across languages.

439
00:22:33,960 --> 00:22:38,159
Speaker 4: Yes, they definitely vary a lot based on the language

440
00:22:38,200 --> 00:22:41,200
Speaker 4: that you're learning, and even more so, they also vary

441
00:22:41,280 --> 00:22:45,040
Speaker 4: based on your native language. So we actually have a

442
00:22:45,119 --> 00:22:50,159
Speaker 4: different course to learn English for Spanish speakers than the

443
00:22:50,200 --> 00:22:52,639
Speaker 4: course we have to learn English for Chinese speakers. They

444
00:22:52,640 --> 00:22:55,359
Speaker 4: are different courses, and there's a reason for that. It

445
00:22:55,400 --> 00:22:58,320
Speaker 4: turns out that what's hard for Spanish speakers in learning

446
00:22:58,359 --> 00:23:01,760
Speaker 4: English is different than it's hard for Chinese speakers in

447
00:23:01,840 --> 00:23:05,280
Speaker 4: learning English. Typically, you know, the things that are common

448
00:23:05,440 --> 00:23:08,159
Speaker 4: between languages are easy, and the things that are very

449
00:23:08,160 --> 00:23:10,840
Speaker 4: different between languages are hard. So just a stupid example,

450
00:23:10,880 --> 00:23:14,679
Speaker 4: I mean, when you're learning English from Spanish, there's you know,

451
00:23:14,720 --> 00:23:18,600
Speaker 4: a couple of thousand cognates. That's words that are the

452
00:23:18,640 --> 00:23:21,359
Speaker 4: same or very close to the same, so you immediately

453
00:23:21,400 --> 00:23:23,600
Speaker 4: know those We don't even need to teach you those words.

454
00:23:23,640 --> 00:23:26,240
Speaker 4: If you're learning English from Spanish because you already you

455
00:23:26,560 --> 00:23:30,120
Speaker 4: know them automatically because they are the same word. That's

456
00:23:30,160 --> 00:23:33,639
Speaker 4: not quite true from Chinese. Other examples are, you know,

457
00:23:33,720 --> 00:23:36,720
Speaker 4: for me in particular, i started learning German, and for me,

458
00:23:37,000 --> 00:23:40,000
Speaker 4: German was quite hard to learn because Spanish, you know,

459
00:23:40,040 --> 00:23:43,680
Speaker 4: my native language is Spanish. Spanish just does not have

460
00:23:44,119 --> 00:23:48,040
Speaker 4: a very developed concept of grammatical cases, whereas German does.

461
00:23:48,920 --> 00:23:52,919
Speaker 4: But learning German from like from Russian, that's just not

462
00:23:53,000 --> 00:23:56,679
Speaker 4: a very hard concept to grasp. So it kind of

463
00:23:56,680 --> 00:24:00,320
Speaker 4: depends on what concepts your language has, you know, also

464
00:24:00,480 --> 00:24:03,960
Speaker 4: not exactly concepts. But in terms of pronunciation, everybody says

465
00:24:03,960 --> 00:24:06,840
Speaker 4: that Spanish pronunciation is really easy, and it's true. Vowels

466
00:24:06,840 --> 00:24:09,320
Speaker 4: in Spanish are really easy because there's only really about

467
00:24:09,359 --> 00:24:11,040
Speaker 4: five vowel sounds. It's a little more than that, but

468
00:24:11,040 --> 00:24:13,639
Speaker 4: it's about five vowel sounds, whereas you know, there are

469
00:24:13,640 --> 00:24:16,399
Speaker 4: other languages that have, you know, fifteen vowel sounds. So

470
00:24:16,520 --> 00:24:18,879
Speaker 4: learning Spanish is easy, but vice versa. If you're a

471
00:24:18,960 --> 00:24:22,240
Speaker 4: native Spanish speaker, learning the languages that have a lot

472
00:24:22,240 --> 00:24:24,640
Speaker 4: of vowel sounds is really hard because you don't even

473
00:24:24,720 --> 00:24:27,280
Speaker 4: you can't even hear the difference. You know, it's very

474
00:24:27,280 --> 00:24:30,479
Speaker 4: funny when you're learning English from as a native Spanish speaker,

475
00:24:30,480 --> 00:24:33,439
Speaker 4: you cannot hear the difference between beach and bitch. You

476
00:24:33,480 --> 00:24:36,960
Speaker 4: cannot hear that difference, and you know, people make funny

477
00:24:37,000 --> 00:24:37,879
Speaker 4: mistakes because of that.

478
00:24:37,960 --> 00:24:39,640
Speaker 2: But I think there are a lot of T shirts

479
00:24:39,720 --> 00:24:43,120
Speaker 2: that involve that at one point in time.

480
00:24:43,720 --> 00:24:46,479
Speaker 4: Well, because really, if you're a native Spanish speaker, you

481
00:24:46,520 --> 00:24:47,600
Speaker 4: just cannot hear that difference.

482
00:24:48,280 --> 00:24:50,439
Speaker 2: So one thing I wanted to ask you is the

483
00:24:50,640 --> 00:24:53,600
Speaker 2: type of model that you're actually using. So I believe

484
00:24:53,680 --> 00:24:58,160
Speaker 2: you're using GPT four for some things like your premium

485
00:24:58,200 --> 00:25:01,640
Speaker 2: subscription do a Lingo Max, but then you've also developed

486
00:25:01,640 --> 00:25:06,360
Speaker 2: your own proprietary AI model called bird Brain. And I'm

487
00:25:06,440 --> 00:25:10,520
Speaker 2: curious about the decision to both use an off the

488
00:25:10,520 --> 00:25:15,720
Speaker 2: shelf solution or platform and to also develop your own

489
00:25:15,800 --> 00:25:18,800
Speaker 2: model at the same time. How did you end up

490
00:25:18,840 --> 00:25:20,160
Speaker 2: going down that path.

491
00:25:20,680 --> 00:25:22,879
Speaker 4: Yeah, it's a great question. I mean, I think the

492
00:25:23,040 --> 00:25:27,000
Speaker 4: difference is these are just very different the last since

493
00:25:27,040 --> 00:25:29,640
Speaker 4: since I don't know, two years ago, when large language

494
00:25:29,640 --> 00:25:35,120
Speaker 4: models or generative AI became very popular. Before that, there

495
00:25:35,119 --> 00:25:37,600
Speaker 4: were different just different things that AI could be used

496
00:25:37,760 --> 00:25:40,280
Speaker 4: for us. We were not using AI, for example for

497
00:25:40,440 --> 00:25:44,000
Speaker 4: practicing conversation. But we were using AI to determine which

498
00:25:44,000 --> 00:25:47,959
Speaker 4: exercise to give to which person that we built our

499
00:25:48,000 --> 00:25:50,840
Speaker 4: own that is the bird brain model is a model

500
00:25:50,840 --> 00:25:52,680
Speaker 4: that tries to figure out which exercise to give to

501
00:25:52,720 --> 00:25:55,959
Speaker 4: which person, you know, the last two years ago, for

502
00:25:56,000 --> 00:25:58,600
Speaker 4: the last two year stories. When people talk about models,

503
00:25:58,680 --> 00:26:02,480
Speaker 4: they usually mean langue which models, And it's this, it's

504
00:26:02,520 --> 00:26:05,560
Speaker 4: this specific type of AI model that what it does

505
00:26:05,560 --> 00:26:08,560
Speaker 4: is it predicts the next word given the previous words.

506
00:26:08,600 --> 00:26:11,639
Speaker 4: That's what a language model does. The large language models

507
00:26:11,640 --> 00:26:14,919
Speaker 4: are particularly good at doing this, and we did not

508
00:26:15,000 --> 00:26:18,320
Speaker 4: develop our own large language model. We decided it's a

509
00:26:18,320 --> 00:26:21,199
Speaker 4: lot easier to just use something like GPT four. But

510
00:26:21,280 --> 00:26:23,000
Speaker 4: we have our own model for something else that is

511
00:26:23,040 --> 00:26:24,840
Speaker 4: not a language model. That is an but it is

512
00:26:24,880 --> 00:26:28,679
Speaker 4: an AI model to predict what exercise to give to

513
00:26:28,680 --> 00:26:46,960
Speaker 4: which user, which is a pretty pretty different problem.

514
00:26:47,240 --> 00:26:51,119
Speaker 3: Speaking of AI, all these especially the really big companies,

515
00:26:51,480 --> 00:26:55,560
Speaker 3: making an extraordinary show of almost bragging about how much

516
00:26:55,600 --> 00:26:58,280
Speaker 3: money they give to Jensen Wong and in video it's like,

517
00:26:58,440 --> 00:27:01,480
Speaker 3: we just spent you know, we're spending twenty billion dollars

518
00:27:01,480 --> 00:27:04,040
Speaker 3: over the next two years to just acquire h one

519
00:27:04,080 --> 00:27:07,119
Speaker 3: hundred chips or whatever it is, and it almost seems

520
00:27:07,200 --> 00:27:09,719
Speaker 3: like there's like arms race. And then there is also

521
00:27:09,880 --> 00:27:14,879
Speaker 3: this view that actually the best models will not necessarily

522
00:27:14,920 --> 00:27:17,240
Speaker 3: be the ones strictly with the access to the most

523
00:27:17,280 --> 00:27:21,719
Speaker 3: compute but the access to data sets that other models

524
00:27:21,720 --> 00:27:24,520
Speaker 3: simply don't have. And I'm curious sort of like you know,

525
00:27:24,720 --> 00:27:28,359
Speaker 3: you as dual lingo must have an extraordinary amount of

526
00:27:28,440 --> 00:27:32,959
Speaker 3: proprietary data just from all of your user interactions in

527
00:27:33,040 --> 00:27:35,760
Speaker 3: your experience. When you think about who the winners will

528
00:27:35,760 --> 00:27:38,159
Speaker 3: be in this space, is it going to be the

529
00:27:38,200 --> 00:27:41,600
Speaker 3: ones that just have the most electricity and energy and chips,

530
00:27:41,720 --> 00:27:44,560
Speaker 3: or is it going to be who has access to

531
00:27:44,600 --> 00:27:46,560
Speaker 3: some sort of data that they can fine tune their

532
00:27:46,560 --> 00:27:48,320
Speaker 3: model on that the other model can.

533
00:27:49,000 --> 00:27:52,000
Speaker 4: It depends on what you're talking about. You know, certainly

534
00:27:52,040 --> 00:27:54,720
Speaker 4: we a stoolingo have a lot of you know, data

535
00:27:54,880 --> 00:27:57,640
Speaker 4: nobody else has, which is the data on how each

536
00:27:57,680 --> 00:28:00,919
Speaker 4: person's learning language. I mean that's not data you can

537
00:28:00,960 --> 00:28:02,600
Speaker 4: find on the web or anything like that. That is

538
00:28:02,680 --> 00:28:04,399
Speaker 4: just the data that we have that we're generating, and

539
00:28:04,440 --> 00:28:07,040
Speaker 4: we're going to train our own models for that. I

540
00:28:07,040 --> 00:28:12,200
Speaker 4: don't think there's enough electricity to train a model without

541
00:28:12,200 --> 00:28:14,800
Speaker 4: this data to be as good as ours with our data,

542
00:28:15,320 --> 00:28:19,760
Speaker 4: but it is for specifically language learning. If you're talking

543
00:28:19,800 --> 00:28:23,399
Speaker 4: about training a general model, that is going to be something,

544
00:28:23,440 --> 00:28:26,520
Speaker 4: you know, a language model that is general for being

545
00:28:26,520 --> 00:28:29,960
Speaker 4: able to have conversations, et cetera. Usually you can get

546
00:28:30,000 --> 00:28:32,679
Speaker 4: that from there's pretty good data there out there. You know,

547
00:28:32,760 --> 00:28:35,840
Speaker 4: YouTube videos that are free or a lot of kind

548
00:28:35,840 --> 00:28:38,880
Speaker 4: of Reddit conversations or whatever. There's there's a lot of

549
00:28:39,000 --> 00:28:42,200
Speaker 4: data in there. Probably a power is going to matter.

550
00:28:42,720 --> 00:28:44,239
Speaker 4: So it depends on what you're going to use your

551
00:28:44,280 --> 00:28:46,400
Speaker 4: model for. If if you're getting if you're using it

552
00:28:46,440 --> 00:28:49,440
Speaker 4: for a very specific purpose and you have very specific

553
00:28:49,520 --> 00:28:52,160
Speaker 4: data for that that is proprietary, that's going to be

554
00:28:52,160 --> 00:28:56,000
Speaker 4: better for the specific purpose. But my sense is that

555
00:28:56,080 --> 00:28:58,800
Speaker 4: you know both are going to matter. You know what

556
00:28:59,000 --> 00:29:02,840
Speaker 4: data you have and also how much electricity you spend.

557
00:29:03,360 --> 00:29:06,080
Speaker 4: But I also think that over time, hopefully we're going

558
00:29:06,160 --> 00:29:08,239
Speaker 4: to get better and better at these algorithms. And if

559
00:29:08,280 --> 00:29:10,600
Speaker 4: you think about it, the human brain uses something like

560
00:29:10,640 --> 00:29:13,680
Speaker 4: thirty watts for the human brain is pretty good and

561
00:29:13,720 --> 00:29:15,760
Speaker 4: we don't need you know, some of these models. People

562
00:29:15,760 --> 00:29:17,960
Speaker 4: are saying, oh, this is uses the the amount of

563
00:29:18,000 --> 00:29:20,640
Speaker 4: electricity that all of New York City uses. We use

564
00:29:20,720 --> 00:29:24,440
Speaker 4: that to train a model. You know, our brain uses much, much,

565
00:29:24,520 --> 00:29:28,520
Speaker 4: much less electricity than that, and you know, it's pretty good.

566
00:29:28,760 --> 00:29:31,600
Speaker 4: So my sense is that also over time, hopefully we'll

567
00:29:31,640 --> 00:29:34,200
Speaker 4: be able to get to the point where we're not

568
00:29:34,480 --> 00:29:36,960
Speaker 4: as crazy about using electricity as we are today.

569
00:29:37,280 --> 00:29:40,160
Speaker 2: I'm glad our brains are energy efficient. That's nice to know.

570
00:29:40,800 --> 00:29:42,800
Speaker 4: We've been talking a lot better than computers.

571
00:29:43,760 --> 00:29:46,280
Speaker 2: We've been talking a lot about the use of AI

572
00:29:46,720 --> 00:29:51,440
Speaker 2: in the product itself, so improving the experience of learning

573
00:29:51,480 --> 00:29:55,080
Speaker 2: a language. But one of the things that we hear

574
00:29:55,200 --> 00:29:58,320
Speaker 2: a lot about nowadays is also, you know, angst over

575
00:29:58,560 --> 00:30:01,880
Speaker 2: the role of AI in the wider economy in terms

576
00:30:01,880 --> 00:30:05,040
Speaker 2: of the labor force, job security, and stuff like that,

577
00:30:05,120 --> 00:30:08,480
Speaker 2: as companies try to be more efficient. So I guess

578
00:30:08,520 --> 00:30:12,440
Speaker 2: I'm wondering, on the sort of corporate side, how much

579
00:30:12,480 --> 00:30:15,760
Speaker 2: does AI play into the business model right now in

580
00:30:15,880 --> 00:30:21,000
Speaker 2: terms of streamlining things like costs or reducing workforce. And

581
00:30:21,040 --> 00:30:23,760
Speaker 2: I believe there are quite a few headlines around Duo

582
00:30:23,880 --> 00:30:26,600
Speaker 2: Lingo on this exact topic late last year.

583
00:30:26,960 --> 00:30:29,360
Speaker 4: Yeah, first of all, those headlines were upsetting to me.

584
00:30:29,440 --> 00:30:31,320
Speaker 4: Because they were wrong. You know, there were a lot

585
00:30:31,320 --> 00:30:33,320
Speaker 4: of headlines thing that we had done a massive layoff

586
00:30:33,760 --> 00:30:37,120
Speaker 4: that was not actually true. So what is true is that,

587
00:30:37,160 --> 00:30:39,360
Speaker 4: you know, we really are leaning into AI. You know,

588
00:30:39,640 --> 00:30:42,480
Speaker 4: it just it makes sense. This is a very transformative technology,

589
00:30:42,520 --> 00:30:44,720
Speaker 4: so we're leaning into it. And it is also true

590
00:30:44,760 --> 00:30:48,280
Speaker 4: that many workflows are a lot more efficient. And so

591
00:30:48,320 --> 00:30:51,959
Speaker 4: what happened late last year was that we realized we

592
00:30:52,000 --> 00:30:54,040
Speaker 4: have full time employees and but we also have some

593
00:30:54,200 --> 00:30:58,120
Speaker 4: hourly contractors. We realized that we need a fewer hourly

594
00:30:58,200 --> 00:31:01,360
Speaker 4: contractors and so for you know, a small fraction of

595
00:31:01,360 --> 00:31:03,760
Speaker 4: our hourly contracts, we did not renew their contract because

596
00:31:03,800 --> 00:31:06,840
Speaker 4: we realized we need a few of them for doing

597
00:31:06,920 --> 00:31:10,440
Speaker 4: some tests that you know, honestly, computers were just as

598
00:31:10,440 --> 00:31:13,200
Speaker 4: good as as a human and that's you know, that

599
00:31:13,320 --> 00:31:16,080
Speaker 4: may be true for something like a like an hourly

600
00:31:16,120 --> 00:31:18,800
Speaker 4: contractor force that was being asked to do. We were

601
00:31:18,840 --> 00:31:22,800
Speaker 4: basically being asked to do very rote kind of language

602
00:31:22,840 --> 00:31:25,760
Speaker 4: tasks that computers just got very good at. I think

603
00:31:25,800 --> 00:31:28,959
Speaker 4: if you're talking about you know, our full time employees

604
00:31:29,000 --> 00:31:31,360
Speaker 4: and people who are who are not necessarily just doing

605
00:31:31,560 --> 00:31:36,280
Speaker 4: rote repetitive stuff that's going to take a while to replace.

606
00:31:36,320 --> 00:31:38,200
Speaker 4: I don't think, and certainly this is not what we

607
00:31:38,240 --> 00:31:39,800
Speaker 4: want to do as a company. You know, I heard

608
00:31:39,800 --> 00:31:43,080
Speaker 4: a really good saying recently, which is, your job's not

609
00:31:43,120 --> 00:31:44,840
Speaker 4: going to be replaced by AI. It's going to be

610
00:31:44,840 --> 00:31:48,000
Speaker 4: replaced by somebody who knows how to use AI. So

611
00:31:48,120 --> 00:31:50,120
Speaker 4: what we're seeing in the company, at least for our

612
00:31:50,120 --> 00:31:52,880
Speaker 4: full time employees, is not that we're able or even

613
00:31:52,920 --> 00:31:55,160
Speaker 4: want to replace them. What we're seeing is just way

614
00:31:55,160 --> 00:31:58,920
Speaker 4: more productivity, to the point where people are able to

615
00:31:59,000 --> 00:32:03,120
Speaker 4: concentrate on kind of higher level cognitive tasks rather than

616
00:32:03,200 --> 00:32:06,280
Speaker 4: wrote things. I don't know. One hundred years ago, people

617
00:32:06,320 --> 00:32:10,240
Speaker 4: were being hired to add numbers or multiply numbers the

618
00:32:10,280 --> 00:32:13,200
Speaker 4: original quote unquote computers were actually humans who are being

619
00:32:13,720 --> 00:32:17,600
Speaker 4: hired to multiply numbers. We were able to mechanize that

620
00:32:17,880 --> 00:32:19,680
Speaker 4: and use an actual computer to do that so that

621
00:32:19,720 --> 00:32:21,880
Speaker 4: people didn't have to do that. Instead, they spend time,

622
00:32:22,000 --> 00:32:25,719
Speaker 4: you know, planning something at a higher level rather than

623
00:32:25,760 --> 00:32:29,000
Speaker 4: having to do the multiplication. We're seeing something similar to

624
00:32:29,120 --> 00:32:31,800
Speaker 4: that now. And the other thing that we're seeing is

625
00:32:31,960 --> 00:32:35,080
Speaker 4: that is really amazing. So we are saving costs because

626
00:32:35,160 --> 00:32:38,840
Speaker 4: it's a single person can do more, but also we're

627
00:32:38,840 --> 00:32:42,640
Speaker 4: able to do things much much faster, and in particular

628
00:32:42,640 --> 00:32:44,400
Speaker 4: in data creation. I mean, one of the ways in

629
00:32:44,400 --> 00:32:45,960
Speaker 4: which we teach you how to read is even read

630
00:32:46,000 --> 00:32:48,760
Speaker 4: short stories. We used to create and we need to

631
00:32:48,760 --> 00:32:51,280
Speaker 4: create a lot of short stories. We used to be

632
00:32:51,280 --> 00:32:54,720
Speaker 4: able to create short stories, you know, at a certain pace.

633
00:32:55,080 --> 00:32:58,400
Speaker 4: We can now create them like ten times faster. And

634
00:32:58,440 --> 00:33:00,880
Speaker 4: what's beautiful about being able to eat them ten times

635
00:33:00,920 --> 00:33:03,920
Speaker 4: faster is that you can actually make the quality better

636
00:33:03,960 --> 00:33:06,360
Speaker 4: because if you create them once ten times faster and

637
00:33:06,520 --> 00:33:08,280
Speaker 4: you don't like it, you can start over and do

638
00:33:08,360 --> 00:33:11,240
Speaker 4: it again with certain changes and then oh you didn't

639
00:33:11,280 --> 00:33:13,320
Speaker 4: like it, Okay, try it again, So you can you

640
00:33:13,320 --> 00:33:16,400
Speaker 4: can try ten times at this, you know, whereas before

641
00:33:16,400 --> 00:33:18,840
Speaker 4: you can only try once, and generally you don't have

642
00:33:18,880 --> 00:33:20,240
Speaker 4: to try ten times. You have to try a few

643
00:33:20,320 --> 00:33:21,800
Speaker 4: or times. So this is able to at the same

644
00:33:21,880 --> 00:33:25,800
Speaker 4: time lower costs for us, but also make the speed

645
00:33:25,840 --> 00:33:28,360
Speaker 4: faster and the quality better. So I mean, we're very

646
00:33:28,360 --> 00:33:30,280
Speaker 4: happy with that. In terms from the corporate side.

647
00:33:30,440 --> 00:33:33,360
Speaker 3: Could you talk more about benchmarking AI, because there's all

648
00:33:33,400 --> 00:33:36,160
Speaker 3: these tests, right and you see these websites and they're like,

649
00:33:36,160 --> 00:33:38,560
Speaker 3: well this one got this on the l sads, and

650
00:33:38,560 --> 00:33:40,360
Speaker 3: this one got this on the SATs and I can

651
00:33:40,360 --> 00:33:42,880
Speaker 3: never quite tell. And a lot of it seems inscrutable

652
00:33:43,000 --> 00:33:46,320
Speaker 3: to me from your perspective, Like, what are sort of

653
00:33:46,360 --> 00:33:51,560
Speaker 3: your basic approaches to benchmarking different models and determining when

654
00:33:51,600 --> 00:33:54,640
Speaker 3: it like, okay, this makes sense as some sort of

655
00:33:54,760 --> 00:33:58,719
Speaker 3: task to employ AI instead of a person doing it.

656
00:33:59,000 --> 00:34:01,200
Speaker 4: Yeah, I have felt the same as you have. There's

657
00:34:01,240 --> 00:34:02,520
Speaker 4: a lot of my senses and a lot of these

658
00:34:02,560 --> 00:34:05,920
Speaker 4: benchmarks are from marketing teams. You know, what we do

659
00:34:06,080 --> 00:34:09,600
Speaker 4: internally is two things. First of all, we just try stuff,

660
00:34:09,680 --> 00:34:11,160
Speaker 4: and then we look at it, and we look at

661
00:34:11,280 --> 00:34:13,759
Speaker 4: the very specific you know, it's nice that an AI

662
00:34:13,800 --> 00:34:15,759
Speaker 4: can pass the L set or whatever, but we're you know,

663
00:34:15,760 --> 00:34:18,120
Speaker 4: we're not in the business of passing L sets. We're

664
00:34:18,160 --> 00:34:19,799
Speaker 4: in the business of doing whatever it is we're doing,

665
00:34:19,880 --> 00:34:22,279
Speaker 4: you know, creating short stories or whatever. So whatever task,

666
00:34:22,360 --> 00:34:25,800
Speaker 4: we just try it and then we judge the quality ourselves.

667
00:34:25,880 --> 00:34:28,719
Speaker 4: So far, we have found that the quality of the

668
00:34:28,760 --> 00:34:31,680
Speaker 4: open AI models is a little better than everybody else's,

669
00:34:32,360 --> 00:34:35,600
Speaker 4: but not that much better. I mean two years ago

670
00:34:35,680 --> 00:34:38,239
Speaker 4: it was way better. It seems like everybody else is

671
00:34:38,280 --> 00:34:40,520
Speaker 4: catching up. But so far we have found that that's

672
00:34:40,600 --> 00:34:42,799
Speaker 4: just when we do our tests. And again, this is

673
00:34:43,280 --> 00:34:45,400
Speaker 4: you know, just an end of one one company. I'm

674
00:34:45,400 --> 00:34:47,600
Speaker 4: sure that other companies are finding maybe different stuff, but

675
00:34:47,640 --> 00:34:50,879
Speaker 4: for us, for our specific use cases, we find time

676
00:34:50,920 --> 00:34:53,920
Speaker 4: and again the GPT four does better. And I don't know,

677
00:34:53,960 --> 00:34:56,000
Speaker 4: of course, everybody's now announcing like there's going to be

678
00:34:56,040 --> 00:34:58,040
Speaker 4: GPT five et cetera, et cetera. I don't know how

679
00:34:58,040 --> 00:35:00,840
Speaker 4: those will be, but that's what we're finding. You know, generally,

680
00:35:00,880 --> 00:35:01,960
Speaker 4: would just do our own testing.

681
00:35:02,160 --> 00:35:04,640
Speaker 3: Yeah, Tracy, I find that so fascinating, especially, I think

682
00:35:04,640 --> 00:35:07,560
Speaker 3: we've talked about this, like it definitely seems like TBD

683
00:35:07,760 --> 00:35:10,480
Speaker 3: whether like one model would just prove to be head

684
00:35:10,480 --> 00:35:12,840
Speaker 3: and shoulders better than the others, the way that Google

685
00:35:12,960 --> 00:35:15,160
Speaker 3: was just head and shoulders above everyone else for twenty

686
00:35:15,280 --> 00:35:18,880
Speaker 3: years basically and still is kind of like, it's unclear

687
00:35:18,920 --> 00:35:19,960
Speaker 3: to me whether that'll be the.

688
00:35:19,920 --> 00:35:21,719
Speaker 2: Case with they ask right, the idea that we're in

689
00:35:21,760 --> 00:35:25,080
Speaker 2: the I don't know, the bing era of chat models

690
00:35:25,080 --> 00:35:28,360
Speaker 2: and eventually we're all going to migrate to something else. Luise,

691
00:35:28,440 --> 00:35:30,600
Speaker 2: One thing I wanted to ask you, and this is

692
00:35:30,640 --> 00:35:32,400
Speaker 2: sort of going back to the very beginning of the

693
00:35:32,440 --> 00:35:36,360
Speaker 2: conversation and some of the you know, older thoughts around language.

694
00:35:36,400 --> 00:35:38,680
Speaker 2: There used to be I don't want to say a consensus,

695
00:35:38,719 --> 00:35:42,040
Speaker 2: but there used to be some thinking that language was

696
00:35:42,520 --> 00:35:45,359
Speaker 2: very complicated in many ways, and so much of it

697
00:35:45,440 --> 00:35:49,719
Speaker 2: was sort of ambiguous or maybe context dependent, that it

698
00:35:49,760 --> 00:35:52,640
Speaker 2: would be very hard for AI to sort of wrap

699
00:35:52,680 --> 00:35:55,960
Speaker 2: its head around it. And I'm wondering now, with something

700
00:35:56,000 --> 00:35:59,880
Speaker 2: like due lingo, how do your models take into account

701
00:36:00,239 --> 00:36:04,320
Speaker 2: that sort of context dependency? And I'm thinking, you know,

702
00:36:04,400 --> 00:36:09,279
Speaker 2: I'm thinking specifically about things like Mandarin, where the pronunciation

703
00:36:09,600 --> 00:36:13,720
Speaker 2: is kind of tricky and a lot of understanding depends

704
00:36:13,760 --> 00:36:17,560
Speaker 2: on the context in which a particular word is said.

705
00:36:17,680 --> 00:36:19,600
Speaker 2: So how do you sort of deal with that?

706
00:36:20,000 --> 00:36:21,719
Speaker 4: Yeah, I mean it's an interesting thing. You know, when

707
00:36:21,719 --> 00:36:24,319
Speaker 4: you meant when you were asking the question, I thought

708
00:36:24,320 --> 00:36:26,880
Speaker 4: of this thing. You know, I've been around AI since

709
00:36:26,960 --> 00:36:31,200
Speaker 4: the late nineties, and I remember just it's just this

710
00:36:31,280 --> 00:36:34,000
Speaker 4: moving goalpost. I remember. Everybody just kept on saying, look,

711
00:36:34,080 --> 00:36:36,640
Speaker 4: if a computer can play chess, surely we all agree

712
00:36:36,760 --> 00:36:39,279
Speaker 4: it has human level intelligence. This is kind of what

713
00:36:39,320 --> 00:36:42,319
Speaker 4: everybody said. Then it turned out computers could play chess,

714
00:36:42,320 --> 00:36:44,680
Speaker 4: and nobody agreed that I had human level intelligence. It's

715
00:36:44,719 --> 00:36:47,120
Speaker 4: just like, oh, very fine, it can play just next thing.

716
00:36:47,360 --> 00:36:49,120
Speaker 4: And it would just keep coming up with stuff like,

717
00:36:49,200 --> 00:36:51,839
Speaker 4: surely if a computer can you know, play the game

718
00:36:51,880 --> 00:36:54,160
Speaker 4: of go, or if a computer could do this, then

719
00:36:54,680 --> 00:36:56,920
Speaker 4: you know, And one of the last few things was

720
00:36:57,239 --> 00:37:01,840
Speaker 4: if a computer can whatever right poetry so well or

721
00:37:01,960 --> 00:37:05,800
Speaker 4: understand text, then surely is intelligent. And at this point,

722
00:37:06,320 --> 00:37:09,839
Speaker 4: models like GPT four are really good at doing things,

723
00:37:09,880 --> 00:37:11,480
Speaker 4: certainly better than the average human. They may not be

724
00:37:11,520 --> 00:37:13,400
Speaker 4: as good as the best poet in the world, but

725
00:37:13,440 --> 00:37:16,439
Speaker 4: certainly better than the average human writing poetry, certainly better

726
00:37:16,440 --> 00:37:19,680
Speaker 4: than the average human at almost anything with text manipulation. Actually,

727
00:37:19,719 --> 00:37:21,360
Speaker 4: if you look at your average human, they're just not

728
00:37:21,400 --> 00:37:24,080
Speaker 4: particularly good at writing.

729
00:37:23,719 --> 00:37:26,160
Speaker 3: So many professional writers oh yeah, ye.

730
00:37:26,120 --> 00:37:28,680
Speaker 4: Yeah, I mean just these models are excellent. And in fact,

731
00:37:28,680 --> 00:37:30,799
Speaker 4: you can write something that is half well written and

732
00:37:30,840 --> 00:37:32,520
Speaker 4: you can ask the model to make it better and

733
00:37:32,560 --> 00:37:35,600
Speaker 4: it does that. It like makes your text better. So

734
00:37:35,880 --> 00:37:39,479
Speaker 4: it's this funny thing that just AI. We keep coming

735
00:37:39,560 --> 00:37:41,600
Speaker 4: up with things that like if AI can crack that,

736
00:37:41,600 --> 00:37:43,719
Speaker 4: that's it, that's it. You know, I don't know what

737
00:37:43,760 --> 00:37:45,480
Speaker 4: the next one will be, but you know, we gep

738
00:37:45,520 --> 00:37:47,440
Speaker 4: coming up with stuff like that, you know, in terms

739
00:37:47,440 --> 00:37:51,160
Speaker 4: of the language, it just turns out that language can

740
00:37:51,200 --> 00:37:55,920
Speaker 4: be mostly captured by these models. It turns out that

741
00:37:55,960 --> 00:37:58,920
Speaker 4: if you make a neural network architecture and this you know,

742
00:37:58,960 --> 00:38:01,719
Speaker 4: nobody could have guess this, but it just turns out

743
00:38:02,120 --> 00:38:04,919
Speaker 4: that if you make this neural network at architecture that's

744
00:38:04,960 --> 00:38:08,760
Speaker 4: called the transformer, and you train it with a gazillion

745
00:38:09,080 --> 00:38:11,759
Speaker 4: pieces of text, it just turns out it pretty much

746
00:38:11,760 --> 00:38:14,200
Speaker 4: can capture almost any new ones of the language. Again,

747
00:38:14,239 --> 00:38:15,919
Speaker 4: nobody could have figured this out, but it just turns

748
00:38:15,960 --> 00:38:17,640
Speaker 4: out that this is the case. So at this point,

749
00:38:17,640 --> 00:38:19,200
Speaker 4: when you know, when you ask about you know, what

750
00:38:19,320 --> 00:38:23,080
Speaker 4: we do with context or whatever, it just works when

751
00:38:23,080 --> 00:38:26,040
Speaker 4: you're you know, some of it we do with handwritten

752
00:38:26,080 --> 00:38:28,520
Speaker 4: rules because we write the rules. But generally, if you're

753
00:38:28,520 --> 00:38:31,319
Speaker 4: going to use an AI, it just works. And you

754
00:38:31,360 --> 00:38:32,960
Speaker 4: can ask me why it works. And I don't know

755
00:38:33,000 --> 00:38:35,640
Speaker 4: white works. I don't think anybody does. But it turns

756
00:38:35,640 --> 00:38:37,919
Speaker 4: out that the statistics are kind of strong enough there

757
00:38:38,000 --> 00:38:40,880
Speaker 4: that if you train it with a gazillion pieces of text,

758
00:38:41,239 --> 00:38:42,000
Speaker 4: it just works.

759
00:38:43,120 --> 00:38:45,399
Speaker 3: I just want to go back to the sort of

760
00:38:45,480 --> 00:38:48,920
Speaker 3: like you know where AI is going and you mentioned

761
00:38:48,960 --> 00:38:52,640
Speaker 3: that AI can generate thousands or ten, you know, very

762
00:38:52,760 --> 00:38:55,560
Speaker 3: rapidly numerous short stories, and then a human can say, Okay,

763
00:38:55,640 --> 00:38:57,200
Speaker 3: these are the good ones. We can improve and so

764
00:38:57,280 --> 00:39:00,560
Speaker 3: you not only get the efficiency savings, actually can get

765
00:39:00,600 --> 00:39:03,399
Speaker 3: a better higher quality for the lessons and so forth.

766
00:39:03,480 --> 00:39:05,920
Speaker 3: But you know, sort of like I'm moving up the

767
00:39:06,000 --> 00:39:09,279
Speaker 3: abstraction layer, like, will there be a point at some

768
00:39:09,440 --> 00:39:13,480
Speaker 3: point in the future in which the entire concept of

769
00:39:13,560 --> 00:39:16,880
Speaker 3: learning a language or the entire sequence is almost entirely

770
00:39:16,960 --> 00:39:20,320
Speaker 3: something that AI can do from scratch? Again, I'm thinking

771
00:39:20,400 --> 00:39:23,360
Speaker 3: sort of back to that chess analogy of not having

772
00:39:23,400 --> 00:39:27,040
Speaker 3: to use the entire history of games to learn, but

773
00:39:27,200 --> 00:39:29,839
Speaker 3: just knowing the basic rules and then coming up with

774
00:39:29,880 --> 00:39:32,759
Speaker 3: something further like, will AI eventually be able to sort

775
00:39:32,800 --> 00:39:35,520
Speaker 3: of like design the architecture of what it means to

776
00:39:35,600 --> 00:39:36,520
Speaker 3: learn a language?

777
00:39:37,000 --> 00:39:39,160
Speaker 4: I mean sure, I think at some point EI is

778
00:39:39,160 --> 00:39:41,000
Speaker 4: going to be able to do pretty much everything.

779
00:39:41,360 --> 00:39:41,560
Speaker 2: Right.

780
00:39:41,600 --> 00:39:44,239
Speaker 4: It very hard to know how long this will take.

781
00:39:44,239 --> 00:39:47,279
Speaker 4: I mean, it's just very hard, and honestly for our

782
00:39:47,320 --> 00:39:50,000
Speaker 4: own society, I'm hoping that the process is gradual and

783
00:39:50,080 --> 00:39:52,600
Speaker 4: not from one day to the next, because if we

784
00:39:52,680 --> 00:39:55,239
Speaker 4: find that at some point AI really goes from if

785
00:39:55,280 --> 00:39:57,560
Speaker 4: tomorrow somebody announces, okay, I have an AI that can

786
00:39:57,560 --> 00:40:00,680
Speaker 4: pretty much do everything perfectly. I think this will be

787
00:40:00,719 --> 00:40:04,000
Speaker 4: a major societal problem because we won't know what to do.

788
00:40:04,040 --> 00:40:06,680
Speaker 4: But if this process takes twenty thirty years, at least

789
00:40:06,680 --> 00:40:09,440
Speaker 4: we'll be able to as a society figure out what

790
00:40:09,480 --> 00:40:13,000
Speaker 4: to do with ourselves. But generally, I mean, I think

791
00:40:13,000 --> 00:40:14,319
Speaker 4: at some point AI is going to be able to

792
00:40:14,320 --> 00:40:15,080
Speaker 4: do everything we can.

793
00:40:16,640 --> 00:40:19,640
Speaker 2: What's the big challenge when it comes to AI at

794
00:40:19,640 --> 00:40:22,680
Speaker 2: the moment? I realize we've been talking a lot about opportunities,

795
00:40:22,719 --> 00:40:24,840
Speaker 2: but what are some of the issues that you're trying

796
00:40:24,840 --> 00:40:28,560
Speaker 2: to surmount at the moment, Whether it's something like getting

797
00:40:28,640 --> 00:40:34,200
Speaker 2: enough compute or securing the best engineers, or I guess

798
00:40:34,280 --> 00:40:37,440
Speaker 2: being in competition with a number of other companies that

799
00:40:37,480 --> 00:40:40,800
Speaker 2: are also using AI, maybe in the same business.

800
00:40:41,400 --> 00:40:44,440
Speaker 4: I mean, certainly, securing good engineers has been a challenge

801
00:40:44,480 --> 00:40:47,200
Speaker 4: for anything related to engineering for a while. You know,

802
00:40:47,520 --> 00:40:49,359
Speaker 4: you want the best engineers, and there's just not very

803
00:40:49,360 --> 00:40:51,200
Speaker 4: many of them, so there's a lot of competition. So

804
00:40:51,200 --> 00:40:54,880
Speaker 4: that's certainly true in terms of AI in particular, I

805
00:40:54,880 --> 00:40:57,360
Speaker 4: would say that I don't know what depends on what

806
00:40:57,400 --> 00:41:00,760
Speaker 4: you're trying to achieve. These models are getting better and better.

807
00:41:01,480 --> 00:41:06,040
Speaker 4: What they're not yet quite exhibiting is actual kind of

808
00:41:06,200 --> 00:41:09,160
Speaker 4: deduction and understanding as good as we would want them

809
00:41:09,200 --> 00:41:11,240
Speaker 4: to do. I mean, so you still see really because

810
00:41:11,280 --> 00:41:12,719
Speaker 4: of the way they work, I mean, these are just

811
00:41:12,719 --> 00:41:15,719
Speaker 4: predicting the next word. Because of the way they work,

812
00:41:15,760 --> 00:41:18,439
Speaker 4: you can see them do funky stuff like they get

813
00:41:18,560 --> 00:41:22,440
Speaker 4: adding numbers wrong sometimes because they're not actually adding numbers.

814
00:41:22,480 --> 00:41:24,680
Speaker 4: They're just predicting the next word. And it turns out

815
00:41:24,719 --> 00:41:26,600
Speaker 4: you can predict a lot of things you know you

816
00:41:26,640 --> 00:41:29,520
Speaker 4: may not, So it doesn't quite have a concept of addition, doesn't.

817
00:41:29,680 --> 00:41:31,600
Speaker 4: So I think, you know, if what you're looking for

818
00:41:31,719 --> 00:41:34,480
Speaker 4: is kind of general intelligence, I think there's some amount

819
00:41:34,600 --> 00:41:38,560
Speaker 4: that's going to be required in terms of actually understanding

820
00:41:38,600 --> 00:41:41,919
Speaker 4: certain concepts that these models don't yet have. And that's,

821
00:41:42,080 --> 00:41:43,959
Speaker 4: you know, my sense is that new ideas are needed

822
00:41:44,000 --> 00:41:45,600
Speaker 4: for that. I don't know what they are. If I knew,

823
00:41:45,640 --> 00:41:48,640
Speaker 4: I would do them, but new ideas are needed for that.

824
00:41:48,920 --> 00:41:51,080
Speaker 3: Yeah, it's still like mind blowing, Like you see the

825
00:41:51,200 --> 00:41:54,920
Speaker 3: AI produce some sort of amazing output or explanation and

826
00:41:54,960 --> 00:41:57,160
Speaker 3: then it'll like get wrong. Like a question of like

827
00:41:57,200 --> 00:42:00,840
Speaker 3: what weighs more a kilogram of feathers or kill of steel,

828
00:42:00,920 --> 00:42:02,800
Speaker 3: like something really led.

829
00:42:02,680 --> 00:42:05,040
Speaker 4: Or yah because it doesn't Yeah, right, because.

830
00:42:05,600 --> 00:42:07,799
Speaker 3: There's no one, there's no actual intuition. I just have

831
00:42:07,880 --> 00:42:11,520
Speaker 3: one last question, and it's sort of There are not

832
00:42:11,680 --> 00:42:15,000
Speaker 3: many sort of like cutting edge tech companies based in Pittsburgh.

833
00:42:15,040 --> 00:42:18,920
Speaker 3: I understand like CMU has historically been a bastion of

834
00:42:19,000 --> 00:42:22,160
Speaker 3: advanced AI research. I think at one point, like Uber

835
00:42:22,280 --> 00:42:24,719
Speaker 3: bought out like the entire robotics department when it was

836
00:42:24,719 --> 00:42:27,719
Speaker 3: trying to do self driving cars. But how do you

837
00:42:27,760 --> 00:42:29,920
Speaker 3: see that when it comes to this sort of recruiting

838
00:42:30,200 --> 00:42:34,160
Speaker 3: of talent and it's already scarce. What are the advantages

839
00:42:34,239 --> 00:42:37,560
Speaker 3: and disadvantages of being based in Pittsburgh rather than the

840
00:42:37,560 --> 00:42:38,680
Speaker 3: Bay Area or somewhere else.

841
00:42:38,920 --> 00:42:41,560
Speaker 4: Yeah, we that a quarter in Pittsburgh's it's the beginning.

842
00:42:41,600 --> 00:42:43,759
Speaker 4: We've loved being there. There are good things and bad things.

843
00:42:43,800 --> 00:42:45,920
Speaker 4: I mean, certainly a good thing is being close to

844
00:42:45,920 --> 00:42:48,960
Speaker 4: Carnegie Mellon. Carnegie Mellon produces, you know, some of the

845
00:42:48,960 --> 00:42:51,880
Speaker 4: best engineers in the world, and certainly relating to AI.

846
00:42:52,719 --> 00:42:55,560
Speaker 4: Another good thing about being in a city like Pittsburgh

847
00:42:55,640 --> 00:42:58,439
Speaker 4: is that two good things. One of them is that

848
00:42:58,640 --> 00:43:01,799
Speaker 4: people don't leave jobs that easily. And you know, when

849
00:43:01,800 --> 00:43:03,560
Speaker 4: you're in a place like Silicon Valley, you get these

850
00:43:03,600 --> 00:43:07,200
Speaker 4: people that leave jobs every eighteen months. Our average employee

851
00:43:07,200 --> 00:43:10,240
Speaker 4: stays around for a very long time, and that's actually

852
00:43:10,239 --> 00:43:12,600
Speaker 4: a major advantage because you don't have to retrain them.

853
00:43:12,640 --> 00:43:14,440
Speaker 4: They really know how to do the job because they've

854
00:43:14,480 --> 00:43:16,360
Speaker 4: been doing it for the last seven years. So that

855
00:43:16,600 --> 00:43:19,719
Speaker 4: that's been an advantage. And I think another advantage that

856
00:43:19,760 --> 00:43:22,760
Speaker 4: we've had is in terms of Silicon Valley, there's usually

857
00:43:22,800 --> 00:43:24,880
Speaker 4: one or two companies that are kind of the darlings

858
00:43:24,880 --> 00:43:27,319
Speaker 4: of Silicon Valley, and everybody wants to work there, and

859
00:43:27,400 --> 00:43:30,640
Speaker 4: that the Darling Company changes every two three years, and

860
00:43:30,640 --> 00:43:32,520
Speaker 4: the kind of all the good people go there. The

861
00:43:32,560 --> 00:43:36,000
Speaker 4: good news in Pittsburgh is that fad type thing doesn't happen.

862
00:43:36,040 --> 00:43:38,359
Speaker 4: So there have been times. We're lucky that right now

863
00:43:38,360 --> 00:43:40,279
Speaker 4: our stock is doing very well, so we're kind of

864
00:43:40,280 --> 00:43:42,840
Speaker 4: a fad company. But there have been times when we

865
00:43:42,960 --> 00:43:45,319
Speaker 4: just weren't, but we still were able to get really

866
00:43:45,360 --> 00:43:48,200
Speaker 4: good talent. So I think that's been really good. You know.

867
00:43:48,239 --> 00:43:50,480
Speaker 4: On the flip side, of course, there are certain roles

868
00:43:50,560 --> 00:43:52,720
Speaker 4: for which it is hard to hire people in Pittsburgh.

869
00:43:52,760 --> 00:43:56,400
Speaker 4: Particularly product managers are hard to hire in Pittsburgh. So

870
00:43:56,640 --> 00:43:58,359
Speaker 4: because of that, we have an office in New York,

871
00:43:58,520 --> 00:44:00,440
Speaker 4: and we complement that, we have a pretty long our jofice

872
00:44:00,440 --> 00:44:02,560
Speaker 4: in New York, and we compliment that.

873
00:44:03,120 --> 00:44:05,840
Speaker 2: All right, Louise went on from dual LINGO, thank you

874
00:44:05,840 --> 00:44:07,800
Speaker 2: so much for coming on all thoughts. That was great,

875
00:44:08,480 --> 00:44:23,719
Speaker 2: Thank you excellent, Joe. I enjoyed that conversation. You know

876
00:44:23,760 --> 00:44:26,600
Speaker 2: what I was thinking about when Louis was talking about,

877
00:44:26,680 --> 00:44:28,439
Speaker 2: it's not that AI is going to take her job,

878
00:44:28,480 --> 00:44:30,719
Speaker 2: it's someone who knows how to use AI is going

879
00:44:30,760 --> 00:44:33,400
Speaker 2: to take your job. I was thinking about just before

880
00:44:33,440 --> 00:44:35,759
Speaker 2: we came on this recording, you were telling me that

881
00:44:35,840 --> 00:44:39,120
Speaker 2: you used was it Chat, GPT or claude to learn

882
00:44:39,200 --> 00:44:40,560
Speaker 2: something that I normally do.

883
00:44:40,880 --> 00:44:43,560
Speaker 3: Oh yeah. So for those who don't know, we have

884
00:44:43,600 --> 00:44:47,040
Speaker 3: a weekly odd lauged newsletter and we usually comes out

885
00:44:47,080 --> 00:44:50,840
Speaker 3: every Friday. You should go to subscribe and Tracy usually

886
00:44:50,840 --> 00:44:52,600
Speaker 3: sends an email to one of the guests each week

887
00:44:52,719 --> 00:44:56,040
Speaker 3: asking what books they recommend, you know, people like reading books.

888
00:44:56,400 --> 00:44:58,680
Speaker 3: And then she goes into ms paint and then like

889
00:44:58,719 --> 00:45:03,480
Speaker 3: puts the chatbooks of like the four books together, and

890
00:45:03,560 --> 00:45:05,879
Speaker 3: I did add because Tracy was out a couple weeks ago.

891
00:45:06,360 --> 00:45:08,799
Speaker 3: And I am not, like, I've never like learned Photoshop

892
00:45:08,880 --> 00:45:11,160
Speaker 3: or even MS paint, so just like I'm very dumb,

893
00:45:11,239 --> 00:45:13,760
Speaker 3: Like just like the process of putting four images together

894
00:45:14,520 --> 00:45:16,640
Speaker 3: was not something I exactly knew how to do. So

895
00:45:16,680 --> 00:45:18,560
Speaker 3: I went to Claude and I said, I'm putting together

896
00:45:18,640 --> 00:45:21,560
Speaker 3: four book images in an MS paint thing. Please tell

897
00:45:21,560 --> 00:45:23,360
Speaker 3: me how to do it and to walk through the steps.

898
00:45:23,400 --> 00:45:24,040
Speaker 1: And I did it.

899
00:45:24,080 --> 00:45:25,080
Speaker 3: Tracy, you were proud.

900
00:45:24,840 --> 00:45:27,360
Speaker 2: Of me, right, I was very proud. I do think

901
00:45:27,480 --> 00:45:30,960
Speaker 2: it's somewhat ironic that the pinnacle of AI usage is

902
00:45:30,960 --> 00:45:34,080
Speaker 2: teaching someone how to use MS paint, But it's fine,

903
00:45:34,120 --> 00:45:36,279
Speaker 2: I'll take it. Yeah, No, there's so much to pull

904
00:45:36,320 --> 00:45:38,719
Speaker 2: out of that conversation. One thing I'll say, and maybe

905
00:45:38,719 --> 00:45:41,360
Speaker 2: it's a little bit trite, but it does seem like

906
00:45:42,040 --> 00:45:46,120
Speaker 2: language learning is sort of ground zero for the application

907
00:45:46,400 --> 00:45:50,879
Speaker 2: of a lot of this natural language and chat bought technology.

908
00:45:50,960 --> 00:45:52,840
Speaker 2: So it was interesting to come at it from a

909
00:45:52,880 --> 00:45:56,319
Speaker 2: sort of pure language or linguistics perspective.

910
00:45:57,000 --> 00:45:58,839
Speaker 3: Yeah, I mean, I like, I feel like we could

911
00:45:58,880 --> 00:46:03,200
Speaker 3: have talked to Luist for hours, just on like theory

912
00:46:03,239 --> 00:46:08,040
Speaker 3: of language itself, which I find endlessly fascinating, and I

913
00:46:08,120 --> 00:46:10,160
Speaker 3: really I can only speak one language. I used to

914
00:46:10,200 --> 00:46:12,200
Speaker 3: be able to speak French, so I don't know if

915
00:46:12,200 --> 00:46:15,360
Speaker 3: I told you, but I did one semester in Geneva, Switzerland,

916
00:46:15,360 --> 00:46:17,440
Speaker 3: and I lived with a family that only spoke French,

917
00:46:17,800 --> 00:46:19,680
Speaker 3: and I'd never spoken a word of French before I

918
00:46:19,680 --> 00:46:22,080
Speaker 3: got there. And after one semester, I came home and

919
00:46:22,120 --> 00:46:24,800
Speaker 3: I passed out of four years worth of my college

920
00:46:24,840 --> 00:46:27,560
Speaker 3: requirements from that four months living there. And then I

921
00:46:27,600 --> 00:46:29,560
Speaker 3: didn't speak French again for twenty years and I lost

922
00:46:29,600 --> 00:46:32,000
Speaker 3: it all. But I was gonna go somewhere with that.

923
00:46:32,040 --> 00:46:32,960
Speaker 3: I don't really know.

924
00:46:33,080 --> 00:46:36,400
Speaker 2: It's okay I to speak multiple languages poorly.

925
00:46:36,800 --> 00:46:38,760
Speaker 3: But you know the other thing I was thinking about,

926
00:46:39,040 --> 00:46:41,319
Speaker 3: you know, so due LINGO has obviously been around for

927
00:46:41,400 --> 00:46:43,960
Speaker 3: quite a long time before anyone was talking about generative

928
00:46:44,000 --> 00:46:45,919
Speaker 3: AI or anything. And one of the things you hear,

929
00:46:46,800 --> 00:46:49,400
Speaker 3: and it sort of used pejoratively, is like some company

930
00:46:49,440 --> 00:46:52,560
Speaker 3: will be called like a chet GPT rapper, right, so

931
00:46:52,680 --> 00:46:56,560
Speaker 3: basically they're just taking GPT four whatever the latest model is,

932
00:46:56,880 --> 00:46:59,680
Speaker 3: and then building some slick interface to do a specific

933
00:46:59,719 --> 00:47:02,279
Speaker 3: task on top of it. And what's interesting about dual

934
00:47:02,360 --> 00:47:04,960
Speaker 3: Lingo is it feels like it's backwards or going in

935
00:47:05,000 --> 00:47:09,600
Speaker 3: the opposite sequence where they already had this extremely popular

936
00:47:10,440 --> 00:47:16,120
Speaker 3: app for language learning, and then over time they incorporate

937
00:47:16,160 --> 00:47:18,520
Speaker 3: more so rather than being starting off as a rapper

938
00:47:18,560 --> 00:47:22,080
Speaker 3: for someone else's technology, they already have the audience, they

939
00:47:22,080 --> 00:47:24,480
Speaker 3: already have the thing, and then they find more ways

940
00:47:24,960 --> 00:47:28,000
Speaker 3: that the AI can be used to actually like rebuild

941
00:47:28,200 --> 00:47:28,799
Speaker 3: the core app.

942
00:47:29,200 --> 00:47:30,960
Speaker 2: Yeah, that's a really good way of putting it. And

943
00:47:31,000 --> 00:47:34,160
Speaker 2: also just the iterative nature of all of this technology,

944
00:47:34,239 --> 00:47:36,920
Speaker 2: So the idea that you know, you're sort of training it,

945
00:47:37,040 --> 00:47:39,520
Speaker 2: I know, again it's sort of an obvious point, yeah,

946
00:47:39,560 --> 00:47:43,480
Speaker 2: but also I didn't realize how customized a lot of

947
00:47:43,480 --> 00:47:45,880
Speaker 2: the duo lingo stuff is at this point. And the

948
00:47:45,920 --> 00:47:48,640
Speaker 2: idea that if you speak one language, the way you learn,

949
00:47:48,920 --> 00:47:52,200
Speaker 2: say German, is going to be completely different to someone

950
00:47:52,280 --> 00:47:56,040
Speaker 2: who grew up speaking another language. And I'm very intrigued

951
00:47:56,280 --> 00:47:59,640
Speaker 2: by the amount of data that's something like a duolingo

952
00:48:00,200 --> 00:48:02,919
Speaker 2: have at this point, and I guess maybe we should

953
00:48:02,920 --> 00:48:05,879
Speaker 2: have asked Louise about this. But also other business opportunities

954
00:48:05,880 --> 00:48:09,120
Speaker 2: in terms of like licensing that data or maybe I

955
00:48:09,160 --> 00:48:11,799
Speaker 2: don't know. I think they were doing a partnership for

956
00:48:11,840 --> 00:48:15,399
Speaker 2: a while with BuzzFeed where they were where the cap

957
00:48:15,440 --> 00:48:19,360
Speaker 2: show was like actually translating news articles or something.

958
00:48:19,520 --> 00:48:21,879
Speaker 3: Right, there was going to be something like that, I think.

959
00:48:21,880 --> 00:48:24,480
Speaker 3: I recall it didn't really take off, but the idea

960
00:48:24,600 --> 00:48:27,960
Speaker 3: was BuzzFeed would get its news articles translated into Spanish

961
00:48:28,000 --> 00:48:31,280
Speaker 3: and other languages from the process of duo lingo users

962
00:48:31,360 --> 00:48:34,520
Speaker 3: learning that process. I forget why it didn't take off,

963
00:48:34,520 --> 00:48:35,560
Speaker 3: but yeah, absolutely.

964
00:48:35,840 --> 00:48:38,960
Speaker 2: I also I find it funny like in some senses

965
00:48:39,239 --> 00:48:43,120
Speaker 2: that we're sort of I guess the thing that AI

966
00:48:43,280 --> 00:48:46,160
Speaker 2: is feeding off of now right, And like all those

967
00:48:46,320 --> 00:48:50,239
Speaker 2: minutes which I'm sure add up to days eventually of

968
00:48:50,320 --> 00:48:54,160
Speaker 2: going through Capsha, it's all sort of unpaid labor for

969
00:48:54,320 --> 00:48:56,279
Speaker 2: training our future AI overlords.

970
00:48:56,400 --> 00:48:59,680
Speaker 3: So he mentioned that he was upset about headlines last

971
00:48:59,719 --> 00:49:02,319
Speaker 3: year implying that they had laid off a bunch of

972
00:49:02,360 --> 00:49:05,560
Speaker 3: people due to AI. But he did say that there

973
00:49:05,560 --> 00:49:08,040
Speaker 3: are people who they were contractors, so they weren't full

974
00:49:08,080 --> 00:49:11,640
Speaker 3: time employees. But it sounds like a very crisp example

975
00:49:11,800 --> 00:49:14,279
Speaker 3: of AI being able to do a job even if

976
00:49:14,320 --> 00:49:17,120
Speaker 3: they were contractors. That were done by humans. And I'm

977
00:49:17,239 --> 00:49:20,480
Speaker 3: generally skeptical of most articles and that I read where

978
00:49:20,520 --> 00:49:23,000
Speaker 3: a company says, oh, we're getting like cut all this

979
00:49:23,200 --> 00:49:25,680
Speaker 3: labor savings and we're gonna do AI, because I sort

980
00:49:25,680 --> 00:49:28,200
Speaker 3: of think that is often a smokescreen for just like

981
00:49:28,360 --> 00:49:30,680
Speaker 3: a business that wants to cut jobs and make it

982
00:49:30,719 --> 00:49:33,799
Speaker 3: sound like they're progressive. But here did sound like an

983
00:49:33,840 --> 00:49:37,000
Speaker 3: actual example in which there was some form of human

984
00:49:37,120 --> 00:49:41,000
Speaker 3: labor that is no longer needed because it is AI.

985
00:49:41,320 --> 00:49:44,440
Speaker 2: Yes, AI will come for us all. Shall we leave

986
00:49:44,440 --> 00:49:44,640
Speaker 2: it there?

987
00:49:44,719 --> 00:49:45,479
Speaker 4: Let's leave it there.

988
00:49:45,719 --> 00:49:48,400
Speaker 2: This has been another episode of the All Thoughts podcast.

989
00:49:48,520 --> 00:49:51,800
Speaker 2: I'm Tracy Alloway. You can follow me at Tracy Alloway.

990
00:49:51,520 --> 00:49:54,400
Speaker 3: And I'm Joe Wisenthal. You can follow me at the Stalwart.

991
00:49:54,480 --> 00:49:57,279
Speaker 3: Follow our guest Louis Vaughan on He's at Louis van On.

992
00:49:57,760 --> 00:50:01,120
Speaker 3: Follow our producers Carman Rodriguez at Herman Ermann dash Ol

993
00:50:01,120 --> 00:50:04,360
Speaker 3: Bennett at Dashbot and kill Brooks at Kilbrooks. Thank you

994
00:50:04,400 --> 00:50:07,400
Speaker 3: to our producer Moses Ondem From our Oddlows content. Go

995
00:50:07,440 --> 00:50:10,399
Speaker 3: to Bloomberg dot com slash odd Lots, where we have transcripts,

996
00:50:10,480 --> 00:50:13,359
Speaker 3: blog and a newsletter and you can chat about all

997
00:50:13,360 --> 00:50:16,120
Speaker 3: of these topics twenty four to seven in the Discord.

998
00:50:16,200 --> 00:50:19,200
Speaker 3: In fact, this episode came about because someone in the

999
00:50:19,200 --> 00:50:22,440
Speaker 3: Discord wanted to hear an interview with Luis van On,

1000
00:50:22,920 --> 00:50:24,960
Speaker 3: So you can go there, you can talk about AI,

1001
00:50:25,080 --> 00:50:27,600
Speaker 3: you can suggest future episodes.

1002
00:50:28,120 --> 00:50:31,279
Speaker 2: Check it out and if you enjoy all blots, if

1003
00:50:31,320 --> 00:50:35,200
Speaker 2: you like it when we speak bad Spanish, I guess,

1004
00:50:35,440 --> 00:50:38,400
Speaker 2: then please leave us a positive review on your favorite

1005
00:50:38,400 --> 00:50:42,279
Speaker 2: podcast platform. And remember, if you are a Bloomberg subscriber,

1006
00:50:42,360 --> 00:50:45,560
Speaker 2: you can listen to all of our episodes absolutely ad free.

1007
00:50:45,840 --> 00:50:48,440
Speaker 2: All you need to do is connect your Bloomberg subscription

1008
00:50:48,719 --> 00:50:51,200
Speaker 2: with Apple Podcasts. Thanks for listening

1009
00:51:07,960 --> 00:51:08,000
Speaker 4: In