1
00:00:04,440 --> 00:00:12,600
Speaker 1: Welcome to Tech Stuff, a production from iHeartRadio. Today, we

2
00:00:12,680 --> 00:00:15,640
Speaker 1: are witnessed to one of those rare moments in history,

3
00:00:16,000 --> 00:00:19,239
Speaker 1: the rise of an innovative technology with the potential to

4
00:00:19,360 --> 00:00:24,080
Speaker 1: radically transform business and society forever. That technology, of course,

5
00:00:24,560 --> 00:00:28,120
Speaker 1: is artificial intelligence, and it's the central focus for this

6
00:00:28,280 --> 00:00:32,320
Speaker 1: new season of Smart Talks with IBM. Join hosts from

7
00:00:32,320 --> 00:00:36,040
Speaker 1: your favorite Pushkin podcasts as they talk with industry experts

8
00:00:36,080 --> 00:00:39,640
Speaker 1: and leaders to explore how businesses can integrate AI into

9
00:00:39,720 --> 00:00:43,040
Speaker 1: their workflows and help drive real change in this new

10
00:00:43,120 --> 00:00:46,800
Speaker 1: era of AI, and of course, host Malcolm Gladwell will

11
00:00:46,840 --> 00:00:49,120
Speaker 1: be there to guide you through the season and throw

12
00:00:49,240 --> 00:00:52,120
Speaker 1: in his two cents as well. Look out for new

13
00:00:52,159 --> 00:00:55,040
Speaker 1: episodes of Smart Talks with IBM every other week on

14
00:00:55,080 --> 00:00:59,320
Speaker 1: the iHeartRadio app, Apple Podcasts, wherever you get your podcasts,

15
00:00:59,520 --> 00:01:03,760
Speaker 1: and learn more at IBM dot com slash smart Talks.

16
00:01:04,840 --> 00:01:08,560
Speaker 2: Hello, Hello, Welcome to Smart Talks with IBM, a podcast

17
00:01:08,560 --> 00:01:14,959
Speaker 2: from Pushkin Industries, iHeartRadio and IBM. I'm Malcolm Gladwell. This season,

18
00:01:15,160 --> 00:01:19,760
Speaker 2: we're continuing our conversation with new creators visionaries who are

19
00:01:19,840 --> 00:01:23,880
Speaker 2: creatively applying technology in business to drive change, but with

20
00:01:23,920 --> 00:01:28,760
Speaker 2: a focus on the transformative power of artificial intelligence and

21
00:01:28,840 --> 00:01:31,840
Speaker 2: what it means to leverage AI as a game changing

22
00:01:31,920 --> 00:01:36,399
Speaker 2: multiplier for your business. Our guest today is Jeff Boutier,

23
00:01:36,840 --> 00:01:40,360
Speaker 2: head of Product and Growth at hugging Face, the leading

24
00:01:40,480 --> 00:01:45,640
Speaker 2: open source and open science artificial intelligence platform. An engineer

25
00:01:45,680 --> 00:01:49,560
Speaker 2: by background, he has a self professed obsession with the

26
00:01:49,600 --> 00:01:54,160
Speaker 2: business of technology. Recently, IBM and hugging Face announced a

27
00:01:54,240 --> 00:01:58,880
Speaker 2: collaboration bringing together hugging faces repositories of open source AI

28
00:01:58,960 --> 00:02:03,800
Speaker 2: models with IBM's Watson X platform. It's a move that

29
00:02:03,840 --> 00:02:08,239
Speaker 2: gives businesses even more access to AI while staying true

30
00:02:08,280 --> 00:02:13,920
Speaker 2: to IBM's long standing philosophy of supporting open source technology.

31
00:02:14,840 --> 00:02:18,720
Speaker 2: With open source, businesses can build better AI models that

32
00:02:18,800 --> 00:02:23,320
Speaker 2: suit their specific needs using their own proprietary data while

33
00:02:23,360 --> 00:02:28,440
Speaker 2: browsing a ready catalog of pre trained models. In today's episode,

34
00:02:28,639 --> 00:02:31,760
Speaker 2: you'll hear why open source is so crucial to the

35
00:02:31,800 --> 00:02:36,400
Speaker 2: advancement of AI, how IBM's Watson X interacts with open

36
00:02:36,440 --> 00:02:40,919
Speaker 2: source AI, and Jeff's thoughts on why this singular omnipotent

37
00:02:41,160 --> 00:02:45,240
Speaker 2: AI model is a myth. Jeff spoke with Tim Harford,

38
00:02:45,440 --> 00:02:49,680
Speaker 2: host of the Pushkin podcast Cautionary Tales, a longtime columnist

39
00:02:49,760 --> 00:02:53,239
Speaker 2: at the Financial Times, where he writes the Undercover Economist.

40
00:02:53,520 --> 00:02:57,320
Speaker 2: Tim is also a BBC broadcaster with his show More

41
00:02:57,480 --> 00:03:01,160
Speaker 2: or Less. Okay, let's get to the interview.

42
00:03:08,600 --> 00:03:11,480
Speaker 3: I am a Jeff Boudier and I'm a product director

43
00:03:11,560 --> 00:03:12,600
Speaker 3: at hugging.

44
00:03:12,280 --> 00:03:16,800
Speaker 4: Face, So I'm immediately intrigue. Hugging Face. Is this a

45
00:03:16,880 --> 00:03:18,840
Speaker 4: reference to the Alien movie or something else?

46
00:03:20,200 --> 00:03:24,040
Speaker 3: It is not, and it may be not obvious to

47
00:03:24,160 --> 00:03:27,800
Speaker 3: a listener, but hugging Face is the name of that

48
00:03:27,960 --> 00:03:30,799
Speaker 3: cute emoji, you know, the one that's smiling with his

49
00:03:30,960 --> 00:03:34,120
Speaker 3: two hands extended like that to give you a big hug.

50
00:03:34,360 --> 00:03:37,640
Speaker 3: That's hugging Face. So basically we name the company after

51
00:03:37,760 --> 00:03:39,120
Speaker 3: an emoji.

52
00:03:40,000 --> 00:03:42,400
Speaker 4: And it is I saw your website and it is

53
00:03:42,440 --> 00:03:45,480
Speaker 4: a very friendly emoji. So that's that's nice. So tell

54
00:03:45,560 --> 00:03:47,880
Speaker 4: us a little bit about hugging Face and about what

55
00:03:47,920 --> 00:03:48,320
Speaker 4: you do that.

56
00:03:48,960 --> 00:03:53,440
Speaker 3: Of course, hugging Face is the leading open platform for

57
00:03:53,800 --> 00:03:59,000
Speaker 3: AI builders, and it's the place that's all the AI

58
00:03:59,080 --> 00:04:04,600
Speaker 3: researchers use to share their work, their new AI models

59
00:04:04,880 --> 00:04:09,600
Speaker 3: and collaborate around them. It's the place where the data

60
00:04:09,840 --> 00:04:15,080
Speaker 3: scientists go and find those pre train models and access

61
00:04:15,160 --> 00:04:19,000
Speaker 3: them and use them and work with them, and increasingly

62
00:04:19,040 --> 00:04:23,360
Speaker 3: it's the place where developers are coming to turn all

63
00:04:23,400 --> 00:04:28,480
Speaker 3: of these AI models and data sets into their own applications,

64
00:04:28,520 --> 00:04:29,680
Speaker 3: their own features.

65
00:04:30,360 --> 00:04:33,039
Speaker 4: So it's like the I don't know, the Facebook group

66
00:04:33,120 --> 00:04:36,000
Speaker 4: or the Reddit or the Twitter for people who are

67
00:04:36,040 --> 00:04:40,280
Speaker 4: interested in particularly generative language AI, or all kinds of

68
00:04:40,360 --> 00:04:41,920
Speaker 4: artificial intelligence.

69
00:04:42,160 --> 00:04:46,400
Speaker 3: All kinds of AI really, and of course generative AIS

70
00:04:46,600 --> 00:04:51,320
Speaker 3: this new wave that has caught the world by storm.

71
00:04:51,680 --> 00:04:55,360
Speaker 3: But on Hiking Face you can find any kind of model,

72
00:04:55,680 --> 00:04:59,560
Speaker 3: the new sort of transformers models to do anything from

73
00:05:00,000 --> 00:05:04,080
Speaker 3: translation or if you wanted to transcribe what I'm saying

74
00:05:04,120 --> 00:05:07,560
Speaker 3: into text, then you would use a transformer model. If

75
00:05:07,560 --> 00:05:10,880
Speaker 3: you wanted to then take that text and make a summary,

76
00:05:11,320 --> 00:05:15,000
Speaker 3: that would be another transformer model. If you wanted to

77
00:05:15,360 --> 00:05:19,400
Speaker 3: create a nice little thumbnail for this podcast by typeing

78
00:05:19,440 --> 00:05:23,280
Speaker 3: a sentence, that would be another type of model. So

79
00:05:23,360 --> 00:05:26,480
Speaker 3: all these models you can find. There's actually three hundred

80
00:05:26,640 --> 00:05:31,119
Speaker 3: thousands that are free and publicly accessible. You can find

81
00:05:31,160 --> 00:05:34,840
Speaker 3: them on our website at Hikingphase dot co and use

82
00:05:34,920 --> 00:05:37,520
Speaker 3: them using our open source libraries.

83
00:05:38,360 --> 00:05:41,160
Speaker 4: And so this is this is fascinating. So there are

84
00:05:41,160 --> 00:05:44,520
Speaker 4: three hundred thousand models. Now when you say model, I'm

85
00:05:44,560 --> 00:05:46,400
Speaker 4: thinking in my head, oh, it's kind of like a

86
00:05:47,360 --> 00:05:50,080
Speaker 4: computer program. There were three hundred thousand computer programs. Is

87
00:05:50,680 --> 00:05:52,359
Speaker 4: that roughly right or it not?

88
00:05:52,440 --> 00:05:57,839
Speaker 3: Really, it's a general idea. A model is a giant

89
00:05:59,680 --> 00:06:05,120
Speaker 3: set of numbers that are working together to sift through

90
00:06:05,760 --> 00:06:08,960
Speaker 3: some inputs that you're going to give it. So think

91
00:06:09,000 --> 00:06:13,480
Speaker 3: of it of a big black box filled with numbers,

92
00:06:14,440 --> 00:06:19,240
Speaker 3: and you give it as an input, maybe some text,

93
00:06:19,960 --> 00:06:23,880
Speaker 3: maybe a prompt, so you're asking, you're giving an instruction

94
00:06:24,120 --> 00:06:26,719
Speaker 3: to the model, or maybe you give it an image

95
00:06:26,800 --> 00:06:31,240
Speaker 3: as an input, and then it will sift through that

96
00:06:31,400 --> 00:06:35,400
Speaker 3: information thanks to all of these numbers, which we call

97
00:06:35,440 --> 00:06:39,880
Speaker 3: in the field parameters, and it will produce an output.

98
00:06:40,480 --> 00:06:43,039
Speaker 3: So when I told you, hey, we can transcribe this

99
00:06:43,279 --> 00:06:47,200
Speaker 3: conversation into text, the input would have been the conversation

100
00:06:47,800 --> 00:06:50,440
Speaker 3: in an audio file, and then the output would have

101
00:06:50,480 --> 00:06:53,479
Speaker 3: been the text of the transcription. If you want to

102
00:06:53,560 --> 00:06:57,599
Speaker 3: create a thumbnail for this podcast episode, then the input

103
00:06:57,640 --> 00:07:00,520
Speaker 3: would be what we call the prompt, which is really

104
00:07:00,520 --> 00:07:05,040
Speaker 3: a text description like a Frenchman in San Francisco talking

105
00:07:05,080 --> 00:07:10,720
Speaker 3: about machine learning, and the output would be completely original image.

106
00:07:11,280 --> 00:07:14,600
Speaker 3: So that's how I think about what an AI model is,

107
00:07:15,080 --> 00:07:19,200
Speaker 3: and I think what we're starting to realize is that

108
00:07:20,080 --> 00:07:24,280
Speaker 3: this is becoming the new way of building technology in

109
00:07:24,320 --> 00:07:28,320
Speaker 3: the world. It has been for the field of dealing

110
00:07:28,440 --> 00:07:32,400
Speaker 3: understanding generating text for quite some time, but now it's

111
00:07:32,520 --> 00:07:36,440
Speaker 3: sort of moving across every field of technology. We have

112
00:07:36,840 --> 00:07:40,960
Speaker 3: models to create images, as I say, but also to

113
00:07:41,120 --> 00:07:46,600
Speaker 3: generate new proteins to make predictions on numerical data. So

114
00:07:46,640 --> 00:07:51,160
Speaker 3: every kind of field of machine learning is now using

115
00:07:52,600 --> 00:07:56,480
Speaker 3: this new type of models. But what's interesting is that

116
00:07:57,080 --> 00:08:00,720
Speaker 3: if you're, say a product manager at a company, and

117
00:08:00,760 --> 00:08:03,840
Speaker 3: you say, hey, I want to build a feature that

118
00:08:03,960 --> 00:08:07,320
Speaker 3: does this. A few years ago, the approach would have

119
00:08:07,360 --> 00:08:11,360
Speaker 3: been to ask a software developer to write a thousand

120
00:08:11,400 --> 00:08:14,840
Speaker 3: lines of code in order to build a prototype. And

121
00:08:14,920 --> 00:08:18,360
Speaker 3: the new way of doing things today is to go

122
00:08:18,520 --> 00:08:23,040
Speaker 3: look for an off the shelf pre train model that

123
00:08:23,080 --> 00:08:27,200
Speaker 3: does a pretty good job at solving exactly that problem,

124
00:08:27,320 --> 00:08:30,400
Speaker 3: so you can create a prototype of that feature fast.

125
00:08:30,440 --> 00:08:33,000
Speaker 3: So it's a new approach of building tech.

126
00:08:33,200 --> 00:08:36,320
Speaker 4: I'm not a programmer, but I'm aware that there was

127
00:08:36,520 --> 00:08:39,080
Speaker 4: this idea of open source code, and now we have

128
00:08:39,160 --> 00:08:42,120
Speaker 4: open source models. So what does it mean for something

129
00:08:42,120 --> 00:08:43,040
Speaker 4: to be open source.

130
00:08:43,640 --> 00:08:49,400
Speaker 3: Open source AI actually means a lot of different specific things.

131
00:08:50,080 --> 00:08:54,280
Speaker 3: It's the open source implementation of the model. So if

132
00:08:54,320 --> 00:08:58,600
Speaker 3: you use the Hugging Phase transformers library to use a model,

133
00:08:58,640 --> 00:09:03,000
Speaker 3: you're using an open source code library to use that model.

134
00:09:03,080 --> 00:09:06,320
Speaker 4: Just to end up on the transformers. These are these

135
00:09:06,400 --> 00:09:09,640
Speaker 4: kind of ways of turning a picture of a dog

136
00:09:09,760 --> 00:09:12,440
Speaker 4: into a text output that says, hey, this is a

137
00:09:12,440 --> 00:09:15,079
Speaker 4: picture of a dog, or this is a French text

138
00:09:15,080 --> 00:09:17,920
Speaker 4: and with the transformers helping you turn it into English text,

139
00:09:18,000 --> 00:09:19,880
Speaker 4: or it's doing all of these things that you've been describing.

140
00:09:19,960 --> 00:09:23,920
Speaker 4: That's the transformer is the kind of the engine at

141
00:09:23,920 --> 00:09:24,760
Speaker 4: the heart of that.

142
00:09:25,559 --> 00:09:29,960
Speaker 3: Yes, exactly. And we call them transformers because they correspond

143
00:09:30,000 --> 00:09:33,920
Speaker 3: to this new way of building machine learning models that

144
00:09:34,080 --> 00:09:38,800
Speaker 3: was introduced by Google actually with a very important paper

145
00:09:39,120 --> 00:09:41,920
Speaker 3: called Attention is All You Need and that was published

146
00:09:41,920 --> 00:09:46,440
Speaker 3: in twenty seventeen by researchers out of Google Deep Mind.

147
00:09:47,400 --> 00:09:50,680
Speaker 4: Well that's just six years so new.

148
00:09:51,960 --> 00:09:55,240
Speaker 3: It is very new, and ever since the piece of

149
00:09:55,480 --> 00:10:00,920
Speaker 3: innovation of like new model architectures has real really accelerated.

150
00:10:01,240 --> 00:10:06,000
Speaker 3: But it really started from this inflection point that came

151
00:10:06,120 --> 00:10:10,400
Speaker 3: from this paper and its implementation in what is now

152
00:10:10,440 --> 00:10:16,240
Speaker 3: called Transformer models, the transformer that has conquered every area

153
00:10:16,360 --> 00:10:18,080
Speaker 3: of machine learning since.

154
00:10:18,280 --> 00:10:21,840
Speaker 4: Okay, so say turned up. So you've got this library

155
00:10:21,840 --> 00:10:26,120
Speaker 4: of Transformer models and that open source, and that means

156
00:10:26,280 --> 00:10:28,480
Speaker 4: that means what anyone can use them for free, or

157
00:10:29,240 --> 00:10:31,320
Speaker 4: that anybody can implement them for free. What does it mean?

158
00:10:32,840 --> 00:10:35,800
Speaker 3: So again, there's lots that go into it, but the

159
00:10:35,840 --> 00:10:40,240
Speaker 3: most important thing is for the model itself to be

160
00:10:40,480 --> 00:10:44,600
Speaker 3: available so that a data scientists or an engineer can

161
00:10:45,000 --> 00:10:49,400
Speaker 3: download them and use them. And also there are a

162
00:10:49,400 --> 00:10:54,240
Speaker 3: lot of considerations about how you make them accessible, and

163
00:10:54,280 --> 00:10:58,240
Speaker 3: a very important one is whether or not you give

164
00:10:58,480 --> 00:11:03,520
Speaker 3: access to the training data, all the information that went

165
00:11:03,679 --> 00:11:07,920
Speaker 3: into training that model and teaching it to do what

166
00:11:08,720 --> 00:11:09,640
Speaker 3: it's trained to do.

167
00:11:09,800 --> 00:11:12,800
Speaker 4: So I might have fed millions of words into a

168
00:11:12,920 --> 00:11:16,040
Speaker 4: into a language transformer, or I might have fed millions

169
00:11:16,040 --> 00:11:18,640
Speaker 4: of photographs into a into a picture transformer.

170
00:11:18,720 --> 00:11:22,160
Speaker 3: Yeah, yes, and now it's trillions and that and the

171
00:11:22,520 --> 00:11:26,160
Speaker 3: accessibility of that training data is very very important.

172
00:11:27,160 --> 00:11:32,960
Speaker 4: What's the relationship between the hugging face libraries and GitHub, which,

173
00:11:34,080 --> 00:11:38,360
Speaker 4: if I understand GitHub correctly, it's this the repository of

174
00:11:38,400 --> 00:11:42,360
Speaker 4: open source code lots and lots of lines of code

175
00:11:42,360 --> 00:11:47,280
Speaker 4: and routines and programs that are shared and updated and tracked,

176
00:11:47,320 --> 00:11:50,480
Speaker 4: and they're all available on GitHub, which sounds similar to

177
00:11:50,520 --> 00:11:52,959
Speaker 4: what you're doing with hugging face for AI. So what

178
00:11:52,960 --> 00:11:55,600
Speaker 4: what what is the interaction or the relationship there?

179
00:11:56,200 --> 00:11:58,640
Speaker 3: Yeah, I think you nailed it on the head there.

180
00:11:58,679 --> 00:12:02,839
Speaker 3: So hugging phase is to AI what GitHub is to code, right,

181
00:12:02,840 --> 00:12:08,959
Speaker 3: It's this central platform where AI builders can go find

182
00:12:09,440 --> 00:12:14,720
Speaker 3: and collaborate around AI artifacts, which are models and data sets.

183
00:12:14,760 --> 00:12:18,719
Speaker 3: So it's quite different than software, but we play this

184
00:12:18,840 --> 00:12:23,079
Speaker 3: central role in the community to share and collaborate and

185
00:12:24,080 --> 00:12:28,880
Speaker 3: access all of those artifacts for AI, like GitHub offers

186
00:12:28,880 --> 00:12:29,839
Speaker 3: for code.

187
00:12:30,679 --> 00:12:33,600
Speaker 4: And that community must be incredibly important. I mean, the

188
00:12:33,640 --> 00:12:36,240
Speaker 4: open source is nothing if you don't have a community

189
00:12:36,280 --> 00:12:38,640
Speaker 4: of people working on it. So how have you been

190
00:12:38,679 --> 00:12:41,800
Speaker 4: able to foster and nurture that community.

191
00:12:42,400 --> 00:12:45,760
Speaker 3: Well, I think it goes to the origins of the

192
00:12:45,840 --> 00:12:49,960
Speaker 3: transformer model and hugging and face role into that. So

193
00:12:50,600 --> 00:12:55,160
Speaker 3: when the first sort of open model came out, it

194
00:12:55,280 --> 00:12:58,440
Speaker 3: was called Bird and it came out of Google. The

195
00:12:58,480 --> 00:13:02,720
Speaker 3: only way you could would access it was to use

196
00:13:02,920 --> 00:13:07,360
Speaker 3: a tool called TensorFlow. But it happened that most of

197
00:13:07,400 --> 00:13:12,840
Speaker 3: the AI community was using a different tool called PyTorch,

198
00:13:13,960 --> 00:13:18,920
Speaker 3: and something that Hugging Face did is to make that

199
00:13:19,000 --> 00:13:25,480
Speaker 3: new model Bert accessible to all PyTorch user and they

200
00:13:25,480 --> 00:13:28,680
Speaker 3: did it in open source. It was a project called

201
00:13:29,200 --> 00:13:32,720
Speaker 3: Bert's pre Trained PyTorch or bird pitworch pre trained.

202
00:13:33,240 --> 00:13:35,360
Speaker 4: So this is like being able to play my Zelda

203
00:13:35,400 --> 00:13:39,440
Speaker 4: game on an Xbox or a PlayStation, right or am

204
00:13:39,480 --> 00:13:41,120
Speaker 4: I not really understanding what's going on?

205
00:13:41,559 --> 00:13:43,920
Speaker 3: No, That's exactly what it is. And the thing is

206
00:13:44,120 --> 00:13:48,080
Speaker 3: everybody was using the game Boy and so it became

207
00:13:48,440 --> 00:13:53,200
Speaker 3: a very popular and from there the community sort of

208
00:13:53,280 --> 00:13:56,839
Speaker 3: gathered to make all the other models that were then

209
00:13:56,960 --> 00:14:01,360
Speaker 3: published by AI researchers available with that library, which was

210
00:14:01,440 --> 00:14:07,000
Speaker 3: quickly renamed from bird bretrain Bytorch into Transformers to welcome

211
00:14:07,120 --> 00:14:12,280
Speaker 3: like all of these different new models, and today that's

212
00:14:12,440 --> 00:14:17,440
Speaker 3: open source library. Transformers is what all AI builders are

213
00:14:17,559 --> 00:14:20,880
Speaker 3: using when they want to access those models, see how

214
00:14:20,920 --> 00:14:22,400
Speaker 3: they work, and build upon them.

215
00:14:23,720 --> 00:14:26,880
Speaker 4: What's striking about this field is that it's changing so fast,

216
00:14:26,920 --> 00:14:30,720
Speaker 4: it's improving so quickly. So how do open source models

217
00:14:31,440 --> 00:14:35,320
Speaker 4: keep up with that? How do they get iterated and improved?

218
00:14:35,440 --> 00:14:38,400
Speaker 3: Actually? It's not so much that open source is keeping

219
00:14:38,480 --> 00:14:41,440
Speaker 3: up with it. It's actually open source that is driving

220
00:14:42,160 --> 00:14:45,600
Speaker 3: that is driving this piece of change. And that's because

221
00:14:46,320 --> 00:14:51,680
Speaker 3: with open source and open research data, scientists researchers can

222
00:14:51,800 --> 00:14:55,480
Speaker 3: build upon each other's work, they can reproduce each other's work,

223
00:14:55,760 --> 00:14:59,760
Speaker 3: they can access each other's work using our open source library,

224
00:15:00,000 --> 00:15:02,320
Speaker 3: et cetera. So in a sense, it's not really that

225
00:15:02,720 --> 00:15:07,320
Speaker 3: open source AI is a new idea. It's rather the opposite.

226
00:15:07,480 --> 00:15:11,600
Speaker 3: There's been a blip of time in which closed source

227
00:15:11,840 --> 00:15:15,560
Speaker 3: AI seemed to be the dominant way, but it's really

228
00:15:16,120 --> 00:15:19,840
Speaker 3: a blip. In fact, you know, none of the incredible

229
00:15:19,880 --> 00:15:24,480
Speaker 3: advances that we're marvel about today would be possible without

230
00:15:24,680 --> 00:15:27,680
Speaker 3: open source. We're standing upon the shoulders of fifty years

231
00:15:27,680 --> 00:15:32,120
Speaker 3: of research and open source software. So I think that

232
00:15:32,120 --> 00:15:35,000
Speaker 3: that's really important. If it wasn't for that, we'll probably

233
00:15:35,000 --> 00:15:39,880
Speaker 3: be fifty years away from having these amazing experiences like

234
00:15:40,040 --> 00:15:45,840
Speaker 3: JGBT or stable diffusion, et cetera. So it's really open

235
00:15:45,880 --> 00:15:50,240
Speaker 3: source that is fueling this pace of change, all these

236
00:15:50,280 --> 00:15:53,800
Speaker 3: new models, all these new capabilities. To give you an example,

237
00:15:54,120 --> 00:15:58,640
Speaker 3: so Meta released the Lama large language model just a

238
00:15:58,680 --> 00:16:02,960
Speaker 3: few months ago, and ever since, there's been this Cambrian

239
00:16:03,120 --> 00:16:07,520
Speaker 3: explosion of variations and improvements upon the original models, and

240
00:16:07,560 --> 00:16:10,600
Speaker 3: today there are over a thousands of them that we

241
00:16:11,160 --> 00:16:16,560
Speaker 3: host and track and evaluate. So yeah, open source is

242
00:16:16,600 --> 00:16:20,280
Speaker 3: really the gas and the engine for that.

243
00:16:21,560 --> 00:16:24,400
Speaker 2: Jeff just made it clear that it is open source,

244
00:16:24,640 --> 00:16:28,640
Speaker 2: not closed that sets the pace for AI innovation. If

245
00:16:28,680 --> 00:16:33,240
Speaker 2: that's true, then forward thinking businesses shouldn't shy from leveraging

246
00:16:33,320 --> 00:16:37,680
Speaker 2: open source AI to solve their own proprietary challenges. But

247
00:16:37,880 --> 00:16:42,800
Speaker 2: how businesses can face serious obstacles when trying to adopt

248
00:16:43,040 --> 00:16:47,600
Speaker 2: open source technologies, like complying with government regulation or making

249
00:16:47,640 --> 00:16:51,880
Speaker 2: sure their customers data stays protected. In the next part

250
00:16:51,920 --> 00:16:56,200
Speaker 2: of their conversation, Jeff and Tim discuss how IBM's collaboration

251
00:16:56,360 --> 00:17:00,520
Speaker 2: with hugging Face empowers businesses to tap into the open

252
00:17:00,560 --> 00:17:04,879
Speaker 2: source AI community and how the watsonex platform can enable

253
00:17:04,920 --> 00:17:08,720
Speaker 2: them to customize those AI models to their needs.

254
00:17:09,400 --> 00:17:11,920
Speaker 4: Just want to ask about the partnership between hugging Face

255
00:17:11,960 --> 00:17:14,720
Speaker 4: and an IBM. How did that come about?

256
00:17:16,680 --> 00:17:23,280
Speaker 3: Well, it came through a conversation, a conversation between our CEO,

257
00:17:24,080 --> 00:17:29,320
Speaker 3: Clement de Lange and Bill Higgins IBM, who's really really

258
00:17:29,400 --> 00:17:34,280
Speaker 3: close to all the amazing research work and open source

259
00:17:34,400 --> 00:17:39,399
Speaker 3: work that's happening at IBM, and that conversation sort of

260
00:17:39,680 --> 00:17:44,240
Speaker 3: sparked the evidence that we needed to do something together.

261
00:17:44,840 --> 00:17:48,840
Speaker 3: We share a lot of values in terms of the

262
00:17:48,880 --> 00:17:53,600
Speaker 3: importance of open source, which is fundamental to us, with

263
00:17:54,000 --> 00:17:58,800
Speaker 3: the importance of doing things in an ethics first way

264
00:17:58,920 --> 00:18:04,040
Speaker 3: to enable the commune to incorporate ethical considerations in how

265
00:18:04,520 --> 00:18:09,760
Speaker 3: they're building AI. And we sort of have a different

266
00:18:10,040 --> 00:18:14,119
Speaker 3: audience to start with, which is all the AI builders

267
00:18:14,240 --> 00:18:18,840
Speaker 3: use hiking phase today to access all the models we

268
00:18:18,960 --> 00:18:22,879
Speaker 3: talked about, to use them using our open source and

269
00:18:22,920 --> 00:18:27,320
Speaker 3: build with them. And IBM has this incredible history of

270
00:18:27,440 --> 00:18:32,920
Speaker 3: working with enterprise companies and enabling them to make use

271
00:18:32,960 --> 00:18:37,000
Speaker 3: of that technology in a way that's compliant with everything

272
00:18:37,040 --> 00:18:40,800
Speaker 3: that an enterprise requires, and so being able to marry

273
00:18:40,840 --> 00:18:45,000
Speaker 3: these two things together is an amazing opportunity. And now

274
00:18:45,040 --> 00:18:49,280
Speaker 3: we can enable the largest corporations that have sort of

275
00:18:49,520 --> 00:18:54,920
Speaker 3: complex requirements in order to deploy machine learning systems and

276
00:18:55,720 --> 00:18:59,080
Speaker 3: give them an easy experience to take advantage of all

277
00:18:59,119 --> 00:19:01,600
Speaker 3: the latest and great is that AA has to offer

278
00:19:02,119 --> 00:19:02,920
Speaker 3: through our platform.

279
00:19:04,480 --> 00:19:08,040
Speaker 4: Let's talk about this idea of a single model or

280
00:19:08,080 --> 00:19:11,600
Speaker 4: a variety of models, because what I've been hearing you say.

281
00:19:12,160 --> 00:19:14,000
Speaker 4: You've been saying, oh, there are lots of models, there

282
00:19:14,040 --> 00:19:18,119
Speaker 4: are hundreds of thousands of models available on hugging Face.

283
00:19:18,280 --> 00:19:21,640
Speaker 4: But you've also said there's this single thing, the transformer,

284
00:19:22,280 --> 00:19:26,720
Speaker 4: and they're all transformers. So if they're all basically the

285
00:19:26,760 --> 00:19:31,480
Speaker 4: same thing, why can't you just build one super clever

286
00:19:31,560 --> 00:19:32,640
Speaker 4: model that can do everything.

287
00:19:34,760 --> 00:19:39,679
Speaker 3: That's a really interesting idea and very much a new idea.

288
00:19:40,520 --> 00:19:44,400
Speaker 3: The reason we have over a million repositories three hundred

289
00:19:44,480 --> 00:19:48,119
Speaker 3: thousand free and accessible models on a hiking Face platform

290
00:19:48,560 --> 00:19:52,320
Speaker 3: is that models are typically trained to do one thing,

291
00:19:52,680 --> 00:19:55,920
Speaker 3: and they're typically trained to do one thing with specific

292
00:19:55,960 --> 00:20:02,439
Speaker 3: types of data. And what became new and evidence in

293
00:20:02,480 --> 00:20:04,920
Speaker 3: the research that came out over the last couple of

294
00:20:05,000 --> 00:20:09,120
Speaker 3: years is that if you train a big enough model

295
00:20:09,600 --> 00:20:14,680
Speaker 3: with enough data, then those models start to have sort

296
00:20:14,680 --> 00:20:18,720
Speaker 3: of general capabilities. You can ask them to do different things.

297
00:20:19,000 --> 00:20:22,480
Speaker 3: You can even train them to respond to instructions. So

298
00:20:22,600 --> 00:20:26,840
Speaker 3: with the same model, you can say, hey, summarize this paragraph,

299
00:20:27,240 --> 00:20:30,960
Speaker 3: translate this into English, start a conversation in French, and

300
00:20:30,960 --> 00:20:34,560
Speaker 3: pivot to German. And so these are general sort of

301
00:20:34,680 --> 00:20:42,000
Speaker 3: language capabilities. And I think when CHGBT came online and

302
00:20:42,320 --> 00:20:47,000
Speaker 3: the world sort of discovered these new capabilities, there was,

303
00:20:47,560 --> 00:20:50,480
Speaker 3: at least for a short period, this sort of idea,

304
00:20:50,600 --> 00:20:54,480
Speaker 3: this sort of myth that the endgame of all this

305
00:20:55,440 --> 00:20:59,199
Speaker 3: is maybe one or a handful of models there are

306
00:20:59,400 --> 00:21:03,640
Speaker 3: so much better than anything else than exists, that they

307
00:21:03,640 --> 00:21:06,280
Speaker 3: can do anything that we can ask them to do,

308
00:21:07,080 --> 00:21:10,560
Speaker 3: and that's the only model that we will need. And I,

309
00:21:10,800 --> 00:21:15,080
Speaker 3: for one, think it is a myth. I don't think

310
00:21:15,119 --> 00:21:19,200
Speaker 3: it is practical for a variety of reasons. Say you're

311
00:21:19,600 --> 00:21:23,760
Speaker 3: writing an email and you have like this great suggestion

312
00:21:23,920 --> 00:21:28,199
Speaker 3: of text to sort of complete your sentence, Well, that's AI.

313
00:21:28,640 --> 00:21:31,159
Speaker 3: That's a large language model, that's a transformer model that

314
00:21:31,200 --> 00:21:33,840
Speaker 3: does that. So there are a ton of existing use

315
00:21:33,880 --> 00:21:37,520
Speaker 3: cases like this, and these use cases are powered by

316
00:21:38,320 --> 00:21:41,280
Speaker 3: specific models that have been trained to do one thing

317
00:21:41,400 --> 00:21:44,479
Speaker 3: well and to do it fast. If you wanted to

318
00:21:44,600 --> 00:21:51,200
Speaker 3: apply these sort of all knowing, powerful oracle type of model,

319
00:21:51,600 --> 00:21:55,639
Speaker 3: you would not be able to serve millions of customers

320
00:21:55,680 --> 00:21:58,359
Speaker 3: through a search engine. You will not be able to

321
00:22:00,080 --> 00:22:04,119
Speaker 3: complete people's sentences because the amount of money that you

322
00:22:04,160 --> 00:22:07,400
Speaker 3: would need, the number of computers that you would need

323
00:22:07,640 --> 00:22:13,240
Speaker 3: to run such of service just exceeds what is available

324
00:22:13,359 --> 00:22:18,760
Speaker 3: on the planet. So one reason for which it's not

325
00:22:18,880 --> 00:22:24,359
Speaker 3: a practical scenario is that it's just very expensive to

326
00:22:24,600 --> 00:22:27,440
Speaker 3: run those very very large models.

327
00:22:27,760 --> 00:22:29,920
Speaker 4: What I'm hearing is it's like, look, if you want

328
00:22:29,920 --> 00:22:33,679
Speaker 4: to screw in a screw you need a screwdriver. You

329
00:22:33,720 --> 00:22:37,720
Speaker 4: don't want an entire tool shed full of tools if

330
00:22:37,800 --> 00:22:39,960
Speaker 4: the task is to screw in a screwdriver, and sure

331
00:22:40,040 --> 00:22:43,240
Speaker 4: you could bring the toolshed that are all the tools.

332
00:22:43,280 --> 00:22:47,320
Speaker 4: There's a screwdriver there, but it's not necessary. It's incredibly expensive,

333
00:22:47,320 --> 00:22:52,119
Speaker 4: it's incredibly cumbersome, and that cost exists even though maybe

334
00:22:52,200 --> 00:22:54,879
Speaker 4: is the user who's just typing in a into a

335
00:22:54,920 --> 00:22:57,680
Speaker 4: prompt box. The user may not see it, but it's

336
00:22:57,680 --> 00:22:58,720
Speaker 4: still very real.

337
00:23:00,040 --> 00:23:03,480
Speaker 3: That's right. And then another one is performance. So taking

338
00:23:03,520 --> 00:23:06,760
Speaker 3: the screwdriver example, so and by the way, like we're

339
00:23:06,800 --> 00:23:09,560
Speaker 3: not quite there at this moment where we have this

340
00:23:09,720 --> 00:23:13,240
Speaker 3: all knowing, powerful oracle that is still sort of a

341
00:23:13,320 --> 00:23:16,919
Speaker 3: sci fi scenario, but we have screw drivers, but we

342
00:23:17,040 --> 00:23:21,680
Speaker 3: also have the leatherman, right, the multitol Swiss army knife.

343
00:23:21,920 --> 00:23:24,919
Speaker 3: And that's sort of the moment that we are in today.

344
00:23:24,960 --> 00:23:28,600
Speaker 3: But now if I'm trying to open up my computer,

345
00:23:29,200 --> 00:23:32,439
Speaker 3: turns out that it requires a specific kind of screw

346
00:23:32,600 --> 00:23:36,760
Speaker 3: like these tiny little tork screws, and having a torqu

347
00:23:36,800 --> 00:23:40,520
Speaker 3: screwdriver will get me much further than trying to use

348
00:23:40,760 --> 00:23:43,399
Speaker 3: my leather man, where maybe I'll get the knife blade

349
00:23:43,440 --> 00:23:46,520
Speaker 3: and it will mess up the screw and maybe eventually

350
00:23:46,520 --> 00:23:49,160
Speaker 3: I'll get to what I need. But my point is

351
00:23:49,280 --> 00:23:54,399
Speaker 3: that if you take a very specifically trained model for

352
00:23:54,480 --> 00:23:57,960
Speaker 3: a particular problem, it will work much better. It will

353
00:23:57,960 --> 00:24:02,760
Speaker 3: give you better results than a very very generalistic, big

354
00:24:02,840 --> 00:24:05,800
Speaker 3: model that can do a lot of things. And so

355
00:24:05,880 --> 00:24:10,119
Speaker 3: for things like search engines or things like translation, for

356
00:24:10,280 --> 00:24:15,000
Speaker 3: things that are very specific, companies are much better off

357
00:24:15,119 --> 00:24:19,680
Speaker 3: using smaller, more efficient models that produce better results.

358
00:24:19,480 --> 00:24:24,000
Speaker 4: That's really interesting. And presumably then being able to know

359
00:24:24,040 --> 00:24:26,800
Speaker 4: which model to use, or being able to know who

360
00:24:26,840 --> 00:24:30,640
Speaker 4: to ask which model to use, becomes a very important capability.

361
00:24:31,480 --> 00:24:35,000
Speaker 3: Yes, and that's what we're trying to make easy through

362
00:24:35,040 --> 00:24:35,800
Speaker 3: our platform.

363
00:24:37,160 --> 00:24:41,160
Speaker 4: So tell me about how this works with IBM's what's

364
00:24:41,160 --> 00:24:44,760
Speaker 4: an X platform? How do you see hugging faces customers

365
00:24:44,800 --> 00:24:45,640
Speaker 4: benefiting from that?

366
00:24:47,560 --> 00:24:51,640
Speaker 3: The end goal is to make it really easy for

367
00:24:51,760 --> 00:24:56,240
Speaker 3: what's an X customers to make use of all the

368
00:24:56,320 --> 00:25:00,600
Speaker 3: great models and libraries that we talked about, all the

369
00:25:00,600 --> 00:25:03,320
Speaker 3: the three hundred thousand models are today on hugging face

370
00:25:04,160 --> 00:25:08,440
Speaker 3: and to do this we need to really collaborate deeply

371
00:25:08,520 --> 00:25:12,080
Speaker 3: with the IBM teams that build the What's and X

372
00:25:12,160 --> 00:25:17,360
Speaker 3: platform so that our libraries, our open source our models

373
00:25:17,760 --> 00:25:21,480
Speaker 3: are well integrated into the platform. If you are a

374
00:25:21,640 --> 00:25:24,560
Speaker 3: single user, if you are a data science student and

375
00:25:24,600 --> 00:25:26,680
Speaker 3: you want to use a model, is we make it

376
00:25:26,720 --> 00:25:29,399
Speaker 3: super easy, right. We have our open source library. You

377
00:25:29,440 --> 00:25:32,159
Speaker 3: can download the model on your computer and run with

378
00:25:32,240 --> 00:25:37,320
Speaker 3: it then. But in enterprises there is a vast complexity

379
00:25:37,560 --> 00:25:42,800
Speaker 3: of infrastructure and rules around what people can do and

380
00:25:43,400 --> 00:25:47,600
Speaker 3: how the data can be accessed, and all this complexity

381
00:25:48,280 --> 00:25:52,879
Speaker 3: is sort of solved by the Watson X platform.

382
00:25:53,560 --> 00:25:57,520
Speaker 4: This season of the Smart Talks podcast features what we're

383
00:25:57,520 --> 00:26:00,399
Speaker 4: calling new creators. Do you see yourself as being a

384
00:26:00,440 --> 00:26:01,440
Speaker 4: creative person?

385
00:26:02,359 --> 00:26:05,960
Speaker 3: Ah, I think it's a requirement for the job. I mean,

386
00:26:05,960 --> 00:26:10,720
Speaker 3: we're in such a new and rapidly evolving industry that

387
00:26:11,000 --> 00:26:15,000
Speaker 3: we have to be creative in order to invent the

388
00:26:15,080 --> 00:26:19,640
Speaker 3: business models the use cases of tomorrow. My role within

389
00:26:19,680 --> 00:26:24,680
Speaker 3: the company is really to create the business around all

390
00:26:24,840 --> 00:26:28,840
Speaker 3: the great work of our science and open source and

391
00:26:28,960 --> 00:26:33,080
Speaker 3: product team, and by and large, the business model of

392
00:26:33,240 --> 00:26:38,200
Speaker 3: AI within the whole ecosystem is still something that companies

393
00:26:38,240 --> 00:26:43,200
Speaker 3: are trying to figure out. So creativity is really important

394
00:26:43,320 --> 00:26:47,120
Speaker 3: to really have the conversation with companies, understand what they're

395
00:26:47,160 --> 00:26:49,240
Speaker 3: trying to do, and then build the right kind of solution.

396
00:26:49,840 --> 00:26:54,080
Speaker 3: So that's like where creativity comes into play.

397
00:26:54,800 --> 00:26:59,000
Speaker 4: And one of the things that you've you've been talking

398
00:26:59,040 --> 00:27:02,520
Speaker 4: about is just this growing number of models, this growing

399
00:27:02,600 --> 00:27:09,040
Speaker 4: number of capabilities, this growing number of use cases enormously

400
00:27:09,080 --> 00:27:15,000
Speaker 4: exciting but also I think completely bewildering for most people

401
00:27:16,000 --> 00:27:20,640
Speaker 4: who are trying to navigate their way through this maze

402
00:27:20,680 --> 00:27:23,960
Speaker 4: of possibilities that is growing faster than they can even

403
00:27:24,200 --> 00:27:28,200
Speaker 4: learn about it. So how are you helping people navigate

404
00:27:28,400 --> 00:27:30,879
Speaker 4: and make choices in that environment? And how does the

405
00:27:30,920 --> 00:27:32,840
Speaker 4: partnership with IBM help with that?

406
00:27:35,640 --> 00:27:39,520
Speaker 3: Well? As I said, our vision is that AI machine

407
00:27:39,600 --> 00:27:44,639
Speaker 3: learning is becoming the default way of creating technology and

408
00:27:44,680 --> 00:27:48,520
Speaker 3: that means like every product, app, service that you're going

409
00:27:48,600 --> 00:27:52,159
Speaker 3: to be using is going to be using AI to

410
00:27:52,280 --> 00:27:57,280
Speaker 3: do whatever it is better faster, And I guess there

411
00:27:57,280 --> 00:28:01,400
Speaker 3: are two competing visions of doing world coming from that.

412
00:28:01,480 --> 00:28:07,639
Speaker 3: There is this vision of the oracle, all powerful model

413
00:28:07,720 --> 00:28:12,159
Speaker 3: that can do everything, and our vision is different. Our

414
00:28:12,240 --> 00:28:17,640
Speaker 3: vision is that every single company will be able to

415
00:28:17,720 --> 00:28:22,680
Speaker 3: create their own models that they own, that they can use,

416
00:28:22,760 --> 00:28:27,560
Speaker 3: that they control, and that's the vision that we're trying

417
00:28:27,600 --> 00:28:31,440
Speaker 3: to bring to life through our open source tools that

418
00:28:31,760 --> 00:28:35,560
Speaker 3: make this work easy. Through our platform where you can

419
00:28:35,600 --> 00:28:38,640
Speaker 3: find all those pre train models are shared by the community.

420
00:28:39,080 --> 00:28:41,840
Speaker 3: So we really want to empower companies to build their

421
00:28:41,880 --> 00:28:45,640
Speaker 3: own stuff, not to outsource all the intelligence to a

422
00:28:45,720 --> 00:28:51,120
Speaker 3: third party. And the What's on next platform from IBM

423
00:28:51,920 --> 00:28:56,920
Speaker 3: gives those tools to enterprise companies, So that's you can

424
00:28:57,600 --> 00:29:02,680
Speaker 3: use the open source models hiking Face offers, then you

425
00:29:02,760 --> 00:29:07,480
Speaker 3: can improve them with your own data without sharing that

426
00:29:07,600 --> 00:29:10,520
Speaker 3: data to a third party, and then you could do

427
00:29:11,160 --> 00:29:16,680
Speaker 3: all of this work in compliance with whatever governance requirements

428
00:29:17,080 --> 00:29:20,800
Speaker 3: that you have for your company, maybe your finance services

429
00:29:20,800 --> 00:29:24,680
Speaker 3: company and you have a specific set of rules, maybe

430
00:29:25,000 --> 00:29:30,120
Speaker 3: your healthcare company and you have very strong privacy requirements

431
00:29:30,320 --> 00:29:35,480
Speaker 3: for patients data. Maybe your tech company, and you have

432
00:29:35,600 --> 00:29:40,560
Speaker 3: your customers, your users personal information, so you need to

433
00:29:40,560 --> 00:29:43,320
Speaker 3: be able to do this work respecting all of that.

434
00:29:44,360 --> 00:29:46,280
Speaker 4: Jeff Bridier, thank you very much.

435
00:29:46,960 --> 00:29:48,640
Speaker 3: Thanks so much to it's fun.

436
00:29:50,320 --> 00:29:53,280
Speaker 2: To create the AI models of the future. We're going

437
00:29:53,280 --> 00:29:55,800
Speaker 2: to need open source. That means as a place for

438
00:29:55,960 --> 00:29:58,960
Speaker 2: business in the open source community to harness the game

439
00:29:59,080 --> 00:30:04,600
Speaker 2: changing potential of AI innovation. Like Jeff said, businesses face

440
00:30:04,840 --> 00:30:08,800
Speaker 2: unique challenges they need to solve at scale without proper

441
00:30:08,840 --> 00:30:13,000
Speaker 2: support systems. Tapping into open source AI at enterprise level

442
00:30:13,320 --> 00:30:16,600
Speaker 2: is daunting finding the right size model for the job,

443
00:30:16,920 --> 00:30:21,480
Speaker 2: fine tuning its purpose, all while addressing governance requirements around

444
00:30:21,560 --> 00:30:27,520
Speaker 2: data privacy and ethics. So for businesses, IBM's collaboration with

445
00:30:27,640 --> 00:30:31,440
Speaker 2: hugging Face is a market progress because it signifies that

446
00:30:31,600 --> 00:30:36,040
Speaker 2: business can tap into open source AI while preserving enterprise

447
00:30:36,120 --> 00:30:41,280
Speaker 2: level integrity. Businesses should embrace the open source community and

448
00:30:41,360 --> 00:30:45,120
Speaker 2: the AI future, much like hugging Face and its emoji

449
00:30:45,200 --> 00:30:49,720
Speaker 2: namesake suggests. I'm Malcolm Gladwell. This is a paid advertisement

450
00:30:49,840 --> 00:30:54,479
Speaker 2: from IBM. Smart Talks with IBM is produced by Matt Romano,

451
00:30:54,960 --> 00:30:59,200
Speaker 2: David jaw Nisha Nkat and Royston Deserve with Jacob Goldstein

452
00:31:00,320 --> 00:31:04,240
Speaker 2: by Lydia gene Kott. Our engineers are Jason Gambrel, Sarah

453
00:31:04,280 --> 00:31:09,720
Speaker 2: Bruger and Ben Tolliday. Theme song by Gramoscope. Special thanks

454
00:31:09,720 --> 00:31:13,400
Speaker 2: to Carlei Migliori, Andy Kelly, Kathy Callahan, and the eight

455
00:31:13,440 --> 00:31:17,440
Speaker 2: Bar and IBM teams, as well as the Pushkin marketing team.

456
00:31:17,640 --> 00:31:20,600
Speaker 2: Smart Talks with IBM is a production of Pushkin Industries

457
00:31:20,960 --> 00:31:25,720
Speaker 2: and Ruby Studio at iHeartMedia. To find more Pushkin podcasts,

458
00:31:25,920 --> 00:31:30,560
Speaker 2: listen on the iHeartRadio app, Apple Podcasts, or wherever you

459
00:31:30,720 --> 00:31:42,360
Speaker 2: listen to podcasts.