WEBVTT - Smart Talks with IBM - Hugging Face and watsonx: Why Open Source Is the Future of AI in Business

0:00:04.440 --> 0:00:12.600
<v Speaker 1>Welcome to Tech Stuff, a production from iHeartRadio. Today, we

0:00:12.680 --> 0:00:15.640
<v Speaker 1>are witnessed to one of those rare moments in history,

0:00:16.000 --> 0:00:19.239
<v Speaker 1>the rise of an innovative technology with the potential to

0:00:19.360 --> 0:00:24.080
<v Speaker 1>radically transform business and society forever. That technology, of course,

0:00:24.560 --> 0:00:28.120
<v Speaker 1>is artificial intelligence, and it's the central focus for this

0:00:28.280 --> 0:00:32.320
<v Speaker 1>new season of Smart Talks with IBM. Join hosts from

0:00:32.320 --> 0:00:36.040
<v Speaker 1>your favorite Pushkin podcasts as they talk with industry experts

0:00:36.080 --> 0:00:39.640
<v Speaker 1>and leaders to explore how businesses can integrate AI into

0:00:39.720 --> 0:00:43.040
<v Speaker 1>their workflows and help drive real change in this new

0:00:43.120 --> 0:00:46.800
<v Speaker 1>era of AI, and of course, host Malcolm Gladwell will

0:00:46.840 --> 0:00:49.120
<v Speaker 1>be there to guide you through the season and throw

0:00:49.240 --> 0:00:52.120
<v Speaker 1>in his two cents as well. Look out for new

0:00:52.159 --> 0:00:55.040
<v Speaker 1>episodes of Smart Talks with IBM every other week on

0:00:55.080 --> 0:00:59.320
<v Speaker 1>the iHeartRadio app, Apple Podcasts, wherever you get your podcasts,

0:00:59.520 --> 0:01:03.760
<v Speaker 1>and learn more at IBM dot com slash smart Talks.

0:01:04.840 --> 0:01:08.560
<v Speaker 2>Hello, Hello, Welcome to Smart Talks with IBM, a podcast

0:01:08.560 --> 0:01:14.959
<v Speaker 2>from Pushkin Industries, iHeartRadio and IBM. I'm Malcolm Gladwell. This season,

0:01:15.160 --> 0:01:19.760
<v Speaker 2>we're continuing our conversation with new creators visionaries who are

0:01:19.840 --> 0:01:23.880
<v Speaker 2>creatively applying technology in business to drive change, but with

0:01:23.920 --> 0:01:28.760
<v Speaker 2>a focus on the transformative power of artificial intelligence and

0:01:28.840 --> 0:01:31.840
<v Speaker 2>what it means to leverage AI as a game changing

0:01:31.920 --> 0:01:36.399
<v Speaker 2>multiplier for your business. Our guest today is Jeff Boutier,

0:01:36.840 --> 0:01:40.360
<v Speaker 2>head of Product and Growth at hugging Face, the leading

0:01:40.480 --> 0:01:45.640
<v Speaker 2>open source and open science artificial intelligence platform. An engineer

0:01:45.680 --> 0:01:49.560
<v Speaker 2>by background, he has a self professed obsession with the

0:01:49.600 --> 0:01:54.160
<v Speaker 2>business of technology. Recently, IBM and hugging Face announced a

0:01:54.240 --> 0:01:58.880
<v Speaker 2>collaboration bringing together hugging faces repositories of open source AI

0:01:58.960 --> 0:02:03.800
<v Speaker 2>models with IBM's Watson X platform. It's a move that

0:02:03.840 --> 0:02:08.239
<v Speaker 2>gives businesses even more access to AI while staying true

0:02:08.280 --> 0:02:13.920
<v Speaker 2>to IBM's long standing philosophy of supporting open source technology.

0:02:14.840 --> 0:02:18.720
<v Speaker 2>With open source, businesses can build better AI models that

0:02:18.800 --> 0:02:23.320
<v Speaker 2>suit their specific needs using their own proprietary data while

0:02:23.360 --> 0:02:28.440
<v Speaker 2>browsing a ready catalog of pre trained models. In today's episode,

0:02:28.639 --> 0:02:31.760
<v Speaker 2>you'll hear why open source is so crucial to the

0:02:31.800 --> 0:02:36.400
<v Speaker 2>advancement of AI, how IBM's Watson X interacts with open

0:02:36.440 --> 0:02:40.919
<v Speaker 2>source AI, and Jeff's thoughts on why this singular omnipotent

0:02:41.160 --> 0:02:45.240
<v Speaker 2>AI model is a myth. Jeff spoke with Tim Harford,

0:02:45.440 --> 0:02:49.680
<v Speaker 2>host of the Pushkin podcast Cautionary Tales, a longtime columnist

0:02:49.760 --> 0:02:53.239
<v Speaker 2>at the Financial Times, where he writes the Undercover Economist.

0:02:53.520 --> 0:02:57.320
<v Speaker 2>Tim is also a BBC broadcaster with his show More

0:02:57.480 --> 0:03:01.160
<v Speaker 2>or Less. Okay, let's get to the interview.

0:03:08.600 --> 0:03:11.480
<v Speaker 3>I am a Jeff Boudier and I'm a product director

0:03:11.560 --> 0:03:12.600
<v Speaker 3>at hugging.

0:03:12.280 --> 0:03:16.800
<v Speaker 4>Face, So I'm immediately intrigue. Hugging Face. Is this a

0:03:16.880 --> 0:03:18.840
<v Speaker 4>reference to the Alien movie or something else?

0:03:20.200 --> 0:03:24.040
<v Speaker 3>It is not, and it may be not obvious to

0:03:24.160 --> 0:03:27.800
<v Speaker 3>a listener, but hugging Face is the name of that

0:03:27.960 --> 0:03:30.799
<v Speaker 3>cute emoji, you know, the one that's smiling with his

0:03:30.960 --> 0:03:34.120
<v Speaker 3>two hands extended like that to give you a big hug.

0:03:34.360 --> 0:03:37.640
<v Speaker 3>That's hugging Face. So basically we name the company after

0:03:37.760 --> 0:03:39.120
<v Speaker 3>an emoji.

0:03:40.000 --> 0:03:42.400
<v Speaker 4>And it is I saw your website and it is

0:03:42.440 --> 0:03:45.480
<v Speaker 4>a very friendly emoji. So that's that's nice. So tell

0:03:45.560 --> 0:03:47.880
<v Speaker 4>us a little bit about hugging Face and about what

0:03:47.920 --> 0:03:48.320
<v Speaker 4>you do that.

0:03:48.960 --> 0:03:53.440
<v Speaker 3>Of course, hugging Face is the leading open platform for

0:03:53.800 --> 0:03:59.000
<v Speaker 3>AI builders, and it's the place that's all the AI

0:03:59.080 --> 0:04:04.600
<v Speaker 3>researchers use to share their work, their new AI models

0:04:04.880 --> 0:04:09.600
<v Speaker 3>and collaborate around them. It's the place where the data

0:04:09.840 --> 0:04:15.080
<v Speaker 3>scientists go and find those pre train models and access

0:04:15.160 --> 0:04:19.000
<v Speaker 3>them and use them and work with them, and increasingly

0:04:19.040 --> 0:04:23.360
<v Speaker 3>it's the place where developers are coming to turn all

0:04:23.400 --> 0:04:28.480
<v Speaker 3>of these AI models and data sets into their own applications,

0:04:28.520 --> 0:04:29.680
<v Speaker 3>their own features.

0:04:30.360 --> 0:04:33.039
<v Speaker 4>So it's like the I don't know, the Facebook group

0:04:33.120 --> 0:04:36.000
<v Speaker 4>or the Reddit or the Twitter for people who are

0:04:36.040 --> 0:04:40.280
<v Speaker 4>interested in particularly generative language AI, or all kinds of

0:04:40.360 --> 0:04:41.920
<v Speaker 4>artificial intelligence.

0:04:42.160 --> 0:04:46.400
<v Speaker 3>All kinds of AI really, and of course generative AIS

0:04:46.600 --> 0:04:51.320
<v Speaker 3>this new wave that has caught the world by storm.

0:04:51.680 --> 0:04:55.360
<v Speaker 3>But on Hiking Face you can find any kind of model,

0:04:55.680 --> 0:04:59.560
<v Speaker 3>the new sort of transformers models to do anything from

0:05:00.000 --> 0:05:04.080
<v Speaker 3>translation or if you wanted to transcribe what I'm saying

0:05:04.120 --> 0:05:07.560
<v Speaker 3>into text, then you would use a transformer model. If

0:05:07.560 --> 0:05:10.880
<v Speaker 3>you wanted to then take that text and make a summary,

0:05:11.320 --> 0:05:15.000
<v Speaker 3>that would be another transformer model. If you wanted to

0:05:15.360 --> 0:05:19.400
<v Speaker 3>create a nice little thumbnail for this podcast by typeing

0:05:19.440 --> 0:05:23.280
<v Speaker 3>a sentence, that would be another type of model. So

0:05:23.360 --> 0:05:26.480
<v Speaker 3>all these models you can find. There's actually three hundred

0:05:26.640 --> 0:05:31.119
<v Speaker 3>thousands that are free and publicly accessible. You can find

0:05:31.160 --> 0:05:34.840
<v Speaker 3>them on our website at Hikingphase dot co and use

0:05:34.920 --> 0:05:37.520
<v Speaker 3>them using our open source libraries.

0:05:38.360 --> 0:05:41.160
<v Speaker 4>And so this is this is fascinating. So there are

0:05:41.160 --> 0:05:44.520
<v Speaker 4>three hundred thousand models. Now when you say model, I'm

0:05:44.560 --> 0:05:46.400
<v Speaker 4>thinking in my head, oh, it's kind of like a

0:05:47.360 --> 0:05:50.080
<v Speaker 4>computer program. There were three hundred thousand computer programs. Is

0:05:50.680 --> 0:05:52.359
<v Speaker 4>that roughly right or it not?

0:05:52.440 --> 0:05:57.839
<v Speaker 3>Really, it's a general idea. A model is a giant

0:05:59.680 --> 0:06:05.120
<v Speaker 3>set of numbers that are working together to sift through

0:06:05.760 --> 0:06:08.960
<v Speaker 3>some inputs that you're going to give it. So think

0:06:09.000 --> 0:06:13.480
<v Speaker 3>of it of a big black box filled with numbers,

0:06:14.440 --> 0:06:19.240
<v Speaker 3>and you give it as an input, maybe some text,

0:06:19.960 --> 0:06:23.880
<v Speaker 3>maybe a prompt, so you're asking, you're giving an instruction

0:06:24.120 --> 0:06:26.719
<v Speaker 3>to the model, or maybe you give it an image

0:06:26.800 --> 0:06:31.240
<v Speaker 3>as an input, and then it will sift through that

0:06:31.400 --> 0:06:35.400
<v Speaker 3>information thanks to all of these numbers, which we call

0:06:35.440 --> 0:06:39.880
<v Speaker 3>in the field parameters, and it will produce an output.

0:06:40.480 --> 0:06:43.039
<v Speaker 3>So when I told you, hey, we can transcribe this

0:06:43.279 --> 0:06:47.200
<v Speaker 3>conversation into text, the input would have been the conversation

0:06:47.800 --> 0:06:50.440
<v Speaker 3>in an audio file, and then the output would have

0:06:50.480 --> 0:06:53.479
<v Speaker 3>been the text of the transcription. If you want to

0:06:53.560 --> 0:06:57.599
<v Speaker 3>create a thumbnail for this podcast episode, then the input

0:06:57.640 --> 0:07:00.520
<v Speaker 3>would be what we call the prompt, which is really

0:07:00.520 --> 0:07:05.040
<v Speaker 3>a text description like a Frenchman in San Francisco talking

0:07:05.080 --> 0:07:10.720
<v Speaker 3>about machine learning, and the output would be completely original image.

0:07:11.280 --> 0:07:14.600
<v Speaker 3>So that's how I think about what an AI model is,

0:07:15.080 --> 0:07:19.200
<v Speaker 3>and I think what we're starting to realize is that

0:07:20.080 --> 0:07:24.280
<v Speaker 3>this is becoming the new way of building technology in

0:07:24.320 --> 0:07:28.320
<v Speaker 3>the world. It has been for the field of dealing

0:07:28.440 --> 0:07:32.400
<v Speaker 3>understanding generating text for quite some time, but now it's

0:07:32.520 --> 0:07:36.440
<v Speaker 3>sort of moving across every field of technology. We have

0:07:36.840 --> 0:07:40.960
<v Speaker 3>models to create images, as I say, but also to

0:07:41.120 --> 0:07:46.600
<v Speaker 3>generate new proteins to make predictions on numerical data. So

0:07:46.640 --> 0:07:51.160
<v Speaker 3>every kind of field of machine learning is now using

0:07:52.600 --> 0:07:56.480
<v Speaker 3>this new type of models. But what's interesting is that

0:07:57.080 --> 0:08:00.720
<v Speaker 3>if you're, say a product manager at a company, and

0:08:00.760 --> 0:08:03.840
<v Speaker 3>you say, hey, I want to build a feature that

0:08:03.960 --> 0:08:07.320
<v Speaker 3>does this. A few years ago, the approach would have

0:08:07.360 --> 0:08:11.360
<v Speaker 3>been to ask a software developer to write a thousand

0:08:11.400 --> 0:08:14.840
<v Speaker 3>lines of code in order to build a prototype. And

0:08:14.920 --> 0:08:18.360
<v Speaker 3>the new way of doing things today is to go

0:08:18.520 --> 0:08:23.040
<v Speaker 3>look for an off the shelf pre train model that

0:08:23.080 --> 0:08:27.200
<v Speaker 3>does a pretty good job at solving exactly that problem,

0:08:27.320 --> 0:08:30.400
<v Speaker 3>so you can create a prototype of that feature fast.

0:08:30.440 --> 0:08:33.000
<v Speaker 3>So it's a new approach of building tech.

0:08:33.200 --> 0:08:36.320
<v Speaker 4>I'm not a programmer, but I'm aware that there was

0:08:36.520 --> 0:08:39.080
<v Speaker 4>this idea of open source code, and now we have

0:08:39.160 --> 0:08:42.120
<v Speaker 4>open source models. So what does it mean for something

0:08:42.120 --> 0:08:43.040
<v Speaker 4>to be open source.

0:08:43.640 --> 0:08:49.400
<v Speaker 3>Open source AI actually means a lot of different specific things.

0:08:50.080 --> 0:08:54.280
<v Speaker 3>It's the open source implementation of the model. So if

0:08:54.320 --> 0:08:58.600
<v Speaker 3>you use the Hugging Phase transformers library to use a model,

0:08:58.640 --> 0:09:03.000
<v Speaker 3>you're using an open source code library to use that model.

0:09:03.080 --> 0:09:06.320
<v Speaker 4>Just to end up on the transformers. These are these

0:09:06.400 --> 0:09:09.640
<v Speaker 4>kind of ways of turning a picture of a dog

0:09:09.760 --> 0:09:12.440
<v Speaker 4>into a text output that says, hey, this is a

0:09:12.440 --> 0:09:15.079
<v Speaker 4>picture of a dog, or this is a French text

0:09:15.080 --> 0:09:17.920
<v Speaker 4>and with the transformers helping you turn it into English text,

0:09:18.000 --> 0:09:19.880
<v Speaker 4>or it's doing all of these things that you've been describing.

0:09:19.960 --> 0:09:23.920
<v Speaker 4>That's the transformer is the kind of the engine at

0:09:23.920 --> 0:09:24.760
<v Speaker 4>the heart of that.

0:09:25.559 --> 0:09:29.960
<v Speaker 3>Yes, exactly. And we call them transformers because they correspond

0:09:30.000 --> 0:09:33.920
<v Speaker 3>to this new way of building machine learning models that

0:09:34.080 --> 0:09:38.800
<v Speaker 3>was introduced by Google actually with a very important paper

0:09:39.120 --> 0:09:41.920
<v Speaker 3>called Attention is All You Need and that was published

0:09:41.920 --> 0:09:46.440
<v Speaker 3>in twenty seventeen by researchers out of Google Deep Mind.

0:09:47.400 --> 0:09:50.680
<v Speaker 4>Well that's just six years so new.

0:09:51.960 --> 0:09:55.240
<v Speaker 3>It is very new, and ever since the piece of

0:09:55.480 --> 0:10:00.920
<v Speaker 3>innovation of like new model architectures has real really accelerated.

0:10:01.240 --> 0:10:06.000
<v Speaker 3>But it really started from this inflection point that came

0:10:06.120 --> 0:10:10.400
<v Speaker 3>from this paper and its implementation in what is now

0:10:10.440 --> 0:10:16.240
<v Speaker 3>called Transformer models, the transformer that has conquered every area

0:10:16.360 --> 0:10:18.080
<v Speaker 3>of machine learning since.

0:10:18.280 --> 0:10:21.840
<v Speaker 4>Okay, so say turned up. So you've got this library

0:10:21.840 --> 0:10:26.120
<v Speaker 4>of Transformer models and that open source, and that means

0:10:26.280 --> 0:10:28.480
<v Speaker 4>that means what anyone can use them for free, or

0:10:29.240 --> 0:10:31.320
<v Speaker 4>that anybody can implement them for free. What does it mean?

0:10:32.840 --> 0:10:35.800
<v Speaker 3>So again, there's lots that go into it, but the

0:10:35.840 --> 0:10:40.240
<v Speaker 3>most important thing is for the model itself to be

0:10:40.480 --> 0:10:44.600
<v Speaker 3>available so that a data scientists or an engineer can

0:10:45.000 --> 0:10:49.400
<v Speaker 3>download them and use them. And also there are a

0:10:49.400 --> 0:10:54.240
<v Speaker 3>lot of considerations about how you make them accessible, and

0:10:54.280 --> 0:10:58.240
<v Speaker 3>a very important one is whether or not you give

0:10:58.480 --> 0:11:03.520
<v Speaker 3>access to the training data, all the information that went

0:11:03.679 --> 0:11:07.920
<v Speaker 3>into training that model and teaching it to do what

0:11:08.720 --> 0:11:09.640
<v Speaker 3>it's trained to do.

0:11:09.800 --> 0:11:12.800
<v Speaker 4>So I might have fed millions of words into a

0:11:12.920 --> 0:11:16.040
<v Speaker 4>into a language transformer, or I might have fed millions

0:11:16.040 --> 0:11:18.640
<v Speaker 4>of photographs into a into a picture transformer.

0:11:18.720 --> 0:11:22.160
<v Speaker 3>Yeah, yes, and now it's trillions and that and the

0:11:22.520 --> 0:11:26.160
<v Speaker 3>accessibility of that training data is very very important.

0:11:27.160 --> 0:11:32.960
<v Speaker 4>What's the relationship between the hugging face libraries and GitHub, which,

0:11:34.080 --> 0:11:38.360
<v Speaker 4>if I understand GitHub correctly, it's this the repository of

0:11:38.400 --> 0:11:42.360
<v Speaker 4>open source code lots and lots of lines of code

0:11:42.360 --> 0:11:47.280
<v Speaker 4>and routines and programs that are shared and updated and tracked,

0:11:47.320 --> 0:11:50.480
<v Speaker 4>and they're all available on GitHub, which sounds similar to

0:11:50.520 --> 0:11:52.959
<v Speaker 4>what you're doing with hugging face for AI. So what

0:11:52.960 --> 0:11:55.600
<v Speaker 4>what what is the interaction or the relationship there?

0:11:56.200 --> 0:11:58.640
<v Speaker 3>Yeah, I think you nailed it on the head there.

0:11:58.679 --> 0:12:02.839
<v Speaker 3>So hugging phase is to AI what GitHub is to code, right,

0:12:02.840 --> 0:12:08.959
<v Speaker 3>It's this central platform where AI builders can go find

0:12:09.440 --> 0:12:14.720
<v Speaker 3>and collaborate around AI artifacts, which are models and data sets.

0:12:14.760 --> 0:12:18.719
<v Speaker 3>So it's quite different than software, but we play this

0:12:18.840 --> 0:12:23.079
<v Speaker 3>central role in the community to share and collaborate and

0:12:24.080 --> 0:12:28.880
<v Speaker 3>access all of those artifacts for AI, like GitHub offers

0:12:28.880 --> 0:12:29.839
<v Speaker 3>for code.

0:12:30.679 --> 0:12:33.600
<v Speaker 4>And that community must be incredibly important. I mean, the

0:12:33.640 --> 0:12:36.240
<v Speaker 4>open source is nothing if you don't have a community

0:12:36.280 --> 0:12:38.640
<v Speaker 4>of people working on it. So how have you been

0:12:38.679 --> 0:12:41.800
<v Speaker 4>able to foster and nurture that community.

0:12:42.400 --> 0:12:45.760
<v Speaker 3>Well, I think it goes to the origins of the

0:12:45.840 --> 0:12:49.960
<v Speaker 3>transformer model and hugging and face role into that. So

0:12:50.600 --> 0:12:55.160
<v Speaker 3>when the first sort of open model came out, it

0:12:55.280 --> 0:12:58.440
<v Speaker 3>was called Bird and it came out of Google. The

0:12:58.480 --> 0:13:02.720
<v Speaker 3>only way you could would access it was to use

0:13:02.920 --> 0:13:07.360
<v Speaker 3>a tool called TensorFlow. But it happened that most of

0:13:07.400 --> 0:13:12.840
<v Speaker 3>the AI community was using a different tool called PyTorch,

0:13:13.960 --> 0:13:18.920
<v Speaker 3>and something that Hugging Face did is to make that

0:13:19.000 --> 0:13:25.480
<v Speaker 3>new model Bert accessible to all PyTorch user and they

0:13:25.480 --> 0:13:28.680
<v Speaker 3>did it in open source. It was a project called

0:13:29.200 --> 0:13:32.720
<v Speaker 3>Bert's pre Trained PyTorch or bird pitworch pre trained.

0:13:33.240 --> 0:13:35.360
<v Speaker 4>So this is like being able to play my Zelda

0:13:35.400 --> 0:13:39.440
<v Speaker 4>game on an Xbox or a PlayStation, right or am

0:13:39.480 --> 0:13:41.120
<v Speaker 4>I not really understanding what's going on?

0:13:41.559 --> 0:13:43.920
<v Speaker 3>No, That's exactly what it is. And the thing is

0:13:44.120 --> 0:13:48.080
<v Speaker 3>everybody was using the game Boy and so it became

0:13:48.440 --> 0:13:53.200
<v Speaker 3>a very popular and from there the community sort of

0:13:53.280 --> 0:13:56.839
<v Speaker 3>gathered to make all the other models that were then

0:13:56.960 --> 0:14:01.360
<v Speaker 3>published by AI researchers available with that library, which was

0:14:01.440 --> 0:14:07.000
<v Speaker 3>quickly renamed from bird bretrain Bytorch into Transformers to welcome

0:14:07.120 --> 0:14:12.280
<v Speaker 3>like all of these different new models, and today that's

0:14:12.440 --> 0:14:17.440
<v Speaker 3>open source library. Transformers is what all AI builders are

0:14:17.559 --> 0:14:20.880
<v Speaker 3>using when they want to access those models, see how

0:14:20.920 --> 0:14:22.400
<v Speaker 3>they work, and build upon them.

0:14:23.720 --> 0:14:26.880
<v Speaker 4>What's striking about this field is that it's changing so fast,

0:14:26.920 --> 0:14:30.720
<v Speaker 4>it's improving so quickly. So how do open source models

0:14:31.440 --> 0:14:35.320
<v Speaker 4>keep up with that? How do they get iterated and improved?

0:14:35.440 --> 0:14:38.400
<v Speaker 3>Actually? It's not so much that open source is keeping

0:14:38.480 --> 0:14:41.440
<v Speaker 3>up with it. It's actually open source that is driving

0:14:42.160 --> 0:14:45.600
<v Speaker 3>that is driving this piece of change. And that's because

0:14:46.320 --> 0:14:51.680
<v Speaker 3>with open source and open research data, scientists researchers can

0:14:51.800 --> 0:14:55.480
<v Speaker 3>build upon each other's work, they can reproduce each other's work,

0:14:55.760 --> 0:14:59.760
<v Speaker 3>they can access each other's work using our open source library,

0:15:00.000 --> 0:15:02.320
<v Speaker 3>et cetera. So in a sense, it's not really that

0:15:02.720 --> 0:15:07.320
<v Speaker 3>open source AI is a new idea. It's rather the opposite.

0:15:07.480 --> 0:15:11.600
<v Speaker 3>There's been a blip of time in which closed source

0:15:11.840 --> 0:15:15.560
<v Speaker 3>AI seemed to be the dominant way, but it's really

0:15:16.120 --> 0:15:19.840
<v Speaker 3>a blip. In fact, you know, none of the incredible

0:15:19.880 --> 0:15:24.480
<v Speaker 3>advances that we're marvel about today would be possible without

0:15:24.680 --> 0:15:27.680
<v Speaker 3>open source. We're standing upon the shoulders of fifty years

0:15:27.680 --> 0:15:32.120
<v Speaker 3>of research and open source software. So I think that

0:15:32.120 --> 0:15:35.000
<v Speaker 3>that's really important. If it wasn't for that, we'll probably

0:15:35.000 --> 0:15:39.880
<v Speaker 3>be fifty years away from having these amazing experiences like

0:15:40.040 --> 0:15:45.840
<v Speaker 3>JGBT or stable diffusion, et cetera. So it's really open

0:15:45.880 --> 0:15:50.240
<v Speaker 3>source that is fueling this pace of change, all these

0:15:50.280 --> 0:15:53.800
<v Speaker 3>new models, all these new capabilities. To give you an example,

0:15:54.120 --> 0:15:58.640
<v Speaker 3>so Meta released the Lama large language model just a

0:15:58.680 --> 0:16:02.960
<v Speaker 3>few months ago, and ever since, there's been this Cambrian

0:16:03.120 --> 0:16:07.520
<v Speaker 3>explosion of variations and improvements upon the original models, and

0:16:07.560 --> 0:16:10.600
<v Speaker 3>today there are over a thousands of them that we

0:16:11.160 --> 0:16:16.560
<v Speaker 3>host and track and evaluate. So yeah, open source is

0:16:16.600 --> 0:16:20.280
<v Speaker 3>really the gas and the engine for that.

0:16:21.560 --> 0:16:24.400
<v Speaker 2>Jeff just made it clear that it is open source,

0:16:24.640 --> 0:16:28.640
<v Speaker 2>not closed that sets the pace for AI innovation. If

0:16:28.680 --> 0:16:33.240
<v Speaker 2>that's true, then forward thinking businesses shouldn't shy from leveraging

0:16:33.320 --> 0:16:37.680
<v Speaker 2>open source AI to solve their own proprietary challenges. But

0:16:37.880 --> 0:16:42.800
<v Speaker 2>how businesses can face serious obstacles when trying to adopt

0:16:43.040 --> 0:16:47.600
<v Speaker 2>open source technologies, like complying with government regulation or making

0:16:47.640 --> 0:16:51.880
<v Speaker 2>sure their customers data stays protected. In the next part

0:16:51.920 --> 0:16:56.200
<v Speaker 2>of their conversation, Jeff and Tim discuss how IBM's collaboration

0:16:56.360 --> 0:17:00.520
<v Speaker 2>with hugging Face empowers businesses to tap into the open

0:17:00.560 --> 0:17:04.879
<v Speaker 2>source AI community and how the watsonex platform can enable

0:17:04.920 --> 0:17:08.720
<v Speaker 2>them to customize those AI models to their needs.

0:17:09.400 --> 0:17:11.920
<v Speaker 4>Just want to ask about the partnership between hugging Face

0:17:11.960 --> 0:17:14.720
<v Speaker 4>and an IBM. How did that come about?

0:17:16.680 --> 0:17:23.280
<v Speaker 3>Well, it came through a conversation, a conversation between our CEO,

0:17:24.080 --> 0:17:29.320
<v Speaker 3>Clement de Lange and Bill Higgins IBM, who's really really

0:17:29.400 --> 0:17:34.280
<v Speaker 3>close to all the amazing research work and open source

0:17:34.400 --> 0:17:39.399
<v Speaker 3>work that's happening at IBM, and that conversation sort of

0:17:39.680 --> 0:17:44.240
<v Speaker 3>sparked the evidence that we needed to do something together.

0:17:44.840 --> 0:17:48.840
<v Speaker 3>We share a lot of values in terms of the

0:17:48.880 --> 0:17:53.600
<v Speaker 3>importance of open source, which is fundamental to us, with

0:17:54.000 --> 0:17:58.800
<v Speaker 3>the importance of doing things in an ethics first way

0:17:58.920 --> 0:18:04.040
<v Speaker 3>to enable the commune to incorporate ethical considerations in how

0:18:04.520 --> 0:18:09.760
<v Speaker 3>they're building AI. And we sort of have a different

0:18:10.040 --> 0:18:14.119
<v Speaker 3>audience to start with, which is all the AI builders

0:18:14.240 --> 0:18:18.840
<v Speaker 3>use hiking phase today to access all the models we

0:18:18.960 --> 0:18:22.879
<v Speaker 3>talked about, to use them using our open source and

0:18:22.920 --> 0:18:27.320
<v Speaker 3>build with them. And IBM has this incredible history of

0:18:27.440 --> 0:18:32.920
<v Speaker 3>working with enterprise companies and enabling them to make use

0:18:32.960 --> 0:18:37.000
<v Speaker 3>of that technology in a way that's compliant with everything

0:18:37.040 --> 0:18:40.800
<v Speaker 3>that an enterprise requires, and so being able to marry

0:18:40.840 --> 0:18:45.000
<v Speaker 3>these two things together is an amazing opportunity. And now

0:18:45.040 --> 0:18:49.280
<v Speaker 3>we can enable the largest corporations that have sort of

0:18:49.520 --> 0:18:54.920
<v Speaker 3>complex requirements in order to deploy machine learning systems and

0:18:55.720 --> 0:18:59.080
<v Speaker 3>give them an easy experience to take advantage of all

0:18:59.119 --> 0:19:01.600
<v Speaker 3>the latest and great is that AA has to offer

0:19:02.119 --> 0:19:02.920
<v Speaker 3>through our platform.

0:19:04.480 --> 0:19:08.040
<v Speaker 4>Let's talk about this idea of a single model or

0:19:08.080 --> 0:19:11.600
<v Speaker 4>a variety of models, because what I've been hearing you say.

0:19:12.160 --> 0:19:14.000
<v Speaker 4>You've been saying, oh, there are lots of models, there

0:19:14.040 --> 0:19:18.119
<v Speaker 4>are hundreds of thousands of models available on hugging Face.

0:19:18.280 --> 0:19:21.640
<v Speaker 4>But you've also said there's this single thing, the transformer,

0:19:22.280 --> 0:19:26.720
<v Speaker 4>and they're all transformers. So if they're all basically the

0:19:26.760 --> 0:19:31.480
<v Speaker 4>same thing, why can't you just build one super clever

0:19:31.560 --> 0:19:32.640
<v Speaker 4>model that can do everything.

0:19:34.760 --> 0:19:39.679
<v Speaker 3>That's a really interesting idea and very much a new idea.

0:19:40.520 --> 0:19:44.400
<v Speaker 3>The reason we have over a million repositories three hundred

0:19:44.480 --> 0:19:48.119
<v Speaker 3>thousand free and accessible models on a hiking Face platform

0:19:48.560 --> 0:19:52.320
<v Speaker 3>is that models are typically trained to do one thing,

0:19:52.680 --> 0:19:55.920
<v Speaker 3>and they're typically trained to do one thing with specific

0:19:55.960 --> 0:20:02.439
<v Speaker 3>types of data. And what became new and evidence in

0:20:02.480 --> 0:20:04.920
<v Speaker 3>the research that came out over the last couple of

0:20:05.000 --> 0:20:09.120
<v Speaker 3>years is that if you train a big enough model

0:20:09.600 --> 0:20:14.680
<v Speaker 3>with enough data, then those models start to have sort

0:20:14.680 --> 0:20:18.720
<v Speaker 3>of general capabilities. You can ask them to do different things.

0:20:19.000 --> 0:20:22.480
<v Speaker 3>You can even train them to respond to instructions. So

0:20:22.600 --> 0:20:26.840
<v Speaker 3>with the same model, you can say, hey, summarize this paragraph,

0:20:27.240 --> 0:20:30.960
<v Speaker 3>translate this into English, start a conversation in French, and

0:20:30.960 --> 0:20:34.560
<v Speaker 3>pivot to German. And so these are general sort of

0:20:34.680 --> 0:20:42.000
<v Speaker 3>language capabilities. And I think when CHGBT came online and

0:20:42.320 --> 0:20:47.000
<v Speaker 3>the world sort of discovered these new capabilities, there was,

0:20:47.560 --> 0:20:50.480
<v Speaker 3>at least for a short period, this sort of idea,

0:20:50.600 --> 0:20:54.480
<v Speaker 3>this sort of myth that the endgame of all this

0:20:55.440 --> 0:20:59.199
<v Speaker 3>is maybe one or a handful of models there are

0:20:59.400 --> 0:21:03.640
<v Speaker 3>so much better than anything else than exists, that they

0:21:03.640 --> 0:21:06.280
<v Speaker 3>can do anything that we can ask them to do,

0:21:07.080 --> 0:21:10.560
<v Speaker 3>and that's the only model that we will need. And I,

0:21:10.800 --> 0:21:15.080
<v Speaker 3>for one, think it is a myth. I don't think

0:21:15.119 --> 0:21:19.200
<v Speaker 3>it is practical for a variety of reasons. Say you're

0:21:19.600 --> 0:21:23.760
<v Speaker 3>writing an email and you have like this great suggestion

0:21:23.920 --> 0:21:28.199
<v Speaker 3>of text to sort of complete your sentence, Well, that's AI.

0:21:28.640 --> 0:21:31.159
<v Speaker 3>That's a large language model, that's a transformer model that

0:21:31.200 --> 0:21:33.840
<v Speaker 3>does that. So there are a ton of existing use

0:21:33.880 --> 0:21:37.520
<v Speaker 3>cases like this, and these use cases are powered by

0:21:38.320 --> 0:21:41.280
<v Speaker 3>specific models that have been trained to do one thing

0:21:41.400 --> 0:21:44.479
<v Speaker 3>well and to do it fast. If you wanted to

0:21:44.600 --> 0:21:51.200
<v Speaker 3>apply these sort of all knowing, powerful oracle type of model,

0:21:51.600 --> 0:21:55.639
<v Speaker 3>you would not be able to serve millions of customers

0:21:55.680 --> 0:21:58.359
<v Speaker 3>through a search engine. You will not be able to

0:22:00.080 --> 0:22:04.119
<v Speaker 3>complete people's sentences because the amount of money that you

0:22:04.160 --> 0:22:07.400
<v Speaker 3>would need, the number of computers that you would need

0:22:07.640 --> 0:22:13.240
<v Speaker 3>to run such of service just exceeds what is available

0:22:13.359 --> 0:22:18.760
<v Speaker 3>on the planet. So one reason for which it's not

0:22:18.880 --> 0:22:24.359
<v Speaker 3>a practical scenario is that it's just very expensive to

0:22:24.600 --> 0:22:27.440
<v Speaker 3>run those very very large models.

0:22:27.760 --> 0:22:29.920
<v Speaker 4>What I'm hearing is it's like, look, if you want

0:22:29.920 --> 0:22:33.679
<v Speaker 4>to screw in a screw you need a screwdriver. You

0:22:33.720 --> 0:22:37.720
<v Speaker 4>don't want an entire tool shed full of tools if

0:22:37.800 --> 0:22:39.960
<v Speaker 4>the task is to screw in a screwdriver, and sure

0:22:40.040 --> 0:22:43.240
<v Speaker 4>you could bring the toolshed that are all the tools.

0:22:43.280 --> 0:22:47.320
<v Speaker 4>There's a screwdriver there, but it's not necessary. It's incredibly expensive,

0:22:47.320 --> 0:22:52.119
<v Speaker 4>it's incredibly cumbersome, and that cost exists even though maybe

0:22:52.200 --> 0:22:54.879
<v Speaker 4>is the user who's just typing in a into a

0:22:54.920 --> 0:22:57.680
<v Speaker 4>prompt box. The user may not see it, but it's

0:22:57.680 --> 0:22:58.720
<v Speaker 4>still very real.

0:23:00.040 --> 0:23:03.480
<v Speaker 3>That's right. And then another one is performance. So taking

0:23:03.520 --> 0:23:06.760
<v Speaker 3>the screwdriver example, so and by the way, like we're

0:23:06.800 --> 0:23:09.560
<v Speaker 3>not quite there at this moment where we have this

0:23:09.720 --> 0:23:13.240
<v Speaker 3>all knowing, powerful oracle that is still sort of a

0:23:13.320 --> 0:23:16.919
<v Speaker 3>sci fi scenario, but we have screw drivers, but we

0:23:17.040 --> 0:23:21.680
<v Speaker 3>also have the leatherman, right, the multitol Swiss army knife.

0:23:21.920 --> 0:23:24.919
<v Speaker 3>And that's sort of the moment that we are in today.

0:23:24.960 --> 0:23:28.600
<v Speaker 3>But now if I'm trying to open up my computer,

0:23:29.200 --> 0:23:32.439
<v Speaker 3>turns out that it requires a specific kind of screw

0:23:32.600 --> 0:23:36.760
<v Speaker 3>like these tiny little tork screws, and having a torqu

0:23:36.800 --> 0:23:40.520
<v Speaker 3>screwdriver will get me much further than trying to use

0:23:40.760 --> 0:23:43.399
<v Speaker 3>my leather man, where maybe I'll get the knife blade

0:23:43.440 --> 0:23:46.520
<v Speaker 3>and it will mess up the screw and maybe eventually

0:23:46.520 --> 0:23:49.160
<v Speaker 3>I'll get to what I need. But my point is

0:23:49.280 --> 0:23:54.399
<v Speaker 3>that if you take a very specifically trained model for

0:23:54.480 --> 0:23:57.960
<v Speaker 3>a particular problem, it will work much better. It will

0:23:57.960 --> 0:24:02.760
<v Speaker 3>give you better results than a very very generalistic, big

0:24:02.840 --> 0:24:05.800
<v Speaker 3>model that can do a lot of things. And so

0:24:05.880 --> 0:24:10.119
<v Speaker 3>for things like search engines or things like translation, for

0:24:10.280 --> 0:24:15.000
<v Speaker 3>things that are very specific, companies are much better off

0:24:15.119 --> 0:24:19.680
<v Speaker 3>using smaller, more efficient models that produce better results.

0:24:19.480 --> 0:24:24.000
<v Speaker 4>That's really interesting. And presumably then being able to know

0:24:24.040 --> 0:24:26.800
<v Speaker 4>which model to use, or being able to know who

0:24:26.840 --> 0:24:30.640
<v Speaker 4>to ask which model to use, becomes a very important capability.

0:24:31.480 --> 0:24:35.000
<v Speaker 3>Yes, and that's what we're trying to make easy through

0:24:35.040 --> 0:24:35.800
<v Speaker 3>our platform.

0:24:37.160 --> 0:24:41.160
<v Speaker 4>So tell me about how this works with IBM's what's

0:24:41.160 --> 0:24:44.760
<v Speaker 4>an X platform? How do you see hugging faces customers

0:24:44.800 --> 0:24:45.640
<v Speaker 4>benefiting from that?

0:24:47.560 --> 0:24:51.640
<v Speaker 3>The end goal is to make it really easy for

0:24:51.760 --> 0:24:56.240
<v Speaker 3>what's an X customers to make use of all the

0:24:56.320 --> 0:25:00.600
<v Speaker 3>great models and libraries that we talked about, all the

0:25:00.600 --> 0:25:03.320
<v Speaker 3>the three hundred thousand models are today on hugging face

0:25:04.160 --> 0:25:08.440
<v Speaker 3>and to do this we need to really collaborate deeply

0:25:08.520 --> 0:25:12.080
<v Speaker 3>with the IBM teams that build the What's and X

0:25:12.160 --> 0:25:17.360
<v Speaker 3>platform so that our libraries, our open source our models

0:25:17.760 --> 0:25:21.480
<v Speaker 3>are well integrated into the platform. If you are a

0:25:21.640 --> 0:25:24.560
<v Speaker 3>single user, if you are a data science student and

0:25:24.600 --> 0:25:26.680
<v Speaker 3>you want to use a model, is we make it

0:25:26.720 --> 0:25:29.399
<v Speaker 3>super easy, right. We have our open source library. You

0:25:29.440 --> 0:25:32.159
<v Speaker 3>can download the model on your computer and run with

0:25:32.240 --> 0:25:37.320
<v Speaker 3>it then. But in enterprises there is a vast complexity

0:25:37.560 --> 0:25:42.800
<v Speaker 3>of infrastructure and rules around what people can do and

0:25:43.400 --> 0:25:47.600
<v Speaker 3>how the data can be accessed, and all this complexity

0:25:48.280 --> 0:25:52.879
<v Speaker 3>is sort of solved by the Watson X platform.

0:25:53.560 --> 0:25:57.520
<v Speaker 4>This season of the Smart Talks podcast features what we're

0:25:57.520 --> 0:26:00.399
<v Speaker 4>calling new creators. Do you see yourself as being a

0:26:00.440 --> 0:26:01.440
<v Speaker 4>creative person?

0:26:02.359 --> 0:26:05.960
<v Speaker 3>Ah, I think it's a requirement for the job. I mean,

0:26:05.960 --> 0:26:10.720
<v Speaker 3>we're in such a new and rapidly evolving industry that

0:26:11.000 --> 0:26:15.000
<v Speaker 3>we have to be creative in order to invent the

0:26:15.080 --> 0:26:19.640
<v Speaker 3>business models the use cases of tomorrow. My role within

0:26:19.680 --> 0:26:24.680
<v Speaker 3>the company is really to create the business around all

0:26:24.840 --> 0:26:28.840
<v Speaker 3>the great work of our science and open source and

0:26:28.960 --> 0:26:33.080
<v Speaker 3>product team, and by and large, the business model of

0:26:33.240 --> 0:26:38.200
<v Speaker 3>AI within the whole ecosystem is still something that companies

0:26:38.240 --> 0:26:43.200
<v Speaker 3>are trying to figure out. So creativity is really important

0:26:43.320 --> 0:26:47.120
<v Speaker 3>to really have the conversation with companies, understand what they're

0:26:47.160 --> 0:26:49.240
<v Speaker 3>trying to do, and then build the right kind of solution.

0:26:49.840 --> 0:26:54.080
<v Speaker 3>So that's like where creativity comes into play.

0:26:54.800 --> 0:26:59.000
<v Speaker 4>And one of the things that you've you've been talking

0:26:59.040 --> 0:27:02.520
<v Speaker 4>about is just this growing number of models, this growing

0:27:02.600 --> 0:27:09.040
<v Speaker 4>number of capabilities, this growing number of use cases enormously

0:27:09.080 --> 0:27:15.000
<v Speaker 4>exciting but also I think completely bewildering for most people

0:27:16.000 --> 0:27:20.640
<v Speaker 4>who are trying to navigate their way through this maze

0:27:20.680 --> 0:27:23.960
<v Speaker 4>of possibilities that is growing faster than they can even

0:27:24.200 --> 0:27:28.200
<v Speaker 4>learn about it. So how are you helping people navigate

0:27:28.400 --> 0:27:30.879
<v Speaker 4>and make choices in that environment? And how does the

0:27:30.920 --> 0:27:32.840
<v Speaker 4>partnership with IBM help with that?

0:27:35.640 --> 0:27:39.520
<v Speaker 3>Well? As I said, our vision is that AI machine

0:27:39.600 --> 0:27:44.639
<v Speaker 3>learning is becoming the default way of creating technology and

0:27:44.680 --> 0:27:48.520
<v Speaker 3>that means like every product, app, service that you're going

0:27:48.600 --> 0:27:52.159
<v Speaker 3>to be using is going to be using AI to

0:27:52.280 --> 0:27:57.280
<v Speaker 3>do whatever it is better faster, And I guess there

0:27:57.280 --> 0:28:01.400
<v Speaker 3>are two competing visions of doing world coming from that.

0:28:01.480 --> 0:28:07.639
<v Speaker 3>There is this vision of the oracle, all powerful model

0:28:07.720 --> 0:28:12.159
<v Speaker 3>that can do everything, and our vision is different. Our

0:28:12.240 --> 0:28:17.640
<v Speaker 3>vision is that every single company will be able to

0:28:17.720 --> 0:28:22.680
<v Speaker 3>create their own models that they own, that they can use,

0:28:22.760 --> 0:28:27.560
<v Speaker 3>that they control, and that's the vision that we're trying

0:28:27.600 --> 0:28:31.440
<v Speaker 3>to bring to life through our open source tools that

0:28:31.760 --> 0:28:35.560
<v Speaker 3>make this work easy. Through our platform where you can

0:28:35.600 --> 0:28:38.640
<v Speaker 3>find all those pre train models are shared by the community.

0:28:39.080 --> 0:28:41.840
<v Speaker 3>So we really want to empower companies to build their

0:28:41.880 --> 0:28:45.640
<v Speaker 3>own stuff, not to outsource all the intelligence to a

0:28:45.720 --> 0:28:51.120
<v Speaker 3>third party. And the What's on next platform from IBM

0:28:51.920 --> 0:28:56.920
<v Speaker 3>gives those tools to enterprise companies, So that's you can

0:28:57.600 --> 0:29:02.680
<v Speaker 3>use the open source models hiking Face offers, then you

0:29:02.760 --> 0:29:07.480
<v Speaker 3>can improve them with your own data without sharing that

0:29:07.600 --> 0:29:10.520
<v Speaker 3>data to a third party, and then you could do

0:29:11.160 --> 0:29:16.680
<v Speaker 3>all of this work in compliance with whatever governance requirements

0:29:17.080 --> 0:29:20.800
<v Speaker 3>that you have for your company, maybe your finance services

0:29:20.800 --> 0:29:24.680
<v Speaker 3>company and you have a specific set of rules, maybe

0:29:25.000 --> 0:29:30.120
<v Speaker 3>your healthcare company and you have very strong privacy requirements

0:29:30.320 --> 0:29:35.480
<v Speaker 3>for patients data. Maybe your tech company, and you have

0:29:35.600 --> 0:29:40.560
<v Speaker 3>your customers, your users personal information, so you need to

0:29:40.560 --> 0:29:43.320
<v Speaker 3>be able to do this work respecting all of that.

0:29:44.360 --> 0:29:46.280
<v Speaker 4>Jeff Bridier, thank you very much.

0:29:46.960 --> 0:29:48.640
<v Speaker 3>Thanks so much to it's fun.

0:29:50.320 --> 0:29:53.280
<v Speaker 2>To create the AI models of the future. We're going

0:29:53.280 --> 0:29:55.800
<v Speaker 2>to need open source. That means as a place for

0:29:55.960 --> 0:29:58.960
<v Speaker 2>business in the open source community to harness the game

0:29:59.080 --> 0:30:04.600
<v Speaker 2>changing potential of AI innovation. Like Jeff said, businesses face

0:30:04.840 --> 0:30:08.800
<v Speaker 2>unique challenges they need to solve at scale without proper

0:30:08.840 --> 0:30:13.000
<v Speaker 2>support systems. Tapping into open source AI at enterprise level

0:30:13.320 --> 0:30:16.600
<v Speaker 2>is daunting finding the right size model for the job,

0:30:16.920 --> 0:30:21.480
<v Speaker 2>fine tuning its purpose, all while addressing governance requirements around

0:30:21.560 --> 0:30:27.520
<v Speaker 2>data privacy and ethics. So for businesses, IBM's collaboration with

0:30:27.640 --> 0:30:31.440
<v Speaker 2>hugging Face is a market progress because it signifies that

0:30:31.600 --> 0:30:36.040
<v Speaker 2>business can tap into open source AI while preserving enterprise

0:30:36.120 --> 0:30:41.280
<v Speaker 2>level integrity. Businesses should embrace the open source community and

0:30:41.360 --> 0:30:45.120
<v Speaker 2>the AI future, much like hugging Face and its emoji

0:30:45.200 --> 0:30:49.720
<v Speaker 2>namesake suggests. I'm Malcolm Gladwell. This is a paid advertisement

0:30:49.840 --> 0:30:54.479
<v Speaker 2>from IBM. Smart Talks with IBM is produced by Matt Romano,

0:30:54.960 --> 0:30:59.200
<v Speaker 2>David jaw Nisha Nkat and Royston Deserve with Jacob Goldstein

0:31:00.320 --> 0:31:04.240
<v Speaker 2>by Lydia gene Kott. Our engineers are Jason Gambrel, Sarah

0:31:04.280 --> 0:31:09.720
<v Speaker 2>Bruger and Ben Tolliday. Theme song by Gramoscope. Special thanks

0:31:09.720 --> 0:31:13.400
<v Speaker 2>to Carlei Migliori, Andy Kelly, Kathy Callahan, and the eight

0:31:13.440 --> 0:31:17.440
<v Speaker 2>Bar and IBM teams, as well as the Pushkin marketing team.

0:31:17.640 --> 0:31:20.600
<v Speaker 2>Smart Talks with IBM is a production of Pushkin Industries

0:31:20.960 --> 0:31:25.720
<v Speaker 2>and Ruby Studio at iHeartMedia. To find more Pushkin podcasts,

0:31:25.920 --> 0:31:30.560
<v Speaker 2>listen on the iHeartRadio app, Apple Podcasts, or wherever you

0:31:30.720 --> 0:31:42.360
<v Speaker 2>listen to podcasts.