WEBVTT - Smart Talks with IBM: The power of Granite in business 0:00:00.160 --> 0:00:02.920 Hey everyone, it's Robert and Joe here. Today we've got 0:00:02.920 --> 0:00:05.160 something a little different to share with you. It's a 0:00:05.200 --> 0:00:08.799 new season of the Smart Talks with IBM podcast series. 0:00:09.280 --> 0:00:12.080 This season, on smart Talks, Malcolm Gladwell and team are 0:00:12.119 --> 0:00:15.320 diving into the transformative world of artificial intelligence with a 0:00:15.360 --> 0:00:18.680 fresh perspective on the concept of open What does open 0:00:18.760 --> 0:00:21.960 really mean in the context of AI. It can mean 0:00:22.079 --> 0:00:25.680 open source code or open data, but it also encompasses 0:00:25.760 --> 0:00:30.840 fostering an ecosystem of ideas, ensuring diverse perspectives are heard, 0:00:31.200 --> 0:00:33.599 and enabling new levels of transparency. 0:00:33.920 --> 0:00:37.159 Join hosts from your favorite pushkin podcasts as they explore 0:00:37.159 --> 0:00:41.000 how openness and AI is reshaping industries, driving innovation, and 0:00:41.040 --> 0:00:44.920 redefining what's possible. You'll hear from industry experts and leaders 0:00:44.920 --> 0:00:48.400 about the implication and possibilities of open AI, and of course, 0:00:48.760 --> 0:00:50.960 Malcolm Gladwell will be there to guide you through the 0:00:51.000 --> 0:00:52.760 season with his unique insights. 0:00:53.040 --> 0:00:55.600 Look out for new episodes of Smart Talks every other 0:00:55.680 --> 0:00:59.240 week on the iHeartRadio app, Apple Podcasts, or wherever you 0:00:59.280 --> 0:01:02.560 get your podcast and learn more at IBM dot com, 0:01:02.600 --> 0:01:13.240 Slash smart Talks. 0:01:10.760 --> 0:01:11.720 Pushkin. 0:01:15.959 --> 0:01:19.200 Hello, Hello, Welcome to Smart Talks with IBM, a podcast 0:01:19.200 --> 0:01:25.160 from Pushkin Industries, iHeartRadio and IBM. I'm Malcolm Glabo. This season, 0:01:25.400 --> 0:01:28.560 we're diving back into the world of artificial intelligence, but 0:01:28.640 --> 0:01:34.560 with a focus on the powerful concept of open its possibilities, implications, 0:01:34.600 --> 0:01:38.080 and misconceptions. We'll look at openness from a variety of 0:01:38.120 --> 0:01:41.880 angles and explore how the concept is already reshaping industries, 0:01:42.400 --> 0:01:46.240 ways of doing business and our very notion of what's possible. 0:01:47.040 --> 0:01:51.360 In today's episode, Jacob Goldstein sat down with maryam Ashuri, 0:01:51.760 --> 0:01:54.680 the Director of Product Management and a Head of Product 0:01:54.920 --> 0:01:59.160 for IBM's Watson x dot AI, where she spearheads the 0:01:59.160 --> 0:02:04.960 product strategy and delivery of IBM's watsonex foundation models. She 0:02:05.040 --> 0:02:08.480 is a technologist with more than fifteen years of experience 0:02:08.840 --> 0:02:14.320 developing data driven technologies. The conversation focused on how enterprises 0:02:14.520 --> 0:02:19.079 can use technology to build and deliver greater transparency in AI. 0:02:19.639 --> 0:02:24.320 With Granite. Mariam explained how Grantite can be utilized to 0:02:24.360 --> 0:02:29.640 improve efficiency across various domains. She discussed how these models 0:02:29.639 --> 0:02:33.640 are being used in real world business applications, particularly in 0:02:33.720 --> 0:02:38.080 areas like customer care, where AI can help enable quick, 0:02:38.560 --> 0:02:44.320 accurate responses based on internal company data. Mariam provided a 0:02:44.360 --> 0:02:49.120 fascinating look into how enterprises have moved from mere experimentation 0:02:49.360 --> 0:02:55.120 with generative AI to actual production, navigating challenges such as 0:02:55.160 --> 0:03:00.480 increased latency, cost, and energy consumption. She highlighted how the 0:03:00.560 --> 0:03:05.119 emerging trend of smaller models customized with proprietary data can 0:03:05.160 --> 0:03:08.840 potentially deliver high performance at a fraction of the cost, 0:03:09.520 --> 0:03:14.760 marking a significant shift in how enterprises leverage AI. Whether 0:03:14.800 --> 0:03:17.560 you're an AI enthusiast, we're a business leader looking to 0:03:17.639 --> 0:03:22.360 harness the power of artificial intelligence, this episode is packed 0:03:22.880 --> 0:03:26.560 with valuable insights and forward thinking strategies. 0:03:30.880 --> 0:03:32.840 Let's just start with your background. How did you come 0:03:32.880 --> 0:03:34.040 to work at IBM. 0:03:34.720 --> 0:03:38.200 I join IBM right after I graduated. I have an 0:03:38.200 --> 0:03:44.280 AI background, and throughout the years, I've held many roles 0:03:44.320 --> 0:03:49.760 in design, engineering, development, research, mostly focused on AI application 0:03:49.920 --> 0:03:54.040 development and design. In my current job, I'm the product 0:03:54.040 --> 0:03:58.520 owner for What's the Next DAYI, which is the IBM 0:03:58.560 --> 0:04:02.640 platform for enterprise AI. What excites me about this job, 0:04:02.680 --> 0:04:06.480 I would say, is the technology advancements over the last 0:04:06.480 --> 0:04:09.920 eighteen months in the market. We've been witnessing how GENERATIVELI 0:04:10.000 --> 0:04:12.680 has been changing the market. The way that I see 0:04:12.720 --> 0:04:16.360 that is JENNYI has been perhaps one of the largest 0:04:16.400 --> 0:04:19.960 paradigm shifts when we think about productivity. The same way 0:04:20.000 --> 0:04:25.160 that Internet and personal computers impacted the productivity of workforce, 0:04:25.320 --> 0:04:30.200 now we are witnessing another wave of all those opportunities 0:04:30.240 --> 0:04:33.480 that it can unlock for especially enterprise AI when it 0:04:33.520 --> 0:04:37.600 comes to enhancing the productivity of the workforce and releasing 0:04:37.680 --> 0:04:42.320 some time that can potentially be put into creating more 0:04:42.440 --> 0:04:46.840 value work for enterprise. So that's the major part that 0:04:47.000 --> 0:04:50.840 I picked this team to have an impact on the 0:04:50.880 --> 0:04:54.880 market and the community, but also of course using the 0:04:55.520 --> 0:04:58.440 skills that I gain through all these years through IBM 0:04:58.640 --> 0:05:02.599 to help to establish IBM as the market leader for 0:05:02.760 --> 0:05:03.520 enterprise AI. 0:05:04.080 --> 0:05:08.159 So you talked about JENAI as this sort of generational, 0:05:08.320 --> 0:05:13.360 transformational technological force, and I'm curious just in terms of 0:05:13.800 --> 0:05:16.280 how it's going to come into the world, Like, how 0:05:16.279 --> 0:05:20.360 do you see market adoption of GENAI sort of evolving 0:05:20.400 --> 0:05:20.880 from here? 0:05:21.680 --> 0:05:25.239 Well, last year was the year of excitement about generative AI. 0:05:25.400 --> 0:05:28.520 Most of the companies were experimenting and exploring with GENI. 0:05:29.240 --> 0:05:32.800 We see that energy shifted towards how to best monetize 0:05:32.839 --> 0:05:35.719 that technology. Almost half of the market has moved from 0:05:35.960 --> 0:05:41.200 investigation to pilots. Ten percent has moved to production. When 0:05:41.240 --> 0:05:44.880 you're exploring with this technology, you're looking for a valve factor, 0:05:45.160 --> 0:05:48.679 You're looking for an AHA moment. That's why very large 0:05:48.680 --> 0:05:53.279 general purpose models shine. But as companies move toward production 0:05:53.400 --> 0:05:56.320 and scale, they soon realized the past success is not 0:05:56.360 --> 0:06:01.440 that straightforward. For example, they're larger the model, the larger 0:06:01.480 --> 0:06:06.039 computer resources it requires. That translates to increased latency that's 0:06:06.040 --> 0:06:10.440 your response time. That translates to increased cost. That translates 0:06:10.480 --> 0:06:14.039 to increase carbon food print, and energy consumption. So think 0:06:14.040 --> 0:06:17.720 about that. At the scale of enterprise in production, some 0:06:17.800 --> 0:06:19.680 of them can be a showstopper. 0:06:20.040 --> 0:06:20.760 Because of this. 0:06:20.920 --> 0:06:25.240 Reason, what actually c is emerging in the market is 0:06:25.520 --> 0:06:31.520 instead of focusing on very large general purpose models, coming 0:06:31.600 --> 0:06:37.320 back to very small, trustworthy models that they can customize 0:06:37.480 --> 0:06:41.280 on their own proprietary data that's the data about their customers, 0:06:41.320 --> 0:06:45.080 that the data about their specific domains to create something 0:06:45.160 --> 0:06:49.760 differentiated that is much smaller and delivers the performance that 0:06:49.800 --> 0:06:53.120 they want on a target use case for a fraction 0:06:53.200 --> 0:06:53.760 of the cost. 0:06:54.080 --> 0:06:58.000 Uh huh. So let's talk a little bit more specifically 0:06:58.040 --> 0:07:02.440 about what you're working on. Talk about Granite. First of all, 0:07:02.480 --> 0:07:03.799 tell me what is Granite. 0:07:04.400 --> 0:07:09.960 Granite is our industrial leading family of models, flagship IBM models. 0:07:10.680 --> 0:07:14.520 These are the models that we train from scratch. When 0:07:14.600 --> 0:07:18.000 offered to our platform, we offer indemnification and we stand 0:07:18.000 --> 0:07:23.720 behind them today. It comes in four flavors, language, code, 0:07:24.400 --> 0:07:31.880 time series, and geospecial models. Granite Language series is covering English, Spanish, German, 0:07:32.320 --> 0:07:37.120 Portuguese and Japanese. We have a combination of commercial and 0:07:37.360 --> 0:07:41.320 open source language models on Granite. For example, we recently 0:07:41.520 --> 0:07:46.680 released the Granite seven B language model, small powerful English model. 0:07:47.400 --> 0:07:50.720 On the code front, our models are state of the 0:07:50.840 --> 0:07:55.120 art models ranging from three billion to thirty four billion parameters. 0:07:55.760 --> 0:08:00.960 These are very powerful models that performs or outperforms in 0:08:00.960 --> 0:08:04.800 some cases the popular open source models in their weight class. 0:08:04.840 --> 0:08:06.200 So very powerful models. 0:08:06.400 --> 0:08:09.080 So I get the idea a big picture about these models, 0:08:09.120 --> 0:08:10.800 but it would be helpful to just get a sense 0:08:10.840 --> 0:08:12.960 specifically of what they're doing, Like, can you give me 0:08:13.000 --> 0:08:16.640 any specific examples of how these models are being used 0:08:17.440 --> 0:08:19.720 in businesses in the real world right now? 0:08:20.880 --> 0:08:24.119 Well, the top use cases for generative AI are really 0:08:24.240 --> 0:08:31.120 content generation, summarization, information extraction. Perhaps the most popular use 0:08:31.160 --> 0:08:34.840 case that we are seeing in enterprise is content grounded 0:08:34.920 --> 0:08:39.160 question and answering. So using these models as a base 0:08:39.440 --> 0:08:42.320 to connect them to a body of information let's say, 0:08:42.360 --> 0:08:46.680 their policies, their documents that is internal to the enterprise, 0:08:46.960 --> 0:08:51.160 and get the model to provide answers based on that question. 0:08:51.559 --> 0:08:55.520 One example of that is for customer agents customer care, 0:08:55.920 --> 0:09:00.520 when a customer is asking a question. Previously, the agent 0:09:00.559 --> 0:09:04.080 that responds to the customer had to answer the question 0:09:04.200 --> 0:09:06.959 and if they don't know the answer escalated to the product. 0:09:07.120 --> 0:09:10.600 Especially is keeping people on hold on the line to 0:09:10.760 --> 0:09:13.720 go figure out the answer for that and then come back. 0:09:13.800 --> 0:09:16.160 You can think of the time it takes to resolve 0:09:16.200 --> 0:09:19.880 an issue. But now we llms, we have an opportunity 0:09:19.920 --> 0:09:23.960 to automatically retrieve the information based on the internal documents 0:09:24.000 --> 0:09:27.000 of the company, formulate an answer, show it to the 0:09:27.080 --> 0:09:30.400 human agent, and then if they verify with the sources 0:09:30.440 --> 0:09:33.160 of varies coming from, they can just translate it directly 0:09:33.200 --> 0:09:33.880 to the customer. 0:09:34.800 --> 0:09:35.360 This is a. 0:09:35.360 --> 0:09:39.200 Very simple example of how it's impacting the customer care. 0:09:39.679 --> 0:09:44.160 So one big theme of this season is this idea 0:09:44.160 --> 0:09:47.560 of open and one of the things that's interesting to 0:09:47.679 --> 0:09:51.480 me about the work you're doing is you are using 0:09:51.559 --> 0:09:55.200 not only granted this model IBM developed, but you're also 0:09:55.360 --> 0:09:58.960 using third party models right from other places. So tell 0:09:59.000 --> 0:10:01.000 me about that work and how that is sort of 0:10:01.040 --> 0:10:05.120 fitting into your kind of real world typically enterprise Jenai work. 0:10:05.760 --> 0:10:08.600 When it comes to a model strategy, our strategy is 0:10:08.800 --> 0:10:13.160 really focused on two pillars, multimodel and multi deployment. It 0:10:13.280 --> 0:10:16.559 means that we don't believe one single model rules all 0:10:16.559 --> 0:10:19.000 the use cases. And I think at this point the 0:10:19.040 --> 0:10:22.520 market has also realized the enterprise markets in average today 0:10:22.559 --> 0:10:27.040 are using five to ten different models for different use cases. 0:10:27.200 --> 0:10:28.199 Oh interesting. 0:10:28.520 --> 0:10:30.800 So in our portfolio, if you look into what's on 0:10:30.840 --> 0:10:33.640 Extra DAYI today, we are offering a large sets of 0:10:33.880 --> 0:10:36.760 high performing, state of the art models coming from open 0:10:36.800 --> 0:10:41.000 source commercial models that we are bringing through our partners 0:10:41.320 --> 0:10:45.200 and also IBM developed models. In addition to all of these, 0:10:45.400 --> 0:10:48.400 we also have an option for bring your own model 0:10:48.720 --> 0:10:51.680 from outside the platform. Let's say you have a custom 0:10:51.760 --> 0:10:54.440 model that you made it yourself, you can bring it 0:10:54.480 --> 0:10:59.360 to the platform and really helping the customers to navigate 0:10:59.400 --> 0:11:03.040 through aid range of models and pick the right model 0:11:03.320 --> 0:11:06.960 for their target use case. Throughout that we've been heavily 0:11:07.000 --> 0:11:10.200 working with our partners, and you know, this is the 0:11:10.240 --> 0:11:13.720 market that is evolving rapidly. We've been at the forefront 0:11:13.720 --> 0:11:15.880 of a spit to delivery. One example that I like 0:11:15.960 --> 0:11:21.400 to highlight is recently Metal released Lama four or five billion, 0:11:21.720 --> 0:11:24.240 such a powerful model. On the same day that it 0:11:24.440 --> 0:11:27.400 was released to the market, we made it available in 0:11:27.480 --> 0:11:30.520 our platform to our customers the same day. And not 0:11:30.600 --> 0:11:33.040 only we delivered it on the same day. We are 0:11:33.040 --> 0:11:37.520 offering competitive pricing but also for flexibility in where to deploy. 0:11:37.640 --> 0:11:40.559 So we are giving an option to enterprise to deploy 0:11:40.640 --> 0:11:44.880 these models on the platform of dage choice, either multi 0:11:44.880 --> 0:11:48.760 cloud it can be gcpaws as youre IBM cloud, or 0:11:48.800 --> 0:11:54.160 on premises. The same for mistrall Ai. Mistrall Ai recently 0:11:54.320 --> 0:11:57.320 released the model misroll launch too on the same day 0:11:57.600 --> 0:12:00.600 we delivered that through the platform. That's an example of 0:12:00.640 --> 0:12:04.960 a commercial model. Lama as open source, but MS large 0:12:04.960 --> 0:12:08.000 two is a commercial model that we made available through 0:12:08.040 --> 0:12:08.800 the platform. 0:12:09.320 --> 0:12:14.920 Great, So I want to talk about enterprise grade foundation models. 0:12:15.640 --> 0:12:18.520 Just to get into it briefly, what's a foundation model. 0:12:19.000 --> 0:12:22.719 People associate foundation models with a large language model, but 0:12:22.840 --> 0:12:26.160 large language models are really a subset of foundation models. 0:12:26.320 --> 0:12:30.240 Large language models are focused on language, but foundation models 0:12:30.280 --> 0:12:34.120 can be code generators, can be focused on time series 0:12:34.120 --> 0:12:36.720 model we talked about, they can be images, it can 0:12:36.760 --> 0:12:41.200 be jew special models. So foundation model, as the term 0:12:41.320 --> 0:12:46.439 suggests that your foundations to create a series of subsequent 0:12:46.640 --> 0:12:51.000 models that can be customized for a downstream use case. 0:12:51.040 --> 0:12:54.439 And that's why they are calling them foundation models. Lm 0:12:54.480 --> 0:12:56.480 ME is a good example of that as a subset 0:12:56.520 --> 0:13:00.240 for language that you can further customize on your space. 0:13:00.480 --> 0:13:04.040 Data to get the model to do other works. 0:13:04.080 --> 0:13:07.280 So the core of these foundation models, they are basically 0:13:07.920 --> 0:13:11.680 trained on an ab third amount of data data sets 0:13:11.880 --> 0:13:15.120 that most of the institutions today are sourcing them from 0:13:15.120 --> 0:13:18.080 the internet. So you can imagine what can potentially go 0:13:18.120 --> 0:13:20.880 to those models and then it comes to the enterprise 0:13:21.000 --> 0:13:25.200 and they start using it. So for us also, when 0:13:25.240 --> 0:13:29.440 we started looking into in particular, it was triggered by 0:13:29.720 --> 0:13:33.880 customers asking us to provide client protections on these models, 0:13:33.920 --> 0:13:36.440 and we started thinking about, let's look into how the 0:13:36.520 --> 0:13:40.120 models are trained and if you are comfortable of fering 0:13:40.200 --> 0:13:43.680 client protections on the models that are available in the market. 0:13:43.800 --> 0:13:45.199 And guess what, for a. 0:13:45.200 --> 0:13:49.280 Majority of these models there is absolutely no visibility into 0:13:49.360 --> 0:13:52.760 what data vent into those models, not much transparency into 0:13:52.840 --> 0:13:56.880 how the model trains, and the responsibility lies on you 0:13:56.960 --> 0:13:59.240 as the customers we start using those models. 0:13:59.240 --> 0:14:03.080 So just to be that is presenting like potential risk, 0:14:03.200 --> 0:14:06.480 real potential risk to a company that is using these models, 0:14:06.720 --> 0:14:07.120 it is. 0:14:07.240 --> 0:14:10.720 It is a potential risk in particular for the customers 0:14:10.760 --> 0:14:15.319 in highly regulated industries. So what we did for Granite 0:14:15.880 --> 0:14:19.120 was when we started training these models from scratch, Basically 0:14:19.160 --> 0:14:22.280 we went to the corpus of data that was available 0:14:22.320 --> 0:14:25.440 to us. So, for example, the very first version of 0:14:25.800 --> 0:14:29.800 Granite was exposed to twenty percent of its data from 0:14:29.880 --> 0:14:33.680 finance and legal because we have a lot of financial 0:14:33.680 --> 0:14:38.120 institutions as our clients. We worked directly with our IBM 0:14:38.160 --> 0:14:43.080 research to identify detectors for harmful information like haytyp use 0:14:43.160 --> 0:14:44.600 and profanity detectors. 0:14:45.160 --> 0:14:47.480 Okay, so we're talking about Granted, we're talking about this 0:14:47.680 --> 0:14:51.000 set of models IBM has developed. Let's talk about using 0:14:51.000 --> 0:14:55.840 Granite on Watson X compared to downloading open source models, 0:14:55.960 --> 0:14:56.880 Like how do those differ? 0:14:57.520 --> 0:15:01.160 By using Granite and what's on ex you get two things. 0:15:01.520 --> 0:15:05.280 The first one is the client protection and thementification that 0:15:05.320 --> 0:15:07.520 we talked about. You get that if the model is 0:15:07.560 --> 0:15:08.960 consumed through our platform. 0:15:09.440 --> 0:15:10.440 And the second. 0:15:10.120 --> 0:15:14.600 One is really the ecosystem of platform capabilities that we 0:15:14.640 --> 0:15:17.760 are offering to help you create value on top of 0:15:17.800 --> 0:15:21.960 those data. So for example, bringing your data to customize 0:15:22.000 --> 0:15:25.720 granted for your own specific use case. But also one 0:15:25.720 --> 0:15:28.520 thing that I like to highlight in particular is the 0:15:28.560 --> 0:15:31.800 AI governance. So when you get one of these pre 0:15:31.880 --> 0:15:35.040 train models, you put it in front of your own users. 0:15:35.840 --> 0:15:39.600 Through the input and instructions that the user provides for 0:15:39.760 --> 0:15:44.080 the model, they can notdge the model to potentially create 0:15:44.400 --> 0:15:48.000 undesired behavior and change the behavior of the model. And 0:15:48.040 --> 0:15:52.120 because of this is extremely important to automatically document the 0:15:52.240 --> 0:15:56.400 lineage of who touched the model at one point, so 0:15:56.480 --> 0:15:58.880 if something happens, you can trace it back and see 0:15:58.920 --> 0:16:02.920 where it's coming from. And that's what's an extra governance 0:16:03.040 --> 0:16:07.160 is offering automatically documenting the lineage. When you use the 0:16:07.200 --> 0:16:10.200 granite within the platform, you get all of those you 0:16:10.240 --> 0:16:13.320 can have the end to end governance, you can have 0:16:13.640 --> 0:16:17.720 access to all these scalable deployment opportunities that is available 0:16:17.760 --> 0:16:20.560 for you, like to allow you deploy them on the 0:16:20.600 --> 0:16:23.320 platform of your choice that we talked about, either multiple 0:16:23.960 --> 0:16:27.440 cloud or on prem and it also helps you to 0:16:27.520 --> 0:16:32.080 have access to avoid range of model customizations, approaches, prompt tuning, 0:16:32.160 --> 0:16:36.080 fine tuning, retrival augmented generations agents. There is a series 0:16:36.120 --> 0:16:38.960 of them available to use an apply to your model. 0:16:39.760 --> 0:16:44.240 This distinction between large language models and foundation models is 0:16:44.280 --> 0:16:48.760 eye opening. Mariam emphasized that foundation models can be tailored 0:16:48.760 --> 0:16:53.760 to specific tasks, but with that versatility comes a significant 0:16:53.840 --> 0:16:58.200 challenge the lack of transparency and how these models are trained. 0:16:59.040 --> 0:17:05.280 This composed a real especially in highly regulated industries like finance. Essentially, 0:17:05.359 --> 0:17:10.160 by using Granite and watsonex together, enterprises get powerful and 0:17:10.200 --> 0:17:11.560 customizable tools. 0:17:12.760 --> 0:17:14.960 So let's talk about the future a little bit. What 0:17:15.040 --> 0:17:17.120 do you think are some of the big developments were 0:17:17.200 --> 0:17:20.040 likely to see in the realm of AI models? 0:17:20.400 --> 0:17:21.280 Very good question. 0:17:22.040 --> 0:17:26.199 I feel like the generative AI of the past was 0:17:26.400 --> 0:17:30.800 powered by large language models. The generative AI of the 0:17:30.840 --> 0:17:35.439 future is going to reason, plan, act and reflect. 0:17:35.960 --> 0:17:39.359 Huh, and so I mean in the context of Granite 0:17:39.560 --> 0:17:43.000 in particular, like, what are we likely to see both 0:17:43.160 --> 0:17:45.040 you know, in the near term and in the sort 0:17:45.080 --> 0:17:46.320 of medium to long term. 0:17:46.920 --> 0:17:51.239 There are multiple elements to implement an agentic workflow that 0:17:51.280 --> 0:17:54.800 I just mentioned. One element of that is the LLM 0:17:54.880 --> 0:17:59.000 itself to be able to do the planning and reasoning 0:17:59.080 --> 0:18:03.439 and acting and doing something that we call tool calling. 0:18:03.840 --> 0:18:07.439 So basically, a series of tools are available to the model. 0:18:08.000 --> 0:18:10.480 You ask the model to call those and. 0:18:10.400 --> 0:18:10.880 Make a call. 0:18:11.040 --> 0:18:14.199 For example, we can say, hey, Granted, what is the 0:18:14.200 --> 0:18:19.960 weather like where Jacob lives. It's connect to web search API, 0:18:20.520 --> 0:18:23.280 look up your location. Then it's going to connect to 0:18:23.720 --> 0:18:28.080 weather API, calculate the weather and come back and formulate 0:18:28.119 --> 0:18:31.680 an answer and respond to that. So during this process, 0:18:32.240 --> 0:18:34.720 it first has to plan the task of how to 0:18:34.760 --> 0:18:37.639 answer that question, look into what are the tools that 0:18:37.680 --> 0:18:40.360 are available to it, and call them, and that's an 0:18:40.359 --> 0:18:43.040 ability of the model to do that. What we did 0:18:43.080 --> 0:18:47.159 with Granted was we expanded the Granite capabilities to be 0:18:47.240 --> 0:18:50.880 able to do function calling. So for example, today we 0:18:51.240 --> 0:18:54.320 have an open source granted to an eb function calling 0:18:54.400 --> 0:18:57.320 that is available on hugging face to try on and 0:18:57.400 --> 0:18:59.960 you can grab the model and the model has capability 0:19:00.080 --> 0:19:03.359 to do the tool callings. I'm anticipating that in the 0:19:03.400 --> 0:19:07.639 near future the planning and reasoning and acting and reflecting 0:19:07.680 --> 0:19:10.760 capabilities of the large language models are going to continue 0:19:10.800 --> 0:19:11.280 to evolve. 0:19:12.680 --> 0:19:16.720 So thinking now from the point of view of buyers 0:19:16.760 --> 0:19:20.400 and users of AIS, really people who are listening from 0:19:20.400 --> 0:19:26.840 that perspective, as people are evaluating AI tools and solutions, 0:19:27.480 --> 0:19:30.359 what is the most important thing they should be thinking about? 0:19:30.440 --> 0:19:32.879 How do you think about kind of that process? 0:19:33.920 --> 0:19:37.240 I think they should always start with the area at 0:19:37.320 --> 0:19:41.400 which they think it would benefit from AI, and then 0:19:41.720 --> 0:19:45.720 within that area, look into what data they have available 0:19:45.880 --> 0:19:50.080 to potentially fit into those AI service architects do they 0:19:50.080 --> 0:19:53.639 have access to quality data? And the second question that 0:19:53.680 --> 0:19:55.560 they have to ask themselves is do I have a 0:19:55.600 --> 0:19:59.520 trusted partner that can supply what I need to be 0:19:59.560 --> 0:20:03.320 able to implement AI. That can be a collection of 0:20:03.359 --> 0:20:05.920 the foundation models that you're going to need, that can 0:20:05.960 --> 0:20:10.000 be a collection of the platform capabilities that the trusted 0:20:10.040 --> 0:20:13.399 partner can offer you to implement such a thing. The 0:20:13.480 --> 0:20:18.600 third thing is go and evaluate the regulations. Does regulation 0:20:19.000 --> 0:20:23.240 allow you to apploy AI to the specific area that 0:20:23.760 --> 0:20:27.159 you are investigating and you're targeting for AI? And the 0:20:27.280 --> 0:20:30.520 last part, but not least, is back to the principles 0:20:30.560 --> 0:20:34.200 of design, thinking, what is the problem in that area? 0:20:34.680 --> 0:20:39.120 I'm solving with AI, and if AI is even appropriate, 0:20:39.640 --> 0:20:41.639 because we want to make sure that you use AI 0:20:41.800 --> 0:20:44.680 not just because it's a cool, hot toy in the market, 0:20:44.720 --> 0:20:48.600 but you are convinced that it can significantly enhance the 0:20:49.119 --> 0:20:52.960 user experience of your customers in that area. And once 0:20:53.000 --> 0:20:55.520 you have an answer to those all these four questions, 0:20:55.600 --> 0:20:58.840 then maybe you have a good candidates to start applying AI. 0:21:00.720 --> 0:21:03.880 What about from the side of project managers who are 0:21:04.040 --> 0:21:07.400 trying to just keep up with how fast things are changing, 0:21:07.440 --> 0:21:11.359 how fast innovation is happening, Like, what advice would you 0:21:11.440 --> 0:21:12.280 give those people? 0:21:12.880 --> 0:21:17.159 My advice would be focused on agility. This is a 0:21:17.160 --> 0:21:20.879 market that is evolving rapidly and the winners of the 0:21:20.960 --> 0:21:24.439 market would be those that are able to take advantage 0:21:24.440 --> 0:21:27.680 of the best the market can offer at any point 0:21:27.680 --> 0:21:30.680 of time. So in order to do that, they need 0:21:30.720 --> 0:21:39.000 to be open to experimentation, continuous learning, and to rapidly 0:21:39.320 --> 0:21:40.880 adopting the new ideas. 0:21:42.080 --> 0:21:45.520 And when you think about the future and GENAI, is 0:21:45.600 --> 0:21:49.480 there a particular, say problem that you are most excited 0:21:49.520 --> 0:21:50.040 to solve. 0:21:50.720 --> 0:21:53.600 I think that would be productivity. If you look into 0:21:53.640 --> 0:21:57.040 the stats that are out there, there are surveys that 0:21:57.320 --> 0:22:01.000 confirm that sixty to seventy persons of the time of 0:22:01.000 --> 0:22:07.000 our employees can be potentially enhanced to the productivity gains 0:22:07.000 --> 0:22:10.440 of generative I For example, I personally myself use my 0:22:10.520 --> 0:22:14.040 product for content generation a lot, so the time that 0:22:14.080 --> 0:22:19.080 it frees up can be potentially put into generating a 0:22:19.160 --> 0:22:23.359 higher value work. And because of that, I'm super excited 0:22:23.480 --> 0:22:27.919 with all the opportunities that it represents for enterprises to 0:22:28.359 --> 0:22:31.480 go and dedicate the time of the employees to higher 0:22:31.560 --> 0:22:32.640 value items. 0:22:32.880 --> 0:22:37.399 Great. Okay, a couple of Granite specific questions. So what 0:22:37.520 --> 0:22:40.280 are like the key things you want the world to 0:22:40.400 --> 0:22:41.760 know about Granite. 0:22:42.320 --> 0:22:48.320 Granite is open, trusted, and targeted. Two ways to think 0:22:48.359 --> 0:22:52.840 about openness. One open as open weights it's available for 0:22:52.880 --> 0:22:57.120 public to download, and the second one is open as 0:22:57.200 --> 0:23:02.080 in there is less restrictions on how the customers can 0:23:02.200 --> 0:23:05.280 legally use these models for a range of use cases. 0:23:05.400 --> 0:23:08.760 We have released Grantite open source models on their Apache 0:23:08.960 --> 0:23:12.760 license that is enabling a large range of use cases. 0:23:13.240 --> 0:23:16.399 The second one was trusted. We talked about that like 0:23:16.520 --> 0:23:20.720 it's rooted in the trustworthy governance process that we established 0:23:20.760 --> 0:23:24.760 thereund how we are training these models and the responsibility 0:23:24.800 --> 0:23:27.280 that we take for these models, and the third one 0:23:27.320 --> 0:23:31.800 is targeted, targeted for enterprise. We talked about like exposing 0:23:31.800 --> 0:23:36.159 Granted to enterprise data or the domain specific Granted some 0:23:36.240 --> 0:23:39.600 of them like Cobalt Java Translation that is targeting to 0:23:39.760 --> 0:23:44.840 solve the specific enterprise needs. And that's granite, so open, trusted, 0:23:44.920 --> 0:23:45.560 and targeted. 0:23:46.280 --> 0:23:48.080 So there are a lot of models out in the 0:23:48.119 --> 0:23:51.240 world all of a sudden, right, it's a crowded market. 0:23:51.840 --> 0:23:54.679 Where does granted fit in that universe? What is the 0:23:54.720 --> 0:23:55.600 market for granted? 0:23:56.600 --> 0:24:00.480 We talked about the enterprise market shifting away from very 0:24:00.600 --> 0:24:05.560 large general purpose models to target a smaller models, and 0:24:05.680 --> 0:24:10.400 Granted is a small model that enterprise can pick up 0:24:10.680 --> 0:24:15.600 and customize on their proprietary data to create something that 0:24:15.720 --> 0:24:19.720 is differentiated for a target use case. So Granted is 0:24:19.760 --> 0:24:24.520 well suited as a small, domain specific business, ready tailored 0:24:24.520 --> 0:24:29.800 for business and trained on enterprise data to solve enterprise questions. 0:24:30.200 --> 0:24:33.080 You mentioned small as one of the things that granted 0:24:33.200 --> 0:24:38.240 is why is that useful in some contexts for enterprise 0:24:38.320 --> 0:24:39.360 for businesses. 0:24:40.160 --> 0:24:44.640 The larger the model, the larger computer resources it requires, 0:24:45.320 --> 0:24:50.560 it translates to increased latency that's your response time. It 0:24:50.600 --> 0:24:57.240 translates to increased cost and in translates to increased carbon 0:24:57.240 --> 0:25:01.760 footprint and energy consumption. So at this case of enterprise transactions, 0:25:01.800 --> 0:25:04.160 when you move to production and you want to scale, 0:25:05.000 --> 0:25:10.159 some of these challenges can be multiple times stronger. Like 0:25:10.280 --> 0:25:13.560 costs can add up, the energy consumption can be a 0:25:13.640 --> 0:25:17.240