WEBVTT - Google's Journey to the Edge of Search

0:00:15.316 --> 0:00:23.156
<v Speaker 1>Pushkin. It feels like searching the web is a problem

0:00:23.236 --> 0:00:26.756
<v Speaker 1>that's been solved. You know, it's ridiculously easy for me

0:00:26.796 --> 0:00:30.036
<v Speaker 1>to say, find out when Alexander Hamilton was shot eighteen

0:00:30.076 --> 0:00:33.796
<v Speaker 1>oh four, or whether they are making Sing three. Not yet,

0:00:34.156 --> 0:00:38.876
<v Speaker 1>but Matthew McConaughey has expressed interest and yet. And maybe

0:00:38.876 --> 0:00:42.556
<v Speaker 1>this is not surprising. The people who spend their lives

0:00:42.716 --> 0:00:46.876
<v Speaker 1>working on search do not think search is solved. This

0:00:46.956 --> 0:00:50.116
<v Speaker 1>is partly because the people at the frontier of search

0:00:50.596 --> 0:00:53.796
<v Speaker 1>don't just want to search the web. They want to

0:00:53.796 --> 0:00:57.876
<v Speaker 1>answer every question that might cross your mind, even questions

0:00:57.996 --> 0:01:06.796
<v Speaker 1>you can't put into words. I'm Jacob Goldstein, and this

0:01:06.916 --> 0:01:09.636
<v Speaker 1>is What's your problem? The show where on entrepreneurs and

0:01:09.676 --> 0:01:11.916
<v Speaker 1>engineers talk about how they're going to change the world

0:01:12.236 --> 0:01:15.156
<v Speaker 1>once they solve a few problems. My guest today is

0:01:15.316 --> 0:01:19.396
<v Speaker 1>Kathy Edwards, vice president and GM of Search at Google.

0:01:20.516 --> 0:01:23.956
<v Speaker 1>Cathy's problem is this, how do you teach computers to

0:01:23.956 --> 0:01:26.676
<v Speaker 1>tell people what they want to know, even if they

0:01:26.716 --> 0:01:30.676
<v Speaker 1>don't know how to ask. Later in the conversation we

0:01:30.756 --> 0:01:33.356
<v Speaker 1>get to the frontier of what Kathy and Google are

0:01:33.396 --> 0:01:36.356
<v Speaker 1>working on now, but we started with the problem they

0:01:36.396 --> 0:01:39.196
<v Speaker 1>have largely solved in the six years Kathy has been

0:01:39.236 --> 0:01:42.916
<v Speaker 1>at Google, the jump from search results based on keywords

0:01:43.316 --> 0:01:46.676
<v Speaker 1>to search results based on natural language, the way people

0:01:46.716 --> 0:01:51.316
<v Speaker 1>talk in everyday life. So one of the problems that

0:01:51.356 --> 0:01:54.556
<v Speaker 1>we were working on around six years ago is this

0:01:54.716 --> 0:01:58.276
<v Speaker 1>problem of natural language queries. So, if you're old enough

0:01:58.316 --> 0:02:01.756
<v Speaker 1>to remember the early days of search on the Internet,

0:02:02.116 --> 0:02:05.836
<v Speaker 1>there was this idea of keywordies, right, that you had

0:02:05.876 --> 0:02:09.236
<v Speaker 1>to sort of take this idea you had in your

0:02:09.236 --> 0:02:11.756
<v Speaker 1>mind of what you what you needed to know and

0:02:11.916 --> 0:02:15.436
<v Speaker 1>figure out what were the exact right keywords to enter

0:02:15.476 --> 0:02:17.756
<v Speaker 1>into the search engine to get your results back right.

0:02:17.796 --> 0:02:20.716
<v Speaker 1>I mean an example as an example, very early in

0:02:20.716 --> 0:02:23.436
<v Speaker 1>the you know, I remember being taught how to query

0:02:23.876 --> 0:02:28.716
<v Speaker 1>back in you know, nineteen ninety nine and being told

0:02:28.796 --> 0:02:32.356
<v Speaker 1>never used the word and never used the word because

0:02:32.716 --> 0:02:35.676
<v Speaker 1>the word and or the word there is in almost

0:02:35.796 --> 0:02:38.516
<v Speaker 1>every document on the Internet. And the way it worked

0:02:38.556 --> 0:02:41.796
<v Speaker 1>back then is you did this word matching, right, and

0:02:41.836 --> 0:02:44.436
<v Speaker 1>so if you had a word that was in your

0:02:44.516 --> 0:02:47.076
<v Speaker 1>query and there was that same word in the document,

0:02:47.476 --> 0:02:50.876
<v Speaker 1>then that document would be returned and potentially scored right.

0:02:51.276 --> 0:02:54.436
<v Speaker 1>And that was very helpful if it was a word

0:02:54.476 --> 0:02:58.716
<v Speaker 1>like genetics, right, which was highly specific and wasn't in

0:02:58.796 --> 0:03:00.996
<v Speaker 1>a heap of documents on the internet. But the word

0:03:00.996 --> 0:03:04.196
<v Speaker 1>and not very specific. And you know, in the very

0:03:04.236 --> 0:03:08.276
<v Speaker 1>early days the Internet, these words weren't even weighted particularly right,

0:03:08.316 --> 0:03:11.956
<v Speaker 1>The word and count for as much as the word genetics,

0:03:11.956 --> 0:03:14.716
<v Speaker 1>and so a document might have a ton of the

0:03:14.796 --> 0:03:17.276
<v Speaker 1>uses of the word and and one use of the

0:03:17.276 --> 0:03:20.116
<v Speaker 1>word genetics, and it would score really highly, even though

0:03:20.116 --> 0:03:22.436
<v Speaker 1>it wasn't particularly genetics. Folk. Now, by the time you

0:03:22.436 --> 0:03:25.556
<v Speaker 1>get to Google, that part is solved, right Google buy

0:03:25.716 --> 0:03:29.156
<v Speaker 1>that part is years ago? Is waiting genetics more more

0:03:29.196 --> 0:03:33.556
<v Speaker 1>heavily than it's waiting the But but what what part

0:03:33.636 --> 0:03:36.516
<v Speaker 1>six years ago was not solved? That's solved now or

0:03:36.556 --> 0:03:40.436
<v Speaker 1>solved ish now. But we were still seeing people do

0:03:40.596 --> 0:03:44.596
<v Speaker 1>these very keyword oriented queries. So they weren't saying things

0:03:44.676 --> 0:03:48.516
<v Speaker 1>like what wine pairs best with chicken? Or if they were,

0:03:48.876 --> 0:03:53.036
<v Speaker 1>they were doing those queries and getting not the best results,

0:03:53.076 --> 0:03:56.116
<v Speaker 1>because not only is there a question of word matching

0:03:56.116 --> 0:03:59.156
<v Speaker 1>and how much each word counts for, there's also the

0:03:59.276 --> 0:04:03.556
<v Speaker 1>question of like does the word what appear at all? Right?

0:04:03.676 --> 0:04:07.276
<v Speaker 1>Like are the answers to that question actually just documents

0:04:07.356 --> 0:04:09.716
<v Speaker 1>that talk about the best wine to pair with chicken

0:04:10.076 --> 0:04:16.716
<v Speaker 1>is you know, chardonnay, right, and not so much talking.

0:04:16.916 --> 0:04:18.596
<v Speaker 1>You know, they didn't include the question, and so we

0:04:18.676 --> 0:04:23.756
<v Speaker 1>sort of saw these like SEO documents that would spring

0:04:23.876 --> 0:04:26.236
<v Speaker 1>up that would have the questions kind of baked in

0:04:26.276 --> 0:04:29.116
<v Speaker 1>and an attempt to match. But those documents weren't necessarily

0:04:29.156 --> 0:04:33.436
<v Speaker 1>the best answers. And so this is when we started

0:04:33.436 --> 0:04:36.356
<v Speaker 1>to go just that next level deeper in our language

0:04:36.476 --> 0:04:41.956
<v Speaker 1>understanding with these AI models, their language models that really

0:04:42.036 --> 0:04:46.196
<v Speaker 1>can start to map out in a concept space, things

0:04:46.316 --> 0:04:49.356
<v Speaker 1>like this sort of translation between how you might ask

0:04:49.396 --> 0:04:51.356
<v Speaker 1>a query and then what that might look like in

0:04:51.396 --> 0:04:53.916
<v Speaker 1>the document. So, to take the example you gave of

0:04:54.196 --> 0:04:57.916
<v Speaker 1>what wine pairs best with chicken, even as late as

0:04:57.956 --> 0:04:59.956
<v Speaker 1>six years ago when you got to Google, you're saying

0:05:00.636 --> 0:05:05.316
<v Speaker 1>Google wasn't great at delivering the best results to a

0:05:05.396 --> 0:05:08.956
<v Speaker 1>query like that because it was written as speech, not

0:05:09.156 --> 0:05:11.396
<v Speaker 1>written as a series of keywords. So six years ago,

0:05:11.516 --> 0:05:16.596
<v Speaker 1>I would have been better off typing chicken wine pairing.

0:05:17.356 --> 0:05:19.116
<v Speaker 1>I would have got better results if I did that,

0:05:19.156 --> 0:05:21.636
<v Speaker 1>you're saying, because that's kind of the way. That's the

0:05:21.636 --> 0:05:23.476
<v Speaker 1>way Google had mapped the web. It was like a

0:05:23.516 --> 0:05:26.076
<v Speaker 1>series of important words and what sites are reliable and

0:05:26.116 --> 0:05:29.876
<v Speaker 1>they it just the technology wasn't there to actually try

0:05:29.876 --> 0:05:35.316
<v Speaker 1>and understand the way people ask questions in real life. Absolutely,

0:05:35.716 --> 0:05:40.636
<v Speaker 1>And it was this idea of bringing AI into search

0:05:41.036 --> 0:05:44.596
<v Speaker 1>and having these like large scale language models. That first

0:05:44.676 --> 0:05:48.036
<v Speaker 1>one was called Bert. We now use one called mom,

0:05:48.076 --> 0:05:50.796
<v Speaker 1>which is get to mom. But let's talk about Let's

0:05:50.796 --> 0:05:53.836
<v Speaker 1>try talking about Bert for a second. So how do

0:05:53.916 --> 0:05:58.996
<v Speaker 1>you get from search results that are fundamentally keyword based

0:05:59.476 --> 0:06:03.636
<v Speaker 1>to search results that are fundamentally you know, answering questions

0:06:03.676 --> 0:06:06.396
<v Speaker 1>that are posed in a more natural way, like how

0:06:06.396 --> 0:06:11.076
<v Speaker 1>do you make that leap? So the fundamental insight is

0:06:11.476 --> 0:06:14.236
<v Speaker 1>you go from looking at these words as tokens that

0:06:14.716 --> 0:06:18.556
<v Speaker 1>get matched against each other to suddenly you look at

0:06:18.596 --> 0:06:21.756
<v Speaker 1>all the words in all the documents on the Internet

0:06:22.076 --> 0:06:25.156
<v Speaker 1>and you create what's code an embedding space, which is

0:06:25.276 --> 0:06:27.436
<v Speaker 1>essentially you can think of it as a map of

0:06:27.476 --> 0:06:32.676
<v Speaker 1>the concepts that these documents know about. And suddenly, by

0:06:32.716 --> 0:06:34.636
<v Speaker 1>being able to say, okay, you can take a query,

0:06:34.796 --> 0:06:39.396
<v Speaker 1>map that into this concept embedding space. You'd take these documents,

0:06:39.396 --> 0:06:42.476
<v Speaker 1>map that into the content embedding space. You can start

0:06:42.516 --> 0:06:47.116
<v Speaker 1>to actually match together not these words, but what people

0:06:47.236 --> 0:06:50.716
<v Speaker 1>actually mean what they actually mean when they ask these questions,

0:06:50.716 --> 0:06:53.676
<v Speaker 1>and what they actually mean when they write these web

0:06:53.716 --> 0:06:56.836
<v Speaker 1>pages on the Internet. That seems I mean, A, it

0:06:56.836 --> 0:07:00.196
<v Speaker 1>seems super hard, right, and B As I'm parsing that,

0:07:00.356 --> 0:07:03.916
<v Speaker 1>I'm tempted to use a lot of anthropomorphic language, right,

0:07:03.956 --> 0:07:06.676
<v Speaker 1>I'm tempted to say, like, you have to go from

0:07:06.716 --> 0:07:09.076
<v Speaker 1>the computer just sort of having a list of words

0:07:09.316 --> 0:07:11.556
<v Speaker 1>and kind of weights around those words to a computer

0:07:12.076 --> 0:07:16.316
<v Speaker 1>understanding what people mean. Like, am I right to say that?

0:07:16.436 --> 0:07:19.196
<v Speaker 1>Or is that just my like layperson intuition getting in

0:07:19.196 --> 0:07:22.716
<v Speaker 1>the way of what's going on? I Mean, the first

0:07:22.716 --> 0:07:25.836
<v Speaker 1>thing I'll say is I think we're very far away

0:07:25.876 --> 0:07:29.996
<v Speaker 1>from the computer having any sort of sentience and truly understanding.

0:07:30.356 --> 0:07:32.396
<v Speaker 1>But I think it is true. It is fair to

0:07:32.476 --> 0:07:37.236
<v Speaker 1>say that there is a level of deeper understanding that

0:07:37.276 --> 0:07:40.956
<v Speaker 1>you're not just looking at these words as as you know,

0:07:41.036 --> 0:07:44.076
<v Speaker 1>bits in a computer, but you're actually starting to model

0:07:44.236 --> 0:07:46.676
<v Speaker 1>in a way that a human might, a brain might

0:07:46.756 --> 0:07:50.956
<v Speaker 1>model what the concepts are. And I do think that's

0:07:50.956 --> 0:07:53.636
<v Speaker 1>a first step of getting closer to this sort of

0:07:53.796 --> 0:07:59.356
<v Speaker 1>natural human understanding. So is there a way to talk

0:07:59.396 --> 0:08:04.316
<v Speaker 1>about how that works? It's it's pattern matching effectively right,

0:08:04.356 --> 0:08:07.916
<v Speaker 1>And it just so happens that if you magnify pattern

0:08:07.956 --> 0:08:10.316
<v Speaker 1>matching on a very large scale, that can be a

0:08:10.476 --> 0:08:15.396
<v Speaker 1>pretty compelling understanding. And so that's the sort of big idea,

0:08:15.636 --> 0:08:19.036
<v Speaker 1>the theory of how it works. I'm sure in actually

0:08:19.236 --> 0:08:23.116
<v Speaker 1>building the thing in building Burt, which was this big

0:08:23.636 --> 0:08:28.716
<v Speaker 1>model that did work, it wasn't that easy, right, I mean,

0:08:28.836 --> 0:08:33.076
<v Speaker 1>is there a is there a story version of how

0:08:33.116 --> 0:08:36.276
<v Speaker 1>you built it? So I think there were two hard

0:08:36.316 --> 0:08:39.796
<v Speaker 1>points along the journey. The first hard point was just

0:08:40.996 --> 0:08:43.956
<v Speaker 1>these models were being built at a scale that was

0:08:44.076 --> 0:08:48.516
<v Speaker 1>unprecedented the amount of information. You know, traditional neural networks

0:08:48.556 --> 0:08:53.756
<v Speaker 1>would run on thousands, maybe millions of training examples. Suddenly

0:08:53.796 --> 0:08:58.076
<v Speaker 1>you're trying to model all the words on the Internet

0:08:58.116 --> 0:09:01.956
<v Speaker 1>and this scale. Firstly, this scale is what gets you

0:09:02.036 --> 0:09:04.756
<v Speaker 1>the amount of training to actually get the concepts model

0:09:04.836 --> 0:09:09.436
<v Speaker 1>to be compelling. But frankly, the computers just couldn't process.

0:09:09.476 --> 0:09:12.236
<v Speaker 1>So you're you're building this model and saying, okay, now

0:09:12.236 --> 0:09:15.596
<v Speaker 1>to learn what you need to learn, read literally every

0:09:15.596 --> 0:09:18.476
<v Speaker 1>word on the Internet, is that right, yes, and not

0:09:18.516 --> 0:09:22.076
<v Speaker 1>read at once, because every layer of the neuronet needs

0:09:22.116 --> 0:09:25.236
<v Speaker 1>to read it and reprocess it. Right. So you're reading

0:09:25.396 --> 0:09:29.196
<v Speaker 1>every word, you know, a massive number of times. And

0:09:29.236 --> 0:09:31.956
<v Speaker 1>at the time we didn't really have the compute power.

0:09:32.236 --> 0:09:36.836
<v Speaker 1>You just needed more more computers, essentially more and more chips,

0:09:37.036 --> 0:09:40.756
<v Speaker 1>more more engines to just process and process and process.

0:09:41.996 --> 0:09:47.236
<v Speaker 1>So our research team had developed these these chips that

0:09:47.836 --> 0:09:51.516
<v Speaker 1>these processes that were really optimized for doing a sort

0:09:51.556 --> 0:09:55.036
<v Speaker 1>of deep learning work. And it was that these chips

0:09:55.076 --> 0:09:56.876
<v Speaker 1>and the way that we could sort of put all

0:09:56.916 --> 0:09:59.236
<v Speaker 1>the chips together at a work in concert to solve

0:09:59.316 --> 0:10:03.356
<v Speaker 1>this problem that really unlocked the amount of processing power

0:10:03.356 --> 0:10:05.756
<v Speaker 1>and needed to even build these models in the festival.

0:10:05.836 --> 0:10:09.196
<v Speaker 1>So the binding constraint wasn't like the theory or the

0:10:09.236 --> 0:10:10.916
<v Speaker 1>ideas of it, like you knew how to do it,

0:10:10.956 --> 0:10:14.956
<v Speaker 1>you just didn't have enough enough horsepower to actually make

0:10:14.996 --> 0:10:18.396
<v Speaker 1>it happen. Well, we knew that we could do it,

0:10:18.436 --> 0:10:21.356
<v Speaker 1>we didn't know offer to be any good, right, it

0:10:20.916 --> 0:10:24.676
<v Speaker 1>wasn't it, and you couldn't even try, right, yeah? Right?

0:10:25.236 --> 0:10:28.636
<v Speaker 1>And so then we tried it and we found out, actually,

0:10:28.676 --> 0:10:32.476
<v Speaker 1>this thing is pretty compelling. It can understand things that

0:10:32.516 --> 0:10:36.676
<v Speaker 1>our models previously have never understood. You know. But I

0:10:36.716 --> 0:10:39.716
<v Speaker 1>will say the second and this gets to the second

0:10:39.796 --> 0:10:44.956
<v Speaker 1>hard part. We once we had these large scale language models,

0:10:44.956 --> 0:10:49.436
<v Speaker 1>we didn't quite know how to put them into search ranking.

0:10:49.676 --> 0:10:52.876
<v Speaker 1>This was not something that had been done before. So

0:10:52.956 --> 0:10:57.796
<v Speaker 1>we have in search this incredibly rigorous methodology for testing

0:10:57.956 --> 0:11:01.676
<v Speaker 1>any given change to our algorithm, and it's it's based

0:11:01.676 --> 0:11:05.556
<v Speaker 1>in statistics, and it's statistically samples queries, and we look

0:11:05.556 --> 0:11:08.556
<v Speaker 1>at the before and they after, and there's a scoring

0:11:08.596 --> 0:11:11.716
<v Speaker 1>system to say is it better or not? And I

0:11:11.796 --> 0:11:17.756
<v Speaker 1>remember looking at the early experiments from this burst integration

0:11:17.876 --> 0:11:22.236
<v Speaker 1>into our search engine, and the queries that it was

0:11:22.596 --> 0:11:27.916
<v Speaker 1>impacting were just queries that, honestly before we would have said,

0:11:28.396 --> 0:11:30.796
<v Speaker 1>we don't know how we can solve this query. And

0:11:30.956 --> 0:11:35.796
<v Speaker 1>suddenly the model was just able to figure out these

0:11:35.836 --> 0:11:41.596
<v Speaker 1>sort of unspoken concepts that just our previous technology just

0:11:41.716 --> 0:11:43.956
<v Speaker 1>would not have even been able to come close to.

0:11:44.516 --> 0:11:46.356
<v Speaker 1>Like give me an example, like what kind of thing?

0:11:47.836 --> 0:11:50.956
<v Speaker 1>So here's a really great example. This is directly from

0:11:51.556 --> 0:11:56.956
<v Speaker 1>the one of the very first bit evaluations that we

0:11:56.996 --> 0:12:01.796
<v Speaker 1>did internally, and the query is can you get medicine

0:12:01.876 --> 0:12:05.836
<v Speaker 1>for someone? Pharmacy? Right? And so what's interesting about this

0:12:05.916 --> 0:12:09.876
<v Speaker 1>question is the users looking for something very specific. They're

0:12:09.916 --> 0:12:13.356
<v Speaker 1>looking like maybe my partner is sick. Can I go

0:12:13.356 --> 0:12:16.156
<v Speaker 1>and pick up their prescription at the pharmacy for them?

0:12:16.316 --> 0:12:18.516
<v Speaker 1>Or do they have to go and get it? Right?

0:12:18.676 --> 0:12:22.636
<v Speaker 1>It's also a goodly jankie where it's half in natural

0:12:22.716 --> 0:12:25.676
<v Speaker 1>language can you get medicine for someone? And half in

0:12:25.756 --> 0:12:28.676
<v Speaker 1>like keyword ease, they're just typing pharmacy at the end, right,

0:12:28.716 --> 0:12:35.156
<v Speaker 1>it's a weird exactly yeah. And so previously we didn't

0:12:35.196 --> 0:12:38.676
<v Speaker 1>know how to pause out this intent right, this idea.

0:12:38.796 --> 0:12:42.156
<v Speaker 1>You know, we could tell that it was about getting

0:12:42.156 --> 0:12:45.916
<v Speaker 1>a prescription from a pharmacy, but this notion of force

0:12:46.116 --> 0:12:53.236
<v Speaker 1>someone was a concept that was just slightly too complex. Oh,

0:12:53.556 --> 0:12:55.636
<v Speaker 1>I didn't even understand it until now. What they mean

0:12:55.756 --> 0:12:58.596
<v Speaker 1>is can I pick up someone else's prescription? That's what

0:12:58.636 --> 0:13:02.076
<v Speaker 1>they're actually asking, But it's very it's poorly worded, frankly,

0:13:02.116 --> 0:13:06.876
<v Speaker 1>and they'refore hard to figure out exactly right. And so previously,

0:13:06.956 --> 0:13:11.876
<v Speaker 1>before Bert, we would turn these wonderful web pages saying

0:13:11.956 --> 0:13:14.996
<v Speaker 1>this is how you get a prescription filled, which you

0:13:15.036 --> 0:13:19.116
<v Speaker 1>can imagine if you're this user doing this query, you're like, yeah,

0:13:19.156 --> 0:13:21.316
<v Speaker 1>I already know how to get a prescription that filled.

0:13:21.436 --> 0:13:24.956
<v Speaker 1>Thanks for me. What I need is filled for somebody

0:13:24.956 --> 0:13:29.396
<v Speaker 1>else exactly, and with Bert we were able to understand

0:13:29.476 --> 0:13:32.556
<v Speaker 1>pick up this idea of the force someone and put

0:13:32.596 --> 0:13:35.396
<v Speaker 1>the appropriate weight on it, that that was the sort

0:13:35.396 --> 0:13:38.956
<v Speaker 1>of you know, discriminating thing in the query, that that

0:13:39.036 --> 0:13:41.356
<v Speaker 1>was the key thing that the query turned on. And

0:13:42.196 --> 0:13:46.636
<v Speaker 1>then we were able to show this this web page

0:13:46.676 --> 0:13:48.556
<v Speaker 1>that talked about can I have a friend or family

0:13:48.596 --> 0:13:51.436
<v Speaker 1>member pick up a prescription for me? And that was

0:13:51.476 --> 0:13:54.196
<v Speaker 1>the sort of like aha moment where we could all

0:13:54.236 --> 0:13:56.316
<v Speaker 1>just sit around and be like, Wow, this is a

0:13:56.436 --> 0:13:59.876
<v Speaker 1>new level of understanding that we haven't got to previously.

0:14:02.516 --> 0:14:05.156
<v Speaker 1>So with birth, Google got to the point where it

0:14:05.236 --> 0:14:08.996
<v Speaker 1>was very very good at dealing with words in a deep,

0:14:09.236 --> 0:14:12.316
<v Speaker 1>complex way. But words make up less and less of

0:14:12.356 --> 0:14:16.156
<v Speaker 1>the Internet. Pictures and videos are a whole other story

0:14:16.716 --> 0:14:25.916
<v Speaker 1>that's coming up in a minute. Now back to the show.

0:14:26.396 --> 0:14:29.156
<v Speaker 1>So you have got to this place now where you

0:14:28.396 --> 0:14:32.876
<v Speaker 1>have you Google have made the leap from keyword based

0:14:32.876 --> 0:14:35.876
<v Speaker 1>searches to intention based search is what do people mean? Right?

0:14:35.876 --> 0:14:39.436
<v Speaker 1>Which is this big interesting leap? And so I'm interested

0:14:39.476 --> 0:14:42.436
<v Speaker 1>in kind of the next leap, like what's the next

0:14:42.636 --> 0:14:47.116
<v Speaker 1>big hard problem you're trying to solve? What's really interesting

0:14:47.196 --> 0:14:50.876
<v Speaker 1>to me is this idea of how many questions you

0:14:50.916 --> 0:14:54.556
<v Speaker 1>don't ask because you don't even know the words, right,

0:14:54.996 --> 0:14:58.316
<v Speaker 1>Like this is a bit of a sad story. But

0:14:58.436 --> 0:15:01.676
<v Speaker 1>I have at my house this oak tree, and the

0:15:01.716 --> 0:15:04.756
<v Speaker 1>oak tree I think is dead, and it's very sad

0:15:04.756 --> 0:15:08.196
<v Speaker 1>for me because a very beautiful oak tree. And what's

0:15:08.236 --> 0:15:11.556
<v Speaker 1>interesting is, you know, I looked at the oak tree,

0:15:11.596 --> 0:15:14.716
<v Speaker 1>I'm like, wow, those leaves are kind of brown, Like,

0:15:14.796 --> 0:15:17.276
<v Speaker 1>that's not it doesn't seem right to me. I wonder

0:15:17.316 --> 0:15:20.676
<v Speaker 1>if there's something's wrong with the oak tree, right, But

0:15:20.756 --> 0:15:24.676
<v Speaker 1>I can't necessarily right now really articulate that to a

0:15:24.716 --> 0:15:27.596
<v Speaker 1>computer this fundamental question of is this oak tree dead?

0:15:27.636 --> 0:15:30.156
<v Speaker 1>And if not, what can I do to save it? Right?

0:15:30.676 --> 0:15:32.436
<v Speaker 1>So what I do is I go and type in

0:15:32.476 --> 0:15:36.356
<v Speaker 1>some queries, I say, you know, oak tree dead? How

0:15:36.356 --> 0:15:38.956
<v Speaker 1>do I know if my oak tree is dead? You know?

0:15:39.036 --> 0:15:42.556
<v Speaker 1>And I get back results. But those results aren't necessarily

0:15:42.636 --> 0:15:45.876
<v Speaker 1>taking what they're not taking in any context of this

0:15:46.036 --> 0:15:49.196
<v Speaker 1>particular tree and what do the leaves look like? And

0:15:49.236 --> 0:15:53.756
<v Speaker 1>so this idea of how can you start to ask

0:15:53.836 --> 0:15:58.796
<v Speaker 1>these questions using all of the information around you, using

0:15:59.036 --> 0:16:03.396
<v Speaker 1>your camera to actually capture this particular oak tree, using

0:16:03.396 --> 0:16:07.716
<v Speaker 1>your location to know, you know, what are the native

0:16:07.716 --> 0:16:11.436
<v Speaker 1>oaks in this area? And what's the current incidence of

0:16:12.196 --> 0:16:14.476
<v Speaker 1>sudden oak death syndrome, which is a thing that I

0:16:14.516 --> 0:16:17.116
<v Speaker 1>have recently learned exists. Okay, so I get why this

0:16:17.196 --> 0:16:22.116
<v Speaker 1>is a hard thing to search in a text box, right,

0:16:22.236 --> 0:16:24.556
<v Speaker 1>And so the thing that's interesting to me is how

0:16:24.596 --> 0:16:29.436
<v Speaker 1>can we facilitate asking those types of questions where it's

0:16:29.476 --> 0:16:33.876
<v Speaker 1>a mix of here's something that you're looking at, Here's

0:16:33.916 --> 0:16:37.156
<v Speaker 1>something that you're saying with your words that adds to

0:16:37.236 --> 0:16:40.876
<v Speaker 1>the picture. You know, here's a lemon tree that's got

0:16:40.956 --> 0:16:43.156
<v Speaker 1>some black spots on it? What's wrong with it? Like?

0:16:43.196 --> 0:16:46.156
<v Speaker 1>How can you help me understand what I should do

0:16:46.236 --> 0:16:49.996
<v Speaker 1>about this? You know, these sorts of questions I think

0:16:50.156 --> 0:16:54.156
<v Speaker 1>are right now. We have to do a tremendous amount

0:16:54.156 --> 0:16:57.676
<v Speaker 1>of work to try and translate these questions into text

0:16:57.876 --> 0:17:01.276
<v Speaker 1>that we would issue to a search engine. And yeah,

0:17:01.516 --> 0:17:05.316
<v Speaker 1>we use that. Yeah, yeah, normal people. Yes, we're all

0:17:05.356 --> 0:17:08.196
<v Speaker 1>doing it. And when you think it's you know, sometimes

0:17:08.196 --> 0:17:12.876
<v Speaker 1>it's very easy, right, but sometimes you're like really having

0:17:12.916 --> 0:17:14.836
<v Speaker 1>to work hard to come up with a query that

0:17:14.836 --> 0:17:16.636
<v Speaker 1>will actually get you the answers that you need. And

0:17:16.676 --> 0:17:19.276
<v Speaker 1>I think that's really the next frontier for us is

0:17:19.716 --> 0:17:23.196
<v Speaker 1>how do we on the query side help users just

0:17:23.436 --> 0:17:30.156
<v Speaker 1>naturally intuitively express whatever information need they have. And then

0:17:30.276 --> 0:17:34.276
<v Speaker 1>how do we understand the whole universe of information, not

0:17:34.396 --> 0:17:38.356
<v Speaker 1>just the web pages, that all the images and video

0:17:38.516 --> 0:17:43.636
<v Speaker 1>and audio out there, and take that next level of

0:17:43.676 --> 0:17:47.756
<v Speaker 1>like concept understanding to match those together so that we

0:17:47.796 --> 0:17:53.716
<v Speaker 1>can get users even more precise answers that really help them. Great.

0:17:53.836 --> 0:18:00.356
<v Speaker 1>So that's the like vast dream slash big problem. Can

0:18:00.396 --> 0:18:02.356
<v Speaker 1>we reduce it a little bit so we can talk

0:18:02.396 --> 0:18:05.036
<v Speaker 1>in sort of practical terms about what you're working on.

0:18:05.076 --> 0:18:08.556
<v Speaker 1>I mean, I know there's this new AI model that

0:18:09.156 --> 0:18:12.516
<v Speaker 1>integrates images, like you can, you know, whatever, take a

0:18:12.556 --> 0:18:14.876
<v Speaker 1>picture with the camera on your phone and put in text.

0:18:14.996 --> 0:18:17.156
<v Speaker 1>So like, well, you have this new model, and it,

0:18:17.356 --> 0:18:20.676
<v Speaker 1>like the old one, has this worm fuzzy acronym. Right,

0:18:20.716 --> 0:18:23.516
<v Speaker 1>it's called MUM, which stands for hold on, I gotta

0:18:23.516 --> 0:18:27.756
<v Speaker 1>look at my notes, the multitask unified model. So like,

0:18:28.076 --> 0:18:33.036
<v Speaker 1>tell me about MUM. So MOM is our next level

0:18:33.076 --> 0:18:36.676
<v Speaker 1>model that you know Bert was about language. MOM is

0:18:36.716 --> 0:18:41.316
<v Speaker 1>about all these different modalities of information coming together, particular

0:18:41.516 --> 0:18:44.636
<v Speaker 1>images and language. I mean, is that if really images

0:18:44.676 --> 0:18:49.596
<v Speaker 1>and language and we've got some limited applications of it

0:18:49.636 --> 0:18:53.876
<v Speaker 1>in search today. So for example, you can take the

0:18:54.276 --> 0:18:58.756
<v Speaker 1>take the photo of somebody's handbag and say you want

0:18:58.756 --> 0:19:01.396
<v Speaker 1>to shop it, and that will work today. And that

0:19:01.556 --> 0:19:04.036
<v Speaker 1>is like we were not able to do this previously,

0:19:04.116 --> 0:19:06.076
<v Speaker 1>and that in and of itself is a big breakthrough.

0:19:06.516 --> 0:19:09.516
<v Speaker 1>But there's still so much headroom, right like, this, still

0:19:09.556 --> 0:19:13.516
<v Speaker 1>so much ability to say, you know, you can add

0:19:13.556 --> 0:19:18.476
<v Speaker 1>sort of I would I would classify our current ability

0:19:18.516 --> 0:19:22.236
<v Speaker 1>to process words in this multimodal context as you know,

0:19:22.396 --> 0:19:24.396
<v Speaker 1>kind of like back in the early days days of

0:19:24.436 --> 0:19:26.636
<v Speaker 1>the internet, you can say near me to find where

0:19:26.636 --> 0:19:29.196
<v Speaker 1>you can buy it. Near me, you can say buy,

0:19:29.396 --> 0:19:34.396
<v Speaker 1>but you can't necessarily like ask an incredibly complicated question

0:19:34.956 --> 0:19:38.116
<v Speaker 1>about a picture, right like, so we're kind of back

0:19:38.156 --> 0:19:42.276
<v Speaker 1>to keywords in this new pictures plus words universe. Let

0:19:42.316 --> 0:19:45.356
<v Speaker 1>me ask a dumb question, why why can't you just

0:19:45.436 --> 0:19:51.036
<v Speaker 1>take all of your brilliant intent AI and copy and

0:19:51.116 --> 0:19:56.516
<v Speaker 1>paste it to fit with the image AI. So a

0:19:56.556 --> 0:20:01.716
<v Speaker 1>couple of things. The first is that anytime we develop

0:20:02.236 --> 0:20:04.876
<v Speaker 1>sort of this new technology, we also need to see

0:20:04.916 --> 0:20:07.916
<v Speaker 1>how users start using it, right And so I think

0:20:07.956 --> 0:20:12.156
<v Speaker 1>it's also fed say that we don't have. You know,

0:20:12.196 --> 0:20:14.196
<v Speaker 1>we have a ton of people using this, but we

0:20:15.036 --> 0:20:18.156
<v Speaker 1>haven't yet. There hasn't been time for that new technology

0:20:18.196 --> 0:20:21.156
<v Speaker 1>to really be accepted by the world. And then we

0:20:21.196 --> 0:20:25.396
<v Speaker 1>have this vast set of queries that we're doing poorly on, right.

0:20:25.436 --> 0:20:27.276
<v Speaker 1>So that's the other thing you should know about Google.

0:20:27.516 --> 0:20:29.556
<v Speaker 1>We spend a lot of time looking at the queries

0:20:29.596 --> 0:20:31.356
<v Speaker 1>where we're failing. That's one of the other reasons we

0:20:31.396 --> 0:20:33.876
<v Speaker 1>have a deep appreciation of how search is an unsolved

0:20:33.916 --> 0:20:37.236
<v Speaker 1>problem because we're just constantly looking at queries whether the

0:20:37.316 --> 0:20:40.716
<v Speaker 1>users clearly not getting what they're looking for. And I'll

0:20:41.116 --> 0:20:45.156
<v Speaker 1>use that as a as a siege to figure out

0:20:45.196 --> 0:20:47.836
<v Speaker 1>how to make things better. So do I understand you

0:20:47.916 --> 0:20:50.196
<v Speaker 1>that the fundamental thing you need now is just lots

0:20:50.236 --> 0:20:52.716
<v Speaker 1>of people to use this thing so you can see

0:20:53.436 --> 0:20:56.156
<v Speaker 1>the weird ways people search and the things they sort

0:20:56.196 --> 0:21:00.236
<v Speaker 1>of do that are hard to understand. That's certainly one

0:21:00.276 --> 0:21:03.756
<v Speaker 1>of the things we need. I mean, it is fundamentally

0:21:04.316 --> 0:21:07.636
<v Speaker 1>search works in service of our users, right, and so

0:21:07.876 --> 0:21:13.196
<v Speaker 1>understanding the the failures is critical to how we get better.

0:21:13.396 --> 0:21:15.476
<v Speaker 1>I think there are also just things that we know

0:21:15.556 --> 0:21:17.956
<v Speaker 1>that we need to do on the AI and the

0:21:17.996 --> 0:21:22.396
<v Speaker 1>model side that we'll continue working through, right, the ability

0:21:22.476 --> 0:21:28.156
<v Speaker 1>to really bring together more of the two step process

0:21:28.196 --> 0:21:31.636
<v Speaker 1>of how do you conceptually understand the words, conceptually understand

0:21:31.636 --> 0:21:35.076
<v Speaker 1>the image, and then bring those two things together and

0:21:35.156 --> 0:21:37.556
<v Speaker 1>have that be a bit deeper on both sides rather

0:21:37.636 --> 0:21:40.716
<v Speaker 1>than just the combination together and those sorts of things.

0:21:41.676 --> 0:21:44.236
<v Speaker 1>But yeah, I mean people coming in and using it

0:21:44.276 --> 0:21:46.636
<v Speaker 1>and then having a bad time, we'll then make it better.

0:21:48.276 --> 0:21:53.316
<v Speaker 1>It seems like there have been two main threads of

0:21:53.396 --> 0:21:57.476
<v Speaker 1>AI research. One is basically language and the other is

0:21:57.876 --> 0:22:02.796
<v Speaker 1>basically vision and images. I mean, it is it right

0:22:02.836 --> 0:22:05.196
<v Speaker 1>to think of what you're trying to do as the

0:22:05.236 --> 0:22:10.756
<v Speaker 1>synthesis of those two sort of main AI traditions. Yeah,

0:22:10.796 --> 0:22:15.956
<v Speaker 1>I think so. I think it is clearly the case

0:22:16.076 --> 0:22:20.196
<v Speaker 1>that just like uh, you know, with Bert, we took

0:22:20.196 --> 0:22:23.236
<v Speaker 1>all these words and we got down to concepts. Right,

0:22:23.476 --> 0:22:26.236
<v Speaker 1>It is clearly the case that human beings understand the

0:22:26.276 --> 0:22:28.756
<v Speaker 1>world through concepts, and they do that visually, and they

0:22:28.796 --> 0:22:32.876
<v Speaker 1>do that with language, and ultimately the concepts are the same. Right,

0:22:32.956 --> 0:22:36.516
<v Speaker 1>So being able being able to say, okay, here's here's

0:22:36.516 --> 0:22:39.516
<v Speaker 1>a concept, and we can attach to that what that

0:22:39.596 --> 0:22:43.076
<v Speaker 1>concept looks like or that visual representation of that concept

0:22:43.116 --> 0:22:45.876
<v Speaker 1>as much as it has one and the words surrounding

0:22:45.916 --> 0:22:50.196
<v Speaker 1>that concept. That's when we can really unlock this true

0:22:51.596 --> 0:22:56.076
<v Speaker 1>natural way of understanding the world that we think is

0:22:56.436 --> 0:22:58.836
<v Speaker 1>going to enable people to ask all those questions that

0:22:58.876 --> 0:23:02.356
<v Speaker 1>they have that they're not asking right now. Are there

0:23:03.276 --> 0:23:08.756
<v Speaker 1>applications that go beyond search that come to mind if

0:23:08.756 --> 0:23:13.316
<v Speaker 1>you figure this out? Yeah, I mean I think that

0:23:16.316 --> 0:23:20.596
<v Speaker 1>search has this connotation of kind of find what's out there.

0:23:22.316 --> 0:23:26.476
<v Speaker 1>I think there's something, you know, we're thinking about what

0:23:26.516 --> 0:23:31.156
<v Speaker 1>this looks like in the generative space. So for example,

0:23:31.196 --> 0:23:35.836
<v Speaker 1>if you're looking for you know, I bake birthday cakes,

0:23:36.356 --> 0:23:40.836
<v Speaker 1>and sometimes I for my kids, and sometimes what my

0:23:40.916 --> 0:23:43.636
<v Speaker 1>kids want, and a birthday cake just actually doesn't exist

0:23:43.796 --> 0:23:46.956
<v Speaker 1>on the internet right or there's like only one or two,

0:23:47.156 --> 0:23:48.956
<v Speaker 1>So then I have to come up with it myself.

0:23:48.996 --> 0:23:54.716
<v Speaker 1>And like, what if AI could help us generate as

0:23:54.756 --> 0:23:59.156
<v Speaker 1>sample image just based on these concepts that I could

0:23:59.196 --> 0:24:02.716
<v Speaker 1>then use for inspiration. I think that's a pretty interesting concept.

0:24:03.476 --> 0:24:05.356
<v Speaker 1>There's obviously a lot of things that we need to

0:24:05.356 --> 0:24:07.996
<v Speaker 1>be very thoughtful about in this space as we do it,

0:24:09.396 --> 0:24:12.996
<v Speaker 1>But I think this idea of extending search past the

0:24:13.076 --> 0:24:15.916
<v Speaker 1>notion of connecting you with the information that's out there.

0:24:15.956 --> 0:24:21.676
<v Speaker 1>To actually synthesizing new information for you is pretty interesting

0:24:21.876 --> 0:24:25.156
<v Speaker 1>and something we're talking about a lot. You know. One

0:24:25.196 --> 0:24:27.316
<v Speaker 1>of the things that has become clear to me talking

0:24:27.356 --> 0:24:32.716
<v Speaker 1>with you is clearly, I think too narrowly about search. Right.

0:24:32.756 --> 0:24:35.756
<v Speaker 1>I have this very kind of twenty years ago idea

0:24:35.836 --> 0:24:41.636
<v Speaker 1>of like searching text on the web, and the web

0:24:41.716 --> 0:24:45.516
<v Speaker 1>has become much less text based in that time. Right,

0:24:46.876 --> 0:24:50.196
<v Speaker 1>the web includes Instagram, the web includes TikTok, and those

0:24:50.196 --> 0:24:54.196
<v Speaker 1>are places where, weirdly to me, lots of people go

0:24:54.316 --> 0:24:56.956
<v Speaker 1>to search like people go on TikTok to find whatever,

0:24:56.996 --> 0:24:58.716
<v Speaker 1>where to go out to eat, which would never occur

0:24:58.796 --> 0:25:01.796
<v Speaker 1>to me. So I mean it's that part of the

0:25:01.916 --> 0:25:06.596
<v Speaker 1>sort of motivation on some level for you to figure out, Oh, right, text,

0:25:06.636 --> 0:25:08.516
<v Speaker 1>that's not enough, clue, we got to figure out how

0:25:08.556 --> 0:25:10.596
<v Speaker 1>to search in video and what does that even mean.

0:25:12.156 --> 0:25:15.596
<v Speaker 1>I think we're really driven by what our users are

0:25:15.596 --> 0:25:19.396
<v Speaker 1>telling us, and we just have really robust mechanisms for

0:25:20.156 --> 0:25:23.556
<v Speaker 1>understanding what our users are doing. And it's pretty clear

0:25:23.756 --> 0:25:29.316
<v Speaker 1>that people around the world find image and video content

0:25:29.396 --> 0:25:34.276
<v Speaker 1>to be pretty compelling, right, I Mean that's sort of

0:25:34.316 --> 0:25:38.916
<v Speaker 1>a very obvious statement, but you know the Internet in

0:25:38.956 --> 0:25:41.796
<v Speaker 1>the early days, it was banned with constrained. It was

0:25:41.876 --> 0:25:45.276
<v Speaker 1>technology constrained. It had to be words because that's what

0:25:45.396 --> 0:25:50.556
<v Speaker 1>the technology enabled, not necessarily because that's what human beings

0:25:50.596 --> 0:25:54.276
<v Speaker 1>most enjoy in terms of an information consumption experience. And

0:25:54.356 --> 0:25:57.876
<v Speaker 1>so we really are driven by what we're seeing in

0:25:57.916 --> 0:26:00.676
<v Speaker 1>the user trends, and we're really driven by just this

0:26:00.716 --> 0:26:03.076
<v Speaker 1>mission of how do we keep helping people get the

0:26:03.116 --> 0:26:10.996
<v Speaker 1>best answers to their questions that we can give them.

0:26:11.036 --> 0:26:14.036
<v Speaker 1>In a minute, the Lightning round, including what you learn

0:26:14.116 --> 0:26:16.756
<v Speaker 1>about the Internet when you spend six years working at

0:26:16.756 --> 0:26:27.196
<v Speaker 1>Google Search. Now, let's get back to the show. I

0:26:27.236 --> 0:26:29.316
<v Speaker 1>want to do a lightning round. We close usually with

0:26:29.356 --> 0:26:32.316
<v Speaker 1>a lightning round on this show, just a bunch of

0:26:32.556 --> 0:26:37.876
<v Speaker 1>relatively quick questions. So in this instance, I googled best

0:26:37.956 --> 0:26:41.196
<v Speaker 1>Lightning Round questions, and right at the top of the

0:26:41.196 --> 0:26:43.076
<v Speaker 1>search results page I didn't even have to click through,

0:26:43.236 --> 0:26:45.036
<v Speaker 1>is this bulleted list. I'm just going to give you

0:26:45.076 --> 0:26:48.716
<v Speaker 1>a few from there. Sounds good. Favorite day of the week,

0:26:50.476 --> 0:26:57.076
<v Speaker 1>oh Monday, because I get to go to work and

0:26:57.516 --> 0:26:59.716
<v Speaker 1>not deal with my kids all day. Who I love

0:26:59.916 --> 0:27:06.116
<v Speaker 1>very daily. Good favorite city in US besides the one

0:27:06.156 --> 0:27:10.716
<v Speaker 1>you live in just reading here New York's City. Thank you.

0:27:12.156 --> 0:27:14.956
<v Speaker 1>Would you rather be able to speak every language in

0:27:14.956 --> 0:27:19.196
<v Speaker 1>the world or be able to talk to animals? Speak

0:27:19.236 --> 0:27:22.636
<v Speaker 1>every language in the world. I'm shocked, to be honest,

0:27:22.676 --> 0:27:24.796
<v Speaker 1>Although I get that like a Google might actually figure

0:27:24.796 --> 0:27:27.276
<v Speaker 1>that out. Does it not seem like you can already

0:27:27.316 --> 0:27:30.596
<v Speaker 1>get a translator for every language. Talking that animals would

0:27:30.636 --> 0:27:34.236
<v Speaker 1>be like a revolution and human understanding of the natural world,

0:27:35.076 --> 0:27:38.676
<v Speaker 1>I guess. But I do not speak any language as really,

0:27:38.916 --> 0:27:42.316
<v Speaker 1>and I constantly feel bad about it. So maybe that

0:27:42.396 --> 0:27:48.676
<v Speaker 1>was just a fair personal feeling of weakness. So okay,

0:27:48.836 --> 0:27:51.116
<v Speaker 1>so we're now we're pivoting out of the Google lightning

0:27:51.156 --> 0:27:57.116
<v Speaker 1>round questions into my own bespoke lightning round questions. What's

0:27:57.156 --> 0:28:02.196
<v Speaker 1>your favorite kind of cake to bake? Oh? Well, so

0:28:03.636 --> 0:28:09.476
<v Speaker 1>I really make these, like quite elaborate cakes for my

0:28:09.596 --> 0:28:13.156
<v Speaker 1>children because I want them to be able to grow

0:28:13.196 --> 0:28:15.476
<v Speaker 1>up and say, wow, I remember you making great cakes

0:28:15.476 --> 0:28:20.396
<v Speaker 1>for us. Mostly so I recently made in one of

0:28:20.396 --> 0:28:23.196
<v Speaker 1>my kids plays Minecraft a lot, and there is a

0:28:23.276 --> 0:28:26.276
<v Speaker 1>character called a slime, which is a sort of jelly

0:28:26.356 --> 0:28:28.316
<v Speaker 1>blob that kind of jumps on top of you and

0:28:28.436 --> 0:28:30.556
<v Speaker 1>kills you if you don't fight it off. And so

0:28:30.596 --> 0:28:33.556
<v Speaker 1>I made a slime cake with a cake embedded in

0:28:34.316 --> 0:28:38.396
<v Speaker 1>the jelly. So big idea one here? What do you

0:28:38.436 --> 0:28:43.876
<v Speaker 1>think you understand about the Internet that most people don't understand? Oh?

0:28:43.916 --> 0:28:47.356
<v Speaker 1>I like this one. I think most people don't understand

0:28:47.396 --> 0:28:50.876
<v Speaker 1>how much it changes every day. And you know, so

0:28:50.916 --> 0:28:54.076
<v Speaker 1>we have this astonishing stat that even I didn't believe

0:28:54.116 --> 0:28:56.116
<v Speaker 1>when I heard it, which is that fifteen percent of

0:28:56.116 --> 0:28:59.196
<v Speaker 1>the queries Google sees every day we have never seen before.

0:28:59.396 --> 0:29:01.916
<v Speaker 1>And that happens every day. That there's fifteen percent, I

0:29:02.036 --> 0:29:06.356
<v Speaker 1>just completely new. And the same happens on the internet side.

0:29:06.396 --> 0:29:09.396
<v Speaker 1>Every day we index a ton of new content we've

0:29:09.436 --> 0:29:13.116
<v Speaker 1>never seen before about ideas that are completely new to

0:29:13.236 --> 0:29:18.036
<v Speaker 1>humanity at that time, right, And you know, we have

0:29:18.116 --> 0:29:22.116
<v Speaker 1>to be able to continually understand that and keep up.

0:29:22.156 --> 0:29:25.036
<v Speaker 1>And I think that people sort of have this idea

0:29:25.116 --> 0:29:27.196
<v Speaker 1>that there's a fixed amount of information out there, but

0:29:27.236 --> 0:29:31.756
<v Speaker 1>actually human beings are astonishingly productive and are constantly coming

0:29:31.836 --> 0:29:35.476
<v Speaker 1>up with new ideas. If everything goes well, what problem

0:29:35.476 --> 0:29:38.556
<v Speaker 1>will you be trying to solve in five years, I

0:29:38.596 --> 0:29:41.636
<v Speaker 1>will still be working on making Google Search better for

0:29:41.716 --> 0:29:44.396
<v Speaker 1>all our users. I think we will I think we

0:29:44.436 --> 0:29:48.156
<v Speaker 1>will be working on this for the next hundred years.

0:29:48.236 --> 0:29:51.836
<v Speaker 1>Is there a narrower answer, like this particular problem you're

0:29:51.836 --> 0:29:57.076
<v Speaker 1>working on now of integrating image and words basically like

0:29:57.516 --> 0:30:00.036
<v Speaker 1>you think you'll obviously it won't be completely solved, but

0:30:00.076 --> 0:30:02.076
<v Speaker 1>you think that'll basically work. And if so, is there

0:30:02.116 --> 0:30:09.916
<v Speaker 1>a next thing? I think the problem of a video

0:30:10.316 --> 0:30:12.916
<v Speaker 1>I think will continue to be hard because there's just

0:30:13.036 --> 0:30:15.756
<v Speaker 1>such a large amount of information in a given video.

0:30:17.276 --> 0:30:21.956
<v Speaker 1>The other problem that I'm really interested in is helping

0:30:22.676 --> 0:30:27.516
<v Speaker 1>people pause information with helpful context. So, like, how you

0:30:27.556 --> 0:30:31.676
<v Speaker 1>know we've unleashed the all of the world's information on people,

0:30:32.356 --> 0:30:36.156
<v Speaker 1>how do you actually help help them sift through that

0:30:36.236 --> 0:30:39.756
<v Speaker 1>and make good decisions, whether it's choosing a reliable merchant

0:30:39.796 --> 0:30:43.676
<v Speaker 1>to buy from or finding reliable medical information? How do

0:30:43.716 --> 0:30:46.196
<v Speaker 1>you help people make those decisions for themselves and be

0:30:46.316 --> 0:30:49.476
<v Speaker 1>literate with their information choices. What's one piece of advice

0:30:49.476 --> 0:30:54.996
<v Speaker 1>you'd give to someone trying to solve a hard problem.

0:30:55.036 --> 0:30:58.596
<v Speaker 1>I would say, find a really great group of people

0:30:58.716 --> 0:31:02.556
<v Speaker 1>to help solve up with you, because generally trying to

0:31:02.556 --> 0:31:05.836
<v Speaker 1>solve hard things by yourself enser being an active fustriction.

0:31:09.436 --> 0:31:12.356
<v Speaker 1>Kathy Edwards is vice president and g M of Search

0:31:12.596 --> 0:31:18.236
<v Speaker 1>at Google. Today's show was edited by Robert Smith, produced

0:31:18.276 --> 0:31:22.596
<v Speaker 1>by Edith Russelo, and engineered by Amanda k Waugh. I'm

0:31:22.676 --> 0:31:25.116
<v Speaker 1>Jacob Goldstein, and we'll be back next week with another

0:31:25.116 --> 0:31:31.636
<v Speaker 1>episode of What's Your Problem.