WEBVTT - Google Goes Magenta

0:00:00.160 --> 0:00:07.160
<v Speaker 1>Brought to you by Toyota. Let's go places. Welcome to

0:00:07.360 --> 0:00:14.920
<v Speaker 1>Forward Thinking. Hey there, and welcome to Forward Thinking, the

0:00:15.120 --> 0:00:18.239
<v Speaker 1>podcast that looks at the future and says rewritten by

0:00:18.239 --> 0:00:24.840
<v Speaker 1>a machine with new technology. I'm John and I'm Joe McCormick. So, guys,

0:00:25.239 --> 0:00:28.560
<v Speaker 1>you know, we've talked a little bit about AI and

0:00:28.840 --> 0:00:32.400
<v Speaker 1>its potential role in creativity a few times. Back in September,

0:00:33.320 --> 0:00:36.400
<v Speaker 1>we recorded an episode called the Future of Music Composition,

0:00:36.920 --> 0:00:39.559
<v Speaker 1>and we talked about electronically aided composition as well as

0:00:39.600 --> 0:00:43.400
<v Speaker 1>the idea of AI being able to compose music all

0:00:43.440 --> 0:00:45.879
<v Speaker 1>by itself. You know. One of the funny things is

0:00:46.400 --> 0:00:49.159
<v Speaker 1>on that episode, I think we we talked about, you know,

0:00:49.200 --> 0:00:52.599
<v Speaker 1>will we ever have something that and and then it's

0:00:52.640 --> 0:00:55.600
<v Speaker 1>stuff we already have right now. Yeah. Yeah, to be fair,

0:00:55.680 --> 0:00:57.880
<v Speaker 1>some of it may not have been as as publicly

0:00:57.920 --> 0:01:01.400
<v Speaker 1>available in as it is. No, I mean now today,

0:01:01.520 --> 0:01:07.080
<v Speaker 1>not necessarily now, back then we had earlier versions of them. Yeah. Oh,

0:01:07.400 --> 0:01:09.399
<v Speaker 1>the base base technology that we're going to be talking

0:01:09.440 --> 0:01:13.600
<v Speaker 1>about today, really deep learning didn't kind of was not

0:01:13.680 --> 0:01:18.880
<v Speaker 1>announced to the people. Yeah, we can. We cannot be blamed,

0:01:19.360 --> 0:01:22.000
<v Speaker 1>is I think what we're trying to say here, But

0:01:22.120 --> 0:01:26.240
<v Speaker 1>we're futurists, not psychics. We're not even really futurists. Reporters, Yeah,

0:01:26.240 --> 0:01:29.480
<v Speaker 1>we're not even really reporters. Kind of sit in front

0:01:29.480 --> 0:01:32.080
<v Speaker 1>of microphone. We're just gonna keep stepping back. We don't

0:01:32.080 --> 0:01:34.800
<v Speaker 1>know what we are, but we know what we're excited about.

0:01:35.000 --> 0:01:38.360
<v Speaker 1>We're excited about this idea of AI being able to

0:01:38.440 --> 0:01:42.920
<v Speaker 1>engage in works of creativity. And before, just before we

0:01:43.040 --> 0:01:46.560
<v Speaker 1>jumped into the studio, Joe, you came across an example

0:01:46.760 --> 0:01:50.560
<v Speaker 1>of AI that works with music. Yeah, so I I

0:01:50.640 --> 0:01:53.560
<v Speaker 1>was wondering if you guys had listened to any of

0:01:53.600 --> 0:01:57.600
<v Speaker 1>the creations of this app called juke deck, And before

0:01:57.720 --> 0:02:00.640
<v Speaker 1>you asked us, I had not, But Lauren and I

0:02:00.680 --> 0:02:04.960
<v Speaker 1>listened to a track. I decided to generate a music

0:02:05.040 --> 0:02:09.840
<v Speaker 1>track that was in the electronica genre with a mood

0:02:10.280 --> 0:02:14.080
<v Speaker 1>of aggressive. Was it aggressive, No, it was kind of.

0:02:14.160 --> 0:02:18.920
<v Speaker 1>It was kind of a little bouncy yeah, yeah, aggressively bouncy. Yeah.

0:02:19.120 --> 0:02:22.560
<v Speaker 1>What juke Dick does is it will let you specify

0:02:22.560 --> 0:02:26.040
<v Speaker 1>a mood and specify a genre from like, you know,

0:02:26.120 --> 0:02:28.800
<v Speaker 1>four different choices. You can pick a piano thing, a

0:02:28.880 --> 0:02:31.720
<v Speaker 1>folk thing, an electronic thing, or something else. I can't

0:02:31.720 --> 0:02:34.200
<v Speaker 1>remember the last one, and then it'll let you pick

0:02:34.200 --> 0:02:35.919
<v Speaker 1>a mood to go with that and say how long

0:02:35.960 --> 0:02:38.760
<v Speaker 1>of a track you want, and then it'll just generate

0:02:38.960 --> 0:02:42.720
<v Speaker 1>for you an original track of music that is royalty

0:02:42.800 --> 0:02:46.880
<v Speaker 1>free that was created by a I. So I first

0:02:46.919 --> 0:02:49.480
<v Speaker 1>told it I wanted a ninety second original track of

0:02:49.520 --> 0:02:52.840
<v Speaker 1>electronic music for a chilled mood, and it gave me

0:02:52.880 --> 0:02:57.200
<v Speaker 1>a track called Holistic Adventure. Ours was called Sweltering Seas.

0:02:57.520 --> 0:03:00.720
<v Speaker 1>That's pretty good. And then it also made me a

0:03:00.800 --> 0:03:04.240
<v Speaker 1>sixty second original track of folk style music for a

0:03:04.360 --> 0:03:08.359
<v Speaker 1>melancholic mood, which was called Infinite Atoms. And beyond that,

0:03:08.600 --> 0:03:11.160
<v Speaker 1>if you were to go further into it, they actually

0:03:11.240 --> 0:03:16.320
<v Speaker 1>have other genres available as well, including cinematic, and they're uh,

0:03:16.880 --> 0:03:18.840
<v Speaker 1>just to pay up for those. Yeah, it's it's when

0:03:18.840 --> 0:03:22.040
<v Speaker 1>you've subscribed to the service. And one thing we should

0:03:22.080 --> 0:03:24.960
<v Speaker 1>point out is that not every mood is available with

0:03:25.040 --> 0:03:27.720
<v Speaker 1>every style of music, So you know, you can't really

0:03:27.720 --> 0:03:31.240
<v Speaker 1>necessarily get a lot of aggressive folk necessarily. Yeah, So

0:03:31.320 --> 0:03:34.000
<v Speaker 1>I'm not exactly sure how this works. So the way

0:03:34.040 --> 0:03:37.280
<v Speaker 1>they explain it is, quote, our AI uses machine learning

0:03:37.320 --> 0:03:39.880
<v Speaker 1>to understand how to write music chord by chord and

0:03:39.960 --> 0:03:42.840
<v Speaker 1>note by note, this means that every track you create

0:03:42.960 --> 0:03:45.880
<v Speaker 1>using juke Deck is truly unique, and so what they're

0:03:45.920 --> 0:03:50.760
<v Speaker 1>saying is that it truly is ground up auto generated. Um,

0:03:50.840 --> 0:03:54.960
<v Speaker 1>you know, artificial intelligence created music and they're not just

0:03:55.080 --> 0:03:59.400
<v Speaker 1>like working from templates or something like that. But then again,

0:03:59.440 --> 0:04:02.120
<v Speaker 1>I mean, it's it's hard to know what what's really

0:04:02.120 --> 0:04:04.360
<v Speaker 1>going on on the back end, but at least if

0:04:04.400 --> 0:04:08.160
<v Speaker 1>it works as advertised, I'm impressed. Neither one of the

0:04:08.160 --> 0:04:10.840
<v Speaker 1>tracks I heard was like great music it you know,

0:04:10.880 --> 0:04:13.920
<v Speaker 1>it wasn't blowing my mind. I wasn't just amazed, oh man,

0:04:13.960 --> 0:04:17.440
<v Speaker 1>that's so good, but it was absolutely passable for its genre,

0:04:17.520 --> 0:04:20.040
<v Speaker 1>totally good enough to be elevator music or the kind

0:04:20.040 --> 0:04:22.040
<v Speaker 1>of music you'd playing a store or something like that.

0:04:22.080 --> 0:04:24.160
<v Speaker 1>I was telling Lauren that it reminded me of the

0:04:24.200 --> 0:04:26.320
<v Speaker 1>kind of music you would encounter in like a small

0:04:26.400 --> 0:04:29.560
<v Speaker 1>independent video games, Like you're playing a little video game

0:04:29.560 --> 0:04:31.599
<v Speaker 1>where clearly it's maybe one or two people who have

0:04:31.680 --> 0:04:34.160
<v Speaker 1>worked on that, and it totally was something that you

0:04:34.160 --> 0:04:37.640
<v Speaker 1>would hear from that. Not bad, not like a spectacular

0:04:37.760 --> 0:04:39.680
<v Speaker 1>entry into the field, as you were saying, Joe, but

0:04:39.720 --> 0:04:43.640
<v Speaker 1>not bad. Um. It's also I don't know for a

0:04:43.720 --> 0:04:45.760
<v Speaker 1>fact that you can do this, But I was playing

0:04:45.760 --> 0:04:47.279
<v Speaker 1>with it and it looked like there might even be

0:04:47.360 --> 0:04:50.400
<v Speaker 1>the possibility of combining genres. So I wanted to find

0:04:50.400 --> 0:04:53.839
<v Speaker 1>out what would happen if I did folk electronic with

0:04:53.920 --> 0:04:57.840
<v Speaker 1>a really kind of action oriented music. But I didn't

0:04:57.880 --> 0:04:59.600
<v Speaker 1>get a chance to get far enough into it to

0:04:59.640 --> 0:05:04.120
<v Speaker 1>find out if, in fact, that is possible. Well, we

0:05:04.240 --> 0:05:07.040
<v Speaker 1>have that little example there. And I don't mean to

0:05:07.080 --> 0:05:09.680
<v Speaker 1>say little as in like I'm I'm dismissing it. I

0:05:09.720 --> 0:05:14.719
<v Speaker 1>mean it's it's a it's a relatively um modest approach

0:05:14.800 --> 0:05:17.480
<v Speaker 1>to it, and there are some really intelligent people behind it.

0:05:17.480 --> 0:05:21.520
<v Speaker 1>It's a it's a group of folks from originally Yeah,

0:05:21.640 --> 0:05:26.680
<v Speaker 1>so smart folks. Meanwhile, we also covered another related topic

0:05:26.800 --> 0:05:28.920
<v Speaker 1>in August two thousand and fourteen, we did a show

0:05:28.960 --> 0:05:31.280
<v Speaker 1>called the Future of Art, and we talked about the

0:05:31.320 --> 0:05:34.240
<v Speaker 1>merger of technology and art. And today we're gonna talk

0:05:34.320 --> 0:05:38.080
<v Speaker 1>about a project that's really trying to push this entire

0:05:38.160 --> 0:05:42.040
<v Speaker 1>idea forward, Google Magenta, And of course it's by Google.

0:05:42.120 --> 0:05:44.880
<v Speaker 1>When you're wearing your Google Fiber is coming, sir, I

0:05:44.920 --> 0:05:47.680
<v Speaker 1>am that was not I didn't think about it when

0:05:47.720 --> 0:05:49.560
<v Speaker 1>I put it on this morning. But yeah, I'm wearing

0:05:49.839 --> 0:05:53.760
<v Speaker 1>my my Google Fiber Georgia shirt. Uh and and I

0:05:53.880 --> 0:05:57.400
<v Speaker 1>understand they're laying Google Fiber along North Avenue right now

0:05:57.839 --> 0:06:00.760
<v Speaker 1>as we speak. Yeah, which is which is right outside

0:06:00.760 --> 0:06:04.600
<v Speaker 1>our office building. In fact, nor that all what all

0:06:04.600 --> 0:06:07.560
<v Speaker 1>those police cars were escorting this morning, the Google Fiber

0:06:08.279 --> 0:06:10.760
<v Speaker 1>escorting the fiber down pondstantly in Avalue right like it

0:06:10.800 --> 0:06:14.240
<v Speaker 1>gets like a presidential motorcade level of welcome. One can

0:06:14.279 --> 0:06:17.000
<v Speaker 1>only hope, right, I'm still I'm really holding out hope

0:06:17.000 --> 0:06:19.760
<v Speaker 1>that I get Google Fiber before too long. At any rate,

0:06:20.360 --> 0:06:23.440
<v Speaker 1>you might be wondering what exactly is Google Magenta. Well,

0:06:23.440 --> 0:06:27.240
<v Speaker 1>it's a project that falls under the Google Brain team.

0:06:27.320 --> 0:06:30.800
<v Speaker 1>That's the department within Google dedicated to using machine intelligence

0:06:30.800 --> 0:06:35.679
<v Speaker 1>focused on deep learning, and in deep learning, engineers program

0:06:35.680 --> 0:06:40.000
<v Speaker 1>networks of like virtual neurons that can look at data.

0:06:40.400 --> 0:06:42.720
<v Speaker 1>So so let's use like images for example. Okay, the

0:06:43.040 --> 0:06:46.280
<v Speaker 1>neurons can look at a picture and assess some factor

0:06:46.360 --> 0:06:49.320
<v Speaker 1>of it, the shapes or the colors maybe, and then

0:06:49.600 --> 0:06:52.520
<v Speaker 1>the neurons can decide what those shapes or colors were

0:06:52.520 --> 0:06:57.400
<v Speaker 1>minded of and assign that decision of probability. Um, probably

0:06:57.440 --> 0:07:00.800
<v Speaker 1>this is a boat. Maybe it's a banana. Most likely

0:07:00.839 --> 0:07:03.440
<v Speaker 1>it's not a centipede, but it's slightly sent a petish

0:07:03.600 --> 0:07:09.200
<v Speaker 1>udmule recognized boats that don't have sales. I mean, you know,

0:07:09.320 --> 0:07:12.400
<v Speaker 1>it all depends on what you fed it to tell it.

0:07:12.400 --> 0:07:14.520
<v Speaker 1>It was about, remember this is This is very similar

0:07:14.520 --> 0:07:16.680
<v Speaker 1>to that idea we talked about, where feeding all those

0:07:16.720 --> 0:07:20.280
<v Speaker 1>pictures of cats to machine learning, so that eventually the computer,

0:07:20.640 --> 0:07:23.320
<v Speaker 1>without being told this is a cat, starts to learn

0:07:23.440 --> 0:07:26.480
<v Speaker 1>what a cat is. It knows that it's got certain

0:07:26.520 --> 0:07:29.040
<v Speaker 1>features it's really aloof, and it really doesn't care if

0:07:29.040 --> 0:07:32.640
<v Speaker 1>you live or die exactly. So, so the neurons, the

0:07:32.680 --> 0:07:36.440
<v Speaker 1>turons passed this information around to other neurons in their layer,

0:07:36.880 --> 0:07:39.800
<v Speaker 1>and then that layer kind of compiles and passes its

0:07:39.800 --> 0:07:43.200
<v Speaker 1>information up through other layers, and and the system makes

0:07:43.240 --> 0:07:46.560
<v Speaker 1>increasingly educated guesses about what exactly is going on in

0:07:46.560 --> 0:07:48.520
<v Speaker 1>this picture. And as you feed the system more and

0:07:48.520 --> 0:07:51.480
<v Speaker 1>more pictures, it makes more and more associations with particular

0:07:51.480 --> 0:07:55.880
<v Speaker 1>shapes and colors, so it learns right. And we also

0:07:55.920 --> 0:07:59.920
<v Speaker 1>talked about a related uh products that came out of

0:08:00.000 --> 0:08:03.400
<v Speaker 1>this group Deep dream. Oh, this is the project that

0:08:03.520 --> 0:08:06.600
<v Speaker 1>turned my dog into a big old mess of caterpillars.

0:08:07.120 --> 0:08:10.239
<v Speaker 1>It turned me into a big old mess of dogs. Yeah.

0:08:10.240 --> 0:08:12.120
<v Speaker 1>This is This also came out of the Brain Team group.

0:08:12.160 --> 0:08:15.320
<v Speaker 1>That's the Artificial Neural Network project that teaches computers how

0:08:15.360 --> 0:08:18.600
<v Speaker 1>to recognize patterns out of visual data, even when the

0:08:18.600 --> 0:08:21.840
<v Speaker 1>patterns aren't necessarily there. It's not it's not too different

0:08:21.880 --> 0:08:24.920
<v Speaker 1>from when you look up at the clouds and you say, oh,

0:08:24.960 --> 0:08:27.680
<v Speaker 1>there's a it's very like a whale. Right, So we

0:08:27.880 --> 0:08:31.480
<v Speaker 1>were teaching computer programs how to hallucinate. Yeah, pretty much.

0:08:31.720 --> 0:08:34.480
<v Speaker 1>And we talked about that in an episode called deep

0:08:34.640 --> 0:08:37.960
<v Speaker 1>Dreaming with Google and that published in July two thousand fifteen.

0:08:38.280 --> 0:08:41.400
<v Speaker 1>And and basically what is going on with deep dream

0:08:41.480 --> 0:08:43.520
<v Speaker 1>here is that instead of just making a guess that

0:08:43.559 --> 0:08:47.320
<v Speaker 1>a picture contained a dog, for example, um, it changed

0:08:47.360 --> 0:08:50.200
<v Speaker 1>the image so that everything that appeared to be a

0:08:50.280 --> 0:08:53.280
<v Speaker 1>dog in it would look more dog like. So like

0:08:53.320 --> 0:08:57.240
<v Speaker 1>a layer would say, well, this this is probably a

0:08:57.320 --> 0:08:59.800
<v Speaker 1>dog based on the shapes in it, So let's enhance

0:09:00.160 --> 0:09:04.200
<v Speaker 1>the doggie shapes to really emphasize the dogginess. And and

0:09:04.240 --> 0:09:08.520
<v Speaker 1>then as that extrapolated image is passed through Deep Dreams layers,

0:09:08.559 --> 0:09:11.520
<v Speaker 1>each one emphasizes whatever it guesses is going on in

0:09:11.559 --> 0:09:14.439
<v Speaker 1>the photo, and yes, frequently what it guesses is going

0:09:14.440 --> 0:09:16.880
<v Speaker 1>on in the photo is a nightmareescape of dog faces

0:09:16.920 --> 0:09:18.920
<v Speaker 1>as far as the eye can see. Right. So well,

0:09:18.960 --> 0:09:21.680
<v Speaker 1>I think the most popular ones were definitely the animal

0:09:21.720 --> 0:09:24.800
<v Speaker 1>recognition ones, but you could tweak the algorithm to recognize

0:09:24.840 --> 0:09:27.520
<v Speaker 1>all kinds of stuff. And it was because that they

0:09:27.520 --> 0:09:31.000
<v Speaker 1>had started with just feeding so many different animal images.

0:09:31.040 --> 0:09:33.120
<v Speaker 1>That was like that was sort of their their starting

0:09:33.120 --> 0:09:35.520
<v Speaker 1>point over at Google when they were training this how

0:09:35.559 --> 0:09:38.400
<v Speaker 1>to recognize different visual patterns. So if a fold in

0:09:38.440 --> 0:09:43.200
<v Speaker 1>your clothing looked even remotely similar to a dog, guess

0:09:43.280 --> 0:09:48.880
<v Speaker 1>what you're wearing? Dogs? Now, according to arms, your arms

0:09:48.880 --> 0:09:52.120
<v Speaker 1>have bugs in them, your your shirt, and your shoulders

0:09:52.200 --> 0:09:55.560
<v Speaker 1>or dogheads, your dog actually is not a dog but

0:09:55.600 --> 0:09:58.200
<v Speaker 1>a bunch of cats taped together. Yeah. I like that.

0:09:58.520 --> 0:10:02.200
<v Speaker 1>Everything became very dog lee for a while. But one

0:10:02.200 --> 0:10:04.280
<v Speaker 1>of the other projects under the Brain team is one

0:10:04.320 --> 0:10:06.800
<v Speaker 1>we're going to mention a little bit later called TensorFlow,

0:10:06.840 --> 0:10:09.319
<v Speaker 1>which is a machine learning engine, and it's an open

0:10:09.360 --> 0:10:13.440
<v Speaker 1>source project, meaning lots of people anyone really can access

0:10:13.440 --> 0:10:16.720
<v Speaker 1>those tools and not just access them, but tweak them,

0:10:16.760 --> 0:10:20.480
<v Speaker 1>improve them, evolve them, and and grow them, and that's

0:10:20.480 --> 0:10:25.120
<v Speaker 1>really an interesting and potentially exciting development in machine learning. Yeah,

0:10:25.160 --> 0:10:28.360
<v Speaker 1>but you can see how pro projects like this might

0:10:28.480 --> 0:10:33.199
<v Speaker 1>be sort of evolving toward ultimately generative powers in AI,

0:10:33.320 --> 0:10:37.560
<v Speaker 1>not just recognition and modification of visual images and other

0:10:37.600 --> 0:10:42.319
<v Speaker 1>types of sensory input and data, but but actually building

0:10:42.400 --> 0:10:45.559
<v Speaker 1>things from the ground up, recognizing patterns and saying I've

0:10:45.559 --> 0:10:47.600
<v Speaker 1>got enough sense of what the patterns are that i

0:10:47.600 --> 0:10:50.240
<v Speaker 1>can make one on my own, right, exactly like you,

0:10:50.240 --> 0:10:53.520
<v Speaker 1>You no longer are taking a a uh, you know,

0:10:53.960 --> 0:10:58.199
<v Speaker 1>an image or a concept and then saying, here, enhance

0:10:58.320 --> 0:11:01.680
<v Speaker 1>this or a this in some way or recognize it.

0:11:01.679 --> 0:11:03.560
<v Speaker 1>And you're now saying, hey, you know what one of

0:11:03.559 --> 0:11:06.520
<v Speaker 1>those happens to look like? Make one? Yeah, And that's

0:11:06.920 --> 0:11:09.559
<v Speaker 1>that's a big leap, right, that's a huge leap. And

0:11:09.920 --> 0:11:12.240
<v Speaker 1>you can do the same thing you somebody could say,

0:11:12.280 --> 0:11:14.200
<v Speaker 1>tell me the plot of a James Bond movie that

0:11:14.240 --> 0:11:16.600
<v Speaker 1>doesn't exist. You've seen enough of them. You know, there's

0:11:16.600 --> 0:11:19.520
<v Speaker 1>a standard pattern. You gotta have the gadgets, you've got

0:11:19.640 --> 0:11:24.160
<v Speaker 1>to have the you know the yeah, yeah, yeah, you

0:11:24.240 --> 0:11:26.520
<v Speaker 1>gotta have a pit of sharks or something like. You

0:11:26.520 --> 0:11:28.360
<v Speaker 1>you can put the pieces together. You gotta have at

0:11:28.440 --> 0:11:31.960
<v Speaker 1>least one one lady who is pretty much a good

0:11:31.960 --> 0:11:33.679
<v Speaker 1>guy and one lady who's pretty much a bad guy,

0:11:33.679 --> 0:11:35.680
<v Speaker 1>and Bond's gotta mess with both of them. You know,

0:11:35.720 --> 0:11:40.679
<v Speaker 1>there's there's certain rules that you probably both what's that

0:11:41.040 --> 0:11:45.079
<v Speaker 1>messed with? Hey, I'm being very uh, you know, being

0:11:45.160 --> 0:11:49.400
<v Speaker 1>very family friendly here, so like James Bond, Yes, exactly,

0:11:49.480 --> 0:11:53.200
<v Speaker 1>James Bond, so family friendly. So the Magenta project is

0:11:53.240 --> 0:11:58.439
<v Speaker 1>aimed at developing artificial intelligence capable of actually creating art,

0:11:58.679 --> 0:12:02.719
<v Speaker 1>not not just altering something so that it looks like art,

0:12:02.920 --> 0:12:06.360
<v Speaker 1>but to create art, both visual art and music. Now,

0:12:06.400 --> 0:12:09.079
<v Speaker 1>the official launch date of Magenta is June one, two

0:12:09.080 --> 0:12:12.320
<v Speaker 1>thousand sixteen. We are recording this on May twenty six,

0:12:12.480 --> 0:12:15.000
<v Speaker 1>two thousand sixteen, so it has not launched as of

0:12:15.040 --> 0:12:17.800
<v Speaker 1>the time we're recording this, UM, but we wanted to

0:12:17.840 --> 0:12:20.400
<v Speaker 1>kind of talk about it, and it was just recently

0:12:20.480 --> 0:12:24.439
<v Speaker 1>announced to the world, although not officially unveiled. Douglas k

0:12:24.480 --> 0:12:28.120
<v Speaker 1>who's working on the project, announced Magenta at Mogue Fest

0:12:29.040 --> 0:12:33.959
<v Speaker 1>I assume named after the Mogue synthesizer which everybody lives

0:12:34.360 --> 0:12:36.920
<v Speaker 1>and um that's actually a music and technology festival that

0:12:36.920 --> 0:12:39.000
<v Speaker 1>takes place in North Carolina, so it was not too

0:12:39.040 --> 0:12:42.760
<v Speaker 1>far away, and also stressed that while they're working on

0:12:42.800 --> 0:12:45.120
<v Speaker 1>this project and while they have high hopes for it,

0:12:45.320 --> 0:12:48.520
<v Speaker 1>he says, AI is still a very long way from

0:12:48.520 --> 0:12:52.360
<v Speaker 1>creating long narrative arcs. So it's not like this is

0:12:52.400 --> 0:12:54.720
<v Speaker 1>going to be, you know, within a year, we're going

0:12:54.760 --> 0:12:57.280
<v Speaker 1>to have computers righting the next great American novel or

0:12:57.320 --> 0:12:59.560
<v Speaker 1>anything along those lines, but that this is the first

0:12:59.559 --> 0:13:04.120
<v Speaker 1>step to word computers making a creative problem solving and

0:13:04.280 --> 0:13:06.839
<v Speaker 1>moving to a point where that problem solving isn't about

0:13:07.000 --> 0:13:11.480
<v Speaker 1>tackling a question, but about creating something new, like music

0:13:11.640 --> 0:13:15.240
<v Speaker 1>or a painting or anything along those lines, or even video. Well,

0:13:15.280 --> 0:13:18.760
<v Speaker 1>I mean, I think something literary and long and coherent

0:13:18.920 --> 0:13:21.480
<v Speaker 1>like a novel would would be one of the most

0:13:21.480 --> 0:13:25.680
<v Speaker 1>difficult things because that involves just the most uh what

0:13:25.800 --> 0:13:29.720
<v Speaker 1>might you call it, semantically diverse array of things that

0:13:29.760 --> 0:13:32.400
<v Speaker 1>you're working with. I mean, in a lot of moving parts,

0:13:33.320 --> 0:13:35.439
<v Speaker 1>not just not just the words that have to make

0:13:35.480 --> 0:13:38.719
<v Speaker 1>sense and sentences and in paragraphs and in chapters, but

0:13:38.800 --> 0:13:42.480
<v Speaker 1>also character development, character motivation. I mean, it's not even

0:13:42.880 --> 0:13:47.040
<v Speaker 1>that doesn't even make sense to a computer, world building, etcetera. Right, Yeah, exactly.

0:13:47.040 --> 0:13:48.840
<v Speaker 1>There are a lot of things, a lot of elements

0:13:48.880 --> 0:13:53.360
<v Speaker 1>in creating a long narrative that are would take a

0:13:53.400 --> 0:13:56.280
<v Speaker 1>long long time to teach a computer. What does this mean?

0:13:56.440 --> 0:13:58.280
<v Speaker 1>I mean you can imagine that a book written by

0:13:58.320 --> 0:14:00.920
<v Speaker 1>a computer before it has a full grasp on that,

0:14:01.160 --> 0:14:03.840
<v Speaker 1>it could end up being incredibly dull. You're just you're

0:14:03.840 --> 0:14:07.720
<v Speaker 1>reading a very mundane account of a person or or

0:14:07.720 --> 0:14:10.560
<v Speaker 1>it could be the opposite. It could be like well,

0:14:10.600 --> 0:14:12.520
<v Speaker 1>it could be like a Dan Brown novel where every

0:14:12.559 --> 0:14:15.240
<v Speaker 1>page ends with a cliffhanger and you think I need

0:14:15.320 --> 0:14:19.720
<v Speaker 1>to break you know. I was looking up Douglas X

0:14:19.720 --> 0:14:21.560
<v Speaker 1>so I was reading a little about him on his

0:14:21.840 --> 0:14:24.880
<v Speaker 1>Google research page, and one of the things he had

0:14:24.880 --> 0:14:30.200
<v Speaker 1>previously worked on was was content. What would you call it?

0:14:30.360 --> 0:14:33.800
<v Speaker 1>Uh oh, I'm losing the word for it. He worked

0:14:33.800 --> 0:14:37.440
<v Speaker 1>on Google Play music delivering you the kind of music

0:14:37.560 --> 0:14:41.280
<v Speaker 1>you would want. There's a term for that, curation. Curation sure, yeah,

0:14:41.320 --> 0:14:44.440
<v Speaker 1>based on yeah, yeah, yeah, based on what you like

0:14:44.560 --> 0:14:47.320
<v Speaker 1>to listen to. Figuring out, Okay, what's the other type

0:14:47.360 --> 0:14:50.200
<v Speaker 1>of music you haven't heard yet that fits in with

0:14:50.280 --> 0:14:53.360
<v Speaker 1>the profile of the stuff that you like, which is,

0:14:53.680 --> 0:14:56.000
<v Speaker 1>you know, that's kind of a tough job, because how

0:14:56.000 --> 0:14:58.600
<v Speaker 1>does the computer know that one song? Oh you know,

0:14:58.880 --> 0:15:02.400
<v Speaker 1>if Jonathan really likes this song by they might be giants.

0:15:02.480 --> 0:15:05.920
<v Speaker 1>He'll probably really also like this song by Slayer. Yeah

0:15:06.080 --> 0:15:08.200
<v Speaker 1>I think that. Yeah, they're like, hey, like like you

0:15:08.240 --> 0:15:11.200
<v Speaker 1>like rock and roll Joe, so probably you like this

0:15:11.400 --> 0:15:15.920
<v Speaker 1>Nickelback song which is rock and roll. Yes, yeah, yes,

0:15:16.080 --> 0:15:18.720
<v Speaker 1>So at least with the Music Genome Project, which is

0:15:18.760 --> 0:15:21.600
<v Speaker 1>what Pandora is based off of, the way that works

0:15:21.720 --> 0:15:25.360
<v Speaker 1>is they have human beings who meta tag every song

0:15:25.400 --> 0:15:29.080
<v Speaker 1>with every kind of descriptor that would be relevant to

0:15:29.200 --> 0:15:33.400
<v Speaker 1>that particular song, and then the algorithm starts looking for

0:15:33.480 --> 0:15:37.440
<v Speaker 1>other songs that have several of those same meta tags

0:15:37.480 --> 0:15:39.840
<v Speaker 1>associated with it and say yeah, yeah, and so it's

0:15:39.880 --> 0:15:43.080
<v Speaker 1>going like you like melancholy guitar solos because you listened

0:15:43.160 --> 0:15:45.800
<v Speaker 1>to the Decembrists, so probably you're going to like the

0:15:45.880 --> 0:15:49.720
<v Speaker 1>dashboard confessionals, right, So that that may be similar to

0:15:49.760 --> 0:15:52.400
<v Speaker 1>the way Google Play Music does it. I don't know,

0:15:52.440 --> 0:15:56.400
<v Speaker 1>because I don't know. He made it sound more like

0:15:56.480 --> 0:15:59.200
<v Speaker 1>that this was a more automated process. That's really interesting,

0:15:59.280 --> 0:16:02.520
<v Speaker 1>more difficult. Yeah, yeah, you're not having a human hold

0:16:02.560 --> 0:16:05.200
<v Speaker 1>your hand taking you through Okay, now this is what

0:16:05.320 --> 0:16:11.240
<v Speaker 1>moody sounds like. This is melancholy, right, that's really interesting. Well.

0:16:12.000 --> 0:16:15.760
<v Speaker 1>At at the mog Fest conference, Magenta team member Adam

0:16:15.880 --> 0:16:19.280
<v Speaker 1>Roberts showed off a digital synthesizer program where he could

0:16:19.320 --> 0:16:21.240
<v Speaker 1>feed a few musical notes into it. I think it

0:16:21.240 --> 0:16:24.520
<v Speaker 1>was a sequence of four notes, and then allowed the

0:16:24.560 --> 0:16:27.800
<v Speaker 1>program to build a melody off those basic notes. And

0:16:27.840 --> 0:16:30.600
<v Speaker 1>what I'm picturing this kind of going off the deep

0:16:30.680 --> 0:16:33.160
<v Speaker 1>dream concept, is you know like, Okay, well these notes

0:16:33.680 --> 0:16:36.240
<v Speaker 1>sound like the beginning of a big band slow dance,

0:16:36.240 --> 0:16:38.080
<v Speaker 1>So I'll just add more notes to make it look

0:16:38.280 --> 0:16:40.760
<v Speaker 1>more like that. Sound more like that. Yeah. The the

0:16:40.920 --> 0:16:44.560
<v Speaker 1>example that they showed in the in the actual festival,

0:16:44.560 --> 0:16:47.760
<v Speaker 1>when I when I watched it, I thought, ah, they

0:16:47.800 --> 0:16:51.360
<v Speaker 1>picked up bad for initial notes. It just because you

0:16:51.440 --> 0:16:53.360
<v Speaker 1>listen to it and you're like, to me, that sounds

0:16:53.400 --> 0:16:57.600
<v Speaker 1>like just someone aimlessly plunking on a synthesizer keyboard. It

0:16:57.600 --> 0:17:01.040
<v Speaker 1>didn't sound like someone actually creating melody. But I would

0:17:01.120 --> 0:17:05.119
<v Speaker 1>argue that you're really in this case, the way this works,

0:17:05.680 --> 0:17:08.600
<v Speaker 1>the the final tune is really only going to be

0:17:08.640 --> 0:17:13.000
<v Speaker 1>as good as the initial input you give to the computer.

0:17:13.160 --> 0:17:16.840
<v Speaker 1>Because it's not creating it out of whole cloth. It's

0:17:16.880 --> 0:17:19.760
<v Speaker 1>taking a foundation and then building upon it. If the

0:17:19.800 --> 0:17:22.240
<v Speaker 1>foundation is faulty, then you can't really expect the rest

0:17:22.280 --> 0:17:26.040
<v Speaker 1>of it to be awesome. Um So, but maybe that

0:17:26.119 --> 0:17:28.440
<v Speaker 1>was just me. I also don't have the best ear

0:17:28.760 --> 0:17:33.080
<v Speaker 1>so perhaps I'm being particularly harsh. Also, this this is

0:17:33.119 --> 0:17:35.760
<v Speaker 1>an early prototype that he's showing off, right, Yeah, and

0:17:35.800 --> 0:17:38.120
<v Speaker 1>maybe that the what we see on June one will

0:17:38.160 --> 0:17:43.400
<v Speaker 1>be a more comprehensive demonstration of the abilities of Magenta.

0:17:43.480 --> 0:17:49.440
<v Speaker 1>And keep in mind the Magenta we had it composed

0:17:49.440 --> 0:17:54.119
<v Speaker 1>this song and destroy these four countries? What what did

0:17:54.240 --> 0:17:57.719
<v Speaker 1>you do? Now? Um? We think? Uh. For one thing,

0:17:57.760 --> 0:18:00.359
<v Speaker 1>Magenta is supposed to be an ongoing project, right. This

0:18:00.440 --> 0:18:03.520
<v Speaker 1>is not something that's a fully fleshed out product and

0:18:03.600 --> 0:18:06.080
<v Speaker 1>they're going to reveal it on June one for people

0:18:06.160 --> 0:18:10.399
<v Speaker 1>to play with. It's more like, here's the concept behind

0:18:10.400 --> 0:18:12.960
<v Speaker 1>the project, here's how we're going to try and accomplish

0:18:13.000 --> 0:18:16.159
<v Speaker 1>our goals. Here's where we are now. That's more likely

0:18:16.280 --> 0:18:20.160
<v Speaker 1>to be the announcement on June one. So uh. Ex

0:18:20.280 --> 0:18:24.520
<v Speaker 1>hope is that by feeding enough musical information to Magenta,

0:18:24.720 --> 0:18:28.000
<v Speaker 1>enough songs, in other words, it will be able to

0:18:28.000 --> 0:18:31.359
<v Speaker 1>produce its own music that is esthetically pleasing to human

0:18:31.440 --> 0:18:36.639
<v Speaker 1>like people persons, and that those yeah, I mean that

0:18:36.760 --> 0:18:39.520
<v Speaker 1>my list is pretty small, but I know a couple.

0:18:39.880 --> 0:18:42.800
<v Speaker 1>So the program first has to learn what makes music work?

0:18:42.880 --> 0:18:46.720
<v Speaker 1>What are the rules of music? So what are the

0:18:46.720 --> 0:18:49.280
<v Speaker 1>sort of things that we like to listen to? And

0:18:49.320 --> 0:18:51.879
<v Speaker 1>once you know those rules, when is it okay or

0:18:51.960 --> 0:18:55.639
<v Speaker 1>even preferable to break those rules? When are when is

0:18:55.640 --> 0:18:59.240
<v Speaker 1>it all right to stray from the conventions of any

0:18:59.240 --> 0:19:02.680
<v Speaker 1>particular musical genre and do so in a way that's

0:19:02.760 --> 0:19:06.560
<v Speaker 1>interesting and and maybe it's pleasing, maybe it's maybe it's

0:19:06.600 --> 0:19:09.200
<v Speaker 1>not pleasing, but it's the sort of thing that catches

0:19:09.240 --> 0:19:12.639
<v Speaker 1>your attention and that's what makes the music really stand

0:19:12.640 --> 0:19:16.280
<v Speaker 1>out to you. Right, So these are not easy concepts,

0:19:16.320 --> 0:19:19.080
<v Speaker 1>even for musicians, like human musicians it you know, it

0:19:19.080 --> 0:19:22.000
<v Speaker 1>can take years of study to really understand musical theory

0:19:22.480 --> 0:19:27.280
<v Speaker 1>and be able to craft something that is most likely

0:19:27.320 --> 0:19:30.600
<v Speaker 1>to evoke the reaction your hope to get from your audience.

0:19:31.160 --> 0:19:33.920
<v Speaker 1>You know, I mean, I know, I've written a couple

0:19:33.960 --> 0:19:38.480
<v Speaker 1>of songs. They're terrible, uh, rather than any rate. So

0:19:38.640 --> 0:19:40.800
<v Speaker 1>music is just the first type of art the Magenta

0:19:40.880 --> 0:19:43.159
<v Speaker 1>is going to tackle. They're going to actually use the

0:19:43.200 --> 0:19:45.560
<v Speaker 1>same sort of approach to generate images and perhaps even

0:19:45.720 --> 0:19:49.119
<v Speaker 1>video in the future that they mentioned eventually text too. Yeah,

0:19:49.160 --> 0:19:51.639
<v Speaker 1>so again getting to that point where maybe we can

0:19:51.640 --> 0:19:53.479
<v Speaker 1>get to that narrative arc. And this also kind of

0:19:53.480 --> 0:19:55.880
<v Speaker 1>feeds into when we were talking of our I don't

0:19:55.880 --> 0:19:57.760
<v Speaker 1>know if we've ever talked about on the podcast, but

0:19:57.840 --> 0:20:01.320
<v Speaker 1>Google recently in the news is uh, a lot of

0:20:01.320 --> 0:20:03.920
<v Speaker 1>people poked fun at Google because it was feeding romance

0:20:03.960 --> 0:20:07.640
<v Speaker 1>novels too. It's um digital assistant, so it would learn

0:20:07.680 --> 0:20:11.560
<v Speaker 1>better about how people converse. The important thing being that

0:20:11.880 --> 0:20:16.159
<v Speaker 1>romance novels tend to follow a very similar uh uh

0:20:16.720 --> 0:20:20.480
<v Speaker 1>pattern right, you're very they're very formulaic, but different romance

0:20:20.520 --> 0:20:23.399
<v Speaker 1>novels say the same thing in different ways. And so

0:20:23.960 --> 0:20:26.600
<v Speaker 1>the hope is that by feeding this kind of information,

0:20:26.640 --> 0:20:28.880
<v Speaker 1>and romance novels were just one genre that we're fed

0:20:28.880 --> 0:20:30.680
<v Speaker 1>to it, but everyone focused on it because of course

0:20:30.680 --> 0:20:33.840
<v Speaker 1>it's funny. You know, this idea that your digital assistant

0:20:34.240 --> 0:20:38.480
<v Speaker 1>is going to be making some very sassy recommendations to

0:20:38.520 --> 0:20:41.600
<v Speaker 1>you or explain the weather in ways that are probably inappropriate.

0:20:41.920 --> 0:20:44.240
<v Speaker 1>But the idea being that by feeding all this information

0:20:44.320 --> 0:20:47.280
<v Speaker 1>using machine learning, that you would be able to get

0:20:47.400 --> 0:20:51.600
<v Speaker 1>your finished product to be more capable of interacting with

0:20:51.640 --> 0:20:55.480
<v Speaker 1>people using natural language, very similar to what Magenta is doing,

0:20:55.520 --> 0:20:57.800
<v Speaker 1>except in that case it's music and art. Right. You're

0:20:57.840 --> 0:21:01.040
<v Speaker 1>feeding more and more music in to Magenta so that

0:21:01.119 --> 0:21:04.879
<v Speaker 1>it has a quote unquote understanding of what music is

0:21:04.920 --> 0:21:08.520
<v Speaker 1>and can more likely produce something that is similar to

0:21:08.560 --> 0:21:11.359
<v Speaker 1>what a human would make without it actually just copying

0:21:12.080 --> 0:21:15.280
<v Speaker 1>something that a human has already made. Yeah, I mean

0:21:15.320 --> 0:21:19.720
<v Speaker 1>I wonder sort of what the what the end goal, like,

0:21:19.760 --> 0:21:22.720
<v Speaker 1>what the expectation is here, because like, on one hand,

0:21:22.760 --> 0:21:25.359
<v Speaker 1>if we're to believe what's supposedly going on in the

0:21:25.359 --> 0:21:28.240
<v Speaker 1>back end that creates these tracks we listen to a

0:21:28.359 --> 0:21:31.359
<v Speaker 1>juke deck, I feel like, here's the system that is

0:21:31.400 --> 0:21:36.160
<v Speaker 1>already creating perfectly passable music. Again, like I said, it's

0:21:36.160 --> 0:21:39.520
<v Speaker 1>it's not amazing. So I'm wondering, is is Magenta aiming

0:21:39.560 --> 0:21:44.080
<v Speaker 1>to create music that's really going to be amazing and

0:21:44.119 --> 0:21:47.399
<v Speaker 1>people will be like, wow, I love that song. I

0:21:47.440 --> 0:21:50.879
<v Speaker 1>think Magenta's not to put words into the mouths of

0:21:50.920 --> 0:21:54.760
<v Speaker 1>the people in the project, but I think Magenta, by

0:21:55.000 --> 0:22:01.240
<v Speaker 1>using a very specific goal, is really out driving machine

0:22:01.320 --> 0:22:04.000
<v Speaker 1>learning further. Yeah, so I agree. I think that the

0:22:04.119 --> 0:22:07.199
<v Speaker 1>art is absolutely secondary and kind of the headline grabber,

0:22:07.440 --> 0:22:11.000
<v Speaker 1>right um, and and that right the other applications like

0:22:11.080 --> 0:22:13.320
<v Speaker 1>in a voice recognition or something like that, or what

0:22:13.359 --> 0:22:15.840
<v Speaker 1>they're really aiming for, right So. So for a great

0:22:15.840 --> 0:22:18.080
<v Speaker 1>example of this would be looking at the private space

0:22:18.160 --> 0:22:22.240
<v Speaker 1>industry and saying, you know, you've got some pretty big,

0:22:22.280 --> 0:22:25.360
<v Speaker 1>big ideas, how do you focus that in a way

0:22:25.359 --> 0:22:28.560
<v Speaker 1>where you can actually engineer toward a solution? And then

0:22:28.600 --> 0:22:31.359
<v Speaker 1>you just start taking specific questions like how do we

0:22:31.440 --> 0:22:35.000
<v Speaker 1>make sure that astronauts can can breathe in space? And

0:22:35.040 --> 0:22:36.960
<v Speaker 1>you take that first question and you start trying to

0:22:37.000 --> 0:22:40.000
<v Speaker 1>solve that and and it ends up being that it's

0:22:40.080 --> 0:22:42.720
<v Speaker 1>one part of a much bigger picture. I think that

0:22:42.720 --> 0:22:44.840
<v Speaker 1>that's what we're going to see when the GENTA. Yeah,

0:22:44.880 --> 0:22:47.560
<v Speaker 1>i'd agree. I mean I just wondered about the art itself,

0:22:47.640 --> 0:22:50.400
<v Speaker 1>like how what do they think it's gonna be? Like,

0:22:50.600 --> 0:22:53.720
<v Speaker 1>I'm very curious about that too. And obviously since we're

0:22:53.760 --> 0:22:56.879
<v Speaker 1>so early into the project, it's hard to say. I

0:22:56.920 --> 0:22:59.560
<v Speaker 1>would love to be able to to revisit this. In fact,

0:22:59.560 --> 0:23:02.560
<v Speaker 1>maybe we will be able to revisit this, uh sometime

0:23:02.600 --> 0:23:05.119
<v Speaker 1>into the future and listen to some of the stuff

0:23:05.119 --> 0:23:07.560
<v Speaker 1>Magenta has produced and say, does this sound like a

0:23:07.640 --> 0:23:09.800
<v Speaker 1>human being made it? Or you know, the fact that

0:23:09.800 --> 0:23:11.679
<v Speaker 1>we know a computer made it, does that change what

0:23:11.760 --> 0:23:14.760
<v Speaker 1>we feel about it? And anyway you might wonder how

0:23:14.760 --> 0:23:17.680
<v Speaker 1>the heck is this thing working well? To to kind

0:23:17.680 --> 0:23:22.080
<v Speaker 1>of build upon what Lawrence point was with the neural networks. Uh,

0:23:22.080 --> 0:23:25.640
<v Speaker 1>they're specifically using this set of tools called TensorFlow, which

0:23:25.680 --> 0:23:29.479
<v Speaker 1>falls into that category. UM, that's the open source machine

0:23:29.560 --> 0:23:33.280
<v Speaker 1>learning set of tools. UH. And generally speaking, first, it's

0:23:33.280 --> 0:23:36.280
<v Speaker 1>going to accept midi files, that's a very common version

0:23:36.320 --> 0:23:39.680
<v Speaker 1>of music files. UH. And that's going to be submitted

0:23:39.680 --> 0:23:41.640
<v Speaker 1>by a community of contributors to teach itself the basic

0:23:41.720 --> 0:23:45.080
<v Speaker 1>rules and concepts around music. But TensorFlow itself, according to

0:23:45.080 --> 0:23:48.280
<v Speaker 1>its web page, is an open source software library for

0:23:48.400 --> 0:23:51.840
<v Speaker 1>numerical computation using data flow graphs. Nodes in the graph

0:23:51.920 --> 0:23:55.760
<v Speaker 1>represent mathematical operations, while the graph edges represent the multidimensional

0:23:55.840 --> 0:23:59.840
<v Speaker 1>data arrays or tensors communicated between them. The flexible arc

0:24:00.000 --> 0:24:02.520
<v Speaker 1>actually allows you to deploy computation to one or more

0:24:02.640 --> 0:24:05.760
<v Speaker 1>CPUs or GPUs in a desktop, server, or mobile device

0:24:05.800 --> 0:24:09.480
<v Speaker 1>with a single API found artistic to me, couldn't be

0:24:10.000 --> 0:24:13.600
<v Speaker 1>more transparent or simple. So I mean, I think the

0:24:13.680 --> 0:24:17.400
<v Speaker 1>simple version is just that TensorFlow is. They say it's

0:24:17.440 --> 0:24:20.760
<v Speaker 1>open source software for machine intelligence. Yes, and it's and

0:24:20.840 --> 0:24:25.480
<v Speaker 1>it's essentially modeled after, at least inspired by, rather the

0:24:25.560 --> 0:24:30.320
<v Speaker 1>way brains work. Yeah. Now, tensor flows particularly well suited

0:24:30.359 --> 0:24:33.800
<v Speaker 1>to process information for visual analysis and recognition as well

0:24:33.840 --> 0:24:37.080
<v Speaker 1>as speech recognition. So that's that's what they were primarily

0:24:37.119 --> 0:24:39.720
<v Speaker 1>intending it for when they made it open source and

0:24:39.800 --> 0:24:43.440
<v Speaker 1>said that different developers could use this kind of tools

0:24:43.440 --> 0:24:48.480
<v Speaker 1>set in order to give their various apps or projects

0:24:48.520 --> 0:24:51.840
<v Speaker 1>the capabilities that they would need, uh, in order to

0:24:51.880 --> 0:24:55.359
<v Speaker 1>process visual information or to do some sort of speech

0:24:55.359 --> 0:25:01.760
<v Speaker 1>recognition or speech activation kind of process. Now, the ideas

0:25:01.800 --> 0:25:04.879
<v Speaker 1>that developers can use these tools to create the stuff

0:25:04.880 --> 0:25:07.560
<v Speaker 1>they need. They can test the stuff that they have

0:25:07.760 --> 0:25:10.320
<v Speaker 1>planned and see if it works, and if it does work,

0:25:10.440 --> 0:25:12.560
<v Speaker 1>they don't have to do any more new code. They

0:25:12.560 --> 0:25:16.639
<v Speaker 1>can actually use the code from TensorFlow as and incorporate

0:25:16.720 --> 0:25:20.320
<v Speaker 1>into their their project. Um And like I said, it's

0:25:20.359 --> 0:25:22.560
<v Speaker 1>not a complete set. It's something that will continue to

0:25:22.640 --> 0:25:26.800
<v Speaker 1>evolve over time as people find shortcuts or they might

0:25:26.800 --> 0:25:28.879
<v Speaker 1>find a more elegant way to do something or a

0:25:28.880 --> 0:25:31.400
<v Speaker 1>more robust way, which is part of why it's open source.

0:25:31.440 --> 0:25:34.240
<v Speaker 1>I'm sure that they're hoping that the people who want

0:25:34.280 --> 0:25:35.880
<v Speaker 1>to get in there and work with it will also

0:25:35.920 --> 0:25:39.639
<v Speaker 1>want to improve it. Right, so uh and and again

0:25:39.680 --> 0:25:42.199
<v Speaker 1>going back to Lawren's point about probabilities, it really is

0:25:42.280 --> 0:25:47.200
<v Speaker 1>all about assigning probabilities, assessing those probabilities, refining them so

0:25:47.240 --> 0:25:52.120
<v Speaker 1>that you start looking at your individual options and determining

0:25:52.160 --> 0:25:55.200
<v Speaker 1>which option is the best out of all of them.

0:25:55.240 --> 0:25:57.920
<v Speaker 1>This might sound familiar to you if you remember IBM S.

0:25:57.920 --> 0:26:00.359
<v Speaker 1>Watson when it was on Jeopardy. There was a big

0:26:00.480 --> 0:26:03.440
<v Speaker 1>bit about how does it come up with the answers?

0:26:03.440 --> 0:26:06.200
<v Speaker 1>How does it know what the answer is? Well, IBM S.

0:26:06.240 --> 0:26:10.000
<v Speaker 1>Watson would receive the clue and then it would come

0:26:10.080 --> 0:26:13.320
<v Speaker 1>up with potential answers and assign each one a probability

0:26:13.359 --> 0:26:16.520
<v Speaker 1>of how certain it is that's the correct answer. And

0:26:16.560 --> 0:26:19.760
<v Speaker 1>if the probability was above a certain threshold, that is

0:26:20.080 --> 0:26:23.320
<v Speaker 1>when Watson would buzz in and and send that message in.

0:26:23.840 --> 0:26:25.840
<v Speaker 1>And I think it was something like an eight percent

0:26:26.480 --> 0:26:29.280
<v Speaker 1>uh certainty, something along those lines. It was somewhere around there.

0:26:29.680 --> 0:26:31.760
<v Speaker 1>And this is very similar to when we were talking

0:26:31.760 --> 0:26:34.880
<v Speaker 1>about the Kepler space telescope and how if you had

0:26:34.880 --> 0:26:39.960
<v Speaker 1>a probability greater than of a signal being an exo

0:26:40.000 --> 0:26:44.440
<v Speaker 1>planet that was considered a verified exo planet. So TensorFlow

0:26:44.480 --> 0:26:46.120
<v Speaker 1>does the same sort of thing. It looks through these

0:26:46.119 --> 0:26:49.320
<v Speaker 1>probabilities and then it goes with the highest option. Uh.

0:26:49.400 --> 0:26:51.520
<v Speaker 1>It kind of makes you wonder how that works with music.

0:26:51.760 --> 0:26:54.280
<v Speaker 1>And I don't know the answer to that because we've

0:26:54.280 --> 0:26:57.359
<v Speaker 1>reached the limit of my understanding of this particular approach

0:26:57.400 --> 0:27:00.840
<v Speaker 1>to machine learning. But I wanted to talk a little

0:27:00.880 --> 0:27:05.160
<v Speaker 1>bit about a related project, not directly related to the

0:27:05.200 --> 0:27:10.200
<v Speaker 1>Google team, but one about a computer trying to generate art. Yeah,

0:27:10.280 --> 0:27:14.840
<v Speaker 1>and this was one that you covered for our video show. Now. Yes,

0:27:15.320 --> 0:27:18.200
<v Speaker 1>this would be a project that a group of art

0:27:18.320 --> 0:27:23.480
<v Speaker 1>historians and computer scientists and researchers and developers tackled together.

0:27:23.840 --> 0:27:26.919
<v Speaker 1>And it was in the efforts of creating a new

0:27:26.960 --> 0:27:30.159
<v Speaker 1>painting in the style of Rembrandt, and the idea of

0:27:30.160 --> 0:27:33.119
<v Speaker 1>being that this should be a painting that Rembrandt could

0:27:33.200 --> 0:27:36.880
<v Speaker 1>have painted himself. But if only he'd had a three

0:27:36.960 --> 0:27:39.440
<v Speaker 1>D printer, if only you had a three D printer. So, yeah,

0:27:39.440 --> 0:27:42.320
<v Speaker 1>they specifically use machine learning, computer algorithms and a three

0:27:42.440 --> 0:27:47.040
<v Speaker 1>D printer to create a new quote unquote new Rembrandt painting,

0:27:48.040 --> 0:27:51.360
<v Speaker 1>at least in the style of Rembrandt. And it's not

0:27:51.440 --> 0:27:55.760
<v Speaker 1>quite the same thing as generating completely new art because again,

0:27:55.760 --> 0:27:59.960
<v Speaker 1>you're using Rembrandt's style as your starting point, that's your foundation.

0:28:00.000 --> 0:28:02.720
<v Speaker 1>It's kind of like those intro notes when they were

0:28:02.760 --> 0:28:05.399
<v Speaker 1>building out that melody. Yeah yeah. It also like a

0:28:05.480 --> 0:28:09.520
<v Speaker 1>very specific type of Rembrandt painting in order to create

0:28:10.119 --> 0:28:12.479
<v Speaker 1>a very specific type of ram Brant paints right right,

0:28:12.520 --> 0:28:15.440
<v Speaker 1>instead of feeding it every single Rembrandt painting that ever

0:28:15.640 --> 0:28:20.199
<v Speaker 1>was ever, they took a specific subtype of Rembrandt painting. Now,

0:28:20.200 --> 0:28:22.480
<v Speaker 1>granted it was a specific subtype that rembrand did a

0:28:22.720 --> 0:28:27.800
<v Speaker 1>whole lot of. Sure. It was of a dude, white white,

0:28:28.359 --> 0:28:31.880
<v Speaker 1>an old white dude, well you know, old ish, not

0:28:32.520 --> 0:28:35.080
<v Speaker 1>probably younger than I am actually, but at any rate,

0:28:35.480 --> 0:28:38.200
<v Speaker 1>sitting sitting and not quite you know. It's a little

0:28:38.200 --> 0:28:40.960
<v Speaker 1>bit of a profile shot, not a full profile um

0:28:41.040 --> 0:28:45.040
<v Speaker 1>wearing black with a big old white collar and a

0:28:45.080 --> 0:28:50.400
<v Speaker 1>big old black hat, looking somewhat pensive, as Rembrandts subjects

0:28:50.440 --> 0:28:53.600
<v Speaker 1>often did, probably thinking is this guy gonna let me

0:28:53.640 --> 0:28:57.360
<v Speaker 1>sneeze or something on those lines and it uh. The

0:28:57.360 --> 0:28:59.040
<v Speaker 1>The way they did this was they actually had the

0:28:59.040 --> 0:29:03.240
<v Speaker 1>computer analyze ice portrait after portrait after portrait in this style,

0:29:03.960 --> 0:29:07.080
<v Speaker 1>and the computer began to take measurements of all the

0:29:07.080 --> 0:29:11.560
<v Speaker 1>different little elements of these these portraits to determine what

0:29:11.720 --> 0:29:15.040
<v Speaker 1>is the typical Rembrandt portrait, like, like, what is the

0:29:15.160 --> 0:29:19.080
<v Speaker 1>spacing of the eyes right, how is the nose shaped

0:29:19.120 --> 0:29:21.640
<v Speaker 1>in comparison to the eyes right when you get to

0:29:21.680 --> 0:29:23.760
<v Speaker 1>the corner of the mouth, how does that look in

0:29:23.800 --> 0:29:26.920
<v Speaker 1>a Rembrandt painting? All of these ideas, and of course,

0:29:27.200 --> 0:29:30.400
<v Speaker 1>how do the brush strokes look. So they used very

0:29:30.480 --> 0:29:33.640
<v Speaker 1>high tech scanning technology to get the texture of the

0:29:33.640 --> 0:29:37.000
<v Speaker 1>brush strokes as well. Once they did all this, they

0:29:37.040 --> 0:29:41.440
<v Speaker 1>then fed all that information and generated a Rembrandt style

0:29:41.640 --> 0:29:45.320
<v Speaker 1>portrait using all of those points of data as kind

0:29:45.360 --> 0:29:47.960
<v Speaker 1>of a roadmap, a guide, saying, make sure that the

0:29:48.040 --> 0:29:50.080
<v Speaker 1>eyes are this far apart, make sure that they are

0:29:50.160 --> 0:29:53.360
<v Speaker 1>this large, make sure that their space this far from

0:29:53.360 --> 0:29:56.479
<v Speaker 1>the nose, all these little basic rules that they had

0:29:56.600 --> 0:30:00.240
<v Speaker 1>established through the analysis of all those other portraits, And

0:30:00.280 --> 0:30:02.160
<v Speaker 1>so the computer generated one and then they sent it

0:30:02.200 --> 0:30:04.400
<v Speaker 1>to a three D printer which was able to replicate

0:30:04.760 --> 0:30:08.400
<v Speaker 1>the ridges you would find from brushstrokes, and the end

0:30:08.440 --> 0:30:11.240
<v Speaker 1>result was a painting that looked an awful lot like

0:30:11.280 --> 0:30:14.040
<v Speaker 1>a Rembrandt portrait, enough so that if you put it

0:30:14.080 --> 0:30:16.760
<v Speaker 1>in a gallery of Rembrandt portraits and you brought a

0:30:16.840 --> 0:30:19.360
<v Speaker 1>non expert into the room who someone who was not

0:30:19.480 --> 0:30:22.600
<v Speaker 1>familiar with every painting Rembrandt has ever done, and said,

0:30:22.600 --> 0:30:24.520
<v Speaker 1>pick out the one that was done by a computer,

0:30:24.680 --> 0:30:26.440
<v Speaker 1>I bet it would have been really hard to do

0:30:27.240 --> 0:30:30.120
<v Speaker 1>because it looked pretty much like every other rem Breand

0:30:31.440 --> 0:30:35.720
<v Speaker 1>but again, that was more about copying a specific style, right.

0:30:35.800 --> 0:30:39.920
<v Speaker 1>It was a little It's incredibly impressive. I don't want

0:30:39.920 --> 0:30:43.520
<v Speaker 1>to downplay how impressive the the achievement was. It took

0:30:43.560 --> 0:30:46.840
<v Speaker 1>them two years to do this. But it's not the

0:30:46.920 --> 0:30:50.840
<v Speaker 1>same as trying to teach a computer what art is

0:30:51.040 --> 0:30:54.280
<v Speaker 1>and then tell the computer, now, make something right. So

0:30:54.320 --> 0:30:56.800
<v Speaker 1>it's a little different because you're giving the computer way

0:30:56.840 --> 0:31:02.400
<v Speaker 1>more of a roadmap in the rembrand approach. Now, assuming

0:31:02.440 --> 0:31:06.360
<v Speaker 1>we get to a point where we actually are able

0:31:06.640 --> 0:31:11.239
<v Speaker 1>to have a I produce music and art that we

0:31:11.400 --> 0:31:15.160
<v Speaker 1>think has value to it, it doesn't just seem like

0:31:15.360 --> 0:31:19.920
<v Speaker 1>a random representation of whatever, right, But we're not just

0:31:20.040 --> 0:31:24.400
<v Speaker 1>tuning in for the novelty of it. What does that mean?

0:31:26.200 --> 0:31:29.160
<v Speaker 1>I mean, that's a big question. What does it mean

0:31:29.240 --> 0:31:34.200
<v Speaker 1>to us if AI is capable of of creating something

0:31:34.680 --> 0:31:37.120
<v Speaker 1>that falls into the realm of art. Now, I would argue,

0:31:38.120 --> 0:31:42.200
<v Speaker 1>at least in the foreseeable future, we wouldn't say that

0:31:42.240 --> 0:31:46.960
<v Speaker 1>the computer is actually expressing itself. Uh no, no, And

0:31:47.320 --> 0:31:49.440
<v Speaker 1>I mean for them more like the computer isn't really

0:31:49.520 --> 0:31:52.440
<v Speaker 1>the artist, Like the programmers of the computer are kind

0:31:52.480 --> 0:31:55.720
<v Speaker 1>of the artists. Yeah, just several stages removed. Yeah, the

0:31:55.760 --> 0:31:58.680
<v Speaker 1>computer is the tool that they are using. I think

0:31:58.720 --> 0:32:00.880
<v Speaker 1>of I think of it personally, and let me know

0:32:00.920 --> 0:32:03.120
<v Speaker 1>if you guys disagree. I'm really curious to hear your

0:32:03.120 --> 0:32:05.400
<v Speaker 1>thoughts on this. I think of it as the computer

0:32:05.440 --> 0:32:08.240
<v Speaker 1>could be thought of as an early stage artist, someone

0:32:08.240 --> 0:32:11.600
<v Speaker 1>who is learning their craft by copying the work of others,

0:32:11.600 --> 0:32:15.480
<v Speaker 1>not directly copying their work, but copying the style and

0:32:15.480 --> 0:32:19.120
<v Speaker 1>and not necessarily expressing themselves through that art, but rather

0:32:19.240 --> 0:32:22.920
<v Speaker 1>just trying to master the tools of creation for that art,

0:32:23.200 --> 0:32:26.720
<v Speaker 1>but has not made that next step where they are

0:32:26.760 --> 0:32:31.280
<v Speaker 1>the ones who are being able to express something deeper,

0:32:31.600 --> 0:32:34.840
<v Speaker 1>something new, something unique. To themselves, right, rather than rather

0:32:34.880 --> 0:32:38.800
<v Speaker 1>than simply I now understand how this works. I think

0:32:38.800 --> 0:32:41.760
<v Speaker 1>that the computers will be on that first step uh

0:32:41.800 --> 0:32:44.680
<v Speaker 1>and not in the second one. But if you disagree,

0:32:44.760 --> 0:32:47.360
<v Speaker 1>or if you feel like I am being way too

0:32:48.600 --> 0:32:53.640
<v Speaker 1>uh or narrow minded of how to define this, or

0:32:53.680 --> 0:32:56.360
<v Speaker 1>that you don't even think like I don't even think

0:32:56.400 --> 0:32:58.800
<v Speaker 1>that's a You're looking at it far too utilitary in

0:32:58.800 --> 0:33:01.200
<v Speaker 1>a way. I mean, that is perfectly fine. And I

0:33:01.240 --> 0:33:02.800
<v Speaker 1>didn't put this in the note. So that's why I'm

0:33:02.880 --> 0:33:04.640
<v Speaker 1>just springing it on you to find out what you

0:33:04.680 --> 0:33:07.280
<v Speaker 1>think I mean. I mean, I guess the question is

0:33:07.280 --> 0:33:11.760
<v Speaker 1>really like, do you think that uh, an artist needs

0:33:12.200 --> 0:33:14.280
<v Speaker 1>for for for lack of a better term, a soul

0:33:14.840 --> 0:33:19.480
<v Speaker 1>in order to actually create art? Or can a computer,

0:33:19.600 --> 0:33:26.480
<v Speaker 1>not being truly self aware, actually create art? Right? Here's

0:33:26.520 --> 0:33:29.960
<v Speaker 1>another question. If an artist accidentally spills a bunch of

0:33:30.000 --> 0:33:33.239
<v Speaker 1>paint onto a canvas and it turns out to be

0:33:33.320 --> 0:33:35.760
<v Speaker 1>something that people think is very beautiful and they want

0:33:35.800 --> 0:33:39.560
<v Speaker 1>to look at, is that art, well, it depends what

0:33:39.720 --> 0:33:42.480
<v Speaker 1>is the artist's opinion of this. They aren't feel that

0:33:42.520 --> 0:33:46.480
<v Speaker 1>the act of accident was truly a moment of chaos

0:33:46.600 --> 0:33:49.360
<v Speaker 1>or was it something that was instigated through I mean, like,

0:33:49.520 --> 0:33:51.760
<v Speaker 1>I mean, these are questions people have been asking all

0:33:51.800 --> 0:33:54.080
<v Speaker 1>of that, you know what, what actually counts is all right?

0:33:54.440 --> 0:33:56.680
<v Speaker 1>In fact, this is actually one of the things I

0:33:56.760 --> 0:34:00.200
<v Speaker 1>wanted to look at. So one of the back to

0:34:00.320 --> 0:34:04.360
<v Speaker 1>Douglas x uh Google research page. One of the things

0:34:04.400 --> 0:34:08.359
<v Speaker 1>he says on that page about Magenta is he wants

0:34:08.400 --> 0:34:12.440
<v Speaker 1>to ask the question can machines make make music and art?

0:34:12.719 --> 0:34:17.439
<v Speaker 1>If so, how? If not? Why not? And that very

0:34:17.520 --> 0:34:20.400
<v Speaker 1>last sentence was actually the most interesting part to me.

0:34:20.600 --> 0:34:23.480
<v Speaker 1>I like, I like the inclusion of this question why not,

0:34:24.520 --> 0:34:27.040
<v Speaker 1>because it's it's an interesting way to phrase it. It

0:34:27.080 --> 0:34:29.560
<v Speaker 1>illuminates one of the roles Magenta could play in the

0:34:29.640 --> 0:34:34.040
<v Speaker 1>larger world of AI development. So the study of artificial intelligence,

0:34:34.040 --> 0:34:36.239
<v Speaker 1>to me, it's not just about can I get a

0:34:36.239 --> 0:34:40.320
<v Speaker 1>computer program to perform or X or y intelligent behavior?

0:34:40.760 --> 0:34:44.000
<v Speaker 1>It's about understanding the nature of the behavior to begin

0:34:44.080 --> 0:34:47.640
<v Speaker 1>with the nature of intelligence and intelligence based labor. And

0:34:47.680 --> 0:34:50.840
<v Speaker 1>in this case that might mean that, you know, computers

0:34:50.880 --> 0:34:54.000
<v Speaker 1>could help give us insights into questions like what doesn't

0:34:54.040 --> 0:34:55.719
<v Speaker 1>mean to create a piece of art? What we're just

0:34:55.719 --> 0:34:57.719
<v Speaker 1>talking about a minute ago? I mean, think about in

0:34:57.760 --> 0:35:01.560
<v Speaker 1>the past, um all the for weird pieces of abstract

0:35:01.640 --> 0:35:03.600
<v Speaker 1>art that people looked at they said, that doesn't that's

0:35:03.640 --> 0:35:06.239
<v Speaker 1>not art, you know, Jackson Pollock, does that counts art?

0:35:07.239 --> 0:35:10.239
<v Speaker 1>John Cage? Is this really music? He's just playing one

0:35:10.280 --> 0:35:13.120
<v Speaker 1>note over and over or he's just like turning radios

0:35:13.160 --> 0:35:17.800
<v Speaker 1>on and off? Is that really music? And it? I

0:35:17.800 --> 0:35:20.720
<v Speaker 1>don't know. It makes me think what makes us want

0:35:20.760 --> 0:35:25.080
<v Speaker 1>to pay special attention to a particular display of shapes

0:35:25.080 --> 0:35:30.399
<v Speaker 1>and colors or a particular ordered sequence of sounds. And

0:35:31.200 --> 0:35:34.680
<v Speaker 1>there's difficulty there because when we encounter art in our lives,

0:35:35.120 --> 0:35:39.560
<v Speaker 1>it's not devoid of context. Like so when you encounter

0:35:39.640 --> 0:35:42.880
<v Speaker 1>a particular sequence of sounds or group of shapes and colors,

0:35:43.120 --> 0:35:46.600
<v Speaker 1>it might come with social pressures, right, like people saying, like,

0:35:46.840 --> 0:35:48.960
<v Speaker 1>you know, this thing here is a piece of art.

0:35:49.000 --> 0:35:51.680
<v Speaker 1>You should know and the mere fact that a piece

0:35:51.680 --> 0:35:54.400
<v Speaker 1>of art is hanging in a museum gives it weight exactly,

0:35:54.960 --> 0:35:57.279
<v Speaker 1>So people people are telling you to pay attention to

0:35:57.360 --> 0:35:59.239
<v Speaker 1>this thing, and that might make you pay attention to

0:35:59.320 --> 0:36:02.320
<v Speaker 1>something that you wouldn't pay attention to otherwise, or maybe

0:36:02.320 --> 0:36:04.440
<v Speaker 1>you would who knows. But then again, this could also

0:36:04.520 --> 0:36:08.440
<v Speaker 1>apply to artificial intelligence, because what if it's a an

0:36:08.520 --> 0:36:11.160
<v Speaker 1>AI making a piece of music and you're saying, well,

0:36:11.200 --> 0:36:13.000
<v Speaker 1>maybe I should listen to this because I want to

0:36:13.040 --> 0:36:16.560
<v Speaker 1>hear what artificial intelligence can come up with, and you wouldn't. Really,

0:36:16.920 --> 0:36:19.680
<v Speaker 1>it wouldn't be all that interesting to you. Otherwise what

0:36:19.760 --> 0:36:21.920
<v Speaker 1>you were talking about earlier actually is it? Is it

0:36:22.040 --> 0:36:25.360
<v Speaker 1>just the novelty of it? Sure? Um, which I I

0:36:25.440 --> 0:36:28.680
<v Speaker 1>guess really, I mean, I don't know. I would suppose that, uh,

0:36:29.120 --> 0:36:32.520
<v Speaker 1>post postmodernism, I would define art as a thing that

0:36:32.640 --> 0:36:37.480
<v Speaker 1>makes you think or feel h and therefore a machine

0:36:37.520 --> 0:36:41.000
<v Speaker 1>can totally create absolutely by that definition. I would say

0:36:41.000 --> 0:36:46.240
<v Speaker 1>that if you were to have your your AI create

0:36:46.320 --> 0:36:49.080
<v Speaker 1>some music and that made you feel something, and you

0:36:49.239 --> 0:36:52.280
<v Speaker 1>define that as music that makes me feel something, is

0:36:52.280 --> 0:36:55.200
<v Speaker 1>is art, whether it's you know, happy or sad or

0:36:55.280 --> 0:36:58.319
<v Speaker 1>energetic or whatever, then it would by that definition you

0:36:58.400 --> 0:37:01.399
<v Speaker 1>have to say that the puter was able to generate art.

0:37:01.760 --> 0:37:04.520
<v Speaker 1>And also, let's not forget that there are a lot

0:37:04.560 --> 0:37:08.480
<v Speaker 1>of pieces of music out there throughout the ages where

0:37:08.600 --> 0:37:12.760
<v Speaker 1>musicians were taking very calculated decisions on how to craft

0:37:12.840 --> 0:37:16.799
<v Speaker 1>that music to get a specific kind of feeling for it,

0:37:16.960 --> 0:37:19.480
<v Speaker 1>to the point where you could even be cynical about

0:37:19.480 --> 0:37:23.000
<v Speaker 1>it and be like that song was manufactured day one

0:37:23.120 --> 0:37:26.600
<v Speaker 1>to be a single and get play on like the

0:37:26.640 --> 0:37:30.399
<v Speaker 1>pop charts. And first of all, I don't think there's

0:37:30.400 --> 0:37:32.520
<v Speaker 1>anything wrong with that. I I don't look down at

0:37:32.560 --> 0:37:35.920
<v Speaker 1>my nose at the idea of a manufactured piece of music.

0:37:35.960 --> 0:37:39.000
<v Speaker 1>If it makes someone happy, that's awesome, and that's all

0:37:39.040 --> 0:37:41.239
<v Speaker 1>that really matters in my eyes. I know a lot

0:37:41.280 --> 0:37:43.919
<v Speaker 1>of people have other opinions about it, um and that's

0:37:43.960 --> 0:37:47.160
<v Speaker 1>not bad either. But if a computer were capable of

0:37:47.200 --> 0:37:49.680
<v Speaker 1>doing that, I would I would agree with you, Lauren,

0:37:49.719 --> 0:37:52.719
<v Speaker 1>I'd say, well that that counts is art. H On

0:37:52.760 --> 0:37:56.400
<v Speaker 1>a related note, so let's imagine that we've got computers

0:37:56.440 --> 0:38:00.440
<v Speaker 1>capable of making music. And they talked about the work

0:38:00.480 --> 0:38:03.400
<v Speaker 1>of trying to create text. One wonders if then you

0:38:03.440 --> 0:38:07.400
<v Speaker 1>could actually have computers capable of creating entire songs, like,

0:38:07.440 --> 0:38:09.839
<v Speaker 1>not just music, but music with lyrics. We talked about

0:38:09.880 --> 0:38:12.719
<v Speaker 1>this last time. We were saying that, you know, I

0:38:13.080 --> 0:38:18.399
<v Speaker 1>can easily imagine computers creating perfectly passable instrumental music. Not

0:38:18.560 --> 0:38:22.359
<v Speaker 1>so much music with vocals. That seems a lot more

0:38:22.400 --> 0:38:26.680
<v Speaker 1>difficult to me. So let me give you some poetry

0:38:27.280 --> 0:38:31.160
<v Speaker 1>written by machine. Remember when I told you that Google

0:38:31.200 --> 0:38:36.760
<v Speaker 1>had fed it's artificial intelligence all these novels. It started

0:38:36.800 --> 0:38:41.160
<v Speaker 1>to try and create sentences, and it wasn't attempting to

0:38:41.160 --> 0:38:43.759
<v Speaker 1>create poetry. It was creating a series of sentences that

0:38:43.840 --> 0:38:47.479
<v Speaker 1>other people have looked at and said, this is there's

0:38:47.520 --> 0:38:51.480
<v Speaker 1>something interesting here. Now there's not an intent behind it, necessarily,

0:38:51.520 --> 0:38:54.880
<v Speaker 1>but there is something that feels like really morose poetry.

0:38:54.920 --> 0:39:00.520
<v Speaker 1>So here's here's a poem from Google's AI. It made

0:39:00.560 --> 0:39:03.760
<v Speaker 1>me want to cry. No one had seen him since

0:39:04.160 --> 0:39:08.879
<v Speaker 1>it made me feel uneasy. No one had seen him.

0:39:08.920 --> 0:39:12.640
<v Speaker 1>The thought made me smile. The pain was unbearable. The

0:39:12.800 --> 0:39:18.160
<v Speaker 1>crowd was silent. The man called out, the old man said.

0:39:19.040 --> 0:39:26.399
<v Speaker 1>The man asked, it's a poem by Google AI. Yeah,

0:39:26.440 --> 0:39:30.359
<v Speaker 1>I can dig that. I mean, it's just it's it's interesting, though,

0:39:30.400 --> 0:39:32.880
<v Speaker 1>because this is just this is just AI trying to

0:39:32.920 --> 0:39:37.480
<v Speaker 1>suss out the meaning or or the intent behind words

0:39:37.480 --> 0:39:41.920
<v Speaker 1>so that it can better understand when we communicate to

0:39:42.160 --> 0:39:44.920
<v Speaker 1>it and ask for something using different language, how it

0:39:44.960 --> 0:39:48.320
<v Speaker 1>should respond. It's not attempting to create anything of meaning,

0:39:48.520 --> 0:39:51.600
<v Speaker 1>but because we're humans, we find meaning where perhaps there

0:39:51.680 --> 0:39:54.680
<v Speaker 1>was none intended. And I thought that you know, you

0:39:54.680 --> 0:39:57.760
<v Speaker 1>could argue that that's maybe it's maybe the artists created

0:39:57.840 --> 0:40:01.080
<v Speaker 1>not through the computer right get down, but through us

0:40:01.120 --> 0:40:04.200
<v Speaker 1>reading it. Maybe that's where the artist created. And I mean,

0:40:04.239 --> 0:40:07.400
<v Speaker 1>and I've definitely read some some like like found poetry

0:40:07.719 --> 0:40:10.720
<v Speaker 1>from from spam emails for example. You know, back before

0:40:10.960 --> 0:40:14.000
<v Speaker 1>all of our email filters were so good that they

0:40:14.000 --> 0:40:17.279
<v Speaker 1>didn't don't let spam through anymore. That often um that

0:40:17.200 --> 0:40:20.480
<v Speaker 1>the like just terrific keyword salad that you would get

0:40:20.520 --> 0:40:24.520
<v Speaker 1>that was beautiful, not intending to be. So the funny

0:40:24.520 --> 0:40:27.120
<v Speaker 1>thing we found poetry is who gets the byline on

0:40:27.200 --> 0:40:30.839
<v Speaker 1>found poetry. It's the person who put it together. I mean,

0:40:30.880 --> 0:40:33.600
<v Speaker 1>it's not the where the your text originally came from.

0:40:33.680 --> 0:40:36.719
<v Speaker 1>It's the person who well, I I edited together this

0:40:36.800 --> 0:40:39.360
<v Speaker 1>found poem out of some text I put this in

0:40:39.440 --> 0:40:43.560
<v Speaker 1>this order. So so so is the suggestion here that

0:40:43.880 --> 0:40:48.280
<v Speaker 1>the artist, when we read works by AI is the reader.

0:40:49.120 --> 0:40:51.400
<v Speaker 1>I mean, that's a possibility. It's it's a question that

0:40:51.440 --> 0:40:54.880
<v Speaker 1>I think has has merited. Uh yeah, I don't know. Well,

0:40:54.920 --> 0:40:58.040
<v Speaker 1>I mean, in the case of yours, Jonathan, I suspect

0:40:58.200 --> 0:41:01.320
<v Speaker 1>that somebody was that just it was that a sequence

0:41:01.440 --> 0:41:05.560
<v Speaker 1>fully generated by the AI itself or somebody pull pieces.

0:41:05.719 --> 0:41:08.200
<v Speaker 1>That's an excellent question. I suspect it was the latter,

0:41:09.000 --> 0:41:10.920
<v Speaker 1>And in that case, I would say that the poet

0:41:11.080 --> 0:41:14.440
<v Speaker 1>is the is the person who pulled those pieces together. Actually,

0:41:14.480 --> 0:41:16.240
<v Speaker 1>here it is. Here's the way it works. The team.

0:41:16.360 --> 0:41:18.600
<v Speaker 1>Now that I'm reading this more because when I first

0:41:18.640 --> 0:41:20.560
<v Speaker 1>saw this, I just saw the examples and I didn't

0:41:20.600 --> 0:41:23.479
<v Speaker 1>see how they were generating it. What they were doing

0:41:23.600 --> 0:41:26.800
<v Speaker 1>was they gave the computer a starting word or sentence

0:41:26.880 --> 0:41:29.320
<v Speaker 1>and an ending word or sentence, and then the computer

0:41:29.400 --> 0:41:34.560
<v Speaker 1>had to generate a series of uh sentences to link

0:41:34.640 --> 0:41:39.080
<v Speaker 1>the two together. And they began they began to say

0:41:39.080 --> 0:41:44.160
<v Speaker 1>that these were telling stories. Now they were sometimes abstract stories. Um,

0:41:44.200 --> 0:41:46.120
<v Speaker 1>some of them are some of them are really weird.

0:41:46.160 --> 0:41:47.640
<v Speaker 1>I wish we could find the one that I that

0:41:47.800 --> 0:41:50.799
<v Speaker 1>I saw where it was really really sad, and then

0:41:50.840 --> 0:41:53.560
<v Speaker 1>it became about horses at the end. It was amazing.

0:41:54.000 --> 0:41:56.520
<v Speaker 1>I was like it was like Tina Belcher from Bob's

0:41:56.520 --> 0:41:59.280
<v Speaker 1>Burgers had written a poem. But it was a pretty

0:41:59.280 --> 0:42:02.759
<v Speaker 1>phenomenal and and you know it's I bring it up

0:42:02.840 --> 0:42:06.360
<v Speaker 1>mainly just to say that we're seeing some really interesting

0:42:06.400 --> 0:42:10.919
<v Speaker 1>work in this field of machine learning and creation that

0:42:11.480 --> 0:42:17.000
<v Speaker 1>could ultimately lead to things that perhaps don't replace any

0:42:17.040 --> 0:42:22.040
<v Speaker 1>sort of human creativity, but in either enhance something. Maybe

0:42:22.040 --> 0:42:24.759
<v Speaker 1>you do a partnership quote unquote with a computer to

0:42:24.840 --> 0:42:28.680
<v Speaker 1>create something, just to put that out there like this

0:42:28.719 --> 0:42:31.239
<v Speaker 1>is part of me, part machine. There's something interesting with

0:42:31.280 --> 0:42:35.080
<v Speaker 1>that as well, or maybe just it'll be you know,

0:42:35.120 --> 0:42:38.759
<v Speaker 1>another another option. One of the things that the team

0:42:38.800 --> 0:42:41.879
<v Speaker 1>talked about or or that people have chatted about as

0:42:41.920 --> 0:42:46.120
<v Speaker 1>far as the prospect of computer generating music is using

0:42:46.160 --> 0:42:50.440
<v Speaker 1>it to enhance or suppress certain moods. So, for example,

0:42:50.480 --> 0:42:54.839
<v Speaker 1>you're wearing a smart watch. It's got a uh activity

0:42:54.960 --> 0:43:00.000
<v Speaker 1>tracker on it exactly, and it detects perhaps that you're

0:43:00.000 --> 0:43:02.160
<v Speaker 1>being stressed out. It knows that you're not moving around,

0:43:02.480 --> 0:43:05.040
<v Speaker 1>but the text, because of your physiological changes, you're getting

0:43:05.040 --> 0:43:07.680
<v Speaker 1>stressed out. And so the headphones you're wearing you start

0:43:07.719 --> 0:43:10.719
<v Speaker 1>listening hearing music that's more soothing to you, and it's

0:43:10.840 --> 0:43:14.799
<v Speaker 1>generated on the fly. It's unique music. It's not something

0:43:14.800 --> 0:43:16.440
<v Speaker 1>that you're gonna listen to and then just tune out

0:43:16.440 --> 0:43:19.400
<v Speaker 1>because you've heard it a billion times before, or maybe

0:43:19.440 --> 0:43:21.680
<v Speaker 1>you're it detects that you're working out, and it says, Oh,

0:43:21.719 --> 0:43:25.080
<v Speaker 1>we need to generate some nice, fun up temposts type

0:43:25.120 --> 0:43:28.080
<v Speaker 1>of stuff to keep the activity going at the right level.

0:43:28.440 --> 0:43:31.319
<v Speaker 1>And it starts to create that on the fly. So

0:43:31.360 --> 0:43:34.640
<v Speaker 1>that's a possible application for this, and that that would

0:43:34.640 --> 0:43:37.560
<v Speaker 1>be fascinating because there's all this research into if you

0:43:37.560 --> 0:43:41.120
<v Speaker 1>you listen to music that has the similar beats per

0:43:41.120 --> 0:43:44.560
<v Speaker 1>minute to your active heart rate, then you will keep

0:43:44.600 --> 0:43:47.840
<v Speaker 1>going at that active heart rate for longer. I've found

0:43:48.200 --> 0:43:51.319
<v Speaker 1>I've found just anecdotally, yeah, I think I think by

0:43:51.320 --> 0:43:55.600
<v Speaker 1>your research, I mean, like sports blog right right, Anecdotally,

0:43:55.680 --> 0:43:57.279
<v Speaker 1>I have certainly found that to be the case. Like

0:43:57.320 --> 0:43:59.520
<v Speaker 1>if you know, I walked to and from the office

0:44:00.040 --> 0:44:02.480
<v Speaker 1>and if I'm listening to music and I'm listening to

0:44:02.520 --> 0:44:05.120
<v Speaker 1>a podcast, I'm just strolling. If I'm listening to music

0:44:05.280 --> 0:44:07.759
<v Speaker 1>and and something with a beat goes on there, if

0:44:07.800 --> 0:44:10.000
<v Speaker 1>I'm not paying it, if I do suddenly pay attention,

0:44:10.080 --> 0:44:13.240
<v Speaker 1>I realize I'm stepping on the beach. You're staying alive,

0:44:13.280 --> 0:44:17.800
<v Speaker 1>absolutely exactly, Yeah, doing that CPR. So it's it's pretty cool.

0:44:18.080 --> 0:44:21.680
<v Speaker 1>And also, you know, to kind of conclude this discussion, really,

0:44:21.719 --> 0:44:25.120
<v Speaker 1>I think ultimately what this is going to do is,

0:44:26.120 --> 0:44:30.200
<v Speaker 1>uh make a more robust machine learning system for problem solving.

0:44:30.280 --> 0:44:33.560
<v Speaker 1>In general, you could think of creating a piece of

0:44:33.719 --> 0:44:36.960
<v Speaker 1>art as a problem, not a problem in the sense of, oh, gosh,

0:44:37.000 --> 0:44:39.840
<v Speaker 1>I've got a problem, but like an engineering problem, right,

0:44:40.040 --> 0:44:42.279
<v Speaker 1>but imagine that you're able to create machine learning so

0:44:42.320 --> 0:44:45.480
<v Speaker 1>that you could present a computer with a problem in

0:44:45.520 --> 0:44:48.600
<v Speaker 1>the more colloquial sense, the more like I've got a problem,

0:44:48.880 --> 0:44:51.480
<v Speaker 1>I don't know how to fix this, and the computer,

0:44:51.520 --> 0:44:54.000
<v Speaker 1>because it's studied all of the analysis, you've got a problem.

0:44:54.000 --> 0:44:56.360
<v Speaker 1>You I'll solve it, and then it takes your problem

0:44:56.400 --> 0:45:00.000
<v Speaker 1>and gives you the problem the solution that is most

0:45:00.280 --> 0:45:04.040
<v Speaker 1>probably the right one according to its machine learning algorithm.

0:45:04.080 --> 0:45:06.160
<v Speaker 1>And ultimately we could get to a point where we

0:45:06.280 --> 0:45:11.080
<v Speaker 1>consult the great oracle, perhaps made by oracle that tells

0:45:11.200 --> 0:45:15.480
<v Speaker 1>us what we should do for some questions that are

0:45:15.480 --> 0:45:18.799
<v Speaker 1>particularly tricky, where you've got lots of different variables, and

0:45:18.920 --> 0:45:22.880
<v Speaker 1>the suggestion is always have a huge party, like I

0:45:22.920 --> 0:45:26.520
<v Speaker 1>got the dance mix right here, get going, and it's

0:45:26.520 --> 0:45:30.919
<v Speaker 1>all generated, has really morose poetry in it. Um, who knows,

0:45:30.960 --> 0:45:33.960
<v Speaker 1>Maybe that's the answer. I will not say no to

0:45:34.000 --> 0:45:36.960
<v Speaker 1>a party butt. I'm just throwing that out there, but

0:45:37.120 --> 0:45:41.200
<v Speaker 1>I think that this particular project is really interesting. I

0:45:41.239 --> 0:45:46.200
<v Speaker 1>think the the possible outcomes could be really cool, not

0:45:46.320 --> 0:45:49.000
<v Speaker 1>just for the art that it creates, but how it

0:45:49.040 --> 0:45:52.560
<v Speaker 1>advances machine learning in general. And like I said, maybe

0:45:52.560 --> 0:45:54.440
<v Speaker 1>in a year will come back to this and take

0:45:54.480 --> 0:45:59.040
<v Speaker 1>a look and see if the magenta has produced anything

0:45:59.440 --> 0:46:03.560
<v Speaker 1>you know, no worthy. Yeah that's a pun musical nets

0:46:04.480 --> 0:46:07.520
<v Speaker 1>all right, So I'm gonna wrap this up. Guys. If

0:46:07.520 --> 0:46:10.400
<v Speaker 1>you have any suggestions for future episodes of forward Thinking,

0:46:10.400 --> 0:46:13.160
<v Speaker 1>you should write us our email addresses f W Thinking

0:46:13.200 --> 0:46:15.600
<v Speaker 1>at how Stuff Works dot com, or you can drop

0:46:15.640 --> 0:46:17.799
<v Speaker 1>us a line on Twitter we are FW thinking there,

0:46:18.239 --> 0:46:20.480
<v Speaker 1>or you can go to Facebook search f W thinking.

0:46:20.520 --> 0:46:23.040
<v Speaker 1>In the little search field, our profile will pop up.

0:46:23.080 --> 0:46:25.160
<v Speaker 1>You can leave us a message there. We love hearing

0:46:25.200 --> 0:46:27.520
<v Speaker 1>from you, guys. So if you've got any suggestions for

0:46:27.600 --> 0:46:30.560
<v Speaker 1>future episodes or questions or comments or anything, leave them there.

0:46:30.640 --> 0:46:32.480
<v Speaker 1>We read all of them, and we will talk to

0:46:32.520 --> 0:46:40.880
<v Speaker 1>you again really soon. For more on this topic in

0:46:40.920 --> 0:46:55.239
<v Speaker 1>the future of technology, visit forward thinking dot com, brought

0:46:55.280 --> 0:46:57.800
<v Speaker 1>to you by Toyota. Let's go places