WEBVTT - The Complicated Story of an AI Artist

0:00:04.120 --> 0:00:07.200
<v Speaker 1>Get in touch with technology with tech Stuff from half

0:00:07.200 --> 0:00:13.680
<v Speaker 1>stuff works dot com. Hey there, and welcome to tech Stuff.

0:00:13.720 --> 0:00:16.480
<v Speaker 1>I'm your host, Jonathan Strickland. I'm an executive producer with

0:00:16.520 --> 0:00:19.639
<v Speaker 1>how Stuff Works in a love all things tech, and

0:00:19.680 --> 0:00:23.920
<v Speaker 1>today we're going to tackle a story that recently unfolded recently.

0:00:24.000 --> 0:00:26.480
<v Speaker 1>As of the recording of this show, I'm sitting in

0:00:26.560 --> 0:00:30.639
<v Speaker 1>the recording studio on October two, thousand eighteen. It's not

0:00:30.680 --> 0:00:33.159
<v Speaker 1>my normal studio either, So if you hear other noises,

0:00:33.360 --> 0:00:36.240
<v Speaker 1>that's because we've got noisy people walking around the office

0:00:36.240 --> 0:00:39.400
<v Speaker 1>and I'm in a different studio. That's commentary. But this

0:00:39.479 --> 0:00:44.120
<v Speaker 1>story unfolded just at the very end of October. That

0:00:44.280 --> 0:00:48.440
<v Speaker 1>was when the auction house Christie's, put a special item

0:00:48.560 --> 0:00:52.200
<v Speaker 1>up on the auctioning block. It was a somewhat blurry

0:00:52.280 --> 0:00:56.480
<v Speaker 1>portrait of a man dressed in antiquated clothing. It looked

0:00:56.520 --> 0:00:58.560
<v Speaker 1>like a painting that could have come from the eighteenth

0:00:58.680 --> 0:01:02.240
<v Speaker 1>century from one of any number of artists, but it

0:01:02.360 --> 0:01:06.279
<v Speaker 1>was in fact a much more recent painting. The artist

0:01:06.360 --> 0:01:09.880
<v Speaker 1>was not a famous painter. In fact, the artist wasn't

0:01:10.000 --> 0:01:14.600
<v Speaker 1>a person. It was an artificially intelligent algorithm that created

0:01:14.640 --> 0:01:18.800
<v Speaker 1>the portrait through the process of machine learning. And what's more,

0:01:19.280 --> 0:01:23.360
<v Speaker 1>the group of human artists who supplied the AI generated

0:01:23.400 --> 0:01:27.440
<v Speaker 1>portrait had taken a great deal of direction, let's say,

0:01:27.560 --> 0:01:31.280
<v Speaker 1>from a different computer programmer, but perhaps did not do

0:01:31.720 --> 0:01:36.040
<v Speaker 1>as much to attribute that coder's work to the creation

0:01:36.080 --> 0:01:39.039
<v Speaker 1>of this portrait that they should have done. So what

0:01:39.160 --> 0:01:41.919
<v Speaker 1>we have here sounds a bit like a twenty first

0:01:42.040 --> 0:01:45.800
<v Speaker 1>century futuristic art heist, only this isn't about stealing a

0:01:45.840 --> 0:01:50.440
<v Speaker 1>work of art, but rather a means of generating art itself,

0:01:50.720 --> 0:01:54.240
<v Speaker 1>and it's creating a lot of interesting conversations about concepts,

0:01:54.360 --> 0:01:57.600
<v Speaker 1>ranging from what is art in the first place, to

0:01:57.680 --> 0:02:00.960
<v Speaker 1>the practical applications of machine learning to the nature of

0:02:01.040 --> 0:02:04.840
<v Speaker 1>open source code. So let's dive down into this, because

0:02:04.880 --> 0:02:08.520
<v Speaker 1>when it comes to discussing our how technology interacts with

0:02:08.520 --> 0:02:11.720
<v Speaker 1>our lives, this is a doozy of a story. It

0:02:11.800 --> 0:02:15.000
<v Speaker 1>highlights not just technological issues but human ones that just

0:02:15.120 --> 0:02:18.880
<v Speaker 1>happened to intersect with technology. So to begin with, let's

0:02:18.919 --> 0:02:22.440
<v Speaker 1>talk about the tech behind generating this portrait in the

0:02:22.480 --> 0:02:27.119
<v Speaker 1>first place. It is an application of machine learning. That's

0:02:27.120 --> 0:02:29.880
<v Speaker 1>one of those topics we've talked about a lot on

0:02:30.040 --> 0:02:34.960
<v Speaker 1>tech stuff, especially recently. But basically, machine learning is all

0:02:35.000 --> 0:02:38.920
<v Speaker 1>about designing processes that allow machines to parse data in

0:02:38.960 --> 0:02:42.480
<v Speaker 1>some useful way and then apply the results of those

0:02:42.520 --> 0:02:46.280
<v Speaker 1>operations to future problems. But that's pretty darn vague, right,

0:02:46.360 --> 0:02:48.640
<v Speaker 1>that's not that doesn't really tell you anything useful if

0:02:48.639 --> 0:02:51.280
<v Speaker 1>you dive down a bit further, it's about creating a

0:02:51.320 --> 0:02:55.160
<v Speaker 1>framework within which machines can learn to perform a task

0:02:55.600 --> 0:02:58.800
<v Speaker 1>without having to be programmed to do it. So let's

0:02:58.880 --> 0:03:01.280
<v Speaker 1>use an example, and it's one I've talked about a

0:03:01.280 --> 0:03:03.600
<v Speaker 1>lot because it was one of the early examples of

0:03:03.600 --> 0:03:06.240
<v Speaker 1>what machine learning could do once it reached a certain

0:03:06.320 --> 0:03:10.280
<v Speaker 1>level of sophistication. Back in two thousand twelve, Google showed

0:03:10.320 --> 0:03:14.799
<v Speaker 1>how their computer scientists teams had taught an AI algorithm

0:03:14.919 --> 0:03:19.840
<v Speaker 1>or neural network to recognize images of cats. Now, this

0:03:19.919 --> 0:03:21.920
<v Speaker 1>was perhaps a funny way of showing an approach to

0:03:21.960 --> 0:03:25.200
<v Speaker 1>a difficult problem. So if you want a computer to

0:03:25.280 --> 0:03:28.720
<v Speaker 1>recognize an image of a cat, if it's a specific

0:03:28.760 --> 0:03:31.079
<v Speaker 1>image of a cat, you have a couple of different options.

0:03:31.360 --> 0:03:34.280
<v Speaker 1>One is, you can program the computer so that when

0:03:34.320 --> 0:03:38.600
<v Speaker 1>it encounters a specific arrangement of pixels for this particular image,

0:03:38.880 --> 0:03:41.880
<v Speaker 1>it recognizes that as the image of a cat, and

0:03:41.920 --> 0:03:45.440
<v Speaker 1>that you have programmed the computer to say, when you

0:03:45.840 --> 0:03:50.080
<v Speaker 1>see this arrangement of pixels, then that means this is

0:03:50.200 --> 0:03:52.720
<v Speaker 1>a cat. The computer doesn't understand what a cat is,

0:03:52.800 --> 0:03:56.400
<v Speaker 1>it doesn't have any context. It doesn't understand what any

0:03:56.440 --> 0:03:59.280
<v Speaker 1>other picture of a cat might be because that would

0:03:59.320 --> 0:04:03.600
<v Speaker 1>be a different arrangement of pixels. So you could program

0:04:03.600 --> 0:04:05.480
<v Speaker 1>a computer to do this and it would be able

0:04:05.520 --> 0:04:07.880
<v Speaker 1>to do it with that one image. But if you

0:04:07.880 --> 0:04:09.680
<v Speaker 1>gave it a different image of a cat, or even

0:04:09.680 --> 0:04:12.320
<v Speaker 1>an image of the same cat, but it's a different picture,

0:04:12.880 --> 0:04:14.960
<v Speaker 1>the computer would not be able to identify it. You

0:04:14.960 --> 0:04:18.560
<v Speaker 1>would have to repeat the entire process from beginning to

0:04:18.720 --> 0:04:21.400
<v Speaker 1>end to get the same result. And once you start

0:04:21.400 --> 0:04:25.440
<v Speaker 1>adding up images, you realize this is not really an

0:04:25.440 --> 0:04:30.680
<v Speaker 1>efficient means of teaching a computer anything. Or you could

0:04:30.680 --> 0:04:34.560
<v Speaker 1>create an artificial neural network that examines the pixels in

0:04:34.600 --> 0:04:37.719
<v Speaker 1>an image, and each neuron might be looking at a

0:04:37.760 --> 0:04:40.440
<v Speaker 1>different element of the data to determine if that data

0:04:40.600 --> 0:04:44.200
<v Speaker 1>was consistent with images of cat pictures. So we've talked

0:04:44.200 --> 0:04:48.160
<v Speaker 1>about this recently too, and artificial neuron can take in

0:04:48.320 --> 0:04:53.080
<v Speaker 1>multiple bin binary points of data's euros and ones and

0:04:53.080 --> 0:04:56.920
<v Speaker 1>then create a single binary output. So it might be

0:04:56.920 --> 0:05:00.840
<v Speaker 1>looking at specific features that might have to do with ears,

0:05:00.880 --> 0:05:03.600
<v Speaker 1>for example, and if it detects that the ears are

0:05:03.680 --> 0:05:06.840
<v Speaker 1>consistent with those of a cat, it might pass a

0:05:06.880 --> 0:05:10.080
<v Speaker 1>positive response further down the neural network, and a full

0:05:10.120 --> 0:05:12.839
<v Speaker 1>collection of all these looking at multiple points of data

0:05:13.160 --> 0:05:17.320
<v Speaker 1>would allow the computer to come to a decision does

0:05:17.400 --> 0:05:21.640
<v Speaker 1>this image represent a cat or does it represent something else. So,

0:05:21.720 --> 0:05:26.120
<v Speaker 1>in this way, by feeding thousands or tens of thousands

0:05:26.200 --> 0:05:29.320
<v Speaker 1>or hundreds of thousands of images to a computer, you

0:05:29.360 --> 0:05:32.160
<v Speaker 1>can train it to recognize cats. And the more you

0:05:32.240 --> 0:05:35.560
<v Speaker 1>train it and the more closely you're able to tweak

0:05:35.720 --> 0:05:39.600
<v Speaker 1>the network so that it waits certain elements more than others,

0:05:40.240 --> 0:05:43.640
<v Speaker 1>the better it gets. So the tweaking makes the network

0:05:43.680 --> 0:05:47.400
<v Speaker 1>more capable and eventually get to a point where it

0:05:47.480 --> 0:05:50.840
<v Speaker 1>can identify a picture as either being a cat or

0:05:50.920 --> 0:05:56.080
<v Speaker 1>not a cat with pretty good results. Um Back in

0:05:56.120 --> 0:05:59.719
<v Speaker 1>two thousand twelve when Google was talking about this, it

0:05:59.839 --> 0:06:03.720
<v Speaker 1>was still a little jankie. It could sometimes recognize a cat,

0:06:04.000 --> 0:06:06.479
<v Speaker 1>and sometimes it would think that a person was a

0:06:06.480 --> 0:06:09.240
<v Speaker 1>cat or that a cat was a person, So it

0:06:09.400 --> 0:06:13.799
<v Speaker 1>was not infallible, but it was pretty good. Now, because

0:06:13.839 --> 0:06:16.920
<v Speaker 1>I've covered artificial neural networks in recent episodes of tech Stuff,

0:06:17.240 --> 0:06:20.119
<v Speaker 1>I'm not gonna go through the whole thing all over again.

0:06:20.160 --> 0:06:22.279
<v Speaker 1>That high level I just gave you that's a pretty

0:06:22.279 --> 0:06:24.919
<v Speaker 1>good starting point. It's just important to remember that the

0:06:25.000 --> 0:06:28.560
<v Speaker 1>general output here is through training and network using that

0:06:29.120 --> 0:06:32.000
<v Speaker 1>input data set in this case or in the case

0:06:32.040 --> 0:06:35.080
<v Speaker 1>of that example, hundreds of thousands of images of cats.

0:06:36.000 --> 0:06:40.400
<v Speaker 1>Machine learning can actually take a few different approaches. The

0:06:40.440 --> 0:06:44.120
<v Speaker 1>one that I sort of outlined earlier would kind of

0:06:44.160 --> 0:06:48.040
<v Speaker 1>fall into the category of supervised machine learning. See in

0:06:48.120 --> 0:06:50.880
<v Speaker 1>that approach, we human beings are trying to teach a

0:06:50.920 --> 0:06:56.640
<v Speaker 1>machine through algorithms and data sets two recognize something that

0:06:56.680 --> 0:07:00.000
<v Speaker 1>we already know the answer for. Right, you can look

0:07:00.000 --> 0:07:02.640
<v Speaker 1>get a picture, and you can recognize whether that picture

0:07:02.680 --> 0:07:05.120
<v Speaker 1>is of a cat or not, so you already know

0:07:05.200 --> 0:07:07.320
<v Speaker 1>the answer. You're not asking the computer to give you

0:07:07.400 --> 0:07:10.200
<v Speaker 1>new information. You're trying to teach the computer to do

0:07:10.280 --> 0:07:16.360
<v Speaker 1>something that you already can do. So we human beings

0:07:16.440 --> 0:07:20.160
<v Speaker 1>are able to supervise the machine as it is learning

0:07:20.200 --> 0:07:23.560
<v Speaker 1>this process and make those minor adjust adjustments that are

0:07:23.600 --> 0:07:26.160
<v Speaker 1>needed throughout the system in order for it to get

0:07:26.200 --> 0:07:29.920
<v Speaker 1>better at its job. That is supervised machine learning. We

0:07:29.960 --> 0:07:32.280
<v Speaker 1>can keep working with it until it reaches what we

0:07:32.320 --> 0:07:36.080
<v Speaker 1>consider to be an acceptable level of success, which doesn't

0:07:36.080 --> 0:07:37.480
<v Speaker 1>mean it has to be perfect. It just has to

0:07:37.480 --> 0:07:39.840
<v Speaker 1>be good enough for whatever it is we're building it for.

0:07:40.480 --> 0:07:46.160
<v Speaker 1>But there's another approach called unsupervised machine learning, and as

0:07:46.400 --> 0:07:50.040
<v Speaker 1>you might imagine, this is different from the previous one.

0:07:50.160 --> 0:07:53.520
<v Speaker 1>On this approach, you only have input data and your

0:07:53.520 --> 0:07:56.120
<v Speaker 1>goal as a human is to learn more about that

0:07:56.240 --> 0:07:59.640
<v Speaker 1>data itself. So you don't have a correct answer in mind.

0:08:00.040 --> 0:08:03.400
<v Speaker 1>You don't already know that the data represents, say a

0:08:03.520 --> 0:08:06.360
<v Speaker 1>cat in a photo. It's a different type of problem

0:08:06.400 --> 0:08:09.680
<v Speaker 1>you're looking at. Uh. The machine is learning about the

0:08:09.760 --> 0:08:13.600
<v Speaker 1>nature of the information itself, including how different points of

0:08:13.680 --> 0:08:17.360
<v Speaker 1>data relate to one another or correspond with other data,

0:08:17.680 --> 0:08:21.080
<v Speaker 1>and you in turn can learn more about the information

0:08:21.120 --> 0:08:24.000
<v Speaker 1>as well. So within this category you have a couple

0:08:24.240 --> 0:08:29.040
<v Speaker 1>of subcategories. There are clustering problems. With a clustering problem,

0:08:29.120 --> 0:08:32.800
<v Speaker 1>you're learning about the groupings within data. So one example

0:08:32.880 --> 0:08:35.079
<v Speaker 1>might be that you have a population of customers. Let's

0:08:35.080 --> 0:08:37.520
<v Speaker 1>say you own a business. You've got customers. You have

0:08:37.600 --> 0:08:40.920
<v Speaker 1>data that represents all these different customers, and you're using

0:08:40.920 --> 0:08:45.080
<v Speaker 1>the collective behaviors of those customers to sort them into

0:08:45.160 --> 0:08:48.360
<v Speaker 1>meaningful groups so that you can better serve each of

0:08:48.400 --> 0:08:52.600
<v Speaker 1>those groups. Maybe you learn that there are four basic

0:08:52.679 --> 0:08:55.439
<v Speaker 1>types of customers, and that helps you plan out your

0:08:55.440 --> 0:08:59.160
<v Speaker 1>business so that you can cater it to those four types.

0:09:00.000 --> 0:09:03.280
<v Speaker 1>But another type of problem in unsupervised machine learning is

0:09:03.280 --> 0:09:06.960
<v Speaker 1>called an association problem. Now, in those problems, you want

0:09:07.000 --> 0:09:09.880
<v Speaker 1>to learn rules that describe large parts of the data

0:09:09.960 --> 0:09:12.440
<v Speaker 1>that you're feeding into the system. So, for example, let's

0:09:12.440 --> 0:09:15.000
<v Speaker 1>go back to you run a business. You've got this

0:09:15.040 --> 0:09:17.880
<v Speaker 1>big pool of customers, and you're feeding all the customer

0:09:18.120 --> 0:09:22.280
<v Speaker 1>behavior data into your system. It might tell you that, hey,

0:09:22.520 --> 0:09:26.280
<v Speaker 1>it turns out that of the customers who are buying

0:09:26.880 --> 0:09:30.840
<v Speaker 1>widgets go on to buy sprockets. So that would tell you, hey,

0:09:31.000 --> 0:09:33.320
<v Speaker 1>now I know more information. I know that if I

0:09:33.360 --> 0:09:35.480
<v Speaker 1>sell a widget to someone, there's a good chance I

0:09:35.520 --> 0:09:38.600
<v Speaker 1>can upsell that and include a Sprocket as well. So

0:09:38.640 --> 0:09:41.760
<v Speaker 1>I'm going to tailor my business approach to try and

0:09:41.800 --> 0:09:44.440
<v Speaker 1>take advantage of that. Now, the reason I went through

0:09:44.480 --> 0:09:47.240
<v Speaker 1>all of this is to explain that the type of

0:09:47.320 --> 0:09:51.439
<v Speaker 1>artificial intelligence algorithm that was used to produce the painting

0:09:51.559 --> 0:09:53.360
<v Speaker 1>I was talking about at the top of the show,

0:09:53.920 --> 0:09:58.400
<v Speaker 1>falls into a group called generative adversarial networks or g

0:09:58.559 --> 0:10:03.120
<v Speaker 1>a N or a GAN. These are used in unsupervised

0:10:03.240 --> 0:10:07.400
<v Speaker 1>machine learning applications. So it's in that second category I

0:10:07.480 --> 0:10:11.360
<v Speaker 1>was just talking about. So what is with this name?

0:10:11.440 --> 0:10:17.720
<v Speaker 1>What is a generative adversarial network? Well, for one thing,

0:10:18.360 --> 0:10:23.120
<v Speaker 1>it actually uses a pair of deep neural net architecture networks.

0:10:23.600 --> 0:10:27.200
<v Speaker 1>These two nets are in competition with one another. That's

0:10:27.200 --> 0:10:31.480
<v Speaker 1>why it's called an adversarial network. You have these two

0:10:31.520 --> 0:10:37.800
<v Speaker 1>different constructs that are working against each other. The approach

0:10:37.880 --> 0:10:40.840
<v Speaker 1>was first proposed by researchers at the University of Montreal,

0:10:41.240 --> 0:10:44.559
<v Speaker 1>and we chiefly associate the concept with a guy named

0:10:44.600 --> 0:10:49.560
<v Speaker 1>Ian Goodfellow. Ian Goodfellow wrote the definitive paper on the

0:10:49.600 --> 0:10:53.559
<v Speaker 1>subject back in two thousand and fourteen, and it is fascinating.

0:10:53.679 --> 0:10:56.480
<v Speaker 1>So from a very high level, what's happening is that

0:10:57.160 --> 0:11:00.320
<v Speaker 1>you have a neural network called the generator and you

0:11:00.360 --> 0:11:04.120
<v Speaker 1>have a second year old network called the discriminator. So

0:11:04.280 --> 0:11:08.840
<v Speaker 1>you're feeding the discriminator your input data. Let's again go

0:11:08.960 --> 0:11:12.880
<v Speaker 1>with pictures of cats, So actual pictures of cats photographs

0:11:12.960 --> 0:11:16.040
<v Speaker 1>of cats. If you will, you're you're feeding photographs of

0:11:16.080 --> 0:11:20.360
<v Speaker 1>cats to the discriminator. The generator's job is to create

0:11:21.280 --> 0:11:26.120
<v Speaker 1>a an image that fools the discriminator into thinking that

0:11:26.120 --> 0:11:29.520
<v Speaker 1>that's a legitimate photograph of a cat, but in fact

0:11:29.600 --> 0:11:34.040
<v Speaker 1>it was created or generated by the generator. So you've

0:11:34.040 --> 0:11:37.760
<v Speaker 1>got two processes going on at the same time. The

0:11:37.840 --> 0:11:41.280
<v Speaker 1>generator is trying to create essentially a forgery or a counterfeit.

0:11:41.880 --> 0:11:46.720
<v Speaker 1>It's it's creating something from scratch to fool the discriminator

0:11:46.760 --> 0:11:50.880
<v Speaker 1>into thinking this is a legitimate piece of data from

0:11:50.920 --> 0:11:55.320
<v Speaker 1>the training data set. The discriminator is looking at each

0:11:55.360 --> 0:11:58.360
<v Speaker 1>image and thinking, all right, now does this represent a

0:11:58.440 --> 0:12:01.800
<v Speaker 1>real picture or is this something that is coming from

0:12:01.840 --> 0:12:04.800
<v Speaker 1>the generator that's designed to fool me, And the two

0:12:04.880 --> 0:12:08.240
<v Speaker 1>are working against each other. Both networks learn as this

0:12:08.280 --> 0:12:11.199
<v Speaker 1>goes on. If the discriminator gets an image and rejects it,

0:12:11.720 --> 0:12:15.160
<v Speaker 1>that becomes a feedback to the generator and the messages. Essentially,

0:12:15.800 --> 0:12:18.360
<v Speaker 1>this was not good enough, and the generator starts to

0:12:18.800 --> 0:12:22.960
<v Speaker 1>try again, taking a slightly different approach. If the discriminator

0:12:23.040 --> 0:12:25.840
<v Speaker 1>accepts it, the generator says, ah ha, you're onto something.

0:12:26.160 --> 0:12:31.240
<v Speaker 1>But then you can tweak the discriminator and say this

0:12:31.320 --> 0:12:33.440
<v Speaker 1>was wrong. You you got this part wrong, and it

0:12:33.480 --> 0:12:36.000
<v Speaker 1>can start to try and look for signs that might

0:12:36.080 --> 0:12:40.440
<v Speaker 1>otherwise fool it. The goal here is that you are

0:12:40.480 --> 0:12:44.320
<v Speaker 1>going to have a generator producing better and better versions

0:12:44.480 --> 0:12:48.440
<v Speaker 1>of whatever it is you're trying to create. And that

0:12:48.520 --> 0:12:52.520
<v Speaker 1>could be a picture, it could be text, it could

0:12:52.559 --> 0:12:56.400
<v Speaker 1>be music. You could feed any sort of data to

0:12:56.559 --> 0:13:00.520
<v Speaker 1>both of these systems in an effort to deuce a

0:13:00.600 --> 0:13:05.080
<v Speaker 1>computer generated version of that thing, and as long as

0:13:05.080 --> 0:13:08.680
<v Speaker 1>it reached a certain level of quality, the discriminator won't

0:13:08.679 --> 0:13:10.600
<v Speaker 1>be able to tell the difference, and then you've got

0:13:10.640 --> 0:13:14.480
<v Speaker 1>yourself a computer generated whatever it might be, in this case,

0:13:15.200 --> 0:13:18.480
<v Speaker 1>a painting. I'll explain more about the specifics of this

0:13:18.559 --> 0:13:20.560
<v Speaker 1>case in just a moment, but first let's take a

0:13:20.640 --> 0:13:30.640
<v Speaker 1>quick break to thank our sponsor. So a couple of

0:13:30.720 --> 0:13:34.320
<v Speaker 1>years ago, there were computer scientists at Microsoft as well

0:13:34.360 --> 0:13:38.200
<v Speaker 1>as tu Deft University, and they were working together with

0:13:38.280 --> 0:13:41.080
<v Speaker 1>a banking company I n G to create a brand

0:13:41.080 --> 0:13:45.440
<v Speaker 1>new painting in the style of the painter Rembrandt. This

0:13:45.559 --> 0:13:49.640
<v Speaker 1>project involved processing high resolution digital scans of three hundred

0:13:49.840 --> 0:13:56.320
<v Speaker 1>forty six different images of Rembrandt's works, specifically portraits of men.

0:13:56.840 --> 0:14:00.480
<v Speaker 1>That information was fed to a deep learning algorithm that

0:14:00.600 --> 0:14:05.560
<v Speaker 1>analyzed Rembrandt's style and also the techniques that were common

0:14:05.640 --> 0:14:08.160
<v Speaker 1>across all the images. What were the common elements that

0:14:08.200 --> 0:14:12.600
<v Speaker 1>were found in those numerous paintings, And eventually this machine

0:14:12.679 --> 0:14:15.480
<v Speaker 1>was told, or this system was told to produce a

0:14:15.600 --> 0:14:20.680
<v Speaker 1>new painting based on those uh those common factors. And

0:14:20.720 --> 0:14:23.160
<v Speaker 1>so it narrowed down the approach to be a portrait

0:14:23.480 --> 0:14:26.320
<v Speaker 1>of a Caucasian white male because that's what most of

0:14:26.360 --> 0:14:30.760
<v Speaker 1>Rembrandt's portraits were of, somewhere between the ages of thirty

0:14:30.760 --> 0:14:33.720
<v Speaker 1>and forty, wearing white and black clothing, because again that

0:14:33.800 --> 0:14:37.600
<v Speaker 1>was the vast majority of the portraits that Rembrandt created,

0:14:37.880 --> 0:14:42.640
<v Speaker 1>and the focus of the subject was off to the right,

0:14:42.720 --> 0:14:46.280
<v Speaker 1>like looking slightly off to the right, because a lot

0:14:46.280 --> 0:14:48.480
<v Speaker 1>of the subjects in the other paintings were doing the same.

0:14:49.040 --> 0:14:52.520
<v Speaker 1>The algorithm also analyzed the faces of all those portraits

0:14:52.520 --> 0:14:54.560
<v Speaker 1>and came up was sort of a kind of a

0:14:54.640 --> 0:14:57.600
<v Speaker 1>mishmash average of them to produce the face of the

0:14:57.640 --> 0:15:00.840
<v Speaker 1>fictional Dutch gentleman in the new painting. To go a

0:15:00.880 --> 0:15:04.400
<v Speaker 1>step further, the team then added depth to this painting.

0:15:04.440 --> 0:15:06.800
<v Speaker 1>It was a two dimensional image, and then they decided

0:15:06.840 --> 0:15:09.000
<v Speaker 1>to add some depth. They included some ridges and some

0:15:09.080 --> 0:15:13.080
<v Speaker 1>bumps that would have been created from brush strokes onto

0:15:13.240 --> 0:15:17.320
<v Speaker 1>a two dimensional surface. So if you're using paint, then

0:15:17.560 --> 0:15:19.800
<v Speaker 1>it's actually a three dimensional image. You know, if you

0:15:19.840 --> 0:15:23.120
<v Speaker 1>get super close enough, you can see raised areas and

0:15:23.560 --> 0:15:26.680
<v Speaker 1>dips and trenches and stuff like that that the brush leaves.

0:15:26.800 --> 0:15:31.560
<v Speaker 1>And it all depends upon your painting technique how these

0:15:31.600 --> 0:15:35.280
<v Speaker 1>get laid out on canvas. So the team added those

0:15:35.320 --> 0:15:39.640
<v Speaker 1>details in to make it look even more authentic. Ultimately,

0:15:39.720 --> 0:15:43.640
<v Speaker 1>the design was printed using thirteen layers of ultra violet

0:15:43.720 --> 0:15:46.840
<v Speaker 1>based inc and the result is a work that looks

0:15:46.880 --> 0:15:49.600
<v Speaker 1>like it could have come from Rembrandt, complete with techniques

0:15:49.600 --> 0:15:53.760
<v Speaker 1>Rembrandt used in actually making his brushstrokes. And that's just

0:15:53.880 --> 0:15:57.480
<v Speaker 1>one high profile example of computers generating paintings after being

0:15:57.520 --> 0:16:01.040
<v Speaker 1>fed information about works that human artists have created. Now,

0:16:01.040 --> 0:16:05.040
<v Speaker 1>as get back to the story of the recently auctioned painting. Now,

0:16:05.600 --> 0:16:07.440
<v Speaker 1>to do that, we have to talk about a young

0:16:07.480 --> 0:16:12.560
<v Speaker 1>man named Robbie Barrett. Barrett is nineteen years old and

0:16:12.680 --> 0:16:16.200
<v Speaker 1>is attending Stanford and has been doing some really interesting

0:16:16.240 --> 0:16:19.840
<v Speaker 1>work in machine learning. It was his code that would

0:16:19.840 --> 0:16:22.640
<v Speaker 1>be the basis for the computer generated portrait that was

0:16:22.760 --> 0:16:26.040
<v Speaker 1>recently auctioned off. Barrett's work was going a step further

0:16:26.560 --> 0:16:30.800
<v Speaker 1>than copying the style of an established artist. Barrett's algorithms

0:16:30.920 --> 0:16:34.640
<v Speaker 1>would work to create new images after having analyzed numerous

0:16:34.720 --> 0:16:38.120
<v Speaker 1>real world examples. So just a couple of years ago,

0:16:38.480 --> 0:16:42.000
<v Speaker 1>the state of the art in GAN networks or GN

0:16:42.040 --> 0:16:46.680
<v Speaker 1>networks might produce some really disturbing images, like there are

0:16:46.720 --> 0:16:50.200
<v Speaker 1>early pictures of GAN attempts at making realistic human faces

0:16:50.680 --> 0:16:53.960
<v Speaker 1>that were not terribly successful, and that's because those networks

0:16:53.960 --> 0:16:57.560
<v Speaker 1>were able to recognize certain basic visual elements and images,

0:16:58.160 --> 0:17:02.880
<v Speaker 1>but not understand the reation ships between multiple elements within

0:17:02.960 --> 0:17:05.200
<v Speaker 1>an image, so you could end up with a face

0:17:05.480 --> 0:17:11.040
<v Speaker 1>with really extreme features like pronounced asymmetry. But over just

0:17:11.080 --> 0:17:13.040
<v Speaker 1>a short amount of time, people have developed much more

0:17:13.040 --> 0:17:17.160
<v Speaker 1>sophisticated GAN algorithms and performance has improved, and there of

0:17:17.160 --> 0:17:20.440
<v Speaker 1>course artists who have gone in a different approach, specifically

0:17:21.240 --> 0:17:25.600
<v Speaker 1>emphasizing some of these more absurd elements in order to

0:17:25.640 --> 0:17:29.920
<v Speaker 1>get that kind of a result when you're actually producing art.

0:17:30.560 --> 0:17:33.439
<v Speaker 1>Verrett created GAN algorithms that could generate all sorts of

0:17:33.440 --> 0:17:37.800
<v Speaker 1>interesting images. He was enabling computers to make art themselves.

0:17:38.240 --> 0:17:41.760
<v Speaker 1>And sure, these computers were learning to create art after

0:17:41.800 --> 0:17:45.679
<v Speaker 1>being fed numerous paintings and images from human artists. But

0:17:45.800 --> 0:17:47.920
<v Speaker 1>you could argue that if you want to become a

0:17:48.000 --> 0:17:50.639
<v Speaker 1>human artist, you have to do the same thing. You

0:17:50.680 --> 0:17:53.240
<v Speaker 1>have to study art that was created by other people.

0:17:53.440 --> 0:17:57.960
<v Speaker 1>So computers are no different. The computers weren't replicating specific works,

0:17:57.960 --> 0:18:00.840
<v Speaker 1>they weren't trying to make a copy. They were learning

0:18:01.160 --> 0:18:07.280
<v Speaker 1>various styles. Barrett would frequently put these images and also

0:18:07.320 --> 0:18:10.439
<v Speaker 1>the algorithms he used to create those images up on

0:18:10.560 --> 0:18:13.920
<v Speaker 1>get hub for free and open source. He also had

0:18:14.560 --> 0:18:18.760
<v Speaker 1>uh people download these and upload their own art, and

0:18:18.800 --> 0:18:21.720
<v Speaker 1>it was all in the spirit of this open source community.

0:18:23.200 --> 0:18:25.439
<v Speaker 1>This way, not only could people use the tools that

0:18:25.480 --> 0:18:28.399
<v Speaker 1>Barrett had created, they could understand how those tools worked,

0:18:28.840 --> 0:18:31.440
<v Speaker 1>and perhaps in the future they can make their own tools,

0:18:32.000 --> 0:18:36.640
<v Speaker 1>tweaking the approach the Barrett had used, maybe making art

0:18:36.720 --> 0:18:41.639
<v Speaker 1>that was even more indistinguishable from human art, or perhaps

0:18:41.640 --> 0:18:44.760
<v Speaker 1>going in a totally different direction, making something truly new

0:18:44.760 --> 0:18:47.560
<v Speaker 1>and alien. By the way, some of the images created

0:18:47.560 --> 0:18:51.320
<v Speaker 1>by Barrett's algorithms are a little unsettling. They can be

0:18:51.440 --> 0:18:54.359
<v Speaker 1>surreal and absurd, and some of them even come across

0:18:54.359 --> 0:18:58.000
<v Speaker 1>a little sinister to me. But that's my own interpretation.

0:18:58.040 --> 0:18:59.919
<v Speaker 1>I mean, that is what art is all about, is

0:19:00.040 --> 0:19:02.520
<v Speaker 1>the interpretation of the person looking at art. But they

0:19:02.560 --> 0:19:05.679
<v Speaker 1>remind me of some of the horror movie effects you

0:19:05.760 --> 0:19:08.639
<v Speaker 1>might see where the visual effects artists will distort a

0:19:08.680 --> 0:19:10.840
<v Speaker 1>person's face for the effect of horror, like in the

0:19:10.880 --> 0:19:16.000
<v Speaker 1>movie The Ring. Anyway, Barrett created several GAN algorithms and

0:19:16.040 --> 0:19:18.600
<v Speaker 1>put them up online for others to use, and this

0:19:18.720 --> 0:19:21.520
<v Speaker 1>in itself was not unusual. There are many in the

0:19:21.560 --> 0:19:24.800
<v Speaker 1>digital art field who work on AI who have done

0:19:24.840 --> 0:19:29.760
<v Speaker 1>similar things. Now he creates this code, Let's take a

0:19:29.800 --> 0:19:33.160
<v Speaker 1>trip across the world from Stanford over to France. That's

0:19:33.160 --> 0:19:37.000
<v Speaker 1>where three artists in their mid twenties were working in

0:19:37.040 --> 0:19:40.920
<v Speaker 1>a group they had called Obvious and their stated goal

0:19:41.119 --> 0:19:45.280
<v Speaker 1>is to promote ganism, that is, the art that has

0:19:45.320 --> 0:19:50.040
<v Speaker 1>been generated through AI algorithms running on this GAN approach. Now,

0:19:50.080 --> 0:19:53.159
<v Speaker 1>according to an article on Medium written by one of

0:19:53.200 --> 0:19:57.520
<v Speaker 1>these artists, they quote want to send out an update

0:19:57.640 --> 0:20:00.600
<v Speaker 1>of the state of the research and AI end quote.

0:20:01.200 --> 0:20:03.879
<v Speaker 1>They want to do this they want to tell the

0:20:03.880 --> 0:20:06.560
<v Speaker 1>world what is going on in the world of AI

0:20:06.680 --> 0:20:10.040
<v Speaker 1>research through showing off artwork made by AI, so kind

0:20:10.040 --> 0:20:14.159
<v Speaker 1>of a creative artistic way of talking about artificial intelligence.

0:20:14.960 --> 0:20:18.000
<v Speaker 1>The group says that the value of the art may

0:20:18.040 --> 0:20:21.280
<v Speaker 1>not be in the art itself, but rather the discussions

0:20:21.359 --> 0:20:25.040
<v Speaker 1>that the art inspires, like what is it that makes

0:20:25.240 --> 0:20:30.720
<v Speaker 1>art art? Can machines be creative? Who ultimately would you

0:20:30.800 --> 0:20:33.199
<v Speaker 1>say is the artist in a work that was created

0:20:33.240 --> 0:20:37.160
<v Speaker 1>by a machine? What does that art mean? Who does

0:20:37.160 --> 0:20:40.640
<v Speaker 1>it belong to? That's a big one. So the artists

0:20:40.720 --> 0:20:44.200
<v Speaker 1>reached out to Barrett when they were tackling this project.

0:20:44.560 --> 0:20:47.800
<v Speaker 1>They wanted to use a gain algorithm to generate a

0:20:47.840 --> 0:20:50.480
<v Speaker 1>portrait in a style similar to what you see in

0:20:50.600 --> 0:20:54.120
<v Speaker 1>eighteenth century paintings out of Europe. The students have made

0:20:54.119 --> 0:20:56.720
<v Speaker 1>it clear that Barrett had been a big part of

0:20:56.760 --> 0:20:59.880
<v Speaker 1>their inspiration. More on that in just a second now.

0:21:00.080 --> 0:21:03.440
<v Speaker 1>Members of Obvious began using gan code to generate portraits,

0:21:03.840 --> 0:21:06.760
<v Speaker 1>and they created several of them, eleven in fact of

0:21:06.800 --> 0:21:14.119
<v Speaker 1>a fictional noble family they named the Bellamy family B. E. L. A. M. Y.

0:21:14.280 --> 0:21:16.600
<v Speaker 1>The name Bellamy itself was a bit of a pun

0:21:16.720 --> 0:21:19.919
<v Speaker 1>and a reference to Ian Goodfellow, the guy who wrote

0:21:19.960 --> 0:21:23.520
<v Speaker 1>that main paper on gangs. In the first place, Bellamy

0:21:23.680 --> 0:21:27.920
<v Speaker 1>can be broken down into bell and Amy. That would

0:21:27.920 --> 0:21:30.679
<v Speaker 1>mean all the different spellings. It would mean good friend

0:21:30.880 --> 0:21:34.120
<v Speaker 1>or good fellow, which is kind of cute. Right. Well,

0:21:34.119 --> 0:21:38.320
<v Speaker 1>the artists produced these portraits, and they are all of

0:21:38.440 --> 0:21:42.680
<v Speaker 1>hollow eyed nobles that will stare right into the void

0:21:42.720 --> 0:21:45.920
<v Speaker 1>in a way that actually that's getting off track. Never

0:21:45.960 --> 0:21:48.440
<v Speaker 1>mind it. It creates me out a little bit. But

0:21:48.520 --> 0:21:51.600
<v Speaker 1>the last in the line of portraits would be Edmund

0:21:51.840 --> 0:21:55.480
<v Speaker 1>do Bellamy, the fictional noble whose portrait would go up

0:21:55.520 --> 0:22:00.640
<v Speaker 1>on auction in October and fetched way more money than

0:22:00.920 --> 0:22:05.160
<v Speaker 1>was anticipated and so obvious had fed to the algorithms

0:22:05.600 --> 0:22:09.520
<v Speaker 1>numerous paintings from the eighteenth century to guide its efforts,

0:22:10.160 --> 0:22:13.160
<v Speaker 1>and once they started producing these, they had each one

0:22:13.240 --> 0:22:16.680
<v Speaker 1>signed with a line of code referencing the algorithm. They

0:22:16.760 --> 0:22:20.600
<v Speaker 1>framed the machine generated portraits in golden frames, and when

0:22:20.760 --> 0:22:23.720
<v Speaker 1>Edmund de Bellamy went up for auction, the best guess

0:22:23.720 --> 0:22:26.320
<v Speaker 1>was that it would probably fetch between seven thousand and

0:22:26.359 --> 0:22:31.040
<v Speaker 1>eleven thousand dollars. Instead, the winning bid was for more

0:22:31.119 --> 0:22:37.199
<v Speaker 1>than four hundred thirty thousand dollars. So that raises a

0:22:37.280 --> 0:22:41.400
<v Speaker 1>good question who the heck should get that money. Who

0:22:41.600 --> 0:22:46.840
<v Speaker 1>was responsible for this painting and that would become something

0:22:46.960 --> 0:22:49.480
<v Speaker 1>of a controversy. I'll explain more in just a second,

0:22:49.520 --> 0:22:52.600
<v Speaker 1>but first let's take another quick break to thank our sponsor.

0:23:00.560 --> 0:23:04.639
<v Speaker 1>So as the group Obvious was getting press coverage for

0:23:04.720 --> 0:23:08.040
<v Speaker 1>the AI produced Bellamy portraits, this is before they had

0:23:08.119 --> 0:23:11.720
<v Speaker 1>even put one up for auction, some people, including Barratt,

0:23:12.920 --> 0:23:17.239
<v Speaker 1>express some disappointment with the group. They said that it

0:23:17.280 --> 0:23:21.480
<v Speaker 1>looked like they had used Barrett's code to produce these portraits,

0:23:21.520 --> 0:23:24.440
<v Speaker 1>and yet they weren't quick to attribute him. They didn't

0:23:24.440 --> 0:23:29.560
<v Speaker 1>give him credit, at least not readily and not visibly

0:23:29.760 --> 0:23:33.440
<v Speaker 1>in a lot of locations. And so his code, while

0:23:33.440 --> 0:23:36.920
<v Speaker 1>it was open source and he didn't begrudge anyone from

0:23:37.119 --> 0:23:40.240
<v Speaker 1>being able to use it, would have usually meant that

0:23:40.240 --> 0:23:44.360
<v Speaker 1>people would give him credit. Typically in the open source community,

0:23:44.359 --> 0:23:47.679
<v Speaker 1>it's considered bad form or even ghosh if you prefer

0:23:48.040 --> 0:23:52.080
<v Speaker 1>to not give credit where credit is due. As to

0:23:52.119 --> 0:23:55.880
<v Speaker 1>how much of the code was actually used unaltered, that

0:23:56.200 --> 0:23:58.840
<v Speaker 1>is a bit of an open question. The artists that

0:23:58.920 --> 0:24:01.600
<v Speaker 1>Obvious have admitted that they did use his code and

0:24:01.640 --> 0:24:05.520
<v Speaker 1>they changed it a little bit. Some other artists say

0:24:05.520 --> 0:24:09.560
<v Speaker 1>they believe that or more of the code was unaltered.

0:24:10.200 --> 0:24:13.200
<v Speaker 1>One such artist, a New Zealander named Tom White, said

0:24:13.240 --> 0:24:17.280
<v Speaker 1>he downloaded Barrett's code and ran it unaltered to see

0:24:17.280 --> 0:24:20.640
<v Speaker 1>if he could produce images similar to those that Obvious

0:24:20.680 --> 0:24:24.439
<v Speaker 1>had generated, and he said they look pretty close. So

0:24:24.480 --> 0:24:26.320
<v Speaker 1>I took a look at as well. I would say

0:24:26.400 --> 0:24:29.760
<v Speaker 1>that the ones that that White had created with that

0:24:29.880 --> 0:24:33.160
<v Speaker 1>AI have a little bit more of the weird facial

0:24:33.240 --> 0:24:35.520
<v Speaker 1>distortion thing going on than the ones that were made

0:24:35.560 --> 0:24:41.080
<v Speaker 1>by Obvious, but they are fairly similar. Throughout the project,

0:24:41.440 --> 0:24:44.280
<v Speaker 1>members of Obvious reached out to brot to for for

0:24:44.400 --> 0:24:48.119
<v Speaker 1>help and getting the GAN algorithms to run properly on computers.

0:24:48.480 --> 0:24:50.840
<v Speaker 1>Those communications are up on geth hubs, so I mean

0:24:51.440 --> 0:24:54.679
<v Speaker 1>they definitely happened. Anyone can see them. So that's definitely

0:24:54.720 --> 0:24:57.720
<v Speaker 1>a sign that a significant portion of the code used

0:24:57.720 --> 0:25:01.879
<v Speaker 1>to create the expensive painting came from ROT. So we

0:25:01.920 --> 0:25:06.200
<v Speaker 1>get into that tricky question who owns the art before

0:25:06.400 --> 0:25:10.440
<v Speaker 1>it gets purchased at auction? Obviously, so does the computer

0:25:10.520 --> 0:25:14.440
<v Speaker 1>scientist who created the code own anything that the code produces.

0:25:15.000 --> 0:25:17.320
<v Speaker 1>I mean, the code has to have a programmer. Without

0:25:17.359 --> 0:25:20.960
<v Speaker 1>a programmer, there's no code. So without the code, you

0:25:21.000 --> 0:25:25.080
<v Speaker 1>get no artistic output. But then again, you could say

0:25:25.119 --> 0:25:28.840
<v Speaker 1>that human artists learn from their teachers. There's a long

0:25:29.000 --> 0:25:33.200
<v Speaker 1>history of artists taking on apprentices, and those apprentices later

0:25:33.240 --> 0:25:35.920
<v Speaker 1>on go on to become great artists of their own.

0:25:36.480 --> 0:25:38.840
<v Speaker 1>So maybe you could argue that Brought was a teacher

0:25:39.200 --> 0:25:43.440
<v Speaker 1>and the AI was the student, and therefore Brought wouldn't

0:25:43.440 --> 0:25:46.080
<v Speaker 1>own the art. He didn't make it. He just taught

0:25:46.119 --> 0:25:50.119
<v Speaker 1>the student how to make art, not in a traditional sense,

0:25:50.359 --> 0:25:56.359
<v Speaker 1>but that's how it happened. But here's another problem. AI

0:25:56.520 --> 0:26:01.199
<v Speaker 1>cannot own stuff. Artificial intelligence can't have property. We have

0:26:01.280 --> 0:26:05.560
<v Speaker 1>no legal means to assign ownership, so that a program,

0:26:05.640 --> 0:26:09.920
<v Speaker 1>or an algorithm or an artificial neural network could own property.

0:26:10.000 --> 0:26:12.239
<v Speaker 1>And even if we did, what good would it do.

0:26:12.400 --> 0:26:16.000
<v Speaker 1>The AI doesn't want or need anything. It doesn't even

0:26:16.040 --> 0:26:21.000
<v Speaker 1>have will or self awareness. So maybe Obvious could claim

0:26:21.040 --> 0:26:25.199
<v Speaker 1>ownership because they were the ones who fed the information

0:26:25.240 --> 0:26:28.520
<v Speaker 1>to the algorithm. They're the ones who gave the algorithm

0:26:28.640 --> 0:26:31.880
<v Speaker 1>the access to all the different portraits. They made some

0:26:32.040 --> 0:26:35.520
<v Speaker 1>changes to the code, and the algorithms ran on computers

0:26:35.560 --> 0:26:40.080
<v Speaker 1>that they controlled, so if the code was using their assets,

0:26:40.600 --> 0:26:43.760
<v Speaker 1>maybe they own the output. But this is also complicated.

0:26:43.800 --> 0:26:46.800
<v Speaker 1>They didn't build the algorithm. They made use of it,

0:26:47.240 --> 0:26:50.639
<v Speaker 1>but they didn't design it from the ground up. But

0:26:50.680 --> 0:26:53.199
<v Speaker 1>if someone else could have run the code and use

0:26:53.320 --> 0:26:56.560
<v Speaker 1>the same general pool of images and train the code,

0:26:56.840 --> 0:27:00.880
<v Speaker 1>they might have seen similar results, which means someone else

0:27:00.880 --> 0:27:03.480
<v Speaker 1>could have done the exact same thing that obvious did,

0:27:03.800 --> 0:27:08.359
<v Speaker 1>and so that raises questions as well. Maybe there's nothing

0:27:08.359 --> 0:27:11.919
<v Speaker 1>special about owning the machine. In other words, in the

0:27:11.960 --> 0:27:15.920
<v Speaker 1>digital world, using open source code to make something new

0:27:16.000 --> 0:27:19.240
<v Speaker 1>and then profit from it sell it. That happens regularly,

0:27:19.320 --> 0:27:21.760
<v Speaker 1>but again it's all on how you do it. If

0:27:21.800 --> 0:27:25.200
<v Speaker 1>you follow the general rules of etiquette, you're typically pretty good.

0:27:25.400 --> 0:27:28.000
<v Speaker 1>But if not, people think of that as being kind

0:27:28.000 --> 0:27:33.480
<v Speaker 1>of a jerk face. So it's not it's it's frowned

0:27:33.560 --> 0:27:37.879
<v Speaker 1>upon in the open source community. Broad is quoted in

0:27:37.920 --> 0:27:40.600
<v Speaker 1>a piece on The Verge as saying, quote, I'm more

0:27:40.680 --> 0:27:44.360
<v Speaker 1>concerned about the fact that actual artists using AI are

0:27:44.359 --> 0:27:47.760
<v Speaker 1>being deprived of the spotlight. It's a very bad first

0:27:47.800 --> 0:27:51.520
<v Speaker 1>impression for the field to have end quote. So he's

0:27:51.520 --> 0:27:55.280
<v Speaker 1>not saying he's upset and missing out on money, but

0:27:55.880 --> 0:28:00.920
<v Speaker 1>rather that the the whole field is getting is represented

0:28:01.520 --> 0:28:03.920
<v Speaker 1>The Verge piece also does a great job pointing out

0:28:04.000 --> 0:28:07.520
<v Speaker 1>how many in the AI digital art field feel that

0:28:07.600 --> 0:28:11.160
<v Speaker 1>Obvious is painting a misleading picture to use a pun

0:28:11.680 --> 0:28:13.720
<v Speaker 1>that if you were to look at the press release

0:28:14.520 --> 0:28:16.520
<v Speaker 1>that the group has put out and the way that

0:28:16.600 --> 0:28:19.360
<v Speaker 1>they've presented the art, it would seem as if these

0:28:19.359 --> 0:28:23.840
<v Speaker 1>programs were largely undirected or even fully autonomous, and they aren't.

0:28:24.440 --> 0:28:27.679
<v Speaker 1>Just because it's called unsupervised machine learning doesn't mean that

0:28:27.720 --> 0:28:31.080
<v Speaker 1>there's no human component. So there's a debate going on

0:28:31.840 --> 0:28:35.480
<v Speaker 1>within the digital art world on where in the spectrum

0:28:36.080 --> 0:28:40.960
<v Speaker 1>these algorithms should fall. Are they closer to being tools

0:28:41.000 --> 0:28:44.640
<v Speaker 1>like what a paint brush would be to a traditional painter,

0:28:45.520 --> 0:28:49.960
<v Speaker 1>or are they more closely connected to a collaborator, maybe

0:28:50.080 --> 0:28:53.680
<v Speaker 1>someone who's assisting a painter. But they certainly are not

0:28:53.800 --> 0:28:57.200
<v Speaker 1>fully autonomous robots. Now. In a way, this question of

0:28:57.240 --> 0:29:00.920
<v Speaker 1>ownership actually makes me think of an earlier incident involving

0:29:00.920 --> 0:29:05.760
<v Speaker 1>a different art form. It involved a monkey, a digital camera,

0:29:06.120 --> 0:29:09.000
<v Speaker 1>and a lawsuit. So back in two thousand and eleven,

0:29:09.240 --> 0:29:13.080
<v Speaker 1>a photographer named David Slater was working on an assignment

0:29:13.120 --> 0:29:18.000
<v Speaker 1>in Indonesia and that's where he met Naruto Naruto was

0:29:18.040 --> 0:29:22.520
<v Speaker 1>a seven year old crested macaque, so Naruto was a

0:29:22.520 --> 0:29:27.520
<v Speaker 1>monkey now. On this assignment, Naruto at one point grabbed

0:29:27.680 --> 0:29:32.200
<v Speaker 1>Slater's camera, and while handling Slater's camera, Naruto took a

0:29:32.280 --> 0:29:36.320
<v Speaker 1>photo of himself. So it's a monkey selfie, and it's

0:29:36.360 --> 0:29:38.480
<v Speaker 1>a great photo. If you've not seen it, you've got

0:29:38.480 --> 0:29:42.760
<v Speaker 1>to look up monkey selfie because it is amazing. The

0:29:42.800 --> 0:29:45.360
<v Speaker 1>monkey obviously didn't understand what it was doing, but the

0:29:45.400 --> 0:29:50.080
<v Speaker 1>selfie is just about perfect. So then this image goes

0:29:50.160 --> 0:29:53.480
<v Speaker 1>up online and it goes viral. It gets posted all

0:29:53.520 --> 0:29:58.000
<v Speaker 1>over the place, including on Wikipedia, and David Slater would

0:29:58.000 --> 0:30:00.640
<v Speaker 1>reach out to Wikipedia and say, hey, you can't just

0:30:00.720 --> 0:30:03.160
<v Speaker 1>put my photograph up on your site without asking for

0:30:03.240 --> 0:30:07.560
<v Speaker 1>permission or paying a licensing fee. The Wikipedia said, dude,

0:30:08.160 --> 0:30:11.520
<v Speaker 1>you didn't take the photograph. It doesn't belong to you.

0:30:12.000 --> 0:30:14.600
<v Speaker 1>It was taken on your camera, but you didn't snap

0:30:14.640 --> 0:30:18.560
<v Speaker 1>the picture. A monkey took the photos, so you don't

0:30:18.560 --> 0:30:22.160
<v Speaker 1>have copyright to that image. In fact, no one has

0:30:22.200 --> 0:30:26.360
<v Speaker 1>copyright to that image because news flash, animals can't hold

0:30:26.360 --> 0:30:31.040
<v Speaker 1>copyrights to any work. But then Peter ak, a People

0:30:31.080 --> 0:30:34.479
<v Speaker 1>for the Ethical Treatment of Animals, would sue David Slater

0:30:34.680 --> 0:30:38.920
<v Speaker 1>and a publishing company called Blurb for copyright infringement, saying, Hey,

0:30:39.040 --> 0:30:42.360
<v Speaker 1>Naruto took that photo, so Naruto should hold the copyright.

0:30:42.720 --> 0:30:46.160
<v Speaker 1>The judge in that case would ultimately say that animals

0:30:46.200 --> 0:30:51.040
<v Speaker 1>can't hold copyright, backing up what Wikipedia had said, and

0:30:51.160 --> 0:30:55.480
<v Speaker 1>that this whole argument was invalid. Peter appealed the decision

0:30:55.760 --> 0:30:57.840
<v Speaker 1>it went to or it was scheduled to go to

0:30:58.080 --> 0:31:01.600
<v Speaker 1>a higher court, but ultimately the various parties came to

0:31:01.640 --> 0:31:04.560
<v Speaker 1>a settlement out of court. And this is where I

0:31:04.640 --> 0:31:08.680
<v Speaker 1>kind of roll my eyes at Peter. But this situation,

0:31:08.720 --> 0:31:12.800
<v Speaker 1>while silly on the surface, raises questions that also applied

0:31:12.800 --> 0:31:15.880
<v Speaker 1>to artificial intelligence. In a case like this, who has

0:31:15.920 --> 0:31:19.680
<v Speaker 1>the right to use or exploit a work? Now, I

0:31:19.680 --> 0:31:23.480
<v Speaker 1>would argue than the case with artificial intelligence, it gets

0:31:23.600 --> 0:31:27.280
<v Speaker 1>even thornier than that. Right now, we're talking about paintings.

0:31:27.560 --> 0:31:30.240
<v Speaker 1>But as I said earlier, gain algorithms could produce all

0:31:30.360 --> 0:31:33.760
<v Speaker 1>sorts of different stuff, including text. So we could have

0:31:33.880 --> 0:31:37.440
<v Speaker 1>a computer generated novel or a screenplay in the future,

0:31:37.840 --> 0:31:41.840
<v Speaker 1>and sure, the first versions of those will probably be terrible,

0:31:42.200 --> 0:31:45.280
<v Speaker 1>And to be fair, we already have a surplus of

0:31:45.480 --> 0:31:49.560
<v Speaker 1>terrible books and terrible movies and terrible TV shows that

0:31:49.600 --> 0:31:51.680
<v Speaker 1>are made by real human beings. We don't we don't

0:31:51.720 --> 0:31:54.960
<v Speaker 1>need robots to make more of those, but we could

0:31:55.000 --> 0:31:57.680
<v Speaker 1>also end up with some that are interesting or that

0:31:57.840 --> 0:32:01.840
<v Speaker 1>say something surprising that people will value. In those cases,

0:32:01.960 --> 0:32:04.800
<v Speaker 1>who has a claim to that intellectual property? Who should

0:32:04.840 --> 0:32:07.240
<v Speaker 1>profit from it? Maybe it should be the person who

0:32:07.240 --> 0:32:09.600
<v Speaker 1>wrote the code in the first place. But if that's

0:32:09.640 --> 0:32:12.880
<v Speaker 1>the case, let's take this thought experiment in another direction.

0:32:13.240 --> 0:32:15.880
<v Speaker 1>Let's say someone creates code for an AI that does

0:32:15.920 --> 0:32:20.800
<v Speaker 1>something entirely different. There it's not generating any content. Let's

0:32:20.800 --> 0:32:24.200
<v Speaker 1>say it's the artificial intelligence you would need to power

0:32:24.280 --> 0:32:28.080
<v Speaker 1>an autonomous car. Now, let's say one of those cars

0:32:28.240 --> 0:32:31.560
<v Speaker 1>is found to have caused a really bad accident. So

0:32:31.560 --> 0:32:34.360
<v Speaker 1>should the person who wrote the code be held responsible?

0:32:35.160 --> 0:32:38.160
<v Speaker 1>What if the scenario that led up to the accident

0:32:38.320 --> 0:32:41.640
<v Speaker 1>was so unusual that no one would have ever predicted it.

0:32:42.360 --> 0:32:45.320
<v Speaker 1>Because it's one thing to overlook a common event, Like

0:32:45.400 --> 0:32:49.520
<v Speaker 1>if someone were to program an autonomous car and say, oh, crap,

0:32:49.600 --> 0:32:54.000
<v Speaker 1>I totally forgot about stop signs, that would be demonstrably bad,

0:32:54.280 --> 0:32:57.280
<v Speaker 1>And you could say, well, that is that is endangerment,

0:32:57.440 --> 0:33:01.080
<v Speaker 1>That is definitely not cool. But it's a totally different

0:33:01.080 --> 0:33:04.840
<v Speaker 1>thing if you just don't predict an accident that involves

0:33:04.880 --> 0:33:08.720
<v Speaker 1>a lot of unique factors, because those happen too. There's

0:33:08.720 --> 0:33:12.120
<v Speaker 1>stuff that happens on the road every single day that

0:33:12.200 --> 0:33:16.000
<v Speaker 1>happens in a way that nobody anticipated. And because we

0:33:16.080 --> 0:33:19.640
<v Speaker 1>have so many people driving so many cars on so

0:33:19.680 --> 0:33:23.120
<v Speaker 1>many roads under so many conditions on a daily basis,

0:33:23.720 --> 0:33:26.560
<v Speaker 1>it's inevitable that we're going to have moments where those

0:33:26.680 --> 0:33:29.640
<v Speaker 1>unique situations pop up and it would be impossible to

0:33:29.920 --> 0:33:35.080
<v Speaker 1>identify or predict them. So in those cases, would you

0:33:35.200 --> 0:33:38.480
<v Speaker 1>still hold hold someone who made the code responsible that

0:33:38.560 --> 0:33:41.680
<v Speaker 1>they weren't able to predict something that nobody could predict?

0:33:41.960 --> 0:33:47.080
<v Speaker 1>Or does that put them at an unreasonable standard? Is

0:33:47.120 --> 0:33:50.280
<v Speaker 1>it the fault of the car manufacturer? Is it the

0:33:50.320 --> 0:33:53.280
<v Speaker 1>fault of the person who designed the road. I mean,

0:33:53.280 --> 0:33:56.520
<v Speaker 1>there's so many different questions and we don't have all

0:33:56.520 --> 0:34:00.320
<v Speaker 1>the answers, But I think in this case, with the painting,

0:34:00.800 --> 0:34:04.760
<v Speaker 1>we have this high profile example of AI producing something.

0:34:05.440 --> 0:34:08.520
<v Speaker 1>It leads us to get into a deeper conversation about

0:34:08.520 --> 0:34:11.759
<v Speaker 1>those ideas, and my guess is we will ultimately come

0:34:11.840 --> 0:34:17.000
<v Speaker 1>up with answers that are not entirely satisfactory for all situations,

0:34:17.040 --> 0:34:20.120
<v Speaker 1>but maybe some people will even go so far as

0:34:20.160 --> 0:34:26.160
<v Speaker 1>to to vehemently disagree with him. But more importantly, we

0:34:26.200 --> 0:34:30.399
<v Speaker 1>will actually have maybe answers right So, yeah, it might

0:34:30.400 --> 0:34:33.239
<v Speaker 1>be answers that not everyone is happy with, but at

0:34:33.320 --> 0:34:35.400
<v Speaker 1>least they would be answers right now we have nothing.

0:34:36.080 --> 0:34:39.680
<v Speaker 1>So this is a good case study for us to say,

0:34:39.920 --> 0:34:43.759
<v Speaker 1>we've got to start thinking about this stuff because the

0:34:43.840 --> 0:34:47.319
<v Speaker 1>era of AI playing a more pivotal role in our

0:34:47.360 --> 0:34:49.560
<v Speaker 1>lives is right around the corner, and it would be

0:34:49.560 --> 0:34:52.680
<v Speaker 1>better for us to figure this out now rather than

0:34:52.760 --> 0:34:55.520
<v Speaker 1>have to react to it when it's too late later.

0:34:55.920 --> 0:34:58.120
<v Speaker 1>I'm curious to hear what you guys have to say

0:34:58.160 --> 0:35:01.120
<v Speaker 1>about this subject. Why don't you pop on over to

0:35:01.280 --> 0:35:05.000
<v Speaker 1>text Stuff podcast dot com. That's our website. Get in

0:35:05.080 --> 0:35:07.239
<v Speaker 1>touch with me and let me know what you think.

0:35:07.560 --> 0:35:09.960
<v Speaker 1>If you have suggestions for future episodes of tech Stuff,

0:35:10.480 --> 0:35:12.560
<v Speaker 1>I'd love to hear those two. Make sure you go

0:35:12.600 --> 0:35:15.080
<v Speaker 1>over to t public dot com slash tech stuff. Check

0:35:15.120 --> 0:35:18.000
<v Speaker 1>out our our store. There lots of cool things over there.

0:35:18.239 --> 0:35:22.120
<v Speaker 1>Get yourself something fun for the holidays, because every purchase

0:35:22.160 --> 0:35:23.840
<v Speaker 1>you make goes to help the show, and I greatly

0:35:23.880 --> 0:35:27.719
<v Speaker 1>appreciate it, and I'll talk to you again really soon

0:35:33.600 --> 0:35:36.000
<v Speaker 1>for more on this and bathands of other topics, because

0:35:36.040 --> 0:35:47.200
<v Speaker 1>it how stuff works. Dot com